This document provides essential context and guidance for LLMs working on the Algo VPN codebase.
Algo is an Ansible-based tool that sets up a personal VPN in the cloud. It's designed to be:
- Security-focused: Creates hardened VPN servers with minimal attack surface
- Easy to use: Automated deployment with sensible defaults
- Multi-platform: Supports various cloud providers and operating systems
- Privacy-preserving: No logging, minimal data retention
- VPN Protocols: WireGuard (preferred) and IPsec/IKEv2
- Configuration Management: Ansible (v12+)
- Languages: Python, YAML, Shell, Jinja2 templates
- Supported Providers: AWS, Azure, DigitalOcean, GCP, Vultr, Hetzner, local deployment
- Stability over features
- Security over convenience
- Clarity over cleverness
- Test everything
- Stay in scope - solve exactly what the issue asks, nothing more
- Test assumptions - run the code before committing
- Resist new dependencies - each one is attack surface and maintenance
algo/
├── main.yml # Primary playbook
├── users.yml # User management playbook
├── server.yml # Server-specific tasks
├── config.cfg # Main configuration file
├── pyproject.toml # Python project configuration and dependencies
├── uv.lock # Exact dependency versions lockfile
├── requirements.yml # Ansible collections
├── roles/ # Ansible roles
│ ├── common/ # Base system configuration, firewall, hardening
│ ├── wireguard/ # WireGuard VPN setup
│ ├── strongswan/ # IPsec/IKEv2 setup
│ ├── dns/ # DNS configuration (dnscrypt-proxy)
│ └── cloud-*/ # Cloud provider specific roles
├── library/ # Custom Ansible modules
└── tests/unit/ # Python unit tests
All PRs must pass these checks locally before submission. CI will reject failures:
# Run the full lint suite (same as CI)
ansible-lint . && yamllint . && ruff check . && shellcheck scripts/*.sh && semgrep --config auto --exclude-rule dockerfile.security.last-user-is-root.last-user-is-root --error --quiet .
ansible-playbook main.yml --syntax-check
ansible-playbook users.yml --syntax-check
pytest tests/unit/ -qCommon lint issues to fix before submitting:
- YAML files missing
---document start markers - GitHub workflows with unquoted
on:(must be'on':) - Using
ignore_errors: trueinstead offailed_when: false - Jinja2 spacing errors (
{{foo}}should be{{ foo }}) - Missing
mode:on file/directory tasks
No warnings are tolerated in CI. Every linter finding must be either fixed or explicitly allowlisted in the tool's config file (.ansible-lint, pyproject.toml, etc.).
Why this matters for Algo:
- Security tool - VPN misconfigurations silently break privacy guarantees. A "cosmetic" warning today hides a real bug tomorrow.
- Ansible complexity - YAML+Jinja2 linting catches real runtime failures (wrong key order breaks
whenevaluation, spacing errors cause template failures). Warnings in Ansible are not style nits. - CI signal integrity - If 30 warnings scroll by on every run, the 31st one (a real regression) goes unnoticed. Zero warnings means every new finding gets human attention.
Resolution order of preference:
- Fix it - Preferred. Most findings have straightforward fixes.
- Allowlist in config - If the rule is wrong for this project, add to
skip_listwith a comment explaining why. - Inline suppress - Last resort. Use
# noqa: rule-namewith a comment justifying the exception.
Never use warn_list in .ansible-lint — it exists as a migration tool, not a permanent home. Rules either pass or are explicitly skipped.
When adding or modifying features, verify these before requesting review:
- Validate inputs early - Check for empty lists, missing configs, permission mismatches before expensive operations
- Explicit file modes - Always specify
mode:on file/directory tasks (never rely on umask) - Fail vs warn - Permission/security issues should fail; optional features can warn
- Actionable errors - Include fix commands in error messages:
"Run: sudo chown -R $USER configs/" - Follow existing patterns - Search codebase first:
rg "when:.*localhost" --type yaml
| Tool | Target | Key Rules |
|---|---|---|
ansible-lint |
YAML tasks | Use failed_when not ignore_errors, add mode: to files |
yamllint |
All YAML | Document start ---, quote 'on': in workflows |
ruff |
Python | Line length 120, target Python 3.11 |
shellcheck |
Shell scripts | Quote variables, use set -euo pipefail |
semgrep |
All code | SAST scanner, --config auto, suppress with # nosemgrep: rule-id |
- Create feature branches from
master - Run all linters before pushing
- Make atomic commits with clear messages
- Update PR description with test results
Before creating a PR, review your own diff:
- Did I run all linters locally?
- Did I search for similar patterns in the codebase?
- Did I add explicit
mode:to file/directory tasks? - Did I validate inputs before expensive operations?
- Did I update tests if I changed file paths or behavior?
- Would a reviewer ask "what happens if X is empty/missing?"
with_items auto-flattens lists; loop does not. Never mechanically convert:
# WRONG - treats list as single item, creates file named "['alice', 'bob']"
loop:
- "{{ users }}"
# CORRECT - iterates over list contents
loop: "{{ users }}"
# CORRECT - combining lists (with_items did this automatically)
loop: "{{ users + [server_name] }}"Always test loop conversions - verify the task creates expected files.
Never include trailing slashes - causes double-slash bugs:
# WRONG - creates paths like /etc/ipsec.d//private
ipsec_path: "configs/{{ server }}/ipsec/"
# CORRECT
ipsec_path: "configs/{{ server }}/ipsec"# WRONG - ansible-lint failure
- name: Clear history
command: some_command
ignore_errors: true
# CORRECT - explicit about expected failures
- name: Clear history
command: some_command
failed_when: falseHandlers and check commands that don't modify state need changed_when: false:
- name: Check service status
command: systemctl status foo
changed_when: falseAnsible 12 enables jinja2_native by default, changing how values are evaluated:
Boolean conditionals require actual booleans:
# WRONG - string "true" is not boolean
ipv6_support: "{% if ipv6 %}true{% else %}false{% endif %}"
# CORRECT - return actual boolean
ipv6_support: "{{ ipv6 is defined }}"No nested templates in lookup():
# WRONG - deprecated double-templating
key: "{{ lookup('file', '{{ SSH_keys.public }}') }}"
# CORRECT - pass variable directly
key: "{{ lookup('file', SSH_keys.public) }}"JSON files need explicit parsing:
# WRONG - returns string in native mode
creds: "{{ lookup('file', 'credentials.json') }}"
# CORRECT - parse JSON explicitly
creds: "{{ lookup('file', 'credentials.json') | from_json }}"default() doesn't trigger on empty strings:
# WRONG - empty string '' is not undefined
key: "{{ lookup('env', 'AWS_KEY') | default('fallback') }}"
# CORRECT - add true to handle falsy values
key: "{{ lookup('env', 'AWS_KEY') | default('fallback', true) }}"Complex Jinja loops break in set_fact:
# WRONG - list comprehension fails in native mode
servers: "[{% for s in configs %}{{ s.name }},{% endfor %}]"
# CORRECT - use Ansible loop
servers: "{{ servers | default([]) + [item.name] }}"
loop: "{{ configs }}"Use tests (not filters) for boolean checks:
# WRONG - filters return transformed data, not booleans
that: my_ip | ansible.utils.ipv4
# CORRECT - tests return native booleans
that: my_ip is ansible.utils.ipv4_addressAlgo uses a randomly generated IP in 172.16.0.0/12 on the loopback interface (local_service_ip) for DNS. This provides consistency across WireGuard and IPsec but requires understanding systemd socket activation.
- Consistent DNS IP across both VPN protocols
- Survives interface changes and restarts
- Works identically across all cloud providers
- Trade-off: Requires
route_localnet=1sysctl
Ubuntu's dnscrypt-proxy uses socket activation which completely ignores the listen_addresses config setting. You must configure the socket, not the service:
# /etc/systemd/system/dnscrypt-proxy.socket.d/10-algo-override.conf
[Socket]
ListenStream= # Clear defaults first
ListenDatagram=
ListenStream=172.x.x.x:53 # Then set VPN IP
ListenDatagram=172.x.x.x:53Common mistakes:
- Trying to disable/mask the socket (breaks service dependency)
- Only setting ListenStream (need ListenDatagram for UDP)
- Forgetting to restart socket after config changes
Many "routing" issues are actually DNS issues. Start here:
ss -lnup | grep :53 # Should show local_service_ip:53
systemctl status dnscrypt-proxy.socket # Check for config warnings
sysctl net.ipv4.conf.all.route_localnet # Must be 1
dig @172.x.x.x google.com # Test resolutionFor comprehensive diagnostics, see docs/troubleshooting.md.
Ubuntu 22.04+ defaults to iptables-nft which reorders rules unpredictably. Algo forces iptables-legacy for consistent behavior. Switching backends can break DNS routing that previously worked.
Servers with both public and private IPs on the same interface need explicit output interface for NAT:
-o {{ ansible_default_ipv4['interface'] }}Don't overengineer with SNAT - MASQUERADE with interface specification works fine.
OpenSSL 3.x dropped support for legacy algorithms. Add -legacy flag conditionally:
{{ (openssl_version is version('3', '>=')) | ternary('-legacy', '') }}WireGuard configs must bracket IPv6 addresses:
{% if ':' in IP %}[{{ IP }}]:{{ port }}{% else %}{{ IP }}:{{ port }}{% endif %}Many templates use Ansible-specific filters. Test with tests/unit/test_template_rendering.py and mock Ansible filters when testing.
Lessons learned - don't spend time on these unless absolutely necessary:
- Converting MASQUERADE to SNAT - MASQUERADE works fine for Algo's use case
- Fighting systemd socket activation - Configure it properly instead of disabling
- Debugging NAT before checking DNS - Most "routing" issues are DNS issues
- Complex IPsec policy matching - Keep NAT rules simple
- Testing on existing servers - Always test on fresh deployments
- Interface-specific route_localnet - WireGuard interface doesn't exist until service starts
- DNAT for loopback addresses - Packets to local IPs don't traverse PREROUTING
- Speculative features - Don't add "might be useful" functionality. Open an issue instead.
- New dependencies without justification - Vanilla Ansible/Python can do most things.
- Bundling unrelated fixes - One PR, one purpose. Separate issues get separate PRs.
- Assuming behavior - If converting
with_itemstoloop, test that it still works. If adding a firewall rule, verify packets flow. - Configuration options - Don't add flags unless users actively need them. Each option doubles testing surface.
- Undocumented workarounds - When working around broken upstream modules, file an issue and add a comment linking to it. Future maintainers need to know why workarounds exist.
When writing tests, verify your test actually detects the failure case (mutation testing approach):
- Write the test for the bug you're preventing
- Temporarily introduce the bug to verify the test fails
- Fix the bug and verify the test passes
- Document what specific issue the test prevents
def test_regression_openssl_inline_comments():
"""Tests that we detect inline comments in Jinja2 expressions."""
# This pattern SHOULD fail (has inline comments)
problematic = "{{ ['DNS:' + id, # comment ] }}"
assert not validate(problematic), "Should detect inline comments"
# This pattern SHOULD pass (no inline comments)
fixed = "{{ ['DNS:' + id] }}"
assert validate(fixed), "Should pass without comments"uv sync
uv run ansible-galaxy install -r requirements.yml
ansible-playbook main.yml -e "provider=local"# Add/update users
ansible-playbook users.yml -e "server=SERVER_NAME"
# Update dependencies
uv lock && pytest tests/unit/ -q
# Debug deployment
ansible-playbook main.yml -vvvconfigs/- Generated client configurationsroles/*/tasks/- Main task filesroles/*/templates/- Jinja2 templateslibrary/- Custom Ansible modules (add tomock_modulesin.ansible-lint)
All pause: prompts in input.yml and provider roles skip when their
variable is pre-defined via -e or environment variables. This enables
fully headless deployment for CI, agents, and scripted workflows.
See docs/deploy-from-ansible.md for
full human-facing documentation.
These bypass the main prompts in input.yml:
| Variable | Type | Default | Purpose |
|---|---|---|---|
provider |
string | (prompt) | Provider alias (e.g., digitalocean, ec2, local) |
server_name |
string | algo |
VPN server name |
ondemand_cellular |
bool | false |
iOS/macOS Connect On Demand for cellular |
ondemand_wifi |
bool | false |
iOS/macOS Connect On Demand for Wi-Fi |
ondemand_wifi_exclude |
string | (none) | Comma-separated trusted Wi-Fi networks |
store_pki |
bool | false |
Retain PKI keys (needed to add users later) |
dns_adblocking |
bool | false |
Enable DNS ad blocking |
ssh_tunneling |
bool | false |
Per-user SSH tunnel accounts |
| Provider | -e variables |
Env var fallbacks |
|---|---|---|
digitalocean |
do_token, region |
DO_API_TOKEN |
ec2 |
aws_access_key, aws_secret_key, region |
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (also reads ~/.aws/credentials) |
lightsail |
aws_access_key, aws_secret_key, region |
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY |
azure |
azure_secret, azure_tenant, azure_client_id, azure_subscription_id, region |
AZURE_SECRET, AZURE_TENANT, AZURE_CLIENT_ID, AZURE_SUBSCRIPTION_ID |
gce |
gce_credentials_file, region |
GCE_CREDENTIALS_FILE_PATH |
hetzner |
hcloud_token, region |
HCLOUD_TOKEN |
vultr |
vultr_config, region |
VULTR_API_CONFIG |
scaleway |
scaleway_token, scaleway_org_id, region |
SCW_TOKEN, SCW_DEFAULT_ORGANIZATION_ID |
linode |
linode_token, region |
LINODE_API_TOKEN |
cloudstack |
cs_key, cs_secret, cs_url, region |
CLOUDSTACK_KEY, CLOUDSTACK_SECRET, CLOUDSTACK_ENDPOINT |
openstack |
region |
OS_AUTH_URL (source your openrc.sh) |
local |
server, endpoint, local_install_confirmed |
(none) |
# DigitalOcean — fully headless
ansible-playbook main.yml -e \
"provider=digitalocean
server_name=algo
region=nyc3
do_token=YOUR_TOKEN
ondemand_cellular=false
ondemand_wifi=false
dns_adblocking=false
ssh_tunneling=false
store_pki=false"
# Local — for CI/testing
ansible-playbook main.yml -e \
"provider=local
server=localhost
endpoint=10.0.0.1
local_install_confirmed=true
ondemand_cellular=false
ondemand_wifi=false
dns_adblocking=false
ssh_tunneling=false"ansible-playbook users.yml -e "server=YOUR_SERVER ca_password=YOUR_CA_PASS"The server variable bypasses the server selection prompt.
ca_password is only required when IPsec is enabled.
- Never expose secrets - No passwords/keys in commits
- CVE Response - Update immediately when security issues found
- Least Privilege - Minimal permissions, dropped capabilities
- Secure Defaults - Strong crypto (secp384r1), no logging, strict firewall
- Primary OS: Ubuntu 22.04/24.04 LTS
- Secondary: Debian 11/12
- Architectures: x86_64 and ARM64
- Testing tip: DigitalOcean droplets have both public and private IPs on eth0, making them good test cases for multi-IP NAT scenarios