Version: 5.8.1
Axon is a self-hosted RAG stack for crawling, scraping, ingesting, embedding, searching, and asking questions over indexed content. The production release is Docker Compose first: one Axon server container, Qdrant, Hugging Face TEI with Qwen/Qwen3-Embedding-0.6B, and Chrome for JS-heavy pages.
Supported production runtime:
- Docker Compose only.
- Qdrant only for vector storage.
- Hugging Face TEI only for embeddings.
Qwen/Qwen3-Embedding-0.6Bas the production embedding model.- Gemini CLI is the default LLM synthesis path; OpenAI-compatible endpoints
such as llama.cpp are supported when configured with
AXON_LLM_BACKEND=openai-compat. - Web search/research uses a self-hosted SearXNG instance (
AXON_SEARXNG_URL) when configured, falling back to Tavily (TAVILY_API_KEY) otherwise. - Local NVIDIA RTX 4070 target with NVIDIA Container Toolkit.
- CLI and MCP run all actions in-process; deploy the
axon servecontainer only when you need HTTP API access. - One shared config home:
~/.axon/.env,~/.axon/config.toml,~/.axon/jobs.db,~/.axon/output,~/.axon/logs,~/.axon/artifacts,~/.axon/screenshots,~/.axon/qdrant, and~/.axon/tei.
Not supported in the production path:
- systemd deployment of the Axon binary.
- Postgres, Redis, RabbitMQ, AMQP, or external worker services.
- OpenAI-compatible first-run LLM configuration. Configure
AXON_LLM_BACKEND=openai-compatmanually after setup when using llama.cpp or another OpenAI-compatible/v1/chat/completionsendpoint. - Neo4j or graph retrieval.
- Multiple competing
.envorconfig.tomllocations.
Prerequisites:
- Linux x86_64.
- Docker and Docker Compose.
- NVIDIA driver,
nvidia-smi, and NVIDIA Container Toolkit. - Gemini CLI installed and already authenticated, unless using a configured OpenAI-compatible endpoint for LLM synthesis.
curl,sha256sum, andinstall.
One-line installer:
curl -fsSL https://raw.githubusercontent.com/jmagar/axon/main/install.sh | shThe installer verifies the release checksum, installs the axon binary to ~/.local/bin, then delegates setup to axon setup.
Useful installer controls:
AXON_INSTALL_DRY_RUN=1 ./install.sh
AXON_INSTALL_PREFIX=/opt/axon ./install.sh
AXON_VERSION=vX.Y.Z ./install.sh # pin a specific release; defaults to latest
AXON_INSTALL_SKIP_SETUP=1 ./install.shPrerequisites:
- Windows x86_64 (PowerShell 5.1+ or PowerShell Core).
- Docker Desktop with the WSL 2 backend (for the full stack).
- NVIDIA driver and NVIDIA Container Toolkit (for GPU acceleration).
- Gemini CLI installed and already authenticated, or a configured OpenAI-compatible endpoint for LLM synthesis.
One-line installer (PowerShell):
irm https://raw.githubusercontent.com/jmagar/axon/main/install.ps1 | iexThe installer verifies the release checksum, installs axon.exe to %USERPROFILE%\.local\bin, adds it to your user PATH, then delegates setup to axon setup.
Useful installer controls:
$env:AXON_INSTALL_DRY_RUN='1'; irm .../install.ps1 | iex
$env:AXON_INSTALL_PREFIX='C:\tools\axon'; irm .../install.ps1 | iex
$env:AXON_VERSION='vX.Y.Z'; irm .../install.ps1 | iex # pin a version
$env:AXON_INSTALL_SKIP_SETUP='1'; irm .../install.ps1 | iexClaude Code plugin install:
claude plugin install <path-to-this-repo>The plugin ships no binary. Install axon first via install.sh, then install the plugin. Its SessionStart hook runs scripts/plugin-setup.sh, which syncs CLAUDE_PLUGIN_OPTION_* settings into process env and delegates to axon setup plugin-hook. That subcommand is probe-only and never deploys: it checks /readyz and exits silently when the stack is up, or prints a one-line run /axon-deploy advisory when it is down. ConfigChange runs the same script so updated plugin settings take effect immediately. Provisioning is the /axon-deploy slash command (or axon setup / axon compose up). No systemd unit is created.
axon setup is the convenience bootstrap path. It is idempotent and safe to rerun. It:
- Creates or refreshes
~/.axon. - Creates or preserves
~/.axon/config.toml. - Creates or preserves
~/.axon/.env, filling only missing runtime values and preserving secrets. - Writes Docker Compose assets under
~/.axon/compose. - Checks Docker, Docker Compose,
nvidia-smi, Gemini CLI auth, and OAuth config when requested. - Pulls and starts the Compose stack.
- Waits for Qdrant, TEI, Chrome, and Axon server health.
Focused commands:
axon setup # init + compose up + preflight
axon setup init # create ~/.axon, config.toml, .env, and compose assets
axon preflight # check prerequisites, auth config, and service readiness
axon compose up # pull/start services, then follow logs until Ctrl-C
axon compose down # stop services
axon compose restart # restart services
axon compose rebuild # rebuild the Axon image and start services
axon smoke # TEI prewarm + crawl/ask proof
axon setup plugin-hook # probe-only SessionStart path (never deploys; advises /axon-deploy when down)
axon setup targets # list SSH aliases discovered from ~/.ssh/config (informational)For local bearer-token operation, no manual env values are required. setup init
defaults to loopback MCP HTTP, writes AXON_MCP_AUTH_MODE=bearer, and generates
AXON_MCP_HTTP_TOKEN. Optional features need credentials: Gemini auth under
~/.gemini for default LLM features or AXON_LLM_BACKEND=openai-compat plus
AXON_OPENAI_BASE_URL and AXON_OPENAI_MODEL for OpenAI-compatible synthesis,
TAVILY_API_KEY for search/research, GITHUB_TOKEN for higher-rate GitHub
ingest, and REDDIT_CLIENT_ID plus REDDIT_CLIENT_SECRET for Reddit ingest.
OAuth mode also requires
AXON_MCP_PUBLIC_URL, AXON_MCP_GOOGLE_CLIENT_ID,
AXON_MCP_GOOGLE_CLIENT_SECRET, and AXON_MCP_AUTH_ADMIN_EMAIL.
The warm-path setup goal is under 2 minutes once images and model weights are cached. Cold starts that pull images and model weights can take longer; target-hardware timing still needs to be measured against published release artifacts.
The production compose file starts:
| Service | Purpose | Host bind |
|---|---|---|
axon |
HTTP server, web panel, MCP HTTP, action API, in-process workers | 127.0.0.1:8001 |
axon-qdrant |
vector storage | 127.0.0.1:53333, 127.0.0.1:53334 |
axon-tei |
Qwen3 embeddings through TEI | 127.0.0.1:52000 |
axon-chrome |
browser rendering and CDP proxy | 127.0.0.1:6000, 127.0.0.1:9222, 127.0.0.1:9223 |
Start manually:
docker compose --env-file ~/.axon/.env -f ~/.axon/compose/docker-compose.yaml up -dCheck:
docker compose --env-file ~/.axon/.env -f ~/.axon/compose/docker-compose.yaml ps
axon preflight
axon doctorStop:
docker compose --env-file ~/.axon/.env -f ~/.axon/compose/docker-compose.yaml downDevelopment stack:
cargo build --bin axon
docker compose --env-file .env.example -f docker-compose.yaml build axon
docker compose --env-file ~/.axon/.env -f docker-compose.yaml up -d axonThe development stack uses the production infrastructure definitions but runs
axon from the bind-mounted local debug binary under target/debug, inside the
newer axon:dev-runtime image.
Axon has two config layers:
| File | Purpose |
|---|---|
~/.axon/.env |
URLs, secrets, auth, Docker interpolation, runtime bootstrap values |
~/.axon/config.toml |
non-secret tuning and behavior |
Precedence:
CLI flags > environment variables > ~/.axon/config.toml > built-in defaults
Keep in .env:
- URLs:
QDRANT_URL,TEI_URL,AXON_CHROME_REMOTE_URL,AXON_SEARXNG_URL. - Secrets:
AXON_MCP_HTTP_TOKEN,TAVILY_API_KEY,GITHUB_TOKEN,GITLAB_TOKEN,GITEA_TOKEN, Reddit credentials, OAuth credentials,HF_TOKEN. - Docker/runtime bootstrap:
AXON_HOME,AXON_DATA_DIR,AXON_IMAGE,AXON_MCP_HTTP_PUBLISH,TEI_HTTP_PORT, GPU device values. - LLM runtime pointers when needed:
AXON_HEADLESS_GEMINI_CMD,AXON_HEADLESS_GEMINI_HOME,AXON_LLM_BACKEND,AXON_OPENAI_BASE_URL,AXON_OPENAI_MODEL, and optionalAXON_OPENAI_API_KEY.
Put in config.toml:
- collection/search/ask tuning.
- worker and job limits.
- TEI client tuning.
- Qdrant batch sizing.
- logging behavior that is not a process-launch concern.
- UI/output behavior.
~/.axon/.env under the config home is never loaded through a symlink.
After setup:
axon doctor
axon crawl https://example.com --wait true
axon ask "What did we crawl?"CLI and MCP commands always run in-process against Qdrant and TEI. axon serve exposes the same operations over HTTP (/v1/*, MCP-over-HTTP) for clients that want API access to a deployed instance.
- Hybrid search. New Qdrant collections are created with named
dense+bm42sparse vectors and queried with Reciprocal Rank Fusion (RRF). Legacy unnamed collections fall back to dense-only cosine search. Tune via the[search]section inconfig.toml; runaxon migrate --from <old> --to <new>to copy a legacy unnamed collection into a new named-mode one, then pointAXON_COLLECTIONat it. - Vertical extractors.
scrape(and thescrapeMCP/REST action) auto-routes known URLs to structured per-site extractors (GitHub, PyPI, npm, crates.io, Reddit, YouTube, and more) instead of generic HTML→markdown. Disable withAXON_ENABLE_VERTICALS=falseor the[verticals]config section. - Web panel.
axon servehosts an Aurora-styled control panel at the bind address (defaulthttp://127.0.0.1:8001) with a first-run setup flow, config/stack inspection, and a command runner, alongside the/v1/*REST surface and OpenAPI docs at/docs.
Core:
scrape <url>...crawl <url>...map <url>extract <url>...embed [input]query <text>retrieve <url>ask <question>summarize <url>...evaluate <question>diff <url-a> <url-b>— show what changed between two URLs (content/metadata/links)brand <url>— extract brand identity: colors, fonts, logos, favicon
Discovery and ingest:
search <query>— web search via SearXNG (or Tavily), auto-queues crawl jobsresearch <query>— multi-source web research with LLM synthesissuggest [focus]endpoints <url>— discover API endpoints from page HTML and JavaScript bundlesingest <target>— GitHub, GitLab, Gitea/Forgejo, generic Git, Reddit, or YouTubesessions— index AI session exports (Claude, Codex, Gemini)
Operations:
setupdoctordebugservemcpstatusmonitor jobs— stream job lifecycle events (start/complete/fail/cancel)sourcesdomainsstatswatchrefresh [filter]— re-crawl/re-ingest previously indexed origins (full docs refresh)dedupemigratescreenshottrain— collect human preference votes on retrieved RAG candidatesconfigcompletions
Use command-specific help:
axon --help
axon setup --help
axon crawl --help
axon mcp --helpGraph flags are not part of the production CLI, MCP, or /v1/ask request contract.
Axon exposes one MCP tool named axon; actions are routed by action and optional subaction.
Examples:
{ "action": "doctor" }
{ "action": "scrape", "url": "https://example.com" }
{ "action": "ask", "query": "How does setup work?" }
{ "action": "summarize", "url": "https://example.com" }
{ "action": "crawl", "subaction": "status", "job_id": "<uuid>" }HTTP auth modes:
- Static bearer token with
AXON_MCP_HTTP_TOKEN. - OAuth/lab-auth with
AXON_MCP_AUTH_MODE=oauth.
/mcp and the /v1/* REST routes (e.g. /v1/ask, /v1/scrape, /v1/query) share the same auth policy. Tokenless HTTP is only for loopback development binds.
Build:
cargo build --bin axon
cargo build --release --bin axonTest and lint:
cargo fmt --all -- --check
cargo check --workspace --all-targets --features test-helpers
cargo clippy --workspace --all-targets --features test-helpers -- -D warnings
cargo test --workspace --features test-helpersCommon focused checks:
cargo test --test cli_help_contract -- --nocapture
cargo test parse_setup -- --nocapture
cargo test load_dotenv -- --nocapture
python3 scripts/generate_mcp_schema_doc.py --check
docker compose --env-file .env.example -f docker-compose.prod.yaml config --quietModule layout policy:
- Do not add
mod.rs. - Rust module roots live in
foo.rs. - Submodules live in
foo/bar.rs.
Required before production release:
- CI fmt/check/clippy/test.
- MCP schema doc sync.
- CLI help contract tests.
- Docker Compose config validation.
- Docker image build and GHCR publish workflow.
- Compose smoke workflow.
- Self-hosted RTX 4070 smoke for Qwen3 TEI cold/warm setup timing.
cargo machete should be run when installed; it is not vendored in this repo.
Fast checks:
axon preflight
axon doctor
docker compose --env-file ~/.axon/.env -f ~/.axon/compose/docker-compose.yaml ps
docker compose --env-file ~/.axon/.env -f ~/.axon/compose/docker-compose.yaml logs --tail=100 axon axon-tei axon-qdrant axon-chromeImportant paths:
~/.axon/.env~/.axon/config.toml~/.axon/jobs.db~/.axon/logs~/.axon/artifacts~/.axon/output~/.axon/tei~/.axon/qdrant
Common failures:
- Docker missing: install Docker and Docker Compose, then rerun
axon setup. - GPU unavailable: verify
nvidia-smiand NVIDIA Container Toolkit. - Gemini unauthenticated: run Gemini CLI login outside Axon, then rerun setup,
or configure
AXON_LLM_BACKEND=openai-compatfor an OpenAI-compatible endpoint. - TEI slow on first boot: model download/cache warmup is the cold path.
- Auth failures: make sure Claude/plugin config uses the same token as
AXON_MCP_HTTP_TOKENin~/.axon/.env.
install.sh— verified one-line installer bootstrapper.docker-compose.prod.yaml— production Compose stack.docker-compose.yaml— local development stack..env.example— production environment template.config.example.toml— non-secret tuning template.plugins/axon/.claude-plugin/plugin.json— Claude plugin manifest.scripts/plugin-setup.sh— plugin hook delegating to shared setup.docs/reference/mcp/tool-schema.md— generated MCP wire contract.docs/— full documentation tree: guides,reference/commands/, architecture, and operations.
MIT