The first firewall for AI agents.
A network firewall inspects the packets reaching a machine. AEGIS inspects the
untrusted content reaching your agent — and blocks what's trying to hijack it.
A message arrives that looks like routine ops automation. AEGIS reads it —
patterns flag the phrasing, the local judge sees it wants the agent to act without you —
and it's blocked before the agent ever sees it. Genuine status messages pass straight through.
Your AI coding agent reads a lot of text it didn't write — peer messages, skill
files, memory notes, CLAUDE.md, MCP configs, web results. Any of it can carry an
attack on the agent itself. The dangerous ones don't look like attacks:
No "ignore previous instructions." No keywords. Just an official-sounding message that tells your agent it's pre-authorized to act without you. Keyword filters and prompt-injection classifiers wave it through. AEGIS catches it — and quarantines it before your agent ever reads it.
Scope: AEGIS guards the agent, not the host. It's not antivirus. Its one job is to stop your agent from being talked into doing something by untrusted text.
It doesn't cry wolf. On a held-out set of 190 realistic files it had never seen:
| Recall | Precision | False-positive rate | F1 |
|---|---|---|---|
| 82% | 95% | 4% | 88% |
Zero false positives on 80 real benign dev/agent files — code that calls
subprocess/eval, skills full of kubectl/gcloud commands, MCP configs, even
security docs that quote attacks. Reproduce it yourself: tests/held_out_eval/.
brew install pilot-protocol/tap/aegis # brings llama.cpp automatically
aegis install-models # one-time judge model (~1.8 GB)
aegis init # protect your agent surfaces
aegis daemon # (or: brew services start aegis)That's it — your agent's inbox, skills, memory, CLAUDE.md, and MCP config are now
guarded. No model? No llama.cpp? It still runs as the L1 pattern layer.
Build from source / other platforms
git clone https://github.com/pilot-protocol/aegis && cd aegis
cargo build --release
sudo cp target/release/aegis /usr/local/bin/Prebuilt macOS + Linux (x86_64/arm64) binaries are on the releases page.
flowchart LR
A[New / changed file<br/>on an agent surface] --> B{Pilot inbox?}
B -- yes --> C[Intercept:<br/>rename before<br/>the agent reads it]
B -- no --> D[Read in place]
C --> E
D --> E[Extract the payload text]
E --> F[L1 · Aho-Corasick patterns<br/><i>microseconds · pure Rust</i>]
F --> G[L2 · Unified judge<br/><i>local LLM, two passes</i>]
G --> H{Attack?}
H -- no --> I[Allow → file released<br/>to the agent, untouched]
H -- yes --> J[Quarantine →<br/>~/.aegis/quarantine/<br/>+ desktop notification<br/>+ audit log]
Two layers. A fast universal one, and a smart one.
- L1 — Aho-Corasick patterns. Pure Rust, microseconds, kilobytes. Known injection/IoC strings plus base64/hex/rot13/homoglyph/zero-width decode passes. Runs on anything — a Pi, a router, a CI box.
- L2 — the judge. A local Qwen3-1.7B (via llama.cpp, fully offline). Two passes: "is this content attacking the agent?" (injection, jailbreak, spoofing, exfil — and crucially, describing an attack ≠ performing one) OR "is it pushing the agent to act without the user?" (the infra-impersonation question). A safe verdict vetoes L1's keyword hits — that's why a security doc that quotes an injection isn't flagged.
If the judge can't run (tiny device, no model, server down), AEGIS degrades to L1 patterns alone — lower recall, but an instant, dependency-free floor.
- Quarantine =
~/.aegis/quarantine/(amv, not a delete — you can inspect it). Inbox messages are intercepted (claimed before the agent can read them); skills and memory are moved out of the agent's path.CLAUDE.md/ MCP config are alerted but not moved (they're yours — moving them would break your setup). - Notified three ways: a native desktop notification, the terminal, and an
HMAC-chained audit log at
~/.aegis/audit.jsonl(aegis statusto tail it).
~/.aegis/config.toml (created by aegis init):
[judge]
enabled = true # false = super-lightweight, L1 patterns only, any host
model = "" # pin a model, e.g. "Qwen3-1.7B-Q4_K_M.gguf"; "" = auto
[watch]
defaults = true # protect the standard agent surfacesCustom watch targets go in ~/.aegis/watch.toml. aegis config shows the effective
settings.
| Layer | Latency | RAM | Runs on |
|---|---|---|---|
| L1 patterns | microseconds | KB | anywhere |
| L2 judge | ~260 ms/pass (warm) | ~2.2 GB | macOS / Linux with a GPU or CPU |
Binary 831 KB. Judge model loads once; clean traffic stays cheap. Nothing ever leaves the machine.
aegis init Write a default config and show what's protected
aegis daemon Watch & protect all agent surfaces
aegis scan <path>... One-shot scan (great for an agent hook / CI)
aegis install-models Download the judge model
aegis status Tail the audit log
aegis targets List protected surfaces
aegis config Show effective configuration
tests/held_out_eval/ is the honest held-out benchmark
(82/95/4) — 190 labeled files AEGIS never saw during tuning. Start the judge, then
python3 run_held_out.py.
MIT — see LICENSE.