An LLM-powered knowledge engine that builds structured, searchable wikis from codebases. Inspired by Karpathy's LLM Wiki pattern.
Point it at a Python project → it produces a rich, interlinked wiki with architectural context, not just API docs.
kb build ./my-project --output docs/wiki/Before: "GroupService has 13 methods" After: "GroupService is the business logic layer for chat groups. It handles creation, membership, agent assignment, and DMs. Authorization is owner-based — only the group owner can add/remove members. Uses batch queries to avoid N+1 problems when populating member details."
1. SCAN Python's ast module extracts classes, functions, imports, docstrings
2. GRAPH Import analysis maps who depends on whom across all files
3. COMPILE LLM writes rich prose from AST data + raw code (or AST-only fallback)
4. LINK Import graph creates backlinks between articles
5. INDEX Concept graph tracks shared entities across articles
6. EXPORT Markdown files with frontmatter → docs/wiki/ or any directory
The LLM is used only in step 3 — everything else is pure Python. Without an LLM, you still get a functional wiki (just method signatures instead of prose).
uv pip install -e .
# With LLM compilation (recommended)
uv pip install -e ".[anthropic]"
# With search
uv pip install -e ".[search]"
# Everything
uv pip install -e ".[all]"# Build a wiki from a Python project
kb build ./src/myproject --scope myproject --output docs/wiki/
# Search
kb search "authentication" --scope myproject
# Show a specific article
kb show group_service --scope myproject
# Check wiki health
kb lint --scope myproject
# Stats
kb stats --scope myproject
# Ingest a URL or file
kb ingest https://docs.example.com/guide
kb ingest ./ARCHITECTURE.mdfrom knowledge_base import KnowledgeEngine
from knowledge_base.compiler import AnthropicBackend
# With LLM compilation
engine = KnowledgeEngine(
scope="myproject",
backend=AnthropicBackend(model="claude-haiku-4-5-20251001"),
)
# Build from codebase
articles = await engine.build_from_code("./src/myproject")
# Search
results = await engine.search("how does auth work")
# Get context for an AI agent's system prompt
context = await engine.search_context("group membership", limit=3, max_chars=4000)
# Health check
issues = await engine.lint()Implement the CompilerBackend protocol:
from knowledge_base.compiler import CompilerBackend
class MyBackend:
async def complete(self, prompt: str, system_prompt: str = "") -> str:
# Call your LLM here
return await my_llm.generate(prompt, system=system_prompt)
engine = KnowledgeEngine(scope="myproject", backend=MyBackend())Each article is a markdown file with JSON frontmatter:
---
{
"title": "GroupService — Group and Channel Business Logic",
"summary": "Stateless service for group CRUD, membership, agent assignment, and DMs...",
"concepts": ["GroupService", "workspace scoping", "Beanie ODM", "event-driven"],
"categories": ["chat domain", "service layer", "CRUD"],
"backlinks": ["schemas", "group-model", "errors", "message-service"],
"word_count": 1332,
"version": 1
}
---
## Purpose
The group_service module encapsulates all group-related business logic...
## Key Classes and Methods
### GroupService
...
## Authorization and Security
...
## Dependencies and Integration
...knowledge_base/
__init__.py KnowledgeEngine — main API
models.py RawDoc, WikiArticle, Concept, CodeModule, LintIssue
store.py File-based persistence (markdown + JSON)
compiler.py CompilerBackend protocol + AnthropicBackend
code_compiler.py AST parser, import graph, code-specific prompts
indexer.py Concept graph, backlinks, categories
search.py BM25 keyword search
linter.py LLM-powered health checks
cli.py Click CLI (kb command)
| Layer | What | Where |
|---|---|---|
| Raw Sources | Original code files, URLs, text | ~/.knowledge-base/{scope}/raw/ |
| Compiled Wiki | LLM-written articles with frontmatter | ~/.knowledge-base/{scope}/wiki/ |
| Index | Concept graph, categories, metadata | ~/.knowledge-base/{scope}/index.json |
Articles are connected through:
- Import backlinks — if module A imports module B, their articles link
- Shared concepts — "Beanie ODM" appears in 17 articles, connecting them
- Categories — "service layer", "data model", "API router" group related articles
Add to your project's CLAUDE.md:
## Knowledge Base
A codebase wiki lives at `docs/wiki/`. Read relevant articles before modifying modules.Auto-rebuild on commits via .claude/hooks/:
#!/bin/bash
# .claude/hooks/kb-rebuild.sh
CHANGED=$(git diff --name-only HEAD~1 HEAD | grep "^src/" | head -1)
[ -z "$CHANGED" ] && exit 0
kb build ./src --scope myproject --output docs/wiki/The knowledge-base package integrates with PocketPaw's enterprise cloud module:
- Agent context injection — KB articles injected into agent system prompt based on user query
- Wiki pocket template — interactive KB browser rendered via Ripple UI
- Agent-scoped KB — each agent can have its own knowledge base
- Workspace-scoped KB — shared knowledge across a team
MIT