Skip to content

feat(memory): add persistent memory layer for cross-session learning#149

Open
castabar wants to merge 16 commits intomasterfrom
feature/memory-layer
Open

feat(memory): add persistent memory layer for cross-session learning#149
castabar wants to merge 16 commits intomasterfrom
feature/memory-layer

Conversation

@castabar
Copy link
Copy Markdown
Contributor

@castabar castabar commented May 6, 2026

Memory Layer Implementation

Adds a persistent memory system to the Kimchi harness. Memory survives across sessions and is injected directly into the system prompt by the memory extension itself.

What Changed

Core Components

  • memory extension (src/extensions/memory/): New extension with full tooling
  • memory tool: Supports 4 actions against two targets:
    • add / replace / remove / read
    • memory target: agent notes (env facts, conventions, tool quirks)
    • user target: user profile (name, role, preferences, timezone)
  • §-delimited storage: MEMORY.md / USER.md, no frontmatter
  • Char-bounded: 2200 chars (memory), 1375 chars (user)
  • Atomic writes: temp file + fs.rename() + proper-lockfile
  • Content scanner: blocks prompt injection, role hijacking, exfiltration patterns, invisible unicode

System Prompt Integration

The memory extension registers its own before_agent_start handler. After prompt-enrichment builds the system prompt, memory appends its blocks directly:

[orchestrator/single-model prompt built by prompt-enrichment]

══════════════════════════════════════════════
MEMORY (your personal notes) [4% — 88/2200 chars]
══════════════════════════════════════════════
Use `rg` instead of `grep` and `fd` instead of `find` when running bash search commands.

══════════════════════════════════════════════
USER PROFILE (who the user is) [3% — 45/1375 chars]
══════════════════════════════════════════════
Timezone: EEST (Eastern European Summer Time)

No placeholders in prompt templates. No cross-extension coordination required.

Frozen snapshot: memory loads once at session_start — mid-session writes don't alter the running prompt, preserving prefix cache stability.

Subagent isolation: memory is not injected into subagent prompts (stateless workers).

Files Added/Modified

src/extensions/memory/
├── types.ts                    # MemoryTarget, MemoryAction, MemoryStoreOptions
├── content-scanner.ts          # Threat pattern detection
├── file-io.ts                  # §-delimited atomic I/O
├── memory-store.ts             # Core store (add/replace/remove/read)
├── memory-tool.ts              # Tool registration + dispatch
└── index.ts                    # Extension entry point + before_agent_start injection

src/cli.ts                      # Register memoryExtension
package.json                    # +proper-lockfile

Session Log Examples

Writing preferences (two mutations in one session)

User tells the model to use rg instead of grep:

user:  let's only use `rg` instead of `grep` when running bash commands
model: [calls memory add target=memory "Use `rg` instead of `grep`..."]
       Got it. I'll use `rg` instead of `grep` for bash search commands going forward.

User follows up with fd:

user:  also `fd` instead of `find`
model: [calls memory replace target=memory old_text="Use `rg`..." new_content="Use `rg`...and `fd`..."]
       Done.

User mentions their timezone:

user:  i'm in the eest time zone
model: [calls memory add target=user "Timezone: EEST (Eastern European Summer Time)"]
       Noted.

Successful recall from system prompt (next session)

user:  do i prefer anything to grep
model: <think>Looking at my system prompt, I see this in the MEMORY section:
       "Use `rg` instead of `grep` and `fd` instead of `find`..."</think>
       Yes — `rg` (ripgrep) instead of `grep`, and `fd` instead of `find`.

user:  what timezone am i in
model: <think>Looking at the user profile that was loaded, it says
       "Timezone: EEST (Eastern European Summer Time)".</think>
       EEST — Eastern European Summer Time.

No tool calls. Both answers came directly from the system prompt.

Design Decisions

Self-contained injection: The memory extension owns its before_agent_start handler rather than relying on {{MEMORY}} placeholders in prompt templates. The runner chains handlers — each receives the system prompt built by the previous one — so memory safely appends after orchestration without any coupling.

Frozen snapshots: Mid-session writes update disk but not the running system prompt. Next session picks up the new state.

Char limits vs tokens: Tokens vary by model. Characters are deterministic and match Hermes's implementation (2200/1375).

Testing

Test Files: 11 passed
Tests:      23 passed
Biome:      clean
Type check: clean

Security

Content scanner blocks: prompt injection, role hijacking, secret exfiltration via curl/wget, secret file reads (.env, .netrc), invisible unicode.


Kimchi Summary

What changed

Adds a persistent memory extension that lets the agent save and recall durable facts and user preferences across sessions via a new memory tool. Also removes several obsolete documentation files.

Why

Agents currently lose context about the user, project conventions, and environment quirks between sessions. A persistent memory layer reduces repetitive discovery, prevents recurring mistakes, and personalises behaviour by injecting known facts back into the system prompt.

Key changes

  • src/cli.ts: registers memoryExtension.
  • src/extensions/memory/index.ts: bootstraps the extension, captures a frozen snapshot on session_start, and injects it into the system prompt during before_agent_start.
  • src/extensions/memory/memory-tool.ts: exposes a memory tool supporting add, replace, remove, and read actions against memory (agent notes) or user (user profile) targets.
  • src/extensions/memory/memory-store.ts: implements CRUD with per-store character limits, deduplication, substring-based matching for updates, and frozen snapshots to keep prefix caches stable.
  • src/extensions/memory/file-io.ts: atomic §-delimited file I/O using proper-lockfile for safe concurrent access.
  • src/extensions/memory/content-scanner.ts: rejects prompt injection, role hijacking, secret exfiltration (curl/wget with secrets, reading .env, SSH backdoors), and invisible unicode characters before persistence.
  • src/extensions/memory/types.ts: defines store types and ENTRY_DELIMITER.
  • package.json / pnpm-lock.yaml: adds proper-lockfile and @types/proper-lockfile dependencies.
  • Removed outdated docs: docs/permissions.md, docs/phase-guidelines-research.md, docs/sandboxing.md, docs/troubleshooting.md, docs/superpowers/specs/2026-04-15-mcp-tool-result-offload-design.md.

Impact

  • Memory is persisted as MEMORY.md and USER.md under the agent directory.
  • Default character limits: 2200 for memory, 1375 for user. Attempts to exceed these limits are rejected.
  • Only the snapshot captured at session start is injected into the prompt; mid-session writes are not reflected until the next session, preserving prefix-cache stability.
  • Security-sensitive patterns are blocked at write time with detailed error messages.

@kimchi-review
Copy link
Copy Markdown

kimchi-review Bot commented May 6, 2026

Kimchi Code Review

Property Value
Commit 430fb8a
Author @castabar
Files changed 0
Review status Completed
Comments 5 (3 info, 2 warning)
Duration 56s

Summary

📊 Review Score: 82/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 3/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — Comprehensive test coverage added including: content-scanner tests for prompt injection patterns (9 test cases), file-io tests for atomic writes and entry splitting, memory-store tests for CRUD operations and limits, memory-tool tests for action dispatch, and integration tests for the extension registration. Tests properly use temporary directories and clean up after execution.

📝 Found 5 issue(s). See inline comments for details.

What to expect

Kimchi will analyze the changes in this pull request and post:

  • A summary of the overall changes
  • Inline comments on specific lines with findings categorized by issue type

The review typically completes within a few minutes. This comment will be updated once the review is ready.

Interact with Kimchi
  • @kimchi review — re-trigger a full review on the latest commit
  • @kimchi summary — regenerate the PR summary
  • @kimchi ignore — skip this PR (no review will be posted)
  • Reply to any inline comment to ask follow-up questions or request clarification
Configuration

Reviews are configured by your organization admin.
Review instructions, excluded directories, and severity thresholds can be adjusted per repository in the Kimchi dashboard.


Powered by Kimchi — AI-powered code review by CAST AI

Copy link
Copy Markdown

@kimchi-review kimchi-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📊 Review Score: 82/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 3/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — Comprehensive test coverage added including: content-scanner tests for prompt injection patterns (9 test cases), file-io tests for atomic writes and entry splitting, memory-store tests for CRUD operations and limits, memory-tool tests for action dispatch, and integration tests for the extension registration. Tests properly use temporary directories and clean up after execution.

📝 Found 5 issue(s). See inline comments for details.

export async function writeMemoryFile(filePath: string, entries: string[]): Promise<void> {
mkdirSync(dirname(filePath), { recursive: true })
const content = entries.length > 0 ? entries.join(ENTRY_DELIMITER) : ""
const tmpPath = join(dirname(filePath), `.${Date.now()}.tmp`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️🔀 Concurrency

The catch block silently swallows ALL errors from lock(), not just the case where the file doesn't exist. If locking fails due to permissions, lock contention timeout, or system errors, the code proceeds without a lock, risking data corruption. Additionally, if the file doesn't exist, multiple concurrent writers could race: both proceed without locks, write to different temp files, and the last rename wins, causing the first write to be silently lost.

💡 Suggestion: Check specifically for file-not-exist errors (e.g., by checking err.code === 'ENOENT' or verifying !existsSync(filePath) in the catch block). If the error is not ENOENT, re-throw it. Consider creating an empty file first if it doesn't exist, then locking it, to ensure locking always occurs.

Comment thread src/extensions/memory/file-io.ts Outdated
// Acquire cross-platform file lock before writing
let release: (() => Promise<void>) | undefined
try {
release = await lock(filePath, { retries: { retries: 10, factor: 2, minTimeout: 50, maxTimeout: 1000 } })
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️⚠️ Error Handling

If writeFile succeeds but rename fails (e.g., due to permissions or disk full), the temporary file at tmpPath is not cleaned up, leaving orphaned temp files in the memory directory. The finally block only releases the lock, it doesn't unlink the temp file.

💡 Suggestion: Add cleanup logic in a catch block or finally block to remove tmpPath if it exists and the rename failed. For example: try { await writeFile(...); await rename(...) } catch (err) { await unlink(tmpPath).catch(() => {}); throw err }.

Comment thread src/extensions/memory/file-io.ts Outdated
import { dirname, join } from "node:path"
import { lock } from "proper-lockfile"

const ENTRY_DELIMITER = "\n§\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️🔧 Maintainability

The ENTRY_DELIMITER constant \n§\n is defined identically in both file-io.ts and memory-store.ts. This duplication risks inconsistency if one file is modified without updating the other.

💡 Suggestion: Export ENTRY_DELIMITER from file-io.ts (or a shared constants file) and import it in memory-store.ts to ensure a single source of truth.

if ((err as { code?: string }).code === "ENOENT") return []
throw err
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️🔧 Maintainability

The temp file name uses only Date.now() without randomness, which could theoretically collide if two operations occur within the same millisecond (e.g., on a fast system or if the clock hasn't updated). While the file lock prevents concurrent collisions, if the lock cannot be acquired (file doesn't exist case), collisions become possible.

💡 Suggestion: Append a random suffix to the temp filename, e.g., \.${Date.now()}-${randomBytes(4).toString('hex')}.tmp, to guarantee uniqueness.

private _ok(target: MemoryTarget, message: string): MemoryToolResult {
const entries = this._get(target)
const current = entries.join(ENTRY_DELIMITER).length
const limit = this._limit(target)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️🏗️ Design

The _resolveSingleMatch method truncates matched entries to 80 characters in the error message, but uses a slice without checking if the truncation actually occurred, and doesn't indicate that truncation happened.

💡 Suggestion: Consider adding an ellipsis indicator only when truncation occurs, or ensure the truncation logic is consistent with the displayed message to avoid confusion when users see partial strings.

@castabar castabar marked this pull request as ready for review May 6, 2026 09:05
Copy link
Copy Markdown
Contributor Author

@castabar castabar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Clean architecture overall. Two stores (MEMORY.md / USER.md), §-delimited, char-bounded, injected once at session_start as a frozen snapshot. Reasonable approach.


Issues

1. Locking bug — real race condition (file-io.ts:258-263)

When the target file doesn't exist yet (first write), proper-lockfile throws and the catch silently proceeds with no lock:

```ts
try {
release = await lock(filePath, ...)
} catch {
// File doesn't exist yet — proceed without lock
}
```

Two concurrent writers both fail to lock → both write different tmp paths → both rename → last rename wins, first writer's entry is silently dropped. Fix: use a separate always-present lock file (e.g. `filePath + ".lock"`) or ensure the target file exists before locking.

2. ENTRY_DELIMITER defined in two places

file-io.ts:6 and memory-store.ts:4 both define const ENTRY_DELIMITER = "\n§\n". If one changes they silently diverge. Move to types.ts and import from there.

3. Dynamic import inside hot path (file-io.ts:266)

```ts
const { rename } = await import("node:fs/promises")
```

rename is available as a static import at the top of the module. This dynamic import runs on every write. Just import it statically.

4. _resolveSingleMatch — ambiguous multi-match behavior (memory-store.ts:580-597)

When matches.length > 1 but all matched entries are identical (file manually edited with dupes), uniqueTexts.size === 1 so it silently uses the first match. The user gets a success response but may not know which duplicate was targeted. At minimum document this.

5. Content scanner false positives

/you\s+are\s+now\s+/i blocks "you are now able to use this feature" in legitimate notes. The read_secrets pattern blocks any note that mentions cat ~/.env even in a "don't do this" context. These will produce confusing user-facing errors.

6. types.test.ts is noise

The test just assigns type literals and asserts equality — it tests nothing at runtime. Remove or replace with something meaningful.


Minor

  • _ok() returns live entries while formatForSystemPrompt returns the frozen snapshot. The agent sees fresh counts but a stale prompt — document this so future maintainers don't "fix" it.
  • No test covering that {{MEMORY}} / {{USER_PROFILE}} placeholders are actually replaced in the rendered system prompt.
  • ctx is cast twice via as unknown as { memorySnapshot?: ... } in two different files. Consider a shared interface.
  • proper-lockfile last released 2021, no longer actively maintained. Not a blocker but worth noting.

The locking bug (#1) is the only real correctness issue. Fix the lock-on-nonexistent-file race before merging.

castabar and others added 10 commits May 7, 2026 14:45
Co-Authored-By: Kimchi <noreply@kimchi.dev>
Co-Authored-By: Kimchi <noreply@kimchi.dev>
Co-Authored-By: Kimchi <noreply@kimchi.dev>
…to orchestrator and single-model system prompts

Co-Authored-By: Kimchi <noreply@kimchi.dev>
Co-Authored-By: Kimchi <noreply@kimchi.dev>
… to system prompt builders

Add optional memoryBlock and userProfileBlock parameters to buildOrchestratorSystemPrompt
and buildSingleModelSystemPrompt, with corresponding .replace() calls for {{MEMORY}} and
{{USER_PROFILE}} placeholders. buildSubagentSystemPrompt is intentionally unchanged as
subagent prompts should never receive memory.

Co-Authored-By: Kimchi <noreply@kimchi.dev>
- Create src/extensions/memory/index.ts entry point
- Wire memoryExtension into extensionFactories in cli.ts
- Add tests for tool registration and session_start snapshot
- Biome format fixes applied

Co-Authored-By: Kimchi <noreply@kimchi.dev>
- Lock companion .lock file instead of target to avoid race when target
  doesn't exist yet (proper-lockfile throws on missing files).
- Move ENTRY_DELIMITER to types.ts so it's defined once.
- Use static rename import instead of dynamic import in hot path.
- Add JSDoc on _resolveSingleMatch documenting duplicate handling.
- Add test for replace on duplicate-identical entries.
- Narrow role_hijack pattern with article requirement (a/an/the) to
  avoid false positives on benign phrases like 'you are now able to'.
- Add test for benign 'you are now' without article.
- Extract MemoryContext interface in types.ts, use in index.ts and
  prompt-enrichment.ts to eliminate repeated inline casts.
- Remove types.test.ts (noise — tests TS type system at runtime).
- Add JSDoc on formatForSystemPrompt and _ok documenting the
  live entries vs frozen-snapshot distinction.

Co-Authored-By: Kimchi <noreply@kimchi.dev>
@castabar castabar force-pushed the feature/memory-layer branch from 11f3d88 to 9333086 Compare May 7, 2026 11:50
castabar added 6 commits May 7, 2026 15:14
… to model

- Memory extension now owns its own before_agent_start handler and appends
  memory blocks directly to the system prompt built by prompt-enrichment,
  removing the brittle ctx-based snapshot approach (ctx is a fresh object
  per emit() call, so writes to it were silently dropped)
- Remove {{MEMORY}}/{{USER_PROFILE}} placeholders from prompt templates and
  all memoryBlock/userProfileBlock params from prompt-transformer functions
- Fix wrapResult in memory-tool: read action was returning empty string as
  text content; model received blank tool result and hallucinated an image
  attachment — now surfaces entries in the content the model sees
- Remove MemoryContext type and getMemorySnapshot export (no longer needed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant