Daneel kept forgetting things. After every session restart, I had to re-explain what we were working on. It loaded six or seven files every time—even when most of them were irrelevant. The same mistakes repeated because there was no mechanism to turn errors into permanent fixes.
I designed a 3-tier memory system. Inspired by CPU cache architecture. Simple, predictable, maintainable.
The Problem
LLM sessions don’t persist. Every restart is a cold boot. Daneel had context files—NOW.md, daily logs—but no hierarchy. Everything had equal priority. Read everything every time.
Result:
- Slow startup (loading files “just in case”)
- Wasted tokens on stale context
- Repeated mistakes (no path from error → permanent fix)
- Manual context handoff after every restart
It worked. Barely. It didn’t scale.
The Solution: L1/L2/L3
L1: Hot Cache (<1.5KB)
File: NOW.md
Loaded every session, no exceptions. Contains only:
- Current task (1-2 sentences)
- Active blockers
- Open threads (max 2-3)
Think CPU L1 cache: tiny, fast, always in scope.
Hard rule: stays under 1.5KB. No history. No retrospectives. What’s happening right now.
L2: Warm Storage
File: MEMORY.md
Curated long-term knowledge. Loaded on demand—main session startup or after a break longer than 6 hours.
Contains:
- Distilled lessons learned
- Important context and relationships
- Architectural decisions and the reasoning behind them
Not append-only. Actively maintained. Stale entries get removed.
L3: Cold Archive
Files: memory/YYYY-MM-DD.md
Raw daily logs. Timestamped. Append-only. Never bulk-loaded.
Accessed only via memory_search(). Disk cache semantics: search when needed, never read in full.
Session Restart Workflow
Before: always read 6-7 files → wasted tokens, slow startup.
After: 3-phase startup.
Phase 1: Mandatory (every session)
- Read
NOW.md(~1.5KB) - Read
SOUL.md+USER.md(identity and preferences)
Takes roughly 30 seconds and 8KB.
Phase 2: Context-dependent
- Break longer than 6h? Read today’s log.
- New topic? Run
memory_search(topic). - Main session after a long break? Read
MEMORY.md.
Phase 3: Compression recovery
- Check
NOW.mdfor compression checkpoint entries - Resume from checkpoint
- Run
memory_searchfor last active topic
Result: faster startup, fewer tokens consumed, nothing loaded that isn’t needed.
Memory Maintenance
The deeper problem: insights from L3 (daily logs) never promoted to L2 (MEMORY.md). Hard-won lessons stayed buried in raw logs, never becoming permanent knowledge.
Fix: scheduled maintenance every 3 days.
Process:
- Read last 3 days of daily logs
- Identify new lessons and critical decisions
- Update
MEMORY.md: add insights, prune stale entries - Review
memory/self-review.md: any mistake at COUNT=3? Promote the fix to a permanent rule inAGENTS.md - Log maintenance in the daily diary
Time cost: 5-10 minutes every 3 days. Trade-off is obvious.
MISS/FIX Auto-Graduation
File: memory/self-review.md
Every mistake gets logged with a COUNT field. Each repeat increments the counter.
- COUNT reaches 3 → fix auto-promoted to permanent rule in
AGENTS.md - High severity (privacy, security) → immediate promotion, COUNT = 1
### MEMORY FAIL #2
TAG: Credentials
MISS: Asked for Zulip credentials without checking TOOLS.md
FIX: Always check TOOLS.md first, then memory_search, THEN ask
COUNT: 2
STATUS: Active
Systematic mistakes become systematic fixes. That’s the goal.
Compression Checkpoint Protocol
LLM contexts compress without warning. You lose work in progress.
At 70% context usage (140k/200k tokens), Daneel dumps current state to NOW.md.
## [2026-02-16 23:00] Checkpoint (context at 72%)
Working on: Gitea backup automation
Decisions made: Using daily cron at 8:00 CET
Pending: Test backup restore process
Key files: scripts/gitea-backup.sh, TOOLS.md#Gitea
Resume from: "Implement restore test"
When to checkpoint:
- Context above 70%
- Before complex multi-step work
- Before any potentially risky operation
- When accumulating important decisions that haven’t been written down yet
Implementation
Done in roughly one hour:
- Shrink
NOW.mdto <1.5KB (was 2.8KB) - Create
memory/self-review.mdfor MISS/FIX tracking - Document L1/L2/L3 in
AGENTS.md - Update
HEARTBEAT.mdwith maintenance schedule - Create
memory/metrics.jsonfor evaluation tracking - Schedule cron: memory maintenance every 3 days
- Schedule cron: evaluation run on 2026-02-23
Evaluation
In one week, an automated cron job will analyze metrics.json:
- Did memory fails decrease?
- Is the maintenance overhead acceptable?
- Are checkpoints actually being used?
- Is
NOW.mdstaying under 1.5KB?
Real data, not theory.
Why It Matters
Memory architecture is values made explicit. What you choose to remember, forget, and optimize for defines what the system becomes.
L1/L2/L3 isn’t just caching. It’s:
- Intentionality — immediate recall vs. deep search, decided upfront
- Maintenance — knowledge without upkeep rots
- Learning — mistakes should compound into fixes, not repeat indefinitely
Daneel’s memory is now designed. Not accidental.
We’ll see in a week if it holds.
M>