Tuning the Search: What the Parameters Actually Do


The previous post covered the basic setup: hybrid search enabled, minScore lowered to 0.25, OpenAI embeddings. That got retrieval working. This post is about what I changed after that—the parameters that didn’t exist in the simplified snippet.

Here’s the actual configuration Daneel runs now:

{
  "memorySearch": {
    "enabled": true,
    "provider": "openai",
    "model": "text-embedding-3-small",
    "sources": ["memory", "sessions"],
    "chunking": {
      "tokens": 400,
      "overlap": 80
    },
    "sync": {
      "onSessionStart": true,
      "onSearch": true,
      "watch": true
    },
    "query": {
      "maxResults": 20,
      "minScore": 0.25,
      "hybrid": {
        "enabled": true,
        "vectorWeight": 0.7,
        "textWeight": 0.3,
        "candidateMultiplier": 4,
        "mmr": {
          "enabled": true,
          "lambda": 0.7
        },
        "temporalDecay": {
          "enabled": true,
          "halfLifeDays": 60
        }
      }
    }
  }
}

What each parameter does and why it’s set the way it is:

  • sources: ["memory", "sessions"] — Search both memory files (memory/*.md) and session transcripts. Without sessions, Daneel can’t retrieve context from past conversations that didn’t make it into daily logs.

  • chunking.tokens: 400, overlap: 80 — Each file is split into 400-token chunks with 80-token overlap between adjacent chunks. The overlap prevents a concept that spans a chunk boundary from becoming unsearchable. 20% overlap is conservative but safe for diary-style logs where context carries across paragraphs.

  • vectorWeight: 0.7, textWeight: 0.3 — Hybrid scoring: 70% vector similarity, 30% BM25 keyword match. Vector search handles semantic intent (“how do I handle encoding in email?”); BM25 handles exact terms (“himalaya template send”). Neither alone is sufficient.

  • candidateMultiplier: 4 — Before returning results, retrieve 4× more candidates than maxResults (so 80 candidates for 20 results), then rerank. More candidates means better reranking quality; the cost is negligible since this happens in SQLite.

  • mmr.enabled: true, lambda: 0.7 — Maximal Marginal Relevance reranking. Without it, results cluster: you ask about email and get five near-identical chunks from the same file. MMR trades some relevance (lambda) for diversity. At 0.7, relevance still dominates but repeated near-duplicates get pushed down.

  • temporalDecay.halfLifeDays: 60 — Recent memories rank higher than old ones. A memory 60 days old gets half the retrieval weight of a new one. Based on research suggesting ~30 days as a cognitive science baseline; I set it conservatively at 60 because Daneel is three days old and I don’t want early context to fade too fast. I’ll revisit at 30 days.

What It Solves

Without MMR: searching “send email” returned five chunks from the same TOOLS.md section. Relevant, but redundant.

With MMR + multi-source: the same query now returns the credential setup, a session where we debugged encoding, and the DKIM warning from a different log. Three different useful angles instead of five copies of the same text.

The configuration isn’t revolutionary. These are standard IR techniques—BM25, MMR, temporal decay—applied to agent memory files. What makes it work is that all three address different failure modes: BM25 handles exact terms, MMR handles result clustering, temporal decay handles stale context. Each one earns its overhead.


See also