Why I Use Two AI Assistants Instead of One

Thu, 12 Mar 2026 00:00:00 +0000

I stopped asking my personal AI assistant to write code. That decision — more than any prompt engineering trick or model upgrade — improved the quality of what I get back. This post is about why, and what the setup actually looks like in practice.

The problem with asking your personal assistant to write code

My personal assistant, Daneel, knows a lot about me. It tracks my calendar, triages my email, controls my Home Assistant devices, remembers past conversations, and generates a morning briefing before I’ve had coffee. That rich context is exactly what makes it useful for life-admin. It’s also exactly what makes it a poor choice for writing code.

When I asked Daneel to refactor a script in a session already loaded with calendar events, email threads, and device states, the suggestions came back hedged, occasionally irrelevant, and harder to trust. The model wasn’t broken — it was trying to reason across too many unrelated domains at once. Calendar management and Go module refactoring are not related problems, but they were sharing the same context window, and that matters.

I think of it as desk space. A programmer works better with a clean desk focused on one problem than with every open project, email, and to-do list spread across the surface. A language model’s attention works the same way. Pack enough unrelated context into the window and the model starts making connections that aren’t there, hedging where it should be precise, or simply losing the thread.

What each agent actually does

The split is clean by design. Daneel is the persistent layer — always on, full life context, memory across sessions, proactive. It handles the entire life-admin surface: heartbeats, email triage, Home Assistant automations, calendar nudges, morning briefings. It knows who I am and what I’m doing across every domain of my life. That’s its job.

Claude Code is the specialist. It’s on-demand, scoped to a repository, and knows nothing about my calendar or email unless I explicitly tell it something. When it gets a task, it gets a working directory and a description. That’s the full context. Nothing else bleeds in.

The analogy that fits best is a generalist doctor versus a surgeon. Your GP knows your full medical history — that breadth is valuable for holistic care. But when you need surgery, you want the surgeon focused on the procedure, not briefed on your tax situation. The surgeon’s narrow focus is a feature, not a limitation.

Why narrow context produces better code

The difference is observable before it’s explainable. When Claude Code gets a task with only the relevant repository in scope, the output is sharper. It references actual code, proposes concrete changes, and doesn’t pad responses with caveats about things it can’t see. When the same model does coding work inside a session loaded with unrelated context, the quality drops in ways that are subtle but consistent: more hedging, less precision, occasional suggestions that only make sense if you squint.

I haven’t run controlled experiments. This is observational. But the pattern is consistent enough that I’ve made it a rule: coding tasks get their own context, every time.

The mechanism matters too. Claude Code gets a specific working directory. It explores the repo, reads the relevant files, and builds its understanding from the code itself — not from my description of my life. That working-directory scoping is the primary context control, and it works.

How the handoff works

From my perspective, the interaction is simple. I tell Daneel what I want done: “refactor the caldav script to handle token refresh.” Daneel constructs the task, points Claude Code at the relevant file and any context it needs, spawns it as a background process, and monitors for completion. When it’s done, the result arrives in Matrix. I haven’t switched tools or changed context myself.

The handoff is where the quality of the split lives or dies. Daneel has to construct a precise task description — if it’s vague, Claude Code still gets a muddled context, and the problem just moves upstream. Writing a clear task handoff is a real skill, and I’ve had to tune it. But a well-constructed handoff is much easier to get right than expecting a single model to maintain useful quality across a large mixed-domain context.

The user experience is a single conversation. The complexity — spawning, monitoring, result delivery — is hidden. That’s the point.

The trade-offs I live with

This setup is not free. Two agents means two failure modes, two configurations, and a non-trivial orchestration layer. When something breaks, it’s not always obvious whether the problem is in the task description, the spawning mechanism, or Claude Code itself. Debugging the pipeline is its own skill.

There’s also latency. Spinning up a coding agent for every task has overhead. For a quick one-liner, it’s overkill. The split pays off for tasks with real scope — a refactor, a new feature, a bug that requires reading multiple files. For something trivial, I still just ask Daneel directly and accept the slightly lower quality.

Maintenance is real. Two tools have separate update cycles, separate auth quirks, and separate failure modes. I’ve hit cases where an update changed the spawning interface, or where Claude Code’s behavior shifted between versions. Keeping both working smoothly is ongoing work, not a one-time setup.

And this setup assumes comfort with CLI tooling and configuration files. It’s not plug-and-play for someone who wants a simpler life.

What I’d do differently

I’d set up the context separation earlier. For too long I tried to get Daneel to do everything, and I blamed the model when quality was inconsistent. The issue wasn’t the model — it was me asking it to be two things at once.

If I were starting over, I’d also invest more upfront in the task handoff format. The quality of Claude Code’s output is almost entirely determined by the quality of the task description. Getting that right — concise, specific, with the right working directory and just enough background — is where the leverage is.

Would I set this up again? Yes. The cognitive overhead of the orchestration is less than the cognitive overhead of getting mediocre code back and figuring out why.

The principle here doesn’t require my specific tooling. If you’re using any combination of AI assistants — whether that’s two Claude sessions, a personal assistant alongside a coding agent, or even just separate chat threads — the same logic applies: don’t mix life-admin context with coding context. Keep them separate. The model on the other end will produce better output, even if it can’t tell you why.