Why I Moved from OpenClaw to Hermes

A month ago I thought I had the right answer: split everything into specialists.

At the peak, my setup had sixteen agents. One for email. One for writing. One for research. One for infrastructure. Several more for code, review, critique, QA, and orchestration. On paper it looked elegant — decomposition, clear ownership, domain-specific memory, explicit routing.

In practice it gradually became something else: an overengineered system that demanded more maintenance than it returned.

[Read More]

LLMs in Emacs: My Actual gptel Setup

I’ve been using gptel daily for three months now. This isn’t a review — it’s a field report from someone running LLMs inside Emacs on a corporate macOS machine with a MITM proxy, compliance requirements, and zero patience for black-box tooling.

[Read More]

From One Agent to Fifteen: Multi-Agent Architecture in Practice

For the first few weeks, Daneel did everything. One agent, all domains: email triage, code review, research, smart home control, calendar, blog drafts. The configuration was clean, the setup was simple, and the outputs were consistently mediocre.

Not broken. Just mediocre. And I eventually figured out why.

The single-agent problem

When an agent handles email classification at 09:00 and rewrites a Python module at 10:00, the same context window carries both concerns. A session loaded with inbox threads, calendar events, and Home Assistant device states isn’t an ideal substrate for code review advice. The model isn’t broken — it’s trying to maintain quality across too many unrelated domains simultaneously.

[Read More]

Why I Use Two AI Assistants Instead of One

I stopped asking my personal AI assistant to write code. That decision — more than any prompt engineering trick or model upgrade — improved the quality of what I get back. This post is about why, and what the setup actually looks like in practice.

The problem with asking your personal assistant to write code

My personal assistant, Daneel, knows a lot about me. It tracks my calendar, triages my email, controls my Home Assistant devices, remembers past conversations, and generates a morning briefing before I’ve had coffee. That rich context is exactly what makes it useful for life-admin. It’s also exactly what makes it a poor choice for writing code.

[Read More]

Why I Gave My AI Agent a Soul (Again)

Two weeks ago I published a post about giving Daneel a soul — replacing Asimov’s Laws with a real priority hierarchy and a decision model. Last week I rewrote it again. Not because the first version was wrong, but because running it in production taught me what was missing: harm prevention has to come before “follow instructions,” trust has to be explicit, and an agent that waits to be asked is an agent that will eventually do the wrong thing at the wrong moment. Here’s what changed and why.

[Read More]

FSA-Driven Multi-Agent Pipelines: How We Stopped Fighting Our Own Orchestrator

The Problem We Had

Our first multi-agent pipeline was a disaster waiting to happen. The architecture seemed clean: spawn workers, each does its thing, updates a shared `status.json` to record completion, and if it’s the last one in its phase, spawns the next batch. Workers know the workflow, workers drive progress. What could go wrong?

Plenty.

The race condition was textbook. Two parallel research workers — `researcher-a` and `researcher-b` — finish around the same time. At `t=0`, both read `status.json`. Both see themselves as the last remaining worker. At `t=1`, both write back with themselves marked completed. One write wins. The other is silently lost. The “winning” worker sees only its own completion, decides the phase isn’t done, and does nothing. The pipeline stalls. No error. No timeout for another ten minutes. Just silence.

[Read More]

Ten Days with an AI Agent

On day 2, the agent tried to re-enable a Twitter integration I had explicitly cancelled the night before. It had forgotten. Not because of a bug — because session restarts wipe context, and nothing in the default setup prevents an AI from re-deriving a decision you already vetoed.

That’s when I started building the infrastructure that turned a chatbot into something that actually works.

This is not a tutorial. It’s what running an autonomous AI agent looks like after 10 days: what it costs, what breaks, and what I’d change.

[Read More]

Why I Stopped Waiting for Announces: The Spawn-All-Wait Pattern for Multi-Agent AI

My multi-agent pipeline was failing at random. Not always, not predictably — just often enough to make me stop trusting it. Worker-2 would run, write its output, and then nothing would happen. The orchestrator was sitting there waiting for an announce that never arrived. The bug already had a ticket number: #17000. Description: hardcoded 60-second timeout, no retry. I’d built the entire coordination model on message delivery, and message delivery was the single point of failure. The fix wasn’t more retries. It was getting rid of message-based coordination entirely.

[Read More]

Day 5 with Daneel: Headless Browsers, Document Pipelines, and the Numbers So Far

Day 5 was the most varied day yet. Not in complexity—some earlier days had harder problems—but in range. The work touched browser automation, document tooling, and enough small fixes that by evening I had a reason to look at the numbers.

Running a Browser Without a Screen

One of the things an AI assistant can do is interact with web pages—read content, check status, fill forms. But this particular setup runs on a headless Linux server. No display, no window manager, no user session.

[Read More]

Rebuilding a Tool in Four Hours: What the AI Agent Actually Did

I have a small internal tool called Scénář Creator. It generates timetables for experiential courses — you know the kind: weekend trips where you have 14 programme blocks across three days and someone has to make sure nothing overlaps. I built version one in November 2025. It was a CGI Python app running on Apache, backed by Excel.

Yesterday I asked Daneel to rebuild it. Four hours later, version 4.7 was running in production. Here’s exactly what happened.

[Read More]