<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>K@emacs on Martin Sukany</title><link>https://sukany.cz/tags/k@emacs/</link><description>Recent content in K@emacs on Martin Sukany</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Mon, 23 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://sukany.cz/tags/k@emacs/index.xml" rel="self" type="application/rss+xml"/><item><title>LLMs in Emacs: My Actual gptel Setup</title><link>https://sukany.cz/blog/2026-03-23-emacs-gptel-setup/</link><pubDate>Mon, 23 Mar 2026 00:00:00 +0000</pubDate><guid>https://sukany.cz/blog/2026-03-23-emacs-gptel-setup/</guid><description>&lt;p&gt;I&amp;rsquo;ve been using gptel daily for three months now. This isn&amp;rsquo;t a review — it&amp;rsquo;s a field report from someone running LLMs inside Emacs on a corporate macOS machine with a MITM proxy, compliance requirements, and zero patience for black-box tooling.&lt;/p&gt;
&lt;h2 id="why-emacs-for-llm-work"&gt;Why Emacs for LLM Work&lt;/h2&gt;
&lt;p&gt;gptel is a thin client. It sends text to an API, gets text back. That&amp;rsquo;s it. No hidden prompt injection, no telemetry you can&amp;rsquo;t inspect, no magic. You see exactly what goes over the wire.&lt;/p&gt;
&lt;p&gt;I came from VS Code&amp;rsquo;s Copilot Chat. It works fine until you need to understand what it&amp;rsquo;s actually doing. Which model is it using right now? What&amp;rsquo;s in the system prompt? Can I route this through a different backend? The answer is always: you can&amp;rsquo;t, or you need an extension that half-works.&lt;/p&gt;
&lt;p&gt;gptel gives you full control because there&amp;rsquo;s nothing to control. It&amp;rsquo;s Emacs — the config &lt;em&gt;is&lt;/em&gt; the product. Every backend, every model, every parameter is an elisp variable you can inspect and change at runtime.&lt;/p&gt;
&lt;p&gt;The corporate context matters here. I&amp;rsquo;m on a work macOS with a MITM proxy that intercepts TLS. Compliance says data must not be retained by third parties. I need to know exactly where my prompts go. With gptel, I do.&lt;/p&gt;
&lt;p&gt;Three months in, I can say: gptel is not the most polished LLM interface. It is the most transparent one.&lt;/p&gt;
&lt;h2 id="one-config-to-rule-them-all"&gt;One Config to Rule Them All&lt;/h2&gt;
&lt;p&gt;The first thing I did was centralize. One elisp file controls both gptel and aidermacs. One variable switches the default backend:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; One line to switch the default for both gptel and aidermacs:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defvar&lt;/span&gt; &lt;span class="nv"&gt;my/llm-default-backend&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;Copilot&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; (defvar my/llm-default-backend &amp;#34;Claude-Max&amp;#34;) ; personal machine&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The second piece is a preference list. Backends expose different models — Copilot gives you Claude, GPT-5, Gemini through one API. The preference list picks the best available model automatically:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defvar&lt;/span&gt; &lt;span class="nv"&gt;my/gptel-model-preferences&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;claude-opus-4.6&lt;/span&gt; &lt;span class="nv"&gt;claude-opus-4.5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;claude-sonnet-4.6&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;gpt-5.4&lt;/span&gt; &lt;span class="nv"&gt;gpt-5.2&lt;/span&gt; &lt;span class="nv"&gt;gpt-4o&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;gemini-3.1-pro-preview&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;First match from dynamically fetched models wins.&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When I switch machines or a model disappears from an API, the preference list falls through to the next option. No breakage, no manual editing. This pattern scales to any number of backends — everything downstream (gptel, aidermacs, org-babel helpers) reads from the same source.&lt;/p&gt;
&lt;h2 id="github-copilot-for-business-as-primary-backend"&gt;GitHub Copilot for Business as Primary Backend&lt;/h2&gt;
&lt;p&gt;Why Copilot? Compliance. GitHub Copilot for Business does not retain prompts or completions — that&amp;rsquo;s contractual, not just a policy page. For a corporate environment where data retention matters, this is the deciding factor.&lt;/p&gt;
&lt;p&gt;The bonus is access. One Copilot subscription gives you Claude, GPT-5, Gemini, and others through a single API. No separate billing, no individual API keys. IT signs one contract, I get a model zoo.&lt;/p&gt;
&lt;p&gt;The auth flow uses a two-stage token exchange. You start with an OAuth token stored locally by the GitHub Copilot VS Code extension in &lt;code&gt;~/.config/github-copilot/apps.json&lt;/code&gt;. That token gets exchanged for a short-lived session token via GitHub&amp;rsquo;s API:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; OAuth token from ~/.config/github-copilot/apps.json&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; -&amp;gt; exchanged for short-lived session token (TTL ~30 min)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; -&amp;gt; used against api.business.githubcopilot.com&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defun&lt;/span&gt; &lt;span class="nv"&gt;my/copilot-get-session-token&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;Exchange OAuth token for Copilot session token. Cached for 30 min.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;and&lt;/span&gt; &lt;span class="nv"&gt;my/copilot-session-token&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;my/copilot-session-expires&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;float-time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;my/copilot-session-token&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;;; exchange via api.github.com/copilot_internal/v2/token&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;;; ... (see full config in repo)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;my/copilot-do-token-exchange&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The session token expires in roughly 30 minutes. The wrapper caches it and refreshes automatically with a 5-minute buffer. You never think about auth after initial setup.&lt;/p&gt;
&lt;p&gt;One gotcha that cost me an afternoon: model name normalization. Copilot&amp;rsquo;s API returns model names with dots (&lt;code&gt;claude-opus-4.6&lt;/code&gt;), while Anthropic&amp;rsquo;s convention uses dashes (&lt;code&gt;claude-opus-4-6&lt;/code&gt;). The preference list needs to match against both:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defun&lt;/span&gt; &lt;span class="nv"&gt;my/model-normalize&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;Normalize model NAME: dots-&amp;gt;dashes, strip date suffix.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;let&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;s&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;symbolp&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;symbol-name&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;replace-regexp-in-string&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;\\.&amp;#34;&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;-&amp;#34;&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;replace-regexp-in-string&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;-[0-9]\\{8\\}$&amp;#34;&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;&amp;#34;&lt;/span&gt; &lt;span class="nv"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Dots become dashes, trailing date stamps get stripped. Without this, your preference for &lt;code&gt;claude-opus-4.6&lt;/code&gt; silently never matches anything from Copilot.&lt;/p&gt;
&lt;h2 id="multiple-backends-dynamic-discovery"&gt;Multiple Backends, Dynamic Discovery&lt;/h2&gt;
&lt;p&gt;Copilot is the primary, but not the only backend. I have three others:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Claude-Max&lt;/strong&gt; &amp;mdash; a proxy to Anthropic&amp;rsquo;s API running on internal infrastructure, no per-token billing&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenWebUI&lt;/strong&gt; &amp;mdash; self-hosted, open models for experimentation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Daneel&lt;/strong&gt; &amp;mdash; a custom agent system with its own API&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each backend fetches its available models from the API at startup and caches the result:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;defun&lt;/span&gt; &lt;span class="nv"&gt;my/setup-gptel-backends&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;Create all gptel backends with dynamically fetched models.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;member&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;Copilot&amp;#34;&lt;/span&gt; &lt;span class="nv"&gt;my/llm-enabled-backends&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt; &lt;span class="nf"&gt;#&amp;#39;&lt;/span&gt;&lt;span class="nv"&gt;gptel-make-gh-copilot&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;Copilot&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt; &lt;span class="nb"&gt;:host&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;api.business.githubcopilot.com&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;:models&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;my/fetch-copilot-models&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;;; Claude-Max, OpenWebUI, Daneel similarly...&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The preference list picks the best model across all backends. If Copilot is down, Claude-Max takes over automatically. &lt;code&gt;SPC o l R&lt;/code&gt; refreshes all backends. A new model appears on Copilot&amp;rsquo;s API, I hit refresh, and if it ranks higher in preferences, it&amp;rsquo;s already the default.&lt;/p&gt;
&lt;h2 id="daily-workflows-rewrite-and-chat"&gt;Daily Workflows: Rewrite and Chat&lt;/h2&gt;
&lt;p&gt;Two workflows cover 90% of my LLM usage: rewrite and chat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gptel-rewrite&lt;/strong&gt; is the daily driver. Select a region, type an instruction, and the model rewrites the selection in place. The key addition is dispatch mode &amp;mdash; after a rewrite completes, you get a menu: Accept, Reject, Diff, or Merge:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; After rewrite completes: show Accept/Reject/Diff/Merge menu&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;after!&lt;/span&gt; &lt;span class="nv"&gt;gptel-rewrite&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;gptel-rewrite-default-action&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;dispatch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Accept replaces the region. Reject restores the original. Diff opens ediff. Merge lets you pick hunks. This single setting turned gptel-rewrite from &amp;ldquo;interesting&amp;rdquo; to &amp;ldquo;indispensable.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chat buffers&lt;/strong&gt; use org-mode. Every conversation is a structured document I can export, search, refile. For batch work and scripting, a CLI helper wraps gptel for use in org-babel blocks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-org" data-lang="org"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;#+begin_src &lt;/span&gt;&lt;span class="cs"&gt;elisp&lt;/span&gt;&lt;span class="c"&gt; :results raw
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;my/gptel-cli&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;Summarize this error log&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;#+end_src&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This makes LLM calls composable with other org-babel languages. Shell block produces output, LLM block processes it, Python block handles the result. Pipelines, not chat.&lt;/p&gt;
&lt;h2 id="tool-use-and-mcp"&gt;Tool Use and MCP&lt;/h2&gt;
&lt;p&gt;gptel supports tool use &amp;mdash; the model can call functions, not just generate text:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;gptel-use-tools&lt;/span&gt; &lt;span class="no"&gt;t&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;gptel-confirm-tool-calls&lt;/span&gt; &lt;span class="no"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;; ask before each call&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I keep confirmation on. Letting a model execute arbitrary functions without review defeats the purpose of a transparent setup.&lt;/p&gt;
&lt;p&gt;The tool ecosystem has three layers. &lt;strong&gt;llm-tool-collection&lt;/strong&gt; provides filesystem and shell access &amp;mdash; read files, run commands. &lt;strong&gt;ragmacs&lt;/strong&gt; adds Emacs introspection &amp;mdash; the model can query buffers and read documentation. &lt;strong&gt;gptel-got&lt;/strong&gt; works with org structures.&lt;/p&gt;
&lt;p&gt;Then there&amp;rsquo;s MCP (Model Context Protocol). gptel bridges to MCP servers through mcp-hub:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;mcp-hub-servers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;fetch&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;:command&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;uvx&amp;#34;&lt;/span&gt; &lt;span class="nb"&gt;:args&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;mcp-server-fetch&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;sequential-thinking&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;:command&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;npx&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;:args&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;-y&amp;#34;&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;@modelcontextprotocol/server-sequential-thinking&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;mcp-server-fetch&lt;/code&gt; lets the model pull web content. &lt;code&gt;sequential-thinking&lt;/code&gt; provides a scratchpad for multi-step reasoning. Agent mode (&lt;code&gt;SPC o l A&lt;/code&gt;) combines tool use with a planning loop. It works for well-scoped tasks; don&amp;rsquo;t expect it to handle more than five or six tool calls reliably yet.&lt;/p&gt;
&lt;h2 id="aidermacs-pair-programming"&gt;Aidermacs: Pair Programming&lt;/h2&gt;
&lt;p&gt;For actual code changes across multiple files, gptel-rewrite isn&amp;rsquo;t enough. Aidermacs brings Aider into Emacs &amp;mdash; architect/editor pair programming where one model designs and another applies changes:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;aidermacs-default-model&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;my/aider-architect-model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;aidermacs-default-chat-mode&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;architect&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;aidermacs-extra-args&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="o"&gt;`&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;--editor-model&amp;#34;&lt;/span&gt; &lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;my/aider-editor-model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;--editor-edit-format&amp;#34;&lt;/span&gt; &lt;span class="s"&gt;&amp;#34;diff&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s"&gt;&amp;#34;--no-auto-commits&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The architect model (typically Opus) proposes changes. The editor model (typically Haiku &amp;mdash; fast and cheap) applies them as diffs. This split keeps costs reasonable while maintaining quality for the planning phase.&lt;/p&gt;
&lt;p&gt;Aidermacs shares the Copilot auth flow. The same token exchange function provides credentials &amp;mdash; no separate auth setup. An auto-generated &lt;code&gt;.aider.model.settings.yml&lt;/code&gt; sets the Copilot IDE headers required by the business endpoint.&lt;/p&gt;
&lt;p&gt;The corporate proxy needs extra attention. Aider is a Python tool, and Python&amp;rsquo;s requests library needs its own CA bundle:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;REQUESTS_CA_BUNDLE=/path/to/corporate-ca-bundle.crt
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;SSL_CERT_FILE=/path/to/corporate-ca-bundle.crt
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;These environment variables get set in the aidermacs process environment. Without them, every Aider request fails with a TLS verification error.&lt;/p&gt;
&lt;h2 id="corporate-proxy-the-elephant-in-the-room"&gt;Corporate Proxy: The Elephant in the Room&lt;/h2&gt;
&lt;p&gt;If you&amp;rsquo;re on a corporate network with a MITM proxy, you already know the pain. The proxy terminates TLS, re-signs with its own CA, and every HTTPS tool needs to know about it.&lt;/p&gt;
&lt;p&gt;For Emacs itself:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-emacs-lisp" data-lang="emacs-lisp"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; Trust corporate MITM proxy (adds intermediate CA)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;gnutls-verify-error&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;tls-checktrust&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nv"&gt;network-security-level&lt;/span&gt; &lt;span class="ss"&gt;&amp;#39;low&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;;; curl handles proxy better than url.el&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;setq&lt;/span&gt; &lt;span class="nv"&gt;gptel-use-curl&lt;/span&gt; &lt;span class="no"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;gptel-use-curl t&lt;/code&gt; matters. Emacs&amp;rsquo;s built-in &lt;code&gt;url.el&lt;/code&gt; has inconsistent proxy support. curl picks up the system proxy configuration reliably and handles streaming better. The &lt;code&gt;gnutls-verify-error nil&lt;/code&gt; settings are a known security trade-off &amp;mdash; on a corporate machine where IT controls the network anyway, this is the pragmatic choice.&lt;/p&gt;
&lt;h2 id="three-months-in-what-i-d-change"&gt;Three Months In: What I&amp;rsquo;d Change&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What works:&lt;/strong&gt; gptel-rewrite with dispatch is the single most valuable feature. Multi-backend setup with dynamic discovery means I never worry about model availability. The Copilot integration is solid once the auth plumbing is in place.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What doesn&amp;rsquo;t:&lt;/strong&gt; Copilot token refresh occasionally has a race condition &amp;mdash; two simultaneous requests can both trigger an exchange, and one gets a stale token. MCP is early: the ecosystem is small, and agent mode falls apart on complex tasks. The corporate proxy config breaks after macOS updates and needs manual fixes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Recommendation:&lt;/strong&gt; Start with gptel and one backend. Get comfortable with gptel-rewrite. Add aidermacs when you have a concrete use case. Add tools and MCP only when you&amp;rsquo;ve hit the ceiling of what chat alone can do. The config described here took weeks to build incrementally &amp;mdash; don&amp;rsquo;t start there.&lt;/p&gt;
&lt;p&gt;The full configuration is in my &lt;a href="https://git.apps.sukany.cz/martin/emacs-doom"&gt;doom-emacs repository&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Bridging reMarkable and Emacs Org-mode</title><link>https://sukany.cz/blog/2026-03-12-remarkable-org-mode/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://sukany.cz/blog/2026-03-12-remarkable-org-mode/</guid><description>&lt;h2 id="why-i-still-write-on-paper"&gt;Why I still write on paper&lt;/h2&gt;
&lt;p&gt;There&amp;rsquo;s something that happens when I pick up a pen that doesn&amp;rsquo;t happen when I open a new buffer. The thinking is different — less filtered, less structured, more honest. Ideas that would never survive the friction of forming a heading and choosing a tag actually get written down. First drafts of decisions, rough task lists, things I&amp;rsquo;m trying to work out, all of it lands on paper before it&amp;rsquo;s ready to be digital.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve tried replacing this with digital tools. Org-capture is excellent for structured input, but for capturing a fleeting thought mid-meeting or sketching out a problem while commuting, I still reach for paper. The reMarkable is my compromise: it&amp;rsquo;s close enough to writing on paper that it doesn&amp;rsquo;t disrupt the thinking, and close enough to a computer that the notes don&amp;rsquo;t stay trapped on dead wood.&lt;/p&gt;
&lt;p&gt;The problem is that notes on a reMarkable and notes in Org-mode don&amp;rsquo;t naturally talk to each other. This post is about the pipeline I built to close that gap.&lt;/p&gt;
&lt;h2 id="the-two-tools-and-why-they-complement-each-other"&gt;The two tools and why they complement each other&lt;/h2&gt;
&lt;p&gt;The reMarkable is good at one thing: letting you write without getting in the way. The e-ink display doesn&amp;rsquo;t glow, doesn&amp;rsquo;t notify you, doesn&amp;rsquo;t tempt you to check anything. The battery lasts days. The pen latency is low enough to feel like paper. Cloud sync happens automatically in the background — you don&amp;rsquo;t think about it. For first-pass capture of any kind of thinking, it&amp;rsquo;s hard to beat.&lt;/p&gt;
&lt;p&gt;Org-mode is good at different things. It&amp;rsquo;s plain text, version-controllable, programmable. It integrates with agenda, GTD-style workflows, time tracking, archiving. When information is in an &lt;code&gt;.org&lt;/code&gt; file, the full Emacs ecosystem is available — you can schedule it, tag it, refile it, clock time on it, link it to other notes. For organizing and acting on information, it&amp;rsquo;s where I want everything to end up.&lt;/p&gt;
&lt;p&gt;The gap is obvious. reMarkable is where I write things down. Org-mode is where things become actionable. Without a bridge, I was manually transcribing notes — which defeats most of the point of capturing on the device in the first place.&lt;/p&gt;
&lt;h2 id="how-the-sync-pipeline-works"&gt;How the sync pipeline works&lt;/h2&gt;
&lt;p&gt;The pipeline has three stages: download, recognize, and structure.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Download&lt;/em&gt; is handled by &lt;code&gt;rmapi&lt;/code&gt;, a command-line client for the reMarkable cloud API. It downloads notebooks as &lt;code&gt;.rmdoc&lt;/code&gt; files — the native reMarkable format, which is essentially a zip archive containing per-page binary stroke data and metadata. I run this for each notebook I want to sync.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Recognition&lt;/em&gt; is where handwriting becomes text. I use the MyScript Cloud API, which accepts raw reMarkable page data and returns recognized text. The API is HMAC-authenticated, accepts batched requests, and handles Latin-script handwriting well enough for practical use. The free tier covers 2000 pages per month, which is more than I need.&lt;/p&gt;
&lt;p&gt;One piece of engineering worth calling out: hash-based deduplication. Before sending a page to the API, the script computes a hash of the page content and compares it against a local cache. If the page hasn&amp;rsquo;t changed since the last run, it&amp;rsquo;s skipped. This matters in practice — a 20-page notebook where you&amp;rsquo;ve written 2 new pages today sends 2 pages to the API, not 20. Quota is preserved, and the run takes seconds.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Structuring&lt;/em&gt; is handled by a Python script that takes the raw recognized text from MyScript and organizes it into Org format. Each notebook becomes an &lt;code&gt;.org&lt;/code&gt; file; pages become headings or entries under headings, depending on how the notebook is organized. The results land in &lt;code&gt;emacs-org/remarkable/&lt;/code&gt;, organized by notebook name. A notebook called &amp;ldquo;Projects&amp;rdquo; on the device becomes &lt;code&gt;emacs-org/remarkable/Projects.org&lt;/code&gt; on disk.&lt;/p&gt;
&lt;p&gt;The whole thing runs nightly via cron at 02:00. By the time I open Emacs in the morning, yesterday&amp;rsquo;s handwritten notes are already there.&lt;/p&gt;
&lt;h2 id="notebook-structure-on-the-device"&gt;Notebook structure on the device&lt;/h2&gt;
&lt;p&gt;I keep three notebooks on the reMarkable that feed into this pipeline: &lt;code&gt;Projects&lt;/code&gt;, &lt;code&gt;Areas&lt;/code&gt;, and &lt;code&gt;Quick Sheets&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Projects&lt;/code&gt; holds working notes tied to specific projects — meeting notes, rough plans, things I&amp;rsquo;m working through. &lt;code&gt;Areas&lt;/code&gt; holds reference material and ongoing concerns that don&amp;rsquo;t have a deadline. &lt;code&gt;Quick Sheets&lt;/code&gt; is the capture inbox: whatever I write when I don&amp;rsquo;t want to think about where it belongs yet. Random ideas, to-do items, things to look up, fragments of thought.&lt;/p&gt;
&lt;p&gt;After nightly sync, Quick Sheets becomes the Org file I triage first. Something like &amp;ldquo;Call Jan re: contract renewal — deadline Friday&amp;rdquo; appears as a plain text entry. Turning it into a scheduled TODO in Emacs takes ten seconds. The capture happened on paper; the action lives in the agenda.&lt;/p&gt;
&lt;h2 id="what-works-well"&gt;What works well&lt;/h2&gt;
&lt;p&gt;HWR accuracy for clean, reasonably paced handwriting is high enough to be useful without heavy editing. Not perfect — there are occasional word-level errors — but close enough that the recognized text is readable and the context is always restorable even when individual words are off.&lt;/p&gt;
&lt;p&gt;The automation itself is reliable. Once the cron job is set up, I don&amp;rsquo;t think about it. Notes appear. The hash deduplication means I can re-run the script manually without worrying about duplicates accumulating.&lt;/p&gt;
&lt;p&gt;The Org integration is the payoff. Once notes are in &lt;code&gt;.org&lt;/code&gt; files, everything Emacs offers is available. Tags, scheduling, refiling into project files, linking to related notes — none of that required any special work on the reMarkable side. The device just needed to get the text into a file.&lt;/p&gt;
&lt;h2 id="limitations-and-honest-trade-offs"&gt;Limitations and honest trade-offs&lt;/h2&gt;
&lt;p&gt;The Czech diacritics problem is real. Fast or slightly sloppy handwriting, especially with Czech-specific characters like &lt;code&gt;ě&lt;/code&gt;, &lt;code&gt;š&lt;/code&gt;, &lt;code&gt;č&lt;/code&gt;, &lt;code&gt;ř&lt;/code&gt;, produces more recognition errors than clean Latin script. &amp;ldquo;Přečíst knihu o Kafkovi&amp;rdquo; might come out as &amp;ldquo;Precist knihu o Kafkovi&amp;rdquo;. Readable and context-restorable, but not clean. For notes that matter, a proofread pass is necessary.&lt;/p&gt;
&lt;p&gt;Symbols don&amp;rsquo;t transfer. Diagrams, arrows, flowcharts, mathematical notation — anything that isn&amp;rsquo;t text is either garbled or empty in the output. A page with a hand-drawn architecture diagram produces only the text labels, if anything. This is a known constraint of the HWR approach, not a bug in the implementation.&lt;/p&gt;
&lt;p&gt;The pipeline is one-directional and always will be. reMarkable is a capture device. You write on it; the notes flow to Org. Nothing flows back. This is fine for my workflow, but worth stating clearly.&lt;/p&gt;
&lt;p&gt;There are two external dependencies worth noting. rmapi requires reMarkable cloud credentials — if you don&amp;rsquo;t want to use the cloud, this pipeline doesn&amp;rsquo;t work as described. MyScript is a third-party API that requires registration and could change pricing or availability. The free tier has been stable, but it&amp;rsquo;s not self-hosted.&lt;/p&gt;
&lt;p&gt;Sync is nightly, not real-time. If I write something at 10pm and need it in Emacs immediately, I run the script by hand. But the default cadence is once a night, and for most of what I write, that&amp;rsquo;s enough.&lt;/p&gt;
&lt;h2 id="how-to-replicate-this"&gt;How to replicate this&lt;/h2&gt;
&lt;p&gt;The setup requires some comfort with Python, command-line tools, and cron. It&amp;rsquo;s not complex, but it&amp;rsquo;s also not a one-click install.&lt;/p&gt;
&lt;p&gt;Tools you need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;rmapi&lt;/em&gt; — CLI for reMarkable cloud. Handles authentication and &lt;code&gt;.rmdoc&lt;/code&gt; download. (&lt;a href="https://github.com/juruen/rmapi"&gt;GitHub: juruen/rmapi&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;MyScript Cloud API&lt;/em&gt; — Handwriting recognition. Register at &lt;a href="https://developer.myscript.com/"&gt;developer.myscript.com&lt;/a&gt; for a free API key. The batch endpoint accepts reMarkable page data directly.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Python 3&lt;/em&gt; — For the post-processing script that structures recognized text into Org format. Standard library is sufficient; no exotic dependencies.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;cron&lt;/em&gt; — For nightly scheduling.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The rough flow: configure rmapi with your reMarkable credentials, register for a MyScript API key, wire up the Python script to call both, point it at your Org directory, and schedule it. The deduplication cache is a simple JSON file mapping page hashes to recognized text — straightforward to implement.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m not publishing the script as a ready-made package because it&amp;rsquo;s too tied to my specific notebook structure and Org conventions. But the components are all documented, the API is straightforward, and the overall architecture is simple enough to re-implement in an afternoon.&lt;/p&gt;
&lt;p&gt;The result: I write on paper and my agenda knows about it by morning. Not magic — just a cron job and a working API key.&lt;/p&gt;</description></item><item><title> Fixing macOS Zoom "Follow Keyboard Focus" in GNU Emacs</title><link>https://sukany.cz/blog/2026-02-23-emacs-macos-zoom-fix/</link><pubDate>Mon, 23 Feb 2026 00:00:00 +0000</pubDate><guid>https://sukany.cz/blog/2026-02-23-emacs-macos-zoom-fix/</guid><description>&lt;p&gt;I run macOS Accessibility Zoom at 16× magnification. Not occasionally — all the time, every day. Apple&amp;rsquo;s built-in screen magnifier has a mode called &amp;ldquo;Follow keyboard focus&amp;rdquo; that&amp;rsquo;s supposed to track your text cursor as you type, keeping it visible on screen. Every app I use does this correctly. Terminal, VS Code, Safari, iTerm2 — they all work. Emacs did not.&lt;/p&gt;
&lt;p&gt;For years.&lt;/p&gt;
&lt;p&gt;I type something. The cursor moves. The Zoom viewport doesn&amp;rsquo;t follow. I have to scroll manually to find it again. Then type another character. Repeat. If you&amp;rsquo;re reading this without needing magnification, the description might sound like a minor inconvenience. It isn&amp;rsquo;t. It&amp;rsquo;s the kind of friction that makes a tool feel broken — and I use Emacs all day.&lt;/p&gt;
&lt;p&gt;So I finally decided to fix it properly.&lt;/p&gt;
&lt;h2 id="the-obvious-first-attempt"&gt;The Obvious First Attempt&lt;/h2&gt;
&lt;p&gt;The &amp;ldquo;Follow keyboard focus&amp;rdquo; feature in macOS Zoom is event-driven. When a focused UI element changes, Zoom picks it up via the Accessibility API and moves the viewport to where that element is on screen. The standard mechanism for announcing these changes is &lt;code&gt;NSAccessibilityPostNotification()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Seemed straightforward: after each cursor draw, post a notification telling Zoom the selection changed.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-objc" data-lang="objc"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;NSAccessibilityPostNotification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NSAccessibilitySelectedTextChangedNotification&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;NSAccessibilityPostNotification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NSAccessibilityFocusedUIElementChangedNotification&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I added this to &lt;code&gt;ns_draw_window_cursor()&lt;/code&gt; in &lt;code&gt;nsterm.m&lt;/code&gt;, rebuilt, tested.&lt;/p&gt;
&lt;p&gt;Nothing.&lt;/p&gt;
&lt;p&gt;The viewport didn&amp;rsquo;t move at all.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s why. When Zoom receives &lt;code&gt;AXSelectedTextChanged&lt;/code&gt; or &lt;code&gt;AXFocusedUIElementChanged&lt;/code&gt;, it doesn&amp;rsquo;t just accept the notification and move on — it &lt;em&gt;queries back&lt;/em&gt;. It calls &lt;code&gt;AXBoundsForRange&lt;/code&gt; on the focused element to find out &lt;em&gt;where&lt;/em&gt; the cursor actually is. To answer that query, the view needs to conform to the &lt;code&gt;NSAccessibility&lt;/code&gt; protocol and implement &lt;code&gt;accessibilityBoundsForRange:&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;EmacsView&lt;/code&gt; — the main Emacs drawing surface in &lt;code&gt;nsterm.m&lt;/code&gt; — is a subclass of &lt;code&gt;NSView&lt;/code&gt;. It does not declare &lt;code&gt;NSAccessibility&lt;/code&gt; protocol conformance. There&amp;rsquo;s no &lt;code&gt;@interface EmacsView () &amp;lt;NSAccessibility&amp;gt;&lt;/code&gt;, no implementation of &lt;code&gt;accessibilityBoundsForRange:&lt;/code&gt;. So when Zoom posts the query, it gets nothing back. No bounds. Zoom shrugs and does nothing.&lt;/p&gt;
&lt;p&gt;The notification fires. Zoom hears it. Zoom asks &amp;ldquo;okay, so where is the cursor?&amp;rdquo; Emacs cannot answer. The viewport stays put.&lt;/p&gt;
&lt;p&gt;I could have gone down the road of implementing proper &lt;code&gt;NSAccessibility&lt;/code&gt; conformance on &lt;code&gt;EmacsView&lt;/code&gt;. That would technically work. It would also be a massive undertaking — you&amp;rsquo;d need a full accessibility tree, element hierarchy, all the associated protocol methods. A multi-month project, not a patch. I needed something more surgical.&lt;/p&gt;
&lt;h2 id="finding-the-real-answer"&gt;Finding the Real Answer&lt;/h2&gt;
&lt;p&gt;When you&amp;rsquo;re stuck on an obscure macOS API problem, the most useful thing you can do is read the source code of other apps that solved the same problem. iTerm2 is open source. So is Chromium.&lt;/p&gt;
&lt;p&gt;Both of them have exactly the same situation as Emacs: a custom &lt;code&gt;NSView&lt;/code&gt; for terminal or browser rendering that doesn&amp;rsquo;t expose a full accessibility tree. And both of them needed Zoom to follow the text cursor. I went looking for how they handled it.&lt;/p&gt;
&lt;p&gt;In iTerm2&amp;rsquo;s &lt;code&gt;PTYTextView.m&lt;/code&gt;, there&amp;rsquo;s a method called &lt;code&gt;refreshAccessibility&lt;/code&gt;. It calls a function I hadn&amp;rsquo;t seen before: &lt;code&gt;UAZoomChangeFocus()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Chromium&amp;rsquo;s &lt;code&gt;render_widget_host_view_mac.mm&lt;/code&gt; does the same thing in its cursor tracking code.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;UAZoomChangeFocus()&lt;/code&gt; is part of &lt;code&gt;HIServices/UniversalAccess.h&lt;/code&gt;, accessible via &lt;code&gt;Carbon/Carbon.h&lt;/code&gt;. It&amp;rsquo;s a Carbon-era API that speaks directly to the Zoom subsystem — bypassing the Accessibility notification infrastructure entirely. No protocol conformance required. No callback. No &amp;ldquo;where is the cursor?&amp;rdquo; query. You just call it with the cursor&amp;rsquo;s screen coordinates and it moves the viewport.&lt;/p&gt;
&lt;p&gt;The signature:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-c" data-lang="c"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;OSStatus&lt;/span&gt; &lt;span class="nf"&gt;UAZoomChangeFocus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;CGRect&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;focusedItemBounds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;CGRect&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;caretBounds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;UAZoomFocusType&lt;/span&gt; &lt;span class="n"&gt;focusType&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;focusType&lt;/code&gt; argument is &lt;code&gt;kUAZoomFocusTypeInsertionPoint&lt;/code&gt;, which tells Zoom this is a text cursor — triggering exactly the keyboard focus tracking behavior I needed.&lt;/p&gt;
&lt;p&gt;This was the real fix. Not a notification, not an accessibility protocol — a direct API call with explicit coordinates.&lt;/p&gt;
&lt;h2 id="the-fix-and-a-coordinate-problem"&gt;The Fix, and a Coordinate Problem&lt;/h2&gt;
&lt;p&gt;The implementation goes into &lt;code&gt;ns_draw_window_cursor()&lt;/code&gt; in &lt;code&gt;nsterm.m&lt;/code&gt;, inside the &lt;code&gt;#ifdef NS_IMPL_COCOA&lt;/code&gt; block. When Emacs draws the cursor, we know exactly where it is in view-local coordinates. From there it&amp;rsquo;s a coordinate conversion chain:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Convert cursor rect from view-local to window coordinates&lt;/li&gt;
&lt;li&gt;Convert window coordinates to screen coordinates (AppKit convention)&lt;/li&gt;
&lt;li&gt;Convert to &lt;code&gt;CGRect&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;UAZoomChangeFocus()&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Steps 1–3 are standard AppKit. Step 4 would have been trivial — except for a coordinate system mismatch that took me a while to sort out.&lt;/p&gt;
&lt;p&gt;macOS has two coordinate conventions. AppKit (NSView, NSWindow) uses bottom-left as the origin, with y increasing upward. CoreGraphics (CGRect, HIServices) uses top-left as the origin, with y increasing downward. &lt;code&gt;UAZoomChangeFocus()&lt;/code&gt; expects CoreGraphics screen coordinates.&lt;/p&gt;
&lt;p&gt;The natural way to do this conversion is &lt;code&gt;accessibilityConvertScreenRect:&lt;/code&gt;, which handles the y-flip for you. But — and here&amp;rsquo;s the catch — that method is declared on objects that conform to the &lt;code&gt;NSAccessibility&lt;/code&gt; protocol. Which, as we&amp;rsquo;ve established, &lt;code&gt;EmacsView&lt;/code&gt; does not.&lt;/p&gt;
&lt;p&gt;I tried calling it anyway. Compilation error.&lt;/p&gt;
&lt;p&gt;So: manual y-flip. The primary screen height is the key reference point:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-objc" data-lang="objc"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;CGFloat&lt;/span&gt; &lt;span class="n"&gt;primaryH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[[&lt;/span&gt;&lt;span class="n"&gt;NSScreen&lt;/span&gt; &lt;span class="n"&gt;screens&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;firstObject&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;primaryH&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This converts the AppKit screen y-coordinate to the CoreGraphics y-coordinate using simple arithmetic. No protocol conformance needed.&lt;/p&gt;
&lt;p&gt;The full implementation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-objc" data-lang="objc"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;UAZoomEnabled&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;NSRect&lt;/span&gt; &lt;span class="n"&gt;windowRect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt; &lt;span class="nl"&gt;convertRect&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="nl"&gt;toView&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;NSRect&lt;/span&gt; &lt;span class="n"&gt;screenRect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="nl"&gt;convertRectToScreen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;windowRect&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;CGRect&lt;/span&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NSRectToCGRect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;screenRect&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;CGFloat&lt;/span&gt; &lt;span class="n"&gt;primaryH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[[&lt;/span&gt;&lt;span class="n"&gt;NSScreen&lt;/span&gt; &lt;span class="n"&gt;screens&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;firstObject&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;primaryH&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;UAZoomChangeFocus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cgRect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kUAZoomFocusTypeInsertionPoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;UAZoomEnabled()&lt;/code&gt; check at the top is important — it avoids any overhead when Zoom isn&amp;rsquo;t active, so there&amp;rsquo;s no performance cost for the common case.&lt;/p&gt;
&lt;p&gt;One build note: after patching and rebuilding Emacs.app, you need to re-grant Accessibility permission in System Settings. The binary hash changes, macOS treats it as a new application, and Accessibility permissions are keyed to the binary.&lt;/p&gt;
&lt;h2 id="it-works"&gt;It Works&lt;/h2&gt;
&lt;p&gt;After the rebuild, I enabled Zoom at 16×, opened Emacs, started typing. The viewport followed the cursor. Every character, every line movement, every jump across the file — Zoom tracked it.&lt;/p&gt;
&lt;p&gt;I typed in Emacs for ten minutes just to make sure I wasn&amp;rsquo;t imagining it.&lt;/p&gt;
&lt;p&gt;The fix itself is small — about fifteen lines of Objective-C in &lt;code&gt;ns_draw_window_cursor()&lt;/code&gt;. The journey to find it was longer: trying the notification approach, understanding why it failed, going through the Accessibility API documentation, reading iTerm2 and Chromium source, finding &lt;code&gt;UAZoomChangeFocus()&lt;/code&gt;, working through the coordinate system issue, hitting the compilation error on &lt;code&gt;accessibilityConvertScreenRect:&lt;/code&gt;, figuring out the manual y-flip. That&amp;rsquo;s the actual work. The patch is just the result of it.&lt;/p&gt;
&lt;p&gt;The patch has been submitted to the GNU Emacs developers. Hopefully it lands in a future release so nobody else has to track this down. This was a long-standing problem — Emacs being the one editor on macOS that didn&amp;rsquo;t work with &amp;ldquo;Follow keyboard focus&amp;rdquo; — and it&amp;rsquo;s finally resolved. Everything works beautifully now.&lt;/p&gt;
&lt;p&gt;If you&amp;rsquo;re building a custom &lt;code&gt;NSView&lt;/code&gt; on macOS and need Zoom compatibility, skip the accessibility notification approach and go straight to &lt;code&gt;UAZoomChangeFocus()&lt;/code&gt;. That&amp;rsquo;s the right tool for the job.&lt;/p&gt;</description></item></channel></rss>