<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Zero-Shot Log</title><description>Tech Blog by Ten</description><link>https://zeroshotlog.com/</link><language>en-us</language><item><title>Claude Code Routines × GitHub: Pitfalls and Workarounds</title><link>https://zeroshotlog.com/en/blog/2026/04/26/claude-code-routine-github-mcp-pitfalls/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/04/26/claude-code-routine-github-mcp-pitfalls/</guid><description>A breakdown of timeout limits and sandbox authentication constraints I hit when saving files via the GitHub MCP from Claude Code Routines, along with workarounds for each.</description><pubDate>Sun, 26 Apr 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The GitHub MCP&apos;s &lt;code&gt;create_or_update_file&lt;/code&gt; triggers a &lt;strong&gt;Stream idle timeout&lt;/strong&gt; when saving large files (noticeable above ~11,500 bytes).&lt;/li&gt;
&lt;li&gt;The sandbox is intentionally locked down, so &lt;code&gt;git push&lt;/code&gt; over Bash isn&apos;t a viable fallback. You have to solve the problem inside the MCP boundary.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Background&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;The information in this article reflects the state of Claude Code Routines as of April 2026. It&apos;s still in research preview, so behavior may change.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Released in April 2026 as a research preview, &lt;strong&gt;&lt;a href=&quot;https://claude.ai/code/routines&quot;&gt;Claude Code Routines&lt;/a&gt;&lt;/strong&gt; runs prompts on Claude.ai automatically — triggered by schedules, API calls, or GitHub events (available on Pro / Max / Team / Enterprise plans).&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/2026/04/26/routine-creation-form.png&quot; alt=&quot;Routine creation form. Configure triggers (schedule / GitHub event / API) and connectors&quot; /&gt;&lt;/p&gt;
&lt;p&gt;GitHub integration uses the account-level GitHub connection on Claude.ai (either via the GitHub App or a &lt;code&gt;gh&lt;/code&gt; token sync through &lt;code&gt;/web-setup&lt;/code&gt;). The &quot;Connectors&quot; section in the Routine creation form lists services like Slack and Linear, but GitHub isn&apos;t there. Once your account is connected to GitHub, adding a repository to a Routine lets that Routine read and write files in the repo.&lt;/p&gt;
&lt;p&gt;When a Routine commits files to a GitHub repository, in my environment it used the MCP tool &lt;code&gt;mcp__github__create_or_update_file&lt;/code&gt;. &quot;Auto-commit the daily output to GitHub&quot; sounds like an obvious fit — but in practice I ran into unexpected limits with that MCP call. This post shares the actual blockers and the workarounds I landed on.&lt;/p&gt;
&lt;h2&gt;Issue 1: MCP Timeout — Large Files Won&apos;t Save&lt;/h2&gt;
&lt;h3&gt;Symptom&lt;/h3&gt;
&lt;p&gt;I tried to save a Routine-generated report (~30KB Markdown) via &lt;code&gt;mcp__github__create_or_update_file&lt;/code&gt; and hit:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;API Error: Stream idle timeout - partial response received
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Retrying didn&apos;t help. Shrinking the file content made it work, so the cause was clearly file-size related.&lt;/p&gt;
&lt;h3&gt;Investigation: GitHub API limit?&lt;/h3&gt;
&lt;p&gt;My first guess was a GitHub REST API file size limit. But the GitHub Contents API allows up to &lt;strong&gt;100MB&lt;/strong&gt;, so a ~30KB text file shouldn&apos;t be anywhere near the cap.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GitHub Contents API: 100MB max file size
Actual file: ~30KB
→ Not the API&apos;s limit
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Investigation: Claude.ai platform timeout?&lt;/h3&gt;
&lt;p&gt;Next guess: a timeout on the Claude.ai side of the MCP call.&lt;/p&gt;
&lt;p&gt;Claude Code CLI exposes an &lt;code&gt;MCP_TIMEOUT&lt;/code&gt; env var, but that mostly governs MCP server startup — it&apos;s unclear whether it controls per-tool-call timeouts. Either way, the Routines cloud runtime gives users no way to set this value.&lt;/p&gt;
&lt;h3&gt;Findings&lt;/h3&gt;
&lt;p&gt;Given the error message (&lt;code&gt;Stream idle timeout&lt;/code&gt;) and its correlation with file size, the most likely cause is a &lt;strong&gt;per-MCP-tool-call timeout&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;create_or_update_file&lt;/code&gt; Base64-encodes the content and ships it to the GitHub API.&lt;/li&gt;
&lt;li&gt;Larger payloads take longer to process; the response likely doesn&apos;t arrive before the call times out.&lt;/li&gt;
&lt;li&gt;There&apos;s no user-facing way to adjust per-tool MCP timeouts.&lt;/li&gt;
&lt;li&gt;In my measurements, calls under ~11,500 bytes were stable; from ~13,000–15,000 bytes upward, timeouts kicked in.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Approximate threshold from my own runs:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File size&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;~5,000 bytes&lt;/td&gt;
&lt;td&gt;Stable success&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;~8,500 bytes&lt;/td&gt;
&lt;td&gt;Stable success&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;~11,500 bytes&lt;/td&gt;
&lt;td&gt;Success (near the ceiling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;~15,000+ bytes&lt;/td&gt;
&lt;td&gt;Timeout&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: These thresholds shift based on network conditions and server load. As a practical safety margin, &lt;strong&gt;keeping each file under 10,000 bytes&lt;/strong&gt; worked reliably for me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Can &lt;code&gt;git push&lt;/code&gt; over Bash help?&lt;/h3&gt;
&lt;p&gt;I also checked whether I could bypass the MCP tool by saving files via Bash. Routines have a code execution capability, so a &lt;code&gt;git push&lt;/code&gt; from a shell sounded plausible.&lt;/p&gt;
&lt;p&gt;But inspecting the sandbox: no &lt;code&gt;GITHUB_TOKEN&lt;/code&gt;, no &lt;code&gt;GH_TOKEN&lt;/code&gt; env var, no &lt;code&gt;gh&lt;/code&gt; CLI, no SSH keys, no &lt;code&gt;~/.netrc&lt;/code&gt;. There&apos;s simply no GitHub auth surface inside the shell.&lt;/p&gt;
&lt;p&gt;This is the right design from a security standpoint. Anthropic&apos;s docs explicitly state that git credentials and signing keys are not placed inside the sandbox; GitHub operations are routed through a secure proxy with scoped credentials. By keeping raw git push credentials away from the LLM, the risk of unintended repo operations is contained.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/2026/04/26/routine-sandbox-auth.jpg&quot; alt=&quot;Routine sandbox auth model. Bash-based git push has no credentials and is blocked; only MCP can reach GitHub&quot; /&gt;&lt;/p&gt;
&lt;p&gt;So Bash-based git push is &lt;em&gt;not&lt;/em&gt; available as a fallback for the timeout issue. You have to solve it inside the MCP boundary.&lt;/p&gt;
&lt;h2&gt;Workarounds&lt;/h2&gt;
&lt;h3&gt;Avoiding the Timeout: Chunked File Saves&lt;/h3&gt;
&lt;p&gt;To work around the MCP timeout, the approach I settled on is &lt;strong&gt;splitting content into chunks of &amp;lt;= 10,000 bytes and saving each piece&lt;/strong&gt;.&lt;/p&gt;
&lt;h4&gt;Splitting rules in the prompt&lt;/h4&gt;
&lt;p&gt;I encode the splitting rules directly into the Routine&apos;s prompt:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;### File size limit and save procedure

Due to MCP timeout limits, a single `create_or_update_file` call can save
**up to ~10,000 bytes** of content.

If the output exceeds 10,000 bytes:
1. Split the content into sections (each part &amp;lt;= 10,000 bytes).
2. Call `create_or_update_file` for Part 1.
3. Wait for success, then save Part 2 (no parallel calls).
4. After all parts are saved, append links to each part at the end of Part 1.

If a timeout occurs:
→ Split into smaller chunks and retry (target &amp;lt;= 8,000 bytes per part).
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The key is to &lt;strong&gt;explicitly forbid parallel calls&lt;/strong&gt;. LLMs love to parallelize tool calls for efficiency, but here that&apos;s counterproductive.&lt;/p&gt;
&lt;p&gt;For navigation between parts, appending a link list to Part 1 makes the report much easier to browse on GitHub. To update Part 1 later, pass the SHA returned in Part 1&apos;s creation response back into &lt;code&gt;create_or_update_file&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;Trade-offs of chunked saves&lt;/h4&gt;
&lt;p&gt;This approach has clear downsides.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Worse readability on GitHub&lt;/strong&gt;: One report becomes multiple files; readers have to jump between them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;More API calls&lt;/strong&gt;: A 3-way split needs at least 3 MCP calls (+1 to append the link list). Routine execution time grows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;More complex prompt&lt;/strong&gt;: Splitting logic clutters the prompt with concerns unrelated to the actual task.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I still chose this approach because nothing else was practical. Bash-based &lt;code&gt;git push&lt;/code&gt; is blocked by the sandbox design, and per-tool-call timeouts can&apos;t be tuned. By elimination, file splitting was the most reliable workaround.&lt;/p&gt;
&lt;h3&gt;Side note: Watch out for timezone drift&lt;/h3&gt;
&lt;p&gt;It&apos;s no surprise that the cloud runtime is UTC, but because Routine schedules are configured in JST, it&apos;s easy to assume internal date logic will also be JST.&lt;/p&gt;
&lt;p&gt;In practice, when an LLM determines &quot;today&apos;s date,&quot; it sometimes uses UTC. Run at JST 4/25 07:00 — that&apos;s UTC 4/24 22:00, so date-based file names and commit messages can land on the previous day.&lt;/p&gt;
&lt;p&gt;A timezone directive at the top of the prompt prevents this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;**⚠️ Timezone: All date/time logic must use JST (UTC+9).**
Dates, weekday checks, file names, and commit messages all use JST.
If the runtime is UTC, add 9 hours before deciding.
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;The main limits I hit when using the GitHub MCP from Claude Code Routines:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;Workaround&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Large files fail to save&lt;/td&gt;
&lt;td&gt;Per-MCP-tool-call timeout&lt;/td&gt;
&lt;td&gt;Split into chunks &amp;lt;= 10,000 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bash-based &lt;code&gt;git push&lt;/code&gt; unusable&lt;/td&gt;
&lt;td&gt;Sandbox security design&lt;/td&gt;
&lt;td&gt;Stay within the MCP boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date/weekday drift&lt;/td&gt;
&lt;td&gt;Runtime is UTC&lt;/td&gt;
&lt;td&gt;Force JST in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Given that Claude Code Routines is still in research preview, these limits will likely improve over time. In particular, exposing per-tool MCP timeouts as a configurable value would close the most painful gap, and is plausible if enough users surface the need.&lt;/p&gt;
&lt;p&gt;For now, the most practical approach is to &lt;strong&gt;understand the limits and work around them at the prompt level&lt;/strong&gt;. Hopefully this saves someone hitting the same wall.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://code.claude.com/docs/en/routines&quot;&gt;Claude Code Docs — Routines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://code.claude.com/docs/en/mcp&quot;&gt;Claude Code Docs — MCP&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://code.claude.com/docs/en/env-vars&quot;&gt;Claude Code Docs — Environment Variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.anthropic.com/engineering/claude-code-sandboxing&quot;&gt;Anthropic Engineering — Claude Code Sandboxing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2026/04/26/routine-github-pitfalls-hero.png" length="0" type="image/png"/></item><item><title>Enabling and Using Codex&apos;s experimental Hooks</title><link>https://zeroshotlog.com/en/blog/2026/03/22/codex-hooks/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/03/22/codex-hooks/</guid><description>How I discovered Codex&apos;s experimental Hooks feature, enabled it, and reverse-engineered its config structure and supported events by reading the source.</description><pubDate>Sun, 22 Mar 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Codex ships an &quot;under development&quot; Hooks feature that you can enable with &lt;code&gt;codex features enable codex_hooks&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It supports three events: &lt;code&gt;SessionStart&lt;/code&gt;, &lt;code&gt;UserPromptSubmit&lt;/code&gt;, and &lt;code&gt;Stop&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Background&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;This article was verified against Codex v0.115.0 (March 2026). Since this is an in-development feature, behavior may change in future versions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Codex doesn&apos;t officially have a Hooks feature yet. The official docs say nothing about it, and the &lt;a href=&quot;https://developers.openai.com/codex/changelog&quot;&gt;Changelog&lt;/a&gt; only mentions a one-liner: &quot;experimental hooks engine.&quot;&lt;/p&gt;
&lt;p&gt;But it turns out a working Hooks implementation &lt;strong&gt;is already shipping&lt;/strong&gt; under the &quot;under development&quot; status. Running &lt;code&gt;codex features list&lt;/code&gt; revealed a &lt;code&gt;codex_hooks&lt;/code&gt; feature flag, and once enabled, hooks actually fire.&lt;/p&gt;
&lt;p&gt;The trouble is that the config JSON structure and the list of supported events are documented nowhere — so I read the Rust source in the OSS &lt;a href=&quot;https://github.com/openai/codex&quot;&gt;openai/codex repository&lt;/a&gt; (Apache License 2.0), specifically &lt;code&gt;codex-rs/hooks/&lt;/code&gt;, to nail down the spec. This post summarizes what I found.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;1. Discovery and enablement&lt;/h2&gt;
&lt;h3&gt;Checking the feature flag&lt;/h3&gt;
&lt;p&gt;Codex has a &lt;code&gt;codex features list&lt;/code&gt; command that prints the feature flag list. That&apos;s how I learned about Hooks in the first place: I ran it and saw &lt;code&gt;codex_hooks&lt;/code&gt; listed as &quot;under development.&quot;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ codex features list
codex_hooks    under development  false
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Disabled by default.&lt;/p&gt;
&lt;h3&gt;Enable it&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;# persisted to config.toml
codex features enable codex_hooks
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once enabled, you&apos;ll see this warning at startup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;⚠ Under-development features enabled: codex_hooks. Under-development features are incomplete
  and may behave unpredictably.
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;2. Configuration file&lt;/h2&gt;
&lt;h3&gt;Path and format&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Global config&lt;/td&gt;
&lt;td&gt;&lt;code&gt;~/.codex/hooks.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project config&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;project&amp;gt;/.codex/hooks.json&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Format&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Codex&apos;s general config lives in &lt;code&gt;~/.codex/config.toml&lt;/code&gt; (TOML), but Hooks config is a &lt;strong&gt;separate &lt;code&gt;hooks.json&lt;/code&gt; file in JSON&lt;/strong&gt;. Writing hooks into &lt;code&gt;config.toml&lt;/code&gt; won&apos;t be picked up.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~/.codex/
├── config.toml      ← general settings &amp;amp; feature flags
└── hooks.json       ← hooks config (JSON)
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;JSON structure&lt;/h3&gt;
&lt;p&gt;The structure is a bit unusual: instead of putting handlers directly under each event name, you need an &lt;strong&gt;extra wrapper layer&lt;/strong&gt;. In the source (&lt;code&gt;codex-rs/hooks/src/engine/config.rs&lt;/code&gt;), it&apos;s defined as a &lt;code&gt;MatcherGroup&lt;/code&gt; struct with &lt;code&gt;matcher&lt;/code&gt; (a regex filter) and &lt;code&gt;hooks&lt;/code&gt; (an array of handlers).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hooks.&amp;lt;EventName&amp;gt;[].matcher  → filter condition (optional)
hooks.&amp;lt;EventName&amp;gt;[].hooks[]  → array of handlers to run
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A concrete JSON looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hooks&quot;: {
    &quot;SessionStart&quot;: [
      {
        &quot;hooks&quot;: [
          {
            &quot;type&quot;: &quot;command&quot;,
            &quot;command&quot;: &quot;echo &apos;session started&apos;&quot;,
            &quot;timeout&quot;: 10
          }
        ]
      }
    ],
    &quot;Stop&quot;: [
      {
        &quot;hooks&quot;: [
          {
            &quot;type&quot;: &quot;command&quot;,
            &quot;command&quot;: &quot;echo &apos;session stopped&apos;&quot;,
            &quot;timeout&quot;: 10
          }
        ]
      }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;timeout&lt;/code&gt; is optional and defaults to 600 seconds.&lt;/p&gt;
&lt;h3&gt;What doesn&apos;t work&lt;/h3&gt;
&lt;p&gt;If you don&apos;t know about the nested structure and put handlers directly under the event array, nothing fires:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hooks&quot;: {
    &quot;SessionStart&quot;: [
      {
        &quot;type&quot;: &quot;command&quot;,
        &quot;command&quot;: &quot;echo &apos;session started&apos;&quot;
      }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I actually tried this form first, hit silence, and only realized the nesting was required after reading the source. With no official docs, the source is the only accurate reference.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;3. Supported events&lt;/h2&gt;
&lt;h3&gt;The three supported events&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;When it fires&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SessionStart&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;When a session starts or resumes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;UserPromptSubmit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;When the user submits a prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;When the agent finishes responding&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Tool execution and subagent-related events aren&apos;t implemented yet.&lt;/p&gt;
&lt;h3&gt;Handler types&lt;/h3&gt;
&lt;p&gt;A &quot;handler type&quot; specifies what to actually run when a hook fires. The source (&lt;code&gt;codex-rs/hooks/src/engine/config.rs&lt;/code&gt;) defines three types — &lt;code&gt;command&lt;/code&gt;, &lt;code&gt;prompt&lt;/code&gt;, and &lt;code&gt;agent&lt;/code&gt; — but only &lt;code&gt;command&lt;/code&gt; (run a shell command) actually works today.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;command&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Runs a shell command&lt;/td&gt;
&lt;td&gt;Works&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prompt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Injects a prompt into the agent&apos;s context&lt;/td&gt;
&lt;td&gt;Not implemented (logs &lt;code&gt;&quot;skipping prompt hook&quot;&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;agent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Spawns another agent to handle the work&lt;/td&gt;
&lt;td&gt;Not implemented&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;async: true&lt;/code&gt; (asynchronous execution) is also defined but not implemented. Once &lt;code&gt;prompt&lt;/code&gt; lands, you could &quot;inject extra instructions on every prompt submit&quot;; with &lt;code&gt;agent&lt;/code&gt;, you could &quot;kick off a review agent after each response.&quot; For now, though, the only option is &quot;event fires → shell command runs.&quot;&lt;/p&gt;
&lt;h3&gt;JSON payload passed to stdin&lt;/h3&gt;
&lt;p&gt;Hook commands run via &lt;code&gt;$SHELL -l -c &quot;&amp;lt;command&amp;gt;&quot;&lt;/code&gt;, and &lt;strong&gt;a JSON payload is piped to stdin&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;session_id&quot;: &quot;019d07b0-...&quot;,
  &quot;cwd&quot;: &quot;/Users/user/project&quot;,
  &quot;hook_event_name&quot;: &quot;SessionStart&quot;,
  &quot;model&quot;: &quot;o3-pro&quot;,
  &quot;permission_mode&quot;: &quot;default&quot;,
  &quot;source&quot;: &quot;startup&quot;,
  &quot;transcript_path&quot;: null
}
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;session_id&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Session ID (UUID v7)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cwd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Working directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;hook_event_name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Event name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;model&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Model in use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;permission_mode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Approval policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;source&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Start type (SessionStart only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;transcript_path&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Path to the conversation history file (currently null)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Some fields are added per-event:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;source&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SessionStart&lt;/td&gt;
&lt;td&gt;&lt;code&gt;startup&lt;/code&gt; / &lt;code&gt;resume&lt;/code&gt; / &lt;code&gt;clear&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prompt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;UserPromptSubmit&lt;/td&gt;
&lt;td&gt;The user&apos;s input text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;turn_id&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;UserPromptSubmit / Stop&lt;/td&gt;
&lt;td&gt;Conversation turn identifier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;last_assistant_message&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;The last assistant response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;stop_hook_active&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;Whether a Stop hook is active&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;session_id&lt;/code&gt;, &lt;code&gt;cwd&lt;/code&gt;, &lt;code&gt;hook_event_name&lt;/code&gt;, &lt;code&gt;model&lt;/code&gt;, and &lt;code&gt;permission_mode&lt;/code&gt; are common to all events.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;4. Use case: unified management of multiple agent sessions&lt;/h2&gt;
&lt;p&gt;I built a desktop app that uses AI coding agent Hooks to centrally track the state of multiple sessions. It listens for &lt;code&gt;SessionStart&lt;/code&gt; / &lt;code&gt;Stop&lt;/code&gt; events and updates a session list view.&lt;/p&gt;
&lt;p&gt;Now that Codex supports Hooks too, I extended the same app to manage Codex sessions alongside the others.&lt;/p&gt;
&lt;h3&gt;What I configure in hooks.json&lt;/h3&gt;
&lt;p&gt;When you write events and commands into &lt;code&gt;hooks.json&lt;/code&gt;, Codex runs the commands when those events fire. Conversely, if &lt;code&gt;hooks.json&lt;/code&gt; is empty or missing, nothing happens — even with the feature flag on.&lt;/p&gt;
&lt;p&gt;In the example below, every event does the same thing: read the JSON payload from stdin via &lt;code&gt;$(cat)&lt;/code&gt; and forward it as-is to my app&apos;s API with &lt;code&gt;curl&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hooks&quot;: {
    &quot;SessionStart&quot;: [
      {
        &quot;hooks&quot;: [
          {
            &quot;type&quot;: &quot;command&quot;,
            &quot;command&quot;: &quot;curl -s -X POST http://localhost:3000/api/hook -H &apos;Content-Type: application/json&apos; -d \&quot;$(cat)\&quot;&quot;,
            &quot;timeout&quot;: 10
          }
        ]
      }
    ],
    &quot;UserPromptSubmit&quot;: [
      {
        &quot;hooks&quot;: [
          {
            &quot;type&quot;: &quot;command&quot;,
            &quot;command&quot;: &quot;curl -s -X POST http://localhost:3000/api/hook -H &apos;Content-Type: application/json&apos; -d \&quot;$(cat)\&quot;&quot;,
            &quot;timeout&quot;: 10
          }
        ]
      }
    ],
    &quot;Stop&quot;: [
      {
        &quot;hooks&quot;: [
          {
            &quot;type&quot;: &quot;command&quot;,
            &quot;command&quot;: &quot;curl -s -X POST http://localhost:3000/api/hook -H &apos;Content-Type: application/json&apos; -d \&quot;$(cat)\&quot;&quot;,
            &quot;timeout&quot;: 10
          }
        ]
      }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The receiving app inspects &lt;code&gt;session_id&lt;/code&gt; and &lt;code&gt;hook_event_name&lt;/code&gt; in the JSON to update session state. Because this pattern is shared across other tools&apos; Hooks, a single API endpoint can centralize multiple agent tools together.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;5. Limitations and outlook&lt;/h2&gt;
&lt;h3&gt;Current limitations&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Under-development status&lt;/strong&gt;: The API may change.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coarse event granularity&lt;/strong&gt;: There&apos;s no per-tool-execution hook, so you can&apos;t track real-time progress mid-task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;command type only&lt;/strong&gt;: &lt;code&gt;prompt&lt;/code&gt;, &lt;code&gt;agent&lt;/code&gt;, and &lt;code&gt;async&lt;/code&gt; handler types aren&apos;t supported.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No official docs&lt;/strong&gt;: You have to read the source to learn the actual spec.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;The OSS upside&lt;/h3&gt;
&lt;p&gt;For all those limitations, Codex is OSS under Apache License 2.0, which means you can verify the Hooks implementation directly against the source. Everything in this post came from reading &lt;code&gt;codex-rs/hooks/src/&lt;/code&gt;. Being able to fall back to the source when the spec is unclear is a real comfort.&lt;/p&gt;
&lt;h3&gt;What&apos;s next&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://github.com/openai/codex/discussions/2150&quot;&gt;Discussion #2150&lt;/a&gt; thread on GitHub has lots of community comments requesting Hooks features, which signals strong interest. I&apos;d expect tool events and additional handler types to land over time.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;Codex Hooks is still in development with limited functionality, but flipping the feature flag gets you working hooks today. For use cases like detecting session start/end and running commands, it&apos;s already useful enough.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://developers.openai.com/codex/changelog&quot;&gt;Codex Changelog&lt;/a&gt; - announcement of the hooks engine&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/openai/codex&quot;&gt;openai/codex - GitHub&lt;/a&gt; - source code (Apache License 2.0)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/openai/codex/discussions/2150&quot;&gt;Discussion #2150: Hook would be a great feature&lt;/a&gt; - community request thread&lt;/li&gt;
&lt;/ul&gt;
</content:encoded></item><item><title>Hands-on with Claude Code&apos;s /loop Feature</title><link>https://zeroshotlog.com/en/blog/2026/03/11/claude-code-loop/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/03/11/claude-code-loop/</guid><description>Using Claude Code&apos;s /loop to define recurring jobs in natural language. Practical patterns for API and issue monitoring, gotchas around auto-compact and interval design, and how it interacts with Hooks.</description><pubDate>Wed, 11 Mar 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Claude Code&apos;s &lt;code&gt;/loop&lt;/code&gt; lets you define recurring jobs in natural language.&lt;/li&gt;
&lt;li&gt;Great for lightweight automation in your dev flow — periodic API checks, issue monitoring, etc.&lt;/li&gt;
&lt;li&gt;Recurring runs pause during auto-compact, so design intervals with context consumption in mind.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Background&lt;/h2&gt;
&lt;p&gt;Claude Code&apos;s &lt;code&gt;/loop&lt;/code&gt; feature is rolling out gradually. It became available on my account, so I gave it a spin.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;1. /loop basics&lt;/h2&gt;
&lt;h3&gt;Syntax&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;/loop [interval] &amp;lt;prompt&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you omit the interval, the &lt;strong&gt;default is 10 minutes&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Three ways to specify the interval&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;# Pattern 1: interval at the start
/loop 5m check git status

# Pattern 2: trailing &quot;every&quot; at the end
/loop check the deploy status every 20m

# Pattern 3: no interval (defaults to 10 minutes)
/loop check the test results
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Allowed units are &lt;code&gt;s&lt;/code&gt; (seconds), &lt;code&gt;m&lt;/code&gt; (minutes), &lt;code&gt;h&lt;/code&gt; (hours), and &lt;code&gt;d&lt;/code&gt; (days). Cron&apos;s minimum granularity is one minute, though, so seconds get rounded up.&lt;/p&gt;
&lt;h3&gt;What happens under the hood&lt;/h3&gt;
&lt;p&gt;When you run &lt;code&gt;/loop&lt;/code&gt;, three tools fire behind the scenes:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CronCreate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create a job&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CronList&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List jobs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CronDelete&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Delete a job&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;You don&apos;t actually need to know these tool names — natural language is enough.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&quot;Any jobs running right now?&quot;  → calls CronList
&quot;Stop the loop.&quot;               → calls CronDelete
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This part of the experience is genuinely good. The fact that you can manage jobs in natural language alone makes it worth trying.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;2. Practical use cases&lt;/h2&gt;
&lt;h3&gt;Use case 1: Periodic API endpoint checks&lt;/h3&gt;
&lt;p&gt;Periodically poll an external API and have Claude report any changes from the previous run.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/loop 5m check https://api.example.com/v1/status and report any change since last time
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Good fit for status monitoring or detecting changes in response payloads — anything along the lines of a simple API check.&lt;/p&gt;
&lt;h3&gt;Use case 2: GitHub issue monitoring&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;/loop 5m run `gh issue list --label &quot;bug&quot;` to check for new issues,
and if any show up, analyze them and propose a response plan
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pairs well with issue-driven dev flows. You could imagine, for example, automatically analyzing a new issue and spinning up a branch for it.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;3. Auto-compact and interval design&lt;/h2&gt;
&lt;h3&gt;Things to be aware of&lt;/h3&gt;
&lt;p&gt;When you run information-gathering jobs through &lt;code&gt;/loop&lt;/code&gt;, results pile up in context with each iteration. Claude Code runs &lt;strong&gt;auto-compact&lt;/strong&gt; when the conversation history gets long, summarizing and compressing older content (you can also trigger it manually with &lt;code&gt;/compact&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Auto-compact happens during normal Claude Code use too, but combining it with &lt;code&gt;/loop&lt;/code&gt; introduces a few quirks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Processing is blocked during auto-compact, so loop intervals drift while it runs.&lt;/li&gt;
&lt;li&gt;Cron events that pile up during auto-compact may all fire at once after it finishes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Information-gathering jobs in particular consume a lot of context, so running them at short intervals tends to trigger auto-compact frequently. I don&apos;t think &lt;code&gt;/loop&lt;/code&gt; is meant for super-precise timing anyway, but it&apos;s worth knowing when you design intervals.&lt;/p&gt;
&lt;h3&gt;Interval guidelines&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Suggested interval&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lightweight checks (e.g. &lt;code&gt;git fetch&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;3–5 min&lt;/td&gt;
&lt;td&gt;Low context consumption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API response monitoring&lt;/td&gt;
&lt;td&gt;5–10 min&lt;/td&gt;
&lt;td&gt;Depends on response size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test runs&lt;/td&gt;
&lt;td&gt;10–30 min&lt;/td&gt;
&lt;td&gt;Long execution time and large output&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;When tuning intervals, think not just about &quot;how often do I want to check,&quot; but also &lt;strong&gt;&quot;how much context does each check consume.&quot;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You can also tell the prompt to &quot;if there&apos;s no change, just say &apos;no change&apos;&quot; to keep context consumption down.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/loop 5m check the site, and if nothing changed just say &quot;no change&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;4. Combining with Hooks&lt;/h2&gt;
&lt;h3&gt;Problem: cron triggers and manual input look identical&lt;/h3&gt;
&lt;p&gt;Wanting to combine Hooks with &lt;code&gt;/loop&lt;/code&gt; so that &quot;only cron-triggered runs trigger a specific action&quot; is a natural thought. But right now, the &lt;code&gt;UserPromptSubmit&lt;/code&gt; event payload &lt;strong&gt;has no field indicating the trigger source&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;session_id&quot;: &quot;abc123&quot;,
  &quot;hook_event_name&quot;: &quot;UserPromptSubmit&quot;,
  &quot;prompt&quot;: &quot;the submitted text&quot;
  // no field like trigger_source
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Hook can&apos;t tell whether cron fired automatically or whether the user typed something in by hand.&lt;/p&gt;
&lt;h3&gt;Workaround: prefix convention&lt;/h3&gt;
&lt;p&gt;You can probably distinguish them by &lt;strong&gt;prefixing the prompt&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/loop 5m [CRON] run git fetch and analyze any new issues
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Hook then branches on the presence of &lt;code&gt;[CRON]&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# check_cron.py
import json, sys

data = json.load(sys.stdin)
prompt = data.get(&quot;prompt&quot;, &quot;&quot;)

if &quot;[CRON]&quot; in prompt:
    # handle cron-triggered case
    print(&quot;Cron-triggered prompt detected&quot;, file=sys.stderr)
else:
    # handle manual input case
    pass
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Until an official trigger-source field shows up, this kind of workaround seems to be how you handle it.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;5. Constraints and where it fits&lt;/h2&gt;
&lt;h3&gt;&lt;code&gt;/loop&lt;/code&gt; constraints&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Constraint&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Session-bound&lt;/td&gt;
&lt;td&gt;Closing the session (terminal) stops every job&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3-day expiry&lt;/td&gt;
&lt;td&gt;Jobs are auto-deleted 3 days after creation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approval prompts&lt;/td&gt;
&lt;td&gt;Destructive operations like &lt;code&gt;git push&lt;/code&gt; still surface a confirmation dialog&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Interval precision&lt;/td&gt;
&lt;td&gt;Cron-based, minimum 1-minute granularity, plus runtime delay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrent limit&lt;/td&gt;
&lt;td&gt;Up to 50 jobs per session&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;How it compares to GitHub Actions and friends&lt;/h3&gt;
&lt;p&gt;Given those constraints, &lt;code&gt;/loop&lt;/code&gt; feels best suited to &lt;strong&gt;lightweight, ephemeral automation&lt;/strong&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;/loop&lt;/th&gt;
&lt;th&gt;GitHub Actions / traditional cron&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Watch a situation for a few hours&lt;/td&gt;
&lt;td&gt;Good fit&lt;/td&gt;
&lt;td&gt;Overkill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitor a PR through to merge&lt;/td&gt;
&lt;td&gt;Good fit&lt;/td&gt;
&lt;td&gt;Setup overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Continuous production monitoring&lt;/td&gt;
&lt;td&gt;Bad fit&lt;/td&gt;
&lt;td&gt;The right tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Automation shared across the team&lt;/td&gt;
&lt;td&gt;Bad fit&lt;/td&gt;
&lt;td&gt;The right tool&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For one-off automation needs inside my personal dev flow, the appeal is being able to wrap things up without standing up an external CI/CD pipeline.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;h3&gt;What I like about &lt;code&gt;/loop&lt;/code&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Easy to use&lt;/strong&gt;: create and manage jobs in natural language.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flexible&lt;/strong&gt;: works across API checks, issue monitoring, test runs, and more.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Self-contained&lt;/strong&gt;: no external tools needed; you can even automate GitHub operations.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Even just messing around for a bit, the convenience of &quot;spinning up a quick automation in seconds&quot; was real. Designing intervals together with context consumption seems to be the right mindset.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://code.claude.com/docs/en/scheduled-tasks&quot;&gt;Claude Code Docs — Run prompts on a schedule&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://winbuzzer.com/2026/03/09/anthropic-claude-code-cron-scheduling-background-worker-loop-xcxwbn/&quot;&gt;Claude Code Gets Cron Scheduling to Run as a Background Worker — WinBuzzer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://medium.com/@joe.njenga/claude-code-loop-create-new-native-autonomous-loops-that-work-29934d615402&quot;&gt;Claude Code /loop — How I Create New Native Autonomous Loops That Work! — Medium&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded></item><item><title>Claude Code Agent Teams: Reusing Existing Skill and Agent Knowledge</title><link>https://zeroshotlog.com/en/blog/2026/02/13/claude-code-agent-teams/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/02/13/claude-code-agent-teams/</guid><description>An overview of Agent Teams, how it differs from traditional subagents, the current state of reusing existing skills and agent definitions, and what token consumption actually looks like.</description><pubDate>Fri, 13 Feb 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Agent Teams runs multiple Claude Code instances as a coordinating team. The architecture is different from traditional subagents (the Task tool).&lt;/li&gt;
&lt;li&gt;To reuse existing skills or agent definitions inside Agent Teams, you have to spell out file paths in the prompt and tell teammates to read them — there&apos;s no structural way to wire them in yet.&lt;/li&gt;
&lt;li&gt;Token consumption is several times higher than the skill approach. Use skills for work where skills are enough; reach for Agent Teams only when parallel execution clearly pays off.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;What is Agent Teams?&lt;/h2&gt;
&lt;p&gt;Released as an experimental preview on February 5, 2026 — alongside Claude Opus 4.6. Multiple Claude Code instances run in parallel as a &quot;team.&quot;&lt;/p&gt;
&lt;p&gt;Claude Code&apos;s extension surface has several layers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Skills (Skill tool)
  └── Expanded and executed inside the main session
       └── May invoke the Task tool internally

Subagents (Task tool)
  └── Spawned as independent instances
  └── subagent_type lets you point at a custom agent definition

Agent Teams (TeamCreate + SendMessage + TaskList, etc.)
  └── Spawning a teammate = Task tool + inter-team messaging + shared task list
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Traditional &lt;strong&gt;subagents&lt;/strong&gt; are independent instances spawned via the Task tool — a hub-and-spoke shape from parent to children. Children can&apos;t talk to each other; they only return results to the parent. The Task tool ships with four built-in types (Bash / general-purpose / Explore / Plan), and &lt;code&gt;subagent_type&lt;/code&gt; lets you point at a custom agent definition (&lt;code&gt;.claude/agents/*.md&lt;/code&gt;); the knowledge baked into that definition is loaded automatically.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Agent Teams&lt;/strong&gt; is an orchestration layer on top of the Task tool that adds inter-team messaging and a shared task list. The big difference: teammates can send messages directly to each other.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[Subagent model]
  Parent agent
    ├── Task → Child A (returns result to parent)
    ├── Task → Child B (returns result to parent)
    └── Task → Child C (returns result to parent)
  * Children can&apos;t talk to each other. subagent_type lets you specify a definition.

[Agent Teams model]
  Team Lead
    ├── Teammate A ←→ Teammate B
    ├── Teammate B ←→ Teammate C
    └── Teammate A ←→ Teammate C
  * Teammates can message each other directly.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that teammates are themselves independent Claude Code instances, so in principle they should be able to invoke subagents via the Task tool (the documented restriction is only &quot;no nested teams&quot; — using the Task tool by itself isn&apos;t restricted). In practice, though, &lt;strong&gt;this didn&apos;t work for me&lt;/strong&gt;. If it did, existing custom agent definitions could be reused as-is, so I&apos;d love to see this fixed.&lt;/p&gt;
&lt;p&gt;Here&apos;s a screenshot of seven reviewers running in parallel across tmux split panes:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/2026/02/13/agent-teams-tmux-split-panes.png&quot; alt=&quot;Code review team running in parallel via Agent Teams&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The team has a shared task list with state and dependency management. When a blocker is cleared, an idle teammate autonomously claims the next task. There&apos;s no file-level locking, though, so concurrent writes to the same file need attention.&lt;/p&gt;
&lt;h3&gt;Enabling and using it&lt;/h3&gt;
&lt;p&gt;Enable it in &lt;code&gt;settings.json&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;// ~/.claude/settings.json
{
  &quot;env&quot;: {
    &quot;CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS&quot;: &quot;1&quot;
  },
  &quot;teammateMode&quot;: &quot;auto&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;&quot;auto&quot;&lt;/code&gt; for &lt;code&gt;teammateMode&lt;/code&gt; picks split panes when running inside tmux, and in-process mode (toggle with Shift+Up/Down) elsewhere. You drive it in plain natural language:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Set up a team to review PR #42 in this project.
Spawn three reviewers:
- Security
- Performance
- Test coverage
Have each one review and report results.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As a baseline, Max 5x ($100/month) or higher is recommended. The Pro plan ($20/month) hits limits quickly.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Reusing Existing Knowledge: The Current State of Natural-Language Prompts&lt;/h2&gt;
&lt;p&gt;There&apos;s something to be aware of when using Agent Teams.&lt;/p&gt;
&lt;p&gt;Claude Code already has extension mechanisms like skills (&lt;code&gt;SKILL.md&lt;/code&gt;) and custom agents (&lt;code&gt;.claude/agents/*.md&lt;/code&gt;). In a single session they&apos;re loaded automatically, and Task-tool subagents can invoke them simply by naming them in &lt;code&gt;subagent_type&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But Agent Teams &lt;strong&gt;currently provides no structural way&lt;/strong&gt; to tell a teammate &quot;use this skill&quot; or &quot;run with this agent definition.&quot; If you want a teammate to use an existing definition file, you have to embed the file path in the prompt and tell them, in natural language, to read it.&lt;/p&gt;
&lt;h3&gt;You have to spell out paths in detail&lt;/h3&gt;
&lt;p&gt;Say you&apos;ve curated code-review agent definitions under &lt;code&gt;~/.claude/agents/&lt;/code&gt;. To use these from Agent Teams, you have to write out the directory structure and file paths in the prompt and tell each teammate &quot;which file to read and how to use it.&quot;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;### Directory structure

~/.claude/
├── agents/                       # Review perspective definitions
│   ├── review-architecture.md
│   ├── review-naming.md
│   └── review-frontend.md
└── knowledge/                    # Reference knowledge
    ├── architecture/
    │   ├── patterns.md
    │   └── anti-patterns.md
    └── naming/
        └── conventions.md

### Teammate read procedure

1. Read `~/.claude/agents/review-{your-area}.md` to learn
   the review perspective and output format.
2. If the definition references other files, load the matching
   knowledge file.
3. Use that knowledge to ground your review.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In skills or with the Task tool, the framework handles agent definition paths and read order. With Agent Teams, you currently have to &lt;strong&gt;write all of that yourself, inside the prompt&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Comparison with the skill approach&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Skills / Task tool&lt;/th&gt;
&lt;th&gt;Agent Teams&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Specifying an agent definition&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Name it via &lt;code&gt;subagent_type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Write the file path in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Loading knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Automatic via references inside the definition&lt;/td&gt;
&lt;td&gt;Spell out the procedure in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output destination management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Defined inside the skill&lt;/td&gt;
&lt;td&gt;Specified per-teammate in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Execution control&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Follows the skill&apos;s flow&lt;/td&gt;
&lt;td&gt;Design Phase structure in the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In other words, even when you&apos;ve accumulated knowledge as skills or agent definitions, &lt;strong&gt;using it from Agent Teams requires translating that content into a natural-language prompt&lt;/strong&gt;. This should resolve once Agent Teams can reference skills or agent definitions directly, but as of February 2026 we&apos;re not there yet.&lt;/p&gt;
&lt;h2&gt;Prompt Design Guidelines&lt;/h2&gt;
&lt;p&gt;Combining the official Agent Teams docs with community findings, here are the points worth keeping in mind.&lt;/p&gt;
&lt;h3&gt;Sharing context with teammates&lt;/h3&gt;
&lt;p&gt;Teammates don&apos;t inherit the lead&apos;s conversation history — they spawn as independent instances. CLAUDE.md and MCP servers are loaded automatically, but anything else has to be passed in the spawn-time prompt or via files.&lt;/p&gt;
&lt;h3&gt;Separating output files&lt;/h3&gt;
&lt;p&gt;There&apos;s no file-level locking, so design things so each teammate owns a different file set.&lt;/p&gt;
&lt;h3&gt;Phase structure for staged control&lt;/h3&gt;
&lt;p&gt;Splitting the prompt into Phases — &quot;prep → parallel work → integration → completion&quot; — makes the Team Lead&apos;s behavior easier to control.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Delegate Mode (Shift+Tab)&lt;/strong&gt; is also useful. It restricts the lead&apos;s tool execution permissions so they focus on coordination, but as of February 2026 there&apos;s a reported bug where teammates lose tool access (&lt;a href=&quot;https://github.com/anthropics/claude-code/issues/24073&quot;&gt;GitHub Issue #24073&lt;/a&gt;).&lt;/p&gt;
&lt;h2&gt;Sample: Agent Teams Code Review Prompt&lt;/h2&gt;
&lt;p&gt;Below is a sample Agent Teams code review prompt that reflects the points above. It&apos;s structured so existing agent definition files get loaded by each teammate.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Assumption&lt;/strong&gt;: This assumes you have agent definition files in &lt;code&gt;~/.claude/agents/&lt;/code&gt;. The team will spawn even without them, but the review perspectives and output format depend on what&apos;s in those definitions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;lt;details open&amp;gt;
&amp;lt;summary&amp;gt;Boilerplate agent definition files (simple samples)&amp;lt;/summary&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;review-architecture.md (Architecture reviewer)&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Architecture Reviewer

## Role
Conduct architecture- and design-level code review.

## Review perspectives
- Soundness of directory structure and layer separation
- Direction of dependencies
- Adherence to single responsibility principle
- Excessive abstraction or unnecessary complexity

## Scoring
Tag each finding with severity:
- [Critical]: Serious issue
- [Warning]: Concern worth improving
- [Suggestion]: Suggestion for a better design

## Output format
- **[severity]** filename:line — issue
  - Why: why it&apos;s a problem
  - Fix: how to address it
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;review-naming.md (Naming reviewer)&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Naming Reviewer

## Role
Review naming conventions for variables, functions, classes, and files.

## Review perspectives
- Adherence to language-specific naming conventions (camelCase / snake_case / PascalCase)
- Whether the role can be inferred from the name (semantic clarity)
- Consistency of abbreviations (e.g. mixing btn vs button)
- Boolean prefixes (is / has / should)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;review-frontend.md (Frontend reviewer)&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Frontend Reviewer

## Role
Review frontend-specific patterns in React/Vue/etc.

## Review perspectives
- Component decomposition granularity and Props design
- Appropriateness of state management patterns
- Performance (unnecessary re-renders, over/under-memoization)
- Accessibility (semantic HTML, ARIA attributes)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;lt;/details&amp;gt;&lt;/p&gt;
&lt;h3&gt;Full prompt&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;Run a code review on this repository using an agent team.
Follow the procedure and proceed autonomously. Only ask the user
when you&apos;re unsure.

## Reference file guide

The definition files and knowledge for the review live under `~/.claude/`.
Each teammate should load the files matching their assignment.

### Directory structure

~/.claude/
├── agents/                       # Review perspective definitions (perspective + output format)
│   ├── review-architecture.md    # Architecture review
│   ├── review-naming.md          # Naming convention review
│   └── review-frontend.md        # Frontend-specific (conditional)
│
└── knowledge/                    # Reference knowledge
    ├── architecture/
    │   ├── patterns.md
    │   └── anti-patterns.md
    ├── naming/
    │   └── conventions.md
    └── frontend/
        └── best-practices.md

### Teammate read procedure

1. Read `~/.claude/agents/review-{your-area}.md` to learn
   the review perspective and output format.
2. If the definition references other files, load the matching
   file from `~/.claude/knowledge/`.
3. Use that knowledge to ground your review.

---

## Procedure

### Phase 0: Confirm review scope

Confirm with the user:

1. **Review target**: specific files / recent commits / whole project
2. **Review depth**: full (default) / quick (Critical only)

### Phase 1: Preparation (Team Lead)

1. Create a scratch directory:
   `.claude/code-review-team/.scratch/{YYYY-MM-DD-HHmm}/`

2. Only when &quot;recent commits&quot; is selected, generate a diff context:
   - Save the result of `git diff HEAD~N` to `{scratch}/diff-context.md`

3. Run tech stack detection:
   Identify frameworks from `package.json`, `requirements.txt`, etc.
   Write the result to `{scratch}/stack-detection.md`

4. Read the detection result and decide the team composition for Phase 2

### Phase 2: Team creation &amp;amp; parallel review

Create a team named &quot;code-review&quot; and spawn the following
teammates in parallel.

#### Required members (always spawned)

1. **architecture-reviewer**
   - Role: Architecture / design-level review
   - Definition: `~/.claude/agents/review-architecture.md`
   - Output: `{scratch}/architecture-review.md`

2. **naming-reviewer**
   - Role: Naming convention review
   - Definition: `~/.claude/agents/review-naming.md`
   - Output: `{scratch}/naming-review.md`

#### Conditional members (added based on stack detection)

- **frontend-reviewer** — when React/Vue etc. is detected
  Definition: `~/.claude/agents/review-frontend.md`
  Output: `{scratch}/frontend-review.md`

#### Common rules for all teammates

- First, read `{scratch}/stack-detection.md`.
- Read your own agent definition and follow its perspective and output format.
- If the definition references other files, load them from `~/.claude/knowledge/`.
- Write review results incrementally to your output file.
- Tag each finding with severity: [Critical] / [Warning] / [Suggestion].
- When done, message the Team Lead:
  &quot;Review complete. Critical: X, Warning: Y. See {output} for details.&quot;

### Phase 3: Report integration

Once all teammates are done:

1. Read every review file under `{scratch}/`,
   deduplicate, normalize priorities, and produce an integrated report.
   **Note**: Only files inside *this* scratch directory should be integrated.
2. Output: `docs/code-review-team/{YYYY-MM-DD-HHmm}-review.md`

### Phase 4: Wrap up

1. Delete the &quot;code-review&quot; team
2. Present the report to the user and confirm the fix strategy:
   - Fix all findings in one pass
   - Fix only Critical/Warning
   - User will fix themselves (report only)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The point of this sample is that &lt;strong&gt;the directory structure is laid out at the top of the prompt and each teammate is told exactly which paths to read&lt;/strong&gt;. With skills you wouldn&apos;t need any of this; with Agent Teams, today, you do.&lt;/p&gt;
&lt;h2&gt;On Token Consumption&lt;/h2&gt;
&lt;p&gt;Agent Teams burns a lot of tokens. Each teammate runs as an independent Claude Code instance, so cost scales with team size.&lt;/p&gt;
&lt;p&gt;On the Max 20x ($200/month) plan, running a team with five or more teammates 2–3 times per hour consumed about 4% of my Max usage. Because each teammate is its own instance, costs grow with headcount.&lt;/p&gt;
&lt;p&gt;Honestly, &lt;strong&gt;I haven&apos;t run it enough times to measure how much the final output quality differs between the skill approach (Task tool) and Agent Teams&lt;/strong&gt;. The speed benefit from parallel execution is tangible, but whether the quality improvement justifies the cost takes ongoing testing to find out.&lt;/p&gt;
&lt;p&gt;You can rein in cost by assigning Sonnet to the teammates, but the Pro plan ($20/month) is realistically too tight; Max ($100–200/month) feels like the floor.&lt;/p&gt;
&lt;h3&gt;Leveraging the weekly reset&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;Claude Code usage limits reset on a 7-day rolling window. The &lt;code&gt;/usage&lt;/code&gt; command shows the next reset, so timing Agent Teams sessions to weeks where you have headroom makes them easier to plan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Limitations (as of February 2026)&lt;/h2&gt;
&lt;p&gt;Agent Teams is in experimental preview. The main limits:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Limitation&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;No session resume&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/resume&lt;/code&gt; doesn&apos;t restore teammates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;No file locking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Concurrent edits to the same file can overwrite each other&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;One team per session&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple teams can&apos;t run simultaneously&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;No nested teams&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Teammates can&apos;t spawn sub-teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Split panes constraints&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;VS Code integrated terminal, Windows Terminal, and Ghostty are unsupported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Slow shutdown&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shutdown waits for teammates to finish their current request or tool call, which takes time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;No direct reference to existing knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No structural way to point Agent Teams at skills or agent definitions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;Wrap-up&lt;/h2&gt;
&lt;p&gt;Agent Teams enables autonomous coordination via direct teammate-to-teammate messaging and a shared task list.&lt;/p&gt;
&lt;p&gt;That said, you currently have to express the entire team configuration in natural language, and reusing existing skills or agent definitions means painstakingly enumerating file paths. What the framework used to do for you in the skill approach, you now write yourself inside the prompt.&lt;/p&gt;
&lt;p&gt;Token consumption is also high, and I haven&apos;t yet been able to clearly measure the quality delta against the skill approach. Looking forward to deeper integration with skills and agent definitions, but for now I&apos;d say &quot;use skills when skills are enough; reach for Agent Teams when parallel execution clearly adds value&quot; is the practical split.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://code.claude.com/docs/en/agent-teams&quot;&gt;Orchestrate teams of Claude Code sessions — Claude Code official docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-6&quot;&gt;Introducing Claude Opus 4.6 — Anthropic&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2026/02/13/agent-teams-hero.png" length="0" type="image/png"/></item><item><title>Lost in the Middle — Prompt Design that Beats LLM Position Bias</title><link>https://zeroshotlog.com/en/blog/2026/02/04/llm-prompt-design-pitfalls/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/02/04/llm-prompt-design-pitfalls/</guid><description>LLMs tend to overlook the middle of long prompts (the Lost in the Middle problem). This post covers the cause and practical countermeasures — tail checklists, the sandwich strategy, XML-tag structuring, and more.</description><pubDate>Wed, 04 Feb 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;LLMs are prone to &lt;strong&gt;missing information placed in the middle&lt;/strong&gt; of long prompts (the Lost in the Middle problem).&lt;/li&gt;
&lt;li&gt;One major driver is the long-range decay of RoPE (Rotary Position Embedding); causal attention masks and biases in training data also play a role — it&apos;s a multi-factor structural issue.&lt;/li&gt;
&lt;li&gt;Practical countermeasures include the &lt;strong&gt;tail checklist pattern&lt;/strong&gt;, the &lt;strong&gt;sandwich strategy&lt;/strong&gt;, and &lt;strong&gt;XML-tag structuring&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Each technique has different token costs and ideal use cases, so picking the right tool for the situation matters.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;1. What is Lost in the Middle?&lt;/h2&gt;
&lt;h3&gt;LLM position bias&lt;/h3&gt;
&lt;p&gt;If you&apos;ve built an LLM application, you&apos;ve probably seen &quot;the instructions in the prompt got ignored.&quot; It happens often once the system prompt grows to hundreds of lines.&lt;/p&gt;
&lt;p&gt;This is the phenomenon known as &lt;strong&gt;Lost in the Middle&lt;/strong&gt;. It was systematically reported in Liu et al. (2023), &quot;&lt;a href=&quot;https://arxiv.org/abs/2307.03172&quot;&gt;Lost in the Middle: How Language Models Use Long Contexts&lt;/a&gt;&quot;.&lt;/p&gt;
&lt;p&gt;The core finding: LLMs exhibit a &lt;strong&gt;U-shaped performance curve&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Performance
 ▲
 │  ★                                    ★
 │   ★                                 ★
 │    ★★                            ★★
 │      ★★★                     ★★★
 │         ★★★★★★★★★★★★★
 │
 └──────────────────────────────────────► Information position
   Beginning      Middle (degraded)        End
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Concretely, on tasks that ask the model to answer questions referencing multiple documents, performance for information placed in the middle drops by &lt;strong&gt;more than 30%&lt;/strong&gt; compared to the beginning or the end (Liu et al., 2023; the magnitude depends on model and task).&lt;/p&gt;
&lt;h3&gt;Why middle information gets lost — RoPE&apos;s long-range decay&lt;/h3&gt;
&lt;p&gt;A leading cause is the &lt;strong&gt;long-range decay effect of RoPE (Rotary Position Embedding)&lt;/strong&gt;, which most modern LLMs use. Note that position bias isn&apos;t only about RoPE — the structure of causal attention masks (the triangular masks that prevent each token from attending to tokens after it) and biases in the positional distribution of training data are all considered contributing factors.&lt;/p&gt;
&lt;p&gt;RoPE adjusts attention strength based on the relative position between tokens. A standard Transformer remembers absolute positions (&quot;which slot in the input is this token in?&quot;), while RoPE encodes the relative distance between two tokens as a rotation angle of the vector. The rotation angle grows with distance, which naturally attenuates attention scores between far-apart tokens.&lt;/p&gt;
&lt;p&gt;Instructions at the very start of a prompt are &quot;far&quot; from the most recent generated tokens. But because of how a causal language model works — generating tokens left to right with each token only able to attend to earlier tokens — the leading tokens are referenced repeatedly while processing every subsequent token. This cumulative effect ends up preserving information at the beginning and the end strongly.&lt;/p&gt;
&lt;p&gt;Tokens in the middle don&apos;t get the same cumulative leverage as the beginning, and they aren&apos;t close to the generation point like the end either. They fall into an &quot;attention valley.&quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;2. Concrete examples from real projects&lt;/h2&gt;
&lt;h3&gt;How long does a prompt have to be before this matters?&lt;/h3&gt;
&lt;p&gt;With recent models, you almost never see this on prompts a few dozen lines long. In my experience, the impact starts to show up at &lt;strong&gt;system prompts of several hundred lines&lt;/strong&gt; — for example, RAG setups injecting large amounts of context, or agent applications with complex rule sets.&lt;/p&gt;
&lt;p&gt;The problem persists in the latest models. Modarressi et al. (2025), &quot;&lt;a href=&quot;https://arxiv.org/abs/2502.05167&quot;&gt;NoLiMa: Long-Context Evaluation Beyond Literal Matching&lt;/a&gt;&quot; (ICML 2025), found that 11 of 13 models — including GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet — fell below 50% of their short-prompt baseline performance at a 32K-token context. Subsequent evaluations showed the same trend on GPT-4.1 and Gemini 2.5 Flash.&lt;/p&gt;
&lt;p&gt;Chroma Research&apos;s &quot;&lt;a href=&quot;https://research.trychroma.com/context-rot&quot;&gt;Context Rot: How Increasing Input Tokens Impacts LLM Performance&lt;/a&gt;&quot; (July 2025) tested 18 models for long-context degradation. Interestingly, the failure mode differs by model family: GPT-family models tend to confidently return wrong answers (hallucinate), while Claude-family models tend to abstain when uncertain. The same study reports that the U-shape Liu et al. observed wasn&apos;t consistently reproduced — so position bias may manifest differently depending on task and model.&lt;/p&gt;
&lt;p&gt;I&apos;ve personally observed similar behavior on GPT-4.1 mini. Across model generations, position bias is &lt;strong&gt;easing but not eliminated&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The example below is simplified for clarity. In real settings, you&apos;d have dozens of similar sections stacking up to several hundred lines or thousands of tokens — that&apos;s when the problem appears.&lt;/p&gt;
&lt;h3&gt;Middle rules ignored in a system prompt&lt;/h3&gt;
&lt;p&gt;Imagine a system prompt with the following section structure spanning several hundred lines:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;You are a customer support assistant.            ← Near the top: followed

## Basic rules
- Respond politely
- Address the user by name

... (dozens more sections) ...

## Response format                                ← Buried in the middle
- Keep answers within 3 sentences
- Use bullet points

## Data reference guide                           ← Buried in the middle
- Always look up pricing in the database

... (many more sections) ...

## Prohibited                                     ← Near the bottom: followed
- Don&apos;t recommend competitor products
- Don&apos;t ask for personal information
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &quot;Basic rules&quot; at the top and the &quot;Prohibited&quot; list at the bottom get followed, but the &quot;Response format&quot; and &quot;Data reference guide&quot; buried in the middle get ignored. The longer the prompt, the more often you hit this pattern.&lt;/p&gt;
&lt;h3&gt;Missing requirements in code generation&lt;/h3&gt;
&lt;p&gt;The same pattern shows up in code generation when the requirements section is long. The tech stack at the top and the response format at the bottom get followed, while validation and error-handling requirements written in the middle drop out entirely. If the whole prompt is short, no problem — but as context and examples grow, the impact starts showing.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;3. The tail checklist pattern&lt;/h2&gt;
&lt;h3&gt;Overview&lt;/h3&gt;
&lt;p&gt;The most practical countermeasure for Lost in the Middle is the &lt;strong&gt;tail checklist pattern&lt;/strong&gt;. You restate the important instructions as a checklist at the very end of the prompt, prompting the LLM to &quot;double-check.&quot;&lt;/p&gt;
&lt;h3&gt;Before / After&lt;/h3&gt;
&lt;p&gt;The example below simplifies a system prompt that would normally span several hundred lines. In practice each section is more detailed, with many rules and chunks of context in between.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Before (middle instructions get buried):&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;You are a code review assistant.

## Review perspectives
... (5 items)

... (many sections: coding standards, language-specific rules, edge cases...)

## Output format                       ← Buried in the middle
- Classify severity as High/Medium/Low
- Attach a code example for each suggestion
- State the impact scope

... (more sections)

## Code under review
{code}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;After (checklist appended at the end):&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;You are a code review assistant.

## Review perspectives
... (5 items)

... (many sections: coding standards, language-specific rules, edge cases...)

## Output format
- Classify severity as High/Medium/Low
- Attach a code example for each suggestion
- State the impact scope

... (more sections)

## Code under review
{code}

---
## Final checklist before output       ← Added here
Before producing the answer, confirm all of the following:
- [ ] Did you address all 5 review perspectives?
- [ ] Did you assign a severity (High/Medium/Low) to each finding?
- [ ] Did you attach a code example to each finding?
- [ ] Did you state the impact scope?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By placing the checklist at the end, the LLM &quot;re-recognizes&quot; these constraints right before generating output. You&apos;re flipping the U-shape to your advantage by putting the verification items in the position where attention is highest — the end.&lt;/p&gt;
&lt;p&gt;In my experience, after introducing this pattern the rate of middle-buried instructions getting ignored dropped noticeably. It works especially well for instructions like &quot;output format,&quot; which tend to live in the middle of prompts.&lt;/p&gt;
&lt;h3&gt;Real example: improving structured JSON output&lt;/h3&gt;
&lt;p&gt;I ran into this concretely while using LangChain with OpenAI models for a task that extracted structured JSON from free-form user text. The setup used LangChain&apos;s &lt;code&gt;with_structured_output&lt;/code&gt;, so the schema and field descriptions were defined via Pydantic&apos;s &lt;code&gt;Field(description=...)&lt;/code&gt;, while extraction rules for each field (required vs. optional, default values, format specifications, etc.) lived in the prompt.&lt;/p&gt;
&lt;p&gt;As the number of fields grew, extraction accuracy for fields described in the middle of the prompt visibly dropped. Field rules near the top and bottom were applied fine, but fields buried in the middle came back as &lt;code&gt;null&lt;/code&gt; or with wrong values — exactly the U-shape.&lt;/p&gt;
&lt;p&gt;Adding a reminder at the end of the prompt (after the user input) measurably improved extraction accuracy for those fields.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field

# Pydantic schema (description is also passed to the LLM)
class TaskSchema(BaseModel):
    category: str = Field(description=&quot;Choose from the predefined categories&quot;)
    priority: int = Field(description=&quot;Integer 1-5&quot;)
    due_date: str | None = Field(description=&quot;ISO 8601 date, null if unknown&quot;)

prompt = ChatPromptTemplate.from_messages([
    (&quot;system&quot;, &quot;&quot;&quot;You are an assistant that extracts task information.
Extract structured data from the user&apos;s input.&quot;&quot;&quot;),
    (&quot;human&quot;, &quot;&quot;&quot;{user_input}

## Pre-output check
Before responding, confirm the following fields are extracted correctly:
- &quot;category&quot;: must be picked from the predefined categories (do not guess)
- &quot;priority&quot;: must be an integer 1-5
- &quot;due_date&quot;: must be ISO 8601 (null if no date is in the input)&quot;&quot;&quot;),
])

structured_llm = llm.with_structured_output(TaskSchema)
messages = prompt.format_messages(user_input=user_input)
result = structured_llm.invoke(messages)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What&apos;s notable is that adding this tail reminder had almost no effect on the other fields (the ones that were already being extracted correctly). When I tried the same instructions as emphasis on the field definition in the middle of the prompt, surrounding fields would sometimes drift slightly, but the tail placement showed virtually no such side effects. The tail checklist lets you reinforce a weak spot in a targeted way without breaking the existing output.&lt;/p&gt;
&lt;p&gt;One caveat: when you tweak the extraction rules in the prompt, you have to update the Pydantic model&apos;s &lt;code&gt;Field(description=...)&lt;/code&gt; to match — otherwise the prompt and the schema disagree, and accuracy can suffer despite your fix. &lt;code&gt;with_structured_output&lt;/code&gt; passes the schema&apos;s &lt;code&gt;description&lt;/code&gt; to the LLM as well, so prompt and schema need to stay in sync. It&apos;s a mundane point but easy to overlook in practice.&lt;/p&gt;
&lt;p&gt;On injecting domain-specific knowledge in LangChain, the LangChain blog post &quot;&lt;a href=&quot;https://blog.langchain.com/incorporating-domain-specific-knowledge-in-sql-llm-solutions/&quot;&gt;Incorporating domain specific knowledge in SQL-LLM solutions&lt;/a&gt;&quot; recommends dynamically retrieving relevant few-shot examples rather than relying on a static prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A more powerful approach is to have a robust dataset of good examples, and &lt;em&gt;dynamically&lt;/em&gt; include those which are relevant to the user question.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Specifically, the post shows building a custom Retriever Tool backed by a vector database to fetch examples semantically similar to the user&apos;s question. For structured-output tasks with many fields, dynamically selecting and placing the rules relevant to the input — rather than statically listing every rule — may be less susceptible to Lost in the Middle.&lt;/p&gt;
&lt;h3&gt;When the tail checklist isn&apos;t enough&lt;/h3&gt;
&lt;p&gt;The technique has limits.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Too many checklist items reduces effectiveness&lt;/strong&gt;: Per &quot;&lt;a href=&quot;https://openreview.net/forum?id=R6q67CDBCH&quot;&gt;Curse of Instructions&lt;/a&gt;&quot; (ManyIFEval, 2024; ICLR 2025), even an LLM that follows individual instructions 90% of the time has a theoretical success rate of only 0.9^10 ≈ ~35% for satisfying 10 instructions simultaneously (actual values vary by model). Once the checklist gets long, you can also re-introduce Lost in the Middle within the checklist itself.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vague instructions&lt;/strong&gt;: Items like &quot;handle this appropriately&quot; let the LLM interpret loosely.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Highly creative generation&lt;/strong&gt;: For free-form writing, an over-constrained checklist can hurt output quality.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Implementation tips&lt;/h3&gt;
&lt;p&gt;Tips for using the tail checklist effectively:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;1. Write the checklist as &quot;verification items,&quot; not as a copy of the body
   - Bad:  Pasting the same prose
   - Good: A concise list of points to verify

2. Cap the list at 5–10 items
   - Too many backfires (Lost in the Middle inside the checklist itself)

3. Explicitly say &quot;verify before answering&quot;
   - Encourage a verification pass

4. Prioritize the items most often missed
   - Don&apos;t restate everything; emphasize what&apos;s empirically dropped
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;4. Other approaches&lt;/h2&gt;
&lt;p&gt;The tail checklist is lightweight and effective, but there are also approaches that improve the structure of the prompt itself, or address the issue outside the prompt — like the RAG pipeline.&lt;/p&gt;
&lt;h3&gt;Sandwich strategy&lt;/h3&gt;
&lt;p&gt;Place the most important information &lt;strong&gt;at both the beginning and the end&lt;/strong&gt; of the prompt.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## Most important rule
Always return output in JSON format.

## Context
{lots of context...}

## Additional info
{more context...}

## Reminder: always return output in JSON format.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You&apos;re putting the critical instruction at the two ends of the U-shape — the highest-performing positions — so it&apos;s simple but effective. The trade-off is that you have to pick a single &quot;most important&quot; item, which makes it a poor fit when you want to emphasize multiple instructions at once.&lt;/p&gt;
&lt;h3&gt;XML-tag structuring and section splitting&lt;/h3&gt;
&lt;p&gt;Use XML tags or Markdown headers to clearly partition the prompt into sections that are easy for the LLM to parse.&lt;/p&gt;
&lt;p&gt;Anthropic&apos;s prompt engineering tutorial recommends separating data and instructions with XML tags. By bracketing input data with tags like &lt;code&gt;&amp;lt;sentences&amp;gt;...&amp;lt;/sentences&amp;gt;&lt;/code&gt;, the LLM can more clearly distinguish the data region from the instruction region, which can reduce the risk of missing middle information. Note that XML structuring doesn&apos;t eliminate position bias by itself — it&apos;s better used in combination with other techniques.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;system&amp;gt;
You are a data analysis assistant.
&amp;lt;/system&amp;gt;

&amp;lt;rules&amp;gt;
&amp;lt;rule priority=&quot;high&quot;&amp;gt;Always cite the source of any number&amp;lt;/rule&amp;gt;
&amp;lt;rule priority=&quot;high&quot;&amp;gt;Mark estimates explicitly as &quot;estimated&quot;&amp;lt;/rule&amp;gt;
&amp;lt;rule priority=&quot;medium&quot;&amp;gt;Include axis labels in chart descriptions&amp;lt;/rule&amp;gt;
&amp;lt;/rules&amp;gt;

&amp;lt;context&amp;gt;
{the data to analyze}
&amp;lt;/context&amp;gt;

&amp;lt;output_format&amp;gt;
{output format specification}
&amp;lt;/output_format&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Adding a &lt;code&gt;priority&lt;/code&gt; attribute also gives the LLM a hint for judging importance. Making it explicit &quot;what is written where&quot; through structure helps reduce the risk of middle information being buried.&lt;/p&gt;
&lt;h3&gt;Strategic document placement in RAG&lt;/h3&gt;
&lt;p&gt;In a RAG (Retrieval-Augmented Generation) pipeline, the &lt;strong&gt;ordering&lt;/strong&gt; of retrieved documents directly affects answer quality.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def reorder_documents(docs: list[str], scores: list[float]) -&amp;gt; list[str]:
    &quot;&quot;&quot;
    A Lost in the Middle countermeasure: place the highest-relevance
    documents at the beginning and the end.

    Example: scores [A(0.9), B(0.8), C(0.7), D(0.6), E(0.5)]
    Result:  [A(0.9), C(0.7), E(0.5), D(0.6), B(0.8)]
              ^^^^^^                           ^^^^^^
              High score at head        High score at tail
    &quot;&quot;&quot;
    scored_docs = list(zip(docs, scores))
    scored_docs.sort(key=lambda x: x[1], reverse=True)

    head = []  # head side (even indices: 1st, 3rd, 5th...)
    tail = []  # tail side (odd indices: 2nd, 4th, 6th...)

    for i, (doc, score) in enumerate(scored_docs):
        if i % 2 == 0:
            head.append(doc)
        else:
            tail.append(doc)

    # Reverse the tail so the highest-scoring item lands at the very end
    return head + tail[::-1]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By keeping the lowest-relevance documents in the middle and the highest-relevance ones at the ends, you reduce the risk of important information being overlooked.&lt;/p&gt;
&lt;h3&gt;Quantitative validation of position vs. accuracy&lt;/h3&gt;
&lt;p&gt;Lost in the Middle also affects few-shot prompting. Anthropic&apos;s blog post &quot;&lt;a href=&quot;https://www.anthropic.com/news/prompting-long-context&quot;&gt;Prompt engineering for Claude&apos;s long context window&lt;/a&gt;&quot; quantitatively evaluates techniques for improving information retrieval from long contexts — like extracting relevant quotes first before answering, and adding correctly answered Q&amp;amp;A examples to the prompt.&lt;/p&gt;
&lt;p&gt;If you want to measure how much position bias affects your own prompts, building a validation pipeline informed by these benchmarks is a good starting point.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;5. Summary&lt;/h2&gt;
&lt;h3&gt;Comparing the techniques&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Token cost&lt;/th&gt;
&lt;th&gt;Implementation difficulty&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tail checklist&lt;/td&gt;
&lt;td&gt;System prompts in general&lt;/td&gt;
&lt;td&gt;Low (just the list)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sandwich strategy&lt;/td&gt;
&lt;td&gt;Single most-important rule&lt;/td&gt;
&lt;td&gt;Low (one restated line)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XML-tag structuring&lt;/td&gt;
&lt;td&gt;Multiple kinds of information&lt;/td&gt;
&lt;td&gt;Medium (tag overhead)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG document placement&lt;/td&gt;
&lt;td&gt;RAG pipelines&lt;/td&gt;
&lt;td&gt;None (reorder only)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;Pay attention to information placement&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Put the most important instructions at the &lt;strong&gt;beginning and the end&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;If something has to live in the middle, restate it at the end.&lt;/li&gt;
&lt;li&gt;The longer the prompt, the higher the Lost in the Middle risk.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Use a tail checklist for double-verification&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Add a &quot;Final checklist before output&quot; at the end of the prompt.&lt;/li&gt;
&lt;li&gt;Prioritize instructions that have been missed in past runs.&lt;/li&gt;
&lt;li&gt;Keep the list to 5–10 items.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Continuously monitor prompt quality&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Set up before/after accuracy comparisons for prompt changes.&lt;/li&gt;
&lt;li&gt;Periodically check whether failures cluster around specific input patterns.&lt;/li&gt;
&lt;li&gt;Position-bias impact can shift across model versions, so revalidate when you upgrade models.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Lessons learned&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;LLM position bias is a &lt;strong&gt;structural issue rooted in the architecture&lt;/strong&gt;, and you can&apos;t fully solve it with prompt wording alone. But understanding the structure and applying countermeasures can substantially improve practical accuracy.&lt;/li&gt;
&lt;li&gt;The latest 2025 models still don&apos;t eliminate the problem, and the failure mode varies by model family (hallucinate vs. abstain). Validation against your specific model is essential.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Liu et al. (2023) &quot;Lost in the Middle: How Language Models Use Long Contexts&quot; - &lt;a href=&quot;https://arxiv.org/abs/2307.03172&quot;&gt;arXiv:2307.03172&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Su et al. (2021) &quot;RoFormer: Enhanced Transformer with Rotary Position Embedding&quot; - &lt;a href=&quot;https://arxiv.org/abs/2104.09864&quot;&gt;arXiv:2104.09864&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic &quot;Prompt Engineering: Use XML Tags&quot; - &lt;a href=&quot;https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags&quot;&gt;docs.anthropic.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LangChain Blog &quot;Incorporating domain specific knowledge in SQL-LLM solutions&quot; - &lt;a href=&quot;https://blog.langchain.com/incorporating-domain-specific-knowledge-in-sql-llm-solutions/&quot;&gt;blog.langchain.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Anthropic &quot;Prompt engineering for Claude&apos;s long context window&quot; - &lt;a href=&quot;https://www.anthropic.com/news/prompting-long-context&quot;&gt;anthropic.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Modarressi et al. (2025) &quot;NoLiMa: Long-Context Evaluation Beyond Literal Matching&quot; - &lt;a href=&quot;https://arxiv.org/abs/2502.05167&quot;&gt;arXiv:2502.05167&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Chroma Research (2025) &quot;Context Rot: How Increasing Input Tokens Impacts LLM Performance&quot; - &lt;a href=&quot;https://research.trychroma.com/context-rot&quot;&gt;research.trychroma.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&quot;Curse of Instructions: Large Language Models Cannot Follow Multiple Instructions at Once&quot; (2024; ICLR 2025) - &lt;a href=&quot;https://openreview.net/forum?id=R6q67CDBCH&quot;&gt;OpenReview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2026/02/04/lost-in-the-middle.png" length="0" type="image/png"/></item><item><title>Escaping the AI-Generated &apos;Purple Gradient&apos;</title><link>https://zeroshotlog.com/en/blog/2026/01/18/design-renewal/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/2026/01/18/design-renewal/</guid><description>I redesigned my tech blog using a workflow that combines Claude Code, Playwright MCP, and Stitch AI — and worked around the typical AI-generated design clichés.</description><pubDate>Sun, 18 Jan 2026 12:00:00 GMT</pubDate><content:encoded>&lt;p&gt;import { Tweet } from &apos;astro-embed&apos;;&lt;/p&gt;
&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Hand a UI to AI without specific direction and you&apos;ll almost certainly end up with a &quot;purple gradient.&quot;&lt;/li&gt;
&lt;li&gt;I rebuilt the design using Claude Code + Playwright MCP + a design AI (Stitch).&lt;/li&gt;
&lt;li&gt;The new palette is amber-500 (amber), with a terminal/code-inspired look.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;The Purple Gradient Problem&lt;/h2&gt;
&lt;p&gt;This blog, &quot;Zero-Shot Log,&quot; originally had its design generated by AI.&lt;/p&gt;
&lt;p&gt;It didn&apos;t look bad. But it felt familiar — like I&apos;d seen it somewhere before.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/2026/01/18/before-desktop.png&quot; alt=&quot;The old design (purple gradient)&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The culprit: the &quot;purple gradient.&quot;&lt;/p&gt;
&lt;p&gt;I came across an interesting post about this on X.&lt;/p&gt;
&lt;p&gt;&amp;lt;Tweet id=&quot;https://twitter.com/seltzer/status/2010678415142039560&quot; /&amp;gt;&lt;/p&gt;
&lt;p&gt;A reply pointed out that this is heavily influenced by Tailwind&apos;s &lt;code&gt;bg-indigo-500&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&amp;lt;Tweet id=&quot;https://twitter.com/masayanishigaki/status/2010689800030871814&quot; /&amp;gt;&lt;/p&gt;
&lt;p&gt;The theory is that AI learned indigo/purple gradients as the &quot;optimal default&quot; from its training data.&lt;/p&gt;
&lt;p&gt;I decided it was time to break out of that pattern.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Renewal Workflow&lt;/h2&gt;
&lt;p&gt;For this redesign, I combined three AI tools.&lt;/p&gt;
&lt;h3&gt;1. Capture the current state: Claude Code + Playwright MCP&lt;/h3&gt;
&lt;p&gt;First, I connected Playwright MCP to Claude Code and took screenshots of the current site — the top page and an article page, on both desktop and mobile.&lt;/p&gt;
&lt;p&gt;This way I could communicate the &quot;current state&quot; accurately when briefing a designer AI.&lt;/p&gt;
&lt;h3&gt;2. Write the design brief&lt;/h3&gt;
&lt;p&gt;Based on those screenshots, I wrote a design brief (&lt;code&gt;design-brief.md&lt;/code&gt;) for the designer AI. It included:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Site info (name, concept, page structure)&lt;/li&gt;
&lt;li&gt;The current color scheme and what was wrong with it&lt;/li&gt;
&lt;li&gt;Design elements to avoid (purple gradients, flashy glows, etc.)&lt;/li&gt;
&lt;li&gt;The direction I wanted (minimal, warm dark, terminal/Vibe Coding aesthetic, etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The idea is to control AI output by being explicit about what you &lt;em&gt;don&apos;t&lt;/em&gt; want.&lt;/p&gt;
&lt;h3&gt;3. Hand it to the design AI (Stitch)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://stitch.withgoogle.com/&quot;&gt;Stitch&lt;/a&gt; is a UI generation tool from Google Labs. Powered by Gemini 2.5 Pro, it generates UI designs from text or images. Currently in beta and free to use.&lt;/p&gt;
&lt;p&gt;I passed the screenshots and the brief to Stitch and got back a new design proposal.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/images/2026/01/18/designer-output.png&quot; alt=&quot;Output from Stitch&quot; /&gt;&lt;/p&gt;
&lt;p&gt;What came back was an HTML file plus preview images. The warm amber palette was applied as requested.&lt;/p&gt;
&lt;h3&gt;4. Implementation&lt;/h3&gt;
&lt;p&gt;Stitch&apos;s output includes HTML, but it&apos;s mainly there to fill in details that an image alone can&apos;t convey. You can&apos;t just drop it into existing Astro components, so I rewrote the styles in Claude Code while taking the color palette and font choices as references.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Wrap-up&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The &quot;purple gradient&quot; problem in AI-generated design is avoidable with explicit direction.&lt;/li&gt;
&lt;li&gt;The flow of &quot;screenshots + brief → design AI → implementation&quot; makes redesigns go smoothly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&apos;s a good time to be alive — design renewals are this easy with AI now.&lt;/p&gt;
</content:encoded></item><item><title>I Built a Prompt Expansion Tool: Query Expander</title><link>https://zeroshotlog.com/en/blog/query-expander-introduction/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/query-expander-introduction/</guid><description>A tool that turns vague prompts into clear, structured ones. Built as a Claude Artifact, so it&apos;s available any time you&apos;re logged into Claude.</description><pubDate>Mon, 12 Jan 2026 12:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;I built &lt;strong&gt;&lt;a href=&quot;https://claude.ai/public/artifacts/9a14629b-bc73-41ab-bd96-c8be99c8feee&quot;&gt;Query Expander&lt;/a&gt;&lt;/strong&gt;, a tool that expands LLM prompts into clearer, more explicit instructions.&lt;/li&gt;
&lt;li&gt;It runs as a Claude Artifact, so it&apos;s available any time you&apos;re logged into Claude.&lt;/li&gt;
&lt;li&gt;Three detail levels (Concise / Standard / Detailed) let you tune the output.&lt;/li&gt;
&lt;li&gt;It auto-detects the input language (Japanese / English) and matches the output to it.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Updated February 2026&lt;/strong&gt;: Bumped to v1.1.1. Fixes a bug where the copy button didn&apos;t work.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;h2&gt;Why I Built It&lt;/h2&gt;
&lt;p&gt;When I ask Claude Code to investigate or implement something, the precision of the answer scales with how clear the request is.&lt;/p&gt;
&lt;p&gt;In RAG, turning a user&apos;s vague query into something search-friendly is called &quot;Query Expansion.&quot; I wanted to apply the same idea to LLM prompts in general.&lt;/p&gt;
&lt;p&gt;I used to keep a set of expansion rules configured in Claude itself. A vague request like &quot;look into X&quot; got rewritten into a clear prompt that included the goal, the scope, and the expected output format.&lt;/p&gt;
&lt;p&gt;The catch: I was doing this constantly. It came up so often during work that opening a chat every single time started to feel like friction.&lt;/p&gt;
&lt;p&gt;So I packaged it as a standalone Claude Artifact. As long as you&apos;re logged into Claude, you can use it — no more context switching to expand a prompt. It runs inside your own plan, so you don&apos;t have to worry about API token usage either.&lt;/p&gt;
&lt;p&gt;For the UI design, I used Google&apos;s &lt;a href=&quot;https://stitch.withgoogle.com/&quot;&gt;Stitch&lt;/a&gt;. It generates UI from prompts, which makes it fast to put together a base design.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What Query Expander Is&lt;/h2&gt;
&lt;p&gt;Query Expander is a tool that turns vague prompts into clear, structured ones.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Main features:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Three detail levels&lt;/strong&gt;: Concise / Standard / Detailed&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Refine Only mode&lt;/strong&gt;: Polish the wording while keeping the original structure&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automatic language detection&lt;/strong&gt;: Japanese in, Japanese out; English in, English out&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Markdown preserved&lt;/strong&gt;: Headings and list structure survive the expansion&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;How to Use It&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Paste the prompt you want to expand into the text area.&lt;/li&gt;
&lt;li&gt;Pick a detail level (Concise / Standard / Detailed).&lt;/li&gt;
&lt;li&gt;Click &quot;ENHANCE QUERY&quot;.&lt;/li&gt;
&lt;li&gt;Copy the expanded prompt and use it.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Choosing a Detail Level&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Output Size&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Concise&lt;/td&gt;
&lt;td&gt;1–2 sentences&lt;/td&gt;
&lt;td&gt;Minor tweaks, typo fixes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;td&gt;2–4 sentences&lt;/td&gt;
&lt;td&gt;General prompt improvement (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detailed&lt;/td&gt;
&lt;td&gt;3–5 items&lt;/td&gt;
&lt;td&gt;Complex tasks, when detailed instructions are needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;Refine Only Mode&lt;/h3&gt;
&lt;p&gt;The default mode &lt;em&gt;expands&lt;/em&gt; the prompt. Refine Only mode keeps the original structure and just &lt;em&gt;polishes&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Difference from the default mode:&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Default&lt;/td&gt;
&lt;td&gt;Adds purpose, scope, format, etc. to expand the prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Refine Only&lt;/td&gt;
&lt;td&gt;Keeps the original structure, refines the wording&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;When Refine Only is the right fit:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You already have a structured prompt and just want to sharpen the wording.&lt;/li&gt;
&lt;li&gt;You don&apos;t want the original format touched.&lt;/li&gt;
&lt;li&gt;You only want a tone adjustment (casual → formal, etc.).&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Example&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Reactのstate管理について調べて
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Standard output (illustrative):&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Reactにおけるstate管理の主要なアプローチについて調査してください。
対象として、useState、useReducer、Context API、および外部ライブラリ（Redux、Zustand、Jotai等）を含めてください。
それぞれの特徴、適切なユースケース、パフォーマンス面での考慮点を比較形式でまとめてください。
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A vague &quot;look into it&quot; turns into a clear prompt with purpose, scope, and expected format.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://claude.ai/public/artifacts/9a14629b-bc73-41ab-bd96-c8be99c8feee&quot;&gt;Query Expander&lt;/a&gt; — link to the Claude Artifact&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://query-expander.gitbook.io/docs&quot;&gt;Documentation&lt;/a&gt; — detailed usage&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Wrap-up&lt;/h2&gt;
&lt;p&gt;Tightening up your prompts is unglamorous work, but it directly affects LLM output quality. Query Expander makes that step quick.&lt;/p&gt;
&lt;p&gt;It&apos;s free to use as long as you&apos;re logged into Claude. If you use LLMs day to day, give it a try.&lt;/p&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2026/01/12/query-expander.png" length="0" type="image/png"/></item><item><title>Handing Off Conversations When Antigravity Slows Down</title><link>https://zeroshotlog.com/en/blog/antigravity-hidden-brain-feature/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/antigravity-hidden-brain-feature/</guid><description>How to deal with Antigravity getting sluggish during long sessions, and how to use the brain directory to hand off context to a new conversation. Just pass a UUID and the next session picks up where the last one left off.</description><pubDate>Sat, 03 Jan 2026 10:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Antigravity gets sluggish during long sessions.&lt;/li&gt;
&lt;li&gt;Splitting work across multiple sessions helps, but carrying context over becomes the next problem.&lt;/li&gt;
&lt;li&gt;Asking Antigravity itself how to hand off led me to the &quot;brain directory.&quot;&lt;/li&gt;
&lt;li&gt;With the brain directory, you can resume cleanly in a fresh session.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;What Happened&lt;/h2&gt;
&lt;h3&gt;Long conversations slow things down&lt;/h3&gt;
&lt;p&gt;I was working on a longer implementation task in Antigravity.&lt;/p&gt;
&lt;p&gt;The agent was handling a sizable chunk of work, and gradually the responses started getting slower. The UI became unresponsive, and the agent was clearly lagging. A look at Activity Monitor showed memory usage was way up.&lt;/p&gt;
&lt;h3&gt;The cause: token budget pressure&lt;/h3&gt;
&lt;p&gt;Digging in, I learned Antigravity has roughly a 200,000-token budget per conversation. As the conversation grows, that budget gets consumed and processing load climbs.&lt;/p&gt;
&lt;p&gt;On top of that, conversation history fills up local storage. Antigravity stores screenshots, recordings, and other &quot;Artifacts&quot; locally, which likely contributes to the slowdown.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Splitting Sessions Helps&lt;/h2&gt;
&lt;h3&gt;Basic mitigations&lt;/h3&gt;
&lt;p&gt;A few things worked for me:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Delete old conversation logs&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In the Agent Manager history, delete conversations you no longer need. Once you have dozens of accumulated sessions, IDE-wide responsiveness can take a hit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Split long work across sessions&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Instead of pushing one conversation to the limit, switch to a new conversation at natural breakpoints. The token budget resets and things feel snappy again.&lt;/p&gt;
&lt;h3&gt;The remaining problem: handing off context&lt;/h3&gt;
&lt;p&gt;Splitting sessions fixed performance — but introduced a new problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How do you bring the previous work&apos;s context into the new session?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Re-explaining &quot;we got this far last time&quot; and &quot;we were going with this approach&quot; every time is tedious. And in the middle of a complex implementation, it&apos;s easy to leave something out.&lt;/p&gt;
&lt;p&gt;Antigravity has a Knowledge Items feature that learns automatically, and the accumulated knowledge does carry over. But the docs aren&apos;t clear about scope (workspace? directory?) or how much actually carries. Sometimes you specifically want to hand off the tasks and files from one conversation.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;A Lucky Find: Antigravity&apos;s Own Answer&lt;/h2&gt;
&lt;p&gt;Before switching sessions, I asked Antigravity:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;How should I hand this off to a new conversation?&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The answer:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;When you start a separate conversation, just say &apos;continue from the previous conversation (a1b2c3d4-...)&apos; and I can pick up the work by referencing these documents. In particular, current_issues.md has a detailed summary of what the next session should tackle.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Wait, that&apos;s a thing?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;What is &lt;code&gt;current_issues.md&lt;/code&gt;? Where does it live? I poked around and found an unfamiliar folder in my home directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ls ~/.gemini/antigravity/brain/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is where the per-session Artifacts were being stored.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What Is the Brain Directory?&lt;/h2&gt;
&lt;h3&gt;Overview&lt;/h3&gt;
&lt;p&gt;The brain directory (&lt;code&gt;~/.gemini/antigravity/brain/&lt;/code&gt;) is where the Antigravity agent stores the records it generates while working.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~/.gemini/antigravity/brain/
├── a1b2c3d4-5678-90ab-cdef-1234567890ab/
│   ├── task.md
│   ├── implementation_plan.md
│   ├── walkthrough.md
│   ├── current_issues.md
│   ├── uploaded_image_1234567890123.png
│   └── uploaded_image_1234567890456.png
├── b2c3d4e5-6789-01bc-def0-2345678901bc/
│   └── ...
└── ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Each UUID-named folder corresponds to one conversation session.&lt;/p&gt;
&lt;h3&gt;What gets saved&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Contents&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;task.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Task progress (checklist format)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;implementation_plan.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Technical details of the implementation plan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;walkthrough.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Summary of changes after implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;current_issues.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Issues to address in the next session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;uploaded_image_*.png&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Images uploaded during the conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;current_issues.md&lt;/code&gt; is the key one for handoffs. It captures what&apos;s unresolved and what the next session should pick up.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Handing Off Conversations Using the Brain Directory&lt;/h2&gt;
&lt;h3&gt;Approach 1: Hand off with a UUID&lt;/h3&gt;
&lt;p&gt;Start a new conversation and pass it the previous session&apos;s ID:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Continue from the previous conversation (a1b2c3d4-5678-90ab-cdef-1234567890ab).
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The agent reads the matching Artifacts in the brain directory, understands the context, and picks up the work.&lt;/p&gt;
&lt;h3&gt;Approach 2: Point it at current_issues.md directly&lt;/h3&gt;
&lt;p&gt;For a more reliable handoff, give it the exact file path:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Check ~/.gemini/antigravity/brain/a1b2c3d4-.../current_issues.md and
resume from the unresolved items.
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;How to find the conversation ID&lt;/h3&gt;
&lt;p&gt;Just ask Antigravity directly:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;What&apos;s the conversation ID (UUID) for this conversation?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example response:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;This conversation&apos;s ID is a1b2c3d4-5678-90ab-cdef-1234567890ab.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note: I confirmed this in Planning mode with Gemini 3 Pro. Other models or contexts may respond differently.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;A Practical Workflow&lt;/h2&gt;
&lt;h3&gt;Antigravity standalone&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Start work&lt;/strong&gt;: Begin a new conversation and assign the task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;During work&lt;/strong&gt;: The agent automatically generates &lt;code&gt;task.md&lt;/code&gt;, &lt;code&gt;implementation_plan.md&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;At a breakpoint&lt;/strong&gt;: When things get sluggish, ask the agent to &quot;bring the Artifacts up to date.&quot; &lt;code&gt;task.md&lt;/code&gt;, &lt;code&gt;implementation_plan.md&lt;/code&gt;, &lt;code&gt;current_issues.md&lt;/code&gt;, etc. all get refreshed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capture the UUID&lt;/strong&gt;: Ask &quot;What&apos;s the UUID for this conversation?&quot; and save it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hand off&lt;/strong&gt;: Start a new conversation and say &quot;continue from the previous conversation (UUID).&quot;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Key point&lt;/strong&gt;: Update the Artifacts before handing off. If they&apos;re stale, the new session will start from a misaligned picture.&lt;/p&gt;
&lt;h3&gt;Combined with Claude Code&lt;/h3&gt;
&lt;p&gt;Brain files are plain Markdown, so other AI tools like Claude Code can read them.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Inspect brain contents from Claude Code
cat ~/.gemini/antigravity/brain/a1b2c3d4-.../implementation_plan.md
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can hand an &lt;code&gt;implementation_plan.md&lt;/code&gt; written by Antigravity to Claude Code and ask it to &quot;review this plan.&quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Caveats&lt;/h2&gt;
&lt;h3&gt;The brain doesn&apos;t store full conversation transcripts&lt;/h3&gt;
&lt;p&gt;Only Artifacts are stored in the brain — not the full back-and-forth. Subtle nuances and the path of the discussion can be lost, so make sure the agent explicitly records important decisions to the Artifacts.&lt;/p&gt;
&lt;p&gt;For reference, &lt;code&gt;~/.gemini/antigravity/conversations/&lt;/code&gt; contains &lt;code&gt;.pb&lt;/code&gt; (Protocol Buffers) files, which appear to hold the conversation data in binary form. You can&apos;t read them directly, but the conversation history seems to live there.&lt;/p&gt;
&lt;h3&gt;Storage usage&lt;/h3&gt;
&lt;p&gt;Beyond the brain directory, Antigravity also writes evidence files to places like &lt;code&gt;browser_recordings/&lt;/code&gt;. Accumulated screenshots and browser recordings can eat through disk space. Periodically cleaning up unused sessions is a good idea.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Inspect old sessions
ls -la ~/.gemini/antigravity/brain/

# Delete sessions you don&apos;t need
rm -rf ~/.gemini/antigravity/brain/&amp;lt;unwanted-uuid&amp;gt;/
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;Wrap-Up&lt;/h2&gt;
&lt;p&gt;The Antigravity slowdown is solvable by splitting sessions. And when handing off conversations, you can &lt;strong&gt;reference the brain directory by UUID&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Long conversations consume the token budget and slow everything down.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Switch to a new session and use the brain Artifacts to carry context forward.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hidden feature&lt;/strong&gt;: Just say &quot;continue from the previous conversation (UUID)&quot; and the agent reads the brain Artifacts.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I stumbled into this feature by asking Antigravity itself how to hand off a conversation. There&apos;s a chance the answer was a hallucination — but the brain directory is unambiguously real, and pointing the agent at those files for handoff does work in practice. I couldn&apos;t find any mention in the official docs, so consider this a useful undocumented trick.&lt;/p&gt;
&lt;p&gt;If you&apos;re hitting performance issues, give brain-based session management a try.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://antigravity.codes/troubleshooting&quot;&gt;Troubleshooting Google Antigravity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://vertexdigest.com/blogs/mastering-anti-gravity-artifacts&quot;&gt;Mastering Anti-Gravity Artifacts - Vertex Digest&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://moghaoui.substack.com/p/hack-dont-lose-to-googles-antigravity&quot;&gt;Hack: Don&apos;t lose to Google&apos;s Antigravity - Substack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://note.com/biwakonbu/n/n04482fc86825&quot;&gt;note - Antigravity で会話履歴が長くなると重くなってきている気がする&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://note.com/honest_kudu5817/n/ndcdc33f2538f&quot;&gt;note - 【Google Antigravity】大量の証跡がローカルストレージを圧迫&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2026/01/03/antigravity-hidden-brain-feature.png" length="0" type="image/png"/></item><item><title>A Practical Neovim Setup Guide for macOS</title><link>https://zeroshotlog.com/en/blog/neovim-macos-setup-guide/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/neovim-macos-setup-guide/</guid><description>Solving the macOS IME problem, handling Neovim 0.11 plugin compatibility, building a minimal config that doesn&apos;t need Nerd Font, and setting up the basics of LSP and completion.</description><pubDate>Tue, 30 Dec 2025 11:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The macOS IME problem can be solved with &lt;code&gt;im-select.nvim&lt;/code&gt; + &lt;code&gt;macism&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;On Neovim 0.11, watch out for plugin compatibility (use the &lt;code&gt;0.1.x&lt;/code&gt; tag for Telescope).&lt;/li&gt;
&lt;li&gt;You can build a perfectly usable environment with a simple config that doesn&apos;t need Nerd Font.&lt;/li&gt;
&lt;li&gt;For LSP and completion, the &lt;code&gt;mason.nvim&lt;/code&gt; + &lt;code&gt;nvim-cmp&lt;/code&gt; combo is the most beginner-friendly option.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Background&lt;/h2&gt;
&lt;p&gt;This article walks through a Neovim setup that&apos;s comfortable to use on macOS. It assumes you know the basic Vim operations (mode switching, cursor movement, save/quit, etc.) but are new to configuring Neovim plugins.&lt;/p&gt;
&lt;p&gt;It covers four topics:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Solving the macOS IME problem&lt;/strong&gt; — the most common pain point for Japanese input&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Neovim 0.11 plugin compatibility&lt;/strong&gt; — what to watch out for on the latest version&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A minimal config that doesn&apos;t need Nerd Font&lt;/strong&gt; — an environment that works without font setup&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Basic LSP / completion setup&lt;/strong&gt; — go-to-definition and autocompletion&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;The configuration in this article was verified on the following environment:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;Sonoma 14.x or later&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Neovim&lt;/td&gt;
&lt;td&gt;0.11.3 or later (required for the new LSP API)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Homebrew&lt;/td&gt;
&lt;td&gt;Installed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Terminal&lt;/td&gt;
&lt;td&gt;iTerm2 (Terminal.app also works)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If you haven&apos;t installed Neovim yet:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew install neovim
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;1. Solving the macOS IME Problem&lt;/h2&gt;
&lt;h3&gt;The problem&lt;/h3&gt;
&lt;p&gt;When using Neovim (Vim) on macOS, many people get tripped up by the IME (Japanese input) issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Specifically&lt;/strong&gt;: if you return to Normal mode while still in Japanese input mode, commands like &lt;code&gt;j&lt;/code&gt; and &lt;code&gt;k&lt;/code&gt; stop working.&lt;/p&gt;
&lt;p&gt;The reason: even after switching from Insert back to Normal, macOS keeps the IME state as-is. After typing some Japanese and pressing &lt;code&gt;Esc&lt;/code&gt;, the IME is still in Japanese mode — so typing &lt;code&gt;jjj...&lt;/code&gt; produces &lt;code&gt;っっっ...&lt;/code&gt; instead of moving the cursor.&lt;/p&gt;
&lt;h3&gt;The fix: im-select.nvim + macism&lt;/h3&gt;
&lt;p&gt;The fix combines two components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://github.com/keaising/im-select.nvim&quot;&gt;im-select.nvim&lt;/a&gt;&lt;/strong&gt; (Neovim plugin)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Detects mode switches (Insert → Normal, etc.)&lt;/li&gt;
&lt;li&gt;Calls an external CLI tool to switch the IME&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://github.com/laishulu/macism&quot;&gt;macism&lt;/a&gt;&lt;/strong&gt; (CLI tool)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The command that actually flips the macOS IME&lt;/li&gt;
&lt;li&gt;Invoked from im-select.nvim&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So im-select.nvim watches for mode changes inside Neovim, and when it sees one, it runs &lt;code&gt;macism&lt;/code&gt; to switch the IME back to ASCII.&lt;/p&gt;
&lt;h4&gt;Why macism&lt;/h4&gt;
&lt;p&gt;im-select.nvim shells out to a CLI tool to switch the IME on macOS. Several CLI tools exist for this, but &lt;strong&gt;the official im-select.nvim README recommends macism&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Please install macism, this is the only one CLI tool can switch CJK and English input methods in macOS correctly.
— &lt;a href=&quot;https://github.com/keaising/im-select.nvim&quot;&gt;im-select.nvim README&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;macism reliably switches CJK input sources like Japanese. With other tools (&lt;code&gt;input-source-switcher&lt;/code&gt; and similar), there&apos;s a known macOS bug where the menu bar icon flips but the input source doesn&apos;t actually change.&lt;/p&gt;
&lt;h4&gt;Installation&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;macism&lt;/code&gt; isn&apos;t in the official Homebrew tap, so you need to add a third-party tap:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Add the tap
brew tap laishulu/homebrew

# Install macism
brew install laishulu/homebrew/macism
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once installed, verify it works:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Check the current input source
macism

# Example output: com.apple.inputmethod.Kotoeri.RomajiTyping.Japanese
# or:            com.apple.keylayout.ABC
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Neovim configuration&lt;/h4&gt;
&lt;p&gt;Use the &lt;code&gt;im-select.nvim&lt;/code&gt; plugin to control behavior on mode transitions.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;-- Example config for the lazy.nvim plugin manager
{
  &quot;keaising/im-select.nvim&quot;,
  config = function()
    require(&quot;im_select&quot;).setup({
      -- Target input source (ASCII keyboard)
      default_im_select = &quot;com.apple.keylayout.ABC&quot;,

      -- Absolute path to macism (use /usr/local/bin/macism on Intel Mac)
      default_command = &quot;/opt/homebrew/bin/macism&quot;,

      -- When to switch to ASCII
      set_default_events = {
        &quot;VimEnter&quot;,       -- On Neovim startup
        &quot;FocusGained&quot;,    -- When the window regains focus
        &quot;InsertLeave&quot;,    -- When leaving Insert mode
        &quot;CmdlineLeave&quot;    -- When leaving command-line mode
      },

      -- Setting to restore the previous IME on InsertEnter (empty = disabled)
      set_previous_events = {},
    })
  end,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Why I set &lt;code&gt;set_previous_events = {}&lt;/code&gt;&lt;/h4&gt;
&lt;p&gt;By default, &lt;code&gt;set_previous_events = { &quot;InsertEnter&quot; }&lt;/code&gt;, which restores the previous IME state when entering Insert mode.&lt;/p&gt;
&lt;p&gt;I disabled this by setting it to an empty table. Reasoning:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Trade-off comparison&lt;/strong&gt;:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setting&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Cons&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;{ &quot;InsertEnter&quot; }&lt;/code&gt; (default)&lt;/td&gt;
&lt;td&gt;Convenient for typing Japanese in succession&lt;/td&gt;
&lt;td&gt;Requires switching every time you write English code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;{}&lt;/code&gt; (disabled)&lt;/td&gt;
&lt;td&gt;Always starts in ASCII, so behavior is predictable&lt;/td&gt;
&lt;td&gt;Manual switch needed when writing Japanese&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For programming, English input dominates by far, so &quot;always return to ASCII&quot; feels much smoother. When I do need Japanese, I switch manually — no big deal.&lt;/p&gt;
&lt;p&gt;If you write a lot of Japanese documentation, the default &lt;code&gt;{ &quot;InsertEnter&quot; }&lt;/code&gt; may be more convenient.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;2. Neovim 0.11 Plugin Compatibility&lt;/h2&gt;
&lt;h3&gt;Background&lt;/h3&gt;
&lt;p&gt;Neovim 0.11, released in March 2025, made significant changes to the LSP-related APIs. As a result, some plugins won&apos;t work as-is.&lt;/p&gt;
&lt;h3&gt;Telescope.nvim compatibility issue&lt;/h3&gt;
&lt;p&gt;The most common one users hit is with Telescope.nvim (the fuzzy finder).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: Errors when opening Telescope, or broken display.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cause&lt;/strong&gt;: The Telescope.nvim stable branch (&lt;code&gt;0.1.x&lt;/code&gt;) hadn&apos;t caught up with the Neovim 0.11 API changes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: Use the &lt;code&gt;0.1.x&lt;/code&gt; tag (the latest version is already patched).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;-- Neovim 0.11–compatible version
{
  &quot;nvim-telescope/telescope.nvim&quot;,
  tag = &quot;0.1.x&quot;,  -- Stable tag (Neovim 0.11–compatible)
  dependencies = { &quot;nvim-lua/plenary.nvim&quot; },
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Choosing a version&lt;/h4&gt;
&lt;p&gt;As of late 2025, Telescope v0.2.0 has also been released.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Specification&lt;/th&gt;
&lt;th&gt;Characteristics&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tag = &quot;0.1.x&quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stable, Neovim 0.11–compatible&lt;/td&gt;
&lt;td&gt;Recommended&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tag = &quot;0.1.8&quot;&lt;/code&gt; etc.&lt;/td&gt;
&lt;td&gt;Pinned to a specific version&lt;/td&gt;
&lt;td&gt;If reproducibility matters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;branch = &quot;master&quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Latest dev branch&lt;/td&gt;
&lt;td&gt;If you want to try new features&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;code&gt;0.1.x&lt;/code&gt; branch was previously incompatible with Neovim 0.11, but that&apos;s been fixed. Unless you have a specific reason otherwise, &lt;code&gt;tag = &quot;0.1.x&quot;&lt;/code&gt; is the way to go.&lt;/p&gt;
&lt;h3&gt;Other ways to check compatibility&lt;/h3&gt;
&lt;p&gt;When a plugin doesn&apos;t work, here&apos;s how I investigate:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Check GitHub Issues&lt;/strong&gt;: Search for &lt;code&gt;[plugin name] Neovim 0.11&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Check the Requirements section in the README&lt;/strong&gt;: The supported Neovim version is usually documented.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Update to the latest version&lt;/strong&gt;: Run &lt;code&gt;:Lazy sync&lt;/code&gt; to refresh plugins.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;3. A Minimal Config That Doesn&apos;t Need Nerd Font&lt;/h2&gt;
&lt;h3&gt;What is Nerd Font&lt;/h3&gt;
&lt;p&gt;Most Neovim setup articles tell you to &quot;install Nerd Font.&quot; Nerd Font is a font family that adds icons (file types, Git status, folders, and so on) on top of regular fonts.&lt;/p&gt;
&lt;p&gt;But getting Nerd Font running requires several steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Download Nerd Font.&lt;/li&gt;
&lt;li&gt;Install it on your system.&lt;/li&gt;
&lt;li&gt;Change the font setting in your terminal.&lt;/li&gt;
&lt;li&gt;Verify the changes took effect.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Why I went without Nerd Font&lt;/h3&gt;
&lt;p&gt;Reasons for skipping Nerd Font:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No need to fiddle with terminal font settings.&lt;/li&gt;
&lt;li&gt;Easier to share configs across multiple machines (no font install required).&lt;/li&gt;
&lt;li&gt;Simple, lightweight look.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Harder to tell file types at a glance.&lt;/li&gt;
&lt;li&gt;Visually plain.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In actual coding, you can already tell the file type from the filename, so the lack of icons isn&apos;t a practical problem.&lt;/p&gt;
&lt;h3&gt;Disabling icons in plugins&lt;/h3&gt;
&lt;p&gt;How to disable icons in the major plugins.&lt;/p&gt;
&lt;h4&gt;nvim-tree (file explorer)&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;nvim-tree/nvim-tree.lua&quot;,
  config = function()
    require(&quot;nvim-tree&quot;).setup({
      renderer = {
        icons = {
          show = {
            file = false,        -- Hide file icons
            folder = false,      -- Hide folder icons
            folder_arrow = true, -- Show expand/collapse arrows
            git = false,         -- Hide Git status icons
          },
          glyphs = {
            folder = {
              arrow_closed = &quot;&amp;gt;&quot;,  -- Arrow when collapsed
              arrow_open = &quot;v&quot;,    -- Arrow when expanded
            },
          },
        },
      },
    })
  end,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;lualine (status line)&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;nvim-lualine/lualine.nvim&quot;,
  config = function()
    require(&quot;lualine&quot;).setup({
      options = {
        icons_enabled = false,       -- Disable all icons
        section_separators = &quot;&quot;,     -- No section separators
        component_separators = &quot;|&quot;,  -- Simple component separator
      },
    })
  end,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Visual comparison&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;With Nerd Font&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;  init.lua    main   lua  utf-8  100%  42:1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Without Nerd Font&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;NORMAL | main | init.lua | lua | utf-8 | 100% | 42:1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Information shows up as text instead of icons. Once you get used to it, it&apos;s perfectly readable.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;4. Basic LSP / Completion Setup&lt;/h2&gt;
&lt;h3&gt;What LSP is&lt;/h3&gt;
&lt;p&gt;LSP (Language Server Protocol) is a communication protocol between editors and language servers. It enables features like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Go to definition&lt;/strong&gt;: Jump to where a function or variable is defined.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Find references&lt;/strong&gt;: List everywhere a function or variable is used.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Autocompletion&lt;/strong&gt;: Show candidates while typing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Diagnostics&lt;/strong&gt;: Surface errors and warnings in real time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Rename&lt;/strong&gt;: Rename a symbol everywhere at once.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Plugin layout overview&lt;/h3&gt;
&lt;p&gt;To get LSP and completion working, combine these plugins:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mason.nvim              # Language server installer
  └── mason-lspconfig   # Bridge between mason and lspconfig
        └── nvim-lspconfig  # Language server configuration

nvim-cmp                # Completion engine
  ├── cmp-nvim-lsp      # Completion source from LSP
  ├── cmp-buffer        # Complete words from the current buffer
  ├── cmp-path          # Complete file paths
  └── LuaSnip           # Snippet expansion
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Why this stack&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LSP installation&lt;/td&gt;
&lt;td&gt;mason.nvim&lt;/td&gt;
&lt;td&gt;GUI-managed, easy for beginners&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Completion engine&lt;/td&gt;
&lt;td&gt;nvim-cmp&lt;/td&gt;
&lt;td&gt;Most widely used, lots of resources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snippets&lt;/td&gt;
&lt;td&gt;LuaSnip&lt;/td&gt;
&lt;td&gt;Smooth integration with nvim-cmp&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;Auto-installing language servers with mason.nvim&lt;/h3&gt;
&lt;p&gt;With mason.nvim, you can manage language servers from a GUI via the &lt;code&gt;:Mason&lt;/code&gt; command. With &lt;code&gt;mason-lspconfig&lt;/code&gt;, you can also auto-install the language servers you need.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;neovim/nvim-lspconfig&quot;,
  dependencies = {
    &quot;williamboman/mason.nvim&quot;,
    &quot;williamboman/mason-lspconfig.nvim&quot;,
  },
  config = function()
    -- Initialize mason (enables the :Mason command)
    require(&quot;mason&quot;).setup()

    -- Specify which language servers to auto-install
    require(&quot;mason-lspconfig&quot;).setup({
      ensure_installed = {
        &quot;lua_ls&quot;,   -- Lua
        &quot;ts_ls&quot;,    -- TypeScript/JavaScript
        &quot;pyright&quot;,  -- Python
      },
    })
  end,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;On first launch, the specified language servers are installed automatically.&lt;/p&gt;
&lt;h3&gt;Neovim 0.11&apos;s new LSP configuration API&lt;/h3&gt;
&lt;p&gt;Neovim 0.11 changed how LSP is configured significantly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The old way&lt;/strong&gt; (depends on nvim-lspconfig):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;local lspconfig = require(&quot;lspconfig&quot;)
lspconfig.lua_ls.setup({
  settings = {
    Lua = {
      diagnostics = { globals = { &quot;vim&quot; } },
    },
  },
})
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;The new way&lt;/strong&gt; (Neovim 0.11.3+ native):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;-- Per-language-server configuration
vim.lsp.config(&apos;lua_ls&apos;, {
  settings = {
    Lua = {
      diagnostics = { globals = { &quot;vim&quot; } },
    },
  },
})

-- Enable language servers
vim.lsp.enable({ &quot;lua_ls&quot;, &quot;pyright&quot;, &quot;ts_ls&quot; })
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: nvim-lspconfig isn&apos;t deprecated. Internally it functions as a wrapper that calls &lt;code&gt;vim.lsp.config&lt;/code&gt;, so the traditional &lt;code&gt;lspconfig.xxx.setup({})&lt;/code&gt; form still works. For new setups, the new API is recommended since it reduces plugin dependencies.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Benefits of the new API:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Built into Neovim, so it should remain stable long-term.&lt;/li&gt;
&lt;li&gt;Simple, easy-to-read syntax.&lt;/li&gt;
&lt;li&gt;Supports file-based configuration (under &lt;code&gt;~/.config/nvim/lsp/&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;LSP keymap setup&lt;/h3&gt;
&lt;p&gt;Bind LSP features to keys. Using the &lt;code&gt;LspAttach&lt;/code&gt; event ensures keymaps are only set on buffers where LSP is active.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;vim.api.nvim_create_autocmd(&quot;LspAttach&quot;, {
  callback = function(args)
    local bufnr = args.buf
    local opts = { buffer = bufnr, silent = true }

    -- Jump to definition / declaration / implementation
    vim.keymap.set(&quot;n&quot;, &quot;gd&quot;, vim.lsp.buf.definition, opts)
    vim.keymap.set(&quot;n&quot;, &quot;gD&quot;, vim.lsp.buf.declaration, opts)
    vim.keymap.set(&quot;n&quot;, &quot;gi&quot;, vim.lsp.buf.implementation, opts)

    -- Documentation and edit operations
    vim.keymap.set(&quot;n&quot;, &quot;K&quot;, vim.lsp.buf.hover, opts)
    vim.keymap.set(&quot;n&quot;, &quot;&amp;lt;leader&amp;gt;rn&quot;, vim.lsp.buf.rename, opts)
    vim.keymap.set(&quot;n&quot;, &quot;&amp;lt;leader&amp;gt;ca&quot;, vim.lsp.buf.code_action, opts)

    -- Navigate diagnostics
    vim.keymap.set(&quot;n&quot;, &quot;[d&quot;, vim.diagnostic.goto_prev, opts)
    vim.keymap.set(&quot;n&quot;, &quot;]d&quot;, vim.diagnostic.goto_next, opts)
    vim.keymap.set(&quot;n&quot;, &quot;&amp;lt;leader&amp;gt;e&quot;, vim.diagnostic.open_float, opts)
  end,
})
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Why use the LspAttach event&lt;/h4&gt;
&lt;p&gt;There are several ways to set up keymaps:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Issue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Global keymaps&lt;/td&gt;
&lt;td&gt;Active in every buffer&lt;/td&gt;
&lt;td&gt;Keys are bound even in files without LSP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;lspconfig.on_attach&lt;/td&gt;
&lt;td&gt;Configured via lspconfig&lt;/td&gt;
&lt;td&gt;Depends on lspconfig&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LspAttach event&lt;/td&gt;
&lt;td&gt;Set when LSP attaches&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;LspAttach&lt;/code&gt; is a built-in Neovim event, so it doesn&apos;t depend on any plugin and works reliably.&lt;/p&gt;
&lt;h3&gt;Completion setup with nvim-cmp&lt;/h3&gt;
&lt;p&gt;Basic configuration for the &lt;code&gt;nvim-cmp&lt;/code&gt; completion engine.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hrsh7th/nvim-cmp&quot;,
  dependencies = {
    &quot;hrsh7th/cmp-nvim-lsp&quot;,   -- LSP completion source
    &quot;hrsh7th/cmp-buffer&quot;,     -- Buffer completion
    &quot;hrsh7th/cmp-path&quot;,       -- Path completion
    &quot;L3MON4D3/LuaSnip&quot;,       -- Snippet engine
    &quot;saadparwaiz1/cmp_luasnip&quot;, -- Snippet completion
  },
  config = function()
    local cmp = require(&quot;cmp&quot;)
    local luasnip = require(&quot;luasnip&quot;)

    cmp.setup({
      -- Snippet expansion settings
      snippet = {
        expand = function(args)
          luasnip.lsp_expand(args.body)
        end,
      },

      -- Keymaps
      mapping = cmp.mapping.preset.insert({
        [&quot;&amp;lt;C-Space&amp;gt;&quot;] = cmp.mapping.complete(),   -- Manually trigger completion
        [&quot;&amp;lt;C-e&amp;gt;&quot;] = cmp.mapping.abort(),          -- Cancel completion
        [&quot;&amp;lt;CR&amp;gt;&quot;] = cmp.mapping.confirm({ select = true }), -- Confirm
        [&quot;&amp;lt;Tab&amp;gt;&quot;] = cmp.mapping(function(fallback)
          if cmp.visible() then
            cmp.select_next_item()  -- Next candidate
          else
            fallback()
          end
        end, { &quot;i&quot;, &quot;s&quot; }),
        [&quot;&amp;lt;S-Tab&amp;gt;&quot;] = cmp.mapping(function(fallback)
          if cmp.visible() then
            cmp.select_prev_item()  -- Previous candidate
          else
            fallback()
          end
        end, { &quot;i&quot;, &quot;s&quot; }),
      }),

      -- Completion source priority
      sources = cmp.config.sources({
        { name = &quot;nvim_lsp&quot; },  -- LSP (highest priority)
        { name = &quot;luasnip&quot; },   -- Snippets
      }, {
        { name = &quot;buffer&quot; },    -- Words in the buffer
        { name = &quot;path&quot; },      -- File paths
      }),
    })
  end,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The completion sources are prioritized: the first group (LSP, snippets) takes precedence, and if there are no matches, candidates come from the next group (buffer, path).&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;On Splitting the Config File&lt;/h2&gt;
&lt;p&gt;All the configuration in this article lives in a single &lt;code&gt;~/.config/nvim/init.lua&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why a single file&lt;/strong&gt;:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layout&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Cons&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single file&lt;/td&gt;
&lt;td&gt;Easy to see and search the whole thing&lt;/td&gt;
&lt;td&gt;Hard to read once it gets long&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multiple files&lt;/td&gt;
&lt;td&gt;Clear separation of concerns, scales better&lt;/td&gt;
&lt;td&gt;Inter-file dependencies get complex&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For a config of around 400 lines, a single file is plenty manageable. You can always split it up once it grows.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article covered four configurations for using Neovim comfortably on macOS.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;IME problem&lt;/strong&gt;: Solved with &lt;code&gt;im-select.nvim&lt;/code&gt; + &lt;code&gt;macism&lt;/code&gt;. Setting &lt;code&gt;set_previous_events = {}&lt;/code&gt; (always return to ASCII) is the most programming-friendly choice.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Neovim 0.11 compatibility&lt;/strong&gt;: Use the &lt;code&gt;0.1.x&lt;/code&gt; tag for Telescope (already fixed).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No Nerd Font needed&lt;/strong&gt;: Disable icons in each plugin to get a simple environment.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LSP / completion&lt;/strong&gt;: The mason.nvim + nvim-cmp combo gives you GUI-managed installation and rich completion.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This setup is just one option among many. As you use it, customize it to match your taste and use cases.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;h3&gt;Official documentation&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://neovim.io/doc/user/lsp.html&quot;&gt;Neovim Documentation - LSP&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gpanders.com/blog/whats-new-in-neovim-0-11/&quot;&gt;What&apos;s New in Neovim 0.11&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Plugins&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/keaising/im-select.nvim&quot;&gt;im-select.nvim&lt;/a&gt; - Automatic IME switching&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/laishulu/macism&quot;&gt;macism&lt;/a&gt; - macOS IME control CLI&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/neovim/nvim-lspconfig&quot;&gt;nvim-lspconfig&lt;/a&gt; - LSP configuration&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/williamboman/mason.nvim&quot;&gt;mason.nvim&lt;/a&gt; - Language server management&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/hrsh7th/nvim-cmp&quot;&gt;nvim-cmp&lt;/a&gt; - Completion engine&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/nvim-telescope/telescope.nvim&quot;&gt;telescope.nvim&lt;/a&gt; - Fuzzy finder&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/folke/lazy.nvim&quot;&gt;lazy.nvim&lt;/a&gt; - Plugin manager&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Related articles&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://davelage.com/posts/neovim-lsp-0.11/&quot;&gt;Neovim LSP 0.11&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://blog.diovani.com/technology/2025/06/13/configuring-neovim-011-lsp.html&quot;&gt;Configuring Neovim 0.11 LSP from scratch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://0xunicorn.com/neovim-native-lsp-config/&quot;&gt;Native LSP config in Neovim V0.11&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2025/12/30/neovim-macos.png" length="0" type="image/png"/></item><item><title>Auto-Switch Git and GitHub CLI Accounts Just by cd</title><link>https://zeroshotlog.com/en/blog/git-github-multi-account-auto-switch/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/git-github-multi-account-auto-switch/</guid><description>How to auto-switch between work and personal GitHub accounts simply by changing directories. A 3-layer setup combining includeIf, insteadOf, and a gh function wrapper delivers full automation.</description><pubDate>Tue, 30 Dec 2025 10:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Switching between work and personal GitHub accounts requires &lt;strong&gt;three layers&lt;/strong&gt; of configuration.&lt;/li&gt;
&lt;li&gt;Combining &lt;code&gt;includeIf&lt;/code&gt; + &lt;code&gt;insteadOf&lt;/code&gt; + a &lt;code&gt;gh&lt;/code&gt; function wrapper delivers full automation.&lt;/li&gt;
&lt;li&gt;Just &lt;code&gt;cd&lt;/code&gt; into &lt;code&gt;~/work/&lt;/code&gt; and everything switches over to the work account.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;January 2026 update&lt;/strong&gt;: I revised the Layer 3 &lt;code&gt;gh&lt;/code&gt; command switching scheme. The previous &lt;code&gt;chpwd&lt;/code&gt; hook + &lt;code&gt;gh auth switch&lt;/code&gt; approach has been replaced with a function wrapper + &lt;code&gt;GH_TOKEN&lt;/code&gt; environment variable approach. This now handles parallel work across multiple terminal windows.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Background and Problem&lt;/h2&gt;
&lt;h3&gt;The problem I wanted to solve&lt;/h3&gt;
&lt;p&gt;&quot;I want to use separate work and personal GitHub accounts.&quot;&lt;/p&gt;
&lt;p&gt;This is a common challenge. Manual switching works, but it has issues:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Forgetting to switch&lt;/strong&gt;: Committing to a personal repo with the work email.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operational friction&lt;/strong&gt;: Manually toggling &lt;code&gt;git config&lt;/code&gt; and SSH keys every time is tedious.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The gh CLI&lt;/strong&gt;: Not just Git — GitHub CLI (&lt;code&gt;gh&lt;/code&gt;) needs its own switching too.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;The ideal state&lt;/h3&gt;
&lt;p&gt;&quot;Just change directories, and everything switches automatically.&quot;&lt;/p&gt;
&lt;p&gt;Specifically: when I move under &lt;code&gt;~/work/&lt;/code&gt; it should be the work account; otherwise, personal. All of the following should switch automatically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Git commit identity (&lt;code&gt;user.name&lt;/code&gt;, &lt;code&gt;user.email&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;SSH key (GitHub authentication)&lt;/li&gt;
&lt;li&gt;GitHub CLI (&lt;code&gt;gh&lt;/code&gt; command) account&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Options Considered&lt;/h2&gt;
&lt;h3&gt;Option A: Manual configuration each time&lt;/h3&gt;
&lt;p&gt;Manually set &lt;code&gt;git config user.email&lt;/code&gt; per repository.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No additional setup needed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Configuration required every time you create a repo.&lt;/li&gt;
&lt;li&gt;Forget to set it, and you commit with the wrong account.&lt;/li&gt;
&lt;li&gt;SSH keys and gh CLI still need separate management.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Option B: direnv + GH_TOKEN&lt;/h3&gt;
&lt;p&gt;Use &lt;a href=&quot;https://direnv.net/&quot;&gt;direnv&lt;/a&gt; to set environment variables per directory.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# ~/work/.envrc
export GH_TOKEN=&quot;ghp_xxxx&quot;
export GIT_AUTHOR_EMAIL=&quot;work@example.com&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Unified management via environment variables.&lt;/li&gt;
&lt;li&gt;direnv is widely adopted.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Requires installing direnv.&lt;/li&gt;
&lt;li&gt;Tokens have to be written to a file (security concern).&lt;/li&gt;
&lt;li&gt;Each project needs its own &lt;code&gt;.envrc&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Option C: includeIf + insteadOf + gh function wrapper (chosen)&lt;/h3&gt;
&lt;p&gt;Combine standard Git features with shell functions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No extra tools (uses Git/Zsh built-ins).&lt;/li&gt;
&lt;li&gt;Configure once; everything is automatic afterward.&lt;/li&gt;
&lt;li&gt;No need to write tokens to a file.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Many configuration items (3 layers).&lt;/li&gt;
&lt;li&gt;Requires understanding the mechanism.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Comparison Table&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Manual&lt;/th&gt;
&lt;th&gt;direnv&lt;/th&gt;
&lt;th&gt;includeIf+α (chosen)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup effort&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily friction&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extra tools&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;direnv&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;Token stored in file&lt;/td&gt;
&lt;td&gt;OS keychain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Risk of forgetting&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parallel work across windows&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;Final Decision&lt;/h2&gt;
&lt;h3&gt;Adopted: includeIf + insteadOf + gh function wrapper&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Deciding factors&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No additional tools&lt;/strong&gt;: Achievable with Git and Zsh built-ins alone.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No forgetting&lt;/strong&gt;: Determined automatically by directory structure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;: gh CLI tokens are stored in the OS keychain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-window support&lt;/strong&gt;: Per-process auth via environment variables, so other terminals are unaffected.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Trade-offs accepted&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Initial setup is more involved (covered in this article).&lt;/li&gt;
&lt;li&gt;Without understanding the mechanism, troubleshooting is harder.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Why &quot;Three Layers&quot; Are Necessary&lt;/h2&gt;
&lt;p&gt;Here&apos;s the key point. Switching Git and GitHub accounts requires &lt;strong&gt;three distinct configurations&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;┌─────────────────────────────────────────────────────┐
│                   Your Machine                      │
├─────────────────────────────────────────────────────┤
│                                                     │
│  [Layer 1] Git user settings                        │
│    → Name and email recorded in commits             │
│    → Switched via includeIf in .gitconfig           │
│                                                     │
│  [Layer 2] SSH key                                  │
│    → Authentication for git push/pull to GitHub     │
│    → Switched via SSH host alias + insteadOf        │
│                                                     │
│  [Layer 3] gh CLI account                           │
│    → Account used for gh pr create and other ops    │
│    → Switched via gh function wrapper + GH_TOKEN    │
│                                                     │
└─────────────────────────────────────────────────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │      GitHub         │
              │ (personal or work)  │
              └─────────────────────┘
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Why each layer needs its own configuration&lt;/h3&gt;
&lt;p&gt;Each one is invoked at a &lt;strong&gt;different moment&lt;/strong&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;When used&lt;/th&gt;
&lt;th&gt;What it identifies&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Layer 1&lt;/td&gt;
&lt;td&gt;At &lt;code&gt;git commit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Commit author&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer 2&lt;/td&gt;
&lt;td&gt;At &lt;code&gt;git push/pull&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Connection auth to GitHub&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer 3&lt;/td&gt;
&lt;td&gt;At &lt;code&gt;gh pr create&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td&gt;The actor for GitHub API operations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Configuring only Layer 1, for example, still leaves you pushing with the wrong SSH key. Full switching only works once all three layers are set up correctly.&lt;/p&gt;
&lt;h2&gt;How to Configure Each Layer&lt;/h2&gt;
&lt;p&gt;The example below treats &lt;code&gt;~/work/&lt;/code&gt; as work and everything else as personal.&lt;/p&gt;
&lt;h3&gt;Layer 1: Auto-switching user settings via includeIf&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;includeIf&lt;/code&gt; is a Git feature that loads a different config file based on a condition.&lt;/p&gt;
&lt;h4&gt;~/.gitconfig (the main config file)&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Default (personal) settings
[user]
    name = Your Personal Name
    email = personal@example.com

# Under ~/work/, additionally load the work config
[includeIf &quot;gitdir:~/work/&quot;]
    path = ~/.gitconfig-work
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;~/.gitconfig-work (the work config file)&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;[user]
    name = Your Work Name
    email = work@company.com
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;[includeIf &quot;gitdir:~/work/&quot;]&lt;/code&gt;: Applies when the current repo is under &lt;code&gt;~/work/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;path = ~/.gitconfig-work&lt;/code&gt;: Loads this config file.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Important details&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The path after &lt;code&gt;gitdir:&lt;/code&gt; &lt;strong&gt;must end with &lt;code&gt;/&lt;/code&gt;&lt;/strong&gt; (it&apos;s &lt;code&gt;~/work/&lt;/code&gt;, not &lt;code&gt;~/work&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;The trailing &lt;code&gt;/&lt;/code&gt; is interpreted as &lt;code&gt;**&lt;/code&gt; (any subdirectory).&lt;/li&gt;
&lt;li&gt;Settings loaded later take precedence (which is why &lt;code&gt;user.name&lt;/code&gt; is overridden by the work value).&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Verification&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Check from a personal directory
cd ~/personal/some-repo
git config user.email
# → personal@example.com

# Check from a work directory
cd ~/work/some-project
git config user.email
# → work@company.com
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To confirm exactly which file each setting comes from:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git config --list --show-origin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This shows the source file for every setting.&lt;/p&gt;
&lt;h3&gt;Layer 2: SSH host alias + insteadOf for SSH key switching&lt;/h3&gt;
&lt;p&gt;SSH key switching combines two pieces of configuration.&lt;/p&gt;
&lt;h4&gt;Step 1: Define a host alias in the SSH config&lt;/h4&gt;
&lt;p&gt;Add the following to &lt;code&gt;~/.ssh/config&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Personal (default)
Host github.com
    IdentityFile ~/.ssh/id_ed25519_personal

# Work (alias)
Host github-work
    HostName github.com
    IdentityFile ~/.ssh/id_ed25519_work
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Host github.com&lt;/code&gt;: Settings used when connecting to &lt;code&gt;github.com&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Host github-work&lt;/code&gt;: Defines a &lt;strong&gt;fictional host name&lt;/strong&gt; called &lt;code&gt;github-work&lt;/code&gt;.
&lt;ul&gt;
&lt;li&gt;The actual destination is &lt;code&gt;HostName github.com&lt;/code&gt; (real GitHub).&lt;/li&gt;
&lt;li&gt;But the SSH key used is &lt;code&gt;id_ed25519_work&lt;/code&gt; (the work key).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So accessing &lt;code&gt;git@github-work:org/repo.git&lt;/code&gt; uses the work SSH key.&lt;/p&gt;
&lt;h4&gt;Step 2: Auto-rewrite URLs with Git&apos;s insteadOf&lt;/h4&gt;
&lt;p&gt;For work repos, automatically rewrite &lt;code&gt;github.com&lt;/code&gt; access to &lt;code&gt;github-work&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Add the following to &lt;code&gt;~/.gitconfig-work&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[user]
    name = Your Work Name
    email = work@company.com

# Rewrite SSH-form URLs
[url &quot;git@github-work:&quot;]
    insteadOf = git@github.com:

# Rewrite HTTPS-form URLs too
[url &quot;git@github-work:&quot;]
    insteadOf = https://github.com/
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;insteadOf&lt;/code&gt;: Git auto-substitutes the URL during resolution.&lt;/li&gt;
&lt;li&gt;For example, &lt;code&gt;git clone git@github.com:org/repo.git&lt;/code&gt;:
&lt;ul&gt;
&lt;li&gt;Internally becomes &lt;code&gt;git@github-work:org/repo.git&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;SSH config then connects &lt;code&gt;github-work&lt;/code&gt; using the work SSH key.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Verification&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Check the remote URL in a work repo
cd ~/work/some-project
git remote -v
# → origin  git@github.com:company/repo.git (fetch)
#    ↑ The displayed value is still github.com

# Check the URL actually used
git config --get-regexp &apos;url.*&apos;
# → url.git@github-work:.insteadof git@github.com:
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Layer 3: gh CLI account switching via a gh function wrapper&lt;/h3&gt;
&lt;p&gt;GitHub CLI (&lt;code&gt;gh&lt;/code&gt;) gained multi-account support in v2.40.0.&lt;/p&gt;
&lt;h4&gt;Prerequisite: log in with both accounts&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Log in with the first account
gh auth login

# Log in with the second account (added)
gh auth login

# Check login status
gh auth status
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;gh auth status&lt;/code&gt; lists every account you&apos;re logged into and which one is currently active.&lt;/p&gt;
&lt;h4&gt;Add to ~/.zshrc&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;########################################
# GitHub CLI account switching (function wrapper approach)
# Work account under ~/work, personal account elsewhere.
# Set the account names in ~/.zshrc.local:
#   GH_PERSONAL_ACCOUNT=&quot;your-personal&quot;
#   GH_WORK_ACCOUNT=&quot;your-work&quot;
########################################
gh() {
  local token
  if [[ &quot;$PWD&quot; == &quot;$HOME/work&quot;* ]]; then
    token=$(command gh auth token --user &quot;$GH_WORK_ACCOUNT&quot; 2&amp;gt;/dev/null)
  else
    token=$(command gh auth token --user &quot;$GH_PERSONAL_ACCOUNT&quot; 2&amp;gt;/dev/null)
  fi

  if [[ -n &quot;$token&quot; ]]; then
    GH_TOKEN=&quot;$token&quot; command gh &quot;$@&quot;
  else
    command gh &quot;$@&quot;
  fi
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Set account names in ~/.zshrc.local&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Set your GitHub usernames
GH_PERSONAL_ACCOUNT=&quot;your-personal-username&quot;
GH_WORK_ACCOUNT=&quot;your-work-username&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;What the function wrapper does&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;It wraps the &lt;code&gt;gh&lt;/code&gt; command in a shell function that, at invocation time, checks the current directory and uses the appropriate token.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gh auth token --user&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetches the token for the specified account.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;GH_TOKEN=&quot;...&quot; command gh&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Passes the env var only to that command (process-local).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;command gh&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Calls the actual &lt;code&gt;gh&lt;/code&gt; command, not the function.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Why the &lt;code&gt;command&lt;/code&gt; keyword matters&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;command gh &quot;$@&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Inside a shell function, calling &lt;code&gt;gh&lt;/code&gt; would recursively call the function itself. The &lt;code&gt;command&lt;/code&gt; keyword bypasses the function and invokes the real binary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why this approach&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;I previously used a &lt;code&gt;chpwd&lt;/code&gt; hook + &lt;code&gt;gh auth switch&lt;/code&gt;, but &lt;strong&gt;it broke when working across multiple terminal windows in parallel&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Old approach: &lt;code&gt;gh auth switch&lt;/code&gt; mutates global state, so switching in one window affected the other.&lt;/li&gt;
&lt;li&gt;New approach: &lt;code&gt;GH_TOKEN&lt;/code&gt; is only set for the duration of that command (process-local), so other windows are unaffected.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Why account names go in ~/.zshrc.local&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Account names are personal, so I don&apos;t want them in my dotfiles repository.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;~/.zshrc.local&lt;/code&gt; is treated as a Git-untracked file.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Verification&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;# Check from a work directory
cd ~/work/some-project
gh api user --jq &apos;.login&apos;
# → work-username

# Check from a personal directory
cd ~/personal/my-repo
gh api user --jq &apos;.login&apos;
# → personal-username
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: &lt;code&gt;gh auth status&lt;/code&gt; shows global state and does not accurately reflect the current state with the function wrapper approach. To check which account is actually used, use &lt;code&gt;gh api user&lt;/code&gt; as shown above.&lt;/p&gt;
&lt;h2&gt;End-to-End Verification&lt;/h2&gt;
&lt;p&gt;Once everything is configured, verify with these steps:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# 1. Check from a personal directory
cd ~/personal/my-repo
git config user.email          # → personal@example.com
gh api user --jq &apos;.login&apos;      # → personal-username

# 2. Check from a work directory
cd ~/work/company-project
git config user.email          # → work@company.com
gh api user --jq &apos;.login&apos;      # → work-username

# 3. Confirm push works (in a work repo)
git push --dry-run             # Authenticates with the work SSH key
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Results After Adoption&lt;/h2&gt;
&lt;h3&gt;What worked&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Zero forgotten switches&lt;/strong&gt;: Determined automatically by directory structure, no need to think about it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No setup overhead at the start of work&lt;/strong&gt;: Previously I&apos;d always check &quot;which account am I on?&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fewer incidents&lt;/strong&gt;: No more accidental commits/pushes from the wrong account.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Parallel work across windows now works&lt;/strong&gt;: Thanks to the function wrapper approach, work and personal terminals can stay open side by side.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;How I got here&lt;/h3&gt;
&lt;p&gt;I originally adopted the &lt;code&gt;chpwd&lt;/code&gt; hook + &lt;code&gt;gh auth switch&lt;/code&gt; approach, but &lt;strong&gt;it broke when working in parallel across multiple terminal windows&lt;/strong&gt;. Because &lt;code&gt;gh auth switch&lt;/code&gt; mutates global state, switching in one window affected the other.&lt;/p&gt;
&lt;p&gt;So I moved to the function wrapper + &lt;code&gt;GH_TOKEN&lt;/code&gt; environment variable approach. With this approach:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The env var is only effective during command execution (process-local).&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;cd&lt;/code&gt; hook is needed; it&apos;s simpler.&lt;/li&gt;
&lt;li&gt;Global state (&lt;code&gt;gh auth status&lt;/code&gt;) is never mutated.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;What I Learned&lt;/h2&gt;
&lt;h3&gt;1. Git auth and GitHub auth are different things&lt;/h3&gt;
&lt;p&gt;It&apos;s tempting to think &quot;just change Git settings and you&apos;re done,&quot; but in reality there are multiple layers — SSH auth, GitHub API auth, and so on. Each needs to be understood and configured.&lt;/p&gt;
&lt;h3&gt;2. Built-in features are often enough&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;includeIf&lt;/code&gt;, &lt;code&gt;insteadOf&lt;/code&gt;, and shell functions are all standard features. Without adding new tools, I achieved the goal by combining what was already there.&lt;/p&gt;
&lt;h3&gt;3. Pursuing UX is worth it&lt;/h3&gt;
&lt;p&gt;I obsessed over the &quot;just &lt;code&gt;cd&lt;/code&gt;&quot; experience. Setup is complex, but day-to-day operations stay simple. Balancing upfront investment against ongoing cost is the call worth making.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;h3&gt;Official documentation&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://cli.github.com/manual/gh_auth_token&quot;&gt;gh auth token - GitHub CLI Manual&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://docs.github.com/en/github-cli/github-cli/using-multiple-accounts&quot;&gt;Using the GitHub CLI across GitHub platforms - GitHub Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://git-scm.com/docs/git-config#_includes&quot;&gt;git-config - includeIf - Git Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Related articles&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.blog/changelog/2023-12-17-log-in-to-multiple-github-accounts-with-the-cli/&quot;&gt;Log in to multiple GitHub accounts with the CLI - GitHub Changelog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gist.github.com/bgauduch/06a8c4ec2fec8fef6354afe94358c89e&quot;&gt;Git config with multiple identities - GitHub Gist&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2025/12/30/git-multi-account.png" length="0" type="image/png"/></item><item><title>I Built a Tech Blog with Astro</title><link>https://zeroshotlog.com/en/blog/hello-world/</link><guid isPermaLink="true">https://zeroshotlog.com/en/blog/hello-world/</guid><description>I built a blazing-fast tech blog with Astro + Tailwind CSS + Vercel. Here&apos;s why I picked this stack and how I set it up.</description><pubDate>Tue, 30 Dec 2025 09:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;Welcome to Zero-Shot Log&lt;/h2&gt;
&lt;p&gt;This blog is built with &lt;strong&gt;Astro&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Why Astro?&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Strips out unnecessary JavaScript and delivers the fastest possible static-site performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DX (Developer Experience)&lt;/strong&gt;: Drop in components from React, Vue, Svelte, or whatever framework you prefer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Markdown-first&lt;/strong&gt;: Content is managed as Markdown, which is great for engineers.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;console.log(&quot;Hello, Astro!&quot;);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;More technical posts coming soon!&lt;/p&gt;
</content:encoded><enclosure url="https://zeroshotlog.com/images/2025/12/30/astro-blog.png" length="0" type="image/png"/></item></channel></rss>