Merge pull request #1634 from NousResearch/hermes/hermes-a86162db

chore: release v0.3.0 (v2026.3.17)
2026-03-17 00:55:51 -07:00 · 2026-03-17 00:38:48 -07:00 · 2026-03-17 00:12:16 -07:00 · 2026-03-16 23:48:14 -07:00 · 2026-03-16 23:39:41 -07:00 · 2026-03-16 23:26:43 -07:00
152 changed files with 14449 additions and 2795 deletions
@@ -129,14 +129,50 @@ Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Re
 - **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
 - `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
 - **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
- `process_command()` is a method on `HermesCLI` (not in commands.py)
+- `process_command()` is a method on `HermesCLI` — dispatches on canonical command name resolved via `resolve_command()` from the central registry
 - Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching

-### Adding CLI Commands
+### Slash Command Registry (`hermes_cli/commands.py`)

-1. Add to `COMMANDS` dict in `hermes_cli/commands.py`
-2. Add handler in `HermesCLI.process_command()` in `cli.py`
-3. For persistent settings, use `save_config_value()` in `cli.py`
+All slash commands are defined in a central `COMMAND_REGISTRY` list of `CommandDef` objects. Every downstream consumer derives from this registry automatically:
+
+- **CLI** — `process_command()` resolves aliases via `resolve_command()`, dispatches on canonical name
+- **Gateway** — `GATEWAY_KNOWN_COMMANDS` frozenset for hook emission, `resolve_command()` for dispatch
+- **Gateway help** — `gateway_help_lines()` generates `/help` output
+- **Telegram** — `telegram_bot_commands()` generates the BotCommand menu
+- **Slack** — `slack_subcommand_map()` generates `/hermes` subcommand routing
+- **Autocomplete** — `COMMANDS` flat dict feeds `SlashCommandCompleter`
+- **CLI help** — `COMMANDS_BY_CATEGORY` dict feeds `show_help()`
+
+### Adding a Slash Command
+
+1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
+```python
+CommandDef("mycommand", "Description of what it does", "Session",
+           aliases=("mc",), args_hint="[arg]"),
+```
+2. Add handler in `HermesCLI.process_command()` in `cli.py`:
+```python
+elif canonical == "mycommand":
+    self._handle_mycommand(cmd_original)
+```
+3. If the command is available in the gateway, add a handler in `gateway/run.py`:
+```python
+if canonical == "mycommand":
+    return await self._handle_mycommand(event)
+```
+4. For persistent settings, use `save_config_value()` in `cli.py`
+
+**CommandDef fields:**
+- `name` — canonical name without slash (e.g. `"background"`)
+- `description` — human-readable description
+- `category` — one of `"Session"`, `"Configuration"`, `"Tools & Skills"`, `"Info"`, `"Exit"`
+- `aliases` — tuple of alternative names (e.g. `("bg",)`)
+- `args_hint` — argument placeholder shown in help (e.g. `"<prompt>"`, `"[name]"`)
+- `cli_only` — only available in the interactive CLI
+- `gateway_only` — only available in messaging platforms
+
+**Adding an alias** requires only adding it to the `aliases` tuple on the existing `CommandDef`. No other file changes needed — dispatch, help text, Telegram menu, Slack mapping, and autocomplete all update automatically.

 ---

@@ -136,7 +136,7 @@ hermes-agent/
 │   ├── auth.py                   # Provider resolution, OAuth, Nous Portal
 │   ├── models.py                 # OpenRouter model selection lists
 │   ├── banner.py                 # Welcome banner, ASCII art
-│   ├── commands.py               # Slash command definitions + autocomplete
+│   ├── commands.py               # Central slash command registry (CommandDef), autocomplete, gateway helpers
 │   ├── callbacks.py              # Interactive callbacks (clarify, sudo, approval)
 │   ├── doctor.py                 # Diagnostics
 │   ├── skills_hub.py             # Skills Hub CLI + /skills slash command
@@ -62,6 +62,24 @@ hermes doctor       # Diagnose any issues

 📖 **[Full documentation →](https://hermes-agent.nousresearch.com/docs/)**

+## CLI vs Messaging Quick Reference
+
+Hermes has two entry points: start the terminal UI with `hermes`, or run the gateway and talk to it from Telegram, Discord, Slack, WhatsApp, Signal, or Email. Once you're in a conversation, many slash commands are shared across both interfaces.
+
+| Action | CLI | Messaging platforms |
+|---------|-----|---------------------|
+| Start chatting | `hermes` | Run `hermes gateway setup` + `hermes gateway start`, then send the bot a message |
+| Start fresh conversation | `/new` or `/reset` | `/new` or `/reset` |
+| Change model | `/model [provider:model]` | `/model [provider:model]` |
+| Set a personality | `/personality [name]` | `/personality [name]` |
+| Retry or undo the last turn | `/retry`, `/undo` | `/retry`, `/undo` |
+| Compress context / check usage | `/compress`, `/usage`, `/insights [--days N]` | `/compress`, `/usage`, `/insights [days]` |
+| Browse skills | `/skills` or `/<skill-name>` | `/skills` or `/<skill-name>` |
+| Interrupt current work | `Ctrl+C` or send a new message | `/stop` or send a new message |
+| Platform-specific status | `/platforms` | `/status`, `/sethome` |
+
+For the full command lists, see the [CLI guide](https://hermes-agent.nousresearch.com/docs/user-guide/cli) and the [Messaging Gateway guide](https://hermes-agent.nousresearch.com/docs/user-guide/messaging).
+
 ---

 ## Documentation
@@ -0,0 +1,377 @@
+# Hermes Agent v0.3.0 (v2026.3.17)
+
+**Release Date:** March 17, 2026
+
+> The streaming, plugins, and provider release — unified real-time token delivery, first-class plugin architecture, rebuilt provider system with Vercel AI Gateway, native Anthropic provider, smart approvals, live Chrome CDP browser connect, ACP IDE integration, Honcho memory, voice mode, persistent shell, and 50+ bug fixes across every platform.
+
+---
+
+## ✨ Highlights
+
+- **Unified Streaming Infrastructure** — Real-time token-by-token delivery in CLI and all gateway platforms. Responses stream as they're generated instead of arriving as a block. ([#1538](https://github.com/NousResearch/hermes-agent/pull/1538))
+
+- **First-Class Plugin Architecture** — Drop Python files into `~/.hermes/plugins/` to extend Hermes with custom tools, commands, and hooks. No forking required. ([#1544](https://github.com/NousResearch/hermes-agent/pull/1544), [#1555](https://github.com/NousResearch/hermes-agent/pull/1555))
+
+- **Native Anthropic Provider** — Direct Anthropic API calls with Claude Code credential auto-discovery, OAuth PKCE flows, and native prompt caching. No OpenRouter middleman needed. ([#1097](https://github.com/NousResearch/hermes-agent/pull/1097))
+
+- **Smart Approvals + /stop Command** — Codex-inspired approval system that learns which commands are safe and remembers your preferences. `/stop` kills the current agent run immediately. ([#1543](https://github.com/NousResearch/hermes-agent/pull/1543))
+
+- **Honcho Memory Integration** — Async memory writes, configurable recall modes, session title integration, and multi-user isolation in gateway mode. By @erosika. ([#736](https://github.com/NousResearch/hermes-agent/pull/736))
+
+- **Voice Mode** — Push-to-talk in CLI, voice notes in Telegram/Discord, Discord voice channel support, and local Whisper transcription via faster-whisper. ([#1299](https://github.com/NousResearch/hermes-agent/pull/1299), [#1185](https://github.com/NousResearch/hermes-agent/pull/1185), [#1429](https://github.com/NousResearch/hermes-agent/pull/1429))
+
+- **Concurrent Tool Execution** — Multiple independent tool calls now run in parallel via ThreadPoolExecutor, significantly reducing latency for multi-tool turns. ([#1152](https://github.com/NousResearch/hermes-agent/pull/1152))
+
+- **PII Redaction** — When `privacy.redact_pii` is enabled, personally identifiable information is automatically scrubbed before sending context to LLM providers. ([#1542](https://github.com/NousResearch/hermes-agent/pull/1542))
+
+- **`/browser connect` via CDP** — Attach browser tools to a live Chrome instance through Chrome DevTools Protocol. Debug, inspect, and interact with pages you already have open. ([#1549](https://github.com/NousResearch/hermes-agent/pull/1549))
+
+- **Vercel AI Gateway Provider** — Route Hermes through Vercel's AI Gateway for access to their model catalog and infrastructure. ([#1628](https://github.com/NousResearch/hermes-agent/pull/1628))
+
+- **Centralized Provider Router** — Rebuilt provider system with `call_llm` API, unified `/model` command, auto-detect provider on model switch, and direct endpoint overrides for auxiliary/delegation clients. ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003), [#1506](https://github.com/NousResearch/hermes-agent/pull/1506), [#1375](https://github.com/NousResearch/hermes-agent/pull/1375))
+
+- **ACP Server (IDE Integration)** — VS Code, Zed, and JetBrains can now connect to Hermes as an agent backend, with full slash command support. ([#1254](https://github.com/NousResearch/hermes-agent/pull/1254), [#1532](https://github.com/NousResearch/hermes-agent/pull/1532))
+
+- **Persistent Shell Mode** — Local and SSH terminal backends can maintain shell state across tool calls — cd, env vars, and aliases persist. By @alt-glitch. ([#1067](https://github.com/NousResearch/hermes-agent/pull/1067), [#1483](https://github.com/NousResearch/hermes-agent/pull/1483))
+
+- **Agentic On-Policy Distillation (OPD)** — New RL training environment for distilling agent policies, expanding the Atropos training ecosystem. ([#1149](https://github.com/NousResearch/hermes-agent/pull/1149))
+
+---
+
+## 🏗️ Core Agent & Architecture
+
+### Provider & Model Support
+- **Centralized provider router** with `call_llm` API and unified `/model` command — switch models and providers seamlessly ([#1003](https://github.com/NousResearch/hermes-agent/pull/1003))
+- **Vercel AI Gateway** provider support ([#1628](https://github.com/NousResearch/hermes-agent/pull/1628))
+- **Auto-detect provider** when switching models via `/model` ([#1506](https://github.com/NousResearch/hermes-agent/pull/1506))
+- **Direct endpoint overrides** for auxiliary and delegation clients — point vision/subagent calls at specific endpoints ([#1375](https://github.com/NousResearch/hermes-agent/pull/1375))
+- **Native Anthropic auxiliary vision** — use Claude's native vision API instead of routing through OpenAI-compatible endpoints ([#1377](https://github.com/NousResearch/hermes-agent/pull/1377))
+- Anthropic OAuth flow improvements — auto-run `claude setup-token`, reauthentication, PKCE state persistence, identity fingerprinting ([#1132](https://github.com/NousResearch/hermes-agent/pull/1132), [#1360](https://github.com/NousResearch/hermes-agent/pull/1360), [#1396](https://github.com/NousResearch/hermes-agent/pull/1396), [#1597](https://github.com/NousResearch/hermes-agent/pull/1597))
+- Fix adaptive thinking without `budget_tokens` for Claude 4.6 models — by @ASRagab ([#1128](https://github.com/NousResearch/hermes-agent/pull/1128))
+- Fix Anthropic cache markers through adapter — by @brandtcormorant ([#1216](https://github.com/NousResearch/hermes-agent/pull/1216))
+- Retry Anthropic 429/529 errors and surface details to users — by @0xbyt4 ([#1585](https://github.com/NousResearch/hermes-agent/pull/1585))
+- Fix Anthropic adapter max_tokens, fallback crash, proxy base_url — by @0xbyt4 ([#1121](https://github.com/NousResearch/hermes-agent/pull/1121))
+- Fix DeepSeek V3 parser dropping multiple parallel tool calls — by @mr-emmett-one ([#1365](https://github.com/NousResearch/hermes-agent/pull/1365), [#1300](https://github.com/NousResearch/hermes-agent/pull/1300))
+- Accept unlisted models with warning instead of rejecting ([#1047](https://github.com/NousResearch/hermes-agent/pull/1047), [#1102](https://github.com/NousResearch/hermes-agent/pull/1102))
+- Skip reasoning params for unsupported OpenRouter models ([#1485](https://github.com/NousResearch/hermes-agent/pull/1485))
+- MiniMax Anthropic API compatibility fix ([#1623](https://github.com/NousResearch/hermes-agent/pull/1623))
+- Custom endpoint `/models` verification and `/v1` base URL suggestion ([#1480](https://github.com/NousResearch/hermes-agent/pull/1480))
+- Resolve delegation providers from `custom_providers` config ([#1328](https://github.com/NousResearch/hermes-agent/pull/1328))
+- Kimi model additions and User-Agent fix ([#1039](https://github.com/NousResearch/hermes-agent/pull/1039))
+- Strip `call_id`/`response_item_id` for Mistral compatibility ([#1058](https://github.com/NousResearch/hermes-agent/pull/1058))
+
+### Agent Loop & Conversation
+- **Anthropic Context Editing API** support ([#1147](https://github.com/NousResearch/hermes-agent/pull/1147))
+- Improved context compaction handoff summaries — compressor now preserves more actionable state ([#1273](https://github.com/NousResearch/hermes-agent/pull/1273))
+- Sync session_id after mid-run context compression ([#1160](https://github.com/NousResearch/hermes-agent/pull/1160))
+- Session hygiene threshold tuned to 50% for more proactive compression ([#1096](https://github.com/NousResearch/hermes-agent/pull/1096), [#1161](https://github.com/NousResearch/hermes-agent/pull/1161))
+- Include session ID in system prompt via `--pass-session-id` flag ([#1040](https://github.com/NousResearch/hermes-agent/pull/1040))
+- Prevent closed OpenAI client reuse across retries ([#1391](https://github.com/NousResearch/hermes-agent/pull/1391))
+- Sanitize chat payloads and provider precedence ([#1253](https://github.com/NousResearch/hermes-agent/pull/1253))
+- Handle dict tool call arguments from Codex and local backends ([#1393](https://github.com/NousResearch/hermes-agent/pull/1393), [#1440](https://github.com/NousResearch/hermes-agent/pull/1440))
+
+### Memory & Sessions
+- **Improve memory prioritization** — user preferences and corrections weighted above procedural knowledge ([#1548](https://github.com/NousResearch/hermes-agent/pull/1548))
+- Tighter memory and session recall guidance in system prompts ([#1329](https://github.com/NousResearch/hermes-agent/pull/1329))
+- Persist CLI token counts to session DB for `/insights` ([#1498](https://github.com/NousResearch/hermes-agent/pull/1498))
+- Keep Honcho recall out of the cached system prefix ([#1201](https://github.com/NousResearch/hermes-agent/pull/1201))
+- Correct `seed_ai_identity` to use `session.add_messages()` ([#1475](https://github.com/NousResearch/hermes-agent/pull/1475))
+- Isolate Honcho session routing for multi-user gateway ([#1500](https://github.com/NousResearch/hermes-agent/pull/1500))
+
+---
+
+## 📱 Messaging Platforms (Gateway)
+
+### Gateway Core
+- **System gateway service mode** — run as a system-level systemd service, not just user-level ([#1371](https://github.com/NousResearch/hermes-agent/pull/1371))
+- **Gateway install scope prompts** — choose user vs system scope during setup ([#1374](https://github.com/NousResearch/hermes-agent/pull/1374))
+- **Reasoning hot reload** — change reasoning settings without restarting the gateway ([#1275](https://github.com/NousResearch/hermes-agent/pull/1275))
+- Default group sessions to per-user isolation — no more shared state across users in group chats ([#1495](https://github.com/NousResearch/hermes-agent/pull/1495), [#1417](https://github.com/NousResearch/hermes-agent/pull/1417))
+- Harden gateway restart recovery ([#1310](https://github.com/NousResearch/hermes-agent/pull/1310))
+- Cancel active runs during shutdown ([#1427](https://github.com/NousResearch/hermes-agent/pull/1427))
+- SSL certificate auto-detection for NixOS and non-standard systems ([#1494](https://github.com/NousResearch/hermes-agent/pull/1494))
+- Auto-detect D-Bus session bus for `systemctl --user` on headless servers ([#1601](https://github.com/NousResearch/hermes-agent/pull/1601))
+- Auto-enable systemd linger during gateway install on headless servers ([#1334](https://github.com/NousResearch/hermes-agent/pull/1334))
+- Fall back to module entrypoint when `hermes` is not on PATH ([#1355](https://github.com/NousResearch/hermes-agent/pull/1355))
+- Fix dual gateways on macOS launchd after `hermes update` ([#1567](https://github.com/NousResearch/hermes-agent/pull/1567))
+- Remove recursive ExecStop from systemd units ([#1530](https://github.com/NousResearch/hermes-agent/pull/1530))
+- Prevent logging handler accumulation in gateway mode ([#1251](https://github.com/NousResearch/hermes-agent/pull/1251))
+- Restart on retryable startup failures — by @jplew ([#1517](https://github.com/NousResearch/hermes-agent/pull/1517))
+- Backfill model on gateway sessions after agent runs ([#1306](https://github.com/NousResearch/hermes-agent/pull/1306))
+- PID-based gateway kill and deferred config write ([#1499](https://github.com/NousResearch/hermes-agent/pull/1499))
+
+### Telegram
+- Buffer media groups to prevent self-interruption from photo bursts ([#1341](https://github.com/NousResearch/hermes-agent/pull/1341), [#1422](https://github.com/NousResearch/hermes-agent/pull/1422))
+- Retry on transient TLS failures during connect and send ([#1535](https://github.com/NousResearch/hermes-agent/pull/1535))
+- Harden polling conflict handling ([#1339](https://github.com/NousResearch/hermes-agent/pull/1339))
+- Escape chunk indicators and inline code in MarkdownV2 ([#1478](https://github.com/NousResearch/hermes-agent/pull/1478), [#1626](https://github.com/NousResearch/hermes-agent/pull/1626))
+- Check updater/app state before disconnect ([#1389](https://github.com/NousResearch/hermes-agent/pull/1389))
+
+### Discord
+- `/thread` command with `auto_thread` config and media metadata fixes ([#1178](https://github.com/NousResearch/hermes-agent/pull/1178))
+- Auto-thread on @mention, skip mention text in bot threads ([#1438](https://github.com/NousResearch/hermes-agent/pull/1438))
+- Retry without reply reference for system messages ([#1385](https://github.com/NousResearch/hermes-agent/pull/1385))
+- Preserve native document and video attachment support ([#1392](https://github.com/NousResearch/hermes-agent/pull/1392))
+- Defer discord adapter annotations to avoid optional import crashes ([#1314](https://github.com/NousResearch/hermes-agent/pull/1314))
+
+### Slack
+- Thread handling overhaul — progress messages, responses, and session isolation all respect threads ([#1103](https://github.com/NousResearch/hermes-agent/pull/1103))
+- Formatting, reactions, user resolution, and command improvements ([#1106](https://github.com/NousResearch/hermes-agent/pull/1106))
+- Fix MAX_MESSAGE_LENGTH 3900 → 39000 ([#1117](https://github.com/NousResearch/hermes-agent/pull/1117))
+- File upload fallback preserves thread context — by @0xbyt4 ([#1122](https://github.com/NousResearch/hermes-agent/pull/1122))
+- Improve setup guidance ([#1387](https://github.com/NousResearch/hermes-agent/pull/1387))
+
+### Email
+- Fix IMAP UID tracking and SMTP TLS verification ([#1305](https://github.com/NousResearch/hermes-agent/pull/1305))
+- Add `skip_attachments` option via config.yaml ([#1536](https://github.com/NousResearch/hermes-agent/pull/1536))
+
+### Home Assistant
+- Event filtering closed by default ([#1169](https://github.com/NousResearch/hermes-agent/pull/1169))
+
+---
+
+## 🖥️ CLI & User Experience
+
+### Interactive CLI
+- **Persistent CLI status bar** — always-visible model, provider, and token counts ([#1522](https://github.com/NousResearch/hermes-agent/pull/1522))
+- **File path autocomplete** in the input prompt ([#1545](https://github.com/NousResearch/hermes-agent/pull/1545))
+- **`/plan` command** — generate implementation plans from specs ([#1372](https://github.com/NousResearch/hermes-agent/pull/1372), [#1381](https://github.com/NousResearch/hermes-agent/pull/1381))
+- **Major `/rollback` improvements** — richer checkpoint history, clearer UX ([#1505](https://github.com/NousResearch/hermes-agent/pull/1505))
+- **Preload CLI skills on launch** — skills are ready before the first prompt ([#1359](https://github.com/NousResearch/hermes-agent/pull/1359))
+- **Centralized slash command registry** — all commands defined once, consumed everywhere ([#1603](https://github.com/NousResearch/hermes-agent/pull/1603))
+- `/bg` alias for `/background` ([#1590](https://github.com/NousResearch/hermes-agent/pull/1590))
+- Prefix matching for slash commands — `/mod` resolves to `/model` ([#1320](https://github.com/NousResearch/hermes-agent/pull/1320))
+- `/new`, `/reset`, `/clear` now start genuinely fresh sessions ([#1237](https://github.com/NousResearch/hermes-agent/pull/1237))
+- Accept session ID prefixes for session actions ([#1425](https://github.com/NousResearch/hermes-agent/pull/1425))
+- TUI prompt and accent output now respect active skin ([#1282](https://github.com/NousResearch/hermes-agent/pull/1282))
+- Centralize tool emoji metadata in registry + skin integration ([#1484](https://github.com/NousResearch/hermes-agent/pull/1484))
+- "View full command" option added to dangerous command approval — by @teknium1 based on design by community ([#887](https://github.com/NousResearch/hermes-agent/pull/887))
+- Non-blocking startup update check and banner deduplication ([#1386](https://github.com/NousResearch/hermes-agent/pull/1386))
+- `/reasoning` command output ordering and inline think extraction fixes ([#1031](https://github.com/NousResearch/hermes-agent/pull/1031))
+- Verbose mode shows full untruncated output ([#1472](https://github.com/NousResearch/hermes-agent/pull/1472))
+- Fix `/status` to report live state and tokens ([#1476](https://github.com/NousResearch/hermes-agent/pull/1476))
+- Seed a default global SOUL.md ([#1311](https://github.com/NousResearch/hermes-agent/pull/1311))
+
+### Setup & Configuration
+- **OpenClaw migration** during first-time setup — by @kshitijk4poor ([#981](https://github.com/NousResearch/hermes-agent/pull/981))
+- `hermes claw migrate` command + migration docs ([#1059](https://github.com/NousResearch/hermes-agent/pull/1059))
+- Smart vision setup that respects the user's chosen provider ([#1323](https://github.com/NousResearch/hermes-agent/pull/1323))
+- Handle headless setup flows end-to-end ([#1274](https://github.com/NousResearch/hermes-agent/pull/1274))
+- Prefer curses over `simple_term_menu` in setup.py ([#1487](https://github.com/NousResearch/hermes-agent/pull/1487))
+- Show effective model and provider in `/status` ([#1284](https://github.com/NousResearch/hermes-agent/pull/1284))
+- Config set examples use placeholder syntax ([#1322](https://github.com/NousResearch/hermes-agent/pull/1322))
+- Reload .env over stale shell overrides ([#1434](https://github.com/NousResearch/hermes-agent/pull/1434))
+- Fix is_coding_plan NameError crash — by @0xbyt4 ([#1123](https://github.com/NousResearch/hermes-agent/pull/1123))
+- Add missing packages to setuptools config — by @alt-glitch ([#912](https://github.com/NousResearch/hermes-agent/pull/912))
+- Installer: clarify why sudo is needed at every prompt ([#1602](https://github.com/NousResearch/hermes-agent/pull/1602))
+
+---
+
+## 🔧 Tool System
+
+### Terminal & Execution
+- **Persistent shell mode** for local and SSH backends — maintain shell state across tool calls — by @alt-glitch ([#1067](https://github.com/NousResearch/hermes-agent/pull/1067), [#1483](https://github.com/NousResearch/hermes-agent/pull/1483))
+- **Tirith pre-exec command scanning** — security layer that analyzes commands before execution ([#1256](https://github.com/NousResearch/hermes-agent/pull/1256))
+- Strip Hermes provider env vars from all subprocess environments ([#1157](https://github.com/NousResearch/hermes-agent/pull/1157), [#1172](https://github.com/NousResearch/hermes-agent/pull/1172), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399), [#1419](https://github.com/NousResearch/hermes-agent/pull/1419)) — initial fix by @eren-karakus0
+- SSH preflight check ([#1486](https://github.com/NousResearch/hermes-agent/pull/1486))
+- Docker backend: make cwd workspace mount explicit opt-in ([#1534](https://github.com/NousResearch/hermes-agent/pull/1534))
+- Add project root to PYTHONPATH in execute_code sandbox ([#1383](https://github.com/NousResearch/hermes-agent/pull/1383))
+- Eliminate execute_code progress spam on gateway platforms ([#1098](https://github.com/NousResearch/hermes-agent/pull/1098))
+- Clearer docker backend preflight errors ([#1276](https://github.com/NousResearch/hermes-agent/pull/1276))
+
+### Browser
+- **`/browser connect`** — attach browser tools to a live Chrome instance via CDP ([#1549](https://github.com/NousResearch/hermes-agent/pull/1549))
+- Improve browser cleanup, local browser PATH setup, and screenshot recovery ([#1333](https://github.com/NousResearch/hermes-agent/pull/1333))
+
+### MCP
+- **Selective tool loading** with utility policies — filter which MCP tools are available ([#1302](https://github.com/NousResearch/hermes-agent/pull/1302))
+- Auto-reload MCP tools when `mcp_servers` config changes without restart ([#1474](https://github.com/NousResearch/hermes-agent/pull/1474))
+- Resolve npx stdio connection failures ([#1291](https://github.com/NousResearch/hermes-agent/pull/1291))
+- Preserve MCP toolsets when saving platform tool config ([#1421](https://github.com/NousResearch/hermes-agent/pull/1421))
+
+### Vision
+- Unify vision backend gating ([#1367](https://github.com/NousResearch/hermes-agent/pull/1367))
+- Surface actual error reason instead of generic message ([#1338](https://github.com/NousResearch/hermes-agent/pull/1338))
+- Make Claude image handling work end-to-end ([#1408](https://github.com/NousResearch/hermes-agent/pull/1408))
+
+### Cron
+- **Compress cron management into one tool** — single `cronjob` tool replaces multiple commands ([#1343](https://github.com/NousResearch/hermes-agent/pull/1343))
+- Suppress duplicate cron sends to auto-delivery targets ([#1357](https://github.com/NousResearch/hermes-agent/pull/1357))
+- Persist cron sessions to SQLite ([#1255](https://github.com/NousResearch/hermes-agent/pull/1255))
+- Per-job runtime overrides (provider, model, base_url) ([#1398](https://github.com/NousResearch/hermes-agent/pull/1398))
+- Atomic write in `save_job_output` to prevent data loss on crash ([#1173](https://github.com/NousResearch/hermes-agent/pull/1173))
+- Preserve thread context for `deliver=origin` ([#1437](https://github.com/NousResearch/hermes-agent/pull/1437))
+
+### Patch Tool
+- Avoid corrupting pipe chars in V4A patch apply ([#1286](https://github.com/NousResearch/hermes-agent/pull/1286))
+- Permissive `block_anchor` thresholds and unicode normalization ([#1539](https://github.com/NousResearch/hermes-agent/pull/1539))
+
+### Delegation
+- Add observability metadata to subagent results (model, tokens, duration, tool trace) ([#1175](https://github.com/NousResearch/hermes-agent/pull/1175))
+
+---
+
+## 🧩 Skills Ecosystem
+
+### Skills System
+- **Integrate skills.sh** as a hub source alongside ClawHub ([#1303](https://github.com/NousResearch/hermes-agent/pull/1303))
+- Secure skill env setup on load ([#1153](https://github.com/NousResearch/hermes-agent/pull/1153))
+- Honor policy table for dangerous verdicts ([#1330](https://github.com/NousResearch/hermes-agent/pull/1330))
+- Harden ClawHub skill search exact matches ([#1400](https://github.com/NousResearch/hermes-agent/pull/1400))
+- Fix ClawHub skill install — use `/download` ZIP endpoint ([#1060](https://github.com/NousResearch/hermes-agent/pull/1060))
+- Avoid mislabeling local skills as builtin — by @arceus77-7 ([#862](https://github.com/NousResearch/hermes-agent/pull/862))
+
+### New Skills
+- **Linear** project management ([#1230](https://github.com/NousResearch/hermes-agent/pull/1230))
+- **X/Twitter** via x-cli ([#1285](https://github.com/NousResearch/hermes-agent/pull/1285))
+- **Telephony** — Twilio, SMS, and AI calls ([#1289](https://github.com/NousResearch/hermes-agent/pull/1289))
+- **1Password** — by @arceus77-7 ([#883](https://github.com/NousResearch/hermes-agent/pull/883), [#1179](https://github.com/NousResearch/hermes-agent/pull/1179))
+- **NeuroSkill BCI** integration ([#1135](https://github.com/NousResearch/hermes-agent/pull/1135))
+- **Blender MCP** for 3D modeling ([#1531](https://github.com/NousResearch/hermes-agent/pull/1531))
+- **OSS Security Forensics** ([#1482](https://github.com/NousResearch/hermes-agent/pull/1482))
+- **Parallel CLI** research skill ([#1301](https://github.com/NousResearch/hermes-agent/pull/1301))
+- **OpenCode** CLI skill ([#1174](https://github.com/NousResearch/hermes-agent/pull/1174))
+- **ASCII Video** skill refactored — by @SHL0MS ([#1213](https://github.com/NousResearch/hermes-agent/pull/1213), [#1598](https://github.com/NousResearch/hermes-agent/pull/1598))
+
+---
+
+## 🎙️ Voice Mode
+
+- Voice mode foundation — push-to-talk CLI, Telegram/Discord voice notes ([#1299](https://github.com/NousResearch/hermes-agent/pull/1299))
+- Free local Whisper transcription via faster-whisper ([#1185](https://github.com/NousResearch/hermes-agent/pull/1185))
+- Discord voice channel reliability fixes ([#1429](https://github.com/NousResearch/hermes-agent/pull/1429))
+- Restore local STT fallback for gateway voice notes ([#1490](https://github.com/NousResearch/hermes-agent/pull/1490))
+- Honor `stt.enabled: false` across gateway transcription ([#1394](https://github.com/NousResearch/hermes-agent/pull/1394))
+- Fix bogus incapability message on Telegram voice notes (Issue [#1033](https://github.com/NousResearch/hermes-agent/issues/1033))
+
+---
+
+## 🔌 ACP (IDE Integration)
+
+- Restore ACP server implementation ([#1254](https://github.com/NousResearch/hermes-agent/pull/1254))
+- Support slash commands in ACP adapter ([#1532](https://github.com/NousResearch/hermes-agent/pull/1532))
+
+---
+
+## 🧪 RL Training
+
+- **Agentic On-Policy Distillation (OPD)** environment — new RL training environment for agent policy distillation ([#1149](https://github.com/NousResearch/hermes-agent/pull/1149))
+- Make tinker-atropos RL training fully optional ([#1062](https://github.com/NousResearch/hermes-agent/pull/1062))
+
+---
+
+## 🔒 Security & Reliability
+
+### Security Hardening
+- **Tirith pre-exec command scanning** — static analysis of terminal commands before execution ([#1256](https://github.com/NousResearch/hermes-agent/pull/1256))
+- **PII redaction** when `privacy.redact_pii` is enabled ([#1542](https://github.com/NousResearch/hermes-agent/pull/1542))
+- Strip Hermes provider/gateway/tool env vars from all subprocess environments ([#1157](https://github.com/NousResearch/hermes-agent/pull/1157), [#1172](https://github.com/NousResearch/hermes-agent/pull/1172), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399), [#1419](https://github.com/NousResearch/hermes-agent/pull/1419))
+- Docker cwd workspace mount now explicit opt-in — never auto-mount host directories ([#1534](https://github.com/NousResearch/hermes-agent/pull/1534))
+- Escape parens and braces in fork bomb regex pattern ([#1397](https://github.com/NousResearch/hermes-agent/pull/1397))
+- Harden `.worktreeinclude` path containment ([#1388](https://github.com/NousResearch/hermes-agent/pull/1388))
+- Use description as `pattern_key` to prevent approval collisions ([#1395](https://github.com/NousResearch/hermes-agent/pull/1395))
+
+### Reliability
+- Guard init-time stdio writes ([#1271](https://github.com/NousResearch/hermes-agent/pull/1271))
+- Session log writes reuse shared atomic JSON helper ([#1280](https://github.com/NousResearch/hermes-agent/pull/1280))
+- Atomic temp cleanup protected on interrupts ([#1401](https://github.com/NousResearch/hermes-agent/pull/1401))
+
+---
+
+## 🐛 Notable Bug Fixes
+
+- **`/status` always showing 0 tokens** — now reports live state (Issue [#1465](https://github.com/NousResearch/hermes-agent/issues/1465), [#1476](https://github.com/NousResearch/hermes-agent/pull/1476))
+- **Custom model endpoints not working** — restored config-saved endpoint resolution (Issue [#1460](https://github.com/NousResearch/hermes-agent/issues/1460), [#1373](https://github.com/NousResearch/hermes-agent/pull/1373))
+- **MCP tools not visible until restart** — auto-reload on config change (Issue [#1036](https://github.com/NousResearch/hermes-agent/issues/1036), [#1474](https://github.com/NousResearch/hermes-agent/pull/1474))
+- **`hermes tools` removing MCP tools** — preserve MCP toolsets when saving (Issue [#1247](https://github.com/NousResearch/hermes-agent/issues/1247), [#1421](https://github.com/NousResearch/hermes-agent/pull/1421))
+- **Terminal subprocesses inheriting `OPENAI_BASE_URL`** breaking external tools (Issue [#1002](https://github.com/NousResearch/hermes-agent/issues/1002), [#1399](https://github.com/NousResearch/hermes-agent/pull/1399))
+- **Background process lost on gateway restart** — improved recovery (Issue [#1144](https://github.com/NousResearch/hermes-agent/issues/1144))
+- **Cron jobs not persisting state** — now stored in SQLite (Issue [#1416](https://github.com/NousResearch/hermes-agent/issues/1416), [#1255](https://github.com/NousResearch/hermes-agent/pull/1255))
+- **Cronjob `deliver: origin` not preserving thread context** (Issue [#1219](https://github.com/NousResearch/hermes-agent/issues/1219), [#1437](https://github.com/NousResearch/hermes-agent/pull/1437))
+- **Gateway systemd service failing to auto-restart** when browser processes orphaned (Issue [#1617](https://github.com/NousResearch/hermes-agent/issues/1617))
+- **`/background` completion report cut off in Telegram** (Issue [#1443](https://github.com/NousResearch/hermes-agent/issues/1443))
+- **Model switching not taking effect** (Issue [#1244](https://github.com/NousResearch/hermes-agent/issues/1244), [#1183](https://github.com/NousResearch/hermes-agent/pull/1183))
+- **`hermes doctor` reporting cronjob as unavailable** (Issue [#878](https://github.com/NousResearch/hermes-agent/issues/878), [#1180](https://github.com/NousResearch/hermes-agent/pull/1180))
+- **WhatsApp bridge messages not received** from mobile (Issue [#1142](https://github.com/NousResearch/hermes-agent/issues/1142))
+- **Setup wizard hanging on headless SSH** (Issue [#905](https://github.com/NousResearch/hermes-agent/issues/905), [#1274](https://github.com/NousResearch/hermes-agent/pull/1274))
+- **Log handler accumulation** degrading gateway performance (Issue [#990](https://github.com/NousResearch/hermes-agent/issues/990), [#1251](https://github.com/NousResearch/hermes-agent/pull/1251))
+- **Gateway NULL model in DB** (Issue [#987](https://github.com/NousResearch/hermes-agent/issues/987), [#1306](https://github.com/NousResearch/hermes-agent/pull/1306))
+- **Strict endpoints rejecting replayed tool_calls** (Issue [#893](https://github.com/NousResearch/hermes-agent/issues/893))
+- **Remaining hardcoded `~/.hermes` paths** — all now respect `HERMES_HOME` (Issue [#892](https://github.com/NousResearch/hermes-agent/issues/892), [#1233](https://github.com/NousResearch/hermes-agent/pull/1233))
+- **Delegate tool not working with custom inference providers** (Issue [#1011](https://github.com/NousResearch/hermes-agent/issues/1011), [#1328](https://github.com/NousResearch/hermes-agent/pull/1328))
+- **Skills Guard blocking official skills** (Issue [#1006](https://github.com/NousResearch/hermes-agent/issues/1006), [#1330](https://github.com/NousResearch/hermes-agent/pull/1330))
+- **Setup writing provider before model selection** (Issue [#1182](https://github.com/NousResearch/hermes-agent/issues/1182))
+- **`GatewayConfig.get()` AttributeError** crashing all message handling (Issue [#1158](https://github.com/NousResearch/hermes-agent/issues/1158), [#1287](https://github.com/NousResearch/hermes-agent/pull/1287))
+- **`/update` hard-failing with "command not found"** (Issue [#1049](https://github.com/NousResearch/hermes-agent/issues/1049))
+- **Image analysis failing silently** (Issue [#1034](https://github.com/NousResearch/hermes-agent/issues/1034), [#1338](https://github.com/NousResearch/hermes-agent/pull/1338))
+- **API `BadRequestError` from `'dict'` object has no attribute `'strip'`** (Issue [#1071](https://github.com/NousResearch/hermes-agent/issues/1071))
+- **Slash commands requiring exact full name** — now uses prefix matching (Issue [#928](https://github.com/NousResearch/hermes-agent/issues/928), [#1320](https://github.com/NousResearch/hermes-agent/pull/1320))
+- **Gateway stops responding when terminal is closed on headless** (Issue [#1005](https://github.com/NousResearch/hermes-agent/issues/1005))
+
+---
+
+## 🧪 Testing
+
+- Cover empty cached Anthropic tool-call turns ([#1222](https://github.com/NousResearch/hermes-agent/pull/1222))
+- Fix stale CI assumptions in parser and quick-command coverage ([#1236](https://github.com/NousResearch/hermes-agent/pull/1236))
+- Fix gateway async tests without implicit event loop ([#1278](https://github.com/NousResearch/hermes-agent/pull/1278))
+- Make gateway async tests xdist-safe ([#1281](https://github.com/NousResearch/hermes-agent/pull/1281))
+- Cross-timezone naive timestamp regression for cron ([#1319](https://github.com/NousResearch/hermes-agent/pull/1319))
+- Isolate codex provider tests from local env ([#1335](https://github.com/NousResearch/hermes-agent/pull/1335))
+- Lock retry replacement semantics ([#1379](https://github.com/NousResearch/hermes-agent/pull/1379))
+- Improve error logging in session search tool — by @aydnOktay ([#1533](https://github.com/NousResearch/hermes-agent/pull/1533))
+
+---
+
+## 📚 Documentation
+
+- Comprehensive SOUL.md guide ([#1315](https://github.com/NousResearch/hermes-agent/pull/1315))
+- Voice mode documentation ([#1316](https://github.com/NousResearch/hermes-agent/pull/1316), [#1362](https://github.com/NousResearch/hermes-agent/pull/1362))
+- Provider contribution guide ([#1361](https://github.com/NousResearch/hermes-agent/pull/1361))
+- ACP and internal systems implementation guides ([#1259](https://github.com/NousResearch/hermes-agent/pull/1259))
+- Expand Docusaurus coverage across CLI, tools, skills, and skins ([#1232](https://github.com/NousResearch/hermes-agent/pull/1232))
+- Terminal backend and Windows troubleshooting ([#1297](https://github.com/NousResearch/hermes-agent/pull/1297))
+- Skills hub reference section ([#1317](https://github.com/NousResearch/hermes-agent/pull/1317))
+- Checkpoint, /rollback, and git worktrees guide ([#1493](https://github.com/NousResearch/hermes-agent/pull/1493), [#1524](https://github.com/NousResearch/hermes-agent/pull/1524))
+- CLI status bar and /usage reference ([#1523](https://github.com/NousResearch/hermes-agent/pull/1523))
+- Fallback providers + /background command docs ([#1430](https://github.com/NousResearch/hermes-agent/pull/1430))
+- Gateway service scopes docs ([#1378](https://github.com/NousResearch/hermes-agent/pull/1378))
+- Slack thread reply behavior docs ([#1407](https://github.com/NousResearch/hermes-agent/pull/1407))
+- Redesigned landing page with Nous blue palette — by @austinpickett ([#974](https://github.com/NousResearch/hermes-agent/pull/974))
+- Fix several documentation typos — by @JackTheGit ([#953](https://github.com/NousResearch/hermes-agent/pull/953))
+- Stabilize website diagrams ([#1405](https://github.com/NousResearch/hermes-agent/pull/1405))
+- CLI vs messaging quick reference in README ([#1491](https://github.com/NousResearch/hermes-agent/pull/1491))
+- Add search to Docusaurus ([#1053](https://github.com/NousResearch/hermes-agent/pull/1053))
+- Home Assistant integration docs ([#1170](https://github.com/NousResearch/hermes-agent/pull/1170))
+
+---
+
+## 👥 Contributors
+
+### Core
+- **@teknium1** — 220+ PRs spanning every area of the codebase
+
+### Top Community Contributors
+
+- **@0xbyt4** (4 PRs) — Anthropic adapter fixes (max_tokens, fallback crash, 429/529 retry), Slack file upload thread context, setup NameError fix
+- **@erosika** (1 PR) — Honcho memory integration: async writes, memory modes, session title integration
+- **@SHL0MS** (2 PRs) — ASCII video skill design patterns and refactoring
+- **@alt-glitch** (2 PRs) — Persistent shell mode for local/SSH backends, setuptools packaging fix
+- **@arceus77-7** (2 PRs) — 1Password skill, fix skills list mislabeling
+- **@kshitijk4poor** (1 PR) — OpenClaw migration during setup wizard
+- **@ASRagab** (1 PR) — Fix adaptive thinking for Claude 4.6 models
+- **@eren-karakus0** (1 PR) — Strip Hermes provider env vars from subprocess environment
+- **@mr-emmett-one** (1 PR) — Fix DeepSeek V3 parser multi-tool call support
+- **@jplew** (1 PR) — Gateway restart on retryable startup failures
+- **@brandtcormorant** (1 PR) — Fix Anthropic cache control for empty text blocks
+- **@aydnOktay** (1 PR) — Improve error logging in session search tool
+- **@austinpickett** (1 PR) — Landing page redesign with Nous blue palette
+- **@JackTheGit** (1 PR) — Documentation typo fixes
+
+### All Contributors
+
+@0xbyt4, @alt-glitch, @arceus77-7, @ASRagab, @austinpickett, @aydnOktay, @brandtcormorant, @eren-karakus0, @erosika, @JackTheGit, @jplew, @kshitijk4poor, @mr-emmett-one, @SHL0MS, @teknium1
+
+---
+
+**Full Changelog**: [v2026.3.12...v2026.3.17](https://github.com/NousResearch/hermes-agent/compare/v2026.3.12...v2026.3.17)
@@ -42,7 +42,7 @@ from acp_adapter.events import (
    make_tool_progress_cb,
 )
 from acp_adapter.permissions import make_approval_callback
-from acp_adapter.session import SessionManager
+from acp_adapter.session import SessionManager, SessionState

 logger = logging.getLogger(__name__)

@@ -226,10 +226,19 @@ class HermesACPAgent(acp.Agent):
            logger.error("prompt: session %s not found", session_id)
            return PromptResponse(stop_reason="refusal")

-        user_text = _extract_text(prompt)
-        if not user_text.strip():
+        user_text = _extract_text(prompt).strip()
+        if not user_text:
            return PromptResponse(stop_reason="end_turn")

+        # Intercept slash commands — handle locally without calling the LLM
+        if user_text.startswith("/"):
+            response_text = self._handle_slash_command(user_text, state)
+            if response_text is not None:
+                if self._conn:
+                    update = acp.update_agent_message_text(response_text)
+                    await self._conn.session_update(session_id, update)
+                return PromptResponse(stop_reason="end_turn")
+
        logger.info("Prompt on session %s: %s", session_id, user_text[:100])

        conn = self._conn
@@ -315,12 +324,149 @@ class HermesACPAgent(acp.Agent):
        stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
        return PromptResponse(stop_reason=stop_reason, usage=usage)

-    # ---- Model switching ----------------------------------------------------
+    # ---- Slash commands (headless) -------------------------------------------
+
+    _SLASH_COMMANDS = {
+        "help": "Show available commands",
+        "model": "Show or change current model",
+        "tools": "List available tools",
+        "context": "Show conversation context info",
+        "reset": "Clear conversation history",
+        "compact": "Compress conversation context",
+        "version": "Show Hermes version",
+    }
+
+    def _handle_slash_command(self, text: str, state: SessionState) -> str | None:
+        """Dispatch a slash command and return the response text.
+
+        Returns ``None`` for unrecognized commands so they fall through
+        to the LLM (the user may have typed ``/something`` as prose).
+        """
+        parts = text.split(maxsplit=1)
+        cmd = parts[0].lstrip("/").lower()
+        args = parts[1].strip() if len(parts) > 1 else ""
+
+        handler = {
+            "help": self._cmd_help,
+            "model": self._cmd_model,
+            "tools": self._cmd_tools,
+            "context": self._cmd_context,
+            "reset": self._cmd_reset,
+            "compact": self._cmd_compact,
+            "version": self._cmd_version,
+        }.get(cmd)
+
+        if handler is None:
+            return None  # not a known command — let the LLM handle it
+
+        try:
+            return handler(args, state)
+        except Exception as e:
+            logger.error("Slash command /%s error: %s", cmd, e, exc_info=True)
+            return f"Error executing /{cmd}: {e}"
+
+    def _cmd_help(self, args: str, state: SessionState) -> str:
+        lines = ["Available commands:", ""]
+        for cmd, desc in self._SLASH_COMMANDS.items():
+            lines.append(f"  /{cmd:10s}  {desc}")
+        lines.append("")
+        lines.append("Unrecognized /commands are sent to the model as normal messages.")
+        return "\n".join(lines)
+
+    def _cmd_model(self, args: str, state: SessionState) -> str:
+        if not args:
+            model = state.model or getattr(state.agent, "model", "unknown")
+            provider = getattr(state.agent, "provider", None) or "auto"
+            return f"Current model: {model}\nProvider: {provider}"
+
+        new_model = args.strip()
+        target_provider = None
+
+        # Auto-detect provider for the requested model
+        try:
+            from hermes_cli.models import parse_model_input, detect_provider_for_model
+            current_provider = getattr(state.agent, "provider", None) or "openrouter"
+            target_provider, new_model = parse_model_input(new_model, current_provider)
+            if target_provider == current_provider:
+                detected = detect_provider_for_model(new_model, current_provider)
+                if detected:
+                    target_provider, new_model = detected
+        except Exception:
+            logger.debug("Provider detection failed, using model as-is", exc_info=True)
+
+        state.model = new_model
+        state.agent = self.session_manager._make_agent(
+            session_id=state.session_id,
+            cwd=state.cwd,
+            model=new_model,
+        )
+        provider_label = target_provider or getattr(state.agent, "provider", "auto")
+        logger.info("Session %s: model switched to %s", state.session_id, new_model)
+        return f"Model switched to: {new_model}\nProvider: {provider_label}"
+
+    def _cmd_tools(self, args: str, state: SessionState) -> str:
+        try:
+            from model_tools import get_tool_definitions
+            toolsets = getattr(state.agent, "enabled_toolsets", None) or ["hermes-acp"]
+            tools = get_tool_definitions(enabled_toolsets=toolsets, quiet_mode=True)
+            if not tools:
+                return "No tools available."
+            lines = [f"Available tools ({len(tools)}):"]
+            for t in tools:
+                name = t.get("function", {}).get("name", "?")
+                desc = t.get("function", {}).get("description", "")
+                # Truncate long descriptions
+                if len(desc) > 80:
+                    desc = desc[:77] + "..."
+                lines.append(f"  {name}: {desc}")
+            return "\n".join(lines)
+        except Exception as e:
+            return f"Could not list tools: {e}"
+
+    def _cmd_context(self, args: str, state: SessionState) -> str:
+        n_messages = len(state.history)
+        if n_messages == 0:
+            return "Conversation is empty (no messages yet)."
+        # Count by role
+        roles: dict[str, int] = {}
+        for msg in state.history:
+            role = msg.get("role", "unknown")
+            roles[role] = roles.get(role, 0) + 1
+        lines = [
+            f"Conversation: {n_messages} messages",
+            f"  user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
+            f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
+        ]
+        model = state.model or getattr(state.agent, "model", "")
+        if model:
+            lines.append(f"Model: {model}")
+        return "\n".join(lines)
+
+    def _cmd_reset(self, args: str, state: SessionState) -> str:
+        state.history.clear()
+        return "Conversation history cleared."
+
+    def _cmd_compact(self, args: str, state: SessionState) -> str:
+        if not state.history:
+            return "Nothing to compress — conversation is empty."
+        try:
+            agent = state.agent
+            if hasattr(agent, "compress_context"):
+                agent.compress_context(state.history)
+                return f"Context compressed. Messages: {len(state.history)}"
+            return "Context compression not available for this agent."
+        except Exception as e:
+            return f"Compression failed: {e}"
+
+    def _cmd_version(self, args: str, state: SessionState) -> str:
+        return f"Hermes Agent v{HERMES_VERSION}"
+
+    # ---- Model switching (ACP protocol method) -------------------------------

    async def set_session_model(
        self, model_id: str, session_id: str, **kwargs: Any
    ):
-        """Switch the model for a session."""
+        """Switch the model for a session (called by ACP protocol)."""
        state = self.session_manager.get_session(session_id)
        if state:
            state.model = model_id
@@ -45,14 +45,19 @@ _COMMON_BETAS = [
    "fine-grained-tool-streaming-2025-05-14",
 ]

-# Additional beta headers required for OAuth/subscription auth
-# Both clawdbot and OpenCode include claude-code-20250219 alongside oauth-2025-04-20.
-# Without claude-code-20250219, Anthropic's API rejects OAuth tokens with 401.
+# Additional beta headers required for OAuth/subscription auth.
+# Matches what Claude Code (and pi-ai / OpenCode) send.
 _OAUTH_ONLY_BETAS = [
    "claude-code-20250219",
    "oauth-2025-04-20",
 ]

+# Claude Code identity — required for OAuth requests to be routed correctly.
+# Without these, Anthropic's infrastructure intermittently 500s OAuth traffic.
+_CLAUDE_CODE_VERSION = "2.1.2"
+_CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
+_MCP_TOOL_PREFIX = "mcp_"
+

 def _is_oauth_token(key: str) -> bool:
    """Check if the key is an OAuth/setup token (not a regular Console API key).
@@ -88,10 +93,16 @@ def build_anthropic_client(api_key: str, base_url: str = None):
        kwargs["base_url"] = base_url

    if _is_oauth_token(api_key):
-        # OAuth access token / setup-token → Bearer auth + beta headers
+        # OAuth access token / setup-token → Bearer auth + Claude Code identity.
+        # Anthropic routes OAuth requests based on user-agent and headers;
+        # without Claude Code's fingerprint, requests get intermittent 500s.
        all_betas = _COMMON_BETAS + _OAUTH_ONLY_BETAS
        kwargs["auth_token"] = api_key
-        kwargs["default_headers"] = {"anthropic-beta": ",".join(all_betas)}
+        kwargs["default_headers"] = {
+            "anthropic-beta": ",".join(all_betas),
+            "user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+            "x-app": "cli",
+        }
    else:
        # Regular API key → x-api-key header + common betas
        kwargs["api_key"] = api_key
@@ -189,7 +200,10 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:
    req = urllib.request.Request(
        "https://console.anthropic.com/v1/oauth/token",
        data=data,
-        headers={"Content-Type": "application/x-www-form-urlencoded"},
+        headers={
+            "Content-Type": "application/x-www-form-urlencoded",
+            "User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+        },
        method="POST",
    )

@@ -332,12 +346,24 @@ def resolve_anthropic_token() -> Optional[str]:
            return preferred
        return cc_token

-    # 3. Claude Code credential file
+    # 3. Hermes-managed OAuth credentials (~/.hermes/.anthropic_oauth.json)
+    hermes_creds = read_hermes_oauth_credentials()
+    if hermes_creds:
+        if is_claude_code_token_valid(hermes_creds):
+            logger.debug("Using Hermes-managed OAuth credentials")
+            return hermes_creds["accessToken"]
+        # Expired — try refresh
+        logger.debug("Hermes OAuth token expired — attempting refresh")
+        refreshed = refresh_hermes_oauth_token()
+        if refreshed:
+            return refreshed
+
+    # 4. Claude Code credential file
    resolved_claude_token = _resolve_claude_code_token_from_credentials(creds)
    if resolved_claude_token:
        return resolved_claude_token

-    # 4. Regular API key, or a legacy OAuth token saved in ANTHROPIC_API_KEY.
+    # 5. Regular API key, or a legacy OAuth token saved in ANTHROPIC_API_KEY.
    # This remains as a compatibility fallback for pre-migration Hermes configs.
    api_key = os.getenv("ANTHROPIC_API_KEY", "").strip()
    if api_key:
@@ -386,6 +412,215 @@ def run_oauth_setup_token() -> Optional[str]:
    return None


+# ── Hermes-native PKCE OAuth flow ────────────────────────────────────────
+# Mirrors the flow used by Claude Code, pi-ai, and OpenCode.
+# Stores credentials in ~/.hermes/.anthropic_oauth.json (our own file).
+
+_OAUTH_CLIENT_ID = "9d1c250a-e61b-44d9-88ed-5944d1962f5e"
+_OAUTH_TOKEN_URL = "https://console.anthropic.com/v1/oauth/token"
+_OAUTH_REDIRECT_URI = "https://console.anthropic.com/oauth/code/callback"
+_OAUTH_SCOPES = "org:create_api_key user:profile user:inference"
+_HERMES_OAUTH_FILE = Path(os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))) / ".anthropic_oauth.json"
+
+
+def _generate_pkce() -> tuple:
+    """Generate PKCE code_verifier and code_challenge (S256)."""
+    import base64
+    import hashlib
+    import secrets
+
+    verifier = base64.urlsafe_b64encode(secrets.token_bytes(32)).rstrip(b"=").decode()
+    challenge = base64.urlsafe_b64encode(
+        hashlib.sha256(verifier.encode()).digest()
+    ).rstrip(b"=").decode()
+    return verifier, challenge
+
+
+def run_hermes_oauth_login() -> Optional[str]:
+    """Run Hermes-native OAuth PKCE flow for Claude Pro/Max subscription.
+
+    Opens a browser to claude.ai for authorization, prompts for the code,
+    exchanges it for tokens, and stores them in ~/.hermes/.anthropic_oauth.json.
+
+    Returns the access token on success, None on failure.
+    """
+    import time
+    import webbrowser
+
+    verifier, challenge = _generate_pkce()
+
+    # Build authorization URL
+    params = {
+        "code": "true",
+        "client_id": _OAUTH_CLIENT_ID,
+        "response_type": "code",
+        "redirect_uri": _OAUTH_REDIRECT_URI,
+        "scope": _OAUTH_SCOPES,
+        "code_challenge": challenge,
+        "code_challenge_method": "S256",
+        "state": verifier,
+    }
+    from urllib.parse import urlencode
+    auth_url = f"https://claude.ai/oauth/authorize?{urlencode(params)}"
+
+    print()
+    print("Authorize Hermes with your Claude Pro/Max subscription.")
+    print()
+    print("╭─ Claude Pro/Max Authorization ────────────────────╮")
+    print("│                                                   │")
+    print("│  Open this link in your browser:                  │")
+    print("╰───────────────────────────────────────────────────╯")
+    print()
+    print(f"  {auth_url}")
+    print()
+
+    # Try to open browser automatically (works on desktop, silently fails on headless/SSH)
+    try:
+        webbrowser.open(auth_url)
+        print("  (Browser opened automatically)")
+    except Exception:
+        pass
+
+    print()
+    print("After authorizing, you'll see a code. Paste it below.")
+    print()
+    try:
+        auth_code = input("Authorization code: ").strip()
+    except (KeyboardInterrupt, EOFError):
+        return None
+
+    if not auth_code:
+        print("No code entered.")
+        return None
+
+    # Split code#state format
+    splits = auth_code.split("#")
+    code = splits[0]
+    state = splits[1] if len(splits) > 1 else ""
+
+    # Exchange code for tokens
+    try:
+        import urllib.request
+        exchange_data = json.dumps({
+            "grant_type": "authorization_code",
+            "client_id": _OAUTH_CLIENT_ID,
+            "code": code,
+            "state": state,
+            "redirect_uri": _OAUTH_REDIRECT_URI,
+            "code_verifier": verifier,
+        }).encode()
+
+        req = urllib.request.Request(
+            _OAUTH_TOKEN_URL,
+            data=exchange_data,
+            headers={
+                "Content-Type": "application/json",
+                "User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+            },
+            method="POST",
+        )
+
+        with urllib.request.urlopen(req, timeout=15) as resp:
+            result = json.loads(resp.read().decode())
+    except Exception as e:
+        print(f"Token exchange failed: {e}")
+        return None
+
+    access_token = result.get("access_token", "")
+    refresh_token = result.get("refresh_token", "")
+    expires_in = result.get("expires_in", 3600)
+
+    if not access_token:
+        print("No access token in response.")
+        return None
+
+    # Store credentials
+    expires_at_ms = int(time.time() * 1000) + (expires_in * 1000)
+    _save_hermes_oauth_credentials(access_token, refresh_token, expires_at_ms)
+
+    # Also write to Claude Code's credential file for backward compat
+    _write_claude_code_credentials(access_token, refresh_token, expires_at_ms)
+
+    print("Authentication successful!")
+    return access_token
+
+
+def _save_hermes_oauth_credentials(access_token: str, refresh_token: str, expires_at_ms: int) -> None:
+    """Save OAuth credentials to ~/.hermes/.anthropic_oauth.json."""
+    data = {
+        "accessToken": access_token,
+        "refreshToken": refresh_token,
+        "expiresAt": expires_at_ms,
+    }
+    try:
+        _HERMES_OAUTH_FILE.parent.mkdir(parents=True, exist_ok=True)
+        _HERMES_OAUTH_FILE.write_text(json.dumps(data, indent=2), encoding="utf-8")
+        _HERMES_OAUTH_FILE.chmod(0o600)
+    except (OSError, IOError) as e:
+        logger.debug("Failed to save Hermes OAuth credentials: %s", e)
+
+
+def read_hermes_oauth_credentials() -> Optional[Dict[str, Any]]:
+    """Read Hermes-managed OAuth credentials from ~/.hermes/.anthropic_oauth.json."""
+    if _HERMES_OAUTH_FILE.exists():
+        try:
+            data = json.loads(_HERMES_OAUTH_FILE.read_text(encoding="utf-8"))
+            if data.get("accessToken"):
+                return data
+        except (json.JSONDecodeError, OSError, IOError) as e:
+            logger.debug("Failed to read Hermes OAuth credentials: %s", e)
+    return None
+
+
+def refresh_hermes_oauth_token() -> Optional[str]:
+    """Refresh the Hermes-managed OAuth token using the stored refresh token.
+
+    Returns the new access token, or None if refresh fails.
+    """
+    import time
+    import urllib.request
+
+    creds = read_hermes_oauth_credentials()
+    if not creds or not creds.get("refreshToken"):
+        return None
+
+    try:
+        data = json.dumps({
+            "grant_type": "refresh_token",
+            "refresh_token": creds["refreshToken"],
+            "client_id": _OAUTH_CLIENT_ID,
+        }).encode()
+
+        req = urllib.request.Request(
+            _OAUTH_TOKEN_URL,
+            data=data,
+            headers={
+                "Content-Type": "application/json",
+                "User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+            },
+            method="POST",
+        )
+
+        with urllib.request.urlopen(req, timeout=10) as resp:
+            result = json.loads(resp.read().decode())
+
+        new_access = result.get("access_token", "")
+        new_refresh = result.get("refresh_token", creds["refreshToken"])
+        expires_in = result.get("expires_in", 3600)
+
+        if new_access:
+            new_expires_ms = int(time.time() * 1000) + (expires_in * 1000)
+            _save_hermes_oauth_credentials(new_access, new_refresh, new_expires_ms)
+            # Also update Claude Code's credential file
+            _write_claude_code_credentials(new_access, new_refresh, new_expires_ms)
+            logger.debug("Successfully refreshed Hermes OAuth token")
+            return new_access
+    except Exception as e:
+        logger.debug("Failed to refresh Hermes OAuth token: %s", e)
+
+    return None
+
+
 # ---------------------------------------------------------------------------
 # Message / tool / response format conversion
 # ---------------------------------------------------------------------------
@@ -714,14 +949,59 @@ def build_anthropic_kwargs(
    max_tokens: Optional[int],
    reasoning_config: Optional[Dict[str, Any]],
    tool_choice: Optional[str] = None,
+    is_oauth: bool = False,
 ) -> Dict[str, Any]:
-    """Build kwargs for anthropic.messages.create()."""
+    """Build kwargs for anthropic.messages.create().
+
+    When *is_oauth* is True, applies Claude Code compatibility transforms:
+    system prompt prefix, tool name prefixing, and prompt sanitization.
+    """
    system, anthropic_messages = convert_messages_to_anthropic(messages)
    anthropic_tools = convert_tools_to_anthropic(tools) if tools else []

    model = normalize_model_name(model)
    effective_max_tokens = max_tokens or 16384

+    # ── OAuth: Claude Code identity ──────────────────────────────────
+    if is_oauth:
+        # 1. Prepend Claude Code system prompt identity
+        cc_block = {"type": "text", "text": _CLAUDE_CODE_SYSTEM_PREFIX}
+        if isinstance(system, list):
+            system = [cc_block] + system
+        elif isinstance(system, str) and system:
+            system = [cc_block, {"type": "text", "text": system}]
+        else:
+            system = [cc_block]
+
+        # 2. Sanitize system prompt — replace product name references
+        #    to avoid Anthropic's server-side content filters.
+        for block in system:
+            if isinstance(block, dict) and block.get("type") == "text":
+                text = block.get("text", "")
+                text = text.replace("Hermes Agent", "Claude Code")
+                text = text.replace("Hermes agent", "Claude Code")
+                text = text.replace("hermes-agent", "claude-code")
+                text = text.replace("Nous Research", "Anthropic")
+                block["text"] = text
+
+        # 3. Prefix tool names with mcp_ (Claude Code convention)
+        if anthropic_tools:
+            for tool in anthropic_tools:
+                if "name" in tool:
+                    tool["name"] = _MCP_TOOL_PREFIX + tool["name"]
+
+        # 4. Prefix tool names in message history (tool_use and tool_result blocks)
+        for msg in anthropic_messages:
+            content = msg.get("content")
+            if isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict):
+                        if block.get("type") == "tool_use" and "name" in block:
+                            if not block["name"].startswith(_MCP_TOOL_PREFIX):
+                                block["name"] = _MCP_TOOL_PREFIX + block["name"]
+                        elif block.get("type") == "tool_result" and "tool_use_id" in block:
+                            pass  # tool_result uses ID, not name
+
    kwargs: Dict[str, Any] = {
        "model": model,
        "messages": anthropic_messages,
@@ -768,11 +1048,15 @@ def build_anthropic_kwargs(

 def normalize_anthropic_response(
    response,
+    strip_tool_prefix: bool = False,
 ) -> Tuple[SimpleNamespace, str]:
    """Normalize Anthropic response to match the shape expected by AIAgent.

    Returns (assistant_message, finish_reason) where assistant_message has
    .content, .tool_calls, and .reasoning attributes.
+
+    When *strip_tool_prefix* is True, removes the ``mcp_`` prefix that was
+    added to tool names for OAuth Claude Code compatibility.
    """
    text_parts = []
    reasoning_parts = []
@@ -784,12 +1068,15 @@ def normalize_anthropic_response(
        elif block.type == "thinking":
            reasoning_parts.append(block.thinking)
        elif block.type == "tool_use":
+            name = block.name
+            if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
+                name = name[len(_MCP_TOOL_PREFIX):]
            tool_calls.append(
                SimpleNamespace(
                    id=block.id,
                    type="function",
                    function=SimpleNamespace(
-                        name=block.name,
+                        name=name,
                        arguments=json.dumps(block.input),
                    ),
                )
@@ -57,6 +57,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
    "minimax": "MiniMax-M2.5-highspeed",
    "minimax-cn": "MiniMax-M2.5-highspeed",
    "anthropic": "claude-haiku-4-5-20251001",
+    "ai-gateway": "google/gemini-3-flash",
 }

 # OpenRouter app attribution headers
@@ -20,65 +20,16 @@ import json
 import time
 from collections import Counter, defaultdict
 from datetime import datetime
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List

-# =========================================================================
-# Model pricing (USD per million tokens) — approximate as of early 2026
-# =========================================================================
-MODEL_PRICING = {
-    # OpenAI
-    "gpt-4o": {"input": 2.50, "output": 10.00},
-    "gpt-4o-mini": {"input": 0.15, "output": 0.60},
-    "gpt-4.1": {"input": 2.00, "output": 8.00},
-    "gpt-4.1-mini": {"input": 0.40, "output": 1.60},
-    "gpt-4.1-nano": {"input": 0.10, "output": 0.40},
-    "gpt-4.5-preview": {"input": 75.00, "output": 150.00},
-    "gpt-5": {"input": 10.00, "output": 30.00},
-    "gpt-5.4": {"input": 10.00, "output": 30.00},
-    "o3": {"input": 10.00, "output": 40.00},
-    "o3-mini": {"input": 1.10, "output": 4.40},
-    "o4-mini": {"input": 1.10, "output": 4.40},
-    # Anthropic
-    "claude-opus-4-20250514": {"input": 15.00, "output": 75.00},
-    "claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
-    "claude-3-5-sonnet-20241022": {"input": 3.00, "output": 15.00},
-    "claude-3-5-haiku-20241022": {"input": 0.80, "output": 4.00},
-    "claude-3-opus-20240229": {"input": 15.00, "output": 75.00},
-    "claude-3-haiku-20240307": {"input": 0.25, "output": 1.25},
-    # DeepSeek
-    "deepseek-chat": {"input": 0.14, "output": 0.28},
-    "deepseek-reasoner": {"input": 0.55, "output": 2.19},
-    # Google
-    "gemini-2.5-pro": {"input": 1.25, "output": 10.00},
-    "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
-    "gemini-2.0-flash": {"input": 0.10, "output": 0.40},
-    # Meta (via providers)
-    "llama-4-maverick": {"input": 0.50, "output": 0.70},
-    "llama-4-scout": {"input": 0.20, "output": 0.30},
-    # Z.AI / GLM (direct provider — pricing not published externally, treat as local)
-    "glm-5": {"input": 0.0, "output": 0.0},
-    "glm-4.7": {"input": 0.0, "output": 0.0},
-    "glm-4.5": {"input": 0.0, "output": 0.0},
-    "glm-4.5-flash": {"input": 0.0, "output": 0.0},
-    # Kimi / Moonshot (direct provider — pricing not published externally, treat as local)
-    "kimi-k2.5": {"input": 0.0, "output": 0.0},
-    "kimi-k2-thinking": {"input": 0.0, "output": 0.0},
-    "kimi-k2-turbo-preview": {"input": 0.0, "output": 0.0},
-    "kimi-k2-0905-preview": {"input": 0.0, "output": 0.0},
-    # MiniMax (direct provider — pricing not published externally, treat as local)
-    "MiniMax-M2.5": {"input": 0.0, "output": 0.0},
-    "MiniMax-M2.5-highspeed": {"input": 0.0, "output": 0.0},
-    "MiniMax-M2.1": {"input": 0.0, "output": 0.0},
-}
+from agent.usage_pricing import DEFAULT_PRICING, estimate_cost_usd, format_duration_compact, get_pricing, has_known_pricing

-# Fallback: unknown/custom models get zero cost (we can't assume pricing
-# for self-hosted models, custom OAI endpoints, local inference, etc.)
-_DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
+_DEFAULT_PRICING = DEFAULT_PRICING


 def _has_known_pricing(model_name: str) -> bool:
    """Check if a model has known pricing (vs unknown/custom endpoint)."""
-    return _get_pricing(model_name) is not _DEFAULT_PRICING
+    return has_known_pricing(model_name)


 def _get_pricing(model_name: str) -> Dict[str, float]:
@@ -87,67 +38,17 @@ def _get_pricing(model_name: str) -> Dict[str, float]:
    Returns _DEFAULT_PRICING (zero cost) for unknown/custom models —
    we can't assume costs for self-hosted endpoints, local inference, etc.
    """
-    if not model_name:
-        return _DEFAULT_PRICING
-
-    # Strip provider prefix (e.g., "anthropic/claude-..." -> "claude-...")
-    bare = model_name.split("/")[-1].lower()
-
-    # Exact match first
-    if bare in MODEL_PRICING:
-        return MODEL_PRICING[bare]
-
-    # Fuzzy prefix match — prefer the LONGEST matching key to avoid
-    # e.g. "gpt-4o" matching before "gpt-4o-mini" for "gpt-4o-mini-2024-07-18"
-    best_match = None
-    best_len = 0
-    for key, price in MODEL_PRICING.items():
-        if bare.startswith(key) and len(key) > best_len:
-            best_match = price
-            best_len = len(key)
-    if best_match:
-        return best_match
-
-    # Keyword heuristics (checked in most-specific-first order)
-    if "opus" in bare:
-        return {"input": 15.00, "output": 75.00}
-    if "sonnet" in bare:
-        return {"input": 3.00, "output": 15.00}
-    if "haiku" in bare:
-        return {"input": 0.80, "output": 4.00}
-    if "gpt-4o-mini" in bare:
-        return {"input": 0.15, "output": 0.60}
-    if "gpt-4o" in bare:
-        return {"input": 2.50, "output": 10.00}
-    if "gpt-5" in bare:
-        return {"input": 10.00, "output": 30.00}
-    if "deepseek" in bare:
-        return {"input": 0.14, "output": 0.28}
-    if "gemini" in bare:
-        return {"input": 0.15, "output": 0.60}
-
-    return _DEFAULT_PRICING
+    return get_pricing(model_name)


 def _estimate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
    """Estimate the USD cost for a given model and token counts."""
-    pricing = _get_pricing(model)
-    return (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
+    return estimate_cost_usd(model, input_tokens, output_tokens)


 def _format_duration(seconds: float) -> str:
    """Format seconds into a human-readable duration string."""
-    if seconds < 60:
-        return f"{seconds:.0f}s"
-    minutes = seconds / 60
-    if minutes < 60:
-        return f"{minutes:.0f}m"
-    hours = minutes / 60
-    if hours < 24:
-        remaining_min = int(minutes % 60)
-        return f"{int(hours)}h {remaining_min}m" if remaining_min else f"{int(hours)}h"
-    days = hours / 24
-    return f"{days:.1f}d"
+    return format_duration_compact(seconds)


 def _bar_chart(values: List[int], max_width: int = 20) -> List[str]:
@@ -40,6 +40,8 @@ DEFAULT_CONTEXT_LENGTHS = {
    "anthropic/claude-opus-4.6": 200000,
    "anthropic/claude-sonnet-4": 200000,
    "anthropic/claude-sonnet-4-20250514": 200000,
+    "anthropic/claude-sonnet-4.5": 200000,
+    "anthropic/claude-sonnet-4.6": 200000,
    "anthropic/claude-haiku-4.5": 200000,
    # Bare Anthropic model IDs (for native API provider)
    "claude-opus-4-6": 200000,
@@ -50,11 +52,18 @@ DEFAULT_CONTEXT_LENGTHS = {
    "claude-opus-4-20250514": 200000,
    "claude-sonnet-4-20250514": 200000,
    "claude-haiku-4-5-20251001": 200000,
+    "openai/gpt-5": 128000,
+    "openai/gpt-4.1": 1047576,
+    "openai/gpt-4.1-mini": 1047576,
    "openai/gpt-4o": 128000,
    "openai/gpt-4-turbo": 128000,
    "openai/gpt-4o-mini": 128000,
+    "google/gemini-3-pro-preview": 1048576,
+    "google/gemini-3-flash": 1048576,
+    "google/gemini-2.5-flash": 1048576,
    "google/gemini-2.0-flash": 1048576,
    "google/gemini-2.5-pro": 1048576,
+    "deepseek/deepseek-v3.2": 65536,
    "meta-llama/llama-3.3-70b-instruct": 131072,
    "deepseek/deepseek-chat-v3": 65536,
    "qwen/qwen-2.5-72b-instruct": 32768,
@@ -73,9 +73,15 @@ DEFAULT_AGENT_IDENTITY = (
 MEMORY_GUIDANCE = (
    "You have persistent memory across sessions. Save durable facts using the memory "
    "tool: user preferences, environment details, tool quirks, and stable conventions. "
-    "Memory is injected into every turn, so keep it compact. Do NOT save task progress, "
-    "session outcomes, or completed-work logs to memory; use session_search to recall "
-    "those from past transcripts."
+    "Memory is injected into every turn, so keep it compact and focused on facts that "
+    "will still matter later.\n"
+    "Prioritize what reduces future user steering — the most valuable memory is one "
+    "that prevents the user from having to correct or remind you again. "
+    "User preferences and recurring corrections matter more than procedural task details.\n"
+    "Do NOT save task progress, session outcomes, completed-work logs, or temporary TODO "
+    "state to memory; use session_search to recall those from past transcripts. "
+    "If you've discovered a new way to do something, solved a problem that could be "
+    "necessary later, save it as a skill with the skill tool."
 )

 SESSION_SEARCH_GUIDANCE = (
@@ -86,8 +92,11 @@ SESSION_SEARCH_GUIDANCE = (

 SKILLS_GUIDANCE = (
    "After completing a complex task (5+ tool calls), fixing a tricky error, "
-    "or discovering a non-trivial workflow, consider saving the approach as a "
-    "skill with skill_manage so you can reuse it next time."
+    "or discovering a non-trivial workflow, save the approach as a "
+    "skill with skill_manage so you can reuse it next time.\n"
+    "When using a skill and finding it outdated, incomplete, or wrong, "
+    "patch it immediately with skill_manage(action='patch') — don't wait to be asked. "
+    "Skills that aren't maintained become liabilities."
 )

 PLATFORM_HINTS = {
@@ -326,6 +335,9 @@ def build_skills_system_prompt(
        "Before replying, scan the skills below. If one clearly matches your task, "
        "load it with skill_view(name) and follow its instructions. "
        "If a skill has issues, fix it with skill_manage(action='patch').\n"
+        "After difficult/iterative tasks, offer to save as a skill. "
+        "If a skill you loaded was missing steps, had wrong commands, or needed "
+        "pitfalls you discovered, update it before finishing.\n"
        "\n"
        "<available_skills>\n"
        + "\n".join(index_lines) + "\n"
@@ -0,0 +1,184 @@
+"""Helpers for optional cheap-vs-strong model routing."""
+
+from __future__ import annotations
+
+import os
+import re
+from typing import Any, Dict, Optional
+
+_COMPLEX_KEYWORDS = {
+    "debug",
+    "debugging",
+    "implement",
+    "implementation",
+    "refactor",
+    "patch",
+    "traceback",
+    "stacktrace",
+    "exception",
+    "error",
+    "analyze",
+    "analysis",
+    "investigate",
+    "architecture",
+    "design",
+    "compare",
+    "benchmark",
+    "optimize",
+    "optimise",
+    "review",
+    "terminal",
+    "shell",
+    "tool",
+    "tools",
+    "pytest",
+    "test",
+    "tests",
+    "plan",
+    "planning",
+    "delegate",
+    "subagent",
+    "cron",
+    "docker",
+    "kubernetes",
+}
+
+_URL_RE = re.compile(r"https?://|www\.", re.IGNORECASE)
+
+
+def _coerce_bool(value: Any, default: bool = False) -> bool:
+    if value is None:
+        return default
+    if isinstance(value, bool):
+        return value
+    if isinstance(value, str):
+        return value.strip().lower() in {"1", "true", "yes", "on"}
+    return bool(value)
+
+
+def _coerce_int(value: Any, default: int) -> int:
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return default
+
+
+def choose_cheap_model_route(user_message: str, routing_config: Optional[Dict[str, Any]]) -> Optional[Dict[str, Any]]:
+    """Return the configured cheap-model route when a message looks simple.
+
+    Conservative by design: if the message has signs of code/tool/debugging/
+    long-form work, keep the primary model.
+    """
+    cfg = routing_config or {}
+    if not _coerce_bool(cfg.get("enabled"), False):
+        return None
+
+    cheap_model = cfg.get("cheap_model") or {}
+    if not isinstance(cheap_model, dict):
+        return None
+    provider = str(cheap_model.get("provider") or "").strip().lower()
+    model = str(cheap_model.get("model") or "").strip()
+    if not provider or not model:
+        return None
+
+    text = (user_message or "").strip()
+    if not text:
+        return None
+
+    max_chars = _coerce_int(cfg.get("max_simple_chars"), 160)
+    max_words = _coerce_int(cfg.get("max_simple_words"), 28)
+
+    if len(text) > max_chars:
+        return None
+    if len(text.split()) > max_words:
+        return None
+    if text.count("\n") > 1:
+        return None
+    if "```" in text or "`" in text:
+        return None
+    if _URL_RE.search(text):
+        return None
+
+    lowered = text.lower()
+    words = {token.strip(".,:;!?()[]{}\"'`") for token in lowered.split()}
+    if words & _COMPLEX_KEYWORDS:
+        return None
+
+    route = dict(cheap_model)
+    route["provider"] = provider
+    route["model"] = model
+    route["routing_reason"] = "simple_turn"
+    return route
+
+
+def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any]], primary: Dict[str, Any]) -> Dict[str, Any]:
+    """Resolve the effective model/runtime for one turn.
+
+    Returns a dict with model/runtime/signature/label fields.
+    """
+    route = choose_cheap_model_route(user_message, routing_config)
+    if not route:
+        return {
+            "model": primary.get("model"),
+            "runtime": {
+                "api_key": primary.get("api_key"),
+                "base_url": primary.get("base_url"),
+                "provider": primary.get("provider"),
+                "api_mode": primary.get("api_mode"),
+            },
+            "label": None,
+            "signature": (
+                primary.get("model"),
+                primary.get("provider"),
+                primary.get("base_url"),
+                primary.get("api_mode"),
+            ),
+        }
+
+    from hermes_cli.runtime_provider import resolve_runtime_provider
+
+    explicit_api_key = None
+    api_key_env = str(route.get("api_key_env") or "").strip()
+    if api_key_env:
+        explicit_api_key = os.getenv(api_key_env) or None
+
+    try:
+        runtime = resolve_runtime_provider(
+            requested=route.get("provider"),
+            explicit_api_key=explicit_api_key,
+            explicit_base_url=route.get("base_url"),
+        )
+    except Exception:
+        return {
+            "model": primary.get("model"),
+            "runtime": {
+                "api_key": primary.get("api_key"),
+                "base_url": primary.get("base_url"),
+                "provider": primary.get("provider"),
+                "api_mode": primary.get("api_mode"),
+            },
+            "label": None,
+            "signature": (
+                primary.get("model"),
+                primary.get("provider"),
+                primary.get("base_url"),
+                primary.get("api_mode"),
+            ),
+        }
+
+    return {
+        "model": route.get("model"),
+        "runtime": {
+            "api_key": runtime.get("api_key"),
+            "base_url": runtime.get("base_url"),
+            "provider": runtime.get("provider"),
+            "api_mode": runtime.get("api_mode"),
+        },
+        "label": f"smart route → {route.get('model')} ({runtime.get('provider')})",
+        "signature": (
+            route.get("model"),
+            runtime.get("provider"),
+            runtime.get("base_url"),
+            runtime.get("api_mode"),
+        ),
+    }
@@ -0,0 +1,134 @@
+from __future__ import annotations
+
+from decimal import Decimal
+from typing import Dict
+
+
+MODEL_PRICING = {
+    "gpt-4o": {"input": 2.50, "output": 10.00},
+    "gpt-4o-mini": {"input": 0.15, "output": 0.60},
+    "gpt-4.1": {"input": 2.00, "output": 8.00},
+    "gpt-4.1-mini": {"input": 0.40, "output": 1.60},
+    "gpt-4.1-nano": {"input": 0.10, "output": 0.40},
+    "gpt-4.5-preview": {"input": 75.00, "output": 150.00},
+    "gpt-5": {"input": 10.00, "output": 30.00},
+    "gpt-5.4": {"input": 10.00, "output": 30.00},
+    "o3": {"input": 10.00, "output": 40.00},
+    "o3-mini": {"input": 1.10, "output": 4.40},
+    "o4-mini": {"input": 1.10, "output": 4.40},
+    "claude-opus-4-20250514": {"input": 15.00, "output": 75.00},
+    "claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
+    "claude-3-5-sonnet-20241022": {"input": 3.00, "output": 15.00},
+    "claude-3-5-haiku-20241022": {"input": 0.80, "output": 4.00},
+    "claude-3-opus-20240229": {"input": 15.00, "output": 75.00},
+    "claude-3-haiku-20240307": {"input": 0.25, "output": 1.25},
+    "deepseek-chat": {"input": 0.14, "output": 0.28},
+    "deepseek-reasoner": {"input": 0.55, "output": 2.19},
+    "gemini-2.5-pro": {"input": 1.25, "output": 10.00},
+    "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
+    "gemini-2.0-flash": {"input": 0.10, "output": 0.40},
+    "llama-4-maverick": {"input": 0.50, "output": 0.70},
+    "llama-4-scout": {"input": 0.20, "output": 0.30},
+    "glm-5": {"input": 0.0, "output": 0.0},
+    "glm-4.7": {"input": 0.0, "output": 0.0},
+    "glm-4.5": {"input": 0.0, "output": 0.0},
+    "glm-4.5-flash": {"input": 0.0, "output": 0.0},
+    "kimi-k2.5": {"input": 0.0, "output": 0.0},
+    "kimi-k2-thinking": {"input": 0.0, "output": 0.0},
+    "kimi-k2-turbo-preview": {"input": 0.0, "output": 0.0},
+    "kimi-k2-0905-preview": {"input": 0.0, "output": 0.0},
+    "MiniMax-M2.5": {"input": 0.0, "output": 0.0},
+    "MiniMax-M2.5-highspeed": {"input": 0.0, "output": 0.0},
+    "MiniMax-M2.1": {"input": 0.0, "output": 0.0},
+}
+
+DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
+
+
+def get_pricing(model_name: str) -> Dict[str, float]:
+    if not model_name:
+        return DEFAULT_PRICING
+
+    bare = model_name.split("/")[-1].lower()
+    if bare in MODEL_PRICING:
+        return MODEL_PRICING[bare]
+
+    best_match = None
+    best_len = 0
+    for key, price in MODEL_PRICING.items():
+        if bare.startswith(key) and len(key) > best_len:
+            best_match = price
+            best_len = len(key)
+    if best_match:
+        return best_match
+
+    if "opus" in bare:
+        return {"input": 15.00, "output": 75.00}
+    if "sonnet" in bare:
+        return {"input": 3.00, "output": 15.00}
+    if "haiku" in bare:
+        return {"input": 0.80, "output": 4.00}
+    if "gpt-4o-mini" in bare:
+        return {"input": 0.15, "output": 0.60}
+    if "gpt-4o" in bare:
+        return {"input": 2.50, "output": 10.00}
+    if "gpt-5" in bare:
+        return {"input": 10.00, "output": 30.00}
+    if "deepseek" in bare:
+        return {"input": 0.14, "output": 0.28}
+    if "gemini" in bare:
+        return {"input": 0.15, "output": 0.60}
+
+    return DEFAULT_PRICING
+
+
+def has_known_pricing(model_name: str) -> bool:
+    pricing = get_pricing(model_name)
+    return pricing is not DEFAULT_PRICING and any(
+        float(value) > 0 for value in pricing.values()
+    )
+
+
+def estimate_cost_usd(model: str, input_tokens: int, output_tokens: int) -> float:
+    pricing = get_pricing(model)
+    total = (
+        Decimal(input_tokens) * Decimal(str(pricing["input"]))
+        + Decimal(output_tokens) * Decimal(str(pricing["output"]))
+    ) / Decimal("1000000")
+    return float(total)
+
+
+def format_duration_compact(seconds: float) -> str:
+    if seconds < 60:
+        return f"{seconds:.0f}s"
+    minutes = seconds / 60
+    if minutes < 60:
+        return f"{minutes:.0f}m"
+    hours = minutes / 60
+    if hours < 24:
+        remaining_min = int(minutes % 60)
+        return f"{int(hours)}h {remaining_min}m" if remaining_min else f"{int(hours)}h"
+    days = hours / 24
+    return f"{days:.1f}d"
+
+
+def format_token_count_compact(value: int) -> str:
+    abs_value = abs(int(value))
+    if abs_value < 1_000:
+        return str(int(value))
+
+    sign = "-" if value < 0 else ""
+    units = ((1_000_000_000, "B"), (1_000_000, "M"), (1_000, "K"))
+    for threshold, suffix in units:
+        if abs_value >= threshold:
+            scaled = abs_value / threshold
+            if scaled < 10:
+                text = f"{scaled:.2f}"
+            elif scaled < 100:
+                text = f"{scaled:.1f}"
+            else:
+                text = f"{scaled:.0f}"
+            text = text.rstrip("0").rstrip(".")
+            return f"{sign}{text}{suffix}"
+
+    return f"{value:,}"
@@ -51,6 +51,20 @@ model:
 #   # Data policy: "allow" (default) or "deny" to exclude providers that may store data
 #   # data_collection: "deny"

+# =============================================================================
+# Smart Model Routing (optional)
+# =============================================================================
+# Use a cheaper model for short/simple turns while keeping your main model for
+# more complex requests. Disabled by default.
+#
+# smart_model_routing:
+#   enabled: true
+#   max_simple_chars: 160
+#   max_simple_words: 28
+#   cheap_model:
+#     provider: openrouter
+#     model: google/gemini-2.5-flash
+
 # =============================================================================
 # Git Worktree Isolation
 # =============================================================================
@@ -76,8 +90,9 @@ model:
 #   - Messaging (Telegram/Discord): Uses MESSAGING_CWD from .env (default: home)
 terminal:
  backend: "local"
-  cwd: "."  # For local backend: "." = current directory. Ignored for remote backends.
+  cwd: "."  # For local backend: "." = current directory. Ignored for remote backends unless a backend documents otherwise.
  timeout: 180
+  docker_mount_cwd_to_workspace: false  # SECURITY: off by default. Opt in to mount the launch cwd into Docker /workspace.
  lifetime_seconds: 300
  # sudo_password: ""  # Enable sudo commands (pipes via sudo -S) - SECURITY WARNING: plaintext!

@@ -107,6 +122,7 @@ terminal:
 #   timeout: 180
 #   lifetime_seconds: 300
 #   docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
+#   docker_mount_cwd_to_workspace: true   # Explicit opt-in: mount your launch cwd into /workspace

 # -----------------------------------------------------------------------------
 # OPTION 4: Singularity/Apptainer container
@@ -333,6 +349,25 @@ session_reset:
  idle_minutes: 1440   # Inactivity timeout in minutes (default: 1440 = 24 hours)
  at_hour: 4           # Daily reset hour, 0-23 local time (default: 4 AM)

+# When true, group/channel chats use one session per participant when the platform
+# provides a user ID. This is the secure default and prevents users in the same
+# room from sharing context, interrupts, and token costs. Set false only if you
+# explicitly want one shared "room brain" per group/channel.
+group_sessions_per_user: true
+
+# ─────────────────────────────────────────────────────────────────────────────
+# Gateway Streaming
+# ─────────────────────────────────────────────────────────────────────────────
+# Stream tokens to messaging platforms in real-time. The bot sends a message
+# on first token, then progressively edits it as more tokens arrive.
+# Disabled by default — enable to try the streaming UX on Telegram/Discord/Slack.
+streaming:
+  enabled: false
+  # transport: edit           # "edit" = progressive editMessageText
+  # edit_interval: 0.3        # seconds between message edits
+  # buffer_threshold: 40      # chars before forcing an edit flush
+  # cursor: " ▉"              # cursor shown during streaming
+
 # =============================================================================
 # Skills Configuration
 # =============================================================================
@@ -694,6 +729,12 @@ display:
  # Toggle at runtime with /reasoning show or /reasoning hide.
  show_reasoning: false

+  # Stream tokens to the terminal as they arrive instead of waiting for the
+  # full response. The response box opens on first token and text appears
+  # line-by-line. Tool calls are still captured silently.
+  # Disabled by default — enable to try the streaming UX.
+  streaming: false
+
  # ───────────────────────────────────────────────────────────────────────────
  # Skin / Theme
  # ───────────────────────────────────────────────────────────────────────────
@@ -734,3 +775,14 @@ display:
  #   tool_prefix: "╎"                       # Tool output line prefix (default: ┊)
  #
  skin: default
+
+# =============================================================================
+# Privacy
+# =============================================================================
+# privacy:
+#   # Redact PII from the LLM context prompt.
+#   # When true, phone numbers are stripped and user/chat IDs are replaced
+#   # with deterministic hashes before being sent to the model.
+#   # Names and usernames are NOT affected (user-chosen, publicly visible).
+#   # Routing/delivery still uses the original values internally.
+#   redact_pii: false
@@ -6,6 +6,7 @@ Output is saved to ~/.hermes/cron/output/{job_id}/{timestamp}.md
 """

 import json
+import logging
 import tempfile
 import os
 import re
@@ -14,6 +15,8 @@ from datetime import datetime, timedelta
 from pathlib import Path
 from typing import Optional, Dict, List, Any

+logger = logging.getLogger(__name__)
+
 from hermes_time import now as _hermes_now

 try:
@@ -528,10 +531,18 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):


 def get_due_jobs() -> List[Dict[str, Any]]:
-    """Get all jobs that are due to run now."""
+    """Get all jobs that are due to run now.
+
+    For recurring jobs (cron/interval), if the scheduled time is stale
+    (more than one period in the past, e.g. because the gateway was down),
+    the job is fast-forwarded to the next future run instead of firing
+    immediately.  This prevents a burst of missed jobs on gateway restart.
+    """
    now = _hermes_now()
    jobs = [_apply_skill_fields(j) for j in load_jobs()]
+    raw_jobs = load_jobs()  # For saving updates
    due = []
+    needs_save = False

    for job in jobs:
        if not job.get("enabled", True):
@@ -543,8 +554,37 @@ def get_due_jobs() -> List[Dict[str, Any]]:

        next_run_dt = _ensure_aware(datetime.fromisoformat(next_run))
        if next_run_dt <= now:
+            schedule = job.get("schedule", {})
+            kind = schedule.get("kind")
+
+            # For recurring jobs, check if the scheduled time is stale
+            # (gateway was down and missed the window). Fast-forward to
+            # the next future occurrence instead of firing a stale run.
+            if kind in ("cron", "interval") and (now - next_run_dt).total_seconds() > 120:
+                # More than 2 minutes late — this is a missed run, not a current one.
+                # Recompute next_run_at to the next future occurrence.
+                new_next = compute_next_run(schedule, now.isoformat())
+                if new_next:
+                    logger.info(
+                        "Job '%s' missed its scheduled time (%s). "
+                        "Fast-forwarding to next run: %s",
+                        job.get("name", job["id"]),
+                        next_run,
+                        new_next,
+                    )
+                    # Update the job in storage
+                    for rj in raw_jobs:
+                        if rj["id"] == job["id"]:
+                            rj["next_run_at"] = new_next
+                            needs_save = True
+                            break
+                    continue  # Skip this run
+
            due.append(job)

+    if needs_save:
+        save_jobs(raw_jobs)
+
    return due


@@ -315,6 +315,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:

        # Provider routing
        pr = _cfg.get("provider_routing", {})
+        smart_routing = _cfg.get("smart_model_routing", {}) or {}

        from hermes_cli.runtime_provider import (
            resolve_runtime_provider,
@@ -331,12 +332,25 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            message = format_runtime_provider_error(exc)
            raise RuntimeError(message) from exc

+        from agent.smart_model_routing import resolve_turn_route
+        turn_route = resolve_turn_route(
+            prompt,
+            smart_routing,
+            {
+                "model": model,
+                "api_key": runtime.get("api_key"),
+                "base_url": runtime.get("base_url"),
+                "provider": runtime.get("provider"),
+                "api_mode": runtime.get("api_mode"),
+            },
+        )
+
        agent = AIAgent(
-            model=model,
-            api_key=runtime.get("api_key"),
-            base_url=runtime.get("base_url"),
-            provider=runtime.get("provider"),
-            api_mode=runtime.get("api_mode"),
+            model=turn_route["model"],
+            api_key=turn_route["runtime"].get("api_key"),
+            base_url=turn_route["runtime"].get("base_url"),
+            provider=turn_route["runtime"].get("provider"),
+            api_mode=turn_route["runtime"].get("api_mode"),
            max_iterations=max_iterations,
            reasoning_config=reasoning_config,
            prefill_messages=prefill_messages,
@@ -97,10 +97,11 @@ class SessionResetPolicy:
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "SessionResetPolicy":
        # Handle both missing keys and explicit null values (YAML null → None)
+        mode = data.get("mode")
        at_hour = data.get("at_hour")
        idle_minutes = data.get("idle_minutes")
        return cls(
-            mode=data.get("mode", "both"),
+            mode=mode if mode is not None else "both",
            at_hour=at_hour if at_hour is not None else 4,
            idle_minutes=idle_minutes if idle_minutes is not None else 1440,
        )
@@ -145,6 +146,37 @@ class PlatformConfig:
        )


+@dataclass
+class StreamingConfig:
+    """Configuration for real-time token streaming to messaging platforms."""
+    enabled: bool = False
+    transport: str = "edit"       # "edit" (progressive editMessageText) or "off"
+    edit_interval: float = 0.3    # Seconds between message edits
+    buffer_threshold: int = 40    # Chars before forcing an edit
+    cursor: str = " ▉"           # Cursor shown during streaming
+
+    def to_dict(self) -> Dict[str, Any]:
+        return {
+            "enabled": self.enabled,
+            "transport": self.transport,
+            "edit_interval": self.edit_interval,
+            "buffer_threshold": self.buffer_threshold,
+            "cursor": self.cursor,
+        }
+
+    @classmethod
+    def from_dict(cls, data: Dict[str, Any]) -> "StreamingConfig":
+        if not data:
+            return cls()
+        return cls(
+            enabled=data.get("enabled", False),
+            transport=data.get("transport", "edit"),
+            edit_interval=float(data.get("edit_interval", 0.3)),
+            buffer_threshold=int(data.get("buffer_threshold", 40)),
+            cursor=data.get("cursor", " ▉"),
+        )
+
+
@dataclass
 class GatewayConfig:
    """
@@ -174,7 +206,13 @@ class GatewayConfig:

    # STT settings
    stt_enabled: bool = True  # Whether to auto-transcribe inbound voice messages
-    
+
+    # Session isolation in shared chats
+    group_sessions_per_user: bool = True  # Isolate group/channel sessions per participant when user IDs are available
+
+    # Streaming configuration
+    streaming: StreamingConfig = field(default_factory=StreamingConfig)
+
    def get_connected_platforms(self) -> List[Platform]:
        """Return list of platforms that are enabled and configured."""
        connected = []
@@ -239,6 +277,8 @@ class GatewayConfig:
            "sessions_dir": str(self.sessions_dir),
            "always_log_local": self.always_log_local,
            "stt_enabled": self.stt_enabled,
+            "group_sessions_per_user": self.group_sessions_per_user,
+            "streaming": self.streaming.to_dict(),
        }
    
    @classmethod
@@ -279,6 +319,8 @@ class GatewayConfig:
        if stt_enabled is None:
            stt_enabled = data.get("stt", {}).get("enabled") if isinstance(data.get("stt"), dict) else None

+        group_sessions_per_user = data.get("group_sessions_per_user")
+
        return cls(
            platforms=platforms,
            default_reset_policy=default_policy,
@@ -289,6 +331,8 @@ class GatewayConfig:
            sessions_dir=sessions_dir,
            always_log_local=data.get("always_log_local", True),
            stt_enabled=_coerce_bool(stt_enabled, True),
+            group_sessions_per_user=_coerce_bool(group_sessions_per_user, True),
+            streaming=StreamingConfig.from_dict(data.get("streaming", {})),
        )


@@ -344,6 +388,14 @@ def load_gateway_config() -> GatewayConfig:
            if isinstance(stt_cfg, dict) and "enabled" in stt_cfg:
                config.stt_enabled = _coerce_bool(stt_cfg.get("enabled"), True)

+            # Bridge group session isolation from config.yaml into gateway runtime.
+            # Secure default is per-user isolation in shared chats.
+            if "group_sessions_per_user" in yaml_cfg:
+                config.group_sessions_per_user = _coerce_bool(
+                    yaml_cfg.get("group_sessions_per_user"),
+                    True,
+                )
+
            # Bridge discord settings from config.yaml to env vars
            # (env vars take precedence — only set if not already defined)
            discord_cfg = yaml_cfg.get("discord", {})
@@ -528,6 +528,7 @@ class BasePlatformAdapter(ABC):
        animation_url: str,
        caption: Optional[str] = None,
        reply_to: Optional[str] = None,
+        metadata: Optional[Dict[str, Any]] = None,
    ) -> SendResult:
        """
        Send an animated GIF natively via the platform API.
@@ -752,7 +753,10 @@ class BasePlatformAdapter(ABC):
        if not self._message_handler:
            return
        
-        session_key = build_session_key(event.source)
+        session_key = build_session_key(
+            event.source,
+            group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
+        )
        
        # Check if there's already an active handler for this session
        if session_key in self._active_sessions:
@@ -1126,6 +1130,27 @@ class BasePlatformAdapter(ABC):
            if split_at < 1:
                split_at = headroom

+            # Avoid splitting inside an inline code span (`...`).
+            # If the text before split_at has an odd number of unescaped
+            # backticks, the split falls inside inline code — the resulting
+            # chunk would have an unpaired backtick and any special characters
+            # (like parentheses) inside the broken span would be unescaped,
+            # causing MarkdownV2 parse errors on Telegram.
+            candidate = remaining[:split_at]
+            backtick_count = candidate.count("`") - candidate.count("\\`")
+            if backtick_count % 2 == 1:
+                # Find the last unescaped backtick and split before it
+                last_bt = candidate.rfind("`")
+                while last_bt > 0 and candidate[last_bt - 1] == "\\":
+                    last_bt = candidate.rfind("`", 0, last_bt)
+                if last_bt > 0:
+                    # Try to find a space or newline just before the backtick
+                    safe_split = candidate.rfind(" ", 0, last_bt)
+                    nl_split = candidate.rfind("\n", 0, last_bt)
+                    safe_split = max(safe_split, nl_split)
+                    if safe_split > headroom // 4:
+                        split_at = safe_split
+
            chunk_body = remaining[:split_at]
            remaining = remaining[split_at:].lstrip()

@@ -135,14 +135,23 @@ def _extract_email_address(raw: str) -> str:
    return raw.strip().lower()


-def _extract_attachments(msg: email_lib.message.Message) -> List[Dict[str, Any]]:
-    """Extract attachment metadata and cache files locally."""
+def _extract_attachments(
+    msg: email_lib.message.Message,
+    skip_attachments: bool = False,
+) -> List[Dict[str, Any]]:
+    """Extract attachment metadata and cache files locally.
+
+    When *skip_attachments* is True, all attachment/inline parts are ignored
+    (useful for malware protection or bandwidth savings).
+    """
    attachments = []
    if not msg.is_multipart():
        return attachments

    for part in msg.walk():
        disposition = str(part.get("Content-Disposition", ""))
+        if skip_attachments and ("attachment" in disposition or "inline" in disposition):
+            continue
        if "attachment" not in disposition and "inline" not in disposition:
            continue
        # Skip text/plain and text/html body parts
@@ -196,6 +205,13 @@ class EmailAdapter(BasePlatformAdapter):
        self._smtp_port = int(os.getenv("EMAIL_SMTP_PORT", "587"))
        self._poll_interval = int(os.getenv("EMAIL_POLL_INTERVAL", "15"))

+        # Skip attachments — configured via config.yaml:
+        #   platforms:
+        #     email:
+        #       skip_attachments: true
+        extra = config.extra or {}
+        self._skip_attachments = extra.get("skip_attachments", False)
+
        # Track message IDs we've already processed to avoid duplicates
        self._seen_uids: set = set()
        self._poll_task: Optional[asyncio.Task] = None
@@ -306,7 +322,7 @@ class EmailAdapter(BasePlatformAdapter):
                message_id = msg.get("Message-ID", "")
                in_reply_to = msg.get("In-Reply-To", "")
                body = _extract_text_body(msg)
-                attachments = _extract_attachments(msg)
+                attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)

                results.append({
                    "uid": uid,
@@ -789,23 +789,11 @@ class SlackAdapter(BasePlatformAdapter):
        user_id = command.get("user_id", "")
        channel_id = command.get("channel_id", "")

-        # Map subcommands to gateway commands
-        subcommand_map = {
-            "new": "/reset", "reset": "/reset",
-            "status": "/status", "stop": "/stop",
-            "help": "/help",
-            "model": "/model", "personality": "/personality",
-            "retry": "/retry", "undo": "/undo",
-            "compact": "/compress", "compress": "/compress",
-            "resume": "/resume",
-            "background": "/background",
-            "usage": "/usage",
-            "insights": "/insights",
-            "title": "/title",
-            "reasoning": "/reasoning",
-            "provider": "/provider",
-            "rollback": "/rollback",
-        }
+        # Map subcommands to gateway commands — derived from central registry.
+        # Also keep "compact" as a Slack-specific alias for /compress.
+        from hermes_cli.commands import slack_subcommand_map
+        subcommand_map = slack_subcommand_map()
+        subcommand_map["compact"] = "/compress"
        first_word = text.split()[0] if text else ""
        if first_word in subcommand_map:
            # Preserve arguments after the subcommand
@@ -202,8 +202,26 @@ class TelegramAdapter(BasePlatformAdapter):
                self._handle_media_message
            ))
            
-            # Start polling in background
-            await self._app.initialize()
+            # Start polling — retry initialize() for transient TLS resets
+            try:
+                from telegram.error import NetworkError, TimedOut
+            except ImportError:
+                NetworkError = TimedOut = OSError  # type: ignore[misc,assignment]
+            _max_connect = 3
+            for _attempt in range(_max_connect):
+                try:
+                    await self._app.initialize()
+                    break
+                except (NetworkError, TimedOut, OSError) as init_err:
+                    if _attempt < _max_connect - 1:
+                        wait = 2 ** _attempt
+                        logger.warning(
+                            "[%s] Connect attempt %d/%d failed: %s — retrying in %ds",
+                            self.name, _attempt + 1, _max_connect, init_err, wait,
+                        )
+                        await asyncio.sleep(wait)
+                    else:
+                        raise
            await self._app.start()
            loop = asyncio.get_running_loop()

@@ -222,29 +240,13 @@ class TelegramAdapter(BasePlatformAdapter):
            )
            
            # Register bot commands so Telegram shows a hint menu when users type /
+            # List is derived from the central COMMAND_REGISTRY — adding a new
+            # gateway command there automatically adds it to the Telegram menu.
            try:
                from telegram import BotCommand
+                from hermes_cli.commands import telegram_bot_commands
                await self._bot.set_my_commands([
-                    BotCommand("new", "Start a new conversation"),
-                    BotCommand("reset", "Reset conversation history"),
-                    BotCommand("model", "Show or change the model"),
-                    BotCommand("reasoning", "Show or change reasoning effort"),
-                    BotCommand("personality", "Set a personality"),
-                    BotCommand("retry", "Retry your last message"),
-                    BotCommand("undo", "Remove the last exchange"),
-                    BotCommand("status", "Show session info"),
-                    BotCommand("stop", "Stop the running agent"),
-                    BotCommand("sethome", "Set this chat as the home channel"),
-                    BotCommand("compress", "Compress conversation context"),
-                    BotCommand("title", "Set or show the session title"),
-                    BotCommand("resume", "Resume a previously-named session"),
-                    BotCommand("usage", "Show token usage for this session"),
-                    BotCommand("provider", "Show available providers"),
-                    BotCommand("insights", "Show usage insights and analytics"),
-                    BotCommand("update", "Update Hermes to the latest version"),
-                    BotCommand("reload_mcp", "Reload MCP servers from config"),
-                    BotCommand("voice", "Toggle voice reply mode"),
-                    BotCommand("help", "Show available commands"),
+                    BotCommand(name, desc) for name, desc in telegram_bot_commands()
                ])
            except Exception as e:
                logger.warning(
@@ -265,6 +267,8 @@ class TelegramAdapter(BasePlatformAdapter):
                    release_scoped_lock("telegram-bot-token", self._token_lock_identity)
                except Exception:
                    pass
+            message = f"Telegram startup failed: {e}"
+            self._set_fatal_error("telegram_connect_error", message, retryable=True)
            logger.error("[%s] Failed to connect to Telegram: %s", self.name, e, exc_info=True)
            return False
    
@@ -334,32 +338,47 @@ class TelegramAdapter(BasePlatformAdapter):
            message_ids = []
            thread_id = metadata.get("thread_id") if metadata else None
            
+            try:
+                from telegram.error import NetworkError as _NetErr
+            except ImportError:
+                _NetErr = OSError  # type: ignore[misc,assignment]
+
            for i, chunk in enumerate(chunks):
-                # Try Markdown first, fall back to plain text if it fails
-                try:
-                    msg = await self._bot.send_message(
-                        chat_id=int(chat_id),
-                        text=chunk,
-                        parse_mode=ParseMode.MARKDOWN_V2,
-                        reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
-                        message_thread_id=int(thread_id) if thread_id else None,
-                    )
-                except Exception as md_error:
-                    # Markdown parsing failed, try plain text
-                    if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
-                        logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
-                        # Strip MDV2 escape backslashes so the user doesn't
-                        # see raw backslashes littered through the message.
-                        plain_chunk = _strip_mdv2(chunk)
-                        msg = await self._bot.send_message(
-                            chat_id=int(chat_id),
-                            text=plain_chunk,
-                            parse_mode=None,  # Plain text
-                            reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
-                            message_thread_id=int(thread_id) if thread_id else None,
-                        )
-                    else:
-                        raise  # Re-raise if not a parse error
+                msg = None
+                for _send_attempt in range(3):
+                    try:
+                        # Try Markdown first, fall back to plain text if it fails
+                        try:
+                            msg = await self._bot.send_message(
+                                chat_id=int(chat_id),
+                                text=chunk,
+                                parse_mode=ParseMode.MARKDOWN_V2,
+                                reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
+                                message_thread_id=int(thread_id) if thread_id else None,
+                            )
+                        except Exception as md_error:
+                            # Markdown parsing failed, try plain text
+                            if "parse" in str(md_error).lower() or "markdown" in str(md_error).lower():
+                                logger.warning("[%s] MarkdownV2 parse failed, falling back to plain text: %s", self.name, md_error)
+                                plain_chunk = _strip_mdv2(chunk)
+                                msg = await self._bot.send_message(
+                                    chat_id=int(chat_id),
+                                    text=plain_chunk,
+                                    parse_mode=None,
+                                    reply_to_message_id=int(reply_to) if reply_to and i == 0 else None,
+                                    message_thread_id=int(thread_id) if thread_id else None,
+                                )
+                            else:
+                                raise
+                        break  # success
+                    except _NetErr as send_err:
+                        if _send_attempt < 2:
+                            wait = 2 ** _send_attempt
+                            logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
+                                           self.name, _send_attempt + 1, wait, send_err)
+                            await asyncio.sleep(wait)
+                        else:
+                            raise
                message_ids.append(str(msg.message_id))
            
            return SendResult(
@@ -829,7 +848,10 @@ class TelegramAdapter(BasePlatformAdapter):
    def _photo_batch_key(self, event: MessageEvent, msg: Message) -> str:
        """Return a batching key for Telegram photos/albums."""
        from gateway.session import build_session_key
-        session_key = build_session_key(event.source)
+        session_key = build_session_key(
+            event.source,
+            group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
+        )
        media_group_id = getattr(msg, "media_group_id", None)
        if media_group_id:
            return f"{session_key}:album:{media_group_id}"
@@ -29,6 +29,49 @@ from pathlib import Path
 from datetime import datetime
 from typing import Dict, Optional, Any, List

+# ---------------------------------------------------------------------------
+# SSL certificate auto-detection for NixOS and other non-standard systems.
+# Must run BEFORE any HTTP library (discord, aiohttp, etc.) is imported.
+# ---------------------------------------------------------------------------
+def _ensure_ssl_certs() -> None:
+    """Set SSL_CERT_FILE if the system doesn't expose CA certs to Python."""
+    if "SSL_CERT_FILE" in os.environ:
+        return  # user already configured it
+
+    import ssl
+
+    # 1. Python's compiled-in defaults
+    paths = ssl.get_default_verify_paths()
+    for candidate in (paths.cafile, paths.openssl_cafile):
+        if candidate and os.path.exists(candidate):
+            os.environ["SSL_CERT_FILE"] = candidate
+            return
+
+    # 2. certifi (ships its own Mozilla bundle)
+    try:
+        import certifi
+        os.environ["SSL_CERT_FILE"] = certifi.where()
+        return
+    except ImportError:
+        pass
+
+    # 3. Common distro / macOS locations
+    for candidate in (
+        "/etc/ssl/certs/ca-certificates.crt",               # Debian/Ubuntu/Gentoo
+        "/etc/pki/tls/certs/ca-bundle.crt",                 # RHEL/CentOS 7
+        "/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem", # RHEL/CentOS 8+
+        "/etc/ssl/ca-bundle.pem",                            # SUSE/OpenSUSE
+        "/etc/ssl/cert.pem",                                 # Alpine / macOS
+        "/etc/pki/tls/cert.pem",                             # Fedora
+        "/usr/local/etc/openssl@1.1/cert.pem",               # macOS Homebrew Intel
+        "/opt/homebrew/etc/openssl@1.1/cert.pem",            # macOS Homebrew ARM
+    ):
+        if os.path.exists(candidate):
+            os.environ["SSL_CERT_FILE"] = candidate
+            return
+
+_ensure_ssl_certs()
+
 # Add parent directory to path
 sys.path.insert(0, str(Path(__file__).parent.parent))

@@ -77,6 +120,7 @@ if _config_path.exists():
                "container_persistent": "TERMINAL_CONTAINER_PERSISTENT",
                "docker_volumes": "TERMINAL_DOCKER_VOLUMES",
                "sandbox_dir": "TERMINAL_SANDBOX_DIR",
+                "persistent_shell": "TERMINAL_PERSISTENT_SHELL",
            }
            for _cfg_key, _env_var in _terminal_env_map.items():
                if _cfg_key in _terminal_cfg:
@@ -113,6 +157,12 @@ if _config_path.exists():
                    "base_url": "AUXILIARY_WEB_EXTRACT_BASE_URL",
                    "api_key": "AUXILIARY_WEB_EXTRACT_API_KEY",
                },
+                "approval": {
+                    "provider": "AUXILIARY_APPROVAL_PROVIDER",
+                    "model": "AUXILIARY_APPROVAL_MODEL",
+                    "base_url": "AUXILIARY_APPROVAL_BASE_URL",
+                    "api_key": "AUXILIARY_APPROVAL_API_KEY",
+                },
            }
            for _task_key, _env_map in _aux_task_env.items():
                _task_cfg = _auxiliary_cfg.get(_task_key, {})
@@ -274,6 +324,7 @@ class GatewayRunner:
        self._show_reasoning = self._load_show_reasoning()
        self._provider_routing = self._load_provider_routing()
        self._fallback_model = self._load_fallback_model()
+        self._smart_model_routing = self._load_smart_model_routing()

        # Wire process registry into session store for reset protection
        from tools.process_registry import process_registry
@@ -434,7 +485,11 @@ class GatewayRunner:

    # -----------------------------------------------------------------

-    def _flush_memories_for_session(self, old_session_id: str):
+    def _flush_memories_for_session(
+        self,
+        old_session_id: str,
+        honcho_session_key: Optional[str] = None,
+    ):
        """Prompt the agent to save memories/skills before context is lost.

        Synchronous worker — meant to be called via run_in_executor from
@@ -462,6 +517,7 @@ class GatewayRunner:
                quiet_mode=True,
                enabled_toolsets=["memory", "skills"],
                session_id=old_session_id,
+                honcho_session_key=honcho_session_key,
            )

            # Build conversation history from transcript
@@ -489,6 +545,7 @@ class GatewayRunner:
            tmp_agent.run_conversation(
                user_message=flush_prompt,
                conversation_history=msgs,
+                sync_honcho=False,
            )
            logger.info("Pre-reset memory flush completed for session %s", old_session_id)
            # Flush any queued Honcho writes before the session is dropped
@@ -500,10 +557,19 @@ class GatewayRunner:
        except Exception as e:
            logger.debug("Pre-reset memory flush failed for session %s: %s", old_session_id, e)

-    async def _async_flush_memories(self, old_session_id: str):
+    async def _async_flush_memories(
+        self,
+        old_session_id: str,
+        honcho_session_key: Optional[str] = None,
+    ):
        """Run the sync memory flush in a thread pool so it won't block the event loop."""
        loop = asyncio.get_event_loop()
-        await loop.run_in_executor(None, self._flush_memories_for_session, old_session_id)
+        await loop.run_in_executor(
+            None,
+            self._flush_memories_for_session,
+            old_session_id,
+            honcho_session_key,
+        )

    @property
    def should_exit_cleanly(self) -> bool:
@@ -513,6 +579,33 @@ class GatewayRunner:
    def exit_reason(self) -> Optional[str]:
        return self._exit_reason

+    def _session_key_for_source(self, source: SessionSource) -> str:
+        """Resolve the current session key for a source, honoring gateway config when available."""
+        if hasattr(self, "session_store") and self.session_store is not None:
+            try:
+                session_key = self.session_store._generate_session_key(source)
+                if isinstance(session_key, str) and session_key:
+                    return session_key
+            except Exception:
+                pass
+        config = getattr(self, "config", None)
+        return build_session_key(
+            source,
+            group_sessions_per_user=getattr(config, "group_sessions_per_user", True),
+        )
+
+    def _resolve_turn_agent_config(self, user_message: str, model: str, runtime_kwargs: dict) -> dict:
+        from agent.smart_model_routing import resolve_turn_route
+
+        primary = {
+            "model": model,
+            "api_key": runtime_kwargs.get("api_key"),
+            "base_url": runtime_kwargs.get("base_url"),
+            "provider": runtime_kwargs.get("provider"),
+            "api_mode": runtime_kwargs.get("api_mode"),
+        }
+        return resolve_turn_route(user_message, getattr(self, "_smart_model_routing", {}), primary)
+
    async def _handle_adapter_fatal_error(self, adapter: BasePlatformAdapter) -> None:
        """React to a non-retryable adapter failure after startup."""
        logger.error(
@@ -715,6 +808,20 @@ class GatewayRunner:
            pass
        return None

+    @staticmethod
+    def _load_smart_model_routing() -> dict:
+        """Load optional smart cheap-vs-strong model routing config."""
+        try:
+            import yaml as _y
+            cfg_path = _hermes_home / "config.yaml"
+            if cfg_path.exists():
+                with open(cfg_path, encoding="utf-8") as _f:
+                    cfg = _y.safe_load(_f) or {}
+                return cfg.get("smart_model_routing", {}) or {}
+        except Exception:
+            pass
+        return {}
+
    async def start(self) -> bool:
        """
        Start the gateway and all configured platform adapters.
@@ -757,12 +864,15 @@ class GatewayRunner:
            logger.warning("Process checkpoint recovery: %s", e)
        
        connected_count = 0
+        enabled_platform_count = 0
        startup_nonretryable_errors: list[str] = []
+        startup_retryable_errors: list[str] = []
        
        # Initialize and connect each configured platform
        for platform, platform_config in self.config.platforms.items():
            if not platform_config.enabled:
                continue
+            enabled_platform_count += 1
            
            adapter = self._create_adapter(platform, platform_config)
            if not adapter:
@@ -784,12 +894,22 @@ class GatewayRunner:
                    logger.info("✓ %s connected", platform.value)
                else:
                    logger.warning("✗ %s failed to connect", platform.value)
-                    if adapter.has_fatal_error and not adapter.fatal_error_retryable:
-                        startup_nonretryable_errors.append(
+                    if adapter.has_fatal_error:
+                        target = (
+                            startup_retryable_errors
+                            if adapter.fatal_error_retryable
+                            else startup_nonretryable_errors
+                        )
+                        target.append(
                            f"{platform.value}: {adapter.fatal_error_message}"
                        )
+                    else:
+                        startup_retryable_errors.append(
+                            f"{platform.value}: failed to connect"
+                        )
            except Exception as e:
                logger.error("✗ %s error: %s", platform.value, e)
+                startup_retryable_errors.append(f"{platform.value}: {e}")
        
        if connected_count == 0:
            if startup_nonretryable_errors:
@@ -802,7 +922,16 @@ class GatewayRunner:
                    pass
                self._request_clean_exit(reason)
                return True
-            logger.warning("No messaging platforms connected.")
+            if enabled_platform_count > 0:
+                reason = "; ".join(startup_retryable_errors) or "all configured messaging platforms failed to connect"
+                logger.error("Gateway failed to connect any configured messaging platform: %s", reason)
+                try:
+                    from gateway.status import write_runtime_status
+                    write_runtime_status(gateway_state="startup_failed", exit_reason=reason)
+                except Exception:
+                    pass
+                return False
+            logger.warning("No messaging platforms enabled.")
            logger.info("Gateway will continue running for cron job execution.")
        
        # Update delivery router with adapters
@@ -879,7 +1008,7 @@ class GatewayRunner:
                        entry.session_id, key,
                    )
                    try:
-                        await self._async_flush_memories(entry.session_id)
+                        await self._async_flush_memories(entry.session_id, key)
                        self._shutdown_gateway_honcho(key)
                        self.session_store._pre_flushed_sessions.add(entry.session_id)
                    except Exception as e:
@@ -941,6 +1070,12 @@ class GatewayRunner:
        config: Any
    ) -> Optional[BasePlatformAdapter]:
        """Create the appropriate adapter for a platform."""
+        if hasattr(config, "extra") and isinstance(config.extra, dict):
+            config.extra.setdefault(
+                "group_sessions_per_user",
+                self.config.group_sessions_per_user,
+            )
+
        if platform == Platform.TELEGRAM:
            from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
            if not check_telegram_requirements():
@@ -1112,7 +1247,7 @@ class GatewayRunner:
        # Special case: Telegram/photo bursts often arrive as multiple near-
        # simultaneous updates. Do NOT interrupt for photo-only follow-ups here;
        # let the adapter-level batching/queueing logic absorb them.
-        _quick_key = build_session_key(source)
+        _quick_key = self._session_key_for_source(source)
        if _quick_key in self._running_agents:
            if event.get_command() == "status":
                return await self._handle_status_command(event)
@@ -1150,45 +1285,47 @@ class GatewayRunner:
        # Check for commands
        command = event.get_command()
        
-        # Emit command:* hook for any recognized slash command
-        _known_commands = {"new", "reset", "help", "status", "stop", "model", "reasoning",
-                          "personality", "plan", "retry", "undo", "sethome", "set-home",
-                          "compress", "usage", "insights", "reload-mcp", "reload_mcp",
-                          "update", "title", "resume", "provider", "rollback",
-                          "background", "reasoning", "voice"}
-        if command and command in _known_commands:
+        # Emit command:* hook for any recognized slash command.
+        # GATEWAY_KNOWN_COMMANDS is derived from the central COMMAND_REGISTRY
+        # in hermes_cli/commands.py — no hardcoded set to maintain here.
+        from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS, resolve_command as _resolve_cmd
+        if command and command in GATEWAY_KNOWN_COMMANDS:
            await self.hooks.emit(f"command:{command}", {
                "platform": source.platform.value if source.platform else "",
                "user_id": source.user_id,
                "command": command,
                "args": event.get_command_args().strip(),
            })
-        
-        if command in ["new", "reset"]:
+
+        # Resolve aliases to canonical name so dispatch only checks canonicals.
+        _cmd_def = _resolve_cmd(command) if command else None
+        canonical = _cmd_def.name if _cmd_def else command
+
+        if canonical == "new":
            return await self._handle_reset_command(event)
        
-        if command == "help":
+        if canonical == "help":
            return await self._handle_help_command(event)
        
-        if command == "status":
+        if canonical == "status":
            return await self._handle_status_command(event)
        
-        if command == "stop":
+        if canonical == "stop":
            return await self._handle_stop_command(event)
        
-        if command == "model":
+        if canonical == "model":
            return await self._handle_model_command(event)

-        if command == "reasoning":
+        if canonical == "reasoning":
            return await self._handle_reasoning_command(event)

-        if command == "provider":
+        if canonical == "provider":
            return await self._handle_provider_command(event)
        
-        if command == "personality":
+        if canonical == "personality":
            return await self._handle_personality_command(event)

-        if command == "plan":
+        if canonical == "plan":
            try:
                from agent.skill_commands import build_plan_path, build_skill_invocation_message

@@ -1205,51 +1342,48 @@ class GatewayRunner:
                )
                if not event.text:
                    return "Failed to load the bundled /plan skill."
-                command = None
+                canonical = None
            except Exception as e:
                logger.exception("Failed to prepare /plan command")
                return f"Failed to enter plan mode: {e}"
        
-        if command == "retry":
+        if canonical == "retry":
            return await self._handle_retry_command(event)
        
-        if command == "undo":
+        if canonical == "undo":
            return await self._handle_undo_command(event)
        
-        if command in ["sethome", "set-home"]:
+        if canonical == "sethome":
            return await self._handle_set_home_command(event)

-        if command == "compress":
+        if canonical == "compress":
            return await self._handle_compress_command(event)

-        if command == "usage":
+        if canonical == "usage":
            return await self._handle_usage_command(event)

-        if command == "insights":
+        if canonical == "insights":
            return await self._handle_insights_command(event)

-        if command in ("reload-mcp", "reload_mcp"):
+        if canonical == "reload-mcp":
            return await self._handle_reload_mcp_command(event)

-        if command == "update":
+        if canonical == "update":
            return await self._handle_update_command(event)

-        if command == "title":
+        if canonical == "title":
            return await self._handle_title_command(event)

-        if command == "resume":
+        if canonical == "resume":
            return await self._handle_resume_command(event)

-        if command == "rollback":
+        if canonical == "rollback":
            return await self._handle_rollback_command(event)

-        if command == "background":
+        if canonical == "background":
            return await self._handle_background_command(event)

-        if command == "reasoning":
-            return await self._handle_reasoning_command(event)
-
-        if command == "voice":
+        if canonical == "voice":
            return await self._handle_voice_command(event)

        # User-defined quick commands (bypass agent loop, no LLM call)
@@ -1301,7 +1435,7 @@ class GatewayRunner:
                logger.debug("Skill command check failed (non-fatal): %s", e)
        
        # Check for pending exec approval responses
-        session_key_preview = build_session_key(source)
+        session_key_preview = self._session_key_for_source(source)
        if session_key_preview in self._pending_approvals:
            user_text = event.text.strip().lower()
            if user_text in ("yes", "y", "approve", "ok", "go", "do it"):
@@ -1350,8 +1484,17 @@ class GatewayRunner:
        # Set environment variables for tools
        self._set_session_env(context)
        
+        # Read privacy.redact_pii from config (re-read per message)
+        _redact_pii = False
+        try:
+            with open(_config_path, encoding="utf-8") as _pf:
+                _pcfg = yaml.safe_load(_pf) or {}
+            _redact_pii = bool((_pcfg.get("privacy") or {}).get("redact_pii", False))
+        except Exception:
+            pass
+
        # Build the context prompt to inject
-        context_prompt = build_session_context_prompt(context)
+        context_prompt = build_session_context_prompt(context, redact_pii=_redact_pii)
        
        # If the previous session expired and was auto-reset, prepend a notice
        # so the agent knows this is a fresh conversation (not an intentional /reset).
@@ -1720,9 +1863,17 @@ class GatewayRunner:
                session_key=session_key
            )
            
-            response = agent_result.get("final_response", "")
+            response = agent_result.get("final_response") or ""
            agent_messages = agent_result.get("messages", [])

+            # Surface error details when the agent failed silently (final_response=None)
+            if not response and agent_result.get("failed"):
+                error_detail = agent_result.get("error", "unknown error")
+                response = (
+                    f"The request failed: {str(error_detail)[:300]}\n"
+                    "Try again or use /reset to start a fresh session."
+                )
+
            # If the agent's session_id changed during compression, update
            # session_entry so transcript writes below go to the right session.
            if agent_result.get("session_id") and agent_result["session_id"] != session_entry.session_id:
@@ -1835,13 +1986,29 @@ class GatewayRunner:
            if self._should_send_voice_reply(event, response, agent_messages):
                await self._send_voice_reply(event, response)

+            # If streaming already delivered the response, return None so
+            # _process_message_background doesn't send it again.
+            if agent_result.get("already_sent"):
+                return None
+
            return response
            
        except Exception as e:
            logger.exception("Agent error in session %s", session_key)
+            error_type = type(e).__name__
+            error_detail = str(e)[:300] if str(e) else "no details available"
+            status_hint = ""
+            status_code = getattr(e, "status_code", None)
+            if status_code == 401:
+                status_hint = " Check your API key or run `claude /login` to refresh OAuth credentials."
+            elif status_code == 429:
+                status_hint = " You are being rate-limited. Please wait a moment and try again."
+            elif status_code == 529:
+                status_hint = " The API is temporarily overloaded. Please try again shortly."
            return (
-                "Sorry, I encountered an unexpected error. "
-                "The details have been logged for debugging. "
+                f"Sorry, I encountered an error ({error_type}).\n"
+                f"{error_detail}\n"
+                f"{status_hint}"
                "Try again or use /reset to start a fresh session."
            )
        finally:
@@ -1853,14 +2020,16 @@ class GatewayRunner:
        source = event.source
        
        # Get existing session key
-        session_key = self.session_store._generate_session_key(source)
+        session_key = self._session_key_for_source(source)
        
        # Flush memories in the background (fire-and-forget) so the user
        # gets the "Session reset!" response immediately.
        try:
            old_entry = self.session_store._entries.get(session_key)
            if old_entry:
-                asyncio.create_task(self._async_flush_memories(old_entry.session_id))
+                asyncio.create_task(
+                    self._async_flush_memories(old_entry.session_id, session_key)
+                )
        except Exception as e:
            logger.debug("Gateway memory flush on reset failed: %s", e)

@@ -1923,30 +2092,10 @@ class GatewayRunner:
    
    async def _handle_help_command(self, event: MessageEvent) -> str:
        """Handle /help command - list available commands."""
+        from hermes_cli.commands import gateway_help_lines
        lines = [
            "📖 **Hermes Commands**\n",
-            "`/new` — Start a new conversation",
-            "`/reset` — Reset conversation history",
-            "`/status` — Show session info",
-            "`/stop` — Interrupt the running agent",
-            "`/model [provider:model]` — Show/change model (or switch provider)",
-            "`/provider` — Show available providers and auth status",
-            "`/personality [name]` — Set a personality",
-            "`/retry` — Retry your last message",
-            "`/undo` — Remove the last exchange",
-            "`/sethome` — Set this chat as the home channel",
-            "`/compress` — Compress conversation context",
-            "`/title [name]` — Set or show the session title",
-            "`/resume [name]` — Resume a previously-named session",
-            "`/usage` — Show token usage for this session",
-            "`/insights [days]` — Show usage insights and analytics",
-            "`/reasoning [level|show|hide]` — Set reasoning effort or toggle display",
-            "`/rollback [number]` — List or restore filesystem checkpoints",
-            "`/background <prompt>` — Run a prompt in a separate background session",
-            "`/voice [on|off|tts|status]` — Toggle voice reply mode",
-            "`/reload-mcp` — Reload MCP servers from config",
-            "`/update` — Update Hermes Agent to the latest version",
-            "`/help` — Show this message",
+            *gateway_help_lines(),
        ]
        try:
            from agent.skill_commands import get_skill_commands
@@ -2024,6 +2173,12 @@ class GatewayRunner:

        # Parse provider:model syntax
        target_provider, new_model = parse_model_input(args, current_provider)
+        # Auto-detect provider when no explicit provider:model syntax was used
+        if target_provider == current_provider:
+            from hermes_cli.models import detect_provider_for_model
+            detected = detect_provider_for_model(new_model, current_provider)
+            if detected:
+                target_provider, new_model = detected
        provider_changed = target_provider != current_provider

        # Resolve credentials for the target provider (for API probe)
@@ -2806,11 +2961,12 @@ class GatewayRunner:
            max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))
            reasoning_config = self._load_reasoning_config()
            self._reasoning_config = reasoning_config
+            turn_route = self._resolve_turn_agent_config(prompt, model, runtime_kwargs)

            def run_sync():
                agent = AIAgent(
-                    model=model,
-                    **runtime_kwargs,
+                    model=turn_route["model"],
+                    **turn_route["runtime"],
                    max_iterations=max_iterations,
                    quiet_mode=True,
                    verbose_logging=False,
@@ -3083,7 +3239,7 @@ class GatewayRunner:
            return "Session database not available."

        source = event.source
-        session_key = build_session_key(source)
+        session_key = self._session_key_for_source(source)
        name = event.get_command_args().strip()

        if not name:
@@ -3127,7 +3283,9 @@ class GatewayRunner:

        # Flush memories for current session before switching
        try:
-            asyncio.create_task(self._async_flush_memories(current_entry.session_id))
+            asyncio.create_task(
+                self._async_flush_memories(current_entry.session_id, session_key)
+            )
        except Exception as e:
            logger.debug("Memory flush on resume failed: %s", e)

@@ -3155,7 +3313,7 @@ class GatewayRunner:
    async def _handle_usage_command(self, event: MessageEvent) -> str:
        """Handle /usage command -- show token usage for the session's last agent run."""
        source = event.source
-        session_key = build_session_key(source)
+        session_key = self._session_key_for_source(source)

        agent = self._running_agents.get(session_key)
        if agent and hasattr(agent, "session_total_tokens") and agent.session_api_calls > 0:
@@ -3530,13 +3688,9 @@ class GatewayRunner:
          1. Immediately understand what the user sent (no extra tool call).
          2. Re-examine the image with vision_analyze if it needs more detail.

-        Athabasca persistence should happen through Athabasca's own POST
-        /api/uploads flow, using the returned asset.publicUrl rather than local
-        cache paths.
-
        Args:
-            user_text:      The user's original caption / message text.
-            image_paths:    List of local file paths to cached images.
+            user_text:   The user's original caption / message text.
+            image_paths: List of local file paths to cached images.

        Returns:
            The enriched message string with vision descriptions prepended.
@@ -3561,16 +3715,10 @@ class GatewayRunner:
                result = _json.loads(result_json)
                if result.get("success"):
                    description = result.get("analysis", "")
-                    athabasca_note = (
-                        "\n[If this image needs to persist in Athabasca state, upload the cached file "
-                        "through Athabasca POST /api/uploads and use the returned asset.publicUrl. "
-                        "Do not store the local cache path as the canonical imageUrl.]"
-                    )
                    enriched_parts.append(
                        f"[The user sent an image~ Here's what I can see:\n{description}]\n"
                        f"[If you need a closer look, use vision_analyze with "
                        f"image_url: {path} ~]"
-                        f"{athabasca_note}"
                    )
                else:
                    enriched_parts.append(
@@ -3634,7 +3782,10 @@ class GatewayRunner:
                    )
                else:
                    error = result.get("error", "unknown error")
-                    if "No STT provider" in error or "not set" in error:
+                    if (
+                        "No STT provider" in error
+                        or error.startswith("Neither VOICE_TOOLS_OPENAI_KEY nor OPENAI_API_KEY is set")
+                    ):
                        enriched_parts.append(
                            "[The user sent a voice message but I can't listen "
                            "to it right now~ No STT provider is configured "
@@ -3679,6 +3830,7 @@ class GatewayRunner:
        session_key = watcher.get("session_key", "")
        platform_name = watcher.get("platform", "")
        chat_id = watcher.get("chat_id", "")
+        thread_id = watcher.get("thread_id", "")
        notify_mode = self._load_background_notifications_mode()

        logger.debug("Process watcher started: %s (every %ss, notify=%s)",
@@ -3726,7 +3878,8 @@ class GatewayRunner:
                            break
                    if adapter and chat_id:
                        try:
-                            await adapter.send(chat_id, message_text)
+                            send_meta = {"thread_id": thread_id} if thread_id else None
+                            await adapter.send(chat_id, message_text, metadata=send_meta)
                        except Exception as e:
                            logger.error("Watcher delivery error: %s", e)
                break
@@ -3745,7 +3898,8 @@ class GatewayRunner:
                        break
                if adapter and chat_id:
                    try:
-                        await adapter.send(chat_id, message_text)
+                        send_meta = {"thread_id": thread_id} if thread_id else None
+                        await adapter.send(chat_id, message_text, metadata=send_meta)
                    except Exception as e:
                        logger.error("Watcher delivery error: %s", e)

@@ -3984,6 +4138,7 @@ class GatewayRunner:
        agent_holder = [None]  # Mutable container for the agent instance
        result_holder = [None]  # Mutable container for the result
        tools_holder = [None]   # Mutable container for the tool definitions
+        stream_consumer_holder = [None]  # Mutable container for stream consumer
        
        # Bridge sync step_callback → async hooks.emit for agent:step events
        _loop_for_step = asyncio.get_event_loop()
@@ -4046,9 +4201,39 @@ class GatewayRunner:
            honcho_manager, honcho_config = self._get_or_create_gateway_honcho(session_key)
            reasoning_config = self._load_reasoning_config()
            self._reasoning_config = reasoning_config
+            # Set up streaming consumer if enabled
+            _stream_consumer = None
+            _stream_delta_cb = None
+            _scfg = getattr(getattr(self, 'config', None), 'streaming', None)
+            if _scfg is None:
+                from gateway.config import StreamingConfig
+                _scfg = StreamingConfig()
+
+            if _scfg.enabled and _scfg.transport != "off":
+                try:
+                    from gateway.stream_consumer import GatewayStreamConsumer, StreamConsumerConfig
+                    _adapter = self.adapters.get(source.platform)
+                    if _adapter:
+                        _consumer_cfg = StreamConsumerConfig(
+                            edit_interval=_scfg.edit_interval,
+                            buffer_threshold=_scfg.buffer_threshold,
+                            cursor=_scfg.cursor,
+                        )
+                        _stream_consumer = GatewayStreamConsumer(
+                            adapter=_adapter,
+                            chat_id=source.chat_id,
+                            config=_consumer_cfg,
+                            metadata={"thread_id": source.thread_id} if source.thread_id else None,
+                        )
+                        _stream_delta_cb = _stream_consumer.on_delta
+                        stream_consumer_holder[0] = _stream_consumer
+                except Exception as _sc_err:
+                    logger.debug("Could not set up stream consumer: %s", _sc_err)
+
+            turn_route = self._resolve_turn_agent_config(message, model, runtime_kwargs)
            agent = AIAgent(
-                model=model,
-                **runtime_kwargs,
+                model=turn_route["model"],
+                **turn_route["runtime"],
                max_iterations=max_iterations,
                quiet_mode=True,
                verbose_logging=False,
@@ -4065,6 +4250,7 @@ class GatewayRunner:
                session_id=session_id,
                tool_progress_callback=progress_callback if tool_progress_enabled else None,
                step_callback=_step_callback_sync if _hooks_ref.loaded_hooks else None,
+                stream_delta_callback=_stream_delta_cb,
                platform=platform_key,
                honcho_session_key=session_key,
                honcho_manager=honcho_manager,
@@ -4135,6 +4321,10 @@ class GatewayRunner:
            
            result = agent.run_conversation(message, conversation_history=agent_history, task_id=session_id)
            result_holder[0] = result
+
+            # Signal the stream consumer that the agent is done
+            if _stream_consumer is not None:
+                _stream_consumer.finish()
            
            # Return final response, or a message if something went wrong
            final_response = result.get("final_response")
@@ -4234,6 +4424,20 @@ class GatewayRunner:
        progress_task = None
        if tool_progress_enabled:
            progress_task = asyncio.create_task(send_progress_messages())
+
+        # Start stream consumer task — polls for consumer creation since it
+        # happens inside run_sync (thread pool) after the agent is constructed.
+        stream_task = None
+
+        async def _start_stream_consumer():
+            """Wait for the stream consumer to be created, then run it."""
+            for _ in range(200):  # Up to 10s wait
+                if stream_consumer_holder[0] is not None:
+                    await stream_consumer_holder[0].run()
+                    return
+                await asyncio.sleep(0.05)
+
+        stream_task = asyncio.create_task(_start_stream_consumer())
        
        # Track this agent as running for this session (for interrupt support)
        # We do this in a callback after the agent is created
@@ -4316,6 +4520,17 @@ class GatewayRunner:
            if progress_task:
                progress_task.cancel()
            interrupt_monitor.cancel()
+
+            # Wait for stream consumer to finish its final edit
+            if stream_task:
+                try:
+                    await asyncio.wait_for(stream_task, timeout=5.0)
+                except (asyncio.TimeoutError, asyncio.CancelledError):
+                    stream_task.cancel()
+                    try:
+                        await stream_task
+                    except asyncio.CancelledError:
+                        pass
            
            # Clean up tracking
            tracking_task.cancel()
@@ -4329,6 +4544,12 @@ class GatewayRunner:
                        await task
                    except asyncio.CancelledError:
                        pass
+
+        # If streaming already delivered the response, mark it so the
+        # caller's send() is skipped (avoiding duplicate messages).
+        _sc = stream_consumer_holder[0]
+        if _sc and _sc.already_sent and isinstance(response, dict):
+            response["already_sent"] = True
        
        return response

@@ -8,9 +8,11 @@ Handles:
 - Dynamic system prompt injection (agent knows its context)
 """

+import hashlib
 import logging
 import os
 import json
+import re
 import uuid
 from pathlib import Path
 from datetime import datetime, timedelta
@@ -19,6 +21,41 @@ from typing import Dict, List, Optional, Any

 logger = logging.getLogger(__name__)

+
+# ---------------------------------------------------------------------------
+# PII redaction helpers
+# ---------------------------------------------------------------------------
+
+_PHONE_RE = re.compile(r"^\+?\d[\d\-\s]{6,}$")
+
+
+def _hash_id(value: str) -> str:
+    """Deterministic 12-char hex hash of an identifier."""
+    return hashlib.sha256(value.encode("utf-8")).hexdigest()[:12]
+
+
+def _hash_sender_id(value: str) -> str:
+    """Hash a sender ID to ``user_<12hex>``."""
+    return f"user_{_hash_id(value)}"
+
+
+def _hash_chat_id(value: str) -> str:
+    """Hash the numeric portion of a chat ID, preserving platform prefix.
+
+    ``telegram:12345`` → ``telegram:<hash>``
+    ``12345``          → ``<hash>``
+    """
+    colon = value.find(":")
+    if colon > 0:
+        prefix = value[:colon]
+        return f"{prefix}:{_hash_id(value[colon + 1:])}"
+    return _hash_id(value)
+
+
+def _looks_like_phone(value: str) -> bool:
+    """Return True if *value* looks like a phone number (E.164 or similar)."""
+    return bool(_PHONE_RE.match(value.strip()))
+
 from .config import (
    Platform,
    GatewayConfig,
@@ -146,7 +183,21 @@ class SessionContext:
        }


-def build_session_context_prompt(context: SessionContext) -> str:
+_PII_SAFE_PLATFORMS = frozenset({
+    Platform.WHATSAPP,
+    Platform.SIGNAL,
+    Platform.TELEGRAM,
+})
+"""Platforms where user IDs can be safely redacted (no in-message mention system
+that requires raw IDs).  Discord is excluded because mentions use ``<@user_id>``
+and the LLM needs the real ID to tag users."""
+
+
+def build_session_context_prompt(
+    context: SessionContext,
+    *,
+    redact_pii: bool = False,
+) -> str:
    """
    Build the dynamic system prompt section that tells the agent about its context.
    
@@ -154,7 +205,15 @@ def build_session_context_prompt(context: SessionContext) -> str:
    - Where messages are coming from
    - What platforms are connected
    - Where it can deliver scheduled task outputs
+
+    When *redact_pii* is True **and** the source platform is in
+    ``_PII_SAFE_PLATFORMS``, phone numbers are stripped and user/chat IDs
+    are replaced with deterministic hashes before being sent to the LLM.
+    Platforms like Discord are excluded because mentions need real IDs.
+    Routing still uses the original values (they stay in SessionSource).
    """
+    # Only apply redaction on platforms where IDs aren't needed for mentions
+    redact_pii = redact_pii and context.source.platform in _PII_SAFE_PLATFORMS
    lines = [
        "## Current Session Context",
        "",
@@ -165,7 +224,25 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.platform == Platform.LOCAL:
        lines.append(f"**Source:** {platform_name} (the machine running this agent)")
    else:
-        lines.append(f"**Source:** {platform_name} ({context.source.description})")
+        # Build a description that respects PII redaction
+        src = context.source
+        if redact_pii:
+            # Build a safe description without raw IDs
+            _uname = src.user_name or (
+                _hash_sender_id(src.user_id) if src.user_id else "user"
+            )
+            _cname = src.chat_name or _hash_chat_id(src.chat_id)
+            if src.chat_type == "dm":
+                desc = f"DM with {_uname}"
+            elif src.chat_type == "group":
+                desc = f"group: {_cname}"
+            elif src.chat_type == "channel":
+                desc = f"channel: {_cname}"
+            else:
+                desc = _cname
+        else:
+            desc = src.description
+        lines.append(f"**Source:** {platform_name} ({desc})")
    
    # Channel topic (if available - provides context about the channel's purpose)
    if context.source.chat_topic:
@@ -175,7 +252,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.user_name:
        lines.append(f"**User:** {context.source.user_name}")
    elif context.source.user_id:
-        lines.append(f"**User ID:** {context.source.user_id}")
+        uid = context.source.user_id
+        if redact_pii:
+            uid = _hash_sender_id(uid)
+        lines.append(f"**User ID:** {uid}")
    
    # Platform-specific behavioral notes
    if context.source.platform == Platform.SLACK:
@@ -210,7 +290,8 @@ def build_session_context_prompt(context: SessionContext) -> str:
        lines.append("")
        lines.append("**Home Channels (default destinations):**")
        for platform, home in context.home_channels.items():
-            lines.append(f"  - {platform.value}: {home.name} (ID: {home.chat_id})")
+            hc_id = _hash_chat_id(home.chat_id) if redact_pii else home.chat_id
+            lines.append(f"  - {platform.value}: {home.name} (ID: {hc_id})")
    
    # Delivery options for scheduled tasks
    lines.append("")
@@ -220,7 +301,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.platform == Platform.LOCAL:
        lines.append("- `\"origin\"` → Local output (saved to files)")
    else:
-        lines.append(f"- `\"origin\"` → Back to this chat ({context.source.chat_name or context.source.chat_id})")
+        _origin_label = context.source.chat_name or (
+            _hash_chat_id(context.source.chat_id) if redact_pii else context.source.chat_id
+        )
+        lines.append(f"- `\"origin\"` → Back to this chat ({_origin_label})")
    
    # Local always available
    lines.append("- `\"local\"` → Save to local files only (~/.hermes/cron/output/)")
@@ -315,7 +399,7 @@ class SessionEntry:
        )


-def build_session_key(source: SessionSource) -> str:
+def build_session_key(source: SessionSource, group_sessions_per_user: bool = True) -> str:
    """Build a deterministic session key from a message source.

    This is the single source of truth for session key construction.
@@ -328,7 +412,11 @@ def build_session_key(source: SessionSource) -> str:

    Group/channel rules:
      - chat_id identifies the parent group/channel.
+      - user_id/user_id_alt isolates participants within that parent chat when available when
+        ``group_sessions_per_user`` is enabled.
      - thread_id differentiates threads within that parent chat.
+      - Without participant identifiers, or when isolation is disabled, messages fall back to one
+        shared session per chat.
      - Without identifiers, messages fall back to one session per platform/chat_type.
    """
    platform = source.platform.value
@@ -340,13 +428,18 @@ def build_session_key(source: SessionSource) -> str:
        if source.thread_id:
            return f"agent:main:{platform}:dm:{source.thread_id}"
        return f"agent:main:{platform}:dm"
+
+    participant_id = source.user_id_alt or source.user_id
+    key_parts = ["agent:main", platform, source.chat_type]
+
    if source.chat_id:
-        if source.thread_id:
-            return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}:{source.thread_id}"
-        return f"agent:main:{platform}:{source.chat_type}:{source.chat_id}"
+        key_parts.append(source.chat_id)
    if source.thread_id:
-        return f"agent:main:{platform}:{source.chat_type}:{source.thread_id}"
-    return f"agent:main:{platform}:{source.chat_type}"
+        key_parts.append(source.thread_id)
+    if group_sessions_per_user and participant_id:
+        key_parts.append(str(participant_id))
+
+    return ":".join(key_parts)


 class SessionStore:
@@ -425,7 +518,10 @@ class SessionStore:
    
    def _generate_session_key(self, source: SessionSource) -> str:
        """Generate a session key from a source."""
-        return build_session_key(source)
+        return build_session_key(
+            source,
+            group_sessions_per_user=getattr(self.config, "group_sessions_per_user", True),
+        )
    
    def _is_session_expired(self, entry: SessionEntry) -> bool:
        """Check if a session has expired based on its reset policy.
@@ -83,8 +83,7 @@ def _looks_like_gateway_process(pid: int) -> bool:
    """Return True when the live PID still looks like the Hermes gateway."""
    cmdline = _read_process_cmdline(pid)
    if not cmdline:
-        # If we cannot inspect the process, fall back to the liveness check.
-        return True
+        return False

    patterns = (
        "hermes_cli.main gateway",
@@ -94,6 +93,24 @@ def _looks_like_gateway_process(pid: int) -> bool:
    return any(pattern in cmdline for pattern in patterns)


+def _record_looks_like_gateway(record: dict[str, Any]) -> bool:
+    """Validate gateway identity from PID-file metadata when cmdline is unavailable."""
+    if record.get("kind") != _GATEWAY_KIND:
+        return False
+
+    argv = record.get("argv")
+    if not isinstance(argv, list) or not argv:
+        return False
+
+    cmdline = " ".join(str(part) for part in argv)
+    patterns = (
+        "hermes_cli.main gateway",
+        "hermes gateway",
+        "gateway/run.py",
+    )
+    return any(pattern in cmdline for pattern in patterns)
+
+
 def _build_pid_record() -> dict:
    return {
        "pid": os.getpid(),
@@ -325,8 +342,9 @@ def get_running_pid() -> Optional[int]:
        return None

    if not _looks_like_gateway_process(pid):
-        remove_pid_file()
-        return None
+        if not _record_looks_like_gateway(record):
+            remove_pid_file()
+            return None

    return pid

@@ -0,0 +1,177 @@
+"""Gateway streaming consumer — bridges sync agent callbacks to async platform delivery.
+
+The agent fires stream_delta_callback(text) synchronously from its worker thread.
+GatewayStreamConsumer:
+  1. Receives deltas via on_delta() (thread-safe, sync)
+  2. Queues them to an asyncio task via queue.Queue
+  3. The async run() task buffers, rate-limits, and progressively edits
+     a single message on the target platform
+
+Design: Uses the edit transport (send initial message, then editMessageText).
+This is universally supported across Telegram, Discord, and Slack.
+
+Credit: jobless0x (#774, #1312), OutThisLife (#798), clicksingh (#697).
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+import queue
+import time
+from dataclasses import dataclass
+from typing import Any, Optional
+
+logger = logging.getLogger("gateway.stream_consumer")
+
+# Sentinel to signal the stream is complete
+_DONE = object()
+
+
+@dataclass
+class StreamConsumerConfig:
+    """Runtime config for a single stream consumer instance."""
+    edit_interval: float = 0.3
+    buffer_threshold: int = 40
+    cursor: str = " ▉"
+
+
+class GatewayStreamConsumer:
+    """Async consumer that progressively edits a platform message with streamed tokens.
+
+    Usage::
+
+        consumer = GatewayStreamConsumer(adapter, chat_id, config, metadata=metadata)
+        # Pass consumer.on_delta as stream_delta_callback to AIAgent
+        agent = AIAgent(..., stream_delta_callback=consumer.on_delta)
+        # Start the consumer as an asyncio task
+        task = asyncio.create_task(consumer.run())
+        # ... run agent in thread pool ...
+        consumer.finish()  # signal completion
+        await task         # wait for final edit
+    """
+
+    def __init__(
+        self,
+        adapter: Any,
+        chat_id: str,
+        config: Optional[StreamConsumerConfig] = None,
+        metadata: Optional[dict] = None,
+    ):
+        self.adapter = adapter
+        self.chat_id = chat_id
+        self.cfg = config or StreamConsumerConfig()
+        self.metadata = metadata
+        self._queue: queue.Queue = queue.Queue()
+        self._accumulated = ""
+        self._message_id: Optional[str] = None
+        self._already_sent = False
+        self._edit_supported = True  # Disabled on first edit failure (Signal/Email/HA)
+        self._last_edit_time = 0.0
+
+    @property
+    def already_sent(self) -> bool:
+        """True if at least one message was sent/edited — signals the base
+        adapter to skip re-sending the final response."""
+        return self._already_sent
+
+    def on_delta(self, text: str) -> None:
+        """Thread-safe callback — called from the agent's worker thread."""
+        if text:
+            self._queue.put(text)
+
+    def finish(self) -> None:
+        """Signal that the stream is complete."""
+        self._queue.put(_DONE)
+
+    async def run(self) -> None:
+        """Async task that drains the queue and edits the platform message."""
+        try:
+            while True:
+                # Drain all available items from the queue
+                got_done = False
+                while True:
+                    try:
+                        item = self._queue.get_nowait()
+                        if item is _DONE:
+                            got_done = True
+                            break
+                        self._accumulated += item
+                    except queue.Empty:
+                        break
+
+                # Decide whether to flush an edit
+                now = time.monotonic()
+                elapsed = now - self._last_edit_time
+                should_edit = (
+                    got_done
+                    or (elapsed >= self.cfg.edit_interval
+                        and len(self._accumulated) > 0)
+                    or len(self._accumulated) >= self.cfg.buffer_threshold
+                )
+
+                if should_edit and self._accumulated:
+                    display_text = self._accumulated
+                    if not got_done:
+                        display_text += self.cfg.cursor
+
+                    await self._send_or_edit(display_text)
+                    self._last_edit_time = time.monotonic()
+
+                if got_done:
+                    # Final edit without cursor
+                    if self._accumulated and self._message_id:
+                        await self._send_or_edit(self._accumulated)
+                    return
+
+                await asyncio.sleep(0.05)  # Small yield to not busy-loop
+
+        except asyncio.CancelledError:
+            # Best-effort final edit on cancellation
+            if self._accumulated and self._message_id:
+                try:
+                    await self._send_or_edit(self._accumulated)
+                except Exception:
+                    pass
+        except Exception as e:
+            logger.error("Stream consumer error: %s", e)
+
+    async def _send_or_edit(self, text: str) -> None:
+        """Send or edit the streaming message."""
+        try:
+            if self._message_id is not None:
+                if self._edit_supported:
+                    # Edit existing message
+                    result = await self.adapter.edit_message(
+                        chat_id=self.chat_id,
+                        message_id=self._message_id,
+                        content=text,
+                    )
+                    if result.success:
+                        self._already_sent = True
+                    else:
+                        # Edit not supported by this adapter — stop streaming,
+                        # let the normal send path handle the final response.
+                        # Without this guard, adapters like Signal/Email would
+                        # flood the chat with a new message every edit_interval.
+                        logger.debug("Edit failed, disabling streaming for this adapter")
+                        self._edit_supported = False
+                else:
+                    # Editing not supported — skip intermediate updates.
+                    # The final response will be sent by the normal path.
+                    pass
+            else:
+                # First message — send new
+                result = await self.adapter.send(
+                    chat_id=self.chat_id,
+                    content=text,
+                    metadata=self.metadata,
+                )
+                if result.success and result.message_id:
+                    self._message_id = result.message_id
+                    self._already_sent = True
+                else:
+                    # Initial send failed — disable streaming for this session
+                    self._edit_supported = False
+        except Exception as e:
+            logger.error("Stream send/edit error: %s", e)
@@ -11,5 +11,5 @@ Provides subcommands for:
 - hermes cron          - Manage cron jobs
 """

-__version__ = "0.2.0"
-__release_date__ = "2026.3.12"
+__version__ = "0.3.0"
+__release_date__ = "2026.3.17"
@@ -147,6 +147,22 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        api_key_env_vars=("MINIMAX_CN_API_KEY",),
        base_url_env_var="MINIMAX_CN_BASE_URL",
    ),
+    "deepseek": ProviderConfig(
+        id="deepseek",
+        name="DeepSeek",
+        auth_type="api_key",
+        inference_base_url="https://api.deepseek.com/v1",
+        api_key_env_vars=("DEEPSEEK_API_KEY",),
+        base_url_env_var="DEEPSEEK_BASE_URL",
+    ),
+    "ai-gateway": ProviderConfig(
+        id="ai-gateway",
+        name="AI Gateway",
+        auth_type="api_key",
+        inference_base_url="https://ai-gateway.vercel.sh/v1",
+        api_key_env_vars=("AI_GATEWAY_API_KEY",),
+        base_url_env_var="AI_GATEWAY_BASE_URL",
+    ),
 }


@@ -524,6 +540,7 @@ def resolve_provider(
        "kimi": "kimi-coding", "moonshot": "kimi-coding",
        "minimax-china": "minimax-cn", "minimax_cn": "minimax-cn",
        "claude": "anthropic", "claude-code": "anthropic",
+        "aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
    }
    normalized = _PROVIDER_ALIASES.get(normalized, normalized)

@@ -1,68 +1,240 @@
 """Slash command definitions and autocomplete for the Hermes CLI.

-Contains the shared built-in ``COMMANDS`` dict and ``SlashCommandCompleter``.
-The completer can optionally include dynamic skill slash commands supplied by the
-interactive CLI.
+Central registry for all slash commands. Every consumer -- CLI help, gateway
+dispatch, Telegram BotCommands, Slack subcommand mapping, autocomplete --
+derives its data from ``COMMAND_REGISTRY``.
+
+To add a command: add a ``CommandDef`` entry to ``COMMAND_REGISTRY``.
+To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
 """

 from __future__ import annotations

+import os
 from collections.abc import Callable, Mapping
+from dataclasses import dataclass, field
+from pathlib import Path
 from typing import Any

 from prompt_toolkit.completion import Completer, Completion


-# Commands organized by category for better help display
-COMMANDS_BY_CATEGORY = {
-    "Session": {
-        "/new": "Start a new session (fresh session ID + history)",
-        "/reset": "Start a new session (alias for /new)",
-        "/clear": "Clear screen and start a new session",
-        "/history": "Show conversation history",
-        "/save": "Save the current conversation",
-        "/retry": "Retry the last message (resend to agent)",
-        "/undo": "Remove the last user/assistant exchange",
-        "/title": "Set a title for the current session (usage: /title My Session Name)",
-        "/compress": "Manually compress conversation context (flush memories + summarize)",
-        "/rollback": "List or restore filesystem checkpoints (usage: /rollback [number])",
-        "/background": "Run a prompt in the background (usage: /background <prompt>)",
-    },
-    "Configuration": {
-        "/config": "Show current configuration",
-        "/model": "Show or change the current model",
-        "/provider": "Show available providers and current provider",
-        "/prompt": "View/set custom system prompt",
-        "/personality": "Set a predefined personality",
-        "/verbose": "Cycle tool progress display: off → new → all → verbose",
-        "/reasoning": "Manage reasoning effort and display (usage: /reasoning [level|show|hide])",
-        "/skin": "Show or change the display skin/theme",
-        "/voice": "Toggle voice mode (Ctrl+B to record). Usage: /voice [on|off|tts|status]",
-    },
-    "Tools & Skills": {
-        "/tools": "List available tools",
-        "/toolsets": "List available toolsets",
-        "/skills": "Search, install, inspect, or manage skills from online registries",
-        "/cron": "Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove)",
-        "/reload-mcp": "Reload MCP servers from config.yaml",
-    },
-    "Info": {
-        "/help": "Show this help message",
-        "/usage": "Show token usage for the current session",
-        "/insights": "Show usage insights and analytics (last 30 days)",
-        "/platforms": "Show gateway/messaging platform status",
-        "/paste": "Check clipboard for an image and attach it",
-    },
-    "Exit": {
-        "/quit": "Exit the CLI (also: /exit, /q)",
-    },
-}
+# ---------------------------------------------------------------------------
+# CommandDef dataclass
+# ---------------------------------------------------------------------------

-# Flat dict for backwards compatibility and autocomplete
-COMMANDS = {}
-for category_commands in COMMANDS_BY_CATEGORY.values():
-    COMMANDS.update(category_commands)
+@dataclass(frozen=True)
+class CommandDef:
+    """Definition of a single slash command."""

+    name: str                          # canonical name without slash: "background"
+    description: str                   # human-readable description
+    category: str                      # "Session", "Configuration", etc.
+    aliases: tuple[str, ...] = ()      # alternative names: ("bg",)
+    args_hint: str = ""                # argument placeholder: "<prompt>", "[name]"
+    cli_only: bool = False             # only available in CLI
+    gateway_only: bool = False         # only available in gateway/messaging
+
+
+# ---------------------------------------------------------------------------
+# Central registry -- single source of truth
+# ---------------------------------------------------------------------------
+
+COMMAND_REGISTRY: list[CommandDef] = [
+    # Session
+    CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
+               aliases=("reset",)),
+    CommandDef("clear", "Clear screen and start a new session", "Session",
+               cli_only=True),
+    CommandDef("history", "Show conversation history", "Session",
+               cli_only=True),
+    CommandDef("save", "Save the current conversation", "Session",
+               cli_only=True),
+    CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
+    CommandDef("undo", "Remove the last user/assistant exchange", "Session"),
+    CommandDef("title", "Set a title for the current session", "Session",
+               args_hint="[name]"),
+    CommandDef("compress", "Manually compress conversation context", "Session"),
+    CommandDef("rollback", "List or restore filesystem checkpoints", "Session",
+               args_hint="[number]"),
+    CommandDef("stop", "Kill all running background processes", "Session"),
+    CommandDef("background", "Run a prompt in the background", "Session",
+               aliases=("bg",), args_hint="<prompt>"),
+    CommandDef("status", "Show session info", "Session",
+               gateway_only=True),
+    CommandDef("sethome", "Set this chat as the home channel", "Session",
+               gateway_only=True, aliases=("set-home",)),
+    CommandDef("resume", "Resume a previously-named session", "Session",
+               args_hint="[name]"),
+
+    # Configuration
+    CommandDef("config", "Show current configuration", "Configuration",
+               cli_only=True),
+    CommandDef("model", "Show or change the current model", "Configuration",
+               args_hint="[name]"),
+    CommandDef("provider", "Show available providers and current provider",
+               "Configuration"),
+    CommandDef("prompt", "View/set custom system prompt", "Configuration",
+               cli_only=True, args_hint="[text]"),
+    CommandDef("personality", "Set a predefined personality", "Configuration",
+               args_hint="[name]"),
+    CommandDef("verbose", "Cycle tool progress display: off -> new -> all -> verbose",
+               "Configuration", cli_only=True),
+    CommandDef("reasoning", "Manage reasoning effort and display", "Configuration",
+               args_hint="[level|show|hide]"),
+    CommandDef("skin", "Show or change the display skin/theme", "Configuration",
+               cli_only=True, args_hint="[name]"),
+    CommandDef("voice", "Toggle voice mode", "Configuration",
+               args_hint="[on|off|tts|status]"),
+
+    # Tools & Skills
+    CommandDef("tools", "List available tools", "Tools & Skills",
+               cli_only=True),
+    CommandDef("toolsets", "List available toolsets", "Tools & Skills",
+               cli_only=True),
+    CommandDef("skills", "Search, install, inspect, or manage skills",
+               "Tools & Skills", cli_only=True),
+    CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
+               cli_only=True, args_hint="[subcommand]"),
+    CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
+               aliases=("reload_mcp",)),
+    CommandDef("plugins", "List installed plugins and their status",
+               "Tools & Skills", cli_only=True),
+
+    # Info
+    CommandDef("help", "Show available commands", "Info"),
+    CommandDef("usage", "Show token usage for the current session", "Info"),
+    CommandDef("insights", "Show usage insights and analytics", "Info",
+               args_hint="[days]"),
+    CommandDef("platforms", "Show gateway/messaging platform status", "Info",
+               cli_only=True, aliases=("gateway",)),
+    CommandDef("paste", "Check clipboard for an image and attach it", "Info",
+               cli_only=True),
+    CommandDef("update", "Update Hermes Agent to the latest version", "Info",
+               gateway_only=True),
+
+    # Exit
+    CommandDef("quit", "Exit the CLI", "Exit",
+               cli_only=True, aliases=("exit", "q")),
+]
+
+
+# ---------------------------------------------------------------------------
+# Derived lookups -- rebuilt once at import time
+# ---------------------------------------------------------------------------
+
+def _build_command_lookup() -> dict[str, CommandDef]:
+    """Map every name and alias to its CommandDef."""
+    lookup: dict[str, CommandDef] = {}
+    for cmd in COMMAND_REGISTRY:
+        lookup[cmd.name] = cmd
+        for alias in cmd.aliases:
+            lookup[alias] = cmd
+    return lookup
+
+
+_COMMAND_LOOKUP: dict[str, CommandDef] = _build_command_lookup()
+
+
+def resolve_command(name: str) -> CommandDef | None:
+    """Resolve a command name or alias to its CommandDef.
+
+    Accepts names with or without the leading slash.
+    """
+    return _COMMAND_LOOKUP.get(name.lower().lstrip("/"))
+
+
+def _build_description(cmd: CommandDef) -> str:
+    """Build a CLI-facing description string including usage hint."""
+    if cmd.args_hint:
+        return f"{cmd.description} (usage: /{cmd.name} {cmd.args_hint})"
+    return cmd.description
+
+
+# Backwards-compatible flat dict: "/command" -> description
+COMMANDS: dict[str, str] = {}
+for _cmd in COMMAND_REGISTRY:
+    if not _cmd.gateway_only:
+        COMMANDS[f"/{_cmd.name}"] = _build_description(_cmd)
+        for _alias in _cmd.aliases:
+            COMMANDS[f"/{_alias}"] = f"{_cmd.description} (alias for /{_cmd.name})"
+
+# Backwards-compatible categorized dict
+COMMANDS_BY_CATEGORY: dict[str, dict[str, str]] = {}
+for _cmd in COMMAND_REGISTRY:
+    if not _cmd.gateway_only:
+        _cat = COMMANDS_BY_CATEGORY.setdefault(_cmd.category, {})
+        _cat[f"/{_cmd.name}"] = COMMANDS[f"/{_cmd.name}"]
+        for _alias in _cmd.aliases:
+            _cat[f"/{_alias}"] = COMMANDS[f"/{_alias}"]
+
+
+# ---------------------------------------------------------------------------
+# Gateway helpers
+# ---------------------------------------------------------------------------
+
+# Set of all command names + aliases recognized by the gateway
+GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
+    name
+    for cmd in COMMAND_REGISTRY
+    if not cmd.cli_only
+    for name in (cmd.name, *cmd.aliases)
+)
+
+
+def gateway_help_lines() -> list[str]:
+    """Generate gateway help text lines from the registry."""
+    lines: list[str] = []
+    for cmd in COMMAND_REGISTRY:
+        if cmd.cli_only:
+            continue
+        args = f" {cmd.args_hint}" if cmd.args_hint else ""
+        alias_parts: list[str] = []
+        for a in cmd.aliases:
+            # Skip internal aliases like reload_mcp (underscore variant)
+            if a.replace("-", "_") == cmd.name.replace("-", "_") and a != cmd.name:
+                continue
+            alias_parts.append(f"`/{a}`")
+        alias_note = f" (alias: {', '.join(alias_parts)})" if alias_parts else ""
+        lines.append(f"`/{cmd.name}{args}` -- {cmd.description}{alias_note}")
+    return lines
+
+
+def telegram_bot_commands() -> list[tuple[str, str]]:
+    """Return (command_name, description) pairs for Telegram setMyCommands.
+
+    Telegram command names cannot contain hyphens, so they are replaced with
+    underscores.  Aliases are skipped -- Telegram shows one menu entry per
+    canonical command.
+    """
+    result: list[tuple[str, str]] = []
+    for cmd in COMMAND_REGISTRY:
+        if cmd.cli_only:
+            continue
+        tg_name = cmd.name.replace("-", "_")
+        result.append((tg_name, cmd.description))
+    return result
+
+
+def slack_subcommand_map() -> dict[str, str]:
+    """Return subcommand -> /command mapping for Slack /hermes handler.
+
+    Maps both canonical names and aliases so /hermes bg do stuff works
+    the same as /hermes background do stuff.
+    """
+    mapping: dict[str, str] = {}
+    for cmd in COMMAND_REGISTRY:
+        if cmd.cli_only:
+            continue
+        mapping[cmd.name] = f"/{cmd.name}"
+        for alias in cmd.aliases:
+            mapping[alias] = f"/{alias}"
+    return mapping
+
+
+# ---------------------------------------------------------------------------
+# Autocomplete
+# ---------------------------------------------------------------------------

 class SlashCommandCompleter(Completer):
    """Autocomplete for built-in slash commands and optional skill commands."""
@@ -92,9 +264,88 @@ class SlashCommandCompleter(Completer):
        """
        return f"{cmd_name} " if cmd_name == word else cmd_name

+    @staticmethod
+    def _extract_path_word(text: str) -> str | None:
+        """Extract the current word if it looks like a file path.
+
+        Returns the path-like token under the cursor, or None if the
+        current word doesn't look like a path.  A word is path-like when
+        it starts with ``./``, ``../``, ``~/``, ``/``, or contains a
+        ``/`` separator (e.g. ``src/main.py``).
+        """
+        if not text:
+            return None
+        # Walk backwards to find the start of the current "word".
+        # Words are delimited by spaces, but paths can contain almost anything.
+        i = len(text) - 1
+        while i >= 0 and text[i] != " ":
+            i -= 1
+        word = text[i + 1:]
+        if not word:
+            return None
+        # Only trigger path completion for path-like tokens
+        if word.startswith(("./", "../", "~/", "/")) or "/" in word:
+            return word
+        return None
+
+    @staticmethod
+    def _path_completions(word: str, limit: int = 30):
+        """Yield Completion objects for file paths matching *word*."""
+        expanded = os.path.expanduser(word)
+        # Split into directory part and prefix to match inside it
+        if expanded.endswith("/"):
+            search_dir = expanded
+            prefix = ""
+        else:
+            search_dir = os.path.dirname(expanded) or "."
+            prefix = os.path.basename(expanded)
+
+        try:
+            entries = os.listdir(search_dir)
+        except OSError:
+            return
+
+        count = 0
+        prefix_lower = prefix.lower()
+        for entry in sorted(entries):
+            if prefix and not entry.lower().startswith(prefix_lower):
+                continue
+            if count >= limit:
+                break
+
+            full_path = os.path.join(search_dir, entry)
+            is_dir = os.path.isdir(full_path)
+
+            # Build the completion text (what replaces the typed word)
+            if word.startswith("~"):
+                display_path = "~/" + os.path.relpath(full_path, os.path.expanduser("~"))
+            elif os.path.isabs(word):
+                display_path = full_path
+            else:
+                # Keep relative
+                display_path = os.path.relpath(full_path)
+
+            if is_dir:
+                display_path += "/"
+
+            suffix = "/" if is_dir else ""
+            meta = "dir" if is_dir else _file_size_label(full_path)
+
+            yield Completion(
+                display_path,
+                start_position=-len(word),
+                display=entry + suffix,
+                display_meta=meta,
+            )
+            count += 1
+
    def get_completions(self, document, complete_event):
        text = document.text_before_cursor
        if not text.startswith("/"):
+            # Try file path completion for non-slash input
+            path_word = self._extract_path_word(text)
+            if path_word is not None:
+                yield from self._path_completions(path_word)
            return

        word = text[1:]
@@ -120,3 +371,18 @@ class SlashCommandCompleter(Completer):
                    display=cmd,
                    display_meta=f"⚡ {short_desc}",
                )
+
+
+def _file_size_label(path: str) -> str:
+    """Return a compact human-readable file size, or '' on error."""
+    try:
+        size = os.path.getsize(path)
+    except OSError:
+        return ""
+    if size < 1024:
+        return f"{size}B"
+    if size < 1024 * 1024:
+        return f"{size / 1024:.0f}K"
+    if size < 1024 * 1024 * 1024:
+        return f"{size / (1024 * 1024):.1f}M"
+    return f"{size / (1024 * 1024 * 1024):.1f}G"
@@ -118,6 +118,14 @@ DEFAULT_CONFIG = {
        # Each entry is "host_path:container_path" (standard Docker -v syntax).
        # Example: ["/home/user/projects:/workspace/projects", "/data:/data"]
        "docker_volumes": [],
+        # Explicit opt-in: mount the host cwd into /workspace for Docker sessions.
+        # Default off because passing host directories into a sandbox weakens isolation.
+        "docker_mount_cwd_to_workspace": False,
+        # Persistent shell — keep a long-lived bash shell across execute() calls
+        # so cwd/env vars/shell variables survive between commands.
+        # Enabled by default for non-local backends (SSH); local is always opt-in
+        # via TERMINAL_LOCAL_PERSISTENT env var.
+        "persistent_shell": True,
    },
    
    "browser": {
@@ -129,7 +137,7 @@ DEFAULT_CONFIG = {
    # When enabled, the agent takes a snapshot of the working directory once per
    # conversation turn (on first write_file/patch call).  Use /rollback to restore.
    "checkpoints": {
-        "enabled": False,
+        "enabled": True,
        "max_snapshots": 50,  # Max checkpoints to keep per directory
    },
    
@@ -139,6 +147,12 @@ DEFAULT_CONFIG = {
        "summary_model": "google/gemini-3-flash-preview",
        "summary_provider": "auto",
    },
+    "smart_model_routing": {
+        "enabled": False,
+        "max_simple_chars": 160,
+        "max_simple_words": 28,
+        "cheap_model": {},
+    },
    
    # Auxiliary model config — provider:model for each side task.
    # Format: provider is the provider name, model is the model slug.
@@ -177,6 +191,12 @@ DEFAULT_CONFIG = {
            "base_url": "",
            "api_key": "",
        },
+        "approval": {
+            "provider": "auto",
+            "model": "",           # fast/cheap model recommended (e.g. gemini-flash, haiku)
+            "base_url": "",
+            "api_key": "",
+        },
        "mcp": {
            "provider": "auto",
            "model": "",
@@ -197,8 +217,15 @@ DEFAULT_CONFIG = {
        "resume_display": "full",
        "bell_on_complete": False,
        "show_reasoning": False,
+        "streaming": False,
+        "show_cost": False,       # Show $ cost in the status bar (off by default)
        "skin": "default",
    },
+
+    # Privacy settings
+    "privacy": {
+        "redact_pii": False,  # When True, hash user IDs and strip phone numbers from LLM context
+    },
    
    # Text-to-speech configuration
    "tts": {
@@ -283,6 +310,14 @@ DEFAULT_CONFIG = {
        "auto_thread": True,           # Auto-create threads on @mention in channels (like Slack)
    },

+    # Approval mode for dangerous commands:
+    #   manual — always prompt the user (default)
+    #   smart  — use auxiliary LLM to auto-approve low-risk commands, prompt for high-risk
+    #   off    — skip all approval prompts (equivalent to --yolo)
+    "approvals": {
+        "mode": "manual",
+    },
+
    # Permanently allowed dangerous command patterns (added via "always" approval)
    "command_allowlist": [],
    # User-defined quick commands that bypass the agent loop (type: exec only)
@@ -424,6 +459,20 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
        "advanced": True,
    },
+    "DEEPSEEK_API_KEY": {
+        "description": "DeepSeek API key for direct DeepSeek access",
+        "prompt": "DeepSeek API Key",
+        "url": "https://platform.deepseek.com/api_keys",
+        "password": True,
+        "category": "provider",
+    },
+    "DEEPSEEK_BASE_URL": {
+        "description": "Custom DeepSeek API base URL (advanced)",
+        "prompt": "DeepSeek Base URL",
+        "url": "",
+        "password": False,
+        "category": "provider",
+    },

    # ── Tool API keys ──
    "FIRECRAWL_API_KEY": {
@@ -968,6 +1017,19 @@ _FALLBACK_COMMENT = """
 # fallback_model:
 #   provider: openrouter
 #   model: anthropic/claude-sonnet-4
+#
+# ── Smart Model Routing ────────────────────────────────────────────────
+# Optional cheap-vs-strong routing for simple turns.
+# Keeps the primary model for complex work, but can route short/simple
+# messages to a cheaper model across providers.
+#
+# smart_model_routing:
+#   enabled: true
+#   max_simple_chars: 160
+#   max_simple_words: 28
+#   cheap_model:
+#     provider: openrouter
+#     model: google/gemini-2.5-flash
 """


@@ -998,6 +1060,19 @@ _COMMENTED_SECTIONS = """
 # fallback_model:
 #   provider: openrouter
 #   model: anthropic/claude-sonnet-4
+#
+# ── Smart Model Routing ────────────────────────────────────────────────
+# Optional cheap-vs-strong routing for simple turns.
+# Keeps the primary model for complex work, but can route short/simple
+# messages to a cheaper model across providers.
+#
+# smart_model_routing:
+#   enabled: true
+#   max_simple_chars: 160
+#   max_simple_words: 28
+#   cheap_model:
+#     provider: openrouter
+#     model: google/gemini-2.5-flash
 """


@@ -1388,9 +1463,11 @@ def set_config_value(key: str, value: str):
        "terminal.singularity_image": "TERMINAL_SINGULARITY_IMAGE",
        "terminal.modal_image": "TERMINAL_MODAL_IMAGE",
        "terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
+        "terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
        "terminal.cwd": "TERMINAL_CWD",
        "terminal.timeout": "TERMINAL_TIMEOUT",
        "terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
+        "terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
    }
    if key in _config_to_env_sync:
        save_env_value(_config_to_env_sync[key], str(value))
@@ -570,6 +570,7 @@ def run_doctor(args):
        # MiniMax APIs don't support /models endpoint — https://github.com/NousResearch/hermes-agent/issues/811
        ("MiniMax",          ("MINIMAX_API_KEY",),                            None,                                  "MINIMAX_BASE_URL", False),
        ("MiniMax (China)",  ("MINIMAX_CN_API_KEY",),                         None,                                  "MINIMAX_CN_BASE_URL", False),
+        ("AI Gateway",       ("AI_GATEWAY_API_KEY",),                          "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
    ]
    for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
        _key = ""
@@ -119,17 +119,62 @@ def is_windows() -> bool:
 # Service Configuration
 # =============================================================================

-SERVICE_NAME = "hermes-gateway"
+_SERVICE_BASE = "hermes-gateway"
 SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"


+def get_service_name() -> str:
+    """Derive a systemd service name scoped to this HERMES_HOME.
+
+    Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
+    Any other HERMES_HOME appends a short hash so multiple installations
+    can each have their own systemd service without conflicting.
+    """
+    import hashlib
+    from pathlib import Path as _Path  # local import to avoid monkeypatch interference
+    home = _Path(os.getenv("HERMES_HOME", _Path.home() / ".hermes")).resolve()
+    default = (_Path.home() / ".hermes").resolve()
+    if home == default:
+        return _SERVICE_BASE
+    suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
+    return f"{_SERVICE_BASE}-{suffix}"
+
+
+SERVICE_NAME = _SERVICE_BASE  # backward-compat for external importers; prefer get_service_name()
+
+
 def get_systemd_unit_path(system: bool = False) -> Path:
+    name = get_service_name()
    if system:
-        return Path("/etc/systemd/system") / f"{SERVICE_NAME}.service"
-    return Path.home() / ".config" / "systemd" / "user" / f"{SERVICE_NAME}.service"
+        return Path("/etc/systemd/system") / f"{name}.service"
+    return Path.home() / ".config" / "systemd" / "user" / f"{name}.service"
+
+
+def _ensure_user_systemd_env() -> None:
+    """Ensure DBUS_SESSION_BUS_ADDRESS and XDG_RUNTIME_DIR are set for systemctl --user.
+
+    On headless servers (SSH sessions), these env vars may be missing even when
+    the user's systemd instance is running (via linger).  Without them,
+    ``systemctl --user`` fails with "Failed to connect to bus: No medium found".
+    We detect the standard socket path and set the vars so all subsequent
+    subprocess calls inherit them.
+    """
+    uid = os.getuid()
+    if "XDG_RUNTIME_DIR" not in os.environ:
+        runtime_dir = f"/run/user/{uid}"
+        if Path(runtime_dir).exists():
+            os.environ["XDG_RUNTIME_DIR"] = runtime_dir
+
+    if "DBUS_SESSION_BUS_ADDRESS" not in os.environ:
+        xdg_runtime = os.environ.get("XDG_RUNTIME_DIR", f"/run/user/{uid}")
+        bus_path = Path(xdg_runtime) / "bus"
+        if bus_path.exists():
+            os.environ["DBUS_SESSION_BUS_ADDRESS"] = f"unix:path={bus_path}"


 def _systemctl_cmd(system: bool = False) -> list[str]:
+    if not system:
+        _ensure_user_systemd_env()
    return ["systemctl"] if system else ["systemctl", "--user"]


@@ -350,8 +395,6 @@ def get_hermes_cli_path() -> str:
 # =============================================================================

 def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
-    import shutil
-
    python_path = get_python_path()
    working_dir = str(PROJECT_ROOT)
    venv_dir = str(PROJECT_ROOT / "venv")
@@ -360,7 +403,8 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)

    # Build a PATH that includes the venv, node_modules, and standard system dirs
    sane_path = f"{venv_bin}:{node_bin}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
-    hermes_cli = shutil.which("hermes") or f"{python_path} -m hermes_cli.main"
+
+    hermes_home = str(Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")).resolve())

    if system:
        username, group_name, home_dir = _system_service_identity(run_as_user)
@@ -380,11 +424,12 @@ Environment="USER={username}"
 Environment="LOGNAME={username}"
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
+Environment="HERMES_HOME={hermes_home}"
 Restart=on-failure
 RestartSec=10
 KillMode=mixed
 KillSignal=SIGTERM
-TimeoutStopSec=15
+TimeoutStopSec=60
 StandardOutput=journal
 StandardError=journal

@@ -399,15 +444,15 @@ After=network.target
 [Service]
 Type=simple
 ExecStart={python_path} -m hermes_cli.main gateway run --replace
-ExecStop={hermes_cli} gateway stop
 WorkingDirectory={working_dir}
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
+Environment="HERMES_HOME={hermes_home}"
 Restart=on-failure
 RestartSec=10
 KillMode=mixed
 KillSignal=SIGTERM
-TimeoutStopSec=15
+TimeoutStopSec=60
 StandardOutput=journal
 StandardError=journal

@@ -455,7 +500,7 @@ def _print_linger_enable_warning(username: str, detail: str | None = None) -> No
    print(f"    sudo loginctl enable-linger {username}")
    print()
    print("  Then restart the gateway:")
-    print(f"    systemctl --user restart {SERVICE_NAME}.service")
+    print(f"    systemctl --user restart {get_service_name()}.service")
    print()


@@ -526,7 +571,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
    unit_path.write_text(generate_systemd_unit(system=system, run_as_user=run_as_user), encoding="utf-8")

    subprocess.run(_systemctl_cmd(system) + ["daemon-reload"], check=True)
-    subprocess.run(_systemctl_cmd(system) + ["enable", SERVICE_NAME], check=True)
+    subprocess.run(_systemctl_cmd(system) + ["enable", get_service_name()], check=True)

    print()
    print(f"✓ {_service_scope_label(system).capitalize()} service installed and enabled!")
@@ -534,7 +579,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
    print("Next steps:")
    print(f"  {'sudo ' if system else ''}hermes gateway start{scope_flag}              # Start the service")
    print(f"  {'sudo ' if system else ''}hermes gateway status{scope_flag}             # Check status")
-    print(f"  {'journalctl' if system else 'journalctl --user'} -u {SERVICE_NAME} -f  # View logs")
+    print(f"  {'journalctl' if system else 'journalctl --user'} -u {get_service_name()} -f  # View logs")
    print()

    if system:
@@ -552,8 +597,8 @@ def systemd_uninstall(system: bool = False):
    if system:
        _require_root_for_system_service("uninstall")

-    subprocess.run(_systemctl_cmd(system) + ["stop", SERVICE_NAME], check=False)
-    subprocess.run(_systemctl_cmd(system) + ["disable", SERVICE_NAME], check=False)
+    subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=False)
+    subprocess.run(_systemctl_cmd(system) + ["disable", get_service_name()], check=False)

    unit_path = get_systemd_unit_path(system=system)
    if unit_path.exists():
@@ -569,7 +614,7 @@ def systemd_start(system: bool = False):
    if system:
        _require_root_for_system_service("start")
    refresh_systemd_unit_if_needed(system=system)
-    subprocess.run(_systemctl_cmd(system) + ["start", SERVICE_NAME], check=True)
+    subprocess.run(_systemctl_cmd(system) + ["start", get_service_name()], check=True)
    print(f"✓ {_service_scope_label(system).capitalize()} service started")


@@ -578,7 +623,7 @@ def systemd_stop(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("stop")
-    subprocess.run(_systemctl_cmd(system) + ["stop", SERVICE_NAME], check=True)
+    subprocess.run(_systemctl_cmd(system) + ["stop", get_service_name()], check=True)
    print(f"✓ {_service_scope_label(system).capitalize()} service stopped")


@@ -588,7 +633,7 @@ def systemd_restart(system: bool = False):
    if system:
        _require_root_for_system_service("restart")
    refresh_systemd_unit_if_needed(system=system)
-    subprocess.run(_systemctl_cmd(system) + ["restart", SERVICE_NAME], check=True)
+    subprocess.run(_systemctl_cmd(system) + ["restart", get_service_name()], check=True)
    print(f"✓ {_service_scope_label(system).capitalize()} service restarted")


@@ -613,12 +658,12 @@ def systemd_status(deep: bool = False, system: bool = False):
        print()

    subprocess.run(
-        _systemctl_cmd(system) + ["status", SERVICE_NAME, "--no-pager"],
+        _systemctl_cmd(system) + ["status", get_service_name(), "--no-pager"],
        capture_output=False,
    )

    result = subprocess.run(
-        _systemctl_cmd(system) + ["is-active", SERVICE_NAME],
+        _systemctl_cmd(system) + ["is-active", get_service_name()],
        capture_output=True,
        text=True,
    )
@@ -657,7 +702,7 @@ def systemd_status(deep: bool = False, system: bool = False):
    if deep:
        print()
        print("Recent logs:")
-        subprocess.run(_journalctl_cmd(system) + ["-u", SERVICE_NAME, "-n", "20", "--no-pager"])
+        subprocess.run(_journalctl_cmd(system) + ["-u", get_service_name(), "-n", "20", "--no-pager"])


 # =============================================================================
@@ -684,6 +729,7 @@ def generate_launchd_plist() -> str:
        <string>hermes_cli.main</string>
        <string>gateway</string>
        <string>run</string>
+        <string>--replace</string>
    </array>
    
    <key>WorkingDirectory</key>
@@ -707,6 +753,36 @@ def generate_launchd_plist() -> str:
 </plist>
 """

+def launchd_plist_is_current() -> bool:
+    """Check if the installed launchd plist matches the currently generated one."""
+    plist_path = get_launchd_plist_path()
+    if not plist_path.exists():
+        return False
+
+    installed = plist_path.read_text(encoding="utf-8")
+    expected = generate_launchd_plist()
+    return _normalize_service_definition(installed) == _normalize_service_definition(expected)
+
+
+def refresh_launchd_plist_if_needed() -> bool:
+    """Rewrite the installed launchd plist when the generated definition has changed.
+
+    Unlike systemd, launchd picks up plist changes on the next ``launchctl stop``/
+    ``launchctl start`` cycle — no daemon-reload is needed.  We still unload/reload
+    to make launchd re-read the updated plist immediately.
+    """
+    plist_path = get_launchd_plist_path()
+    if not plist_path.exists() or launchd_plist_is_current():
+        return False
+
+    plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
+    # Unload/reload so launchd picks up the new definition
+    subprocess.run(["launchctl", "unload", str(plist_path)], check=False)
+    subprocess.run(["launchctl", "load", str(plist_path)], check=False)
+    print("↻ Updated gateway launchd service definition to match the current Hermes install")
+    return True
+
+
 def launchd_install(force: bool = False):
    plist_path = get_launchd_plist_path()
    
@@ -739,6 +815,7 @@ def launchd_uninstall():
    print("✓ Service uninstalled")

 def launchd_start():
+    refresh_launchd_plist_if_needed()
    subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
    print("✓ Service started")

@@ -747,6 +824,7 @@ def launchd_stop():
    print("✓ Service stopped")

 def launchd_restart():
+    refresh_launchd_plist_if_needed()
    launchd_stop()
    launchd_start()

@@ -1118,7 +1196,7 @@ def _is_service_running() -> bool:

        if user_unit_exists:
            result = subprocess.run(
-                _systemctl_cmd(False) + ["is-active", SERVICE_NAME],
+                _systemctl_cmd(False) + ["is-active", get_service_name()],
                capture_output=True, text=True
            )
            if result.stdout.strip() == "active":
@@ -1126,7 +1204,7 @@ def _is_service_running() -> bool:

        if system_unit_exists:
            result = subprocess.run(
-                _systemctl_cmd(True) + ["is-active", SERVICE_NAME],
+                _systemctl_cmd(True) + ["is-active", get_service_name()],
                capture_output=True, text=True
            )
            if result.stdout.strip() == "active":
@@ -1492,6 +1570,22 @@ def gateway_command(args):
                pass
        
        if not service_available:
+            # systemd/launchd restart failed — check if linger is the issue
+            if is_linux():
+                linger_ok, _detail = get_systemd_linger_status()
+                if linger_ok is not True:
+                    import getpass
+                    _username = getpass.getuser()
+                    print()
+                    print("⚠ Cannot restart gateway as a service — linger is not enabled.")
+                    print("  The gateway user service requires linger to function on headless servers.")
+                    print()
+                    print(f"  Run:  sudo loginctl enable-linger {_username}")
+                    print()
+                    print("  Then restart the gateway:")
+                    print("    hermes gateway restart")
+                    return
+
            # Manual restart: kill existing processes
            killed = kill_gateway_processes()
            if killed:
@@ -768,6 +768,7 @@ def cmd_model(args):
        "kimi-coding": "Kimi / Moonshot",
        "minimax": "MiniMax",
        "minimax-cn": "MiniMax (China)",
+        "ai-gateway": "AI Gateway",
        "custom": "Custom endpoint",
    }
    active_label = provider_labels.get(active, active)
@@ -787,6 +788,7 @@ def cmd_model(args):
        ("kimi-coding", "Kimi / Moonshot (Moonshot AI direct API)"),
        ("minimax", "MiniMax (global direct API)"),
        ("minimax-cn", "MiniMax China (domestic direct API)"),
+        ("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
    ]

    # Add user-defined custom providers from config.yaml
@@ -855,7 +857,7 @@ def cmd_model(args):
        _model_flow_anthropic(config, current_model)
    elif selected_provider == "kimi-coding":
        _model_flow_kimi(config, current_model)
-    elif selected_provider in ("zai", "minimax", "minimax-cn"):
+    elif selected_provider in ("zai", "minimax", "minimax-cn", "ai-gateway"):
        _model_flow_api_key_provider(config, selected_provider, current_model)


@@ -2301,26 +2303,121 @@ def cmd_update(args):
        print()
        print("✓ Update complete!")
        
-        # Auto-restart gateway if it's running as a systemd service
+        # Auto-restart gateway if it's running.
+        # Uses the PID file (scoped to HERMES_HOME) to find this
+        # installation's gateway — safe with multiple installations.
        try:
-            check = subprocess.run(
-                ["systemctl", "--user", "is-active", "hermes-gateway"],
-                capture_output=True, text=True, timeout=5,
+            from gateway.status import get_running_pid, remove_pid_file
+            from hermes_cli.gateway import (
+                get_service_name, get_launchd_plist_path, is_macos, is_linux,
+                refresh_launchd_plist_if_needed,
+                _ensure_user_systemd_env, get_systemd_linger_status,
            )
-            if check.stdout.strip() == "active":
-                print()
-                print("→ Gateway service is running — restarting to pick up changes...")
-                restart = subprocess.run(
-                    ["systemctl", "--user", "restart", "hermes-gateway"],
-                    capture_output=True, text=True, timeout=15,
+            import signal as _signal
+
+            _gw_service_name = get_service_name()
+            existing_pid = get_running_pid()
+            has_systemd_service = False
+            has_launchd_service = False
+
+            try:
+                _ensure_user_systemd_env()
+                check = subprocess.run(
+                    ["systemctl", "--user", "is-active", _gw_service_name],
+                    capture_output=True, text=True, timeout=5,
                )
-                if restart.returncode == 0:
-                    print("✓ Gateway restarted.")
-                else:
-                    print(f"⚠ Gateway restart failed: {restart.stderr.strip()}")
-                    print("  Try manually: hermes gateway restart")
-        except (FileNotFoundError, subprocess.TimeoutExpired):
-            pass  # No systemd (macOS, WSL1, etc.) — skip silently
+                has_systemd_service = check.stdout.strip() == "active"
+            except (FileNotFoundError, subprocess.TimeoutExpired):
+                pass
+
+            # Check for macOS launchd service
+            if is_macos():
+                try:
+                    plist_path = get_launchd_plist_path()
+                    if plist_path.exists():
+                        check = subprocess.run(
+                            ["launchctl", "list", "ai.hermes.gateway"],
+                            capture_output=True, text=True, timeout=5,
+                        )
+                        has_launchd_service = check.returncode == 0
+                except (FileNotFoundError, subprocess.TimeoutExpired):
+                    pass
+
+            if existing_pid or has_systemd_service or has_launchd_service:
+                print()
+
+                # When a service manager is handling the gateway, let it
+                # manage the lifecycle — don't manually SIGTERM the PID
+                # (launchd KeepAlive would respawn immediately, causing races).
+                if has_systemd_service:
+                    import time as _time
+                    if existing_pid:
+                        try:
+                            os.kill(existing_pid, _signal.SIGTERM)
+                            print(f"→ Stopped gateway process (PID {existing_pid})")
+                        except ProcessLookupError:
+                            pass
+                        except PermissionError:
+                            print(f"⚠ Permission denied killing gateway PID {existing_pid}")
+                        remove_pid_file()
+                    _time.sleep(1)  # Brief pause for port/socket release
+                    print("→ Restarting gateway service...")
+                    restart = subprocess.run(
+                        ["systemctl", "--user", "restart", _gw_service_name],
+                        capture_output=True, text=True, timeout=15,
+                    )
+                    if restart.returncode == 0:
+                        print("✓ Gateway restarted.")
+                    else:
+                        print(f"⚠ Gateway restart failed: {restart.stderr.strip()}")
+                        # Check if linger is the issue
+                        if is_linux():
+                            linger_ok, _detail = get_systemd_linger_status()
+                            if linger_ok is not True:
+                                import getpass
+                                _username = getpass.getuser()
+                                print()
+                                print("  Linger must be enabled for the gateway user service to function.")
+                                print(f"  Run:  sudo loginctl enable-linger {_username}")
+                                print()
+                                print("  Then restart the gateway:")
+                                print("    hermes gateway restart")
+                            else:
+                                print("  Try manually: hermes gateway restart")
+                elif has_launchd_service:
+                    # Refresh the plist first (picks up --replace and other
+                    # changes from the update we just pulled).
+                    refresh_launchd_plist_if_needed()
+                    # Explicit stop+start — don't rely on KeepAlive respawn
+                    # after a manual SIGTERM, which would race with the
+                    # PID file cleanup.
+                    print("→ Restarting gateway service...")
+                    stop = subprocess.run(
+                        ["launchctl", "stop", "ai.hermes.gateway"],
+                        capture_output=True, text=True, timeout=10,
+                    )
+                    start = subprocess.run(
+                        ["launchctl", "start", "ai.hermes.gateway"],
+                        capture_output=True, text=True, timeout=10,
+                    )
+                    if start.returncode == 0:
+                        print("✓ Gateway restarted via launchd.")
+                    else:
+                        print(f"⚠ Gateway restart failed: {start.stderr.strip()}")
+                        print("  Try manually: hermes gateway restart")
+                elif existing_pid:
+                    try:
+                        os.kill(existing_pid, _signal.SIGTERM)
+                        print(f"→ Stopped gateway process (PID {existing_pid})")
+                    except ProcessLookupError:
+                        pass  # Already gone
+                    except PermissionError:
+                        print(f"⚠ Permission denied killing gateway PID {existing_pid}")
+                    remove_pid_file()
+                    print("  ℹ️  Gateway was running manually (not as a service).")
+                    print("  Restart it with: hermes gateway run")
+        except Exception as e:
+            logger.debug("Gateway restart during update failed: %s", e)
        
        print()
        print("Tip: You can now select a provider and model:")
@@ -8,6 +8,7 @@ Add, remove, or reorder entries here — both `hermes setup` and
 from __future__ import annotations

 import json
+import os
 import urllib.request
 import urllib.error
 from difflib import get_close_matches
@@ -78,6 +79,24 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "claude-sonnet-4-20250514",
        "claude-haiku-4-5-20251001",
    ],
+    "deepseek": [
+        "deepseek-chat",
+        "deepseek-reasoner",
+    ],
+    "ai-gateway": [
+        "anthropic/claude-opus-4.6",
+        "anthropic/claude-sonnet-4.6",
+        "anthropic/claude-sonnet-4.5",
+        "anthropic/claude-haiku-4.5",
+        "openai/gpt-5",
+        "openai/gpt-4.1",
+        "openai/gpt-4.1-mini",
+        "google/gemini-3-pro-preview",
+        "google/gemini-3-flash",
+        "google/gemini-2.5-pro",
+        "google/gemini-2.5-flash",
+        "deepseek/deepseek-v3.2",
+    ],
 }

 _PROVIDER_LABELS = {
@@ -89,6 +108,8 @@ _PROVIDER_LABELS = {
    "minimax": "MiniMax",
    "minimax-cn": "MiniMax (China)",
    "anthropic": "Anthropic",
+    "deepseek": "DeepSeek",
+    "ai-gateway": "AI Gateway",
    "custom": "Custom endpoint",
 }

@@ -103,6 +124,10 @@ _PROVIDER_ALIASES = {
    "minimax_cn": "minimax-cn",
    "claude": "anthropic",
    "claude-code": "anthropic",
+    "deep-seek": "deepseek",
+    "aigateway": "ai-gateway",
+    "vercel": "ai-gateway",
+    "vercel-ai-gateway": "ai-gateway",
 }


@@ -137,6 +162,7 @@ def list_available_providers() -> list[dict[str, str]]:
    _PROVIDER_ORDER = [
        "openrouter", "nous", "openai-codex",
        "zai", "kimi-coding", "minimax", "minimax-cn", "anthropic",
+        "ai-gateway", "deepseek",
    ]
    # Build reverse alias map
    aliases_for: dict[str, list[str]] = {}
@@ -212,6 +238,111 @@ def curated_models_for_provider(provider: Optional[str]) -> list[tuple[str, str]
    return [(m, "") for m in models]


+def detect_provider_for_model(
+    model_name: str,
+    current_provider: str,
+) -> Optional[tuple[str, str]]:
+    """Auto-detect the best provider for a model name.
+
+    Returns ``(provider_id, model_name)`` — the model name may be remapped
+    (e.g. bare ``deepseek-chat`` → ``deepseek/deepseek-chat`` for OpenRouter).
+    Returns ``None`` when no confident match is found.
+
+    Priority:
+    1. Direct provider with credentials (highest)
+    2. Direct provider without credentials → remap to OpenRouter slug
+    3. OpenRouter catalog match
+    """
+    name = (model_name or "").strip()
+    if not name:
+        return None
+
+    name_lower = name.lower()
+
+    # Aggregators list other providers' models — never auto-switch TO them
+    _AGGREGATORS = {"nous", "openrouter"}
+
+    # If the model belongs to the current provider's catalog, don't suggest switching
+    current_models = _PROVIDER_MODELS.get(current_provider, [])
+    if any(name_lower == m.lower() for m in current_models):
+        return None
+
+    # --- Step 1: check static provider catalogs for a direct match ---
+    direct_match: Optional[str] = None
+    for pid, models in _PROVIDER_MODELS.items():
+        if pid == current_provider or pid in _AGGREGATORS:
+            continue
+        if any(name_lower == m.lower() for m in models):
+            direct_match = pid
+            break
+
+    if direct_match:
+        # Check if we have credentials for this provider
+        has_creds = False
+        try:
+            from hermes_cli.auth import PROVIDER_REGISTRY
+            pconfig = PROVIDER_REGISTRY.get(direct_match)
+            if pconfig:
+                import os
+                for env_var in pconfig.api_key_env_vars:
+                    if os.getenv(env_var, "").strip():
+                        has_creds = True
+                        break
+        except Exception:
+            pass
+
+        if has_creds:
+            return (direct_match, name)
+
+        # No direct creds — try to find this model on OpenRouter instead
+        or_slug = _find_openrouter_slug(name)
+        if or_slug:
+            return ("openrouter", or_slug)
+        # Still return the direct provider — credential resolution will
+        # give a clear error rather than silently using the wrong provider
+        return (direct_match, name)
+
+    # --- Step 2: check OpenRouter catalog ---
+    # First try exact match (handles provider/model format)
+    or_slug = _find_openrouter_slug(name)
+    if or_slug:
+        if current_provider != "openrouter":
+            return ("openrouter", or_slug)
+        # Already on openrouter, just return the resolved slug
+        if or_slug != name:
+            return ("openrouter", or_slug)
+        return None  # already on openrouter with matching name
+
+    return None
+
+
+def _find_openrouter_slug(model_name: str) -> Optional[str]:
+    """Find the full OpenRouter model slug for a bare or partial model name.
+
+    Handles:
+    - Exact match: ``anthropic/claude-opus-4.6`` → as-is
+    - Bare name: ``deepseek-chat`` → ``deepseek/deepseek-chat``
+    - Bare name: ``claude-opus-4.6`` → ``anthropic/claude-opus-4.6``
+    """
+    name_lower = model_name.strip().lower()
+    if not name_lower:
+        return None
+
+    # Exact match (already has provider/ prefix)
+    for mid, _ in OPENROUTER_MODELS:
+        if name_lower == mid.lower():
+            return mid
+
+    # Try matching just the model part (after the /)
+    for mid, _ in OPENROUTER_MODELS:
+        if "/" in mid:
+            _, model_part = mid.split("/", 1)
+            if name_lower == model_part.lower():
+                return mid
+
+    return None
+
+
 def normalize_provider(provider: Optional[str]) -> str:
    """Normalize provider aliases to Hermes' canonical provider ids.

@@ -261,6 +392,10 @@ def provider_model_ids(provider: Optional[str]) -> list[str]:
        live = _fetch_anthropic_models()
        if live:
            return live
+    if normalized == "ai-gateway":
+        live = _fetch_ai_gateway_models()
+        if live:
+            return live
    return list(_PROVIDER_MODELS.get(normalized, []))


@@ -364,6 +499,33 @@ def probe_api_models(
    }


+def _fetch_ai_gateway_models(timeout: float = 5.0) -> Optional[list[str]]:
+    """Fetch available language models with tool-use from AI Gateway."""
+    api_key = os.getenv("AI_GATEWAY_API_KEY", "").strip()
+    if not api_key:
+        return None
+    base_url = os.getenv("AI_GATEWAY_BASE_URL", "").strip()
+    if not base_url:
+        from hermes_constants import AI_GATEWAY_BASE_URL
+        base_url = AI_GATEWAY_BASE_URL
+
+    url = base_url.rstrip("/") + "/models"
+    headers: dict[str, str] = {"Authorization": f"Bearer {api_key}"}
+    req = urllib.request.Request(url, headers=headers)
+    try:
+        with urllib.request.urlopen(req, timeout=timeout) as resp:
+            data = json.loads(resp.read().decode())
+            return [
+                m["id"]
+                for m in data.get("data", [])
+                if m.get("id")
+                and m.get("type") == "language"
+                and "tool-use" in (m.get("tags") or [])
+            ]
+    except Exception:
+        return None
+
+
 def fetch_api_models(
    api_key: Optional[str],
    base_url: Optional[str],
@@ -0,0 +1,449 @@
+"""
+Hermes Plugin System
+====================
+
+Discovers, loads, and manages plugins from three sources:
+
+1. **User plugins**   – ``~/.hermes/plugins/<name>/``
+2. **Project plugins** – ``./.hermes/plugins/<name>/``
+3. **Pip plugins**     – packages that expose the ``hermes_agent.plugins``
+   entry-point group.
+
+Each directory plugin must contain a ``plugin.yaml`` manifest **and** an
+``__init__.py`` with a ``register(ctx)`` function.
+
+Lifecycle hooks
+---------------
+Plugins may register callbacks for any of the hooks in ``VALID_HOOKS``.
+The agent core calls ``invoke_hook(name, **kwargs)`` at the appropriate
+points.
+
+Tool registration
+-----------------
+``PluginContext.register_tool()`` delegates to ``tools.registry.register()``
+so plugin-defined tools appear alongside the built-in tools.
+"""
+
+from __future__ import annotations
+
+import importlib
+import importlib.metadata
+import importlib.util
+import logging
+import os
+import sys
+import types
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any, Callable, Dict, List, Optional, Set
+
+try:
+    import yaml
+except ImportError:  # pragma: no cover – yaml is optional at import time
+    yaml = None  # type: ignore[assignment]
+
+logger = logging.getLogger(__name__)
+
+# ---------------------------------------------------------------------------
+# Constants
+# ---------------------------------------------------------------------------
+
+VALID_HOOKS: Set[str] = {
+    "pre_tool_call",
+    "post_tool_call",
+    "pre_llm_call",
+    "post_llm_call",
+    "on_session_start",
+    "on_session_end",
+}
+
+ENTRY_POINTS_GROUP = "hermes_agent.plugins"
+
+_NS_PARENT = "hermes_plugins"
+
+
+# ---------------------------------------------------------------------------
+# Data classes
+# ---------------------------------------------------------------------------
+
+@dataclass
+class PluginManifest:
+    """Parsed representation of a plugin.yaml manifest."""
+
+    name: str
+    version: str = ""
+    description: str = ""
+    author: str = ""
+    requires_env: List[str] = field(default_factory=list)
+    provides_tools: List[str] = field(default_factory=list)
+    provides_hooks: List[str] = field(default_factory=list)
+    source: str = ""        # "user", "project", or "entrypoint"
+    path: Optional[str] = None
+
+
+@dataclass
+class LoadedPlugin:
+    """Runtime state for a single loaded plugin."""
+
+    manifest: PluginManifest
+    module: Optional[types.ModuleType] = None
+    tools_registered: List[str] = field(default_factory=list)
+    hooks_registered: List[str] = field(default_factory=list)
+    enabled: bool = False
+    error: Optional[str] = None
+
+
+# ---------------------------------------------------------------------------
+# PluginContext  – handed to each plugin's ``register()`` function
+# ---------------------------------------------------------------------------
+
+class PluginContext:
+    """Facade given to plugins so they can register tools and hooks."""
+
+    def __init__(self, manifest: PluginManifest, manager: "PluginManager"):
+        self.manifest = manifest
+        self._manager = manager
+
+    # -- tool registration --------------------------------------------------
+
+    def register_tool(
+        self,
+        name: str,
+        toolset: str,
+        schema: dict,
+        handler: Callable,
+        check_fn: Callable | None = None,
+        requires_env: list | None = None,
+        is_async: bool = False,
+        description: str = "",
+        emoji: str = "",
+    ) -> None:
+        """Register a tool in the global registry **and** track it as plugin-provided."""
+        from tools.registry import registry
+
+        registry.register(
+            name=name,
+            toolset=toolset,
+            schema=schema,
+            handler=handler,
+            check_fn=check_fn,
+            requires_env=requires_env,
+            is_async=is_async,
+            description=description,
+            emoji=emoji,
+        )
+        self._manager._plugin_tool_names.add(name)
+        logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
+
+    # -- hook registration --------------------------------------------------
+
+    def register_hook(self, hook_name: str, callback: Callable) -> None:
+        """Register a lifecycle hook callback.
+
+        Unknown hook names produce a warning but are still stored so
+        forward-compatible plugins don't break.
+        """
+        if hook_name not in VALID_HOOKS:
+            logger.warning(
+                "Plugin '%s' registered unknown hook '%s' "
+                "(valid: %s)",
+                self.manifest.name,
+                hook_name,
+                ", ".join(sorted(VALID_HOOKS)),
+            )
+        self._manager._hooks.setdefault(hook_name, []).append(callback)
+        logger.debug("Plugin %s registered hook: %s", self.manifest.name, hook_name)
+
+
+# ---------------------------------------------------------------------------
+# PluginManager
+# ---------------------------------------------------------------------------
+
+class PluginManager:
+    """Central manager that discovers, loads, and invokes plugins."""
+
+    def __init__(self) -> None:
+        self._plugins: Dict[str, LoadedPlugin] = {}
+        self._hooks: Dict[str, List[Callable]] = {}
+        self._plugin_tool_names: Set[str] = set()
+        self._discovered: bool = False
+
+    # -----------------------------------------------------------------------
+    # Public
+    # -----------------------------------------------------------------------
+
+    def discover_and_load(self) -> None:
+        """Scan all plugin sources and load each plugin found."""
+        if self._discovered:
+            return
+        self._discovered = True
+
+        manifests: List[PluginManifest] = []
+
+        # 1. User plugins (~/.hermes/plugins/)
+        hermes_home = os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))
+        user_dir = Path(hermes_home) / "plugins"
+        manifests.extend(self._scan_directory(user_dir, source="user"))
+
+        # 2. Project plugins (./.hermes/plugins/)
+        project_dir = Path.cwd() / ".hermes" / "plugins"
+        manifests.extend(self._scan_directory(project_dir, source="project"))
+
+        # 3. Pip / entry-point plugins
+        manifests.extend(self._scan_entry_points())
+
+        # Load each manifest
+        for manifest in manifests:
+            self._load_plugin(manifest)
+
+        if manifests:
+            logger.info(
+                "Plugin discovery complete: %d found, %d enabled",
+                len(self._plugins),
+                sum(1 for p in self._plugins.values() if p.enabled),
+            )
+
+    # -----------------------------------------------------------------------
+    # Directory scanning
+    # -----------------------------------------------------------------------
+
+    def _scan_directory(self, path: Path, source: str) -> List[PluginManifest]:
+        """Read ``plugin.yaml`` manifests from subdirectories of *path*."""
+        manifests: List[PluginManifest] = []
+        if not path.is_dir():
+            return manifests
+
+        for child in sorted(path.iterdir()):
+            if not child.is_dir():
+                continue
+            manifest_file = child / "plugin.yaml"
+            if not manifest_file.exists():
+                manifest_file = child / "plugin.yml"
+            if not manifest_file.exists():
+                logger.debug("Skipping %s (no plugin.yaml)", child)
+                continue
+
+            try:
+                if yaml is None:
+                    logger.warning("PyYAML not installed – cannot load %s", manifest_file)
+                    continue
+                data = yaml.safe_load(manifest_file.read_text()) or {}
+                manifest = PluginManifest(
+                    name=data.get("name", child.name),
+                    version=str(data.get("version", "")),
+                    description=data.get("description", ""),
+                    author=data.get("author", ""),
+                    requires_env=data.get("requires_env", []),
+                    provides_tools=data.get("provides_tools", []),
+                    provides_hooks=data.get("provides_hooks", []),
+                    source=source,
+                    path=str(child),
+                )
+                manifests.append(manifest)
+            except Exception as exc:
+                logger.warning("Failed to parse %s: %s", manifest_file, exc)
+
+        return manifests
+
+    # -----------------------------------------------------------------------
+    # Entry-point scanning
+    # -----------------------------------------------------------------------
+
+    def _scan_entry_points(self) -> List[PluginManifest]:
+        """Check ``importlib.metadata`` for pip-installed plugins."""
+        manifests: List[PluginManifest] = []
+        try:
+            eps = importlib.metadata.entry_points()
+            # Python 3.12+ returns a SelectableGroups; earlier returns dict
+            if hasattr(eps, "select"):
+                group_eps = eps.select(group=ENTRY_POINTS_GROUP)
+            elif isinstance(eps, dict):
+                group_eps = eps.get(ENTRY_POINTS_GROUP, [])
+            else:
+                group_eps = [ep for ep in eps if ep.group == ENTRY_POINTS_GROUP]
+
+            for ep in group_eps:
+                manifest = PluginManifest(
+                    name=ep.name,
+                    source="entrypoint",
+                    path=ep.value,
+                )
+                manifests.append(manifest)
+        except Exception as exc:
+            logger.debug("Entry-point scan failed: %s", exc)
+
+        return manifests
+
+    # -----------------------------------------------------------------------
+    # Loading
+    # -----------------------------------------------------------------------
+
+    def _load_plugin(self, manifest: PluginManifest) -> None:
+        """Import a plugin module and call its ``register(ctx)`` function."""
+        loaded = LoadedPlugin(manifest=manifest)
+
+        try:
+            if manifest.source in ("user", "project"):
+                module = self._load_directory_module(manifest)
+            else:
+                module = self._load_entrypoint_module(manifest)
+
+            loaded.module = module
+
+            # Call register()
+            register_fn = getattr(module, "register", None)
+            if register_fn is None:
+                loaded.error = "no register() function"
+                logger.warning("Plugin '%s' has no register() function", manifest.name)
+            else:
+                ctx = PluginContext(manifest, self)
+                register_fn(ctx)
+                loaded.tools_registered = [
+                    t for t in self._plugin_tool_names
+                    if t not in {
+                        n
+                        for name, p in self._plugins.items()
+                        for n in p.tools_registered
+                    }
+                ]
+                loaded.hooks_registered = list(
+                    {
+                        h
+                        for h, cbs in self._hooks.items()
+                        if cbs  # non-empty
+                    }
+                    - {
+                        h
+                        for name, p in self._plugins.items()
+                        for h in p.hooks_registered
+                    }
+                )
+                loaded.enabled = True
+
+        except Exception as exc:
+            loaded.error = str(exc)
+            logger.warning("Failed to load plugin '%s': %s", manifest.name, exc)
+
+        self._plugins[manifest.name] = loaded
+
+    def _load_directory_module(self, manifest: PluginManifest) -> types.ModuleType:
+        """Import a directory-based plugin as ``hermes_plugins.<name>``."""
+        plugin_dir = Path(manifest.path)  # type: ignore[arg-type]
+        init_file = plugin_dir / "__init__.py"
+        if not init_file.exists():
+            raise FileNotFoundError(f"No __init__.py in {plugin_dir}")
+
+        # Ensure the namespace parent package exists
+        if _NS_PARENT not in sys.modules:
+            ns_pkg = types.ModuleType(_NS_PARENT)
+            ns_pkg.__path__ = []  # type: ignore[attr-defined]
+            ns_pkg.__package__ = _NS_PARENT
+            sys.modules[_NS_PARENT] = ns_pkg
+
+        module_name = f"{_NS_PARENT}.{manifest.name.replace('-', '_')}"
+        spec = importlib.util.spec_from_file_location(
+            module_name,
+            init_file,
+            submodule_search_locations=[str(plugin_dir)],
+        )
+        if spec is None or spec.loader is None:
+            raise ImportError(f"Cannot create module spec for {init_file}")
+
+        module = importlib.util.module_from_spec(spec)
+        module.__package__ = module_name
+        module.__path__ = [str(plugin_dir)]  # type: ignore[attr-defined]
+        sys.modules[module_name] = module
+        spec.loader.exec_module(module)
+        return module
+
+    def _load_entrypoint_module(self, manifest: PluginManifest) -> types.ModuleType:
+        """Load a pip-installed plugin via its entry-point reference."""
+        eps = importlib.metadata.entry_points()
+        if hasattr(eps, "select"):
+            group_eps = eps.select(group=ENTRY_POINTS_GROUP)
+        elif isinstance(eps, dict):
+            group_eps = eps.get(ENTRY_POINTS_GROUP, [])
+        else:
+            group_eps = [ep for ep in eps if ep.group == ENTRY_POINTS_GROUP]
+
+        for ep in group_eps:
+            if ep.name == manifest.name:
+                return ep.load()
+
+        raise ImportError(
+            f"Entry point '{manifest.name}' not found in group '{ENTRY_POINTS_GROUP}'"
+        )
+
+    # -----------------------------------------------------------------------
+    # Hook invocation
+    # -----------------------------------------------------------------------
+
+    def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
+        """Call all registered callbacks for *hook_name*.
+
+        Each callback is wrapped in its own try/except so a misbehaving
+        plugin cannot break the core agent loop.
+        """
+        callbacks = self._hooks.get(hook_name, [])
+        for cb in callbacks:
+            try:
+                cb(**kwargs)
+            except Exception as exc:
+                logger.warning(
+                    "Hook '%s' callback %s raised: %s",
+                    hook_name,
+                    getattr(cb, "__name__", repr(cb)),
+                    exc,
+                )
+
+    # -----------------------------------------------------------------------
+    # Introspection
+    # -----------------------------------------------------------------------
+
+    def list_plugins(self) -> List[Dict[str, Any]]:
+        """Return a list of info dicts for all discovered plugins."""
+        result: List[Dict[str, Any]] = []
+        for name, loaded in sorted(self._plugins.items()):
+            result.append(
+                {
+                    "name": name,
+                    "version": loaded.manifest.version,
+                    "description": loaded.manifest.description,
+                    "source": loaded.manifest.source,
+                    "enabled": loaded.enabled,
+                    "tools": len(loaded.tools_registered),
+                    "hooks": len(loaded.hooks_registered),
+                    "error": loaded.error,
+                }
+            )
+        return result
+
+
+# ---------------------------------------------------------------------------
+# Module-level singleton & convenience functions
+# ---------------------------------------------------------------------------
+
+_plugin_manager: Optional[PluginManager] = None
+
+
+def get_plugin_manager() -> PluginManager:
+    """Return (and lazily create) the global PluginManager singleton."""
+    global _plugin_manager
+    if _plugin_manager is None:
+        _plugin_manager = PluginManager()
+    return _plugin_manager
+
+
+def discover_plugins() -> None:
+    """Discover and load all plugins (idempotent)."""
+    get_plugin_manager().discover_and_load()
+
+
+def invoke_hook(hook_name: str, **kwargs: Any) -> None:
+    """Invoke a lifecycle hook on all loaded plugins."""
+    get_plugin_manager().invoke_hook(hook_name, **kwargs)
+
+
+def get_plugin_tool_names() -> Set[str]:
+    """Return the set of tool names registered by plugins."""
+    return get_plugin_manager()._plugin_tool_names
@@ -59,6 +59,7 @@ _DEFAULT_PROVIDER_MODELS = {
    "kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
    "minimax": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
    "minimax-cn": ["MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
+    "ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
 }


@@ -227,54 +228,86 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
        sys.exit(1)


+def _curses_prompt_choice(question: str, choices: list, default: int = 0) -> int:
+    """Single-select menu using curses to avoid simple_term_menu rendering bugs."""
+    try:
+        import curses
+        result_holder = [default]
+
+        def _curses_menu(stdscr):
+            curses.curs_set(0)
+            if curses.has_colors():
+                curses.start_color()
+                curses.use_default_colors()
+                curses.init_pair(1, curses.COLOR_GREEN, -1)
+                curses.init_pair(2, curses.COLOR_YELLOW, -1)
+            cursor = default
+
+            while True:
+                stdscr.clear()
+                max_y, max_x = stdscr.getmaxyx()
+                try:
+                    stdscr.addnstr(
+                        0,
+                        0,
+                        question,
+                        max_x - 1,
+                        curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0),
+                    )
+                except curses.error:
+                    pass
+
+                for i, choice in enumerate(choices):
+                    y = i + 2
+                    if y >= max_y - 1:
+                        break
+                    arrow = "→" if i == cursor else " "
+                    line = f" {arrow}  {choice}"
+                    attr = curses.A_NORMAL
+                    if i == cursor:
+                        attr = curses.A_BOLD
+                        if curses.has_colors():
+                            attr |= curses.color_pair(1)
+                    try:
+                        stdscr.addnstr(y, 0, line, max_x - 1, attr)
+                    except curses.error:
+                        pass
+
+                stdscr.refresh()
+                key = stdscr.getch()
+                if key in (curses.KEY_UP, ord("k")):
+                    cursor = (cursor - 1) % len(choices)
+                elif key in (curses.KEY_DOWN, ord("j")):
+                    cursor = (cursor + 1) % len(choices)
+                elif key in (curses.KEY_ENTER, 10, 13):
+                    result_holder[0] = cursor
+                    return
+                elif key in (27, ord("q")):
+                    return
+
+        curses.wrapper(_curses_menu)
+        return result_holder[0]
+    except Exception:
+        return -1
+
+
+
 def prompt_choice(question: str, choices: list, default: int = 0) -> int:
    """Prompt for a choice from a list with arrow key navigation.

    Escape keeps the current default (skips the question).
    Ctrl+C exits the wizard.
    """
-    print(color(question, Colors.YELLOW))
-
-    # Try to use interactive menu if available
-    try:
-        from simple_term_menu import TerminalMenu
-        import re
-
-        # Strip emoji characters — simple_term_menu miscalculates visual
-        # width of emojis, causing duplicated/garbled lines on redraw.
-        _emoji_re = re.compile(
-            "[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
-            "\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
-            flags=re.UNICODE,
-        )
-        menu_choices = [f"  {_emoji_re.sub('', choice).strip()}" for choice in choices]
-
-        print_info("  ↑/↓ Navigate  Enter Select  Esc Skip  Ctrl+C Exit")
-
-        terminal_menu = TerminalMenu(
-            menu_choices,
-            cursor_index=default,
-            menu_cursor="→ ",
-            menu_cursor_style=("fg_green", "bold"),
-            menu_highlight_style=("fg_green",),
-            cycle_cursor=True,
-            clear_screen=False,
-        )
-
-        idx = terminal_menu.show()
-        if idx is None:  # User pressed Escape — keep current value
-            print_info(f"  Skipped (keeping current)")
+    idx = _curses_prompt_choice(question, choices, default)
+    if idx >= 0:
+        if idx == default:
+            print_info("  Skipped (keeping current)")
            print()
            return default
-        print()  # Add newline after selection
+        print()
        return idx

-    except (ImportError, NotImplementedError):
-        pass
-    except Exception as e:
-        print(f"  (Interactive menu unavailable: {e})")
-
-    # Fallback to number-based selection (simple_term_menu doesn't support Windows)
+    print(color(question, Colors.YELLOW))
    for i, choice in enumerate(choices):
        marker = "●" if i == default else "○"
        if i == default:
@@ -344,84 +377,15 @@ def prompt_checklist(title: str, items: list, pre_selected: list = None) -> list
    if pre_selected is None:
        pre_selected = []

-    print(color(title, Colors.YELLOW))
-    print_info("  SPACE Toggle  ENTER Confirm  ESC Skip  Ctrl+C Exit")
-    print()
+    from hermes_cli.curses_ui import curses_checklist

-    try:
-        from simple_term_menu import TerminalMenu
-        import re
-
-        # Strip emoji characters from menu labels — simple_term_menu miscalculates
-        # visual width of emojis on macOS, causing duplicated/garbled lines.
-        _emoji_re = re.compile(
-            "[\U0001f300-\U0001f9ff\U00002600-\U000027bf\U0000fe00-\U0000fe0f"
-            "\U0001fa00-\U0001fa6f\U0001fa70-\U0001faff\u200d]+",
-            flags=re.UNICODE,
-        )
-        menu_items = [f"  {_emoji_re.sub('', item).strip()}" for item in items]
-
-        # Map pre-selected indices to the actual menu entry strings
-        preselected = [menu_items[i] for i in pre_selected if i < len(menu_items)]
-
-        terminal_menu = TerminalMenu(
-            menu_items,
-            multi_select=True,
-            show_multi_select_hint=False,
-            multi_select_cursor="[✓] ",
-            multi_select_select_on_accept=False,
-            multi_select_empty_ok=True,
-            preselected_entries=preselected if preselected else None,
-            menu_cursor="→ ",
-            menu_cursor_style=("fg_green", "bold"),
-            menu_highlight_style=("fg_green",),
-            cycle_cursor=True,
-            clear_screen=False,
-        )
-
-        terminal_menu.show()
-
-        if terminal_menu.chosen_menu_entries is None:
-            print_info("  Skipped (keeping current)")
-            return list(pre_selected)
-
-        selected = list(terminal_menu.chosen_menu_indices or [])
-        return selected
-
-    except (ImportError, NotImplementedError):
-        # Fallback: numbered toggle interface (simple_term_menu doesn't support Windows)
-        selected = set(pre_selected)
-
-        while True:
-            for i, item in enumerate(items):
-                marker = color("[✓]", Colors.GREEN) if i in selected else "[ ]"
-                print(f"  {marker} {i + 1}. {item}")
-            print()
-
-            try:
-                value = input(
-                    color("  Toggle # (or Enter to confirm): ", Colors.DIM)
-                ).strip()
-                if not value:
-                    break
-                idx = int(value) - 1
-                if 0 <= idx < len(items):
-                    if idx in selected:
-                        selected.discard(idx)
-                    else:
-                        selected.add(idx)
-                else:
-                    print_error(f"Enter a number between 1 and {len(items)}")
-            except ValueError:
-                print_error("Enter a number")
-            except (KeyboardInterrupt, EOFError):
-                print()
-                return []
-
-            # Clear and redraw (simple approach)
-            print()
-
-        return sorted(selected)
+    chosen = curses_checklist(
+        title,
+        items,
+        set(pre_selected),
+        cancel_returns=set(pre_selected),
+    )
+    return sorted(chosen)


 def _prompt_api_key(var: dict):
@@ -761,6 +725,7 @@ def setup_model_provider(config: dict):
        "MiniMax (global endpoint)",
        "MiniMax China (mainland China endpoint)",
        "Anthropic (Claude models — API key or Claude Code subscription)",
+        "AI Gateway (Vercel — 200+ models, pay-per-use)",
    ]
    if keep_label:
        provider_choices.append(keep_label)
@@ -780,6 +745,7 @@ def setup_model_provider(config: dict):
    selected_provider = (
        None  # "nous", "openai-codex", "openrouter", "custom", or None (keep)
    )
+    selected_base_url = None  # deferred until after model selection
    nous_models = []  # populated if Nous login succeeds

    if provider_idx == 0:  # Nous Portal (OAuth)
@@ -1062,8 +1028,8 @@ def setup_model_provider(config: dict):
        if existing_custom:
            save_env_value("OPENAI_BASE_URL", "")
            save_env_value("OPENAI_API_KEY", "")
-        _update_config_for_provider("zai", zai_base_url, default_model="glm-5")
        _set_model_provider(config, "zai", zai_base_url)
+        selected_base_url = zai_base_url

    elif provider_idx == 5:  # Kimi / Moonshot
        selected_provider = "kimi-coding"
@@ -1095,8 +1061,8 @@ def setup_model_provider(config: dict):
        if existing_custom:
            save_env_value("OPENAI_BASE_URL", "")
            save_env_value("OPENAI_API_KEY", "")
-        _update_config_for_provider("kimi-coding", pconfig.inference_base_url, default_model="kimi-k2.5")
        _set_model_provider(config, "kimi-coding", pconfig.inference_base_url)
+        selected_base_url = pconfig.inference_base_url

    elif provider_idx == 6:  # MiniMax
        selected_provider = "minimax"
@@ -1128,8 +1094,8 @@ def setup_model_provider(config: dict):
        if existing_custom:
            save_env_value("OPENAI_BASE_URL", "")
            save_env_value("OPENAI_API_KEY", "")
-        _update_config_for_provider("minimax", pconfig.inference_base_url, default_model="MiniMax-M2.5")
        _set_model_provider(config, "minimax", pconfig.inference_base_url)
+        selected_base_url = pconfig.inference_base_url

    elif provider_idx == 7:  # MiniMax China
        selected_provider = "minimax-cn"
@@ -1161,8 +1127,8 @@ def setup_model_provider(config: dict):
        if existing_custom:
            save_env_value("OPENAI_BASE_URL", "")
            save_env_value("OPENAI_API_KEY", "")
-        _update_config_for_provider("minimax-cn", pconfig.inference_base_url, default_model="MiniMax-M2.5")
        _set_model_provider(config, "minimax-cn", pconfig.inference_base_url)
+        selected_base_url = pconfig.inference_base_url

    elif provider_idx == 8:  # Anthropic
        selected_provider = "anthropic"
@@ -1265,10 +1231,42 @@ def setup_model_provider(config: dict):
            save_env_value("OPENAI_API_KEY", "")
        # Don't save base_url for Anthropic — resolve_runtime_provider()
        # always hardcodes it. Stale base_urls contaminate other providers.
-        _update_config_for_provider("anthropic", "", default_model="claude-opus-4-6")
        _set_model_provider(config, "anthropic")
+        selected_base_url = ""

-    # else: provider_idx == 9 (Keep current) — only shown when a provider already exists
+    elif provider_idx == 9:  # AI Gateway
+        selected_provider = "ai-gateway"
+        print()
+        print_header("AI Gateway API Key")
+        pconfig = PROVIDER_REGISTRY["ai-gateway"]
+        print_info(f"Provider: {pconfig.name}")
+        print_info("Get your API key at: https://vercel.com/docs/ai-gateway")
+        print()
+
+        existing_key = get_env_value("AI_GATEWAY_API_KEY")
+        if existing_key:
+            print_info(f"Current: {existing_key[:8]}... (configured)")
+            if prompt_yes_no("Update API key?", False):
+                api_key = prompt("  AI Gateway API key", password=True)
+                if api_key:
+                    save_env_value("AI_GATEWAY_API_KEY", api_key)
+                    print_success("AI Gateway API key updated")
+        else:
+            api_key = prompt("  AI Gateway API key", password=True)
+            if api_key:
+                save_env_value("AI_GATEWAY_API_KEY", api_key)
+                print_success("AI Gateway API key saved")
+            else:
+                print_warning("Skipped - agent won't work without an API key")
+
+        # Clear custom endpoint vars if switching
+        if existing_custom:
+            save_env_value("OPENAI_BASE_URL", "")
+            save_env_value("OPENAI_API_KEY", "")
+        _update_config_for_provider("ai-gateway", pconfig.inference_base_url, default_model="anthropic/claude-opus-4.6")
+        _set_model_provider(config, "ai-gateway", pconfig.inference_base_url)
+
+    # else: provider_idx == 10 (Keep current) — only shown when a provider already exists
    # Normalize "keep current" to an explicit provider so downstream logic
    # doesn't fall back to the generic OpenRouter/static-model path.
    if selected_provider is None:
@@ -1305,6 +1303,7 @@ def setup_model_provider(config: dict):
            "minimax": "MiniMax",
            "minimax-cn": "MiniMax CN",
            "anthropic": "Anthropic",
+            "ai-gateway": "AI Gateway",
            "custom": "your custom endpoint",
        }
        _prov_display = _prov_names.get(selected_provider, selected_provider or "your provider")
@@ -1438,7 +1437,7 @@ def setup_model_provider(config: dict):
                    _set_default_model(config, custom)
            _update_config_for_provider("openai-codex", DEFAULT_CODEX_BASE_URL)
            _set_model_provider(config, "openai-codex", DEFAULT_CODEX_BASE_URL)
-        elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn"):
+        elif selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "ai-gateway"):
            _setup_provider_model_selection(
                config, selected_provider, current_model,
                prompt_choice, prompt,
@@ -1496,6 +1495,12 @@ def setup_model_provider(config: dict):
            )
            print_success(f"Model set to: {_display}")

+    # Write provider+base_url to config.yaml only after model selection is complete.
+    # This prevents a race condition where the gateway picks up a new provider
+    # before the model name has been updated to match.
+    if selected_provider in ("zai", "kimi-coding", "minimax", "minimax-cn", "anthropic") and selected_base_url is not None:
+        _update_config_for_provider(selected_provider, selected_base_url)
+
    save_config(config)


@@ -275,8 +275,13 @@ def show_status(args):
    print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
    
    if sys.platform.startswith('linux'):
+        try:
+            from hermes_cli.gateway import get_service_name
+            _gw_svc = get_service_name()
+        except Exception:
+            _gw_svc = "hermes-gateway"
        result = subprocess.run(
-            ["systemctl", "--user", "is-active", "hermes-gateway"],
+            ["systemctl", "--user", "is-active", _gw_svc],
            capture_output=True,
            text=True
        )
@@ -133,7 +133,13 @@ def uninstall_gateway_service():
    if platform.system() != "Linux":
        return False
    
-    service_file = Path.home() / ".config" / "systemd" / "user" / "hermes-gateway.service"
+    try:
+        from hermes_cli.gateway import get_service_name
+        svc_name = get_service_name()
+    except Exception:
+        svc_name = "hermes-gateway"
+
+    service_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
    
    if not service_file.exists():
        return False
@@ -141,14 +147,14 @@ def uninstall_gateway_service():
    try:
        # Stop the service
        subprocess.run(
-            ["systemctl", "--user", "stop", "hermes-gateway"],
+            ["systemctl", "--user", "stop", svc_name],
            capture_output=True,
            check=False
        )
        
        # Disable the service
        subprocess.run(
-            ["systemctl", "--user", "disable", "hermes-gateway"],
+            ["systemctl", "--user", "disable", svc_name],
            capture_output=True,
            check=False
        )
@@ -8,5 +8,9 @@ OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
 OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
 OPENROUTER_CHAT_URL = f"{OPENROUTER_BASE_URL}/chat/completions"

+AI_GATEWAY_BASE_URL = "https://ai-gateway.vercel.sh/v1"
+AI_GATEWAY_MODELS_URL = f"{AI_GATEWAY_BASE_URL}/models"
+AI_GATEWAY_CHAT_URL = f"{AI_GATEWAY_BASE_URL}/chat/completions"
+
 NOUS_API_BASE_URL = "https://inference-api.nousresearch.com/v1"
 NOUS_API_CHAT_URL = f"{NOUS_API_BASE_URL}/chat/completions"
@@ -114,11 +114,12 @@ class HonchoClientConfig:
    @classmethod
    def from_env(cls, workspace_id: str = "hermes") -> HonchoClientConfig:
        """Create config from environment variables (fallback)."""
+        api_key = os.environ.get("HONCHO_API_KEY")
        return cls(
            workspace_id=workspace_id,
-            api_key=os.environ.get("HONCHO_API_KEY"),
+            api_key=api_key,
            environment=os.environ.get("HONCHO_ENVIRONMENT", "production"),
-            enabled=True,
+            enabled=bool(api_key),
        )

    @classmethod
@@ -113,6 +113,13 @@ try:
 except Exception as e:
    logger.debug("MCP tool discovery failed: %s", e)

+# Plugin tool discovery (user/project/pip plugins)
+try:
+    from hermes_cli.plugins import discover_plugins
+    discover_plugins()
+except Exception as e:
+    logger.debug("Plugin discovery failed: %s", e)
+

 # =============================================================================
 # Backward-compat constants  (built once after discovery)
@@ -222,6 +229,16 @@ def get_tool_definitions(
        for ts_name in get_all_toolsets():
            tools_to_include.update(resolve_toolset(ts_name))

+    # Always include plugin-registered tools — they bypass the toolset filter
+    # because their toolsets are dynamic (created at plugin load time).
+    try:
+        from hermes_cli.plugins import get_plugin_tool_names
+        plugin_tools = get_plugin_tool_names()
+        if plugin_tools:
+            tools_to_include.update(plugin_tools)
+    except Exception:
+        pass
+
    # Ask the registry for schemas (only returns tools whose check_fn passes)
    filtered_tools = registry.get_definitions(tools_to_include, quiet=quiet_mode)

@@ -267,6 +284,8 @@ def handle_function_call(
    task_id: Optional[str] = None,
    user_task: Optional[str] = None,
    enabled_tools: Optional[List[str]] = None,
+    honcho_manager: Optional[Any] = None,
+    honcho_session_key: Optional[str] = None,
 ) -> str:
    """
    Main function call dispatcher that routes calls to the tool registry.
@@ -298,21 +317,39 @@ def handle_function_call(
        if function_name in _AGENT_LOOP_TOOLS:
            return json.dumps({"error": f"{function_name} must be handled by the agent loop"})

+        try:
+            from hermes_cli.plugins import invoke_hook
+            invoke_hook("pre_tool_call", tool_name=function_name, args=function_args, task_id=task_id or "")
+        except Exception:
+            pass
+
        if function_name == "execute_code":
            # Prefer the caller-provided list so subagents can't overwrite
            # the parent's tool set via the process-global.
            sandbox_enabled = enabled_tools if enabled_tools is not None else _last_resolved_tool_names
-            return registry.dispatch(
+            result = registry.dispatch(
                function_name, function_args,
                task_id=task_id,
                enabled_tools=sandbox_enabled,
+                honcho_manager=honcho_manager,
+                honcho_session_key=honcho_session_key,
+            )
+        else:
+            result = registry.dispatch(
+                function_name, function_args,
+                task_id=task_id,
+                user_task=user_task,
+                honcho_manager=honcho_manager,
+                honcho_session_key=honcho_session_key,
            )

-        return registry.dispatch(
-            function_name, function_args,
-            task_id=task_id,
-            user_task=user_task,
-        )
+        try:
+            from hermes_cli.plugins import invoke_hook
+            invoke_hook("post_tool_call", tool_name=function_name, args=function_args, result=result, task_id=task_id or "")
+        except Exception:
+            pass
+
+        return result

    except Exception as e:
        error_msg = f"Error executing {function_name}: {str(e)}"
@@ -0,0 +1,116 @@
+---
+name: blender-mcp
+description: Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. Use when user wants to create or modify anything in Blender.
+version: 1.0.0
+requires: Blender 4.3+ (desktop instance required, headless not supported)
+author: alireza78a
+tags: [blender, 3d, animation, modeling, bpy, mcp]
+---
+
+# Blender MCP
+
+Control a running Blender instance from Hermes via socket on TCP port 9876.
+
+## Setup (one-time)
+
+### 1. Install the Blender addon
+
+    curl -sL https://raw.githubusercontent.com/ahujasid/blender-mcp/main/addon.py -o ~/Desktop/blender_mcp_addon.py
+
+In Blender:
+    Edit > Preferences > Add-ons > Install > select blender_mcp_addon.py
+    Enable "Interface: Blender MCP"
+
+### 2. Start the socket server in Blender
+
+Press N in Blender viewport to open sidebar.
+Find "BlenderMCP" tab and click "Start Server".
+
+### 3. Verify connection
+
+    nc -z -w2 localhost 9876 && echo "OPEN" || echo "CLOSED"
+
+## Protocol
+
+Plain UTF-8 JSON over TCP -- no length prefix.
+
+Send:     {"type": "<command>", "params": {<kwargs>}}
+Receive:  {"status": "success", "result": <value>}
+          {"status": "error",   "message": "<reason>"}
+
+## Available Commands
+
+| type                    | params            | description                     |
+|-------------------------|-------------------|---------------------------------|
+| execute_code            | code (str)        | Run arbitrary bpy Python code   |
+| get_scene_info          | (none)            | List all objects in scene       |
+| get_object_info         | object_name (str) | Details on a specific object    |
+| get_viewport_screenshot | (none)            | Screenshot of current viewport  |
+
+## Python Helper
+
+Use this inside execute_code tool calls:
+
+    import socket, json
+
+    def blender_exec(code: str, host="localhost", port=9876, timeout=15):
+        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        s.connect((host, port))
+        s.settimeout(timeout)
+        payload = json.dumps({"type": "execute_code", "params": {"code": code}})
+        s.sendall(payload.encode("utf-8"))
+        buf = b""
+        while True:
+            try:
+                chunk = s.recv(4096)
+                if not chunk:
+                    break
+                buf += chunk
+                try:
+                    json.loads(buf.decode("utf-8"))
+                    break
+                except json.JSONDecodeError:
+                    continue
+            except socket.timeout:
+                break
+        s.close()
+        return json.loads(buf.decode("utf-8"))
+
+## Common bpy Patterns
+
+### Clear scene
+    bpy.ops.object.select_all(action='SELECT')
+    bpy.ops.object.delete()
+
+### Add mesh objects
+    bpy.ops.mesh.primitive_uv_sphere_add(radius=1, location=(0, 0, 0))
+    bpy.ops.mesh.primitive_cube_add(size=2, location=(3, 0, 0))
+    bpy.ops.mesh.primitive_cylinder_add(radius=0.5, depth=2, location=(-3, 0, 0))
+
+### Create and assign material
+    mat = bpy.data.materials.new(name="MyMat")
+    mat.use_nodes = True
+    bsdf = mat.node_tree.nodes.get("Principled BSDF")
+    bsdf.inputs["Base Color"].default_value = (R, G, B, 1.0)
+    bsdf.inputs["Roughness"].default_value = 0.3
+    bsdf.inputs["Metallic"].default_value = 0.0
+    obj.data.materials.append(mat)
+
+### Keyframe animation
+    obj.location = (0, 0, 0)
+    obj.keyframe_insert(data_path="location", frame=1)
+    obj.location = (0, 0, 3)
+    obj.keyframe_insert(data_path="location", frame=60)
+
+### Render to file
+    bpy.context.scene.render.filepath = "/tmp/render.png"
+    bpy.context.scene.render.engine = 'CYCLES'
+    bpy.ops.render.render(write_still=True)
+
+## Pitfalls
+
+- Must check socket is open before running (nc -z localhost 9876)
+- Addon server must be started inside Blender each session (N-panel > BlenderMCP > Connect)
+- Break complex scenes into multiple smaller execute_code calls to avoid timeouts
+- Render output path must be absolute (/tmp/...) not relative
+- shade_smooth() requires object to be selected and in object mode
@@ -0,0 +1,422 @@
+---
+name: oss-forensics
+description: |
+  Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories.
+  Covers deleted commit recovery, force-push detection, IOC extraction, multi-source evidence
+  collection, hypothesis formation/validation, and structured forensic reporting.
+  Inspired by RAPTOR's 1800+ line OSS Forensics system.
+category: security
+triggers:
+  - "investigate this repository"
+  - "investigate [owner/repo]"
+  - "check for supply chain compromise"
+  - "recover deleted commits"
+  - "forensic analysis of [owner/repo]"
+  - "was this repo compromised"
+  - "supply chain attack"
+  - "suspicious commit"
+  - "force push detected"
+  - "IOC extraction"
+toolsets:
+  - terminal
+  - web
+  - file
+  - delegation
+---
+
+# OSS Security Forensics Skill
+
+A 7-phase multi-agent investigation framework for researching open-source supply chain attacks.
+Adapted from RAPTOR's forensics system. Covers GitHub Archive, Wayback Machine, GitHub API,
+local git analysis, IOC extraction, evidence-backed hypothesis formation and validation,
+and final forensic report generation.
+
+---
+
+## ⚠️ Anti-Hallucination Guardrails
+
+Read these before every investigation step. Violating them invalidates the report.
+
+1. **Evidence-First Rule**: Every claim in any report, hypothesis, or summary MUST cite at least one evidence ID (`EV-XXXX`). Assertions without citations are forbidden.
+2. **STAY IN YOUR LANE**: Each sub-agent (investigator) has a single data source. Do NOT mix sources. The GH Archive investigator does not query the GitHub API, and vice versa. Role boundaries are hard.
+3. **Fact vs. Hypothesis Separation**: Mark all unverified inferences with `[HYPOTHESIS]`. Only statements verified against original sources may be stated as facts.
+4. **No Evidence Fabrication**: The hypothesis validator MUST mechanically check that every cited evidence ID actually exists in the evidence store before accepting a hypothesis.
+5. **Proof-Required Disproval**: A hypothesis cannot be dismissed without a specific, evidence-backed counter-argument. "No evidence found" is not sufficient to disprove—it only makes a hypothesis inconclusive.
+6. **SHA/URL Double-Verification**: Any commit SHA, URL, or external identifier cited as evidence must be independently confirmed from at least two sources before being marked as verified.
+7. **Suspicious Code Rule**: Never run code found inside the investigated repository locally. Analyze statically only, or use `execute_code` in a sandboxed environment.
+8. **Secret Redaction**: Any API keys, tokens, or credentials discovered during investigation must be redacted in the final report. Log them internally only.
+
+---
+
+## Example Scenarios
+
+- **Scenario A: Dependency Confusion**: A malicious package `internal-lib-v2` is uploaded to NPM with a higher version than the internal one. The investigator must track when this package was first seen and if any PushEvents in the target repo updated `package.json` to this version.
+- **Scenario B: Maintainer Takeover**: A long-term contributor's account is used to push a backdoored `.github/workflows/build.yml`. The investigator looks for PushEvents from this user after a long period of inactivity or from a new IP/location (if detectable via BigQuery).
+- **Scenario C: Force-Push Hide**: A developer accidentally commits a production secret, then force-pushes to "fix" it. The investigator uses `git fsck` and GH Archive to recover the original commit SHA and verify what was leaked.
+
+---
+
+> **Path convention**: Throughout this skill, `SKILL_DIR` refers to the root of this skill's
+> installation directory (the folder containing this `SKILL.md`). When the skill is loaded,
+> resolve `SKILL_DIR` to the actual path — e.g. `~/.hermes/skills/security/oss-forensics/`
+> or the `optional-skills/` equivalent. All script and template references are relative to it.
+
+## Phase 0: Initialization
+
+1. Create investigation working directory:
+   ```bash
+   mkdir investigation_$(echo "REPO_NAME" | tr '/' '_')
+   cd investigation_$(echo "REPO_NAME" | tr '/' '_')
+   ```
+2. Initialize the evidence store:
+   ```bash
+   python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list
+   ```
+3. Copy the forensic report template:
+   ```bash
+   cp SKILL_DIR/templates/forensic-report.md ./investigation-report.md
+   ```
+4. Create an `iocs.md` file to track Indicators of Compromise as they are discovered.
+5. Record the investigation start time, target repository, and stated investigation goal.
+
+---
+
+## Phase 1: Prompt Parsing and IOC Extraction
+
+**Goal**: Extract all structured investigative targets from the user's request.
+
+**Actions**:
+- Parse the user prompt and extract:
+  - Target repository (`owner/repo`)
+  - Target actors (GitHub handles, email addresses)
+  - Time window of interest (commit date ranges, PR timestamps)
+  - Provided Indicators of Compromise: commit SHAs, file paths, package names, IP addresses, domains, API keys/tokens, malicious URLs
+  - Any linked vendor security reports or blog posts
+
+**Tools**: Reasoning only, or `execute_code` for regex extraction from large text blocks.
+
+**Output**: Populate `iocs.md` with extracted IOCs. Each IOC must have:
+- Type (from: COMMIT_SHA, FILE_PATH, API_KEY, SECRET, IP_ADDRESS, DOMAIN, PACKAGE_NAME, ACTOR_USERNAME, MALICIOUS_URL, OTHER)
+- Value
+- Source (user-provided, inferred)
+
+**Reference**: See [evidence-types.md](./references/evidence-types.md) for IOC taxonomy.
+
+---
+
+## Phase 2: Parallel Evidence Collection
+
+Spawn up to 5 specialist investigator sub-agents using `delegate_task` (batch mode, max 3 concurrent). Each investigator has a **single data source** and must not mix sources.
+
+> **Orchestrator note**: Pass the IOC list from Phase 1 and the investigation time window in the `context` field of each delegated task.
+
+---
+
+### Investigator 1: Local Git Investigator
+
+**ROLE BOUNDARY**: You query the LOCAL GIT REPOSITORY ONLY. Do not call any external APIs.
+
+**Actions**:
+```bash
+# Clone repository
+git clone https://github.com/OWNER/REPO.git target_repo && cd target_repo
+
+# Full commit log with stats
+git log --all --full-history --stat --format="%H|%ae|%an|%ai|%s" > ../git_log.txt
+
+# Detect force-push evidence (orphaned/dangling commits)
+git fsck --lost-found --unreachable 2>&1 | grep commit > ../dangling_commits.txt
+
+# Check reflog for rewritten history
+git reflog --all > ../reflog.txt
+
+# List ALL branches including deleted remote refs
+git branch -a -v > ../branches.txt
+
+# Find suspicious large binary additions
+git log --all --diff-filter=A --name-only --format="%H %ai" -- "*.so" "*.dll" "*.exe" "*.bin" > ../binary_additions.txt
+
+# Check for GPG signature anomalies
+git log --show-signature --format="%H %ai %aN" > ../signature_check.txt 2>&1
+```
+
+**Evidence to collect** (add via `python3 SKILL_DIR/scripts/evidence-store.py add`):
+- Each dangling commit SHA → type: `git`
+- Force-push evidence (reflog showing history rewrite) → type: `git`
+- Unsigned commits from verified contributors → type: `git`
+- Suspicious binary file additions → type: `git`
+
+**Reference**: See [recovery-techniques.md](./references/recovery-techniques.md) for accessing force-pushed commits.
+
+---
+
+### Investigator 2: GitHub API Investigator
+
+**ROLE BOUNDARY**: You query the GITHUB REST API ONLY. Do not run git commands locally.
+
+**Actions**:
+```bash
+# Commits (paginated)
+curl -s "https://api.github.com/repos/OWNER/REPO/commits?per_page=100" > api_commits.json
+
+# Pull Requests including closed/deleted
+curl -s "https://api.github.com/repos/OWNER/REPO/pulls?state=all&per_page=100" > api_prs.json
+
+# Issues
+curl -s "https://api.github.com/repos/OWNER/REPO/issues?state=all&per_page=100" > api_issues.json
+
+# Contributors and collaborator changes
+curl -s "https://api.github.com/repos/OWNER/REPO/contributors" > api_contributors.json
+
+# Repository events (last 300)
+curl -s "https://api.github.com/repos/OWNER/REPO/events?per_page=100" > api_events.json
+
+# Check specific suspicious commit SHA details
+curl -s "https://api.github.com/repos/OWNER/REPO/git/commits/SHA" > commit_detail.json
+
+# Releases
+curl -s "https://api.github.com/repos/OWNER/REPO/releases?per_page=100" > api_releases.json
+
+# Check if a specific commit exists (force-pushed commits may 404 on commits/ but succeed on git/commits/)
+curl -s "https://api.github.com/repos/OWNER/REPO/commits/SHA" | jq .sha
+```
+
+**Cross-reference targets** (flag discrepancies as evidence):
+- PR exists in archive but missing from API → evidence of deletion
+- Contributor in archive events but not in contributors list → evidence of permission revocation
+- Commit in archive PushEvents but not in API commit list → evidence of force-push/deletion
+
+**Reference**: See [evidence-types.md](./references/evidence-types.md) for GH event types.
+
+---
+
+### Investigator 3: Wayback Machine Investigator
+
+**ROLE BOUNDARY**: You query the WAYBACK MACHINE CDX API ONLY. Do not use the GitHub API.
+
+**Goal**: Recover deleted GitHub pages (READMEs, issues, PRs, releases, wiki pages).
+
+**Actions**:
+```bash
+# Search for archived snapshots of the repo main page
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO&output=json&limit=100&from=YYYYMMDD&to=YYYYMMDD" > wayback_main.json
+
+# Search for a specific deleted issue
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/issues/NUM&output=json&limit=50" > wayback_issue_NUM.json
+
+# Search for a specific deleted PR
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/pull/NUM&output=json&limit=50" > wayback_pr_NUM.json
+
+# Fetch the best snapshot of a page
+# Use the Wayback Machine URL: https://web.archive.org/web/TIMESTAMP/ORIGINAL_URL
+# Example: https://web.archive.org/web/20240101000000*/github.com/OWNER/REPO
+
+# Advanced: Search for deleted releases/tags
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/releases/tag/*&output=json" > wayback_tags.json
+
+# Advanced: Search for historical wiki changes
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/wiki/*&output=json" > wayback_wiki.json
+```
+
+**Evidence to collect**:
+- Archived snapshots of deleted issues/PRs with their content
+- Historical README versions showing changes
+- Evidence of content present in archive but missing from current GitHub state
+
+**Reference**: See [github-archive-guide.md](./references/github-archive-guide.md) for CDX API parameters.
+
+---
+
+### Investigator 4: GH Archive / BigQuery Investigator
+
+**ROLE BOUNDARY**: You query GITHUB ARCHIVE via BIGQUERY ONLY. This is a tamper-proof record of all public GitHub events.
+
+> **Prerequisites**: Requires Google Cloud credentials with BigQuery access (`gcloud auth application-default login`). If unavailable, skip this investigator and note it in the report.
+
+**Cost Optimization Rules** (MANDATORY):
+1. ALWAYS run a `--dry_run` before every query to estimate cost.
+2. Use `_TABLE_SUFFIX` to filter by date range and minimize scanned data.
+3. Only SELECT the columns you need.
+4. Add a LIMIT unless aggregating.
+
+```bash
+# Template: safe BigQuery query for PushEvents to OWNER/REPO
+bq query --use_legacy_sql=false --dry_run "
+SELECT created_at, actor.login, payload.commits, payload.before, payload.head,
+       payload.size, payload.distinct_size
+FROM \`githubarchive.month.*\`
+WHERE _TABLE_SUFFIX BETWEEN 'YYYYMM' AND 'YYYYMM'
+  AND type = 'PushEvent'
+  AND repo.name = 'OWNER/REPO'
+LIMIT 1000
+"
+# If cost is acceptable, re-run without --dry_run
+
+# Detect force-pushes: zero-distinct_size PushEvents mean commits were force-erased
+# payload.distinct_size = 0 AND payload.size > 0 → force push indicator
+
+# Check for deleted branch events
+bq query --use_legacy_sql=false "
+SELECT created_at, actor.login, payload.ref, payload.ref_type
+FROM \`githubarchive.month.*\`
+WHERE _TABLE_SUFFIX BETWEEN 'YYYYMM' AND 'YYYYMM'
+  AND type = 'DeleteEvent'
+  AND repo.name = 'OWNER/REPO'
+LIMIT 200
+"
+```
+
+**Evidence to collect**:
+- Force-push events (payload.size > 0, payload.distinct_size = 0)
+- DeleteEvents for branches/tags
+- WorkflowRunEvents for suspicious CI/CD automation
+- PushEvents that precede a "gap" in the git log (evidence of rewrite)
+
+**Reference**: See [github-archive-guide.md](./references/github-archive-guide.md) for all 12 event types and query patterns.
+
+---
+
+### Investigator 5: IOC Enrichment Investigator
+
+**ROLE BOUNDARY**: You enrich EXISTING IOCs from Phase 1 using passive public sources ONLY. Do not execute any code from the target repository.
+
+**Actions**:
+- For each commit SHA: attempt recovery via direct GitHub URL (`github.com/OWNER/REPO/commit/SHA.patch`)
+- For each domain/IP: check passive DNS, WHOIS records (via `web_extract` on public WHOIS services)
+- For each package name: check npm/PyPI for matching malicious package reports
+- For each actor username: check GitHub profile, contribution history, account age
+- Recover force-pushed commits using 3 methods (see [recovery-techniques.md](./references/recovery-techniques.md))
+
+---
+
+## Phase 3: Evidence Consolidation
+
+After all investigators complete:
+
+1. Run `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list` to see all collected evidence.
+2. For each piece of evidence, verify the `content_sha256` hash matches the original source.
+3. Group evidence by:
+   - **Timeline**: Sort all timestamped evidence chronologically
+   - **Actor**: Group by GitHub handle or email
+   - **IOC**: Link evidence to the IOC it relates to
+4. Identify **discrepancies**: items present in one source but absent in another (key deletion indicators).
+5. Flag evidence as `[VERIFIED]` (confirmed from 2+ independent sources) or `[UNVERIFIED]` (single source only).
+
+---
+
+## Phase 4: Hypothesis Formation
+
+A hypothesis must:
+- State a specific claim (e.g., "Actor X force-pushed to BRANCH on DATE to erase commit SHA")
+- Cite at least 2 evidence IDs that support it (`EV-XXXX`, `EV-YYYY`)
+- Identify what evidence would disprove it
+- Be labeled `[HYPOTHESIS]` until validated
+
+**Common hypothesis templates** (see [investigation-templates.md](./references/investigation-templates.md)):
+- Maintainer Compromise: legitimate account used post-takeover to inject malicious code
+- Dependency Confusion: package name squatting to intercept installs
+- CI/CD Injection: malicious workflow changes to run code during builds
+- Typosquatting: near-identical package name targeting misspellers
+- Credential Leak: token/key accidentally committed then force-pushed to erase
+
+For each hypothesis, spawn a `delegate_task` sub-agent to attempt to find disconfirming evidence before confirming.
+
+---
+
+## Phase 5: Hypothesis Validation
+
+The validator sub-agent MUST mechanically check:
+
+1. For each hypothesis, extract all cited evidence IDs.
+2. Verify each ID exists in `evidence.json` (hard failure if any ID is missing → hypothesis rejected as potentially fabricated).
+3. Verify each `[VERIFIED]` piece of evidence was confirmed from 2+ sources.
+4. Check logical consistency: does the timeline depicted by the evidence support the hypothesis?
+5. Check for alternative explanations: could the same evidence pattern arise from a benign cause?
+
+**Output**:
+- `VALIDATED`: All evidence cited, verified, logically consistent, no plausible alternative explanation.
+- `INCONCLUSIVE`: Evidence supports hypothesis but alternative explanations exist or evidence is insufficient.
+- `REJECTED`: Missing evidence IDs, unverified evidence cited as fact, logical inconsistency detected.
+
+Rejected hypotheses feed back into Phase 4 for refinement (max 3 iterations).
+
+---
+
+## Phase 6: Final Report Generation
+
+Populate `investigation-report.md` using the template in [forensic-report.md](./templates/forensic-report.md).
+
+**Mandatory sections**:
+- Executive Summary: one-paragraph verdict (Compromised / Clean / Inconclusive) with confidence level
+- Timeline: chronological reconstruction of all significant events with evidence citations
+- Validated Hypotheses: each with status and supporting evidence IDs
+- Evidence Registry: table of all `EV-XXXX` entries with source, type, and verification status
+- IOC List: all extracted and enriched Indicators of Compromise
+- Chain of Custody: how evidence was collected, from what sources, at what timestamps
+- Recommendations: immediate mitigations if compromise detected; monitoring recommendations
+
+**Report rules**:
+- Every factual claim must have at least one `[EV-XXXX]` citation
+- Executive Summary must state confidence level (High / Medium / Low)
+- All secrets/credentials must be redacted to `[REDACTED]`
+
+---
+
+## Phase 7: Completion
+
+1. Run final evidence count: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json list`
+2. Archive the full investigation directory.
+3. If compromise is confirmed:
+   - List immediate mitigations (rotate credentials, pin dependency hashes, notify affected users)
+   - Identify affected versions/packages
+   - Note disclosure obligations (if a public package: coordinate with the package registry)
+4. Present the final `investigation-report.md` to the user.
+
+---
+
+## Ethical Use Guidelines
+
+This skill is designed for **defensive security investigation** — protecting open-source software from supply chain attacks. It must not be used for:
+
+- **Harassment or stalking** of contributors or maintainers
+- **Doxing** — correlating GitHub activity to real identities for malicious purposes
+- **Competitive intelligence** — investigating proprietary or internal repositories without authorization
+- **False accusations** — publishing investigation results without validated evidence (see anti-hallucination guardrails)
+
+Investigations should be conducted with the principle of **minimal intrusion**: collect only the evidence necessary to validate or refute the hypothesis. When publishing results, follow responsible disclosure practices and coordinate with affected maintainers before public disclosure.
+
+If the investigation reveals a genuine compromise, follow the coordinated vulnerability disclosure process:
+1. Notify the repository maintainers privately first
+2. Allow reasonable time for remediation (typically 90 days)
+3. Coordinate with package registries (npm, PyPI, etc.) if published packages are affected
+4. File a CVE if appropriate
+
+---
+
+## API Rate Limiting
+
+GitHub REST API enforces rate limits that will interrupt large investigations if not managed.
+
+**Authenticated requests**: 5,000/hour (requires `GITHUB_TOKEN` env var or `gh` CLI auth)
+**Unauthenticated requests**: 60/hour (unusable for investigations)
+
+**Best practices**:
+- Always authenticate: `export GITHUB_TOKEN=ghp_...` or use `gh` CLI (auto-authenticates)
+- Use conditional requests (`If-None-Match` / `If-Modified-Since` headers) to avoid consuming quota on unchanged data
+- For paginated endpoints, fetch all pages in sequence — don't parallelize against the same endpoint
+- Check `X-RateLimit-Remaining` header; if below 100, pause for `X-RateLimit-Reset` timestamp
+- BigQuery has its own quotas (10 TiB/day free tier) — always dry-run first
+- Wayback Machine CDX API: no formal rate limit, but be courteous (1-2 req/sec max)
+
+If rate-limited mid-investigation, record the partial results in the evidence store and note the limitation in the report.
+
+---
+
+## Reference Materials
+
+- [github-archive-guide.md](./references/github-archive-guide.md) — BigQuery queries, CDX API, 12 event types
+- [evidence-types.md](./references/evidence-types.md) — IOC taxonomy, evidence source types, observation types
+- [recovery-techniques.md](./references/recovery-techniques.md) — Recovering deleted commits, PRs, issues
+- [investigation-templates.md](./references/investigation-templates.md) — Pre-built hypothesis templates per attack type
+- [evidence-store.py](./scripts/evidence-store.py) — CLI tool for managing the evidence JSON store
+- [forensic-report.md](./templates/forensic-report.md) — Structured report template
@@ -0,0 +1,89 @@
+# Evidence Types Reference
+
+Taxonomy of all evidence types, IOC types, GitHub event types, and observation types
+used in OSS forensic investigations.
+
+---
+
+## Evidence Source Types
+
+| Type | Description | Example Sources |
+|------|-------------|-----------------|
+| `git` | Data from local git repository analysis | `git log`, `git fsck`, `git reflog`, `git blame` |
+| `gh_api` | Data from GitHub REST API responses | `/repos/.../commits`, `/repos/.../pulls`, `/repos/.../events` |
+| `gh_archive` | Data from GitHub Archive (BigQuery) | `githubarchive.month.*` BigQuery tables |
+| `web_archive` | Archived web pages from Wayback Machine | CDX API results, `web.archive.org/web/...` snapshots |
+| `ioc` | Indicator of Compromise from any source | Extracted from vendor reports, git history, network traces |
+| `analysis` | Derived insight from cross-source correlation | "SHA present in archive but absent from API" |
+| `vendor_report` | External security vendor or researcher report | CVE advisories, blog posts, NVD records |
+| `manual` | Manually recorded observation by investigator | Notes on behavioral patterns, timeline gaps |
+
+---
+
+## IOC Types
+
+| Type | Description | Example |
+|------|-------------|---------|
+| `COMMIT_SHA` | A git commit hash linked to malicious activity | `abc123def456...` |
+| `FILE_PATH` | A suspicious file inside the repository | `src/utils/crypto.js`, `dist/index.min.js` |
+| `API_KEY` | An API key accidentally committed | `AKIA...` (AWS), `ghp_...` (GitHub PAT) |
+| `SECRET` | A generic secret / credential | Database password, private key blob |
+| `IP_ADDRESS` | A C2 server or attacker IP | `192.0.2.1` |
+| `DOMAIN` | A malicious or suspicious domain | `evil-cdn.io`, typosquatted package registry domain |
+| `PACKAGE_NAME` | A malicious or squatted package name | `colo-rs` (typosquatting `color`), `lodash-utils` |
+| `ACTOR_USERNAME` | A GitHub handle linked to the attack | `malicious-bot-account` |
+| `MALICIOUS_URL` | A URL to a malicious resource | `https://evil.example.com/payload.sh` |
+| `WORKFLOW_FILE` | A suspicious CI/CD workflow file | `.github/workflows/release.yml` |
+| `BRANCH_NAME` | A suspicious branch | `refs/heads/temp-fix-do-not-merge` |
+| `TAG_NAME` | A suspicious git tag | `v1.0.0-security-patch` |
+| `RELEASE_NAME` | A suspicious release | Release with no associated tag or changelog |
+| `OTHER` | Catch-all for unclassified IOCs | — |
+
+---
+
+## GitHub Archive Event Types (12 Types)
+
+| Event Type | Forensic Relevance |
+|------------|-------------------|
+| `PushEvent` | Core: `payload.distinct_size=0` with `payload.size>0` → force push. `payload.before`/`payload.head` shows rewritten history. |
+| `PullRequestEvent` | Detects deleted PRs, rapid open→close patterns, PRs from new accounts |
+| `IssueEvent` | Detects deleted issues, coordinated labeling, rapid closure of vulnerability reports |
+| `IssueCommentEvent` | Deleted comments, rapid activity bursts |
+| `WatchEvent` | Star-farming campaigns (coordinated starring from new accounts) |
+| `ForkEvent` | Unusual fork patterns before malicious commit |
+| `CreateEvent` | Branch/tag creation: signals new release or code injection point |
+| `DeleteEvent` | Branch/tag deletion: critical — often used to hide traces |
+| `ReleaseEvent` | Unauthorized releases, release artifacts modified post-publish |
+| `MemberEvent` | Collaborator added/removed: maintainer compromise indicator |
+| `PublicEvent` | Repository made public (sometimes to drop malicious code briefly) |
+| `WorkflowRunEvent` | CI/CD pipeline executions: workflow injection, secret exfiltration |
+
+---
+
+## Evidence Verification States
+
+| State | Meaning |
+|-------|---------|
+| `unverified` | Collected from a single source, not cross-referenced |
+| `single_source` | The primary source has been confirmed directly (e.g., SHA resolves on GitHub), but no second source |
+| `multi_source_verified` | Confirmed from 2+ independent sources (e.g., GH Archive AND GitHub API both show the same event) |
+
+Only `multi_source_verified` evidence may be cited as fact in validated hypotheses.
+`unverified` and `single_source` evidence must be labeled `[UNVERIFIED]` or `[SINGLE-SOURCE]`.
+
+---
+
+## Observation Types (Patterned after RAPTOR)
+
+| Type | Description |
+|------|-------------|
+| `CommitObservation` | Specific commit SHA with metadata (author, date, files changed) |
+| `ForceWashObservation` | Evidence that commits were force-erased from a branch |
+| `DanglingCommitObservation` | SHA present in git object store but unreachable from any ref |
+| `IssueObservation` | A GitHub issue (current or archived) with title, body, timestamp |
+| `PRObservation` | A GitHub PR (current or archived) with diff summary, reviewers |
+| `IOC` | A single Indicator of Compromise with context |
+| `TimelineGap` | A period with unusual absence of expected activity |
+| `ActorAnomalyObservation` | Behavioral anomaly for a specific GitHub actor |
+| `WorkflowAnomalyObservation` | Suspicious CI/CD workflow change or unexpected run |
+| `CrossSourceDiscrepancy` | Item present in one source but absent in another (strong deletion indicator) |
@@ -0,0 +1,184 @@
+# GitHub Archive Query Guide (BigQuery)
+
+GitHub Archive records every public event on GitHub as immutable JSON records. This data is accessible via Google BigQuery and is the most reliable source for forensic investigation — events cannot be deleted or modified after recording.
+
+## Public Dataset
+
+- **Project**: `githubarchive`
+- **Tables**: `day.YYYYMMDD`, `month.YYYYMM`, `year.YYYY`
+- **Cost**: $6.25 per TiB scanned. Always run dry runs first.
+- **Access**: Requires a Google Cloud account with BigQuery enabled. Free tier includes 1 TiB/month of queries.
+
+---
+
+## The 12 GitHub Event Types
+
+| Event Type | What It Records | Forensic Value |
+|------------|-----------------|----------------|
+| `PushEvent` | Commits pushed to a branch | Force-push detection, commit timeline, author attribution |
+| `PullRequestEvent` | PR opened, closed, merged, reopened | Deleted PR recovery, review timeline |
+| `IssuesEvent` | Issue opened, closed, reopened, labeled | Deleted issue recovery, social engineering traces |
+| `IssueCommentEvent` | Comments on issues and PRs | Deleted comment recovery, communication patterns |
+| `CreateEvent` | Branch, tag, or repository creation | Suspicious branch creation, tag timing |
+| `DeleteEvent` | Branch or tag deletion | Evidence of cleanup after compromise |
+| `MemberEvent` | Collaborator added or removed | Permission changes, access escalation |
+| `PublicEvent` | Repository made public | Accidental exposure of private repos |
+| `WatchEvent` | User stars a repository | Actor reconnaissance patterns |
+| `ForkEvent` | Repository forked | Exfiltration of code before cleanup |
+| `ReleaseEvent` | Release published, edited, deleted | Malicious release injection, deleted release recovery |
+| `WorkflowRunEvent` | GitHub Actions workflow triggered | CI/CD abuse, unauthorized workflow runs |
+
+---
+
+## Query Templates
+
+### Basic: All Events for a Repository
+
+```sql
+SELECT
+  created_at,
+  type,
+  actor.login,
+  repo.name,
+  payload
+FROM
+  `githubarchive.day.20240101`  -- Adjust date
+WHERE
+  repo.name = 'owner/repo'
+  AND type IN ('PushEvent', 'DeleteEvent', 'MemberEvent')
+ORDER BY
+  created_at ASC
+```
+
+### Force-Push Detection
+
+Force-pushes produce PushEvents where commits are overwritten. Key indicators:
+- `payload.distinct_size = 0` with `payload.size > 0` → commits were erased
+- `payload.before` contains the SHA before the rewrite (recoverable)
+
+```sql
+SELECT
+  created_at,
+  actor.login,
+  JSON_EXTRACT_SCALAR(payload, '$.before') AS before_sha,
+  JSON_EXTRACT_SCALAR(payload, '$.head') AS after_sha,
+  JSON_EXTRACT_SCALAR(payload, '$.size') AS total_commits,
+  JSON_EXTRACT_SCALAR(payload, '$.distinct_size') AS distinct_commits,
+  JSON_EXTRACT_SCALAR(payload, '$.ref') AS branch_ref
+FROM
+  `githubarchive.month.*`
+WHERE
+  _TABLE_SUFFIX BETWEEN '202401' AND '202403'
+  AND type = 'PushEvent'
+  AND repo.name = 'owner/repo'
+  AND CAST(JSON_EXTRACT_SCALAR(payload, '$.distinct_size') AS INT64) = 0
+ORDER BY
+  created_at ASC
+```
+
+### Deleted Branch/Tag Detection
+
+```sql
+SELECT
+  created_at,
+  actor.login,
+  JSON_EXTRACT_SCALAR(payload, '$.ref') AS deleted_ref,
+  JSON_EXTRACT_SCALAR(payload, '$.ref_type') AS ref_type
+FROM
+  `githubarchive.month.*`
+WHERE
+  _TABLE_SUFFIX BETWEEN '202401' AND '202403'
+  AND type = 'DeleteEvent'
+  AND repo.name = 'owner/repo'
+ORDER BY
+  created_at ASC
+```
+
+### Collaborator Permission Changes
+
+```sql
+SELECT
+  created_at,
+  actor.login,
+  JSON_EXTRACT_SCALAR(payload, '$.action') AS action,
+  JSON_EXTRACT_SCALAR(payload, '$.member.login') AS member
+FROM
+  `githubarchive.month.*`
+WHERE
+  _TABLE_SUFFIX BETWEEN '202401' AND '202403'
+  AND type = 'MemberEvent'
+  AND repo.name = 'owner/repo'
+ORDER BY
+  created_at ASC
+```
+
+### CI/CD Workflow Activity
+
+```sql
+SELECT
+  created_at,
+  actor.login,
+  JSON_EXTRACT_SCALAR(payload, '$.action') AS action,
+  JSON_EXTRACT_SCALAR(payload, '$.workflow_run.name') AS workflow_name,
+  JSON_EXTRACT_SCALAR(payload, '$.workflow_run.conclusion') AS conclusion,
+  JSON_EXTRACT_SCALAR(payload, '$.workflow_run.head_sha') AS head_sha
+FROM
+  `githubarchive.month.*`
+WHERE
+  _TABLE_SUFFIX BETWEEN '202401' AND '202403'
+  AND type = 'WorkflowRunEvent'
+  AND repo.name = 'owner/repo'
+ORDER BY
+  created_at ASC
+```
+
+### Actor Activity Profiling
+
+```sql
+SELECT
+  type,
+  COUNT(*) AS event_count,
+  MIN(created_at) AS first_event,
+  MAX(created_at) AS last_event
+FROM
+  `githubarchive.month.*`
+WHERE
+  _TABLE_SUFFIX BETWEEN '202301' AND '202412'
+  AND actor.login = 'suspicious-username'
+GROUP BY type
+ORDER BY event_count DESC
+```
+
+---
+
+## Cost Optimization (MANDATORY)
+
+1. **Always dry run first**: Add `--dry_run` flag to `bq query` to see estimated bytes scanned before executing.
+2. **Use `_TABLE_SUFFIX`**: Narrow the date range as much as possible. `day.*` tables are cheapest for narrow windows; `month.*` for broader sweeps.
+3. **Select only needed columns**: Avoid `SELECT *`. The `payload` column is large — only select specific JSON paths.
+4. **Add LIMIT**: Use `LIMIT 1000` during exploration. Remove only for final exhaustive queries.
+5. **Column filtering in WHERE**: Filter on indexed columns (`type`, `repo.name`, `actor.login`) before payload extraction.
+
+**Cost estimation**: A single month of GH Archive data is ~1-2 TiB uncompressed. Querying a specific repo + event type with `_TABLE_SUFFIX` typically scans 1-10 GiB ($0.006-$0.06).
+
+---
+
+## Accessing via Hermes
+
+**Option A: BigQuery CLI** (if `gcloud` is installed)
+```bash
+bq query --use_legacy_sql=false --format=json "YOUR QUERY"
+```
+
+**Option B: Python** (via `execute_code`)
+```python
+from google.cloud import bigquery
+client = bigquery.Client()
+query = "YOUR QUERY"
+results = client.query(query).result()
+for row in results:
+    print(dict(row))
+```
+
+**Option C: No GCP credentials available**
+If BigQuery is unavailable, document this limitation in the report. Use the other 4 investigators (Git, GitHub API, Wayback Machine, IOC Enrichment) — they cover most investigation needs without BigQuery.
@@ -0,0 +1,131 @@
+# Investigation Templates
+
+Pre-built hypothesis and investigation templates for common supply chain attack scenarios.
+Each template includes: attack pattern, key evidence to collect, and hypothesis starters.
+
+---
+
+## Template 1: Maintainer Account Compromise
+
+**Pattern**: Attacker gains access to a legitimate maintainer account (phishing, credential stuffing)
+and uses it to push malicious code, create backdoored releases, or exfiltrate CI secrets.
+
+**Real-world examples**: XZ Utils (2024), Codecov (2021), event-stream (2018)
+
+**Key Evidence to Collect**:
+- [ ] Push events from maintainer account outside normal working hours/timezone
+- [ ] Commits adding new dependencies, obfuscated code, or modified build scripts
+- [ ] Release creation immediately after suspicious push (to maximize package distribution)
+- [ ] MemberEvent adding unknown collaborators (attacker adding backup access)
+- [ ] WorkflowRunEvent with unexpected secret access or exfiltration-like behavior
+- [ ] Account login location changes (check social media, conference talks for corroboration)
+
+**Hypothesis Starters**:
+```
+[HYPOTHESIS] Actor <HANDLE>'s account was compromised on or around <DATE>, 
+based on anomalous commit timing [EV-XXXX] and geographic access patterns [EV-YYYY].
+```
+```
+[HYPOTHESIS] Release <VERSION> was published by the compromised account to push 
+malicious code to downstream users, evidenced by the malicious commit [EV-XXXX] 
+being added <N> hours before the release [EV-YYYY].
+```
+
+---
+
+## Template 2: Malicious Dependency Injection
+
+**Pattern**: A trusted package is modified to include malicious code in a dependency,
+or a new malicious dependency is injected into an existing package.
+
+**Key Evidence to Collect**:
+- [ ] Diff of `package.json`/`requirements.txt`/`go.mod` before and after suspicious commit
+- [ ] The new dependency's publication timestamp vs. the injection commit timestamp
+- [ ] Whether the new dependency exists on npm/PyPI and who owns it
+- [ ] Any obfuscation patterns in the injected dependency code
+- [ ] Install-time scripts (`postinstall`, `setup.py`, etc.) that execute code on install
+
+**Hypothesis Starters**:
+```
+[HYPOTHESIS] Commit <SHA> [EV-XXXX] introduced dependency <PACKAGE@VERSION> 
+which appears to be a malicious package published by actor <HANDLE> [EV-YYYY], 
+designed to execute <BEHAVIOR> during installation.
+```
+
+---
+
+## Template 3: CI/CD Pipeline Injection
+
+**Pattern**: Attacker modifies GitHub Actions workflows to steal secrets, exfiltrate code,
+or inject malicious artifacts into the build output.
+
+**Key Evidence to Collect**:
+- [ ] Diff of all `.github/workflows/*.yml` files before/after suspicious period
+- [ ] WorkflowRunEvents triggered by the modified workflows
+- [ ] Any `curl`, `wget`, or network calls added to workflow steps
+- [ ] New or modified `env:` sections referencing `secrets.*`
+- [ ] Artifacts produced by modified workflow runs
+
+**Hypothesis Starters**:
+```
+[HYPOTHESIS] Workflow file <FILE> was modified in commit <SHA> [EV-XXXX] to 
+exfiltrate repository secrets via <METHOD>, as evidenced by the added network 
+call pattern [EV-YYYY].
+```
+
+---
+
+## Template 4: Typosquatting / Dependency Confusion
+
+**Pattern**: Attacker registers a package with a name similar to a popular package
+(or an internal package name) to intercept installs from users who mistype.
+
+**Key Evidence to Collect**:
+- [ ] Registration timestamp of the suspicious package on the registry
+- [ ] Package content: does it contain malicious code or is it a stub?
+- [ ] Download statistics for the suspicious package
+- [ ] Names of internal packages that could be targeted (if private repo scope)
+- [ ] Any references to the legitimate package in the malicious one's metadata
+
+**Hypothesis Starters**:
+```
+[HYPOTHESIS] Package <MALICIOUS_NAME> was registered on <DATE> [EV-XXXX] to 
+typosquat on <LEGITIMATE_NAME>, targeting users who misspell the package name. 
+The package contains <BEHAVIOR> [EV-YYYY].
+```
+
+---
+
+## Template 5: Force-Push History Rewrite (Evidence Erasure)
+
+**Pattern**: After a malicious commit is detected (or before wider notice), the attacker
+force-pushes to remove the malicious commit from branch history.
+
+**Detection is key** — this template focuses on proving the erasure happened.
+
+**Key Evidence to Collect**:
+- [ ] GH Archive PushEvent with `distinct_size=0` (force push indicator) [EV-XXXX]
+- [ ] The SHA of the commit BEFORE the force push (from GH Archive `payload.before`)
+- [ ] Recovery of the erased commit via direct URL or `git fetch origin SHA`
+- [ ] Wayback Machine snapshot of the commit page before erasure
+- [ ] Timeline gap in git log (N commits visible in archive but M < N in current repo)
+
+**Hypothesis Starters**:
+```
+[HYPOTHESIS] Actor <HANDLE> force-pushed branch <BRANCH> on <DATE> [EV-XXXX] 
+to erase commit <SHA> [EV-YYYY], which contained <MALICIOUS_CONTENT>. 
+The erased commit was recovered via <METHOD> [EV-ZZZZ].
+```
+
+---
+
+## Cross-Cutting Investigation Checklist
+
+Apply to every investigation regardless of template:
+
+- [ ] Check all contributors for newly created accounts (< 30 days old at time of malicious activity)
+- [ ] Check if any maintainer account changed email in the period (sign of account takeover)
+- [ ] Verify GPG signatures on suspicious commits match known maintainer keys
+- [ ] Check if the repository changed ownership or transferred orgs near the incident
+- [ ] Look for "cleanup" commits immediately after the malicious commit (cover-up pattern)
+- [ ] Check related packages/repos by the same author for similar patterns
@@ -0,0 +1,164 @@
+# Deleted Content Recovery Techniques
+
+## Key Insight: GitHub Never Fully Deletes Force-Pushed Commits
+
+Force-pushed commits are removed from the branch history but REMAIN on GitHub's servers until garbage collection runs (which can take weeks to months). This is the foundation of deleted commit recovery.
+
+---
+
+## Method 1: Direct GitHub URL (Fastest — No Auth Required)
+
+If you have a commit SHA, access it directly even if it was force-pushed off a branch:
+
+```bash
+# View commit metadata
+curl -s "https://github.com/OWNER/REPO/commit/SHA"
+
+# Download as patch (includes full diff)
+curl -s "https://github.com/OWNER/REPO/commit/SHA.patch" > recovered_commit.patch
+
+# Download as diff
+curl -s "https://github.com/OWNER/REPO/commit/SHA.diff" > recovered_commit.diff
+
+# Example (Istio credential leak - real incident):
+curl -s "https://github.com/istio/istio/commit/FORCE_PUSHED_SHA.patch"
+```
+
+**When this works**: SHA is known (from GH Archive, Wayback Machine, or `git fsck`)
+**When this fails**: GitHub has already garbage-collected the object (rare, typically 30–90 days post-force-push)
+
+---
+
+## Method 2: GitHub REST API
+
+```bash
+# Works for commits force-pushed off branches but still on server
+# Note: /commits/SHA may 404, but /git/commits/SHA often succeeds for orphaned commits
+curl -s "https://api.github.com/repos/OWNER/REPO/git/commits/SHA" | jq .
+
+# Get the tree (file listing) of a force-pushed commit
+curl -s "https://api.github.com/repos/OWNER/REPO/git/trees/SHA?recursive=1" | jq .
+
+# Get a specific file from a force-pushed commit
+curl -s "https://api.github.com/repos/OWNER/REPO/contents/PATH?ref=SHA" | jq .content | base64 -d
+```
+
+---
+
+## Method 3: Git Fetch by SHA (Local — Requires Clone)
+
+```bash
+# Fetch an orphaned commit directly by SHA into local repo
+cd target_repo
+git fetch origin SHA
+git log FETCH_HEAD -1   # view the commit
+git diff FETCH_HEAD~1 FETCH_HEAD  # view the diff
+
+# If the SHA was recently force-pushed it will still be fetchable
+# This stops working once GitHub GC runs
+```
+
+---
+
+## Method 4: Dangling Commits via git fsck
+
+```bash
+cd target_repo
+
+# Find all unreachable objects (includes force-pushed commits)
+git fsck --unreachable --no-reflogs 2>&1 | grep "unreachable commit" | awk '{print $3}' > dangling_shas.txt
+
+# For each dangling commit, get its metadata
+while read sha; do
+  echo "=== $sha ===" >> dangling_details.txt
+  git show --stat "$sha" >> dangling_details.txt 2>&1
+done < dangling_shas.txt
+
+# Note: dangling objects only exist in LOCAL clone — not the same as GitHub's copies
+# GitHub's copies are accessible via Methods 1-3 until GC runs
+```
+
+---
+
+## Recovering Deleted GitHub Issues and PRs
+
+### Via Wayback Machine CDX API
+
+```bash
+# Find all archived snapshots of a specific issue
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO/issues/NUMBER&output=json&limit=50&fl=timestamp,statuscode,original" | python3 -m json.tool
+
+# Fetch the best snapshot
+# Use the timestamp from the CDX result:
+# https://web.archive.org/web/TIMESTAMP/https://github.com/OWNER/REPO/issues/NUMBER
+curl -s "https://web.archive.org/web/TIMESTAMP/https://github.com/OWNER/REPO/issues/NUMBER" > issue_NUMBER_archived.html
+
+# Find all snapshots of the repo in a date range
+curl -s "https://web.archive.org/cdx/search/cdx?url=github.com/OWNER/REPO*&output=json&from=20240101&to=20240201&limit=200&fl=timestamp,urlkey,statuscode" | python3 -m json.tool
+```
+
+### Via GitHub API (Limited — Only Non-Deleted Content)
+
+```bash
+# Closed issues (not deleted) are retrievable
+curl -s "https://api.github.com/repos/OWNER/REPO/issues?state=closed&per_page=100" | jq '.[].number'
+
+# Note: DELETED issues/PRs do NOT appear in the API. Use Wayback Machine or GH Archive for those.
+```
+
+### Via GitHub Archive (For Event History — Not Content)
+
+```sql
+-- Find all IssueEvents for a repo in a date range
+SELECT created_at, actor.login, payload.action, payload.issue.number, payload.issue.title
+FROM `githubarchive.day.*`
+WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20240201'
+  AND type = 'IssuesEvent'
+  AND repo.name = 'OWNER/REPO'
+ORDER BY created_at
+```
+
+---
+
+## Recovering Deleted Files from a Known Commit
+
+```bash
+# If you have the commit SHA (even force-pushed):
+git show SHA:path/to/file.py > recovered_file.py
+
+# Or via API (base64 encoded content):
+curl -s "https://api.github.com/repos/OWNER/REPO/contents/path/to/file.py?ref=SHA" | python3 -c "
+import sys, json, base64
+d = json.load(sys.stdin)
+print(base64.b64decode(d['content']).decode())
+"
+```
+
+---
+
+## Evidence Recording
+
+After recovering any deleted content, immediately record it:
+
+```bash
+python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json add \
+  --source "git fetch origin FORCE_PUSHED_SHA" \
+  --content "Recovered commit: FORCE_PUSHED_SHA | Author: attacker@example.com | Date: 2024-01-15 | Added file: malicious.sh" \
+  --type git \
+  --actor "attacker-handle" \
+  --url "https://github.com/OWNER/REPO/commit/FORCE_PUSHED_SHA.patch" \
+  --timestamp "2024-01-15T00:00:00Z" \
+  --verification single_source \
+  --notes "Commit force-pushed off main branch on 2024-01-16. Recovered via direct fetch."
+```
+
+---
+
+## Recovery Failure Modes
+
+| Failure | Cause | Workaround |
+|---------|-------|------------|
+| `git fetch origin SHA` returns "not our ref" | GitHub GC already ran | Try Method 1/2, search Wayback Machine |
+| `github.com/OWNER/REPO/commit/SHA` returns 404 | GC ran or SHA is wrong | Verify SHA via GH Archive; try partial SHA search |
+| Wayback Machine has no snapshots | Page was never crawled by IA | Check `commoncrawl.org`, check Google Cache |
+| BigQuery shows event but no content | GH Archive stores event metadata, not file contents | Recovery only reveals the event occurred, not the content |
@@ -0,0 +1,313 @@
+#!/usr/bin/env python3
+"""
+OSS Forensics Evidence Store Manager
+Manages a JSON-based evidence store for forensic investigations.
+
+Commands:
+  add      - Add a piece of evidence
+  list     - List all evidence (optionally filter by type or actor)
+  verify   - Re-check SHA-256 hashes for integrity
+  query    - Search evidence by keyword
+  export   - Export evidence as a Markdown table
+  summary  - Print investigation statistics
+
+Usage example:
+  python3 evidence-store.py --store evidence.json add \
+    --source "git fsck output" --content "dangling commit abc123" \
+    --type git --actor "malicious-user" --url "https://github.com/owner/repo/commit/abc123"
+
+  python3 evidence-store.py --store evidence.json list --type git
+  python3 evidence-store.py --store evidence.json verify
+  python3 evidence-store.py --store evidence.json export > evidence-table.md
+"""
+
+import json
+import argparse
+import os
+import datetime
+import hashlib
+import sys
+
+EVIDENCE_TYPES = [
+    "git",           # Local git repository data (commits, reflog, fsck)
+    "gh_api",        # GitHub REST API responses
+    "gh_archive",    # GitHub Archive / BigQuery query results
+    "web_archive",   # Wayback Machine snapshots
+    "ioc",           # Indicator of Compromise (SHA, domain, IP, package name, etc.)
+    "analysis",      # Derived analysis / cross-source correlation result
+    "manual",        # Manually noted observation
+    "vendor_report", # External security vendor report excerpt
+]
+
+VERIFICATION_STATES = ["unverified", "single_source", "multi_source_verified"]
+
+IOC_TYPES = [
+    "COMMIT_SHA", "FILE_PATH", "API_KEY", "SECRET", "IP_ADDRESS",
+    "DOMAIN", "PACKAGE_NAME", "ACTOR_USERNAME", "MALICIOUS_URL",
+    "WORKFLOW_FILE", "BRANCH_NAME", "TAG_NAME", "RELEASE_NAME", "OTHER",
+]
+
+
+def _now_iso():
+    return datetime.datetime.now(datetime.timezone.utc).isoformat(timespec="seconds") + "Z"
+
+
+def _sha256(content: str) -> str:
+    return hashlib.sha256(content.encode("utf-8")).hexdigest()
+
+
+class EvidenceStore:
+    def __init__(self, filepath: str):
+        self.filepath = filepath
+        self.data = {
+            "metadata": {
+                "version": "2.0",
+                "created_at": _now_iso(),
+                "last_updated": _now_iso(),
+                "investigation": "",
+                "target_repo": "",
+            },
+            "evidence": [],
+            "chain_of_custody": [],
+        }
+        if os.path.exists(filepath):
+            try:
+                with open(filepath, "r", encoding="utf-8") as f:
+                    self.data = json.load(f)
+            except (json.JSONDecodeError, IOError) as e:
+                print(f"Error loading evidence store '{filepath}': {e}", file=sys.stderr)
+                print("Hint: The file might be corrupted. Check for manual edits or syntax errors.", file=sys.stderr)
+                sys.exit(1)
+
+    def _save(self):
+        self.data["metadata"]["last_updated"] = _now_iso()
+        with open(self.filepath, "w", encoding="utf-8") as f:
+            json.dump(self.data, f, indent=2, ensure_ascii=False)
+
+    def _next_id(self) -> str:
+        return f"EV-{len(self.data['evidence']) + 1:04d}"
+
+    def add(
+        self,
+        source: str,
+        content: str,
+        evidence_type: str,
+        actor: str = None,
+        url: str = None,
+        timestamp: str = None,
+        ioc_type: str = None,
+        verification: str = "unverified",
+        notes: str = None,
+    ) -> str:
+        evidence_id = self._next_id()
+        entry = {
+            "id": evidence_id,
+            "type": evidence_type,
+            "source": source,
+            "content": content,
+            "content_sha256": _sha256(content),
+            "actor": actor,
+            "url": url,
+            "event_timestamp": timestamp,
+            "collected_at": _now_iso(),
+            "ioc_type": ioc_type,
+            "verification": verification,
+            "notes": notes,
+        }
+        self.data["evidence"].append(entry)
+        self.data["chain_of_custody"].append({
+            "action": "add",
+            "evidence_id": evidence_id,
+            "timestamp": _now_iso(),
+            "source": source,
+        })
+        self._save()
+        return evidence_id
+
+    def list_evidence(self, filter_type: str = None, filter_actor: str = None):
+        results = self.data["evidence"]
+        if filter_type:
+            results = [e for e in results if e.get("type") == filter_type]
+        if filter_actor:
+            results = [e for e in results if e.get("actor") == filter_actor]
+        return results
+
+    def verify_integrity(self):
+        """Re-compute SHA-256 for all entries and report mismatches."""
+        issues = []
+        for entry in self.data["evidence"]:
+            expected = _sha256(entry["content"])
+            stored = entry.get("content_sha256", "")
+            if expected != stored:
+                issues.append({
+                    "id": entry["id"],
+                    "stored_sha256": stored,
+                    "computed_sha256": expected,
+                })
+        return issues
+
+    def query(self, keyword: str):
+        """Search for keyword in content, source, actor, or url."""
+        keyword_lower = keyword.lower()
+        return [
+            e for e in self.data["evidence"]
+            if keyword_lower in (e.get("content", "") or "").lower()
+            or keyword_lower in (e.get("source", "") or "").lower()
+            or keyword_lower in (e.get("actor", "") or "").lower()
+            or keyword_lower in (e.get("url", "") or "").lower()
+        ]
+
+    def export_markdown(self) -> str:
+        lines = [
+            "# Evidence Registry",
+            "",
+            f"**Store**: `{self.filepath}`",
+            f"**Last Updated**: {self.data['metadata'].get('last_updated', 'N/A')}",
+            f"**Total Evidence Items**: {len(self.data['evidence'])}",
+            "",
+            "| ID | Type | Source | Actor | Verification | Event Timestamp | URL |",
+            "|----|------|--------|-------|--------------|-----------------|-----|",
+        ]
+        for e in self.data["evidence"]:
+            url = e.get("url") or ""
+            url_display = f"[link]({url})" if url else ""
+            lines.append(
+                f"| {e['id']} | {e.get('type','')} | {e.get('source','')} "
+                f"| {e.get('actor') or ''} | {e.get('verification','')} "
+                f"| {e.get('event_timestamp') or ''} | {url_display} |"
+            )
+        lines.append("")
+        lines.append("## Chain of Custody")
+        lines.append("")
+        lines.append("| Evidence ID | Action | Timestamp | Source |")
+        lines.append("|-------------|--------|-----------|--------|")
+        for c in self.data["chain_of_custody"]:
+            lines.append(
+                f"| {c.get('evidence_id','')} | {c.get('action','')} "
+                f"| {c.get('timestamp','')} | {c.get('source','')} |"
+            )
+        return "\n".join(lines)
+
+    def summary(self) -> dict:
+        by_type = {}
+        by_verification = {}
+        actors = set()
+        for e in self.data["evidence"]:
+            t = e.get("type", "unknown")
+            by_type[t] = by_type.get(t, 0) + 1
+            v = e.get("verification", "unverified")
+            by_verification[v] = by_verification.get(v, 0) + 1
+            if e.get("actor"):
+                actors.add(e["actor"])
+        return {
+            "total": len(self.data["evidence"]),
+            "by_type": by_type,
+            "by_verification": by_verification,
+            "unique_actors": sorted(actors),
+        }
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="OSS Forensics Evidence Store Manager v2.0",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument("--store", default="evidence.json", help="Path to evidence JSON file (default: evidence.json)")
+
+    subparsers = parser.add_subparsers(dest="command", metavar="COMMAND")
+
+    # --- add ---
+    add_p = subparsers.add_parser("add", help="Add a new evidence entry")
+    add_p.add_argument("--source", required=True, help="Where this evidence came from (e.g. 'git fsck', 'GH API /commits')")
+    add_p.add_argument("--content", required=True, help="The evidence content (commit SHA, API response excerpt, etc.)")
+    add_p.add_argument("--type", required=True, choices=EVIDENCE_TYPES, dest="evidence_type", help="Evidence type")
+    add_p.add_argument("--actor", help="GitHub handle or email of associated actor")
+    add_p.add_argument("--url", help="URL to original source")
+    add_p.add_argument("--timestamp", help="When the event occurred (ISO 8601)")
+    add_p.add_argument("--ioc-type", choices=IOC_TYPES, help="IOC subtype (for --type ioc)")
+    add_p.add_argument("--verification", choices=VERIFICATION_STATES, default="unverified")
+    add_p.add_argument("--notes", help="Additional investigator notes")
+    add_p.add_argument("--quiet", action="store_true", help="Suppress success message")
+
+    # --- list ---
+    list_p = subparsers.add_parser("list", help="List all evidence entries")
+    list_p.add_argument("--type", dest="filter_type", choices=EVIDENCE_TYPES, help="Filter by type")
+    list_p.add_argument("--actor", dest="filter_actor", help="Filter by actor")
+
+    # --- verify ---
+    subparsers.add_parser("verify", help="Verify SHA-256 integrity of all evidence content")
+
+    # --- query ---
+    query_p = subparsers.add_parser("query", help="Search evidence by keyword")
+    query_p.add_argument("keyword", help="Keyword to search for")
+
+    # --- export ---
+    subparsers.add_parser("export", help="Export evidence as a Markdown table (stdout)")
+
+    # --- summary ---
+    subparsers.add_parser("summary", help="Print investigation statistics")
+
+    args = parser.parse_args()
+
+    if not args.command:
+        parser.print_help()
+        sys.exit(0)
+
+    store = EvidenceStore(args.store)
+
+    if args.command == "add":
+        eid = store.add(
+            source=args.source,
+            content=args.content,
+            evidence_type=args.evidence_type,
+            actor=args.actor,
+            url=args.url,
+            timestamp=args.timestamp,
+            ioc_type=args.ioc_type,
+            verification=args.verification,
+            notes=args.notes,
+        )
+        if not getattr(args, "quiet", False):
+            print(f"✓ Added evidence: {eid}")
+
+    elif args.command == "list":
+        items = store.list_evidence(
+            filter_type=getattr(args, "filter_type", None),
+            filter_actor=getattr(args, "filter_actor", None),
+        )
+        if not items:
+            print("No evidence found.")
+        for e in items:
+            actor_str = f" | actor: {e['actor']}" if e.get("actor") else ""
+            url_str = f" | {e['url']}" if e.get("url") else ""
+            print(f"[{e['id']}] {e['type']:12s} | {e['verification']:20s} | {e['source']}{actor_str}{url_str}")
+
+    elif args.command == "verify":
+        issues = store.verify_integrity()
+        if not issues:
+            print(f"✓ All {len(store.data['evidence'])} evidence entries passed SHA-256 integrity check.")
+        else:
+            print(f"✗ {len(issues)} integrity issue(s) detected:")
+            for i in issues:
+                print(f"  [{i['id']}] stored={i['stored_sha256'][:16]}... computed={i['computed_sha256'][:16]}...")
+            sys.exit(1)
+
+    elif args.command == "query":
+        results = store.query(args.keyword)
+        print(f"Found {len(results)} result(s) for '{args.keyword}':")
+        for e in results:
+            print(f"  [{e['id']}] {e['type']} | {e['source']} | {e['content'][:80]}")
+
+    elif args.command == "export":
+        print(store.export_markdown())
+
+    elif args.command == "summary":
+        s = store.summary()
+        print(f"Total evidence items : {s['total']}")
+        print(f"By type              : {json.dumps(s['by_type'], indent=2)}")
+        print(f"By verification      : {json.dumps(s['by_verification'], indent=2)}")
+        print(f"Unique actors        : {s['unique_actors']}")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,151 @@
+# Forensic Investigation Report
+
+> **Instructions**: Fill in all sections. Every factual claim must cite at least one `[EV-XXXX]` evidence ID.
+> Remove placeholder text and instruction notes before finalizing. Redact all secrets to `[REDACTED]`.
+
+---
+
+## Executive Summary
+
+**Target Repository**: `OWNER/REPO`
+**Investigation Period**: YYYY-MM-DD to YYYY-MM-DD
+**Verdict**: <!-- Compromised / Clean / Inconclusive -->
+**Confidence Level**: <!-- High / Medium / Low -->
+**Report Date**: YYYY-MM-DD
+**Investigator**: <!-- Agent session ID or analyst name -->
+
+<!-- One paragraph: what was investigated, what was found, what is recommended. -->
+
+---
+
+## Timeline of Events
+
+> All timestamps in UTC. Each event must cite at least one evidence ID.
+
+| Timestamp (UTC) | Event | Evidence IDs | Source |
+|-----------------|-------|--------------|--------|
+| YYYY-MM-DDTHH:MM:SSZ | _Describe event_ | [EV-XXXX] | git / gh_api / gh_archive / web_archive |
+| | | | |
+
+---
+
+## Validated Hypotheses
+
+### Hypothesis 1: <!-- Short title -->
+
+**Status**: <!-- VALIDATED / INCONCLUSIVE / REJECTED -->
+
+**Claim**: _Full statement of the hypothesis._
+
+**Supporting Evidence**:
+- [EV-XXXX]: _What this evidence shows_
+- [EV-YYYY]: _What this evidence shows_
+
+**Counter-Evidence Considered**: _What might disprove this, and why it was ruled out or not._
+
+**Confidence**: <!-- High / Medium / Low, and why -->
+
+---
+
+## Indicators of Compromise (IOC List)
+
+| Type | Value | Status | Evidence |
+|------|-------|--------|----------|
+| COMMIT_SHA | `abc123...` | Confirmed malicious | [EV-XXXX] |
+| ACTOR_USERNAME | `handle` | Suspected compromised | [EV-YYYY] |
+| FILE_PATH | `src/evil.js` | Confirmed malicious | [EV-ZZZZ] |
+| DOMAIN | `evil-cdn.io` | Confirmed C2 | [EV-WWWW] |
+
+---
+
+## Affected Versions
+
+| Version / Tag | Published | Contains Malicious Code | Evidence |
+|---------------|-----------|------------------------|----------|
+| `v1.2.3` | YYYY-MM-DD | Yes / No / Unknown | [EV-XXXX] |
+
+---
+
+## Evidence Registry
+
+> Generated by: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json export`
+
+<!-- Paste the Markdown table output from the evidence-store.py export command here -->
+
+| ID | Type | Source | Actor | Verification | Event Timestamp | URL |
+|----|------|--------|-------|--------------|-----------------|-----|
+| EV-0001 | | | | | | |
+
+---
+
+## Chain of Custody
+
+> Generated by: `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json export`
+
+<!-- Paste the chain of custody section from the export output here -->
+
+| Evidence ID | Action | Timestamp | Source |
+|-------------|--------|-----------|--------|
+| EV-0001 | add | | |
+
+---
+
+## Technical Findings
+
+### Git History Analysis
+
+_Summarize findings from local git analysis: dangling commits, reflog anomalies, unsigned commits, binary additions, etc._
+
+### GitHub API Analysis
+
+_Summarize findings from GitHub REST API: deleted PRs/issues, contributor changes, release anomalies, etc._
+
+### GitHub Archive Analysis
+
+_Summarize findings from BigQuery: force-push events, delete events, workflow anomalies, member changes, etc._
+_Note: If BigQuery was unavailable, state this explicitly._
+
+### Wayback Machine Analysis
+
+_Summarize findings from archive.org: recovered deleted pages, historical content differences, etc._
+
+### IOC Enrichment
+
+_Summarize enrichment results: WHOIS data for domains, recovered commit content, actor account analysis, etc._
+
+---
+
+## Recommendations
+
+### Immediate Actions (If Compromise Confirmed)
+
+- [ ] Rotate all GitHub tokens, API keys, and credentials that may have been exposed
+- [ ] Pin dependency versions to hashes in all affected packages
+- [ ] Publish a security advisory / CVE if applicable
+- [ ] Notify downstream users/package registries (npm, PyPI, etc.)
+- [ ] Revoke access for the compromised account and re-secure with hardware 2FA
+- [ ] Audit all CI/CD workflow files for unauthorized modifications
+- [ ] Review all releases published during the compromise window
+
+### Monitoring Recommendations
+
+- [ ] Enable branch protection on `main`/`master` (require code review, disallow force-push)
+- [ ] Enable required commit signing (GPG/SSH)
+- [ ] Set up GitHub audit log streaming for future monitoring
+- [ ] Pin critical dependencies to known-good SHAs in lock files
+
+---
+
+## Limitations and Caveats
+
+- _List any data sources that were unavailable (e.g., no BigQuery access)_
+- _Note any evidence that is single-source only (not independently verified)_
+- _Note any hypotheses that could not be confirmed or denied_
+
+---
+
+## References
+
+- Evidence store: `evidence.json` (SHA-256 integrity: run `python3 SKILL_DIR/scripts/evidence-store.py --store evidence.json verify`)
+- Related issues: <!-- Link to GitHub issues, CVEs, security advisories -->
+- RAPTOR framework: https://github.com/gadievron/raptor
@@ -0,0 +1,43 @@
+# Malicious Package Investigation Report
+
+---
+
+## 📦 Package Metadata
+- **Package Name**: 
+- **Registry**: [NPM / PyPI / RubyGems / etc.]
+- **Affected Versions**: 
+- **Malicious Version(s)**: 
+- **Downloads at Time of Detection**: 
+- **Package URL**: 
+
+---
+
+## 🚩 Indicators of Compromise (IOCs)
+- **Malicious URL(s)**: 
+- **Exfiltrated Data Types**: [Environment variables, ~/.ssh/id_rsa, /etc/shadow, etc.]
+- **Exfiltration Method**: [DNS tunneling, HTTP POST to C2, etc.]
+- **C2 IP/Domain**: 
+
+---
+
+## 🛠️ Analysis Summary
+- **Primary Mechanism**: [Typosquatting / Dependency Confusion / Maintainer Takeover]
+- **Behavior Description**: 
+  - [Example: Installs a postinstall script that exfiltrates environment variables.]
+  - [Example: Patches `setup.py` to download a secondary payload.]
+
+---
+
+## 🔍 Evidence Registry
+| Evidence ID | Type | Source | Description |
+|-------------|------|--------|-------------|
+| EV-XXXX     | ioc  | NPM    | Package install script snapshot |
+| EV-YYYY     | web  | Wayback| Historical version comparison |
+
+---
+
+## 🛡️ Recommended Mitigations
+1. [ ] Unpublish/Report the package to the registry.
+2. [ ] Audit `package-lock.json` or `requirements.txt` across all projects.
+3. [ ] Rotate secrets exfiltrated via environment variables.
+4. [ ] Pin specific hashes (SHASUM) for mission-critical dependencies.
@@ -1,218 +0,0 @@
-# Checkpoint & Rollback — Implementation Plan
-
-## Goal
-
-Automatic filesystem snapshots before destructive file operations, with user-facing rollback. The agent never sees or interacts with this — it's transparent infrastructure.
-
-## Design Principles
-
-1. **Not a tool** — the LLM never knows about it. Zero prompt tokens, zero tool schema overhead.
-2. **Once per turn** — checkpoint at most once per conversation turn (user message → agent response cycle), triggered lazily on the first file-mutating operation. Not on every write.
-3. **Opt-in via config** — disabled by default, enabled with `checkpoints: true` in config.yaml.
-4. **Works on any directory** — uses a shadow git repo completely separate from the user's project git. Works on git repos, non-git directories, anything.
-5. **User-facing rollback** — `/rollback` slash command (CLI + gateway) to list and restore checkpoints. Also `hermes rollback` CLI subcommand.
-
-## Architecture
-
-```
-~/.hermes/checkpoints/
-  {sha256(abs_dir)[:16]}/       # Shadow git repo per working directory
-    HEAD, refs/, objects/...    # Standard git internals
-    HERMES_WORKDIR              # Original dir path (for display)
-    info/exclude                # Default excludes (node_modules, .env, etc.)
-```
-
-### Core: CheckpointManager (new file: tools/checkpoint_manager.py)
-
-Adapted from PR #559's CheckpointStore. Key changes from the PR:
-
- **Not a tool** — no schema, no registry entry, no handler
- **Turn-scoped deduplication** — tracks `_checkpointed_dirs: Set[str]` per turn
- **Configurable** — reads `checkpoints` config key
- **Pruning** — keeps last N snapshots per directory (default 50), prunes on take
-
-```python
-class CheckpointManager:
-    def __init__(self, enabled: bool = False, max_snapshots: int = 50):
-        self.enabled = enabled
-        self.max_snapshots = max_snapshots
-        self._checkpointed_dirs: Set[str] = set()  # reset each turn
-
-    def new_turn(self):
-        """Call at start of each conversation turn to reset dedup."""
-        self._checkpointed_dirs.clear()
-
-    def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> None:
-        """Take a checkpoint if enabled and not already done this turn."""
-        if not self.enabled:
-            return
-        abs_dir = str(Path(working_dir).resolve())
-        if abs_dir in self._checkpointed_dirs:
-            return
-        self._checkpointed_dirs.add(abs_dir)
-        try:
-            self._take(abs_dir, reason)
-        except Exception as e:
-            logger.debug("Checkpoint failed (non-fatal): %s", e)
-
-    def list_checkpoints(self, working_dir: str) -> List[dict]:
-        """List available checkpoints for a directory."""
-        ...
-
-    def restore(self, working_dir: str, commit_hash: str) -> dict:
-        """Restore files to a checkpoint state."""
-        ...
-
-    def _take(self, working_dir: str, reason: str):
-        """Shadow git: add -A + commit. Prune if over max_snapshots."""
-        ...
-
-    def _prune(self, shadow_repo: Path):
-        """Keep only last max_snapshots commits."""
-        ...
-```
-
-### Integration Point: run_agent.py
-
-The AIAgent already owns the conversation loop. Add CheckpointManager as an instance attribute:
-
-```python
-class AIAgent:
-    def __init__(self, ...):
-        ...
-        # Checkpoint manager — reads config to determine if enabled
-        self._checkpoint_mgr = CheckpointManager(
-            enabled=config.get("checkpoints", False),
-            max_snapshots=config.get("checkpoint_max_snapshots", 50),
-        )
-```
-
-**Turn boundary** — in `run_conversation()`, call `new_turn()` at the start of each agent iteration (before processing tool calls):
-
-```python
-# Inside the main loop, before _execute_tool_calls():
-self._checkpoint_mgr.new_turn()
-```
-
-**Trigger point** — in `_execute_tool_calls()`, before dispatching file-mutating tools:
-
-```python
-# Before the handle_function_call dispatch:
-if function_name in ("write_file", "patch"):
-    # Determine working dir from the file path in the args
-    file_path = function_args.get("path", "") or function_args.get("old_string", "")
-    if file_path:
-        work_dir = str(Path(file_path).parent.resolve())
-        self._checkpoint_mgr.ensure_checkpoint(work_dir, f"before {function_name}")
-```
-
-This means:
- First `write_file` in a turn → checkpoint (fast, one `git add -A && git commit`)
- Subsequent writes in the same turn → no-op (already checkpointed)
- Next turn (new user message) → fresh checkpoint eligibility
-
-### Config
-
-Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`:
-
-```python
-"checkpoints": False,          # Enable filesystem checkpoints before destructive ops
-"checkpoint_max_snapshots": 50, # Max snapshots to keep per directory
-```
-
-User enables with:
-```yaml
-# ~/.hermes/config.yaml
-checkpoints: true
-```
-
-### User-Facing Rollback
-
-**CLI slash command** — add `/rollback` to `process_command()` in `cli.py`:
-
-```
-/rollback         — List recent checkpoints for the current directory
-/rollback <hash>  — Restore files to that checkpoint
-```
-
-Shows a numbered list:
-```
-📸 Checkpoints for /home/user/project:
-  1. abc1234  2026-03-09 21:15  before write_file (3 files changed)
-  2. def5678  2026-03-09 20:42  before patch (1 file changed)
-  3. ghi9012  2026-03-09 20:30  before write_file (2 files changed)
-
-Use /rollback <number> to restore, e.g. /rollback 1
-```
-
-**Gateway slash command** — add `/rollback` to gateway/run.py with the same behavior.
-
-**CLI subcommand** — `hermes rollback` (optional, lower priority).
-
-### What Gets Excluded (not checkpointed)
-
-Same as the PR's defaults — written to the shadow repo's `info/exclude`:
-
-```
-node_modules/
-dist/
-build/
-.env
-.env.*
-__pycache__/
-*.pyc
-.DS_Store
-*.log
-.cache/
-.venv/
-.git/
-```
-
-Also respects the project's `.gitignore` if present (shadow repo can read it via `core.excludesFile`).
-
-### Safety
-
- `ensure_checkpoint()` wraps everything in try/except — a checkpoint failure never blocks the actual file operation
- Shadow repo is completely isolated — GIT_DIR + GIT_WORK_TREE env vars, never touches user's .git
- If git isn't installed, checkpoints silently disable
- Large directories: add a file count check — skip checkpoint if >50K files to avoid slowdowns
-
-## Files to Create/Modify
-
-| File | Change |
-|------|--------|
-| `tools/checkpoint_manager.py` | **NEW** — CheckpointManager class (adapted from PR #559) |
-| `run_agent.py` | Add CheckpointManager init + trigger in `_execute_tool_calls()` |
-| `hermes_cli/config.py` | Add `checkpoints` + `checkpoint_max_snapshots` to DEFAULT_CONFIG |
-| `cli.py` | Add `/rollback` slash command handler |
-| `gateway/run.py` | Add `/rollback` slash command handler |
-| `tests/tools/test_checkpoint_manager.py` | **NEW** — tests (adapted from PR #559's tests) |
-
-## What We Take From PR #559
-
- `_shadow_repo_path()` — deterministic path hashing ✅
- `_git_env()` — GIT_DIR/GIT_WORK_TREE isolation ✅
- `_run_git()` — subprocess wrapper with timeout ✅
- `_init_shadow_repo()` — shadow repo initialization ✅
- `DEFAULT_EXCLUDES` list ✅
- Test structure and patterns ✅
-
-## What We Change From PR #559
-
- **Remove tool schema/registry** — not a tool
- **Remove injection into file_operations.py and patch_parser.py** — trigger from run_agent.py instead
- **Add turn-scoped deduplication** — one checkpoint per turn, not per operation
- **Add pruning** — keep last N snapshots
- **Add config flag** — opt-in, not mandatory
- **Add /rollback command** — user-facing restore UI
- **Add file count guard** — skip huge directories
-
-## Implementation Order
-
-1. `tools/checkpoint_manager.py` — core class with take/list/restore/prune
-2. `tests/tools/test_checkpoint_manager.py` — tests
-3. `hermes_cli/config.py` — config keys
-4. `run_agent.py` — integration (init + trigger)
-5. `cli.py` — `/rollback` slash command
-6. `gateway/run.py` — `/rollback` slash command
-7. Full test suite run + manual smoke test
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "hermes-agent"
-version = "0.2.0"
+version = "0.3.0"
 description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -205,6 +205,33 @@ _NEVER_PARALLEL_TOOLS = frozenset({"clarify"})
 # Maximum number of concurrent worker threads for parallel tool execution.
 _MAX_TOOL_WORKERS = 8

+# Patterns that indicate a terminal command may modify/delete files.
+_DESTRUCTIVE_PATTERNS = re.compile(
+    r"""(?:^|\s|&&|\|\||;|`)(?:
+        rm\s|rmdir\s|
+        mv\s|
+        sed\s+-i|
+        truncate\s|
+        dd\s|
+        shred\s|
+        git\s+(?:reset|clean|checkout)\s
+    )""",
+    re.VERBOSE,
+)
+# Output redirects that overwrite files (> but not >>)
+_REDIRECT_OVERWRITE = re.compile(r'[^>]>[^>]|^>[^>]')
+
+
+def _is_destructive_command(cmd: str) -> bool:
+    """Heuristic: does this terminal command look like it modifies/deletes files?"""
+    if not cmd:
+        return False
+    if _DESTRUCTIVE_PATTERNS.search(cmd):
+        return True
+    if _REDIRECT_OVERWRITE.search(cmd):
+        return True
+    return False
+

 def _inject_honcho_turn_context(content, turn_context: str):
    """Append Honcho recall to the current-turn user message without mutating history.
@@ -269,6 +296,7 @@ class AIAgent:
        reasoning_callback: callable = None,
        clarify_callback: callable = None,
        step_callback: callable = None,
+        stream_delta_callback: callable = None,
        max_tokens: int = None,
        reasoning_config: Dict[str, Any] = None,
        prefill_messages: List[Dict[str, Any]] = None,
@@ -368,6 +396,7 @@ class AIAgent:
        self.reasoning_callback = reasoning_callback
        self.clarify_callback = clarify_callback
        self.step_callback = step_callback
+        self.stream_delta_callback = stream_delta_callback
        self._last_reported_tool = None  # Track for "new tool" mode
        
        # Interrupt mechanism for breaking out of tool loops
@@ -517,6 +546,8 @@ class AIAgent:
            effective_key = api_key or resolve_anthropic_token() or ""
            self._anthropic_api_key = effective_key
            self._anthropic_base_url = base_url
+            from agent.anthropic_adapter import _is_oauth_token as _is_oat
+            self._is_anthropic_oauth = _is_oat(effective_key)
            self._anthropic_client = build_anthropic_client(effective_key, base_url)
            # No OpenAI client needed for Anthropic mode
            self.client = None
@@ -785,7 +816,7 @@ class AIAgent:
                logger.debug("peer %s memory_mode=honcho: local USER.md writes disabled", _hcfg.peer_name or "user")

        # Skills config: nudge interval for skill creation reminders
-        self._skill_nudge_interval = 15
+        self._skill_nudge_interval = 10
        try:
            from hermes_cli.config import load_config as _load_skills_config
            skills_config = _load_skills_config().get("skills", {})
@@ -829,9 +860,9 @@ class AIAgent:
        """Verbose print — suppressed when streaming TTS is active.

        Pass ``force=True`` for error/warning messages that should always be
-        shown even during streaming TTS playback.
+        shown even during streaming playback (TTS or display).
        """
-        if not force and getattr(self, "_stream_callback", None) is not None:
+        if not force and self._has_stream_consumers():
            return
        print(*args, **kwargs)

@@ -2575,15 +2606,39 @@ class AIAgent:
    def _close_request_openai_client(self, client: Any, *, reason: str) -> None:
        self._close_openai_client(client, reason=reason, shared=False)

-    def _run_codex_stream(self, api_kwargs: dict, client: Any = None):
+    def _run_codex_stream(self, api_kwargs: dict, client: Any = None, on_first_delta: callable = None):
        """Execute one streaming Responses API request and return the final response."""
        active_client = client or self._ensure_primary_openai_client(reason="codex_stream_direct")
        max_stream_retries = 1
+        has_tool_calls = False
+        first_delta_fired = False
        for attempt in range(max_stream_retries + 1):
            try:
                with active_client.responses.stream(**api_kwargs) as stream:
-                    for _ in stream:
-                        pass
+                    for event in stream:
+                        if self._interrupt_requested:
+                            break
+                        event_type = getattr(event, "type", "")
+                        # Fire callbacks on text content deltas (suppress during tool calls)
+                        if "output_text.delta" in event_type or event_type == "response.output_text.delta":
+                            delta_text = getattr(event, "delta", "")
+                            if delta_text and not has_tool_calls:
+                                if not first_delta_fired:
+                                    first_delta_fired = True
+                                    if on_first_delta:
+                                        try:
+                                            on_first_delta()
+                                        except Exception:
+                                            pass
+                                self._fire_stream_delta(delta_text)
+                        # Track tool calls to suppress text streaming
+                        elif "function_call" in event_type:
+                            has_tool_calls = True
+                        # Fire reasoning callbacks
+                        elif "reasoning" in event_type and "delta" in event_type:
+                            reasoning_text = getattr(event, "delta", "")
+                            if reasoning_text:
+                                self._fire_reasoning_delta(reasoning_text)
                    return stream.get_final_response()
            except RuntimeError as exc:
                err_text = str(exc)
@@ -2764,6 +2819,7 @@ class AIAgent:
                    result["response"] = self._run_codex_stream(
                        api_kwargs,
                        client=request_client_holder["client"],
+                        on_first_delta=getattr(self, "_codex_on_first_delta", None),
                    )
                elif self.api_mode == "anthropic_messages":
                    result["response"] = self._anthropic_messages_create(api_kwargs)
@@ -2805,116 +2861,246 @@ class AIAgent:
            raise result["error"]
        return result["response"]

-    def _streaming_api_call(self, api_kwargs: dict, stream_callback):
-        """Streaming variant of _interruptible_api_call for voice TTS pipeline.
+    # ── Unified streaming API call ─────────────────────────────────────────

-        Uses ``stream=True`` and forwards content deltas to *stream_callback*
-        in real-time.  Returns a ``SimpleNamespace`` that mimics a normal
-        ``ChatCompletion`` so the rest of the agent loop works unchanged.
+    def _fire_stream_delta(self, text: str) -> None:
+        """Fire all registered stream delta callbacks (display + TTS)."""
+        for cb in (self.stream_delta_callback, self._stream_callback):
+            if cb is not None:
+                try:
+                    cb(text)
+                except Exception:
+                    pass

-        This method is separate from ``_interruptible_api_call`` to keep the
-        core agent loop untouched for non-voice users.
+    def _fire_reasoning_delta(self, text: str) -> None:
+        """Fire reasoning callback if registered."""
+        cb = self.reasoning_callback
+        if cb is not None:
+            try:
+                cb(text)
+            except Exception:
+                pass
+
+    def _has_stream_consumers(self) -> bool:
+        """Return True if any streaming consumer is registered."""
+        return (
+            self.stream_delta_callback is not None
+            or getattr(self, "_stream_callback", None) is not None
+        )
+
+    def _interruptible_streaming_api_call(
+        self, api_kwargs: dict, *, on_first_delta: callable = None
+    ):
+        """Streaming variant of _interruptible_api_call for real-time token delivery.
+
+        Handles all three api_modes:
+        - chat_completions: stream=True on OpenAI-compatible endpoints
+        - anthropic_messages: client.messages.stream() via Anthropic SDK
+        - codex_responses: delegates to _run_codex_stream (already streaming)
+
+        Fires stream_delta_callback and _stream_callback for each text token.
+        Tool-call turns suppress the callback — only text-only final responses
+        stream to the consumer.  Returns a SimpleNamespace that mimics the
+        non-streaming response shape so the rest of the agent loop is unchanged.
+
+        Falls back to _interruptible_api_call on provider errors indicating
+        streaming is not supported.
        """
+        if self.api_mode == "codex_responses":
+            # Codex streams internally via _run_codex_stream. The main dispatch
+            # in _interruptible_api_call already calls it; we just need to
+            # ensure on_first_delta reaches it. Store it on the instance
+            # temporarily so _run_codex_stream can pick it up.
+            self._codex_on_first_delta = on_first_delta
+            try:
+                return self._interruptible_api_call(api_kwargs)
+            finally:
+                self._codex_on_first_delta = None
+
        result = {"response": None, "error": None}
        request_client_holder = {"client": None}
+        first_delta_fired = {"done": False}
+        deltas_were_sent = {"yes": False}  # Track if any deltas were fired (for fallback)
+
+        def _fire_first_delta():
+            if not first_delta_fired["done"] and on_first_delta:
+                first_delta_fired["done"] = True
+                try:
+                    on_first_delta()
+                except Exception:
+                    pass
+
+        def _call_chat_completions():
+            """Stream a chat completions response."""
+            stream_kwargs = {**api_kwargs, "stream": True, "stream_options": {"include_usage": True}}
+            request_client_holder["client"] = self._create_request_openai_client(
+                reason="chat_completion_stream_request"
+            )
+            stream = request_client_holder["client"].chat.completions.create(**stream_kwargs)
+
+            content_parts: list = []
+            tool_calls_acc: dict = {}
+            finish_reason = None
+            model_name = None
+            role = "assistant"
+            reasoning_parts: list = []
+            usage_obj = None
+
+            for chunk in stream:
+                if self._interrupt_requested:
+                    break
+
+                if not chunk.choices:
+                    if hasattr(chunk, "model") and chunk.model:
+                        model_name = chunk.model
+                    # Usage comes in the final chunk with empty choices
+                    if hasattr(chunk, "usage") and chunk.usage:
+                        usage_obj = chunk.usage
+                    continue
+
+                delta = chunk.choices[0].delta
+                if hasattr(chunk, "model") and chunk.model:
+                    model_name = chunk.model
+
+                # Accumulate reasoning content
+                reasoning_text = getattr(delta, "reasoning_content", None) or getattr(delta, "reasoning", None)
+                if reasoning_text:
+                    reasoning_parts.append(reasoning_text)
+                    self._fire_reasoning_delta(reasoning_text)
+
+                # Accumulate text content — fire callback only when no tool calls
+                if delta and delta.content:
+                    content_parts.append(delta.content)
+                    if not tool_calls_acc:
+                        _fire_first_delta()
+                        self._fire_stream_delta(delta.content)
+                        deltas_were_sent["yes"] = True
+
+                # Accumulate tool call deltas (silently, no callback)
+                if delta and delta.tool_calls:
+                    for tc_delta in delta.tool_calls:
+                        idx = tc_delta.index if tc_delta.index is not None else 0
+                        if idx not in tool_calls_acc:
+                            tool_calls_acc[idx] = {
+                                "id": tc_delta.id or "",
+                                "type": "function",
+                                "function": {"name": "", "arguments": ""},
+                            }
+                        entry = tool_calls_acc[idx]
+                        if tc_delta.id:
+                            entry["id"] = tc_delta.id
+                        if tc_delta.function:
+                            if tc_delta.function.name:
+                                entry["function"]["name"] += tc_delta.function.name
+                            if tc_delta.function.arguments:
+                                entry["function"]["arguments"] += tc_delta.function.arguments
+
+                if chunk.choices[0].finish_reason:
+                    finish_reason = chunk.choices[0].finish_reason
+
+                # Usage in the final chunk
+                if hasattr(chunk, "usage") and chunk.usage:
+                    usage_obj = chunk.usage
+
+            # Build mock response matching non-streaming shape
+            full_content = "".join(content_parts) or None
+            mock_tool_calls = None
+            if tool_calls_acc:
+                mock_tool_calls = []
+                for idx in sorted(tool_calls_acc):
+                    tc = tool_calls_acc[idx]
+                    mock_tool_calls.append(SimpleNamespace(
+                        id=tc["id"],
+                        type=tc["type"],
+                        function=SimpleNamespace(
+                            name=tc["function"]["name"],
+                            arguments=tc["function"]["arguments"],
+                        ),
+                    ))
+
+            full_reasoning = "".join(reasoning_parts) or None
+            mock_message = SimpleNamespace(
+                role=role,
+                content=full_content,
+                tool_calls=mock_tool_calls,
+                reasoning_content=full_reasoning,
+            )
+            mock_choice = SimpleNamespace(
+                index=0,
+                message=mock_message,
+                finish_reason=finish_reason or "stop",
+            )
+            return SimpleNamespace(
+                id="stream-" + str(uuid.uuid4()),
+                model=model_name,
+                choices=[mock_choice],
+                usage=usage_obj,
+            )
+
+        def _call_anthropic():
+            """Stream an Anthropic Messages API response.
+
+            Fires delta callbacks for real-time token delivery, but returns
+            the native Anthropic Message object from get_final_message() so
+            the rest of the agent loop (validation, tool extraction, etc.)
+            works unchanged.
+            """
+            has_tool_use = False
+
+            # Use the Anthropic SDK's streaming context manager
+            with self._anthropic_client.messages.stream(**api_kwargs) as stream:
+                for event in stream:
+                    if self._interrupt_requested:
+                        break
+
+                    event_type = getattr(event, "type", None)
+
+                    if event_type == "content_block_start":
+                        block = getattr(event, "content_block", None)
+                        if block and getattr(block, "type", None) == "tool_use":
+                            has_tool_use = True
+
+                    elif event_type == "content_block_delta":
+                        delta = getattr(event, "delta", None)
+                        if delta:
+                            delta_type = getattr(delta, "type", None)
+                            if delta_type == "text_delta":
+                                text = getattr(delta, "text", "")
+                                if text and not has_tool_use:
+                                    _fire_first_delta()
+                                    self._fire_stream_delta(text)
+                            elif delta_type == "thinking_delta":
+                                thinking_text = getattr(delta, "thinking", "")
+                                if thinking_text:
+                                    self._fire_reasoning_delta(thinking_text)
+
+                # Return the native Anthropic Message for downstream processing
+                return stream.get_final_message()

        def _call():
            try:
-                stream_kwargs = {**api_kwargs, "stream": True}
-                request_client_holder["client"] = self._create_request_openai_client(
-                    reason="chat_completion_stream_request"
-                )
-                stream = request_client_holder["client"].chat.completions.create(**stream_kwargs)
-
-                content_parts: list[str] = []
-                tool_calls_acc: dict[int, dict] = {}
-                finish_reason = None
-                model_name = None
-                role = "assistant"
-
-                for chunk in stream:
-                    if not chunk.choices:
-                        if hasattr(chunk, "model") and chunk.model:
-                            model_name = chunk.model
-                        continue
-
-                    delta = chunk.choices[0].delta
-                    if hasattr(chunk, "model") and chunk.model:
-                        model_name = chunk.model
-
-                    if delta and delta.content:
-                        content_parts.append(delta.content)
-                        try:
-                            stream_callback(delta.content)
-                        except Exception:
-                            pass
-
-                    if delta and delta.tool_calls:
-                        for tc_delta in delta.tool_calls:
-                            idx = tc_delta.index if tc_delta.index is not None else 0
-                            if idx in tool_calls_acc and tc_delta.id and tc_delta.id != tool_calls_acc[idx]["id"]:
-                                matched = False
-                                for eidx, eentry in tool_calls_acc.items():
-                                    if eentry["id"] == tc_delta.id:
-                                        idx = eidx
-                                        matched = True
-                                        break
-                                if not matched:
-                                    idx = (max(k for k in tool_calls_acc if isinstance(k, int)) + 1) if tool_calls_acc else 0
-                            if idx not in tool_calls_acc:
-                                tool_calls_acc[idx] = {
-                                    "id": tc_delta.id or "",
-                                    "type": "function",
-                                    "function": {"name": "", "arguments": ""},
-                                }
-                            entry = tool_calls_acc[idx]
-                            if tc_delta.id:
-                                entry["id"] = tc_delta.id
-                            if tc_delta.function:
-                                if tc_delta.function.name:
-                                    entry["function"]["name"] += tc_delta.function.name
-                                if tc_delta.function.arguments:
-                                    entry["function"]["arguments"] += tc_delta.function.arguments
-
-                    if chunk.choices[0].finish_reason:
-                        finish_reason = chunk.choices[0].finish_reason
-
-                full_content = "".join(content_parts) or None
-                mock_tool_calls = None
-                if tool_calls_acc:
-                    mock_tool_calls = []
-                    for idx in sorted(tool_calls_acc):
-                        tc = tool_calls_acc[idx]
-                        mock_tool_calls.append(SimpleNamespace(
-                            id=tc["id"],
-                            type=tc["type"],
-                            function=SimpleNamespace(
-                                name=tc["function"]["name"],
-                                arguments=tc["function"]["arguments"],
-                            ),
-                        ))
-
-                mock_message = SimpleNamespace(
-                    role=role,
-                    content=full_content,
-                    tool_calls=mock_tool_calls,
-                    reasoning_content=None,
-                )
-                mock_choice = SimpleNamespace(
-                    index=0,
-                    message=mock_message,
-                    finish_reason=finish_reason or "stop",
-                )
-                mock_response = SimpleNamespace(
-                    id="stream-" + str(uuid.uuid4()),
-                    model=model_name,
-                    choices=[mock_choice],
-                    usage=None,
-                )
-                result["response"] = mock_response
-
+                if self.api_mode == "anthropic_messages":
+                    self._try_refresh_anthropic_client_credentials()
+                    result["response"] = _call_anthropic()
+                else:
+                    result["response"] = _call_chat_completions()
            except Exception as e:
-                result["error"] = e
+                if deltas_were_sent["yes"]:
+                    # Streaming failed AFTER some tokens were already delivered
+                    # to consumers. Don't fall back — that would cause
+                    # double-delivery (partial streamed + full non-streamed).
+                    # Let the error propagate; the partial content already
+                    # reached the user via the stream.
+                    logger.warning("Streaming failed after partial delivery, not falling back: %s", e)
+                    result["error"] = e
+                else:
+                    # Streaming failed before any tokens reached consumers.
+                    # Safe to fall back to the standard non-streaming path.
+                    logger.info("Streaming failed before delivery, falling back to non-streaming: %s", e)
+                    try:
+                        result["response"] = self._interruptible_api_call(api_kwargs)
+                    except Exception as fallback_err:
+                        result["error"] = fallback_err
            finally:
                request_client = request_client_holder.get("client")
                if request_client is not None:
@@ -2940,7 +3126,7 @@ class AIAgent:
                            self._close_request_openai_client(request_client, reason="stream_interrupt_abort")
                except Exception:
                    pass
-                raise InterruptedError("Agent interrupted during API call")
+                raise InterruptedError("Agent interrupted during streaming API call")
        if result["error"] is not None:
            raise result["error"]
        return result["response"]
@@ -3188,6 +3374,7 @@ class AIAgent:
                tools=self.tools,
                max_tokens=self.max_tokens,
                reasoning_config=self.reasoning_config,
+                is_oauth=getattr(self, "_is_anthropic_oauth", False),
            )

        if self.api_mode == "codex_responses":
@@ -3336,6 +3523,8 @@ class AIAgent:
        base_url = (self.base_url or "").lower()
        if "nousresearch" in base_url:
            return True
+        if "ai-gateway.vercel.sh" in base_url:
+            return True
        if "openrouter" not in base_url:
            return False
        if "api.mistral.ai" in base_url:
@@ -3515,7 +3704,8 @@ class AIAgent:

        flush_content = (
            "[System: The session is being compressed. "
-            "Please save anything worth remembering to your memories.]"
+            "Save anything worth remembering — prioritize user preferences, "
+            "corrections, and recurring patterns over task-specific details.]"
        )
        _sentinel = f"__flush_{id(self)}_{time.monotonic()}"
        flush_msg = {"role": "user", "content": flush_content, "_flush_sentinel": _sentinel}
@@ -3604,7 +3794,7 @@ class AIAgent:
                    tool_calls = assistant_msg.tool_calls
            elif self.api_mode == "anthropic_messages" and not _aux_available:
                from agent.anthropic_adapter import normalize_anthropic_response as _nar_flush
-                _flush_msg, _ = _nar_flush(response)
+                _flush_msg, _ = _nar_flush(response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
                if _flush_msg and _flush_msg.tool_calls:
                    tool_calls = _flush_msg.tool_calls
            elif hasattr(response, "choices") and response.choices:
@@ -3790,6 +3980,8 @@ class AIAgent:
            return handle_function_call(
                function_name, function_args, effective_task_id,
                enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
+                honcho_manager=self._honcho,
+                honcho_session_key=self._honcho_session_key,
            )

    def _execute_tool_calls_concurrent(self, assistant_message, messages: list, effective_task_id: str, api_call_count: int = 0) -> None:
@@ -3840,6 +4032,18 @@ class AIAgent:
                except Exception:
                    pass

+            # Checkpoint before destructive terminal commands
+            if function_name == "terminal" and self._checkpoint_mgr.enabled:
+                try:
+                    cmd = function_args.get("command", "")
+                    if _is_destructive_command(cmd):
+                        cwd = function_args.get("workdir") or os.getenv("TERMINAL_CWD", os.getcwd())
+                        self._checkpoint_mgr.ensure_checkpoint(
+                            cwd, f"before terminal: {cmd[:60]}"
+                        )
+                except Exception:
+                    pass
+
            parsed_calls.append((tool_call, function_name, function_args))

        # ── Logging / callbacks ──────────────────────────────────────────
@@ -4033,6 +4237,18 @@ class AIAgent:
                except Exception:
                    pass  # never block tool execution

+            # Checkpoint before destructive terminal commands
+            if function_name == "terminal" and self._checkpoint_mgr.enabled:
+                try:
+                    cmd = function_args.get("command", "")
+                    if _is_destructive_command(cmd):
+                        cwd = function_args.get("workdir") or os.getenv("TERMINAL_CWD", os.getcwd())
+                        self._checkpoint_mgr.ensure_checkpoint(
+                            cwd, f"before terminal: {cmd[:60]}"
+                        )
+                except Exception:
+                    pass  # never block tool execution
+
            tool_start_time = time.time()

            if function_name == "todo":
@@ -4119,7 +4335,7 @@ class AIAgent:
                        spinner.stop(cute_msg)
                    elif self.quiet_mode:
                        self._vprint(f"  {cute_msg}")
-            elif self.quiet_mode and self._stream_callback is None:
+            elif self.quiet_mode and not self._has_stream_consumers():
                face = random.choice(KawaiiSpinner.KAWAII_WAITING)
                emoji = _get_tool_emoji(function_name)
                preview = _build_tool_preview(function_name, function_args) or function_name
@@ -4132,6 +4348,8 @@ class AIAgent:
                    function_result = handle_function_call(
                        function_name, function_args, effective_task_id,
                        enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
+                        honcho_manager=self._honcho,
+                        honcho_session_key=self._honcho_session_key,
                    )
                    _spinner_result = function_result
                except Exception as tool_error:
@@ -4146,6 +4364,8 @@ class AIAgent:
                    function_result = handle_function_call(
                        function_name, function_args, effective_task_id,
                        enabled_tools=list(self.valid_tool_names) if self.valid_tool_names else None,
+                        honcho_manager=self._honcho,
+                        honcho_session_key=self._honcho_session_key,
                    )
                except Exception as tool_error:
                    function_result = f"Error executing tool '{function_name}': {tool_error}"
@@ -4335,9 +4555,10 @@ class AIAgent:
                if self.api_mode == "anthropic_messages":
                    from agent.anthropic_adapter import build_anthropic_kwargs as _bak, normalize_anthropic_response as _nar
                    _ant_kw = _bak(model=self.model, messages=api_messages, tools=None,
-                                   max_tokens=self.max_tokens, reasoning_config=self.reasoning_config)
+                                   max_tokens=self.max_tokens, reasoning_config=self.reasoning_config,
+                                   is_oauth=getattr(self, '_is_anthropic_oauth', False))
                    summary_response = self._anthropic_messages_create(_ant_kw)
-                    _msg, _ = _nar(summary_response)
+                    _msg, _ = _nar(summary_response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
                    final_response = (_msg.content or "").strip()
                else:
                    summary_response = self._ensure_primary_openai_client(reason="iteration_limit_summary").chat.completions.create(**summary_kwargs)
@@ -4365,9 +4586,10 @@ class AIAgent:
                elif self.api_mode == "anthropic_messages":
                    from agent.anthropic_adapter import build_anthropic_kwargs as _bak2, normalize_anthropic_response as _nar2
                    _ant_kw2 = _bak2(model=self.model, messages=api_messages, tools=None,
+                                    is_oauth=getattr(self, '_is_anthropic_oauth', False),
                                     max_tokens=self.max_tokens, reasoning_config=self.reasoning_config)
                    retry_response = self._anthropic_messages_create(_ant_kw2)
-                    _retry_msg, _ = _nar2(retry_response)
+                    _retry_msg, _ = _nar2(retry_response, strip_tool_prefix=getattr(self, '_is_anthropic_oauth', False))
                    final_response = (_retry_msg.content or "").strip()
                else:
                    summary_kwargs = {
@@ -4410,6 +4632,7 @@ class AIAgent:
        task_id: str = None,
        stream_callback: Optional[callable] = None,
        persist_user_message: Optional[str] = None,
+        sync_honcho: bool = True,
    ) -> Dict[str, Any]:
        """
        Run a complete conversation with tool calling until completion.
@@ -4425,6 +4648,8 @@ class AIAgent:
            persist_user_message: Optional clean user message to store in
                transcripts/history when user_message contains API-only
                synthetic prefixes.
+            sync_honcho: When False, skip writing the final synthetic turn back
+                to Honcho or queuing follow-up prefetch work.

        Returns:
            Dict: Complete conversation result with final response and message history
@@ -4481,8 +4706,9 @@ class AIAgent:
            self._turns_since_memory += 1
            if self._turns_since_memory >= self._memory_nudge_interval:
                user_message += (
-                    "\n\n[System: You've had several exchanges in this session. "
-                    "Consider whether there's anything worth saving to your memories.]"
+                    "\n\n[System: You've had several exchanges. Consider: "
+                    "has the user shared preferences, corrected you, or revealed "
+                    "something about their workflow worth remembering for future sessions?]"
                )
                self._turns_since_memory = 0

@@ -4492,8 +4718,9 @@ class AIAgent:
                and self._iters_since_skill >= self._skill_nudge_interval
                and "skill_manage" in self.valid_tool_names):
            user_message += (
-                "\n\n[System: The previous task involved many steps. "
-                "If you discovered a reusable workflow, consider saving it as a skill.]"
+                "\n\n[System: The previous task involved many tool calls. "
+                "Save the approach as a skill if it's reusable, or update "
+                "any existing skill you used if it was wrong or incomplete.]"
            )
            self._iters_since_skill = 0

@@ -4747,8 +4974,8 @@ class AIAgent:
                self._vprint(f"\n{self.log_prefix}🔄 Making API call #{api_call_count}/{self.max_iterations}...")
                self._vprint(f"{self.log_prefix}   📊 Request size: {len(api_messages)} messages, ~{approx_tokens:,} tokens (~{total_chars:,} chars)")
                self._vprint(f"{self.log_prefix}   🔧 Available tools: {len(self.tools) if self.tools else 0}")
-            elif self._stream_callback is None:
-                # Animated thinking spinner in quiet mode (skip during streaming TTS)
+            elif not self._has_stream_consumers():
+                # Animated thinking spinner in quiet mode (skip during streaming)
                face = random.choice(KawaiiSpinner.KAWAII_THINKING)
                verb = random.choice(KawaiiSpinner.THINKING_VERBS)
                if self.thinking_callback:
@@ -4788,33 +5015,22 @@ class AIAgent:
                    if os.getenv("HERMES_DUMP_REQUESTS", "").strip().lower() in {"1", "true", "yes", "on"}:
                        self._dump_api_request_debug(api_kwargs, reason="preflight")

-                    cb = getattr(self, "_stream_callback", None)
-                    if cb is not None and self.api_mode == "chat_completions":
-                        response = self._streaming_api_call(api_kwargs, cb)
+                    if self._has_stream_consumers():
+                        # Streaming path: fire delta callbacks for real-time
+                        # token delivery to CLI display, gateway, or TTS.
+                        def _stop_spinner():
+                            nonlocal thinking_spinner
+                            if thinking_spinner:
+                                thinking_spinner.stop("")
+                                thinking_spinner = None
+                            if self.thinking_callback:
+                                self.thinking_callback("")
+
+                        response = self._interruptible_streaming_api_call(
+                            api_kwargs, on_first_delta=_stop_spinner
+                        )
                    else:
                        response = self._interruptible_api_call(api_kwargs)
-                        # Forward full response to TTS callback for non-streaming providers
-                        # (e.g. Anthropic) so voice TTS still works via batch delivery.
-                        if cb is not None and response:
-                            try:
-                                content = None
-                                # Try choices first — _interruptible_api_call converts all
-                                # providers (including Anthropic) to this format.
-                                try:
-                                    content = response.choices[0].message.content
-                                except (AttributeError, IndexError):
-                                    pass
-                                # Fallback: Anthropic native content blocks
-                                if not content and self.api_mode == "anthropic_messages":
-                                    text_parts = [
-                                        block.text for block in getattr(response, "content", [])
-                                        if getattr(block, "type", None) == "text" and getattr(block, "text", None)
-                                    ]
-                                    content = " ".join(text_parts) if text_parts else None
-                                if content:
-                                    cb(content)
-                            except Exception:
-                                pass
                    
                    api_duration = time.time() - api_start_time
                    
@@ -5069,6 +5285,22 @@ class AIAgent:
                        self.session_completion_tokens += completion_tokens
                        self.session_total_tokens += total_tokens
                        self.session_api_calls += 1
+
+                        # Persist token counts to session DB for /insights.
+                        # Gateway sessions persist via session_store.update_session()
+                        # after run_conversation returns, so only persist here for
+                        # CLI (and other non-gateway) platforms to avoid double-counting.
+                        if (self._session_db and self.session_id
+                                and getattr(self, 'platform', None) == 'cli'):
+                            try:
+                                self._session_db.update_token_counts(
+                                    self.session_id,
+                                    input_tokens=prompt_tokens,
+                                    output_tokens=completion_tokens,
+                                    model=self.model,
+                                )
+                            except Exception:
+                                pass  # never block the agent loop
                        
                        if self.verbose_logging:
                            logging.debug(f"Token usage: prompt={usage_dict['prompt_tokens']:,}, completion={usage_dict['completion_tokens']:,}, total={usage_dict['total_tokens']:,}")
@@ -5317,10 +5549,13 @@ class AIAgent:
                    # These indicate a problem with the request itself (bad model ID,
                    # invalid API key, forbidden, etc.) and will never succeed on retry.
                    # Note: 413 and context-length errors are excluded — handled above.
+                    # 429 (rate limit) is transient and MUST be retried with backoff.
+                    # 529 (Anthropic overloaded) is also transient.
                    # Also catch local validation errors (ValueError, TypeError) — these
                    # are programming bugs, not transient failures.
+                    _RETRYABLE_STATUS_CODES = {413, 429, 529}
                    is_local_validation_error = isinstance(api_error, (ValueError, TypeError))
-                    is_client_status_error = isinstance(status_code, int) and 400 <= status_code < 500 and status_code != 413
+                    is_client_status_error = isinstance(status_code, int) and 400 <= status_code < 500 and status_code not in _RETRYABLE_STATUS_CODES
                    is_client_error = (is_local_validation_error or is_client_status_error or any(phrase in error_msg for phrase in [
                        'error code: 401', 'error code: 403',
                        'error code: 404', 'error code: 422',
@@ -5416,7 +5651,9 @@ class AIAgent:
                    assistant_message, finish_reason = self._normalize_codex_response(response)
                elif self.api_mode == "anthropic_messages":
                    from agent.anthropic_adapter import normalize_anthropic_response
-                    assistant_message, finish_reason = normalize_anthropic_response(response)
+                    assistant_message, finish_reason = normalize_anthropic_response(
+                        response, strip_tool_prefix=getattr(self, "_is_anthropic_oauth", False)
+                    )
                else:
                    assistant_message = response.choices[0].message
                
@@ -5917,7 +6154,7 @@ class AIAgent:
        self._persist_session(messages, conversation_history)

        # Sync conversation to Honcho for user modeling
-        if final_response and not interrupted:
+        if final_response and not interrupted and sync_honcho:
            self._honcho_sync(original_user_message, final_response)
            self._queue_honcho_prefetch(original_user_message)

@@ -483,6 +483,8 @@ install_system_packages() {
        elif command -v sudo &> /dev/null; then
            if [ "$IS_INTERACTIVE" = true ]; then
                echo ""
+                log_info "sudo is needed ONLY to install optional system packages (${pkgs[*]}) via your package manager."
+                log_info "Hermes Agent itself does not require or retain root access."
                read -p "Install ${description}? (requires sudo) [y/N] " -n 1 -r
                echo
                if [[ $REPLY =~ ^[Yy]$ ]]; then
@@ -496,8 +498,9 @@ install_system_packages() {
                # Non-interactive (e.g. curl | bash) but a terminal is available.
                # Read the prompt from /dev/tty (same approach the setup wizard uses).
                echo ""
-                log_info "Installing ${description} requires sudo."
-                read -p "Install? [Y/n] " -n 1 -r < /dev/tty
+                log_info "sudo is needed ONLY to install optional system packages (${pkgs[*]}) via your package manager."
+                log_info "Hermes Agent itself does not require or retain root access."
+                read -p "Install ${description}? [Y/n] " -n 1 -r < /dev/tty
                echo
                if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
                    if sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a $install_cmd < /dev/tty; then
@@ -688,7 +691,9 @@ install_deps() {
                    sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get update -qq && sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get install -y -qq build-essential python3-dev libffi-dev >/dev/null 2>&1 || true
                    log_success "Build tools installed"
                else
-                    read -p "Install build tools (build-essential, python3-dev)? (requires sudo) [Y/n] " -n 1 -r < /dev/tty
+                    log_info "sudo is needed ONLY to install build tools (build-essential, python3-dev, libffi-dev) via apt."
+                    log_info "Hermes Agent itself does not require or retain root access."
+                    read -p "Install build tools? [Y/n] " -n 1 -r < /dev/tty
                    echo
                    if [[ $REPLY =~ ^[Yy]$ ]] || [[ -z $REPLY ]]; then
                        sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get update -qq && sudo DEBIAN_FRONTEND=noninteractive NEEDRESTART_MODE=a apt-get install -y -qq build-essential python3-dev libffi-dev >/dev/null 2>&1 || true
@@ -908,6 +913,8 @@ install_node_deps() {
                cd "$INSTALL_DIR" && npx playwright install chromium 2>/dev/null || true
                ;;
            *)
+                log_info "Playwright may request sudo to install browser system dependencies (shared libraries)."
+                log_info "This is standard Playwright setup — Hermes itself does not require root access."
                cd "$INSTALL_DIR" && npx playwright install --with-deps chromium 2>/dev/null || true
                ;;
        esac
@@ -5,12 +5,26 @@ description: "Production pipeline for ASCII art video — any format. Converts v

 # ASCII Video Production Pipeline

-Full production pipeline for rendering any content as colored ASCII character video.
+## Creative Standard
+
+This is visual art. ASCII characters are the medium; cinema is the standard.
+
+**Before writing a single line of code**, articulate the creative concept. What is the mood? What visual story does this tell? What makes THIS project different from every other ASCII video? The user's prompt is a starting point — interpret it with creative ambition, not literal transcription.
+
+**First-render excellence is non-negotiable.** The output must be visually striking without requiring revision rounds. If something looks generic, flat, or like "AI-generated ASCII art," it is wrong — rethink the creative concept before shipping.
+
+**Go beyond the reference vocabulary.** The effect catalogs, shader presets, and palette libraries in the references are a starting vocabulary. For every project, combine, modify, and invent new patterns. The catalog is a palette of paints — you write the painting.
+
+**Be proactively creative.** Extend the skill's vocabulary when the project calls for it. If the references don't have what the vision demands, build it. Include at least one visual moment the user didn't ask for but will appreciate — a transition, an effect, a color choice that elevates the whole piece.
+
+**Cohesive aesthetic over technical correctness.** All scenes in a video must feel connected by a unifying visual language — shared color temperature, related character palettes, consistent motion vocabulary. A technically correct video where every scene uses a random different effect is an aesthetic failure.
+
+**Dense, layered, considered.** Every frame should reward viewing. Never flat black backgrounds. Always multi-grid composition. Always per-scene variation. Always intentional color.

 ## Modes

-| Mode | Input | Output | Read |
-|------|-------|--------|------|
+| Mode | Input | Output | Reference |
+|------|-------|--------|-----------|
 | **Video-to-ASCII** | Video file | ASCII recreation of source footage | `references/inputs.md` § Video Sampling |
 | **Audio-reactive** | Audio file | Generative visuals driven by audio features | `references/inputs.md` § Audio Analysis |
 | **Generative** | None (or seed params) | Procedural ASCII animation | `references/effects.md` |
@@ -20,210 +34,154 @@ Full production pipeline for rendering any content as colored ASCII character vi

 ## Stack

-Single self-contained Python script per project. No GPU.
+Single self-contained Python script per project. No GPU required.

 | Layer | Tool | Purpose |
 |-------|------|---------|
 | Core | Python 3.10+, NumPy | Math, array ops, vectorized effects |
-| Signal | SciPy | FFT, peak detection (audio modes only) |
-| Imaging | Pillow (PIL) | Font rasterization, video frame decoding, image I/O |
-| Video I/O | ffmpeg (CLI) | Decode input, encode output segments, mux audio, mix tracks |
-| Parallel | concurrent.futures / multiprocessing | N workers for batch/clip rendering |
-| TTS | ElevenLabs API (or similar) | Generate narration clips for quote/testimonial videos |
-| Optional | OpenCV | Video frame sampling, edge detection, optical flow |
+| Signal | SciPy | FFT, peak detection (audio modes) |
+| Imaging | Pillow (PIL) | Font rasterization, frame decoding, image I/O |
+| Video I/O | ffmpeg (CLI) | Decode input, encode output, mux audio |
+| Parallel | concurrent.futures | N workers for batch/clip rendering |
+| TTS | ElevenLabs API (optional) | Generate narration clips |
+| Optional | OpenCV | Video frame sampling, edge detection |

-## Pipeline Architecture (v2)
+## Pipeline Architecture

-Every mode follows the same 6-stage pipeline. See `references/architecture.md` for implementation details, `references/scenes.md` for scene protocol, and `references/composition.md` for multi-grid composition and tonemap.
+Every mode follows the same 6-stage pipeline:

 ```
-┌─────────┐   ┌──────────┐   ┌───────────┐   ┌──────────┐   ┌─────────┐   ┌────────┐
-│ 1.INPUT  │→│ 2.ANALYZE │→│ 3.SCENE_FN │→│ 4.TONEMAP │→│ 5.SHADE  │→│ 6.ENCODE│
-│ load src │  │ features  │  │ → canvas   │  │ normalize │  │ post-fx  │  │ → video │
-└─────────┘   └──────────┘   └───────────┘   └──────────┘   └─────────┘   └────────┘
+INPUT → ANALYZE → SCENE_FN → TONEMAP → SHADE → ENCODE
 ```

 1. **INPUT** — Load/decode source material (video frames, audio samples, images, or nothing)
 2. **ANALYZE** — Extract per-frame features (audio bands, video luminance/edges, motion vectors)
-3. **SCENE_FN** — Scene function renders directly to pixel canvas (`uint8 H,W,3`). May internally compose multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md`
-4. **TONEMAP** — Percentile-based adaptive brightness normalization with per-scene gamma. Replaces linear brightness multipliers. See `references/composition.md` § Adaptive Tonemap
-5. **SHADE** — Apply post-processing `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md`
+3. **SCENE_FN** — Scene function renders to pixel canvas (`uint8 H,W,3`). Composes multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md`
+4. **TONEMAP** — Percentile-based adaptive brightness normalization. See `references/composition.md` § Adaptive Tonemap
+5. **SHADE** — Post-processing via `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md`
 6. **ENCODE** — Pipe raw RGB frames to ffmpeg for H.264/GIF encoding

 ## Creative Direction

-**Every project should look and feel different.** The references provide a vocabulary of building blocks — don't copy them verbatim. Combine, modify, and invent.
-
-### Aesthetic Dimensions to Vary
+### Aesthetic Dimensions

 | Dimension | Options | Reference |
 |-----------|---------|-----------|
-| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), dots, project-specific | `architecture.md` § Character Palettes |
-| **Color strategy** | HSV (angle/distance/time/value mapped), OKLAB/OKLCH (perceptually uniform), discrete RGB palettes, auto-generated harmony (complementary/triadic/analogous/tetradic), monochrome, temperature | `architecture.md` § Color System |
-| **Color tint** | Warm, cool, amber, matrix green, neon pink, sepia, ice, blood, void, sunset | `shaders.md` § Color Grade |
-| **Background texture** | Sine fields, fBM noise, domain warp, voronoi cells, reaction-diffusion, cellular automata, video source | `effects.md` § Background Fills, Noise-Based Fields, Simulation-Based Fields |
-| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, ripple, fire, strange attractors, SDFs (geometric shapes with smooth booleans) | `effects.md` § Radial / Wave / Fire / SDF-Based Fields |
-| **Particles** | Energy sparks, snow, rain, bubbles, runes, binary data, orbits, gravity wells, flocking boids, flow-field followers, trail-drawing particles | `effects.md` § Particle Systems |
-| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, harsh industrial, psychedelic | `shaders.md` § Design Philosophy |
+| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), project-specific | `architecture.md` § Palettes |
+| **Color strategy** | HSV, OKLAB/OKLCH, discrete RGB palettes, auto-generated harmony, monochrome, temperature | `architecture.md` § Color System |
+| **Background texture** | Sine fields, fBM noise, domain warp, voronoi, reaction-diffusion, cellular automata, video | `effects.md` |
+| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, fire, SDFs, strange attractors | `effects.md` |
+| **Particles** | Sparks, snow, rain, bubbles, runes, orbits, flocking boids, flow-field followers, trails | `effects.md` § Particles |
+| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, industrial, psychedelic | `shaders.md` |
 | **Grid density** | xs(8px) through xxl(40px), mixed per layer | `architecture.md` § Grid System |
-| **Font** | Menlo, Monaco, Courier, SF Mono, JetBrains Mono, Fira Code, IBM Plex | `architecture.md` § Font Selection |
-| **Coordinate space** | Cartesian, polar, tiled, rotated, skewed, fisheye, twisted, Möbius, domain-warped | `effects.md` § Coordinate Transforms |
-| **Mirror mode** | None, horizontal, vertical, quad, diagonal, kaleidoscope | `shaders.md` § Mirror Effects |
-| **Masking** | Circle, rect, ring, gradient, text stencil, value-field-as-mask, animated iris/wipe/dissolve | `composition.md` § Masking |
-| **Temporal motion** | Static, audio-reactive, eased keyframes, morphing between fields, temporal noise (smooth in-place evolution) | `effects.md` § Temporal Coherence |
-| **Transition style** | Crossfade, wipe (directional/radial), dissolve, glitch cut, iris open/close, mask-based reveal | `shaders.md` § Transitions, `composition.md` § Animated Masks |
-| **Aspect ratio** | Landscape (16:9), portrait (9:16), square (1:1), ultrawide (21:9) | `architecture.md` § Resolution Presets |
+| **Coordinate space** | Cartesian, polar, tiled, rotated, fisheye, Möbius, domain-warped | `effects.md` § Transforms |
+| **Feedback** | Zoom tunnel, rainbow trails, ghostly echo, rotating mandala, color evolution | `composition.md` § Feedback |
+| **Masking** | Circle, ring, gradient, text stencil, animated iris/wipe/dissolve | `composition.md` § Masking |
+| **Transitions** | Crossfade, wipe, dissolve, glitch cut, iris, mask-based reveal | `shaders.md` § Transitions |

 ### Per-Section Variation

-Never use the same config for the entire video. For each section/scene/quote:
- Choose a **different background effect** (or compose 2-3)
- Choose a **different character palette** (match the mood)
- Choose a **different color strategy** (or at minimum a different hue)
- Vary **shader intensity** (more bloom during peaks, more grain during quiet)
- Use **different particle types** if particles are active
+Never use the same config for the entire video. For each section/scene:
+- **Different background effect** (or compose 2-3)
+- **Different character palette** (match the mood)
+- **Different color strategy** (or at minimum a different hue)
+- **Vary shader intensity** (more bloom during peaks, more grain during quiet)
+- **Different particle types** if particles are active

 ### Project-Specific Invention

 For every project, invent at least one of:
 - A custom character palette matching the theme
- A custom background effect (combine/modify existing ones)
+- A custom background effect (combine/modify existing building blocks)
 - A custom color palette (discrete RGB set matching the brand/mood)
 - A custom particle character set
+- A novel scene transition or visual moment
+
+Don't just pick from the catalog. The catalog is vocabulary — you write the poem.

 ## Workflow

-### Step 1: Determine Mode and Gather Requirements
+### Step 1: Creative Vision
+
+Before any code, articulate the creative concept:
+
+- **Mood/atmosphere**: What should the viewer feel? Energetic, meditative, chaotic, elegant, ominous?
+- **Visual story**: What happens over the duration? Build tension? Transform? Dissolve?
+- **Color world**: Warm/cool? Monochrome? Neon? Earth tones? What's the dominant hue?
+- **Character texture**: Dense data? Sparse stars? Organic dots? Geometric blocks?
+- **What makes THIS different**: What's the one thing that makes this project unique?
+- **Emotional arc**: How do scenes progress? Open with energy, build to climax, resolve?
+
+Map the user's prompt to aesthetic choices. A "chill lo-fi visualizer" demands different everything from a "glitch cyberpunk data stream."
+
+### Step 2: Technical Design

-Establish with user:
- **Input source** — file path, format, duration
 - **Mode** — which of the 6 modes above
- **Sections** — time-mapped style changes (timestamps → effect names)
- **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps; GIFs typically 640x360 @ 15fps
- **Style direction** — dense/sparse, bright/dark, chaotic/minimal, color palette
- **Text/branding** — easter eggs, overlays, credits, themed character sets
- **Output format** — MP4 (default), GIF, PNG sequence
- **Aspect ratio** — landscape (16:9), portrait (9:16 for TikTok/Reels/Stories), square (1:1 for IG feed)
-
-### Step 2: Detect Hardware and Set Quality
-
-Before building the script, detect the user's hardware and set appropriate defaults. See `references/optimization.md` § Hardware Detection.
-
-```python
-hw = detect_hardware()
-profile = quality_profile(hw, target_duration, user_quality_pref)
-log(f"Hardware: {hw['cpu_count']} cores, {hw['mem_gb']:.1f}GB RAM")
-log(f"Render: {profile['vw']}x{profile['vh']} @{profile['fps']}fps, {profile['workers']} workers")
-```
-
-Never hardcode worker counts, resolution, or CRF. Always detect and adapt.
+- **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps
+- **Hardware detection** — auto-detect cores/RAM, set quality profile. See `references/optimization.md`
+- **Sections** — map timestamps to scene functions, each with its own effect/palette/color/shader config
+- **Output format** — MP4 (default), GIF (640x360 @ 15fps), PNG sequence

 ### Step 3: Build the Script

-Write as a single Python file. Major components:
+Single Python file. Components (with references):

-1. **Hardware detection + quality profile** — see `references/optimization.md`
-2. **Input loader** — mode-dependent; see `references/inputs.md`
-3. **Feature analyzer** — audio FFT, video luminance, or pass-through
-4. **Grid + renderer** — multi-density character grids with bitmap cache; `_render_vf()` helper for value/hue field → canvas
-5. **Character palettes** — multiple palettes chosen per project theme; see `references/architecture.md`
-6. **Color system** — HSV + discrete RGB palettes as needed; see `references/architecture.md`
-7. **Scene functions** — each returns `canvas (uint8 H,W,3)` directly. May compose multiple grids internally via pixel blend modes. See `references/scenes.md` + `references/composition.md`
-8. **Tonemap** — adaptive brightness normalization with per-scene gamma; see `references/composition.md`
-9. **Shader pipeline** — `ShaderChain` + `FeedbackBuffer` per-section config; see `references/shaders.md`
-10. **Scene table + dispatcher** — maps time ranges to scene functions + shader/feedback configs; see `references/scenes.md`
-11. **Parallel encoder** — N-worker batch clip rendering with ffmpeg pipes
+1. **Hardware detection + quality profile** — `references/optimization.md`
+2. **Input loader** — mode-dependent; `references/inputs.md`
+3. **Feature analyzer** — audio FFT, video luminance, or synthetic
+4. **Grid + renderer** — multi-density grids with bitmap cache; `references/architecture.md`
+5. **Character palettes** — multiple per project; `references/architecture.md` § Palettes
+6. **Color system** — HSV + discrete RGB + harmony generation; `references/architecture.md` § Color
+7. **Scene functions** — each returns `canvas (uint8 H,W,3)`; `references/scenes.md`
+8. **Tonemap** — adaptive brightness normalization; `references/composition.md`
+9. **Shader pipeline** — `ShaderChain` + `FeedbackBuffer`; `references/shaders.md`
+10. **Scene table + dispatcher** — time → scene function + config; `references/scenes.md`
+11. **Parallel encoder** — N-worker clip rendering with ffmpeg pipes
 12. **Main** — orchestrate full pipeline

-### Step 4: Handle Critical Bugs
+### Step 4: Quality Verification

-#### Font Cell Height (macOS Pillow)
+- **Test frames first**: render single frames at key timestamps before full render
+- **Brightness check**: `canvas.mean() > 8` for all ASCII content. If dark, lower gamma
+- **Visual coherence**: do all scenes feel like they belong to the same video?
+- **Creative vision check**: does the output match the concept from Step 1? If it looks generic, go back

-`textbbox()` returns wrong height. Use `font.getmetrics()`:
+## Critical Implementation Notes

-```python
-ascent, descent = font.getmetrics()
-cell_height = ascent + descent  # correct
-```
+### Brightness — Use `tonemap()`, Not Linear Multipliers

-#### ffmpeg Pipe Deadlock
-
-Never use `stderr=subprocess.PIPE` with long-running ffmpeg. Redirect to file:
-
-```python
-stderr_fh = open(err_path, "w")
-pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.DEVNULL, stderr=stderr_fh)
-```
-
-#### Brightness — Use `tonemap()`, Not Linear Multipliers
-
-ASCII on black is inherently dark. This is the #1 visual issue. **Do NOT use linear `* N` brightness multipliers** — they clip highlights and wash out the image. Instead, use the **adaptive tonemap** function from `references/composition.md`:
+This is the #1 visual issue. ASCII on black is inherently dark. **Never use `canvas * N` multipliers** — they clip highlights. Use adaptive tonemap:

 ```python
 def tonemap(canvas, gamma=0.75):
-    """Percentile-based adaptive normalization + gamma. Replaces all brightness multipliers."""
    f = canvas.astype(np.float32)
-    lo = np.percentile(f, 1)          # black point (1st percentile)
-    hi = np.percentile(f, 99.5)       # white point (99.5th percentile)
-    if hi - lo < 1: hi = lo + 1
-    f = (f - lo) / (hi - lo)
-    f = np.clip(f, 0, 1) ** gamma     # gamma < 1 = brighter mids
+    lo, hi = np.percentile(f[::4, ::4], [1, 99.5])
+    if hi - lo < 10: hi = lo + 10
+    f = np.clip((f - lo) / (hi - lo), 0, 1) ** gamma
    return (f * 255).astype(np.uint8)
 ```

-Pipeline ordering: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg`
+Pipeline: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg`

-Per-scene gamma overrides for destructive effects:
- Default: `gamma=0.75`
- Solarize scenes: `gamma=0.55` (solarize darkens above-threshold pixels)
- Posterize scenes: `gamma=0.50` (quantization loses brightness range)
- Already-bright scenes: `gamma=0.85`
+Per-scene gamma: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85. Use `screen` blend (not `overlay`) for dark layers.

-Additional brightness best practices:
- Dense animated backgrounds — never flat black, always fill the grid
- Vignette minimum clamped to 0.15 (not 0.12)
- Bloom threshold lowered to 130 (not 170) so more pixels contribute to glow
- Use `screen` blend mode (not `overlay`) when compositing dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03`
+### Font Cell Height

-#### Font Compatibility
+macOS Pillow: `textbbox()` returns wrong height. Use `font.getmetrics()`: `cell_height = ascent + descent`. See `references/troubleshooting.md`.

-Not all Unicode characters render in all fonts. Validate palettes at init:
-```python
-for c in palette:
-    img = Image.new("L", (20, 20), 0)
-    ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font)
-    if np.array(img).max() == 0:
-        log(f"WARNING: char '{c}' (U+{ord(c):04X}) not in font, removing from palette")
-```
+### ffmpeg Pipe Deadlock

-### Step 4b: Per-Clip Architecture (for segmented videos)
+Never `stderr=subprocess.PIPE` with long-running ffmpeg — buffer fills at 64KB and deadlocks. Redirect to file. See `references/troubleshooting.md`.

-When the video has discrete segments (quotes, scenes, chapters), render each as a separate clip file. This enables:
- Re-rendering individual clips without touching the rest (`--clip q05`)
- Faster iteration on specific sections
- Easy reordering or trimming in post
+### Font Compatibility

-```python
-segments = [
-    {"id": "intro", "start": 0.0, "end": 5.0, "type": "intro"},
-    {"id": "q00", "start": 5.0, "end": 12.0, "type": "quote", "qi": 0, ...},
-    {"id": "t00", "start": 12.0, "end": 13.5, "type": "transition", ...},
-    {"id": "outro", "start": 208.0, "end": 211.6, "type": "outro"},
-]
+Not all Unicode chars render in all fonts. Validate palettes at init — render each char, check for blank output. See `references/troubleshooting.md`.

-from concurrent.futures import ProcessPoolExecutor, as_completed
-with ProcessPoolExecutor(max_workers=hw["workers"]) as pool:
-    futures = {pool.submit(render_clip, seg, features, path): seg["id"]
-               for seg, path in clip_args}
-    for fut in as_completed(futures):
-        fut.result()
-```
+### Per-Clip Architecture

-CLI: `--clip q00 t00 q01` to re-render specific clips, `--list` to show segments, `--skip-render` to re-stitch only.
+For segmented videos (quotes, scenes, chapters), render each as a separate clip file for parallel rendering and selective re-rendering. See `references/scenes.md`.

-### Step 5: Render and Iterate
-
-Performance targets per frame:
+## Performance Targets

 | Component | Budget |
 |-----------|--------|
@@ -233,24 +191,15 @@ Performance targets per frame:
 | Shader pipeline | 5-25ms |
 | **Total** | ~100-200ms/frame |

-**Fast iteration**: render single test frames to check brightness/layout before full render:
-```python
-canvas = render_single_frame(frame_index, features, renderer)
-Image.fromarray(canvas).save("test.png")
-```
-
-**Brightness verification**: sample 5-10 frames across video, check `mean > 8` for ASCII content.
-
 ## References

 | File | Contents |
 |------|----------|
-| `references/architecture.md` | Grid system (landscape/portrait/square resolution presets), font selection, character palettes (library of 20+), color system (HSV + OKLAB/OKLCH + discrete RGB + color harmony generation + perceptual gradient interpolation), `_render_vf()` helper, compositing, v2 effect function contract |
-| `references/inputs.md` | All input sources: audio analysis, video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
-| `references/effects.md` | Effect building blocks: 20+ value field generators (trig, noise/fBM, domain warp, voronoi, reaction-diffusion, cellular automata, strange attractors, SDFs), 8 hue field generators, coordinate transforms (rotate/tile/polar/Möbius), temporal coherence (easing, keyframes, morphing), radial/wave/fire effects, advanced particles (flocking, flow fields, trails), composing guide |
-| `references/shaders.md` | 38 shader implementations (geometry, channel, color, glow, noise, pattern, tone, glitch, mirror), `ShaderChain` class, full `_apply_shader_step()` dispatch, audio-reactive scaling, transitions, tint presets |
-| `references/composition.md` | **v2 core**: pixel blend modes (20 modes with implementations), multi-grid composition, `_render_vf()` helper, adaptive `tonemap()`, per-scene gamma, `FeedbackBuffer` with spatial transforms, `PixelBlendStack`, masking/stencil system (shape masks, text stencils, animated masks, boolean ops) |
-| `references/scenes.md` | **v2 scene protocol**: scene function contract (local time convention), `Renderer` class, `SCENES` table structure, `render_clip()` loop, beat-synced cutting, parallel rendering + pickling constraints, 4 complete scene examples, scene design checklist |
-| `references/design-patterns.md` | **Scene composition patterns**: layer hierarchy (bg/content/accent), directional parameter arcs vs oscillation, scene concepts and visual metaphors, counter-rotating dual systems, wave collision, progressive fragmentation, entropy/consumption, staggered layer entry (crescendo), scene ordering |
-| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling issues, brightness diagnostics, ffmpeg deadlocks, font issues, performance bottlenecks, common mistakes |
-| `references/optimization.md` | Hardware detection, adaptive quality profiles (draft/preview/production/max), CLI integration, vectorized effect patterns, parallel rendering, memory management |
+| `references/architecture.md` | Grid system, resolution presets, font selection, character palettes (20+), color system (HSV + OKLAB + discrete RGB + harmony generation), `_render_vf()` helper, GridLayer class |
+| `references/composition.md` | Pixel blend modes (20 modes), `blend_canvas()`, multi-grid composition, adaptive `tonemap()`, `FeedbackBuffer`, `PixelBlendStack`, masking/stencil system |
+| `references/effects.md` | Effect building blocks: value field generators, hue fields, noise/fBM/domain warp, voronoi, reaction-diffusion, cellular automata, SDFs, strange attractors, particle systems, coordinate transforms, temporal coherence |
+| `references/shaders.md` | `ShaderChain`, `_apply_shader_step()` dispatch, 38 shader catalog, audio-reactive scaling, transitions, tint presets, output format encoding, terminal rendering |
+| `references/scenes.md` | Scene protocol, `Renderer` class, `SCENES` table, `render_clip()`, beat-synced cutting, parallel rendering, design patterns (layer hierarchy, directional arcs, visual metaphors, compositional techniques), complete scene examples at every complexity level, scene design checklist |
+| `references/inputs.md` | Audio analysis (FFT, bands, beats), video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
+| `references/optimization.md` | Hardware detection, quality profiles, vectorized patterns, parallel rendering, memory management, performance budgets |
+| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling, brightness diagnostics, ffmpeg issues, font problems, common mistakes |
@@ -1,14 +1,6 @@
 # Architecture Reference

-**Cross-references:**
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, output encoding: `shaders.md`
- Complete scene examples: `examples.md`
- Input sources (audio analysis, video, TTS): `inputs.md`
- Performance tuning, hardware detection: `optimization.md`
- Common bugs (broadcasting, font, encoding): `troubleshooting.md`
+> **See also:** composition.md · effects.md · scenes.md · shaders.md · inputs.md · optimization.md · troubleshooting.md

 ## Grid System

@@ -2,13 +2,7 @@

 The composable system is the core of visual complexity. It operates at three levels: pixel-level blend modes, multi-grid composition, and adaptive brightness management. This document covers all three, plus the masking/stencil system for spatial control.

-**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, hue fields, particles): `effects.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples with blend/mask usage: `examples.md`
- Blend mode pitfalls (overlay crush, division by zero): `troubleshooting.md`
+> **See also:** architecture.md · effects.md · scenes.md · shaders.md · troubleshooting.md

 ## Pixel-Level Blend Modes

@@ -1,193 +0,0 @@
-# Scene Design Patterns
-
-**Cross-references:**
- Scene protocol, SCENES table: `scenes.md`
- Blend modes, multi-grid composition, tonemap: `composition.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples: `examples.md`
-
-Higher-order patterns for composing scenes that feel intentional rather than random. These patterns use the existing building blocks (value fields, blend modes, shaders, feedback) but organize them with compositional intent.
-
-## Layer Hierarchy
-
-Every scene should have clear visual layers with distinct roles:
-
-| Layer | Grid | Brightness | Purpose |
-|-------|------|-----------|---------|
-| **Background** | xs or sm (dense) | 0.1–0.25 | Atmosphere, texture. Never competes with content. |
-| **Content** | md (balanced) | 0.4–0.8 | The main visual idea. Carries the scene's concept. |
-| **Accent** | lg or sm (sparse) | 0.5–1.0 (sparse coverage) | Highlights, punctuation, sparse bright points. |
-
-The background sets mood. The content layer is what the scene *is about*. The accent adds visual interest without overwhelming.
-
-```python
-def fx_example(r, f, t, S):
-    local = t
-    progress = min(local / 5.0, 1.0)
-
-    g_bg = r.get_grid("sm")
-    g_main = r.get_grid("md")
-    g_accent = r.get_grid("lg")
-
-    # --- Background: dim atmosphere ---
-    bg_val = vf_smooth_noise(g_bg, f, t * 0.3, S, octaves=2, bri=0.15)
-    # ... render bg to canvas
-
-    # --- Content: the main visual idea ---
-    content_val = vf_spiral(g_main, f, t, S, n_arms=n_arms, tightness=tightness)
-    # ... render content on top of canvas
-
-    # --- Accent: sparse highlights ---
-    accent_val = vf_noise_static(g_accent, f, t, S, density=0.05)
-    # ... render accent on top
-
-    return canvas
-```
-
-## Directional Parameter Arcs
-
-Parameters should *go somewhere* over the scene's duration — not oscillate aimlessly with `sin(t * N)`.
-
-**Bad:** `twist = 3.0 + 2.0 * math.sin(t * 0.6)` — wobbles back and forth, feels aimless.
-
-**Good:** `twist = 2.0 + progress * 5.0` — starts gentle, ends intense. The scene *builds*.
-
-Use `progress = min(local / duration, 1.0)` (0→1 over the scene) to drive directional change:
-
-| Pattern | Formula | Feel |
-|---------|---------|------|
-| Linear ramp | `progress * range` | Steady buildup |
-| Ease-out | `1 - (1 - progress) ** 2` | Fast start, gentle finish |
-| Ease-in | `progress ** 2` | Slow start, accelerating |
-| Step reveal | `np.clip((progress - 0.5) / 0.25, 0, 1)` | Nothing until 50%, then fades in |
-| Build + plateau | `min(1.0, progress * 1.5)` | Reaches full at 67%, holds |
-
-Oscillation is fine for *secondary* parameters (saturation shimmer, hue drift). But the *defining* parameter of the scene should have a direction.
-
-### Examples of Directional Arcs
-
-| Scene concept | Parameter | Arc |
-|--------------|-----------|-----|
-| Emergence | Ring radius | 0 → max (ease-out) |
-| Shatter | Voronoi cell count | 8 → 38 (linear) |
-| Descent | Tunnel speed | 2.0 → 10.0 (linear) |
-| Mandala | Shape complexity | ring → +polygon → +star → +rosette (step reveals) |
-| Crescendo | Layer count | 1 → 7 (staggered entry) |
-| Entropy | Geometry visibility | 1.0 → 0.0 (consumed) |
-
-## Scene Concepts
-
-Each scene should be built around a *visual idea*, not an effect name.
-
-**Bad:** "fx_plasma_cascade" — named after the effect. No concept.
-**Good:** "fx_emergence" — a point of light expands into a field. The name tells you *what happens*.
-
-Good scene concepts have:
-1. A **visual metaphor** (emergence, descent, collision, entropy)
-2. A **directional arc** (things change from A to B, not oscillate)
-3. **Motivated layer choices** (each layer serves the concept)
-4. **Motivated feedback** (transform direction matches the metaphor)
-
-| Concept | Metaphor | Feedback transform | Why |
-|---------|----------|-------------------|-----|
-| Emergence | Birth, expansion | zoom-out | Past frames expand outward |
-| Descent | Falling, acceleration | zoom-in | Past frames rush toward center |
-| Inferno | Rising fire | shift-up | Past frames rise with the flames |
-| Entropy | Decay, dissolution | none | Clean, no persistence — things disappear |
-| Crescendo | Accumulation | zoom + hue_shift | Everything compounds and shifts |
-
-## Compositional Techniques
-
-### Counter-Rotating Dual Systems
-
-Two instances of the same effect rotating in opposite directions create visual interference:
-
-```python
-# Primary spiral (clockwise)
-s1_val = vf_spiral(g_main, f, t * 1.5, S, n_arms=n_arms_1, tightness=tightness_1)
-
-# Counter-rotating spiral (counter-clockwise via negative time)
-s2_val = vf_spiral(g_accent, f, -t * 1.2, S, n_arms=n_arms_2, tightness=tightness_2)
-
-# Screen blend creates bright interference at crossing points
-canvas = blend_canvas(canvas_with_s1, c2, "screen", 0.7)
-```
-
-Works with spirals, vortexes, rings. The counter-rotation creates constantly shifting interference patterns.
-
-### Wave Collision
-
-Two wave fronts converging from opposite sides, meeting at a collision point:
-
-```python
-collision_phase = abs(progress - 0.5) * 2  # 1→0→1 (0 at collision)
-
-# Wave A approaches from left
-offset_a = (1 - progress) * g.cols * 0.4
-wave_a = np.sin((g.cc + offset_a) * 0.08 + t * 2) * 0.5 + 0.5
-
-# Wave B approaches from right
-offset_b = -(1 - progress) * g.cols * 0.4
-wave_b = np.sin((g.cc + offset_b) * 0.08 - t * 2) * 0.5 + 0.5
-
-# Interference peaks at collision
-combined = wave_a * 0.5 + wave_b * 0.5 + np.abs(wave_a - wave_b) * (1 - collision_phase) * 0.5
-```
-
-### Progressive Fragmentation
-
-Voronoi with cell count increasing over time — visual shattering:
-
-```python
-n_pts = int(8 + progress * 30)  # 8 cells → 38 cells
-# Pre-generate enough points, slice to n_pts
-px = base_x[:n_pts] + np.sin(t * 0.3 + np.arange(n_pts) * 0.7) * (3 + progress * 3)
-```
-
-The edge glow width can also increase with progress to emphasize the cracks.
-
-### Entropy / Consumption
-
-A clean geometric pattern being overtaken by an organic process:
-
-```python
-# Geometry fades out
-geo_val = clean_pattern * max(0.05, 1.0 - progress * 0.9)
-
-# Organic process grows in
-rd_val = vf_reaction_diffusion(g, f, t, S) * min(1.0, progress * 1.5)
-
-# Render geometry first, organic on top — organic consumes geometry
-```
-
-### Staggered Layer Entry (Crescendo)
-
-Layers enter one at a time, building to overwhelming density:
-
-```python
-def layer_strength(enter_t, ramp=1.5):
-    """0.0 until enter_t, ramps to 1.0 over ramp seconds."""
-    return max(0.0, min(1.0, (local - enter_t) / ramp))
-
-# Layer 1: always present
-s1 = layer_strength(0.0)
-# Layer 2: enters at 2s
-s2 = layer_strength(2.0)
-# Layer 3: enters at 4s
-s3 = layer_strength(4.0)
-# ... etc
-
-# Each layer uses a different effect, grid, palette, and blend mode
-# Screen blend between layers so they accumulate light
-```
-
-For a 15-second crescendo, 7 layers entering every 2 seconds works well. Use different blend modes (screen for most, add for energy, colordodge for the final wash).
-
-## Scene Ordering
-
-For a multi-scene reel or video:
- **Vary mood between adjacent scenes** — don't put two calm scenes next to each other
- **Randomize order** rather than grouping by type — prevents "effect demo" feel
- **End on the strongest scene** — crescendo or something with a clear payoff
- **Open with energy** — grab attention in the first 2 seconds
@@ -2,13 +2,7 @@

 Effect building blocks that produce visual patterns. In v2, these are used **inside scene functions** that return a pixel canvas directly. The building blocks below operate on grid coordinate arrays and produce `(chars, colors)` or value/hue fields that the scene function renders to canvas via `_render_vf()`.

-**Cross-references:**
- Grid system, palettes, color: `architecture.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer: `shaders.md`
- Complete scene examples using these effects: `examples.md`
- Common bugs (broadcasting, clipping): `troubleshooting.md`
+> **See also:** architecture.md · composition.md · scenes.md · shaders.md · troubleshooting.md

 ## Design Philosophy

@@ -109,142 +103,7 @@ def bg_cellular(g, f, t, n_centers=12, hue=0.5, bri=0.6, pal=PAL_BLOCKS):

 ---

-## Radial Effects
-
-### Concentric Rings
-Bass/sub-driven pulsing rings from center. Scale ring count and thickness with bass energy.
-```python
-def eff_rings(g, f, t, hue=0.5, n_base=6, pal=PAL_DEFAULT):
-    n_rings = int(n_base + f["sub_r"] * 25 + f["bass"] * 10)
-    spacing = 2 + f["bass_r"] * 7 + f["rms"] * 3
-    ring_cv = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for ri in range(n_rings):
-        rad = (ri+1) * spacing + f["bdecay"] * 15
-        wobble = f["mid_r"]*5*np.sin(g.angle*3 + t*4) + f["hi_r"]*3*np.sin(g.angle*7 - t*6)
-        rd = np.abs(g.dist - rad - wobble)
-        th = 1 + f["sub"] * 3
-        ring_cv = np.maximum(ring_cv, np.clip((1 - rd/th) * (0.4 + f["bass"]*0.8), 0, 1))
-    # Color by angle + distance for rainbow rings
-    h = g.angle/(2*np.pi) + g.dist*0.005 + f["sub_r"]*0.2
-    return ring_cv, h
-```
-
-### Radial Rays
-```python
-def eff_rays(g, f, t, n_base=8, hue=0.5):
-    n_rays = int(n_base + f["hi_r"] * 25)
-    ray = np.clip(np.cos(g.angle*n_rays + t*3) * f["bdecay"]*0.6 * (1-g.dist_n), 0, 0.7)
-    return ray
-```
-
-### Spiral Arms (Logarithmic)
-```python
-def eff_spiral(g, f, t, n_arms=3, tightness=2.5, hue=0.5):
-    arm_cv = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for ai in range(n_arms):
-        offset = ai * 2*np.pi / n_arms
-        log_r = np.log(g.dist + 1) * tightness
-        arm_phase = g.angle + offset - log_r + t * 0.8
-        arm_val = np.clip(np.cos(arm_phase * n_arms) * 0.6 + 0.2, 0, 1)
-        arm_val *= (0.4 + f["rms"]*0.6) * np.clip(1 - g.dist_n*0.5, 0.2, 1)
-        arm_cv = np.maximum(arm_cv, arm_val)
-    return arm_cv
-```
-
-### Center Glow / Pulse
-```python
-def eff_glow(g, f, t, intensity=0.6, spread=2.0):
-    return np.clip(intensity * np.exp(-g.dist_n * spread) * (0.5 + f["rms"]*2 + np.sin(t*1.2)*0.2), 0, 0.9)
-```
-
-### Tunnel / Depth
-```python
-def eff_tunnel(g, f, t, speed=3.0, complexity=6):
-    tunnel_d = 1.0 / (g.dist_n + 0.1)
-    v1 = np.sin(tunnel_d*2 - t*speed) * 0.45 + 0.55
-    v2 = np.sin(g.angle*complexity + tunnel_d*1.5 - t*2) * 0.35 + 0.55
-    return v1 * 0.5 + v2 * 0.5
-```
-
-### Vortex (Rotating Distortion)
-```python
-def eff_vortex(g, f, t, twist=3.0, pulse=True):
-    """Twisting radial pattern -- distance modulates angle."""
-    twisted = g.angle + g.dist_n * twist * np.sin(t * 0.5)
-    val = np.sin(twisted * 4 - t * 2) * 0.5 + 0.5
-    if pulse:
-        val *= 0.5 + f.get("bass", 0.3) * 0.8
-    return np.clip(val, 0, 1)
-```
-
---
-
-## Wave Effects
-
-### Multi-Band Frequency Waves
-Each frequency band draws its own wave at different spatial/temporal frequencies:
-```python
-def eff_freq_waves(g, f, t, bands=None):
-    if bands is None:
-        bands = [("sub",0.06,1.2,0.0), ("bass",0.10,2.0,0.08), ("lomid",0.15,3.0,0.16),
-                 ("mid",0.22,4.5,0.25), ("himid",0.32,6.5,0.4), ("hi",0.45,8.5,0.55)]
-    mid = g.rows / 2.0
-    composite = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for band_key, sf, tf, hue_base in bands:
-        amp = f.get(band_key, 0.3) * g.rows * 0.4
-        y_wave = mid - np.sin(g.cc*sf + t*tf) * amp
-        y_wave += np.sin(g.cc*sf*2.3 + t*tf*1.7) * amp * 0.2  # harmonic
-        dist = np.abs(g.rr - y_wave)
-        thickness = 2 + f.get(band_key, 0.3) * 5
-        intensity = np.clip((1 - dist/thickness) * f.get(band_key, 0.3) * 1.5, 0, 1)
-        composite = np.maximum(composite, intensity)
-    return composite
-```
-
-### Interference Pattern
-6-8 overlapping sine waves creating moire-like patterns:
-```python
-def eff_interference(g, f, t, n_waves=5):
-    """Parametric interference -- vary n_waves for complexity."""
-    # Each wave has different orientation, frequency, and feature driver
-    drivers = ["mid_r", "himid_r", "bass_r", "lomid_r", "hi_r"]
-    vals = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for i in range(min(n_waves, len(drivers))):
-        angle = i * np.pi / n_waves  # spread orientations
-        freq = 0.06 + i * 0.03
-        sp = 0.5 + i * 0.3
-        proj = g.cc * np.cos(angle) + g.rr * np.sin(angle)
-        vals += np.sin(proj * freq + t * sp) * f.get(drivers[i], 0.3) * 2.5
-    return np.clip(vals * 0.12 + 0.45, 0.1, 1)
-```
-
-### Aurora / Horizontal Bands
-```python
-def eff_aurora(g, f, t, hue=0.4, n_bands=3):
-    val = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for i in range(n_bands):
-        freq_r = 0.08 + i * 0.04
-        freq_c = 0.012 + i * 0.008
-        sp_r = 0.7 + i * 0.3
-        sp_c = 0.18 + i * 0.12
-        val += np.sin(g.rr*freq_r + t*sp_r) * np.sin(g.cc*freq_c + t*sp_c) * (0.6 / n_bands)
-    return np.clip(val * (f.get("lomid_r", 0.3)*3 + 0.2), 0, 0.7)
-```
-
-### Ripple (Point-Source Waves)
-```python
-def eff_ripple(g, f, t, sources=None, freq=0.3, damping=0.02):
-    """Concentric ripples from point sources. Sources = [(row_frac, col_frac), ...]"""
-    if sources is None:
-        sources = [(0.5, 0.5)]  # center
-    val = np.zeros((g.rows, g.cols), dtype=np.float32)
-    for ry, rx in sources:
-        dy = g.rr - g.rows * ry
-        dx = g.cc - g.cols * rx
-        d = np.sqrt(dy**2 + dx**2)
-        val += np.sin(d * freq - t * 4) * np.exp(-d * damping) * 0.5
-    return np.clip(val + 0.5, 0, 1)
-```
+> **Note:** The v1 `eff_rings`, `eff_rays`, `eff_spiral`, `eff_glow`, `eff_tunnel`, `eff_vortex`, `eff_freq_waves`, `eff_interference`, `eff_aurora`, and `eff_ripple` functions are superseded by the `vf_*` value field generators below (used via `_render_vf()`). The `vf_*` versions integrate with the multi-grid composition pipeline and are preferred for all new scenes.

 ---

@@ -1967,3 +1826,40 @@ def scene_complex(r, f, t, S):
 ```

 Vary the **value field combo**, **hue field**, **palette**, **blend modes**, **feedback config**, and **shader chain** per section for maximum visual variety. With 12 value fields × 8 hue fields × 14 palettes × 20 blend modes × 7 feedback transforms × 38 shaders, the combinations are effectively infinite.
+
+---
+
+## Combining Effects — Creative Guide
+
+The catalog above is vocabulary. Here's how to compose it into something that looks intentional.
+
+### Layering for Depth
+Every scene should have at least two layers at different grid densities:
+- **Background** (sm or xs): dense, dim texture that prevents flat black. fBM, smooth noise, or domain warp at low brightness (bri=0.15-0.25).
+- **Content** (md): the main visual — rings, voronoi, spirals, tunnel. Full brightness.
+- **Accent** (lg or xl): sparse highlights — particles, text stencil, glow pulse. Screen-blended on top.
+
+### Interesting Effect Pairs
+| Pair | Blend | Why it works |
+|------|-------|-------------|
+| fBM + voronoi edges | `screen` | Organic fills the cells, edges add structure |
+| Domain warp + plasma | `difference` | Psychedelic organic interference |
+| Tunnel + vortex | `screen` | Depth perspective + rotational energy |
+| Spiral + interference | `exclusion` | Moire patterns from different spatial frequencies |
+| Reaction-diffusion + fire | `add` | Living organic base + dynamic foreground |
+| SDF geometry + domain warp | `screen` | Clean shapes floating in organic texture |
+
+### Effects as Masks
+Any value field can be used as a mask for another effect via `mask_from_vf()`:
+- Voronoi cells masking fire (fire visible only inside cells)
+- fBM masking a solid color layer (organic color clouds)
+- SDF shapes masking a reaction-diffusion field
+- Animated iris/wipe revealing one effect over another
+
+### Inventing New Effects
+For every project, create at least one effect that isn't in the catalog:
+- **Combine two vf_* functions** with math: `np.clip(vf_fbm(...) * vf_rings(...), 0, 1)`
+- **Apply coordinate transforms** before evaluation: `vf_plasma(twisted_grid, ...)`
+- **Use one field to modulate another's parameters**: `vf_spiral(..., tightness=2 + vf_fbm(...) * 5)`
+- **Stack time offsets**: render the same field at `t` and `t - 0.5`, difference-blend for motion trails
+- **Mirror a value field** through an SDF boundary for kaleidoscopic geometry
@@ -1,416 +0,0 @@
-# Scene Examples
-
-**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, ShaderChain: `shaders.md`
- Input sources (audio features, video features): `inputs.md`
- Performance tuning: `optimization.md`
- Common bugs: `troubleshooting.md`
-
-Copy-paste-ready scene functions at increasing complexity. Each is a complete, working v2 scene function that returns a pixel canvas. See `scenes.md` for the scene protocol and `composition.md` for blend modes and tonemap.
-
---
-
-## Minimal — Single Grid, Single Effect
-
-### Breathing Plasma
-
-One grid, one value field, one hue field. The simplest possible scene.
-
-```python
-def fx_breathing_plasma(r, f, t, S):
-    """Plasma field with time-cycling hue. Audio modulates brightness."""
-    canvas = _render_vf(r, "md",
-        lambda g, f, t, S: vf_plasma(g, f, t, S) * 1.3,
-        hf_time_cycle(0.08), PAL_DENSE, f, t, S, sat=0.8)
-    return canvas
-```
-
-### Reaction-Diffusion Coral
-
-Single grid, simulation-based field. Evolves organically over time.
-
-```python
-def fx_coral(r, f, t, S):
-    """Gray-Scott reaction-diffusion — coral branching pattern.
-    Slow-evolving, organic. Best for ambient/chill sections."""
-    canvas = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
-            feed=0.037, kill=0.060, steps_per_frame=6, init_mode="center"),
-        hf_distance(0.55, 0.015), PAL_DOTS, f, t, S, sat=0.7)
-    return canvas
-```
-
-### SDF Geometry
-
-Geometric shapes from SDFs. Clean, precise, graphic.
-
-```python
-def fx_sdf_rings(r, f, t, S):
-    """Concentric SDF rings with smooth pulsing."""
-    def val_fn(g, f, t, S):
-        d1 = sdf_ring(g, radius=0.15 + f.get("bass", 0.3) * 0.05, thickness=0.015)
-        d2 = sdf_ring(g, radius=0.25 + f.get("mid", 0.3) * 0.05, thickness=0.012)
-        d3 = sdf_ring(g, radius=0.35 + f.get("hi", 0.3) * 0.04, thickness=0.010)
-        combined = sdf_smooth_union(sdf_smooth_union(d1, d2, 0.05), d3, 0.05)
-        return sdf_glow(combined, falloff=0.08) * (0.5 + f.get("rms", 0.3) * 0.8)
-    canvas = _render_vf(r, "md", val_fn, hf_angle(0.0), PAL_STARS, f, t, S, sat=0.85)
-    return canvas
-```
-
---
-
-## Standard — Two Grids + Blend
-
-### Tunnel Through Noise
-
-Two grids at different densities, screen blended. The fine noise texture shows through the coarser tunnel characters.
-
-```python
-def fx_tunnel_noise(r, f, t, S):
-    """Tunnel depth on md grid + fBM noise on sm grid, screen blended."""
-    canvas_a = _render_vf(r, "md",
-        lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=4.0, complexity=8) * 1.2,
-        hf_distance(0.5, 0.02), PAL_BLOCKS, f, t, S, sat=0.7)
-
-    canvas_b = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=4, freq=0.05, speed=0.15) * 1.3,
-        hf_time_cycle(0.06), PAL_RUNE, f, t, S, sat=0.6)
-
-    return blend_canvas(canvas_a, canvas_b, "screen", 0.7)
-```
-
-### Voronoi Cells + Spiral Overlay
-
-Voronoi cell edges with a spiral arm pattern overlaid.
-
-```python
-def fx_voronoi_spiral(r, f, t, S):
-    """Voronoi edge detection on md + logarithmic spiral on lg."""
-    canvas_a = _render_vf(r, "md",
-        lambda g, f, t, S: vf_voronoi(g, f, t, S,
-            n_cells=15, mode="edge", edge_width=2.0, speed=0.4),
-        hf_angle(0.2), PAL_CIRCUIT, f, t, S, sat=0.75)
-
-    canvas_b = _render_vf(r, "lg",
-        lambda g, f, t, S: vf_spiral(g, f, t, S, n_arms=4, tightness=3.0) * 1.2,
-        hf_distance(0.1, 0.03), PAL_BLOCKS, f, t, S, sat=0.9)
-
-    return blend_canvas(canvas_a, canvas_b, "exclusion", 0.6)
-```
-
-### Domain-Warped fBM
-
-Two layers of the same fBM, one domain-warped, difference-blended for psychedelic organic texture.
-
-```python
-def fx_organic_warp(r, f, t, S):
-    """Clean fBM vs domain-warped fBM, difference blended."""
-    canvas_a = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04, speed=0.1),
-        hf_plasma(0.2), PAL_DENSE, f, t, S, sat=0.6)
-
-    canvas_b = _render_vf(r, "md",
-        lambda g, f, t, S: vf_domain_warp(g, f, t, S,
-            warp_strength=20.0, freq=0.05, speed=0.15),
-        hf_time_cycle(0.05), PAL_BRAILLE, f, t, S, sat=0.7)
-
-    return blend_canvas(canvas_a, canvas_b, "difference", 0.7)
-```
-
---
-
-## Complex — Three Grids + Conditional + Feedback
-
-### Psychedelic Cathedral
-
-Three-grid composition with beat-triggered kaleidoscope and feedback zoom tunnel. The most visually complex pattern.
-
-```python
-def fx_cathedral(r, f, t, S):
-    """Three-layer cathedral: interference + rings + noise, kaleidoscope on beat,
-    feedback zoom tunnel."""
-    # Layer 1: interference pattern on sm grid
-    canvas_a = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_interference(g, f, t, S, n_waves=7) * 1.3,
-        hf_angle(0.0), PAL_MATH, f, t, S, sat=0.8)
-
-    # Layer 2: pulsing rings on md grid
-    canvas_b = _render_vf(r, "md",
-        lambda g, f, t, S: vf_rings(g, f, t, S, n_base=10, spacing_base=3) * 1.4,
-        hf_distance(0.3, 0.02), PAL_STARS, f, t, S, sat=0.9)
-
-    # Layer 3: temporal noise on lg grid (slow morph)
-    canvas_c = _render_vf(r, "lg",
-        lambda g, f, t, S: vf_temporal_noise(g, f, t, S,
-            freq=0.04, t_freq=0.2, octaves=3),
-        hf_time_cycle(0.12), PAL_BLOCKS, f, t, S, sat=0.7)
-
-    # Blend: A screen B, then difference with C
-    result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
-    result = blend_canvas(result, canvas_c, "difference", 0.5)
-
-    # Beat-triggered kaleidoscope
-    if f.get("bdecay", 0) > 0.3:
-        folds = 6 if f.get("sub_r", 0.3) > 0.4 else 8
-        result = sh_kaleidoscope(result.copy(), folds=folds)
-
-    return result
-
-# Scene table entry with feedback:
-# {"start": 30.0, "end": 50.0, "name": "cathedral", "fx": fx_cathedral,
-#  "gamma": 0.65, "shaders": [("bloom", {"thr": 110}), ("chromatic", {"amt": 4}),
-#                              ("vignette", {"s": 0.2}), ("grain", {"amt": 8})],
-#  "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
-#               "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}}
-```
-
-### Masked Reaction-Diffusion with Attractor Overlay
-
-Reaction-diffusion visible only through an animated iris mask, with a strange attractor density field underneath.
-
-```python
-def fx_masked_life(r, f, t, S):
-    """Attractor base + reaction-diffusion visible through iris mask + particles."""
-    g_sm = r.get_grid("sm")
-    g_md = r.get_grid("md")
-
-    # Layer 1: strange attractor density field (background)
-    canvas_bg = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_strange_attractor(g, f, t, S,
-            attractor="clifford", n_points=30000),
-        hf_time_cycle(0.04), PAL_DOTS, f, t, S, sat=0.5)
-
-    # Layer 2: reaction-diffusion (foreground, will be masked)
-    canvas_rd = _render_vf(r, "md",
-        lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
-            feed=0.046, kill=0.063, steps_per_frame=4, init_mode="ring"),
-        hf_angle(0.15), PAL_HALFFILL, f, t, S, sat=0.85)
-
-    # Animated iris mask — opens over first 5 seconds of scene
-    scene_start = S.get("_scene_start", t)
-    if "_scene_start" not in S:
-        S["_scene_start"] = t
-    mask = mask_iris(g_md, t, scene_start, scene_start + 5.0,
-                     max_radius=0.6)
-    canvas_rd = apply_mask_canvas(canvas_rd, mask, bg_canvas=canvas_bg)
-
-    # Layer 3: flow-field particles following the R-D gradient
-    rd_field = vf_reaction_diffusion(g_sm, f, t, S,
-        feed=0.046, kill=0.063, steps_per_frame=0)  # read without stepping
-    ch_p, co_p = update_flow_particles(S, g_sm, f, rd_field,
-        n=300, speed=0.8, char_set=list("·•◦∘°"))
-    canvas_p = g_sm.render(ch_p, co_p)
-
-    result = blend_canvas(canvas_rd, canvas_p, "add", 0.7)
-    return result
-```
-
-### Morphing Field Sequence with Eased Keyframes
-
-Demonstrates temporal coherence: smooth morphing between effects with keyframed parameters.
-
-```python
-def fx_morphing_journey(r, f, t, S):
-    """Morphs through 4 value fields over 20 seconds with eased transitions.
-    Parameters (twist, arm count) also keyframed."""
-    # Keyframed twist parameter
-    twist = keyframe(t, [(0, 1.0), (5, 5.0), (10, 2.0), (15, 8.0), (20, 1.0)],
-                     ease_fn=ease_in_out_cubic, loop=True)
-
-    # Sequence of value fields with 2s crossfade
-    fields = [
-        lambda g, f, t, S: vf_plasma(g, f, t, S),
-        lambda g, f, t, S: vf_vortex(g, f, t, S, twist=twist),
-        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04),
-        lambda g, f, t, S: vf_domain_warp(g, f, t, S, warp_strength=15),
-    ]
-    durations = [5.0, 5.0, 5.0, 5.0]
-
-    val_fn = lambda g, f, t, S: vf_sequence(g, f, t, S, fields, durations,
-                                             crossfade=2.0)
-
-    # Render with slowly rotating hue
-    canvas = _render_vf(r, "md", val_fn, hf_time_cycle(0.06),
-                        PAL_DENSE, f, t, S, sat=0.8)
-
-    # Second layer: tiled version of same sequence at smaller grid
-    tiled_fn = lambda g, f, t, S: vf_sequence(
-        make_tgrid(g, *uv_tile(g, 3, 3, mirror=True)),
-        f, t, S, fields, durations, crossfade=2.0)
-    canvas_b = _render_vf(r, "sm", tiled_fn, hf_angle(0.1),
-                          PAL_RUNE, f, t, S, sat=0.6)
-
-    return blend_canvas(canvas, canvas_b, "screen", 0.5)
-```
-
---
-
-## Specialized — Unique State Patterns
-
-### Game of Life with Ghost Trails
-
-Cellular automaton with analog fade trails. Beat injects random cells.
-
-```python
-def fx_life(r, f, t, S):
-    """Conway's Game of Life with fading ghost trails.
-    Beat events inject random live cells for disruption."""
-    canvas = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_game_of_life(g, f, t, S,
-            rule="life", steps_per_frame=1, fade=0.92, density=0.25),
-        hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.8)
-
-    # Overlay: coral automaton on lg grid for chunky texture
-    canvas_b = _render_vf(r, "lg",
-        lambda g, f, t, S: vf_game_of_life(g, f, t, S,
-            rule="coral", steps_per_frame=1, fade=0.85, density=0.15, seed=99),
-        hf_time_cycle(0.1), PAL_HATCH, f, t, S, sat=0.6)
-
-    return blend_canvas(canvas, canvas_b, "screen", 0.5)
-```
-
-### Boids Flock Over Voronoi
-
-Emergent swarm movement over a cellular background.
-
-```python
-def fx_boid_swarm(r, f, t, S):
-    """Flocking boids over animated voronoi cells."""
-    # Background: voronoi cells
-    canvas_bg = _render_vf(r, "md",
-        lambda g, f, t, S: vf_voronoi(g, f, t, S,
-            n_cells=20, mode="distance", speed=0.2),
-        hf_distance(0.4, 0.02), PAL_CIRCUIT, f, t, S, sat=0.5)
-
-    # Foreground: boids
-    g = r.get_grid("md")
-    ch_b, co_b = update_boids(S, g, f, n_boids=150, perception=6.0,
-                              max_speed=1.5, char_set=list("▸▹►▻→⟶"))
-    canvas_boids = g.render(ch_b, co_b)
-
-    # Trails for the boids
-    # (boid positions are stored in S["boid_x"], S["boid_y"])
-    S["px"] = list(S.get("boid_x", []))
-    S["py"] = list(S.get("boid_y", []))
-    ch_t, co_t = draw_particle_trails(S, g, max_trail=6, fade=0.6)
-    canvas_trails = g.render(ch_t, co_t)
-
-    result = blend_canvas(canvas_bg, canvas_trails, "add", 0.3)
-    result = blend_canvas(result, canvas_boids, "add", 0.9)
-    return result
-```
-
-### Fire Rising Through SDF Text Stencil
-
-Fire effect visible only through text letterforms.
-
-```python
-def fx_fire_text(r, f, t, S):
-    """Fire columns visible through text stencil. Text acts as window."""
-    g = r.get_grid("lg")
-
-    # Full-screen fire (will be masked)
-    canvas_fire = _render_vf(r, "sm",
-        lambda g, f, t, S: np.clip(
-            vf_fbm(g, f, t, S, octaves=4, freq=0.08, speed=0.8) *
-            (1.0 - g.rr / g.rows) *  # fade toward top
-            (0.6 + f.get("bass", 0.3) * 0.8), 0, 1),
-        hf_fixed(0.05), PAL_BLOCKS, f, t, S, sat=0.9)  # fire hue
-
-    # Background: dark domain warp
-    canvas_bg = _render_vf(r, "md",
-        lambda g, f, t, S: vf_domain_warp(g, f, t, S,
-            warp_strength=8, freq=0.03, speed=0.05) * 0.3,
-        hf_fixed(0.6), PAL_DENSE, f, t, S, sat=0.4)
-
-    # Text stencil mask
-    mask = mask_text(g, "FIRE", row_frac=0.45)
-    # Expand vertically for multi-row coverage
-    for offset in range(-2, 3):
-        shifted = mask_text(g, "FIRE", row_frac=0.45 + offset / g.rows)
-        mask = mask_union(mask, shifted)
-
-    canvas_masked = apply_mask_canvas(canvas_fire, mask, bg_canvas=canvas_bg)
-    return canvas_masked
-```
-
-### Portrait Mode: Vertical Rain + Quote
-
-Optimized for 9:16. Uses vertical space for long rain trails and stacked text.
-
-```python
-def fx_portrait_rain_quote(r, f, t, S):
-    """Portrait-optimized: matrix rain (long vertical trails) with stacked quote.
-    Designed for 1080x1920 (9:16)."""
-    g = r.get_grid("md")  # ~112x100 in portrait
-
-    # Matrix rain — long trails benefit from portrait's extra rows
-    ch, co, S = eff_matrix_rain(g, f, t, S,
-        hue=0.33, bri=0.6, pal=PAL_KATA, speed_base=0.4, speed_beat=2.5)
-    canvas_rain = g.render(ch, co)
-
-    # Tunnel depth underneath for texture
-    canvas_tunnel = _render_vf(r, "sm",
-        lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=3.0, complexity=6) * 0.8,
-        hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.5)
-
-    result = blend_canvas(canvas_tunnel, canvas_rain, "screen", 0.8)
-
-    # Quote text — portrait layout: short lines, many of them
-    g_text = r.get_grid("lg")  # ~90x80 in portrait
-    quote_lines = layout_text_portrait(
-        "The code is the art and the art is the code",
-        max_chars_per_line=20)
-    # Center vertically
-    block_start = (g_text.rows - len(quote_lines)) // 2
-    ch_t = np.full((g_text.rows, g_text.cols), " ", dtype="U1")
-    co_t = np.zeros((g_text.rows, g_text.cols, 3), dtype=np.uint8)
-    total_chars = sum(len(l) for l in quote_lines)
-    progress = min(1.0, (t - S.get("_scene_start", t)) / 3.0)
-    if "_scene_start" not in S: S["_scene_start"] = t
-    render_typewriter(ch_t, co_t, quote_lines, block_start, g_text.cols,
-                      progress, total_chars, (200, 255, 220), t)
-    canvas_text = g_text.render(ch_t, co_t)
-
-    result = blend_canvas(result, canvas_text, "add", 0.9)
-    return result
-```
-
---
-
-## Scene Table Template
-
-Wire scenes into a complete video:
-
-```python
-SCENES = [
-    {"start": 0.0,  "end": 5.0,  "name": "coral",
-     "fx": fx_coral, "grid": "sm", "gamma": 0.70,
-     "shaders": [("bloom", {"thr": 110}), ("vignette", {"s": 0.2})],
-     "feedback": {"decay": 0.8, "blend": "screen", "opacity": 0.3,
-                  "transform": "zoom", "transform_amt": 0.01}},
-
-    {"start": 5.0,  "end": 15.0, "name": "tunnel_noise",
-     "fx": fx_tunnel_noise, "grid": "md", "gamma": 0.75,
-     "shaders": [("chromatic", {"amt": 3}), ("bloom", {"thr": 120}),
-                 ("scanlines", {"intensity": 0.06}), ("grain", {"amt": 8})],
-     "feedback": None},
-
-    {"start": 15.0, "end": 35.0, "name": "cathedral",
-     "fx": fx_cathedral, "grid": "sm", "gamma": 0.65,
-     "shaders": [("bloom", {"thr": 100}), ("chromatic", {"amt": 5}),
-                 ("color_wobble", {"amt": 0.2}), ("vignette", {"s": 0.18})],
-     "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
-                  "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}},
-
-    {"start": 35.0, "end": 50.0, "name": "morphing",
-     "fx": fx_morphing_journey, "grid": "md", "gamma": 0.70,
-     "shaders": [("bloom", {"thr": 110}), ("grain", {"amt": 6})],
-     "feedback": {"decay": 0.7, "blend": "screen", "opacity": 0.25,
-                  "transform": "rotate_cw", "transform_amt": 0.003}},
-]
-```
@@ -1,13 +1,6 @@
 # Input Sources

-**Cross-references:**
- Grid system, resolution presets: `architecture.md`
- Effect building blocks (audio-reactive modulation): `effects.md`
- Scene protocol, SCENES table (feature routing): `scenes.md`
- Shader pipeline, output encoding: `shaders.md`
- Performance tuning (audio chunking, WAV caching): `optimization.md`
- Common bugs (sample rate, dtype, silence handling): `troubleshooting.md`
- Complete scene examples with feature usage: `examples.md`
+> **See also:** architecture.md · effects.md · scenes.md · shaders.md · optimization.md · troubleshooting.md

 ## Audio Analysis

@@ -1,14 +1,6 @@
 # Optimization Reference

-**Cross-references:**
- Grid system, resolution presets, portrait GridLayer: `architecture.md`
- Effect building blocks (pre-computation strategies): `effects.md`
- `_render_vf()`, tonemap (subsampled percentile): `composition.md`
- Scene protocol, render_clip: `scenes.md`
- Shader pipeline, encoding (ffmpeg flags): `shaders.md`
- Input sources (audio chunking, WAV extraction): `inputs.md`
- Common bugs (memory, OOM, frame drops): `troubleshooting.md`
- Complete scene examples: `examples.md`
+> **See also:** architecture.md · composition.md · scenes.md · shaders.md · inputs.md · troubleshooting.md

 ## Hardware Detection

@@ -1,18 +1,214 @@
-# Scene System Reference
+# Scene System & Creative Composition

-**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs, particles): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Shader pipeline, feedback buffer, ShaderChain: `shaders.md`
- Complete scene examples at every complexity level: `examples.md`
- Input sources (audio features, video features): `inputs.md`
- Performance tuning, portrait CLI: `optimization.md`
- Common bugs (state leaks, frame drops): `troubleshooting.md`
+> **See also:** architecture.md · composition.md · effects.md · shaders.md
+
+## Scene Design Philosophy
+
+Scenes are storytelling units, not effect demos. Every scene needs:
+- A **concept** — what is happening visually? Not "plasma + rings" but "emergence from void" or "crystallization"
+- An **arc** — how does it change over its duration? Build, decay, transform, reveal?
+- A **role** — how does it serve the larger video narrative? Opening tension, peak energy, resolution?
+
+The design patterns below provide compositional techniques. The scene examples show them in practice at increasing complexity. The protocol section covers the technical contract.
+
+Good scene design starts with the concept, then selects effects and parameters that serve it. The design patterns section shows *how* to compose layers intentionally. The examples section shows complete working scenes at every complexity level. The protocol section covers the technical contract that all scenes must follow.
+
+---
+
+## Scene Design Patterns
+
+Higher-order patterns for composing scenes that feel intentional rather than random. These patterns use the existing building blocks (value fields, blend modes, shaders, feedback) but organize them with compositional intent.
+
+## Layer Hierarchy
+
+Every scene should have clear visual layers with distinct roles:
+
+| Layer | Grid | Brightness | Purpose |
+|-------|------|-----------|---------|
+| **Background** | xs or sm (dense) | 0.1–0.25 | Atmosphere, texture. Never competes with content. |
+| **Content** | md (balanced) | 0.4–0.8 | The main visual idea. Carries the scene's concept. |
+| **Accent** | lg or sm (sparse) | 0.5–1.0 (sparse coverage) | Highlights, punctuation, sparse bright points. |
+
+The background sets mood. The content layer is what the scene *is about*. The accent adds visual interest without overwhelming.
+
+```python
+def fx_example(r, f, t, S):
+    local = t
+    progress = min(local / 5.0, 1.0)
+
+    g_bg = r.get_grid("sm")
+    g_main = r.get_grid("md")
+    g_accent = r.get_grid("lg")
+
+    # --- Background: dim atmosphere ---
+    bg_val = vf_smooth_noise(g_bg, f, t * 0.3, S, octaves=2, bri=0.15)
+    # ... render bg to canvas
+
+    # --- Content: the main visual idea ---
+    content_val = vf_spiral(g_main, f, t, S, n_arms=n_arms, tightness=tightness)
+    # ... render content on top of canvas
+
+    # --- Accent: sparse highlights ---
+    accent_val = vf_noise_static(g_accent, f, t, S, density=0.05)
+    # ... render accent on top
+
+    return canvas
+```
+
+## Directional Parameter Arcs
+
+Parameters should *go somewhere* over the scene's duration — not oscillate aimlessly with `sin(t * N)`.
+
+**Bad:** `twist = 3.0 + 2.0 * math.sin(t * 0.6)` — wobbles back and forth, feels aimless.
+
+**Good:** `twist = 2.0 + progress * 5.0` — starts gentle, ends intense. The scene *builds*.
+
+Use `progress = min(local / duration, 1.0)` (0→1 over the scene) to drive directional change:
+
+| Pattern | Formula | Feel |
+|---------|---------|------|
+| Linear ramp | `progress * range` | Steady buildup |
+| Ease-out | `1 - (1 - progress) ** 2` | Fast start, gentle finish |
+| Ease-in | `progress ** 2` | Slow start, accelerating |
+| Step reveal | `np.clip((progress - 0.5) / 0.25, 0, 1)` | Nothing until 50%, then fades in |
+| Build + plateau | `min(1.0, progress * 1.5)` | Reaches full at 67%, holds |
+
+Oscillation is fine for *secondary* parameters (saturation shimmer, hue drift). But the *defining* parameter of the scene should have a direction.
+
+### Examples of Directional Arcs
+
+| Scene concept | Parameter | Arc |
+|--------------|-----------|-----|
+| Emergence | Ring radius | 0 → max (ease-out) |
+| Shatter | Voronoi cell count | 8 → 38 (linear) |
+| Descent | Tunnel speed | 2.0 → 10.0 (linear) |
+| Mandala | Shape complexity | ring → +polygon → +star → +rosette (step reveals) |
+| Crescendo | Layer count | 1 → 7 (staggered entry) |
+| Entropy | Geometry visibility | 1.0 → 0.0 (consumed) |
+
+## Scene Concepts
+
+Each scene should be built around a *visual idea*, not an effect name.
+
+**Bad:** "fx_plasma_cascade" — named after the effect. No concept.
+**Good:** "fx_emergence" — a point of light expands into a field. The name tells you *what happens*.
+
+Good scene concepts have:
+1. A **visual metaphor** (emergence, descent, collision, entropy)
+2. A **directional arc** (things change from A to B, not oscillate)
+3. **Motivated layer choices** (each layer serves the concept)
+4. **Motivated feedback** (transform direction matches the metaphor)
+
+| Concept | Metaphor | Feedback transform | Why |
+|---------|----------|-------------------|-----|
+| Emergence | Birth, expansion | zoom-out | Past frames expand outward |
+| Descent | Falling, acceleration | zoom-in | Past frames rush toward center |
+| Inferno | Rising fire | shift-up | Past frames rise with the flames |
+| Entropy | Decay, dissolution | none | Clean, no persistence — things disappear |
+| Crescendo | Accumulation | zoom + hue_shift | Everything compounds and shifts |
+
+## Compositional Techniques
+
+### Counter-Rotating Dual Systems
+
+Two instances of the same effect rotating in opposite directions create visual interference:
+
+```python
+# Primary spiral (clockwise)
+s1_val = vf_spiral(g_main, f, t * 1.5, S, n_arms=n_arms_1, tightness=tightness_1)
+
+# Counter-rotating spiral (counter-clockwise via negative time)
+s2_val = vf_spiral(g_accent, f, -t * 1.2, S, n_arms=n_arms_2, tightness=tightness_2)
+
+# Screen blend creates bright interference at crossing points
+canvas = blend_canvas(canvas_with_s1, c2, "screen", 0.7)
+```
+
+Works with spirals, vortexes, rings. The counter-rotation creates constantly shifting interference patterns.
+
+### Wave Collision
+
+Two wave fronts converging from opposite sides, meeting at a collision point:
+
+```python
+collision_phase = abs(progress - 0.5) * 2  # 1→0→1 (0 at collision)
+
+# Wave A approaches from left
+offset_a = (1 - progress) * g.cols * 0.4
+wave_a = np.sin((g.cc + offset_a) * 0.08 + t * 2) * 0.5 + 0.5
+
+# Wave B approaches from right
+offset_b = -(1 - progress) * g.cols * 0.4
+wave_b = np.sin((g.cc + offset_b) * 0.08 - t * 2) * 0.5 + 0.5
+
+# Interference peaks at collision
+combined = wave_a * 0.5 + wave_b * 0.5 + np.abs(wave_a - wave_b) * (1 - collision_phase) * 0.5
+```
+
+### Progressive Fragmentation
+
+Voronoi with cell count increasing over time — visual shattering:
+
+```python
+n_pts = int(8 + progress * 30)  # 8 cells → 38 cells
+# Pre-generate enough points, slice to n_pts
+px = base_x[:n_pts] + np.sin(t * 0.3 + np.arange(n_pts) * 0.7) * (3 + progress * 3)
+```
+
+The edge glow width can also increase with progress to emphasize the cracks.
+
+### Entropy / Consumption
+
+A clean geometric pattern being overtaken by an organic process:
+
+```python
+# Geometry fades out
+geo_val = clean_pattern * max(0.05, 1.0 - progress * 0.9)
+
+# Organic process grows in
+rd_val = vf_reaction_diffusion(g, f, t, S) * min(1.0, progress * 1.5)
+
+# Render geometry first, organic on top — organic consumes geometry
+```
+
+### Staggered Layer Entry (Crescendo)
+
+Layers enter one at a time, building to overwhelming density:
+
+```python
+def layer_strength(enter_t, ramp=1.5):
+    """0.0 until enter_t, ramps to 1.0 over ramp seconds."""
+    return max(0.0, min(1.0, (local - enter_t) / ramp))
+
+# Layer 1: always present
+s1 = layer_strength(0.0)
+# Layer 2: enters at 2s
+s2 = layer_strength(2.0)
+# Layer 3: enters at 4s
+s3 = layer_strength(4.0)
+# ... etc
+
+# Each layer uses a different effect, grid, palette, and blend mode
+# Screen blend between layers so they accumulate light
+```
+
+For a 15-second crescendo, 7 layers entering every 2 seconds works well. Use different blend modes (screen for most, add for energy, colordodge for the final wash).
+
+## Scene Ordering
+
+For a multi-scene reel or video:
+- **Vary mood between adjacent scenes** — don't put two calm scenes next to each other
+- **Randomize order** rather than grouping by type — prevents "effect demo" feel
+- **End on the strongest scene** — crescendo or something with a clear payoff
+- **Open with energy** — grab attention in the first 2 seconds
+
+---
+
+## Scene Protocol

 Scenes are the top-level creative unit. Each scene is a time-bounded segment with its own effect function, shader chain, feedback configuration, and tone-mapping gamma.

-## Scene Protocol (v2)
+### Scene Protocol (v2)

 ### Function Signature

@@ -404,3 +600,412 @@ For each scene:
 7. **Configure feedback** for trailing/recursive looks — or None for clean cuts
 8. **Set gamma** if using destructive shaders (solarize, posterize)
 9. **Test with --test-frame** at the scene's midpoint before full render
+
+---
+
+## Scene Examples
+
+Copy-paste-ready scene functions at increasing complexity. Each is a complete, working v2 scene function that returns a pixel canvas. See the Scene Protocol section above for the scene protocol and `composition.md` for blend modes and tonemap.
+
+---
+
+### Minimal — Single Grid, Single Effect
+
+### Breathing Plasma
+
+One grid, one value field, one hue field. The simplest possible scene.
+
+```python
+def fx_breathing_plasma(r, f, t, S):
+    """Plasma field with time-cycling hue. Audio modulates brightness."""
+    canvas = _render_vf(r, "md",
+        lambda g, f, t, S: vf_plasma(g, f, t, S) * 1.3,
+        hf_time_cycle(0.08), PAL_DENSE, f, t, S, sat=0.8)
+    return canvas
+```
+
+### Reaction-Diffusion Coral
+
+Single grid, simulation-based field. Evolves organically over time.
+
+```python
+def fx_coral(r, f, t, S):
+    """Gray-Scott reaction-diffusion — coral branching pattern.
+    Slow-evolving, organic. Best for ambient/chill sections."""
+    canvas = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
+            feed=0.037, kill=0.060, steps_per_frame=6, init_mode="center"),
+        hf_distance(0.55, 0.015), PAL_DOTS, f, t, S, sat=0.7)
+    return canvas
+```
+
+### SDF Geometry
+
+Geometric shapes from SDFs. Clean, precise, graphic.
+
+```python
+def fx_sdf_rings(r, f, t, S):
+    """Concentric SDF rings with smooth pulsing."""
+    def val_fn(g, f, t, S):
+        d1 = sdf_ring(g, radius=0.15 + f.get("bass", 0.3) * 0.05, thickness=0.015)
+        d2 = sdf_ring(g, radius=0.25 + f.get("mid", 0.3) * 0.05, thickness=0.012)
+        d3 = sdf_ring(g, radius=0.35 + f.get("hi", 0.3) * 0.04, thickness=0.010)
+        combined = sdf_smooth_union(sdf_smooth_union(d1, d2, 0.05), d3, 0.05)
+        return sdf_glow(combined, falloff=0.08) * (0.5 + f.get("rms", 0.3) * 0.8)
+    canvas = _render_vf(r, "md", val_fn, hf_angle(0.0), PAL_STARS, f, t, S, sat=0.85)
+    return canvas
+```
+
+---
+
+### Standard — Two Grids + Blend
+
+### Tunnel Through Noise
+
+Two grids at different densities, screen blended. The fine noise texture shows through the coarser tunnel characters.
+
+```python
+def fx_tunnel_noise(r, f, t, S):
+    """Tunnel depth on md grid + fBM noise on sm grid, screen blended."""
+    canvas_a = _render_vf(r, "md",
+        lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=4.0, complexity=8) * 1.2,
+        hf_distance(0.5, 0.02), PAL_BLOCKS, f, t, S, sat=0.7)
+
+    canvas_b = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=4, freq=0.05, speed=0.15) * 1.3,
+        hf_time_cycle(0.06), PAL_RUNE, f, t, S, sat=0.6)
+
+    return blend_canvas(canvas_a, canvas_b, "screen", 0.7)
+```
+
+### Voronoi Cells + Spiral Overlay
+
+Voronoi cell edges with a spiral arm pattern overlaid.
+
+```python
+def fx_voronoi_spiral(r, f, t, S):
+    """Voronoi edge detection on md + logarithmic spiral on lg."""
+    canvas_a = _render_vf(r, "md",
+        lambda g, f, t, S: vf_voronoi(g, f, t, S,
+            n_cells=15, mode="edge", edge_width=2.0, speed=0.4),
+        hf_angle(0.2), PAL_CIRCUIT, f, t, S, sat=0.75)
+
+    canvas_b = _render_vf(r, "lg",
+        lambda g, f, t, S: vf_spiral(g, f, t, S, n_arms=4, tightness=3.0) * 1.2,
+        hf_distance(0.1, 0.03), PAL_BLOCKS, f, t, S, sat=0.9)
+
+    return blend_canvas(canvas_a, canvas_b, "exclusion", 0.6)
+```
+
+### Domain-Warped fBM
+
+Two layers of the same fBM, one domain-warped, difference-blended for psychedelic organic texture.
+
+```python
+def fx_organic_warp(r, f, t, S):
+    """Clean fBM vs domain-warped fBM, difference blended."""
+    canvas_a = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04, speed=0.1),
+        hf_plasma(0.2), PAL_DENSE, f, t, S, sat=0.6)
+
+    canvas_b = _render_vf(r, "md",
+        lambda g, f, t, S: vf_domain_warp(g, f, t, S,
+            warp_strength=20.0, freq=0.05, speed=0.15),
+        hf_time_cycle(0.05), PAL_BRAILLE, f, t, S, sat=0.7)
+
+    return blend_canvas(canvas_a, canvas_b, "difference", 0.7)
+```
+
+---
+
+### Complex — Three Grids + Conditional + Feedback
+
+### Psychedelic Cathedral
+
+Three-grid composition with beat-triggered kaleidoscope and feedback zoom tunnel. The most visually complex pattern.
+
+```python
+def fx_cathedral(r, f, t, S):
+    """Three-layer cathedral: interference + rings + noise, kaleidoscope on beat,
+    feedback zoom tunnel."""
+    # Layer 1: interference pattern on sm grid
+    canvas_a = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_interference(g, f, t, S, n_waves=7) * 1.3,
+        hf_angle(0.0), PAL_MATH, f, t, S, sat=0.8)
+
+    # Layer 2: pulsing rings on md grid
+    canvas_b = _render_vf(r, "md",
+        lambda g, f, t, S: vf_rings(g, f, t, S, n_base=10, spacing_base=3) * 1.4,
+        hf_distance(0.3, 0.02), PAL_STARS, f, t, S, sat=0.9)
+
+    # Layer 3: temporal noise on lg grid (slow morph)
+    canvas_c = _render_vf(r, "lg",
+        lambda g, f, t, S: vf_temporal_noise(g, f, t, S,
+            freq=0.04, t_freq=0.2, octaves=3),
+        hf_time_cycle(0.12), PAL_BLOCKS, f, t, S, sat=0.7)
+
+    # Blend: A screen B, then difference with C
+    result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
+    result = blend_canvas(result, canvas_c, "difference", 0.5)
+
+    # Beat-triggered kaleidoscope
+    if f.get("bdecay", 0) > 0.3:
+        folds = 6 if f.get("sub_r", 0.3) > 0.4 else 8
+        result = sh_kaleidoscope(result.copy(), folds=folds)
+
+    return result
+
+# Scene table entry with feedback:
+# {"start": 30.0, "end": 50.0, "name": "cathedral", "fx": fx_cathedral,
+#  "gamma": 0.65, "shaders": [("bloom", {"thr": 110}), ("chromatic", {"amt": 4}),
+#                              ("vignette", {"s": 0.2}), ("grain", {"amt": 8})],
+#  "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
+#               "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}}
+```
+
+### Masked Reaction-Diffusion with Attractor Overlay
+
+Reaction-diffusion visible only through an animated iris mask, with a strange attractor density field underneath.
+
+```python
+def fx_masked_life(r, f, t, S):
+    """Attractor base + reaction-diffusion visible through iris mask + particles."""
+    g_sm = r.get_grid("sm")
+    g_md = r.get_grid("md")
+
+    # Layer 1: strange attractor density field (background)
+    canvas_bg = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_strange_attractor(g, f, t, S,
+            attractor="clifford", n_points=30000),
+        hf_time_cycle(0.04), PAL_DOTS, f, t, S, sat=0.5)
+
+    # Layer 2: reaction-diffusion (foreground, will be masked)
+    canvas_rd = _render_vf(r, "md",
+        lambda g, f, t, S: vf_reaction_diffusion(g, f, t, S,
+            feed=0.046, kill=0.063, steps_per_frame=4, init_mode="ring"),
+        hf_angle(0.15), PAL_HALFFILL, f, t, S, sat=0.85)
+
+    # Animated iris mask — opens over first 5 seconds of scene
+    scene_start = S.get("_scene_start", t)
+    if "_scene_start" not in S:
+        S["_scene_start"] = t
+    mask = mask_iris(g_md, t, scene_start, scene_start + 5.0,
+                     max_radius=0.6)
+    canvas_rd = apply_mask_canvas(canvas_rd, mask, bg_canvas=canvas_bg)
+
+    # Layer 3: flow-field particles following the R-D gradient
+    rd_field = vf_reaction_diffusion(g_sm, f, t, S,
+        feed=0.046, kill=0.063, steps_per_frame=0)  # read without stepping
+    ch_p, co_p = update_flow_particles(S, g_sm, f, rd_field,
+        n=300, speed=0.8, char_set=list("·•◦∘°"))
+    canvas_p = g_sm.render(ch_p, co_p)
+
+    result = blend_canvas(canvas_rd, canvas_p, "add", 0.7)
+    return result
+```
+
+### Morphing Field Sequence with Eased Keyframes
+
+Demonstrates temporal coherence: smooth morphing between effects with keyframed parameters.
+
+```python
+def fx_morphing_journey(r, f, t, S):
+    """Morphs through 4 value fields over 20 seconds with eased transitions.
+    Parameters (twist, arm count) also keyframed."""
+    # Keyframed twist parameter
+    twist = keyframe(t, [(0, 1.0), (5, 5.0), (10, 2.0), (15, 8.0), (20, 1.0)],
+                     ease_fn=ease_in_out_cubic, loop=True)
+
+    # Sequence of value fields with 2s crossfade
+    fields = [
+        lambda g, f, t, S: vf_plasma(g, f, t, S),
+        lambda g, f, t, S: vf_vortex(g, f, t, S, twist=twist),
+        lambda g, f, t, S: vf_fbm(g, f, t, S, octaves=5, freq=0.04),
+        lambda g, f, t, S: vf_domain_warp(g, f, t, S, warp_strength=15),
+    ]
+    durations = [5.0, 5.0, 5.0, 5.0]
+
+    val_fn = lambda g, f, t, S: vf_sequence(g, f, t, S, fields, durations,
+                                             crossfade=2.0)
+
+    # Render with slowly rotating hue
+    canvas = _render_vf(r, "md", val_fn, hf_time_cycle(0.06),
+                        PAL_DENSE, f, t, S, sat=0.8)
+
+    # Second layer: tiled version of same sequence at smaller grid
+    tiled_fn = lambda g, f, t, S: vf_sequence(
+        make_tgrid(g, *uv_tile(g, 3, 3, mirror=True)),
+        f, t, S, fields, durations, crossfade=2.0)
+    canvas_b = _render_vf(r, "sm", tiled_fn, hf_angle(0.1),
+                          PAL_RUNE, f, t, S, sat=0.6)
+
+    return blend_canvas(canvas, canvas_b, "screen", 0.5)
+```
+
+---
+
+### Specialized — Unique State Patterns
+
+### Game of Life with Ghost Trails
+
+Cellular automaton with analog fade trails. Beat injects random cells.
+
+```python
+def fx_life(r, f, t, S):
+    """Conway's Game of Life with fading ghost trails.
+    Beat events inject random live cells for disruption."""
+    canvas = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_game_of_life(g, f, t, S,
+            rule="life", steps_per_frame=1, fade=0.92, density=0.25),
+        hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.8)
+
+    # Overlay: coral automaton on lg grid for chunky texture
+    canvas_b = _render_vf(r, "lg",
+        lambda g, f, t, S: vf_game_of_life(g, f, t, S,
+            rule="coral", steps_per_frame=1, fade=0.85, density=0.15, seed=99),
+        hf_time_cycle(0.1), PAL_HATCH, f, t, S, sat=0.6)
+
+    return blend_canvas(canvas, canvas_b, "screen", 0.5)
+```
+
+### Boids Flock Over Voronoi
+
+Emergent swarm movement over a cellular background.
+
+```python
+def fx_boid_swarm(r, f, t, S):
+    """Flocking boids over animated voronoi cells."""
+    # Background: voronoi cells
+    canvas_bg = _render_vf(r, "md",
+        lambda g, f, t, S: vf_voronoi(g, f, t, S,
+            n_cells=20, mode="distance", speed=0.2),
+        hf_distance(0.4, 0.02), PAL_CIRCUIT, f, t, S, sat=0.5)
+
+    # Foreground: boids
+    g = r.get_grid("md")
+    ch_b, co_b = update_boids(S, g, f, n_boids=150, perception=6.0,
+                              max_speed=1.5, char_set=list("▸▹►▻→⟶"))
+    canvas_boids = g.render(ch_b, co_b)
+
+    # Trails for the boids
+    # (boid positions are stored in S["boid_x"], S["boid_y"])
+    S["px"] = list(S.get("boid_x", []))
+    S["py"] = list(S.get("boid_y", []))
+    ch_t, co_t = draw_particle_trails(S, g, max_trail=6, fade=0.6)
+    canvas_trails = g.render(ch_t, co_t)
+
+    result = blend_canvas(canvas_bg, canvas_trails, "add", 0.3)
+    result = blend_canvas(result, canvas_boids, "add", 0.9)
+    return result
+```
+
+### Fire Rising Through SDF Text Stencil
+
+Fire effect visible only through text letterforms.
+
+```python
+def fx_fire_text(r, f, t, S):
+    """Fire columns visible through text stencil. Text acts as window."""
+    g = r.get_grid("lg")
+
+    # Full-screen fire (will be masked)
+    canvas_fire = _render_vf(r, "sm",
+        lambda g, f, t, S: np.clip(
+            vf_fbm(g, f, t, S, octaves=4, freq=0.08, speed=0.8) *
+            (1.0 - g.rr / g.rows) *  # fade toward top
+            (0.6 + f.get("bass", 0.3) * 0.8), 0, 1),
+        hf_fixed(0.05), PAL_BLOCKS, f, t, S, sat=0.9)  # fire hue
+
+    # Background: dark domain warp
+    canvas_bg = _render_vf(r, "md",
+        lambda g, f, t, S: vf_domain_warp(g, f, t, S,
+            warp_strength=8, freq=0.03, speed=0.05) * 0.3,
+        hf_fixed(0.6), PAL_DENSE, f, t, S, sat=0.4)
+
+    # Text stencil mask
+    mask = mask_text(g, "FIRE", row_frac=0.45)
+    # Expand vertically for multi-row coverage
+    for offset in range(-2, 3):
+        shifted = mask_text(g, "FIRE", row_frac=0.45 + offset / g.rows)
+        mask = mask_union(mask, shifted)
+
+    canvas_masked = apply_mask_canvas(canvas_fire, mask, bg_canvas=canvas_bg)
+    return canvas_masked
+```
+
+### Portrait Mode: Vertical Rain + Quote
+
+Optimized for 9:16. Uses vertical space for long rain trails and stacked text.
+
+```python
+def fx_portrait_rain_quote(r, f, t, S):
+    """Portrait-optimized: matrix rain (long vertical trails) with stacked quote.
+    Designed for 1080x1920 (9:16)."""
+    g = r.get_grid("md")  # ~112x100 in portrait
+
+    # Matrix rain — long trails benefit from portrait's extra rows
+    ch, co, S = eff_matrix_rain(g, f, t, S,
+        hue=0.33, bri=0.6, pal=PAL_KATA, speed_base=0.4, speed_beat=2.5)
+    canvas_rain = g.render(ch, co)
+
+    # Tunnel depth underneath for texture
+    canvas_tunnel = _render_vf(r, "sm",
+        lambda g, f, t, S: vf_tunnel(g, f, t, S, speed=3.0, complexity=6) * 0.8,
+        hf_fixed(0.33), PAL_BLOCKS, f, t, S, sat=0.5)
+
+    result = blend_canvas(canvas_tunnel, canvas_rain, "screen", 0.8)
+
+    # Quote text — portrait layout: short lines, many of them
+    g_text = r.get_grid("lg")  # ~90x80 in portrait
+    quote_lines = layout_text_portrait(
+        "The code is the art and the art is the code",
+        max_chars_per_line=20)
+    # Center vertically
+    block_start = (g_text.rows - len(quote_lines)) // 2
+    ch_t = np.full((g_text.rows, g_text.cols), " ", dtype="U1")
+    co_t = np.zeros((g_text.rows, g_text.cols, 3), dtype=np.uint8)
+    total_chars = sum(len(l) for l in quote_lines)
+    progress = min(1.0, (t - S.get("_scene_start", t)) / 3.0)
+    if "_scene_start" not in S: S["_scene_start"] = t
+    render_typewriter(ch_t, co_t, quote_lines, block_start, g_text.cols,
+                      progress, total_chars, (200, 255, 220), t)
+    canvas_text = g_text.render(ch_t, co_t)
+
+    result = blend_canvas(result, canvas_text, "add", 0.9)
+    return result
+```
+
+---
+
+### Scene Table Template
+
+Wire scenes into a complete video:
+
+```python
+SCENES = [
+    {"start": 0.0,  "end": 5.0,  "name": "coral",
+     "fx": fx_coral, "grid": "sm", "gamma": 0.70,
+     "shaders": [("bloom", {"thr": 110}), ("vignette", {"s": 0.2})],
+     "feedback": {"decay": 0.8, "blend": "screen", "opacity": 0.3,
+                  "transform": "zoom", "transform_amt": 0.01}},
+
+    {"start": 5.0,  "end": 15.0, "name": "tunnel_noise",
+     "fx": fx_tunnel_noise, "grid": "md", "gamma": 0.75,
+     "shaders": [("chromatic", {"amt": 3}), ("bloom", {"thr": 120}),
+                 ("scanlines", {"intensity": 0.06}), ("grain", {"amt": 8})],
+     "feedback": None},
+
+    {"start": 15.0, "end": 35.0, "name": "cathedral",
+     "fx": fx_cathedral, "grid": "sm", "gamma": 0.65,
+     "shaders": [("bloom", {"thr": 100}), ("chromatic", {"amt": 5}),
+                 ("color_wobble", {"amt": 0.2}), ("vignette", {"s": 0.18})],
+     "feedback": {"decay": 0.75, "blend": "screen", "opacity": 0.35,
+                  "transform": "zoom", "transform_amt": 0.012, "hue_shift": 0.015}},
+
+    {"start": 35.0, "end": 50.0, "name": "morphing",
+     "fx": fx_morphing_journey, "grid": "md", "gamma": 0.70,
+     "shaders": [("bloom", {"thr": 110}), ("grain", {"amt": 6})],
+     "feedback": {"decay": 0.7, "blend": "screen", "opacity": 0.25,
+                  "transform": "rotate_cw", "transform_amt": 0.003}},
+]
+```
@@ -2,14 +2,9 @@

 Post-processing effects applied to the pixel canvas (`numpy uint8 array, shape (H,W,3)`) after character rendering and before encoding. Also covers **pixel-level blend modes**, **feedback buffers**, and the **ShaderChain** compositor.

-**Cross-references:**
- Grid system, palettes, color (HSV + OKLAB): `architecture.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- `_render_vf()`, blend modes, tonemap, masking: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Complete scene examples with shader usage: `examples.md`
- Performance tuning (frame budget, worker count): `optimization.md`
- Encoding pitfalls (ffmpeg flags, color space): `troubleshooting.md`
+> **See also:** composition.md (blend modes, tonemap) · effects.md · scenes.md · architecture.md · optimization.md · troubleshooting.md
+>
+> **Blend modes:** For the 20 pixel blend modes and `blend_canvas()`, see `composition.md`. All blending uses `blend_canvas(base, top, mode, opacity)`.

 ## Design Philosophy

@@ -1,14 +1,19 @@
 # Troubleshooting Reference

-**Cross-references:**
- Grid system, palettes, font selection: `architecture.md`
- Effect building blocks (value fields, noise, SDFs): `effects.md`
- `_render_vf()`, blend modes, tonemap: `composition.md`
- Scene protocol, render_clip, SCENES table: `scenes.md`
- Shader pipeline, feedback buffer, encoding: `shaders.md`
- Input sources (audio, video, TTS): `inputs.md`
- Performance tuning, hardware detection: `optimization.md`
- Complete scene examples: `examples.md`
+> **See also:** composition.md · architecture.md · shaders.md · scenes.md · optimization.md
+
+## Quick Diagnostic
+
+| Symptom | Likely Cause | Fix |
+|---------|-------------|-----|
+| All black output | tonemap gamma too high or no effects rendering | Lower gamma to 0.5, check scene_fn returns non-zero canvas |
+| Washed out / too bright | Linear brightness multiplier instead of tonemap | Replace `canvas * N` with `tonemap(canvas, gamma=0.75)` |
+| ffmpeg hangs mid-render | stderr=subprocess.PIPE deadlock | Redirect stderr to file |
+| "read-only" array error | broadcast_to view without .copy() | Add `.copy()` after broadcast_to |
+| PicklingError | Lambda or closure in SCENES table | Define all fx_* at module level |
+| Random dark holes in output | Font missing Unicode glyphs | Validate palettes at init |
+| Audio-visual desync | Frame timing accumulation | Use integer frame counter, compute t fresh each frame |
+| Single-color flat output | Hue field shape mismatch | Ensure h,s,v arrays all (rows,cols) before hsv2rgb |

 Common bugs, gotchas, and platform-specific issues encountered during ASCII video development.

@@ -339,3 +344,22 @@ val = np.clip(vf_plasma(g, f, t, S) * 1.5, 0, 1)
 ```

 The `_render_vf()` helper clips automatically, but if you're building custom scenes, clip explicitly.
+
+## Brightness Best Practices
+
+- Dense animated backgrounds — never flat black, always fill the grid
+- Vignette minimum clamped to 0.15 (not 0.12)
+- Bloom threshold 130 (not 170) so more pixels contribute to glow
+- Use `screen` blend mode (not `overlay`) for dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03`
+- FeedbackBuffer decay minimum 0.5 — below that, feedback disappears too fast to see
+- Value field floor: `vf * 0.8 + 0.05` ensures no cell is truly zero
+- Per-scene gamma overrides: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85
+- Test frames early: render single frames at key timestamps before committing to full render
+
+**Quick checklist before full render:**
+1. Render 3 test frames (start, middle, end)
+2. Check `canvas.mean() > 8` after tonemap
+3. Check no scene is visually flat black
+4. Verify per-section variation (different bg/palette/color per scene)
+5. Confirm shader chain includes bloom (threshold 130)
+6. Confirm vignette strength ≤ 0.25
@@ -114,6 +114,7 @@ curl -s "https://export.arxiv.org/api/query?id_list=2402.03300,2401.12345,2403.0

 After fetching metadata for a paper, generate a BibTeX entry:

+{% raw %}
 ```bash
 curl -s "https://export.arxiv.org/api/query?id_list=1706.03762" | python3 -c "
 import sys, xml.etree.ElementTree as ET
@@ -139,6 +140,7 @@ print(f'  url       = {{https://arxiv.org/abs/{raw_id}}}')
 print('}')
 "
 ```
+{% endraw %}

 ## Reading Paper Content

@@ -215,6 +215,7 @@ def generate_citation_key(bibtex: str) -> str:

 ### Complete Citation Manager Class

+{% raw %}
 ```python
 """
 Citation Manager - Verified citation workflow for ML papers.
@@ -377,6 +378,7 @@ if __name__ == "__main__":
    if bibtex:
        print(bibtex)
 ```
+{% endraw %}

 ### Quick Functions

@@ -295,3 +295,97 @@ class TestOnConnect:
        mock_conn = MagicMock(spec=acp.Client)
        agent.on_connect(mock_conn)
        assert agent._conn is mock_conn
+
+
+# ---------------------------------------------------------------------------
+# Slash commands
+# ---------------------------------------------------------------------------
+
+
+class TestSlashCommands:
+    """Test slash command dispatch in the ACP adapter."""
+
+    def _make_state(self, mock_manager):
+        state = mock_manager.create_session(cwd="/tmp")
+        state.agent.model = "test-model"
+        state.agent.provider = "openrouter"
+        state.model = "test-model"
+        return state
+
+    def test_help_lists_commands(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        result = agent._handle_slash_command("/help", state)
+        assert result is not None
+        assert "/help" in result
+        assert "/model" in result
+        assert "/tools" in result
+        assert "/reset" in result
+
+    def test_model_shows_current(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        result = agent._handle_slash_command("/model", state)
+        assert "test-model" in result
+
+    def test_context_empty(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        state.history = []
+        result = agent._handle_slash_command("/context", state)
+        assert "empty" in result.lower()
+
+    def test_context_with_messages(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        state.history = [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": "hi"},
+        ]
+        result = agent._handle_slash_command("/context", state)
+        assert "2 messages" in result
+        assert "user: 1" in result
+
+    def test_reset_clears_history(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        state.history = [{"role": "user", "content": "hello"}]
+        result = agent._handle_slash_command("/reset", state)
+        assert "cleared" in result.lower()
+        assert len(state.history) == 0
+
+    def test_version(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        result = agent._handle_slash_command("/version", state)
+        assert HERMES_VERSION in result
+
+    def test_unknown_command_returns_none(self, agent, mock_manager):
+        state = self._make_state(mock_manager)
+        result = agent._handle_slash_command("/nonexistent", state)
+        assert result is None
+
+    @pytest.mark.asyncio
+    async def test_slash_command_intercepted_in_prompt(self, agent, mock_manager):
+        """Slash commands should be handled without calling the LLM."""
+        new_resp = await agent.new_session(cwd="/tmp")
+        mock_conn = AsyncMock(spec=acp.Client)
+        agent._conn = mock_conn
+
+        prompt = [TextContentBlock(type="text", text="/help")]
+        resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
+
+        assert resp.stop_reason == "end_turn"
+        mock_conn.session_update.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_unknown_slash_falls_through_to_llm(self, agent, mock_manager):
+        """Unknown /commands should be sent to the LLM, not intercepted."""
+        new_resp = await agent.new_session(cwd="/tmp")
+        mock_conn = AsyncMock(spec=acp.Client)
+        agent._conn = mock_conn
+
+        # Mock run_in_executor to avoid actually running the agent
+        with patch("asyncio.get_running_loop") as mock_loop:
+            mock_loop.return_value.run_in_executor = AsyncMock(return_value={
+                "final_response": "I processed /foo",
+                "messages": [],
+            })
+            prompt = [TextContentBlock(type="text", text="/foo bar")]
+            resp = await agent.prompt(prompt=prompt, session_id=new_resp.session_id)
+
+        assert resp.stop_reason == "end_turn"
@@ -0,0 +1,61 @@
+from agent.smart_model_routing import choose_cheap_model_route
+
+
+_BASE_CONFIG = {
+    "enabled": True,
+    "cheap_model": {
+        "provider": "openrouter",
+        "model": "google/gemini-2.5-flash",
+    },
+}
+
+
+def test_returns_none_when_disabled():
+    cfg = {**_BASE_CONFIG, "enabled": False}
+    assert choose_cheap_model_route("what time is it in tokyo?", cfg) is None
+
+
+def test_routes_short_simple_prompt():
+    result = choose_cheap_model_route("what time is it in tokyo?", _BASE_CONFIG)
+    assert result is not None
+    assert result["provider"] == "openrouter"
+    assert result["model"] == "google/gemini-2.5-flash"
+    assert result["routing_reason"] == "simple_turn"
+
+
+def test_skips_long_prompt():
+    prompt = "please summarize this carefully " * 20
+    assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
+
+
+def test_skips_code_like_prompt():
+    prompt = "debug this traceback: ```python\nraise ValueError('bad')\n```"
+    assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
+
+
+def test_skips_tool_heavy_prompt_keywords():
+    prompt = "implement a patch for this docker error"
+    assert choose_cheap_model_route(prompt, _BASE_CONFIG) is None
+
+
+def test_resolve_turn_route_falls_back_to_primary_when_route_runtime_cannot_be_resolved(monkeypatch):
+    from agent.smart_model_routing import resolve_turn_route
+
+    monkeypatch.setattr(
+        "hermes_cli.runtime_provider.resolve_runtime_provider",
+        lambda **kwargs: (_ for _ in ()).throw(RuntimeError("bad route")),
+    )
+    result = resolve_turn_route(
+        "what time is it in tokyo?",
+        _BASE_CONFIG,
+        {
+            "model": "anthropic/claude-sonnet-4",
+            "provider": "openrouter",
+            "base_url": "https://openrouter.ai/api/v1",
+            "api_mode": "chat_completions",
+            "api_key": "sk-primary",
+        },
+    )
+    assert result["model"] == "anthropic/claude-sonnet-4"
+    assert result["runtime"]["provider"] == "openrouter"
+    assert result["label"] is None
@@ -26,6 +26,12 @@ def _isolate_hermes_home(tmp_path, monkeypatch):
    (fake_home / "memories").mkdir()
    (fake_home / "skills").mkdir()
    monkeypatch.setenv("HERMES_HOME", str(fake_home))
+    # Reset plugin singleton so tests don't leak plugins from ~/.hermes/plugins/
+    try:
+        import hermes_cli.plugins as _plugins_mod
+        monkeypatch.setattr(_plugins_mod, "_plugin_manager", None)
+    except Exception:
+        pass
    # Tests should not inherit the agent's current gateway/messaging surface.
    # Individual tests that need gateway behavior set these explicitly.
    monkeypatch.delenv("HERMES_SESSION_PLATFORM", raising=False)
@@ -304,17 +304,34 @@ class TestMarkJobRun:


 class TestGetDueJobs:
-    def test_past_due_returned(self, tmp_cron_dir):
+    def test_past_due_within_window_returned(self, tmp_cron_dir):
+        """Jobs less than 2 minutes late are still considered due (not stale)."""
        job = create_job(prompt="Due now", schedule="every 1h")
-        # Force next_run_at to the past
+        # Force next_run_at to just 1 minute ago (within the 2-min window)
        jobs = load_jobs()
-        jobs[0]["next_run_at"] = (datetime.now() - timedelta(minutes=5)).isoformat()
+        jobs[0]["next_run_at"] = (datetime.now() - timedelta(seconds=60)).isoformat()
        save_jobs(jobs)

        due = get_due_jobs()
        assert len(due) == 1
        assert due[0]["id"] == job["id"]

+    def test_stale_past_due_skipped(self, tmp_cron_dir):
+        """Recurring jobs more than 2 minutes late are fast-forwarded, not fired."""
+        job = create_job(prompt="Stale", schedule="every 1h")
+        # Force next_run_at to 5 minutes ago (beyond the 2-min window)
+        jobs = load_jobs()
+        jobs[0]["next_run_at"] = (datetime.now() - timedelta(minutes=5)).isoformat()
+        save_jobs(jobs)
+
+        due = get_due_jobs()
+        assert len(due) == 0
+        # next_run_at should be fast-forwarded to the future
+        updated = get_job(job["id"])
+        from cron.jobs import _ensure_aware, _hermes_now
+        next_dt = _ensure_aware(datetime.fromisoformat(updated["next_run_at"]))
+        assert next_dt > _hermes_now()
+
    def test_future_not_returned(self, tmp_cron_dir):
        create_job(prompt="Not yet", schedule="every 1h")
        due = get_due_jobs()
@@ -65,6 +65,14 @@ class TestHandleBackgroundCommand:
        assert "Usage:" in result
        assert "/background" in result

+    @pytest.mark.asyncio
+    async def test_bg_alias_no_prompt_shows_usage(self):
+        """Running /bg with no prompt shows usage."""
+        runner = _make_runner()
+        event = _make_event(text="/bg")
+        result = await runner._handle_background_command(event)
+        assert "Usage:" in result
+
    @pytest.mark.asyncio
    async def test_empty_prompt_shows_usage(self):
        """Running /background with only whitespace shows usage."""
@@ -264,11 +272,14 @@ class TestBackgroundInHelp:
        assert "/background" in result

    def test_background_is_known_command(self):
-        """The /background command is in the _known_commands set."""
-        from gateway.run import GatewayRunner
-        import inspect
-        source = inspect.getsource(GatewayRunner._handle_message)
-        assert '"background"' in source
+        """The /background command is in GATEWAY_KNOWN_COMMANDS."""
+        from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
+        assert "background" in GATEWAY_KNOWN_COMMANDS
+
+    def test_bg_alias_is_known_command(self):
+        """The /bg alias is in GATEWAY_KNOWN_COMMANDS."""
+        from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
+        assert "bg" in GATEWAY_KNOWN_COMMANDS


 # ---------------------------------------------------------------------------
@@ -284,6 +295,11 @@ class TestBackgroundInCLICommands:
        from hermes_cli.commands import COMMANDS
        assert "/background" in COMMANDS

+    def test_bg_alias_in_commands_dict(self):
+        """The /bg alias is in the COMMANDS dict."""
+        from hermes_cli.commands import COMMANDS
+        assert "/bg" in COMMANDS
+
    def test_background_in_session_category(self):
        """The /background command is in the Session category."""
        from hermes_cli.commands import COMMANDS_BY_CATEGORY
@@ -83,6 +83,14 @@ class TestSessionResetPolicy:
        assert policy.at_hour == 4
        assert policy.idle_minutes == 1440

+    def test_from_dict_treats_null_values_as_defaults(self):
+        restored = SessionResetPolicy.from_dict(
+            {"mode": None, "at_hour": None, "idle_minutes": None}
+        )
+        assert restored.mode == "both"
+        assert restored.at_hour == 4
+        assert restored.idle_minutes == 1440
+

 class TestGatewayConfigRoundtrip:
    def test_full_roundtrip(self):
@@ -96,6 +104,7 @@ class TestGatewayConfigRoundtrip:
            },
            reset_triggers=["/new"],
            quick_commands={"limits": {"type": "exec", "command": "echo ok"}},
+            group_sessions_per_user=False,
        )
        d = config.to_dict()
        restored = GatewayConfig.from_dict(d)
@@ -104,6 +113,7 @@ class TestGatewayConfigRoundtrip:
        assert restored.platforms[Platform.TELEGRAM].token == "tok_123"
        assert restored.reset_triggers == ["/new"]
        assert restored.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}
+        assert restored.group_sessions_per_user is False


 class TestLoadGatewayConfig:
@@ -125,6 +135,18 @@ class TestLoadGatewayConfig:

        assert config.quick_commands == {"limits": {"type": "exec", "command": "echo ok"}}

+    def test_bridges_group_sessions_per_user_from_config_yaml(self, tmp_path, monkeypatch):
+        hermes_home = tmp_path / ".hermes"
+        hermes_home.mkdir()
+        config_path = hermes_home / "config.yaml"
+        config_path.write_text("group_sessions_per_user: false\n", encoding="utf-8")
+
+        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
+
+        config = load_gateway_config()
+
+        assert config.group_sessions_per_user is False
+
    def test_invalid_quick_commands_in_config_yaml_are_ignored(self, tmp_path, monkeypatch):
        hermes_home = tmp_path / ".hermes"
        hermes_home.mkdir()
@@ -90,6 +90,7 @@ class TestGatewayHonchoLifecycle:
        runner = _make_runner()
        event = _make_event()
        runner._shutdown_gateway_honcho = MagicMock()
+        runner._async_flush_memories = AsyncMock()
        runner.session_store = MagicMock()
        runner.session_store._generate_session_key.return_value = "gateway-key"
        runner.session_store._entries = {
@@ -100,4 +101,31 @@ class TestGatewayHonchoLifecycle:
        result = await runner._handle_reset_command(event)

        runner._shutdown_gateway_honcho.assert_called_once_with("gateway-key")
+        runner._async_flush_memories.assert_called_once_with("old-session", "gateway-key")
        assert "Session reset" in result
+
+    def test_flush_memories_reuses_gateway_session_key_and_skips_honcho_sync(self):
+        runner = _make_runner()
+        runner.session_store = MagicMock()
+        runner.session_store.load_transcript.return_value = [
+            {"role": "user", "content": "a"},
+            {"role": "assistant", "content": "b"},
+            {"role": "user", "content": "c"},
+            {"role": "assistant", "content": "d"},
+        ]
+        tmp_agent = MagicMock()
+
+        with (
+            patch("gateway.run._resolve_runtime_agent_kwargs", return_value={"api_key": "test-key"}),
+            patch("gateway.run._resolve_gateway_model", return_value="model-name"),
+            patch("run_agent.AIAgent", return_value=tmp_agent) as mock_agent_cls,
+        ):
+            runner._flush_memories_for_session("old-session", "gateway-key")
+
+        mock_agent_cls.assert_called_once()
+        _, kwargs = mock_agent_cls.call_args
+        assert kwargs["session_id"] == "old-session"
+        assert kwargs["honcho_session_key"] == "gateway-key"
+        tmp_agent.run_conversation.assert_called_once()
+        _, run_kwargs = tmp_agent.run_conversation.call_args
+        assert run_kwargs["sync_honcho"] is False
@@ -1,25 +0,0 @@
-from unittest.mock import patch
-
-import pytest
-
-
-@pytest.mark.asyncio
-async def test_image_enrichment_uses_athabasca_upload_guidance_without_stale_r2_warning():
-    from gateway.run import GatewayRunner
-
-    runner = object.__new__(GatewayRunner)
-
-    with patch(
-        "tools.vision_tools.vision_analyze_tool",
-        return_value='{"success": true, "analysis": "A painted serpent warrior."}',
-    ):
-        enriched = await runner._enrich_message_with_vision(
-            "caption",
-            ["/tmp/test.jpg"],
-        )
-
-    assert "R2 not configured" not in enriched
-    assert "Gateway media URL available for reference" not in enriched
-    assert "POST /api/uploads" in enriched
-    assert "Do not store the local cache path" in enriched
-    assert "caption" in enriched
@@ -0,0 +1,156 @@
+"""Tests for PII redaction in gateway session context prompts."""
+
+from gateway.session import (
+    SessionContext,
+    SessionSource,
+    build_session_context_prompt,
+    _hash_id,
+    _hash_sender_id,
+    _hash_chat_id,
+    _looks_like_phone,
+)
+from gateway.config import Platform, HomeChannel
+
+
+# ---------------------------------------------------------------------------
+# Low-level helpers
+# ---------------------------------------------------------------------------
+
+class TestHashHelpers:
+    def test_hash_id_deterministic(self):
+        assert _hash_id("12345") == _hash_id("12345")
+
+    def test_hash_id_12_hex_chars(self):
+        h = _hash_id("user-abc")
+        assert len(h) == 12
+        assert all(c in "0123456789abcdef" for c in h)
+
+    def test_hash_sender_id_prefix(self):
+        assert _hash_sender_id("12345").startswith("user_")
+        assert len(_hash_sender_id("12345")) == 17  # "user_" + 12
+
+    def test_hash_chat_id_preserves_prefix(self):
+        result = _hash_chat_id("telegram:12345")
+        assert result.startswith("telegram:")
+        assert "12345" not in result
+
+    def test_hash_chat_id_no_prefix(self):
+        result = _hash_chat_id("12345")
+        assert len(result) == 12
+        assert "12345" not in result
+
+    def test_looks_like_phone(self):
+        assert _looks_like_phone("+15551234567")
+        assert _looks_like_phone("15551234567")
+        assert _looks_like_phone("+1-555-123-4567")
+        assert not _looks_like_phone("alice")
+        assert not _looks_like_phone("user-123")
+        assert not _looks_like_phone("")
+
+
+# ---------------------------------------------------------------------------
+# Integration: build_session_context_prompt
+# ---------------------------------------------------------------------------
+
+def _make_context(
+    user_id="user-123",
+    user_name=None,
+    chat_id="telegram:99999",
+    platform=Platform.TELEGRAM,
+    home_channels=None,
+):
+    source = SessionSource(
+        platform=platform,
+        chat_id=chat_id,
+        chat_type="dm",
+        user_id=user_id,
+        user_name=user_name,
+    )
+    return SessionContext(
+        source=source,
+        connected_platforms=[platform],
+        home_channels=home_channels or {},
+    )
+
+
+class TestBuildSessionContextPromptRedaction:
+    def test_no_redaction_by_default(self):
+        ctx = _make_context(user_id="user-123")
+        prompt = build_session_context_prompt(ctx)
+        assert "user-123" in prompt
+
+    def test_user_id_hashed_when_redact_pii(self):
+        ctx = _make_context(user_id="user-123")
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "user-123" not in prompt
+        assert "user_" in prompt  # hashed ID present
+
+    def test_user_name_not_redacted(self):
+        ctx = _make_context(user_id="user-123", user_name="Alice")
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "Alice" in prompt
+        # user_id should not appear when user_name is present (name takes priority)
+        assert "user-123" not in prompt
+
+    def test_home_channel_id_hashed(self):
+        hc = {
+            Platform.TELEGRAM: HomeChannel(
+                platform=Platform.TELEGRAM,
+                chat_id="telegram:99999",
+                name="Home Chat",
+            )
+        }
+        ctx = _make_context(home_channels=hc)
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "99999" not in prompt
+        assert "telegram:" in prompt  # prefix preserved
+        assert "Home Chat" in prompt  # name not redacted
+
+    def test_home_channel_id_preserved_without_redaction(self):
+        hc = {
+            Platform.TELEGRAM: HomeChannel(
+                platform=Platform.TELEGRAM,
+                chat_id="telegram:99999",
+                name="Home Chat",
+            )
+        }
+        ctx = _make_context(home_channels=hc)
+        prompt = build_session_context_prompt(ctx, redact_pii=False)
+        assert "99999" in prompt
+
+    def test_redaction_is_deterministic(self):
+        ctx = _make_context(user_id="+15551234567")
+        prompt1 = build_session_context_prompt(ctx, redact_pii=True)
+        prompt2 = build_session_context_prompt(ctx, redact_pii=True)
+        assert prompt1 == prompt2
+
+    def test_different_ids_produce_different_hashes(self):
+        ctx1 = _make_context(user_id="user-A")
+        ctx2 = _make_context(user_id="user-B")
+        p1 = build_session_context_prompt(ctx1, redact_pii=True)
+        p2 = build_session_context_prompt(ctx2, redact_pii=True)
+        assert p1 != p2
+
+    def test_discord_ids_not_redacted_even_with_flag(self):
+        """Discord needs real IDs for <@user_id> mentions."""
+        ctx = _make_context(user_id="123456789", platform=Platform.DISCORD)
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "123456789" in prompt
+
+    def test_whatsapp_ids_redacted(self):
+        ctx = _make_context(user_id="+15551234567", platform=Platform.WHATSAPP)
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "+15551234567" not in prompt
+        assert "user_" in prompt
+
+    def test_signal_ids_redacted(self):
+        ctx = _make_context(user_id="+15551234567", platform=Platform.SIGNAL)
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "+15551234567" not in prompt
+        assert "user_" in prompt
+
+    def test_slack_ids_not_redacted(self):
+        """Slack may need IDs for mentions too."""
+        ctx = _make_context(user_id="U12345ABC", platform=Platform.SLACK)
+        prompt = build_session_context_prompt(ctx, redact_pii=True)
+        assert "U12345ABC" in prompt
@@ -199,3 +199,28 @@ class TestHandleResumeCommand:

        assert real_key not in runner._running_agents
        db.close()
+
+    @pytest.mark.asyncio
+    async def test_resume_flushes_memories_with_gateway_session_key(self, tmp_path):
+        """Resume should preserve the gateway session key for Honcho flushes."""
+        from hermes_state import SessionDB
+
+        db = SessionDB(db_path=tmp_path / "state.db")
+        db.create_session("old_session", "telegram")
+        db.set_session_title("old_session", "Old Work")
+        db.create_session("current_session_001", "telegram")
+
+        event = _make_event(text="/resume Old Work")
+        runner = _make_runner(
+            session_db=db,
+            current_session_id="current_session_001",
+            event=event,
+        )
+
+        await runner._handle_resume_command(event)
+
+        runner._async_flush_memories.assert_called_once_with(
+            "current_session_001",
+            _session_key_for_event(event),
+        )
+        db.close()
@@ -0,0 +1,89 @@
+import pytest
+
+from gateway.config import GatewayConfig, Platform, PlatformConfig
+from gateway.platforms.base import BasePlatformAdapter
+from gateway.run import GatewayRunner
+from gateway.status import read_runtime_status
+
+
+class _RetryableFailureAdapter(BasePlatformAdapter):
+    def __init__(self):
+        super().__init__(PlatformConfig(enabled=True, token="***"), Platform.TELEGRAM)
+
+    async def connect(self) -> bool:
+        self._set_fatal_error(
+            "telegram_connect_error",
+            "Telegram startup failed: temporary DNS resolution failure.",
+            retryable=True,
+        )
+        return False
+
+    async def disconnect(self) -> None:
+        self._mark_disconnected()
+
+    async def send(self, chat_id, content, reply_to=None, metadata=None):
+        raise NotImplementedError
+
+    async def get_chat_info(self, chat_id):
+        return {"id": chat_id}
+
+
+class _DisabledAdapter(BasePlatformAdapter):
+    def __init__(self):
+        super().__init__(PlatformConfig(enabled=False, token="***"), Platform.TELEGRAM)
+
+    async def connect(self) -> bool:
+        raise AssertionError("connect should not be called for disabled platforms")
+
+    async def disconnect(self) -> None:
+        self._mark_disconnected()
+
+    async def send(self, chat_id, content, reply_to=None, metadata=None):
+        raise NotImplementedError
+
+    async def get_chat_info(self, chat_id):
+        return {"id": chat_id}
+
+
+@pytest.mark.asyncio
+async def test_runner_returns_failure_for_retryable_startup_errors(monkeypatch, tmp_path):
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path))
+    config = GatewayConfig(
+        platforms={
+            Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")
+        },
+        sessions_dir=tmp_path / "sessions",
+    )
+    runner = GatewayRunner(config)
+
+    monkeypatch.setattr(runner, "_create_adapter", lambda platform, platform_config: _RetryableFailureAdapter())
+
+    ok = await runner.start()
+
+    assert ok is False
+    assert runner.should_exit_cleanly is False
+    state = read_runtime_status()
+    assert state["gateway_state"] == "startup_failed"
+    assert "temporary DNS resolution failure" in state["exit_reason"]
+    assert state["platforms"]["telegram"]["state"] == "fatal"
+    assert state["platforms"]["telegram"]["error_code"] == "telegram_connect_error"
+
+
+@pytest.mark.asyncio
+async def test_runner_allows_cron_only_mode_when_no_platforms_are_enabled(monkeypatch, tmp_path):
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path))
+    config = GatewayConfig(
+        platforms={
+            Platform.TELEGRAM: PlatformConfig(enabled=False, token="***")
+        },
+        sessions_dir=tmp_path / "sessions",
+    )
+    runner = GatewayRunner(config)
+
+    ok = await runner.start()
+
+    assert ok is True
+    assert runner.should_exit_cleanly is False
+    assert runner.adapters == {}
+    state = read_runtime_status()
+    assert state["gateway_state"] == "running"
@@ -369,6 +369,54 @@ class TestWhatsAppDMSessionKeyConsistency:
        )
        assert store._generate_session_key(source) == build_session_key(source)

+    def test_store_creates_distinct_group_sessions_per_user(self, store):
+        first = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="alice",
+            user_name="Alice",
+        )
+        second = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="bob",
+            user_name="Bob",
+        )
+
+        first_entry = store.get_or_create_session(first)
+        second_entry = store.get_or_create_session(second)
+
+        assert first_entry.session_key == "agent:main:discord:group:guild-123:alice"
+        assert second_entry.session_key == "agent:main:discord:group:guild-123:bob"
+        assert first_entry.session_id != second_entry.session_id
+
+    def test_store_shares_group_sessions_when_disabled_in_config(self, store):
+        store.config.group_sessions_per_user = False
+
+        first = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="alice",
+            user_name="Alice",
+        )
+        second = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="bob",
+            user_name="Bob",
+        )
+
+        first_entry = store.get_or_create_session(first)
+        second_entry = store.get_or_create_session(second)
+
+        assert first_entry.session_key == "agent:main:discord:group:guild-123"
+        assert second_entry.session_key == "agent:main:discord:group:guild-123"
+        assert first_entry.session_id == second_entry.session_id
+
    def test_telegram_dm_includes_chat_id(self):
        """Non-WhatsApp DMs should also include chat_id to separate users."""
        source = SessionSource(
@@ -398,6 +446,41 @@ class TestWhatsAppDMSessionKeyConsistency:
        key = build_session_key(source)
        assert key == "agent:main:discord:group:guild-123"

+    def test_group_sessions_are_isolated_per_user_when_user_id_present(self):
+        first = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="alice",
+        )
+        second = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="bob",
+        )
+
+        assert build_session_key(first) == "agent:main:discord:group:guild-123:alice"
+        assert build_session_key(second) == "agent:main:discord:group:guild-123:bob"
+        assert build_session_key(first) != build_session_key(second)
+
+    def test_group_sessions_can_be_shared_when_isolation_disabled(self):
+        first = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="alice",
+        )
+        second = SessionSource(
+            platform=Platform.DISCORD,
+            chat_id="guild-123",
+            chat_type="group",
+            user_id="bob",
+        )
+
+        assert build_session_key(first, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
+        assert build_session_key(second, group_sessions_per_user=False) == "agent:main:discord:group:guild-123"
+
    def test_group_thread_includes_thread_id(self):
        """Forum-style threads need a distinct session key within one group."""
        source = SessionSource(
@@ -409,6 +492,17 @@ class TestWhatsAppDMSessionKeyConsistency:
        key = build_session_key(source)
        assert key == "agent:main:telegram:group:-1002285219667:17585"

+    def test_group_thread_sessions_are_isolated_per_user(self):
+        source = SessionSource(
+            platform=Platform.TELEGRAM,
+            chat_id="-1002285219667",
+            chat_type="group",
+            thread_id="17585",
+            user_id="42",
+        )
+        key = build_session_key(source)
+        assert key == "agent:main:telegram:group:-1002285219667:17585:42"
+

 class TestSessionStoreEntriesAttribute:
    """Regression: /reset must access _entries, not _sessions."""
@@ -0,0 +1,81 @@
+"""Tests for SSL certificate auto-detection in gateway/run.py."""
+
+import importlib
+import os
+from unittest.mock import patch, MagicMock
+
+
+def _load_ensure_ssl():
+    """Import _ensure_ssl_certs fresh (gateway/run.py has heavy deps, so we
+    extract just the function source to avoid importing the whole gateway)."""
+    # We can test via the actual module since conftest isolates HERMES_HOME,
+    # but we need to be careful about side effects.  Instead, replicate the
+    # logic in a controlled way.
+    from types import ModuleType
+    import textwrap, ssl as _ssl  # noqa: F401
+
+    code = textwrap.dedent("""\
+    import os, ssl
+
+    def _ensure_ssl_certs():
+        if "SSL_CERT_FILE" in os.environ:
+            return
+        paths = ssl.get_default_verify_paths()
+        for candidate in (paths.cafile, paths.openssl_cafile):
+            if candidate and os.path.exists(candidate):
+                os.environ["SSL_CERT_FILE"] = candidate
+                return
+        try:
+            import certifi
+            os.environ["SSL_CERT_FILE"] = certifi.where()
+            return
+        except ImportError:
+            pass
+        for candidate in (
+            "/etc/ssl/certs/ca-certificates.crt",
+            "/etc/ssl/cert.pem",
+        ):
+            if os.path.exists(candidate):
+                os.environ["SSL_CERT_FILE"] = candidate
+                return
+    """)
+    mod = ModuleType("_ssl_helper")
+    exec(code, mod.__dict__)
+    return mod._ensure_ssl_certs
+
+
+class TestEnsureSslCerts:
+    def test_respects_existing_env_var(self):
+        fn = _load_ensure_ssl()
+        with patch.dict(os.environ, {"SSL_CERT_FILE": "/custom/ca.pem"}):
+            fn()
+            assert os.environ["SSL_CERT_FILE"] == "/custom/ca.pem"
+
+    def test_sets_from_ssl_default_paths(self, tmp_path):
+        fn = _load_ensure_ssl()
+        cert = tmp_path / "ca.crt"
+        cert.write_text("FAKE CERT")
+
+        mock_paths = MagicMock()
+        mock_paths.cafile = str(cert)
+        mock_paths.openssl_cafile = None
+
+        env = {k: v for k, v in os.environ.items() if k != "SSL_CERT_FILE"}
+        with patch.dict(os.environ, env, clear=True), \
+             patch("ssl.get_default_verify_paths", return_value=mock_paths):
+            fn()
+            assert os.environ.get("SSL_CERT_FILE") == str(cert)
+
+    def test_no_op_when_nothing_found(self):
+        fn = _load_ensure_ssl()
+        mock_paths = MagicMock()
+        mock_paths.cafile = None
+        mock_paths.openssl_cafile = None
+
+        env = {k: v for k, v in os.environ.items() if k != "SSL_CERT_FILE"}
+        with patch.dict(os.environ, env, clear=True), \
+             patch("ssl.get_default_verify_paths", return_value=mock_paths), \
+             patch("os.path.exists", return_value=False), \
+             patch.dict("sys.modules", {"certifi": None}):
+            fn()
+            assert "SSL_CERT_FILE" not in os.environ
@@ -26,6 +26,22 @@ class TestGatewayPidState:
        assert status.get_running_pid() is None
        assert not pid_path.exists()

+    def test_get_running_pid_accepts_gateway_metadata_when_cmdline_unavailable(self, tmp_path, monkeypatch):
+        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
+        pid_path = tmp_path / "gateway.pid"
+        pid_path.write_text(json.dumps({
+            "pid": os.getpid(),
+            "kind": "hermes-gateway",
+            "argv": ["python", "-m", "hermes_cli.main", "gateway"],
+            "start_time": 123,
+        }))
+
+        monkeypatch.setattr(status.os, "kill", lambda pid, sig: None)
+        monkeypatch.setattr(status, "_get_process_start_time", lambda pid: 123)
+        monkeypatch.setattr(status, "_read_process_cmdline", lambda pid: None)
+
+        assert status.get_running_pid() == os.getpid()
+

 class TestGatewayRuntimeStatus:
    def test_write_runtime_status_records_platform_failure(self, tmp_path, monkeypatch):
@@ -51,3 +51,27 @@ async def test_enrich_message_with_transcription_skips_when_stt_disabled():

    assert "transcription is disabled" in result.lower()
    assert "caption" in result
+
+
+@pytest.mark.asyncio
+async def test_enrich_message_with_transcription_avoids_bogus_no_provider_message_for_backend_key_errors():
+    from gateway.run import GatewayRunner
+
+    runner = GatewayRunner.__new__(GatewayRunner)
+    runner.config = GatewayConfig(stt_enabled=True)
+
+    with patch(
+        "tools.transcription_tools.transcribe_audio",
+        return_value={"success": False, "error": "VOICE_TOOLS_OPENAI_KEY not set"},
+    ), patch(
+        "tools.transcription_tools.get_stt_model_from_config",
+        return_value=None,
+    ):
+        result = await runner._enrich_message_with_transcription(
+            "caption",
+            ["/tmp/voice.ogg"],
+        )
+
+    assert "No STT provider is configured" not in result
+    assert "trouble transcribing" in result
+    assert "caption" in result
@@ -100,6 +100,39 @@ async def test_polling_conflict_stops_polling_and_notifies_handler(monkeypatch):
    fatal_handler.assert_awaited_once()


+@pytest.mark.asyncio
+async def test_connect_marks_retryable_fatal_error_for_startup_network_failure(monkeypatch):
+    adapter = TelegramAdapter(PlatformConfig(enabled=True, token="***"))
+
+    monkeypatch.setattr(
+        "gateway.status.acquire_scoped_lock",
+        lambda scope, identity, metadata=None: (True, None),
+    )
+    monkeypatch.setattr(
+        "gateway.status.release_scoped_lock",
+        lambda scope, identity: None,
+    )
+
+    builder = MagicMock()
+    builder.token.return_value = builder
+    app = SimpleNamespace(
+        bot=SimpleNamespace(),
+        updater=SimpleNamespace(),
+        add_handler=MagicMock(),
+        initialize=AsyncMock(side_effect=RuntimeError("Temporary failure in name resolution")),
+        start=AsyncMock(),
+    )
+    builder.build.return_value = app
+    monkeypatch.setattr("gateway.platforms.telegram.Application", SimpleNamespace(builder=MagicMock(return_value=builder)))
+
+    ok = await adapter.connect()
+
+    assert ok is False
+    assert adapter.fatal_error_code == "telegram_connect_error"
+    assert adapter.fatal_error_retryable is True
+    assert "Temporary failure in name resolution" in adapter.fatal_error_message
+
+
@pytest.mark.asyncio
 async def test_disconnect_skips_inactive_updater_and_app(monkeypatch):
    adapter = TelegramAdapter(PlatformConfig(enabled=True, token="***"))
@@ -475,16 +475,15 @@ class TestDiscordPlayTtsSkip:
 class TestVoiceInHelp:

    def test_voice_in_help_output(self):
-        from gateway.run import GatewayRunner
-        import inspect
-        source = inspect.getsource(GatewayRunner._handle_help_command)
-        assert "/voice" in source
+        """The gateway help text includes /voice (generated from registry)."""
+        from hermes_cli.commands import gateway_help_lines
+        help_text = "\n".join(gateway_help_lines())
+        assert "/voice" in help_text

    def test_voice_is_known_command(self):
-        from gateway.run import GatewayRunner
-        import inspect
-        source = inspect.getsource(GatewayRunner._handle_message)
-        assert '"voice"' in source
+        """The /voice command is in GATEWAY_KNOWN_COMMANDS."""
+        from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
+        assert "voice" in GATEWAY_KNOWN_COMMANDS


 # =====================================================================
@@ -1,19 +1,20 @@
-"""Tests for shared slash command definitions and autocomplete."""
+"""Tests for the central command registry and autocomplete."""

 from prompt_toolkit.completion import CompleteEvent
 from prompt_toolkit.document import Document

-from hermes_cli.commands import COMMANDS, SlashCommandCompleter
-
-
-# All commands that must be present in the shared COMMANDS dict.
-EXPECTED_COMMANDS = {
-    "/help", "/tools", "/toolsets", "/model", "/provider", "/prompt",
-    "/personality", "/clear", "/history", "/new", "/reset", "/retry",
-    "/undo", "/save", "/config", "/cron", "/skills", "/platforms",
-    "/verbose", "/reasoning", "/compress", "/title", "/usage", "/insights", "/paste",
-    "/reload-mcp", "/rollback", "/background", "/skin", "/voice", "/quit",
-}
+from hermes_cli.commands import (
+    COMMAND_REGISTRY,
+    COMMANDS,
+    COMMANDS_BY_CATEGORY,
+    CommandDef,
+    GATEWAY_KNOWN_COMMANDS,
+    SlashCommandCompleter,
+    gateway_help_lines,
+    resolve_command,
+    slack_subcommand_map,
+    telegram_bot_commands,
+)


 def _completions(completer: SlashCommandCompleter, text: str):
@@ -25,21 +26,200 @@ def _completions(completer: SlashCommandCompleter, text: str):
    )


-class TestCommands:
-    def test_shared_commands_include_cli_specific_entries(self):
-        """Entries that previously only existed in cli.py are now in the shared dict."""
-        assert COMMANDS["/paste"] == "Check clipboard for an image and attach it"
-        assert COMMANDS["/reload-mcp"] == "Reload MCP servers from config.yaml"
+# ---------------------------------------------------------------------------
+# CommandDef registry tests
+# ---------------------------------------------------------------------------

-    def test_all_expected_commands_present(self):
-        """Regression guard — every known command must appear in the shared dict."""
-        assert set(COMMANDS.keys()) == EXPECTED_COMMANDS
+class TestCommandRegistry:
+    def test_registry_is_nonempty(self):
+        assert len(COMMAND_REGISTRY) > 30
+
+    def test_every_entry_is_commanddef(self):
+        for entry in COMMAND_REGISTRY:
+            assert isinstance(entry, CommandDef), f"Unexpected type: {type(entry)}"
+
+    def test_no_duplicate_canonical_names(self):
+        names = [cmd.name for cmd in COMMAND_REGISTRY]
+        assert len(names) == len(set(names)), f"Duplicate names: {[n for n in names if names.count(n) > 1]}"
+
+    def test_no_alias_collides_with_canonical_name(self):
+        """An alias must not shadow another command's canonical name."""
+        canonical_names = {cmd.name for cmd in COMMAND_REGISTRY}
+        for cmd in COMMAND_REGISTRY:
+            for alias in cmd.aliases:
+                if alias in canonical_names:
+                    # reset -> new is intentional (reset IS an alias for new)
+                    target = next(c for c in COMMAND_REGISTRY if c.name == alias)
+                    # This should only happen if the alias points to the same entry
+                    assert resolve_command(alias).name == cmd.name or alias == cmd.name, \
+                        f"Alias '{alias}' of '{cmd.name}' shadows canonical '{target.name}'"
+
+    def test_every_entry_has_valid_category(self):
+        valid_categories = {"Session", "Configuration", "Tools & Skills", "Info", "Exit"}
+        for cmd in COMMAND_REGISTRY:
+            assert cmd.category in valid_categories, f"{cmd.name} has invalid category '{cmd.category}'"
+
+    def test_cli_only_and_gateway_only_are_mutually_exclusive(self):
+        for cmd in COMMAND_REGISTRY:
+            assert not (cmd.cli_only and cmd.gateway_only), \
+                f"{cmd.name} cannot be both cli_only and gateway_only"
+
+
+# ---------------------------------------------------------------------------
+# resolve_command tests
+# ---------------------------------------------------------------------------
+
+class TestResolveCommand:
+    def test_canonical_name_resolves(self):
+        assert resolve_command("help").name == "help"
+        assert resolve_command("background").name == "background"
+
+    def test_alias_resolves_to_canonical(self):
+        assert resolve_command("bg").name == "background"
+        assert resolve_command("reset").name == "new"
+        assert resolve_command("q").name == "quit"
+        assert resolve_command("exit").name == "quit"
+        assert resolve_command("gateway").name == "platforms"
+        assert resolve_command("set-home").name == "sethome"
+        assert resolve_command("reload_mcp").name == "reload-mcp"
+
+    def test_leading_slash_stripped(self):
+        assert resolve_command("/help").name == "help"
+        assert resolve_command("/bg").name == "background"
+
+    def test_unknown_returns_none(self):
+        assert resolve_command("nonexistent") is None
+        assert resolve_command("") is None
+
+
+# ---------------------------------------------------------------------------
+# Derived dicts (backwards compat)
+# ---------------------------------------------------------------------------
+
+class TestDerivedDicts:
+    def test_commands_dict_excludes_gateway_only(self):
+        """gateway_only commands should NOT appear in the CLI COMMANDS dict."""
+        for cmd in COMMAND_REGISTRY:
+            if cmd.gateway_only:
+                assert f"/{cmd.name}" not in COMMANDS, \
+                    f"gateway_only command /{cmd.name} should not be in COMMANDS"
+
+    def test_commands_dict_includes_all_cli_commands(self):
+        for cmd in COMMAND_REGISTRY:
+            if not cmd.gateway_only:
+                assert f"/{cmd.name}" in COMMANDS, \
+                    f"/{cmd.name} missing from COMMANDS dict"
+
+    def test_commands_dict_includes_aliases(self):
+        assert "/bg" in COMMANDS
+        assert "/reset" in COMMANDS
+        assert "/q" in COMMANDS
+        assert "/exit" in COMMANDS
+        assert "/reload_mcp" in COMMANDS
+        assert "/gateway" in COMMANDS
+
+    def test_commands_by_category_covers_all_categories(self):
+        registry_categories = {cmd.category for cmd in COMMAND_REGISTRY if not cmd.gateway_only}
+        assert set(COMMANDS_BY_CATEGORY.keys()) == registry_categories

    def test_every_command_has_nonempty_description(self):
        for cmd, desc in COMMANDS.items():
            assert isinstance(desc, str) and len(desc) > 0, f"{cmd} has empty description"


+# ---------------------------------------------------------------------------
+# Gateway helpers
+# ---------------------------------------------------------------------------
+
+class TestGatewayKnownCommands:
+    def test_excludes_cli_only(self):
+        for cmd in COMMAND_REGISTRY:
+            if cmd.cli_only:
+                assert cmd.name not in GATEWAY_KNOWN_COMMANDS, \
+                    f"cli_only command '{cmd.name}' should not be in GATEWAY_KNOWN_COMMANDS"
+
+    def test_includes_gateway_commands(self):
+        for cmd in COMMAND_REGISTRY:
+            if not cmd.cli_only:
+                assert cmd.name in GATEWAY_KNOWN_COMMANDS
+                for alias in cmd.aliases:
+                    assert alias in GATEWAY_KNOWN_COMMANDS
+
+    def test_bg_alias_in_gateway(self):
+        assert "bg" in GATEWAY_KNOWN_COMMANDS
+        assert "background" in GATEWAY_KNOWN_COMMANDS
+
+    def test_is_frozenset(self):
+        assert isinstance(GATEWAY_KNOWN_COMMANDS, frozenset)
+
+
+class TestGatewayHelpLines:
+    def test_returns_nonempty_list(self):
+        lines = gateway_help_lines()
+        assert len(lines) > 10
+
+    def test_excludes_cli_only_commands(self):
+        lines = gateway_help_lines()
+        joined = "\n".join(lines)
+        for cmd in COMMAND_REGISTRY:
+            if cmd.cli_only:
+                assert f"`/{cmd.name}" not in joined, \
+                    f"cli_only command /{cmd.name} should not be in gateway help"
+
+    def test_includes_alias_note_for_bg(self):
+        lines = gateway_help_lines()
+        bg_line = [l for l in lines if "/background" in l]
+        assert len(bg_line) == 1
+        assert "/bg" in bg_line[0]
+
+
+class TestTelegramBotCommands:
+    def test_returns_list_of_tuples(self):
+        cmds = telegram_bot_commands()
+        assert len(cmds) > 10
+        for name, desc in cmds:
+            assert isinstance(name, str)
+            assert isinstance(desc, str)
+
+    def test_no_hyphens_in_command_names(self):
+        """Telegram does not support hyphens in command names."""
+        for name, _ in telegram_bot_commands():
+            assert "-" not in name, f"Telegram command '{name}' contains a hyphen"
+
+    def test_excludes_cli_only(self):
+        names = {name for name, _ in telegram_bot_commands()}
+        for cmd in COMMAND_REGISTRY:
+            if cmd.cli_only:
+                tg_name = cmd.name.replace("-", "_")
+                assert tg_name not in names
+
+
+class TestSlackSubcommandMap:
+    def test_returns_dict(self):
+        mapping = slack_subcommand_map()
+        assert isinstance(mapping, dict)
+        assert len(mapping) > 10
+
+    def test_values_are_slash_prefixed(self):
+        for key, val in slack_subcommand_map().items():
+            assert val.startswith("/"), f"Slack mapping for '{key}' should start with /"
+
+    def test_includes_aliases(self):
+        mapping = slack_subcommand_map()
+        assert "bg" in mapping
+        assert "reset" in mapping
+
+    def test_excludes_cli_only(self):
+        mapping = slack_subcommand_map()
+        for cmd in COMMAND_REGISTRY:
+            if cmd.cli_only:
+                assert cmd.name not in mapping
+
+
+# ---------------------------------------------------------------------------
+# Autocomplete (SlashCommandCompleter)
+# ---------------------------------------------------------------------------
+
 class TestSlashCommandCompleter:
    # -- basic prefix completion -----------------------------------------

@@ -54,7 +234,7 @@ class TestSlashCommandCompleter:
    def test_builtin_completion_display_meta_shows_description(self):
        completions = _completions(SlashCommandCompleter(), "/help")
        assert len(completions) == 1
-        assert completions[0].display_meta_text == "Show this help message"
+        assert completions[0].display_meta_text == "Show available commands"

    # -- exact-match trailing space --------------------------------------

@@ -39,7 +39,7 @@ def test_systemd_status_warns_when_linger_disabled(monkeypatch, tmp_path, capsys
    monkeypatch.setattr(gateway, "get_systemd_linger_status", lambda: (False, ""))

    def fake_run(cmd, capture_output=False, text=False, check=False):
-        if cmd[:4] == ["systemctl", "--user", "status", gateway.SERVICE_NAME]:
+        if cmd[:4] == ["systemctl", "--user", "status", gateway.get_service_name()]:
            return SimpleNamespace(returncode=0, stdout="", stderr="")
        if cmd[:3] == ["systemctl", "--user", "is-active"]:
            return SimpleNamespace(returncode=0, stdout="active\n", stderr="")
@@ -76,7 +76,7 @@ def test_systemd_install_checks_linger_status(monkeypatch, tmp_path, capsys):
    assert unit_path.exists()
    assert [cmd for cmd, _ in calls] == [
        ["systemctl", "--user", "daemon-reload"],
-        ["systemctl", "--user", "enable", gateway.SERVICE_NAME],
+        ["systemctl", "--user", "enable", gateway.get_service_name()],
    ]
    assert helper_calls == [True]
    assert "User service installed and enabled" in out
@@ -110,7 +110,7 @@ def test_systemd_install_system_scope_skips_linger_and_uses_systemctl(monkeypatc
    assert unit_path.read_text(encoding="utf-8") == "scope=True user=alice\n"
    assert [cmd for cmd, _ in calls] == [
        ["systemctl", "daemon-reload"],
-        ["systemctl", "enable", gateway.SERVICE_NAME],
+        ["systemctl", "enable", gateway.get_service_name()],
    ]
    assert helper_calls == []
    assert "Configured to run as: alice" not in out  # generated test unit has no User= line
@@ -114,7 +114,7 @@ def test_systemd_install_calls_linger_helper(monkeypatch, tmp_path, capsys):
    assert unit_path.exists()
    assert [cmd for cmd, _ in calls] == [
        ["systemctl", "--user", "daemon-reload"],
-        ["systemctl", "--user", "enable", gateway.SERVICE_NAME],
+        ["systemctl", "--user", "enable", gateway.get_service_name()],
    ]
    assert helper_calls == [True]
    assert "User service installed and enabled" in out
@@ -1,5 +1,6 @@
 """Tests for gateway service management helpers."""

+import os
 from types import SimpleNamespace

 import hermes_cli.gateway as gateway_cli
@@ -26,7 +27,7 @@ class TestSystemdServiceRefresh:
        assert unit_path.read_text(encoding="utf-8") == "new unit\n"
        assert calls[:2] == [
            ["systemctl", "--user", "daemon-reload"],
-            ["systemctl", "--user", "start", gateway_cli.SERVICE_NAME],
+            ["systemctl", "--user", "start", gateway_cli.get_service_name()],
        ]

    def test_systemd_restart_refreshes_outdated_unit(self, tmp_path, monkeypatch):
@@ -49,10 +50,27 @@ class TestSystemdServiceRefresh:
        assert unit_path.read_text(encoding="utf-8") == "new unit\n"
        assert calls[:2] == [
            ["systemctl", "--user", "daemon-reload"],
-            ["systemctl", "--user", "restart", gateway_cli.SERVICE_NAME],
+            ["systemctl", "--user", "restart", gateway_cli.get_service_name()],
        ]


+class TestGeneratedSystemdUnits:
+    def test_user_unit_avoids_recursive_execstop_and_uses_extended_stop_timeout(self):
+        unit = gateway_cli.generate_systemd_unit(system=False)
+
+        assert "ExecStart=" in unit
+        assert "ExecStop=" not in unit
+        assert "TimeoutStopSec=60" in unit
+
+    def test_system_unit_avoids_recursive_execstop_and_uses_extended_stop_timeout(self):
+        unit = gateway_cli.generate_systemd_unit(system=True)
+
+        assert "ExecStart=" in unit
+        assert "ExecStop=" not in unit
+        assert "TimeoutStopSec=60" in unit
+        assert "WantedBy=multi-user.target" in unit
+
+
 class TestGatewayStopCleanup:
    def test_stop_sweeps_manual_gateway_processes_after_service_stop(self, tmp_path, monkeypatch):
        unit_path = tmp_path / "hermes-gateway.service"
@@ -92,9 +110,9 @@ class TestGatewayServiceDetection:
        )

        def fake_run(cmd, capture_output=True, text=True, **kwargs):
-            if cmd == ["systemctl", "--user", "is-active", gateway_cli.SERVICE_NAME]:
+            if cmd == ["systemctl", "--user", "is-active", gateway_cli.get_service_name()]:
                return SimpleNamespace(returncode=0, stdout="inactive\n", stderr="")
-            if cmd == ["systemctl", "is-active", gateway_cli.SERVICE_NAME]:
+            if cmd == ["systemctl", "is-active", gateway_cli.get_service_name()]:
                return SimpleNamespace(returncode=0, stdout="active\n", stderr="")
            raise AssertionError(f"Unexpected command: {cmd}")

@@ -139,3 +157,81 @@ class TestGatewaySystemServiceRouting:
        gateway_cli.gateway_command(SimpleNamespace(gateway_command="status", deep=False, system=False))

        assert calls == [(False, False)]
+
+
+class TestEnsureUserSystemdEnv:
+    """Tests for _ensure_user_systemd_env() D-Bus session bus auto-detection."""
+
+    def test_sets_xdg_runtime_dir_when_missing(self, tmp_path, monkeypatch):
+        monkeypatch.delenv("XDG_RUNTIME_DIR", raising=False)
+        monkeypatch.delenv("DBUS_SESSION_BUS_ADDRESS", raising=False)
+        monkeypatch.setattr(os, "getuid", lambda: 42)
+
+        # Patch Path so /run/user/42 resolves to our tmp dir (which exists)
+        from pathlib import Path as RealPath
+
+        class FakePath(type(RealPath())):
+            def __new__(cls, *args):
+                p = str(args[0]) if args else ""
+                if p == "/run/user/42":
+                    return RealPath.__new__(cls, str(tmp_path))
+                return RealPath.__new__(cls, *args)
+
+        monkeypatch.setattr(gateway_cli, "Path", FakePath)
+
+        gateway_cli._ensure_user_systemd_env()
+
+        # Function sets the canonical string, not the fake path
+        assert os.environ.get("XDG_RUNTIME_DIR") == "/run/user/42"
+
+    def test_sets_dbus_address_when_bus_socket_exists(self, tmp_path, monkeypatch):
+        runtime = tmp_path / "runtime"
+        runtime.mkdir()
+        bus_socket = runtime / "bus"
+        bus_socket.touch()  # simulate the socket file
+
+        monkeypatch.setenv("XDG_RUNTIME_DIR", str(runtime))
+        monkeypatch.delenv("DBUS_SESSION_BUS_ADDRESS", raising=False)
+        monkeypatch.setattr(os, "getuid", lambda: 99)
+
+        gateway_cli._ensure_user_systemd_env()
+
+        assert os.environ["DBUS_SESSION_BUS_ADDRESS"] == f"unix:path={bus_socket}"
+
+    def test_preserves_existing_env_vars(self, monkeypatch):
+        monkeypatch.setenv("XDG_RUNTIME_DIR", "/custom/runtime")
+        monkeypatch.setenv("DBUS_SESSION_BUS_ADDRESS", "unix:path=/custom/bus")
+
+        gateway_cli._ensure_user_systemd_env()
+
+        assert os.environ["XDG_RUNTIME_DIR"] == "/custom/runtime"
+        assert os.environ["DBUS_SESSION_BUS_ADDRESS"] == "unix:path=/custom/bus"
+
+    def test_no_dbus_when_bus_socket_missing(self, tmp_path, monkeypatch):
+        runtime = tmp_path / "runtime"
+        runtime.mkdir()
+        # no bus socket created
+
+        monkeypatch.setenv("XDG_RUNTIME_DIR", str(runtime))
+        monkeypatch.delenv("DBUS_SESSION_BUS_ADDRESS", raising=False)
+        monkeypatch.setattr(os, "getuid", lambda: 99)
+
+        gateway_cli._ensure_user_systemd_env()
+
+        assert "DBUS_SESSION_BUS_ADDRESS" not in os.environ
+
+    def test_systemctl_cmd_calls_ensure_for_user_mode(self, monkeypatch):
+        calls = []
+        monkeypatch.setattr(gateway_cli, "_ensure_user_systemd_env", lambda: calls.append("called"))
+
+        result = gateway_cli._systemctl_cmd(system=False)
+        assert result == ["systemctl", "--user"]
+        assert calls == ["called"]
+
+    def test_systemctl_cmd_skips_ensure_for_system_mode(self, monkeypatch):
+        calls = []
+        monkeypatch.setattr(gateway_cli, "_ensure_user_systemd_env", lambda: calls.append("called"))
+
+        result = gateway_cli._systemctl_cmd(system=True)
+        assert result == ["systemctl"]
+        assert calls == []
@@ -1,6 +1,6 @@
 """Tests for the hermes_cli models module."""

-from hermes_cli.models import OPENROUTER_MODELS, menu_labels, model_ids
+from hermes_cli.models import OPENROUTER_MODELS, menu_labels, model_ids, detect_provider_for_model


 class TestModelIds:
@@ -54,3 +54,66 @@ class TestOpenRouterModels:
    def test_at_least_5_models(self):
        """Sanity check that the models list hasn't been accidentally truncated."""
        assert len(OPENROUTER_MODELS) >= 5
+
+
+class TestFindOpenrouterSlug:
+    def test_exact_match(self):
+        from hermes_cli.models import _find_openrouter_slug
+        assert _find_openrouter_slug("anthropic/claude-opus-4.6") == "anthropic/claude-opus-4.6"
+
+    def test_bare_name_match(self):
+        from hermes_cli.models import _find_openrouter_slug
+        result = _find_openrouter_slug("claude-opus-4.6")
+        assert result == "anthropic/claude-opus-4.6"
+
+    def test_case_insensitive(self):
+        from hermes_cli.models import _find_openrouter_slug
+        result = _find_openrouter_slug("Anthropic/Claude-Opus-4.6")
+        assert result is not None
+
+    def test_unknown_returns_none(self):
+        from hermes_cli.models import _find_openrouter_slug
+        assert _find_openrouter_slug("totally-fake-model-xyz") is None
+
+
+class TestDetectProviderForModel:
+    def test_anthropic_model_detected(self):
+        """claude-opus-4-6 should resolve to anthropic provider."""
+        result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
+        assert result is not None
+        assert result[0] == "anthropic"
+
+    def test_deepseek_model_detected(self):
+        """deepseek-chat should resolve to deepseek provider."""
+        result = detect_provider_for_model("deepseek-chat", "openai-codex")
+        assert result is not None
+        # Provider is deepseek (direct) or openrouter (fallback) depending on creds
+        assert result[0] in ("deepseek", "openrouter")
+
+    def test_current_provider_model_returns_none(self):
+        """Models belonging to the current provider should not trigger a switch."""
+        assert detect_provider_for_model("gpt-5.3-codex", "openai-codex") is None
+
+    def test_openrouter_slug_match(self):
+        """Models in the OpenRouter catalog should be found."""
+        result = detect_provider_for_model("anthropic/claude-opus-4.6", "openai-codex")
+        assert result is not None
+        assert result[0] == "openrouter"
+        assert result[1] == "anthropic/claude-opus-4.6"
+
+    def test_bare_name_gets_openrouter_slug(self):
+        """Bare model names should get mapped to full OpenRouter slugs."""
+        result = detect_provider_for_model("claude-opus-4.6", "openai-codex")
+        assert result is not None
+        # Should find it on OpenRouter with full slug
+        assert result[1] == "anthropic/claude-opus-4.6"
+
+    def test_unknown_model_returns_none(self):
+        """Completely unknown model names should return None."""
+        assert detect_provider_for_model("nonexistent-model-xyz", "openai-codex") is None
+
+    def test_aggregator_not_suggested(self):
+        """nous/openrouter should never be auto-suggested as target provider."""
+        result = detect_provider_for_model("claude-opus-4-6", "openai-codex")
+        assert result is not None
+        assert result[0] not in ("nous",)  # nous has claude models but shouldn't be suggested
@@ -0,0 +1,184 @@
+"""Tests for file path autocomplete in the CLI completer."""
+
+import os
+from unittest.mock import MagicMock
+
+import pytest
+from prompt_toolkit.document import Document
+from prompt_toolkit.formatted_text import to_plain_text
+
+from hermes_cli.commands import SlashCommandCompleter, _file_size_label
+
+
+def _display_names(completions):
+    """Extract plain-text display names from a list of Completion objects."""
+    return [to_plain_text(c.display) for c in completions]
+
+
+def _display_metas(completions):
+    """Extract plain-text display_meta from a list of Completion objects."""
+    return [to_plain_text(c.display_meta) if c.display_meta else "" for c in completions]
+
+
+@pytest.fixture
+def completer():
+    return SlashCommandCompleter()
+
+
+class TestExtractPathWord:
+    def test_relative_path(self):
+        assert SlashCommandCompleter._extract_path_word("look at ./src/main.py") == "./src/main.py"
+
+    def test_home_path(self):
+        assert SlashCommandCompleter._extract_path_word("edit ~/docs/") == "~/docs/"
+
+    def test_absolute_path(self):
+        assert SlashCommandCompleter._extract_path_word("read /etc/hosts") == "/etc/hosts"
+
+    def test_parent_path(self):
+        assert SlashCommandCompleter._extract_path_word("check ../config.yaml") == "../config.yaml"
+
+    def test_path_with_slash_in_middle(self):
+        assert SlashCommandCompleter._extract_path_word("open src/utils/helpers.py") == "src/utils/helpers.py"
+
+    def test_plain_word_not_path(self):
+        assert SlashCommandCompleter._extract_path_word("hello world") is None
+
+    def test_empty_string(self):
+        assert SlashCommandCompleter._extract_path_word("") is None
+
+    def test_single_word_no_slash(self):
+        assert SlashCommandCompleter._extract_path_word("README.md") is None
+
+    def test_word_after_space(self):
+        assert SlashCommandCompleter._extract_path_word("fix the bug in ./tools/") == "./tools/"
+
+    def test_just_dot_slash(self):
+        assert SlashCommandCompleter._extract_path_word("./") == "./"
+
+    def test_just_tilde_slash(self):
+        assert SlashCommandCompleter._extract_path_word("~/") == "~/"
+
+
+class TestPathCompletions:
+    def test_lists_current_directory(self, tmp_path):
+        (tmp_path / "file_a.py").touch()
+        (tmp_path / "file_b.txt").touch()
+        (tmp_path / "subdir").mkdir()
+
+        old_cwd = os.getcwd()
+        os.chdir(tmp_path)
+        try:
+            completions = list(SlashCommandCompleter._path_completions("./"))
+            names = _display_names(completions)
+            assert "file_a.py" in names
+            assert "file_b.txt" in names
+            assert "subdir/" in names
+        finally:
+            os.chdir(old_cwd)
+
+    def test_filters_by_prefix(self, tmp_path):
+        (tmp_path / "alpha.py").touch()
+        (tmp_path / "beta.py").touch()
+        (tmp_path / "alpha_test.py").touch()
+
+        completions = list(SlashCommandCompleter._path_completions(f"{tmp_path}/alpha"))
+        names = _display_names(completions)
+        assert "alpha.py" in names
+        assert "alpha_test.py" in names
+        assert "beta.py" not in names
+
+    def test_directories_have_trailing_slash(self, tmp_path):
+        (tmp_path / "mydir").mkdir()
+        (tmp_path / "myfile.txt").touch()
+
+        completions = list(SlashCommandCompleter._path_completions(f"{tmp_path}/"))
+        names = _display_names(completions)
+        metas = _display_metas(completions)
+        assert "mydir/" in names
+        idx = names.index("mydir/")
+        assert metas[idx] == "dir"
+
+    def test_home_expansion(self, tmp_path, monkeypatch):
+        monkeypatch.setenv("HOME", str(tmp_path))
+        (tmp_path / "testfile.md").touch()
+
+        completions = list(SlashCommandCompleter._path_completions("~/test"))
+        names = _display_names(completions)
+        assert "testfile.md" in names
+
+    def test_nonexistent_dir_returns_empty(self):
+        completions = list(SlashCommandCompleter._path_completions("/nonexistent_dir_xyz/"))
+        assert completions == []
+
+    def test_respects_limit(self, tmp_path):
+        for i in range(50):
+            (tmp_path / f"file_{i:03d}.txt").touch()
+
+        completions = list(SlashCommandCompleter._path_completions(f"{tmp_path}/", limit=10))
+        assert len(completions) == 10
+
+    def test_case_insensitive_prefix(self, tmp_path):
+        (tmp_path / "README.md").touch()
+
+        completions = list(SlashCommandCompleter._path_completions(f"{tmp_path}/read"))
+        names = _display_names(completions)
+        assert "README.md" in names
+
+
+class TestIntegration:
+    """Test the completer produces path completions via the prompt_toolkit API."""
+
+    def test_slash_commands_still_work(self, completer):
+        doc = Document("/hel", cursor_position=4)
+        event = MagicMock()
+        completions = list(completer.get_completions(doc, event))
+        names = _display_names(completions)
+        assert "/help" in names
+
+    def test_path_completion_triggers_on_dot_slash(self, completer, tmp_path):
+        (tmp_path / "test.py").touch()
+        old_cwd = os.getcwd()
+        os.chdir(tmp_path)
+        try:
+            doc = Document("edit ./te", cursor_position=9)
+            event = MagicMock()
+            completions = list(completer.get_completions(doc, event))
+            names = _display_names(completions)
+            assert "test.py" in names
+        finally:
+            os.chdir(old_cwd)
+
+    def test_no_completion_for_plain_words(self, completer):
+        doc = Document("hello world", cursor_position=11)
+        event = MagicMock()
+        completions = list(completer.get_completions(doc, event))
+        assert completions == []
+
+    def test_absolute_path_triggers_completion(self, completer):
+        doc = Document("check /etc/hos", cursor_position=14)
+        event = MagicMock()
+        completions = list(completer.get_completions(doc, event))
+        names = _display_names(completions)
+        # /etc/hosts should exist on Linux
+        assert any("host" in n.lower() for n in names)
+
+
+class TestFileSizeLabel:
+    def test_bytes(self, tmp_path):
+        f = tmp_path / "small.txt"
+        f.write_text("hi")
+        assert _file_size_label(str(f)) == "2B"
+
+    def test_kilobytes(self, tmp_path):
+        f = tmp_path / "medium.txt"
+        f.write_bytes(b"x" * 2048)
+        assert _file_size_label(str(f)) == "2K"
+
+    def test_megabytes(self, tmp_path):
+        f = tmp_path / "large.bin"
+        f.write_bytes(b"x" * (2 * 1024 * 1024))
+        assert _file_size_label(str(f)) == "2.0M"
+
+    def test_nonexistent(self):
+        assert _file_size_label("/nonexistent_xyz") == ""
@@ -115,3 +115,13 @@ class TestConfigYamlRouting:
        set_config_value("terminal.docker_image", "python:3.12")
        config = _read_config(_isolated_hermes_home)
        assert "python:3.12" in config
+
+    def test_terminal_docker_cwd_mount_flag_goes_to_config_and_env(self, _isolated_hermes_home):
+        set_config_value("terminal.docker_mount_cwd_to_workspace", "true")
+        config = _read_config(_isolated_hermes_home)
+        env_content = _read_env(_isolated_hermes_home)
+        assert "docker_mount_cwd_to_workspace: 'true'" in config or "docker_mount_cwd_to_workspace: true" in config
+        assert (
+            "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE=true" in env_content
+            or "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE=True" in env_content
+        )
@@ -0,0 +1,29 @@
+from hermes_cli import setup as setup_mod
+
+
+def test_prompt_choice_uses_curses_helper(monkeypatch):
+    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: 1)
+
+    idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
+
+    assert idx == 1
+
+
+def test_prompt_choice_falls_back_to_numbered_input(monkeypatch):
+    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: -1)
+    monkeypatch.setattr("builtins.input", lambda _prompt="": "2")
+
+    idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
+
+    assert idx == 1
+
+
+def test_prompt_checklist_uses_shared_curses_checklist(monkeypatch):
+    monkeypatch.setattr(
+        "hermes_cli.curses_ui.curses_checklist",
+        lambda title, items, selected, cancel_returns=None: {0, 2},
+    )
+
+    selected = setup_mod.prompt_checklist("Pick tools", ["one", "two", "three"], pre_selected=[1])
+
+    assert selected == [0, 2]
@@ -0,0 +1,305 @@
+"""Tests for cmd_update gateway auto-restart — systemd + launchd coverage.
+
+Ensures ``hermes update`` correctly detects running gateways managed by
+systemd (Linux) or launchd (macOS) and restarts/informs the user properly,
+rather than leaving zombie processes or telling users to manually restart
+when launchd will auto-respawn.
+"""
+
+import subprocess
+from types import SimpleNamespace
+from unittest.mock import patch, MagicMock
+
+import pytest
+
+import hermes_cli.gateway as gateway_cli
+from hermes_cli.main import cmd_update
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _make_run_side_effect(
+    branch="main",
+    verify_ok=True,
+    commit_count="3",
+    systemd_active=False,
+    launchctl_loaded=False,
+):
+    """Build a subprocess.run side_effect that simulates git + service commands."""
+
+    def side_effect(cmd, **kwargs):
+        joined = " ".join(str(c) for c in cmd)
+
+        # git rev-parse --abbrev-ref HEAD
+        if "rev-parse" in joined and "--abbrev-ref" in joined:
+            return subprocess.CompletedProcess(cmd, 0, stdout=f"{branch}\n", stderr="")
+
+        # git rev-parse --verify origin/{branch}
+        if "rev-parse" in joined and "--verify" in joined:
+            rc = 0 if verify_ok else 128
+            return subprocess.CompletedProcess(cmd, rc, stdout="", stderr="")
+
+        # git rev-list HEAD..origin/{branch} --count
+        if "rev-list" in joined:
+            return subprocess.CompletedProcess(cmd, 0, stdout=f"{commit_count}\n", stderr="")
+
+        # systemctl --user is-active
+        if "systemctl" in joined and "is-active" in joined:
+            if systemd_active:
+                return subprocess.CompletedProcess(cmd, 0, stdout="active\n", stderr="")
+            return subprocess.CompletedProcess(cmd, 3, stdout="inactive\n", stderr="")
+
+        # systemctl --user restart
+        if "systemctl" in joined and "restart" in joined:
+            return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
+
+        # launchctl list ai.hermes.gateway
+        if "launchctl" in joined and "list" in joined:
+            if launchctl_loaded:
+                return subprocess.CompletedProcess(cmd, 0, stdout="PID\tStatus\tLabel\n123\t0\tai.hermes.gateway\n", stderr="")
+            return subprocess.CompletedProcess(cmd, 113, stdout="", stderr="Could not find service")
+
+        return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
+
+    return side_effect
+
+
+@pytest.fixture
+def mock_args():
+    return SimpleNamespace()
+
+
+# ---------------------------------------------------------------------------
+# Launchd plist includes --replace
+# ---------------------------------------------------------------------------
+
+
+class TestLaunchdPlistReplace:
+    """The generated launchd plist must include --replace so respawned
+    gateways kill stale instances."""
+
+    def test_plist_contains_replace_flag(self):
+        plist = gateway_cli.generate_launchd_plist()
+        assert "--replace" in plist
+
+    def test_plist_program_arguments_order(self):
+        """--replace comes after 'run' in the ProgramArguments."""
+        plist = gateway_cli.generate_launchd_plist()
+        lines = [line.strip() for line in plist.splitlines()]
+        # Find 'run' and '--replace' in the string entries
+        string_values = [
+            line.replace("<string>", "").replace("</string>", "")
+            for line in lines
+            if "<string>" in line and "</string>" in line
+        ]
+        assert "run" in string_values
+        assert "--replace" in string_values
+        run_idx = string_values.index("run")
+        replace_idx = string_values.index("--replace")
+        assert replace_idx == run_idx + 1
+
+
+# ---------------------------------------------------------------------------
+# cmd_update — macOS launchd detection
+# ---------------------------------------------------------------------------
+
+
+class TestLaunchdPlistRefresh:
+    """refresh_launchd_plist_if_needed rewrites stale plists (like systemd's
+    refresh_systemd_unit_if_needed)."""
+
+    def test_refresh_rewrites_stale_plist(self, tmp_path, monkeypatch):
+        plist_path = tmp_path / "ai.hermes.gateway.plist"
+        plist_path.write_text("<plist>old content</plist>")
+
+        monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
+
+        calls = []
+        def fake_run(cmd, check=False, **kwargs):
+            calls.append(cmd)
+            return SimpleNamespace(returncode=0, stdout="", stderr="")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
+
+        result = gateway_cli.refresh_launchd_plist_if_needed()
+
+        assert result is True
+        # Plist should now contain the generated content (which includes --replace)
+        assert "--replace" in plist_path.read_text()
+        # Should have unloaded then reloaded
+        assert any("unload" in str(c) for c in calls)
+        assert any("load" in str(c) for c in calls)
+
+    def test_refresh_skips_when_current(self, tmp_path, monkeypatch):
+        plist_path = tmp_path / "ai.hermes.gateway.plist"
+        monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
+
+        # Write the current expected content
+        plist_path.write_text(gateway_cli.generate_launchd_plist())
+
+        calls = []
+        monkeypatch.setattr(
+            gateway_cli.subprocess, "run",
+            lambda cmd, **kw: calls.append(cmd) or SimpleNamespace(returncode=0),
+        )
+
+        result = gateway_cli.refresh_launchd_plist_if_needed()
+
+        assert result is False
+        assert len(calls) == 0  # No launchctl calls needed
+
+    def test_refresh_skips_when_no_plist(self, tmp_path, monkeypatch):
+        plist_path = tmp_path / "nonexistent.plist"
+        monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
+
+        result = gateway_cli.refresh_launchd_plist_if_needed()
+        assert result is False
+
+    def test_launchd_start_calls_refresh(self, tmp_path, monkeypatch):
+        """launchd_start refreshes the plist before starting."""
+        plist_path = tmp_path / "ai.hermes.gateway.plist"
+        plist_path.write_text("<plist>old</plist>")
+        monkeypatch.setattr(gateway_cli, "get_launchd_plist_path", lambda: plist_path)
+
+        calls = []
+        def fake_run(cmd, check=False, **kwargs):
+            calls.append(cmd)
+            return SimpleNamespace(returncode=0, stdout="", stderr="")
+
+        monkeypatch.setattr(gateway_cli.subprocess, "run", fake_run)
+
+        gateway_cli.launchd_start()
+
+        # First calls should be refresh (unload/load), then start
+        cmd_strs = [" ".join(c) for c in calls]
+        assert any("unload" in s for s in cmd_strs)
+        assert any("start" in s for s in cmd_strs)
+
+
+class TestCmdUpdateLaunchdRestart:
+    """cmd_update correctly detects and handles launchd on macOS."""
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_update_detects_launchd_and_skips_manual_restart_message(
+        self, mock_run, _mock_which, mock_args, capsys, tmp_path, monkeypatch,
+    ):
+        """When launchd is running the gateway, update should print
+        'auto-restart via launchd' instead of 'Restart it with: hermes gateway run'."""
+        # Create a fake launchd plist so is_macos + plist.exists() passes
+        plist_path = tmp_path / "ai.hermes.gateway.plist"
+        plist_path.write_text("<plist/>")
+
+        monkeypatch.setattr(
+            gateway_cli, "is_macos", lambda: True,
+        )
+        monkeypatch.setattr(
+            gateway_cli, "get_launchd_plist_path", lambda: plist_path,
+        )
+
+        mock_run.side_effect = _make_run_side_effect(
+            commit_count="3",
+            launchctl_loaded=True,
+        )
+
+        # Mock get_running_pid to return a PID
+        with patch("gateway.status.get_running_pid", return_value=12345), \
+             patch("gateway.status.remove_pid_file"):
+            cmd_update(mock_args)
+
+        captured = capsys.readouterr().out
+        assert "Gateway restarted via launchd" in captured
+        assert "Restart it with: hermes gateway run" not in captured
+        # Verify launchctl stop + start were called (not manual SIGTERM)
+        launchctl_calls = [
+            c for c in mock_run.call_args_list
+            if len(c.args[0]) > 0 and c.args[0][0] == "launchctl"
+        ]
+        stop_calls = [c for c in launchctl_calls if "stop" in c.args[0]]
+        start_calls = [c for c in launchctl_calls if "start" in c.args[0]]
+        assert len(stop_calls) >= 1
+        assert len(start_calls) >= 1
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_update_without_launchd_shows_manual_restart(
+        self, mock_run, _mock_which, mock_args, capsys, tmp_path, monkeypatch,
+    ):
+        """When no service manager is running, update should show the manual restart hint."""
+        monkeypatch.setattr(
+            gateway_cli, "is_macos", lambda: True,
+        )
+        plist_path = tmp_path / "ai.hermes.gateway.plist"
+        # plist does NOT exist — no launchd service
+        monkeypatch.setattr(
+            gateway_cli, "get_launchd_plist_path", lambda: plist_path,
+        )
+
+        mock_run.side_effect = _make_run_side_effect(
+            commit_count="3",
+            launchctl_loaded=False,
+        )
+
+        with patch("gateway.status.get_running_pid", return_value=12345), \
+             patch("gateway.status.remove_pid_file"), \
+             patch("os.kill"):
+            cmd_update(mock_args)
+
+        captured = capsys.readouterr().out
+        assert "Restart it with: hermes gateway run" in captured
+        assert "Gateway restarted via launchd" not in captured
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_update_with_systemd_still_restarts_via_systemd(
+        self, mock_run, _mock_which, mock_args, capsys, monkeypatch,
+    ):
+        """On Linux with systemd active, update should restart via systemctl."""
+        monkeypatch.setattr(
+            gateway_cli, "is_macos", lambda: False,
+        )
+
+        mock_run.side_effect = _make_run_side_effect(
+            commit_count="3",
+            systemd_active=True,
+        )
+
+        with patch("gateway.status.get_running_pid", return_value=12345), \
+             patch("gateway.status.remove_pid_file"), \
+             patch("os.kill"):
+            cmd_update(mock_args)
+
+        captured = capsys.readouterr().out
+        assert "Gateway restarted" in captured
+        # Verify systemctl restart was called
+        restart_calls = [
+            c for c in mock_run.call_args_list
+            if "restart" in " ".join(str(a) for a in c.args[0])
+            and "systemctl" in " ".join(str(a) for a in c.args[0])
+        ]
+        assert len(restart_calls) == 1
+
+    @patch("shutil.which", return_value=None)
+    @patch("subprocess.run")
+    def test_update_no_gateway_running_skips_restart(
+        self, mock_run, _mock_which, mock_args, capsys, monkeypatch,
+    ):
+        """When no gateway is running, update should skip the restart section entirely."""
+        monkeypatch.setattr(
+            gateway_cli, "is_macos", lambda: False,
+        )
+
+        mock_run.side_effect = _make_run_side_effect(
+            commit_count="3",
+            systemd_active=False,
+        )
+
+        with patch("gateway.status.get_running_pid", return_value=None):
+            cmd_update(mock_args)
+
+        captured = capsys.readouterr().out
+        assert "Stopped gateway" not in captured
+        assert "Gateway restarted" not in captured
+        assert "Gateway restarted via launchd" not in captured
@@ -16,126 +16,131 @@ from run_agent import AIAgent, IterationBudget
 from tools.delegate_tool import _run_single_child
 from tools.interrupt import set_interrupt, is_interrupted

-set_interrupt(False)
+def main() -> int:
+    set_interrupt(False)

-# Create parent agent (minimal)
-parent = AIAgent.__new__(AIAgent)
-parent._interrupt_requested = False
-parent._interrupt_message = None
-parent._active_children = []
-parent.quiet_mode = True
-parent.model = "test/model"
-parent.base_url = "http://localhost:1"
-parent.api_key = "test"
-parent.provider = "test"
-parent.api_mode = "chat_completions"
-parent.platform = "cli"
-parent.enabled_toolsets = ["terminal", "file"]
-parent.providers_allowed = None
-parent.providers_ignored = None
-parent.providers_order = None
-parent.provider_sort = None
-parent.max_tokens = None
-parent.reasoning_config = None
-parent.prefill_messages = None
-parent._session_db = None
-parent._delegate_depth = 0
-parent._delegate_spinner = None
-parent.tool_progress_callback = None
-parent.iteration_budget = IterationBudget(max_total=100)
-parent._client_kwargs = {"api_key": "test", "base_url": "http://localhost:1"}
+    # Create parent agent (minimal)
+    parent = AIAgent.__new__(AIAgent)
+    parent._interrupt_requested = False
+    parent._interrupt_message = None
+    parent._active_children = []
+    parent.quiet_mode = True
+    parent.model = "test/model"
+    parent.base_url = "http://localhost:1"
+    parent.api_key = "test"
+    parent.provider = "test"
+    parent.api_mode = "chat_completions"
+    parent.platform = "cli"
+    parent.enabled_toolsets = ["terminal", "file"]
+    parent.providers_allowed = None
+    parent.providers_ignored = None
+    parent.providers_order = None
+    parent.provider_sort = None
+    parent.max_tokens = None
+    parent.reasoning_config = None
+    parent.prefill_messages = None
+    parent._session_db = None
+    parent._delegate_depth = 0
+    parent._delegate_spinner = None
+    parent.tool_progress_callback = None
+    parent.iteration_budget = IterationBudget(max_total=100)
+    parent._client_kwargs = {"api_key": "test", "base_url": "http://localhost:1"}

-child_started = threading.Event()
-result_holder = [None]
+    child_started = threading.Event()
+    result_holder = [None]

+    def run_delegate():
+        with patch("run_agent.OpenAI") as MockOpenAI:
+            mock_client = MagicMock()

-def run_delegate():
-    with patch("run_agent.OpenAI") as MockOpenAI:
-        mock_client = MagicMock()
+            def slow_create(**kwargs):
+                time.sleep(3)
+                resp = MagicMock()
+                resp.choices = [MagicMock()]
+                resp.choices[0].message.content = "Done"
+                resp.choices[0].message.tool_calls = None
+                resp.choices[0].message.refusal = None
+                resp.choices[0].finish_reason = "stop"
+                resp.usage.prompt_tokens = 100
+                resp.usage.completion_tokens = 10
+                resp.usage.total_tokens = 110
+                resp.usage.prompt_tokens_details = None
+                return resp

-        def slow_create(**kwargs):
-            time.sleep(3)
-            resp = MagicMock()
-            resp.choices = [MagicMock()]
-            resp.choices[0].message.content = "Done"
-            resp.choices[0].message.tool_calls = None
-            resp.choices[0].message.refusal = None
-            resp.choices[0].finish_reason = "stop"
-            resp.usage.prompt_tokens = 100
-            resp.usage.completion_tokens = 10
-            resp.usage.total_tokens = 110
-            resp.usage.prompt_tokens_details = None
-            return resp
+            mock_client.chat.completions.create = slow_create
+            mock_client.close = MagicMock()
+            MockOpenAI.return_value = mock_client

-        mock_client.chat.completions.create = slow_create
-        mock_client.close = MagicMock()
-        MockOpenAI.return_value = mock_client
+            original_init = AIAgent.__init__

-        original_init = AIAgent.__init__
+            def patched_init(self_agent, *a, **kw):
+                original_init(self_agent, *a, **kw)
+                child_started.set()

-        def patched_init(self_agent, *a, **kw):
-            original_init(self_agent, *a, **kw)
-            child_started.set()
+            with patch.object(AIAgent, "__init__", patched_init):
+                try:
+                    result = _run_single_child(
+                        task_index=0,
+                        goal="Test slow task",
+                        context=None,
+                        toolsets=["terminal"],
+                        model="test/model",
+                        max_iterations=5,
+                        parent_agent=parent,
+                        task_count=1,
+                        override_provider="test",
+                        override_base_url="http://localhost:1",
+                        override_api_key="test",
+                        override_api_mode="chat_completions",
+                    )
+                    result_holder[0] = result
+                except Exception as e:
+                    print(f"ERROR in delegate: {e}")
+                    import traceback
+                    traceback.print_exc()

-        with patch.object(AIAgent, "__init__", patched_init):
-            try:
-                result = _run_single_child(
-                    task_index=0,
-                    goal="Test slow task",
-                    context=None,
-                    toolsets=["terminal"],
-                    model="test/model",
-                    max_iterations=5,
-                    parent_agent=parent,
-                    task_count=1,
-                    override_provider="test",
-                    override_base_url="http://localhost:1",
-                    override_api_key="test",
-                    override_api_mode="chat_completions",
-                )
-                result_holder[0] = result
-            except Exception as e:
-                print(f"ERROR in delegate: {e}")
-                import traceback
-                traceback.print_exc()
+    print("Starting agent thread...")
+    agent_thread = threading.Thread(target=run_delegate, daemon=True)
+    agent_thread.start()

+    started = child_started.wait(timeout=10)
+    if not started:
+        print("ERROR: Child never started")
+        set_interrupt(False)
+        return 1

-print("Starting agent thread...")
-agent_thread = threading.Thread(target=run_delegate, daemon=True)
-agent_thread.start()
+    time.sleep(0.5)

-started = child_started.wait(timeout=10)
-if not started:
-    print("ERROR: Child never started")
-    sys.exit(1)
+    print(f"Active children: {len(parent._active_children)}")
+    for i, c in enumerate(parent._active_children):
+        print(f"  Child {i}: _interrupt_requested={c._interrupt_requested}")

-time.sleep(0.5)
+    t0 = time.monotonic()
+    parent.interrupt("User typed a new message")
+    print("Called parent.interrupt()")

-print(f"Active children: {len(parent._active_children)}")
-for i, c in enumerate(parent._active_children):
-    print(f"  Child {i}: _interrupt_requested={c._interrupt_requested}")
+    for i, c in enumerate(parent._active_children):
+        print(f"  Child {i} after interrupt: _interrupt_requested={c._interrupt_requested}")
+    print(f"Global is_interrupted: {is_interrupted()}")

-t0 = time.monotonic()
-parent.interrupt("User typed a new message")
-print(f"Called parent.interrupt()")
+    agent_thread.join(timeout=10)
+    elapsed = time.monotonic() - t0
+    print(f"Agent thread finished in {elapsed:.2f}s")

-for i, c in enumerate(parent._active_children):
-    print(f"  Child {i} after interrupt: _interrupt_requested={c._interrupt_requested}")
-print(f"Global is_interrupted: {is_interrupted()}")
-
-agent_thread.join(timeout=10)
-elapsed = time.monotonic() - t0
-print(f"Agent thread finished in {elapsed:.2f}s")
-
-result = result_holder[0]
-if result:
-    print(f"Status: {result['status']}")
-    print(f"Duration: {result['duration_seconds']}s")
-    if elapsed < 2.0:
-        print("✅ PASS: Interrupt detected quickly!")
+    result = result_holder[0]
+    if result:
+        print(f"Status: {result['status']}")
+        print(f"Duration: {result['duration_seconds']}s")
+        if elapsed < 2.0:
+            print("✅ PASS: Interrupt detected quickly!")
+        else:
+            print(f"❌ FAIL: Took {elapsed:.2f}s — interrupt was too slow or not detected")
    else:
-        print(f"❌ FAIL: Took {elapsed:.2f}s — interrupt was too slow or not detected")
-else:
-    print("❌ FAIL: No result!")
+        print("❌ FAIL: No result!")

-set_interrupt(False)
+    set_interrupt(False)
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,480 @@
+"""Tests for Anthropic error handling in the agent retry loop.
+
+Covers all error paths in run_agent.py's run_conversation() for api_mode=anthropic_messages:
+- 429 rate limit → retried with backoff
+- 529 overloaded → retried with backoff
+- 400 bad request → non-retryable, immediate fail
+- 401 unauthorized → credential refresh + retry
+- 500 server error → retried with backoff
+- "prompt is too long" → context length error triggers compression
+"""
+
+import asyncio
+import sys
+import types
+from types import SimpleNamespace
+from unittest.mock import MagicMock, AsyncMock
+
+import pytest
+
+sys.modules.setdefault("fire", types.SimpleNamespace(Fire=lambda *a, **k: None))
+sys.modules.setdefault("firecrawl", types.SimpleNamespace(Firecrawl=object))
+sys.modules.setdefault("fal_client", types.SimpleNamespace())
+
+import gateway.run as gateway_run
+import run_agent
+from gateway.config import Platform
+from gateway.session import SessionSource
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _patch_agent_bootstrap(monkeypatch):
+    monkeypatch.setattr(
+        run_agent,
+        "get_tool_definitions",
+        lambda **kwargs: [
+            {
+                "type": "function",
+                "function": {
+                    "name": "terminal",
+                    "description": "Run shell commands.",
+                    "parameters": {"type": "object", "properties": {}},
+                },
+            }
+        ],
+    )
+    monkeypatch.setattr(run_agent, "check_toolset_requirements", lambda: {})
+
+
+def _anthropic_response(text: str):
+    """Simulate an Anthropic messages.create() response object."""
+    return SimpleNamespace(
+        content=[SimpleNamespace(type="text", text=text)],
+        stop_reason="end_turn",
+        usage=SimpleNamespace(input_tokens=10, output_tokens=5),
+        model="claude-sonnet-4-6-20250514",
+    )
+
+
+class _RateLimitError(Exception):
+    """Simulates Anthropic 429 rate limit error."""
+    def __init__(self):
+        super().__init__("Error code: 429 - Rate limit exceeded. Please retry after 30s.")
+        self.status_code = 429
+
+
+class _OverloadedError(Exception):
+    """Simulates Anthropic 529 overloaded error."""
+    def __init__(self):
+        super().__init__("Error code: 529 - API is temporarily overloaded.")
+        self.status_code = 529
+
+
+class _BadRequestError(Exception):
+    """Simulates Anthropic 400 bad request error (non-retryable)."""
+    def __init__(self):
+        super().__init__("Error code: 400 - Invalid model specified.")
+        self.status_code = 400
+
+
+class _UnauthorizedError(Exception):
+    """Simulates Anthropic 401 unauthorized error."""
+    def __init__(self):
+        super().__init__("Error code: 401 - Unauthorized. Invalid API key.")
+        self.status_code = 401
+
+
+class _ServerError(Exception):
+    """Simulates Anthropic 500 internal server error."""
+    def __init__(self):
+        super().__init__("Error code: 500 - Internal server error.")
+        self.status_code = 500
+
+
+class _PromptTooLongError(Exception):
+    """Simulates Anthropic prompt-too-long error (triggers context compression)."""
+    def __init__(self):
+        super().__init__("prompt is too long: 250000 tokens > 200000 maximum")
+        self.status_code = 400
+
+
+class _FakeAnthropicClient:
+    def close(self):
+        pass
+
+
+def _fake_build_anthropic_client(key, base_url=None):
+    return _FakeAnthropicClient()
+
+
+def _make_agent_cls(error_cls, recover_after=None):
+    """Create an AIAgent subclass that raises error_cls on API calls.
+
+    If recover_after is set, the agent succeeds after that many failures.
+    """
+
+    class _Agent(run_agent.AIAgent):
+        def __init__(self, *args, **kwargs):
+            kwargs.setdefault("skip_context_files", True)
+            kwargs.setdefault("skip_memory", True)
+            kwargs.setdefault("max_iterations", 4)
+            super().__init__(*args, **kwargs)
+            self._cleanup_task_resources = lambda task_id: None
+            self._persist_session = lambda messages, history=None: None
+            self._save_trajectory = lambda messages, user_message, completed: None
+            self._save_session_log = lambda messages: None
+
+        def run_conversation(self, user_message, conversation_history=None, task_id=None):
+            calls = {"n": 0}
+
+            def _fake_api_call(api_kwargs):
+                calls["n"] += 1
+                if recover_after is not None and calls["n"] > recover_after:
+                    return _anthropic_response("Recovered")
+                raise error_cls()
+
+            self._interruptible_api_call = _fake_api_call
+            return super().run_conversation(
+                user_message, conversation_history=conversation_history, task_id=task_id
+            )
+
+    return _Agent
+
+
+def _run_with_agent(monkeypatch, agent_cls):
+    """Run _run_agent through the gateway with the given agent class."""
+    _patch_agent_bootstrap(monkeypatch)
+    monkeypatch.setattr(
+        "agent.anthropic_adapter.build_anthropic_client", _fake_build_anthropic_client
+    )
+    monkeypatch.setattr(run_agent, "AIAgent", agent_cls)
+    monkeypatch.setattr(
+        gateway_run,
+        "_resolve_runtime_agent_kwargs",
+        lambda: {
+            "provider": "anthropic",
+            "api_mode": "anthropic_messages",
+            "base_url": "https://api.anthropic.com",
+            "api_key": "sk-ant-api03-test-key",
+        },
+    )
+    monkeypatch.setenv("HERMES_TOOL_PROGRESS", "false")
+
+    runner = gateway_run.GatewayRunner.__new__(gateway_run.GatewayRunner)
+    runner.adapters = {}
+    runner._ephemeral_system_prompt = ""
+    runner._prefill_messages = []
+    runner._reasoning_config = None
+    runner._provider_routing = {}
+    runner._fallback_model = None
+    runner._running_agents = {}
+    runner.hooks = MagicMock()
+    runner.hooks.emit = AsyncMock()
+    runner.hooks.loaded_hooks = []
+    runner._session_db = None
+
+    source = SessionSource(
+        platform=Platform.LOCAL,
+        chat_id="cli",
+        chat_name="CLI",
+        chat_type="dm",
+        user_id="test-user-1",
+    )
+
+    return asyncio.run(
+        runner._run_agent(
+            message="hello",
+            context_prompt="",
+            history=[],
+            source=source,
+            session_id="test-session",
+            session_key="agent:main:local:dm",
+        )
+    )
+
+
+# ---------------------------------------------------------------------------
+# Tests
+# ---------------------------------------------------------------------------
+
+
+def test_429_rate_limit_is_retried_and_recovers(monkeypatch):
+    """429 should be retried with backoff. First call fails, second succeeds."""
+    agent_cls = _make_agent_cls(_RateLimitError, recover_after=1)
+    result = _run_with_agent(monkeypatch, agent_cls)
+    assert result["final_response"] == "Recovered"
+
+
+def test_529_overloaded_is_retried_and_recovers(monkeypatch):
+    """529 should be retried with backoff. First call fails, second succeeds."""
+    agent_cls = _make_agent_cls(_OverloadedError, recover_after=1)
+    result = _run_with_agent(monkeypatch, agent_cls)
+    assert result["final_response"] == "Recovered"
+
+
+def test_429_exhausts_all_retries_before_raising(monkeypatch):
+    """429 must retry max_retries times, not abort on first attempt."""
+    agent_cls = _make_agent_cls(_RateLimitError)  # always fails
+    with pytest.raises(_RateLimitError):
+        _run_with_agent(monkeypatch, agent_cls)
+
+
+def test_400_bad_request_is_non_retryable(monkeypatch):
+    """400 should fail immediately with only 1 API call (regression guard)."""
+    agent_cls = _make_agent_cls(_BadRequestError)
+    result = _run_with_agent(monkeypatch, agent_cls)
+    assert result["api_calls"] == 1
+    assert "400" in str(result.get("final_response", ""))
+
+
+def test_500_server_error_is_retried_and_recovers(monkeypatch):
+    """500 should be retried with backoff. First call fails, second succeeds."""
+    agent_cls = _make_agent_cls(_ServerError, recover_after=1)
+    result = _run_with_agent(monkeypatch, agent_cls)
+    assert result["final_response"] == "Recovered"
+
+
+def test_401_credential_refresh_recovers(monkeypatch):
+    """401 should trigger credential refresh and retry once."""
+    _patch_agent_bootstrap(monkeypatch)
+    monkeypatch.setattr(
+        "agent.anthropic_adapter.build_anthropic_client", _fake_build_anthropic_client
+    )
+    monkeypatch.setenv("HERMES_TOOL_PROGRESS", "false")
+
+    refresh_count = {"n": 0}
+
+    class _Auth401ThenSuccessAgent(run_agent.AIAgent):
+        def __init__(self, *args, **kwargs):
+            kwargs.setdefault("skip_context_files", True)
+            kwargs.setdefault("skip_memory", True)
+            kwargs.setdefault("max_iterations", 4)
+            super().__init__(*args, **kwargs)
+            self._cleanup_task_resources = lambda task_id: None
+            self._persist_session = lambda messages, history=None: None
+            self._save_trajectory = lambda messages, user_message, completed: None
+            self._save_session_log = lambda messages: None
+
+        def _try_refresh_anthropic_client_credentials(self) -> bool:
+            refresh_count["n"] += 1
+            return True  # Simulate successful credential refresh
+
+        def run_conversation(self, user_message, conversation_history=None, task_id=None):
+            calls = {"n": 0}
+
+            def _fake_api_call(api_kwargs):
+                calls["n"] += 1
+                if calls["n"] == 1:
+                    raise _UnauthorizedError()
+                return _anthropic_response("Auth refreshed")
+
+            self._interruptible_api_call = _fake_api_call
+            return super().run_conversation(
+                user_message, conversation_history=conversation_history, task_id=task_id
+            )
+
+    monkeypatch.setattr(run_agent, "AIAgent", _Auth401ThenSuccessAgent)
+    monkeypatch.setattr(
+        gateway_run,
+        "_resolve_runtime_agent_kwargs",
+        lambda: {
+            "provider": "anthropic",
+            "api_mode": "anthropic_messages",
+            "base_url": "https://api.anthropic.com",
+            "api_key": "sk-ant-api03-test-key",
+        },
+    )
+
+    runner = gateway_run.GatewayRunner.__new__(gateway_run.GatewayRunner)
+    runner.adapters = {}
+    runner._ephemeral_system_prompt = ""
+    runner._prefill_messages = []
+    runner._reasoning_config = None
+    runner._provider_routing = {}
+    runner._fallback_model = None
+    runner._running_agents = {}
+    runner.hooks = MagicMock()
+    runner.hooks.emit = AsyncMock()
+    runner.hooks.loaded_hooks = []
+    runner._session_db = None
+
+    source = SessionSource(
+        platform=Platform.LOCAL, chat_id="cli", chat_name="CLI",
+        chat_type="dm", user_id="test-user-1",
+    )
+
+    result = asyncio.run(
+        runner._run_agent(
+            message="hello", context_prompt="", history=[],
+            source=source, session_id="session-401",
+            session_key="agent:main:local:dm",
+        )
+    )
+
+    assert result["final_response"] == "Auth refreshed"
+    assert refresh_count["n"] == 1
+
+
+def test_401_refresh_fails_is_non_retryable(monkeypatch):
+    """401 with failed credential refresh should be treated as non-retryable."""
+    _patch_agent_bootstrap(monkeypatch)
+    monkeypatch.setattr(
+        "agent.anthropic_adapter.build_anthropic_client", _fake_build_anthropic_client
+    )
+    monkeypatch.setenv("HERMES_TOOL_PROGRESS", "false")
+
+    class _Auth401AlwaysFailAgent(run_agent.AIAgent):
+        def __init__(self, *args, **kwargs):
+            kwargs.setdefault("skip_context_files", True)
+            kwargs.setdefault("skip_memory", True)
+            kwargs.setdefault("max_iterations", 4)
+            super().__init__(*args, **kwargs)
+            self._cleanup_task_resources = lambda task_id: None
+            self._persist_session = lambda messages, history=None: None
+            self._save_trajectory = lambda messages, user_message, completed: None
+            self._save_session_log = lambda messages: None
+
+        def _try_refresh_anthropic_client_credentials(self) -> bool:
+            return False  # Simulate failed credential refresh
+
+        def run_conversation(self, user_message, conversation_history=None, task_id=None):
+            def _fake_api_call(api_kwargs):
+                raise _UnauthorizedError()
+
+            self._interruptible_api_call = _fake_api_call
+            return super().run_conversation(
+                user_message, conversation_history=conversation_history, task_id=task_id
+            )
+
+    monkeypatch.setattr(run_agent, "AIAgent", _Auth401AlwaysFailAgent)
+    monkeypatch.setattr(
+        gateway_run,
+        "_resolve_runtime_agent_kwargs",
+        lambda: {
+            "provider": "anthropic",
+            "api_mode": "anthropic_messages",
+            "base_url": "https://api.anthropic.com",
+            "api_key": "sk-ant-api03-test-key",
+        },
+    )
+
+    runner = gateway_run.GatewayRunner.__new__(gateway_run.GatewayRunner)
+    runner.adapters = {}
+    runner._ephemeral_system_prompt = ""
+    runner._prefill_messages = []
+    runner._reasoning_config = None
+    runner._provider_routing = {}
+    runner._fallback_model = None
+    runner._running_agents = {}
+    runner.hooks = MagicMock()
+    runner.hooks.emit = AsyncMock()
+    runner.hooks.loaded_hooks = []
+    runner._session_db = None
+
+    source = SessionSource(
+        platform=Platform.LOCAL, chat_id="cli", chat_name="CLI",
+        chat_type="dm", user_id="test-user-1",
+    )
+
+    result = asyncio.run(
+        runner._run_agent(
+            message="hello", context_prompt="", history=[],
+            source=source, session_id="session-401-fail",
+            session_key="agent:main:local:dm",
+        )
+    )
+
+    # 401 after failed refresh → non-retryable (falls through to is_client_error)
+    assert result["api_calls"] == 1
+    assert "401" in str(result.get("final_response", "")) or "unauthorized" in str(result.get("final_response", "")).lower()
+
+
+def test_prompt_too_long_triggers_compression(monkeypatch):
+    """Anthropic 'prompt is too long' error should trigger context compression, not immediate fail."""
+    _patch_agent_bootstrap(monkeypatch)
+    monkeypatch.setattr(
+        "agent.anthropic_adapter.build_anthropic_client", _fake_build_anthropic_client
+    )
+    monkeypatch.setenv("HERMES_TOOL_PROGRESS", "false")
+
+    class _PromptTooLongThenSuccessAgent(run_agent.AIAgent):
+        compress_called = 0
+
+        def __init__(self, *args, **kwargs):
+            kwargs.setdefault("skip_context_files", True)
+            kwargs.setdefault("skip_memory", True)
+            kwargs.setdefault("max_iterations", 4)
+            super().__init__(*args, **kwargs)
+            self._cleanup_task_resources = lambda task_id: None
+            self._persist_session = lambda messages, history=None: None
+            self._save_trajectory = lambda messages, user_message, completed: None
+            self._save_session_log = lambda messages: None
+
+        def _compress_context(self, messages, system_message, approx_tokens=0, task_id=None):
+            type(self).compress_called += 1
+            # Simulate compression by dropping oldest non-system message
+            if len(messages) > 2:
+                compressed = [messages[0]] + messages[2:]
+            else:
+                compressed = messages
+            return compressed, system_message
+
+        def run_conversation(self, user_message, conversation_history=None, task_id=None):
+            calls = {"n": 0}
+
+            def _fake_api_call(api_kwargs):
+                calls["n"] += 1
+                if calls["n"] == 1:
+                    raise _PromptTooLongError()
+                return _anthropic_response("Compressed and recovered")
+
+            self._interruptible_api_call = _fake_api_call
+            return super().run_conversation(
+                user_message, conversation_history=conversation_history, task_id=task_id
+            )
+
+    _PromptTooLongThenSuccessAgent.compress_called = 0
+    monkeypatch.setattr(run_agent, "AIAgent", _PromptTooLongThenSuccessAgent)
+    monkeypatch.setattr(
+        gateway_run,
+        "_resolve_runtime_agent_kwargs",
+        lambda: {
+            "provider": "anthropic",
+            "api_mode": "anthropic_messages",
+            "base_url": "https://api.anthropic.com",
+            "api_key": "sk-ant-api03-test-key",
+        },
+    )
+
+    runner = gateway_run.GatewayRunner.__new__(gateway_run.GatewayRunner)
+    runner.adapters = {}
+    runner._ephemeral_system_prompt = ""
+    runner._prefill_messages = []
+    runner._reasoning_config = None
+    runner._provider_routing = {}
+    runner._fallback_model = None
+    runner._running_agents = {}
+    runner.hooks = MagicMock()
+    runner.hooks.emit = AsyncMock()
+    runner.hooks.loaded_hooks = []
+    runner._session_db = None
+
+    source = SessionSource(
+        platform=Platform.LOCAL, chat_id="cli", chat_name="CLI",
+        chat_type="dm", user_id="test-user-1",
+    )
+
+    result = asyncio.run(
+        runner._run_agent(
+            message="hello", context_prompt="", history=[],
+            source=source, session_id="session-prompt-long",
+            session_key="agent:main:local:dm",
+        )
+    )
+
+    assert result["final_response"] == "Compressed and recovered"
+    assert _PromptTooLongThenSuccessAgent.compress_called >= 1
@@ -1,4 +1,4 @@
-"""Tests for API-key provider support (z.ai/GLM, Kimi, MiniMax)."""
+"""Tests for API-key provider support (z.ai/GLM, Kimi, MiniMax, AI Gateway)."""

 import os
 import sys
@@ -37,6 +37,7 @@ class TestProviderRegistry:
        ("kimi-coding", "Kimi / Moonshot", "api_key"),
        ("minimax", "MiniMax", "api_key"),
        ("minimax-cn", "MiniMax (China)", "api_key"),
+        ("ai-gateway", "AI Gateway", "api_key"),
    ])
    def test_provider_registered(self, provider_id, name, auth_type):
        assert provider_id in PROVIDER_REGISTRY
@@ -65,11 +66,17 @@ class TestProviderRegistry:
        assert pconfig.api_key_env_vars == ("MINIMAX_CN_API_KEY",)
        assert pconfig.base_url_env_var == "MINIMAX_CN_BASE_URL"

+    def test_ai_gateway_env_vars(self):
+        pconfig = PROVIDER_REGISTRY["ai-gateway"]
+        assert pconfig.api_key_env_vars == ("AI_GATEWAY_API_KEY",)
+        assert pconfig.base_url_env_var == "AI_GATEWAY_BASE_URL"
+
    def test_base_urls(self):
        assert PROVIDER_REGISTRY["zai"].inference_base_url == "https://api.z.ai/api/paas/v4"
        assert PROVIDER_REGISTRY["kimi-coding"].inference_base_url == "https://api.moonshot.ai/v1"
        assert PROVIDER_REGISTRY["minimax"].inference_base_url == "https://api.minimax.io/v1"
        assert PROVIDER_REGISTRY["minimax-cn"].inference_base_url == "https://api.minimaxi.com/v1"
+        assert PROVIDER_REGISTRY["ai-gateway"].inference_base_url == "https://ai-gateway.vercel.sh/v1"

    def test_oauth_providers_unchanged(self):
        """Ensure we didn't break the existing OAuth providers."""
@@ -87,6 +94,7 @@ PROVIDER_ENV_VARS = (
    "OPENROUTER_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY",
    "GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY",
    "KIMI_API_KEY", "KIMI_BASE_URL", "MINIMAX_API_KEY", "MINIMAX_CN_API_KEY",
+    "AI_GATEWAY_API_KEY", "AI_GATEWAY_BASE_URL",
    "OPENAI_BASE_URL",
 )

@@ -112,6 +120,9 @@ class TestResolveProvider:
    def test_explicit_minimax_cn(self):
        assert resolve_provider("minimax-cn") == "minimax-cn"

+    def test_explicit_ai_gateway(self):
+        assert resolve_provider("ai-gateway") == "ai-gateway"
+
    def test_alias_glm(self):
        assert resolve_provider("glm") == "zai"

@@ -130,6 +141,12 @@ class TestResolveProvider:
    def test_alias_minimax_underscore(self):
        assert resolve_provider("minimax_cn") == "minimax-cn"

+    def test_alias_aigateway(self):
+        assert resolve_provider("aigateway") == "ai-gateway"
+
+    def test_alias_vercel(self):
+        assert resolve_provider("vercel") == "ai-gateway"
+
    def test_alias_case_insensitive(self):
        assert resolve_provider("GLM") == "zai"
        assert resolve_provider("Z-AI") == "zai"
@@ -163,6 +180,10 @@ class TestResolveProvider:
        monkeypatch.setenv("MINIMAX_CN_API_KEY", "test-mm-cn-key")
        assert resolve_provider("auto") == "minimax-cn"

+    def test_auto_detects_ai_gateway_key(self, monkeypatch):
+        monkeypatch.setenv("AI_GATEWAY_API_KEY", "test-gw-key")
+        assert resolve_provider("auto") == "ai-gateway"
+
    def test_openrouter_takes_priority_over_glm(self, monkeypatch):
        """OpenRouter API key should win over GLM in auto-detection."""
        monkeypatch.setenv("OPENROUTER_API_KEY", "or-key")
@@ -248,6 +269,13 @@ class TestResolveApiKeyProviderCredentials:
        assert creds["api_key"] == "mmcn-secret-key"
        assert creds["base_url"] == "https://api.minimaxi.com/v1"

+    def test_resolve_ai_gateway_with_key(self, monkeypatch):
+        monkeypatch.setenv("AI_GATEWAY_API_KEY", "gw-secret-key")
+        creds = resolve_api_key_provider_credentials("ai-gateway")
+        assert creds["provider"] == "ai-gateway"
+        assert creds["api_key"] == "gw-secret-key"
+        assert creds["base_url"] == "https://ai-gateway.vercel.sh/v1"
+
    def test_resolve_with_custom_base_url(self, monkeypatch):
        monkeypatch.setenv("GLM_API_KEY", "glm-key")
        monkeypatch.setenv("GLM_BASE_URL", "https://custom.glm.example/v4")
@@ -309,6 +337,15 @@ class TestRuntimeProviderResolution:
        assert result["provider"] == "minimax"
        assert result["api_key"] == "mm-key"

+    def test_runtime_ai_gateway(self, monkeypatch):
+        monkeypatch.setenv("AI_GATEWAY_API_KEY", "gw-key")
+        from hermes_cli.runtime_provider import resolve_runtime_provider
+        result = resolve_runtime_provider(requested="ai-gateway")
+        assert result["provider"] == "ai-gateway"
+        assert result["api_mode"] == "chat_completions"
+        assert result["api_key"] == "gw-key"
+        assert "ai-gateway.vercel.sh" in result["base_url"]
+
    def test_runtime_auto_detects_api_key_provider(self, monkeypatch):
        monkeypatch.setenv("KIMI_API_KEY", "auto-kimi-key")
        from hermes_cli.runtime_provider import resolve_runtime_provider
@@ -64,8 +64,8 @@ class TestModelCommand:
            cli_obj.process_command("/model gpt-5.4")

        output = capsys.readouterr().out
-        # Model is accepted (with warning) even if not in API listing
-        assert cli_obj.model == "gpt-5.4"
+        # Auto-detection remaps bare model names to proper OpenRouter slugs
+        assert cli_obj.model == "openai/gpt-5.4"

    def test_validation_crash_falls_back_to_save(self, capsys):
        cli_obj = self._make_cli()
@@ -35,7 +35,9 @@ class TestSlashCommandPrefixMatching:
                raise RecursionError("process_command called too many times")
            return original(self_inner, cmd)

-        with patch.object(type(cli_obj), 'process_command', counting_process_command):
+        # Mock show_config since the test is about recursion, not config display
+        with patch.object(type(cli_obj), 'process_command', counting_process_command), \
+             patch.object(cli_obj, 'show_config'):
            try:
                cli_obj.process_command("/con set key value")
            except RecursionError:
@@ -57,7 +59,9 @@ class TestSlashCommandPrefixMatching:
                raise RecursionError("Infinite recursion detected")
            return original_pc(self_inner, cmd)

-        with patch.object(HermesCLI, 'process_command', guarded):
+        # Mock show_config since the test is about recursion, not config display
+        with patch.object(HermesCLI, 'process_command', guarded), \
+             patch.object(cli_obj, 'show_config'):
            try:
                cli_obj.process_command("/config set key value")
            except RecursionError:
@@ -162,6 +162,57 @@ def test_runtime_resolution_rebuilds_agent_on_routing_change(monkeypatch):
    assert shell.api_mode == "codex_responses"


+def test_cli_turn_routing_uses_primary_when_disabled(monkeypatch):
+    cli = _import_cli()
+    shell = cli.HermesCLI(model="gpt-5", compact=True, max_turns=1)
+    shell.provider = "openrouter"
+    shell.api_mode = "chat_completions"
+    shell.base_url = "https://openrouter.ai/api/v1"
+    shell.api_key = "sk-primary"
+    shell._smart_model_routing = {"enabled": False}
+
+    result = shell._resolve_turn_agent_config("what time is it in tokyo?")
+
+    assert result["model"] == "gpt-5"
+    assert result["runtime"]["provider"] == "openrouter"
+    assert result["label"] is None
+
+
+def test_cli_turn_routing_uses_cheap_model_when_simple(monkeypatch):
+    cli = _import_cli()
+
+    def _runtime_resolve(**kwargs):
+        assert kwargs["requested"] == "zai"
+        return {
+            "provider": "zai",
+            "api_mode": "chat_completions",
+            "base_url": "https://open.z.ai/api/v1",
+            "api_key": "cheap-key",
+            "source": "env/config",
+        }
+
+    monkeypatch.setattr("hermes_cli.runtime_provider.resolve_runtime_provider", _runtime_resolve)
+
+    shell = cli.HermesCLI(model="anthropic/claude-sonnet-4", compact=True, max_turns=1)
+    shell.provider = "openrouter"
+    shell.api_mode = "chat_completions"
+    shell.base_url = "https://openrouter.ai/api/v1"
+    shell.api_key = "primary-key"
+    shell._smart_model_routing = {
+        "enabled": True,
+        "cheap_model": {"provider": "zai", "model": "glm-5-air"},
+        "max_simple_chars": 160,
+        "max_simple_words": 28,
+    }
+
+    result = shell._resolve_turn_agent_config("what time is it in tokyo?")
+
+    assert result["model"] == "glm-5-air"
+    assert result["runtime"]["provider"] == "zai"
+    assert result["runtime"]["api_key"] == "cheap-key"
+    assert result["label"] is not None
+
+
 def test_cli_prefers_config_provider_over_stale_env_override(monkeypatch):
    cli = _import_cli()

@@ -0,0 +1,175 @@
+from datetime import datetime, timedelta
+from types import SimpleNamespace
+
+from cli import HermesCLI
+
+
+def _make_cli(model: str = "anthropic/claude-sonnet-4-20250514"):
+    cli_obj = HermesCLI.__new__(HermesCLI)
+    cli_obj.model = model
+    cli_obj.session_start = datetime.now() - timedelta(minutes=14, seconds=32)
+    cli_obj.conversation_history = [{"role": "user", "content": "hi"}]
+    cli_obj.agent = None
+    return cli_obj
+
+
+def _attach_agent(
+    cli_obj,
+    *,
+    prompt_tokens: int,
+    completion_tokens: int,
+    total_tokens: int,
+    api_calls: int,
+    context_tokens: int,
+    context_length: int,
+    compressions: int = 0,
+):
+    cli_obj.agent = SimpleNamespace(
+        model=cli_obj.model,
+        session_prompt_tokens=prompt_tokens,
+        session_completion_tokens=completion_tokens,
+        session_total_tokens=total_tokens,
+        session_api_calls=api_calls,
+        context_compressor=SimpleNamespace(
+            last_prompt_tokens=context_tokens,
+            context_length=context_length,
+            compression_count=compressions,
+        ),
+    )
+    return cli_obj
+
+
+class TestCLIStatusBar:
+    def test_context_style_thresholds(self):
+        cli_obj = _make_cli()
+
+        assert cli_obj._status_bar_context_style(None) == "class:status-bar-dim"
+        assert cli_obj._status_bar_context_style(10) == "class:status-bar-good"
+        assert cli_obj._status_bar_context_style(50) == "class:status-bar-warn"
+        assert cli_obj._status_bar_context_style(81) == "class:status-bar-bad"
+        assert cli_obj._status_bar_context_style(95) == "class:status-bar-critical"
+
+    def test_build_status_bar_text_for_wide_terminal(self):
+        cli_obj = _attach_agent(
+            _make_cli(),
+            prompt_tokens=10_230,
+            completion_tokens=2_220,
+            total_tokens=12_450,
+            api_calls=7,
+            context_tokens=12_450,
+            context_length=200_000,
+        )
+
+        text = cli_obj._build_status_bar_text(width=120)
+
+        assert "claude-sonnet-4-20250514" in text
+        assert "12.4K/200K" in text
+        assert "6%" in text
+        assert "$0.06" not in text  # cost hidden by default
+        assert "15m" in text
+
+    def test_build_status_bar_text_shows_cost_when_enabled(self):
+        cli_obj = _attach_agent(
+            _make_cli(),
+            prompt_tokens=10000,
+            completion_tokens=2400,
+            total_tokens=12400,
+            api_calls=7,
+            context_tokens=12400,
+            context_length=200_000,
+        )
+        cli_obj.show_cost = True
+
+        text = cli_obj._build_status_bar_text(width=120)
+        assert "$" in text  # cost is shown when enabled
+
+    def test_build_status_bar_text_collapses_for_narrow_terminal(self):
+        cli_obj = _attach_agent(
+            _make_cli(),
+            prompt_tokens=10000,
+            completion_tokens=2400,
+            total_tokens=12400,
+            api_calls=7,
+            context_tokens=12400,
+            context_length=200_000,
+        )
+
+        text = cli_obj._build_status_bar_text(width=60)
+
+        assert "⚕" in text
+        assert "$0.06" not in text  # cost hidden by default
+        assert "15m" in text
+        assert "200K" not in text
+
+    def test_build_status_bar_text_handles_missing_agent(self):
+        cli_obj = _make_cli()
+
+        text = cli_obj._build_status_bar_text(width=100)
+
+        assert "⚕" in text
+        assert "claude-sonnet-4-20250514" in text
+
+
+class TestCLIUsageReport:
+    def test_show_usage_includes_estimated_cost(self, capsys):
+        cli_obj = _attach_agent(
+            _make_cli(),
+            prompt_tokens=10_230,
+            completion_tokens=2_220,
+            total_tokens=12_450,
+            api_calls=7,
+            context_tokens=12_450,
+            context_length=200_000,
+            compressions=1,
+        )
+        cli_obj.verbose = False
+
+        cli_obj._show_usage()
+        output = capsys.readouterr().out
+
+        assert "Model:" in output
+        assert "Input cost:" in output
+        assert "Output cost:" in output
+        assert "Total cost:" in output
+        assert "$" in output
+        assert "0.064" in output
+        assert "Session duration:" in output
+        assert "Compressions:" in output
+
+    def test_show_usage_marks_unknown_pricing(self, capsys):
+        cli_obj = _attach_agent(
+            _make_cli(model="local/my-custom-model"),
+            prompt_tokens=1_000,
+            completion_tokens=500,
+            total_tokens=1_500,
+            api_calls=1,
+            context_tokens=1_000,
+            context_length=32_000,
+        )
+        cli_obj.verbose = False
+
+        cli_obj._show_usage()
+        output = capsys.readouterr().out
+
+        assert "Total cost:" in output
+        assert "n/a" in output
+        assert "Pricing unknown for local/my-custom-model" in output
+
+    def test_zero_priced_provider_models_stay_unknown(self, capsys):
+        cli_obj = _attach_agent(
+            _make_cli(model="glm-5"),
+            prompt_tokens=1_000,
+            completion_tokens=500,
+            total_tokens=1_500,
+            api_calls=1,
+            context_tokens=1_000,
+            context_length=32_000,
+        )
+        cli_obj.verbose = False
+
+        cli_obj._show_usage()
+        output = capsys.readouterr().out
+
+        assert "Total cost:" in output
+        assert "n/a" in output
+        assert "Pricing unknown for glm-5" in output
@@ -0,0 +1,186 @@
+import os
+import json
+import pytest
+from pathlib import Path
+import importlib.util
+
+# Load the hyphenated script name dynamically
+repo_root = Path(__file__).parent.parent
+script_path = repo_root / "optional-skills" / "security" / "oss-forensics" / "scripts" / "evidence-store.py"
+
+spec = importlib.util.spec_from_file_location("evidence_store", str(script_path))
+evidence_store = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(evidence_store)
+EvidenceStore = evidence_store.EvidenceStore
+
+
+def test_evidence_store_init(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+    assert store.filepath == str(store_file)
+    assert len(store.data["evidence"]) == 0
+    assert "metadata" in store.data
+    assert store.data["metadata"]["version"] == "2.0"
+    assert "chain_of_custody" in store.data
+
+
+def test_evidence_store_add(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    eid = store.add(
+        source="test_source",
+        content="test_content",
+        evidence_type="git",
+        actor="test_actor",
+        notes="test_notes",
+    )
+
+    assert eid == "EV-0001"
+    assert len(store.data["evidence"]) == 1
+    assert store.data["evidence"][0]["content"] == "test_content"
+    assert store.data["evidence"][0]["id"] == "EV-0001"
+    assert store.data["evidence"][0]["actor"] == "test_actor"
+    assert store.data["evidence"][0]["notes"] == "test_notes"
+    # Verify SHA-256 was computed
+    assert store.data["evidence"][0]["content_sha256"] is not None
+    assert len(store.data["evidence"][0]["content_sha256"]) == 64
+
+
+def test_evidence_store_add_persists(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+    store.add(source="s1", content="c1", evidence_type="git")
+
+    # Reload from disk
+    store2 = EvidenceStore(str(store_file))
+    assert len(store2.data["evidence"]) == 1
+    assert store2.data["evidence"][0]["id"] == "EV-0001"
+
+
+def test_evidence_store_sequential_ids(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    eid1 = store.add(source="s1", content="c1", evidence_type="git")
+    eid2 = store.add(source="s2", content="c2", evidence_type="gh_api")
+    eid3 = store.add(source="s3", content="c3", evidence_type="ioc")
+
+    assert eid1 == "EV-0001"
+    assert eid2 == "EV-0002"
+    assert eid3 == "EV-0003"
+
+
+def test_evidence_store_list(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="s1", content="c1", evidence_type="git", actor="a1")
+    store.add(source="s2", content="c2", evidence_type="gh_api", actor="a2")
+
+    all_evidence = store.list_evidence()
+    assert len(all_evidence) == 2
+
+    git_evidence = store.list_evidence(filter_type="git")
+    assert len(git_evidence) == 1
+    assert git_evidence[0]["actor"] == "a1"
+
+    actor_evidence = store.list_evidence(filter_actor="a2")
+    assert len(actor_evidence) == 1
+    assert actor_evidence[0]["type"] == "gh_api"
+
+
+def test_evidence_store_verify_integrity(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="s1", content="c1", evidence_type="git")
+    assert len(store.verify_integrity()) == 0
+
+    # Manually corrupt the content to trigger a hash mismatch
+    store.data["evidence"][0]["content"] = "corrupted_content"
+    issues = store.verify_integrity()
+    assert len(issues) == 1
+    assert issues[0]["id"] == "EV-0001"
+
+
+def test_evidence_store_query(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="github_api", content="malicious activity detected", evidence_type="gh_api")
+    store.add(source="manual", content="clean observation", evidence_type="manual")
+
+    results = store.query("malicious")
+    assert len(results) == 1
+    assert results[0]["source"] == "github_api"
+
+    # Query should be case-insensitive
+    results = store.query("MALICIOUS")
+    assert len(results) == 1
+
+
+def test_evidence_store_query_searches_multiple_fields(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="git_fsck", content="dangling commit abc123", evidence_type="git", actor="attacker")
+    store.add(source="manual", content="clean", evidence_type="manual")
+
+    # Search by source
+    assert len(store.query("fsck")) == 1
+    # Search by actor
+    assert len(store.query("attacker")) == 1
+    # Search returns nothing for non-matching
+    assert len(store.query("nonexistent")) == 0
+
+
+def test_evidence_store_chain_of_custody(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="s1", content="c1", evidence_type="git")
+    store.add(source="s2", content="c2", evidence_type="gh_api")
+
+    chain = store.data["chain_of_custody"]
+    assert len(chain) == 2
+    assert chain[0]["evidence_id"] == "EV-0001"
+    assert chain[0]["action"] == "add"
+    assert chain[1]["evidence_id"] == "EV-0002"
+
+
+def test_evidence_store_export_markdown(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="git_log", content="suspicious commit", evidence_type="git", actor="actor1")
+
+    md = store.export_markdown()
+    assert "# Evidence Registry" in md
+    assert "EV-0001" in md
+    assert "Chain of Custody" in md
+    assert "actor1" in md
+
+
+def test_evidence_store_summary(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store = EvidenceStore(str(store_file))
+
+    store.add(source="s1", content="c1", evidence_type="git", actor="a1")
+    store.add(source="s2", content="c2", evidence_type="git", actor="a2")
+    store.add(source="s3", content="c3", evidence_type="gh_api", actor="a1")
+
+    s = store.summary()
+    assert s["total"] == 3
+    assert s["by_type"]["git"] == 2
+    assert s["by_type"]["gh_api"] == 1
+    assert "a1" in s["unique_actors"]
+    assert "a2" in s["unique_actors"]
+
+
+def test_evidence_store_corrupted_file(tmp_path):
+    store_file = tmp_path / "test_evidence.json"
+    store_file.write_text("NOT VALID JSON {{{")
+
+    with pytest.raises(SystemExit):
+        EvidenceStore(str(store_file))
--- a/Show More
+++ b/Show More