Merge remote-tracking branch 'origin/main' into feat/dashboard-chat

docs: document the dashboard Chat tab
AGENTS.md — new subsection under TUI Architecture explaining that the dashboard embeds the real hermes --tui rather than rewriting it, with pointers to the pty_bridge + WebSocket endpoint and the rule 'never add a parallel chat surface in React.' website/docs/user-guide/features/web-dashboard.md — user-facing Chat section inside the existing Web Dashboard page, covering how it works (WebSocket + PTY + xterm.js), the Sessions-page resume flow, and prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows).
2026-04-22 21:42:14 -04:00 · 2026-04-21 03:10:30 -04:00 · 2026-04-21 03:10:30 -04:00 · 2026-04-21 03:10:30 -04:00 · 2026-04-21 02:48:16 -04:00
357 changed files with 2989 additions and 61273 deletions
@@ -14,6 +14,3 @@ node_modules
 .env

 *.md
-
-# Runtime data (bind-mounted at /opt/data; must not leak into build context)
-data/
@@ -53,9 +53,6 @@ jobs:
      - name: Extract skill metadata for dashboard
        run: python3 website/scripts/extract-skills.py

-      - name: Regenerate per-skill docs pages + catalogs
-        run: python3 website/scripts/generate-skill-docs.py
-
      - name: Build skills index (if not already present)
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -36,9 +36,6 @@ jobs:
      - name: Extract skill metadata for dashboard
        run: python3 website/scripts/extract-skills.py

-      - name: Regenerate per-skill docs pages + catalogs
-        run: python3 website/scripts/generate-skill-docs.py
-
      - name: Lint docs diagrams
        run: npm run lint:diagrams
        working-directory: website
@@ -1,4 +1,3 @@
-.DS_Store
 /venv/
 /_pycache/
 *.pyc*
@@ -5,61 +5,78 @@ Instructions for AI coding assistants and developers working on the hermes-agent
 ## Development Environment

 ```bash
-# Prefer .venv; fall back to venv if that's what your checkout has.
-source .venv/bin/activate   # or: source venv/bin/activate
+source venv/bin/activate  # ALWAYS activate before running Python
 ```

-`scripts/run_tests.sh` probes `.venv` first, then `venv`, then
-`$HOME/.hermes/hermes-agent/venv` (for worktrees that share a venv with the
-main checkout).
-
 ## Project Structure

-File counts shift constantly — don't treat the tree below as exhaustive.
-The canonical source is the filesystem. The notes call out the load-bearing
-entry points you'll actually edit.
-
 ```
 hermes-agent/
-├── run_agent.py          # AIAgent class — core conversation loop (~12k LOC)
+├── run_agent.py          # AIAgent class — core conversation loop
 ├── model_tools.py        # Tool orchestration, discover_builtin_tools(), handle_function_call()
 ├── toolsets.py           # Toolset definitions, _HERMES_CORE_TOOLS list
-├── cli.py                # HermesCLI class — interactive CLI orchestrator (~11k LOC)
+├── cli.py                # HermesCLI class — interactive CLI orchestrator
 ├── hermes_state.py       # SessionDB — SQLite session store (FTS5 search)
-├── hermes_constants.py   # get_hermes_home(), display_hermes_home() — profile-aware paths
-├── hermes_logging.py     # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
-├── batch_runner.py       # Parallel batch processing
-├── agent/                # Agent internals (provider adapters, memory, caching, compression, etc.)
-├── hermes_cli/           # CLI subcommands, setup wizard, plugins loader, skin engine
-├── tools/                # Tool implementations — auto-discovered via tools/registry.py
+├── agent/                # Agent internals
+│   ├── prompt_builder.py     # System prompt assembly
+│   ├── context_compressor.py # Auto context compression
+│   ├── prompt_caching.py     # Anthropic prompt caching
+│   ├── auxiliary_client.py   # Auxiliary LLM client (vision, summarization)
+│   ├── model_metadata.py     # Model context lengths, token estimation
+│   ├── models_dev.py         # models.dev registry integration (provider-aware context)
+│   ├── display.py            # KawaiiSpinner, tool preview formatting
+│   ├── skill_commands.py     # Skill slash commands (shared CLI/gateway)
+│   └── trajectory.py         # Trajectory saving helpers
+├── hermes_cli/           # CLI subcommands and setup
+│   ├── main.py           # Entry point — all `hermes` subcommands
+│   ├── config.py         # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
+│   ├── commands.py       # Slash command definitions + SlashCommandCompleter
+│   ├── callbacks.py      # Terminal callbacks (clarify, sudo, approval)
+│   ├── setup.py          # Interactive setup wizard
+│   ├── skin_engine.py    # Skin/theme engine — CLI visual customization
+│   ├── skills_config.py  # `hermes skills` — enable/disable skills per platform
+│   ├── tools_config.py   # `hermes tools` — enable/disable tools per platform
+│   ├── skills_hub.py     # `/skills` slash command (search, browse, install)
+│   ├── models.py         # Model catalog, provider model lists
+│   ├── model_switch.py   # Shared /model switch pipeline (CLI + gateway)
+│   └── auth.py           # Provider credential resolution
+├── tools/                # Tool implementations (one file per tool)
+│   ├── registry.py       # Central tool registry (schemas, handlers, dispatch)
+│   ├── approval.py       # Dangerous command detection
+│   ├── terminal_tool.py  # Terminal orchestration
+│   ├── process_registry.py # Background process management
+│   ├── file_tools.py     # File read/write/search/patch
+│   ├── web_tools.py      # Web search/extract (Parallel + Firecrawl)
+│   ├── browser_tool.py   # Browserbase browser automation
+│   ├── code_execution_tool.py # execute_code sandbox
+│   ├── delegate_tool.py  # Subagent delegation
+│   ├── mcp_tool.py       # MCP client (~1050 lines)
 │   └── environments/     # Terminal backends (local, docker, ssh, modal, daytona, singularity)
-├── gateway/              # Messaging gateway — run.py + session.py + platforms/
-│   ├── platforms/        # Adapter per platform (telegram, discord, slack, whatsapp,
-│   │                     #   homeassistant, signal, matrix, mattermost, email, sms,
-│   │                     #   dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
-│   │                     #   webhook, api_server, ...). See ADDING_A_PLATFORM.md.
-│   └── builtin_hooks/    # Always-registered gateway hooks (boot-md, ...)
-├── plugins/              # Plugin system (see "Plugins" section below)
-│   ├── memory/           # Memory-provider plugins (honcho, mem0, supermemory, ...)
-│   ├── context_engine/   # Context-engine plugins
-│   └── <others>/         # Dashboard, image-gen, disk-cleanup, examples, ...
-├── optional-skills/      # Heavier/niche skills shipped but NOT active by default
-├── skills/               # Built-in skills bundled with the repo
+├── gateway/              # Messaging platform gateway
+│   ├── run.py            # Main loop, slash commands, message dispatch
+│   ├── session.py        # SessionStore — conversation persistence
+│   └── platforms/        # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal, qqbot
 ├── ui-tui/               # Ink (React) terminal UI — `hermes --tui`
-│   └── src/              # entry.tsx, app.tsx, gatewayClient.ts + app/components/hooks/lib
+│   ├── src/entry.tsx        # TTY gate + render()
+│   ├── src/app.tsx          # Main state machine and UI
+│   ├── src/gatewayClient.ts # Child process + JSON-RPC bridge
+│   ├── src/app/             # Decomposed app logic (event handler, slash handler, stores, hooks)
+│   ├── src/components/      # Ink components (branding, markdown, prompts, pickers, etc.)
+│   ├── src/hooks/           # useCompletion, useInputHistory, useQueue, useVirtualHistory
+│   └── src/lib/             # Pure helpers (history, osc52, text, rpc, messages)
 ├── tui_gateway/          # Python JSON-RPC backend for the TUI
+│   ├── entry.py             # stdio entrypoint
+│   ├── server.py            # RPC handlers and session logic
+│   ├── render.py            # Optional rich/ANSI bridge
+│   └── slash_worker.py      # Persistent HermesCLI subprocess for slash commands
 ├── acp_adapter/          # ACP server (VS Code / Zed / JetBrains integration)
-├── cron/                 # Scheduler — jobs.py, scheduler.py
+├── cron/                 # Scheduler (jobs.py, scheduler.py)
 ├── environments/         # RL training environments (Atropos)
-├── scripts/              # run_tests.sh, release.py, auxiliary scripts
-├── website/              # Docusaurus docs site
-└── tests/                # Pytest suite (~15k tests across ~700 files as of Apr 2026)
+├── tests/                # Pytest suite (~3000 tests)
+└── batch_runner.py       # Parallel batch processing
 ```

-**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
-**Logs:** `~/.hermes/logs/` — `agent.log` (INFO+), `errors.log` (WARNING+),
-`gateway.log` when running the gateway. Profile-aware via `get_hermes_home()`.
-Browse with `hermes logs [--follow] [--level ...] [--session ...]`.
+**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys)

 ## File Dependency Chain

@@ -77,30 +94,20 @@ run_agent.py, cli.py, batch_runner.py, environments/

 ## AIAgent Class (run_agent.py)

-The real `AIAgent.__init__` takes ~60 parameters (credentials, routing, callbacks,
-session context, budget, credential pool, etc.). The signature below is the
-minimum subset you'll usually touch — read `run_agent.py` for the full list.
-
 ```python
 class AIAgent:
    def __init__(self,
-        base_url: str = None,
-        api_key: str = None,
-        provider: str = None,
-        api_mode: str = None,              # "chat_completions" | "codex_responses" | ...
-        model: str = "",                   # empty → resolved from config/provider later
-        max_iterations: int = 90,          # tool-calling iterations (shared with subagents)
+        model: str = "anthropic/claude-opus-4.6",
+        max_iterations: int = 90,
        enabled_toolsets: list = None,
        disabled_toolsets: list = None,
        quiet_mode: bool = False,
        save_trajectories: bool = False,
-        platform: str = None,              # "cli", "telegram", etc.
+        platform: str = None,           # "cli", "telegram", etc.
        session_id: str = None,
        skip_context_files: bool = False,
        skip_memory: bool = False,
-        credential_pool=None,
-        # ... plus callbacks, thread/user/chat IDs, iteration_budget, fallback_model,
-        # checkpoints config, prefill_messages, service_tier, reasoning_config, etc.
+        # ... plus provider, api_mode, callbacks, routing params
    ): ...

    def chat(self, message: str) -> str:
@@ -113,13 +120,10 @@ class AIAgent:

 ### Agent Loop

-The core loop is inside `run_conversation()` — entirely synchronous, with
-interrupt checks, budget tracking, and a one-turn grace call:
+The core loop is inside `run_conversation()` — entirely synchronous:

 ```python
-while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0) \
-        or self._budget_grace_call:
-    if self._interrupt_requested: break
+while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
    response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
    if response.tool_calls:
        for tool_call in response.tool_calls:
@@ -130,8 +134,7 @@ while (api_call_count < self.max_iterations and self.iteration_budget.remaining
        return response.content
 ```

-Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`.
-Reasoning content is stored in `assistant_msg["reasoning"]`.
+Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`. Reasoning content is stored in `assistant_msg["reasoning"]`.

 ---

@@ -240,6 +243,17 @@ npm run fmt       # prettier
 npm test          # vitest
 ```

+### TUI in the Dashboard (`hermes dashboard` → `/chat`)
+
+The dashboard embeds the real `hermes --tui` — **not** a rewrite.  See `hermes_cli/pty_bridge.py` + the `@app.websocket("/api/pty")` endpoint in `hermes_cli/web_server.py`.
+
+- Browser loads `web/src/pages/ChatPage.tsx`, which mounts xterm.js's `Terminal` with the WebGL renderer, `@xterm/addon-fit` for container-driven resize, and `@xterm/addon-unicode11` for modern wide-character widths.
+- `/api/pty?token=…` upgrades to a WebSocket; auth uses the same ephemeral `_SESSION_TOKEN` as REST, via query param (browsers can't set `Authorization` on WS upgrade).
+- The server spawns whatever `hermes --tui` would spawn, through `ptyprocess` (POSIX PTY — WSL works, native Windows does not).
+- Frames: raw PTY bytes each direction; resize via `\x1b[RESIZE:<cols>;<rows>]` intercepted on the server and applied with `TIOCSWINSZ`.
+
+**Never add a parallel chat surface in React.** If you catch yourself re-implementing slash popover / model picker / tool cards for the dashboard, stop — the TUI already does those, and anything new you add to Ink will appear in the dashboard automatically.
+
 ---

 ## Adding New Tools
@@ -277,7 +291,7 @@ The registry handles schema collection, dispatch, availability checking, and err

 **State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.

-**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `tools/todo_tool.py` for the pattern.
+**Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.

 ---

@@ -285,13 +299,9 @@ The registry handles schema collection, dispatch, availability checking, and err

 ### config.yaml options:
 1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
-2. Bump `_config_version` (check the current value at the top of `DEFAULT_CONFIG`)
-   ONLY if you need to actively migrate/transform existing user config
-   (renaming keys, changing structure). Adding a new key to an existing
-   section is handled automatically by the deep-merge and does NOT require
-   a version bump.
+2. Bump `_config_version` (currently 5) to trigger migration for existing users

-### .env variables (SECRETS ONLY — API keys, tokens, passwords):
+### .env variables:
 1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
 ```python
 "NEW_API_KEY": {
@@ -303,29 +313,13 @@ The registry handles schema collection, dispatch, availability checking, and err
 },
 ```

-Non-secret settings (timeouts, thresholds, feature flags, paths, display
-preferences) belong in `config.yaml`, not `.env`. If internal code needs an
-env var mirror for backward compatibility, bridge it from `config.yaml` to
-the env var in code (see `gateway_timeout`, `terminal.cwd` → `TERMINAL_CWD`).
-
-### Config loaders (three paths — know which one you're in):
+### Config loaders (two separate systems):

 | Loader | Used by | Location |
 |--------|---------|----------|
-| `load_cli_config()` | CLI mode | `cli.py` — merges CLI-specific defaults + user YAML |
-| `load_config()` | `hermes tools`, `hermes setup`, most CLI subcommands | `hermes_cli/config.py` — merges `DEFAULT_CONFIG` + user YAML |
-| Direct YAML load | Gateway runtime | `gateway/run.py` + `gateway/config.py` — reads user YAML raw |
-
-If you add a new key and the CLI sees it but the gateway doesn't (or vice
-versa), you're on the wrong loader. Check `DEFAULT_CONFIG` coverage.
-
-### Working directory:
- **CLI** — uses the process's current directory (`os.getcwd()`).
- **Messaging** — uses `terminal.cwd` from `config.yaml`. The gateway bridges this
-  to the `TERMINAL_CWD` env var for child tools. **`MESSAGING_CWD` has been
-  removed** — the config loader prints a deprecation warning if it's set in
-  `.env`. Same for `TERMINAL_CWD` in `.env`; the canonical setting is
-  `terminal.cwd` in `config.yaml`.
+| `load_cli_config()` | CLI mode | `cli.py` |
+| `load_config()` | `hermes tools`, `hermes setup` | `hermes_cli/config.py` |
+| Direct YAML load | Gateway | `gateway/run.py` |

 ---

@@ -418,95 +412,7 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.

 ---

-## Plugins
-
-Hermes has two plugin surfaces. Both live under `plugins/` in the repo so
-repo-shipped plugins can be discovered alongside user-installed ones in
-`~/.hermes/plugins/` and pip-installed entry points.
-
-### General plugins (`hermes_cli/plugins.py` + `plugins/<name>/`)
-
-`PluginManager` discovers plugins from `~/.hermes/plugins/`, `./.hermes/plugins/`,
-and pip entry points. Each plugin exposes a `register(ctx)` function that
-can:
-
- Register Python-callback lifecycle hooks:
-  `pre_tool_call`, `post_tool_call`, `pre_llm_call`, `post_llm_call`,
-  `on_session_start`, `on_session_end`
- Register new tools via `ctx.register_tool(...)`
- Register CLI subcommands via `ctx.register_cli_command(...)` — the
-  plugin's argparse tree is wired into `hermes` at startup so
-  `hermes <pluginname> <subcmd>` works with no change to `main.py`
-
-Hooks are invoked from `model_tools.py` (pre/post tool) and `run_agent.py`
-(lifecycle). **Discovery timing pitfall:** `discover_plugins()` only runs
-as a side effect of importing `model_tools.py`. Code paths that read plugin
-state without importing `model_tools.py` first must call `discover_plugins()`
-explicitly (it's idempotent).
-
-### Memory-provider plugins (`plugins/memory/<name>/`)
-
-Separate discovery system for pluggable memory backends. Current built-in
-providers include **honcho, mem0, supermemory, byterover, hindsight,
-holographic, openviking, retaindb**.
-
-Each provider implements the `MemoryProvider` ABC (see `agent/memory_provider.py`)
-and is orchestrated by `agent/memory_manager.py`. Lifecycle hooks include
-`sync_turn(turn_messages)`, `prefetch(query)`, `shutdown()`, and optional
-`post_setup(hermes_home, config)` for setup-wizard integration.
-
-**CLI commands via `plugins/memory/<name>/cli.py`:** if a memory plugin
-defines `register_cli(subparser)`, `discover_plugin_cli_commands()` finds
-it at argparse setup time and wires it into `hermes <plugin>`. The
-framework only exposes CLI commands for the **currently active** memory
-provider (read from `memory.provider` in config.yaml), so disabled
-providers don't clutter `hermes --help`.
-
-**Rule (Teknium, May 2026):** plugins MUST NOT modify core files
-(`run_agent.py`, `cli.py`, `gateway/run.py`, `hermes_cli/main.py`, etc.).
-If a plugin needs a capability the framework doesn't expose, expand the
-generic plugin surface (new hook, new ctx method) — never hardcode
-plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
-honcho argparse from `main.py` for exactly this reason.
-
-### Dashboard / context-engine / image-gen plugin directories
-
-`plugins/context_engine/`, `plugins/image_gen/`, `plugins/example-dashboard/`,
-etc. follow the same pattern (ABC + orchestrator + per-plugin directory).
-Context engines plug into `agent/context_engine.py`; image-gen providers
-into `agent/image_gen_provider.py`.
-
---
-
-## Skills
-
-Two parallel surfaces:
-
- **`skills/`** — built-in skills shipped and loadable by default.
-  Organized by category directories (e.g. `skills/github/`, `skills/mlops/`).
- **`optional-skills/`** — heavier or niche skills shipped with the repo but
-  NOT active by default. Installed explicitly via
-  `hermes skills install official/<category>/<skill>`. Adapter lives in
-  `tools/skills_hub.py` (`OptionalSkillSource`). Categories include
-  `autonomous-ai-agents`, `blockchain`, `communication`, `creative`,
-  `devops`, `email`, `health`, `mcp`, `migration`, `mlops`, `productivity`,
-  `research`, `security`, `web-development`.
-
-When reviewing skill PRs, check which directory they target — heavy-dep or
-niche skills belong in `optional-skills/`.
-
-### SKILL.md frontmatter
-
-Standard fields: `name`, `description`, `version`, `platforms`
-(OS-gating list: `[macos]`, `[linux, macos]`, ...),
-`metadata.hermes.tags`, `metadata.hermes.category`,
-`metadata.hermes.config` (config.yaml settings the skill needs — stored
-under `skills.config.<key>`, prompted during setup, injected at load time).
-
---
-
 ## Important Policies
-
 ### Prompt Caching Must Not Break

 Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
@@ -516,10 +422,9 @@ Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT i

 Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.

-Slash commands that mutate system-prompt state (skills, tools, memory, etc.)
-must be **cache-aware**: default to deferred invalidation (change takes
-effect next session), with an opt-in `--now` flag for immediate
-invalidation. See `/skills install --now` for the canonical pattern.
+### Working Directory Behavior
+- **CLI**: Uses current directory (`.` → `os.getcwd()`)
+- **Messaging**: Uses `MESSAGING_CWD` env var (default: home directory)

 ### Background Process Notifications (Gateway)

@@ -541,7 +446,7 @@ Hermes supports **profiles** — multiple fully isolated instances, each with it
 `HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).

 The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
-`HERMES_HOME` before any module imports. All `get_hermes_home()` references
+`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
 automatically scope to the active profile.

 ### Rules for profile-safe code
@@ -598,12 +503,8 @@ Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_her
 for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
 has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.

-### DO NOT introduce new `simple_term_menu` usage
-Existing call sites in `hermes_cli/main.py` remain for legacy fallback only;
-the preferred UI is curses (stdlib) because `simple_term_menu` has
-ghost-duplication rendering bugs in tmux/iTerm2 with arrow keys. New
-interactive menus must use `hermes_cli/curses_ui.py` — see
-`hermes_cli/tools_config.py` for the canonical pattern.
+### DO NOT use `simple_term_menu` for interactive menus
+Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.

 ### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
 Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.
@@ -614,30 +515,6 @@ Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-p
 ### DO NOT hardcode cross-tool references in schema descriptions
 Tool schema descriptions must not mention tools from other toolsets by name (e.g., `browser_navigate` saying "prefer web_search"). Those tools may be unavailable (missing API keys, disabled toolset), causing the model to hallucinate calls to non-existent tools. If a cross-reference is needed, add it dynamically in `get_tool_definitions()` in `model_tools.py` — see the `browser_navigate` / `execute_code` post-processing blocks for the pattern.

-### The gateway has TWO message guards — both must bypass approval/control commands
-When an agent is running, messages pass through two sequential guards:
-(1) **base adapter** (`gateway/platforms/base.py`) queues messages in
-`_pending_messages` when `session_key in self._active_sessions`, and
-(2) **gateway runner** (`gateway/run.py`) intercepts `/stop`, `/new`,
-`/queue`, `/status`, `/approve`, `/deny` before they reach
-`running_agent.interrupt()`. Any new command that must reach the runner
-while the agent is blocked (e.g. approval prompts) MUST bypass BOTH
-guards and be dispatched inline, not via `_process_message_background()`
-(which races session lifecycle).
-
-### Squash merges from stale branches silently revert recent fixes
-Before squash-merging a PR, ensure the branch is up to date with `main`
-(`git fetch origin main && git reset --hard origin/main` in the worktree,
-then re-apply the PR's commits). A stale branch's version of an unrelated
-file will silently overwrite recent fixes on main when squashed. Verify
-with `git diff HEAD~1..HEAD` after merging — unexpected deletions are a
-red flag.
-
-### Don't wire in dead code without E2E validation
-Unused code that was never shipped was dead for a reason. Before wiring an
-unused module into a live code path, E2E test the real resolution chain
-with actual imports (not mocks) against a temp `HERMES_HOME`.
-
 ### Tests must not write to `~/.hermes/`
 The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.

@@ -693,7 +570,7 @@ If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
 pytest directly), at minimum activate the venv and pass `-n 4`:

 ```bash
-source .venv/bin/activate   # or: source venv/bin/activate
+source venv/bin/activate
 python -m pytest tests/ -q -n 4
 ```

@@ -9,7 +9,7 @@ Thank you for contributing to Hermes Agent! This guide covers everything you nee
 We value contributions in this order:

 1. **Bug fixes** — crashes, incorrect behavior, data loss. Always top priority.
-2. **Cross-platform compatibility** — macOS, different Linux distros, and WSL2 on Windows. We want Hermes to work everywhere.
+2. **Cross-platform compatibility** — Windows, macOS, different Linux distros, different terminal emulators. We want Hermes to work everywhere.
 3. **Security hardening** — shell injection, prompt injection, path traversal, privilege escalation. See [Security](#security-considerations).
 4. **Performance and robustness** — retry logic, error handling, graceful degradation.
 5. **New skills** — but only broadly useful ones. See [Should it be a Skill or a Tool?](#should-it-be-a-skill-or-a-tool)
@@ -55,10 +55,10 @@ If your skill is specialized, community-contributed, or niche, it's better suite

 | Requirement | Notes |
 |-------------|-------|
-| **Git** | With `--recurse-submodules` support, and the `git-lfs` extension installed |
+| **Git** | With `--recurse-submodules` support |
 | **Python 3.11+** | uv will install it if missing |
 | **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
-| **Node.js 20+** | Optional — needed for browser tools and WhatsApp bridge (matches root `package.json` engines) |
+| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge |

 ### Clone and install

@@ -515,7 +515,7 @@ See `hermes_cli/skin_engine.py` for the full schema and existing skins as exampl

 ## Cross-Platform Compatibility

-Hermes runs on Linux, macOS, and WSL2 on Windows. When writing code that touches the OS:
+Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:

 ### Critical rules

@@ -597,7 +597,7 @@ refactor/description   # Code restructuring

 1. **Run tests**: `pytest tests/ -v`
 2. **Test manually**: Run `hermes` and exercise the code path you changed
-3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider macOS, Linux, and WSL2
+3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS
 4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.

 ### PR description
@@ -76,7 +76,7 @@ Hermes has two entry points: start the terminal UI with `hermes`, or run the gat
 | Set a personality | `/personality [name]` | `/personality [name]` |
 | Retry or undo the last turn | `/retry`, `/undo` | `/retry`, `/undo` |
 | Compress context / check usage | `/compress`, `/usage`, `/insights [--days N]` | `/compress`, `/usage`, `/insights [days]` |
-| Browse skills | `/skills` or `/<skill-name>` | `/<skill-name>` |
+| Browse skills | `/skills` or `/<skill-name>` | `/skills` or `/<skill-name>` |
 | Interrupt current work | `Ctrl+C` or send a new message | `/stop` or send a new message |
 | Platform-specific status | `/platforms` | `/status`, `/sethome` |

@@ -157,10 +157,14 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
 uv venv venv --python 3.11
 source venv/bin/activate
 uv pip install -e ".[all,dev]"
-scripts/run_tests.sh
+python -m pytest tests/ -q
 ```

-> **RL Training (optional):** The RL/Atropos integration (`environments/`) ships via the `atroposlib` and `tinker` dependencies pulled in by `.[all,dev]` — no submodule setup required.
+> **RL Training (optional):** To work on the RL/Tinker-Atropos integration:
+> ```bash
+> git submodule update --init tinker-atropos
+> uv pip install -e "./tinker-atropos"
+> ```

 ---

@@ -1,453 +0,0 @@
-# Hermes Agent v0.11.0 (v2026.4.23)
-
-**Release Date:** April 23, 2026
-**Since v0.9.0:** 1,556 commits · 761 merged PRs · 1,314 files changed · 224,174 insertions · 29 community contributors (290 including co-authors)
-
-> The Interface release — a full React/Ink rewrite of the interactive CLI, a pluggable transport architecture underneath every provider, native AWS Bedrock support, five new inference paths, a 17th messaging platform (QQBot), a dramatically expanded plugin surface, and GPT-5.5 via Codex OAuth.
-
-This release also folds in all the highlights deferred from v0.10.0 (which shipped only the Nous Tool Gateway) — so it covers roughly two weeks of work across the whole stack.
-
---
-
-## ✨ Highlights
-
- **New Ink-based TUI** — `hermes --tui` is now a full React/Ink rewrite of the interactive CLI, with a Python JSON-RPC backend (`tui_gateway`). Sticky composer, live streaming with OSC-52 clipboard support, stable picker keys, status bar with per-turn stopwatch and git branch, `/clear` confirm, light-theme preset, and a subagent spawn observability overlay. ~310 commits to `ui-tui/` + `tui_gateway/`. (@OutThisLife + Teknium)
-
- **Transport ABC + Native AWS Bedrock** — Format conversion and HTTP transport were extracted from `run_agent.py` into a pluggable `agent/transports/` layer. `AnthropicTransport`, `ChatCompletionsTransport`, `ResponsesApiTransport`, and `BedrockTransport` each own their own format conversion and API shape. Native AWS Bedrock support via the Converse API ships on top of the new abstraction. ([#10549](https://github.com/NousResearch/hermes-agent/pull/10549), [#13347](https://github.com/NousResearch/hermes-agent/pull/13347), [#13366](https://github.com/NousResearch/hermes-agent/pull/13366), [#13430](https://github.com/NousResearch/hermes-agent/pull/13430), [#13805](https://github.com/NousResearch/hermes-agent/pull/13805), [#13814](https://github.com/NousResearch/hermes-agent/pull/13814) — @kshitijk4poor + Teknium)
-
- **Five new inference paths** — Native NVIDIA NIM ([#11774](https://github.com/NousResearch/hermes-agent/pull/11774)), Arcee AI ([#9276](https://github.com/NousResearch/hermes-agent/pull/9276)), Step Plan ([#13893](https://github.com/NousResearch/hermes-agent/pull/13893)), Google Gemini CLI OAuth ([#11270](https://github.com/NousResearch/hermes-agent/pull/11270)), and Vercel ai-gateway with pricing + dynamic discovery ([#13223](https://github.com/NousResearch/hermes-agent/pull/13223) — @jerilynzheng). Plus Gemini routed through the native AI Studio API for better performance ([#12674](https://github.com/NousResearch/hermes-agent/pull/12674)).
-
- **GPT-5.5 over Codex OAuth** — OpenAI's new GPT-5.5 reasoning model is now available through your ChatGPT Codex OAuth, with live model discovery wired into the model picker so new OpenAI releases show up without catalog updates. ([#14720](https://github.com/NousResearch/hermes-agent/pull/14720))
-
- **QQBot — 17th supported platform** — Native QQBot adapter via QQ Official API v2, with QR scan-to-configure setup wizard, streaming cursor, emoji reactions, and DM/group policy gating that matches WeCom/Weixin parity. ([#9364](https://github.com/NousResearch/hermes-agent/pull/9364), [#11831](https://github.com/NousResearch/hermes-agent/pull/11831))
-
- **Plugin surface expanded** — Plugins can now register slash commands (`register_command`), dispatch tools directly (`dispatch_tool`), block tool execution from hooks (`pre_tool_call` can veto), rewrite tool results (`transform_tool_result`), transform terminal output (`transform_terminal_output`), ship image_gen backends, and add custom dashboard tabs. The bundled disk-cleanup plugin is opt-in by default as a reference implementation. ([#9377](https://github.com/NousResearch/hermes-agent/pull/9377), [#10626](https://github.com/NousResearch/hermes-agent/pull/10626), [#10763](https://github.com/NousResearch/hermes-agent/pull/10763), [#10951](https://github.com/NousResearch/hermes-agent/pull/10951), [#12929](https://github.com/NousResearch/hermes-agent/pull/12929), [#12944](https://github.com/NousResearch/hermes-agent/pull/12944), [#12972](https://github.com/NousResearch/hermes-agent/pull/12972), [#13799](https://github.com/NousResearch/hermes-agent/pull/13799), [#14175](https://github.com/NousResearch/hermes-agent/pull/14175))
-
- **`/steer` — mid-run agent nudges** — `/steer <prompt>` injects a note that the running agent sees after its next tool call, without interrupting the turn or breaking prompt cache. For when you want to course-correct an agent in-flight. ([#12116](https://github.com/NousResearch/hermes-agent/pull/12116))
-
- **Shell hooks** — Wire any shell script as a Hermes lifecycle hook (pre_tool_call, post_tool_call, on_session_start, etc.) without writing a Python plugin. ([#13296](https://github.com/NousResearch/hermes-agent/pull/13296))
-
- **Webhook direct-delivery mode** — Webhook subscriptions can now forward payloads straight to a platform chat without going through the agent — zero-LLM push notifications for alerting, uptime checks, and event streams. ([#12473](https://github.com/NousResearch/hermes-agent/pull/12473))
-
- **Smarter delegation** — Subagents now have an explicit `orchestrator` role that can spawn their own workers, with configurable `max_spawn_depth` (default flat). Concurrent sibling subagents share filesystem state through a file-coordination layer so they don't clobber each other's edits. ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691), [#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
-
- **Auxiliary models — configurable UI + main-model-first** — `hermes model` has a dedicated "Configure auxiliary models" screen for per-task overrides (compression, vision, session_search, title_generation). `auto` routing now defaults to the main model for side tasks across all users (previously aggregator users were silently routed to a cheap provider-side default). ([#11891](https://github.com/NousResearch/hermes-agent/pull/11891), [#11900](https://github.com/NousResearch/hermes-agent/pull/11900))
-
- **Dashboard plugin system + live theme switching** — The web dashboard is now extensible. Third-party plugins can add custom tabs, widgets, and views without forking. Paired with a live-switching theme system — themes now control colors, fonts, layout, and density — so users can hot-swap the dashboard look without a reload. Same theming discipline the CLI has, now on the web. ([#10951](https://github.com/NousResearch/hermes-agent/pull/10951), [#10687](https://github.com/NousResearch/hermes-agent/pull/10687), [#14725](https://github.com/NousResearch/hermes-agent/pull/14725))
-
- **Dashboard polish** — i18n (English + Chinese), react-router sidebar layout, mobile-responsive, Vercel deployment, real per-session API call tracking, and one-click update + gateway restart buttons. ([#9228](https://github.com/NousResearch/hermes-agent/pull/9228), [#9370](https://github.com/NousResearch/hermes-agent/pull/9370), [#9453](https://github.com/NousResearch/hermes-agent/pull/9453), [#10686](https://github.com/NousResearch/hermes-agent/pull/10686), [#13526](https://github.com/NousResearch/hermes-agent/pull/13526), [#14004](https://github.com/NousResearch/hermes-agent/pull/14004) — @austinpickett + @DeployFaith + Teknium)
-
---
-
-## 🏗️ Core Agent & Architecture
-
-### Transport Layer (NEW)
- **Transport ABC** abstracts format conversion and HTTP transport from `run_agent.py` into `agent/transports/` ([#13347](https://github.com/NousResearch/hermes-agent/pull/13347))
- **AnthropicTransport** — Anthropic Messages API path ([#13366](https://github.com/NousResearch/hermes-agent/pull/13366), @kshitijk4poor)
- **ChatCompletionsTransport** — default path for OpenAI-compatible providers ([#13805](https://github.com/NousResearch/hermes-agent/pull/13805))
- **ResponsesApiTransport** — OpenAI Responses API + Codex build_kwargs wiring ([#13430](https://github.com/NousResearch/hermes-agent/pull/13430), @kshitijk4poor)
- **BedrockTransport** — AWS Bedrock Converse API transport ([#13814](https://github.com/NousResearch/hermes-agent/pull/13814))
-
-### Provider & Model Support
- **Native AWS Bedrock provider** via Converse API ([#10549](https://github.com/NousResearch/hermes-agent/pull/10549))
- **NVIDIA NIM native provider** (salvage of #11703) ([#11774](https://github.com/NousResearch/hermes-agent/pull/11774))
- **Arcee AI direct provider** ([#9276](https://github.com/NousResearch/hermes-agent/pull/9276))
- **Step Plan provider** (salvage #6005) ([#13893](https://github.com/NousResearch/hermes-agent/pull/13893), @kshitijk4poor)
- **Google Gemini CLI OAuth** inference provider ([#11270](https://github.com/NousResearch/hermes-agent/pull/11270))
- **Vercel ai-gateway** with pricing, attribution, and dynamic discovery ([#13223](https://github.com/NousResearch/hermes-agent/pull/13223), @jerilynzheng)
- **GPT-5.5 over Codex OAuth** with live model discovery in the picker ([#14720](https://github.com/NousResearch/hermes-agent/pull/14720))
- **Gemini routed through native AI Studio API** ([#12674](https://github.com/NousResearch/hermes-agent/pull/12674))
- **xAI Grok upgraded to Responses API** ([#10783](https://github.com/NousResearch/hermes-agent/pull/10783))
- **Ollama improvements** — Cloud provider support, GLM continuation, `think=false` control, surrogate sanitization, `/v1` hint ([#10782](https://github.com/NousResearch/hermes-agent/pull/10782))
- **Kimi K2.6** across OpenRouter, Nous Portal, native Kimi, and HuggingFace ([#13148](https://github.com/NousResearch/hermes-agent/pull/13148), [#13152](https://github.com/NousResearch/hermes-agent/pull/13152), [#13169](https://github.com/NousResearch/hermes-agent/pull/13169))
- **Kimi K2.5** promoted to first position in all model suggestion lists ([#11745](https://github.com/NousResearch/hermes-agent/pull/11745), @kshitijk4poor)
- **Xiaomi MiMo v2.5-pro + v2.5** on OpenRouter, Nous Portal, and native ([#14184](https://github.com/NousResearch/hermes-agent/pull/14184), [#14635](https://github.com/NousResearch/hermes-agent/pull/14635), @kshitijk4poor)
- **GLM-5V-Turbo** for coding plan ([#9907](https://github.com/NousResearch/hermes-agent/pull/9907))
- **Claude Opus 4.7** in Nous Portal catalog ([#11398](https://github.com/NousResearch/hermes-agent/pull/11398))
- **OpenRouter elephant-alpha** in curated lists ([#9378](https://github.com/NousResearch/hermes-agent/pull/9378))
- **OpenCode-Go** — Kimi K2.6 and Qwen3.5/3.6 Plus in curated catalog ([#13429](https://github.com/NousResearch/hermes-agent/pull/13429))
- **minimax/minimax-m2.5:free** in OpenRouter catalog ([#13836](https://github.com/NousResearch/hermes-agent/pull/13836))
- **`/model` merges models.dev entries** for lesser-loved providers ([#14221](https://github.com/NousResearch/hermes-agent/pull/14221))
- **Per-provider + per-model `request_timeout_seconds`** config ([#12652](https://github.com/NousResearch/hermes-agent/pull/12652))
- **Configurable API retry count** via `agent.api_max_retries` ([#14730](https://github.com/NousResearch/hermes-agent/pull/14730))
- **ctx_size context length key** for Lemonade server (salvage #8536) ([#14215](https://github.com/NousResearch/hermes-agent/pull/14215))
- **Custom provider display name prompt** ([#9420](https://github.com/NousResearch/hermes-agent/pull/9420))
- **Recommendation badges** on tool provider selection ([#9929](https://github.com/NousResearch/hermes-agent/pull/9929))
- Fix: correct GPT-5 family context lengths in fallback defaults ([#9309](https://github.com/NousResearch/hermes-agent/pull/9309))
- Fix: clamp `minimal` reasoning effort to `low` on Responses API ([#9429](https://github.com/NousResearch/hermes-agent/pull/9429))
- Fix: strip reasoning item IDs from Responses API input when `store=False` ([#10217](https://github.com/NousResearch/hermes-agent/pull/10217))
- Fix: OpenViking correct account default + commit session on `/new` and compress ([#10463](https://github.com/NousResearch/hermes-agent/pull/10463))
- Fix: Kimi `/coding` thinking block survival + empty reasoning_content + block ordering (multiple PRs)
- Fix: don't send Anthropic thinking to api.kimi.com/coding ([#13826](https://github.com/NousResearch/hermes-agent/pull/13826))
- Fix: send `max_tokens`, `reasoning_effort`, and `thinking` for Kimi/Moonshot
- Fix: stream reasoning content through OpenAI-compatible providers that emit it
-
-### Agent Loop & Conversation
- **`/steer <prompt>`** — mid-run agent nudges after next tool call ([#12116](https://github.com/NousResearch/hermes-agent/pull/12116))
- **Orchestrator role + configurable spawn depth** for `delegate_task` (default flat) ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691))
- **Cross-agent file state coordination** for concurrent subagents ([#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
- **Compressor smart collapse, dedup, anti-thrashing**, template upgrade, hardening ([#10088](https://github.com/NousResearch/hermes-agent/pull/10088))
- **Compression summaries respect the conversation's language** ([#12556](https://github.com/NousResearch/hermes-agent/pull/12556))
- **Compression model falls back to main model** on permanent 503/404 ([#10093](https://github.com/NousResearch/hermes-agent/pull/10093))
- **Auto-continue interrupted agent work** after gateway restart ([#9934](https://github.com/NousResearch/hermes-agent/pull/9934))
- **Activity heartbeats** prevent false gateway inactivity timeouts ([#10501](https://github.com/NousResearch/hermes-agent/pull/10501))
- **Auxiliary models UI** — dedicated screen for per-task overrides ([#11891](https://github.com/NousResearch/hermes-agent/pull/11891))
- **Auxiliary auto routing defaults to main model** for all users ([#11900](https://github.com/NousResearch/hermes-agent/pull/11900))
- **PLATFORM_HINTS for Matrix, Mattermost, Feishu** ([#14428](https://github.com/NousResearch/hermes-agent/pull/14428), @alt-glitch)
- Fix: reset retry counters after compression; stop poisoning conversation history ([#10055](https://github.com/NousResearch/hermes-agent/pull/10055))
- Fix: break compression-exhaustion infinite loop and auto-reset session ([#10063](https://github.com/NousResearch/hermes-agent/pull/10063))
- Fix: stale agent timeout, uv venv detection, empty response after tools ([#10065](https://github.com/NousResearch/hermes-agent/pull/10065))
- Fix: prevent premature loop exit when weak models return empty after substantive tool calls ([#10472](https://github.com/NousResearch/hermes-agent/pull/10472))
- Fix: preserve pre-start terminal interrupts ([#10504](https://github.com/NousResearch/hermes-agent/pull/10504))
- Fix: improve interrupt responsiveness during concurrent tool execution ([#10935](https://github.com/NousResearch/hermes-agent/pull/10935))
- Fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt ([#10940](https://github.com/NousResearch/hermes-agent/pull/10940))
- Fix: `/stop` no longer resets the session ([#9224](https://github.com/NousResearch/hermes-agent/pull/9224))
- Fix: honor interrupts during MCP tool waits ([#9382](https://github.com/NousResearch/hermes-agent/pull/9382), @helix4u)
- Fix: break stuck session resume loops after repeated restarts ([#9941](https://github.com/NousResearch/hermes-agent/pull/9941))
- Fix: empty response nudge crash + placeholder leak to cron targets ([#11021](https://github.com/NousResearch/hermes-agent/pull/11021))
- Fix: streaming cursor sanitization to prevent message truncation (multiple PRs)
- Fix: resolve `context_length` for plugin context engines ([#9238](https://github.com/NousResearch/hermes-agent/pull/9238))
-
-### Session & Memory
- **Auto-prune old sessions + VACUUM state.db** at startup ([#13861](https://github.com/NousResearch/hermes-agent/pull/13861))
- **Honcho overhaul** — context injection, 5-tool surface, cost safety, session isolation ([#10619](https://github.com/NousResearch/hermes-agent/pull/10619))
- **Hindsight richer session-scoped retain metadata** (salvage of #6290) ([#13987](https://github.com/NousResearch/hermes-agent/pull/13987))
- Fix: deduplicate memory provider tools to prevent 400 on strict providers ([#10511](https://github.com/NousResearch/hermes-agent/pull/10511))
- Fix: discover user-installed memory providers from `$HERMES_HOME/plugins/` ([#10529](https://github.com/NousResearch/hermes-agent/pull/10529))
- Fix: add `on_memory_write` bridge to sequential tool execution path ([#10507](https://github.com/NousResearch/hermes-agent/pull/10507))
- Fix: preserve `session_id` across `previous_response_id` chains in `/v1/responses` ([#10059](https://github.com/NousResearch/hermes-agent/pull/10059))
-
---
-
-## 🖥️ New Ink-based TUI
-
-A full React/Ink rewrite of the interactive CLI — invoked via `hermes --tui` or `HERMES_TUI=1`. Shipped across ~310 commits to `ui-tui/` and `tui_gateway/`.
-
-### TUI Foundations
- New TUI based on Ink + Python JSON-RPC backend
- Prettier + ESLint + vitest tooling for `ui-tui/`
- Entry split between `src/entry.tsx` (TTY gate) and `src/app.tsx` (state machine)
- Persistent `_SlashWorker` subprocess for slash command dispatch
-
-### UX & Features
- **Stable picker keys, /clear confirm, light-theme preset** ([#12312](https://github.com/NousResearch/hermes-agent/pull/12312), @OutThisLife)
- **Git branch in status bar** cwd label ([#12305](https://github.com/NousResearch/hermes-agent/pull/12305), @OutThisLife)
- **Per-turn elapsed stopwatch in FaceTicker + done-in sys line** ([#13105](https://github.com/NousResearch/hermes-agent/pull/13105), @OutThisLife)
- **Subagent spawn observability overlay** ([#14045](https://github.com/NousResearch/hermes-agent/pull/14045), @OutThisLife)
- **Per-prompt elapsed stopwatch in status bar** ([#12948](https://github.com/NousResearch/hermes-agent/pull/12948))
- Sticky composer that freezes during scroll
- OSC-52 clipboard support for copy across SSH sessions
- Virtualized history rendering for performance
- Slash command autocomplete via `complete.slash` RPC
- Path autocomplete via `complete.path` RPC
- Dozens of resize/ghosting/sticky-prompt fixes landed through the week
-
-### Structural Refactors
- Decomposed `app.tsx` into `app/event-handler`, `app/slash-handler`, `app/stores`, `app/hooks` ([#14640](https://github.com/NousResearch/hermes-agent/pull/14640) and surrounding)
- Component split: `branding.tsx`, `markdown.tsx`, `prompts.tsx`, `sessionPicker.tsx`, `messageLine.tsx`, `thinking.tsx`, `maskedPrompt.tsx`
- Hook split: `useCompletion`, `useInputHistory`, `useQueue`, `useVirtualHistory`
-
---
-
-## 📱 Messaging Platforms (Gateway)
-
-### New Platforms
- **QQBot (17th platform)** — QQ Official API v2 adapter with QR setup, streaming, package split ([#9364](https://github.com/NousResearch/hermes-agent/pull/9364), [#11831](https://github.com/NousResearch/hermes-agent/pull/11831))
-
-### Telegram
- **Dedicated `TELEGRAM_PROXY` env var + config.yaml proxy support** (closes #9414, #6530, #9074, #7786) ([#10681](https://github.com/NousResearch/hermes-agent/pull/10681))
- **`ignored_threads` config** for Telegram groups ([#9530](https://github.com/NousResearch/hermes-agent/pull/9530))
- **Config option to disable link previews** (closes #8728) ([#10610](https://github.com/NousResearch/hermes-agent/pull/10610))
- **Auto-wrap markdown tables** in code blocks ([#11794](https://github.com/NousResearch/hermes-agent/pull/11794))
- Fix: prevent duplicate replies when stream task is cancelled ([#9319](https://github.com/NousResearch/hermes-agent/pull/9319))
- Fix: prevent streaming cursor (▉) from appearing as standalone messages ([#9538](https://github.com/NousResearch/hermes-agent/pull/9538))
- Fix: retry transient tool sends + cold-boot budget ([#10947](https://github.com/NousResearch/hermes-agent/pull/10947))
- Fix: Markdown special char escaping in `send_exec_approval`
- Fix: parentheses in URLs during MarkdownV2 link conversion
- Fix: Unicode dash normalization in model switch (closes iOS smart-punctuation issue)
- Many platform hint / streaming / session-key fixes
-
-### Discord
- **Forum channel support** (salvage of #10145 + media + polish) ([#11920](https://github.com/NousResearch/hermes-agent/pull/11920))
- **`DISCORD_ALLOWED_ROLES`** for role-based access control ([#11608](https://github.com/NousResearch/hermes-agent/pull/11608))
- **Config option to disable slash commands** (salvage #13130) ([#14315](https://github.com/NousResearch/hermes-agent/pull/14315))
- **Native `send_animation`** for inline GIF playback ([#10283](https://github.com/NousResearch/hermes-agent/pull/10283))
- **`send_message` Discord media attachments** ([#10246](https://github.com/NousResearch/hermes-agent/pull/10246))
- **`/skill` command group** with category subcommands ([#9909](https://github.com/NousResearch/hermes-agent/pull/9909))
- **Extract reply text from message references** ([#9781](https://github.com/NousResearch/hermes-agent/pull/9781))
-
-### Feishu
- **Intelligent reply on document comments** with 3-tier access control ([#11898](https://github.com/NousResearch/hermes-agent/pull/11898))
- **Show processing state via reactions** on user messages ([#12927](https://github.com/NousResearch/hermes-agent/pull/12927))
- **Preserve @mention context for agent consumption** (salvage #13874) ([#14167](https://github.com/NousResearch/hermes-agent/pull/14167))
-
-### DingTalk
- **`require_mention` + `allowed_users` gating** (parity with Slack/Telegram/Discord) ([#11564](https://github.com/NousResearch/hermes-agent/pull/11564))
- **QR-code device-flow authorization** for setup wizard ([#11574](https://github.com/NousResearch/hermes-agent/pull/11574))
- **AI Cards streaming, emoji reactions, and media handling** (salvage of #10985) ([#11910](https://github.com/NousResearch/hermes-agent/pull/11910))
-
-### WhatsApp
- **`send_voice`** — native audio message delivery ([#13002](https://github.com/NousResearch/hermes-agent/pull/13002))
- **`dm_policy` and `group_policy`** parity with WeCom/Weixin/QQ adapters ([#13151](https://github.com/NousResearch/hermes-agent/pull/13151))
-
-### WeCom / Weixin
- **WeCom QR-scan bot creation + interactive setup wizard** (salvage #13923) ([#13961](https://github.com/NousResearch/hermes-agent/pull/13961))
-
-### Signal
- **Media delivery support** via `send_message` ([#13178](https://github.com/NousResearch/hermes-agent/pull/13178))
-
-### Slack
- **Per-thread sessions for DMs by default** ([#10987](https://github.com/NousResearch/hermes-agent/pull/10987))
-
-### BlueBubbles (iMessage)
- Group chat session separation, webhook registration & auth fixes ([#9806](https://github.com/NousResearch/hermes-agent/pull/9806))
-
-### Gateway Core
- **Gateway proxy mode** — forward messages to a remote API server ([#9787](https://github.com/NousResearch/hermes-agent/pull/9787))
- **Per-channel ephemeral prompts** (Discord, Telegram, Slack, Mattermost) ([#10564](https://github.com/NousResearch/hermes-agent/pull/10564))
- **Surface plugin slash commands** natively on all platforms + decision-capable command hook ([#14175](https://github.com/NousResearch/hermes-agent/pull/14175))
- **Support document/archive extensions in MEDIA: tag extraction** (salvage #8255) ([#14307](https://github.com/NousResearch/hermes-agent/pull/14307))
- **Recognize `.pdf` in MEDIA: tag extraction** ([#13683](https://github.com/NousResearch/hermes-agent/pull/13683))
- **`--all` flag for `gateway start` and `restart`** ([#10043](https://github.com/NousResearch/hermes-agent/pull/10043))
- **Notify active sessions on gateway shutdown** + update health check ([#9850](https://github.com/NousResearch/hermes-agent/pull/9850))
- **Block agent from self-destructing the gateway** via terminal (closes #6666) ([#9895](https://github.com/NousResearch/hermes-agent/pull/9895))
- Fix: suppress duplicate replies on interrupt and streaming flood control ([#10235](https://github.com/NousResearch/hermes-agent/pull/10235))
- Fix: close temporary agents after one-off tasks ([#11028](https://github.com/NousResearch/hermes-agent/pull/11028), @kshitijk4poor)
- Fix: busy-session ack when user messages during active agent run ([#10068](https://github.com/NousResearch/hermes-agent/pull/10068))
- Fix: route watch-pattern notifications to the originating session ([#10460](https://github.com/NousResearch/hermes-agent/pull/10460))
- Fix: preserve notify context in executor threads ([#10921](https://github.com/NousResearch/hermes-agent/pull/10921), @kshitijk4poor)
- Fix: avoid duplicate replies after interrupted long tasks ([#11018](https://github.com/NousResearch/hermes-agent/pull/11018))
- Fix: unlink stale PID + lock files on cleanup
- Fix: force-unlink stale PID file after `--replace` takeover
-
---
-
-## 🔧 Tool System
-
-### Plugin Surface (major expansion)
- **`register_command()`** — plugins can now add slash commands ([#10626](https://github.com/NousResearch/hermes-agent/pull/10626))
- **`dispatch_tool()`** — plugins can invoke tools from their code ([#10763](https://github.com/NousResearch/hermes-agent/pull/10763))
- **`pre_tool_call` blocking** — plugins can veto tool execution ([#9377](https://github.com/NousResearch/hermes-agent/pull/9377))
- **`transform_tool_result`** — plugins rewrite tool results generically ([#12972](https://github.com/NousResearch/hermes-agent/pull/12972))
- **`transform_terminal_output`** — plugins rewrite terminal tool output ([#12929](https://github.com/NousResearch/hermes-agent/pull/12929))
- **Namespaced skill registration** for plugin skill bundles ([#9786](https://github.com/NousResearch/hermes-agent/pull/9786))
- **Opt-in-by-default + bundled disk-cleanup plugin** (salvage #12212) ([#12944](https://github.com/NousResearch/hermes-agent/pull/12944))
- **Pluggable `image_gen` backends + OpenAI provider** ([#13799](https://github.com/NousResearch/hermes-agent/pull/13799))
- **`openai-codex` image_gen plugin** (gpt-image-2 via Codex OAuth) ([#14317](https://github.com/NousResearch/hermes-agent/pull/14317))
- **Shell hooks** — wire shell scripts as hook callbacks ([#13296](https://github.com/NousResearch/hermes-agent/pull/13296))
-
-### Browser
- **`browser_cdp` raw DevTools Protocol passthrough** ([#12369](https://github.com/NousResearch/hermes-agent/pull/12369))
- Camofox hardening + connection stability across the window
-
-### Execute Code
- **Project/strict execution modes** (default: project) ([#11971](https://github.com/NousResearch/hermes-agent/pull/11971))
-
-### Image Generation
- **Multi-model FAL support** with picker in `hermes tools` ([#11265](https://github.com/NousResearch/hermes-agent/pull/11265))
- **Recraft V3 → V4 Pro, Nano Banana → Pro upgrades** ([#11406](https://github.com/NousResearch/hermes-agent/pull/11406))
- **GPT Image 2** in FAL catalog ([#13677](https://github.com/NousResearch/hermes-agent/pull/13677))
- **xAI image generation provider** (grok-imagine-image) ([#14765](https://github.com/NousResearch/hermes-agent/pull/14765))
-
-### TTS / STT / Voice
- **Google Gemini TTS provider** ([#11229](https://github.com/NousResearch/hermes-agent/pull/11229))
- **xAI Grok STT provider** ([#14473](https://github.com/NousResearch/hermes-agent/pull/14473))
- **xAI TTS** (shipped with Responses API upgrade) ([#10783](https://github.com/NousResearch/hermes-agent/pull/10783))
- **KittenTTS local provider** (salvage of #2109) ([#13395](https://github.com/NousResearch/hermes-agent/pull/13395))
- **CLI record beep toggle** ([#13247](https://github.com/NousResearch/hermes-agent/pull/13247), @helix4u)
-
-### Webhook / Cron
- **Webhook direct-delivery mode** — zero-LLM push notifications ([#12473](https://github.com/NousResearch/hermes-agent/pull/12473))
- **Cron `wakeAgent` gate** — scripts can skip the agent entirely ([#12373](https://github.com/NousResearch/hermes-agent/pull/12373))
- **Cron per-job `enabled_toolsets`** — cap token overhead + cost per job ([#14767](https://github.com/NousResearch/hermes-agent/pull/14767))
-
-### Delegate
- **Orchestrator role** + configurable spawn depth (default flat) ([#13691](https://github.com/NousResearch/hermes-agent/pull/13691))
- **Cross-agent file state coordination** ([#13718](https://github.com/NousResearch/hermes-agent/pull/13718))
-
-### File / Patch
- **`patch` — "did you mean?" feedback** when patch fails to match ([#13435](https://github.com/NousResearch/hermes-agent/pull/13435))
-
-### API Server
- **Stream `/v1/responses` SSE tool events** (salvage #9779) ([#10049](https://github.com/NousResearch/hermes-agent/pull/10049))
- **Inline image inputs** on `/v1/chat/completions` and `/v1/responses` ([#12969](https://github.com/NousResearch/hermes-agent/pull/12969))
-
-### Docker / Podman
- **Entry-level Podman support** — `find_docker()` + rootless entrypoint ([#10066](https://github.com/NousResearch/hermes-agent/pull/10066))
- **Add docker-cli to Docker image** (salvage #10096) ([#14232](https://github.com/NousResearch/hermes-agent/pull/14232))
- **File-sync back to host on teardown** (salvage of #8189 + hardening) ([#11291](https://github.com/NousResearch/hermes-agent/pull/11291))
-
-### MCP
- 12 MCP improvements across the window (status, timeout handling, tool-call forwarding, etc.)
-
---
-
-## 🧩 Skills Ecosystem
-
-### Skill System
- **Namespaced skill registration** for plugin bundles ([#9786](https://github.com/NousResearch/hermes-agent/pull/9786))
- **`hermes skills reset`** to un-stick bundled skills ([#11468](https://github.com/NousResearch/hermes-agent/pull/11468))
- **Skills guard opt-in** — `config.skills.guard_agent_created` (default off) ([#14557](https://github.com/NousResearch/hermes-agent/pull/14557))
- **Bundled skill scripts runnable out of the box** ([#13384](https://github.com/NousResearch/hermes-agent/pull/13384))
- **`xitter` replaced with `xurl`** — the official X API CLI ([#12303](https://github.com/NousResearch/hermes-agent/pull/12303))
- **MiniMax-AI/cli as default skill tap** (salvage #7501) ([#14493](https://github.com/NousResearch/hermes-agent/pull/14493))
- **Fuzzy `@` file completions + mtime sorting** ([#9467](https://github.com/NousResearch/hermes-agent/pull/9467))
-
-### New Skills
- **concept-diagrams** (salvage of #11045, @v1k22) ([#11363](https://github.com/NousResearch/hermes-agent/pull/11363))
- **architecture-diagram** (Cocoon AI port) ([#9906](https://github.com/NousResearch/hermes-agent/pull/9906))
- **pixel-art** with hardware palettes and video animation ([#12663](https://github.com/NousResearch/hermes-agent/pull/12663), [#12725](https://github.com/NousResearch/hermes-agent/pull/12725))
- **baoyu-comic** ([#13257](https://github.com/NousResearch/hermes-agent/pull/13257), @JimLiu)
- **baoyu-infographic** — 21 layouts × 21 styles (salvage #9901) ([#12254](https://github.com/NousResearch/hermes-agent/pull/12254))
- **page-agent** — embed Alibaba's in-page GUI agent in your webapp ([#13976](https://github.com/NousResearch/hermes-agent/pull/13976))
- **fitness-nutrition** optional skill + optional env var support ([#9355](https://github.com/NousResearch/hermes-agent/pull/9355))
- **drug-discovery** — ChEMBL, PubChem, OpenFDA, ADMET ([#9443](https://github.com/NousResearch/hermes-agent/pull/9443))
- **touchdesigner-mcp** (salvage of #10081) ([#12298](https://github.com/NousResearch/hermes-agent/pull/12298))
- **adversarial-ux-test** optional skill (salvage of #2494, @omnissiah-comelse) ([#13425](https://github.com/NousResearch/hermes-agent/pull/13425))
- **maps** — added `guest_house`, `camp_site`, and dual-key bakery lookup ([#13398](https://github.com/NousResearch/hermes-agent/pull/13398))
- **llm-wiki** — port provenance markers, source hashing, and quality signals ([#13700](https://github.com/NousResearch/hermes-agent/pull/13700))
-
---
-
-## 📊 Web Dashboard
-
- **i18n (English + Chinese) language switcher** ([#9453](https://github.com/NousResearch/hermes-agent/pull/9453))
- **Live-switching theme system** ([#10687](https://github.com/NousResearch/hermes-agent/pull/10687))
- **Dashboard plugin system** — extend the web UI with custom tabs ([#10951](https://github.com/NousResearch/hermes-agent/pull/10951))
- **react-router, sidebar layout, sticky header, dropdown component** ([#9370](https://github.com/NousResearch/hermes-agent/pull/9370), @austinpickett)
- **Responsive for mobile** ([#9228](https://github.com/NousResearch/hermes-agent/pull/9228), @DeployFaith)
- **Vercel deployment** ([#10686](https://github.com/NousResearch/hermes-agent/pull/10686), [#11061](https://github.com/NousResearch/hermes-agent/pull/11061), @austinpickett)
- **Context window config support** ([#9357](https://github.com/NousResearch/hermes-agent/pull/9357))
- **HTTP health probe for cross-container gateway detection** ([#9894](https://github.com/NousResearch/hermes-agent/pull/9894))
- **Update + restart gateway buttons** ([#13526](https://github.com/NousResearch/hermes-agent/pull/13526), @austinpickett)
- **Real API call count per session** (salvages #10140) ([#14004](https://github.com/NousResearch/hermes-agent/pull/14004))
-
---
-
-## 🖱️ CLI & User Experience
-
- **Dynamic shell completion for bash, zsh, and fish** ([#9785](https://github.com/NousResearch/hermes-agent/pull/9785))
- **Light-mode skins + skin-aware completion menus** ([#9461](https://github.com/NousResearch/hermes-agent/pull/9461))
- **Numbered keyboard shortcuts** on approval and clarify prompts ([#13416](https://github.com/NousResearch/hermes-agent/pull/13416))
- **Markdown stripping, compact multiline previews, external editor** ([#12934](https://github.com/NousResearch/hermes-agent/pull/12934))
- **`--ignore-user-config` and `--ignore-rules` flags** (port codex#18646) ([#14277](https://github.com/NousResearch/hermes-agent/pull/14277))
- **Account limits section in `/usage`** ([#13428](https://github.com/NousResearch/hermes-agent/pull/13428))
- **Doctor: Command Installation check** for `hermes` bin symlink ([#10112](https://github.com/NousResearch/hermes-agent/pull/10112))
- **ESC cancels secret/sudo prompts**, clearer skip messaging ([#9902](https://github.com/NousResearch/hermes-agent/pull/9902))
- Fix: agent-facing text uses `display_hermes_home()` instead of hardcoded `~/.hermes` ([#10285](https://github.com/NousResearch/hermes-agent/pull/10285))
- Fix: enforce `config.yaml` as sole CWD source + deprecate `.env` CWD vars + add `hermes memory reset` ([#11029](https://github.com/NousResearch/hermes-agent/pull/11029))
-
---
-
-## 🔒 Security & Reliability
-
- **Global toggle to allow private/internal URL resolution** ([#14166](https://github.com/NousResearch/hermes-agent/pull/14166))
- **Block agent from self-destructing the gateway** via terminal (closes #6666) ([#9895](https://github.com/NousResearch/hermes-agent/pull/9895))
- **Telegram callback authorization** on update prompts ([#10536](https://github.com/NousResearch/hermes-agent/pull/10536))
- **SECURITY.md** added ([#10532](https://github.com/NousResearch/hermes-agent/pull/10532), @I3eg1nner)
- **Warn about legacy hermes.service units** during `hermes update` ([#11918](https://github.com/NousResearch/hermes-agent/pull/11918))
- **Complete ASCII-locale UnicodeEncodeError recovery** for `api_messages`/`reasoning_content` (closes #6843) ([#10537](https://github.com/NousResearch/hermes-agent/pull/10537))
- **Prevent stale `os.environ` leak** after `clear_session_vars` ([#10527](https://github.com/NousResearch/hermes-agent/pull/10527))
- **Prevent agent hang when backgrounding processes** via terminal tool ([#10584](https://github.com/NousResearch/hermes-agent/pull/10584))
- Many smaller session-resume, interrupt, streaming, and memory-race fixes throughout the window
-
---
-
-## 🐛 Notable Bug Fixes
-
-The `fix:` category in this window covers 482 PRs. Highlights:
-
- Streaming cursor artifacts filtered from Matrix, Telegram, WhatsApp, Discord (multiple PRs)
- `<think>` and `<thought>` blocks filtered from gateway stream consumers ([#9408](https://github.com/NousResearch/hermes-agent/pull/9408))
- Gateway display.streaming root-config override regression ([#9799](https://github.com/NousResearch/hermes-agent/pull/9799))
- Context `session_search` coerces limit to int (prevents TypeError) ([#10522](https://github.com/NousResearch/hermes-agent/pull/10522))
- Memory tool stays available when `fcntl` is unavailable (Windows) ([#9783](https://github.com/NousResearch/hermes-agent/pull/9783))
- Trajectory compressor credentials load from `HERMES_HOME/.env` ([#9632](https://github.com/NousResearch/hermes-agent/pull/9632), @Dusk1e)
- `@_context_completions` no longer crashes on `@` mention ([#9683](https://github.com/NousResearch/hermes-agent/pull/9683), @kshitijk4poor)
- Group session `user_id` no longer treated as `thread_id` in shutdown notifications ([#10546](https://github.com/NousResearch/hermes-agent/pull/10546))
- Telegram `platform_hint` — markdown is supported (closes #8261) ([#10612](https://github.com/NousResearch/hermes-agent/pull/10612))
- Doctor checks for Kimi China credentials fixed
- Streaming: don't suppress final response when commentary message is sent ([#10540](https://github.com/NousResearch/hermes-agent/pull/10540))
- Rapid Telegram follow-ups no longer get cut off
-
---
-
-## 🧪 Testing & CI
-
- **Contributor attribution CI check** on PRs ([#9376](https://github.com/NousResearch/hermes-agent/pull/9376))
- Hermetic test parity (`scripts/run_tests.sh`) held across this window
- Test count stabilized post-Transport refactor; CI matrix held green through the transport rollout
-
---
-
-## 📚 Documentation
-
- Atropos + wandb links in user guide
- ACP / VS Code / Zed / JetBrains integration docs refresh
- Webhook subscription docs updated for direct-delivery mode
- Plugin author guide expanded for new hooks (`register_command`, `dispatch_tool`, `transform_tool_result`)
- Transport layer developer guide added
- Website removed Discussions link from README
-
---
-
-## 👥 Contributors
-
-### Core
- **@teknium1** (Teknium)
-
-### Top Community Contributors (by merged PR count)
- **@kshitijk4poor** — 49 PRs · Transport refactor (AnthropicTransport, ResponsesApiTransport), Step Plan provider, Xiaomi MiMo v2.5 support, numerous gateway fixes, promoted Kimi K2.5, @ mention crash fix
- **@OutThisLife** (Brooklyn) — 31 PRs · TUI polish, git branch in status bar, per-turn stopwatch, stable picker keys, `/clear` confirm, light-theme preset, subagent spawn observability overlay
- **@helix4u** — 11 PRs · Voice CLI record beep, MCP tool interrupt handling, assorted stability fixes
- **@austinpickett** — 8 PRs · Dashboard react-router + sidebar + sticky header + dropdown, Vercel deployment, update + restart buttons
- **@alt-glitch** — 8 PRs · PLATFORM_HINTS for Matrix/Mattermost/Feishu, Matrix fixes
- **@ethernet8023** — 3 PRs
- **@benbarclay** — 3 PRs
- **@Aslaaen** — 2 PRs
-
-### Also contributing
-@jerilynzheng (ai-gateway pricing), @JimLiu (baoyu-comic skill), @Dusk1e (trajectory compressor credentials), @DeployFaith (mobile-responsive dashboard), @LeonSGP43, @v1k22 (concept-diagrams), @omnissiah-comelse (adversarial-ux-test), @coekfung (Telegram MarkdownV2 expandable blockquotes), @liftaris (TUI provider resolution), @arihantsethia (skill analytics dashboard), @topcheer + @xing8star (QQBot foundation), @kovyrin, @I3eg1nner (SECURITY.md), @PeterBerthelsen, @lengxii, @priveperfumes, @sjz-ks, @cuyua9, @Disaster-Terminator, @leozeli, @LehaoLin, @trevthefoolish, @loongfay, @MrNiceRicee, @WideLee, @bluefishs, @malaiwah, @bobashopcashier, @dsocolobsky, @iamagenius00, @IAvecilla, @aniruddhaadak80, @Es1la, @asheriif, @walli, @jquesnelle (original Tool Gateway work).
-
-### All Contributors (alphabetical)
-
-@0xyg3n, @10ishq, @A-afflatus, @Abnertheforeman, @admin28980, @adybag14-cyber, @akhater, @alexzhu0,
-@AllardQuek, @alt-glitch, @aniruddhaadak80, @anna-oake, @anniesurla, @anthhub, @areu01or00, @arihantsethia,
-@arthurbr11, @asheriif, @Aslaaen, @Asunfly, @austinpickett, @AviArora02-commits, @AxDSan, @azhengbot, @Bartok9,
-@benbarclay, @bennytimz, @bernylinville, @bingo906, @binhnt92, @bkadish, @bluefishs, @bobashopcashier,
-@brantzh6, @BrennerSpear, @brianclemens, @briandevans, @brooklynnicholson, @bugkill3r, @buray, @burtenshaw,
-@cdanis, @cgarwood82, @ChimingLiu, @chongweiliu, @christopherwoodall, @coekfung, @cola-runner, @corazzione,
-@counterposition, @cresslank, @cuyua9, @cypres0099, @danieldoderlein, @davetist, @davidvv, @DeployFaith,
-@Dev-Mriganka, @devorun, @dieutx, @Disaster-Terminator, @dodo-reach, @draix, @DrStrangerUJN, @dsocolobsky,
-@Dusk1e, @dyxushuai, @elkimek, @elmatadorgh, @emozilla, @entropidelic, @Erosika, @erosika, @Es1la, @etcircle,
-@etherman-os, @ethernet8023, @fancydirty, @farion1231, @fatinghenji, @Fatty911, @fengtianyu88, @Feranmi10,
-@flobo3, @francip, @fuleinist, @g-guthrie, @GenKoKo, @gianfrancopiana, @gnanam1990, @GuyCui, @haileymarshall,
-@haimu0x, @handsdiff, @hansnow, @hedgeho9X, @helix4u, @hengm3467, @HenkDz, @heykb, @hharry11, @HiddenPuppy,
-@honghua, @houko, @houziershi, @hsy5571616, @huangke19, @hxp-plus, @Hypn0sis, @I3eg1nner, @iacker,
-@iamagenius00, @IAvecilla, @iborazzi, @Ifkellx, @ifrederico, @imink, @isaachuangGMICLOUD, @ismell0992-afk,
-@j0sephz, @Jaaneek, @jackjin1997, @JackTheGit, @jaffarkeikei, @jerilynzheng, @JiaDe-Wu, @Jiawen-lee, @JimLiu,
-@jinzheng8115, @jneeee, @jplew, @jquesnelle, @Julientalbot, @Junass1, @jvcl, @kagura-agent, @keifergu,
-@kevinskysunny, @keyuyuan, @konsisumer, @kovyrin, @kshitijk4poor, @leeyang1990, @LehaoLin, @lengxii,
-@LeonSGP43, @leozeli, @li0near, @liftaris, @Lind3ey, @Linux2010, @liujinkun2025, @LLQWQ, @Llugaes, @lmoncany,
-@longsizhuo, @lrawnsley, @Lubrsy706, @lumenradley, @luyao618, @lvnilesh, @LVT382009, @m0n5t3r, @Magaav,
-@MagicRay1217, @malaiwah, @manuelschipper, @Marvae, @MassiveMassimo, @mavrickdeveloper, @maxchernin, @memosr,
-@meng93, @mengjian-github, @MestreY0d4-Uninter, @Mibayy, @MikeFac, @mikewaters, @milkoor, @minorgod,
-@MrNiceRicee, @ms-alan, @mvanhorn, @n-WN, @N0nb0at, @Nan93, @NIDNASSER-Abdelmajid, @nish3451, @niyoh120,
-@nocoo, @nosleepcassette, @NousResearch, @ogzerber, @omnissiah-comelse, @Only-Code-A, @opriz, @OwenYWT, @pedh,
-@pefontana, @PeterBerthelsen, @phpoh, @pinion05, @plgonzalezrx8, @pradeep7127, @priveperfumes,
-@projectadmin-dev, @PStarH, @rnijhara, @Roy-oss1, @roytian1217, @RucchiZ, @Ruzzgar, @RyanLee-Dev, @Salt-555,
-@Sanjays2402, @sgaofen, @sharziki, @shenuu, @shin4, @SHL0MS, @shushuzn, @sicnuyudidi, @simon-gtcl,
-@simon-marcus, @sirEven, @Sisyphus, @sjz-ks, @snreynolds, @Societus, @Somme4096, @sontianye, @sprmn24,
-@StefanIsMe, @stephenschoettler, @Swift42, @taeng0204, @taeuk178, @tannerfokkens-maker, @TaroballzChen,
-@ten-ltw, @teyrebaz33, @Tianworld, @topcheer, @Tranquil-Flow, @trevthefoolish, @TroyMitchell911, @UNLINEARITY,
-@v1k22, @vivganes, @vominh1919, @vrinek, @VTRiot, @WadydX, @walli, @wenhao7, @WhiteWorld, @WideLee, @wujhsu,
-@WuTianyi123, @Wysie, @xandersbell, @xiaoqiang243, @xiayh0107, @xinpengdr, @Xowiek, @ycbai, @yeyitech, @ygd58,
-@youngDoo, @yudaiyan, @Yukipukii1, @yule975, @yyq4193, @yzx9, @ZaynJarvis, @zhang9w0v5, @zhanggttry,
-@zhangxicen, @zhongyueming1121, @zhouxiaoya12, @zons-zhaozhy
-
-Also: @maelrx, @Marco Rutsch, @MaxsolcuCrypto, @Mind-Dragon, @Paul Bergeron, @say8hi, @whitehatjr1001.
-
-
---
-
-**Full Changelog**: [v2026.4.13...v2026.4.23](https://github.com/NousResearch/hermes-agent/compare/v2026.4.13...v2026.4.23)
@@ -17,6 +17,7 @@ import os
 from pathlib import Path

 from hermes_constants import get_hermes_home
+from types import SimpleNamespace
 from typing import Any, Dict, List, Optional, Tuple
 from utils import normalize_proxy_env_vars

@@ -1598,4 +1599,70 @@ def build_anthropic_kwargs(
    return kwargs


+def normalize_anthropic_response(
+    response,
+    strip_tool_prefix: bool = False,
+) -> Tuple[SimpleNamespace, str]:
+    """Normalize Anthropic response to match the shape expected by AIAgent.

+    Returns (assistant_message, finish_reason) where assistant_message has
+    .content, .tool_calls, and .reasoning attributes.
+
+    When *strip_tool_prefix* is True, removes the ``mcp_`` prefix that was
+    added to tool names for OAuth Claude Code compatibility.
+    """
+    text_parts = []
+    reasoning_parts = []
+    reasoning_details = []
+    tool_calls = []
+
+    for block in response.content:
+        if block.type == "text":
+            text_parts.append(block.text)
+        elif block.type == "thinking":
+            reasoning_parts.append(block.thinking)
+            block_dict = _to_plain_data(block)
+            if isinstance(block_dict, dict):
+                reasoning_details.append(block_dict)
+        elif block.type == "tool_use":
+            name = block.name
+            if strip_tool_prefix and name.startswith(_MCP_TOOL_PREFIX):
+                name = name[len(_MCP_TOOL_PREFIX):]
+            tool_calls.append(
+                SimpleNamespace(
+                    id=block.id,
+                    type="function",
+                    function=SimpleNamespace(
+                        name=name,
+                        arguments=json.dumps(block.input),
+                    ),
+                )
+            )
+
+    # Map Anthropic stop_reason to OpenAI finish_reason.
+    # Newer stop reasons added in Claude 4.5+ / 4.7:
+    #   - refusal: the model declined to answer (cyber safeguards, CSAM, etc.)
+    #   - model_context_window_exceeded: hit context limit (not max_tokens)
+    # Both need distinct handling upstream — a refusal should surface to the
+    # user with a clear message, and a context-window overflow should trigger
+    # compression/truncation rather than be treated as normal end-of-turn.
+    stop_reason_map = {
+        "end_turn": "stop",
+        "tool_use": "tool_calls",
+        "max_tokens": "length",
+        "stop_sequence": "stop",
+        "refusal": "content_filter",
+        "model_context_window_exceeded": "length",
+    }
+    finish_reason = stop_reason_map.get(response.stop_reason, "stop")
+
+    return (
+        SimpleNamespace(
+            content="\n".join(text_parts) if text_parts else None,
+            tool_calls=tool_calls or None,
+            reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
+            reasoning_content=None,
+            reasoning_details=reasoning_details or None,
+        ),
+        finish_reason,
+    )
@@ -151,7 +151,7 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
 # differs from their main chat model, map it here.  The vision auto-detect
 # "exotic provider" branch checks this before falling back to the main model.
 _PROVIDER_VISION_MODELS: Dict[str, str] = {
-    "xiaomi": "mimo-v2.5",
+    "xiaomi": "mimo-v2-omni",
    "zai": "glm-5v-turbo",
 }

@@ -573,8 +573,7 @@ class _AnthropicCompletionsAdapter:
        self._is_oauth = is_oauth

    def create(self, **kwargs) -> Any:
-        from agent.anthropic_adapter import build_anthropic_kwargs
-        from agent.transports import get_transport
+        from agent.anthropic_adapter import build_anthropic_kwargs, normalize_anthropic_response

        messages = kwargs.get("messages", [])
        model = kwargs.get("model", self._model)
@@ -611,19 +610,7 @@ class _AnthropicCompletionsAdapter:
                anthropic_kwargs["temperature"] = temperature

        response = self._client.messages.create(**anthropic_kwargs)
-        _transport = get_transport("anthropic_messages")
-        _nr = _transport.normalize_response(
-            response, strip_tool_prefix=self._is_oauth
-        )
-
-        # ToolCall already duck-types as OpenAI shape (.type, .function.name,
-        # .function.arguments) via properties, so no wrapping needed.
-        assistant_message = SimpleNamespace(
-            content=_nr.content,
-            tool_calls=_nr.tool_calls,
-            reasoning=_nr.reasoning,
-        )
-        finish_reason = _nr.finish_reason
+        assistant_message, finish_reason = normalize_anthropic_response(response)

        usage = None
        if hasattr(response, "usage") and response.usage:
@@ -916,19 +903,6 @@ def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
                   default_headers=_OR_HEADERS), _OPENROUTER_MODEL


-def _describe_openrouter_unavailable() -> str:
-    """Return a more precise OpenRouter auth failure reason for logs."""
-    pool_present, entry = _select_pool_entry("openrouter")
-    if pool_present:
-        if entry is None:
-            return "OpenRouter credential pool has no usable entries (credentials may be exhausted)"
-        if not _pool_runtime_api_key(entry):
-            return "OpenRouter credential pool entry is missing a runtime API key"
-    if not str(os.getenv("OPENROUTER_API_KEY") or "").strip():
-        return "OPENROUTER_API_KEY not set"
-    return "no usable OpenRouter credentials found"
-
-
 def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
    # Check cross-session rate limit guard before attempting Nous —
    # if another session already recorded a 429, skip Nous entirely
@@ -1640,10 +1614,8 @@ def resolve_provider_client(
    if provider == "openrouter":
        client, default = _try_openrouter()
        if client is None:
-            logger.warning(
-                "resolve_provider_client: openrouter requested but %s",
-                _describe_openrouter_unavailable(),
-            )
+            logger.warning("resolve_provider_client: openrouter requested "
+                           "but OPENROUTER_API_KEY not set")
            return None, None
        final_model = _normalize_resolved_model(model or default, provider)
        return (_to_async_client(client, final_model) if async_mode
@@ -45,7 +45,6 @@ class FailoverReason(enum.Enum):

    # Model
    model_not_found = "model_not_found"  # 404 or invalid model — fallback to different model
-    provider_policy_blocked = "provider_policy_blocked"  # Aggregator (e.g. OpenRouter) blocked the only endpoint due to account data/privacy policy

    # Request format
    format_error = "format_error"        # 400 bad request — abort or strip + retry
@@ -195,29 +194,6 @@ _MODEL_NOT_FOUND_PATTERNS = [
    "unsupported model",
 ]

-# OpenRouter aggregator policy-block patterns.
-#
-# When a user's OpenRouter account privacy setting (or a per-request
-# `provider.data_collection: deny` preference) excludes the only endpoint
-# serving a model, OpenRouter returns 404 with a *specific* message that is
-# distinct from "model not found":
-#
-#   "No endpoints available matching your guardrail restrictions and
-#    data policy. Configure: https://openrouter.ai/settings/privacy"
-#
-# We classify this as `provider_policy_blocked` rather than
-# `model_not_found` because:
-#   - The model *exists* — model_not_found is misleading in logs
-#   - Provider fallback won't help: the account-level setting applies to
-#     every call on the same OpenRouter account
-#   - The error body already contains the fix URL, so the user gets
-#     actionable guidance without us rewriting the message
-_PROVIDER_POLICY_BLOCKED_PATTERNS = [
-    "no endpoints available matching your guardrail",
-    "no endpoints available matching your data policy",
-    "no endpoints found matching your data policy",
-]
-
 # Auth patterns (non-status-code signals)
 _AUTH_PATTERNS = [
    "invalid api key",
@@ -547,17 +523,6 @@ def _classify_by_status(
        return _classify_402(error_msg, result_fn)

    if status_code == 404:
-        # OpenRouter policy-block 404 — distinct from "model not found".
-        # The model exists; the user's account privacy setting excludes the
-        # only endpoint serving it. Falling back to another provider won't
-        # help (same account setting applies).  The error body already
-        # contains the fix URL, so just surface it.
-        if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
-            return result_fn(
-                FailoverReason.provider_policy_blocked,
-                retryable=False,
-                should_fallback=False,
-            )
        if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
            return result_fn(
                FailoverReason.model_not_found,
@@ -675,12 +640,6 @@ def _classify_400(
        )

    # Some providers return model-not-found as 400 instead of 404 (e.g. OpenRouter).
-    if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
-        return result_fn(
-            FailoverReason.provider_policy_blocked,
-            retryable=False,
-            should_fallback=False,
-        )
    if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
        return result_fn(
            FailoverReason.model_not_found,
@@ -853,15 +812,6 @@ def _classify_by_message(
            should_fallback=True,
        )

-    # Provider policy-block (aggregator-side guardrail) — check before
-    # model_not_found so we don't mis-label as a missing model.
-    if any(p in error_msg for p in _PROVIDER_POLICY_BLOCKED_PATTERNS):
-        return result_fn(
-            FailoverReason.provider_policy_blocked,
-            retryable=False,
-            should_fallback=False,
-        )
-
    # Model not found patterns
    if any(p in error_msg for p in _MODEL_NOT_FOUND_PATTERNS):
        return result_fn(
@@ -123,10 +123,6 @@ DEFAULT_CONTEXT_LENGTHS = {
    "claude": 200000,
    # OpenAI — GPT-5 family (most have 400k; specific overrides first)
    # Source: https://developers.openai.com/api/docs/models
-    # GPT-5.5 (launched Apr 23 2026). 400k is the fallback for providers we
-    # can't probe live. ChatGPT Codex OAuth actually caps lower (272k as of
-    # Apr 2026) and is resolved via _resolve_codex_oauth_context_length().
-    "gpt-5.5": 400000,
    "gpt-5.4-nano": 400000,           # 400k (not 1.05M like full 5.4)
    "gpt-5.4-mini": 400000,           # 400k (not 1.05M like full 5.4)
    "gpt-5.4": 1050000,               # GPT-5.4, GPT-5.4 Pro (1.05M context)
@@ -187,12 +183,12 @@ DEFAULT_CONTEXT_LENGTHS = {
    "moonshotai/Kimi-K2.6": 262144,
    "moonshotai/Kimi-K2-Thinking": 262144,
    "MiniMaxAI/MiniMax-M2.5": 204800,
-    "XiaomiMiMo/MiMo-V2-Flash": 262144,
-    "mimo-v2-pro": 1048576,
-    "mimo-v2.5-pro": 1048576,
-    "mimo-v2.5": 1048576,
-    "mimo-v2-omni": 262144,
-    "mimo-v2-flash": 262144,
+    "XiaomiMiMo/MiMo-V2-Flash": 256000,
+    "mimo-v2-pro": 1000000,
+    "mimo-v2-omni": 256000,
+    "mimo-v2-flash": 256000,
+    "mimo-v2.5-pro": 1000000,
+    "mimo-v2.5": 1000000,
    "zai-org/GLM-5": 202752,
 }

@@ -1006,115 +1002,6 @@ def _query_anthropic_context_length(model: str, base_url: str, api_key: str) ->
    return None


-# Known ChatGPT Codex OAuth context windows (observed via live
-# chatgpt.com/backend-api/codex/models probe, Apr 2026). These are the
-# `context_window` values, which are what Codex actually enforces — the
-# direct OpenAI API has larger limits for the same slugs, but Codex OAuth
-# caps lower (e.g. gpt-5.5 is 1.05M on the API, 272K on Codex).
-#
-# Used as a fallback when the live probe fails (no token, network error).
-# Longest keys first so substring match picks the most specific entry.
-_CODEX_OAUTH_CONTEXT_FALLBACK: Dict[str, int] = {
-    "gpt-5.1-codex-max": 272_000,
-    "gpt-5.1-codex-mini": 272_000,
-    "gpt-5.3-codex": 272_000,
-    "gpt-5.2-codex": 272_000,
-    "gpt-5.4-mini": 272_000,
-    "gpt-5.5": 272_000,
-    "gpt-5.4": 272_000,
-    "gpt-5.2": 272_000,
-    "gpt-5": 272_000,
-}
-
-
-_codex_oauth_context_cache: Dict[str, int] = {}
-_codex_oauth_context_cache_time: float = 0.0
-_CODEX_OAUTH_CONTEXT_CACHE_TTL = 3600  # 1 hour
-
-
-def _fetch_codex_oauth_context_lengths(access_token: str) -> Dict[str, int]:
-    """Probe the ChatGPT Codex /models endpoint for per-slug context windows.
-
-    Codex OAuth imposes its own context limits that differ from the direct
-    OpenAI API (e.g. gpt-5.5 is 1.05M on the API, 272K on Codex). The
-    `context_window` field in each model entry is the authoritative source.
-
-    Returns a ``{slug: context_window}`` dict. Empty on failure.
-    """
-    global _codex_oauth_context_cache, _codex_oauth_context_cache_time
-    now = time.time()
-    if (
-        _codex_oauth_context_cache
-        and now - _codex_oauth_context_cache_time < _CODEX_OAUTH_CONTEXT_CACHE_TTL
-    ):
-        return _codex_oauth_context_cache
-
-    try:
-        resp = requests.get(
-            "https://chatgpt.com/backend-api/codex/models?client_version=1.0.0",
-            headers={"Authorization": f"Bearer {access_token}"},
-            timeout=10,
-        )
-        if resp.status_code != 200:
-            logger.debug(
-                "Codex /models probe returned HTTP %s; falling back to hardcoded defaults",
-                resp.status_code,
-            )
-            return {}
-        data = resp.json()
-    except Exception as exc:
-        logger.debug("Codex /models probe failed: %s", exc)
-        return {}
-
-    entries = data.get("models", []) if isinstance(data, dict) else []
-    result: Dict[str, int] = {}
-    for item in entries:
-        if not isinstance(item, dict):
-            continue
-        slug = item.get("slug")
-        ctx = item.get("context_window")
-        if isinstance(slug, str) and isinstance(ctx, int) and ctx > 0:
-            result[slug.strip()] = ctx
-
-    if result:
-        _codex_oauth_context_cache = result
-        _codex_oauth_context_cache_time = now
-    return result
-
-
-def _resolve_codex_oauth_context_length(
-    model: str, access_token: str = ""
-) -> Optional[int]:
-    """Resolve a Codex OAuth model's real context window.
-
-    Prefers a live probe of chatgpt.com/backend-api/codex/models (when we
-    have a bearer token), then falls back to ``_CODEX_OAUTH_CONTEXT_FALLBACK``.
-    """
-    model_bare = _strip_provider_prefix(model).strip()
-    if not model_bare:
-        return None
-
-    if access_token:
-        live = _fetch_codex_oauth_context_lengths(access_token)
-        if model_bare in live:
-            return live[model_bare]
-        # Case-insensitive match in case casing drifts
-        model_lower = model_bare.lower()
-        for slug, ctx in live.items():
-            if slug.lower() == model_lower:
-                return ctx
-
-    # Fallback: longest-key-first substring match over hardcoded defaults.
-    model_lower = model_bare.lower()
-    for slug, ctx in sorted(
-        _CODEX_OAUTH_CONTEXT_FALLBACK.items(), key=lambda x: len(x[0]), reverse=True
-    ):
-        if slug in model_lower:
-            return ctx
-
-    return None
-
-
 def _resolve_nous_context_length(model: str) -> Optional[int]:
    """Resolve Nous Portal model context length via OpenRouter metadata.

@@ -1259,15 +1146,6 @@ def get_model_context_length(
        ctx = _resolve_nous_context_length(model)
        if ctx:
            return ctx
-    if effective_provider == "openai-codex":
-        # Codex OAuth enforces lower context limits than the direct OpenAI
-        # API for the same slug (e.g. gpt-5.5 is 1.05M on the API but 272K
-        # on Codex). Authoritative source is Codex's own /models endpoint.
-        codex_ctx = _resolve_codex_oauth_context_length(model, access_token=api_key or "")
-        if codex_ctx:
-            if base_url:
-                save_context_length(model, base_url, codex_ctx)
-            return codex_ctx
    if effective_provider:
        from agent.models_dev import lookup_models_dev_context
        ctx = lookup_models_dev_context(effective_provider, model)
@@ -418,9 +418,6 @@ def list_provider_models(provider: str) -> List[str]:

    Returns an empty list if the provider is unknown or has no data.
    """
-    from hermes_cli.models import normalize_provider
-    provider = normalize_provider(provider) or provider
-    
    models = _get_provider_models(provider)
    if models is None:
        return []
@@ -1,190 +0,0 @@
-"""Helpers for translating OpenAI-style tool schemas to Moonshot's schema subset.
-
-Moonshot (Kimi) accepts a stricter subset of JSON Schema than standard OpenAI
-tool calling.  Requests that violate it fail with HTTP 400:
-
-    tools.function.parameters is not a valid moonshot flavored json schema,
-    details: <...>
-
-Known rejection modes documented at
-https://forum.moonshot.ai/t/tool-calling-specification-violation-on-moonshot-api/102
-and MoonshotAI/kimi-cli#1595:
-
-1. Every property schema must carry a ``type``.  Standard JSON Schema allows
-   type to be omitted (the value is then unconstrained); Moonshot refuses.
-2. When ``anyOf`` is used, ``type`` must be on the ``anyOf`` children, not
-   the parent.  Presence of both causes "type should be defined in anyOf
-   items instead of the parent schema".
-
-The ``#/definitions/...`` → ``#/$defs/...`` rewrite for draft-07 refs is
-handled separately in ``tools/mcp_tool._normalize_mcp_input_schema`` so it
-applies at MCP registration time for all providers.
-"""
-
-from __future__ import annotations
-
-import copy
-from typing import Any, Dict, List
-
-# Keys whose values are maps of name → schema (not schemas themselves).
-# When we recurse, we walk the values of these maps as schemas, but we do
-# NOT apply the missing-type repair to the map itself.
-_SCHEMA_MAP_KEYS = frozenset({"properties", "patternProperties", "$defs", "definitions"})
-
-# Keys whose values are lists of schemas.
-_SCHEMA_LIST_KEYS = frozenset({"anyOf", "oneOf", "allOf", "prefixItems"})
-
-# Keys whose values are a single nested schema.
-_SCHEMA_NODE_KEYS = frozenset({"items", "contains", "not", "additionalProperties", "propertyNames"})
-
-
-def _repair_schema(node: Any, is_schema: bool = True) -> Any:
-    """Recursively apply Moonshot repairs to a schema node.
-
-    ``is_schema=True`` means this dict is a JSON Schema node and gets the
-    missing-type + anyOf-parent repairs applied.  ``is_schema=False`` means
-    it's a container map (e.g. the value of ``properties``) and we only
-    recurse into its values.
-    """
-    if isinstance(node, list):
-        # Lists only show up under schema-list keys (anyOf/oneOf/allOf), so
-        # every element is itself a schema.
-        return [_repair_schema(item, is_schema=True) for item in node]
-    if not isinstance(node, dict):
-        return node
-
-    # Walk the dict, deciding per-key whether recursion is into a schema
-    # node, a container map, or a scalar.
-    repaired: Dict[str, Any] = {}
-    for key, value in node.items():
-        if key in _SCHEMA_MAP_KEYS and isinstance(value, dict):
-            # Map of name → schema.  Don't treat the map itself as a schema
-            # (it has no type / properties of its own), but each value is.
-            repaired[key] = {
-                sub_key: _repair_schema(sub_val, is_schema=True)
-                for sub_key, sub_val in value.items()
-            }
-        elif key in _SCHEMA_LIST_KEYS and isinstance(value, list):
-            repaired[key] = [_repair_schema(v, is_schema=True) for v in value]
-        elif key in _SCHEMA_NODE_KEYS:
-            # items / not / additionalProperties: single nested schema.
-            # additionalProperties can also be a bool — leave those alone.
-            if isinstance(value, dict):
-                repaired[key] = _repair_schema(value, is_schema=True)
-            else:
-                repaired[key] = value
-        else:
-            # Scalars (description, title, format, enum values, etc.) pass through.
-            repaired[key] = value
-
-    if not is_schema:
-        return repaired
-
-    # Rule 2: when anyOf is present, type belongs only on the children.
-    if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
-        repaired.pop("type", None)
-        return repaired
-
-    # Rule 1: property schemas without type need one.  $ref nodes are exempt
-    # — their type comes from the referenced definition.
-    if "$ref" in repaired:
-        return repaired
-    return _fill_missing_type(repaired)
-
-
-def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
-    """Infer a reasonable ``type`` if this schema node has none."""
-    if "type" in node and node["type"] not in (None, ""):
-        return node
-
-    # Heuristic: presence of ``properties`` → object, ``items`` → array, ``enum``
-    # → type of first enum value, else fall back to ``string`` (safest scalar).
-    if "properties" in node or "required" in node or "additionalProperties" in node:
-        inferred = "object"
-    elif "items" in node or "prefixItems" in node:
-        inferred = "array"
-    elif "enum" in node and isinstance(node["enum"], list) and node["enum"]:
-        sample = node["enum"][0]
-        if isinstance(sample, bool):
-            inferred = "boolean"
-        elif isinstance(sample, int):
-            inferred = "integer"
-        elif isinstance(sample, float):
-            inferred = "number"
-        else:
-            inferred = "string"
-    else:
-        inferred = "string"
-
-    return {**node, "type": inferred}
-
-
-def sanitize_moonshot_tool_parameters(parameters: Any) -> Dict[str, Any]:
-    """Normalize tool parameters to a Moonshot-compatible object schema.
-
-    Returns a deep-copied schema with the two flavored-JSON-Schema repairs
-    applied.  Input is not mutated.
-    """
-    if not isinstance(parameters, dict):
-        return {"type": "object", "properties": {}}
-
-    repaired = _repair_schema(copy.deepcopy(parameters), is_schema=True)
-    if not isinstance(repaired, dict):
-        return {"type": "object", "properties": {}}
-
-    # Top-level must be an object schema
-    if repaired.get("type") != "object":
-        repaired["type"] = "object"
-    if "properties" not in repaired:
-        repaired["properties"] = {}
-
-    return repaired
-
-
-def sanitize_moonshot_tools(tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
-    """Apply ``sanitize_moonshot_tool_parameters`` to every tool's parameters."""
-    if not tools:
-        return tools
-
-    sanitized: List[Dict[str, Any]] = []
-    any_change = False
-    for tool in tools:
-        if not isinstance(tool, dict):
-            sanitized.append(tool)
-            continue
-        fn = tool.get("function")
-        if not isinstance(fn, dict):
-            sanitized.append(tool)
-            continue
-        params = fn.get("parameters")
-        repaired = sanitize_moonshot_tool_parameters(params)
-        if repaired is not params:
-            any_change = True
-            new_fn = {**fn, "parameters": repaired}
-            sanitized.append({**tool, "function": new_fn})
-        else:
-            sanitized.append(tool)
-
-    return sanitized if any_change else tools
-
-
-def is_moonshot_model(model: str | None) -> bool:
-    """True for any Kimi / Moonshot model slug, regardless of aggregator prefix.
-
-    Matches bare names (``kimi-k2.6``, ``moonshotai/Kimi-K2.6``) and aggregator-
-    prefixed slugs (``nous/moonshotai/kimi-k2.6``, ``openrouter/moonshotai/...``).
-    Detection by model name covers Nous / OpenRouter / other aggregators that
-    route to Moonshot's inference, where the base URL is the aggregator's, not
-    ``api.moonshot.ai``.
-    """
-    if not model:
-        return False
-    bare = model.strip().lower()
-    # Last path segment (covers aggregator-prefixed slugs)
-    tail = bare.rsplit("/", 1)[-1]
-    if tail.startswith("kimi-") or tail == "kimi":
-        return True
-    # Vendor-prefixed forms commonly used on aggregators
-    if "moonshot" in bare or "/kimi" in bare or bare.startswith("kimi"):
-        return True
-    return False
@@ -370,32 +370,6 @@ PLATFORM_HINTS = {
        "MEDIA:/absolute/path/to/file in your response. Images (.jpg, .png, "
        ".heic) appear as photos and other files arrive as attachments."
    ),
-    "mattermost": (
-        "You are in a Mattermost workspace communicating with your user. "
-        "Mattermost renders standard Markdown — headings, bold, italic, code "
-        "blocks, and tables all work. "
-        "You can send media files natively: include MEDIA:/absolute/path/to/file "
-        "in your response. Images (.jpg, .png, .webp) are uploaded as photo "
-        "attachments, audio and video as file attachments. "
-        "Image URLs in markdown format ![alt](url) are rendered as inline previews automatically."
-    ),
-    "matrix": (
-        "You are in a Matrix room communicating with your user. "
-        "Matrix renders Markdown — bold, italic, code blocks, and links work; "
-        "the adapter converts your Markdown to HTML for rich display. "
-        "You can send media files natively: include MEDIA:/absolute/path/to/file "
-        "in your response. Images (.jpg, .png, .webp) are sent as inline photos, "
-        "audio (.ogg, .mp3) as voice/audio messages, video (.mp4) inline, "
-        "and other files as downloadable attachments."
-    ),
-    "feishu": (
-        "You are in a Feishu (Lark) workspace communicating with your user. "
-        "Feishu renders Markdown in messages — bold, italic, code blocks, and "
-        "links are supported. "
-        "You can send media files natively: include MEDIA:/absolute/path/to/file "
-        "in your response. Images (.jpg, .png, .webp) are uploaded and displayed "
-        "inline, audio files as voice messages, and other files as attachments."
-    ),
    "weixin": (
        "You are on Weixin/WeChat. Markdown formatting is supported, so you may use it when "
        "it improves readability, but keep the message compact and chat-friendly. You can send media files natively: "
@@ -345,7 +345,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
    _skill_commands = {}
    try:
        from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
-        from agent.skill_utils import get_external_skills_dirs, iter_skill_index_files
+        from agent.skill_utils import get_external_skills_dirs
        disabled = _get_disabled_skill_names()
        seen_names: set = set()

@@ -356,7 +356,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
        dirs_to_scan.extend(get_external_skills_dirs())

        for scan_dir in dirs_to_scan:
-            for skill_md in iter_skill_index_files(scan_dir, "SKILL.md"):
+            for skill_md in scan_dir.rglob("SKILL.md"):
                if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
                    continue
                try:
@@ -38,7 +38,7 @@ def generate_title(user_message: str, assistant_response: str, timeout: float =
        response = call_llm(
            task="title_generation",
            messages=messages,
-            max_tokens=500,
+            max_tokens=30,
            temperature=0.3,
            timeout=timeout,
        )
@@ -78,52 +78,31 @@ class AnthropicTransport(ProviderTransport):
    def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
        """Normalize Anthropic response to NormalizedResponse.

-        Parses content blocks (text, thinking, tool_use), maps stop_reason
-        to OpenAI finish_reason, and collects reasoning_details in provider_data.
+        Calls the adapter's v1 normalize and maps the (SimpleNamespace, finish_reason)
+        tuple to the shared NormalizedResponse type.
        """
-        import json
-        from agent.anthropic_adapter import _to_plain_data
-        from agent.transports.types import ToolCall
+        from agent.anthropic_adapter import normalize_anthropic_response
+        from agent.transports.types import build_tool_call

        strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
-        _MCP_PREFIX = "mcp_"
+        assistant_msg, finish_reason = normalize_anthropic_response(response, strip_tool_prefix)

-        text_parts = []
-        reasoning_parts = []
-        reasoning_details = []
-        tool_calls = []
-
-        for block in response.content:
-            if block.type == "text":
-                text_parts.append(block.text)
-            elif block.type == "thinking":
-                reasoning_parts.append(block.thinking)
-                block_dict = _to_plain_data(block)
-                if isinstance(block_dict, dict):
-                    reasoning_details.append(block_dict)
-            elif block.type == "tool_use":
-                name = block.name
-                if strip_tool_prefix and name.startswith(_MCP_PREFIX):
-                    name = name[len(_MCP_PREFIX):]
-                tool_calls.append(
-                    ToolCall(
-                        id=block.id,
-                        name=name,
-                        arguments=json.dumps(block.input),
-                    )
-                )
-
-        finish_reason = self._STOP_REASON_MAP.get(response.stop_reason, "stop")
+        tool_calls = None
+        if assistant_msg.tool_calls:
+            tool_calls = [
+                build_tool_call(id=tc.id, name=tc.function.name, arguments=tc.function.arguments)
+                for tc in assistant_msg.tool_calls
+            ]

        provider_data = {}
-        if reasoning_details:
-            provider_data["reasoning_details"] = reasoning_details
+        if getattr(assistant_msg, "reasoning_details", None):
+            provider_data["reasoning_details"] = assistant_msg.reasoning_details

        return NormalizedResponse(
-            content="\n".join(text_parts) if text_parts else None,
-            tool_calls=tool_calls or None,
+            content=assistant_msg.content,
+            tool_calls=tool_calls,
            finish_reason=finish_reason,
-            reasoning="\n\n".join(reasoning_parts) if reasoning_parts else None,
+            reasoning=getattr(assistant_msg, "reasoning", None),
            usage=None,
            provider_data=provider_data or None,
        )
@@ -12,7 +12,6 @@ reasoning configuration, temperature handling, and extra_body assembly.
 import copy
 from typing import Any, Dict, List, Optional

-from agent.moonshot_schema import is_moonshot_model, sanitize_moonshot_tools
 from agent.prompt_builder import DEVELOPER_ROLE_MODELS
 from agent.transports.base import ProviderTransport
 from agent.transports.types import NormalizedResponse, ToolCall, Usage
@@ -173,11 +172,6 @@ class ChatCompletionsTransport(ProviderTransport):

        # Tools
        if tools:
-            # Moonshot/Kimi uses a stricter flavored JSON Schema.  Rewriting
-            # tool parameters here keeps aggregator routes (Nous, OpenRouter,
-            # etc.) compatible, in addition to direct moonshot.ai endpoints.
-            if is_moonshot_model(model):
-                tools = sanitize_moonshot_tools(tools)
            api_kwargs["tools"] = tools

        # max_tokens resolution — priority: ephemeral > user > provider default
@@ -37,44 +37,6 @@ class ToolCall:
    arguments: str  # JSON string
    provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)

-    # ── Backward compatibility ──────────────────────────────────
-    # The agent loop reads tc.function.name / tc.function.arguments
-    # throughout run_agent.py (45+ sites).  These properties let
-    # NormalizedResponse pass through without the _nr_to_assistant_message
-    # shim, while keeping ToolCall's canonical fields flat.
-    @property
-    def type(self) -> str:
-        return "function"
-
-    @property
-    def function(self) -> "ToolCall":
-        """Return self so tc.function.name / tc.function.arguments work."""
-        return self
-
-    @property
-    def call_id(self) -> Optional[str]:
-        """Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
-        return (self.provider_data or {}).get("call_id")
-
-    @property
-    def response_item_id(self) -> Optional[str]:
-        """Codex response_item_id from provider_data."""
-        return (self.provider_data or {}).get("response_item_id")
-
-    @property
-    def extra_content(self) -> Optional[Dict[str, Any]]:
-        """Gemini extra_content (thought_signature) from provider_data.
-
-        Gemini 3 thinking models attach ``extra_content`` with a
-        ``thought_signature`` to each tool call.  This signature must be
-        replayed on subsequent API calls — without it the API rejects the
-        request with HTTP 400.  The chat_completions transport stores this
-        in ``provider_data["extra_content"]``; this property exposes it so
-        ``_build_assistant_message`` can ``getattr(tc, "extra_content")``
-        uniformly.
-        """
-        return (self.provider_data or {}).get("extra_content")
-

@dataclass
 class Usage:
@@ -108,24 +70,6 @@ class NormalizedResponse:
    usage: Optional[Usage] = None
    provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)

-    # ── Backward compatibility ──────────────────────────────────
-    # The shim _nr_to_assistant_message() mapped these from provider_data.
-    # These properties let NormalizedResponse pass through directly.
-    @property
-    def reasoning_content(self) -> Optional[str]:
-        pd = self.provider_data or {}
-        return pd.get("reasoning_content")
-
-    @property
-    def reasoning_details(self):
-        pd = self.provider_data or {}
-        return pd.get("reasoning_details")
-
-    @property
-    def codex_reasoning_items(self):
-        pd = self.provider_data or {}
-        return pd.get("codex_reasoning_items")
-

 # ---------------------------------------------------------------------------
 # Factory helpers
@@ -507,13 +507,6 @@ agent:
  # finish, then interrupts anything still running after this timeout.
  # 0 = no drain, interrupt immediately.
  # restart_drain_timeout: 60
-
-  # Max app-level retry attempts for API errors (connection drops, provider
-  # timeouts, 5xx, etc.) before the agent surfaces the failure. Lower this
-  # to 1 if you use fallback providers and want fast failover on flaky
-  # primaries (default 3). The OpenAI SDK does its own low-level retries
-  # underneath this wrapper — this is the Hermes-level loop.
-  # api_max_retries: 3
  
  # Enable verbose logging
  verbose: false
@@ -305,23 +305,13 @@ def load_cli_config() -> Dict[str, Any]:
    
    Environment variables take precedence over config file values.
    Returns default values if no config file exists.
-
-    If HERMES_IGNORE_USER_CONFIG=1 is set (via ``hermes chat --ignore-user-config``),
-    the user config at ``~/.hermes/config.yaml`` is skipped entirely and only the
-    built-in defaults plus the project-level ``cli-config.yaml`` (if any) are used.
-    Credentials in ``.env`` are still loaded — this flag only suppresses
-    behavioral/config settings.
    """
    # Check user config first ({HERMES_HOME}/config.yaml)
    user_config_path = _hermes_home / 'config.yaml'
    project_config_path = Path(__file__).parent / 'cli-config.yaml'

-    # --ignore-user-config: force-skip the user config.yaml (still honor project
-    # config as a fallback so defaults stay sensible).
-    ignore_user_config = os.environ.get("HERMES_IGNORE_USER_CONFIG") == "1"
-
    # Use user config if it exists, otherwise project config
-    if user_config_path.exists() and not ignore_user_config:
+    if user_config_path.exists():
        config_path = user_config_path
    else:
        config_path = project_config_path
@@ -1812,7 +1802,6 @@ class HermesCLI:
        resume: str = None,
        checkpoints: bool = False,
        pass_session_id: bool = False,
-        ignore_rules: bool = False,
    ):
        """
        Initialize the Hermes CLI.
@@ -1966,11 +1955,6 @@ class HermesCLI:
        self.checkpoints_enabled = checkpoints or cp_cfg.get("enabled", False)
        self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 50)
        self.pass_session_id = pass_session_id
-        # --ignore-rules: honor either the constructor flag or the env var set
-        # by `hermes chat --ignore-rules` in hermes_cli/main.py. When true we
-        # pass skip_context_files=True and skip_memory=True to AIAgent so
-        # AGENTS.md/SOUL.md/.cursorrules and persistent memory are not loaded.
-        self.ignore_rules = ignore_rules or os.environ.get("HERMES_IGNORE_RULES") == "1"
        
        # Ephemeral system prompt: env var takes precedence, then config
        self.system_prompt = (
@@ -3328,8 +3312,6 @@ class HermesCLI:
                checkpoints_enabled=self.checkpoints_enabled,
                checkpoint_max_snapshots=self.checkpoint_max_snapshots,
                pass_session_id=self.pass_session_id,
-                skip_context_files=self.ignore_rules,
-                skip_memory=self.ignore_rules,
                tool_progress_callback=self._on_tool_progress,
                tool_start_callback=self._on_tool_start if self._inline_diffs_enabled else None,
                tool_complete_callback=self._on_tool_complete if self._inline_diffs_enabled else None,
@@ -6685,13 +6667,6 @@ class HermesCLI:
                print(f"   ⚠ Port {_port} is not reachable at {cdp_url}")

            os.environ["BROWSER_CDP_URL"] = cdp_url
-            # Eagerly start the CDP supervisor so pending_dialogs + frame_tree
-            # show up in the next browser_snapshot.  No-op if already started.
-            try:
-                from tools.browser_tool import _ensure_cdp_supervisor  # type: ignore[import-not-found]
-                _ensure_cdp_supervisor("default")
-            except Exception:
-                pass
            print()
            print("🌐 Browser connected to live Chrome via CDP")
            print(f"   Endpoint: {cdp_url}")
@@ -6713,8 +6688,7 @@ class HermesCLI:
            if current:
                os.environ.pop("BROWSER_CDP_URL", None)
                try:
-                    from tools.browser_tool import cleanup_all_browsers, _stop_cdp_supervisor
-                    _stop_cdp_supervisor("default")
+                    from tools.browser_tool import cleanup_all_browsers
                    cleanup_all_browsers()
                except Exception:
                    pass
@@ -10842,8 +10816,6 @@ def main(
    w: bool = False,
    checkpoints: bool = False,
    pass_session_id: bool = False,
-    ignore_user_config: bool = False,
-    ignore_rules: bool = False,
 ):
    """
    Hermes Agent CLI - Interactive AI Assistant
@@ -10953,7 +10925,6 @@ def main(
        resume=resume,
        checkpoints=checkpoints,
        pass_session_id=pass_session_id,
-        ignore_rules=ignore_rules,
    )

    if parsed_skills:
@@ -384,7 +384,6 @@ def create_job(
    provider: Optional[str] = None,
    base_url: Optional[str] = None,
    script: Optional[str] = None,
-    enabled_toolsets: Optional[List[str]] = None,
 ) -> Dict[str, Any]:
    """
    Create a new cron job.
@@ -404,9 +403,6 @@ def create_job(
        script: Optional path to a Python script whose stdout is injected into the
                prompt each run.  The script runs before the agent turn, and its output
                is prepended as context.  Useful for data collection / change detection.
-        enabled_toolsets: Optional list of toolset names to restrict the agent to.
-                          When set, only tools from these toolsets are loaded, reducing
-                          token overhead. When omitted, all default tools are loaded.

    Returns:
        The created job dict
@@ -437,8 +433,6 @@ def create_job(
    normalized_base_url = normalized_base_url or None
    normalized_script = str(script).strip() if isinstance(script, str) else None
    normalized_script = normalized_script or None
-    normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
-    normalized_toolsets = normalized_toolsets or None

    label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
    job = {
@@ -470,7 +464,6 @@ def create_job(
        # Delivery configuration
        "deliver": deliver,
        "origin": origin,  # Tracks where job was created for "origin" delivery
-        "enabled_toolsets": normalized_toolsets,
    }

    jobs = load_jobs()
@@ -40,37 +40,6 @@ from hermes_time import now as _hermes_now

 logger = logging.getLogger(__name__)

-
-def _resolve_cron_enabled_toolsets(job: dict, cfg: dict) -> list[str] | None:
-    """Resolve the toolset list for a cron job.
-
-    Precedence:
-    1. Per-job ``enabled_toolsets`` (set via ``cronjob`` tool on create/update).
-       Keeps the agent's job-scoped toolset override intact — #6130.
-    2. Per-platform ``hermes tools`` config for the ``cron`` platform.
-       Mirrors gateway behavior (``_get_platform_tools(cfg, platform_key)``)
-       so users can gate cron toolsets globally without recreating every job.
-    3. ``None`` on any lookup failure — AIAgent loads the full default set
-       (legacy behavior before this change, preserved as the safety net).
-
-    _DEFAULT_OFF_TOOLSETS ({moa, homeassistant, rl}) are removed by
-    ``_get_platform_tools`` for unconfigured platforms, so fresh installs
-    get cron WITHOUT ``moa`` by default (issue reported by Norbert —
-    surprise $4.63 run).
-    """
-    per_job = job.get("enabled_toolsets")
-    if per_job:
-        return per_job
-    try:
-        from hermes_cli.tools_config import _get_platform_tools  # lazy: avoid heavy import at cron module load
-        return sorted(_get_platform_tools(cfg or {}, "cron"))
-    except Exception as exc:
-        logger.warning(
-            "Cron toolset resolution failed, falling back to full default toolset: %s",
-            exc,
-        )
-        return None
-
 # Valid delivery platforms — used to validate user-supplied platform names
 # in cron delivery targets, preventing env var enumeration via crafted names.
 _KNOWN_DELIVERY_PLATFORMS = frozenset({
@@ -917,7 +886,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            providers_ignored=pr.get("ignore"),
            providers_order=pr.get("order"),
            provider_sort=pr.get("sort"),
-            enabled_toolsets=_resolve_cron_enabled_toolsets(job, _cfg),
            disabled_toolsets=["cronjob", "messaging", "clarify"],
            quiet_mode=True,
            skip_context_files=True,  # Don't inject SOUL.md/AGENTS.md from scheduler cwd
@@ -1004,12 +972,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
                f"— last activity: {_last_desc}"
            )

-        # Guard against non-dict returns from run_conversation under error conditions
-        if not isinstance(result, dict):
-            raise RuntimeError(
-                f"agent.run_conversation returned {type(result).__name__} instead of dict: {result!r}"
-            )
-
        final_response = result.get("final_response", "") or ""
        # Strip leaked placeholder text that upstream may inject on empty completions.
        if final_response.strip() == "(No response generated)":
@@ -58,13 +58,6 @@ if [ ! -f "$HERMES_HOME/config.yaml" ]; then
    cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
 fi

-# Ensure the main config file remains accessible to the hermes runtime user
-# even if it was edited on the host after initial ownership setup.
-if [ -f "$HERMES_HOME/config.yaml" ]; then
-    chown hermes:hermes "$HERMES_HOME/config.yaml"
-    chmod 640 "$HERMES_HOME/config.yaml"
-fi
-
 # SOUL.md
 if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
    cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
@@ -752,10 +752,7 @@ class MessageEvent:
        if not self.is_command():
            return self.text
        parts = self.text.split(maxsplit=1)
-        args = parts[1] if len(parts) > 1 else ""
-        # iOS auto-corrects -- to — (em dash) and - to – (en dash)
-        args = args.replace("\u2014\u2014", "--").replace("\u2014", "--").replace("\u2013", "-")
-        return args
+        return parts[1] if len(parts) > 1 else ""


@dataclass 
@@ -900,16 +897,10 @@ class BasePlatformAdapter(ABC):
        self._fatal_error_retryable = True
        self._fatal_error_handler: Optional[Callable[["BasePlatformAdapter"], Awaitable[None] | None]] = None
        
-        # Track active message handlers per session for interrupt support.
-        # _active_sessions stores the per-session interrupt Event; _session_tasks
-        # maps session → the specific Task currently processing it so that
-        # session-terminating commands (/stop, /new, /reset) can cancel the
-        # right task and release the adapter-level guard deterministically.
-        # Without the owner-task map, an old task's finally block could delete
-        # a newer task's guard, leaving stale busy state.
+        # Track active message handlers per session for interrupt support
+        # Key: session_key (e.g., chat_id), Value: (event, asyncio.Event for interrupt)
        self._active_sessions: Dict[str, asyncio.Event] = {}
        self._pending_messages: Dict[str, MessageEvent] = {}
-        self._session_tasks: Dict[str, asyncio.Task] = {}
        # Background message-processing tasks spawned by handle_message().
        # Gateway shutdown cancels these so an old gateway instance doesn't keep
        # working on a task after --replace or manual restarts.
@@ -1352,7 +1343,7 @@ class BasePlatformAdapter(ABC):
        # Extract MEDIA:<path> tags, allowing optional whitespace after the colon
        # and quoted/backticked paths for LLM-formatted outputs.
        media_pattern = re.compile(
-            r'''[`"']?MEDIA:\s*(?P<path>`[^`\n]+`|"[^"\n]+"|'[^'\n]+'|(?:~/|/)\S+(?:[^\S\n]+\S+)*?\.(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|epub|pdf|zip|rar|7z|docx?|xlsx?|pptx?|txt|csv|apk|ipa)(?=[\s`"',;:)\]}]|$)|\S+)[`"']?'''
+            r'''[`"']?MEDIA:\s*(?P<path>`[^`\n]+`|"[^"\n]+"|'[^'\n]+'|(?:~/|/)\S+(?:[^\S\n]+\S+)*?\.(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|pdf)(?=[\s`"',;:)\]}]|$)|\S+)[`"']?'''
        )
        for match in media_pattern.finditer(content):
            path = match.group("path").strip()
@@ -1686,222 +1677,6 @@ class BasePlatformAdapter(ABC):
            return f"{existing_text}\n\n{new_text}".strip()
        return existing_text

-    # ------------------------------------------------------------------
-    # Session task + guard ownership helpers
-    # ------------------------------------------------------------------
-    # These were introduced together with the _session_tasks owner map to
-    # make session lifecycle reconciliation deterministic across (a) the
-    # normal completion path, (b) /stop/ /new/ /reset bypass commands,
-    # and (c) stale-lock self-heal on the next inbound message.
-
-    def _release_session_guard(
-        self,
-        session_key: str,
-        *,
-        guard: Optional[asyncio.Event] = None,
-    ) -> None:
-        """Release the adapter-level guard for a session.
-
-        When ``guard`` is provided, only release the entry if it still points
-        at that exact Event.  This lets reset-like commands swap in a temporary
-        guard while the old processing task unwinds, without having the old
-        task's cleanup accidentally clear the replacement guard.
-        """
-        current_guard = self._active_sessions.get(session_key)
-        if current_guard is None:
-            return
-        if guard is not None and current_guard is not guard:
-            return
-        del self._active_sessions[session_key]
-
-    def _session_task_is_stale(self, session_key: str) -> bool:
-        """Return True if the owner task for ``session_key`` is done/cancelled.
-
-        A lock is "stale" when the adapter still has ``_active_sessions[key]``
-        AND a known owner task in ``_session_tasks`` that has already exited.
-        When there is no owner task at all, that usually means the guard was
-        installed by some path other than handle_message() (tests sometimes
-        install guards directly) — don't treat that as stale.  The on-entry
-        self-heal only needs to handle the production split-brain case where
-        an owner task was recorded, then exited without clearing its guard.
-        """
-        task = self._session_tasks.get(session_key)
-        if task is None:
-            return False
-        done = getattr(task, "done", None)
-        return bool(done and done())
-
-    def _heal_stale_session_lock(self, session_key: str) -> bool:
-        """Clear a stale session lock if the owner task is already gone.
-
-        Returns True if a stale lock was healed.  Returns False if there is
-        no lock, or the owner task is still alive (the normal busy case).
-
-        This is the on-entry safety net sidbin's issue #11016 analysis calls
-        for: without it, a split-brain — adapter still thinks the session is
-        active, but nothing is actually processing — traps the chat in
-        infinite "Interrupting current task..." until the gateway is
-        restarted.
-        """
-        if session_key not in self._active_sessions:
-            return False
-        if not self._session_task_is_stale(session_key):
-            return False
-        logger.warning(
-            "[%s] Healing stale session lock for %s (owner task is done/absent)",
-            self.name,
-            session_key,
-        )
-        self._active_sessions.pop(session_key, None)
-        self._pending_messages.pop(session_key, None)
-        self._session_tasks.pop(session_key, None)
-        return True
-
-    def _start_session_processing(
-        self,
-        event: MessageEvent,
-        session_key: str,
-        *,
-        interrupt_event: Optional[asyncio.Event] = None,
-    ) -> bool:
-        """Spawn a background processing task under the given session guard.
-
-        Returns True on success.  If the runtime stubs ``create_task`` with a
-        non-Task sentinel (some tests do this), the guard is rolled back and
-        False is returned so the caller isn't left holding a half-installed
-        session lock.
-        """
-        guard = interrupt_event or asyncio.Event()
-        self._active_sessions[session_key] = guard
-
-        task = asyncio.create_task(self._process_message_background(event, session_key))
-        self._session_tasks[session_key] = task
-        try:
-            self._background_tasks.add(task)
-        except TypeError:
-            # Tests stub create_task() with lightweight sentinels that are not
-            # hashable and do not support lifecycle callbacks.
-            self._session_tasks.pop(session_key, None)
-            self._release_session_guard(session_key, guard=guard)
-            return False
-        if hasattr(task, "add_done_callback"):
-            task.add_done_callback(self._background_tasks.discard)
-            task.add_done_callback(self._expected_cancelled_tasks.discard)
-        return True
-
-    async def cancel_session_processing(
-        self,
-        session_key: str,
-        *,
-        release_guard: bool = True,
-        discard_pending: bool = True,
-    ) -> None:
-        """Cancel in-flight processing for a single session.
-
-        ``release_guard=False`` keeps the adapter-level session guard in place
-        so reset-like commands can finish atomically before follow-up messages
-        are allowed to start a fresh background task.
-        """
-        task = self._session_tasks.pop(session_key, None)
-        if task is not None and not task.done():
-            logger.debug(
-                "[%s] Cancelling active processing for session %s",
-                self.name,
-                session_key,
-            )
-            self._expected_cancelled_tasks.add(task)
-            task.cancel()
-            try:
-                await task
-            except asyncio.CancelledError:
-                pass
-            except Exception:
-                logger.debug(
-                    "[%s] Session cancellation raised while unwinding %s",
-                    self.name,
-                    session_key,
-                    exc_info=True,
-                )
-        if discard_pending:
-            self._pending_messages.pop(session_key, None)
-        if release_guard:
-            self._release_session_guard(session_key)
-
-    async def _drain_pending_after_session_command(
-        self,
-        session_key: str,
-        command_guard: asyncio.Event,
-    ) -> None:
-        """Resume the latest queued follow-up once a session command completes.
-
-        Called at the tail of /stop, /new, and /reset dispatch.  Releases the
-        command-scoped guard, then — if a follow-up message landed while the
-        command was running — spawns a fresh processing task for it.
-        """
-        pending_event = self._pending_messages.pop(session_key, None)
-        self._release_session_guard(session_key, guard=command_guard)
-        if pending_event is None:
-            return
-        self._start_session_processing(pending_event, session_key)
-
-    async def _dispatch_active_session_command(
-        self,
-        event: MessageEvent,
-        session_key: str,
-        cmd: str,
-    ) -> None:
-        """Dispatch a reset-like bypass command while preserving guard ordering.
-
-        /stop, /new, and /reset must:
-          1. Keep the session guard installed while the runner processes the
-             command (so a racing follow-up message stays queued, not
-             dispatched as a second parallel run).
-          2. Cancel the old in-flight adapter task only AFTER the runner has
-             finished handling the command (so the runner sees consistent
-             state and its response is sent in order).
-          3. Release the command-scoped guard and drain the latest queued
-             follow-up exactly once, after 1 and 2 complete.
-        """
-        logger.debug(
-            "[%s] Command '/%s' bypassing active-session guard for %s",
-            self.name,
-            cmd,
-            session_key,
-        )
-
-        current_guard = self._active_sessions.get(session_key)
-        command_guard = asyncio.Event()
-        self._active_sessions[session_key] = command_guard
-        thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
-
-        try:
-            response = await self._message_handler(event)
-            # Old adapter task (if any) is cancelled AFTER the runner has
-            # fully handled the command — keeps ordering deterministic.
-            await self.cancel_session_processing(
-                session_key,
-                release_guard=False,
-                discard_pending=False,
-            )
-            if response:
-                await self._send_with_retry(
-                    chat_id=event.source.chat_id,
-                    content=response,
-                    reply_to=event.message_id,
-                    metadata=thread_meta,
-                )
-        except Exception:
-            # On failure, restore the original guard if one still exists so
-            # we don't leave the session in a half-reset state.
-            if self._active_sessions.get(session_key) is command_guard:
-                if session_key in self._session_tasks and current_guard is not None:
-                    self._active_sessions[session_key] = current_guard
-                else:
-                    self._release_session_guard(session_key, guard=command_guard)
-            raise
-
-        await self._drain_pending_after_session_command(session_key, command_guard)
-
    async def handle_message(self, event: MessageEvent) -> None:
        """
        Process an incoming message.
@@ -1918,15 +1693,7 @@ class BasePlatformAdapter(ABC):
            group_sessions_per_user=self.config.extra.get("group_sessions_per_user", True),
            thread_sessions_per_user=self.config.extra.get("thread_sessions_per_user", False),
        )
-
-        # On-entry self-heal: if the adapter still has an _active_sessions
-        # entry for this key but the owner task has already exited (done or
-        # cancelled), the lock is stale.  Clear it and fall through to
-        # normal dispatch so the user isn't trapped behind a dead guard —
-        # this is the split-brain tail described in issue #11016.
-        if session_key in self._active_sessions:
-            self._heal_stale_session_lock(session_key)
-
+        
        # Check if there's already an active handler for this session
        if session_key in self._active_sessions:
            # Certain commands must bypass the active-session guard and be
@@ -1943,23 +1710,6 @@ class BasePlatformAdapter(ABC):
            from hermes_cli.commands import should_bypass_active_session

            if should_bypass_active_session(cmd):
-                # /stop, /new, /reset must cancel the in-flight adapter task
-                # and preserve ordering of queued follow-ups.  Route those
-                # through the dedicated handoff path that serializes
-                # cancellation + runner response + pending drain.
-                if cmd in ("stop", "new", "reset"):
-                    try:
-                        await self._dispatch_active_session_command(event, session_key, cmd)
-                    except Exception as e:
-                        logger.error(
-                            "[%s] Command '/%s' dispatch failed: %s",
-                            self.name, cmd, e, exc_info=True,
-                        )
-                    return
-
-                # Other bypass commands (/approve, /deny, /status,
-                # /background, /restart) just need direct dispatch — they
-                # don't cancel the running task.
                logger.debug(
                    "[%s] Command '/%s' bypassing active-session guard for %s",
                    self.name, cmd, session_key,
@@ -2005,9 +1755,19 @@ class BasePlatformAdapter(ABC):
        # starts would also pass the _active_sessions check and spawn a
        # duplicate task.  (grammY sequentialize / aiogram EventIsolation
        # pattern — set the guard synchronously, not inside the task.)
-        # _start_session_processing installs the guard AND the owner-task
-        # mapping atomically so stale-lock detection works.
-        self._start_session_processing(event, session_key)
+        self._active_sessions[session_key] = asyncio.Event()
+
+        # Spawn background task to process this message
+        task = asyncio.create_task(self._process_message_background(event, session_key))
+        try:
+            self._background_tasks.add(task)
+        except TypeError:
+            # Some tests stub create_task() with lightweight sentinels that are not
+            # hashable and do not support lifecycle callbacks.
+            return
+        if hasattr(task, "add_done_callback"):
+            task.add_done_callback(self._background_tasks.discard)
+            task.add_done_callback(self._expected_cancelled_tasks.discard)
    
    @staticmethod
    def _get_human_delay() -> float:
@@ -2367,9 +2127,6 @@ class BasePlatformAdapter(ABC):
                drain_task = asyncio.create_task(
                    self._process_message_background(late_pending, session_key)
                )
-                # Hand ownership of the session to the drain task so stale-lock
-                # detection keeps working while it runs.
-                self._session_tasks[session_key] = drain_task
                try:
                    self._background_tasks.add(drain_task)
                    drain_task.add_done_callback(self._background_tasks.discard)
@@ -2379,14 +2136,9 @@ class BasePlatformAdapter(ABC):
                # Leave _active_sessions[session_key] populated — the drain
                # task's own lifecycle will clean it up.
            else:
-                # Clean up session tracking.  Guard-match both deletes so a
-                # reset-like command that already swapped in its own
-                # command_guard (and cancelled us) can't be accidentally
-                # cleared by our unwind.  The command owns the session now.
-                current_task = asyncio.current_task()
-                if current_task is not None and self._session_tasks.get(session_key) is current_task:
-                    del self._session_tasks[session_key]
-                self._release_session_guard(session_key, guard=interrupt_event)
+                # Clean up session tracking
+                if session_key in self._active_sessions:
+                    del self._active_sessions[session_key]
    
    async def cancel_background_tasks(self) -> None:
        """Cancel any in-flight background message-processing tasks.
@@ -2416,7 +2168,6 @@ class BasePlatformAdapter(ABC):
            # will be in self._background_tasks now.  Re-check.
        self._background_tasks.clear()
        self._expected_cancelled_tasks.clear()
-        self._session_tasks.clear()
        self._pending_messages.clear()
        self._active_sessions.clear()

@@ -2440,9 +2191,6 @@ class BasePlatformAdapter(ABC):
        user_id_alt: Optional[str] = None,
        chat_id_alt: Optional[str] = None,
        is_bot: bool = False,
-        guild_id: Optional[str] = None,
-        parent_chat_id: Optional[str] = None,
-        message_id: Optional[str] = None,
    ) -> SessionSource:
        """Helper to build a SessionSource for this platform."""
        # Normalize empty topic to None
@@ -2460,9 +2208,6 @@ class BasePlatformAdapter(ABC):
            user_id_alt=user_id_alt,
            chat_id_alt=chat_id_alt,
            is_bot=is_bot,
-            guild_id=str(guild_id) if guild_id else None,
-            parent_chat_id=str(parent_chat_id) if parent_chat_id else None,
-            message_id=str(message_id) if message_id else None,
        )
    
    @abstractmethod
@@ -23,7 +23,6 @@ from typing import Callable, Dict, Optional, Any
 logger = logging.getLogger(__name__)

 VALID_THREAD_AUTO_ARCHIVE_MINUTES = {60, 1440, 4320, 10080}
-_DISCORD_COMMAND_SYNC_POLICIES = {"safe", "bulk", "off"}

 try:
    import discord
@@ -528,7 +527,6 @@ class DiscordAdapter(BasePlatformAdapter):
        # Reply threading mode: "off" (no replies), "first" (reply on first
        # chunk only, default), "all" (reply-reference on every chunk).
        self._reply_to_mode: str = getattr(config, 'reply_to_mode', 'first') or 'first'
-        self._slash_commands: bool = self.config.extra.get("slash_commands", True)

    async def connect(self) -> bool:
        """Connect to Discord and start receiving events."""
@@ -746,8 +744,7 @@ class DiscordAdapter(BasePlatformAdapter):
                    )

            # Register slash commands
-            if self._slash_commands:
-                self._register_slash_commands()
+            self._register_slash_commands()

            # Start the bot in background
            self._bot_task = asyncio.create_task(self._client.start(self.config.token))
@@ -803,27 +800,8 @@ class DiscordAdapter(BasePlatformAdapter):
        if not self._client:
            return
        try:
-            sync_policy = self._get_discord_command_sync_policy()
-            if sync_policy == "off":
-                logger.info("[%s] Skipping Discord slash command sync (policy=off)", self.name)
-                return
-
-            if sync_policy == "bulk":
-                synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
-                logger.info("[%s] Synced %d slash command(s) via bulk tree sync", self.name, len(synced))
-                return
-
-            summary = await asyncio.wait_for(self._safe_sync_slash_commands(), timeout=30)
-            logger.info(
-                "[%s] Safely reconciled %d slash command(s): unchanged=%d updated=%d recreated=%d created=%d deleted=%d",
-                self.name,
-                summary["total"],
-                summary["unchanged"],
-                summary["updated"],
-                summary["recreated"],
-                summary["created"],
-                summary["deleted"],
-            )
+            synced = await asyncio.wait_for(self._client.tree.sync(), timeout=30)
+            logger.info("[%s] Synced %d slash command(s)", self.name, len(synced))
        except asyncio.TimeoutError:
            logger.warning("[%s] Slash command sync timed out after 30s", self.name)
        except asyncio.CancelledError:
@@ -831,183 +809,6 @@ class DiscordAdapter(BasePlatformAdapter):
        except Exception as e:  # pragma: no cover - defensive logging
            logger.warning("[%s] Slash command sync failed: %s", self.name, e, exc_info=True)

-    def _get_discord_command_sync_policy(self) -> str:
-        raw = str(os.getenv("DISCORD_COMMAND_SYNC_POLICY", "safe") or "").strip().lower()
-        if raw in _DISCORD_COMMAND_SYNC_POLICIES:
-            return raw
-        if raw:
-            logger.warning(
-                "[%s] Invalid DISCORD_COMMAND_SYNC_POLICY=%r; falling back to 'safe'",
-                self.name,
-                raw,
-            )
-        return "safe"
-
-    def _canonicalize_app_command_payload(self, payload: Dict[str, Any]) -> Dict[str, Any]:
-        """Reduce command payloads to the semantic fields Hermes manages."""
-        contexts = payload.get("contexts")
-        integration_types = payload.get("integration_types")
-        return {
-            "type": int(payload.get("type", 1) or 1),
-            "name": str(payload.get("name", "") or ""),
-            "description": str(payload.get("description", "") or ""),
-            "default_member_permissions": self._normalize_permissions(
-                payload.get("default_member_permissions")
-            ),
-            "dm_permission": bool(payload.get("dm_permission", True)),
-            "nsfw": bool(payload.get("nsfw", False)),
-            "contexts": sorted(int(c) for c in contexts) if contexts else None,
-            "integration_types": (
-                sorted(int(i) for i in integration_types) if integration_types else None
-            ),
-            "options": [
-                self._canonicalize_app_command_option(item)
-                for item in payload.get("options", []) or []
-                if isinstance(item, dict)
-            ],
-        }
-
-    @staticmethod
-    def _normalize_permissions(value: Any) -> Optional[str]:
-        """Discord emits default_member_permissions as str server-side but discord.py
-        sets it as int locally. Normalize to str-or-None so the comparison is stable."""
-        if value is None:
-            return None
-        return str(value)
-
-    def _existing_command_to_payload(self, command: Any) -> Dict[str, Any]:
-        """Build a canonical-ready dict from an AppCommand.
-
-        discord.py's AppCommand.to_dict() does NOT include nsfw,
-        dm_permission, or default_member_permissions (they live only on the
-        attributes). Pull them from the attributes so the canonicalizer sees
-        the real server-side values instead of defaults — otherwise any
-        command using non-default permissions would diff on every startup.
-        """
-        payload = dict(command.to_dict())
-        nsfw = getattr(command, "nsfw", None)
-        if nsfw is not None:
-            payload["nsfw"] = bool(nsfw)
-        guild_only = getattr(command, "guild_only", None)
-        if guild_only is not None:
-            payload["dm_permission"] = not bool(guild_only)
-        default_permissions = getattr(command, "default_member_permissions", None)
-        if default_permissions is not None:
-            payload["default_member_permissions"] = getattr(
-                default_permissions, "value", default_permissions
-            )
-        return payload
-
-    def _canonicalize_app_command_option(self, payload: Dict[str, Any]) -> Dict[str, Any]:
-        return {
-            "type": int(payload.get("type", 0) or 0),
-            "name": str(payload.get("name", "") or ""),
-            "description": str(payload.get("description", "") or ""),
-            "required": bool(payload.get("required", False)),
-            "autocomplete": bool(payload.get("autocomplete", False)),
-            "choices": [
-                {
-                    "name": str(choice.get("name", "") or ""),
-                    "value": choice.get("value"),
-                }
-                for choice in payload.get("choices", []) or []
-                if isinstance(choice, dict)
-            ],
-            "channel_types": list(payload.get("channel_types", []) or []),
-            "min_value": payload.get("min_value"),
-            "max_value": payload.get("max_value"),
-            "min_length": payload.get("min_length"),
-            "max_length": payload.get("max_length"),
-            "options": [
-                self._canonicalize_app_command_option(item)
-                for item in payload.get("options", []) or []
-                if isinstance(item, dict)
-            ],
-        }
-
-    def _patchable_app_command_payload(self, payload: Dict[str, Any]) -> Dict[str, Any]:
-        """Fields supported by discord.py's edit_global_command route."""
-        canonical = self._canonicalize_app_command_payload(payload)
-        return {
-            "name": canonical["name"],
-            "description": canonical["description"],
-            "options": canonical["options"],
-        }
-
-    async def _safe_sync_slash_commands(self) -> Dict[str, int]:
-        """Diff existing global commands and only mutate the commands that changed."""
-        if not self._client:
-            return {
-                "total": 0,
-                "unchanged": 0,
-                "updated": 0,
-                "recreated": 0,
-                "created": 0,
-                "deleted": 0,
-            }
-
-        tree = self._client.tree
-        app_id = getattr(self._client, "application_id", None) or getattr(getattr(self._client, "user", None), "id", None)
-        if not app_id:
-            raise RuntimeError("Discord application ID is unavailable for slash command sync")
-
-        desired_payloads = [command.to_dict(tree) for command in tree.get_commands()]
-        desired_by_key = {
-            (int(payload.get("type", 1) or 1), str(payload.get("name", "") or "").lower()): payload
-            for payload in desired_payloads
-        }
-        existing_commands = await tree.fetch_commands()
-        existing_by_key = {
-            (
-                int(getattr(getattr(command, "type", None), "value", getattr(command, "type", 1)) or 1),
-                str(command.name or "").lower(),
-            ): command
-            for command in existing_commands
-        }
-
-        unchanged = 0
-        updated = 0
-        recreated = 0
-        created = 0
-        deleted = 0
-        http = self._client.http
-
-        for key, desired in desired_by_key.items():
-            current = existing_by_key.pop(key, None)
-            if current is None:
-                await http.upsert_global_command(app_id, desired)
-                created += 1
-                continue
-
-            current_existing_payload = self._existing_command_to_payload(current)
-            current_payload = self._canonicalize_app_command_payload(current_existing_payload)
-            desired_payload = self._canonicalize_app_command_payload(desired)
-            if current_payload == desired_payload:
-                unchanged += 1
-                continue
-
-            if self._patchable_app_command_payload(current_existing_payload) == self._patchable_app_command_payload(desired):
-                await http.delete_global_command(app_id, current.id)
-                await http.upsert_global_command(app_id, desired)
-                recreated += 1
-                continue
-
-            await http.edit_global_command(app_id, current.id, desired)
-            updated += 1
-
-        for current in existing_by_key.values():
-            await http.delete_global_command(app_id, current.id)
-            deleted += 1
-
-        return {
-            "total": len(desired_payloads),
-            "unchanged": unchanged,
-            "updated": updated,
-            "recreated": recreated,
-            "created": created,
-            "deleted": deleted,
-        }
-
    async def _add_reaction(self, message: Any, emoji: str) -> bool:
        """Add an emoji reaction to a Discord message."""
        if not message or not hasattr(message, "add_reaction"):
@@ -3256,7 +3057,6 @@ class DiscordAdapter(BasePlatformAdapter):
            if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
                thread = await self._auto_create_thread(message)
                if thread:
-                    parent_channel_id = str(message.channel.id)
                    is_thread = True
                    thread_id = str(thread.id)
                    auto_threaded_channel = thread
@@ -3316,9 +3116,6 @@ class DiscordAdapter(BasePlatformAdapter):
            thread_id=thread_id,
            chat_topic=chat_topic,
            is_bot=getattr(message.author, "bot", False),
-            guild_id=str(message.guild.id) if message.guild else None,
-            parent_chat_id=parent_channel_id,
-            message_id=str(message.id),
        )

        # Build media URLs -- download image attachments to local cache so the
@@ -1700,7 +1700,6 @@ class FeishuAdapter(BasePlatformAdapter):
        if not self._client:
            return SendResult(success=False, error="Not connected")

-        content = self.format_message(content)
        try:
            msg_type, payload = self._build_outbound_payload(content)
            body = self._build_update_message_body(msg_type=msg_type, content=payload)
@@ -535,9 +535,6 @@ class QQAdapter(BasePlatformAdapter):
                    quick_disconnect_count = 0
                else:
                    backoff_idx += 1
-                    if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
-                        logger.error("[%s] Max reconnect attempts reached (QQCloseError)", self._log_tag)
-                        return

            except Exception as exc:
                if not self._running:
@@ -508,11 +508,6 @@ class WeComAdapter(BasePlatformAdapter):
        self._remember_chat_req_id(chat_id, self._payload_req_id(payload))

        text, reply_text = self._extract_text(body)
-        # Strip leading @mention in group chats so slash commands like
-        # "@BotName /approve" are correctly recognized as "/approve".
-        # Mirrors what the Telegram adapter does (re.sub @botname).
-        if is_group and text:
-            text = re.sub(r"^@\S+\s*", "", text).strip()
        media_urls, media_types = await self._extract_media(body)
        message_type = self._derive_message_type(body, text, media_types)
        has_reply_context = bool(reply_text and (text or media_urls))
@@ -1551,23 +1551,27 @@ class GatewayRunner:
            )
            return True

-        # Normal busy case (agent actively running a task)
+        # --- Normal busy case (agent actively running a task) ---
+        # The user sent a message while the agent is working.  Interrupt the
+        # agent immediately so it stops the current tool-calling loop and
+        # processes the new message.  The pending message is stored in the
+        # adapter so the base adapter picks it up once the interrupted run
+        # returns.  A brief ack tells the user what's happening (debounced
+        # to avoid spam when they fire multiple messages quickly).
+
        adapter = self.adapters.get(event.source.platform)
        if not adapter:
            return False  # let default path handle it

        # Store the message so it's processed as the next turn after the
-        # current run finishes (or is interrupted).
+        # interrupt causes the current run to exit.
        from gateway.platforms.base import merge_pending_message_event
        merge_pending_message_event(adapter._pending_messages, session_key, event)

-        is_queue_mode = self._busy_input_mode == "queue"
-
-        # If not in queue mode, interrupt the running agent immediately.
-        # This aborts in-flight tool calls and causes the agent loop to exit
-        # at the next check point.
+        # Interrupt the running agent — this aborts in-flight tool calls and
+        # causes the agent loop to exit at the next check point.
        running_agent = self._running_agents.get(session_key)
-        if not is_queue_mode and running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
+        if running_agent and running_agent is not _AGENT_PENDING_SENTINEL:
            try:
                running_agent.interrupt(event.text)
            except Exception:
@@ -1579,7 +1583,7 @@ class GatewayRunner:
        now = time.time()
        last_ack = self._busy_ack_ts.get(session_key, 0)
        if now - last_ack < _BUSY_ACK_COOLDOWN:
-            return True  # interrupt sent (if not queue), ack already delivered recently
+            return True  # interrupt sent, ack already delivered recently

        self._busy_ack_ts[session_key] = now

@@ -1604,16 +1608,10 @@ class GatewayRunner:
                pass

        status_detail = f" ({', '.join(status_parts)})" if status_parts else ""
-        if is_queue_mode:
-            message = (
-                f"⏳ Queued for the next turn{status_detail}. "
-                f"I'll respond once the current task finishes."
-            )
-        else:
-            message = (
-                f"⚡ Interrupting current task{status_detail}. "
-                f"I'll respond to your message shortly."
-            )
+        message = (
+            f"⚡ Interrupting current task{status_detail}. "
+            f"I'll respond to your message shortly."
+        )

        thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
        try:
@@ -2562,40 +2560,6 @@ class GatewayRunner:
            return

        async def _stop_impl() -> None:
-            def _kill_tool_subprocesses(phase: str) -> None:
-                """Kill tool subprocesses + tear down terminal envs + browsers.
-
-                Called twice in the shutdown path: once eagerly after a
-                drain timeout forces agent interrupt (so we reclaim bash/
-                sleep children before systemd TimeoutStopSec escalates to
-                SIGKILL on the cgroup — #8202), and once as a final
-                catch-all at the end of _stop_impl() for the graceful
-                path or anything respawned mid-teardown.
-
-                All steps are best-effort; exceptions are swallowed so
-                one subsystem's failure doesn't block the rest.
-                """
-                try:
-                    from tools.process_registry import process_registry
-                    _killed = process_registry.kill_all()
-                    if _killed:
-                        logger.info(
-                            "Shutdown (%s): killed %d tool subprocess(es)",
-                            phase, _killed,
-                        )
-                except Exception as _e:
-                    logger.debug("process_registry.kill_all (%s) error: %s", phase, _e)
-                try:
-                    from tools.terminal_tool import cleanup_all_environments
-                    cleanup_all_environments()
-                except Exception as _e:
-                    logger.debug("cleanup_all_environments (%s) error: %s", phase, _e)
-                try:
-                    from tools.browser_tool import cleanup_all_browsers
-                    cleanup_all_browsers()
-                except Exception as _e:
-                    logger.debug("cleanup_all_browsers (%s) error: %s", phase, _e)
-
            logger.info(
                "Stopping gateway%s...",
                " for restart" if self._restart_requested else "",
@@ -2657,16 +2621,6 @@ class GatewayRunner:
                    self._update_runtime_status("draining")
                    await asyncio.sleep(0.1)

-                # Kill lingering tool subprocesses NOW, before we spend more
-                # budget on adapter disconnect / session DB close.  Under
-                # systemd (TimeoutStopSec bounded by drain_timeout+headroom),
-                # deferring this to the end of stop() risks systemd escalating
-                # to SIGKILL on the cgroup first — at which point bash/sleep
-                # children left behind by an interrupted terminal tool get
-                # killed by systemd instead of us (issue #8202).  The final
-                # catch-all cleanup below still runs for the graceful path.
-                _kill_tool_subprocesses("post-interrupt")
-
            if self._restart_requested and self._restart_detached:
                try:
                    await self._launch_detached_restart_command()
@@ -2702,13 +2656,22 @@ class GatewayRunner:
            self._shutdown_event.set()

            # Global cleanup: kill any remaining tool subprocesses not tied
-            # to a specific agent (catch-all for zombie prevention). On the
-            # drain-timeout path we already did this earlier after agent
-            # interrupt — this second call catches (a) the graceful path
-            # where drain succeeded without interrupt, and (b) anything
-            # that got respawned between the earlier call and adapter
-            # disconnect (defense in depth; safe to call repeatedly).
-            _kill_tool_subprocesses("final-cleanup")
+            # to a specific agent (catch-all for zombie prevention).
+            try:
+                from tools.process_registry import process_registry
+                process_registry.kill_all()
+            except Exception:
+                pass
+            try:
+                from tools.terminal_tool import cleanup_all_environments
+                cleanup_all_environments()
+            except Exception:
+                pass
+            try:
+                from tools.browser_tool import cleanup_all_browsers
+                cleanup_all_browsers()
+            except Exception:
+                pass

            # Close SQLite session DBs so the WAL write lock is released.
            # Without this, --replace and similar restart flows leave the
@@ -5521,7 +5484,6 @@ class GatewayRunner:
                try:
                    providers = list_authenticated_providers(
                        current_provider=current_provider,
-                        current_base_url=current_base_url,
                        user_providers=user_provs,
                        custom_providers=custom_provs,
                        max_models=50,
@@ -5633,7 +5595,6 @@ class GatewayRunner:
            try:
                providers = list_authenticated_providers(
                    current_provider=current_provider,
-                    current_base_url=current_base_url,
                    user_providers=user_provs,
                    custom_providers=custom_provs,
                    max_models=5,
@@ -8702,12 +8663,7 @@ class GatewayRunner:
        override = self._session_model_overrides.get(session_key)
        return override is not None and override.get("model") == agent_model

-    def _release_running_agent_state(
-        self,
-        session_key: str,
-        *,
-        run_generation: Optional[int] = None,
-    ) -> bool:
+    def _release_running_agent_state(self, session_key: str) -> None:
        """Pop ALL per-running-agent state entries for ``session_key``.

        Replaces ad-hoc ``del self._running_agents[key]`` calls scattered
@@ -8723,25 +8679,13 @@ class GatewayRunner:
        across turns (``_session_model_overrides``, ``_voice_mode``,
        ``_pending_approvals``, ``_update_prompt_pending``) is NOT
        touched here — those have their own lifecycles.
-
-        When ``run_generation`` is provided, only clear the slot if that
-        generation is still current for the session.  This prevents an
-        older async run whose generation was bumped by /stop or /new from
-        clobbering a newer run's state during its own unwind.  Returns
-        True when the slot was cleared, False when an ownership guard
-        blocked it.
        """
        if not session_key:
-            return False
-        if run_generation is not None and not self._is_session_run_current(
-            session_key, run_generation
-        ):
-            return False
+            return
        self._running_agents.pop(session_key, None)
        self._running_agents_ts.pop(session_key, None)
        if hasattr(self, "_busy_ack_ts"):
            self._busy_ack_ts.pop(session_key, None)
-        return True

    def _clear_session_boundary_security_state(self, session_key: str) -> None:
        """Clear approval state that must not survive a real conversation switch."""
@@ -10303,24 +10247,10 @@ class GatewayRunner:
            # Wait for agent to be created
            while agent_holder[0] is None:
                await asyncio.sleep(0.05)
-            if not session_key:
-                return
-            # Only promote the sentinel to the real agent if this run is still
-            # current.  If /stop or /new bumped the generation while we were
-            # spinning up, leave the newer run's slot alone — we'll be
-            # discarded by the stale-result check in _handle_message_with_agent.
-            if run_generation is not None and not self._is_session_run_current(
-                session_key, run_generation
-            ):
-                logger.info(
-                    "Skipping stale agent promotion for %s — generation %s is no longer current",
-                    (session_key or "")[:20],
-                    run_generation,
-                )
-                return
-            self._running_agents[session_key] = agent_holder[0]
-            if self._draining:
-                self._update_runtime_status("draining")
+            if session_key:
+                self._running_agents[session_key] = agent_holder[0]
+                if self._draining:
+                    self._update_runtime_status("draining")
        
        tracking_task = asyncio.create_task(track_agent())
        
@@ -10375,9 +10305,9 @@ class GatewayRunner:
        # Periodic "still working" notifications for long-running tasks.
        # Fires every N seconds so the user knows the agent hasn't died.
        # Config: agent.gateway_notify_interval in config.yaml, or
-        # HERMES_AGENT_NOTIFY_INTERVAL env var.  Default 180s (3 min).
+        # HERMES_AGENT_NOTIFY_INTERVAL env var.  Default 600s (10 min).
        # 0 = disable notifications.
-        _NOTIFY_INTERVAL_RAW = float(os.getenv("HERMES_AGENT_NOTIFY_INTERVAL", 180))
+        _NOTIFY_INTERVAL_RAW = float(os.getenv("HERMES_AGENT_NOTIFY_INTERVAL", 600))
        _NOTIFY_INTERVAL = _NOTIFY_INTERVAL_RAW if _NOTIFY_INTERVAL_RAW > 0 else None
        _notify_start = time.time()

@@ -10826,14 +10756,7 @@ class GatewayRunner:
            # Clean up tracking
            tracking_task.cancel()
            if session_key:
-                # Only release the slot if this run's generation still owns
-                # it.  A /stop or /new that bumped the generation while we
-                # were unwinding has already installed its own state; this
-                # guard prevents an old run from clobbering it on the way
-                # out.
-                self._release_running_agent_state(
-                    session_key, run_generation=run_generation
-                )
+                self._release_running_agent_state(session_key)
            if self._draining:
                self._update_runtime_status("draining")
            
@@ -10956,7 +10879,6 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
    from gateway.status import (
        acquire_gateway_runtime_lock,
        get_running_pid,
-        get_process_start_time,
        release_gateway_runtime_lock,
        remove_pid_file,
        terminate_pid,
@@ -10964,7 +10886,6 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
    existing_pid = get_running_pid()
    if existing_pid is not None and existing_pid != os.getpid():
        if replace:
-            existing_start_time = get_process_start_time(existing_pid)
            logger.info(
                "Replacing existing gateway instance (PID %d) with --replace.",
                existing_pid,
@@ -11033,10 +10954,7 @@ async def start_gateway(config: Optional[GatewayConfig] = None, replace: bool =
            # leaving stale lock files that block the new gateway from starting.
            try:
                from gateway.status import release_all_scoped_locks
-                _released = release_all_scoped_locks(
-                    owner_pid=existing_pid,
-                    owner_start_time=existing_start_time,
-                )
+                _released = release_all_scoped_locks()
                if _released:
                    logger.info("Released %d stale scoped lock(s) from old gateway.", _released)
            except Exception:
@@ -83,9 +83,6 @@ class SessionSource:
    user_id_alt: Optional[str] = None  # Platform-specific stable alt ID (Signal UUID, Feishu union_id)
    chat_id_alt: Optional[str] = None  # Signal group internal ID
    is_bot: bool = False  # True when the message author is a bot/webhook (Discord)
-    guild_id: Optional[str] = None  # Discord guild / Slack workspace / Matrix server scope
-    parent_chat_id: Optional[str] = None  # Parent channel when chat_id refers to a thread
-    message_id: Optional[str] = None  # ID of the triggering message (for pin/reply/react)
    
    @property
    def description(self) -> str:
@@ -123,14 +120,8 @@ class SessionSource:
            d["user_id_alt"] = self.user_id_alt
        if self.chat_id_alt:
            d["chat_id_alt"] = self.chat_id_alt
-        if self.guild_id:
-            d["guild_id"] = self.guild_id
-        if self.parent_chat_id:
-            d["parent_chat_id"] = self.parent_chat_id
-        if self.message_id:
-            d["message_id"] = self.message_id
        return d
-
+    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "SessionSource":
        return cls(
@@ -144,9 +135,6 @@ class SessionSource:
            chat_topic=data.get("chat_topic"),
            user_id_alt=data.get("user_id_alt"),
            chat_id_alt=data.get("chat_id_alt"),
-            guild_id=data.get("guild_id"),
-            parent_chat_id=data.get("parent_chat_id"),
-            message_id=data.get("message_id"),
        )
    

@@ -285,34 +273,14 @@ def build_session_context_prompt(
            "that you can only read messages sent directly to you and respond."
        )
    elif context.source.platform == Platform.DISCORD:
-        # The discord tool self-gates on DISCORD_BOT_TOKEN at registry
-        # check time.  Match that condition so the prompt stays honest:
-        # with a token the agent has fetch_messages/search_members/
-        # create_thread (and optionally discord_admin) and should know
-        # the IDs it can call them with; without one it really is
-        # limited to reading/replying via the gateway.
-        if (os.environ.get("DISCORD_BOT_TOKEN") or "").strip():
-            src = context.source
-            id_lines = ["", "**Discord IDs (for the `discord` / `discord_admin` tools):**"]
-            if src.guild_id:
-                id_lines.append(f"  - Guild: `{src.guild_id}`")
-            if src.thread_id and src.parent_chat_id:
-                id_lines.append(f"  - Parent channel: `{src.parent_chat_id}`")
-                id_lines.append(f"  - Thread: `{src.thread_id}` (use as `channel_id` for fetch_messages etc.)")
-            else:
-                id_lines.append(f"  - Channel: `{src.chat_id}`")
-            if src.message_id:
-                id_lines.append(f"  - Triggering message: `{src.message_id}`")
-            lines.extend(id_lines)
-        else:
-            lines.append("")
-            lines.append(
-                "**Platform notes:** You are running inside Discord. "
-                "You do NOT have access to Discord-specific APIs — you cannot search "
-                "channel history, pin messages, manage roles, or list server members. "
-                "Do not promise to perform these actions. If the user asks, explain "
-                "that you can only read messages sent directly to you and respond."
-            )
+        lines.append("")
+        lines.append(
+            "**Platform notes:** You are running inside Discord. "
+            "You do NOT have access to Discord-specific APIs — you cannot search "
+            "channel history, pin messages, manage roles, or list server members. "
+            "Do not promise to perform these actions. If the user asks, explain "
+            "that you can only read messages sent directly to you and respond."
+        )

    # Connected platforms
    platforms_list = ["local (files on this machine)"]
@@ -113,11 +113,6 @@ def _get_process_start_time(pid: int) -> Optional[int]:
        return None


-def get_process_start_time(pid: int) -> Optional[int]:
-    """Public wrapper for retrieving a process start time when available."""
-    return _get_process_start_time(pid)
-
-
 def _read_process_cmdline(pid: int) -> Optional[str]:
    """Return the process command line as a space-separated string."""
    cmdline_path = Path(f"/proc/{pid}/cmdline")
@@ -501,8 +496,7 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
        if not stale:
            try:
                os.kill(existing_pid, 0)
-            except (ProcessLookupError, PermissionError, OSError):
-                # Windows raises OSError with WinError 87 for invalid pid check
+            except (ProcessLookupError, PermissionError):
                stale = True
            else:
                current_start = _get_process_start_time(existing_pid)
@@ -567,43 +561,17 @@ def release_scoped_lock(scope: str, identity: str) -> None:
        pass


-def release_all_scoped_locks(
-    *,
-    owner_pid: Optional[int] = None,
-    owner_start_time: Optional[int] = None,
-) -> int:
-    """Remove scoped lock files in the lock directory.
+def release_all_scoped_locks() -> int:
+    """Remove all scoped lock files in the lock directory.

    Called during --replace to clean up stale locks left by stopped/killed
-    gateway processes that did not release their locks gracefully. When an
-    ``owner_pid`` is provided, only lock records belonging to that gateway
-    process are removed. ``owner_start_time`` further narrows the match to
-    protect against PID reuse.
-
-    When no owner is provided, preserves the legacy behavior and removes every
-    scoped lock file in the directory.
-
+    gateway processes that did not release their locks gracefully.
    Returns the number of lock files removed.
    """
    lock_dir = _get_lock_dir()
    removed = 0
    if lock_dir.exists():
        for lock_file in lock_dir.glob("*.lock"):
-            if owner_pid is not None:
-                record = _read_json_file(lock_file)
-                if not isinstance(record, dict):
-                    continue
-                try:
-                    record_pid = int(record.get("pid"))
-                except (TypeError, ValueError):
-                    continue
-                if record_pid != owner_pid:
-                    continue
-                if (
-                    owner_start_time is not None
-                    and record.get("start_time") != owner_start_time
-                ):
-                    continue
            try:
                lock_file.unlink(missing_ok=True)
                removed += 1
@@ -775,10 +743,6 @@ def get_running_pid(
            if _record_looks_like_gateway(record):
                return pid
            continue
-        except OSError:
-            # Windows raises OSError with WinError 87 for an invalid pid
-            # (process is definitely gone). Treat as "process doesn't exist".
-            continue

        recorded_start = record.get("start_time")
        current_start = _get_process_start_time(pid)
@@ -11,5 +11,5 @@ Provides subcommands for:
 - hermes cron          - Manage cron jobs
 """

-__version__ = "0.11.0"
-__release_date__ = "2026.4.23"
+__version__ = "0.10.0"
+__release_date__ = "2026.4.16"
@@ -619,25 +619,7 @@ def _oauth_trace(event: str, *, sequence_id: Optional[str] = None, **fields: Any
 # =============================================================================

 def _auth_file_path() -> Path:
-    path = get_hermes_home() / "auth.json"
-    # Seat belt: if pytest is running and HERMES_HOME resolves to the real
-    # user's auth store, refuse rather than silently corrupt it. This catches
-    # tests that forgot to monkeypatch HERMES_HOME, tests invoked without the
-    # hermetic conftest, or sandbox escapes via threads/subprocesses. In
-    # production (no PYTEST_CURRENT_TEST) this is a single dict lookup.
-    if os.environ.get("PYTEST_CURRENT_TEST"):
-        real_home_auth = (Path.home() / ".hermes" / "auth.json").resolve(strict=False)
-        try:
-            resolved = path.resolve(strict=False)
-        except Exception:
-            resolved = path
-        if resolved == real_home_auth:
-            raise RuntimeError(
-                f"Refusing to touch real user auth store during test run: {path}. "
-                "Set HERMES_HOME to a tmp_path in your test fixture, or run "
-                "via scripts/run_tests.sh for hermetic CI-parity env."
-            )
-    return path
+    return get_hermes_home() / "auth.json"


 def _auth_lock_path() -> Path:
@@ -238,52 +238,6 @@ def get_git_banner_state(repo_dir: Optional[Path] = None) -> Optional[dict]:
    return {"upstream": upstream, "local": local, "ahead": max(ahead, 0)}


-_RELEASE_URL_BASE = "https://github.com/NousResearch/hermes-agent/releases/tag"
-_latest_release_cache: Optional[tuple] = None  # (tag, url) once resolved
-
-
-def get_latest_release_tag(repo_dir: Optional[Path] = None) -> Optional[tuple]:
-    """Return ``(tag, release_url)`` for the latest git tag, or None.
-
-    Local-only — runs ``git describe --tags --abbrev=0`` against the
-    Hermes checkout. Cached per-process. Release URL always points at the
-    canonical NousResearch/hermes-agent repo (forks don't get a link).
-    """
-    global _latest_release_cache
-    if _latest_release_cache is not None:
-        return _latest_release_cache or None
-
-    repo_dir = repo_dir or _resolve_repo_dir()
-    if repo_dir is None:
-        _latest_release_cache = ()  # falsy sentinel — skip future lookups
-        return None
-
-    try:
-        result = subprocess.run(
-            ["git", "describe", "--tags", "--abbrev=0"],
-            capture_output=True,
-            text=True,
-            timeout=3,
-            cwd=str(repo_dir),
-        )
-    except Exception:
-        _latest_release_cache = ()
-        return None
-
-    if result.returncode != 0:
-        _latest_release_cache = ()
-        return None
-
-    tag = (result.stdout or "").strip()
-    if not tag:
-        _latest_release_cache = ()
-        return None
-
-    url = f"{_RELEASE_URL_BASE}/{tag}"
-    _latest_release_cache = (tag, url)
-    return _latest_release_cache
-
-
 def format_banner_version_label() -> str:
    """Return the version label shown in the startup banner title."""
    base = f"Hermes Agent v{VERSION} ({RELEASE_DATE})"
@@ -565,16 +519,9 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
    agent_name = _skin_branding("agent_name", "Hermes Agent")
    title_color = _skin_color("banner_title", "#FFD700")
    border_color = _skin_color("banner_border", "#CD7F32")
-    version_label = format_banner_version_label()
-    release_info = get_latest_release_tag()
-    if release_info:
-        _tag, _url = release_info
-        title_markup = f"[bold {title_color}][link={_url}]{version_label}[/link][/]"
-    else:
-        title_markup = f"[bold {title_color}]{version_label}[/]"
    outer_panel = Panel(
        layout_table,
-        title=title_markup,
+        title=f"[bold {title_color}]{format_banner_version_label()}[/]",
        border_style=border_color,
        padding=(0, 2),
    )
@@ -12,7 +12,6 @@ import os
 logger = logging.getLogger(__name__)

 DEFAULT_CODEX_MODELS: List[str] = [
-    "gpt-5.5",
    "gpt-5.4-mini",
    "gpt-5.4",
    "gpt-5.3-codex",
@@ -22,7 +21,6 @@ DEFAULT_CODEX_MODELS: List[str] = [
 ]

 _FORWARD_COMPAT_TEMPLATE_MODELS: List[tuple[str, tuple[str, ...]]] = [
-    ("gpt-5.5", ("gpt-5.4", "gpt-5.4-mini", "gpt-5.3-codex")),
    ("gpt-5.4-mini", ("gpt-5.3-codex", "gpt-5.2-codex")),
    ("gpt-5.4", ("gpt-5.3-codex", "gpt-5.2-codex")),
    ("gpt-5.3-codex", ("gpt-5.2-codex",)),
@@ -361,15 +361,6 @@ DEFAULT_CONFIG = {
        # to finish, then interrupts any remaining runs after the timeout.
        # 0 = no drain, interrupt immediately.
        "restart_drain_timeout": 60,
-        # Max app-level retry attempts for API errors (connection drops,
-        # provider timeouts, 5xx, etc.) before the agent surfaces the
-        # failure.  The OpenAI SDK already does its own low-level retries
-        # (max_retries=2 default) for transient network errors; this is
-        # the Hermes-level retry loop that wraps the whole call.  Lower
-        # this to 1 if you use fallback providers and want fast failover
-        # on flaky primaries; raise it if you prefer to tolerate longer
-        # provider hiccups on a single provider.
-        "api_max_retries": 3,
        "service_tier": "",
        # Tool-use enforcement: injects system prompt guidance that tells the
        # model to actually call tools instead of describing intended actions.
@@ -384,11 +375,7 @@ DEFAULT_CONFIG = {
        # Periodic "still working" notification interval (seconds).
        # Sends a status message every N seconds so the user knows the
        # agent hasn't died during long tasks.  0 = disable notifications.
-        # Lower values mean faster feedback on slow tasks but more chat
-        # noise; 180s is a compromise that catches spinning weak-model runs
-        # (60+ tool iterations with tiny output) before users assume the
-        # bot is dead and /restart.
-        "gateway_notify_interval": 180,
+        "gateway_notify_interval": 600,
    },
    
    "terminal": {
@@ -407,23 +394,17 @@ DEFAULT_CONFIG = {
        # (bash doesn't source bashrc in non-interactive login mode) or
        # zsh-specific files like ``~/.zshrc`` / ``~/.zprofile``.
        # Paths support ``~`` / ``${VAR}``. Missing files are silently
-        # skipped. When empty, Hermes auto-sources ``~/.profile``,
-        # ``~/.bash_profile``, and ``~/.bashrc`` (in that order) if the
+        # skipped. When empty, Hermes auto-appends ``~/.bashrc`` if the
        # snapshot shell is bash (this is the ``auto_source_bashrc``
        # behaviour — disable with that key if you want strict login-only
        # semantics).
        "shell_init_files": [],
-        # When true (default), Hermes sources the user's shell rc files
-        # (``~/.profile``, ``~/.bash_profile``, ``~/.bashrc``) in the
-        # login shell used to build the environment snapshot. This
-        # captures PATH additions, shell functions, and aliases — which a
-        # plain ``bash -l -c`` would otherwise miss because bash skips
-        # bashrc in non-interactive login mode, and because a default
-        # Debian/Ubuntu ``~/.bashrc`` short-circuits on non-interactive
-        # sources. ``~/.profile`` and ``~/.bash_profile`` are tried first
-        # because ``n`` / ``nvm`` / ``asdf`` installers typically write
-        # their PATH exports there without an interactivity guard. Turn
-        # this off if your rc files misbehave when sourced
+        # When true (default), Hermes sources ``~/.bashrc`` in the login
+        # shell used to build the environment snapshot.  This captures
+        # PATH additions, shell functions, and aliases defined in the
+        # user's bashrc — which a plain ``bash -l -c`` would otherwise
+        # miss because bash skips bashrc in non-interactive login mode.
+        # Turn this off if you have a bashrc that misbehaves when sourced
        # non-interactively (e.g. one that hard-exits on TTY checks).
        "auto_source_bashrc": True,
        "docker_image": "nikolaik/python-nodejs:python3.11-nodejs20",
@@ -466,12 +447,6 @@ DEFAULT_CONFIG = {
        "record_sessions": False,  # Auto-record browser sessions as WebM videos
        "allow_private_urls": False,  # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
        "cdp_url": "",  # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
-        # CDP supervisor — dialog + frame detection via a persistent WebSocket.
-        # Active only when a CDP-capable backend is attached (Browserbase or
-        # local Chrome via /browser connect). See
-        # website/docs/developer-guide/browser-supervisor.md.
-        "dialog_policy": "must_respond",  # must_respond | auto_dismiss | auto_accept
-        "dialog_timeout_s": 300,  # Safety auto-dismiss after N seconds under must_respond
        "camofox": {
            # When true, Hermes sends a stable profile-scoped userId to Camofox
            # so the server maps it to a persistent Firefox profile automatically.
@@ -492,27 +467,7 @@ DEFAULT_CONFIG = {
    # exceed this are rejected with guidance to use offset+limit.
    # 100K chars ≈ 25–35K tokens across typical tokenisers.
    "file_read_max_chars": 100_000,
-
-    # Tool-output truncation thresholds. When terminal output or a
-    # single read_file page exceeds these limits, Hermes truncates the
-    # payload sent to the model (keeping head + tail for terminal,
-    # enforcing pagination for read_file). Tuning these trades context
-    # footprint against how much raw output the model can see in one
-    # shot. Ported from anomalyco/opencode PR #23770.
-    #
-    # - max_bytes:       terminal_tool output cap, in chars
-    #                    (default 50_000 ≈ 12-15K tokens).
-    # - max_lines:       read_file pagination cap — the maximum `limit`
-    #                    a single read_file call can request before
-    #                    being clamped (default 2000).
-    # - max_line_length: per-line cap applied when read_file emits a
-    #                    line-numbered view (default 2000 chars).
-    "tool_output": {
-        "max_bytes": 50_000,
-        "max_lines": 2000,
-        "max_line_length": 2000,
-    },
-
+    
    "compression": {
        "enabled": True,
        "threshold": 0.50,            # compress when context usage exceeds this ratio
@@ -765,10 +720,6 @@ DEFAULT_CONFIG = {
        "inherit_mcp_toolsets": True,
        "max_iterations": 50,  # per-subagent iteration cap (each subagent gets its own budget,
                               # independent of the parent's max_iterations)
-        "child_timeout_seconds": 600,  # wall-clock timeout for each child agent (floor 30s,
-                                       # no ceiling). High-reasoning models on large tasks
-                                       # (e.g. gpt-5.5 xhigh, opus-4.6) need generous budgets;
-                                       # raise if children time out before producing output.
        "reasoning_effort": "",  # reasoning effort for subagents: "xhigh", "high", "medium",
                                 # "low", "minimal", "none" (empty = inherit parent's level)
        "max_concurrent_children": 3,  # max parallel children per batch; floor of 1 enforced, no ceiling
@@ -803,17 +754,6 @@ DEFAULT_CONFIG = {
        "inline_shell": False,
        # Timeout (seconds) for each !`cmd` snippet when inline_shell is on.
        "inline_shell_timeout": 10,
-        # Run the keyword/pattern security scanner on skills the agent
-        # writes via skill_manage (create/edit/patch).  Off by default
-        # because the agent can already execute the same code paths via
-        # terminal() with no gate, so the scan adds friction (blocks
-        # skills that mention risky keywords in prose) without meaningful
-        # security.  Turn on if you want the belt-and-suspenders — a
-        # dangerous verdict will then surface as a tool error to the
-        # agent, which can retry with the flagged content removed.
-        # External hub installs (trusted/community sources) are always
-        # scanned regardless of this setting.
-        "guard_agent_created": False,
    },

    # Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -833,7 +773,7 @@ DEFAULT_CONFIG = {
        "auto_thread": True,           # Auto-create threads on @mention in channels (like Slack)
        "reactions": True,             # Add 👀/✅/❌ reactions to messages during processing
        "channel_prompts": {},         # Per-channel ephemeral system prompts (forum parents apply to child threads)
-        # discord / discord_admin tools: restrict which actions the agent may call.
+        # discord_server tool: restrict which actions the agent may call.
        # Default (empty) = all actions allowed (subject to bot privileged intents).
        # Accepts comma-separated string ("list_guilds,list_channels,fetch_messages")
        # or YAML list. Unknown names are dropped with a warning at load time.
@@ -1334,7 +1274,7 @@ OPTIONAL_ENV_VARS = {
        "advanced": True,
    },
    "XIAOMI_API_KEY": {
-        "description": "Xiaomi MiMo API key for MiMo models (mimo-v2.5-pro, mimo-v2.5, mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
+        "description": "Xiaomi MiMo API key for MiMo models (mimo-v2-pro, mimo-v2-omni, mimo-v2-flash)",
        "prompt": "Xiaomi MiMo API Key",
        "url": "https://platform.xiaomimimo.com",
        "password": True,
@@ -2115,14 +2055,6 @@ def _normalize_custom_provider_entry(
    models = entry.get("models")
    if isinstance(models, dict) and models:
        normalized["models"] = models
-    elif isinstance(models, list) and models:
-        # Hand-edited configs (and older Hermes versions) write ``models`` as
-        # a plain list of model ids. Preserve them by converting to the dict
-        # shape downstream code expects; otherwise normalize silently drops
-        # the list and /model shows the provider with (0) models.
-        normalized["models"] = {
-            str(m): {} for m in models if isinstance(m, str) and m.strip()
-        }

    context_length = entry.get("context_length")
    if isinstance(context_length, int) and context_length > 0:
@@ -175,60 +175,6 @@ def _request_gateway_self_restart(pid: int) -> bool:
    return True


-def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
-    """Send SIGUSR1 to a gateway PID and wait for it to exit gracefully.
-
-    SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
-    which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
-    seconds), then exits with code 75.  Both systemd (``Restart=on-failure``
-    + ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
-    = false``) relaunch the process after the graceful exit.
-
-    This is the drain-aware alternative to ``systemctl restart`` / ``SIGTERM``,
-    which SIGKILL in-flight agents after a short timeout.
-
-    Args:
-        pid: Gateway process PID (systemd MainPID, launchd PID, or bare
-            process PID).
-        drain_timeout: Seconds to wait for the process to exit after sending
-            SIGUSR1.  Should be slightly larger than the gateway's
-            ``agent.restart_drain_timeout`` to allow the drain loop to
-            finish cleanly.
-
-    Returns:
-        True if the PID was signalled and exited within the timeout.
-        False if SIGUSR1 couldn't be sent or the process didn't exit in
-        time (caller should fall back to a harder restart path).
-    """
-    if not hasattr(signal, "SIGUSR1"):
-        return False
-    if pid <= 0:
-        return False
-    try:
-        os.kill(pid, signal.SIGUSR1)
-    except ProcessLookupError:
-        # Already gone — nothing to drain.
-        return True
-    except (PermissionError, OSError):
-        return False
-
-    import time as _time
-
-    deadline = _time.monotonic() + max(drain_timeout, 1.0)
-    while _time.monotonic() < deadline:
-        try:
-            os.kill(pid, 0)  # signal 0 — probe liveness
-        except ProcessLookupError:
-            return True
-        except PermissionError:
-            # Process still exists but we can't signal it.  Treat as alive
-            # so the caller falls back.
-            pass
-        _time.sleep(0.5)
-    # Drain didn't finish in time.
-    return False
-
-
 def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
    if pid is None or pid <= 0:
        return
@@ -815,21 +761,6 @@ def get_systemd_unit_path(system: bool = False) -> Path:
    return Path.home() / ".config" / "systemd" / "user" / f"{name}.service"


-class UserSystemdUnavailableError(RuntimeError):
-    """Raised when ``systemctl --user`` cannot reach the user D-Bus session.
-
-    Typically hit on fresh RHEL/Debian SSH sessions where linger is disabled
-    and no user@.service is running, so ``/run/user/$UID/bus`` never exists.
-    Carries a user-facing remediation message in ``args[0]``.
-    """
-
-
-def _user_dbus_socket_path() -> Path:
-    """Return the expected per-user D-Bus socket path (regardless of existence)."""
-    xdg = os.environ.get("XDG_RUNTIME_DIR") or f"/run/user/{os.getuid()}"
-    return Path(xdg) / "bus"
-
-
 def _ensure_user_systemd_env() -> None:
    """Ensure DBUS_SESSION_BUS_ADDRESS and XDG_RUNTIME_DIR are set for systemctl --user.

@@ -852,126 +783,6 @@ def _ensure_user_systemd_env() -> None:
            os.environ["DBUS_SESSION_BUS_ADDRESS"] = f"unix:path={bus_path}"


-def _wait_for_user_dbus_socket(timeout: float = 3.0) -> bool:
-    """Poll for the user D-Bus socket to appear, up to ``timeout`` seconds.
-
-    Linger-enabled user@.service can take a second or two to spawn the socket
-    after ``loginctl enable-linger`` runs.  Returns True once the socket exists.
-    """
-    import time
-
-    deadline = time.monotonic() + timeout
-    while time.monotonic() < deadline:
-        if _user_dbus_socket_path().exists():
-            _ensure_user_systemd_env()
-            return True
-        time.sleep(0.2)
-    return _user_dbus_socket_path().exists()
-
-
-def _preflight_user_systemd(*, auto_enable_linger: bool = True) -> None:
-    """Ensure ``systemctl --user`` will reach the user D-Bus session bus.
-
-    No-op when the bus socket is already there (the common case on desktops
-    and linger-enabled servers).  On fresh SSH sessions where the socket is
-    missing:
-
-    * If linger is already enabled, wait briefly for user@.service to spawn
-      the socket.
-    * If linger is disabled and ``auto_enable_linger`` is True, try
-      ``loginctl enable-linger $USER`` (works as non-root when polkit permits
-      it, otherwise needs sudo).
-    * If the socket is still missing afterwards, raise
-      :class:`UserSystemdUnavailableError` with a precise remediation message.
-
-    Callers should treat the exception as a terminal condition for user-scope
-    systemd operations and surface the message to the user.
-    """
-    _ensure_user_systemd_env()
-    bus_path = _user_dbus_socket_path()
-    if bus_path.exists():
-        return
-
-    import getpass
-
-    username = getpass.getuser()
-    linger_enabled, linger_detail = get_systemd_linger_status()
-
-    if linger_enabled is True:
-        if _wait_for_user_dbus_socket(timeout=3.0):
-            return
-        # Linger is on but socket still missing — unusual; fall through to error.
-        _raise_user_systemd_unavailable(
-            username,
-            reason="User D-Bus socket is missing even though linger is enabled.",
-            fix_hint=(
-                f"  systemctl start user@{os.getuid()}.service\n"
-                "  (may require sudo; try again after the command succeeds)"
-            ),
-        )
-
-    if auto_enable_linger and shutil.which("loginctl"):
-        try:
-            result = subprocess.run(
-                ["loginctl", "enable-linger", username],
-                capture_output=True,
-                text=True,
-                check=False,
-                timeout=30,
-            )
-        except Exception as exc:
-            _raise_user_systemd_unavailable(
-                username,
-                reason=f"loginctl enable-linger failed ({exc}).",
-                fix_hint=f"  sudo loginctl enable-linger {username}",
-            )
-        else:
-            if result.returncode == 0:
-                if _wait_for_user_dbus_socket(timeout=5.0):
-                    print(f"✓ Enabled linger for {username} — user D-Bus now available")
-                    return
-                # enable-linger succeeded but the socket never appeared.
-                _raise_user_systemd_unavailable(
-                    username,
-                    reason="Linger was enabled, but the user D-Bus socket did not appear.",
-                    fix_hint=(
-                        "  Log out and log back in, then re-run the command.\n"
-                        f"  Or reboot and run: systemctl --user start {get_service_name()}"
-                    ),
-                )
-            detail = (result.stderr or result.stdout or f"exit {result.returncode}").strip()
-            _raise_user_systemd_unavailable(
-                username,
-                reason=f"loginctl enable-linger was denied: {detail}",
-                fix_hint=f"  sudo loginctl enable-linger {username}",
-            )
-
-    _raise_user_systemd_unavailable(
-        username,
-        reason=(
-            "User D-Bus session is not available "
-            f"({linger_detail or 'linger disabled'})."
-        ),
-        fix_hint=f"  sudo loginctl enable-linger {username}",
-    )
-
-
-def _raise_user_systemd_unavailable(username: str, *, reason: str, fix_hint: str) -> None:
-    """Build a user-facing error message and raise UserSystemdUnavailableError."""
-    msg = (
-        f"{reason}\n"
-        "  systemctl --user cannot reach the user D-Bus session in this shell.\n"
-        "\n"
-        "  To fix:\n"
-        f"{fix_hint}\n"
-        "\n"
-        "  Alternative: run the gateway in the foreground (stays up until\n"
-        "  you exit / close the terminal):\n"
-        "    hermes gateway run"
-    )
-    raise UserSystemdUnavailableError(msg)
-
-
 def _systemctl_cmd(system: bool = False) -> list[str]:
    if not system:
        _ensure_user_systemd_env()
@@ -1523,14 +1334,7 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
            path_entries.append(resolved_node_dir)

    common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
-    # systemd's TimeoutStopSec must exceed the gateway's drain_timeout so
-    # there's budget left for post-interrupt cleanup (tool subprocess kill,
-    # adapter disconnect, session DB close) before systemd escalates to
-    # SIGKILL on the cgroup — otherwise bash/sleep tool-call children left
-    # by a force-interrupted agent get reaped by systemd instead of us
-    # (#8202). 30s of headroom covers the worst case we've observed.
-    _drain_timeout = int(_get_restart_drain_timeout() or 0)
-    restart_timeout = max(60, _drain_timeout) + 30
+    restart_timeout = max(60, int(_get_restart_drain_timeout() or 0))

    if system:
        username, group_name, home_dir = _system_service_identity(run_as_user)
@@ -1819,11 +1623,6 @@ def systemd_start(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("start")
-    else:
-        # Fail fast with actionable guidance if the user D-Bus session is not
-        # reachable (common on fresh RHEL/Debian SSH sessions without linger).
-        # Raises UserSystemdUnavailableError with a remediation message.
-        _preflight_user_systemd()
    refresh_systemd_unit_if_needed(system=system)
    _run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
    print(f"✓ {_service_scope_label(system).capitalize()} service started")
@@ -1843,8 +1642,6 @@ def systemd_restart(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("restart")
-    else:
-        _preflight_user_systemd()
    refresh_systemd_unit_if_needed(system=system)
    from gateway.status import get_running_pid

@@ -3719,10 +3516,6 @@ def gateway_setup():
                    systemd_start()
                elif is_macos():
                    launchd_start()
-            except UserSystemdUnavailableError as e:
-                print_error("  Failed to start — user systemd not reachable:")
-                for line in str(e).splitlines():
-                    print(f"  {line}")
            except subprocess.CalledProcessError as e:
                print_error(f"  Failed to start: {e}")
    else:
@@ -3787,10 +3580,6 @@ def gateway_setup():
                    else:
                        stop_profile_gateway()
                        print_info("Start manually: hermes gateway")
-                except UserSystemdUnavailableError as e:
-                    print_error("  Restart failed — user systemd not reachable:")
-                    for line in str(e).splitlines():
-                        print(f"  {line}")
                except subprocess.CalledProcessError as e:
                    print_error(f"  Restart failed: {e}")
        elif service_installed:
@@ -3800,10 +3589,6 @@ def gateway_setup():
                        systemd_start()
                    elif is_macos():
                        launchd_start()
-                except UserSystemdUnavailableError as e:
-                    print_error("  Start failed — user systemd not reachable:")
-                    for line in str(e).splitlines():
-                        print(f"  {line}")
                except subprocess.CalledProcessError as e:
                    print_error(f"  Start failed: {e}")
        else:
@@ -3827,10 +3612,6 @@ def gateway_setup():
                                    systemd_start(system=installed_scope == "system")
                                else:
                                    launchd_start()
-                            except UserSystemdUnavailableError as e:
-                                print_error("  Start failed — user systemd not reachable:")
-                                for line in str(e).splitlines():
-                                    print(f"  {line}")
                            except subprocess.CalledProcessError as e:
                                print_error(f"  Start failed: {e}")
                    except subprocess.CalledProcessError as e:
@@ -3868,18 +3649,6 @@ def gateway_setup():

 def gateway_command(args):
    """Handle gateway subcommands."""
-    try:
-        return _gateway_command_inner(args)
-    except UserSystemdUnavailableError as e:
-        # Clean, actionable message instead of a traceback when the user D-Bus
-        # session is unreachable (fresh SSH shell, no linger, container, etc.).
-        print_error("User systemd not reachable:")
-        for line in str(e).splitlines():
-            print(f"  {line}")
-        sys.exit(1)
-
-
-def _gateway_command_inner(args):
    subcmd = getattr(args, 'gateway_command', None)
    
    # Default to run if no subcommand
@@ -1131,20 +1131,6 @@ def cmd_chat(args):
    if getattr(args, "yolo", False):
        os.environ["HERMES_YOLO_MODE"] = "1"

-    # --ignore-user-config: make load_cli_config() / load_config() skip the
-    # user's ~/.hermes/config.yaml and return built-in defaults. Set BEFORE
-    # importing cli (which runs `CLI_CONFIG = load_cli_config()` at module
-    # import time). Credentials in .env are still loaded — this flag only
-    # ignores behavioral/config settings.
-    if getattr(args, "ignore_user_config", False):
-        os.environ["HERMES_IGNORE_USER_CONFIG"] = "1"
-
-    # --ignore-rules: skip auto-injection of AGENTS.md/SOUL.md/.cursorrules
-    # (rules), memory entries, and any preloaded skills coming from user config.
-    # Maps to AIAgent(skip_context_files=True, skip_memory=True).
-    if getattr(args, "ignore_rules", False):
-        os.environ["HERMES_IGNORE_RULES"] = "1"
-
    # --source: tag session source for filtering (e.g. 'tool' for third-party integrations)
    if getattr(args, "source", None):
        os.environ["HERMES_SESSION_SOURCE"] = args.source
@@ -1173,8 +1159,6 @@ def cmd_chat(args):
        "checkpoints": getattr(args, "checkpoints", False),
        "pass_session_id": getattr(args, "pass_session_id", False),
        "max_turns": getattr(args, "max_turns", None),
-        "ignore_rules": getattr(args, "ignore_rules", False),
-        "ignore_user_config": getattr(args, "ignore_user_config", False),
    }
    # Filter out None values
    kwargs = {k: v for k, v in kwargs.items() if v is not None}
@@ -3984,18 +3968,7 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
            pass

        if mdev_models:
-            # Merge models.dev with curated list so newly added models
-            # (not yet in models.dev) still appear in the picker.
-            if curated:
-                seen = {m.lower() for m in mdev_models}
-                merged = list(mdev_models)
-                for m in curated:
-                    if m.lower() not in seen:
-                        merged.append(m)
-                        seen.add(m.lower())
-                model_list = merged
-            else:
-                model_list = mdev_models
+            model_list = mdev_models
            print(f"  Found {len(model_list)} model(s) from models.dev registry")
        elif curated and len(curated) >= 8:
            # Curated list is substantial — use it directly, skip live probe
@@ -5864,15 +5837,12 @@ def _cmd_update_impl(args, gateway_mode: bool):
        # Write exit code *before* the gateway restart attempt.
        # When running as ``hermes update --gateway`` (spawned by the gateway's
        # /update command), this process lives inside the gateway's systemd
-        # cgroup.  A graceful SIGUSR1 restart keeps the drain loop alive long
-        # enough for the exit-code marker to be written below, but the
-        # fallback ``systemctl restart`` path (see below) kills everything in
-        # the cgroup (KillMode=mixed → SIGKILL to remaining processes),
-        # including us and the wrapping bash shell.  The shell never reaches
-        # its ``printf $status > .update_exit_code`` epilogue, so the
-        # exit-code marker file would never be created.  The new gateway's
-        # update watcher would then poll for 30 minutes and send a spurious
-        # timeout message.
+        # cgroup.  ``systemctl restart hermes-gateway`` kills everything in the
+        # cgroup (KillMode=mixed → SIGKILL to remaining processes), including
+        # us and the wrapping bash shell.  The shell never reaches its
+        # ``printf $status > .update_exit_code`` epilogue, so the exit-code
+        # marker file is never created.  The new gateway's update watcher then
+        # polls for 30 minutes and sends a spurious timeout message.
        #
        # Writing the marker here — after git pull + pip install succeed but
        # before we attempt the restart — ensures the new gateway sees it
@@ -5894,37 +5864,9 @@ def _cmd_update_impl(args, gateway_mode: bool):
                _ensure_user_systemd_env,
                find_gateway_pids,
                _get_service_pids,
-                _graceful_restart_via_sigusr1,
            )
            import signal as _signal

-            # Drain budget for graceful SIGUSR1 restarts.  The gateway drains
-            # for up to ``agent.restart_drain_timeout`` (default 60s) before
-            # exiting with code 75; we wait slightly longer so the drain
-            # completes before we fall back to a hard restart.  On older
-            # systemd units without SIGUSR1 wiring this wait just times out
-            # and we fall back to ``systemctl restart`` (the old behaviour).
-            try:
-                from hermes_constants import (
-                    DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT as _DEFAULT_DRAIN,
-                )
-            except Exception:
-                _DEFAULT_DRAIN = 60.0
-            _cfg_drain = None
-            try:
-                from hermes_cli.config import load_config
-                _cfg_agent = (load_config().get("agent") or {})
-                _cfg_drain = _cfg_agent.get("restart_drain_timeout")
-            except Exception:
-                pass
-            try:
-                _drain_budget = float(_cfg_drain) if _cfg_drain is not None else float(_DEFAULT_DRAIN)
-            except (TypeError, ValueError):
-                _drain_budget = float(_DEFAULT_DRAIN)
-            # Add a 15s margin so the drain loop + final exit finish before
-            # we escalate to ``systemctl restart`` / SIGTERM.
-            _drain_budget = max(_drain_budget, 30.0) + 15.0
-
            restarted_services = []
            killed_pids = set()

@@ -5971,114 +5913,59 @@ def _cmd_update_impl(args, gateway_mode: bool):
                                text=True,
                                timeout=5,
                            )
-                            if check.stdout.strip() != "active":
-                                continue
-
-                            # Prefer a graceful SIGUSR1 restart so in-flight
-                            # agent runs drain instead of being SIGKILLed.
-                            # The gateway's SIGUSR1 handler calls
-                            # request_restart(via_service=True) → drain →
-                            # exit(75); systemd's Restart=on-failure (and
-                            # RestartForceExitStatus=75) respawns the unit.
-                            _main_pid = 0
-                            try:
-                                _show = subprocess.run(
-                                    scope_cmd + [
-                                        "show", svc_name,
-                                        "--property=MainPID", "--value",
-                                    ],
-                                    capture_output=True, text=True, timeout=5,
-                                )
-                                _main_pid = int((_show.stdout or "").strip() or 0)
-                            except (ValueError, subprocess.TimeoutExpired, FileNotFoundError):
-                                _main_pid = 0
-
-                            _graceful_ok = False
-                            if _main_pid > 0:
-                                print(
-                                    f"  → {svc_name}: draining (up to {int(_drain_budget)}s)..."
-                                )
-                                _graceful_ok = _graceful_restart_via_sigusr1(
-                                    _main_pid, drain_timeout=_drain_budget,
-                                )
-
-                            if _graceful_ok:
-                                # Gateway exited 75; systemd should relaunch
-                                # via Restart=on-failure.  Verify the new
-                                # process came up.
-                                _time.sleep(3)
-                                verify = subprocess.run(
-                                    scope_cmd + ["is-active", svc_name],
-                                    capture_output=True, text=True, timeout=5,
-                                )
-                                if verify.stdout.strip() == "active":
-                                    restarted_services.append(svc_name)
-                                    continue
-                                # Process exited but wasn't respawned (older
-                                # unit without Restart=on-failure or
-                                # RestartForceExitStatus=75).  Fall through
-                                # to systemctl start/restart.
-                                print(
-                                    f"  ⚠ {svc_name} drained but didn't relaunch — forcing restart"
-                                )
-
-                            # Fallback: blunt systemctl restart.  This is
-                            # what the old code always did; we get here only
-                            # when the graceful path failed (unit missing
-                            # SIGUSR1 wiring, drain exceeded the budget,
-                            # restart-policy mismatch).
-                            restart = subprocess.run(
-                                scope_cmd + ["restart", svc_name],
-                                capture_output=True,
-                                text=True,
-                                timeout=15,
-                            )
-                            if restart.returncode == 0:
-                                # Verify the service actually survived the
-                                # restart.  systemctl restart returns 0 even
-                                # if the new process crashes immediately.
-                                _time.sleep(3)
-                                verify = subprocess.run(
-                                    scope_cmd + ["is-active", svc_name],
+                            if check.stdout.strip() == "active":
+                                restart = subprocess.run(
+                                    scope_cmd + ["restart", svc_name],
                                    capture_output=True,
                                    text=True,
-                                    timeout=5,
+                                    timeout=15,
                                )
-                                if verify.stdout.strip() == "active":
-                                    restarted_services.append(svc_name)
-                                else:
-                                    # Retry once — transient startup failures
-                                    # (stale module cache, import race) often
-                                    # resolve on the second attempt.
-                                    print(
-                                        f"  ⚠ {svc_name} died after restart, retrying..."
-                                    )
-                                    retry = subprocess.run(
-                                        scope_cmd + ["restart", svc_name],
-                                        capture_output=True,
-                                        text=True,
-                                        timeout=15,
-                                    )
+                                if restart.returncode == 0:
+                                    # Verify the service actually survived the
+                                    # restart.  systemctl restart returns 0 even
+                                    # if the new process crashes immediately.
                                    _time.sleep(3)
-                                    verify2 = subprocess.run(
+                                    verify = subprocess.run(
                                        scope_cmd + ["is-active", svc_name],
                                        capture_output=True,
                                        text=True,
                                        timeout=5,
                                    )
-                                    if verify2.stdout.strip() == "active":
+                                    if verify.stdout.strip() == "active":
                                        restarted_services.append(svc_name)
-                                        print(f"  ✓ {svc_name} recovered on retry")
                                    else:
+                                        # Retry once — transient startup failures
+                                        # (stale module cache, import race) often
+                                        # resolve on the second attempt.
                                        print(
-                                            f"  ✗ {svc_name} failed to stay running after restart.\n"
-                                            f"    Check logs: journalctl --user -u {svc_name} --since '2 min ago'\n"
-                                            f"    Restart manually: systemctl {'--user ' if scope == 'user' else ''}restart {svc_name}"
+                                            f"  ⚠ {svc_name} died after restart, retrying..."
                                        )
-                            else:
-                                print(
-                                    f"  ⚠ Failed to restart {svc_name}: {restart.stderr.strip()}"
-                                )
+                                        retry = subprocess.run(
+                                            scope_cmd + ["restart", svc_name],
+                                            capture_output=True,
+                                            text=True,
+                                            timeout=15,
+                                        )
+                                        _time.sleep(3)
+                                        verify2 = subprocess.run(
+                                            scope_cmd + ["is-active", svc_name],
+                                            capture_output=True,
+                                            text=True,
+                                            timeout=5,
+                                        )
+                                        if verify2.stdout.strip() == "active":
+                                            restarted_services.append(svc_name)
+                                            print(f"  ✓ {svc_name} recovered on retry")
+                                        else:
+                                            print(
+                                                f"  ✗ {svc_name} failed to stay running after restart.\n"
+                                                f"    Check logs: journalctl --user -u {svc_name} --since '2 min ago'\n"
+                                                f"    Restart manually: systemctl {'--user ' if scope == 'user' else ''}restart {svc_name}"
+                                            )
+                                else:
+                                    print(
+                                        f"  ⚠ Failed to restart {svc_name}: {restart.stderr.strip()}"
+                                    )
                    except (FileNotFoundError, subprocess.TimeoutExpired):
                        pass

@@ -6719,18 +6606,6 @@ For more help on a command:
        default=False,
        help="Include the session ID in the agent's system prompt",
    )
-    parser.add_argument(
-        "--ignore-user-config",
-        action="store_true",
-        default=False,
-        help="Ignore ~/.hermes/config.yaml and fall back to built-in defaults (credentials in .env are still loaded)",
-    )
-    parser.add_argument(
-        "--ignore-rules",
-        action="store_true",
-        default=False,
-        help="Skip auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills",
-    )
    parser.add_argument(
        "--tui",
        action="store_true",
@@ -6870,18 +6745,6 @@ For more help on a command:
        default=argparse.SUPPRESS,
        help="Include the session ID in the agent's system prompt",
    )
-    chat_parser.add_argument(
-        "--ignore-user-config",
-        action="store_true",
-        default=argparse.SUPPRESS,
-        help="Ignore ~/.hermes/config.yaml and fall back to built-in defaults (credentials in .env are still loaded). Useful for isolated CI runs, reproduction, and third-party integrations.",
-    )
-    chat_parser.add_argument(
-        "--ignore-rules",
-        action="store_true",
-        default=argparse.SUPPRESS,
-        help="Skip auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills. Combine with --ignore-user-config for a fully isolated run.",
-    )
    chat_parser.add_argument(
        "--source",
        default=None,
@@ -304,113 +304,6 @@ def parse_model_flags(raw_args: str) -> tuple[str, str, bool]:
 # Alias resolution
 # ---------------------------------------------------------------------------

-def _model_sort_key(model_id: str, prefix: str) -> tuple:
-    """Sort key for model version preference.
-
-    Extracts version numbers after the family prefix and returns a sort key
-    that prefers higher versions.  Suffix tokens (``pro``, ``omni``, etc.)
-    are used as tiebreakers, with common quality indicators ranked.
-
-    Examples (with prefix ``"mimo"``)::
-
-        mimo-v2.5-pro   → (-2.5, 0, 'pro')     # highest version wins
-        mimo-v2.5       → (-2.5, 1, '')          # no suffix = lower than pro
-        mimo-v2-pro     → (-2.0, 0, 'pro')
-        mimo-v2-omni    → (-2.0, 1, 'omni')
-        mimo-v2-flash   → (-2.0, 1, 'flash')
-    """
-    # Strip the prefix (and optional "/" separator for aggregator slugs)
-    rest = model_id[len(prefix):]
-    if rest.startswith("/"):
-        rest = rest[1:]
-    rest = rest.lstrip("-").strip()
-
-    # Parse version and suffix from the remainder.
-    # "v2.5-pro" → version [2.5], suffix "pro"
-    # "-omni"    → version [],    suffix "omni"
-    # State machine: start → in_version → between → in_suffix
-    nums: list[float] = []
-    suffix_buf = ""
-    state = "start"
-    num_buf = ""
-
-    for ch in rest:
-        if state == "start":
-            if ch in "vV":
-                state = "in_version"
-            elif ch.isdigit():
-                state = "in_version"
-                num_buf += ch
-            elif ch in "-_.":
-                pass  # skip separators before any content
-            else:
-                state = "in_suffix"
-                suffix_buf += ch
-        elif state == "in_version":
-            if ch.isdigit():
-                num_buf += ch
-            elif ch == ".":
-                if "." in num_buf:
-                    # Second dot — flush current number, start new component
-                    try:
-                        nums.append(float(num_buf.rstrip(".")))
-                    except ValueError:
-                        pass
-                    num_buf = ""
-                else:
-                    num_buf += ch
-            elif ch in "-_.":
-                if num_buf:
-                    try:
-                        nums.append(float(num_buf.rstrip(".")))
-                    except ValueError:
-                        pass
-                    num_buf = ""
-                state = "between"
-            else:
-                if num_buf:
-                    try:
-                        nums.append(float(num_buf.rstrip(".")))
-                    except ValueError:
-                        pass
-                    num_buf = ""
-                state = "in_suffix"
-                suffix_buf += ch
-        elif state == "between":
-            if ch.isdigit():
-                state = "in_version"
-                num_buf = ch
-            elif ch in "vV":
-                state = "in_version"
-            elif ch in "-_.":
-                pass
-            else:
-                state = "in_suffix"
-                suffix_buf += ch
-        elif state == "in_suffix":
-            suffix_buf += ch
-
-    # Flush remaining buffer (strip trailing dots — "5.4." → "5.4")
-    if num_buf and state == "in_version":
-        try:
-            nums.append(float(num_buf.rstrip(".")))
-        except ValueError:
-            pass
-
-    suffix = suffix_buf.lower().strip("-_.")
-    suffix = suffix.strip()
-
-    # Negate versions so higher → sorts first
-    version_key = tuple(-n for n in nums)
-
-    # Suffix quality ranking: pro/max > (no suffix) > omni/flash/mini/lite
-    # Lower number = preferred
-    _SUFFIX_RANK = {"pro": 0, "max": 0, "plus": 0, "turbo": 0}
-    suffix_rank = _SUFFIX_RANK.get(suffix, 1)
-
-    return version_key + (suffix_rank, suffix)
-
-
 def resolve_alias(
    raw_input: str,
    current_provider: str,
@@ -418,9 +311,9 @@ def resolve_alias(
    """Resolve a short alias against the current provider's catalog.

    Looks up *raw_input* in :data:`MODEL_ALIASES`, then searches the
-    current provider's models.dev catalog for the model whose ID starts
-    with ``vendor/family`` (or just ``family`` for non-aggregator
-    providers) and has the **highest version**.
+    current provider's models.dev catalog for the first model whose ID
+    starts with ``vendor/family`` (or just ``family`` for non-aggregator
+    providers).

    Returns:
        ``(provider, resolved_model_id, alias_name)`` if a match is
@@ -448,44 +341,28 @@ def resolve_alias(

    vendor, family = identity

-    # Build catalog from models.dev, then merge in static _PROVIDER_MODELS
-    # entries that models.dev may be missing (e.g. newly added models not
-    # yet synced to the registry).
+    # Search the provider's catalog from models.dev
    catalog = list_provider_models(current_provider)
-    try:
-        from hermes_cli.models import _PROVIDER_MODELS
-        static = _PROVIDER_MODELS.get(current_provider, [])
-        if static:
-            seen = {m.lower() for m in catalog}
-            for m in static:
-                if m.lower() not in seen:
-                    catalog.append(m)
-    except Exception:
-        pass
+    if not catalog:
+        return None

    # For aggregators, models are vendor/model-name format
    aggregator = is_aggregator(current_provider)

-    if aggregator:
-        prefix = f"{vendor}/{family}".lower()
-        matches = [
-            mid for mid in catalog
-            if mid.lower().startswith(prefix)
-        ]
-    else:
-        family_lower = family.lower()
-        matches = [
-            mid for mid in catalog
-            if mid.lower().startswith(family_lower)
-        ]
+    for model_id in catalog:
+        mid_lower = model_id.lower()
+        if aggregator:
+            # Match vendor/family prefix -- e.g. "anthropic/claude-sonnet"
+            prefix = f"{vendor}/{family}".lower()
+            if mid_lower.startswith(prefix):
+                return (current_provider, model_id, key)
+        else:
+            # Non-aggregator: bare names -- e.g. "claude-sonnet-4-6"
+            family_lower = family.lower()
+            if mid_lower.startswith(family_lower):
+                return (current_provider, model_id, key)

-    if not matches:
-        return None
-
-    # Sort by version descending — prefer the latest/highest version
-    prefix_for_sort = f"{vendor}/{family}" if aggregator else family
-    matches.sort(key=lambda m: _model_sort_key(m, prefix_for_sort))
-    return (current_provider, matches[0], key)
+    return None


 def get_authenticated_provider_slugs(
@@ -905,7 +782,6 @@ def switch_model(

 def list_authenticated_providers(
    current_provider: str = "",
-    current_base_url: str = "",
    user_providers: dict = None,
    custom_providers: list | None = None,
    max_models: int = 8,
@@ -971,10 +847,6 @@ def list_authenticated_providers(
        # source of truth.  models.dev can have wrong mappings (e.g.
        # minimax-cn → MINIMAX_API_KEY instead of MINIMAX_CN_API_KEY).
        pconfig = PROVIDER_REGISTRY.get(hermes_id)
-        # Skip non-API-key auth providers here — they are handled in
-        # section 2 (HERMES_OVERLAYS) with proper auth store checking.
-        if pconfig and pconfig.auth_type != "api_key":
-            continue
        if pconfig and pconfig.api_key_env_vars:
            env_vars = list(pconfig.api_key_env_vars)
        else:
@@ -1245,113 +1117,66 @@ def list_authenticated_providers(

    # --- 4. Saved custom providers from config ---
    # Each ``custom_providers`` entry represents one model under a named
-    # provider. Entries sharing the same endpoint (``base_url`` + ``api_key``)
-    # are grouped into a single picker row, so e.g. four Ollama entries
-    # pointing at ``http://localhost:11434/v1`` with per-model display names
-    # ("Ollama — GLM 5.1", "Ollama — Qwen3-coder", ...) appear as one
-    # "Ollama" row with four models inside instead of four near-duplicates
-    # that differ only by suffix. Entries with distinct endpoints still
-    # produce separate rows.
-    #
-    # When the grouped endpoint matches ``current_base_url`` the group's
-    # slug becomes ``current_provider`` so that selecting a model from the
-    # picker flows back through the runtime provider that already holds
-    # valid credentials — no re-resolution needed.
+    # provider. Entries sharing the same provider name are grouped into a
+    # single picker row so that e.g. four Ollama Cloud entries
+    # (qwen3-coder, glm-5.1, kimi-k2, minimax-m2.7) appear as one
+    # "Ollama Cloud" row with four models inside instead of four
+    # duplicate "Ollama Cloud" rows. Entries with distinct provider names
+    # still produce separate rows (e.g. Ollama Cloud vs Moonshot).
    if custom_providers and isinstance(custom_providers, list):
        from collections import OrderedDict

-        # Key by (base_url, api_key) instead of slug: names frequently
-        # differ per model ("Ollama — X") while the endpoint stays the
-        # same. Slug-based grouping left them as separate rows.
-        groups: "OrderedDict[tuple, dict]" = OrderedDict()
+        groups: "OrderedDict[str, dict]" = OrderedDict()
        for entry in custom_providers:
            if not isinstance(entry, dict):
                continue

-            raw_name = (entry.get("name") or "").strip()
+            display_name = (entry.get("name") or "").strip()
            api_url = (
                entry.get("base_url", "")
                or entry.get("url", "")
                or entry.get("api", "")
                or ""
-            ).strip().rstrip("/")
-            if not raw_name or not api_url:
+            ).strip()
+            if not display_name or not api_url:
                continue
-            api_key = (entry.get("api_key") or "").strip()

-            group_key = (api_url, api_key)
-            if group_key not in groups:
-                # Strip per-model suffix so "Ollama — GLM 5.1" becomes
-                # "Ollama" for the grouped row. Em dash is the convention
-                # Hermes's own writer uses; a hyphen variant is accepted
-                # for hand-edited configs.
-                display_name = raw_name
-                for sep in ("—", " - "):
-                    if sep in display_name:
-                        display_name = display_name.split(sep)[0].strip()
-                        break
-                if not display_name:
-                    display_name = raw_name
-                # If this endpoint matches the currently active one, use
-                # ``current_provider`` as the slug so picker-driven switches
-                # route through the live credential pipeline.
-                if (
-                    current_base_url
-                    and api_url == current_base_url.strip().rstrip("/")
-                ):
-                    slug = current_provider or custom_provider_slug(display_name)
-                else:
-                    slug = custom_provider_slug(display_name)
-                groups[group_key] = {
-                    "slug": slug,
+            slug = custom_provider_slug(display_name)
+            if slug not in groups:
+                groups[slug] = {
                    "name": display_name,
                    "api_url": api_url,
                    "models": [],
                }
-
            # The singular ``model:`` field only holds the currently
            # active model. Hermes's own writer (main.py::_save_custom_provider)
            # stores every configured model as a dict under ``models:``;
            # downstream readers (agent/models_dev.py, gateway/run.py,
            # run_agent.py, hermes_cli/config.py) already consume that dict.
+            # The /model picker previously ignored it, so multi-model
+            # custom providers appeared to have only the active model.
            default_model = (entry.get("model") or "").strip()
-            if default_model and default_model not in groups[group_key]["models"]:
-                groups[group_key]["models"].append(default_model)
+            if default_model and default_model not in groups[slug]["models"]:
+                groups[slug]["models"].append(default_model)

            cfg_models = entry.get("models", {})
            if isinstance(cfg_models, dict):
                for m in cfg_models:
-                    if m and m not in groups[group_key]["models"]:
-                        groups[group_key]["models"].append(m)
+                    if m and m not in groups[slug]["models"]:
+                        groups[slug]["models"].append(m)
            elif isinstance(cfg_models, list):
                for m in cfg_models:
-                    if m and m not in groups[group_key]["models"]:
-                        groups[group_key]["models"].append(m)
+                    if m and m not in groups[slug]["models"]:
+                        groups[slug]["models"].append(m)

-        _section4_emitted_slugs: set = set()
-        for grp in groups.values():
-            slug = grp["slug"]
-            # If the slug is already claimed by a built-in / overlay /
-            # user-provider row (sections 1-3), skip this custom group
-            # to avoid shadowing a real provider.
-            if slug.lower() in seen_slugs and slug.lower() not in _section4_emitted_slugs:
+        for slug, grp in groups.items():
+            if slug.lower() in seen_slugs:
                continue
-            # If a prior section-4 group already used this slug (two custom
-            # endpoints with the same cleaned name — e.g. two OpenAI-
-            # compatible gateways named identically with different keys),
-            # append a counter so both rows stay visible in the picker.
-            if slug.lower() in _section4_emitted_slugs:
-                base_slug = slug
-                n = 2
-                while f"{base_slug}-{n}".lower() in seen_slugs:
-                    n += 1
-                slug = f"{base_slug}-{n}"
-                grp["slug"] = slug
            # Skip if section 3 already emitted this endpoint under its
-            # ``providers:`` dict key — matches on (display_name, base_url).
-            # Prevents two picker rows labelled identically when callers
-            # pass both ``user_providers`` and a compatibility-merged
-            # ``custom_providers`` list.
+            # ``providers:`` dict key — matches on (display_name, base_url),
+            # the tuple section 4 groups by.  Prevents two picker rows
+            # labelled identically when callers pass both ``user_providers``
+            # and a compatibility-merged ``custom_providers`` list.
            _pair_key = (
                str(grp["name"]).strip().lower(),
                str(grp["api_url"]).strip().rstrip("/").lower(),
@@ -1369,7 +1194,6 @@ def list_authenticated_providers(
                "api_url": grp["api_url"],
            })
            seen_slugs.add(slug.lower())
-            _section4_emitted_slugs.add(slug.lower())

    # Sort: current provider first, then by model count descending
    results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
@@ -33,8 +33,6 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
 # (model_id, display description shown in menus)
 OPENROUTER_MODELS: list[tuple[str, str]] = [
    ("moonshotai/kimi-k2.6",            "recommended"),
-    ("deepseek/deepseek-v4-pro",        ""),
-    ("deepseek/deepseek-v4-flash",      ""),
    ("anthropic/claude-opus-4.7",       ""),
    ("anthropic/claude-opus-4.6",       ""),
    ("anthropic/claude-sonnet-4.6",     ""),
@@ -111,8 +109,6 @@ def _codex_curated_models() -> list[str]:
 _PROVIDER_MODELS: dict[str, list[str]] = {
    "nous": [
        "moonshotai/kimi-k2.6",
-        "deepseek/deepseek-v4-pro",
-        "deepseek/deepseek-v4-flash",
        "xiaomi/mimo-v2.5-pro",
        "xiaomi/mimo-v2.5",
        "anthropic/claude-opus-4.7",
@@ -250,14 +246,10 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "claude-haiku-4-5-20251001",
    ],
    "deepseek": [
-        "deepseek-v4-pro",
-        "deepseek-v4-flash",
        "deepseek-chat",
        "deepseek-reasoner",
    ],
    "xiaomi": [
-        "mimo-v2.5-pro",
-        "mimo-v2.5",
        "mimo-v2-pro",
        "mimo-v2-omni",
        "mimo-v2-flash",
@@ -309,8 +301,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "kimi-k2.5",
        "glm-5.1",
        "glm-5",
-        "mimo-v2.5-pro",
-        "mimo-v2.5",
        "mimo-v2-pro",
        "mimo-v2-omni",
        "minimax-m2.7",
@@ -702,7 +692,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("ai-gateway",     "Vercel AI Gateway",        "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
    ProviderEntry("anthropic",      "Anthropic",                "Anthropic (Claude models — API key or Claude Code)"),
    ProviderEntry("openai-codex",   "OpenAI Codex",             "OpenAI Codex"),
-    ProviderEntry("xiaomi",         "Xiaomi MiMo",              "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
+    ProviderEntry("xiaomi",         "Xiaomi MiMo",              "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
    ProviderEntry("nvidia",         "NVIDIA NIM",               "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
    ProviderEntry("qwen-oauth",     "Qwen OAuth (Portal)",      "Qwen OAuth (reuses local Qwen CLI login)"),
    ProviderEntry("copilot",        "GitHub Copilot",           "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
@@ -1684,19 +1674,7 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
    if normalized == "openai-codex":
        from hermes_cli.codex_models import get_codex_model_ids

-        # Pass the live OAuth access token so the picker matches whatever
-        # ChatGPT lists for this account right now (new models appear without
-        # a Hermes release). Falls back to the hardcoded catalog if no token
-        # or the endpoint is unreachable.
-        access_token = None
-        try:
-            from hermes_cli.auth import resolve_codex_runtime_credentials
-
-            creds = resolve_codex_runtime_credentials(refresh_if_expiring=True)
-            access_token = creds.get("api_key")
-        except Exception:
-            access_token = None
-        return get_codex_model_ids(access_token=access_token)
+        return get_codex_model_ids()
    if normalized in {"copilot", "copilot-acp"}:
        try:
            live = _fetch_github_models(_resolve_copilot_catalog_api_key())
@@ -44,7 +44,7 @@ def _cmd_list(store):
        for p in pending:
            print(
                f"  {p['platform']:<12} {p['code']:<10} {p['user_id']:<20} "
-                f"{(p.get('user_name') or ''):<20} {p['age_minutes']}m ago"
+                f"{p.get('user_name', ''):<20} {p['age_minutes']}m ago"
            )
    else:
        print("\n  No pending pairing requests.")
@@ -54,7 +54,7 @@ def _cmd_list(store):
        print(f"  {'Platform':<12} {'User ID':<20} {'Name':<20}")
        print(f"  {'--------':<12} {'-------':<20} {'----':<20}")
        for a in approved:
-            print(f"  {a['platform']:<12} {a['user_id']:<20} {(a.get('user_name') or ''):<20}")
+            print(f"  {a['platform']:<12} {a['user_id']:<20} {a.get('user_name', ''):<20}")
    else:
        print("\n  No approved users.")

@@ -69,7 +69,7 @@ def _cmd_approve(store, platform: str, code: str):
    result = store.approve_code(platform, code)
    if result:
        uid = result["user_id"]
-        name = result.get("user_name") or ""
+        name = result.get("user_name", "")
        display = f"{name} ({uid})" if name else uid
        print(f"\n  Approved! User {display} on {platform} can now use the bot~")
        print("  They'll be recognized automatically on their next message.\n")
@@ -38,7 +38,6 @@ PLATFORMS: OrderedDict[str, PlatformInfo] = OrderedDict([
    ("qqbot",          PlatformInfo(label="💬 QQBot",           default_toolset="hermes-qqbot")),
    ("webhook",        PlatformInfo(label="🔗 Webhook",         default_toolset="hermes-webhook")),
    ("api_server",     PlatformInfo(label="🌐 API Server",      default_toolset="hermes-api-server")),
-    ("cron",           PlatformInfo(label="⏰ Cron",            default_toolset="hermes-cron")),
 ])


@@ -512,23 +512,10 @@ class PluginManager:
    # Public
    # -----------------------------------------------------------------------

-    def discover_and_load(self, force: bool = False) -> None:
-        """Scan all plugin sources and load each plugin found.
-
-        When ``force`` is true, clear cached discovery state first so config
-        changes or newly-added bundled backends become visible in long-lived
-        sessions without requiring a full agent restart.
-        """
-        if self._discovered and not force:
+    def discover_and_load(self) -> None:
+        """Scan all plugin sources and load each plugin found."""
+        if self._discovered:
            return
-        if force:
-            self._plugins.clear()
-            self._hooks.clear()
-            self._plugin_tool_names.clear()
-            self._cli_commands.clear()
-            self._plugin_commands.clear()
-            self._plugin_skills.clear()
-            self._context_engine = None
        self._discovered = True

        manifests: List[PluginManifest] = []
@@ -1042,13 +1029,9 @@ def get_plugin_manager() -> PluginManager:
    return _plugin_manager


-def discover_plugins(force: bool = False) -> None:
-    """Discover and load all plugins.
-
-    Default behavior is idempotent. Pass ``force=True`` to rescan plugin
-    manifests and reload state in the current process.
-    """
-    get_plugin_manager().discover_and_load(force=force)
+def discover_plugins() -> None:
+    """Discover and load all plugins (idempotent)."""
+    get_plugin_manager().discover_and_load()


 def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
@@ -1099,13 +1082,10 @@ def get_pre_tool_call_block_message(
    return None


-def _ensure_plugins_discovered(force: bool = False) -> PluginManager:
-    """Return the global manager after ensuring plugin discovery has run.
-
-    Pass ``force=True`` to rescan in the current process.
-    """
+def _ensure_plugins_discovered() -> PluginManager:
+    """Return the global manager after running idempotent plugin discovery."""
    manager = get_plugin_manager()
-    manager.discover_and_load(force=force)
+    manager.discover_and_load()
    return manager


@@ -863,15 +863,19 @@ def _safe_extract_profile_archive(archive: Path, destination: Path) -> None:
                pass


-def _inspect_profile_archive_roots(archive: Path) -> set[str]:
-    """Return the archive's top-level directory names.
+def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
+    """Import a profile from a tar.gz archive.

-    Profile imports expect exactly one root directory. Inspecting the archive
-    before extraction lets us stage the import safely instead of mutating a
-    live profile tree first and reconciling names later.
+    If *name* is not given, infers it from the archive's top-level directory.
+    Returns the imported profile directory.
    """
    import tarfile

+    archive = Path(archive_path)
+    if not archive.exists():
+        raise FileNotFoundError(f"Archive not found: {archive}")
+
+    # Peek at the archive to find the top-level directory name
    with tarfile.open(archive, "r:gz") as tf:
        top_dirs = {
            parts[0]
@@ -885,33 +889,13 @@ def _inspect_profile_archive_roots(archive: Path) -> set[str]:
                for member in tf.getmembers()
                if member.isdir()
            }
-    return top_dirs

-
-def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
-    """Import a profile from a tar.gz archive.
-
-    If *name* is not given, infers it from the archive's top-level directory.
-    Returns the imported profile directory.
-    """
-    import tempfile
-
-    archive = Path(archive_path)
-    if not archive.exists():
-        raise FileNotFoundError(f"Archive not found: {archive}")
-
-    top_dirs = _inspect_profile_archive_roots(archive)
-    archive_root = top_dirs.pop() if len(top_dirs) == 1 else None
-    inferred_name = name or archive_root
+    inferred_name = name or (top_dirs.pop() if len(top_dirs) == 1 else None)
    if not inferred_name:
        raise ValueError(
            "Cannot determine profile name from archive. "
            "Specify it explicitly: hermes profile import <archive> --name <name>"
        )
-    if archive_root is None:
-        raise ValueError(
-            "Profile archive must contain exactly one top-level directory."
-        )

    # Archives exported from the default profile have "default/" as top-level
    # dir.  Importing as "default" would target ~/.hermes itself — disallow
@@ -930,22 +914,12 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
    profiles_root = _get_profiles_root()
    profiles_root.mkdir(parents=True, exist_ok=True)

-    with tempfile.TemporaryDirectory(prefix="hermes_profile_import_") as tmpdir:
-        staging_root = Path(tmpdir)
-        _safe_extract_profile_archive(archive, staging_root)
+    _safe_extract_profile_archive(archive, profiles_root)

-        extracted = staging_root / archive_root
-        if not extracted.is_dir():
-            raise ValueError(
-                f"Profile archive root is missing or invalid: {archive_root}"
-            )
-
-        final_source = extracted
-        if archive_root != inferred_name:
-            final_source = staging_root / inferred_name
-            extracted.rename(final_source)
-
-        shutil.move(str(final_source), str(profile_dir))
+    # If the archive extracted under a different name, rename
+    extracted = profiles_root / (top_dirs.pop() if top_dirs else inferred_name)
+    if extracted != profile_dir and extracted.exists():
+        extracted.rename(profile_dir)

    return profile_dir

@@ -0,0 +1,221 @@
+"""PTY bridge for `hermes dashboard` chat tab.
+
+Wraps a child process behind a pseudo-terminal so its ANSI output can be
+streamed to a browser-side terminal emulator (xterm.js) and typed
+keystrokes can be fed back in.  The only caller today is the
+``/api/pty`` WebSocket endpoint in ``hermes_cli.web_server``.
+
+Design constraints:
+
+* **POSIX-only.**  Hermes Agent supports Windows exclusively via WSL, which
+  exposes a native POSIX PTY via ``openpty(3)``.  Native Windows Python
+  has no PTY; :class:`PtyUnavailableError` is raised with a user-readable
+  install/platform message so the dashboard can render a banner instead of
+  crashing.
+* **Zero Node dependency on the server side.**  We use :mod:`ptyprocess`,
+  which is a pure-Python wrapper around the OS calls.  The browser talks
+  to the same ``hermes --tui`` binary it would launch from the CLI, so
+  every TUI feature (slash popover, model picker, tool rows, markdown,
+  skin engine, clarify/sudo/approval prompts) ships automatically.
+* **Byte-safe I/O.**  Reads and writes go through the PTY master fd
+  directly — we avoid :class:`ptyprocess.PtyProcessUnicode` because
+  streaming ANSI is inherently byte-oriented and UTF-8 boundaries may land
+  mid-read.
+"""
+
+from __future__ import annotations
+
+import errno
+import fcntl
+import os
+import select
+import signal
+import struct
+import sys
+import termios
+import time
+from typing import Optional, Sequence
+
+try:
+    import ptyprocess  # type: ignore
+    _PTY_AVAILABLE = not sys.platform.startswith("win")
+except ImportError:  # pragma: no cover - dev env without ptyprocess
+    ptyprocess = None  # type: ignore
+    _PTY_AVAILABLE = False
+
+
+__all__ = ["PtyBridge", "PtyUnavailableError"]
+
+
+class PtyUnavailableError(RuntimeError):
+    """Raised when a PTY cannot be created on this platform.
+
+    Today this means native Windows (no ConPTY bindings) or a dev
+    environment missing the ``ptyprocess`` dependency.  The dashboard
+    surfaces the message to the user as a chat-tab banner.
+    """
+
+
+class PtyBridge:
+    """Thin wrapper around ``ptyprocess.PtyProcess`` for byte streaming.
+
+    Not thread-safe.  A single bridge is owned by the WebSocket handler
+    that spawned it; the reader runs in an executor thread while writes
+    happen on the event-loop thread.  Both sides are OK because the
+    kernel PTY is the actual synchronization point — we never call
+    :mod:`ptyprocess` methods concurrently, we only call ``os.read`` and
+    ``os.write`` on the master fd, which is safe.
+    """
+
+    def __init__(self, proc: "ptyprocess.PtyProcess"):  # type: ignore[name-defined]
+        self._proc = proc
+        self._fd: int = proc.fd
+        self._closed = False
+
+    # -- lifecycle --------------------------------------------------------
+
+    @classmethod
+    def is_available(cls) -> bool:
+        """True if a PTY can be spawned on this platform."""
+        return bool(_PTY_AVAILABLE)
+
+    @classmethod
+    def spawn(
+        cls,
+        argv: Sequence[str],
+        *,
+        cwd: Optional[str] = None,
+        env: Optional[dict] = None,
+        cols: int = 80,
+        rows: int = 24,
+    ) -> "PtyBridge":
+        """Spawn ``argv`` behind a new PTY and return a bridge.
+
+        Raises :class:`PtyUnavailableError` if the platform can't host a
+        PTY.  Raises :class:`FileNotFoundError` or :class:`OSError` for
+        ordinary exec failures (missing binary, bad cwd, etc.).
+        """
+        if not _PTY_AVAILABLE:
+            raise PtyUnavailableError(
+                "Pseudo-terminals are unavailable on this platform. "
+                "Hermes Agent supports Windows only via WSL."
+            )
+        # Let caller-supplied env fully override inheritance; if they pass
+        # None we inherit the server's env (same semantics as subprocess).
+        spawn_env = os.environ.copy() if env is None else env
+        proc = ptyprocess.PtyProcess.spawn(  # type: ignore[union-attr]
+            list(argv),
+            cwd=cwd,
+            env=spawn_env,
+            dimensions=(rows, cols),
+        )
+        return cls(proc)
+
+    @property
+    def pid(self) -> int:
+        return int(self._proc.pid)
+
+    def is_alive(self) -> bool:
+        if self._closed:
+            return False
+        try:
+            return bool(self._proc.isalive())
+        except Exception:
+            return False
+
+    # -- I/O --------------------------------------------------------------
+
+    def read(self, timeout: float = 0.2) -> Optional[bytes]:
+        """Read up to 64 KiB of raw bytes from the PTY master.
+
+        Returns:
+            * bytes — zero or more bytes of child output
+            * empty bytes (``b""``) — no data available within ``timeout``
+            * None — child has exited and the master fd is at EOF
+
+        Never blocks longer than ``timeout`` seconds.  Safe to call after
+        :meth:`close`; returns ``None`` in that case.
+        """
+        if self._closed:
+            return None
+        try:
+            readable, _, _ = select.select([self._fd], [], [], timeout)
+        except (OSError, ValueError):
+            return None
+        if not readable:
+            return b""
+        try:
+            data = os.read(self._fd, 65536)
+        except OSError as exc:
+            # EIO on Linux = slave side closed.  EBADF = already closed.
+            if exc.errno in (errno.EIO, errno.EBADF):
+                return None
+            raise
+        if not data:
+            return None
+        return data
+
+    def write(self, data: bytes) -> None:
+        """Write raw bytes to the PTY master (i.e. the child's stdin)."""
+        if self._closed or not data:
+            return
+        # os.write can return a short write under load; loop until drained.
+        view = memoryview(data)
+        while view:
+            try:
+                n = os.write(self._fd, view)
+            except OSError as exc:
+                if exc.errno in (errno.EIO, errno.EBADF, errno.EPIPE):
+                    return
+                raise
+            if n <= 0:
+                return
+            view = view[n:]
+
+    def resize(self, cols: int, rows: int) -> None:
+        """Forward a terminal resize to the child via ``TIOCSWINSZ``."""
+        if self._closed:
+            return
+        # struct winsize: rows, cols, xpixel, ypixel (all unsigned short)
+        winsize = struct.pack("HHHH", max(1, rows), max(1, cols), 0, 0)
+        try:
+            fcntl.ioctl(self._fd, termios.TIOCSWINSZ, winsize)
+        except OSError:
+            pass
+
+    # -- teardown ---------------------------------------------------------
+
+    def close(self) -> None:
+        """Terminate the child (SIGTERM → 0.5s grace → SIGKILL) and close fds.
+
+        Idempotent.  Reaping the child is important so we don't leak
+        zombies across the lifetime of the dashboard process.
+        """
+        if self._closed:
+            return
+        self._closed = True
+
+        # SIGHUP is the conventional "your terminal went away" signal.
+        # We escalate if the child ignores it.
+        for sig in (signal.SIGHUP, signal.SIGTERM, signal.SIGKILL):
+            if not self._proc.isalive():
+                break
+            try:
+                self._proc.kill(sig)
+            except Exception:
+                pass
+            deadline = time.monotonic() + 0.5
+            while self._proc.isalive() and time.monotonic() < deadline:
+                time.sleep(0.02)
+
+        try:
+            self._proc.close(force=True)
+        except Exception:
+            pass
+
+    # Context-manager sugar — handy in tests and ad-hoc scripts.
+    def __enter__(self) -> "PtyBridge":
+        return self
+
+    def __exit__(self, *_exc) -> None:
+        self.close()
@@ -103,7 +103,7 @@ _DEFAULT_PROVIDER_MODELS = {
    "ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
    "kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
    "opencode-zen": ["gpt-5.4", "gpt-5.3-codex", "claude-sonnet-4-6", "gemini-3-flash", "glm-5", "kimi-k2.5", "minimax-m2.7"],
-    "opencode-go": ["kimi-k2.6", "kimi-k2.5", "glm-5.1", "glm-5", "mimo-v2.5-pro", "mimo-v2.5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.7", "minimax-m2.5", "qwen3.6-plus", "qwen3.5-plus"],
+    "opencode-go": ["kimi-k2.6", "kimi-k2.5", "glm-5.1", "glm-5", "mimo-v2-pro", "mimo-v2-omni", "minimax-m2.5", "minimax-m2.7", "qwen3.6-plus", "qwen3.5-plus"],
    "huggingface": [
        "Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
        "Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
@@ -2334,7 +2334,6 @@ def setup_gateway(config: dict):
            launchd_install,
            launchd_start,
            launchd_restart,
-            UserSystemdUnavailableError,
        )

        service_installed = _is_service_installed()
@@ -2358,10 +2357,6 @@ def setup_gateway(config: dict):
                        systemd_restart()
                    elif _is_macos:
                        launchd_restart()
-                except UserSystemdUnavailableError as e:
-                    print_error("  Restart failed — user systemd not reachable:")
-                    for line in str(e).splitlines():
-                        print(f"  {line}")
                except Exception as e:
                    print_error(f"  Restart failed: {e}")
        elif service_installed:
@@ -2371,10 +2366,6 @@ def setup_gateway(config: dict):
                        systemd_start()
                    elif _is_macos:
                        launchd_start()
-                except UserSystemdUnavailableError as e:
-                    print_error("  Start failed — user systemd not reachable:")
-                    for line in str(e).splitlines():
-                        print(f"  {line}")
                except Exception as e:
                    print_error(f"  Start failed: {e}")
        elif supports_service_manager:
@@ -2398,10 +2389,6 @@ def setup_gateway(config: dict):
                                systemd_start(system=installed_scope == "system")
                            elif _is_macos:
                                launchd_start()
-                        except UserSystemdUnavailableError as e:
-                            print_error("  Start failed — user systemd not reachable:")
-                            for line in str(e).splitlines():
-                                print(f"  {line}")
                        except Exception as e:
                            print_error(f"  Start failed: {e}")
                except Exception as e:
@@ -289,7 +289,6 @@ TIPS = [
    "When a provider returns HTTP 402 (payment required), the auxiliary client auto-falls back to the next one.",
    "agent.tool_use_enforcement steers models that describe actions instead of calling tools — auto for GPT/Codex.",
    "agent.restart_drain_timeout (default 60s) lets running agents finish before a gateway restart takes effect.",
-    "agent.api_max_retries (default 3) controls how many times the agent retries a failed API call before surfacing the error — lower it for fast fallback.",
    "The gateway caches AIAgent instances per session — destroying this cache breaks Anthropic prompt caching.",
    "Any website can expose skills via /.well-known/skills/index.json — the skills hub discovers them automatically.",
    "The skills audit log at ~/.hermes/skills/.hub/audit.log tracks every install and removal operation.",
@@ -67,13 +67,12 @@ CONFIGURABLE_TOOLSETS = [
    ("messaging",       "📨 Cross-Platform Messaging",  "send_message"),
    ("rl",              "🧪 RL Training",               "Tinker-Atropos training tools"),
    ("homeassistant",    "🏠 Home Assistant",           "smart home device control"),
-    ("discord_admin",   "🛡️  Discord Server Admin",    "list channels/roles, pin, assign roles"),
 ]

 # Toolsets that are OFF by default for new installs.
 # They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
 # but the setup checklist won't pre-select them for first-time users.
-_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "discord_admin"}
+_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl"}


 def _get_effective_configurable_toolsets():
@@ -550,7 +549,7 @@ def _get_platform_tools(
    include_default_mcp_servers: bool = True,
 ) -> Set[str]:
    """Resolve which individual toolset names are enabled for a platform."""
-    from toolsets import resolve_toolset, TOOLSETS
+    from toolsets import resolve_toolset

    platform_toolsets = config.get("platform_toolsets") or {}
    toolset_names = platform_toolsets.get(platform)
@@ -564,8 +563,6 @@ def _get_platform_tools(
    toolset_names = [str(ts) for ts in toolset_names]

    configurable_keys = {ts_key for ts_key, _, _ in CONFIGURABLE_TOOLSETS}
-    plugin_ts_keys = _get_plugin_toolset_keys()
-    platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}

    # If the saved list contains any configurable keys directly, the user
    # has explicitly configured this platform — use direct membership.
@@ -588,46 +585,16 @@ def _get_platform_tools(
            ts_tools = set(resolve_toolset(ts_key))
            if ts_tools and ts_tools.issubset(all_tool_names):
                enabled_toolsets.add(ts_key)
-
        default_off = set(_DEFAULT_OFF_TOOLSETS)
        if platform in default_off:
            default_off.remove(platform)
        enabled_toolsets -= default_off

-    # Recover non-configurable platform toolsets (e.g. discord, feishu_doc,
-    # feishu_drive).  These are part of the platform's default composite but
-    # absent from CONFIGURABLE_TOOLSETS, so they can't appear in the TUI
-    # checklist or in a user-saved config.  Must run in BOTH branches —
-    # otherwise saving via `hermes tools` (which flips has_explicit_config
-    # to True) silently drops them.
-    platform_tool_universe = set(resolve_toolset(PLATFORMS[platform]["default_toolset"]))
-    configurable_tool_universe = set()
-    for ck in configurable_keys:
-        configurable_tool_universe.update(resolve_toolset(ck))
-    claimed = set()
-    for ts_key in enabled_toolsets:
-        claimed.update(resolve_toolset(ts_key))
-    skip = configurable_keys | plugin_ts_keys | platform_default_keys
-    skip |= {k for k in TOOLSETS if k.startswith("hermes-")}
-    skip |= set(_DEFAULT_OFF_TOOLSETS) - {platform}
-    for ts_key, ts_def in TOOLSETS.items():
-        if ts_key in skip:
-            continue
-        if ts_def.get("includes"):
-            continue
-        ts_tools = set(resolve_toolset(ts_key))
-        if not ts_tools or not ts_tools.issubset(platform_tool_universe):
-            continue
-        if ts_tools.issubset(configurable_tool_universe):
-            continue
-        if not ts_tools.issubset(claimed):
-            enabled_toolsets.add(ts_key)
-            claimed.update(ts_tools)
-
    # Plugin toolsets: enabled by default unless explicitly disabled.
    # A plugin toolset is "known" for a platform once `hermes tools`
    # has been saved for that platform (tracked via known_plugin_toolsets).
    # Unknown plugins default to enabled; known-but-absent = disabled.
+    plugin_ts_keys = _get_plugin_toolset_keys()
    if plugin_ts_keys:
        known_map = config.get("known_plugin_toolsets", {})
        known_for_platform = set(known_map.get(platform, []))
@@ -642,6 +609,7 @@ def _get_platform_tools(

    # Preserve any explicit non-configurable toolset entries (for example,
    # custom toolsets or MCP server names saved in platform_toolsets).
+    platform_default_keys = {p["default_toolset"] for p in PLATFORMS.values()}
    explicit_passthrough = {
        ts
        for ts in toolset_names
@@ -701,7 +669,6 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
    existing_toolsets = config.get("platform_toolsets", {}).get(platform, [])
    if not isinstance(existing_toolsets, list):
        existing_toolsets = []
-    existing_toolsets = [str(ts) for ts in existing_toolsets]

    # Preserve any entries that are NOT configurable toolsets and NOT platform
    # defaults (i.e. only MCP server names should be preserved)
@@ -709,8 +676,6 @@ def _save_platform_tools(config: dict, platform: str, enabled_toolset_keys: Set[
        entry for entry in existing_toolsets
        if entry not in configurable_keys and entry not in platform_default_keys
    }
-    if "no_mcp" not in enabled_toolset_keys:
-        preserved_entries.discard("no_mcp")

    # Merge preserved entries with new enabled toolsets
    config["platform_toolsets"][platform] = sorted(enabled_toolset_keys | preserved_entries)
@@ -1054,11 +1019,6 @@ def _configure_tool_category(ts_key: str, cat: dict, config: dict):

 def _is_provider_active(provider: dict, config: dict) -> bool:
    """Check if a provider entry matches the currently active config."""
-    plugin_name = provider.get("image_gen_plugin_name")
-    if plugin_name:
-        image_cfg = config.get("image_gen", {})
-        return isinstance(image_cfg, dict) and image_cfg.get("provider") == plugin_name
-
    managed_feature = provider.get("managed_nous_feature")
    if managed_feature:
        features = get_nous_subscription_features(config)
@@ -1066,13 +1026,6 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
        if feature is None:
            return False
        if managed_feature == "image_gen":
-            image_cfg = config.get("image_gen", {})
-            if isinstance(image_cfg, dict):
-                configured_provider = image_cfg.get("provider")
-                if configured_provider not in (None, "", "fal"):
-                    return False
-                if image_cfg.get("use_gateway") is False:
-                    return False
            return feature.managed_by_nous
        if provider.get("tts_provider"):
            return (
@@ -1095,16 +1048,6 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
    if provider.get("web_backend"):
        current = config.get("web", {}).get("backend")
        return current == provider["web_backend"]
-    if provider.get("imagegen_backend"):
-        image_cfg = config.get("image_gen", {})
-        if not isinstance(image_cfg, dict):
-            return False
-        configured_provider = image_cfg.get("provider")
-        return (
-            provider["imagegen_backend"] == "fal"
-            and configured_provider in (None, "", "fal")
-            and not image_cfg.get("use_gateway")
-        )
    return False


@@ -1302,18 +1245,6 @@ def _configure_imagegen_model_for_plugin(plugin_name: str, config: dict) -> None
    _print_success(f"  Model set to: {chosen}")


-def _select_plugin_image_gen_provider(plugin_name: str, config: dict) -> None:
-    """Persist a plugin-backed image generation provider selection."""
-    img_cfg = config.setdefault("image_gen", {})
-    if not isinstance(img_cfg, dict):
-        img_cfg = {}
-        config["image_gen"] = img_cfg
-    img_cfg["provider"] = plugin_name
-    img_cfg["use_gateway"] = False
-    _print_success(f"  image_gen.provider set to: {plugin_name}")
-    _configure_imagegen_model_for_plugin(plugin_name, config)
-
-
 def _configure_provider(provider: dict, config: dict):
    """Configure a single provider - prompt for API keys and set config."""
    env_vars = provider.get("env_vars", [])
@@ -1374,7 +1305,13 @@ def _configure_provider(provider: dict, config: dict):
        # and route model selection to the plugin's own catalog.
        plugin_name = provider.get("image_gen_plugin_name")
        if plugin_name:
-            _select_plugin_image_gen_provider(plugin_name, config)
+            img_cfg = config.setdefault("image_gen", {})
+            if not isinstance(img_cfg, dict):
+                img_cfg = {}
+                config["image_gen"] = img_cfg
+            img_cfg["provider"] = plugin_name
+            _print_success(f"  image_gen.provider set to: {plugin_name}")
+            _configure_imagegen_model_for_plugin(plugin_name, config)
            return
        # Imagegen backends prompt for model selection after backend pick.
        backend = provider.get("imagegen_backend")
@@ -1422,7 +1359,13 @@ def _configure_provider(provider: dict, config: dict):
        _print_success(f"  {provider['name']} configured!")
        plugin_name = provider.get("image_gen_plugin_name")
        if plugin_name:
-            _select_plugin_image_gen_provider(plugin_name, config)
+            img_cfg = config.setdefault("image_gen", {})
+            if not isinstance(img_cfg, dict):
+                img_cfg = {}
+                config["image_gen"] = img_cfg
+            img_cfg["provider"] = plugin_name
+            _print_success(f"  image_gen.provider set to: {plugin_name}")
+            _configure_imagegen_model_for_plugin(plugin_name, config)
            return
        # Imagegen backends prompt for model selection after env vars are in.
        backend = provider.get("imagegen_backend")
@@ -1596,39 +1539,16 @@ def _reconfigure_provider(provider: dict, config: dict):
        config.setdefault("web", {})["backend"] = provider["web_backend"]
        _print_success(f"  Web backend set to: {provider['web_backend']}")

-    if managed_feature and managed_feature not in ("web", "tts", "browser"):
-        section = config.setdefault(managed_feature, {})
-        if not isinstance(section, dict):
-            section = {}
-            config[managed_feature] = section
-        section["use_gateway"] = True
-    elif not managed_feature:
-        for cat_key, cat in TOOL_CATEGORIES.items():
-            if provider in cat.get("providers", []):
-                section = config.get(cat_key)
-                if isinstance(section, dict) and section.get("use_gateway"):
-                    section["use_gateway"] = False
-                break
-
    if not env_vars:
        if provider.get("post_setup"):
            _run_post_setup(provider["post_setup"])
        _print_success(f"  {provider['name']} - no configuration needed!")
        if managed_feature:
            _print_info("  Requests for this tool will be billed to your Nous subscription.")
-        plugin_name = provider.get("image_gen_plugin_name")
-        if plugin_name:
-            _select_plugin_image_gen_provider(plugin_name, config)
-            return
        # Imagegen backends prompt for model selection on reconfig too.
        backend = provider.get("imagegen_backend")
        if backend:
            _configure_imagegen_model(backend, config)
-            if backend == "fal":
-                img_cfg = config.setdefault("image_gen", {})
-                if isinstance(img_cfg, dict):
-                    img_cfg["provider"] = "fal"
-                    img_cfg["use_gateway"] = False
        return

    for var in env_vars:
@@ -1647,19 +1567,9 @@ def _reconfigure_provider(provider: dict, config: dict):
            _print_info("    Kept current")

    # Imagegen backends prompt for model selection on reconfig too.
-    plugin_name = provider.get("image_gen_plugin_name")
-    if plugin_name:
-        _select_plugin_image_gen_provider(plugin_name, config)
-        return
-
    backend = provider.get("imagegen_backend")
    if backend:
        _configure_imagegen_model(backend, config)
-        if backend == "fal":
-            img_cfg = config.setdefault("image_gen", {})
-            if isinstance(img_cfg, dict):
-                img_cfg["provider"] = "fal"
-                img_cfg["use_gateway"] = False


 def _reconfigure_simple_requirements(ts_key: str):
@@ -1,548 +0,0 @@
-"""Process-wide voice recording + TTS API for the TUI gateway.
-
-Wraps ``tools.voice_mode`` (recording/transcription) and ``tools.tts_tool``
-(text-to-speech) behind idempotent, stateful entry points that the gateway's
-``voice.record``, ``voice.toggle``, and ``voice.tts`` JSON-RPC handlers can
-call from a dedicated thread. The gateway imports this module lazily so that
-missing optional audio deps (sounddevice, faster-whisper, numpy) surface as
-an ``ImportError`` at call time, not at startup.
-
-Two usage modes are exposed:
-
-* **Push-to-talk** (``start_recording`` / ``stop_and_transcribe``) — single
-  manually-bounded capture used when the caller drives the start/stop pair
-  explicitly.
-* **Continuous (VAD)** (``start_continuous`` / ``stop_continuous``) — mirrors
-  the classic CLI voice mode: recording auto-stops on silence, transcribes,
-  hands the result to a callback, and then auto-restarts for the next turn.
-  Three consecutive no-speech cycles stop the loop and fire
-  ``on_silent_limit`` so the UI can turn the mode off.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-import sys
-import threading
-from typing import Any, Callable, Optional
-
-from tools.voice_mode import (
-    create_audio_recorder,
-    is_whisper_hallucination,
-    play_audio_file,
-    transcribe_recording,
-)
-
-logger = logging.getLogger(__name__)
-
-
-def _debug(msg: str) -> None:
-    """Emit a debug breadcrumb when HERMES_VOICE_DEBUG=1.
-
-    Goes to stderr so the TUI gateway wraps it as a gateway.stderr event,
-    which createGatewayEventHandler shows as an Activity line — exactly
-    what we need to diagnose "why didn't the loop auto-restart?" in the
-    user's real terminal without shipping a separate debug RPC.
-
-    Any OSError / BrokenPipeError is swallowed because this fires from
-    background threads (silence callback, TTS daemon, beep) where a
-    broken stderr pipe must not kill the whole gateway — the main
-    command pipe (stdin+stdout) is what actually matters.
-    """
-    if os.environ.get("HERMES_VOICE_DEBUG", "").strip() != "1":
-        return
-    try:
-        print(f"[voice] {msg}", file=sys.stderr, flush=True)
-    except (BrokenPipeError, OSError):
-        pass
-
-
-def _beeps_enabled() -> bool:
-    """CLI parity: voice.beep_enabled in config.yaml (default True)."""
-    try:
-        from hermes_cli.config import load_config
-
-        voice_cfg = load_config().get("voice", {})
-        if isinstance(voice_cfg, dict):
-            return bool(voice_cfg.get("beep_enabled", True))
-    except Exception:
-        pass
-    return True
-
-
-def _play_beep(frequency: int, count: int = 1) -> None:
-    """Audible cue matching cli.py's record/stop beeps.
-
-    880 Hz single-beep on start (cli.py:_voice_start_recording line 7532),
-    660 Hz double-beep on stop (cli.py:_voice_stop_and_transcribe line 7585).
-    Best-effort — sounddevice failures are silently swallowed so the
-    voice loop never breaks because a speaker was unavailable.
-    """
-    if not _beeps_enabled():
-        return
-    try:
-        from tools.voice_mode import play_beep
-
-        play_beep(frequency=frequency, count=count)
-    except Exception as e:
-        _debug(f"beep {frequency}Hz failed: {e}")
-
-# ── Push-to-talk state ───────────────────────────────────────────────
-_recorder = None
-_recorder_lock = threading.Lock()
-
-# ── Continuous (VAD) state ───────────────────────────────────────────
-_continuous_lock = threading.Lock()
-_continuous_active = False
-_continuous_recorder: Any = None
-
-# ── TTS-vs-STT feedback guard ────────────────────────────────────────
-# When TTS plays the agent reply over the speakers, the live microphone
-# picks it up and transcribes the agent's own voice as user input — an
-# infinite loop the agent happily joins ("Ha, looks like we're in a loop").
-# This Event mirrors cli.py:_voice_tts_done: cleared while speak_text is
-# playing, set while silent. _continuous_on_silence waits on it before
-# re-arming the recorder, and speak_text itself cancels any live capture
-# before starting playback so the tail of the previous utterance doesn't
-# leak into the mic.
-_tts_playing = threading.Event()
-_tts_playing.set()  # initially "not playing"
-_continuous_on_transcript: Optional[Callable[[str], None]] = None
-_continuous_on_status: Optional[Callable[[str], None]] = None
-_continuous_on_silent_limit: Optional[Callable[[], None]] = None
-_continuous_no_speech_count = 0
-_CONTINUOUS_NO_SPEECH_LIMIT = 3
-
-
-# ── Push-to-talk API ─────────────────────────────────────────────────
-
-
-def start_recording() -> None:
-    """Begin capturing from the default input device (push-to-talk).
-
-    Idempotent — calling again while a recording is in progress is a no-op.
-    """
-    global _recorder
-
-    with _recorder_lock:
-        if _recorder is not None and getattr(_recorder, "is_recording", False):
-            return
-        rec = create_audio_recorder()
-        rec.start()
-        _recorder = rec
-
-
-def stop_and_transcribe() -> Optional[str]:
-    """Stop the active push-to-talk recording, transcribe, return text.
-
-    Returns ``None`` when no recording is active, when the microphone
-    captured no speech, or when Whisper returned a known hallucination.
-    """
-    global _recorder
-
-    with _recorder_lock:
-        rec = _recorder
-        _recorder = None
-
-    if rec is None:
-        return None
-
-    wav_path = rec.stop()
-    if not wav_path:
-        return None
-
-    try:
-        result = transcribe_recording(wav_path)
-    except Exception as e:
-        logger.warning("voice transcription failed: %s", e)
-        return None
-    finally:
-        try:
-            if os.path.isfile(wav_path):
-                os.unlink(wav_path)
-        except Exception:
-            pass
-
-    # transcribe_recording returns {"success": bool, "transcript": str, ...}
-    # — matches cli.py:_voice_stop_and_transcribe's result.get("transcript").
-    if not result.get("success"):
-        return None
-    text = (result.get("transcript") or "").strip()
-    if not text or is_whisper_hallucination(text):
-        return None
-
-    return text
-
-
-# ── Continuous (VAD) API ─────────────────────────────────────────────
-
-
-def start_continuous(
-    on_transcript: Callable[[str], None],
-    on_status: Optional[Callable[[str], None]] = None,
-    on_silent_limit: Optional[Callable[[], None]] = None,
-    silence_threshold: int = 200,
-    silence_duration: float = 3.0,
-) -> None:
-    """Start a VAD-driven continuous recording loop.
-
-    The loop calls ``on_transcript(text)`` each time speech is detected and
-    transcribed successfully, then auto-restarts. After
-    ``_CONTINUOUS_NO_SPEECH_LIMIT`` consecutive silent cycles (no speech
-    picked up at all) the loop stops itself and calls ``on_silent_limit``
-    so the UI can reflect "voice off". Idempotent — calling while already
-    active is a no-op.
-
-    ``on_status`` is called with ``"listening"`` / ``"transcribing"`` /
-    ``"idle"`` so the UI can show a live indicator.
-    """
-    global _continuous_active, _continuous_recorder
-    global _continuous_on_transcript, _continuous_on_status, _continuous_on_silent_limit
-    global _continuous_no_speech_count
-
-    with _continuous_lock:
-        if _continuous_active:
-            _debug("start_continuous: already active — no-op")
-            return
-        _continuous_active = True
-        _continuous_on_transcript = on_transcript
-        _continuous_on_status = on_status
-        _continuous_on_silent_limit = on_silent_limit
-        _continuous_no_speech_count = 0
-
-        if _continuous_recorder is None:
-            _continuous_recorder = create_audio_recorder()
-
-        _continuous_recorder._silence_threshold = silence_threshold
-        _continuous_recorder._silence_duration = silence_duration
-        rec = _continuous_recorder
-
-    _debug(
-        f"start_continuous: begin (threshold={silence_threshold}, duration={silence_duration}s)"
-    )
-
-    # CLI parity: single 880 Hz beep *before* opening the stream — placing
-    # the beep after stream.start() on macOS triggers a CoreAudio conflict
-    # (cli.py:7528 comment).
-    _play_beep(frequency=880, count=1)
-
-    try:
-        rec.start(on_silence_stop=_continuous_on_silence)
-    except Exception as e:
-        logger.error("failed to start continuous recording: %s", e)
-        _debug(f"start_continuous: rec.start raised {type(e).__name__}: {e}")
-        with _continuous_lock:
-            _continuous_active = False
-        raise
-
-    if on_status:
-        try:
-            on_status("listening")
-        except Exception:
-            pass
-
-
-def stop_continuous() -> None:
-    """Stop the active continuous loop and release the microphone.
-
-    Idempotent — calling while not active is a no-op. Any in-flight
-    transcription completes but its result is discarded (the callback
-    checks ``_continuous_active`` before firing).
-    """
-    global _continuous_active, _continuous_on_transcript
-    global _continuous_on_status, _continuous_on_silent_limit
-    global _continuous_recorder, _continuous_no_speech_count
-
-    with _continuous_lock:
-        if not _continuous_active:
-            return
-        _continuous_active = False
-        rec = _continuous_recorder
-        on_status = _continuous_on_status
-        _continuous_on_transcript = None
-        _continuous_on_status = None
-        _continuous_on_silent_limit = None
-        _continuous_no_speech_count = 0
-
-    if rec is not None:
-        try:
-            # cancel() (not stop()) discards buffered frames — the loop
-            # is over, we don't want to transcribe a half-captured turn.
-            rec.cancel()
-        except Exception as e:
-            logger.warning("failed to cancel recorder: %s", e)
-
-    # Audible "recording stopped" cue (CLI parity: same 660 Hz × 2 the
-    # silence-auto-stop path plays).
-    _play_beep(frequency=660, count=2)
-
-    if on_status:
-        try:
-            on_status("idle")
-        except Exception:
-            pass
-
-
-def is_continuous_active() -> bool:
-    """Whether a continuous voice loop is currently running."""
-    with _continuous_lock:
-        return _continuous_active
-
-
-def _continuous_on_silence() -> None:
-    """AudioRecorder silence callback — runs in a daemon thread.
-
-    Stops the current capture, transcribes, delivers the text via
-    ``on_transcript``, and — if the loop is still active — starts the
-    next capture. Three consecutive silent cycles end the loop.
-    """
-    global _continuous_active, _continuous_no_speech_count
-
-    _debug("_continuous_on_silence: fired")
-
-    with _continuous_lock:
-        if not _continuous_active:
-            _debug("_continuous_on_silence: loop inactive — abort")
-            return
-        rec = _continuous_recorder
-        on_transcript = _continuous_on_transcript
-        on_status = _continuous_on_status
-        on_silent_limit = _continuous_on_silent_limit
-
-    if rec is None:
-        _debug("_continuous_on_silence: no recorder — abort")
-        return
-
-    if on_status:
-        try:
-            on_status("transcribing")
-        except Exception:
-            pass
-
-    wav_path = rec.stop()
-    # Peak RMS is the critical diagnostic when stop() returns None despite
-    # the VAD firing — tells us at a glance whether the mic was too quiet
-    # for SILENCE_RMS_THRESHOLD (200) or the VAD + peak checks disagree.
-    peak_rms = getattr(rec, "_peak_rms", -1)
-    _debug(
-        f"_continuous_on_silence: rec.stop -> {wav_path!r} (peak_rms={peak_rms})"
-    )
-
-    # CLI parity: double 660 Hz beep after the stream stops (safe from the
-    # CoreAudio conflict that blocks pre-start beeps).
-    _play_beep(frequency=660, count=2)
-
-    transcript: Optional[str] = None
-
-    if wav_path:
-        try:
-            result = transcribe_recording(wav_path)
-            # transcribe_recording returns {"success": bool, "transcript": str,
-            # "error": str?} — NOT {"text": str}.  Using the wrong key silently
-            # produced empty transcripts even when Groq/local STT returned fine,
-            # which masqueraded as "not hearing the user" to the caller.
-            success = bool(result.get("success"))
-            text = (result.get("transcript") or "").strip()
-            err = result.get("error")
-            _debug(
-                f"_continuous_on_silence: transcribe -> success={success} "
-                f"text={text!r} err={err!r}"
-            )
-            if success and text and not is_whisper_hallucination(text):
-                transcript = text
-        except Exception as e:
-            logger.warning("continuous transcription failed: %s", e)
-            _debug(f"_continuous_on_silence: transcribe raised {type(e).__name__}: {e}")
-        finally:
-            try:
-                if os.path.isfile(wav_path):
-                    os.unlink(wav_path)
-            except Exception:
-                pass
-
-    with _continuous_lock:
-        if not _continuous_active:
-            # User stopped us while we were transcribing — discard.
-            _debug("_continuous_on_silence: stopped during transcribe — no restart")
-            return
-        if transcript:
-            _continuous_no_speech_count = 0
-        else:
-            _continuous_no_speech_count += 1
-        should_halt = _continuous_no_speech_count >= _CONTINUOUS_NO_SPEECH_LIMIT
-        no_speech = _continuous_no_speech_count
-
-    if transcript and on_transcript:
-        try:
-            on_transcript(transcript)
-        except Exception as e:
-            logger.warning("on_transcript callback raised: %s", e)
-
-    if should_halt:
-        _debug(f"_continuous_on_silence: {no_speech} silent cycles — halting")
-        with _continuous_lock:
-            _continuous_active = False
-            _continuous_no_speech_count = 0
-        if on_silent_limit:
-            try:
-                on_silent_limit()
-            except Exception:
-                pass
-        try:
-            rec.cancel()
-        except Exception:
-            pass
-        if on_status:
-            try:
-                on_status("idle")
-            except Exception:
-                pass
-        return
-
-    # CLI parity (cli.py:10619-10621): wait for any in-flight TTS to
-    # finish before re-arming the mic, then leave a small gap to avoid
-    # catching the tail of the speaker output.  Without this the voice
-    # loop becomes a feedback loop — the agent's spoken reply lands
-    # back in the mic and gets re-submitted.
-    if not _tts_playing.is_set():
-        _debug("_continuous_on_silence: waiting for TTS to finish")
-        _tts_playing.wait(timeout=60)
-        import time as _time
-        _time.sleep(0.3)
-
-        # User may have stopped the loop during the wait.
-        with _continuous_lock:
-            if not _continuous_active:
-                _debug("_continuous_on_silence: stopped while waiting for TTS")
-                return
-
-    # Restart for the next turn.
-    _debug(f"_continuous_on_silence: restarting loop (no_speech={no_speech})")
-    _play_beep(frequency=880, count=1)
-    try:
-        rec.start(on_silence_stop=_continuous_on_silence)
-    except Exception as e:
-        logger.error("failed to restart continuous recording: %s", e)
-        _debug(f"_continuous_on_silence: restart raised {type(e).__name__}: {e}")
-        with _continuous_lock:
-            _continuous_active = False
-        return
-
-    if on_status:
-        try:
-            on_status("listening")
-        except Exception:
-            pass
-
-
-# ── TTS API ──────────────────────────────────────────────────────────
-
-
-def speak_text(text: str) -> None:
-    """Synthesize ``text`` with the configured TTS provider and play it.
-
-    Mirrors cli.py:_voice_speak_response exactly — same markdown strip
-    pipeline, same 4000-char cap, same explicit mp3 output path, same
-    MP3-over-OGG playback choice (afplay misbehaves on OGG), same cleanup
-    of both extensions. Keeping these in sync means a voice-mode TTS
-    session in the TUI sounds identical to one in the classic CLI.
-
-    While playback is in flight the module-level _tts_playing Event is
-    cleared so the continuous-recording loop knows to wait before
-    re-arming the mic (otherwise the agent's spoken reply feedback-loops
-    through the microphone and the agent ends up replying to itself).
-    """
-    if not text or not text.strip():
-        return
-
-    import re
-    import tempfile
-    import time
-
-    # Cancel any live capture before we open the speakers — otherwise the
-    # last ~200ms of the user's turn tail + the first syllables of our TTS
-    # both end up in the next recording window.  The continuous loop will
-    # re-arm itself after _tts_playing flips back (see _continuous_on_silence).
-    paused_recording = False
-    with _continuous_lock:
-        if (
-            _continuous_active
-            and _continuous_recorder is not None
-            and getattr(_continuous_recorder, "is_recording", False)
-        ):
-            try:
-                _continuous_recorder.cancel()
-                paused_recording = True
-            except Exception as e:
-                logger.warning("failed to pause recorder for TTS: %s", e)
-
-    _tts_playing.clear()
-    _debug(f"speak_text: TTS begin (paused_recording={paused_recording})")
-
-    try:
-        from tools.tts_tool import text_to_speech_tool
-
-        tts_text = text[:4000] if len(text) > 4000 else text
-        tts_text = re.sub(r'```[\s\S]*?```', ' ', tts_text)             # fenced code blocks
-        tts_text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', tts_text)    # [text](url) → text
-        tts_text = re.sub(r'https?://\S+', '', tts_text)                # bare URLs
-        tts_text = re.sub(r'\*\*(.+?)\*\*', r'\1', tts_text)            # bold
-        tts_text = re.sub(r'\*(.+?)\*', r'\1', tts_text)                # italic
-        tts_text = re.sub(r'`(.+?)`', r'\1', tts_text)                  # inline code
-        tts_text = re.sub(r'^#+\s*', '', tts_text, flags=re.MULTILINE)  # headers
-        tts_text = re.sub(r'^\s*[-*]\s+', '', tts_text, flags=re.MULTILINE)  # list bullets
-        tts_text = re.sub(r'---+', '', tts_text)                        # horizontal rules
-        tts_text = re.sub(r'\n{3,}', '\n\n', tts_text)                  # excess newlines
-        tts_text = tts_text.strip()
-        if not tts_text:
-            return
-
-        # MP3 output path, pre-chosen so we can play the MP3 directly even
-        # when text_to_speech_tool auto-converts to OGG for messaging
-        # platforms.  afplay's OGG support is flaky, MP3 always works.
-        os.makedirs(os.path.join(tempfile.gettempdir(), "hermes_voice"), exist_ok=True)
-        mp3_path = os.path.join(
-            tempfile.gettempdir(),
-            "hermes_voice",
-            f"tts_{time.strftime('%Y%m%d_%H%M%S')}.mp3",
-        )
-
-        _debug(f"speak_text: synthesizing {len(tts_text)} chars -> {mp3_path}")
-        text_to_speech_tool(text=tts_text, output_path=mp3_path)
-
-        if os.path.isfile(mp3_path) and os.path.getsize(mp3_path) > 0:
-            _debug(f"speak_text: playing {mp3_path} ({os.path.getsize(mp3_path)} bytes)")
-            play_audio_file(mp3_path)
-            try:
-                os.unlink(mp3_path)
-                ogg_path = mp3_path.rsplit(".", 1)[0] + ".ogg"
-                if os.path.isfile(ogg_path):
-                    os.unlink(ogg_path)
-            except OSError:
-                pass
-        else:
-            _debug(f"speak_text: TTS tool produced no audio at {mp3_path}")
-    except Exception as e:
-        logger.warning("Voice TTS playback failed: %s", e)
-        _debug(f"speak_text raised {type(e).__name__}: {e}")
-    finally:
-        _tts_playing.set()
-        _debug("speak_text: TTS done")
-
-        # Re-arm the mic so the user can answer without pressing Ctrl+B.
-        # Small delay lets the OS flush speaker output and afplay fully
-        # release the audio device before sounddevice re-opens the input.
-        if paused_recording:
-            time.sleep(0.3)
-            with _continuous_lock:
-                if _continuous_active and _continuous_recorder is not None:
-                    try:
-                        _continuous_recorder.start(
-                            on_silence_stop=_continuous_on_silence
-                        )
-                        _debug("speak_text: recording resumed after TTS")
-                    except Exception as e:
-                        logger.warning(
-                            "failed to resume recorder after TTS: %s", e
-                        )
@@ -49,7 +49,7 @@ from hermes_cli.config import (
 from gateway.status import get_running_pid, read_runtime_status

 try:
-    from fastapi import FastAPI, HTTPException, Request
+    from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
    from fastapi.middleware.cors import CORSMiddleware
    from fastapi.responses import FileResponse, HTMLResponse, JSONResponse
    from fastapi.staticfiles import StaticFiles
@@ -71,7 +71,6 @@ app = FastAPI(title="Hermes Agent", version=__version__)
 # Injected into the SPA HTML so only the legitimate web UI can use it.
 # ---------------------------------------------------------------------------
 _SESSION_TOKEN = secrets.token_urlsafe(32)
-_SESSION_HEADER_NAME = "X-Hermes-Session-Token"

 # Simple rate limiter for the reveal endpoint
 _reveal_timestamps: List[float] = []
@@ -105,29 +104,14 @@ _PUBLIC_API_PATHS: frozenset = frozenset({
 })


-def _has_valid_session_token(request: Request) -> bool:
-    """True if the request carries a valid dashboard session token.
+def _require_token(request: Request) -> None:
+    """Validate the ephemeral session token.  Raises 401 on mismatch.

-    The dedicated session header avoids collisions with reverse proxies that
-    already use ``Authorization`` (for example Caddy ``basic_auth``). We still
-    accept the legacy Bearer path for backward compatibility with older
-    dashboard bundles.
+    Uses ``hmac.compare_digest`` to prevent timing side-channels.
    """
-    session_header = request.headers.get(_SESSION_HEADER_NAME, "")
-    if session_header and hmac.compare_digest(
-        session_header.encode(),
-        _SESSION_TOKEN.encode(),
-    ):
-        return True
-
    auth = request.headers.get("authorization", "")
    expected = f"Bearer {_SESSION_TOKEN}"
-    return hmac.compare_digest(auth.encode(), expected.encode())
-
-
-def _require_token(request: Request) -> None:
-    """Validate the ephemeral session token.  Raises 401 on mismatch."""
-    if not _has_valid_session_token(request):
+    if not hmac.compare_digest(auth.encode(), expected.encode()):
        raise HTTPException(status_code=401, detail="Unauthorized")


@@ -221,7 +205,9 @@ async def auth_middleware(request: Request, call_next):
    """Require the session token on all /api/ routes except the public list."""
    path = request.url.path
    if path.startswith("/api/") and path not in _PUBLIC_API_PATHS and not path.startswith("/api/plugins/"):
-        if not _has_valid_session_token(request):
+        auth = request.headers.get("authorization", "")
+        expected = f"Bearer {_SESSION_TOKEN}"
+        if not hmac.compare_digest(auth.encode(), expected.encode()):
            return JSONResponse(
                status_code=401,
                content={"detail": "Unauthorized"},
@@ -431,14 +417,7 @@ class EnvVarReveal(BaseModel):


 _GATEWAY_HEALTH_URL = os.getenv("GATEWAY_HEALTH_URL")
-try:
-    _GATEWAY_HEALTH_TIMEOUT = float(os.getenv("GATEWAY_HEALTH_TIMEOUT", "3"))
-except (ValueError, TypeError):
-    _log.warning(
-        "Invalid GATEWAY_HEALTH_TIMEOUT value %r — using default 3.0s",
-        os.getenv("GATEWAY_HEALTH_TIMEOUT"),
-    )
-    _GATEWAY_HEALTH_TIMEOUT = 3.0
+_GATEWAY_HEALTH_TIMEOUT = float(os.getenv("GATEWAY_HEALTH_TIMEOUT", "3"))


 def _probe_gateway_health() -> tuple[bool, dict | None]:
@@ -2263,6 +2242,148 @@ async def get_usage_analytics(days: int = 30):
        db.close()


+# ---------------------------------------------------------------------------
+# /api/pty — PTY-over-WebSocket bridge for the dashboard "Chat" tab.
+#
+# The endpoint spawns the same ``hermes --tui`` binary the CLI uses, behind
+# a POSIX pseudo-terminal, and forwards bytes + resize escapes across a
+# WebSocket.  The browser renders the ANSI through xterm.js (see
+# web/src/pages/ChatPage.tsx).
+#
+# Auth: ``?token=<session_token>`` query param (browsers can't set
+# Authorization on the WS upgrade).  Same ephemeral ``_SESSION_TOKEN`` as
+# REST.  Localhost-only — we defensively reject non-loopback clients even
+# though uvicorn binds to 127.0.0.1.
+# ---------------------------------------------------------------------------
+
+import re
+import asyncio
+
+from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
+
+_RESIZE_RE = re.compile(rb"\x1b\[RESIZE:(\d+);(\d+)\]")
+_PTY_READ_CHUNK_TIMEOUT = 0.2
+# Starlette's TestClient reports the peer as "testclient"; treat it as
+# loopback so tests don't need to rewrite request scope.
+_LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})
+
+
+def _resolve_chat_argv(
+    resume: Optional[str] = None,
+) -> tuple[list[str], Optional[str], Optional[dict]]:
+    """Resolve the argv + cwd + env for the chat PTY.
+
+    Default: whatever ``hermes --tui`` would run.  Tests monkeypatch this
+    function to inject a tiny fake command (``cat``, ``sh -c 'printf …'``)
+    so nothing has to build Node or the TUI bundle.
+
+    Session resume is propagated via the ``HERMES_TUI_RESUME`` env var —
+    matching what ``hermes_cli.main._launch_tui`` does for the CLI path.
+    Appending ``--resume <id>`` to argv doesn't work because ``ui-tui`` does
+    not parse its argv.
+    """
+    from hermes_cli.main import PROJECT_ROOT, _make_tui_argv
+
+    argv, cwd = _make_tui_argv(PROJECT_ROOT / "ui-tui", tui_dev=False)
+    env: Optional[dict] = None
+    if resume:
+        env = os.environ.copy()
+        env["HERMES_TUI_RESUME"] = resume
+    return list(argv), str(cwd) if cwd else None, env
+
+
+@app.websocket("/api/pty")
+async def pty_ws(ws: WebSocket) -> None:
+    # --- auth + loopback check (before accept so we can close cleanly) ---
+    token = ws.query_params.get("token", "")
+    expected = _SESSION_TOKEN
+    if not hmac.compare_digest(token.encode(), expected.encode()):
+        await ws.close(code=4401)
+        return
+
+    client_host = ws.client.host if ws.client else ""
+    if client_host and client_host not in _LOOPBACK_HOSTS:
+        await ws.close(code=4403)
+        return
+
+    await ws.accept()
+
+    # --- spawn PTY ------------------------------------------------------
+    resume = ws.query_params.get("resume") or None
+    try:
+        argv, cwd, env = _resolve_chat_argv(resume=resume)
+    except SystemExit as exc:
+        # _make_tui_argv calls sys.exit(1) when node/npm is missing.
+        await ws.send_text(f"\r\n\x1b[31mChat unavailable: {exc}\x1b[0m\r\n")
+        await ws.close(code=1011)
+        return
+
+
+    try:
+        bridge = PtyBridge.spawn(argv, cwd=cwd, env=env)
+    except PtyUnavailableError as exc:
+        await ws.send_text(f"\r\n\x1b[31mChat unavailable: {exc}\x1b[0m\r\n")
+        await ws.close(code=1011)
+        return
+    except (FileNotFoundError, OSError) as exc:
+        await ws.send_text(f"\r\n\x1b[31mChat failed to start: {exc}\x1b[0m\r\n")
+        await ws.close(code=1011)
+        return
+
+    loop = asyncio.get_running_loop()
+
+    # --- reader task: PTY master → WebSocket ----------------------------
+    async def pump_pty_to_ws() -> None:
+        while True:
+            chunk = await loop.run_in_executor(
+                None, bridge.read, _PTY_READ_CHUNK_TIMEOUT
+            )
+            if chunk is None:  # EOF
+                return
+            if not chunk:  # no data this tick; yield control and retry
+                await asyncio.sleep(0)
+                continue
+            try:
+                await ws.send_bytes(chunk)
+            except Exception:
+                return
+
+    reader_task = asyncio.create_task(pump_pty_to_ws())
+
+    # --- writer loop: WebSocket → PTY master ----------------------------
+    try:
+        while True:
+            msg = await ws.receive()
+            msg_type = msg.get("type")
+            if msg_type == "websocket.disconnect":
+                break
+            raw = msg.get("bytes")
+            if raw is None:
+                text = msg.get("text")
+                raw = text.encode("utf-8") if isinstance(text, str) else b""
+            if not raw:
+                continue
+
+            # Resize escape is consumed locally, never written to the PTY.
+            match = _RESIZE_RE.match(raw)
+            if match and match.end() == len(raw):
+                cols = int(match.group(1))
+                rows = int(match.group(2))
+                bridge.resize(cols=cols, rows=rows)
+                continue
+
+            bridge.write(raw)
+    except WebSocketDisconnect:
+        pass
+    finally:
+        reader_task.cancel()
+        try:
+            await reader_task
+        except (asyncio.CancelledError, Exception):
+            pass
+        bridge.close()
+
+
 def mount_spa(application: FastAPI):
    """Mount the built SPA. Falls back to index.html for client-side routing.

@@ -2325,227 +2446,8 @@ _BUILTIN_DASHBOARD_THEMES = [
 ]


-def _parse_theme_layer(value: Any, default_hex: str, default_alpha: float = 1.0) -> Optional[Dict[str, Any]]:
-    """Normalise a theme layer spec from YAML into `{hex, alpha}` form.
-
-    Accepts shorthand (a bare hex string) or full dict form.  Returns
-    ``None`` on garbage input so the caller can fall back to a built-in
-    default rather than blowing up.
-    """
-    if value is None:
-        return {"hex": default_hex, "alpha": default_alpha}
-    if isinstance(value, str):
-        return {"hex": value, "alpha": default_alpha}
-    if isinstance(value, dict):
-        hex_val = value.get("hex", default_hex)
-        alpha_val = value.get("alpha", default_alpha)
-        if not isinstance(hex_val, str):
-            return None
-        try:
-            alpha_f = float(alpha_val)
-        except (TypeError, ValueError):
-            alpha_f = default_alpha
-        return {"hex": hex_val, "alpha": max(0.0, min(1.0, alpha_f))}
-    return None
-
-
-_THEME_DEFAULT_TYPOGRAPHY: Dict[str, str] = {
-    "fontSans": 'system-ui, -apple-system, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif',
-    "fontMono": 'ui-monospace, "SF Mono", "Cascadia Mono", Menlo, Consolas, monospace',
-    "baseSize": "15px",
-    "lineHeight": "1.55",
-    "letterSpacing": "0",
-}
-
-_THEME_DEFAULT_LAYOUT: Dict[str, str] = {
-    "radius": "0.5rem",
-    "density": "comfortable",
-}
-
-_THEME_OVERRIDE_KEYS = {
-    "card", "cardForeground", "popover", "popoverForeground",
-    "primary", "primaryForeground", "secondary", "secondaryForeground",
-    "muted", "mutedForeground", "accent", "accentForeground",
-    "destructive", "destructiveForeground", "success", "warning",
-    "border", "input", "ring",
-}
-
-# Well-known named asset slots themes can populate.  Any other keys under
-# ``assets.custom`` are exposed as ``--theme-asset-custom-<key>`` CSS vars
-# for plugin/shell use.
-_THEME_NAMED_ASSET_KEYS = {"bg", "hero", "logo", "crest", "sidebar", "header"}
-
-# Component-style buckets themes can override.  The value under each bucket
-# is a mapping from camelCase property name to CSS string; each pair emits
-# ``--component-<bucket>-<kebab-property>`` on :root.  The frontend's shell
-# components (Card, App header, Backdrop, etc.) consume these vars so themes
-# can restyle chrome (clip-path, border-image, segmented progress, etc.)
-# without shipping their own CSS.
-_THEME_COMPONENT_BUCKETS = {
-    "card", "header", "footer", "sidebar", "tab",
-    "progress", "badge", "backdrop", "page",
-}
-
-_THEME_LAYOUT_VARIANTS = {"standard", "cockpit", "tiled"}
-
-# Cap on customCSS length so a malformed/oversized theme YAML can't blow up
-# the response payload or the <style> tag.  32 KiB is plenty for every
-# practical reskin (the Strike Freedom demo is ~2 KiB).
-_THEME_CUSTOM_CSS_MAX = 32 * 1024
-
-
-def _normalise_theme_definition(data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
-    """Normalise a user theme YAML into the wire format `ThemeProvider`
-    expects.  Returns ``None`` if the theme is unusable.
-
-    Accepts both the full schema (palette/typography/layout) and a loose
-    form with bare hex strings, so hand-written YAMLs stay friendly.
-    """
-    if not isinstance(data, dict):
-        return None
-    name = data.get("name")
-    if not isinstance(name, str) or not name.strip():
-        return None
-
-    # Palette
-    palette_src = data.get("palette", {}) if isinstance(data.get("palette"), dict) else {}
-    # Allow top-level `colors.background` as a shorthand too.
-    colors_src = data.get("colors", {}) if isinstance(data.get("colors"), dict) else {}
-
-    def _layer(key: str, default_hex: str, default_alpha: float = 1.0) -> Dict[str, Any]:
-        spec = palette_src.get(key, colors_src.get(key))
-        parsed = _parse_theme_layer(spec, default_hex, default_alpha)
-        return parsed if parsed is not None else {"hex": default_hex, "alpha": default_alpha}
-
-    palette = {
-        "background": _layer("background", "#041c1c", 1.0),
-        "midground": _layer("midground", "#ffe6cb", 1.0),
-        "foreground": _layer("foreground", "#ffffff", 0.0),
-        "warmGlow": palette_src.get("warmGlow") or data.get("warmGlow") or "rgba(255, 189, 56, 0.35)",
-        "noiseOpacity": 1.0,
-    }
-    raw_noise = palette_src.get("noiseOpacity", data.get("noiseOpacity"))
-    try:
-        palette["noiseOpacity"] = float(raw_noise) if raw_noise is not None else 1.0
-    except (TypeError, ValueError):
-        palette["noiseOpacity"] = 1.0
-
-    # Typography
-    typo_src = data.get("typography", {}) if isinstance(data.get("typography"), dict) else {}
-    typography = dict(_THEME_DEFAULT_TYPOGRAPHY)
-    for key in ("fontSans", "fontMono", "fontDisplay", "fontUrl", "baseSize", "lineHeight", "letterSpacing"):
-        val = typo_src.get(key)
-        if isinstance(val, str) and val.strip():
-            typography[key] = val
-
-    # Layout
-    layout_src = data.get("layout", {}) if isinstance(data.get("layout"), dict) else {}
-    layout = dict(_THEME_DEFAULT_LAYOUT)
-    radius = layout_src.get("radius")
-    if isinstance(radius, str) and radius.strip():
-        layout["radius"] = radius
-    density = layout_src.get("density")
-    if isinstance(density, str) and density in ("compact", "comfortable", "spacious"):
-        layout["density"] = density
-
-    # Color overrides — keep only valid keys with string values.
-    overrides_src = data.get("colorOverrides", {})
-    color_overrides: Dict[str, str] = {}
-    if isinstance(overrides_src, dict):
-        for key, val in overrides_src.items():
-            if key in _THEME_OVERRIDE_KEYS and isinstance(val, str) and val.strip():
-                color_overrides[key] = val
-
-    # Assets — named slots + arbitrary user-defined keys.  Values must be
-    # strings (URLs or CSS ``url(...)``/``linear-gradient(...)`` expressions).
-    # We don't fetch remote assets here; the frontend just injects them as
-    # CSS vars.  Empty values are dropped so a theme can explicitly clear a
-    # slot by setting ``hero: ""``.
-    assets_out: Dict[str, Any] = {}
-    assets_src = data.get("assets", {}) if isinstance(data.get("assets"), dict) else {}
-    for key in _THEME_NAMED_ASSET_KEYS:
-        val = assets_src.get(key)
-        if isinstance(val, str) and val.strip():
-            assets_out[key] = val
-    custom_assets_src = assets_src.get("custom")
-    if isinstance(custom_assets_src, dict):
-        custom_assets: Dict[str, str] = {}
-        for key, val in custom_assets_src.items():
-            if (
-                isinstance(key, str)
-                and key.replace("-", "").replace("_", "").isalnum()
-                and isinstance(val, str)
-                and val.strip()
-            ):
-                custom_assets[key] = val
-        if custom_assets:
-            assets_out["custom"] = custom_assets
-
-    # Custom CSS — raw CSS text the frontend injects as a scoped <style>
-    # tag on theme apply.  Clipped to _THEME_CUSTOM_CSS_MAX to keep the
-    # payload bounded.  We intentionally do NOT parse/sanitise the CSS
-    # here — the dashboard is localhost-only and themes are user-authored
-    # YAML in ~/.hermes/, same trust level as the config file itself.
-    custom_css_val = data.get("customCSS")
-    custom_css: Optional[str] = None
-    if isinstance(custom_css_val, str) and custom_css_val.strip():
-        custom_css = custom_css_val[:_THEME_CUSTOM_CSS_MAX]
-
-    # Component style overrides — per-bucket dicts of camelCase CSS
-    # property -> CSS string.  The frontend converts these into CSS vars
-    # that shell components (Card, App header, Backdrop) consume.
-    component_styles_src = data.get("componentStyles", {})
-    component_styles: Dict[str, Dict[str, str]] = {}
-    if isinstance(component_styles_src, dict):
-        for bucket, props in component_styles_src.items():
-            if bucket not in _THEME_COMPONENT_BUCKETS or not isinstance(props, dict):
-                continue
-            clean: Dict[str, str] = {}
-            for prop, value in props.items():
-                if (
-                    isinstance(prop, str)
-                    and prop.replace("-", "").replace("_", "").isalnum()
-                    and isinstance(value, (str, int, float))
-                    and str(value).strip()
-                ):
-                    clean[prop] = str(value)
-            if clean:
-                component_styles[bucket] = clean
-
-    layout_variant_src = data.get("layoutVariant")
-    layout_variant = (
-        layout_variant_src
-        if isinstance(layout_variant_src, str) and layout_variant_src in _THEME_LAYOUT_VARIANTS
-        else "standard"
-    )
-
-    result: Dict[str, Any] = {
-        "name": name,
-        "label": data.get("label") or name,
-        "description": data.get("description", ""),
-        "palette": palette,
-        "typography": typography,
-        "layout": layout,
-        "layoutVariant": layout_variant,
-    }
-    if color_overrides:
-        result["colorOverrides"] = color_overrides
-    if assets_out:
-        result["assets"] = assets_out
-    if custom_css is not None:
-        result["customCSS"] = custom_css
-    if component_styles:
-        result["componentStyles"] = component_styles
-    return result
-
-
 def _discover_user_themes() -> list:
-    """Scan ~/.hermes/dashboard-themes/*.yaml for user-created themes.
-
-    Returns a list of fully-normalised theme definitions ready to ship
-    to the frontend, so the client can apply them without a secondary
-    round-trip or a built-in stub.
-    """
+    """Scan ~/.hermes/dashboard-themes/*.yaml for user-created themes."""
    themes_dir = get_hermes_home() / "dashboard-themes"
    if not themes_dir.is_dir():
        return []
@@ -2553,42 +2455,33 @@ def _discover_user_themes() -> list:
    for f in sorted(themes_dir.glob("*.yaml")):
        try:
            data = yaml.safe_load(f.read_text(encoding="utf-8"))
+            if isinstance(data, dict) and data.get("name"):
+                result.append({
+                    "name": data["name"],
+                    "label": data.get("label", data["name"]),
+                    "description": data.get("description", ""),
+                })
        except Exception:
            continue
-        normalised = _normalise_theme_definition(data)
-        if normalised is not None:
-            result.append(normalised)
    return result


@app.get("/api/dashboard/themes")
 async def get_dashboard_themes():
-    """Return available themes and the currently active one.
-
-    Built-in entries ship name/label/description only (the frontend owns
-    their full definitions in `web/src/themes/presets.ts`).  User themes
-    from `~/.hermes/dashboard-themes/*.yaml` ship with their full
-    normalised definition under `definition`, so the client can apply
-    them without a stub.
-    """
+    """Return available themes and the currently active one."""
    config = load_config()
    active = config.get("dashboard", {}).get("theme", "default")
    user_themes = _discover_user_themes()
+    # Merge built-in + user, user themes override built-in by name.
    seen = set()
    themes = []
    for t in _BUILTIN_DASHBOARD_THEMES:
        seen.add(t["name"])
        themes.append(t)
    for t in user_themes:
-        if t["name"] in seen:
-            continue
-        themes.append({
-            "name": t["name"],
-            "label": t["label"],
-            "description": t["description"],
-            "definition": t,
-        })
-        seen.add(t["name"])
+        if t["name"] not in seen:
+            themes.append(t)
+            seen.add(t["name"])
    return {"themes": themes, "active": active}


@@ -2645,35 +2538,13 @@ def _discover_dashboard_plugins() -> list:
                if name in seen_names:
                    continue
                seen_names.add(name)
-                # Tab options: ``path`` + ``position`` for a new tab, optional
-                # ``override`` to replace a built-in route, and ``hidden`` to
-                # register the plugin component/slots without adding a tab
-                # (useful for slot-only plugins like a header-crest injector).
-                raw_tab = data.get("tab", {}) if isinstance(data.get("tab"), dict) else {}
-                tab_info = {
-                    "path": raw_tab.get("path", f"/{name}"),
-                    "position": raw_tab.get("position", "end"),
-                }
-                override_path = raw_tab.get("override")
-                if isinstance(override_path, str) and override_path.startswith("/"):
-                    tab_info["override"] = override_path
-                if bool(raw_tab.get("hidden")):
-                    tab_info["hidden"] = True
-                # Slots: list of named slot locations this plugin populates.
-                # The frontend exposes ``registerSlot(pluginName, slotName, Component)``
-                # on window; plugins with non-empty slots call it from their JS bundle.
-                slots_src = data.get("slots")
-                slots: List[str] = []
-                if isinstance(slots_src, list):
-                    slots = [s for s in slots_src if isinstance(s, str) and s]
                plugins.append({
                    "name": name,
                    "label": data.get("label", name),
                    "description": data.get("description", ""),
                    "icon": data.get("icon", "Puzzle"),
                    "version": data.get("version", "0.0.0"),
-                    "tab": tab_info,
-                    "slots": slots,
+                    "tab": data.get("tab", {"path": f"/{name}", "position": "end"}),
                    "entry": data.get("entry", "dist/index.js"),
                    "css": data.get("css"),
                    "has_api": bool(data.get("api")),
@@ -288,34 +288,30 @@ def get_tool_definitions(
                filtered_tools[i] = {"type": "function", "function": dynamic_schema}
                break

-    # Rebuild discord / discord_admin schemas based on the bot's privileged
-    # intents (detected from GET /applications/@me) and the user's action
-    # allowlist in config.  Hides actions the bot's intents don't support so
-    # the model never attempts them, and annotates fetch_messages when the
+    # Rebuild discord_server schema based on the bot's privileged intents
+    # (detected from GET /applications/@me) and the user's action allowlist
+    # in config.  Hides actions the bot's intents don't support so the
+    # model never attempts them, and annotates fetch_messages when the
    # MESSAGE_CONTENT intent is missing.
-    _discord_schema_fns = {
-        "discord": "get_dynamic_schema_core",
-        "discord_admin": "get_dynamic_schema_admin",
-    }
-    for discord_tool_name in _discord_schema_fns:
-        if discord_tool_name in available_tool_names:
-            try:
-                from tools import discord_tool as _dt
-                schema_fn = getattr(_dt, _discord_schema_fns[discord_tool_name])
-                dynamic = schema_fn()
-            except Exception:
-                dynamic = None
-            if dynamic is None:
-                filtered_tools = [
-                    t for t in filtered_tools
-                    if t.get("function", {}).get("name") != discord_tool_name
-                ]
-                available_tool_names.discard(discord_tool_name)
-            else:
-                for i, td in enumerate(filtered_tools):
-                    if td.get("function", {}).get("name") == discord_tool_name:
-                        filtered_tools[i] = {"type": "function", "function": dynamic}
-                        break
+    if "discord_server" in available_tool_names:
+        try:
+            from tools.discord_tool import get_dynamic_schema
+            dynamic = get_dynamic_schema()
+        except Exception:  # pragma: no cover — defensive, fall back to static
+            dynamic = None
+        if dynamic is None:
+            # Tool filtered out entirely (empty allowlist or detection disabled
+            # the only remaining actions).  Drop it from the schema list.
+            filtered_tools = [
+                t for t in filtered_tools
+                if t.get("function", {}).get("name") != "discord_server"
+            ]
+            available_tool_names.discard("discord_server")
+        else:
+            for i, td in enumerate(filtered_tools):
+                if td.get("function", {}).get("name") == "discord_server":
+                    filtered_tools[i] = {"type": "function", "function": dynamic}
+                    break

    # Strip web tool cross-references from browser_navigate description when
    # web_search / web_extract are not available.  The static schema says
@@ -422,31 +418,6 @@ def _coerce_value(value: str, expected_type):
        return _coerce_number(value, integer_only=(expected_type == "integer"))
    if expected_type == "boolean":
        return _coerce_boolean(value)
-    if expected_type == "array":
-        return _coerce_json(value, list)
-    if expected_type == "object":
-        return _coerce_json(value, dict)
-    return value
-
-
-def _coerce_json(value: str, expected_python_type: type):
-    """Parse *value* as JSON when the schema expects an array or object.
-
-    Handles model output drift where a complex oneOf/discriminated-union schema
-    causes the LLM to emit the array/object as a JSON string instead of a native
-    structure.  Returns the original string if parsing fails or yields the wrong
-    Python type.
-    """
-    try:
-        parsed = json.loads(value)
-    except (ValueError, TypeError):
-        return value
-    if isinstance(parsed, expected_python_type):
-        logger.debug(
-            "coerce_tool_args: coerced string to %s via json.loads",
-            expected_python_type.__name__,
-        )
-        return parsed
    return value


@@ -777,10 +777,7 @@ HERMES_NIX_ENV_EOF
            NoNewPrivileges = true;
            ProtectSystem = "strict";
            ProtectHome = false;
-            ReadWritePaths = [
-              cfg.stateDir
-              cfg.workingDirectory
-            ];
+            ReadWritePaths = [ cfg.stateDir ];
            PrivateTmp = true;
          };

@@ -1,378 +0,0 @@
-"""OpenAI image generation backend — ChatGPT/Codex OAuth variant.
-
-Identical model catalog and tier semantics to the ``openai`` image-gen plugin
-(``gpt-image-2`` at low/medium/high quality), but routes the request through
-the Codex Responses API ``image_generation`` tool instead of the
-``images.generate`` REST endpoint. This lets users who are already
-authenticated with Codex/ChatGPT generate images without configuring a
-separate ``OPENAI_API_KEY``.
-
-Selection precedence for the tier (first hit wins):
-
-1. ``OPENAI_IMAGE_MODEL`` env var (escape hatch for scripts / tests)
-2. ``image_gen.openai-codex.model`` in ``config.yaml``
-3. ``image_gen.model`` in ``config.yaml`` (when it's one of our tier IDs)
-4. :data:`DEFAULT_MODEL` — ``gpt-image-2-medium``
-
-Output is saved as PNG under ``$HERMES_HOME/cache/images/``.
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any, Dict, List, Optional, Tuple
-
-from agent.image_gen_provider import (
-    DEFAULT_ASPECT_RATIO,
-    ImageGenProvider,
-    error_response,
-    resolve_aspect_ratio,
-    save_b64_image,
-    success_response,
-)
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Model catalog — mirrors the ``openai`` plugin so the picker UX is identical.
-# ---------------------------------------------------------------------------
-
-API_MODEL = "gpt-image-2"
-
-_MODELS: Dict[str, Dict[str, Any]] = {
-    "gpt-image-2-low": {
-        "display": "GPT Image 2 (Low)",
-        "speed": "~15s",
-        "strengths": "Fast iteration, lowest cost",
-        "quality": "low",
-    },
-    "gpt-image-2-medium": {
-        "display": "GPT Image 2 (Medium)",
-        "speed": "~40s",
-        "strengths": "Balanced — default",
-        "quality": "medium",
-    },
-    "gpt-image-2-high": {
-        "display": "GPT Image 2 (High)",
-        "speed": "~2min",
-        "strengths": "Highest fidelity, strongest prompt adherence",
-        "quality": "high",
-    },
-}
-
-DEFAULT_MODEL = "gpt-image-2-medium"
-
-_SIZES = {
-    "landscape": "1536x1024",
-    "square": "1024x1024",
-    "portrait": "1024x1536",
-}
-
-# Codex Responses surface used for the request. The chat model itself is only
-# the host that calls the ``image_generation`` tool; the actual image work is
-# done by ``API_MODEL``.
-_CODEX_CHAT_MODEL = "gpt-5.4"
-_CODEX_BASE_URL = "https://chatgpt.com/backend-api/codex"
-_CODEX_INSTRUCTIONS = (
-    "You are an assistant that must fulfill image generation requests by "
-    "using the image_generation tool when provided."
-)
-
-
-# ---------------------------------------------------------------------------
-# Config + auth helpers
-# ---------------------------------------------------------------------------
-
-
-def _load_image_gen_config() -> Dict[str, Any]:
-    """Read ``image_gen`` from config.yaml (returns {} on any failure)."""
-    try:
-        from hermes_cli.config import load_config
-
-        cfg = load_config()
-        section = cfg.get("image_gen") if isinstance(cfg, dict) else None
-        return section if isinstance(section, dict) else {}
-    except Exception as exc:
-        logger.debug("Could not load image_gen config: %s", exc)
-        return {}
-
-
-def _resolve_model() -> Tuple[str, Dict[str, Any]]:
-    """Decide which tier to use and return ``(model_id, meta)``."""
-    import os
-
-    env_override = os.environ.get("OPENAI_IMAGE_MODEL")
-    if env_override and env_override in _MODELS:
-        return env_override, _MODELS[env_override]
-
-    cfg = _load_image_gen_config()
-    sub = cfg.get("openai-codex") if isinstance(cfg.get("openai-codex"), dict) else {}
-    candidate: Optional[str] = None
-    if isinstance(sub, dict):
-        value = sub.get("model")
-        if isinstance(value, str) and value in _MODELS:
-            candidate = value
-    if candidate is None:
-        top = cfg.get("model")
-        if isinstance(top, str) and top in _MODELS:
-            candidate = top
-
-    if candidate is not None:
-        return candidate, _MODELS[candidate]
-
-    return DEFAULT_MODEL, _MODELS[DEFAULT_MODEL]
-
-
-def _read_codex_access_token() -> Optional[str]:
-    """Return a usable Codex OAuth token, or None.
-
-    Delegates to the canonical reader in ``agent.auxiliary_client`` so token
-    expiry, credential pool selection, and JWT decoding stay in one place.
-    """
-    try:
-        from agent.auxiliary_client import _read_codex_access_token as _reader
-
-        token = _reader()
-        if isinstance(token, str) and token.strip():
-            return token.strip()
-        return None
-    except Exception as exc:
-        logger.debug("Could not resolve Codex access token: %s", exc)
-        return None
-
-
-def _build_codex_client():
-    """Return an OpenAI client pointed at the ChatGPT/Codex backend, or None."""
-    token = _read_codex_access_token()
-    if not token:
-        return None
-    try:
-        import openai
-        from agent.auxiliary_client import _codex_cloudflare_headers
-
-        return openai.OpenAI(
-            api_key=token,
-            base_url=_CODEX_BASE_URL,
-            default_headers=_codex_cloudflare_headers(token),
-        )
-    except Exception as exc:
-        logger.debug("Could not build Codex image client: %s", exc)
-        return None
-
-
-def _collect_image_b64(client: Any, *, prompt: str, size: str, quality: str) -> Optional[str]:
-    """Stream a Codex Responses image_generation call and return the b64 image."""
-    image_b64: Optional[str] = None
-
-    with client.responses.stream(
-        model=_CODEX_CHAT_MODEL,
-        store=False,
-        instructions=_CODEX_INSTRUCTIONS,
-        input=[{
-            "type": "message",
-            "role": "user",
-            "content": [{"type": "input_text", "text": prompt}],
-        }],
-        tools=[{
-            "type": "image_generation",
-            "model": API_MODEL,
-            "size": size,
-            "quality": quality,
-            "output_format": "png",
-            "background": "opaque",
-            "partial_images": 1,
-        }],
-        tool_choice={
-            "type": "allowed_tools",
-            "mode": "required",
-            "tools": [{"type": "image_generation"}],
-        },
-    ) as stream:
-        for event in stream:
-            event_type = getattr(event, "type", "")
-            if event_type == "response.output_item.done":
-                item = getattr(event, "item", None)
-                if getattr(item, "type", None) == "image_generation_call":
-                    result = getattr(item, "result", None)
-                    if isinstance(result, str) and result:
-                        image_b64 = result
-            elif event_type == "response.image_generation_call.partial_image":
-                partial = getattr(event, "partial_image_b64", None)
-                if isinstance(partial, str) and partial:
-                    image_b64 = partial
-        final = stream.get_final_response()
-
-    # Final-response sweep covers the case where the stream finished before
-    # we observed the ``output_item.done`` event for the image call.
-    for item in getattr(final, "output", None) or []:
-        if getattr(item, "type", None) == "image_generation_call":
-            result = getattr(item, "result", None)
-            if isinstance(result, str) and result:
-                image_b64 = result
-
-    return image_b64
-
-
-# ---------------------------------------------------------------------------
-# Provider
-# ---------------------------------------------------------------------------
-
-
-class OpenAICodexImageGenProvider(ImageGenProvider):
-    """gpt-image-2 routed through ChatGPT/Codex OAuth instead of an API key."""
-
-    @property
-    def name(self) -> str:
-        return "openai-codex"
-
-    @property
-    def display_name(self) -> str:
-        return "OpenAI (Codex auth)"
-
-    def is_available(self) -> bool:
-        if not _read_codex_access_token():
-            return False
-        try:
-            import openai  # noqa: F401
-        except ImportError:
-            return False
-        return True
-
-    def list_models(self) -> List[Dict[str, Any]]:
-        return [
-            {
-                "id": model_id,
-                "display": meta["display"],
-                "speed": meta["speed"],
-                "strengths": meta["strengths"],
-                "price": "varies",
-            }
-            for model_id, meta in _MODELS.items()
-        ]
-
-    def default_model(self) -> Optional[str]:
-        return DEFAULT_MODEL
-
-    def get_setup_schema(self) -> Dict[str, Any]:
-        return {
-            "name": "OpenAI (Codex auth)",
-            "badge": "free",
-            "tag": "gpt-image-2 via ChatGPT/Codex OAuth — no API key required",
-            "env_vars": [],
-            "post_setup_hint": (
-                "Sign in with `hermes auth codex` (or `hermes setup` → Codex) "
-                "if you haven't already. No API key needed."
-            ),
-        }
-
-    def generate(
-        self,
-        prompt: str,
-        aspect_ratio: str = DEFAULT_ASPECT_RATIO,
-        **kwargs: Any,
-    ) -> Dict[str, Any]:
-        prompt = (prompt or "").strip()
-        aspect = resolve_aspect_ratio(aspect_ratio)
-
-        if not prompt:
-            return error_response(
-                error="Prompt is required and must be a non-empty string",
-                error_type="invalid_argument",
-                provider="openai-codex",
-                aspect_ratio=aspect,
-            )
-
-        if not _read_codex_access_token():
-            return error_response(
-                error=(
-                    "No Codex/ChatGPT OAuth credentials available. Run "
-                    "`hermes auth codex` (or `hermes setup` → Codex) to sign in."
-                ),
-                error_type="auth_required",
-                provider="openai-codex",
-                aspect_ratio=aspect,
-            )
-
-        try:
-            import openai  # noqa: F401
-        except ImportError:
-            return error_response(
-                error="openai Python package not installed (pip install openai)",
-                error_type="missing_dependency",
-                provider="openai-codex",
-                aspect_ratio=aspect,
-            )
-
-        tier_id, meta = _resolve_model()
-        size = _SIZES.get(aspect, _SIZES["square"])
-
-        client = _build_codex_client()
-        if client is None:
-            return error_response(
-                error="Could not initialize Codex image client",
-                error_type="auth_required",
-                provider="openai-codex",
-                model=tier_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        try:
-            b64 = _collect_image_b64(
-                client,
-                prompt=prompt,
-                size=size,
-                quality=meta["quality"],
-            )
-        except Exception as exc:
-            logger.debug("Codex image generation failed", exc_info=True)
-            return error_response(
-                error=f"OpenAI image generation via Codex auth failed: {exc}",
-                error_type="api_error",
-                provider="openai-codex",
-                model=tier_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        if not b64:
-            return error_response(
-                error="Codex response contained no image_generation_call result",
-                error_type="empty_response",
-                provider="openai-codex",
-                model=tier_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        try:
-            saved_path = save_b64_image(b64, prefix=f"openai_codex_{tier_id}")
-        except Exception as exc:
-            return error_response(
-                error=f"Could not save image to cache: {exc}",
-                error_type="io_error",
-                provider="openai-codex",
-                model=tier_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        return success_response(
-            image=str(saved_path),
-            model=tier_id,
-            prompt=prompt,
-            aspect_ratio=aspect,
-            provider="openai-codex",
-            extra={"size": size, "quality": meta["quality"]},
-        )
-
-
-# ---------------------------------------------------------------------------
-# Plugin entry point
-# ---------------------------------------------------------------------------
-
-
-def register(ctx) -> None:
-    """Plugin entry point — register the Codex-backed image-gen provider."""
-    ctx.register_image_gen_provider(OpenAICodexImageGenProvider())
@@ -1,5 +0,0 @@
-name: openai-codex
-version: 1.0.0
-description: "OpenAI image generation backed by ChatGPT/Codex OAuth (gpt-image-2 via the Responses image_generation tool). Saves generated images to $HERMES_HOME/cache/images/."
-author: NousResearch
-kind: backend
@@ -1,313 +0,0 @@
-"""xAI image generation backend.
-
-Exposes xAI's ``grok-imagine-image`` model as an
-:class:`ImageGenProvider` implementation.
-
-Features:
- Text-to-image generation
- Multiple aspect ratios (1:1, 16:9, 9:16, etc.)
- Multiple resolutions (1K, 2K)
- Base64 output saved to cache
-
-Selection precedence (first hit wins):
-1. ``XAI_IMAGE_MODEL`` env var
-2. ``image_gen.xai.model`` in ``config.yaml``
-3. :data:`DEFAULT_MODEL`
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-from typing import Any, Dict, List, Optional, Tuple
-
-import requests
-
-from agent.image_gen_provider import (
-    DEFAULT_ASPECT_RATIO,
-    ImageGenProvider,
-    error_response,
-    resolve_aspect_ratio,
-    save_b64_image,
-    success_response,
-)
-from tools.xai_http import hermes_xai_user_agent
-
-logger = logging.getLogger(__name__)
-
-# ---------------------------------------------------------------------------
-# Model catalog
-# ---------------------------------------------------------------------------
-
-API_MODEL = "grok-imagine-image"
-
-_MODELS: Dict[str, Dict[str, Any]] = {
-    "grok-imagine-image": {
-        "display": "Grok Imagine Image",
-        "speed": "~5-10s",
-        "strengths": "Fast, high-quality",
-    },
-}
-
-DEFAULT_MODEL = "grok-imagine-image"
-
-# xAI aspect ratios (more options than FAL/OpenAI)
-_XAI_ASPECT_RATIOS = {
-    "landscape": "16:9",
-    "square": "1:1",
-    "portrait": "9:16",
-    "4:3": "4:3",
-    "3:4": "3:4",
-    "3:2": "3:2",
-    "2:3": "2:3",
-}
-
-# xAI resolutions
-_XAI_RESOLUTIONS = {
-    "1k": "1024",
-    "2k": "2048",
-}
-
-DEFAULT_RESOLUTION = "1k"
-
-
-# ---------------------------------------------------------------------------
-# Config
-# ---------------------------------------------------------------------------
-
-
-def _load_xai_config() -> Dict[str, Any]:
-    """Read ``image_gen.xai`` from config.yaml."""
-    try:
-        from hermes_cli.config import load_config
-
-        cfg = load_config()
-        section = cfg.get("image_gen") if isinstance(cfg, dict) else None
-        xai_section = section.get("xai") if isinstance(section, dict) else None
-        return xai_section if isinstance(xai_section, dict) else {}
-    except Exception as exc:
-        logger.debug("Could not load image_gen.xai config: %s", exc)
-        return {}
-
-
-def _resolve_model() -> Tuple[str, Dict[str, Any]]:
-    """Decide which model to use and return ``(model_id, meta)``."""
-    env_override = os.environ.get("XAI_IMAGE_MODEL")
-    if env_override and env_override in _MODELS:
-        return env_override, _MODELS[env_override]
-
-    cfg = _load_xai_config()
-    candidate = cfg.get("model") if isinstance(cfg.get("model"), str) else None
-    if candidate and candidate in _MODELS:
-        return candidate, _MODELS[candidate]
-
-    return DEFAULT_MODEL, _MODELS[DEFAULT_MODEL]
-
-
-def _resolve_resolution() -> str:
-    """Get configured resolution."""
-    cfg = _load_xai_config()
-    res = cfg.get("resolution") if isinstance(cfg.get("resolution"), str) else None
-    if res and res in _XAI_RESOLUTIONS:
-        return res
-    return DEFAULT_RESOLUTION
-
-
-# ---------------------------------------------------------------------------
-# Provider
-# ---------------------------------------------------------------------------
-
-
-class XAIImageGenProvider(ImageGenProvider):
-    """xAI ``grok-imagine-image`` backend."""
-
-    @property
-    def name(self) -> str:
-        return "xai"
-
-    @property
-    def display_name(self) -> str:
-        return "xAI (Grok)"
-
-    def is_available(self) -> bool:
-        return bool(os.getenv("XAI_API_KEY"))
-
-    def list_models(self) -> List[Dict[str, Any]]:
-        return [
-            {
-                "id": model_id,
-                "display": meta.get("display", model_id),
-                "speed": meta.get("speed", ""),
-                "strengths": meta.get("strengths", ""),
-            }
-            for model_id, meta in _MODELS.items()
-        ]
-
-    def get_setup_schema(self) -> Dict[str, Any]:
-        return {
-            "name": "xAI (Grok)",
-            "badge": "paid",
-            "tag": "Native xAI image generation via grok-imagine-image",
-            "env_vars": [
-                {
-                    "key": "XAI_API_KEY",
-                    "prompt": "xAI API key",
-                    "url": "https://console.x.ai/",
-                },
-            ],
-        }
-
-    def generate(
-        self,
-        prompt: str,
-        aspect_ratio: str = DEFAULT_ASPECT_RATIO,
-        **kwargs: Any,
-    ) -> Dict[str, Any]:
-        """Generate an image using xAI's grok-imagine-image."""
-        api_key = os.getenv("XAI_API_KEY", "").strip()
-        if not api_key:
-            return error_response(
-                error="XAI_API_KEY not set. Get one at https://console.x.ai/",
-                error_type="missing_api_key",
-                provider="xai",
-                aspect_ratio=aspect_ratio,
-            )
-
-        model_id, meta = _resolve_model()
-        aspect = resolve_aspect_ratio(aspect_ratio)
-        xai_ar = _XAI_ASPECT_RATIOS.get(aspect, "1:1")
-        resolution = _resolve_resolution()
-        xai_res = _XAI_RESOLUTIONS.get(resolution, "1024")
-
-        payload: Dict[str, Any] = {
-            "model": API_MODEL,
-            "prompt": prompt,
-            "aspect_ratio": xai_ar,
-            "resolution": xai_res,
-        }
-
-        headers = {
-            "Authorization": f"Bearer {api_key}",
-            "Content-Type": "application/json",
-            "User-Agent": hermes_xai_user_agent(),
-        }
-
-        base_url = (os.getenv("XAI_BASE_URL") or "https://api.x.ai/v1").strip().rstrip("/")
-
-        try:
-            response = requests.post(
-                f"{base_url}/images/generations",
-                headers=headers,
-                json=payload,
-                timeout=120,
-            )
-            response.raise_for_status()
-        except requests.HTTPError as exc:
-            status = exc.response.status_code if exc.response else 0
-            try:
-                err_msg = exc.response.json().get("error", {}).get("message", exc.response.text[:300])
-            except Exception:
-                err_msg = exc.response.text[:300] if exc.response else str(exc)
-            logger.error("xAI image gen failed (%d): %s", status, err_msg)
-            return error_response(
-                error=f"xAI image generation failed ({status}): {err_msg}",
-                error_type="api_error",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-        except requests.Timeout:
-            return error_response(
-                error="xAI image generation timed out (120s)",
-                error_type="timeout",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-        except requests.ConnectionError as exc:
-            return error_response(
-                error=f"xAI connection error: {exc}",
-                error_type="connection_error",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        try:
-            result = response.json()
-        except Exception as exc:
-            return error_response(
-                error=f"xAI returned invalid JSON: {exc}",
-                error_type="invalid_response",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        # Parse response — xAI returns data[0].b64_json or data[0].url
-        data = result.get("data", [])
-        if not data:
-            return error_response(
-                error="xAI returned no image data",
-                error_type="empty_response",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        first = data[0]
-        b64 = first.get("b64_json")
-        url = first.get("url")
-
-        if b64:
-            try:
-                saved_path = save_b64_image(b64, prefix=f"xai_{model_id}")
-            except Exception as exc:
-                return error_response(
-                    error=f"Could not save image to cache: {exc}",
-                    error_type="io_error",
-                    provider="xai",
-                    model=model_id,
-                    prompt=prompt,
-                    aspect_ratio=aspect,
-                )
-            image_ref = str(saved_path)
-        elif url:
-            image_ref = url
-        else:
-            return error_response(
-                error="xAI response contained neither b64_json nor URL",
-                error_type="empty_response",
-                provider="xai",
-                model=model_id,
-                prompt=prompt,
-                aspect_ratio=aspect,
-            )
-
-        extra: Dict[str, Any] = {
-            "resolution": xai_res,
-        }
-
-        return success_response(
-            image=image_ref,
-            model=model_id,
-            prompt=prompt,
-            aspect_ratio=aspect,
-            provider="xai",
-            extra=extra,
-        )
-
-
-# ---------------------------------------------------------------------------
-# Plugin registration
-# ---------------------------------------------------------------------------
-
-
-def register(ctx: Any) -> None:
-    """Register this provider with the image gen registry."""
-    ctx.register_image_gen_provider(XAIImageGenProvider())
@@ -1,7 +0,0 @@
-name: xai
-version: 1.0.0
-description: "xAI image generation backend (grok-imagine-image). Text-to-image."
-author: Julien Talbot
-kind: backend
-requires_env:
-  - XAI_API_KEY
@@ -1,70 +0,0 @@
-# Strike Freedom Cockpit — dashboard skin demo
-
-Demonstrates how the dashboard skin+plugin system can be used to build a
-fully custom cockpit-style reskin without touching the core dashboard.
-
-Two pieces:
-
- `theme/strike-freedom.yaml` — a dashboard theme YAML that paints the
-  palette, typography, layout variant (`cockpit`), component chrome
-  (notched card corners, scanlines, accent colors), and declares asset
-  slots (`hero`, `crest`, `bg`).
- `dashboard/` — a plugin that populates the `sidebar`, `header-left`,
-  and `footer-right` slots reserved by the cockpit layout. The sidebar
-  renders an MS-STATUS panel with segmented telemetry bars driven by
-  real agent status; the header-left injects a COMPASS crest; the
-  footer-right replaces the default org tagline.
-
-## Install
-
-1. **Theme** — copy the theme YAML into your Hermes home:
-
-   ```
-   cp theme/strike-freedom.yaml ~/.hermes/dashboard-themes/
-   ```
-
-2. **Plugin** — the `dashboard/` directory gets auto-discovered because
-   it lives under `plugins/` in the repo. On a user install, copy the
-   whole plugin directory into `~/.hermes/plugins/`:
-
-   ```
-   cp -r . ~/.hermes/plugins/strike-freedom-cockpit
-   ```
-
-3. Restart the web UI (or `GET /api/dashboard/plugins/rescan`), open it,
-   pick **Strike Freedom** from the theme switcher.
-
-## Customising the artwork
-
-The sidebar plugin reads `--theme-asset-hero` and `--theme-asset-crest`
-from the active theme. Drop your own URLs into the theme YAML:
-
-```yaml
-assets:
-  hero: "/my-images/strike-freedom.png"
-  crest: "/my-images/compass-crest.svg"
-  bg: "/my-images/cosmic-era-bg.jpg"
-```
-
-The plugin reads those at render time — no plugin code changes needed
-to swap artwork across themes.
-
-## What this demo proves
-
-The dashboard skin+plugin system supports (ref: `web/src/themes/types.ts`,
-`web/src/plugins/slots.ts`):
-
- Palette, typography, font URLs, density, radius — already present
- **Asset URLs exposed as CSS vars** (bg / hero / crest / logo /
-  sidebar / header + arbitrary `custom.*`)
- **Raw `customCSS` blocks** injected as scoped `<style>` tags
- **Per-component style overrides** (card / header / sidebar / backdrop /
-  tab / progress / footer / badge / page) via CSS vars
- **`layoutVariant`** — `standard`, `cockpit`, or `tiled`
- **Plugin slots** — 10 named shell slots plugins can inject into
-  (`backdrop`, `header-left/right/banner`, `sidebar`, `pre-main`,
-  `post-main`, `footer-left/right`, `overlay`)
- **Route overrides** — plugins can replace a built-in page entirely
-  (`tab.override: "/"`) instead of just adding a tab
- **Hidden plugins** — slot-only plugins that never show in the nav
-  (`tab.hidden: true`) — as used here
@@ -1,309 +0,0 @@
-/**
- * Strike Freedom Cockpit — dashboard plugin demo.
- *
- * A slot-only plugin (manifest sets tab.hidden: true) that populates
- * three shell slots when the user has the ``strike-freedom`` theme
- * selected (or any theme that picks layoutVariant: cockpit):
- *
- *   - sidebar       → MS-STATUS panel: ENERGY / SHIELD / POWER bars,
- *                     ZGMF-X20A identity line, pilot block, hero
- *                     render (from --theme-asset-hero when the theme
- *                     provides one).
- *   - header-left   → COMPASS faction crest (uses --theme-asset-crest
- *                     if provided, falls back to a geometric SVG).
- *   - footer-right  → COSMIC ERA tagline that replaces the default
- *                     footer org line.
- *
- * The plugin demonstrates every extension point added alongside the
- * slot system: registerSlot, tab.hidden, reading theme asset CSS vars
- * from plugin code, and rendering above the built-in route content.
- */
-(function () {
-  "use strict";
-
-  const SDK = window.__HERMES_PLUGIN_SDK__;
-  const PLUGINS = window.__HERMES_PLUGINS__;
-  if (!SDK || !PLUGINS || !PLUGINS.registerSlot) {
-    // Old dashboard bundle without slot support — bail silently rather
-    // than breaking the page.
-    return;
-  }
-
-  const { React } = SDK;
-  const { useState, useEffect } = SDK.hooks;
-  const { api } = SDK;
-
-  // ---------------------------------------------------------------------
-  // Helpers
-  // ---------------------------------------------------------------------
-
-  /** Read a CSS custom property from :root. Empty string when unset. */
-  function cssVar(name) {
-    if (typeof document === "undefined") return "";
-    return getComputedStyle(document.documentElement).getPropertyValue(name).trim();
-  }
-
-  /** Segmented chip progress bar — 10 cells filled proportionally to value. */
-  function TelemetryBar(props) {
-    const { label, value, color } = props;
-    const cells = [];
-    for (let i = 0; i < 10; i++) {
-      const filled = Math.round(value / 10) > i;
-      cells.push(
-        React.createElement("span", {
-          key: i,
-          style: {
-            flex: 1,
-            height: 8,
-            background: filled ? color : "rgba(255,255,255,0.06)",
-            transition: "background 200ms",
-            clipPath: "polygon(2px 0, 100% 0, calc(100% - 2px) 100%, 0 100%)",
-          },
-        }),
-      );
-    }
-    return React.createElement(
-      "div",
-      { style: { display: "flex", flexDirection: "column", gap: 4 } },
-      React.createElement(
-        "div",
-        {
-          style: {
-            display: "flex",
-            justifyContent: "space-between",
-            fontSize: "0.65rem",
-            letterSpacing: "0.12em",
-            opacity: 0.75,
-          },
-        },
-        React.createElement("span", null, label),
-        React.createElement("span", { style: { color, fontWeight: 700 } }, value + "%"),
-      ),
-      React.createElement(
-        "div",
-        { style: { display: "flex", gap: 2 } },
-        cells,
-      ),
-    );
-  }
-
-  // ---------------------------------------------------------------------
-  // Sidebar: MS-STATUS panel
-  // ---------------------------------------------------------------------
-
-  function SidebarSlot() {
-    // Pull live-ish numbers from the status API so the plugin isn't just
-    // a static decoration. Fall back to full bars if the API is slow /
-    // unavailable.
-    const [status, setStatus] = useState(null);
-    useEffect(function () {
-      let cancel = false;
-      api.getStatus()
-        .then(function (s) { if (!cancel) setStatus(s); })
-        .catch(function () {});
-      return function () { cancel = true; };
-    }, []);
-
-    // Map real status signals to HUD telemetry. Energy/shield/power
-    // aren't literal concepts on a software agent, so we read them from
-    // adjacent signals: active sessions, gateway connected-platforms,
-    // and agent-online health.
-    const energy = status && status.gateway_online ? 92 : 18;
-    const shield = status && status.connected_platforms
-      ? Math.min(100, 40 + (status.connected_platforms.length * 15))
-      : 70;
-    const power = status && status.active_sessions
-      ? Math.min(100, 55 + (status.active_sessions.length * 10))
-      : 87;
-
-    const hero = cssVar("--theme-asset-hero");
-
-    return React.createElement(
-      "div",
-      {
-        style: {
-          padding: "1rem 0.75rem",
-          display: "flex",
-          flexDirection: "column",
-          gap: "1rem",
-          fontFamily: "var(--theme-font-display, sans-serif)",
-          letterSpacing: "0.08em",
-          textTransform: "uppercase",
-          fontSize: "0.65rem",
-        },
-      },
-      // Header line
-      React.createElement(
-        "div",
-        {
-          style: {
-            borderBottom: "1px solid rgba(64,200,255,0.3)",
-            paddingBottom: 8,
-            display: "flex",
-            flexDirection: "column",
-            gap: 2,
-          },
-        },
-        React.createElement("span", { style: { opacity: 0.6 } }, "ms status"),
-        React.createElement("span", { style: { fontWeight: 700, fontSize: "0.85rem" } }, "zgmf-x20a"),
-        React.createElement("span", { style: { opacity: 0.6, fontSize: "0.6rem" } }, "strike freedom"),
-      ),
-      // Hero slot — only renders when the theme provides one.
-      hero
-        ? React.createElement("div", {
-            style: {
-              width: "100%",
-              aspectRatio: "3 / 4",
-              backgroundImage: hero,
-              backgroundSize: "contain",
-              backgroundPosition: "center",
-              backgroundRepeat: "no-repeat",
-              opacity: 0.85,
-            },
-            "aria-hidden": true,
-          })
-        : React.createElement("div", {
-            style: {
-              width: "100%",
-              aspectRatio: "3 / 4",
-              border: "1px dashed rgba(64,200,255,0.25)",
-              display: "flex",
-              alignItems: "center",
-              justifyContent: "center",
-              fontSize: "0.55rem",
-              opacity: 0.4,
-            },
-          }, "hero slot — set assets.hero in theme"),
-      // Pilot block
-      React.createElement(
-        "div",
-        {
-          style: {
-            borderTop: "1px solid rgba(64,200,255,0.18)",
-            borderBottom: "1px solid rgba(64,200,255,0.18)",
-            padding: "8px 0",
-            display: "flex",
-            flexDirection: "column",
-            gap: 2,
-          },
-        },
-        React.createElement("span", { style: { opacity: 0.5, fontSize: "0.55rem" } }, "pilot"),
-        React.createElement("span", { style: { fontWeight: 700 } }, "hermes agent"),
-        React.createElement("span", { style: { opacity: 0.5, fontSize: "0.55rem" } }, "compass"),
-      ),
-      // Telemetry bars
-      React.createElement(TelemetryBar, { label: "energy",  value: energy, color: "#ffce3a" }),
-      React.createElement(TelemetryBar, { label: "shield",  value: shield, color: "#3fd3ff" }),
-      React.createElement(TelemetryBar, { label: "power",   value: power,  color: "#ff3a5e" }),
-      // System online
-      React.createElement(
-        "div",
-        {
-          style: {
-            marginTop: 4,
-            padding: "6px 8px",
-            border: "1px solid rgba(74,222,128,0.4)",
-            color: "#4ade80",
-            textAlign: "center",
-            fontWeight: 700,
-            fontSize: "0.6rem",
-          },
-        },
-        status && status.gateway_online ? "system online" : "system offline",
-      ),
-    );
-  }
-
-  // ---------------------------------------------------------------------
-  // Header-left: COMPASS crest
-  // ---------------------------------------------------------------------
-
-  function HeaderCrestSlot() {
-    const crest = cssVar("--theme-asset-crest");
-    const inner = crest
-      ? React.createElement("div", {
-          style: {
-            width: 28,
-            height: 28,
-            backgroundImage: crest,
-            backgroundSize: "contain",
-            backgroundPosition: "center",
-            backgroundRepeat: "no-repeat",
-          },
-          "aria-hidden": true,
-        })
-      : React.createElement(
-          "svg",
-          {
-            width: 28,
-            height: 28,
-            viewBox: "0 0 28 28",
-            fill: "none",
-            stroke: "currentColor",
-            strokeWidth: 1.5,
-            "aria-hidden": true,
-          },
-          React.createElement("path", { d: "M14 2 L26 14 L14 26 L2 14 Z" }),
-          React.createElement("path", { d: "M14 8 L20 14 L14 20 L8 14 Z" }),
-          React.createElement("circle", { cx: 14, cy: 14, r: 2, fill: "currentColor" }),
-        );
-    return React.createElement(
-      "div",
-      {
-        style: {
-          display: "flex",
-          alignItems: "center",
-          paddingLeft: 12,
-          paddingRight: 8,
-          color: "var(--color-accent, #3fd3ff)",
-        },
-      },
-      inner,
-    );
-  }
-
-  // ---------------------------------------------------------------------
-  // Footer-right: COSMIC ERA tagline
-  // ---------------------------------------------------------------------
-
-  function FooterTaglineSlot() {
-    return React.createElement(
-      "span",
-      {
-        style: {
-          fontFamily: "var(--theme-font-display, sans-serif)",
-          fontSize: "0.6rem",
-          letterSpacing: "0.18em",
-          textTransform: "uppercase",
-          opacity: 0.75,
-          mixBlendMode: "plus-lighter",
-        },
-      },
-      "compass hermes systems / cosmic era 71",
-    );
-  }
-
-  // ---------------------------------------------------------------------
-  // Hidden tab placeholder — tab.hidden=true means this never renders in
-  // the nav, but we still register something sensible in case someone
-  // manually navigates to /strike-freedom-cockpit (e.g. via a bookmark).
-  // ---------------------------------------------------------------------
-
-  function HiddenPage() {
-    return React.createElement(
-      "div",
-      { style: { padding: "2rem", opacity: 0.6, fontSize: "0.8rem" } },
-      "Strike Freedom cockpit is a slot-only plugin — it populates the sidebar, header, and footer instead of showing a tab page.",
-    );
-  }
-
-  // ---------------------------------------------------------------------
-  // Registration
-  // ---------------------------------------------------------------------
-
-  const NAME = "strike-freedom-cockpit";
-  PLUGINS.register(NAME, HiddenPage);
-  PLUGINS.registerSlot(NAME, "sidebar", SidebarSlot);
-  PLUGINS.registerSlot(NAME, "header-left", HeaderCrestSlot);
-  PLUGINS.registerSlot(NAME, "footer-right", FooterTaglineSlot);
-})();
@@ -1,14 +0,0 @@
-{
-  "name": "strike-freedom-cockpit",
-  "label": "Strike Freedom Cockpit",
-  "description": "MS-STATUS sidebar + header crest for the Strike Freedom theme",
-  "icon": "Shield",
-  "version": "1.0.0",
-  "tab": {
-    "path": "/strike-freedom-cockpit",
-    "position": "end",
-    "hidden": true
-  },
-  "slots": ["sidebar", "header-left", "footer-right"],
-  "entry": "dist/index.js"
-}
@@ -1,126 +0,0 @@
-# Strike Freedom — Hermes dashboard theme demo
-#
-# Copy this file to ~/.hermes/dashboard-themes/strike-freedom.yaml and
-# restart the web UI (or hit `/api/dashboard/plugins/rescan`). Pair with
-# the `strike-freedom-cockpit` plugin (plugins/strike-freedom-cockpit/)
-# for the full cockpit experience — this theme paints the palette,
-# chrome, and layout; the plugin supplies the MS-STATUS sidebar + header
-# crest that the cockpit layout variant reserves space for.
-#
-# Demonstrates every theme extension point added alongside the plugin
-# slot system: palette, typography, layoutVariant, assets, customCSS,
-# componentStyles, colorOverrides.
-name: strike-freedom
-label: "Strike Freedom"
-description: "Cockpit HUD — deep navy + cyan + gold accents"
-
-# ------- palette (3-layer) -------
-palette:
-  background: "#05091a"
-  midground: "#d8f0ff"
-  foreground:
-    hex: "#ffffff"
-    alpha: 0
-  warmGlow: "rgba(255, 199, 55, 0.24)"
-  noiseOpacity: 0.7
-
-# ------- typography -------
-typography:
-  fontSans: '"Orbitron", "Eurostile", "Bank Gothic", "Impact", sans-serif'
-  fontMono: '"Share Tech Mono", "JetBrains Mono", ui-monospace, monospace'
-  fontDisplay: '"Orbitron", "Eurostile", "Impact", sans-serif'
-  fontUrl: "https://fonts.googleapis.com/css2?family=Orbitron:wght@400;500;600;700;800&family=Share+Tech+Mono&display=swap"
-  baseSize: "14px"
-  lineHeight: "1.5"
-  letterSpacing: "0.04em"
-
-# ------- layout -------
-layout:
-  radius: "0"
-  density: "compact"
-
-# ``cockpit`` reserves a 260px left rail that the shell renders when the
-# user is on this theme. A paired plugin populates the rail via the
-# ``sidebar`` slot; with no plugin the rail shows a placeholder.
-layoutVariant: cockpit
-
-# ------- assets -------
-# Use any URL (https, data:, /dashboard-plugins/...) or a pre-wrapped
-# ``url(...)``/``linear-gradient(...)`` expression. The shell exposes
-# each as a CSS var so plugins can read the same imagery.
-assets:
-  bg: "linear-gradient(140deg, #05091a 0%, #0a1530 55%, #102048 100%)"
-  # Plugin reads --theme-asset-hero / --theme-asset-crest to populate
-  # its sidebar hero render + header crest. Replace these URLs with your
-  # own artwork (copy files into ~/.hermes/dashboard-themes/assets/ and
-  # reference them as /dashboard-themes-assets/strike-freedom/hero.png
-  # once that static route is wired up — for now use inline data URLs or
-  # remote URLs).
-  hero: ""
-  crest: ""
-
-# ------- component chrome -------
-# Each bucket's props become CSS vars (--component-<bucket>-<kebab>) that
-# built-in shell components (Card, header, sidebar, backdrop) consume.
-componentStyles:
-  card:
-    # Notched corners on the top-left + bottom-right — classic mecha UI.
-    clipPath: "polygon(12px 0, 100% 0, 100% calc(100% - 12px), calc(100% - 12px) 100%, 0 100%, 0 12px)"
-    background: "linear-gradient(180deg, rgba(10, 22, 52, 0.85) 0%, rgba(5, 9, 26, 0.92) 100%)"
-    boxShadow: "inset 0 0 0 1px rgba(64, 200, 255, 0.28), 0 0 18px -6px rgba(64, 200, 255, 0.4)"
-  header:
-    background: "linear-gradient(180deg, rgba(16, 32, 72, 0.95) 0%, rgba(5, 9, 26, 0.9) 100%)"
-  sidebar:
-    background: "linear-gradient(180deg, rgba(8, 18, 42, 0.88) 0%, rgba(5, 9, 26, 0.85) 100%)"
-  tab:
-    clipPath: "polygon(6px 0, 100% 0, calc(100% - 6px) 100%, 0 100%)"
-  backdrop:
-    backgroundSize: "cover"
-    backgroundPosition: "center"
-    fillerOpacity: "1"
-    fillerBlendMode: "normal"
-
-# ------- color overrides -------
-colorOverrides:
-  primary: "#ffce3a"
-  primaryForeground: "#05091a"
-  accent: "#3fd3ff"
-  accentForeground: "#05091a"
-  ring: "#3fd3ff"
-  success: "#4ade80"
-  warning: "#ffce3a"
-  destructive: "#ff3a5e"
-  border: "rgba(64, 200, 255, 0.28)"
-
-# ------- customCSS -------
-# Raw CSS injected as a scoped <style> tag on theme apply. Use this for
-# selector-level tweaks componentStyles can't express (pseudo-elements,
-# animations, media queries). Bounded to 32 KiB per theme.
-customCSS: |
-  /* Scanline overlay — subtle, only when theme is active. */
-  :root[data-layout-variant="cockpit"] body::before {
-    content: "";
-    position: fixed;
-    inset: 0;
-    pointer-events: none;
-    z-index: 100;
-    background: repeating-linear-gradient(
-      to bottom,
-      transparent 0px,
-      transparent 2px,
-      rgba(64, 200, 255, 0.035) 3px,
-      rgba(64, 200, 255, 0.035) 4px
-    );
-    mix-blend-mode: screen;
-  }
-
-  /* Chevron pips on card corners. */
-  [data-layout-variant="cockpit"] .border-border::before,
-  [data-layout-variant="cockpit"] .border-border::after {
-    content: "";
-    position: absolute;
-    width: 8px;
-    height: 8px;
-    border: 1px solid rgba(64, 200, 255, 0.55);
-    pointer-events: none;
-  }
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "hermes-agent"
-version = "0.11.0"
+version = "0.10.0"
 description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -39,7 +39,7 @@ dependencies = [
 [project.optional-dependencies]
 modal = ["modal>=1.0.0,<2"]
 daytona = ["daytona>=0.148.0,<1"]
-dev = ["debugpy>=1.8.0,<2", "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2", "ty>=0.0.1a29,<0.0.22", "ruff"]
+dev = ["debugpy>=1.8.0,<2", "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2"]
 messaging = ["python-telegram-bot[webhooks]>=22.6,<23", "discord.py[voice]>=2.7.1,<3", "aiohttp>=3.13.3,<4", "slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4", "qrcode>=7.0,<8"]
 cron = ["croniter>=6.0.0,<7"]
 slack = ["slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4"]
@@ -134,28 +134,3 @@ markers = [
    "integration: marks tests requiring external services (API keys, Modal, etc.)",
 ]
 addopts = "-m 'not integration' -n auto"
-
-[tool.ty.environment]
-python-version = "3.13"
-
-[tool.ty.rules]
-unknown-argument = "warn"
-redundant-cast = "ignore"
-
-[tool.ty.src]
-exclude = ["**"]
-
-[[tool.ty.overrides]]
-include = ["**"]
-
-[tool.ty.overrides.rules]
-unresolved-import = "ignore"
-invalid-method-override = "ignore"
-invalid-assignment = "ignore"
-not-iterable = "ignore"
-
-[tool.ruff]
-exclude = ["*"]
-
-[tool.uv]
-exclude-newer = "7 days"
@@ -262,7 +262,6 @@ _MAX_TOOL_WORKERS = 8
 _DESTRUCTIVE_PATTERNS = re.compile(
    r"""(?:^|\s|&&|\|\||;|`)(?:
        rm\s|rmdir\s|
-        cp\s|install\s|
        mv\s|
        sed\s+-i|
        truncate\s|
@@ -1549,17 +1548,6 @@ class AIAgent:
            _agent_section = {}
        self._tool_use_enforcement = _agent_section.get("tool_use_enforcement", "auto")

-        # App-level API retry count (wraps each model API call).  Default 3,
-        # overridable via agent.api_max_retries in config.yaml.  See #11616.
-        try:
-            _raw_api_retries = _agent_section.get("api_max_retries", 3)
-            _api_retries = int(_raw_api_retries)
-            if _api_retries < 1:
-                _api_retries = 1  # 1 = no retry (single attempt)
-        except (TypeError, ValueError):
-            _api_retries = 3
-        self._api_max_retries = _api_retries
-
        # Initialize context compressor for automatic context management
        # Compresses conversation when approaching model's context limit
        # Configuration via config.yaml (compression section)
@@ -6778,6 +6766,42 @@ class AIAgent:
            cache[mode] = t
        return t

+    @staticmethod
+    def _nr_to_assistant_message(nr):
+        """Convert a NormalizedResponse to the SimpleNamespace shape downstream expects.
+
+        This is the single back-compat shim between the transport layer
+        (NormalizedResponse) and the agent loop (SimpleNamespace with
+        .content, .tool_calls, .reasoning, .reasoning_content,
+        .reasoning_details, .codex_reasoning_items, and per-tool-call
+        .call_id / .response_item_id).
+
+        TODO: Remove when downstream code reads NormalizedResponse directly.
+        """
+        tc_list = None
+        if nr.tool_calls:
+            tc_list = []
+            for tc in nr.tool_calls:
+                tc_ns = SimpleNamespace(
+                    id=tc.id,
+                    type="function",
+                    function=SimpleNamespace(name=tc.name, arguments=tc.arguments),
+                )
+                if tc.provider_data:
+                    for key in ("call_id", "response_item_id"):
+                        if tc.provider_data.get(key):
+                            setattr(tc_ns, key, tc.provider_data[key])
+                tc_list.append(tc_ns)
+        pd = nr.provider_data or {}
+        return SimpleNamespace(
+            content=nr.content,
+            tool_calls=tc_list or None,
+            reasoning=nr.reasoning,
+            reasoning_content=pd.get("reasoning_content"),
+            reasoning_details=pd.get("reasoning_details"),
+            codex_reasoning_items=pd.get("codex_reasoning_items"),
+        )
+
    def _prepare_anthropic_messages_for_api(self, api_messages: list) -> list:
        if not any(
            isinstance(msg, dict) and self._content_has_image_parts(msg.get("content"))
@@ -7479,25 +7503,20 @@ class AIAgent:
                    ]
            elif self.api_mode == "anthropic_messages" and not _aux_available:
                _tfn = self._get_transport()
-                _flush_result = _tfn.normalize_response(response, strip_tool_prefix=self._is_anthropic_oauth)
-                if _flush_result and _flush_result.tool_calls:
+                _flush_nr = _tfn.normalize_response(response, strip_tool_prefix=self._is_anthropic_oauth)
+                if _flush_nr and _flush_nr.tool_calls:
                    tool_calls = [
                        SimpleNamespace(
                            id=tc.id, type="function",
                            function=SimpleNamespace(name=tc.name, arguments=tc.arguments),
-                        ) for tc in _flush_result.tool_calls
+                        ) for tc in _flush_nr.tool_calls
                    ]
-            elif self.api_mode in ("chat_completions", "bedrock_converse"):
+            elif hasattr(response, "choices") and response.choices:
                # chat_completions / bedrock — normalize through transport
-                _flush_result = self._get_transport().normalize_response(response)
-                if _flush_result.tool_calls:
-                    tool_calls = _flush_result.tool_calls
-            elif _aux_available and hasattr(response, "choices") and response.choices:
-                # Auxiliary client returned OpenAI-shaped response while main
-                # api_mode is codex/anthropic — extract tool_calls from .choices
-                _aux_msg = response.choices[0].message
-                if hasattr(_aux_msg, "tool_calls") and _aux_msg.tool_calls:
-                    tool_calls = _aux_msg.tool_calls
+                _flush_cc_nr = self._get_transport().normalize_response(response)
+                _flush_msg = self._nr_to_assistant_message(_flush_cc_nr)
+                if _flush_msg.tool_calls:
+                    tool_calls = _flush_msg.tool_calls

            for tc in tool_calls:
                if tc.function.name == "memory":
@@ -8563,12 +8582,12 @@ class AIAgent:
                                   is_oauth=self._is_anthropic_oauth,
                                   preserve_dots=self._anthropic_preserve_dots())
                    summary_response = self._anthropic_messages_create(_ant_kw)
-                    _summary_result = _tsum.normalize_response(summary_response, strip_tool_prefix=self._is_anthropic_oauth)
-                    final_response = (_summary_result.content or "").strip()
+                    _sum_nr = _tsum.normalize_response(summary_response, strip_tool_prefix=self._is_anthropic_oauth)
+                    final_response = (_sum_nr.content or "").strip()
                else:
                    summary_response = self._ensure_primary_openai_client(reason="iteration_limit_summary").chat.completions.create(**summary_kwargs)
-                    _summary_result = self._get_transport().normalize_response(summary_response)
-                    final_response = (_summary_result.content or "").strip()
+                    _sum_cc_nr = self._get_transport().normalize_response(summary_response)
+                    final_response = (_sum_cc_nr.content or "").strip()

            if final_response:
                if "<think>" in final_response:
@@ -8593,8 +8612,8 @@ class AIAgent:
                                    max_tokens=self.max_tokens, reasoning_config=self.reasoning_config,
                                    preserve_dots=self._anthropic_preserve_dots())
                    retry_response = self._anthropic_messages_create(_ant_kw2)
-                    _retry_result = _tretry.normalize_response(retry_response, strip_tool_prefix=self._is_anthropic_oauth)
-                    final_response = (_retry_result.content or "").strip()
+                    _retry_nr = _tretry.normalize_response(retry_response, strip_tool_prefix=self._is_anthropic_oauth)
+                    final_response = (_retry_nr.content or "").strip()
                else:
                    summary_kwargs = {
                        "model": self.model,
@@ -8608,8 +8627,8 @@ class AIAgent:
                        summary_kwargs["extra_body"] = summary_extra_body

                    summary_response = self._ensure_primary_openai_client(reason="iteration_limit_summary_retry").chat.completions.create(**summary_kwargs)
-                    _retry_result = self._get_transport().normalize_response(summary_response)
-                    final_response = (_retry_result.content or "").strip()
+                    _retry_cc_nr = self._get_transport().normalize_response(summary_response)
+                    final_response = (_retry_cc_nr.content or "").strip()

                if final_response:
                    if "<think>" in final_response:
@@ -9271,7 +9290,7 @@ class AIAgent:
            
            api_start_time = time.time()
            retry_count = 0
-            max_retries = self._api_max_retries
+            max_retries = 3
            primary_recovery_attempted = False
            max_compression_attempts = 3
            codex_auth_retry_attempted=False
@@ -9638,13 +9657,13 @@ class AIAgent:
                    elif self.api_mode == "bedrock_converse":
                        # Bedrock response already normalized at dispatch — use transport
                        _bt_fr = self._get_transport()
-                        _bedrock_result = _bt_fr.normalize_response(response)
-                        finish_reason = _bedrock_result.finish_reason
+                        _bt_fr_nr = _bt_fr.normalize_response(response)
+                        finish_reason = _bt_fr_nr.finish_reason
                    else:
                        _cc_fr = self._get_transport()
-                        _finish_result = _cc_fr.normalize_response(response)
-                        finish_reason = _finish_result.finish_reason
-                        assistant_message = _finish_result
+                        _cc_fr_nr = _cc_fr.normalize_response(response)
+                        finish_reason = _cc_fr_nr.finish_reason
+                        assistant_message = self._nr_to_assistant_message(_cc_fr_nr)
                        if self._should_treat_stop_as_truncated(
                            finish_reason,
                            assistant_message,
@@ -9669,12 +9688,12 @@ class AIAgent:
                        _trunc_msg = None
                        _trunc_transport = self._get_transport()
                        if self.api_mode == "anthropic_messages":
-                            _trunc_result = _trunc_transport.normalize_response(
+                            _trunc_nr = _trunc_transport.normalize_response(
                                response, strip_tool_prefix=self._is_anthropic_oauth
                            )
                        else:
-                            _trunc_result = _trunc_transport.normalize_response(response)
-                        _trunc_msg = _trunc_result
+                            _trunc_nr = _trunc_transport.normalize_response(response)
+                        _trunc_msg = self._nr_to_assistant_message(_trunc_nr)

                        _trunc_content = getattr(_trunc_msg, "content", None) if _trunc_msg else None
                        _trunc_has_tool_calls = bool(getattr(_trunc_msg, "tool_calls", None)) if _trunc_msg else False
@@ -10575,30 +10594,9 @@ class AIAgent:
                        # Error is about the INPUT being too large — reduce context_length.
                        # Try to parse the actual limit from the error message
                        parsed_limit = parse_context_limit_from_error(error_msg)
-                        _provider_lower = (getattr(self, "provider", "") or "").lower()
-                        _base_lower = (getattr(self, "base_url", "") or "").rstrip("/").lower()
-                        is_minimax_provider = (
-                            _provider_lower in {"minimax", "minimax-cn"}
-                            or _base_lower.startswith((
-                                "https://api.minimax.io/anthropic",
-                                "https://api.minimaxi.com/anthropic",
-                            ))
-                        )
-                        minimax_delta_only_overflow = (
-                            is_minimax_provider
-                            and parsed_limit is None
-                            and "context window exceeds limit (" in error_msg
-                        )
                        if parsed_limit and parsed_limit < old_ctx:
                            new_ctx = parsed_limit
-                            self._vprint(f"{self.log_prefix}Context limit detected from API: {new_ctx:,} tokens (was {old_ctx:,})", force=True)
-                        elif minimax_delta_only_overflow:
-                            new_ctx = old_ctx
-                            self._vprint(
-                                f"{self.log_prefix}Provider reported overflow amount only; "
-                                f"keeping context_length at {old_ctx:,} tokens and compressing.",
-                                force=True,
-                            )
+                            self._vprint(f"{self.log_prefix}⚠️  Context limit detected from API: {new_ctx:,} tokens (was {old_ctx:,})", force=True)
                        else:
                            # Step down to the next probe tier
                            new_ctx = get_next_probe_tier(old_ctx)
@@ -10930,9 +10928,9 @@ class AIAgent:
                _normalize_kwargs = {}
                if self.api_mode == "anthropic_messages":
                    _normalize_kwargs["strip_tool_prefix"] = self._is_anthropic_oauth
-                normalized = _transport.normalize_response(response, **_normalize_kwargs)
-                assistant_message = normalized
-                finish_reason = normalized.finish_reason
+                _nr = _transport.normalize_response(response, **_normalize_kwargs)
+                assistant_message = self._nr_to_assistant_message(_nr)
+                finish_reason = _nr.finish_reason
                
                # Normalize content to string — some OpenAI-compatible servers
                # (llama-server, etc.) return content as a dict or list instead
@@ -43,12 +43,7 @@ AUTHOR_MAP = {
    "teknium1@gmail.com": "teknium1",
    "teknium@nousresearch.com": "teknium1",
    "127238744+teknium1@users.noreply.github.com": "teknium1",
-    "343873859@qq.com": "DrStrangerUJN",
-    "jefferson@heimdallstrategy.com": "Mind-Dragon",
-    "130918800+devorun@users.noreply.github.com": "devorun",
-    "maks.mir@yahoo.com": "say8hi",
    # contributors (from noreply pattern)
-    "david.vv@icloud.com": "davidvv",
    "wangqiang@wangqiangdeMac-mini.local": "xiaoqiang243",
    "snreynolds2506@gmail.com": "snreynolds",
    "35742124+0xbyt4@users.noreply.github.com": "0xbyt4",
@@ -103,7 +98,6 @@ AUTHOR_MAP = {
    "30841158+n-WN@users.noreply.github.com": "n-WN",
    "tsuijinglei@gmail.com": "hiddenpuppy",
    "jerome@clawwork.ai": "HiddenPuppy",
-    "wysie@users.noreply.github.com": "Wysie",
    "leoyuan0099@gmail.com": "keyuyuan",
    "bxzt2006@163.com": "Only-Code-A",
    "i@troy-y.org": "TroyMitchell911",
@@ -112,11 +106,8 @@ AUTHOR_MAP = {
    "134848055+UNLINEARITY@users.noreply.github.com": "UNLINEARITY",
    "ben.burtenshaw@gmail.com": "burtenshaw",
    "roopaknijhara@gmail.com": "rnijhara",
-    "josephzcan@gmail.com": "j0sephz",
    # contributors (manual mapping from git names)
    "ahmedsherif95@gmail.com": "asheriif",
-    "dyxushuai@gmail.com": "dyxushuai",
-    "33860762+etcircle@users.noreply.github.com": "etcircle",
    "liujinkun@bytedance.com": "liujinkun2025",
    "dmayhem93@gmail.com": "dmahan93",
    "fr@tecompanytea.com": "ifrederico",
@@ -167,10 +158,7 @@ AUTHOR_MAP = {
    "socrates1024@gmail.com": "socrates1024",
    "seanalt555@gmail.com": "Salt-555",
    "satelerd@gmail.com": "satelerd",
-    "dan@danlynn.com": "danklynn",
-    "mattmaximo@hotmail.com": "MattMaximo",
    "numman.ali@gmail.com": "nummanali",
-    "rohithsaimidigudla@gmail.com": "whitehatjr1001",
    "0xNyk@users.noreply.github.com": "0xNyk",
    "0xnykcd@googlemail.com": "0xNyk",
    "buraysandro9@gmail.com": "buray",
@@ -383,68 +371,6 @@ AUTHOR_MAP = {
    "projectadmin@wit.id": "projectadmin-dev",
    "mrigankamondal10@gmail.com": "Dev-Mriganka",
    "132275809+shushuzn@users.noreply.github.com": "shushuzn",
-    "ibrahimozsarac@gmail.com": "iborazzi",
-    "130149563+A-afflatus@users.noreply.github.com": "A-afflatus",
-    "huangkwell@163.com": "huangke19",
-    "tanishq@exa.ai": "10ishq",
-    "363708+christopherwoodall@users.noreply.github.com": "christopherwoodall",
-    "zhang9w0v5@qq.com": "zhang9w0v5",
-    "fuleinist@outlook.com": "fuleinist",
-    "43494187+Llugaes@users.noreply.github.com": "Llugaes",
-    "fengtianyu88@users.noreply.github.com": "fengtianyu88",
-    "l.moncany@gmail.com": "lmoncany",
-    "fatinghenji@users.noreply.github.com": "fatinghenji",
-    "xin.peng.dr@gmail.com": "xinpengdr",
-    "mike@mikewaters.net": "mikewaters",
-    "65117428+WadydX@users.noreply.github.com": "WadydX",
-    "216480837+isaachuangGMICLOUD@users.noreply.github.com": "isaachuangGMICLOUD",
-    "nukuom976228@gmail.com": "hsy5571616",
-    "11462216+Nan93@users.noreply.github.com": "Nan93",
-    "l973401489@126.com": "zhouxiaoya12",
-    "373119611@qq.com": "roytian1217",
-    "brett@brettbrewer.com": "minorgod",
-    "67779267+wenhao7@users.noreply.github.com": "wenhao7",
-    "git@yzx9.xyz": "yzx9",
-    "nilesh@cloudgeni.us": "lvnilesh",
-    "63502660+azhengbot@users.noreply.github.com": "azhengbot",
-    "sharvil.saxena@gmail.com": "sharziki",
-    "yuanhe@minimaxi.com": "RyanLee-Dev",
-    "curtis992250@gmail.com": "TaroballzChen",
-    "92638503+Lind3ey@users.noreply.github.com": "Lind3ey",
-    "1352808998@qq.com": "phpoh",
-    "caliberoviv@gmail.com": "vivganes",
-    "michaelfackerell@gmail.com": "MikeFac",
-    "18024642@qq.com": "GuyCui",
-    "eumael.mkt@gmail.com": "maelrx",
-    # v0.11.0 additions
-    "benbarclay@gmail.com": "benbarclay",
-    "lijiawen@umich.edu": "Jiawen-lee",
-    "oleksiy@kovyrin.net": "kovyrin",
-    "kovyrin.claw@gmail.com": "kovyrin",
-    "kaiobarb@gmail.com": "liftaris",
-    "me@arihantsethia.com": "arihantsethia",
-    "zhuofengwang2003@gmail.com": "coekfung",
-    "teknium@noreply.github.com": "teknium1",
-    "2114364329@qq.com": "cuyua9",
-    "2557058999@qq.com": "Disaster-Terminator",
-    "cine.dreamer.one@gmail.com": "LeonSGP43",
-    "leozeli@qq.com": "leozeli",
-    "linlehao@cuhk.edu.cn": "LehaoLin",
-    "liutong@isacas.ac.cn": "I3eg1nner",
-    "peterberthelsen@Peters-MacBook-Air.local": "PeterBerthelsen",
-    "root@debian.debian": "lengxii",
-    "roque@priveperfumeshn.com": "priveperfumes",
-    "shijianzhi@shijianzhideMacBook-Pro.local": "sjz-ks",
-    "topcheer@me.com": "topcheer",
-    "walli@tencent.com": "walli",
-    "zhuofengwang@tencent.com": "Zhuofeng-Wang",
-    # no-github-match — keep as display names
-    "clio-agent@sisyphuslabs.ai": "Sisyphus",
-    "marco@rutimka.de": "Marco Rutsch",
-    "paul@gamma.app": "Paul Bergeron",
-    "zhangxicen@example.com": "zhangxicen",
-    "codex@openai.invalid": "teknium1",
-    "screenmachine@gmail.com": "teknium1",
 }


@@ -1,196 +0,0 @@
---
-name: design-md
-description: Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system, porting style rules between projects, generating UI with consistent brand, or auditing accessibility/contrast.
-version: 1.0.0
-author: Hermes Agent
-license: MIT
-metadata:
-  hermes:
-    tags: [design, design-system, tokens, ui, accessibility, wcag, tailwind, dtcg, google]
-    related_skills: [popular-web-designs, excalidraw, architecture-diagram]
---
-
-# DESIGN.md Skill
-
-DESIGN.md is Google's open spec (Apache-2.0, `google-labs-code/design.md`) for
-describing a visual identity to coding agents. One file combines:
-
- **YAML front matter** — machine-readable design tokens (normative values)
- **Markdown body** — human-readable rationale, organized into canonical sections
-
-Tokens give exact values. Prose tells agents *why* those values exist and how to
-apply them. The CLI (`npx @google/design.md`) lints structure + WCAG contrast,
-diffs versions for regressions, and exports to Tailwind or W3C DTCG JSON.
-
-## When to use this skill
-
- User asks for a DESIGN.md file, design tokens, or a design system spec
- User wants consistent UI/brand across multiple projects or tools
- User pastes an existing DESIGN.md and asks to lint, diff, export, or extend it
- User asks to port a style guide into a format agents can consume
- User wants contrast / WCAG accessibility validation on their color palette
-
-For purely visual inspiration or layout examples, use `popular-web-designs`
-instead. This skill is for the *formal spec file* itself.
-
-## File anatomy
-
-```md
---
-version: alpha
-name: Heritage
-description: Architectural minimalism meets journalistic gravitas.
-colors:
-  primary: "#1A1C1E"
-  secondary: "#6C7278"
-  tertiary: "#B8422E"
-  neutral: "#F7F5F2"
-typography:
-  h1:
-    fontFamily: Public Sans
-    fontSize: 3rem
-    fontWeight: 700
-    lineHeight: 1.1
-    letterSpacing: "-0.02em"
-  body-md:
-    fontFamily: Public Sans
-    fontSize: 1rem
-rounded:
-  sm: 4px
-  md: 8px
-  lg: 16px
-spacing:
-  sm: 8px
-  md: 16px
-  lg: 24px
-components:
-  button-primary:
-    backgroundColor: "{colors.tertiary}"
-    textColor: "#FFFFFF"
-    rounded: "{rounded.sm}"
-    padding: 12px
-  button-primary-hover:
-    backgroundColor: "{colors.primary}"
---
-
-## Overview
-
-Architectural Minimalism meets Journalistic Gravitas...
-
-## Colors
-
- **Primary (#1A1C1E):** Deep ink for headlines and core text.
- **Tertiary (#B8422E):** "Boston Clay" — the sole driver for interaction.
-
-## Typography
-
-Public Sans for everything except small all-caps labels...
-
-## Components
-
-`button-primary` is the only high-emphasis action on a page...
-```
-
-## Token types
-
-| Type | Format | Example |
-|------|--------|---------|
-| Color | `#` + hex (sRGB) | `"#1A1C1E"` |
-| Dimension | number + unit (`px`, `em`, `rem`) | `48px`, `-0.02em` |
-| Token reference | `{path.to.token}` | `{colors.primary}` |
-| Typography | object with `fontFamily`, `fontSize`, `fontWeight`, `lineHeight`, `letterSpacing`, `fontFeature`, `fontVariation` | see above |
-
-Component property whitelist: `backgroundColor`, `textColor`, `typography`,
-`rounded`, `padding`, `size`, `height`, `width`. Variants (hover, active,
-pressed) are **separate component entries** with related key names
-(`button-primary-hover`), not nested.
-
-## Canonical section order
-
-Sections are optional, but present ones MUST appear in this order. Duplicate
-headings reject the file.
-
-1. Overview (alias: Brand & Style)
-2. Colors
-3. Typography
-4. Layout (alias: Layout & Spacing)
-5. Elevation & Depth (alias: Elevation)
-6. Shapes
-7. Components
-8. Do's and Don'ts
-
-Unknown sections are preserved, not errored. Unknown token names are accepted
-if the value type is valid. Unknown component properties produce a warning.
-
-## Workflow: authoring a new DESIGN.md
-
-1. **Ask the user** (or infer) the brand tone, accent color, and typography
-   direction. If they provided a site, image, or vibe, translate it to the
-   token shape above.
-2. **Write `DESIGN.md`** in their project root using `write_file`. Always
-   include `name:` and `colors:`; other sections optional but encouraged.
-3. **Use token references** (`{colors.primary}`) in the `components:` section
-   instead of re-typing hex values. Keeps the palette single-source.
-4. **Lint it** (see below). Fix any broken references or WCAG failures
-   before returning.
-5. **If the user has an existing project**, also write Tailwind or DTCG
-   exports next to the file (`tailwind.theme.json`, `tokens.json`).
-
-## Workflow: lint / diff / export
-
-The CLI is `@google/design.md` (Node). Use `npx` — no global install needed.
-
-```bash
-# Validate structure + token references + WCAG contrast
-npx -y @google/design.md lint DESIGN.md
-
-# Compare two versions, fail on regression (exit 1 = regression)
-npx -y @google/design.md diff DESIGN.md DESIGN-v2.md
-
-# Export to Tailwind theme JSON
-npx -y @google/design.md export --format tailwind DESIGN.md > tailwind.theme.json
-
-# Export to W3C DTCG (Design Tokens Format Module) JSON
-npx -y @google/design.md export --format dtcg DESIGN.md > tokens.json
-
-# Print the spec itself — useful when injecting into an agent prompt
-npx -y @google/design.md spec --rules-only --format json
-```
-
-All commands accept `-` for stdin. `lint` returns exit 1 on errors. Use the
-`--format json` flag and parse the output if you need to report findings
-structurally.
-
-### Lint rule reference (what the 7 rules catch)
-
- `broken-ref` (error) — `{colors.missing}` points at a non-existent token
- `duplicate-section` (error) — same `## Heading` appears twice
- `invalid-color`, `invalid-dimension`, `invalid-typography` (error)
- `wcag-contrast` (warning/info) — component `textColor` vs `backgroundColor`
-  ratio against WCAG AA (4.5:1) and AAA (7:1)
- `unknown-component-property` (warning) — outside the whitelist above
-
-When the user cares about accessibility, call this out explicitly in your
-summary — WCAG findings are the most load-bearing reason to use the CLI.
-
-## Pitfalls
-
- **Don't nest component variants.** `button-primary.hover` is wrong;
-  `button-primary-hover` as a sibling key is right.
- **Hex colors must be quoted strings.** YAML will otherwise choke on `#` or
-  truncate values like `#1A1C1E` oddly.
- **Negative dimensions need quotes too.** `letterSpacing: -0.02em` parses as
-  a YAML flow — write `letterSpacing: "-0.02em"`.
- **Section order is enforced.** If the user gives you prose in a random order,
-  reorder it to match the canonical list before saving.
- **`version: alpha` is the current spec version** (as of Apr 2026). The spec
-  is marked alpha — watch for breaking changes.
- **Token references resolve by dotted path.** `{colors.primary}` works;
-  `{primary}` does not.
-
-## Spec source of truth
-
- Repo: https://github.com/google-labs-code/design.md (Apache-2.0)
- CLI: `@google/design.md` on npm
- License of generated DESIGN.md files: whatever the user's project uses;
-  the spec itself is Apache-2.0.
@@ -1,99 +0,0 @@
---
-version: alpha
-name: MyBrand
-description: One-sentence description of the visual identity.
-colors:
-  primary: "#0F172A"
-  secondary: "#64748B"
-  tertiary: "#2563EB"
-  neutral: "#F8FAFC"
-  on-primary: "#FFFFFF"
-  on-tertiary: "#FFFFFF"
-typography:
-  h1:
-    fontFamily: Inter
-    fontSize: 3rem
-    fontWeight: 700
-    lineHeight: 1.1
-    letterSpacing: "-0.02em"
-  h2:
-    fontFamily: Inter
-    fontSize: 2rem
-    fontWeight: 600
-    lineHeight: 1.2
-  body-md:
-    fontFamily: Inter
-    fontSize: 1rem
-    lineHeight: 1.5
-  label-caps:
-    fontFamily: Inter
-    fontSize: 0.75rem
-    fontWeight: 600
-    letterSpacing: "0.08em"
-rounded:
-  sm: 4px
-  md: 8px
-  lg: 16px
-  full: 9999px
-spacing:
-  xs: 4px
-  sm: 8px
-  md: 16px
-  lg: 24px
-  xl: 48px
-components:
-  button-primary:
-    backgroundColor: "{colors.tertiary}"
-    textColor: "{colors.on-tertiary}"
-    rounded: "{rounded.sm}"
-    padding: 12px
-  button-primary-hover:
-    backgroundColor: "{colors.primary}"
-    textColor: "{colors.on-primary}"
-  card:
-    backgroundColor: "{colors.neutral}"
-    textColor: "{colors.primary}"
-    rounded: "{rounded.md}"
-    padding: 24px
---
-
-## Overview
-
-Describe the voice and feel of the brand in one or two paragraphs. What mood
-does it evoke? What emotional response should a user have on first impression?
-
-## Colors
-
- **Primary ({colors.primary}):** Core text, headlines, high-emphasis surfaces.
- **Secondary ({colors.secondary}):** Supporting text, borders, metadata.
- **Tertiary ({colors.tertiary}):** Interaction driver — buttons, links,
-  selected states. Use sparingly to preserve its signal.
- **Neutral ({colors.neutral}):** Page background and surface fills.
-
-## Typography
-
-Inter for everything. Weight and size carry hierarchy, not font family. Tight
-letter-spacing on display sizes; default tracking on body.
-
-## Layout
-
-Spacing scale is a 4px baseline. Use `md` (16px) for intra-component gaps,
-`lg` (24px) for inter-component gaps, `xl` (48px) for section breaks.
-
-## Shapes
-
-Rounded corners are modest — `sm` on interactive elements, `md` on cards.
-`full` is reserved for avatars and pill badges.
-
-## Components
-
- `button-primary` is the only high-emphasis action per screen.
- `card` is the default surface for grouped content. No shadow by default.
-
-## Do's and Don'ts
-
- **Do** use token references (`{colors.primary}`) instead of literal hex in
-  component definitions.
- **Don't** introduce colors outside the palette — extend the palette first.
- **Don't** nest component variants. `button-primary-hover` is a sibling,
-  not a child.
@@ -8,7 +8,7 @@ metadata:
  hermes:
    tags: [wiki, knowledge-base, research, notes, markdown, rag-alternative]
    category: research
-    related_skills: [obsidian, arxiv]
+    related_skills: [obsidian, arxiv, agentic-research-ideas]
 ---

 # Karpathy's LLM Wiki
@@ -18,12 +18,12 @@ from agent.anthropic_adapter import (
    convert_messages_to_anthropic,
    convert_tools_to_anthropic,
    is_claude_code_token_valid,
+    normalize_anthropic_response,
    normalize_model_name,
    read_claude_code_credentials,
    resolve_anthropic_token,
    run_oauth_setup_token,
 )
-from agent.transports import get_transport


 # ---------------------------------------------------------------------------
@@ -1242,10 +1242,10 @@ class TestNormalizeResponse:

    def test_text_response(self):
        block = SimpleNamespace(type="text", text="Hello world")
-        nr = get_transport("anthropic_messages").normalize_response(self._make_response([block]))
-        assert nr.content == "Hello world"
-        assert nr.finish_reason == "stop"
-        assert nr.tool_calls is None
+        msg, reason = normalize_anthropic_response(self._make_response([block]))
+        assert msg.content == "Hello world"
+        assert reason == "stop"
+        assert msg.tool_calls is None

    def test_tool_use_response(self):
        blocks = [
@@ -1257,24 +1257,24 @@ class TestNormalizeResponse:
                input={"query": "test"},
            ),
        ]
-        nr = get_transport("anthropic_messages").normalize_response(
+        msg, reason = normalize_anthropic_response(
            self._make_response(blocks, "tool_use")
        )
-        assert nr.content == "Searching..."
-        assert nr.finish_reason == "tool_calls"
-        assert len(nr.tool_calls) == 1
-        assert nr.tool_calls[0].name == "search"
-        assert json.loads(nr.tool_calls[0].arguments) == {"query": "test"}
+        assert msg.content == "Searching..."
+        assert reason == "tool_calls"
+        assert len(msg.tool_calls) == 1
+        assert msg.tool_calls[0].function.name == "search"
+        assert json.loads(msg.tool_calls[0].function.arguments) == {"query": "test"}

    def test_thinking_response(self):
        blocks = [
            SimpleNamespace(type="thinking", thinking="Let me reason about this..."),
            SimpleNamespace(type="text", text="The answer is 42."),
        ]
-        nr = get_transport("anthropic_messages").normalize_response(self._make_response(blocks))
-        assert nr.content == "The answer is 42."
-        assert nr.reasoning == "Let me reason about this..."
-        assert nr.provider_data["reasoning_details"] == [{"type": "thinking", "thinking": "Let me reason about this..."}]
+        msg, reason = normalize_anthropic_response(self._make_response(blocks))
+        assert msg.content == "The answer is 42."
+        assert msg.reasoning == "Let me reason about this..."
+        assert msg.reasoning_details == [{"type": "thinking", "thinking": "Let me reason about this..."}]

    def test_thinking_response_preserves_signature(self):
        blocks = [
@@ -1285,24 +1285,24 @@ class TestNormalizeResponse:
                redacted=False,
            ),
        ]
-        nr = get_transport("anthropic_messages").normalize_response(self._make_response(blocks))
-        assert nr.provider_data["reasoning_details"][0]["signature"] == "opaque_signature"
-        assert nr.provider_data["reasoning_details"][0]["thinking"] == "Let me reason about this..."
+        msg, _ = normalize_anthropic_response(self._make_response(blocks))
+        assert msg.reasoning_details[0]["signature"] == "opaque_signature"
+        assert msg.reasoning_details[0]["thinking"] == "Let me reason about this..."

    def test_stop_reason_mapping(self):
        block = SimpleNamespace(type="text", text="x")
-        nr1 = get_transport("anthropic_messages").normalize_response(
+        _, r1 = normalize_anthropic_response(
            self._make_response([block], "end_turn")
        )
-        nr2 = get_transport("anthropic_messages").normalize_response(
+        _, r2 = normalize_anthropic_response(
            self._make_response([block], "tool_use")
        )
-        nr3 = get_transport("anthropic_messages").normalize_response(
+        _, r3 = normalize_anthropic_response(
            self._make_response([block], "max_tokens")
        )
-        assert nr1.finish_reason == "stop"
-        assert nr2.finish_reason == "tool_calls"
-        assert nr3.finish_reason == "length"
+        assert r1 == "stop"
+        assert r2 == "tool_calls"
+        assert r3 == "length"

    def test_stop_reason_refusal_and_context_exceeded(self):
        # Claude 4.5+ introduced two new stop_reason values the Messages API
@@ -1310,24 +1310,24 @@ class TestNormalizeResponse:
        # handlers already understand, instead of silently collapsing to
        # "stop" (old behavior).
        block = SimpleNamespace(type="text", text="")
-        nr_refusal = get_transport("anthropic_messages").normalize_response(
+        _, refusal_reason = normalize_anthropic_response(
            self._make_response([block], "refusal")
        )
-        nr_overflow = get_transport("anthropic_messages").normalize_response(
+        _, overflow_reason = normalize_anthropic_response(
            self._make_response([block], "model_context_window_exceeded")
        )
-        assert nr_refusal.finish_reason == "content_filter"
-        assert nr_overflow.finish_reason == "length"
+        assert refusal_reason == "content_filter"
+        assert overflow_reason == "length"

    def test_no_text_content(self):
        block = SimpleNamespace(
            type="tool_use", id="tc_1", name="search", input={"q": "hi"}
        )
-        nr = get_transport("anthropic_messages").normalize_response(
+        msg, reason = normalize_anthropic_response(
            self._make_response([block], "tool_use")
        )
-        assert nr.content is None
-        assert len(nr.tool_calls) == 1
+        assert msg.content is None
+        assert len(msg.tool_calls) == 1


 # ---------------------------------------------------------------------------
@@ -447,34 +447,6 @@ class TestExplicitProviderRouting:
            adapter = client.chat.completions
            assert adapter._is_oauth is False

-    def test_explicit_openrouter_pool_exhausted_logs_precise_warning(self, monkeypatch, caplog):
-        monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
-        with patch("agent.auxiliary_client._select_pool_entry", return_value=(True, None)):
-            with caplog.at_level(logging.WARNING, logger="agent.auxiliary_client"):
-                client, model = resolve_provider_client("openrouter")
-        assert client is None
-        assert model is None
-        assert any(
-            "credential pool has no usable entries" in record.message
-            for record in caplog.records
-        )
-        assert not any(
-            "OPENROUTER_API_KEY not set" in record.message
-            for record in caplog.records
-        )
-
-    def test_explicit_openrouter_missing_env_keeps_not_set_warning(self, monkeypatch, caplog):
-        monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
-        with patch("agent.auxiliary_client._select_pool_entry", return_value=(False, None)):
-            with caplog.at_level(logging.WARNING, logger="agent.auxiliary_client"):
-                client, model = resolve_provider_client("openrouter")
-        assert client is None
-        assert model is None
-        assert any(
-            "OPENROUTER_API_KEY not set" in record.message
-            for record in caplog.records
-        )
-
 class TestGetTextAuxiliaryClient:
    """Test the full resolution chain for get_text_auxiliary_client."""

@@ -245,7 +245,7 @@ class TestResolveVisionMainFirst:
        assert model == "xiaomi/mimo-v2-omni"

    def test_exotic_provider_with_vision_override_preserved(self):
-        """xiaomi → mimo-v2.5 override still wins over main_model."""
+        """xiaomi → mimo-v2-omni override still wins over main_model."""
        with patch(
            "agent.auxiliary_client._read_main_provider", return_value="xiaomi",
        ), patch(
@@ -257,15 +257,15 @@ class TestResolveVisionMainFirst:
            "agent.auxiliary_client._resolve_task_provider_model",
            return_value=("auto", None, None, None, None),
        ):
-            mock_resolve.return_value = (MagicMock(), "mimo-v2.5")
+            mock_resolve.return_value = (MagicMock(), "mimo-v2-omni")

            from agent.auxiliary_client import resolve_vision_provider_client

            provider, client, model = resolve_vision_provider_client()

        assert provider == "xiaomi"
-        # Should use mimo-v2.5 (vision override), not mimo-v2-pro (text main)
-        assert mock_resolve.call_args.args[1] == "mimo-v2.5"
+        # Should use mimo-v2-omni (vision override), not mimo-v2-pro (text main)
+        assert mock_resolve.call_args.args[1] == "mimo-v2-omni"

    def test_main_unavailable_vision_falls_through_to_aggregators(self):
        """Main provider fails → fall back to OpenRouter/Nous strict backends."""
@@ -333,6 +333,66 @@ def test_mark_exhausted_and_rotate_persists_status(tmp_path, monkeypatch):
    assert persisted["last_error_code"] == 402


+def test_try_refresh_current_updates_only_current_entry(tmp_path, monkeypatch):
+    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
+    _write_auth_store(
+        tmp_path,
+        {
+            "version": 1,
+            "credential_pool": {
+                "openai-codex": [
+                    {
+                        "id": "cred-1",
+                        "label": "primary",
+                        "auth_type": "oauth",
+                        "priority": 0,
+                        "source": "device_code",
+                        "access_token": "access-old",
+                        "refresh_token": "refresh-old",
+                        "base_url": "https://chatgpt.com/backend-api/codex",
+                    },
+                    {
+                        "id": "cred-2",
+                        "label": "secondary",
+                        "auth_type": "oauth",
+                        "priority": 1,
+                        "source": "device_code",
+                        "access_token": "access-other",
+                        "refresh_token": "refresh-other",
+                        "base_url": "https://chatgpt.com/backend-api/codex",
+                    },
+                ]
+            },
+        },
+    )
+
+    from agent.credential_pool import load_pool
+
+    monkeypatch.setattr(
+        "hermes_cli.auth.refresh_codex_oauth_pure",
+        lambda access_token, refresh_token, timeout_seconds=20.0: {
+            "access_token": "access-new",
+            "refresh_token": "refresh-new",
+        },
+    )
+
+    pool = load_pool("openai-codex")
+    current = pool.select()
+    assert current.id == "cred-1"
+
+    refreshed = pool.try_refresh_current()
+
+    assert refreshed is not None
+    assert refreshed.access_token == "access-new"
+
+    auth_payload = json.loads((tmp_path / "hermes" / "auth.json").read_text())
+    primary, secondary = auth_payload["credential_pool"]["openai-codex"]
+    assert primary["access_token"] == "access-new"
+    assert primary["refresh_token"] == "refresh-new"
+    assert secondary["access_token"] == "access-other"
+    assert secondary["refresh_token"] == "refresh-other"
+
+
 def test_load_pool_seeds_env_api_key(tmp_path, monkeypatch):
    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
    monkeypatch.setenv("OPENROUTER_API_KEY", "sk-or-seeded")
@@ -56,7 +56,6 @@ class TestFailoverReason:
            "overloaded", "server_error", "timeout",
            "context_overflow", "payload_too_large",
            "model_not_found", "format_error",
-            "provider_policy_blocked",
            "thinking_signature", "long_context_tier", "unknown",
        }
        actual = {r.value for r in FailoverReason}
@@ -309,59 +308,6 @@ class TestClassifyApiError:
        assert result.retryable is True
        assert result.should_fallback is False

-    # ── Provider policy-block (OpenRouter privacy/guardrail) ──
-
-    def test_404_openrouter_policy_blocked(self):
-        # Real OpenRouter error when the user's account privacy setting
-        # excludes the only endpoint serving a model (e.g. DeepSeek V4 Pro
-        # which is hosted only by DeepSeek, and their endpoint may log
-        # inputs).  Must NOT classify as model_not_found — the model
-        # exists, falling back won't help (same account setting applies),
-        # and the error body already tells the user where to fix it.
-        e = MockAPIError(
-            "No endpoints available matching your guardrail restrictions "
-            "and data policy. Configure: https://openrouter.ai/settings/privacy",
-            status_code=404,
-        )
-        result = classify_api_error(e)
-        assert result.reason == FailoverReason.provider_policy_blocked
-        assert result.retryable is False
-        assert result.should_fallback is False
-
-    def test_400_openrouter_policy_blocked(self):
-        # Defense-in-depth: if OpenRouter ever returns this as 400 instead
-        # of 404, still classify it distinctly rather than as format_error
-        # or model_not_found.
-        e = MockAPIError(
-            "No endpoints available matching your data policy",
-            status_code=400,
-        )
-        result = classify_api_error(e)
-        assert result.reason == FailoverReason.provider_policy_blocked
-        assert result.retryable is False
-        assert result.should_fallback is False
-
-    def test_message_only_openrouter_policy_blocked(self):
-        # No status code — classifier should still catch the fingerprint
-        # via the message-pattern fallback.
-        e = Exception(
-            "No endpoints available matching your guardrail restrictions "
-            "and data policy"
-        )
-        result = classify_api_error(e)
-        assert result.reason == FailoverReason.provider_policy_blocked
-
-    def test_404_model_not_found_still_works(self):
-        # Regression guard: the new policy-block check must not swallow
-        # genuine model_not_found 404s.
-        e = MockAPIError(
-            "openrouter/nonexistent-model is not a valid model ID",
-            status_code=404,
-        )
-        result = classify_api_error(e)
-        assert result.reason == FailoverReason.model_not_found
-        assert result.should_fallback is True
-
    # ── Payload too large ──

    def test_413_payload_too_large(self):
@@ -200,126 +200,6 @@ class TestDefaultContextLengths:
        assert len(DEFAULT_CONTEXT_LENGTHS) >= 10


-# =========================================================================
-# Codex OAuth context-window resolution (provider="openai-codex")
-# =========================================================================
-
-class TestCodexOAuthContextLength:
-    """ChatGPT Codex OAuth imposes lower context limits than the direct
-    OpenAI API for the same slugs. Verified Apr 2026 via live probe of
-    chatgpt.com/backend-api/codex/models: every model returns 272k, while
-    models.dev reports 1.05M for gpt-5.5/gpt-5.4 and 400k for the rest.
-    """
-
-    def setup_method(self):
-        import agent.model_metadata as mm
-        mm._codex_oauth_context_cache = {}
-        mm._codex_oauth_context_cache_time = 0.0
-
-    def test_fallback_table_used_without_token(self):
-        """With no access token, the hardcoded Codex fallback table wins
-        over models.dev (which reports 1.05M for gpt-5.5 but Codex is 272k).
-        """
-        from agent.model_metadata import get_model_context_length
-
-        with patch("agent.model_metadata.get_cached_context_length", return_value=None), \
-             patch("agent.model_metadata.save_context_length"):
-            for model in (
-                "gpt-5.5",
-                "gpt-5.4",
-                "gpt-5.4-mini",
-                "gpt-5.3-codex",
-                "gpt-5.2-codex",
-                "gpt-5.1-codex-max",
-                "gpt-5.1-codex-mini",
-            ):
-                ctx = get_model_context_length(
-                    model=model,
-                    base_url="https://chatgpt.com/backend-api/codex",
-                    api_key="",
-                    provider="openai-codex",
-                )
-                assert ctx == 272_000, (
-                    f"Codex {model}: expected 272000 fallback, got {ctx} "
-                    "(models.dev leakage?)"
-                )
-
-    def test_live_probe_overrides_fallback(self):
-        """When a token is provided, the live /models probe is preferred
-        and its context_window drives the result."""
-        from agent.model_metadata import get_model_context_length
-
-        fake_response = MagicMock()
-        fake_response.status_code = 200
-        fake_response.json.return_value = {
-            "models": [
-                {"slug": "gpt-5.5", "context_window": 300_000},
-                {"slug": "gpt-5.4", "context_window": 400_000},
-            ]
-        }
-
-        with patch("agent.model_metadata.requests.get", return_value=fake_response), \
-             patch("agent.model_metadata.get_cached_context_length", return_value=None), \
-             patch("agent.model_metadata.save_context_length"):
-            ctx_55 = get_model_context_length(
-                model="gpt-5.5",
-                base_url="https://chatgpt.com/backend-api/codex",
-                api_key="fake-token",
-                provider="openai-codex",
-            )
-            ctx_54 = get_model_context_length(
-                model="gpt-5.4",
-                base_url="https://chatgpt.com/backend-api/codex",
-                api_key="fake-token",
-                provider="openai-codex",
-            )
-        assert ctx_55 == 300_000
-        assert ctx_54 == 400_000
-
-    def test_probe_failure_falls_back_to_hardcoded(self):
-        """If the probe fails (non-200 / network error), we still return
-        the hardcoded 272k rather than leaking through to models.dev 1.05M."""
-        from agent.model_metadata import get_model_context_length
-
-        fake_response = MagicMock()
-        fake_response.status_code = 401
-        fake_response.json.return_value = {}
-
-        with patch("agent.model_metadata.requests.get", return_value=fake_response), \
-             patch("agent.model_metadata.get_cached_context_length", return_value=None), \
-             patch("agent.model_metadata.save_context_length"):
-            ctx = get_model_context_length(
-                model="gpt-5.5",
-                base_url="https://chatgpt.com/backend-api/codex",
-                api_key="expired-token",
-                provider="openai-codex",
-            )
-        assert ctx == 272_000
-
-    def test_non_codex_providers_unaffected(self):
-        """Resolving gpt-5.5 on non-Codex providers must NOT use the Codex
-        272k override — OpenRouter / direct OpenAI API have different limits.
-        """
-        from agent.model_metadata import get_model_context_length
-
-        # OpenRouter — should hit its own catalog path first; when mocked
-        # empty, falls through to hardcoded DEFAULT_CONTEXT_LENGTHS (400k).
-        with patch("agent.model_metadata.fetch_model_metadata", return_value={}), \
-             patch("agent.model_metadata.fetch_endpoint_model_metadata", return_value={}), \
-             patch("agent.model_metadata.get_cached_context_length", return_value=None), \
-             patch("agent.models_dev.lookup_models_dev_context", return_value=None):
-            ctx = get_model_context_length(
-                model="openai/gpt-5.5",
-                base_url="https://openrouter.ai/api/v1",
-                api_key="",
-                provider="openrouter",
-            )
-        assert ctx == 400_000, (
-            f"Non-Codex gpt-5.5 resolved to {ctx}; Codex 272k override "
-            "leaked outside openai-codex provider"
-        )
-
-
 # =========================================================================
 # get_model_context_length — resolution order
 # =========================================================================
@@ -741,10 +621,6 @@ class TestParseContextLimitFromError:
        msg = "Error: context window of 4096 tokens exceeded"
        assert parse_context_limit_from_error(msg) == 4096

-    def test_minimax_delta_only_message_returns_none(self):
-        msg = "invalid params, context window exceeds limit (2013)"
-        assert parse_context_limit_from_error(msg) is None
-
    def test_completely_unrelated_error(self):
        assert parse_context_limit_from_error("Invalid API key") is None

@@ -1,254 +0,0 @@
-"""Tests for Moonshot/Kimi flavored-JSON-Schema sanitizer.
-
-Moonshot's tool-parameter validator rejects several shapes that the rest of
-the JSON Schema ecosystem accepts:
-
-1. Properties without ``type`` — Moonshot requires ``type`` on every node.
-2. ``type`` at the parent of ``anyOf`` — Moonshot requires it only inside
-   ``anyOf`` children.
-
-These tests cover the repairs applied by ``agent/moonshot_schema.py``.
-"""
-
-from __future__ import annotations
-
-import pytest
-
-from agent.moonshot_schema import (
-    is_moonshot_model,
-    sanitize_moonshot_tool_parameters,
-    sanitize_moonshot_tools,
-)
-
-
-class TestMoonshotModelDetection:
-    """is_moonshot_model() must match across aggregator prefixes."""
-
-    @pytest.mark.parametrize(
-        "model",
-        [
-            "kimi-k2.6",
-            "kimi-k2-thinking",
-            "moonshotai/Kimi-K2.6",
-            "moonshotai/kimi-k2.6",
-            "nous/moonshotai/kimi-k2.6",
-            "openrouter/moonshotai/kimi-k2-thinking",
-            "MOONSHOTAI/KIMI-K2.6",
-        ],
-    )
-    def test_positive_matches(self, model):
-        assert is_moonshot_model(model) is True
-
-    @pytest.mark.parametrize(
-        "model",
-        [
-            "",
-            None,
-            "anthropic/claude-sonnet-4.6",
-            "openai/gpt-5.4",
-            "google/gemini-3-flash-preview",
-            "deepseek-chat",
-        ],
-    )
-    def test_negative_matches(self, model):
-        assert is_moonshot_model(model) is False
-
-
-class TestMissingTypeFilled:
-    """Rule 1: every property must carry a type."""
-
-    def test_property_without_type_gets_string(self):
-        params = {
-            "type": "object",
-            "properties": {"query": {"description": "a bare property"}},
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["properties"]["query"]["type"] == "string"
-
-    def test_property_with_enum_infers_type_from_first_value(self):
-        params = {
-            "type": "object",
-            "properties": {"flag": {"enum": [True, False]}},
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["properties"]["flag"]["type"] == "boolean"
-
-    def test_nested_properties_are_repaired(self):
-        params = {
-            "type": "object",
-            "properties": {
-                "filter": {
-                    "type": "object",
-                    "properties": {
-                        "field": {"description": "no type"},
-                    },
-                },
-            },
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["properties"]["filter"]["properties"]["field"]["type"] == "string"
-
-    def test_array_items_without_type_get_repaired(self):
-        params = {
-            "type": "object",
-            "properties": {
-                "tags": {
-                    "type": "array",
-                    "items": {"description": "tag entry"},
-                },
-            },
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["properties"]["tags"]["items"]["type"] == "string"
-
-    def test_ref_node_is_not_given_synthetic_type(self):
-        """$ref nodes should NOT get a synthetic type — the referenced
-        definition supplies it, and Moonshot would reject the conflict."""
-        params = {
-            "type": "object",
-            "properties": {"payload": {"$ref": "#/$defs/Payload"}},
-            "$defs": {"Payload": {"type": "object", "properties": {}}},
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert "type" not in out["properties"]["payload"]
-        assert out["properties"]["payload"]["$ref"] == "#/$defs/Payload"
-
-
-class TestAnyOfParentType:
-    """Rule 2: type must not appear at the anyOf parent level."""
-
-    def test_parent_type_stripped_when_anyof_present(self):
-        params = {
-            "type": "object",
-            "properties": {
-                "from_format": {
-                    "type": "string",
-                    "anyOf": [
-                        {"type": "string"},
-                        {"type": "null"},
-                    ],
-                },
-            },
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        from_format = out["properties"]["from_format"]
-        assert "type" not in from_format
-        assert "anyOf" in from_format
-
-    def test_anyof_children_missing_type_get_filled(self):
-        params = {
-            "type": "object",
-            "properties": {
-                "value": {
-                    "anyOf": [
-                        {"type": "string"},
-                        {"description": "A typeless option"},
-                    ],
-                },
-            },
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        children = out["properties"]["value"]["anyOf"]
-        assert children[0]["type"] == "string"
-        assert "type" in children[1]
-
-
-class TestTopLevelGuarantees:
-    """The returned top-level schema is always a well-formed object."""
-
-    def test_non_dict_input_returns_empty_object(self):
-        assert sanitize_moonshot_tool_parameters(None) == {"type": "object", "properties": {}}
-        assert sanitize_moonshot_tool_parameters("garbage") == {"type": "object", "properties": {}}
-        assert sanitize_moonshot_tool_parameters([]) == {"type": "object", "properties": {}}
-
-    def test_non_object_top_level_coerced(self):
-        params = {"type": "string"}
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["type"] == "object"
-        assert "properties" in out
-
-    def test_does_not_mutate_input(self):
-        params = {
-            "type": "object",
-            "properties": {"q": {"description": "no type"}},
-        }
-        snapshot = {
-            "type": params["type"],
-            "properties": {"q": dict(params["properties"]["q"])},
-        }
-        sanitize_moonshot_tool_parameters(params)
-        assert params["type"] == snapshot["type"]
-        assert "type" not in params["properties"]["q"]
-
-
-class TestToolListSanitizer:
-    """sanitize_moonshot_tools() walks an OpenAI-format tool list."""
-
-    def test_applies_per_tool(self):
-        tools = [
-            {
-                "type": "function",
-                "function": {
-                    "name": "search",
-                    "description": "Search",
-                    "parameters": {
-                        "type": "object",
-                        "properties": {"q": {"description": "query"}},
-                    },
-                },
-            },
-            {
-                "type": "function",
-                "function": {
-                    "name": "noop",
-                    "description": "Does nothing",
-                    "parameters": {"type": "object", "properties": {}},
-                },
-            },
-        ]
-        out = sanitize_moonshot_tools(tools)
-        assert out[0]["function"]["parameters"]["properties"]["q"]["type"] == "string"
-        # Second tool already clean — should be structurally equivalent
-        assert out[1]["function"]["parameters"] == {"type": "object", "properties": {}}
-
-    def test_empty_list_is_passthrough(self):
-        assert sanitize_moonshot_tools([]) == []
-        assert sanitize_moonshot_tools(None) is None
-
-    def test_skips_malformed_entries(self):
-        """Entries without a function dict are passed through untouched."""
-        tools = [{"type": "function"}, {"not": "a tool"}]
-        out = sanitize_moonshot_tools(tools)
-        assert out == tools
-
-
-class TestRealWorldMCPShape:
-    """End-to-end: a realistic MCP-style schema that used to 400 on Moonshot."""
-
-    def test_combined_rewrites(self):
-        # Shape: missing type on a property, anyOf with parent type, array
-        # items without type — all in one tool.
-        params = {
-            "type": "object",
-            "properties": {
-                "query": {"description": "search text"},
-                "filter": {
-                    "type": "string",
-                    "anyOf": [
-                        {"type": "string"},
-                        {"type": "null"},
-                    ],
-                },
-                "tags": {
-                    "type": "array",
-                    "items": {"description": "tag"},
-                },
-            },
-            "required": ["query"],
-        }
-        out = sanitize_moonshot_tool_parameters(params)
-        assert out["properties"]["query"]["type"] == "string"
-        assert "type" not in out["properties"]["filter"]
-        assert out["properties"]["filter"]["anyOf"][0]["type"] == "string"
-        assert out["properties"]["tags"]["items"]["type"] == "string"
-        assert out["required"] == ["query"]
@@ -807,24 +807,6 @@ class TestPromptBuilderConstants:
        # check that this test is calibrated correctly).
        assert "include MEDIA:" in PLATFORM_HINTS["telegram"]

-    def test_platform_hints_mattermost(self):
-        hint = PLATFORM_HINTS["mattermost"]
-        assert "Mattermost" in hint
-        assert "MEDIA:" in hint
-        assert "Markdown" in hint
-
-    def test_platform_hints_matrix(self):
-        hint = PLATFORM_HINTS["matrix"]
-        assert "Matrix" in hint
-        assert "MEDIA:" in hint
-        assert "Markdown" in hint
-
-    def test_platform_hints_feishu(self):
-        hint = PLATFORM_HINTS["feishu"]
-        assert "Feishu" in hint
-        assert "MEDIA:" in hint
-        assert "Markdown" in hint
-

 # =========================================================================
 # Environment hints
@@ -38,18 +38,6 @@ description: Description for {name}.
    return skill_dir


-def _symlink_category(skills_dir: Path, linked_root: Path, category: str) -> Path:
-    """Create a category symlink under skills_dir pointing outside the tree."""
-    external_category = linked_root / category
-    external_category.mkdir(parents=True, exist_ok=True)
-    symlink_path = skills_dir / category
-    try:
-        symlink_path.symlink_to(external_category, target_is_directory=True)
-    except (OSError, NotImplementedError) as exc:
-        pytest.skip(f"symlinks unavailable in test environment: {exc}")
-    return external_category
-
-
 class TestScanSkillCommands:
    def test_finds_skills(self, tmp_path):
        with patch("tools.skills_tool.SKILLS_DIR", tmp_path):
@@ -113,20 +101,6 @@ class TestScanSkillCommands:
        assert "/enabled-skill" in result
        assert "/disabled-skill" not in result

-    def test_finds_skills_in_symlinked_category_dir(self, tmp_path):
-        external_root = tmp_path / "repo"
-        skills_root = tmp_path / "skills"
-        skills_root.mkdir()
-
-        external_category = _symlink_category(skills_root, external_root, "linked")
-        _make_skill(external_category.parent, "knowledge-brain", category="linked")
-
-        with patch("tools.skills_tool.SKILLS_DIR", skills_root):
-            result = scan_skill_commands()
-
-        assert "/knowledge-brain" in result
-        assert result["/knowledge-brain"]["name"] == "knowledge-brain"
-

    def test_special_chars_stripped_from_cmd_key(self, tmp_path):
        """Skill names with +, /, or other special chars produce clean cmd keys."""
@@ -238,56 +238,6 @@ class TestChatCompletionsKimi:
        )
        assert kw["extra_body"]["thinking"] == {"type": "disabled"}

-    def test_moonshot_tool_schemas_are_sanitized_by_model_name(self, transport):
-        """Aggregator routes (Nous, OpenRouter) hit Moonshot by model name, not base URL."""
-        tools = [
-            {
-                "type": "function",
-                "function": {
-                    "name": "search",
-                    "description": "Search",
-                    "parameters": {
-                        "type": "object",
-                        "properties": {
-                            "q": {"description": "query"},  # missing type
-                        },
-                    },
-                },
-            },
-        ]
-        kw = transport.build_kwargs(
-            model="moonshotai/kimi-k2.6",
-            messages=[{"role": "user", "content": "Hi"}],
-            tools=tools,
-            max_tokens_param_fn=lambda n: {"max_tokens": n},
-        )
-        assert kw["tools"][0]["function"]["parameters"]["properties"]["q"]["type"] == "string"
-
-    def test_non_moonshot_tools_are_not_mutated(self, transport):
-        """Other models don't go through the Moonshot sanitizer."""
-        original_params = {
-            "type": "object",
-            "properties": {"q": {"description": "query"}},  # missing type
-        }
-        tools = [
-            {
-                "type": "function",
-                "function": {
-                    "name": "search",
-                    "description": "Search",
-                    "parameters": original_params,
-                },
-            },
-        ]
-        kw = transport.build_kwargs(
-            model="anthropic/claude-sonnet-4.6",
-            messages=[{"role": "user", "content": "Hi"}],
-            tools=tools,
-            max_tokens_param_fn=lambda n: {"max_tokens": n},
-        )
-        # The parameters dict is passed through untouched (no synthetic type)
-        assert "type" not in kw["tools"][0]["function"]["parameters"]["properties"]["q"]
-

 class TestChatCompletionsValidate:

@@ -149,124 +149,3 @@ class TestMapFinishReason:

    def test_none_reason(self):
        assert map_finish_reason(None, self.ANTHROPIC_MAP) == "stop"
-
-
-# ---------------------------------------------------------------------------
-# Backward-compat property tests
-# ---------------------------------------------------------------------------
-
-class TestToolCallBackwardCompat:
-    """Test duck-typing properties that let ToolCall pass through code expecting
-    the old SimpleNamespace(id, type, function=SimpleNamespace(name, arguments)) shape."""
-
-    def test_type_is_function(self):
-        tc = ToolCall(id="1", name="search", arguments='{"q":"test"}')
-        assert tc.type == "function"
-
-    def test_function_returns_self(self):
-        tc = ToolCall(id="1", name="search", arguments='{"q":"test"}')
-        assert tc.function is tc
-
-    def test_function_name_matches(self):
-        tc = ToolCall(id="1", name="search", arguments='{"q":"test"}')
-        assert tc.function.name == "search"
-        assert tc.function.name == tc.name
-
-    def test_function_arguments_matches(self):
-        tc = ToolCall(id="1", name="search", arguments='{"q":"test"}')
-        assert tc.function.arguments == '{"q":"test"}'
-        assert tc.function.arguments == tc.arguments
-
-    def test_call_id_from_provider_data(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"call_id": "c1"})
-        assert tc.call_id == "c1"
-
-    def test_call_id_none_when_no_provider_data(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data=None)
-        assert tc.call_id is None
-
-    def test_response_item_id_from_provider_data(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"response_item_id": "r1"})
-        assert tc.response_item_id == "r1"
-
-    def test_response_item_id_none_when_missing(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"call_id": "c1"})
-        assert tc.response_item_id is None
-
-    def test_getattr_pattern_matches_agent_loop(self):
-        """run_agent.py uses getattr(tool_call, 'call_id', None) — verify it works."""
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"call_id": "c1"})
-        assert getattr(tc, "call_id", None) == "c1"
-        tc_no_pd = ToolCall(id="1", name="fn", arguments="{}")
-        assert getattr(tc_no_pd, "call_id", None) is None
-
-    def test_extra_content_from_provider_data(self):
-        """Gemini thought_signature stored in provider_data is exposed via property."""
-        ec = {"google": {"thought_signature": "SIG_ABC123"}}
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"extra_content": ec})
-        assert tc.extra_content == ec
-
-    def test_extra_content_none_when_no_provider_data(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data=None)
-        assert tc.extra_content is None
-
-    def test_extra_content_none_when_key_absent(self):
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"call_id": "c1"})
-        assert tc.extra_content is None
-
-    def test_extra_content_getattr_pattern(self):
-        """_build_assistant_message uses getattr(tc, 'extra_content', None).
-
-        This is the exact pattern that was broken before the extra_content
-        property was added — ToolCall lacked the property so getattr always
-        returned None, silently dropping the Gemini thought_signature and
-        causing HTTP 400 on subsequent turns (issue #14488).
-        """
-        ec = {"google": {"thought_signature": "SIG_ABC123"}}
-        tc = ToolCall(id="1", name="fn", arguments="{}", provider_data={"extra_content": ec})
-        assert getattr(tc, "extra_content", None) == ec
-
-        tc_no_extra = ToolCall(id="1", name="fn", arguments="{}")
-        assert getattr(tc_no_extra, "extra_content", None) is None
-
-
-class TestNormalizedResponseBackwardCompat:
-    """Test properties that replaced _nr_to_assistant_message() shim."""
-
-    def test_reasoning_content_from_provider_data(self):
-        nr = NormalizedResponse(
-            content="hi", tool_calls=None, finish_reason="stop",
-            provider_data={"reasoning_content": "thought process"},
-        )
-        assert nr.reasoning_content == "thought process"
-
-    def test_reasoning_content_none_when_absent(self):
-        nr = NormalizedResponse(content="hi", tool_calls=None, finish_reason="stop")
-        assert nr.reasoning_content is None
-
-    def test_reasoning_details_from_provider_data(self):
-        details = [{"type": "thinking", "thinking": "hmm"}]
-        nr = NormalizedResponse(
-            content="hi", tool_calls=None, finish_reason="stop",
-            provider_data={"reasoning_details": details},
-        )
-        assert nr.reasoning_details == details
-
-    def test_reasoning_details_none_when_no_provider_data(self):
-        nr = NormalizedResponse(
-            content="hi", tool_calls=None, finish_reason="stop",
-            provider_data=None,
-        )
-        assert nr.reasoning_details is None
-
-    def test_codex_reasoning_items_from_provider_data(self):
-        items = ["item1", "item2"]
-        nr = NormalizedResponse(
-            content="hi", tool_calls=None, finish_reason="stop",
-            provider_data={"codex_reasoning_items": items},
-        )
-        assert nr.codex_reasoning_items == items
-
-    def test_codex_reasoning_items_none_when_absent(self):
-        nr = NormalizedResponse(content="hi", tool_calls=None, finish_reason="stop")
-        assert nr.codex_reasoning_items is None
@@ -566,35 +566,6 @@ class TestGetDueJobs:
        assert get_job("oneshot-stale")["next_run_at"] is None


-class TestEnabledToolsets:
-    def test_enabled_toolsets_stored(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h", enabled_toolsets=["web", "terminal"])
-        assert job["enabled_toolsets"] == ["web", "terminal"]
-
-    def test_enabled_toolsets_persisted(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h", enabled_toolsets=["web", "file"])
-        fetched = get_job(job["id"])
-        assert fetched["enabled_toolsets"] == ["web", "file"]
-
-    def test_enabled_toolsets_none_when_omitted(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h")
-        assert job["enabled_toolsets"] is None
-
-    def test_enabled_toolsets_empty_list_normalizes_to_none(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h", enabled_toolsets=[])
-        assert job["enabled_toolsets"] is None
-
-    def test_enabled_toolsets_whitespace_entries_stripped(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h", enabled_toolsets=["web", " ", "file"])
-        assert job["enabled_toolsets"] == ["web", "file"]
-
-    def test_enabled_toolsets_updated_via_update_job(self, tmp_cron_dir):
-        job = create_job(prompt="monitor", schedule="every 1h")
-        update_job(job["id"], {"enabled_toolsets": ["web", "delegation"]})
-        fetched = get_job(job["id"])
-        assert fetched["enabled_toolsets"] == ["web", "delegation"]
-
-
 class TestSaveJobOutput:
    def test_creates_output_file(self, tmp_cron_dir):
        output_file = save_job_output("test123", "# Results\nEverything ok.")
@@ -673,100 +673,6 @@ class TestRunJobSessionPersistence:
        assert call_args[0][1] == "cron_complete"
        fake_db.close.assert_called_once()

-    def _make_run_job_patches(self, tmp_path):
-        """Common patches for run_job tests."""
-        fake_db = MagicMock()
-        return fake_db, [
-            patch("cron.scheduler._hermes_home", tmp_path),
-            patch("cron.scheduler._resolve_origin", return_value=None),
-            patch("dotenv.load_dotenv"),
-            patch("hermes_state.SessionDB", return_value=fake_db),
-            patch(
-                "hermes_cli.runtime_provider.resolve_runtime_provider",
-                return_value={
-                    "api_key": "test-key",
-                    "base_url": "https://example.invalid/v1",
-                    "provider": "openrouter",
-                    "api_mode": "chat_completions",
-                },
-            ),
-        ]
-
-    def test_run_job_passes_enabled_toolsets_to_agent(self, tmp_path):
-        job = {
-            "id": "toolset-job",
-            "name": "test",
-            "prompt": "hello",
-            "enabled_toolsets": ["web", "terminal", "file"],
-        }
-        fake_db, patches = self._make_run_job_patches(tmp_path)
-        with patches[0], patches[1], patches[2], patches[3], patches[4], \
-             patch("run_agent.AIAgent") as mock_agent_cls:
-            mock_agent = MagicMock()
-            mock_agent.run_conversation.return_value = {"final_response": "ok"}
-            mock_agent_cls.return_value = mock_agent
-            run_job(job)
-
-        kwargs = mock_agent_cls.call_args.kwargs
-        assert kwargs["enabled_toolsets"] == ["web", "terminal", "file"]
-
-    def test_run_job_enabled_toolsets_resolves_from_platform_config_when_not_set(self, tmp_path):
-        """When a job has no explicit enabled_toolsets, the scheduler now
-        resolves them from ``hermes tools`` platform config for ``cron``
-        (PR #14xxx — blanket fix for Norbert's surprise ``moa`` run).
-
-        The legacy "pass None → AIAgent loads full default" path is still
-        reachable, but only when ``_get_platform_tools`` raises (safety net
-        for any unexpected config shape).
-        """
-        job = {
-            "id": "no-toolset-job",
-            "name": "test",
-            "prompt": "hello",
-        }
-        fake_db, patches = self._make_run_job_patches(tmp_path)
-        with patches[0], patches[1], patches[2], patches[3], patches[4], \
-             patch("run_agent.AIAgent") as mock_agent_cls:
-            mock_agent = MagicMock()
-            mock_agent.run_conversation.return_value = {"final_response": "ok"}
-            mock_agent_cls.return_value = mock_agent
-            run_job(job)
-
-        kwargs = mock_agent_cls.call_args.kwargs
-        # Resolution happened — not None, is a list.
-        assert isinstance(kwargs["enabled_toolsets"], list)
-        # The cron default is _HERMES_CORE_TOOLS with _DEFAULT_OFF_TOOLSETS
-        # (``moa``, ``homeassistant``, ``rl``) removed. The most important
-        # invariant: ``moa`` is NOT in the default cron toolset, so a cron
-        # run cannot accidentally spin up frontier models.
-        assert "moa" not in kwargs["enabled_toolsets"]
-
-    def test_run_job_per_job_toolsets_win_over_platform_config(self, tmp_path):
-        """Per-job enabled_toolsets (via cronjob tool) always take precedence
-        over the platform-level ``hermes tools`` config."""
-        job = {
-            "id": "override-job",
-            "name": "test",
-            "prompt": "hello",
-            "enabled_toolsets": ["terminal"],
-        }
-        fake_db, patches = self._make_run_job_patches(tmp_path)
-        # Even if the user has ``hermes tools`` configured to enable web+file
-        # for cron, the per-job override wins.
-        with patches[0], patches[1], patches[2], patches[3], patches[4], \
-             patch("run_agent.AIAgent") as mock_agent_cls, \
-             patch(
-                 "hermes_cli.tools_config._get_platform_tools",
-                 return_value={"web", "file"},
-             ):
-            mock_agent = MagicMock()
-            mock_agent.run_conversation.return_value = {"final_response": "ok"}
-            mock_agent_cls.return_value = mock_agent
-            run_job(job)
-
-        kwargs = mock_agent_cls.call_args.kwargs
-        assert kwargs["enabled_toolsets"] == ["terminal"]
-
    def test_run_job_empty_response_returns_empty_not_placeholder(self, tmp_path):
        """Empty final_response should stay empty for delivery logic (issue #2234).

@@ -95,7 +95,6 @@ class TestBusySessionAck:
    async def test_sends_ack_when_agent_running(self):
        """First message during busy session should get a status ack."""
        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event = _make_event(text="Are you working?")
@@ -128,42 +127,16 @@ class TestBusySessionAck:
        assert "Interrupting" in content or "respond" in content
        assert "/stop" not in content  # no need — we ARE interrupting

+        # Verify message was queued in adapter pending
+        assert sk in adapter._pending_messages
+
        # Verify agent interrupt was called
        agent.interrupt.assert_called_once_with("Are you working?")

-    @pytest.mark.asyncio
-    async def test_queue_mode_suppresses_interrupt_and_updates_ack(self):
-        """When busy_input_mode is 'queue', message is queued WITHOUT interrupt."""
-        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "queue"
-        adapter = _make_adapter()
-
-        event = _make_event(text="Add this to queue")
-        sk = build_session_key(event.source)
-        runner.adapters[event.source.platform] = adapter
-
-        agent = MagicMock()
-        runner._running_agents[sk] = agent
-
-        with patch("gateway.run.merge_pending_message_event"):
-            await runner._handle_active_session_busy_message(event, sk)
-
-        # VERIFY: Agent was NOT interrupted
-        agent.interrupt.assert_not_called()
-
-        # VERIFY: Ack sent with queue-specific wording
-        adapter._send_with_retry.assert_called_once()
-        call_kwargs = adapter._send_with_retry.call_args
-        content = call_kwargs.kwargs.get("content") or call_kwargs[1].get("content", "")
-        assert "Queued for the next turn" in content
-        assert "respond once the current task finishes" in content
-        assert "Interrupting" not in content
-
    @pytest.mark.asyncio
    async def test_debounce_suppresses_rapid_acks(self):
        """Second message within 30s should NOT send another ack."""
        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event1 = _make_event(text="hello?")
@@ -199,14 +172,13 @@ class TestBusySessionAck:
        assert result2 is True
        assert adapter._send_with_retry.call_count == 1  # still 1, no new ack

-        # But interrupt should still be called for both (since we are in interrupt mode)
+        # But interrupt should still be called for both
        assert agent.interrupt.call_count == 2

    @pytest.mark.asyncio
    async def test_ack_after_cooldown_expires(self):
        """After 30s cooldown, a new message should send a fresh ack."""
        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event = _make_event(text="hello?")
@@ -240,7 +212,6 @@ class TestBusySessionAck:
    async def test_includes_status_detail(self):
        """Ack message should include iteration and tool info when available."""
        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event = _make_event(text="yo")
@@ -272,7 +243,6 @@ class TestBusySessionAck:
        """Draining case should still produce the drain-specific message."""
        runner, sentinel = _make_runner()
        runner._draining = True
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event = _make_event(text="hello")
@@ -294,7 +264,6 @@ class TestBusySessionAck:
    async def test_pending_sentinel_no_interrupt(self):
        """When agent is PENDING_SENTINEL, don't call interrupt (it has no method)."""
        runner, sentinel = _make_runner()
-        runner._busy_input_mode = "interrupt"
        adapter = _make_adapter()

        event = _make_event(text="hey")
@@ -1,28 +1,22 @@
 """Regression tests for the TUI gateway's `complete.path` handler.

-Reported during the TUI v2 blitz retest:
-  - typing `@folder:` (and `@folder` with no colon yet) surfaced files
-    alongside directories — the gateway-side completion lives in
-    `tui_gateway/server.py` and was never touched by the earlier fix to
-    `hermes_cli/commands.py`.
-  - typing `@appChrome` required the full `@ui-tui/src/components/app…`
-    path to find the file — users expect Cmd-P-style fuzzy basename
-    matching across the repo, not a strict directory prefix filter.
+Reported during the TUI v2 blitz retest: typing `@folder:` (and `@folder`
+with no colon yet) still surfaced files alongside directories in the
+TUI composer, because the gateway-side completion lives in
+`tui_gateway/server.py` and was never touched by the earlier fix to
+`hermes_cli/commands.py`.

 Covers:
  - `@folder:` only yields directories
  - `@file:` only yields regular files
  - Bare `@folder` / `@file` (no colon) lists cwd directly
  - Explicit prefix is preserved in the completion text
-  - `@<name>` with no slash fuzzy-matches basenames anywhere in the tree
 """

 from __future__ import annotations

 from pathlib import Path

-import pytest
-
 from tui_gateway import server


@@ -39,15 +33,6 @@ def _items(word: str):
    return [(it["text"], it["display"], it.get("meta", "")) for it in resp["result"]["items"]]


-@pytest.fixture(autouse=True)
-def _reset_fuzzy_cache(monkeypatch):
-    # Each test walks a fresh tmp dir; clear the cached listing so prior
-    # roots can't leak through the TTL window.
-    server._fuzzy_cache.clear()
-    yield
-    server._fuzzy_cache.clear()
-
-
 def test_at_folder_colon_only_dirs(tmp_path, monkeypatch):
    monkeypatch.chdir(tmp_path)
    _fixture(tmp_path)
@@ -104,176 +89,3 @@ def test_bare_at_still_shows_static_refs(tmp_path, monkeypatch):

    for expected in ("@diff", "@staged", "@file:", "@folder:", "@url:", "@git:"):
        assert expected in texts, f"missing static ref {expected!r} in {texts!r}"
-
-
-# ── Fuzzy basename matching ──────────────────────────────────────────────
-# Users shouldn't have to know the full path — typing `@appChrome` should
-# find `ui-tui/src/components/appChrome.tsx`.
-
-
-def _nested_fixture(tmp_path: Path):
-    (tmp_path / "readme.md").write_text("x")
-    (tmp_path / ".env").write_text("x")
-    (tmp_path / "ui-tui/src/components").mkdir(parents=True)
-    (tmp_path / "ui-tui/src/components/appChrome.tsx").write_text("x")
-    (tmp_path / "ui-tui/src/components/appLayout.tsx").write_text("x")
-    (tmp_path / "ui-tui/src/components/thinking.tsx").write_text("x")
-    (tmp_path / "ui-tui/src/hooks").mkdir(parents=True)
-    (tmp_path / "ui-tui/src/hooks/useCompletion.ts").write_text("x")
-    (tmp_path / "tui_gateway").mkdir()
-    (tmp_path / "tui_gateway/server.py").write_text("x")
-
-
-def test_fuzzy_at_finds_file_without_directory_prefix(tmp_path, monkeypatch):
-    """`@appChrome` — with no slash — should surface the nested file."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    entries = _items("@appChrome")
-    texts = [t for t, _, _ in entries]
-
-    assert "@file:ui-tui/src/components/appChrome.tsx" in texts, texts
-
-    # Display is the basename, meta is the containing directory, so the
-    # picker can show `appChrome.tsx  ui-tui/src/components` on one row.
-    row = next(r for r in entries if r[0] == "@file:ui-tui/src/components/appChrome.tsx")
-    assert row[1] == "appChrome.tsx"
-    assert row[2] == "ui-tui/src/components"
-
-
-def test_fuzzy_ranks_exact_before_prefix_before_subseq(tmp_path, monkeypatch):
-    """Better matches sort before weaker matches regardless of path depth."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-    (tmp_path / "server.py").write_text("x")  # exact basename match at root
-
-    texts = [t for t, _, _ in _items("@server")]
-
-    # Exact `server.py` beats `tui_gateway/server.py` (prefix match) — both
-    # rank 1 on basename but exact basename wins on the sort key; shorter
-    # rel path breaks ties.
-    assert texts[0] == "@file:server.py", texts
-    assert "@file:tui_gateway/server.py" in texts
-
-
-def test_fuzzy_camelcase_word_boundary(tmp_path, monkeypatch):
-    """Mid-basename camelCase pieces match without substring scanning."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    texts = [t for t, _, _ in _items("@Chrome")]
-
-    # `Chrome` starts a camelCase word inside `appChrome.tsx`.
-    assert "@file:ui-tui/src/components/appChrome.tsx" in texts, texts
-
-
-def test_fuzzy_subsequence_catches_sparse_queries(tmp_path, monkeypatch):
-    """`@uCo` → `useCompletion.ts` via subsequence, last-resort tier."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    texts = [t for t, _, _ in _items("@uCo")]
-
-    assert "@file:ui-tui/src/hooks/useCompletion.ts" in texts, texts
-
-
-def test_fuzzy_at_file_prefix_preserved(tmp_path, monkeypatch):
-    """Explicit `@file:` prefix still wins the completion tag."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    texts = [t for t, _, _ in _items("@file:appChrome")]
-
-    assert "@file:ui-tui/src/components/appChrome.tsx" in texts, texts
-
-
-def test_fuzzy_skipped_when_path_has_slash(tmp_path, monkeypatch):
-    """Any `/` in the query = user is navigating; keep directory listing."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    texts = [t for t, _, _ in _items("@ui-tui/src/components/app")]
-
-    # Directory-listing mode prefixes with `@file:` / `@folder:` per entry.
-    # It should only surface direct children of the named dir — not the
-    # nested `useCompletion.ts`.
-    assert any("appChrome.tsx" in t for t in texts), texts
-    assert not any("useCompletion.ts" in t for t in texts), texts
-
-
-def test_fuzzy_skipped_when_folder_tag(tmp_path, monkeypatch):
-    """`@folder:<name>` still lists directories — fuzzy scanner only walks
-    files (git-tracked + untracked), so defer to the dir-listing path."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    texts = [t for t, _, _ in _items("@folder:ui")]
-
-    # Root has `ui-tui/` as a directory; the listing branch should surface it.
-    assert any(t.startswith("@folder:ui-tui") for t in texts), texts
-
-
-def test_fuzzy_hides_dotfiles_unless_asked(tmp_path, monkeypatch):
-    """`.env` doesn't leak into `@env` but does show for `@.env`."""
-    monkeypatch.chdir(tmp_path)
-    _nested_fixture(tmp_path)
-
-    assert not any(".env" in t for t, _, _ in _items("@env"))
-    assert any(t.endswith(".env") for t, _, _ in _items("@.env"))
-
-
-def test_fuzzy_caps_results(tmp_path, monkeypatch):
-    """The 30-item cap survives a big tree."""
-    monkeypatch.chdir(tmp_path)
-    for i in range(60):
-        (tmp_path / f"mod_{i:03d}.py").write_text("x")
-
-    items = _items("@mod")
-
-    assert len(items) == 30
-
-
-def test_fuzzy_paths_relative_to_cwd_inside_subdir(tmp_path, monkeypatch):
-    """When the gateway runs from a subdirectory of a git repo, fuzzy
-    completion paths must resolve under that cwd — not under the repo root.
-
-    Without this, `@appChrome` from inside `apps/web/` would suggest
-    `@file:apps/web/src/foo.tsx` but the agent (resolving from cwd) would
-    look for `apps/web/apps/web/src/foo.tsx` and fail. We translate every
-    `git ls-files` result back to a `relpath(root)` and drop anything
-    outside `root` so the completion contract stays "paths are cwd-relative".
-    """
-    import subprocess
-
-    subprocess.run(["git", "init", "-q"], cwd=tmp_path, check=True)
-    subprocess.run(["git", "config", "user.email", "test@example.com"], cwd=tmp_path, check=True)
-    subprocess.run(["git", "config", "user.name", "test"], cwd=tmp_path, check=True)
-
-    (tmp_path / "apps" / "web" / "src").mkdir(parents=True)
-    (tmp_path / "apps" / "web" / "src" / "appChrome.tsx").write_text("x")
-    (tmp_path / "apps" / "api" / "src").mkdir(parents=True)
-    (tmp_path / "apps" / "api" / "src" / "server.ts").write_text("x")
-    (tmp_path / "README.md").write_text("x")
-
-    subprocess.run(["git", "add", "."], cwd=tmp_path, check=True)
-    subprocess.run(["git", "commit", "-q", "-m", "init"], cwd=tmp_path, check=True)
-
-    # Run from `apps/web/` — completions should be relative to here, and
-    # files outside this subtree (apps/api, README.md at root) shouldn't
-    # appear at all.
-    monkeypatch.chdir(tmp_path / "apps" / "web")
-
-    texts = [t for t, _, _ in _items("@appChrome")]
-
-    assert "@file:src/appChrome.tsx" in texts, texts
-    assert not any("apps/web/" in t for t in texts), texts
-
-    server._fuzzy_cache.clear()
-    other_texts = [t for t, _, _ in _items("@server")]
-
-    assert not any("server.ts" in t for t in other_texts), other_texts
-
-    server._fuzzy_cache.clear()
-    readme_texts = [t for t, _, _ in _items("@README")]
-
-    assert not any("README.md" in t for t in readme_texts), readme_texts
@@ -73,29 +73,18 @@ from gateway.platforms.discord import DiscordAdapter  # noqa: E402
 class FakeTree:
    def __init__(self):
        self.sync = AsyncMock(return_value=[])
-        self.fetch_commands = AsyncMock(return_value=[])
-        self._commands = []

    def command(self, *args, **kwargs):
        return lambda fn: fn

-    def get_commands(self, *args, **kwargs):
-        return list(self._commands)
-

 class FakeBot:
    def __init__(self, *, intents, proxy=None, allowed_mentions=None, **_):
        self.intents = intents
        self.allowed_mentions = allowed_mentions
-        self.application_id = 999
        self.user = SimpleNamespace(id=999, name="Hermes")
        self._events = {}
        self.tree = FakeTree()
-        self.http = SimpleNamespace(
-            upsert_global_command=AsyncMock(),
-            edit_global_command=AsyncMock(),
-            delete_global_command=AsyncMock(),
-        )

    def event(self, fn):
        self._events[fn.__name__] = fn
@@ -210,7 +199,6 @@ async def test_connect_releases_token_lock_on_timeout(monkeypatch):
 async def test_connect_does_not_wait_for_slash_sync(monkeypatch):
    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))

-    monkeypatch.setenv("DISCORD_COMMAND_SYNC_POLICY", "bulk")
    monkeypatch.setattr("gateway.status.acquire_scoped_lock", lambda scope, identity, metadata=None: (True, None))
    monkeypatch.setattr("gateway.status.release_scoped_lock", lambda scope, identity: None)

@@ -238,420 +226,3 @@ async def test_connect_does_not_wait_for_slash_sync(monkeypatch):
    created["bot"].tree.allow_finish.set()
    await asyncio.sleep(0)
    await adapter.disconnect()
-
-
-@pytest.mark.asyncio
-async def test_connect_respects_slash_commands_opt_out(monkeypatch):
-    adapter = DiscordAdapter(
-        PlatformConfig(enabled=True, token="test-token", extra={"slash_commands": False})
-    )
-
-    monkeypatch.setenv("DISCORD_COMMAND_SYNC_POLICY", "off")
-    monkeypatch.setattr("gateway.status.acquire_scoped_lock", lambda scope, identity, metadata=None: (True, None))
-    monkeypatch.setattr("gateway.status.release_scoped_lock", lambda scope, identity: None)
-
-    intents = SimpleNamespace(message_content=False, dm_messages=False, guild_messages=False, members=False, voice_states=False)
-    monkeypatch.setattr(discord_platform.Intents, "default", lambda: intents)
-    monkeypatch.setattr(
-        discord_platform.commands,
-        "Bot",
-        lambda **kwargs: FakeBot(
-            intents=kwargs["intents"],
-            proxy=kwargs.get("proxy"),
-            allowed_mentions=kwargs.get("allowed_mentions"),
-        ),
-    )
-    register_mock = MagicMock()
-    monkeypatch.setattr(adapter, "_register_slash_commands", register_mock)
-    monkeypatch.setattr(adapter, "_resolve_allowed_usernames", AsyncMock())
-
-    ok = await adapter.connect()
-
-    assert ok is True
-    register_mock.assert_not_called()
-
-    await adapter.disconnect()
-
-
-@pytest.mark.asyncio
-async def test_safe_sync_slash_commands_only_mutates_diffs():
-    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
-
-    class _DesiredCommand:
-        def __init__(self, payload):
-            self._payload = payload
-
-        def to_dict(self, tree):
-            assert tree is not None
-            return dict(self._payload)
-
-    class _ExistingCommand:
-        def __init__(self, command_id, payload):
-            self.id = command_id
-            self.name = payload["name"]
-            self.type = SimpleNamespace(value=payload["type"])
-            self._payload = payload
-
-        def to_dict(self):
-            return {
-                "id": self.id,
-                "application_id": 999,
-                **self._payload,
-                "name_localizations": {},
-                "description_localizations": {},
-            }
-
-    desired_same = {
-        "name": "status",
-        "description": "Show Hermes session status",
-        "type": 1,
-        "options": [],
-        "nsfw": False,
-        "dm_permission": True,
-        "default_member_permissions": None,
-    }
-    desired_updated = {
-        "name": "help",
-        "description": "Show available commands",
-        "type": 1,
-        "options": [],
-        "nsfw": False,
-        "dm_permission": True,
-        "default_member_permissions": None,
-    }
-    desired_created = {
-        "name": "metricas",
-        "description": "Show Colmeio metrics dashboard",
-        "type": 1,
-        "options": [],
-        "nsfw": False,
-        "dm_permission": True,
-        "default_member_permissions": None,
-    }
-    existing_same = _ExistingCommand(11, desired_same)
-    existing_updated = _ExistingCommand(
-        12,
-        {
-            **desired_updated,
-            "description": "Old help text",
-        },
-    )
-    existing_deleted = _ExistingCommand(
-        13,
-        {
-            "name": "old-command",
-            "description": "To be deleted",
-            "type": 1,
-            "options": [],
-            "nsfw": False,
-            "dm_permission": True,
-            "default_member_permissions": None,
-        },
-    )
-
-    fake_tree = SimpleNamespace(
-        get_commands=lambda: [
-            _DesiredCommand(desired_same),
-            _DesiredCommand(desired_updated),
-            _DesiredCommand(desired_created),
-        ],
-        fetch_commands=AsyncMock(return_value=[existing_same, existing_updated, existing_deleted]),
-    )
-    fake_http = SimpleNamespace(
-        upsert_global_command=AsyncMock(),
-        edit_global_command=AsyncMock(),
-        delete_global_command=AsyncMock(),
-    )
-    adapter._client = SimpleNamespace(
-        tree=fake_tree,
-        http=fake_http,
-        application_id=999,
-        user=SimpleNamespace(id=999),
-    )
-
-    summary = await adapter._safe_sync_slash_commands()
-
-    assert summary == {
-        "total": 3,
-        "unchanged": 1,
-        "updated": 1,
-        "recreated": 0,
-        "created": 1,
-        "deleted": 1,
-    }
-    fake_http.edit_global_command.assert_awaited_once_with(999, 12, desired_updated)
-    fake_http.upsert_global_command.assert_awaited_once_with(999, desired_created)
-    fake_http.delete_global_command.assert_awaited_once_with(999, 13)
-
-
-@pytest.mark.asyncio
-async def test_safe_sync_slash_commands_recreates_metadata_only_diffs():
-    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
-
-    class _DesiredCommand:
-        def __init__(self, payload):
-            self._payload = payload
-
-        def to_dict(self, tree):
-            assert tree is not None
-            return dict(self._payload)
-
-    class _ExistingCommand:
-        def __init__(self, command_id, payload):
-            self.id = command_id
-            self.name = payload["name"]
-            self.type = SimpleNamespace(value=payload["type"])
-            self._payload = payload
-
-        def to_dict(self):
-            return {
-                "id": self.id,
-                "application_id": 999,
-                **self._payload,
-                "name_localizations": {},
-                "description_localizations": {},
-            }
-
-    desired = {
-        "name": "help",
-        "description": "Show available commands",
-        "type": 1,
-        "options": [],
-        "nsfw": False,
-        "dm_permission": True,
-        "default_member_permissions": "8",
-    }
-    existing = _ExistingCommand(
-        12,
-        {
-            **desired,
-            "default_member_permissions": None,
-        },
-    )
-
-    fake_tree = SimpleNamespace(
-        get_commands=lambda: [_DesiredCommand(desired)],
-        fetch_commands=AsyncMock(return_value=[existing]),
-    )
-    fake_http = SimpleNamespace(
-        upsert_global_command=AsyncMock(),
-        edit_global_command=AsyncMock(),
-        delete_global_command=AsyncMock(),
-    )
-    adapter._client = SimpleNamespace(
-        tree=fake_tree,
-        http=fake_http,
-        application_id=999,
-        user=SimpleNamespace(id=999),
-    )
-
-    summary = await adapter._safe_sync_slash_commands()
-
-    assert summary == {
-        "total": 1,
-        "unchanged": 0,
-        "updated": 0,
-        "recreated": 1,
-        "created": 0,
-        "deleted": 0,
-    }
-    fake_http.edit_global_command.assert_not_awaited()
-    fake_http.delete_global_command.assert_awaited_once_with(999, 12)
-    fake_http.upsert_global_command.assert_awaited_once_with(999, desired)
-
-
-@pytest.mark.asyncio
-async def test_post_connect_initialization_skips_sync_when_policy_off(monkeypatch):
-    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
-    monkeypatch.setenv("DISCORD_COMMAND_SYNC_POLICY", "off")
-
-    fake_tree = SimpleNamespace(sync=AsyncMock())
-    adapter._client = SimpleNamespace(tree=fake_tree)
-
-    await adapter._run_post_connect_initialization()
-
-    fake_tree.sync.assert_not_called()
-
-
-@pytest.mark.asyncio
-async def test_safe_sync_reads_permission_attrs_from_existing_command():
-    """Regression: AppCommand.to_dict() in discord.py does NOT include
-    nsfw, dm_permission, or default_member_permissions — they live only
-    on the attributes. Without reading those attrs, any command with
-    non-default permissions false-diffs on every startup.
-    """
-    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
-
-    class _DesiredCommand:
-        def __init__(self, payload):
-            self._payload = payload
-
-        def to_dict(self, tree):
-            return dict(self._payload)
-
-    class _ExistingCommand:
-        """Mirrors discord.py's AppCommand — to_dict() omits nsfw/dm/perms."""
-
-        def __init__(self, command_id, name, description, *, nsfw, guild_only, default_permissions):
-            self.id = command_id
-            self.name = name
-            self.description = description
-            self.type = SimpleNamespace(value=1)
-            self.nsfw = nsfw
-            self.guild_only = guild_only
-            self.default_member_permissions = (
-                SimpleNamespace(value=default_permissions)
-                if default_permissions is not None
-                else None
-            )
-
-        def to_dict(self):
-            # Match real AppCommand.to_dict() — no nsfw/dm_permission/default_member_permissions
-            return {
-                "id": self.id,
-                "type": 1,
-                "application_id": 999,
-                "name": self.name,
-                "description": self.description,
-                "name_localizations": {},
-                "description_localizations": {},
-                "options": [],
-            }
-
-    desired = {
-        "name": "admin",
-        "description": "Admin-only command",
-        "type": 1,
-        "options": [],
-        "nsfw": True,
-        "dm_permission": False,
-        "default_member_permissions": "8",
-    }
-    # Existing command has matching attrs — should report unchanged, NOT falsely diff.
-    existing = _ExistingCommand(
-        42,
-        "admin",
-        "Admin-only command",
-        nsfw=True,
-        guild_only=True,
-        default_permissions=8,
-    )
-
-    fake_tree = SimpleNamespace(
-        get_commands=lambda: [_DesiredCommand(desired)],
-        fetch_commands=AsyncMock(return_value=[existing]),
-    )
-    fake_http = SimpleNamespace(
-        upsert_global_command=AsyncMock(),
-        edit_global_command=AsyncMock(),
-        delete_global_command=AsyncMock(),
-    )
-    adapter._client = SimpleNamespace(
-        tree=fake_tree,
-        http=fake_http,
-        application_id=999,
-        user=SimpleNamespace(id=999),
-    )
-
-    summary = await adapter._safe_sync_slash_commands()
-
-    # Without the fix, this would be unchanged=0, recreated=1 (false diff).
-    assert summary == {
-        "total": 1,
-        "unchanged": 1,
-        "updated": 0,
-        "recreated": 0,
-        "created": 0,
-        "deleted": 0,
-    }
-    fake_http.edit_global_command.assert_not_awaited()
-    fake_http.delete_global_command.assert_not_awaited()
-    fake_http.upsert_global_command.assert_not_awaited()
-
-
-@pytest.mark.asyncio
-async def test_safe_sync_detects_contexts_drift():
-    """Regression: contexts and integration_types must be canonicalized
-    so drift in those fields triggers reconciliation. Without this, the
-    diff silently reports 'unchanged' and never reconciles.
-    """
-    adapter = DiscordAdapter(PlatformConfig(enabled=True, token="test-token"))
-
-    class _DesiredCommand:
-        def __init__(self, payload):
-            self._payload = payload
-
-        def to_dict(self, tree):
-            return dict(self._payload)
-
-    class _ExistingCommand:
-        def __init__(self, command_id, payload):
-            self.id = command_id
-            self.name = payload["name"]
-            self.description = payload["description"]
-            self.type = SimpleNamespace(value=1)
-            self.nsfw = payload.get("nsfw", False)
-            self.guild_only = not payload.get("dm_permission", True)
-            self.default_member_permissions = None
-            self._payload = payload
-
-        def to_dict(self):
-            return {
-                "id": self.id,
-                "type": 1,
-                "application_id": 999,
-                "name": self.name,
-                "description": self.description,
-                "name_localizations": {},
-                "description_localizations": {},
-                "options": [],
-                "contexts": self._payload.get("contexts"),
-                "integration_types": self._payload.get("integration_types"),
-            }
-
-    desired = {
-        "name": "help",
-        "description": "Show available commands",
-        "type": 1,
-        "options": [],
-        "nsfw": False,
-        "dm_permission": True,
-        "default_member_permissions": None,
-        "contexts": [0, 1, 2],
-        "integration_types": [0, 1],
-    }
-    existing = _ExistingCommand(
-        77,
-        {
-            **desired,
-            "contexts": [0],  # server-side only
-            "integration_types": [0],
-        },
-    )
-
-    fake_tree = SimpleNamespace(
-        get_commands=lambda: [_DesiredCommand(desired)],
-        fetch_commands=AsyncMock(return_value=[existing]),
-    )
-    fake_http = SimpleNamespace(
-        upsert_global_command=AsyncMock(),
-        edit_global_command=AsyncMock(),
-        delete_global_command=AsyncMock(),
-    )
-    adapter._client = SimpleNamespace(
-        tree=fake_tree,
-        http=fake_http,
-        application_id=999,
-        user=SimpleNamespace(id=999),
-    )
-
-    summary = await adapter._safe_sync_slash_commands()
-
-    # contexts and integration_types are not patchable by
-    # edit_global_command, so the command must be recreated.
-    assert summary["unchanged"] == 0
-    assert summary["recreated"] == 1
-    assert summary["updated"] == 0
-    fake_http.edit_global_command.assert_not_awaited()
-    fake_http.delete_global_command.assert_awaited_once_with(999, 77)
-    fake_http.upsert_global_command.assert_awaited_once_with(999, desired)
@@ -145,86 +145,3 @@ async def test_drain_active_agents_throttles_status_updates():
    # Start, one count-change update, and final update. Allow one extra update
    # if the loop observes the zero-agent state before exiting.
    assert 3 <= runner._update_runtime_status.call_count <= 4
-
-
-@pytest.mark.asyncio
-async def test_gateway_stop_kills_tool_subprocesses_before_adapter_disconnect_on_timeout(monkeypatch):
-    """On drain timeout, tool subprocesses must be killed BEFORE adapter
-    disconnect so systemd's TimeoutStopSec doesn't SIGKILL the cgroup with
-    bash/sleep children still attached (#8202)."""
-    runner, adapter = make_restart_runner()
-    runner._restart_drain_timeout = 0.01  # force timeout path
-
-    call_order: list[str] = []
-
-    def _fake_kill_all(task_id=None):
-        call_order.append("kill_all")
-        return 2
-
-    def _fake_cleanup_envs():
-        call_order.append("cleanup_environments")
-
-    def _fake_cleanup_browsers():
-        call_order.append("cleanup_browsers")
-
-    async def _disconnect():
-        call_order.append("disconnect")
-
-    # Patch the module-level names the stop() helper imports lazily.
-    import tools.process_registry as _pr
-    import tools.terminal_tool as _tt
-    import tools.browser_tool as _bt
-    monkeypatch.setattr(_pr.process_registry, "kill_all", _fake_kill_all)
-    monkeypatch.setattr(_tt, "cleanup_all_environments", _fake_cleanup_envs)
-    monkeypatch.setattr(_bt, "cleanup_all_browsers", _fake_cleanup_browsers)
-
-    adapter.disconnect = _disconnect
-
-    runner._running_agents = {"session": MagicMock()}
-
-    with patch("gateway.status.remove_pid_file"), patch("gateway.status.write_runtime_status"):
-        await runner.stop()
-
-    # First kill_all must precede the first disconnect.  (Both the eager
-    # post-interrupt cleanup and the final catch-all call _kill_tool_
-    # subprocesses, so we expect kill_all to appear twice total.)
-    assert "kill_all" in call_order
-    assert "disconnect" in call_order
-    first_kill = call_order.index("kill_all")
-    first_disconnect = call_order.index("disconnect")
-    assert first_kill < first_disconnect, (
-        f"Tool subprocesses must be killed before adapter disconnect on "
-        f"drain timeout, got order: {call_order}"
-    )
-    # Defense-in-depth final cleanup still runs.
-    assert call_order.count("kill_all") >= 2
-
-
-@pytest.mark.asyncio
-async def test_gateway_stop_kills_tool_subprocesses_on_graceful_path(monkeypatch):
-    """Graceful shutdown (no drain timeout) must still kill tool subprocesses
-    exactly once via the final catch-all — regression guard against
-    accidentally removing that call when refactoring."""
-    runner, adapter = make_restart_runner()
-    adapter.disconnect = AsyncMock()
-
-    kill_count = 0
-
-    def _fake_kill_all(task_id=None):
-        nonlocal kill_count
-        kill_count += 1
-        return 0
-
-    import tools.process_registry as _pr
-    import tools.terminal_tool as _tt
-    import tools.browser_tool as _bt
-    monkeypatch.setattr(_pr.process_registry, "kill_all", _fake_kill_all)
-    monkeypatch.setattr(_tt, "cleanup_all_environments", lambda: None)
-    monkeypatch.setattr(_bt, "cleanup_all_browsers", lambda: None)
-
-    # No running agents → drain returns immediately, no timeout, no eager cleanup.
-    with patch("gateway.status.remove_pid_file"), patch("gateway.status.write_runtime_status"):
-        await runner.stop()
-
-    # Only the final catch-all fires on the graceful path.
-    assert kill_count == 1
@@ -193,10 +193,7 @@ async def test_start_gateway_replace_force_uses_terminate_pid(monkeypatch, tmp_p
        _pid_state["alive"] = False
    monkeypatch.setattr("gateway.status.get_running_pid", _mock_get_running_pid)
    monkeypatch.setattr("gateway.status.remove_pid_file", _mock_remove_pid_file)
-    monkeypatch.setattr(
-        "gateway.status.release_all_scoped_locks",
-        lambda **kwargs: 0,
-    )
+    monkeypatch.setattr("gateway.status.release_all_scoped_locks", lambda: 0)
    monkeypatch.setattr("gateway.status.terminate_pid", lambda pid, force=False: calls.append((pid, force)))
    monkeypatch.setattr("gateway.run.os.getpid", lambda: 100)
    monkeypatch.setattr("gateway.run.os.kill", lambda pid, sig: None)
@@ -270,10 +267,7 @@ async def test_start_gateway_replace_writes_takeover_marker_before_sigterm(
        _pid_state["alive"] = False
    monkeypatch.setattr("gateway.status.get_running_pid", _mock_get_running_pid)
    monkeypatch.setattr("gateway.status.remove_pid_file", _mock_remove_pid_file)
-    monkeypatch.setattr(
-        "gateway.status.release_all_scoped_locks",
-        lambda **kwargs: 0,
-    )
+    monkeypatch.setattr("gateway.status.release_all_scoped_locks", lambda: 0)
    monkeypatch.setattr("gateway.status.write_takeover_marker", record_write_marker)
    monkeypatch.setattr("gateway.status.terminate_pid", record_terminate)
    monkeypatch.setattr("gateway.run.os.getpid", lambda: 100)
@@ -1,399 +0,0 @@
-"""Regression tests for issue #11016 — Telegram sessions trapped in
-repeated 'Interrupting current task...' while /stop reports no active task.
-
-Covers three layers of the fix:
-
-1. Adapter-side task ownership (_session_tasks map): /stop, /new, /reset
-   actually cancel the in-flight adapter task and release the guard in
-   order, so follow-up messages reach the new session.
-
-2. Adapter-side on-entry self-heal: if _active_sessions still has an
-   entry but the recorded owner task is already done/cancelled, clear it
-   on the next inbound message rather than trapping the user.
-
-3. Runner-side generation guard: a stale async run can't promote itself
-   into _running_agents after /stop/ /new bumped the generation, and
-   can't clear a newer run's slot on the way out.
-"""
-
-import asyncio
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from gateway.config import GatewayConfig, Platform, PlatformConfig
-from gateway.platforms.base import (
-    BasePlatformAdapter,
-    MessageEvent,
-    MessageType,
-)
-from gateway.run import GatewayRunner, _AGENT_PENDING_SENTINEL
-from gateway.session import SessionSource, build_session_key
-
-
-# ---------------------------------------------------------------------------
-# Adapter helpers
-# ---------------------------------------------------------------------------
-
-
-class _StubAdapter(BasePlatformAdapter):
-    async def connect(self):
-        pass
-
-    async def disconnect(self):
-        pass
-
-    async def send(self, chat_id, text, **kwargs):
-        pass
-
-    async def get_chat_info(self, chat_id):
-        return {}
-
-
-def _make_adapter():
-    config = PlatformConfig(enabled=True, token="test-token")
-    adapter = _StubAdapter(config, Platform.TELEGRAM)
-    adapter.sent_responses = []
-
-    async def _mock_send_retry(chat_id, content, **kwargs):
-        adapter.sent_responses.append(content)
-
-    adapter._send_with_retry = _mock_send_retry
-    return adapter
-
-
-def _make_event(text="hello", chat_id="12345"):
-    source = SessionSource(
-        platform=Platform.TELEGRAM, chat_id=chat_id, chat_type="dm"
-    )
-    return MessageEvent(text=text, message_type=MessageType.TEXT, source=source)
-
-
-def _session_key(chat_id="12345"):
-    source = SessionSource(
-        platform=Platform.TELEGRAM, chat_id=chat_id, chat_type="dm"
-    )
-    return build_session_key(source)
-
-
-# ---------------------------------------------------------------------------
-# Runner helpers
-# ---------------------------------------------------------------------------
-
-
-def _make_runner():
-    runner = object.__new__(GatewayRunner)
-    runner.config = GatewayConfig(
-        platforms={Platform.TELEGRAM: PlatformConfig(enabled=True, token="***")}
-    )
-    runner.adapters = {}
-    runner._running_agents = {}
-    runner._running_agents_ts = {}
-    runner._session_run_generation = {}
-    runner._pending_messages = {}
-    runner._draining = False
-    runner._update_runtime_status = MagicMock()
-    return runner
-
-
-# ===========================================================================
-# Layer 1: Adapter-side session cancellation on /stop /new /reset
-# ===========================================================================
-
-
-class TestAdapterSessionCancellation:
-    @pytest.mark.asyncio
-    @pytest.mark.parametrize("command_text", ["/stop", "/new", "/reset"])
-    async def test_command_cancels_active_task_and_unblocks_follow_up(
-        self, command_text
-    ):
-        """/stop /new /reset must cancel the adapter task and let follow-ups through."""
-        adapter = _make_adapter()
-        sk = _session_key()
-        processing_started = asyncio.Event()
-        processing_cancelled = asyncio.Event()
-        blocked_first_message = True
-
-        async def _handler(event):
-            nonlocal blocked_first_message
-            cmd = event.get_command()
-            if cmd in {"stop", "new", "reset", "model"}:
-                return f"handled:{cmd}"
-
-            if blocked_first_message:
-                blocked_first_message = False
-                processing_started.set()
-                try:
-                    await asyncio.Event().wait()
-                except asyncio.CancelledError:
-                    processing_cancelled.set()
-                    raise
-            return f"handled:text:{event.text}"
-
-        adapter._message_handler = _handler
-
-        await adapter.handle_message(_make_event("hello world"))
-        await processing_started.wait()
-        await asyncio.sleep(0)
-
-        assert sk in adapter._active_sessions
-        assert sk in adapter._session_tasks
-
-        await adapter.handle_message(_make_event(command_text))
-
-        assert processing_cancelled.is_set(), (
-            f"{command_text} did not cancel the active processing task"
-        )
-        assert sk not in adapter._active_sessions
-        assert sk not in adapter._pending_messages
-        assert sk not in adapter._session_tasks
-        expected = command_text.lstrip("/")
-        assert any(f"handled:{expected}" in r for r in adapter.sent_responses)
-
-        # Follow-up must go through normally now that the session is clean.
-        await adapter.handle_message(
-            _make_event("/model xiaomi/mimo-v2-pro --provider nous")
-        )
-        await asyncio.sleep(0)
-        await asyncio.sleep(0)
-
-        assert any("handled:model" in r for r in adapter.sent_responses), (
-            f"follow-up /model stayed blocked after {command_text}"
-        )
-        assert sk not in adapter._pending_messages
-
-    @pytest.mark.asyncio
-    async def test_new_keeps_guard_until_command_finishes_then_runs_follow_up(self):
-        """/new must finish runner logic before cancelling old work or releasing the guard."""
-        adapter = _make_adapter()
-        sk = _session_key()
-        processing_started = asyncio.Event()
-        command_started = asyncio.Event()
-        allow_command_finish = asyncio.Event()
-        follow_up_processed = asyncio.Event()
-        call_order = []
-
-        async def _handler(event):
-            cmd = event.get_command()
-            if cmd == "new":
-                call_order.append("command:start")
-                command_started.set()
-                await allow_command_finish.wait()
-                call_order.append("command:end")
-                return "handled:new"
-
-            if event.text == "hello world":
-                processing_started.set()
-                try:
-                    await asyncio.Event().wait()
-                except asyncio.CancelledError:
-                    call_order.append("original:cancelled")
-                    raise
-
-            if event.text == "after reset":
-                call_order.append("followup:processed")
-                follow_up_processed.set()
-            return f"handled:text:{event.text}"
-
-        adapter._message_handler = _handler
-
-        await adapter.handle_message(_make_event("hello world"))
-        await processing_started.wait()
-
-        command_task = asyncio.create_task(adapter.handle_message(_make_event("/new")))
-        await command_started.wait()
-        await asyncio.sleep(0)
-
-        assert sk in adapter._active_sessions
-
-        await adapter.handle_message(_make_event("after reset"))
-        await asyncio.sleep(0)
-        await asyncio.sleep(0)
-
-        assert sk in adapter._active_sessions, "guard must stay active while /new is still running"
-        assert sk in adapter._pending_messages, "follow-up should stay queued until /new finishes"
-        assert not follow_up_processed.is_set(), "follow-up ran before /new completed"
-        assert "original:cancelled" not in call_order, "old task was cancelled before runner completed /new"
-
-        allow_command_finish.set()
-        await command_task
-        await asyncio.wait_for(follow_up_processed.wait(), timeout=1.0)
-
-        assert any("handled:new" in r for r in adapter.sent_responses)
-        assert call_order.index("command:end") < call_order.index("original:cancelled")
-        assert call_order.index("original:cancelled") < call_order.index("followup:processed")
-        assert sk not in adapter._pending_messages
-
-
-# ===========================================================================
-# Layer 2: Adapter-side on-entry self-heal for stale session locks
-# ===========================================================================
-
-
-class TestStaleSessionLockSelfHeal:
-    @pytest.mark.asyncio
-    async def test_stale_lock_with_done_task_is_healed_on_next_message(self):
-        """A split-brain guard (owner task done but entry still live) heals on next inbound."""
-        adapter = _make_adapter()
-        sk = _session_key()
-
-        # Simulate the production split-brain: an _active_sessions entry
-        # remains AND a recorded owner task, but that task is already done.
-        async def _done():
-            return None
-
-        done_task = asyncio.create_task(_done())
-        await done_task
-        assert done_task.done()
-
-        adapter._active_sessions[sk] = asyncio.Event()
-        adapter._session_tasks[sk] = done_task
-
-        assert adapter._session_task_is_stale(sk)
-
-        async def _handler(event):
-            return f"handled:{event.get_command() or 'text'}"
-
-        adapter._message_handler = _handler
-
-        # An ordinary message should heal the stale lock, then fall through
-        # to normal dispatch.  User gets a reply instead of a busy ack.
-        await adapter.handle_message(_make_event("hello"))
-        # Drain any spawned background tasks.
-        for _ in range(5):
-            await asyncio.sleep(0)
-
-        assert any("handled:text" in r for r in adapter.sent_responses), (
-            "stale lock trapped a normal message — split-brain not healed"
-        )
-
-    def test_no_owner_task_is_not_treated_as_stale(self):
-        """If _session_tasks has no entry at all, the guard isn't stale.
-
-        Tests and rare legitimate code paths install _active_sessions
-        entries directly.  Auto-healing those would break real fixtures.
-        """
-        adapter = _make_adapter()
-        sk = _session_key()
-
-        adapter._active_sessions[sk] = asyncio.Event()
-        # No _session_tasks entry.
-
-        assert adapter._session_task_is_stale(sk) is False
-        assert adapter._heal_stale_session_lock(sk) is False
-
-    def test_live_owner_task_is_not_stale(self):
-        """When the owner task is alive, do NOT heal — agent is really busy."""
-        adapter = _make_adapter()
-        sk = _session_key()
-
-        fake_task = MagicMock()
-        fake_task.done.return_value = False
-        adapter._active_sessions[sk] = asyncio.Event()
-        adapter._session_tasks[sk] = fake_task
-
-        assert adapter._session_task_is_stale(sk) is False
-        assert adapter._heal_stale_session_lock(sk) is False
-        # Lock still in place.
-        assert sk in adapter._active_sessions
-        assert sk in adapter._session_tasks
-
-
-# ===========================================================================
-# Layer 3: Runner-side generation guard on slot promotion + release
-# ===========================================================================
-
-
-class TestRunnerSessionGenerationGuard:
-    def test_release_without_generation_behaves_as_before(self):
-        runner = _make_runner()
-        sk = "agent:main:telegram:dm:12345"
-        runner._running_agents[sk] = "agent"
-        runner._running_agents_ts[sk] = 1.0
-        assert runner._release_running_agent_state(sk) is True
-        assert sk not in runner._running_agents
-        assert sk not in runner._running_agents_ts
-
-    def test_release_with_current_generation_clears_slot(self):
-        runner = _make_runner()
-        sk = "agent:main:telegram:dm:12345"
-        gen = runner._begin_session_run_generation(sk)
-        runner._running_agents[sk] = "agent"
-        runner._running_agents_ts[sk] = 1.0
-
-        assert runner._release_running_agent_state(sk, run_generation=gen) is True
-        assert sk not in runner._running_agents
-
-    def test_release_with_stale_generation_blocks(self):
-        runner = _make_runner()
-        sk = "agent:main:telegram:dm:12345"
-        stale_gen = runner._begin_session_run_generation(sk)
-        # /stop bumps the generation — stale run's generation is no longer current.
-        runner._invalidate_session_run_generation(sk, reason="stop")
-        # The fresh run lands next; imagine it has its own state installed.
-        runner._running_agents[sk] = "fresh_agent"
-        runner._running_agents_ts[sk] = 2.0
-
-        # Stale run's unwind MUST NOT clobber the fresh run's state.
-        released = runner._release_running_agent_state(sk, run_generation=stale_gen)
-
-        assert released is False
-        assert runner._running_agents[sk] == "fresh_agent"
-        assert runner._running_agents_ts[sk] == 2.0
-
-    def test_is_session_run_current_tracks_bumps(self):
-        runner = _make_runner()
-        sk = "agent:main:telegram:dm:12345"
-        gen1 = runner._begin_session_run_generation(sk)
-        assert runner._is_session_run_current(sk, gen1) is True
-
-        runner._invalidate_session_run_generation(sk, reason="test")
-        assert runner._is_session_run_current(sk, gen1) is False
-
-        gen2 = runner._begin_session_run_generation(sk)
-        assert gen2 > gen1
-        assert runner._is_session_run_current(sk, gen2) is True
-
-
-# ===========================================================================
-# Layer 1 (regression): old task's finally must NOT delete a newer guard
-# ===========================================================================
-
-
-class TestOldTaskCannotClobberNewerGuard:
-    """Direct regression for the unconditional-delete bug.
-
-    Before the guard-match fix, a task in its finally would delete
-    ``_active_sessions[session_key]`` unconditionally — even if a
-    /stop/ /new command had already swapped in its own command_guard
-    (which then gets clobbered, opening a race for follow-up messages).
-    """
-
-    def test_release_session_guard_matches_on_event_identity(self):
-        adapter = _make_adapter()
-        sk = _session_key()
-
-        old_guard = asyncio.Event()
-        new_guard = asyncio.Event()
-        # Command swapped in a newer guard.
-        adapter._active_sessions[sk] = new_guard
-
-        # Old task tries to release using its captured (stale) guard.
-        adapter._release_session_guard(sk, guard=old_guard)
-
-        # The newer guard survives.
-        assert adapter._active_sessions.get(sk) is new_guard
-
-        # Now the command itself releases using the matching guard.
-        adapter._release_session_guard(sk, guard=new_guard)
-        assert sk not in adapter._active_sessions
-
-    def test_release_session_guard_without_guard_releases_unconditionally(self):
-        adapter = _make_adapter()
-        sk = _session_key()
-        adapter._active_sessions[sk] = asyncio.Event()
-        # Callers that don't know the guard (e.g. cancel_session_processing's
-        # default path) still work.
-        adapter._release_session_guard(sk)
-        assert sk not in adapter._active_sessions
-
@@ -404,53 +404,6 @@ class TestScopedLocks:
        status.release_scoped_lock("telegram-bot-token", "secret")
        assert not lock_path.exists()

-    def test_release_all_scoped_locks_can_target_single_owner(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_GATEWAY_LOCK_DIR", str(tmp_path / "locks"))
-        lock_dir = tmp_path / "locks"
-        lock_dir.mkdir(parents=True, exist_ok=True)
-
-        target_lock = lock_dir / "telegram-bot-token-target.lock"
-        other_lock = lock_dir / "slack-app-token-other.lock"
-        target_lock.write_text(json.dumps({
-            "pid": 111,
-            "start_time": 222,
-            "kind": "hermes-gateway",
-        }))
-        other_lock.write_text(json.dumps({
-            "pid": 999,
-            "start_time": 333,
-            "kind": "hermes-gateway",
-        }))
-
-        removed = status.release_all_scoped_locks(
-            owner_pid=111,
-            owner_start_time=222,
-        )
-
-        assert removed == 1
-        assert not target_lock.exists()
-        assert other_lock.exists()
-
-    def test_release_all_scoped_locks_skips_pid_reuse_mismatch(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_GATEWAY_LOCK_DIR", str(tmp_path / "locks"))
-        lock_dir = tmp_path / "locks"
-        lock_dir.mkdir(parents=True, exist_ok=True)
-
-        reused_pid_lock = lock_dir / "telegram-bot-token-reused.lock"
-        reused_pid_lock.write_text(json.dumps({
-            "pid": 111,
-            "start_time": 999,
-            "kind": "hermes-gateway",
-        }))
-
-        removed = status.release_all_scoped_locks(
-            owner_pid=111,
-            owner_start_time=222,
-        )
-
-        assert removed == 0
-        assert reused_pid_lock.exists()
-

 class TestTakeoverMarker:
    """Tests for the --replace takeover marker.
@@ -68,68 +68,3 @@ def test_build_welcome_banner_uses_normalized_toolset_names():
    assert "homeassistant_tools:" not in output
    assert "honcho_tools:" not in output
    assert "web_tools:" not in output
-
-
-def test_build_welcome_banner_title_is_hyperlinked_to_release():
-    """Panel title (version label) is wrapped in an OSC-8 hyperlink to the GitHub release."""
-    import io
-    from unittest.mock import patch as _patch
-    import hermes_cli.banner as _banner
-    import model_tools as _mt
-    import tools.mcp_tool as _mcp
-
-    _banner._latest_release_cache = None
-    tag_url = ("v2026.4.23", "https://github.com/NousResearch/hermes-agent/releases/tag/v2026.4.23")
-
-    buf = io.StringIO()
-    with (
-        _patch.object(_mt, "check_tool_availability", return_value=(["web"], [])),
-        _patch.object(_banner, "get_available_skills", return_value={}),
-        _patch.object(_banner, "get_update_result", return_value=None),
-        _patch.object(_mcp, "get_mcp_status", return_value=[]),
-        _patch.object(_banner, "get_latest_release_tag", return_value=tag_url),
-    ):
-        console = Console(file=buf, force_terminal=True, color_system="truecolor", width=160)
-        _banner.build_welcome_banner(
-            console=console, model="x", cwd="/tmp",
-            session_id="abc123",
-            tools=[{"function": {"name": "read_file"}}],
-            get_toolset_for_tool=lambda n: "file",
-        )
-
-    raw = buf.getvalue()
-    # The existing version label must still be present in the title
-    assert "Hermes Agent v" in raw, "Version label missing from title"
-    # OSC-8 hyperlink escape sequence present with the release URL
-    assert "\x1b]8;" in raw, "OSC-8 hyperlink not emitted"
-    assert "releases/tag/v2026.4.23" in raw, "Release URL missing from banner output"
-
-
-def test_build_welcome_banner_title_falls_back_when_no_tag():
-    """Without a resolvable tag, the panel title renders as plain text (no hyperlink escape)."""
-    import io
-    from unittest.mock import patch as _patch
-    import hermes_cli.banner as _banner
-    import model_tools as _mt
-    import tools.mcp_tool as _mcp
-
-    _banner._latest_release_cache = None
-    buf = io.StringIO()
-    with (
-        _patch.object(_mt, "check_tool_availability", return_value=(["web"], [])),
-        _patch.object(_banner, "get_available_skills", return_value={}),
-        _patch.object(_banner, "get_update_result", return_value=None),
-        _patch.object(_mcp, "get_mcp_status", return_value=[]),
-        _patch.object(_banner, "get_latest_release_tag", return_value=None),
-    ):
-        console = Console(file=buf, force_terminal=True, color_system="truecolor", width=160)
-        _banner.build_welcome_banner(
-            console=console, model="x", cwd="/tmp",
-            session_id="abc123",
-            tools=[{"function": {"name": "read_file"}}],
-            get_toolset_for_tool=lambda n: "file",
-        )
-
-    raw = buf.getvalue()
-    assert "Hermes Agent v" in raw, "Version label missing from title"
-    assert "\x1b]8;" not in raw, "OSC-8 hyperlink should not be emitted without a tag"
@@ -5,8 +5,6 @@ import pwd
 from pathlib import Path
 from types import SimpleNamespace

-import pytest
-
 import hermes_cli.gateway as gateway_cli
 from gateway.restart import (
    DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT,
@@ -95,10 +93,7 @@ class TestGeneratedSystemdUnits:
        assert "ExecStop=" not in unit
        assert "ExecReload=/bin/kill -USR1 $MAINPID" in unit
        assert f"RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}" in unit
-        # TimeoutStopSec must exceed the default drain_timeout (60s) so
-        # systemd doesn't SIGKILL the cgroup before post-interrupt cleanup
-        # (tool subprocess kill, adapter disconnect) runs — issue #8202.
-        assert "TimeoutStopSec=90" in unit
+        assert "TimeoutStopSec=60" in unit

    def test_user_unit_includes_resolved_node_directory_in_path(self, monkeypatch):
        monkeypatch.setattr(gateway_cli.shutil, "which", lambda cmd: "/home/test/.nvm/versions/node/v24.14.0/bin/node" if cmd == "node" else None)
@@ -114,10 +109,7 @@ class TestGeneratedSystemdUnits:
        assert "ExecStop=" not in unit
        assert "ExecReload=/bin/kill -USR1 $MAINPID" in unit
        assert f"RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}" in unit
-        # TimeoutStopSec must exceed the default drain_timeout (60s) so
-        # systemd doesn't SIGKILL the cgroup before post-interrupt cleanup
-        # (tool subprocess kill, adapter disconnect) runs — issue #8202.
-        assert "TimeoutStopSec=90" in unit
+        assert "TimeoutStopSec=60" in unit
        assert "WantedBy=multi-user.target" in unit


@@ -1091,116 +1083,6 @@ class TestEnsureUserSystemdEnv:
        assert calls == []


-class TestPreflightUserSystemd:
-    """Tests for _preflight_user_systemd() — D-Bus reachability before systemctl --user.
-
-    Covers issue #5130 / Rick's RHEL 9.6 SSH scenario: setup tries to start the
-    gateway via ``systemctl --user start`` in a shell with no user D-Bus session,
-    which previously failed with a raw ``CalledProcessError`` and no remediation.
-    """
-
-    def test_noop_when_bus_socket_exists(self, monkeypatch):
-        """Socket already there (desktop / linger + prior login) → no-op."""
-        monkeypatch.setattr(
-            gateway_cli, "_user_dbus_socket_path",
-            lambda: type("P", (), {"exists": lambda self: True})(),
-        )
-        # Should not raise, no subprocess calls needed.
-        gateway_cli._preflight_user_systemd()
-
-    def test_raises_when_linger_disabled_and_loginctl_denied(self, monkeypatch):
-        """Rick's scenario: no D-Bus, no linger, non-root SSH → clear error."""
-        monkeypatch.setattr(
-            gateway_cli, "_user_dbus_socket_path",
-            lambda: type("P", (), {"exists": lambda self: False})(),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "get_systemd_linger_status", lambda: (False, ""),
-        )
-        monkeypatch.setattr(gateway_cli.shutil, "which", lambda _: "/usr/bin/loginctl")
-
-        class _Result:
-            returncode = 1
-            stdout = ""
-            stderr = "Interactive authentication required."
-
-        monkeypatch.setattr(
-            gateway_cli.subprocess, "run", lambda *a, **kw: _Result(),
-        )
-
-        with pytest.raises(gateway_cli.UserSystemdUnavailableError) as exc_info:
-            gateway_cli._preflight_user_systemd()
-
-        msg = str(exc_info.value)
-        assert "sudo loginctl enable-linger" in msg
-        assert "hermes gateway run" in msg  # foreground fallback mentioned
-        assert "Interactive authentication required" in msg
-
-    def test_raises_when_loginctl_missing(self, monkeypatch):
-        """No loginctl binary at all → suggest sudo install + manual fix."""
-        monkeypatch.setattr(
-            gateway_cli, "_user_dbus_socket_path",
-            lambda: type("P", (), {"exists": lambda self: False})(),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "get_systemd_linger_status",
-            lambda: (None, "loginctl not found"),
-        )
-        monkeypatch.setattr(gateway_cli.shutil, "which", lambda _: None)
-
-        with pytest.raises(gateway_cli.UserSystemdUnavailableError) as exc_info:
-            gateway_cli._preflight_user_systemd()
-
-        assert "sudo loginctl enable-linger" in str(exc_info.value)
-
-    def test_linger_enabled_but_socket_still_missing(self, monkeypatch):
-        """Edge case: linger says yes but the bus socket never came up."""
-        monkeypatch.setattr(
-            gateway_cli, "_user_dbus_socket_path",
-            lambda: type("P", (), {"exists": lambda self: False})(),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "get_systemd_linger_status", lambda: (True, ""),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "_wait_for_user_dbus_socket", lambda timeout=3.0: False,
-        )
-
-        with pytest.raises(gateway_cli.UserSystemdUnavailableError) as exc_info:
-            gateway_cli._preflight_user_systemd()
-
-        assert "linger is enabled" in str(exc_info.value)
-
-    def test_enable_linger_succeeds_and_socket_appears(self, monkeypatch, capsys):
-        """Happy remediation path: polkit allows enable-linger, socket spawns."""
-        monkeypatch.setattr(
-            gateway_cli, "_user_dbus_socket_path",
-            lambda: type("P", (), {"exists": lambda self: False})(),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "get_systemd_linger_status", lambda: (False, ""),
-        )
-        monkeypatch.setattr(gateway_cli.shutil, "which", lambda _: "/usr/bin/loginctl")
-
-        class _OkResult:
-            returncode = 0
-            stdout = ""
-            stderr = ""
-
-        monkeypatch.setattr(
-            gateway_cli.subprocess, "run", lambda *a, **kw: _OkResult(),
-        )
-        monkeypatch.setattr(
-            gateway_cli, "_wait_for_user_dbus_socket",
-            lambda timeout=5.0: True,
-        )
-
-        # Should not raise.
-        gateway_cli._preflight_user_systemd()
-        out = capsys.readouterr().out
-        assert "Enabled linger" in out
-
-
 class TestProfileArg:
    """Tests for _profile_arg — returns '--profile <name>' for named profiles."""

@@ -1,245 +0,0 @@
-"""Tests for --ignore-user-config and --ignore-rules flags on `hermes chat`.
-
-Ported from openai/codex#18646 (`feat: add --ignore-user-config and --ignore-rules`).
-Codex's flags fully isolate a run from user-level config and exec-policy .rules
-files. In Hermes the equivalent isolation is:
-
-* ``--ignore-user-config`` → skip ``~/.hermes/config.yaml`` in ``load_cli_config()``
-  (credentials in ``.env`` are still loaded).
-* ``--ignore-rules`` → skip AGENTS.md / SOUL.md / .cursorrules auto-injection
-  and persistent memory (maps to ``AIAgent(skip_context_files=True,
-  skip_memory=True)``).
-
-Both flags are wired via env vars so they work cleanly across the
-argparse → cmd_chat → cli.main() → HermesCLI → AIAgent call chain.
-"""
-
-from __future__ import annotations
-
-import os
-import textwrap
-import importlib
-
-import pytest
-
-
-@pytest.fixture(autouse=True)
-def _clean_env(monkeypatch):
-    """Ensure the two env-var gates start AND end each test in a known state.
-
-    Some tests here write directly to ``os.environ`` (mirroring the real
-    ``cmd_chat`` logic), so ``monkeypatch.delenv`` alone isn't enough —
-    those writes aren't tracked by monkeypatch and won't be undone by it.
-    We add explicit cleanup on yield to prevent cross-test pollution.
-    """
-    for var in ("HERMES_IGNORE_USER_CONFIG", "HERMES_IGNORE_RULES"):
-        monkeypatch.delenv(var, raising=False)
-    yield
-    for var in ("HERMES_IGNORE_USER_CONFIG", "HERMES_IGNORE_RULES"):
-        os.environ.pop(var, None)
-
-
-class TestIgnoreUserConfigEnvGate:
-    """``load_cli_config()`` must honour ``HERMES_IGNORE_USER_CONFIG=1``.
-
-    When the env var is set, user config at ``<hermes_home>/config.yaml`` is
-    skipped even if present — the function returns only the built-in defaults
-    (merged with the project-level ``cli-config.yaml`` fallback).
-    """
-
-    def _write_user_config(self, tmp_path, model_default):
-        config_yaml = textwrap.dedent(
-            f"""
-            model:
-              default: {model_default}
-              provider: openrouter
-            agent:
-              system_prompt: "from user config"
-            """
-        ).lstrip()
-        (tmp_path / "config.yaml").write_text(config_yaml)
-
-    def _reload_cli(self, monkeypatch, tmp_path):
-        """Point cli._hermes_home at tmp_path and return a fresh load_cli_config."""
-        import cli
-        monkeypatch.setattr(cli, "_hermes_home", tmp_path)
-        return cli.load_cli_config
-
-    def test_user_config_loaded_when_flag_unset(self, tmp_path, monkeypatch):
-        self._write_user_config(tmp_path, "anthropic/claude-sonnet-4.6")
-        load_cli_config = self._reload_cli(monkeypatch, tmp_path)
-
-        cfg = load_cli_config()
-
-        # User config value wins
-        assert cfg["model"]["default"] == "anthropic/claude-sonnet-4.6"
-        assert cfg["agent"]["system_prompt"] == "from user config"
-
-    def test_user_config_skipped_when_flag_set(self, tmp_path, monkeypatch):
-        """With HERMES_IGNORE_USER_CONFIG=1, user config.yaml is ignored.
-
-        The built-in default ``model.default`` is empty string (no user override),
-        and the user's ``agent.system_prompt`` is not seen.
-        """
-        self._write_user_config(tmp_path, "anthropic/claude-sonnet-4.6")
-        monkeypatch.setenv("HERMES_IGNORE_USER_CONFIG", "1")
-
-        load_cli_config = self._reload_cli(monkeypatch, tmp_path)
-        cfg = load_cli_config()
-
-        # User-set "system_prompt: from user config" MUST NOT leak through
-        assert cfg["agent"].get("system_prompt", "") != "from user config"
-
-        # User-set model.default MUST NOT leak through — either the built-in
-        # default ("" or unset) or a project-level fallback, but never the
-        # user's value
-        assert cfg["model"].get("default", "") != "anthropic/claude-sonnet-4.6"
-
-    def test_flag_ignored_when_set_to_other_value(self, tmp_path, monkeypatch):
-        """Only the literal value "1" activates the bypass, matching the yolo pattern."""
-        self._write_user_config(tmp_path, "anthropic/claude-sonnet-4.6")
-        monkeypatch.setenv("HERMES_IGNORE_USER_CONFIG", "true")  # not "1"
-
-        load_cli_config = self._reload_cli(monkeypatch, tmp_path)
-        cfg = load_cli_config()
-
-        # "true" != "1", so user config IS loaded
-        assert cfg["model"]["default"] == "anthropic/claude-sonnet-4.6"
-
-
-class TestIgnoreRulesEnvGate:
-    """The constructor / env var must propagate to ``HermesCLI.ignore_rules``
-    so ``AIAgent`` is built with ``skip_context_files=True`` and
-    ``skip_memory=True``.
-    """
-
-    def test_env_var_enables_ignore_rules(self, monkeypatch):
-        """Setting HERMES_IGNORE_RULES=1 flips HermesCLI.ignore_rules True."""
-        monkeypatch.setenv("HERMES_IGNORE_RULES", "1")
-
-        # Import HermesCLI lazily — cli.py has heavy module-init side effects
-        # that we don't want to run at test collection time.
-        import cli
-        importlib.reload(cli)
-
-        # Build only enough of HermesCLI to reach the ignore_rules assignment.
-        # The full __init__ pulls in provider/auth/session DB, so we cheat:
-        # create the object via object.__new__ and manually run the assignment
-        # the same way the real constructor does.
-        obj = object.__new__(cli.HermesCLI)
-        # Replicate the exact logic from cli.py HermesCLI.__init__:
-        ignore_rules = False  # constructor default
-        obj.ignore_rules = ignore_rules or os.environ.get("HERMES_IGNORE_RULES") == "1"
-
-        assert obj.ignore_rules is True
-
-    def test_constructor_flag_alone_enables_ignore_rules(self, monkeypatch):
-        monkeypatch.delenv("HERMES_IGNORE_RULES", raising=False)
-        import cli
-        obj = object.__new__(cli.HermesCLI)
-        ignore_rules = True  # constructor argument
-        obj.ignore_rules = ignore_rules or os.environ.get("HERMES_IGNORE_RULES") == "1"
-        assert obj.ignore_rules is True
-
-    def test_neither_flag_nor_env_leaves_rules_enabled(self, monkeypatch):
-        monkeypatch.delenv("HERMES_IGNORE_RULES", raising=False)
-        import cli
-        obj = object.__new__(cli.HermesCLI)
-        ignore_rules = False
-        obj.ignore_rules = ignore_rules or os.environ.get("HERMES_IGNORE_RULES") == "1"
-        assert obj.ignore_rules is False
-
-
-class TestCmdChatWiring:
-    """The wiring inside ``cmd_chat()`` in ``hermes_cli/main.py`` must set
-    both env vars before importing ``cli`` (which evaluates
-    ``load_cli_config()`` at module import).
-    """
-
-    def _simulate_cmd_chat_env_setup(self, args):
-        """Replicate the exact snippet from cmd_chat in main.py."""
-        if getattr(args, "ignore_user_config", False):
-            os.environ["HERMES_IGNORE_USER_CONFIG"] = "1"
-        if getattr(args, "ignore_rules", False):
-            os.environ["HERMES_IGNORE_RULES"] = "1"
-
-    def test_both_flags_set_both_env_vars(self, monkeypatch):
-        monkeypatch.delenv("HERMES_IGNORE_USER_CONFIG", raising=False)
-        monkeypatch.delenv("HERMES_IGNORE_RULES", raising=False)
-
-        class FakeArgs:
-            ignore_user_config = True
-            ignore_rules = True
-
-        self._simulate_cmd_chat_env_setup(FakeArgs())
-
-        assert os.environ.get("HERMES_IGNORE_USER_CONFIG") == "1"
-        assert os.environ.get("HERMES_IGNORE_RULES") == "1"
-
-    def test_only_ignore_user_config(self, monkeypatch):
-        monkeypatch.delenv("HERMES_IGNORE_USER_CONFIG", raising=False)
-        monkeypatch.delenv("HERMES_IGNORE_RULES", raising=False)
-
-        class FakeArgs:
-            ignore_user_config = True
-            ignore_rules = False
-
-        self._simulate_cmd_chat_env_setup(FakeArgs())
-
-        assert os.environ.get("HERMES_IGNORE_USER_CONFIG") == "1"
-        assert "HERMES_IGNORE_RULES" not in os.environ
-
-    def test_flags_absent_sets_nothing(self, monkeypatch):
-        monkeypatch.delenv("HERMES_IGNORE_USER_CONFIG", raising=False)
-        monkeypatch.delenv("HERMES_IGNORE_RULES", raising=False)
-
-        class FakeArgs:
-            pass  # no attributes at all — getattr fallback must handle
-
-        self._simulate_cmd_chat_env_setup(FakeArgs())
-
-        assert "HERMES_IGNORE_USER_CONFIG" not in os.environ
-        assert "HERMES_IGNORE_RULES" not in os.environ
-
-
-class TestArgparseFlagsRegistered:
-    """Verify the `chat` subparser actually exposes --ignore-user-config
-    and --ignore-rules. This is the contract test for the CLI surface.
-    """
-
-    def test_flags_present_in_chat_parser(self):
-        """Parse a synthetic chat invocation and check both attributes exist."""
-        # Minimal argparse tree matching the real chat subparser shape for the
-        # two flags under test. If someone removes the flag from main.py, this
-        # test keeps passing in isolation — but the E2E test below catches it.
-        import argparse
-        parser = argparse.ArgumentParser(prog="hermes")
-        subs = parser.add_subparsers(dest="command")
-        chat = subs.add_parser("chat")
-        chat.add_argument("--ignore-user-config", action="store_true", default=False)
-        chat.add_argument("--ignore-rules", action="store_true", default=False)
-
-        args = parser.parse_args(["chat", "--ignore-user-config", "--ignore-rules"])
-        assert args.ignore_user_config is True
-        assert args.ignore_rules is True
-
-    def test_main_py_registers_both_flags(self):
-        """E2E: the real hermes_cli/main.py parser accepts both flags.
-
-        We invoke the real argparse tree builder from hermes_cli.main.
-        """
-        import hermes_cli.main as hm
-
-        # hm has a helper that builds the argparse tree inside main().
-        # We can extract it by catching the SystemExit on --help.
-        # Simpler: just grep the source for the flag strings. Both approaches
-        # are brittle; we use a combined test.
-        import inspect
-        src = inspect.getsource(hm)
-        assert '"--ignore-user-config"' in src, \
-            "chat subparser must register --ignore-user-config"
-        assert '"--ignore-rules"' in src, \
-            "chat subparser must register --ignore-rules"
-        # And the cmd_chat env-var wiring must be present
-        assert "HERMES_IGNORE_USER_CONFIG" in src
-        assert "HERMES_IGNORE_RULES" in src
@@ -6,8 +6,6 @@ Covers `_plugin_image_gen_providers`, `_visible_providers`, and

 from __future__ import annotations

-from types import SimpleNamespace
-
 import pytest

 from agent import image_gen_registry
@@ -174,78 +172,3 @@ class TestConfigWriting:

        assert config["image_gen"]["provider"] == "noenv"
        assert config["image_gen"]["model"] == "noenv-model-v1"
-
-    def test_reconfiguring_plugin_provider_writes_provider_and_model(self, monkeypatch, tmp_path):
-        """The reconfigure path should switch image_gen away from managed FAL
-        and onto the selected plugin provider."""
-        from hermes_cli import tools_config
-
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        image_gen_registry.register_provider(_FakeProvider("testopenai"))
-        monkeypatch.setattr(tools_config, "_prompt_choice", lambda *a, **kw: 0)
-        monkeypatch.setattr(tools_config, "_prompt", lambda *a, **kw: "")
-        monkeypatch.setattr(
-            tools_config,
-            "get_env_value",
-            lambda key: "sk-test" if key == "OPENAI_API_KEY" else "",
-        )
-
-        config = {"image_gen": {"use_gateway": True}}
-        provider_row = {
-            "name": "OpenAI",
-            "env_vars": [{"key": "OPENAI_API_KEY", "prompt": "OpenAI API key"}],
-            "image_gen_plugin_name": "testopenai",
-        }
-
-        tools_config._reconfigure_provider(provider_row, config)
-
-        assert config["image_gen"]["provider"] == "testopenai"
-        assert config["image_gen"]["model"] == "testopenai-model-v1"
-        assert config["image_gen"]["use_gateway"] is False
-
-    def test_plugin_provider_active_overrides_managed_nous_active_label(self, monkeypatch):
-        from hermes_cli import tools_config
-
-        monkeypatch.setattr(
-            tools_config,
-            "get_nous_subscription_features",
-            lambda config: SimpleNamespace(
-                features={"image_gen": SimpleNamespace(managed_by_nous=True)}
-            ),
-        )
-
-        config = {"image_gen": {"provider": "openai", "use_gateway": False}}
-        nous_row = {
-            "name": "Nous Subscription",
-            "managed_nous_feature": "image_gen",
-        }
-        openai_row = {
-            "name": "OpenAI",
-            "image_gen_plugin_name": "openai",
-        }
-
-        assert tools_config._is_provider_active(openai_row, config) is True
-        assert tools_config._is_provider_active(nous_row, config) is False
-
-    def test_reconfiguring_fal_clears_plugin_provider(self, monkeypatch):
-        from hermes_cli import tools_config
-
-        monkeypatch.setattr(tools_config, "_prompt_choice", lambda *a, **kw: 0)
-        monkeypatch.setattr(tools_config, "_prompt", lambda *a, **kw: "")
-        monkeypatch.setattr(
-            tools_config,
-            "get_env_value",
-            lambda key: "fal-key" if key == "FAL_KEY" else "",
-        )
-
-        config = {"image_gen": {"provider": "openai", "use_gateway": False}}
-        provider_row = {
-            "name": "FAL.ai",
-            "env_vars": [{"key": "FAL_KEY", "prompt": "FAL API key"}],
-            "imagegen_backend": "fal",
-        }
-
-        tools_config._reconfigure_provider(provider_row, config)
-
-        assert config["image_gen"]["provider"] == "fal"
-        assert config["image_gen"]["use_gateway"] is False
@@ -253,148 +253,3 @@ def test_list_dedupes_dict_model_matching_singular_default(monkeypatch):
    ds_rows = [p for p in providers if p["name"] == "DeepSeek"]
    assert ds_rows[0]["models"].count("deepseek-chat") == 1
    assert ds_rows[0]["models"] == ["deepseek-chat", "deepseek-reasoner"]
-
-
-
-# ─────────────────────────────────────────────────────────────────────────────
-# #9210: group custom_providers by (base_url, api_key) in /model picker
-# ─────────────────────────────────────────────────────────────────────────────
-
-def test_list_authenticated_providers_groups_same_endpoint(monkeypatch):
-    """Multiple custom_providers entries sharing a base_url+api_key must be
-    returned as a single picker row with all their models merged."""
-    monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
-    monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
-
-    providers = list_authenticated_providers(
-        current_provider="custom",
-        current_base_url="http://localhost:11434/v1",
-        user_providers={},
-        custom_providers=[
-            {"name": "Ollama — MiniMax M2.7", "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "minimax-m2.7"},
-            {"name": "Ollama — GLM 5.1",      "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "glm-5.1"},
-            {"name": "Ollama — Qwen3-coder", "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "qwen3-coder"},
-        ],
-        max_models=50,
-    )
-
-    custom_groups = [p for p in providers if p.get("is_user_defined")]
-    assert len(custom_groups) == 1, (
-        "Expected 1 group for shared endpoint, got "
-        f"{[p['slug'] for p in custom_groups]}"
-    )
-    group = custom_groups[0]
-    assert set(group["models"]) == {"minimax-m2.7", "glm-5.1", "qwen3-coder"}
-    assert group["total_models"] == 3
-    # Per-model suffix stripped from display name
-    assert group["name"] == "Ollama"
-
-
-def test_list_authenticated_providers_current_endpoint_uses_current_slug(monkeypatch):
-    """When current_base_url matches the grouped endpoint, the slug must
-    equal current_provider so picker selection routes through the live
-    credential pipeline."""
-    monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
-    monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
-
-    providers = list_authenticated_providers(
-        current_provider="custom",
-        current_base_url="http://localhost:11434/v1",
-        user_providers={},
-        custom_providers=[
-            {"name": "Ollama — GLM 5.1", "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "glm-5.1"},
-        ],
-        max_models=50,
-    )
-
-    matches = [p for p in providers if p.get("is_user_defined")]
-    assert len(matches) == 1
-    group = matches[0]
-    assert group["slug"] == "custom"
-    assert group["is_current"] is True
-
-
-def test_list_authenticated_providers_distinct_endpoints_stay_separate(monkeypatch):
-    """Entries with different base_urls must produce separate picker rows
-    even if some display names happen to be similar."""
-    monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
-    monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
-
-    providers = list_authenticated_providers(
-        user_providers={},
-        custom_providers=[
-            {"name": "Ollama — GLM 5.1", "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "glm-5.1"},
-            {"name": "Moonshot", "base_url": "https://api.moonshot.cn/v1",
-             "api_key": "sk-m", "model": "moonshot-v1"},
-            {"name": "Ollama — Qwen3-coder", "base_url": "http://localhost:11434/v1",
-             "api_key": "ollama", "model": "qwen3-coder"},
-        ],
-        max_models=50,
-    )
-
-    custom_groups = [p for p in providers if p.get("is_user_defined")]
-    assert len(custom_groups) == 2
-    # Ollama endpoint collapses to one row with both models
-    ollama = next(p for p in custom_groups if p["name"] == "Ollama")
-    assert set(ollama["models"]) == {"glm-5.1", "qwen3-coder"}
-    moonshot = next(p for p in custom_groups if p["name"] == "Moonshot")
-    assert moonshot["models"] == ["moonshot-v1"]
-
-
-def test_list_authenticated_providers_same_url_different_keys_disambiguated(monkeypatch):
-    """Two custom_providers entries with the same base_url but different
-    api_keys (and identical cleaned names) must both stay visible in the
-    picker — slug is suffixed to disambiguate."""
-    monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
-    monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
-
-    providers = list_authenticated_providers(
-        user_providers={},
-        custom_providers=[
-            {"name": "OpenAI — key A", "base_url": "https://api.openai.com/v1",
-             "api_key": "sk-AAA", "model": "gpt-5.4"},
-            {"name": "OpenAI — key B", "base_url": "https://api.openai.com/v1",
-             "api_key": "sk-BBB", "model": "gpt-4.6"},
-        ],
-        max_models=50,
-    )
-
-    custom_groups = [p for p in providers if p.get("is_user_defined")]
-    assert len(custom_groups) == 2
-    slugs = sorted(p["slug"] for p in custom_groups)
-    # First group keeps the base slug, second gets a numeric suffix
-    assert slugs == ["custom:openai", "custom:openai-2"]
-    # Each row has a distinct model
-    models = {p["slug"]: p["models"] for p in custom_groups}
-    assert models["custom:openai"] == ["gpt-5.4"]
-    assert models["custom:openai-2"] == ["gpt-4.6"]
-
-
-def test_list_authenticated_providers_total_models_reflects_grouped_count(monkeypatch):
-    """After grouping six entries into one row, total_models must reflect
-    the full count, and every grouped model appears in the list."""
-    monkeypatch.setattr("agent.models_dev.fetch_models_dev", lambda: {})
-    monkeypatch.setattr(providers_mod, "HERMES_OVERLAYS", {})
-
-    entries = [
-        {"name": f"Ollama \u2014 Model {i}", "base_url": "http://localhost:11434/v1",
-         "api_key": "ollama", "model": f"model-{i}"}
-        for i in range(6)
-    ]
-    providers = list_authenticated_providers(
-        user_providers={},
-        custom_providers=entries,
-        max_models=4,
-    )
-
-    groups = [p for p in providers if p.get("is_user_defined")]
-    assert len(groups) == 1
-    group = groups[0]
-    assert group["total_models"] == 6
-    # All six models are preserved in the grouped row.
-    assert sorted(group["models"]) == sorted(f"model-{i}" for i in range(6))
@@ -455,47 +455,6 @@ class TestExportImport:
        with pytest.raises(FileExistsError):
            import_profile(str(archive_path), name="coder")

-    def test_import_with_explicit_name_does_not_mutate_existing_archive_root_profile(
-        self, profile_env, tmp_path
-    ):
-        create_profile("victim", no_alias=True)
-        victim_dir = get_profile_dir("victim")
-        (victim_dir / "marker.txt").write_text("original")
-
-        archive_path = tmp_path / "export" / "victim.tar.gz"
-        archive_path.parent.mkdir(parents=True, exist_ok=True)
-        with tarfile.open(archive_path, "w:gz") as tf:
-            data = b"imported"
-            info = tarfile.TarInfo("victim/marker.txt")
-            info.size = len(data)
-            tf.addfile(info, io.BytesIO(data))
-
-        imported = import_profile(str(archive_path), name="renamed")
-
-        assert imported == get_profile_dir("renamed")
-        assert (imported / "marker.txt").read_text() == "imported"
-        assert (victim_dir / "marker.txt").read_text() == "original"
-
-    def test_import_rejects_archive_with_multiple_top_level_directories(
-        self, profile_env, tmp_path
-    ):
-        archive_path = tmp_path / "export" / "multi-root.tar.gz"
-        archive_path.parent.mkdir(parents=True, exist_ok=True)
-
-        with tarfile.open(archive_path, "w:gz") as tf:
-            for member_name, data in (
-                ("alpha/marker.txt", b"a"),
-                ("beta/marker.txt", b"b"),
-            ):
-                info = tarfile.TarInfo(member_name)
-                info.size = len(data)
-                tf.addfile(info, io.BytesIO(data))
-
-        with pytest.raises(ValueError, match="exactly one top-level directory"):
-            import_profile(str(archive_path), name="coder")
-
-        assert not get_profile_dir("coder").exists()
-
    def test_import_rejects_traversal_archive_member(self, profile_env, tmp_path):
        archive_path = tmp_path / "export" / "evil.tar.gz"
        archive_path.parent.mkdir(parents=True, exist_ok=True)
@@ -135,48 +135,3 @@ class TestNormalizeCustomProviderEntry:
        }
        result = _normalize_custom_provider_entry(entry, provider_key="")
        assert result is None
-
-    def test_models_list_converted_to_dict(self):
-        """List-format models should be preserved as an empty-value dict so
-        /model picks them up instead of showing the provider with (0) models."""
-        entry = {
-            "name": "tencent-coding-plan",
-            "base_url": "https://api.lkeap.cloud.tencent.com/coding/v3",
-            "models": ["glm-5", "kimi-k2.5", "minimax-m2.5"],
-        }
-        result = _normalize_custom_provider_entry(entry)
-        assert result is not None
-        assert result["models"] == {"glm-5": {}, "kimi-k2.5": {}, "minimax-m2.5": {}}
-
-    def test_models_dict_preserved(self):
-        """Dict-format models should pass through unchanged."""
-        entry = {
-            "name": "acme",
-            "base_url": "https://api.example.com/v1",
-            "models": {"gpt-foo": {"context_length": 32000}},
-        }
-        result = _normalize_custom_provider_entry(entry)
-        assert result is not None
-        assert result["models"] == {"gpt-foo": {"context_length": 32000}}
-
-    def test_models_list_filters_empty_and_non_string(self):
-        """List entries that are empty strings or non-strings are skipped."""
-        entry = {
-            "name": "acme",
-            "base_url": "https://api.example.com/v1",
-            "models": ["valid", "", None, 42, "  ", "also-valid"],
-        }
-        result = _normalize_custom_provider_entry(entry)
-        assert result is not None
-        assert result["models"] == {"valid": {}, "also-valid": {}}
-
-    def test_models_empty_list_omitted(self):
-        """Empty list (falsy) should not produce a models key."""
-        entry = {
-            "name": "acme",
-            "base_url": "https://api.example.com/v1",
-            "models": [],
-        }
-        result = _normalize_custom_provider_entry(entry)
-        assert result is not None
-        assert "models" not in result
@@ -0,0 +1,172 @@
+"""Unit tests for hermes_cli.pty_bridge — PTY spawning + byte forwarding.
+
+These tests drive the bridge with minimal POSIX processes (echo, env, sleep,
+printf) to verify it behaves like a PTY you can read/write/resize/close.
+"""
+
+from __future__ import annotations
+
+import os
+import sys
+import time
+
+import pytest
+
+pytest.importorskip("ptyprocess", reason="ptyprocess not installed")
+
+from hermes_cli.pty_bridge import PtyBridge, PtyUnavailableError
+
+
+skip_on_windows = pytest.mark.skipif(
+    sys.platform.startswith("win"), reason="PTY bridge is POSIX-only"
+)
+
+
+def _read_until(bridge: PtyBridge, needle: bytes, timeout: float = 5.0) -> bytes:
+    """Accumulate PTY output until we see `needle` or time out."""
+    deadline = time.monotonic() + timeout
+    buf = bytearray()
+    while time.monotonic() < deadline:
+        chunk = bridge.read(timeout=0.2)
+        if chunk is None:
+            break
+        buf.extend(chunk)
+        if needle in buf:
+            return bytes(buf)
+    return bytes(buf)
+
+
+@skip_on_windows
+class TestPtyBridgeSpawn:
+    def test_is_available_on_posix(self):
+        assert PtyBridge.is_available() is True
+
+    def test_spawn_returns_bridge_with_pid(self):
+        bridge = PtyBridge.spawn(["true"])
+        try:
+            assert bridge.pid > 0
+        finally:
+            bridge.close()
+
+    def test_spawn_raises_on_missing_argv0(self, tmp_path):
+        with pytest.raises((FileNotFoundError, OSError)):
+            PtyBridge.spawn([str(tmp_path / "definitely-not-a-real-binary")])
+
+
+@skip_on_windows
+class TestPtyBridgeIO:
+    def test_reads_child_stdout(self):
+        bridge = PtyBridge.spawn(["/bin/sh", "-c", "printf hermes-ok"])
+        try:
+            output = _read_until(bridge, b"hermes-ok")
+            assert b"hermes-ok" in output
+        finally:
+            bridge.close()
+
+    def test_write_sends_to_child_stdin(self):
+        # `cat` with no args echoes stdin back to stdout.  We write a line,
+        # read it back, then signal EOF to let cat exit cleanly.
+        bridge = PtyBridge.spawn(["/bin/cat"])
+        try:
+            bridge.write(b"hello-pty\n")
+            output = _read_until(bridge, b"hello-pty")
+            assert b"hello-pty" in output
+        finally:
+            bridge.close()
+
+    def test_read_returns_none_after_child_exits(self):
+        bridge = PtyBridge.spawn(["/bin/sh", "-c", "printf done"])
+        try:
+            _read_until(bridge, b"done")
+            # Give the child a beat to exit cleanly, then drain until EOF.
+            deadline = time.monotonic() + 3.0
+            while bridge.is_alive() and time.monotonic() < deadline:
+                bridge.read(timeout=0.1)
+            # Next reads after exit should return None (EOF), not raise.
+            got_none = False
+            for _ in range(10):
+                if bridge.read(timeout=0.1) is None:
+                    got_none = True
+                    break
+            assert got_none, "PtyBridge.read did not return None after child EOF"
+        finally:
+            bridge.close()
+
+
+@skip_on_windows
+class TestPtyBridgeResize:
+    def test_resize_updates_child_winsize(self):
+        # tput reads COLUMNS/LINES from the TTY ioctl (TIOCGWINSZ).
+        # Spawn a shell, resize, then ask tput for the dimensions.
+        bridge = PtyBridge.spawn(
+            ["/bin/sh", "-c", "sleep 0.1; tput cols; tput lines"],
+            cols=80,
+            rows=24,
+        )
+        try:
+            bridge.resize(cols=123, rows=45)
+            output = _read_until(bridge, b"45", timeout=5.0)
+            # tput prints just the numbers, one per line
+            assert b"123" in output
+            assert b"45" in output
+        finally:
+            bridge.close()
+
+
+@skip_on_windows
+class TestPtyBridgeClose:
+    def test_close_is_idempotent(self):
+        bridge = PtyBridge.spawn(["/bin/sh", "-c", "sleep 30"])
+        bridge.close()
+        bridge.close()  # must not raise
+        assert not bridge.is_alive()
+
+    def test_close_terminates_long_running_child(self):
+        bridge = PtyBridge.spawn(["/bin/sh", "-c", "sleep 30"])
+        pid = bridge.pid
+        bridge.close()
+        # Give the kernel a moment to reap
+        deadline = time.monotonic() + 3.0
+        reaped = False
+        while time.monotonic() < deadline:
+            try:
+                os.kill(pid, 0)
+                time.sleep(0.05)
+            except ProcessLookupError:
+                reaped = True
+                break
+        assert reaped, f"pid {pid} still running after close()"
+
+
+@skip_on_windows
+class TestPtyBridgeEnv:
+    def test_cwd_is_respected(self, tmp_path):
+        bridge = PtyBridge.spawn(
+            ["/bin/sh", "-c", "pwd"],
+            cwd=str(tmp_path),
+        )
+        try:
+            output = _read_until(bridge, str(tmp_path).encode())
+            assert str(tmp_path).encode() in output
+        finally:
+            bridge.close()
+
+    def test_env_is_forwarded(self):
+        bridge = PtyBridge.spawn(
+            ["/bin/sh", "-c", "printf %s \"$HERMES_PTY_TEST\""],
+            env={**os.environ, "HERMES_PTY_TEST": "pty-env-works"},
+        )
+        try:
+            output = _read_until(bridge, b"pty-env-works")
+            assert b"pty-env-works" in output
+        finally:
+            bridge.close()
+
+
+class TestPtyBridgeUnavailable:
+    """Platform fallback semantics — PtyUnavailableError is importable and
+    carries a user-readable message."""
+
+    def test_error_carries_user_message(self):
+        err = PtyUnavailableError("platform not supported")
+        assert "platform" in str(err)
@@ -463,7 +463,7 @@ class TestPlatformToolsetConsistency:

        gateway_includes = set(TOOLSETS["hermes-gateway"]["includes"])
        # Exclude non-messaging platforms from the check
-        non_messaging = {"cli", "api_server", "cron"}
+        non_messaging = {"cli", "api_server"}
        for platform, meta in PLATFORMS.items():
            if platform in non_messaging:
                continue
@@ -601,122 +601,3 @@ class TestImagegenModelPicker:
            _configure_imagegen_model("fal", config)
        assert isinstance(config["image_gen"], dict)
        assert config["image_gen"]["model"] == "fal-ai/flux-2/klein/9b"
-
-
-def test_get_platform_tools_recovers_non_configurable_toolsets_from_composite():
-    """Non-configurable toolsets whose tools are in the composite but not in
-    CONFIGURABLE_TOOLSETS should still appear in the result.
-    """
-    from toolsets import TOOLSETS
-    from hermes_cli.tools_config import PLATFORMS
-    from unittest.mock import patch as mock_patch
-
-    fake_toolsets = dict(TOOLSETS)
-    fake_toolsets["_test_platform_tool"] = {
-        "description": "test",
-        "tools": ["_test_special_tool"],
-        "includes": [],
-    }
-    fake_toolsets["hermes-_test_platform"] = {
-        "description": "test composite",
-        "tools": ["web_search", "web_extract", "terminal", "process", "_test_special_tool"],
-        "includes": [],
-    }
-
-    test_platforms = {
-        "_test_platform": {"label": "Test", "default_toolset": "hermes-_test_platform"},
-    }
-
-    with mock_patch("hermes_cli.tools_config.PLATFORMS", {**PLATFORMS, **test_platforms}):
-        with mock_patch("toolsets.TOOLSETS", fake_toolsets):
-            enabled = _get_platform_tools({}, "_test_platform")
-
-    assert "_test_platform_tool" in enabled
-    assert "web" in enabled
-    assert "terminal" in enabled
-
-
-def test_get_platform_tools_second_pass_skips_fully_claimed_toolsets():
-    """Toolsets whose tools are fully covered by configurable keys should NOT
-    be added by the second pass (prevents 'search', 'hermes-acp' noise).
-    """
-    enabled = _get_platform_tools({}, "cli")
-
-    assert "search" not in enabled
-
-
-def test_get_platform_tools_discord_includes_discord_not_admin():
-    enabled = _get_platform_tools({}, "discord")
-    assert "discord" in enabled
-    assert "discord_admin" not in enabled
-
-
-def test_discord_admin_in_configurable_toolsets():
-    assert any(ts_key == "discord_admin" for ts_key, _, _ in CONFIGURABLE_TOOLSETS)
-
-
-def test_discord_admin_in_default_off():
-    assert "discord_admin" in _DEFAULT_OFF_TOOLSETS
-
-
-def test_get_platform_tools_feishu_includes_doc_and_drive():
-    enabled = _get_platform_tools({}, "feishu")
-    assert "feishu_doc" in enabled
-    assert "feishu_drive" in enabled
-
-
-def test_get_platform_tools_feishu_tools_not_on_other_platforms():
-    for plat in ["cli", "telegram", "discord"]:
-        enabled = _get_platform_tools({}, plat)
-        assert "feishu_doc" not in enabled, f"feishu_doc leaked onto {plat}"
-        assert "feishu_drive" not in enabled, f"feishu_drive leaked onto {plat}"
-
-
-def test_save_platform_tools_normalizes_numeric_entries():
-    """YAML may parse bare numeric toolset names as int. They should be
-    normalized to str so they survive the save round-trip.
-    """
-    config = {
-        "platform_toolsets": {
-            "cli": ["web", "terminal", 12306, "custom-mcp"]
-        }
-    }
-
-    with patch("hermes_cli.tools_config.save_config"):
-        _save_platform_tools(config, "cli", {"web", "browser"})
-
-    saved = config["platform_toolsets"]["cli"]
-    assert "12306" in saved
-    assert 12306 not in saved
-
-
-def test_save_platform_tools_clears_stale_no_mcp():
-    """When the new selection doesn't include no_mcp, the sentinel should
-    be stripped from preserved entries so MCP servers are re-enabled.
-    """
-    config = {
-        "platform_toolsets": {
-            "cli": ["web", "terminal", "no_mcp"]
-        }
-    }
-
-    with patch("hermes_cli.tools_config.save_config"):
-        _save_platform_tools(config, "cli", {"web", "browser"})
-
-    saved = config["platform_toolsets"]["cli"]
-    assert "no_mcp" not in saved
-
-
-def test_save_platform_tools_preserves_explicit_no_mcp():
-    """When the new selection explicitly includes no_mcp, it should be kept."""
-    config = {
-        "platform_toolsets": {
-            "cli": ["web", "no_mcp"]
-        }
-    }
-
-    with patch("hermes_cli.tools_config.save_config"):
-        _save_platform_tools(config, "cli", {"web", "no_mcp"})
-
-    saved = config["platform_toolsets"]["cli"]
-    assert "no_mcp" in saved
@@ -422,152 +422,6 @@ class TestCmdUpdateLaunchdRestart:
        ]
        assert len(restart_calls) == 1

-    @patch("shutil.which", return_value=None)
-    @patch("subprocess.run")
-    def test_update_prefers_sigusr1_over_systemctl_restart_when_mainpid_known(
-        self, mock_run, _mock_which, mock_args, capsys, monkeypatch,
-    ):
-        """Drain-aware update: when systemctl show reports a MainPID, the
-        update path sends SIGUSR1 and waits for graceful exit + respawn,
-        instead of ``systemctl restart`` (which SIGKILLs in-flight agents).
-        """
-        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
-        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
-        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
-
-        # Track state: before kill → "active" (old PID),
-        # after kill + exit → briefly inactive, then "active" again (new PID).
-        state = {"killed": False}
-
-        def side_effect(cmd, **kwargs):
-            joined = " ".join(str(c) for c in cmd)
-
-            if "rev-parse" in joined and "--abbrev-ref" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="main\n", stderr="")
-            if "rev-parse" in joined and "--verify" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
-            if "rev-list" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="3\n", stderr="")
-
-            # Only expose a user-scope service.
-            if "systemctl" in joined and "list-units" in joined:
-                if "--user" in joined:
-                    return subprocess.CompletedProcess(
-                        cmd, 0,
-                        stdout="hermes-gateway.service loaded active running\n",
-                        stderr="",
-                    )
-                return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
-
-            if "systemctl" in joined and "is-active" in joined:
-                # Pre-kill: active.  Post-kill: active again (respawned by
-                # Restart=on-failure).  The drain loop verifies liveness
-                # separately via os.kill(pid, 0).
-                return subprocess.CompletedProcess(cmd, 0, stdout="active\n", stderr="")
-
-            # The new code path.
-            if "systemctl" in joined and "show" in joined and "MainPID" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="4242\n", stderr="")
-
-            # If systemctl restart is called, this test fails its intent —
-            # but still let it succeed so we can assert it was NOT called.
-            if "systemctl" in joined and "restart" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
-
-            return subprocess.CompletedProcess(cmd, 0, stdout="", stderr="")
-
-        mock_run.side_effect = side_effect
-
-        # Track SIGUSR1 delivery and simulate the gateway draining + exiting.
-        sigusr1_sent = {"value": False}
-
-        def fake_kill(pid, sig):
-            import signal as _s
-            if pid == 4242 and sig == _s.SIGUSR1:
-                sigusr1_sent["value"] = True
-                state["killed"] = True
-                return
-            if pid == 4242 and sig == 0:
-                # Liveness probe — report dead once SIGUSR1 has been sent.
-                if state["killed"]:
-                    raise ProcessLookupError()
-                return
-            # For any other PID/sig combination, succeed silently.
-            return
-
-        monkeypatch.setattr("os.kill", fake_kill)
-
-        with patch.object(gateway_cli, "find_gateway_pids", return_value=[]):
-            cmd_update(mock_args)
-
-        # SIGUSR1 must have been delivered to the gateway MainPID.
-        assert sigusr1_sent["value"], "Expected SIGUSR1 to be sent to MainPID"
-
-        # And `systemctl restart` must NOT have been used (that's the
-        # non-draining kill-everything path we're moving away from).
-        restart_calls = [
-            c for c in mock_run.call_args_list
-            if "systemctl" in " ".join(str(a) for a in c.args[0])
-            and "restart" in " ".join(str(a) for a in c.args[0])
-        ]
-        assert restart_calls == [], (
-            "Graceful SIGUSR1 succeeded; `systemctl restart` should not "
-            f"have been called. Got: {restart_calls}"
-        )
-
-        captured = capsys.readouterr().out
-        assert "draining" in captured.lower()
-        assert "Restarted hermes-gateway" in captured
-
-    @patch("shutil.which", return_value=None)
-    @patch("subprocess.run")
-    def test_update_falls_back_to_systemctl_restart_when_sigusr1_times_out(
-        self, mock_run, _mock_which, mock_args, capsys, monkeypatch,
-    ):
-        """If the gateway doesn't exit within the drain budget (e.g. old unit
-        missing ``Restart=on-failure`` or an agent ignoring SIGUSR1), the
-        update path falls back to ``systemctl restart``.
-        """
-        monkeypatch.setattr(gateway_cli, "is_macos", lambda: False)
-        monkeypatch.setattr(gateway_cli, "supports_systemd_services", lambda: True)
-        monkeypatch.setattr(gateway_cli, "is_termux", lambda: False)
-
-        mock_run.side_effect = _make_run_side_effect(
-            commit_count="3",
-            systemd_active=True,
-        )
-
-        # Patch systemctl show to report MainPID=4242 so cmd_update attempts
-        # the graceful path.
-        orig = mock_run.side_effect
-        def wrapped(cmd, **kwargs):
-            joined = " ".join(str(c) for c in cmd)
-            if "systemctl" in joined and "show" in joined and "MainPID" in joined:
-                return subprocess.CompletedProcess(cmd, 0, stdout="4242\n", stderr="")
-            return orig(cmd, **kwargs)
-        mock_run.side_effect = wrapped
-
-        # Simulate the drain helper failing to confirm a clean exit — either
-        # because the gateway ignored SIGUSR1 or the drain budget was
-        # exceeded.  cmd_update() should detect this and escalate.
-        monkeypatch.setattr(
-            "hermes_cli.gateway._graceful_restart_via_sigusr1",
-            lambda pid, drain_timeout: False,
-        )
-
-        with patch.object(gateway_cli, "find_gateway_pids", return_value=[]):
-            cmd_update(mock_args)
-
-        # Fallback kicked in → systemctl restart was called.
-        restart_calls = [
-            c for c in mock_run.call_args_list
-            if "systemctl" in " ".join(str(a) for a in c.args[0])
-            and "restart" in " ".join(str(a) for a in c.args[0])
-        ]
-        assert len(restart_calls) >= 1, (
-            "Drain path failed; expected fallback `systemctl restart`."
-        )
-
    @patch("shutil.which", return_value=None)
    @patch("subprocess.run")
    def test_update_no_gateway_running_skips_restart(
@@ -1,255 +0,0 @@
-"""Tests for ``hermes_cli.voice`` — the TUI gateway's voice wrapper.
-
-The module is imported *lazily* by ``tui_gateway/server.py`` so that a
-box with missing audio deps fails at call time (returning a clean RPC
-error) rather than at gateway startup. These tests therefore only
-assert the public contract the gateway depends on: the three symbols
-exist, ``stop_and_transcribe`` is a no-op when nothing is recording,
-and ``speak_text`` tolerates empty input without touching the provider
-stack.
-"""
-
-import os
-import sys
-
-import pytest
-
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
-
-
-class TestPublicAPI:
-    def test_gateway_symbols_importable(self):
-        """Match the exact import shape tui_gateway/server.py uses."""
-        from hermes_cli.voice import (
-            speak_text,
-            start_recording,
-            stop_and_transcribe,
-        )
-
-        assert callable(start_recording)
-        assert callable(stop_and_transcribe)
-        assert callable(speak_text)
-
-
-class TestStopWithoutStart:
-    def test_returns_none_when_no_recording_active(self, monkeypatch):
-        """Idempotent no-op: stop before start must not raise or touch state."""
-        import hermes_cli.voice as voice
-
-        monkeypatch.setattr(voice, "_recorder", None)
-
-        assert voice.stop_and_transcribe() is None
-
-
-class TestSpeakTextGuards:
-    @pytest.mark.parametrize("text", ["", "   ", "\n\t  "])
-    def test_empty_text_is_noop(self, text):
-        """Empty / whitespace-only text must return without importing tts_tool
-        (the gateway spawns a thread per call, so a no-op on empty input
-        keeps the thread pool from churning on trivial inputs)."""
-        from hermes_cli.voice import speak_text
-
-        # Should simply return None without raising.
-        assert speak_text(text) is None
-
-
-class TestContinuousAPI:
-    """Continuous (VAD) mode API — CLI-parity loop entry points."""
-
-    def test_continuous_exports(self):
-        from hermes_cli.voice import (
-            is_continuous_active,
-            start_continuous,
-            stop_continuous,
-        )
-
-        assert callable(start_continuous)
-        assert callable(stop_continuous)
-        assert callable(is_continuous_active)
-
-    def test_not_active_by_default(self, monkeypatch):
-        import hermes_cli.voice as voice
-
-        # Isolate from any state left behind by other tests in the session.
-        monkeypatch.setattr(voice, "_continuous_active", False)
-        monkeypatch.setattr(voice, "_continuous_recorder", None)
-
-        assert voice.is_continuous_active() is False
-
-    def test_stop_continuous_idempotent_when_inactive(self, monkeypatch):
-        """stop_continuous must not raise when no loop is active — the
-        gateway's voice.toggle off path calls it unconditionally."""
-        import hermes_cli.voice as voice
-
-        monkeypatch.setattr(voice, "_continuous_active", False)
-        monkeypatch.setattr(voice, "_continuous_recorder", None)
-
-        # Should return cleanly without exceptions
-        assert voice.stop_continuous() is None
-        assert voice.is_continuous_active() is False
-
-    def test_double_start_is_idempotent(self, monkeypatch):
-        """A second start_continuous while already active is a no-op — prevents
-        two overlapping capture threads fighting over the microphone when the
-        UI double-fires (e.g. both /voice on and Ctrl+B within the same tick)."""
-        import hermes_cli.voice as voice
-
-        monkeypatch.setattr(voice, "_continuous_active", True)
-        called = {"n": 0}
-
-        class FakeRecorder:
-            def start(self, on_silence_stop=None):
-                called["n"] += 1
-
-            def cancel(self):
-                pass
-
-        monkeypatch.setattr(voice, "_continuous_recorder", FakeRecorder())
-
-        voice.start_continuous(on_transcript=lambda _t: None)
-
-        # The guard inside start_continuous short-circuits before rec.start()
-        assert called["n"] == 0
-
-
-class TestContinuousLoopSimulation:
-    """End-to-end simulation of the VAD loop with a fake recorder.
-
-    Proves auto-restart works: the silence callback must trigger transcribe →
-    on_transcript → re-call rec.start(on_silence_stop=same_cb). Also covers
-    the 3-strikes no-speech halt.
-    """
-
-    @pytest.fixture
-    def fake_recorder(self, monkeypatch):
-        import hermes_cli.voice as voice
-
-        # Reset module state between tests.
-        monkeypatch.setattr(voice, "_continuous_active", False)
-        monkeypatch.setattr(voice, "_continuous_recorder", None)
-        monkeypatch.setattr(voice, "_continuous_no_speech_count", 0)
-        monkeypatch.setattr(voice, "_continuous_on_transcript", None)
-        monkeypatch.setattr(voice, "_continuous_on_status", None)
-        monkeypatch.setattr(voice, "_continuous_on_silent_limit", None)
-
-        class FakeRecorder:
-            _silence_threshold = 200
-            _silence_duration = 3.0
-            is_recording = False
-
-            def __init__(self):
-                self.start_calls = 0
-                self.last_callback = None
-                self.stopped = 0
-                self.cancelled = 0
-                # Preset WAV path returned by stop()
-                self.next_stop_wav = "/tmp/fake.wav"
-
-            def start(self, on_silence_stop=None):
-                self.start_calls += 1
-                self.last_callback = on_silence_stop
-                self.is_recording = True
-
-            def stop(self):
-                self.stopped += 1
-                self.is_recording = False
-                return self.next_stop_wav
-
-            def cancel(self):
-                self.cancelled += 1
-                self.is_recording = False
-
-        rec = FakeRecorder()
-        monkeypatch.setattr(voice, "create_audio_recorder", lambda: rec)
-        # Skip real file ops in the silence callback.
-        monkeypatch.setattr(voice.os.path, "isfile", lambda _p: False)
-        return rec
-
-    def test_loop_auto_restarts_after_transcript(self, fake_recorder, monkeypatch):
-        import hermes_cli.voice as voice
-
-        monkeypatch.setattr(
-            voice,
-            "transcribe_recording",
-            lambda _p: {"success": True, "transcript": "hello world"},
-        )
-        monkeypatch.setattr(voice, "is_whisper_hallucination", lambda _t: False)
-
-        transcripts = []
-        statuses = []
-
-        voice.start_continuous(
-            on_transcript=lambda t: transcripts.append(t),
-            on_status=lambda s: statuses.append(s),
-        )
-
-        assert fake_recorder.start_calls == 1
-        assert statuses == ["listening"]
-
-        # Simulate AudioRecorder's silence detector firing.
-        fake_recorder.last_callback()
-
-        assert transcripts == ["hello world"]
-        assert fake_recorder.start_calls == 2  # auto-restarted
-        assert statuses == ["listening", "transcribing", "listening"]
-        assert voice.is_continuous_active() is True
-
-        voice.stop_continuous()
-
-    def test_silent_limit_halts_loop_after_three_strikes(self, fake_recorder, monkeypatch):
-        import hermes_cli.voice as voice
-
-        # Transcription returns no speech — fake_recorder.stop() returns the
-        # path, but transcribe returns empty text, counting as silence.
-        monkeypatch.setattr(
-            voice,
-            "transcribe_recording",
-            lambda _p: {"success": True, "transcript": ""},
-        )
-        monkeypatch.setattr(voice, "is_whisper_hallucination", lambda _t: False)
-
-        transcripts = []
-        silent_limit_fired = []
-
-        voice.start_continuous(
-            on_transcript=lambda t: transcripts.append(t),
-            on_silent_limit=lambda: silent_limit_fired.append(True),
-        )
-
-        # Fire silence callback 3 times
-        for _ in range(3):
-            fake_recorder.last_callback()
-
-        assert transcripts == []
-        assert silent_limit_fired == [True]
-        assert voice.is_continuous_active() is False
-        assert fake_recorder.cancelled >= 1
-
-    def test_stop_during_transcription_discards_restart(self, fake_recorder, monkeypatch):
-        """User hits Ctrl+B mid-transcription: the in-flight transcript must
-        still fire (it's a real utterance), but the loop must NOT restart."""
-        import hermes_cli.voice as voice
-
-        stop_triggered = {"flag": False}
-
-        def late_transcribe(_p):
-            # Simulate stop_continuous arriving while we're inside transcribe
-            voice.stop_continuous()
-            stop_triggered["flag"] = True
-            return {"success": True, "transcript": "final word"}
-
-        monkeypatch.setattr(voice, "transcribe_recording", late_transcribe)
-        monkeypatch.setattr(voice, "is_whisper_hallucination", lambda _t: False)
-
-        transcripts = []
-        voice.start_continuous(on_transcript=lambda t: transcripts.append(t))
-
-        initial_starts = fake_recorder.start_calls  # 1
-        fake_recorder.last_callback()
-
-        assert stop_triggered["flag"] is True
-        # Loop is stopped — no auto-restart
-        assert fake_recorder.start_calls == initial_starts
-        # The in-flight transcript was suppressed because we stopped mid-flight
-        assert transcripts == []
-        assert voice.is_continuous_active() is False
@@ -110,12 +110,12 @@ class TestWebServerEndpoints:

        import hermes_state
        from hermes_constants import get_hermes_home
-        from hermes_cli.web_server import app, _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import app, _SESSION_TOKEN

        monkeypatch.setattr(hermes_state, "DEFAULT_DB_PATH", get_hermes_home() / "state.db")

        self.client = TestClient(app)
-        self.client.headers[_SESSION_HEADER_NAME] = _SESSION_TOKEN
+        self.client.headers["Authorization"] = f"Bearer {_SESSION_TOKEN}"

    def test_get_status(self):
        resp = self.client.get("/api/status")
@@ -221,12 +221,12 @@ class TestWebServerEndpoints:
    def test_reveal_env_var(self, tmp_path):
        """POST /api/env/reveal should return the real unredacted value."""
        from hermes_cli.config import save_env_value
-        from hermes_cli.web_server import _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import _SESSION_TOKEN
        save_env_value("TEST_REVEAL_KEY", "super-secret-value-12345")
        resp = self.client.post(
            "/api/env/reveal",
            json={"key": "TEST_REVEAL_KEY"},
-            headers={_SESSION_HEADER_NAME: _SESSION_TOKEN},
+            headers={"Authorization": f"Bearer {_SESSION_TOKEN}"},
        )
        assert resp.status_code == 200
        data = resp.json()
@@ -235,11 +235,11 @@ class TestWebServerEndpoints:

    def test_reveal_env_var_not_found(self):
        """POST /api/env/reveal should 404 for unknown keys."""
-        from hermes_cli.web_server import _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import _SESSION_TOKEN
        resp = self.client.post(
            "/api/env/reveal",
            json={"key": "NONEXISTENT_KEY_XYZ"},
-            headers={_SESSION_HEADER_NAME: _SESSION_TOKEN},
+            headers={"Authorization": f"Bearer {_SESSION_TOKEN}"},
        )
        assert resp.status_code == 404

@@ -249,7 +249,7 @@ class TestWebServerEndpoints:
        from hermes_cli.web_server import app
        from hermes_cli.config import save_env_value
        save_env_value("TEST_REVEAL_NOAUTH", "secret-value")
-        # Use a fresh client WITHOUT the dashboard session header
+        # Use a fresh client WITHOUT the Authorization header
        unauth_client = TestClient(app)
        resp = unauth_client.post(
            "/api/env/reveal",
@@ -260,47 +260,14 @@ class TestWebServerEndpoints:
    def test_reveal_env_var_bad_token(self, tmp_path):
        """POST /api/env/reveal with wrong token should return 401."""
        from hermes_cli.config import save_env_value
-        from hermes_cli.web_server import _SESSION_HEADER_NAME
        save_env_value("TEST_REVEAL_BADAUTH", "secret-value")
        resp = self.client.post(
            "/api/env/reveal",
            json={"key": "TEST_REVEAL_BADAUTH"},
-            headers={_SESSION_HEADER_NAME: "wrong-token-here"},
+            headers={"Authorization": "Bearer wrong-token-here"},
        )
        assert resp.status_code == 401

-    def test_reveal_env_var_custom_session_header_ignores_proxy_authorization(self, tmp_path):
-        """A valid dashboard session header should coexist with proxy auth."""
-        from hermes_cli.config import save_env_value
-        from hermes_cli.web_server import _SESSION_HEADER_NAME, _SESSION_TOKEN
-
-        save_env_value("TEST_REVEAL_PROXY_AUTH", "secret-value")
-        resp = self.client.post(
-            "/api/env/reveal",
-            json={"key": "TEST_REVEAL_PROXY_AUTH"},
-            headers={
-                _SESSION_HEADER_NAME: _SESSION_TOKEN,
-                "Authorization": "Basic dXNlcjpwYXNz",
-            },
-        )
-
-        assert resp.status_code == 200
-        assert resp.json()["value"] == "secret-value"
-
-    def test_reveal_env_var_legacy_authorization_header_still_works(self, tmp_path):
-        """Keep old dashboard bundles working while the new header rolls out."""
-        from hermes_cli.config import save_env_value
-        from hermes_cli.web_server import _SESSION_TOKEN
-
-        save_env_value("TEST_REVEAL_LEGACY_AUTH", "secret-value")
-        resp = self.client.post(
-            "/api/env/reveal",
-            json={"key": "TEST_REVEAL_LEGACY_AUTH"},
-            headers={"Authorization": f"Bearer {_SESSION_TOKEN}"},
-        )
-
-        assert resp.status_code == 200
-
    def test_session_token_endpoint_removed(self):
        """GET /api/auth/session-token should no longer exist (token injected via HTML)."""
        resp = self.client.get("/api/auth/session-token")
@@ -318,7 +285,7 @@ class TestWebServerEndpoints:
        """API requests without the session token should be rejected."""
        from starlette.testclient import TestClient
        from hermes_cli.web_server import app
-        # Create a client WITHOUT the dashboard session header
+        # Create a client WITHOUT the Authorization header
        unauth_client = TestClient(app)
        resp = unauth_client.get("/api/env")
        assert resp.status_code == 401
@@ -421,9 +388,9 @@ class TestConfigRoundTrip:
            from starlette.testclient import TestClient
        except ImportError:
            pytest.skip("fastapi/starlette not installed")
-        from hermes_cli.web_server import app, _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import app, _SESSION_TOKEN
        self.client = TestClient(app)
-        self.client.headers[_SESSION_HEADER_NAME] = _SESSION_TOKEN
+        self.client.headers["Authorization"] = f"Bearer {_SESSION_TOKEN}"

    def test_get_config_no_internal_keys(self):
        """GET /api/config should not expose _config_version or _model_meta."""
@@ -557,12 +524,12 @@ class TestNewEndpoints:

        import hermes_state
        from hermes_constants import get_hermes_home
-        from hermes_cli.web_server import app, _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import app, _SESSION_TOKEN

        monkeypatch.setattr(hermes_state, "DEFAULT_DB_PATH", get_hermes_home() / "state.db")

        self.client = TestClient(app)
-        self.client.headers[_SESSION_HEADER_NAME] = _SESSION_TOKEN
+        self.client.headers["Authorization"] = f"Bearer {_SESSION_TOKEN}"

    def test_get_logs_default(self):
        resp = self.client.get("/api/logs")
@@ -1209,9 +1176,9 @@ class TestStatusRemoteGateway:
        except ImportError:
            pytest.skip("fastapi/starlette not installed")

-        from hermes_cli.web_server import app, _SESSION_HEADER_NAME, _SESSION_TOKEN
+        from hermes_cli.web_server import app, _SESSION_TOKEN
        self.client = TestClient(app)
-        self.client.headers[_SESSION_HEADER_NAME] = _SESSION_TOKEN
+        self.client.headers["Authorization"] = f"Bearer {_SESSION_TOKEN}"

    def test_status_falls_back_to_remote_probe(self, monkeypatch):
        """When local PID check fails and remote probe succeeds, gateway shows running."""
@@ -1292,388 +1259,183 @@ class TestStatusRemoteGateway:


 # ---------------------------------------------------------------------------
-# Dashboard theme normaliser tests
+# /api/pty WebSocket — terminal bridge for the dashboard "Chat" tab.
+#
+# These tests drive the endpoint with a tiny fake command (typically ``cat``
+# or ``sh -c 'printf …'``) instead of the real ``hermes --tui`` binary.  The
+# endpoint resolves its argv through ``_resolve_chat_argv``, so tests
+# monkeypatch that hook.
 # ---------------------------------------------------------------------------

-
-class TestNormaliseThemeDefinition:
-    """Tests for _normalise_theme_definition() — parses YAML theme files."""
-
-    def test_rejects_missing_name(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        assert _normalise_theme_definition({}) is None
-        assert _normalise_theme_definition({"name": ""}) is None
-        assert _normalise_theme_definition({"name": "   "}) is None
-
-    def test_rejects_non_dict(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        assert _normalise_theme_definition("string") is None
-        assert _normalise_theme_definition(None) is None
-        assert _normalise_theme_definition([1, 2, 3]) is None
-
-    def test_loose_colors_shorthand(self):
-        """Bare hex strings under `colors` parse as {hex, alpha=1.0}."""
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({
-            "name": "loose",
-            "colors": {"background": "#000000", "midground": "#ffffff"},
-        })
-        assert result is not None
-        assert result["palette"]["background"] == {"hex": "#000000", "alpha": 1.0}
-        assert result["palette"]["midground"] == {"hex": "#ffffff", "alpha": 1.0}
-        # foreground falls back to default (transparent white)
-        assert result["palette"]["foreground"]["hex"] == "#ffffff"
-        assert result["palette"]["foreground"]["alpha"] == 0.0
-
-    def test_full_palette_form(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({
-            "name": "full",
-            "palette": {
-                "background": {"hex": "#0a1628", "alpha": 1.0},
-                "midground": {"hex": "#a8d0ff", "alpha": 0.9},
-                "warmGlow": "rgba(255, 0, 0, 0.5)",
-                "noiseOpacity": 0.5,
-            },
-        })
-        assert result["palette"]["background"]["hex"] == "#0a1628"
-        assert result["palette"]["midground"]["alpha"] == 0.9
-        assert result["palette"]["warmGlow"] == "rgba(255, 0, 0, 0.5)"
-        assert result["palette"]["noiseOpacity"] == 0.5
-
-    def test_default_typography_applied_when_missing(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({"name": "minimal"})
-        typo = result["typography"]
-        assert "fontSans" in typo
-        assert "fontMono" in typo
-        assert typo["baseSize"] == "15px"
-        assert typo["lineHeight"] == "1.55"
-        assert typo["letterSpacing"] == "0"
-
-    def test_partial_typography_merges_with_defaults(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({
-            "name": "partial",
-            "typography": {
-                "fontSans": "MyFont, sans-serif",
-                "baseSize": "12px",
-            },
-        })
-        assert result["typography"]["fontSans"] == "MyFont, sans-serif"
-        assert result["typography"]["baseSize"] == "12px"
-        # fontMono defaulted
-        assert "monospace" in result["typography"]["fontMono"]
-
-    def test_layout_defaults(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({"name": "minimal"})
-        assert result["layout"]["radius"] == "0.5rem"
-        assert result["layout"]["density"] == "comfortable"
-
-    def test_invalid_density_falls_back(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({
-            "name": "bad",
-            "layout": {"density": "ultra-spacious"},
-        })
-        assert result["layout"]["density"] == "comfortable"
-
-    def test_valid_densities_accepted(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        for d in ("compact", "comfortable", "spacious"):
-            r = _normalise_theme_definition({"name": "x", "layout": {"density": d}})
-            assert r["layout"]["density"] == d
-
-    def test_color_overrides_filter_unknown_keys(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({
-            "name": "o",
-            "colorOverrides": {
-                "card": "#123456",
-                "fakeToken": "#abcdef",
-                "primary": 42,  # non-string rejected
-                "destructive": "#ff0000",
-            },
-        })
-        assert result["colorOverrides"] == {
-            "card": "#123456",
-            "destructive": "#ff0000",
-        }
-
-    def test_color_overrides_omitted_when_empty(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({"name": "x"})
-        assert "colorOverrides" not in result
-
-    def test_alpha_clamped_to_unit_range(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "c",
-            "palette": {"background": {"hex": "#000", "alpha": 99.5}},
-        })
-        assert r["palette"]["background"]["alpha"] == 1.0
-        r2 = _normalise_theme_definition({
-            "name": "c",
-            "palette": {"background": {"hex": "#000", "alpha": -5}},
-        })
-        assert r2["palette"]["background"]["alpha"] == 0.0
-
-    def test_invalid_alpha_uses_default(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "c",
-            "palette": {"background": {"hex": "#000", "alpha": "not a number"}},
-        })
-        assert r["palette"]["background"]["alpha"] == 1.0
+import sys


-class TestDiscoverUserThemes:
-    """Tests for _discover_user_themes() — scans ~/.hermes/dashboard-themes/."""
+skip_on_windows = pytest.mark.skipif(
+    sys.platform.startswith("win"), reason="PTY bridge is POSIX-only"
+)

-    def test_returns_empty_when_dir_missing(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        from hermes_cli import web_server
-        assert web_server._discover_user_themes() == []

-    def test_loads_and_normalises_yaml(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        themes_dir = tmp_path / "dashboard-themes"
-        themes_dir.mkdir()
-        (themes_dir / "ocean.yaml").write_text(
-            "name: ocean\n"
-            "label: Ocean\n"
-            "palette:\n"
-            "  background:\n"
-            "    hex: \"#0a1628\"\n"
-            "    alpha: 1.0\n"
-            "layout:\n"
-            "  density: spacious\n"
+@skip_on_windows
+class TestPtyWebSocket:
+    @pytest.fixture(autouse=True)
+    def _setup(self, monkeypatch, _isolate_hermes_home):
+        from starlette.testclient import TestClient
+
+        import hermes_cli.web_server as ws
+
+        # Avoid exec'ing the actual TUI in tests: every test below installs
+        # its own fake argv via ``ws._resolve_chat_argv``.
+        self.ws_module = ws
+        self.token = ws._SESSION_TOKEN
+        self.client = TestClient(ws.app)
+
+    def _url(self, token: str | None = None, **params: str) -> str:
+        tok = token if token is not None else self.token
+        # TestClient.websocket_connect takes the path; it reconstructs the
+        # query string, so we pass it inline.
+        from urllib.parse import urlencode
+
+        q = {"token": tok, **params}
+        return f"/api/pty?{urlencode(q)}"
+
+    def test_rejects_missing_token(self, monkeypatch):
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            lambda resume=None: (["/bin/cat"], None, None),
        )
-        from hermes_cli import web_server
-        results = web_server._discover_user_themes()
-        assert len(results) == 1
-        assert results[0]["name"] == "ocean"
-        assert results[0]["label"] == "Ocean"
-        assert results[0]["palette"]["background"]["hex"] == "#0a1628"
-        assert results[0]["layout"]["density"] == "spacious"
-        # defaults filled in
-        assert "fontSans" in results[0]["typography"]
+        from starlette.websockets import WebSocketDisconnect

-    def test_malformed_yaml_skipped(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        themes_dir = tmp_path / "dashboard-themes"
-        themes_dir.mkdir()
-        (themes_dir / "bad.yaml").write_text("::: not valid yaml :::\n\tindent wrong")
-        (themes_dir / "nameless.yaml").write_text("label: No Name Here\n")
-        (themes_dir / "ok.yaml").write_text("name: ok\n")
-        from hermes_cli import web_server
-        results = web_server._discover_user_themes()
-        names = [r["name"] for r in results]
-        assert "ok" in names
-        assert "bad" not in names  # malformed YAML
-        assert len(results) == 1  # only the valid one
+        with pytest.raises(WebSocketDisconnect) as exc:
+            with self.client.websocket_connect("/api/pty"):
+                pass
+        assert exc.value.code == 4401

+    def test_rejects_bad_token(self, monkeypatch):
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            lambda resume=None: (["/bin/cat"], None, None),
+        )
+        from starlette.websockets import WebSocketDisconnect

-class TestNormaliseThemeExtensions:
-    """Tests for the extended normaliser fields (assets, customCSS,
-    componentStyles, layoutVariant) — the surfaces themes use to reskin
-    the dashboard without shipping code."""
+        with pytest.raises(WebSocketDisconnect) as exc:
+            with self.client.websocket_connect(self._url(token="wrong")):
+                pass
+        assert exc.value.code == 4401

-    def test_layout_variant_defaults_to_standard(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        result = _normalise_theme_definition({"name": "t"})
-        assert result["layoutVariant"] == "standard"
+    def test_streams_child_stdout_to_client(self, monkeypatch):
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            lambda resume=None: (
+                ["/bin/sh", "-c", "printf hermes-ws-ok"],
+                None,
+                None,
+            ),
+        )
+        with self.client.websocket_connect(self._url()) as conn:
+            # Drain frames until we see the needle or time out.  TestClient's
+            # recv_bytes blocks; loop until we have the signal byte string.
+            buf = b""
+            import time

-    def test_layout_variant_accepts_known_values(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        for variant in ("standard", "cockpit", "tiled"):
-            r = _normalise_theme_definition({"name": "t", "layoutVariant": variant})
-            assert r["layoutVariant"] == variant
+            deadline = time.monotonic() + 5.0
+            while time.monotonic() < deadline:
+                try:
+                    frame = conn.receive_bytes()
+                except Exception:
+                    break
+                if frame:
+                    buf += frame
+                if b"hermes-ws-ok" in buf:
+                    break
+            assert b"hermes-ws-ok" in buf

-    def test_layout_variant_rejects_unknown(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({"name": "t", "layoutVariant": "warship"})
-        assert r["layoutVariant"] == "standard"
-        r2 = _normalise_theme_definition({"name": "t", "layoutVariant": 12})
-        assert r2["layoutVariant"] == "standard"
+    def test_client_input_reaches_child_stdin(self, monkeypatch):
+        # ``cat`` echoes stdin back, so a write → read round-trip proves
+        # the full duplex path.
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            lambda resume=None: (["/bin/cat"], None, None),
+        )
+        with self.client.websocket_connect(self._url()) as conn:
+            conn.send_bytes(b"round-trip-payload\n")
+            buf = b""
+            import time

-    def test_assets_named_slots_passthrough(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "t",
-            "assets": {
-                "bg": "https://example.com/bg.jpg",
-                "hero": "linear-gradient(180deg, red, blue)",
-                "crest": "/ds-assets/crest.svg",
-                "logo": "  ",  # whitespace-only — dropped
-                "notAKnownKey": "ignored",
-            },
-        })
-        assert r["assets"]["bg"] == "https://example.com/bg.jpg"
-        assert r["assets"]["hero"].startswith("linear-gradient")
-        assert r["assets"]["crest"] == "/ds-assets/crest.svg"
-        assert "logo" not in r["assets"]  # whitespace-only rejected
-        assert "notAKnownKey" not in r["assets"]  # unknown slot ignored
+            deadline = time.monotonic() + 5.0
+            while time.monotonic() < deadline:
+                frame = conn.receive_bytes()
+                if frame:
+                    buf += frame
+                if b"round-trip-payload" in buf:
+                    break
+            assert b"round-trip-payload" in buf

-    def test_assets_custom_block(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "t",
-            "assets": {
-                "custom": {
-                    "scan-lines": "/img/scan.png",
-                    "my_overlay": "/img/ov.png",
-                    "bad key!": "x",  # non-alnum key — rejected
-                    "empty": "",        # empty value — rejected
-                },
-            },
-        })
-        assert r["assets"]["custom"] == {
-            "scan-lines": "/img/scan.png",
-            "my_overlay": "/img/ov.png",
-        }
+    def test_resize_escape_is_forwarded(self, monkeypatch):
+        # Resize escape gets intercepted and applied via TIOCSWINSZ,
+        # then ``tput cols/lines`` reports the new dimensions back.
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            # sleep gives the test time to push the resize before tput runs
+            lambda resume=None: (
+                ["/bin/sh", "-c", "sleep 0.15; tput cols; tput lines"],
+                None,
+                None,
+            ),
+        )
+        with self.client.websocket_connect(self._url()) as conn:
+            conn.send_text("\x1b[RESIZE:99;41]")
+            buf = b""
+            import time

-    def test_assets_absent_means_no_field(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({"name": "t"})
-        assert "assets" not in r
+            deadline = time.monotonic() + 5.0
+            while time.monotonic() < deadline:
+                frame = conn.receive_bytes()
+                if frame:
+                    buf += frame
+                if b"99" in buf and b"41" in buf:
+                    break
+            assert b"99" in buf and b"41" in buf

-    def test_custom_css_passthrough_and_capped(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        # Small CSS passes through verbatim.
-        r = _normalise_theme_definition({
-            "name": "t",
-            "customCSS": "body { color: red; }",
-        })
-        assert r["customCSS"] == "body { color: red; }"
+    def test_unavailable_platform_closes_with_message(self, monkeypatch):
+        from hermes_cli.pty_bridge import PtyUnavailableError

-        # 40 KiB of CSS gets clipped to the 32 KiB cap.
-        huge = "/* x */ " * (40 * 1024 // 8 + 10)
-        r2 = _normalise_theme_definition({"name": "t", "customCSS": huge})
-        assert len(r2["customCSS"]) <= 32 * 1024
+        def _raise(argv, **kwargs):
+            raise PtyUnavailableError("pty missing for tests")

-    def test_custom_css_empty_dropped(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        for val in ("", "   \n\t", None):
-            r = _normalise_theme_definition({"name": "t", "customCSS": val})
-            assert "customCSS" not in r
+        monkeypatch.setattr(
+            self.ws_module,
+            "_resolve_chat_argv",
+            lambda resume=None: (["/bin/cat"], None, None),
+        )
+        # Patch PtyBridge.spawn at the web_server module's binding.
+        import hermes_cli.web_server as ws_mod

-    def test_component_styles_per_bucket(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "t",
-            "componentStyles": {
-                "card": {
-                    "clipPath": "polygon(0 0, 100% 0, 100% 100%, 0 100%)",
-                    "boxShadow": "inset 0 0 0 1px red",
-                    "bad prop!": "ignored",  # non-alnum prop rejected
-                },
-                "header": {"background": "linear-gradient(red, blue)"},
-                "rogueBucket": {"foo": "bar"},  # not a known bucket — rejected
-            },
-        })
-        assert r["componentStyles"]["card"] == {
-            "clipPath": "polygon(0 0, 100% 0, 100% 100%, 0 100%)",
-            "boxShadow": "inset 0 0 0 1px red",
-        }
-        assert r["componentStyles"]["header"]["background"].startswith("linear-gradient")
-        assert "rogueBucket" not in r["componentStyles"]
+        monkeypatch.setattr(ws_mod.PtyBridge, "spawn", classmethod(lambda cls, *a, **k: _raise(*a, **k)))

-    def test_component_styles_empty_buckets_dropped(self):
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "t",
-            "componentStyles": {
-                "card": {},        # empty — dropped entirely
-                "header": {"bad prop!": "ignored"},  # all props rejected — bucket dropped
-                "footer": {"background": "black"},
-            },
-        })
-        assert "card" not in r.get("componentStyles", {})
-        assert "header" not in r.get("componentStyles", {})
-        assert r["componentStyles"]["footer"]["background"] == "black"
+        with self.client.websocket_connect(self._url()) as conn:
+            # Expect a final text frame with the error message, then close.
+            msg = conn.receive_text()
+            assert "pty missing" in msg or "unavailable" in msg.lower() or "pty" in msg.lower()

-    def test_component_styles_accepts_numeric_values(self):
-        """Numeric values (e.g. opacity: 0.8) are coerced to strings."""
-        from hermes_cli.web_server import _normalise_theme_definition
-        r = _normalise_theme_definition({
-            "name": "t",
-            "componentStyles": {"card": {"opacity": 0.8, "zIndex": 5}},
-        })
-        assert r["componentStyles"]["card"] == {"opacity": "0.8", "zIndex": "5"}
+    def test_resume_parameter_is_forwarded_to_argv(self, monkeypatch):
+        captured: dict = {}

+        def fake_resolve(resume=None):
+            captured["resume"] = resume
+            return (["/bin/sh", "-c", "printf resume-arg-ok"], None, None)

-class TestDashboardPluginManifestExtensions:
-    """Tests for the extended plugin manifest fields (tab.override,
-    tab.hidden, slots) read by _discover_dashboard_plugins()."""
+        monkeypatch.setattr(self.ws_module, "_resolve_chat_argv", fake_resolve)

-    def _write_plugin(self, tmp_path, name, manifest):
-        import json
-        plug_dir = tmp_path / "plugins" / name / "dashboard"
-        plug_dir.mkdir(parents=True)
-        (plug_dir / "manifest.json").write_text(json.dumps(manifest))
-        return plug_dir
+        with self.client.websocket_connect(self._url(resume="sess-42")) as conn:
+            # Drain briefly so the handler actually invokes the resolver.
+            try:
+                conn.receive_bytes()
+            except Exception:
+                pass
+        assert captured.get("resume") == "sess-42"

-    def test_override_and_hidden_carried_through(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        self._write_plugin(tmp_path, "skin-home", {
-            "name": "skin-home",
-            "label": "Skin Home",
-            "tab": {"path": "/skin-home", "override": "/", "hidden": True},
-            "slots": ["sidebar", "header-left"],
-            "entry": "dist/index.js",
-        })
-        from hermes_cli import web_server
-        # Bust the process-level cache so the test plugin is picked up.
-        web_server._dashboard_plugins_cache = None
-        plugins = web_server._get_dashboard_plugins(force_rescan=True)
-        entry = next(p for p in plugins if p["name"] == "skin-home")
-        assert entry["tab"]["override"] == "/"
-        assert entry["tab"]["hidden"] is True
-        assert entry["slots"] == ["sidebar", "header-left"]
-
-    def test_override_requires_leading_slash(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        self._write_plugin(tmp_path, "bad-override", {
-            "name": "bad-override",
-            "label": "Bad",
-            "tab": {"path": "/bad", "override": "no-leading-slash"},
-            "entry": "dist/index.js",
-        })
-        from hermes_cli import web_server
-        web_server._dashboard_plugins_cache = None
-        plugins = web_server._get_dashboard_plugins(force_rescan=True)
-        entry = next(p for p in plugins if p["name"] == "bad-override")
-        assert "override" not in entry["tab"]
-
-    def test_slots_default_empty(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        self._write_plugin(tmp_path, "no-slots", {
-            "name": "no-slots",
-            "label": "No Slots",
-            "tab": {"path": "/no-slots"},
-            "entry": "dist/index.js",
-        })
-        from hermes_cli import web_server
-        web_server._dashboard_plugins_cache = None
-        plugins = web_server._get_dashboard_plugins(force_rescan=True)
-        entry = next(p for p in plugins if p["name"] == "no-slots")
-        assert entry["slots"] == []
-        assert "hidden" not in entry["tab"]
-        assert "override" not in entry["tab"]
-
-    def test_slots_filters_non_string_entries(self, tmp_path, monkeypatch):
-        monkeypatch.setenv("HERMES_HOME", str(tmp_path))
-        self._write_plugin(tmp_path, "mixed-slots", {
-            "name": "mixed-slots",
-            "label": "Mixed",
-            "tab": {"path": "/mixed-slots"},
-            "slots": ["sidebar", "", 42, None, "header-right"],
-            "entry": "dist/index.js",
-        })
-        from hermes_cli import web_server
-        web_server._dashboard_plugins_cache = None
-        plugins = web_server._get_dashboard_plugins(force_rescan=True)
-        entry = next(p for p in plugins if p["name"] == "mixed-slots")
-        assert entry["slots"] == ["sidebar", "header-right"]
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
emozilla	1cd2b280fd	Merge remote-tracking branch 'origin/main' into feat/dashboard-chat	2026-04-22 21:42:14 -04:00
emozilla	2c2e32cc45	docs: document the dashboard Chat tab AGENTS.md — new subsection under TUI Architecture explaining that the dashboard embeds the real hermes --tui rather than rewriting it, with pointers to the pty_bridge + WebSocket endpoint and the rule 'never add a parallel chat surface in React.' website/docs/user-guide/features/web-dashboard.md — user-facing Chat section inside the existing Web Dashboard page, covering how it works (WebSocket + PTY + xterm.js), the Sessions-page resume flow, and prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows).	2026-04-21 03:10:30 -04:00
emozilla	a0701b1d5a	fix(tui): replace OSC 52 jargon in /copy confirmation When the user ran /copy successfully, Ink confirmed with: sent OSC52 copy sequence (terminal support required) That reads like a protocol spec to everyone who isn't a terminal implementer. The caveat was a historical artifact — OSC 52 wasn't universally supported when this message was written, so the TUI honestly couldn't guarantee the copy had landed anywhere. Today every modern terminal (including the dashboard's embedded xterm.js) handles OSC 52 reliably. Say what the user actually wants to know — that it copied, and how much — matching the message the TUI already uses for selection copy: copied 1482 chars	2026-04-21 03:10:30 -04:00
emozilla	3d21aee811	feat(web): add Chat tab with xterm.js terminal + Sessions resume button Wires the new /api/pty WebSocket into the dashboard as a top-level Chat tab. Clicking Chat (or the ▶ play icon on any session row) spawns a PTY running hermes --tui and renders its ANSI output with xterm.js in the browser. Frontend -------- web/src/pages/ChatPage.tsx * @xterm/xterm v6 + @xterm/addon-webgl renderer (pixel-perfect cell grid — DOM and canvas renderers each have layout artifacts that break box-drawing glyph connectivity in a browser) * @xterm/addon-fit for container-driven resize * @xterm/addon-unicode11 for modern wide-char widths (matches Ink's string-width computation so kaomoji / CJK / emoji land on the same cell boundaries as the host expects) * @xterm/addon-web-links for URL auto-linking * Rounded dark-teal "terminal window" container with 12px internal padding + drop shadow for visual identity within the dashboard * Clipboard wiring: - Ctrl/Cmd+Shift+C copies xterm selection to system clipboard - Ctrl/Cmd+Shift+V pastes system clipboard into the PTY - OSC 52 handler writes terminal-emitted clipboard sequences (how Ink's own Ctrl+C and /copy command deliver copy events); decodes via TextDecoder so multi-byte UTF-8 codepoints (U+2265, emoji, CJK) round-trip correctly - Plain Ctrl+C still passes through as SIGINT to interrupt a running response * Floating "copy last response" button in the bottom-right corner. Triggers Ink's /copy slash by sending bytes in two frames with a 100ms gap — Ink's tokenizer coalesces rapid adjacent bytes into a paste event (bypasses the slash dispatcher), so we deliberately split '/copy' and '\r' into separate packets to land them as individual keypresses. web/src/App.tsx Chat nav entry (Terminal icon) at position 2 and <Route path="/chat">. web/src/pages/SessionsPage.tsx Play-icon button per session row that navigates to /chat?resume=<id>; the PTY bridge forwards the resume param to hermes --tui --resume. web/src/i18n/{en,zh,types}.ts nav.chat label + sessions.resumeInChat action label. web/vite.config.ts /api proxy gains ws: true so WebSocket upgrades forward to :9119 when running Vite dev mode against a separate hermes dashboard backend. web/src/index.css + web/public/fonts-terminal/ Bundles JetBrains Mono (Regular/Bold/Italic, Apache-2.0, ~280 KB total) as a local webfont. Fonts live outside web/public/fonts/ because the sync-assets prebuild step wipes that directory from @nous-research/ui every build. Package deps ------------ Net new: @xterm/xterm ^6.0.0, @xterm/addon-fit ^0.11.0, @xterm/addon-webgl ^0.19.0, @xterm/addon-unicode11 ^0.9.0, @xterm/addon-web-links ^0.12.0. Bundle impact: +420 KB minified / +105 KB gzipped. Acceptable for a feature that replaces what would otherwise be a rewrite of the entire TUI surface in React. Backend contract preserved --------------------------- Every TUI affordance (slash popover, model picker, tool cards, markdown streaming, clarify/sudo/approval prompts, skin engine, wide chars, mouse tracking) lands in the browser unchanged because we are running the real Ink binary. Adding a feature to the TUI surfaces in the dashboard immediately. Do NOT add parallel React chat surfaces.	2026-04-21 03:10:30 -04:00
emozilla	29b337bca7	feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can embed the real TUI rather than reimplement its surface. The browser attaches xterm.js to the socket; keystrokes flow in, PTY output bytes flow out. Architecture: browser <Terminal> (xterm.js) │ onData ───► ws.send(keystrokes) │ onResize ► ws.send('\x1b[RESIZE:cols;rows]') │ write ◄── ws.onmessage (PTY bytes) ▼ FastAPI /api/pty (token-gated, loopback-only) ▼ PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent Components ---------- hermes_cli/pty_bridge.py Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the master fd via os.read/os.write (not PtyProcessUnicode — ANSI is inherently byte-oriented and UTF-8 boundaries may land mid-read), non-blocking select-based reads, TIOCSWINSZ resize, idempotent SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows is WSL-supported only). hermes_cli/web_server.py @app.websocket("/api/pty") endpoint gated by the existing _SESSION_TOKEN (via ?token= query param since browsers can't set Authorization on WS upgrades). Loopback-only enforcement. Reader task uses run_in_executor to pump PTY bytes without blocking the event loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape before forwarding to the PTY. The endpoint resolves the TUI argv through a _resolve_chat_argv hook so tests can inject fake commands without building the real TUI. Tests ----- tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout, stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close idempotency, cwd, env forwarding, unavailable-platform error. tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests: missing/bad token rejection (close code 4401), stdout streaming, stdin round-trip, resize escape forwarding, unavailable-platform ANSI error frame + 1011 close, resume parameter forwarding to argv. 96 tests pass under scripts/run_tests.sh.	2026-04-21 02:48:16 -04:00