feat: add /recap slash command — summarize recent session activity

Inspired by Claude Code's /recap (v2.1.114, April 2026). Produces a compact text summary of recent activity in the current session: turn counts, tools used, files touched, last user ask, and last assistant reply. Useful when juggling multiple sessions or returning to a session after being away. Implementation notes: - Pure local computation from the in-memory conversation history / gateway transcript. No LLM call, no auxiliary model, no prompt-cache invalidation — a recap should be instant and free. - Works unchanged on CLI and every gateway platform (Telegram, Discord, Slack, …) via a shared hermes_cli.session_recap.build_recap helper. Claude Code only ships this on the CLI. - Tailored to hermes-agent's tool vocabulary: file-editing tools (patch, write_file, read_file, skill_manage, skill_view) surface touched paths; tool-call counts highlight which classes of work drove the session. - Added to ACTIVE_SESSION_BYPASS_COMMANDS and the Level-2 early intercept in gateway/run.py so /recap works while an agent is running (read-only, safe). Source: https://code.claude.com/docs/en/whats-new/2026-w17
2026-05-01 17:10:46 -07:00
492 changed files with 4081 additions and 48854 deletions
@@ -25,7 +25,3 @@ ui-tui/packages/hermes-ink/dist/

 # Runtime data (bind-mounted at /opt/data; must not leak into build context)
 data/
-
-# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
-hermes-config/
-runtime/
@@ -1,44 +0,0 @@
-# Dependabot configuration for hermes-agent.
-#
-# Deliberately scoped to github-actions only.
-#
-# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
-# because we pin source dependencies exactly (uv.lock, package-lock.json) as
-# part of our supply-chain posture. Automatic version-bump PRs against those
-# pins would undermine the strategy — pins are moved deliberately, after
-# review, not on a schedule.
-#
-# github-actions is the exception: action pins (we use full commit SHAs per
-# supply-chain policy) must be updated when upstream actions publish
-# patches — usually themselves security fixes. Dependabot opens a PR with
-# the new SHA and release notes; we review and merge like any other PR.
-#
-# Security-update PRs for source dependencies (opened ONLY when a CVE is
-# published affecting a currently-pinned version) are enabled separately
-# via the repo's Dependabot security updates setting
-# (Settings → Code security → Dependabot → Dependabot security updates).
-# Those are CVE-only, not schedule-driven, and do not conflict with our
-# pinning strategy — they fire when a pinned version becomes known-bad,
-# which is exactly when we want to move the pin.
-
-version: 2
-updates:
-  - package-ecosystem: "github-actions"
-    directory: "/"
-    schedule:
-      interval: "weekly"
-      day: "monday"
-    open-pull-requests-limit: 5
-    labels:
-      - "dependencies"
-      - "github-actions"
-    commit-message:
-      prefix: "chore(actions)"
-      include: "scope"
-    groups:
-      # Batch routine action bumps into one PR per week to reduce noise.
-      # Security updates still open individually and bypass grouping.
-      actions-minor-patch:
-        update-types:
-          - "minor"
-          - "patch"
@@ -1,67 +0,0 @@
-name: OSV-Scanner
-
-# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
-# database. Runs on every PR that touches a lockfile and on a weekly schedule
-# against main.
-#
-# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
-# It reports known CVEs in currently-pinned dependency versions so we can
-# decide when and how to patch on our own schedule. Our pinning strategy
-# (full SHA / exact version) is preserved; only the notification signal
-# is added.
-#
-# Complements the existing supply-chain-audit.yml workflow (which scans
-# for malicious code patterns in PR diffs) by covering the orthogonal
-# "currently-pinned dep became known-vulnerable" case.
-#
-# Uses Google's officially-recommended reusable workflow, pinned by SHA.
-# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
-# fail-on-vuln is disabled so the job does not block merges on pre-existing
-# vulnerabilities in pinned deps that we may need to patch deliberately.
-
-on:
-  pull_request:
-    branches: [main]
-    paths:
-      - 'uv.lock'
-      - 'pyproject.toml'
-      - 'package.json'
-      - 'package-lock.json'
-      - 'ui-tui/package.json'
-      - 'ui-tui/package-lock.json'
-      - 'website/package.json'
-      - 'website/package-lock.json'
-      - '.github/workflows/osv-scanner.yml'
-  push:
-    branches: [main]
-    paths:
-      - 'uv.lock'
-      - 'pyproject.toml'
-      - 'package.json'
-      - 'package-lock.json'
-      - 'ui-tui/package-lock.json'
-      - 'website/package-lock.json'
-  schedule:
-    # Weekly scan against main — catches CVEs published after merge for
-    # deps that haven't changed since.
-    - cron: '0 9 * * 1'
-  workflow_dispatch:
-
-permissions:
-  # Required by the reusable workflow to upload SARIF to the Security tab.
-  actions: read
-  contents: read
-  security-events: write
-
-jobs:
-  scan:
-    name: Scan lockfiles
-    uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5  # v2.3.5
-    with:
-      # Scan explicit lockfiles rather than recursing, so we only look at
-      # the three sources of truth and skip vendored / test / worktree dirs.
-      scan-args: |-
-        --lockfile=uv.lock
-        --lockfile=ui-tui/package-lock.json
-        --lockfile=website/package-lock.json
-      fail-on-vuln: false
@@ -37,17 +37,12 @@ hermes-agent/
 │   ├── platforms/        # Adapter per platform (telegram, discord, slack, whatsapp,
 │   │                     #   homeassistant, signal, matrix, mattermost, email, sms,
 │   │                     #   dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
-│   │                     #   yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
+│   │                     #   webhook, api_server, ...). See ADDING_A_PLATFORM.md.
 │   └── builtin_hooks/    # Extension point for always-registered gateway hooks (none shipped)
 ├── plugins/              # Plugin system (see "Plugins" section below)
 │   ├── memory/           # Memory-provider plugins (honcho, mem0, supermemory, ...)
 │   ├── context_engine/   # Context-engine plugins
-│   ├── kanban/           # Multi-agent board dispatcher + worker plugin
-│   ├── hermes-achievements/  # Gamified achievement tracking
-│   ├── observability/    # Metrics / traces / logs plugin
-│   ├── image_gen/        # Image-generation providers
-│   └── <others>/         # disk-cleanup, example-dashboard, google_meet, platforms,
-│                         #   spotify, strike-freedom-cockpit, ...
+│   └── <others>/         # Dashboard, image-gen, disk-cleanup, examples, ...
 ├── optional-skills/      # Heavier/niche skills shipped but NOT active by default
 ├── skills/               # Built-in skills bundled with the repo
 ├── ui-tui/               # Ink (React) terminal UI — `hermes --tui`
@@ -58,7 +53,7 @@ hermes-agent/
 ├── environments/         # RL training environments (Atropos)
 ├── scripts/              # run_tests.sh, release.py, auxiliary scripts
 ├── website/              # Docusaurus docs site
-└── tests/                # Pytest suite (~17k tests across ~900 files as of May 2026)
+└── tests/                # Pytest suite (~15k tests across ~700 files as of Apr 2026)
 ```

 **User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
@@ -262,16 +257,7 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite.  See `hermes

 ## Adding New Tools

-For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
-route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
-`~/.hermes/plugins/<name>/__init__.py`, then register tools with
-`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
-enabled or disabled without touching `tools/` or `toolsets.py`.
-
-Use the built-in route below only when the user is explicitly contributing a new
-core Hermes tool that should ship in the base system.
-
-Built-in/core tools require changes in **2 files**:
+Requires changes in **2 files**:

 **1. Create `tools/your_tool.py`:**
 ```python
@@ -294,9 +280,9 @@ registry.register(
 )
 ```

-**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset. **This step is required:** auto-discovery imports the tool and registers its schema, but the tool is only *exposed to an agent* if its name appears in a toolset. `_HERMES_CORE_TOOLS` is not dead code — it's the default bundle every platform's base toolset inherits from.
+**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.

-Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain. Wiring into a toolset is still a deliberate, manual step.
+Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain.

 The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.

@@ -318,22 +304,6 @@ The registry handles schema collection, dispatch, availability checking, and err
   section is handled automatically by the deep-merge and does NOT require
   a version bump.

-### Top-level `config.yaml` sections (non-exhaustive):
-
-`model`, `agent`, `terminal`, `compression`, `display`, `stt`, `tts`,
-`memory`, `security`, `delegation`, `smart_model_routing`, `checkpoints`,
-`auxiliary`, `curator`, `skills`, `gateway`, `logging`, `cron`, `profiles`,
-`plugins`, `honcho`.
-
-`auxiliary` holds per-task overrides for side-LLM work (curator, vision,
-embedding, title generation, session_search, etc.) — each task can pin
-its own provider/model/base_url/max_tokens/reasoning_effort. See
-`agent/auxiliary_client.py::_resolve_auto` for resolution order.
-
-`curator` holds the background skill-maintenance config —
-`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
-`archive_after_days`, `backup` (nested).
-
 ### .env variables (SECRETS ONLY — API keys, tokens, passwords):
 1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
 ```python
@@ -540,176 +510,11 @@ niche skills belong in `optional-skills/`.

 ### SKILL.md frontmatter

-Standard fields: `name`, `description`, `version`, `author`, `license`,
-`platforms` (OS-gating list: `[macos]`, `[linux, macos]`, ...),
+Standard fields: `name`, `description`, `version`, `platforms`
+(OS-gating list: `[macos]`, `[linux, macos]`, ...),
 `metadata.hermes.tags`, `metadata.hermes.category`,
-`metadata.hermes.related_skills`, `metadata.hermes.config` (config.yaml
-settings the skill needs — stored under `skills.config.<key>`, prompted
-during setup, injected at load time).
-
-Top-level `tags:` and `category:` are also accepted and mirrored from
-`metadata.hermes.*` by the loader.
-
---
-
-## Toolsets
-
-All toolsets are defined in `toolsets.py` as a single `TOOLSETS` dict.
-Each platform's adapter picks a base toolset (e.g. Telegram uses
-`"messaging"`); `_HERMES_CORE_TOOLS` is the default bundle most
-platforms inherit from.
-
-Current toolset keys: `browser`, `clarify`, `code_execution`, `cronjob`,
-`debugging`, `delegation`, `discord`, `discord_admin`, `feishu_doc`,
-`feishu_drive`, `file`, `homeassistant`, `image_gen`, `kanban`, `memory`,
-`messaging`, `moa`, `rl`, `safe`, `search`, `session_search`, `skills`,
-`spotify`, `terminal`, `todo`, `tts`, `video`, `vision`, `web`, `yuanbao`.
-
-Enable/disable per platform via `hermes tools` (the curses UI) or the
-`tools.<platform>.enabled` / `tools.<platform>.disabled` lists in
-`config.yaml`.
-
---
-
-## Delegation (`delegate_task`)
-
-`tools/delegate_tool.py` spawns a subagent with an isolated
-context + terminal session. Synchronous: the parent waits for the
-child's summary before continuing its own loop — if the parent is
-interrupted, the child is cancelled.
-
-Two shapes:
-
- **Single:** pass `goal` (+ optional `context`, `toolsets`).
- **Batch (parallel):** pass `tasks: [...]` — each gets its own subagent
-  running concurrently. Concurrency is capped by
-  `delegation.max_concurrent_children` (default 3).
-
-Roles:
-
- `role="leaf"` (default) — focused worker. Cannot call `delegate_task`,
-  `clarify`, `memory`, `send_message`, `execute_code`.
- `role="orchestrator"` — retains `delegate_task` so it can spawn its
-  own workers. Gated by `delegation.orchestrator_enabled` (default true)
-  and bounded by `delegation.max_spawn_depth` (default 2).
-
-Key config knobs (under `delegation:` in `config.yaml`):
-`max_concurrent_children`, `max_spawn_depth`, `child_timeout_seconds`,
-`orchestrator_enabled`, `subagent_auto_approve`, `inherit_mcp_toolsets`,
-`max_iterations`.
-
-Synchronicity rule: delegate_task is **not** durable. For long-running
-work that must outlive the current turn, use `cronjob` or
-`terminal(background=True, notify_on_complete=True)` instead.
-
---
-
-## Curator (skill lifecycle)
-
-Background skill-maintenance system that tracks usage on agent-created
-skills and auto-archives stale ones. Users never lose skills; archives
-go to `~/.hermes/skills/.archive/` and are restorable.
-
- **Core:** `agent/curator.py` (review loop, auto-transitions, LLM review
-  prompt) + `agent/curator_backup.py` (pre-run tar.gz snapshots).
- **CLI:** `hermes_cli/curator.py` wires `hermes curator <verb>` where
-  verbs are: `status`, `run`, `pause`, `resume`, `pin`, `unpin`,
-  `archive`, `restore`, `prune`, `backup`, `rollback`.
- **Telemetry:** `tools/skill_usage.py` owns the sidecar
-  `~/.hermes/skills/.usage.json` — per-skill `use_count`, `view_count`,
-  `patch_count`, `last_activity_at`, `state` (active / stale /
-  archived), `pinned`.
-
-Invariants:
- Curator only touches skills with `created_by: "agent"` provenance —
-  bundled + hub-installed skills are off-limits.
- Never deletes; max destructive action is archive.
- Pinned skills are exempt from every auto-transition and from the
-  LLM review pass.
- `skill_manage(action="delete")` refuses pinned skills; patch/edit/
-  write_file/remove_file go through so the agent can keep improving
-  pinned skills.
-
-Config section (`curator:` in `config.yaml`):
-`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
-`archive_after_days`, `backup.*`.
-
-Full user-facing docs: `website/docs/user-guide/features/curator.md`.
-
---
-
-## Cron (scheduled jobs)
-
-`cron/jobs.py` (job store) + `cron/scheduler.py` (tick loop). Agents
-schedule jobs via the `cronjob` tool; users via `hermes cron <verb>`
-(`list`, `add`, `edit`, `pause`, `resume`, `run`, `remove`) or the
-`/cron` slash command.
-
-Supported schedule formats:
- Duration: `"30m"`, `"2h"`, `"1d"`
- "every" phrase: `"every 2h"`, `"every monday 9am"`
- 5-field cron expression: `"0 9 * * *"`
- ISO timestamp (one-shot): `"2026-06-01T09:00:00Z"`
-
-Per-job fields include `skills` (load specific skills), `model` /
-`provider` overrides, `script` (pre-run data-collection script whose
-stdout is injected into the prompt; `no_agent=True` turns the script
-into the entire job), `context_from` (chain job A's last output into
-job B's prompt), `workdir` (run in a specific directory with its
-`AGENTS.md`/`CLAUDE.md` loaded), and multi-platform delivery.
-
-Hardening invariants:
- **3-minute hard interrupt** on cron sessions — runaway agent loops
-  cannot monopolize the scheduler.
- Catchup window: half the job's period, clamped to 120s–2h.
- Grace window: 120s for one-shot jobs whose fire time was missed.
- File lock at `~/.hermes/cron/.tick.lock` prevents duplicate ticks
-  across processes.
- Cron sessions pass `skip_memory=True` by default; memory providers
-  intentionally do not run during cron.
-
-Cron deliveries are **not** mirrored into the target gateway session —
-they land in their own cron session with a header/footer frame so the
-main conversation's message-role alternation stays intact.
-
---
-
-## Kanban (multi-agent work queue)
-
-Durable SQLite-backed board that lets multiple profiles / workers
-collaborate on shared tasks. Users drive it via `hermes kanban <verb>`;
-workers spawned by the dispatcher drive it via a dedicated `kanban_*`
-toolset so their schema footprint is zero when they're not inside a
-kanban task.
-
- **CLI:** `hermes_cli/kanban.py` wires `hermes kanban` with verbs
-  `init`, `create`, `list` (alias `ls`), `show`, `assign`, `link`,
-  `unlink`, `comment`, `complete`, `block`, `unblock`, `archive`,
-  `tail`, plus less-commonly-used `watch`, `stats`, `runs`, `log`,
-  `assignees`, `heartbeat`, `notify-*`, `dispatch`, `daemon`, `gc`.
- **Worker toolset:** `tools/kanban_tools.py` exposes `kanban_show`,
-  `kanban_complete`, `kanban_block`, `kanban_heartbeat`, `kanban_comment`,
-  `kanban_create`, `kanban_link` — gated by `HERMES_KANBAN_TASK` so
-  the schema only appears for processes actually running as a worker.
- **Dispatcher:** long-lived loop that (default every 60s) reclaims
-  stale claims, promotes ready tasks, atomically claims, and spawns
-  assigned profiles. Runs **inside the gateway** by default via
-  `kanban.dispatch_in_gateway: true`.
- **Plugin assets:** `plugins/kanban/dashboard/` (web UI) +
-  `plugins/kanban/systemd/` (`hermes-kanban-dispatcher.service` for
-  standalone dispatcher deployment).
-
-Isolation model:
- **Board** is the hard boundary — workers are spawned with
-  `HERMES_KANBAN_BOARD` pinned in their env so they can't see other
-  boards.
- **Tenant** is a soft namespace *within* a board — one specialist
-  fleet can serve multiple businesses with workspace-path + memory-key
-  isolation.
- After ~5 consecutive spawn failures on the same task the dispatcher
-  auto-blocks it to prevent spin loops.
-
-Full user-facing docs: `website/docs/user-guide/features/kanban.md`.
+`metadata.hermes.config` (config.yaml settings the skill needs — stored
+under `skills.config.<key>`, prompted during setup, injected at load time).

 ---

@@ -4,7 +4,6 @@ from __future__ import annotations

 import asyncio
 import contextvars
-import json
 import logging
 import os
 from collections import defaultdict, deque
@@ -48,7 +47,6 @@ from acp.schema import (
    TextContentBlock,
    UnstructuredCommandInput,
    Usage,
-    UsageUpdate,
    UserMessageChunk,
 )

@@ -67,7 +65,6 @@ from acp_adapter.events import (
 )
 from acp_adapter.permissions import make_approval_callback
 from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
-from acp_adapter.tools import build_tool_complete, build_tool_start

 logger = logging.getLogger(__name__)

@@ -318,66 +315,6 @@ class HermesACPAgent(acp.Agent):

        return target_provider, new_model

-    @staticmethod
-    def _build_usage_update(state: SessionState) -> UsageUpdate | None:
-        """Build ACP native context-usage data for clients like Zed.
-
-        Zed's circular context indicator is driven by ACP ``usage_update``
-        session updates: ``size`` is the model context window and ``used`` is
-        the current request pressure.  Hermes estimates ``used`` from the same
-        buckets it sends to providers: system prompt, conversation history, and
-        tool schemas.
-        """
-        agent = state.agent
-        compressor = getattr(agent, "context_compressor", None)
-        size = int(getattr(compressor, "context_length", 0) or 0)
-        if size <= 0:
-            return None
-
-        try:
-            from agent.model_metadata import estimate_request_tokens_rough
-
-            used = estimate_request_tokens_rough(
-                state.history,
-                system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
-                tools=getattr(agent, "tools", None) or None,
-            )
-        except Exception:
-            logger.debug("Could not estimate ACP native context usage", exc_info=True)
-            used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
-
-        return UsageUpdate(
-            session_update="usage_update",
-            size=max(size, 0),
-            used=max(used, 0),
-        )
-
-    async def _send_usage_update(self, state: SessionState) -> None:
-        """Send ACP native context usage to the connected client."""
-        if not self._conn:
-            return
-        update = self._build_usage_update(state)
-        if update is None:
-            return
-        try:
-            await self._conn.session_update(
-                session_id=state.session_id,
-                update=update,
-            )
-        except Exception:
-            logger.warning(
-                "Failed to send ACP usage update for session %s",
-                state.session_id,
-                exc_info=True,
-            )
-
-    def _schedule_usage_update(self, state: SessionState) -> None:
-        """Schedule native context indicator refresh after ACP responses."""
-        if not self._conn:
-            return
-        loop = asyncio.get_running_loop()
-        loop.call_soon(asyncio.create_task, self._send_usage_update(state))
-
    async def _register_session_mcp_servers(
        self,
        state: SessionState,
@@ -548,99 +485,37 @@ class HermesACPAgent(acp.Agent):
            )
        return None

-    @staticmethod
-    def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
-        """Extract function name/arguments from an OpenAI-style tool_call."""
-        function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
-        name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
-        raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
-        if isinstance(raw_args, str):
-            try:
-                parsed = json.loads(raw_args)
-            except Exception:
-                parsed = {"raw": raw_args}
-            raw_args = parsed
-        if not isinstance(raw_args, dict):
-            raw_args = {}
-        return name, raw_args
-
-    @staticmethod
-    def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
-        """Return the stable provider tool call id for ACP history replay."""
-        return str(
-            tool_call.get("id")
-            or tool_call.get("call_id")
-            or tool_call.get("tool_call_id")
-            or ""
-        ).strip()
-
    async def _replay_session_history(self, state: SessionState) -> None:
        """Send persisted user/assistant history to clients during session/load.

        Zed's ACP history UI calls ``session/load`` after the user picks an item
        from the Agents sidebar. The agent must then replay the full conversation
-        as user/assistant chunks plus reconstructed tool-call start/completion
-        notifications; merely restoring server-side state makes Hermes remember
-        context, but leaves the editor looking like a clean thread.
+        as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
+        restoring server-side state makes Hermes remember context, but leaves the
+        editor looking like a clean thread.
        """
        if not self._conn or not state.history:
            return

-        active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
-
-        async def _send(update: Any) -> bool:
+        for message in state.history:
+            role = str(message.get("role") or "")
+            if role not in {"user", "assistant"}:
+                continue
+            text = self._history_message_text(message)
+            if not text:
+                continue
+            update = self._history_message_update(role=role, text=text)
+            if update is None:
+                continue
            try:
                await self._conn.session_update(session_id=state.session_id, update=update)
-                return True
            except Exception:
                logger.warning(
                    "Failed to replay ACP history for session %s",
                    state.session_id,
                    exc_info=True,
                )
-                return False
-
-        for message in state.history:
-            role = str(message.get("role") or "")
-
-            if role in {"user", "assistant"}:
-                text = self._history_message_text(message)
-                if text:
-                    update = self._history_message_update(role=role, text=text)
-                    if update is not None and not await _send(update):
-                        return
-
-            if role == "assistant" and isinstance(message.get("tool_calls"), list):
-                for tool_call in message["tool_calls"]:
-                    if not isinstance(tool_call, dict):
-                        continue
-                    tool_call_id = self._history_tool_call_id(tool_call)
-                    if not tool_call_id:
-                        continue
-                    tool_name, args = self._history_tool_call_name_args(tool_call)
-                    active_tool_calls[tool_call_id] = (tool_name, args)
-                    if not await _send(build_tool_start(tool_call_id, tool_name, args)):
-                        return
-                continue
-
-            if role == "tool":
-                tool_call_id = str(message.get("tool_call_id") or "").strip()
-                tool_name = str(message.get("tool_name") or "").strip()
-                function_args: dict[str, Any] | None = None
-                if tool_call_id in active_tool_calls:
-                    tool_name, function_args = active_tool_calls.pop(tool_call_id)
-                if not tool_call_id or not tool_name:
-                    continue
-                result = message.get("content")
-                if not await _send(
-                    build_tool_complete(
-                        tool_call_id,
-                        tool_name,
-                        result=result if isinstance(result, str) else None,
-                        function_args=function_args,
-                    )
-                ):
-                    return
+                return

    async def new_session(
        self,
@@ -652,24 +527,11 @@ class HermesACPAgent(acp.Agent):
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("New session %s (cwd=%s)", state.session_id, cwd)
        self._schedule_available_commands_update(state.session_id)
-        self._schedule_usage_update(state)
        return NewSessionResponse(
            session_id=state.session_id,
            models=self._build_model_state(state),
        )

-    def _schedule_history_replay(self, state: SessionState) -> None:
-        """Replay persisted history after session/load or session/resume returns.
-
-        Zed only attaches streamed transcript/tool updates once the load/resume
-        response has completed. Sending replay notifications while the request is
-        still in-flight can make the server look correct in logs while the editor
-        drops or fails to attach the tool-call history.
-        """
-        loop = asyncio.get_running_loop()
-        replay_coro = self._replay_session_history(state)
-        loop.call_soon(asyncio.create_task, replay_coro)
-
    async def load_session(
        self,
        cwd: str,
@@ -683,9 +545,8 @@ class HermesACPAgent(acp.Agent):
            return None
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("Loaded session %s", session_id)
-        self._schedule_history_replay(state)
+        await self._replay_session_history(state)
        self._schedule_available_commands_update(session_id)
-        self._schedule_usage_update(state)
        return LoadSessionResponse(models=self._build_model_state(state))

    async def resume_session(
@@ -701,9 +562,8 @@ class HermesACPAgent(acp.Agent):
            state = self.session_manager.create_session(cwd=cwd)
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("Resumed session %s", state.session_id)
-        self._schedule_history_replay(state)
+        await self._replay_session_history(state)
        self._schedule_available_commands_update(state.session_id)
-        self._schedule_usage_update(state)
        return ResumeSessionResponse(models=self._build_model_state(state))

    async def cancel(self, session_id: str, **kwargs: Any) -> None:
@@ -852,7 +712,6 @@ class HermesACPAgent(acp.Agent):
                if self._conn:
                    update = acp.update_agent_message_text(response_text)
                    await self._conn.session_update(session_id, update)
-                    await self._send_usage_update(state)
                return PromptResponse(stop_reason="end_turn")

        # If Zed sends another regular prompt while the same ACP session is
@@ -885,37 +744,24 @@ class HermesACPAgent(acp.Agent):
        tool_call_meta: dict[str, dict[str, Any]] = {}
        previous_approval_cb = None

-        streamed_message = False
-
        if conn:
            tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
-            reasoning_cb = make_thinking_cb(conn, session_id, loop)
+            thinking_cb = make_thinking_cb(conn, session_id, loop)
            step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
            message_cb = make_message_cb(conn, session_id, loop)
-
-            def stream_delta_cb(text: str) -> None:
-                nonlocal streamed_message
-                if text:
-                    streamed_message = True
-                message_cb(text)
-
            approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
        else:
            tool_progress_cb = None
-            reasoning_cb = None
+            thinking_cb = None
            step_cb = None
-            stream_delta_cb = None
+            message_cb = None
            approval_cb = None

        agent = state.agent
        agent.tool_progress_callback = tool_progress_cb
-        # ACP thought panes should not receive Hermes' local kawaii waiting/status
-        # updates. Route provider/model reasoning deltas instead; if the provider
-        # emits no reasoning, Zed should not get a fake "thinking" accordion.
-        agent.thinking_callback = None
-        agent.reasoning_callback = reasoning_cb
+        agent.thinking_callback = thinking_cb
        agent.step_callback = step_cb
-        agent.stream_delta_callback = stream_delta_cb
+        agent.message_callback = message_cb

        # Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
        # Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -1021,7 +867,7 @@ class HermesACPAgent(acp.Agent):
                )
            except Exception:
                logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
-        if final_response and conn and not streamed_message:
+        if final_response and conn:
            update = acp.update_agent_message_text(final_response)
            await conn.session_update(session_id, update)

@@ -1057,8 +903,6 @@ class HermesACPAgent(acp.Agent):
                cached_read_tokens=result.get("cache_read_tokens"),
            )

-        await self._send_usage_update(state)
-
        stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
        return PromptResponse(stop_reason=stop_reason, usage=usage)

@@ -1191,84 +1035,22 @@ class HermesACPAgent(acp.Agent):
            return f"Could not list tools: {e}"

    def _cmd_context(self, args: str, state: SessionState) -> str:
-        """Show ACP session context pressure and compression guidance."""
        n_messages = len(state.history)
-
-        # Count by role.
+        if n_messages == 0:
+            return "Conversation is empty (no messages yet)."
+        # Count by role
        roles: dict[str, int] = {}
        for msg in state.history:
            role = msg.get("role", "unknown")
            roles[role] = roles.get(role, 0) + 1
-
-        agent = state.agent
-        model = state.model or getattr(agent, "model", "")
-        provider = getattr(agent, "provider", None) or "auto"
-        compressor = getattr(agent, "context_compressor", None)
-        context_length = int(getattr(compressor, "context_length", 0) or 0)
-        threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
-
-        try:
-            from agent.model_metadata import estimate_request_tokens_rough
-
-            system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
-            tools = getattr(agent, "tools", None) or None
-            approx_tokens = estimate_request_tokens_rough(
-                state.history,
-                system_prompt=system_prompt,
-                tools=tools,
-            )
-        except Exception:
-            logger.debug("Could not estimate ACP context usage", exc_info=True)
-            approx_tokens = 0
-
-        if threshold_tokens <= 0 and context_length > 0:
-            threshold_tokens = int(context_length * 0.80)
-
        lines = [
-            f"Conversation: {n_messages} messages"
-            if n_messages
-            else "Conversation is empty (no messages yet).",
+            f"Conversation: {n_messages} messages",
            f"  user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
            f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
        ]
+        model = state.model or getattr(state.agent, "model", "")
        if model:
            lines.append(f"Model: {model}")
-        lines.append(f"Provider: {provider}")
-
-        if approx_tokens > 0:
-            if context_length > 0:
-                usage_pct = (approx_tokens / context_length) * 100
-                lines.append(
-                    f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
-                )
-            else:
-                lines.append(f"Context usage: ~{approx_tokens:,} tokens")
-
-        if threshold_tokens > 0:
-            if approx_tokens > 0:
-                threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
-                remaining = max(threshold_tokens - approx_tokens, 0)
-                if approx_tokens >= threshold_tokens:
-                    lines.append(
-                        f"Compression: due now (threshold ~{threshold_tokens:,}"
-                        + (f", {threshold_pct:.0f}%" if threshold_pct else "")
-                        + "). Run /compact."
-                    )
-                else:
-                    lines.append(
-                        f"Compression: ~{remaining:,} tokens until threshold "
-                        f"(~{threshold_tokens:,}"
-                        + (f", {threshold_pct:.0f}%" if threshold_pct else "")
-                        + ")."
-                    )
-            else:
-                lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
-
-        if getattr(agent, "compression_enabled", True) is False:
-            lines.append("Compression is disabled for this agent.")
-        else:
-            lines.append("Tip: run /compact to compress manually before the threshold.")
-
        return "\n".join(lines)

    def _cmd_reset(self, args: str, state: SessionState) -> str:
@@ -466,10 +466,17 @@ class SessionManager:
                except Exception:
                    logger.debug("Failed to update ACP session metadata", exc_info=True)

-            # Replace stored messages with current history atomically so a
-            # mid-rewrite failure rolls back and the previously persisted
-            # conversation is preserved (salvaged from #13675).
-            db.replace_messages(state.session_id, state.history)
+            # Replace stored messages with current history.
+            db.clear_messages(state.session_id)
+            for msg in state.history:
+                db.append_message(
+                    session_id=state.session_id,
+                    role=msg.get("role", "user"),
+                    content=msg.get("content"),
+                    tool_name=msg.get("tool_name") or msg.get("name"),
+                    tool_calls=msg.get("tool_calls"),
+                    tool_call_id=msg.get("tool_call_id"),
+                )
        except Exception:
            logger.warning("Failed to persist ACP session %s", state.session_id, exc_info=True)

@@ -28,11 +28,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
    "terminal": "execute",
    "process": "execute",
    "execute_code": "execute",
-    # Session/meta tools
-    "todo": "other",
-    "skill_view": "read",
-    "skills_list": "read",
-    "skill_manage": "edit",
    # Web / fetch
    "web_search": "fetch",
    "web_extract": "fetch",
@@ -56,28 +51,6 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
 }


-_POLISHED_TOOLS = {
-    # Core operator loop
-    "todo", "memory", "session_search", "delegate_task",
-    # Files / execution
-    "read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
-    # Skills / web / browser / media
-    "skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
-    "browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
-    "browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
-    "vision_analyze", "image_generate", "text_to_speech",
-    # Schedulers / platform integrations
-    "cronjob", "send_message", "clarify", "discord", "discord_admin",
-    "ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
-    "feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
-    "feishu_drive_reply_comment", "feishu_drive_add_comment",
-    "kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
-    "kanban_block", "kanban_link", "kanban_heartbeat",
-    "yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
-    "yb_send_dm", "yb_send_sticker", "mixture_of_agents",
-}
-
-
 def get_tool_kind(tool_name: str) -> ToolKind:
    """Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
    return TOOL_KIND_MAP.get(tool_name, "other")
@@ -112,645 +85,18 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
        if urls:
            return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
        return "web extract"
-    if tool_name == "process":
-        action = str(args.get("action") or "").strip() or "manage"
-        sid = str(args.get("session_id") or "").strip()
-        return f"process {action}: {sid}" if sid else f"process {action}"
    if tool_name == "delegate_task":
-        tasks = args.get("tasks")
-        if isinstance(tasks, list) and tasks:
-            return f"delegate batch ({len(tasks)} tasks)"
        goal = args.get("goal", "")
        if goal and len(goal) > 60:
            goal = goal[:57] + "..."
        return f"delegate: {goal}" if goal else "delegate task"
-    if tool_name == "session_search":
-        query = str(args.get("query") or "").strip()
-        return f"session search: {query}" if query else "recent sessions"
-    if tool_name == "memory":
-        action = str(args.get("action") or "manage").strip() or "manage"
-        target = str(args.get("target") or "memory").strip() or "memory"
-        return f"memory {action}: {target}"
    if tool_name == "execute_code":
-        code = str(args.get("code") or "").strip()
-        first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
-        if first_line:
-            if len(first_line) > 70:
-                first_line = first_line[:67] + "..."
-            return f"python: {first_line}"
-        return "python code"
-    if tool_name == "todo":
-        items = args.get("todos")
-        if isinstance(items, list):
-            return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
-        return "todo"
-    if tool_name == "skill_view":
-        name = str(args.get("name") or "?").strip() or "?"
-        file_path = str(args.get("file_path") or "").strip()
-        suffix = f"/{file_path}" if file_path else ""
-        return f"skill view ({name}{suffix})"
-    if tool_name == "skills_list":
-        category = str(args.get("category") or "").strip()
-        return f"skills list ({category})" if category else "skills list"
-    if tool_name == "skill_manage":
-        action = str(args.get("action") or "manage").strip() or "manage"
-        name = str(args.get("name") or "?").strip() or "?"
-        file_path = str(args.get("file_path") or "").strip()
-        target = f"{name}/{file_path}" if file_path else name
-        if len(target) > 64:
-            target = target[:61] + "..."
-        return f"skill {action}: {target}"
-    if tool_name == "browser_navigate":
-        return f"navigate: {args.get('url', '?')}"
-    if tool_name == "browser_snapshot":
-        return "browser snapshot"
-    if tool_name == "browser_vision":
-        return f"browser vision: {str(args.get('question', '?'))[:50]}"
-    if tool_name == "browser_get_images":
-        return "browser images"
+        return "execute code"
    if tool_name == "vision_analyze":
-        return f"analyze image: {str(args.get('question', '?'))[:50]}"
-    if tool_name == "image_generate":
-        prompt = str(args.get("prompt") or args.get("description") or "").strip()
-        return f"generate image: {prompt[:50]}" if prompt else "generate image"
-    if tool_name == "cronjob":
-        action = str(args.get("action") or "manage").strip() or "manage"
-        job_id = str(args.get("job_id") or args.get("id") or "").strip()
-        return f"cron {action}: {job_id}" if job_id else f"cron {action}"
+        return f"analyze image: {args.get('question', '?')[:50]}"
    return tool_name


-def _text(content: str) -> Any:
-    return acp.tool_content(acp.text_block(content))
-
-
-def _json_loads_maybe(value: Optional[str]) -> Any:
-    if not isinstance(value, str):
-        return value
-    try:
-        return json.loads(value)
-    except Exception:
-        pass
-
-    # Some Hermes tools append a human hint after a JSON payload, e.g.
-    # ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
-    # by decoding the first JSON value instead of falling back to raw text.
-    try:
-        decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
-        return decoded
-    except Exception:
-        return None
-
-
-def _truncate_text(text: str, limit: int = 5000) -> str:
-    if len(text) <= limit:
-        return text
-    return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
-
-
-def _fenced_text(text: str, language: str = "") -> str:
-    """Return a Markdown fence that cannot be broken by backticks in text."""
-    longest = max((len(run) for run in text.split("`")[1::2]), default=0)
-    fence = "`" * max(3, longest + 1)
-    return f"{fence}{language}\n{text}\n{fence}"
-
-
-def _format_todo_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
-        return None
-    summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
-    icon = {
-        "completed": "✅",
-        "in_progress": "🔄",
-        "pending": "⏳",
-        "cancelled": "✗",
-    }
-    lines = ["**Todo list**", ""]
-    for item in data["todos"]:
-        if not isinstance(item, dict):
-            continue
-        status = str(item.get("status") or "pending")
-        content = str(item.get("content") or item.get("id") or "").strip()
-        if content:
-            lines.append(f"- {icon.get(status, '•')} {content}")
-    if summary:
-        cancelled = summary.get("cancelled", 0)
-        lines.extend([
-            "",
-            "**Progress:** "
-            f"{summary.get('completed', 0)} completed, "
-            f"{summary.get('in_progress', 0)} in progress, "
-            f"{summary.get('pending', 0)} pending"
-            + (f", {cancelled} cancelled" if cancelled else ""),
-        ])
-    return "\n".join(lines)
-
-
-def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    if data.get("error") and not data.get("content"):
-        return f"Read failed: {data.get('error')}"
-    content = data.get("content")
-    if not isinstance(content, str):
-        return None
-    path = str((args or {}).get("path") or data.get("path") or "file").strip()
-    offset = (args or {}).get("offset")
-    limit = (args or {}).get("limit")
-    range_bits = []
-    if offset:
-        range_bits.append(f"from line {offset}")
-    if limit:
-        range_bits.append(f"limit {limit}")
-    suffix = f" ({', '.join(range_bits)})" if range_bits else ""
-    header = f"Read {path}{suffix}"
-    if data.get("total_lines") is not None:
-        header += f" — {data.get('total_lines')} total lines"
-    # Hermes read_file output is line-numbered with `|`. If we send it as raw
-    # Markdown, Zed can interpret pipes as tables and collapse the layout.
-    # Fence the payload so file lines stay readable and literal.
-    return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
-
-
-def _format_search_files_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    matches = data.get("matches")
-    if not isinstance(matches, list):
-        return None
-
-    total = data.get("total_count", len(matches))
-    shown = min(len(matches), 12)
-    truncated = bool(data.get("truncated")) or len(matches) > shown
-    lines = [
-        "Search results",
-        f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
-        "",
-    ]
-
-    for match in matches[:shown]:
-        if not isinstance(match, dict):
-            lines.append(f"- {match}")
-            continue
-
-        path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
-        line = match.get("line") or match.get("line_number")
-        content = str(match.get("content") or match.get("text") or "").strip()
-        loc = f"{path}:{line}" if line else path
-        lines.append(f"- {loc}")
-        if content:
-            snippet = _truncate_text(" ".join(content.split()), 300)
-            lines.append(f"  {snippet}")
-
-    if truncated:
-        lines.extend([
-            "",
-            "Results truncated. Narrow the search, add file_glob, or use offset to page.",
-        ])
-    return _truncate_text("\n".join(lines), limit=7000)
-
-
-def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return result if isinstance(result, str) and result.strip() else None
-    output = str(data.get("output") or "")
-    error = str(data.get("error") or "")
-    exit_code = data.get("exit_code")
-    parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
-    if output:
-        parts.extend(["", "Output:", output])
-    if error:
-        parts.extend(["", "Error:", error])
-    return _truncate_text("\n".join(parts))
-
-
-def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
-    headings: list[str] = []
-    for line in content.splitlines():
-        stripped = line.strip()
-        if stripped.startswith("#"):
-            heading = stripped.lstrip("#").strip()
-            if heading:
-                headings.append(heading)
-        if len(headings) >= limit:
-            break
-    return headings
-
-
-def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    if data.get("success") is False:
-        return f"Skill view failed: {data.get('error', 'unknown error')}"
-    name = str(data.get("name") or "skill")
-    file_path = str(data.get("file") or data.get("path") or "SKILL.md")
-    description = str(data.get("description") or "").strip()
-    content = str(data.get("content") or "")
-    linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
-
-    lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
-    if description:
-        lines.append(f"- **Description:** {description}")
-    if content:
-        lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
-    if linked:
-        linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
-        lines.append(f"- **Linked files:** {linked_count}")
-
-    headings = _extract_markdown_headings(content)
-    if headings:
-        lines.extend(["", "**Sections**"])
-        lines.extend(f"- {heading}" for heading in headings)
-
-    lines.extend([
-        "",
-        "_Full skill content is available to the agent but hidden here to keep ACP readable._",
-    ])
-    return "\n".join(lines)
-
-
-def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-
-    action = str((args or {}).get("action") or "manage").strip() or "manage"
-    name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
-    file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
-    success = data.get("success")
-    status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
-
-    lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
-    if action not in {"delete"}:
-        lines.append(f"- **File:** `{file_path}`")
-
-    message = str(data.get("message") or data.get("error") or "").strip()
-    if message:
-        lines.append(f"- **Result:** {message}")
-
-    replacements = data.get("replacements") or data.get("replacement_count")
-    if replacements is not None:
-        lines.append(f"- **Replacements:** {replacements}")
-
-    path = str(data.get("path") or "").strip()
-    if path:
-        lines.append(f"- **Path:** `{path}`")
-
-    return "\n".join(lines)
-
-
-def _format_web_search_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
-    if not isinstance(web, list):
-        return None
-    lines = [f"Web results: {len(web)}"]
-    for item in web[:10]:
-        if not isinstance(item, dict):
-            continue
-        title = str(item.get("title") or item.get("url") or "result").strip()
-        url = str(item.get("url") or "").strip()
-        desc = str(item.get("description") or "").strip()
-        lines.append(f"• {title}" + (f" — {url}" if url else ""))
-        if desc:
-            lines.append(f"  {desc}")
-    return _truncate_text("\n".join(lines))
-
-
-def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
-    """Return only web_extract errors for ACP; success stays compact via title."""
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    if data.get("success") is False and data.get("error"):
-        return f"Web extract failed: {data.get('error')}"
-    results = data.get("results")
-    if not isinstance(results, list):
-        return None
-
-    failures: list[str] = []
-    for item in results[:10]:
-        if not isinstance(item, dict):
-            continue
-        error = str(item.get("error") or "").strip()
-        if not error or error in {"None", "null"}:
-            continue
-        url = str(item.get("url") or "").strip()
-        title = str(item.get("title") or url or "Untitled").strip()
-        failures.append(
-            f"- {title}" + (f" — {url}" if url and url != title else "") + f"\n  Error: {_truncate_text(error, limit=500)}"
-        )
-
-    if not failures:
-        return None
-    lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
-    lines.extend(failures)
-    return "\n".join(lines)
-
-
-def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return result if isinstance(result, str) and result.strip() else None
-    if data.get("success") is False and data.get("error"):
-        return f"Process error: {data.get('error')}"
-    action = str((args or {}).get("action") or "process").strip() or "process"
-    if isinstance(data.get("processes"), list):
-        processes = data["processes"]
-        lines = [f"Processes: {len(processes)}"]
-        for proc in processes[:20]:
-            if not isinstance(proc, dict):
-                lines.append(f"- {proc}")
-                continue
-            sid = str(proc.get("session_id") or proc.get("id") or "?")
-            status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
-            cmd = str(proc.get("command") or "").strip()
-            pid = proc.get("pid")
-            code = proc.get("exit_code")
-            bits = [status]
-            if pid is not None:
-                bits.append(f"pid {pid}")
-            if code is not None:
-                bits.append(f"exit {code}")
-            lines.append(f"- `{sid}` — {', '.join(bits)}" + (f" — {cmd[:120]}" if cmd else ""))
-        if len(processes) > 20:
-            lines.append(f"... {len(processes) - 20} more process(es)")
-        return "\n".join(lines)
-
-    status = str(data.get("status") or data.get("state") or action).strip()
-    sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
-    lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
-    for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
-        if data.get(key) is not None:
-            lines.append(f"- **{label}:** {data.get(key)}")
-    output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
-    error = data.get("error") or data.get("stderr")
-    if output:
-        lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
-    if error:
-        lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
-    msg = data.get("message")
-    if msg and not output and not error:
-        lines.append(str(msg))
-    return _truncate_text("\n".join(lines), limit=7000)
-
-
-def _format_delegate_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    if data.get("error") and not isinstance(data.get("results"), list):
-        return f"Delegation failed: {data.get('error')}"
-    results = data.get("results")
-    if not isinstance(results, list):
-        return None
-    total = data.get("total_duration_seconds")
-    lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
-    icon = {"completed": "✅", "failed": "✗", "error": "✗", "timeout": "⏱", "interrupted": "⚠"}
-    for item in results:
-        if not isinstance(item, dict):
-            lines.append(f"- {item}")
-            continue
-        idx = item.get("task_index")
-        status = str(item.get("status") or "unknown")
-        model = item.get("model")
-        dur = item.get("duration_seconds")
-        role = item.get("_child_role")
-        header = f"{icon.get(status, '•')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
-        bits = []
-        if model:
-            bits.append(str(model))
-        if role:
-            bits.append(f"role={role}")
-        if dur is not None:
-            bits.append(f"{dur}s")
-        if bits:
-            header += " (" + ", ".join(bits) + ")"
-        lines.extend(["", header])
-        summary = str(item.get("summary") or "").strip()
-        error = str(item.get("error") or "").strip()
-        if summary:
-            lines.append(_truncate_text(summary, limit=1200))
-        if error:
-            lines.append("Error: " + _truncate_text(error, limit=800))
-        trace = item.get("tool_trace")
-        if isinstance(trace, list) and trace:
-            names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
-            if names:
-                lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
-    return _truncate_text("\n".join(lines), limit=8000)
-
-
-def _format_session_search_result(result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    if data.get("success") is False:
-        return f"Session search failed: {data.get('error', 'unknown error')}"
-    results = data.get("results")
-    if not isinstance(results, list):
-        return None
-    mode = data.get("mode") or "search"
-    query = data.get("query")
-    lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
-    if not results:
-        lines.append(str(data.get("message") or "No matching sessions found."))
-        return "\n".join(lines)
-    for item in results:
-        if not isinstance(item, dict):
-            continue
-        sid = str(item.get("session_id") or "?")
-        title = str(item.get("title") or item.get("when") or "Untitled session").strip()
-        when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
-        count = item.get("message_count")
-        source = str(item.get("source") or "").strip()
-        meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
-        lines.append(f"- **{title}** (`{sid}`)" + (f" — {meta}" if meta else ""))
-        summary = str(item.get("summary") or item.get("preview") or "").strip()
-        if summary:
-            lines.append("  " + _truncate_text(" ".join(summary.split()), limit=500))
-    return _truncate_text("\n".join(lines), limit=7000)
-
-
-def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return None
-    action = str((args or {}).get("action") or "memory").strip() or "memory"
-    target = str(data.get("target") or (args or {}).get("target") or "memory")
-    if data.get("success") is False:
-        lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
-        matches = data.get("matches")
-        if isinstance(matches, list) and matches:
-            lines.append("Matches:")
-            lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
-        return "\n".join(lines)
-    lines = [f"✅ Memory {action} saved ({target})"]
-    if data.get("message"):
-        lines.append(str(data.get("message")))
-    if data.get("entry_count") is not None:
-        lines.append(f"Entries: {data.get('entry_count')}")
-    if data.get("usage"):
-        lines.append(f"Usage: {data.get('usage')}")
-    # Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
-    preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
-    if preview:
-        lines.append("Preview: " + _truncate_text(preview, limit=300))
-    return "\n".join(lines)
-
-
-def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    path = str((args or {}).get("path") or "file").strip()
-    if isinstance(data, dict):
-        if data.get("success") is False or data.get("error"):
-            return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
-        message = str(data.get("message") or "").strip()
-        replacements = data.get("replacements") or data.get("replacement_count")
-        lines = [f"✅ {tool_name} completed" + (f" for `{path}`" if path else "")]
-        if message:
-            lines.append(message)
-        if replacements is not None:
-            lines.append(f"Replacements: {replacements}")
-        if data.get("files_modified"):
-            files = data.get("files_modified")
-            if isinstance(files, list):
-                lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
-        return "\n".join(lines)
-    if isinstance(result, str) and result.strip():
-        return _truncate_text(result, limit=3000)
-    return f"✅ {tool_name} completed" + (f" for `{path}`" if path else "")
-
-
-def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return result if isinstance(result, str) and result.strip() else None
-    if data.get("success") is False or data.get("error"):
-        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
-    if tool_name == "browser_get_images":
-        images = data.get("images") or data.get("data")
-        if isinstance(images, list):
-            lines = [f"Images found: {len(images)}"]
-            for img in images[:12]:
-                if isinstance(img, dict):
-                    alt = str(img.get("alt") or "").strip()
-                    url = str(img.get("url") or img.get("src") or "").strip()
-                    lines.append(f"- {alt or 'image'}" + (f" — {url}" if url else ""))
-            return _truncate_text("\n".join(lines), limit=5000)
-    title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
-    text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
-    lines = [title]
-    if data.get("url") and data.get("url") != title:
-        lines.append(str(data.get("url")))
-    if text:
-        lines.extend(["", _truncate_text(text, limit=5000)])
-    return _truncate_text("\n".join(lines), limit=7000)
-
-
-def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, dict):
-        return result if isinstance(result, str) and result.strip() else None
-    if data.get("success") is False or data.get("error"):
-        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
-    lines = [f"✅ {tool_name} completed"]
-    for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
-        if data.get(key):
-            lines.append(f"- **{key}:** {data.get(key)}")
-    return "\n".join(lines)
-
-
-def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
-    data = _json_loads_maybe(result)
-    if not isinstance(data, (dict, list)):
-        return result if isinstance(result, str) and result.strip() else None
-    if isinstance(data, list):
-        lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
-        for item in data[:12]:
-            lines.append(f"- {_truncate_text(str(item), limit=240)}")
-        return _truncate_text("\n".join(lines), limit=5000)
-
-    if data.get("success") is False or data.get("error"):
-        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
-
-    lines = [f"✅ {tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
-    priority_keys = (
-        "message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
-        "state", "service", "url", "path", "file_path", "count", "total", "next_run",
-    )
-    seen = set()
-    for key in priority_keys:
-        value = data.get(key)
-        if value in (None, "", [], {}):
-            continue
-        seen.add(key)
-        lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
-
-    for key, value in data.items():
-        if key in seen or key in {"success", "raw", "content", "entries"}:
-            continue
-        if value in (None, "", [], {}):
-            continue
-        if isinstance(value, (dict, list)):
-            preview = json.dumps(value, ensure_ascii=False, default=str)
-        else:
-            preview = str(value)
-        lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
-        if len(lines) >= 14:
-            break
-
-    content = data.get("content")
-    if isinstance(content, str) and content.strip():
-        lines.extend(["", _truncate_text(content.strip(), limit=1500)])
-    return _truncate_text("\n".join(lines), limit=7000)
-
-
-def _build_polished_completion_content(
-    tool_name: str,
-    result: Optional[str],
-    function_args: Optional[Dict[str, Any]],
-) -> Optional[List[Any]]:
-    formatter = {
-        "todo": lambda: _format_todo_result(result),
-        "read_file": lambda: _format_read_file_result(result, function_args),
-        "write_file": lambda: _format_edit_result(tool_name, result, function_args),
-        "patch": lambda: _format_edit_result(tool_name, result, function_args),
-        "search_files": lambda: _format_search_files_result(result),
-        "execute_code": lambda: _format_execute_code_result(result),
-        "process": lambda: _format_process_result(result, function_args),
-        "delegate_task": lambda: _format_delegate_result(result),
-        "session_search": lambda: _format_session_search_result(result),
-        "memory": lambda: _format_memory_result(result, function_args),
-        "skill_view": lambda: _format_skill_view_result(result),
-        "skill_manage": lambda: _format_skill_manage_result(result, function_args),
-        "web_search": lambda: _format_web_search_result(result),
-        "web_extract": lambda: _format_web_extract_result(result),
-        "browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
-        "browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
-        "browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
-        "browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
-        "vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
-        "image_generate": lambda: _format_media_or_cron_result(tool_name, result),
-        "cronjob": lambda: _format_media_or_cron_result(tool_name, result),
-    }.get(tool_name)
-    if formatter is None and tool_name in _POLISHED_TOOLS:
-        formatter = lambda: _format_generic_structured_result(tool_name, result)
-    if formatter is None:
-        return None
-    text = formatter()
-    if not text:
-        return None
-    return [_text(text)]
-
-
 def _build_patch_mode_content(patch_text: str) -> List[Any]:
    """Parse V4A patch mode input into ACP diff blocks when possible."""
    if not patch_text:
@@ -912,11 +258,7 @@ def _build_tool_complete_content(
        except Exception:
            pass

-    polished_content = _build_polished_completion_content(tool_name, result, function_args)
-    if polished_content:
-        return polished_content
-
-    return [_text(display_result)]
+    return [acp.tool_content(acp.text_block(display_result))]


 # ---------------------------------------------------------------------------
@@ -946,6 +288,7 @@ def build_tool_start(
            content = _build_patch_mode_content(patch_text)
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
+            raw_input=arguments,
        )

    if tool_name == "write_file":
@@ -954,172 +297,32 @@ def build_tool_start(
        content = [acp.tool_diff_content(path=path, new_text=file_content)]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
+            raw_input=arguments,
        )

    if tool_name == "terminal":
        command = arguments.get("command", "")
-        content = [_text(f"$ {command}")]
+        content = [acp.tool_content(acp.text_block(f"$ {command}"))]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
+            raw_input=arguments,
        )

    if tool_name == "read_file":
-        # The title and location already identify the file. Sending a synthetic
-        # "Reading ..." content block makes Zed render an unhelpful Output
-        # section before the real file contents arrive on completion.
+        path = arguments.get("path", "")
+        content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=None, locations=locations,
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+            raw_input=arguments,
        )

    if tool_name == "search_files":
        pattern = arguments.get("pattern", "")
        target = arguments.get("target", "content")
-        search_path = arguments.get("path")
-        where = f" in {search_path}" if search_path else ""
-        content = [_text(f"Searching for '{pattern}' ({target}){where}")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "todo":
-        items = arguments.get("todos")
-        if isinstance(items, list):
-            preview_lines = ["Updating todo list", ""]
-            for item in items[:8]:
-                if isinstance(item, dict):
-                    preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
-            if len(items) > 8:
-                preview_lines.append(f"... {len(items) - 8} more")
-            content = [_text("\n".join(preview_lines))]
-        else:
-            content = [_text("Reading todo list")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "skill_view":
-        name = str(arguments.get("name") or "?").strip() or "?"
-        file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
-        content = [_text(f"Loading skill '{name}' ({file_path})")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "skill_manage":
-        action = str(arguments.get("action") or "manage").strip() or "manage"
-        name = str(arguments.get("name") or "?").strip() or "?"
-        file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
-        path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
-
-        if action == "patch":
-            old = str(arguments.get("old_string") or "")
-            new = str(arguments.get("new_string") or "")
-            content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
-        elif action in {"edit", "create"}:
-            content = [
-                acp.tool_diff_content(
-                    path=path,
-                    new_text=str(arguments.get("content") or ""),
-                )
-            ]
-        elif action == "write_file":
-            target = str(arguments.get("file_path") or "file")
-            content = [
-                acp.tool_diff_content(
-                    path=f"skills/{name}/{target}",
-                    new_text=str(arguments.get("file_content") or ""),
-                )
-            ]
-        elif action in {"delete", "remove_file"}:
-            target = str(arguments.get("file_path") or file_path or name)
-            content = [_text(f"Removing {target} from skill '{name}'")]
-        else:
-            content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
-
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "execute_code":
-        code = str(arguments.get("code") or "").strip()
-        preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
-        content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "web_search":
-        query = str(arguments.get("query") or "").strip()
-        content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "web_extract":
-        # The title identifies the URL(s). Avoid a duplicate content block so
-        # Zed renders this like read_file: compact start, concise completion.
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=None, locations=locations,
-        )
-
-    if tool_name == "process":
-        action = str(arguments.get("action") or "").strip() or "manage"
-        sid = str(arguments.get("session_id") or "").strip()
-        data_preview = str(arguments.get("data") or "").strip()
-        text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
-        if data_preview:
-            text += "\nInput: " + _truncate_text(data_preview, limit=500)
-        content = [_text(text)]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "delegate_task":
-        tasks = arguments.get("tasks")
-        if isinstance(tasks, list) and tasks:
-            lines = [f"Delegating {len(tasks)} tasks", ""]
-            for i, task in enumerate(tasks[:8], 1):
-                if isinstance(task, dict):
-                    goal = str(task.get("goal") or "").strip()
-                    role = str(task.get("role") or "").strip()
-                    lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
-            if len(tasks) > 8:
-                lines.append(f"... {len(tasks) - 8} more")
-            content = [_text("\n".join(lines))]
-        else:
-            goal = str(arguments.get("goal") or "").strip()
-            content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "session_search":
-        query = str(arguments.get("query") or "").strip()
-        content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name == "memory":
-        action = str(arguments.get("action") or "manage").strip() or "manage"
-        target = str(arguments.get("target") or "memory").strip() or "memory"
-        preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
-        text = f"Memory {action} ({target})"
-        if preview:
-            text += "\nPreview: " + _truncate_text(preview, limit=500)
-        content = [_text(text)]
-        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-        )
-
-    if tool_name in _POLISHED_TOOLS:
-        try:
-            args_text = json.dumps(arguments, indent=2, default=str)
-        except (TypeError, ValueError):
-            args_text = str(arguments)
-        content = [_text(_truncate_text(args_text, limit=1200))]
+        content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
+            raw_input=arguments,
        )

    # Generic fallback
@@ -1131,7 +334,7 @@ def build_tool_start(
    content = [acp.tool_content(acp.text_block(args_text))]
    return acp.start_tool_call(
        tool_call_id, title, kind=kind, content=content, locations=locations,
-        raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
+        raw_input=arguments,
    )


@@ -1144,22 +347,18 @@ def build_tool_complete(
 ) -> ToolCallProgress:
    """Create a ToolCallUpdate (progress) event for a completed tool call."""
    kind = get_tool_kind(tool_name)
-    if tool_name == "web_extract":
-        error_text = _format_web_extract_result(result)
-        content = [_text(error_text)] if error_text else None
-    else:
-        content = _build_tool_complete_content(
-            tool_name,
-            result,
-            function_args=function_args,
-            snapshot=snapshot,
-        )
+    content = _build_tool_complete_content(
+        tool_name,
+        result,
+        function_args=function_args,
+        snapshot=snapshot,
+    )
    return acp.update_tool_call(
        tool_call_id,
        kind=kind,
        status="completed",
        content=content,
-        raw_output=None if tool_name in _POLISHED_TOOLS else result,
+        raw_output=result,
    )


@@ -76,7 +76,6 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
 # Models where temperature/top_p/top_k return 400 if set to non-default values.
 # This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
 _NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
-_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")

 # ── Max output token limits per Anthropic model ───────────────────────
 # Source: Anthropic docs + Cline model catalog.  Anthropic's API requires
@@ -106,9 +105,6 @@ _ANTHROPIC_OUTPUT_LIMITS = {
    "claude-3-haiku":      4_096,
    # Third-party Anthropic-compatible providers
    "minimax":            131_072,
-    # Qwen models via DashScope Anthropic-compatible endpoint
-    # DashScope enforces max_tokens ∈ [1, 65536]
-    "qwen3":               65_536,
 }

 # For any model not in the table, assume the highest current limit.
@@ -220,17 +216,6 @@ def _forbids_sampling_params(model: str) -> bool:
    return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)


-def _supports_fast_mode(model: str) -> bool:
-    """Return True for models that support Anthropic Fast Mode (speed=fast).
-
-    Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
-    Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
-    returns HTTP 400. This guard prevents silently 400'ing when stale config
-    or older callers leave fast mode enabled across a model upgrade.
-    """
-    return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
-
-
 # Beta headers for enhanced features (sent with ALL auth types).
 # As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
 # beta headers are still accepted (harmless no-op) but not required. Kept
@@ -1237,14 +1222,6 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
    ``keep_nullable_hint=False`` because the Anthropic validator does not
    recognize the OpenAPI-style ``nullable: true`` extension and strict
    schema-to-grammar converters may reject unknown keywords.
-
-    Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
-    Anthropic API rejects union keywords at the schema root with a generic
-    HTTP 400. Several upstream and plugin tools ship schemas with one of
-    these keywords at the top level (commonly for Pydantic discriminated
-    unions). If we land here with those keywords still present after
-    nullable-union stripping, drop them and fall back to a plain object
-    schema so the tool still validates at the Anthropic boundary.
    """
    if not schema:
        return {"type": "object", "properties": {}}
@@ -1254,12 +1231,6 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
    normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
    if not isinstance(normalized, dict):
        return {"type": "object", "properties": {}}
-    # Strip top-level union keywords that Anthropic's validator rejects.
-    banned = {"oneOf", "allOf", "anyOf"}
-    if banned & normalized.keys():
-        normalized = {k: v for k, v in normalized.items() if k not in banned}
-        if "type" not in normalized:
-            normalized["type"] = "object"
    if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
        normalized = {**normalized, "properties": {}}
    return normalized
@@ -1270,24 +1241,10 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
    if not tools:
        return []
    result = []
-    seen_names: set = set()
    for t in tools:
        fn = t.get("function", {})
-        name = fn.get("name", "")
-        # Defensive dedup: Anthropic rejects requests with duplicate tool
-        # names.  Upstream injection paths already dedup, but this guard
-        # converts a hard API failure into a warning.  See: #18478
-        if name and name in seen_names:
-            logger.warning(
-                "convert_tools_to_anthropic: duplicate tool name '%s' "
-                "— dropping second occurrence",
-                name,
-            )
-            continue
-        if name:
-            seen_names.add(name)
        result.append({
-            "name": name,
+            "name": fn.get("name", ""),
            "description": fn.get("description", ""),
            "input_schema": _normalize_tool_input_schema(
                fn.get("parameters", {"type": "object", "properties": {}})
@@ -1944,15 +1901,9 @@ def build_anthropic_kwargs(

    # ── Fast mode (Opus 4.6 only) ────────────────────────────────────
    # Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
-    # output speed. Per Anthropic docs, fast mode is only supported on
-    # Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
-    # Only for native Anthropic endpoints — third-party providers would
-    # reject the unknown beta header and speed parameter.
-    if (
-        fast_mode
-        and not _is_third_party_anthropic_endpoint(base_url)
-        and _supports_fast_mode(model)
-    ):
+    # output speed. Only for native Anthropic endpoints — third-party
+    # providers would reject the unknown beta header and speed parameter.
+    if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
        kwargs.setdefault("extra_body", {})["speed"] = "fast"
        # Build extra_headers with ALL applicable betas (the per-request
        # extra_headers override the client-level anthropic-beta header).
@@ -216,26 +216,7 @@ def _fixed_temperature_for_model(
    return None

 # Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
-def _get_aux_model_for_provider(provider_id: str) -> str:
-    """Return the cheap auxiliary model for a provider.
-
-    Reads from ProviderProfile.default_aux_model first, falling back to the
-    legacy hardcoded dict for providers that predate the profiles system.
-    """
-    try:
-        from providers import get_provider_profile
-        _p = get_provider_profile(provider_id)
-        if _p and _p.default_aux_model:
-            return _p.default_aux_model
-    except Exception:
-        pass
-    return _API_KEY_PROVIDER_AUX_MODELS_FALLBACK.get(provider_id, "")
-
-
-# Fallback for providers not yet migrated to ProviderProfile.default_aux_model,
-# plus providers we intentionally keep pinned here (e.g. Anthropic predates
-# profiles). New providers should set default_aux_model on their profile instead.
-_API_KEY_PROVIDER_AUX_MODELS_FALLBACK: Dict[str, str] = {
+_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
    "gemini": "gemini-3-flash-preview",
    "zai": "glm-4.5-flash",
    "kimi-coding": "kimi-k2-turbo-preview",
@@ -254,10 +235,6 @@ _API_KEY_PROVIDER_AUX_MODELS_FALLBACK: Dict[str, str] = {
    "tencent-tokenhub": "hy3-preview",
 }

-# Legacy alias — callers that haven't been updated to _get_aux_model_for_provider()
-# can still use this dict directly. Kept in sync with _FALLBACK above.
-_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = _API_KEY_PROVIDER_AUX_MODELS_FALLBACK
-
 # Vision-specific model overrides for direct providers.
 # When the user's main provider has a dedicated vision/multimodal model that
 # differs from their main chat model, map it here.  The vision auto-detect
@@ -282,70 +259,13 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding-cn",
 })

-# OpenRouter app attribution headers (base — always sent).
-# `X-Title` is the canonical attribution header OpenRouter's dashboard
-# reads; the previous `X-OpenRouter-Title` label was not recognized there.
-_OR_HEADERS_BASE = {
+# OpenRouter app attribution headers
+_OR_HEADERS = {
    "HTTP-Referer": "https://hermes-agent.nousresearch.com",
-    "X-Title": "Hermes Agent",
+    "X-OpenRouter-Title": "Hermes Agent",
    "X-OpenRouter-Categories": "productivity,cli-agent",
 }

-# Truthy values for boolean env-var parsing.
-_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
-
-
-def build_or_headers(or_config: dict | None = None) -> dict:
-    """Build OpenRouter headers, optionally including response-cache headers.
-
-    Precedence for response cache: env var > config.yaml > default (enabled).
-
-    Environment variables:
-        ``HERMES_OPENROUTER_CACHE`` — truthy (``1``/``true``/``yes``/``on``)
-            enables caching; ``0``/``false``/``no``/``off`` disables.
-            Overrides ``openrouter.response_cache`` in config.yaml.
-        ``HERMES_OPENROUTER_CACHE_TTL`` — integer seconds (1-86400).
-            Overrides ``openrouter.response_cache_ttl`` in config.yaml.
-
-    *or_config* is the ``openrouter`` section from config.yaml.  When *None*,
-    falls back to reading config from disk via ``load_config()``.
-    """
-    headers = dict(_OR_HEADERS_BASE)
-
-    # Resolve config from disk if not provided.
-    if or_config is None:
-        try:
-            from hermes_cli.config import load_config
-            or_config = load_config().get("openrouter", {})
-        except Exception:
-            or_config = {}
-
-    # Determine cache enabled: env var overrides config.
-    env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
-    if env_cache:
-        cache_enabled = env_cache in _TRUTHY_ENV_VALUES
-    else:
-        cache_enabled = or_config.get("response_cache", False)
-
-    if not cache_enabled:
-        return headers
-
-    headers["X-OpenRouter-Cache"] = "true"
-
-    # Determine TTL: env var overrides config.
-    env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
-    if env_ttl:
-        if env_ttl.isdigit():
-            ttl = int(env_ttl)
-            if 1 <= ttl <= 86400:
-                headers["X-OpenRouter-Cache-TTL"] = str(ttl)
-    else:
-        ttl = or_config.get("response_cache_ttl", 300)
-        if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
-            headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
-
-    return headers
-
 # Vercel AI Gateway app attribution headers. HTTP-Referer maps to
 # referrerUrl and X-Title maps to appName in the gateway's analytics.
 from hermes_cli import __version__ as _HERMES_VERSION
@@ -592,12 +512,7 @@ class _CodexCompletionsAdapter:
                    # API allows it.
                    pass
                else:
-                    # Truthy-only check mirrors agent/transports/codex.py
-                    # build_kwargs(): falsy values (None, "", 0) fall back
-                    # to the default rather than being forwarded to the
-                    # Codex backend, which rejects e.g. {"effort": null}
-                    # with a 400.
-                    effort = reasoning_cfg.get("effort") or "medium"
+                    effort = reasoning_cfg.get("effort", "medium")
                    # Codex backend rejects "minimal"; clamp to "low" to
                    # match the main-agent Codex transport behavior.
                    if effort == "minimal":
@@ -1180,7 +1095,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:

            raw_base_url = _pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
            base_url = _to_openai_base_url(raw_base_url)
-            model = _get_aux_model_for_provider(provider_id) or None
+            model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
            if model is None:
                continue  # skip provider if we don't know a valid aux model
            logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
@@ -1196,14 +1111,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
                from hermes_cli.models import copilot_default_headers

                extra["default_headers"] = copilot_default_headers()
-            else:
-                try:
-                    from providers import get_provider_profile as _gpf_aux
-                    _ph_aux = _gpf_aux(provider_id)
-                    if _ph_aux and _ph_aux.default_headers:
-                        extra["default_headers"] = dict(_ph_aux.default_headers)
-                except Exception:
-                    pass
            _client = OpenAI(api_key=api_key, base_url=base_url, **extra)
            _client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
            return _client, model
@@ -1215,7 +1122,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:

        raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
        base_url = _to_openai_base_url(raw_base_url)
-        model = _get_aux_model_for_provider(provider_id) or None
+        model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
        if model is None:
            continue  # skip provider if we don't know a valid aux model
        logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
@@ -1231,14 +1138,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
            from hermes_cli.models import copilot_default_headers

            extra["default_headers"] = copilot_default_headers()
-        else:
-            try:
-                from providers import get_provider_profile as _gpf_aux2
-                _ph_aux2 = _gpf_aux2(provider_id)
-                if _ph_aux2 and _ph_aux2.default_headers:
-                    extra["default_headers"] = dict(_ph_aux2.default_headers)
-            except Exception:
-                pass
        _client = OpenAI(api_key=api_key, base_url=base_url, **extra)
        _client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
        return _client, model
@@ -1250,23 +1149,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:



-def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
+def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
    pool_present, entry = _select_pool_entry("openrouter")
    if pool_present:
-        or_key = explicit_api_key or _pool_runtime_api_key(entry)
+        or_key = _pool_runtime_api_key(entry)
        if not or_key:
            return None, None
        base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
        logger.debug("Auxiliary client: OpenRouter via pool")
        return OpenAI(api_key=or_key, base_url=base_url,
-                       default_headers=build_or_headers()), _OPENROUTER_MODEL
+                       default_headers=_OR_HEADERS), _OPENROUTER_MODEL

-    or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
+    or_key = os.getenv("OPENROUTER_API_KEY")
    if not or_key:
        return None, None
    logger.debug("Auxiliary client: OpenRouter")
    return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
-                   default_headers=build_or_headers()), _OPENROUTER_MODEL
+                   default_headers=_OR_HEADERS), _OPENROUTER_MODEL


 def _describe_openrouter_unavailable() -> str:
@@ -1575,7 +1474,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
    return CodexAuxiliaryClient(real_client, model), model


-def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
+def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
    try:
        from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
    except ImportError:
@@ -1585,10 +1484,10 @@ def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optiona
    if pool_present:
        if entry is None:
            return None, None
-        token = explicit_api_key or _pool_runtime_api_key(entry)
+        token = _pool_runtime_api_key(entry)
    else:
        entry = None
-        token = explicit_api_key or resolve_anthropic_token()
+        token = resolve_anthropic_token()
    if not token:
        return None, None

@@ -1611,7 +1510,7 @@ def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optiona

    from agent.anthropic_adapter import _is_oauth_token
    is_oauth = _is_oauth_token(token)
-    model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
+    model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
    logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
    try:
        real_client = build_anthropic_client(token, base_url)
@@ -1689,39 +1588,6 @@ def _is_payment_error(exc: Exception) -> bool:
    return False


-def _is_rate_limit_error(exc: Exception) -> bool:
-    """Detect rate-limit errors that warrant provider fallback.
-
-    Returns True for HTTP 429 errors whose message indicates rate limiting
-    (as opposed to billing/quota exhaustion, which _is_payment_error handles).
-    Also catches OpenAI SDK RateLimitError instances that may not set
-    .status_code on the exception object.
-    """
-    status = getattr(exc, "status_code", None)
-    err_lower = str(exc).lower()
-
-    # OpenAI SDK's RateLimitError sometimes omits .status_code —
-    # detect by class name so we don't miss these.  (PR #8023 pattern)
-    if type(exc).__name__ == "RateLimitError":
-        return True
-
-    if status == 429:
-        # Distinguish rate-limit from billing: billing keywords are handled
-        # by _is_payment_error, everything else on 429 is a rate limit.
-        if any(kw in err_lower for kw in (
-            "rate limit", "rate_limit", "too many requests",
-            "try again", "retry after", "resets in",
-        )):
-            return True
-        # Generic 429 without billing keywords = likely a rate limit
-        if not any(kw in err_lower for kw in (
-            "credits", "insufficient funds", "billing",
-            "payment required", "can only afford",
-        )):
-            return True
-    return False
-
-
 def _is_connection_error(exc: Exception) -> bool:
    """Detect connection/network errors that warrant provider fallback.

@@ -2045,7 +1911,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
    }
    sync_base_url = str(sync_client.base_url)
    if base_url_host_matches(sync_base_url, "openrouter.ai"):
-        async_kwargs["default_headers"] = build_or_headers()
+        async_kwargs["default_headers"] = dict(_OR_HEADERS)
    elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
        from hermes_cli.copilot_auth import copilot_request_headers

@@ -2187,9 +2053,9 @@ def resolve_provider_client(
        return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
                else (client, final_model))

-    # ── OpenRouter ───────────────────────────────────────────
+    # ── OpenRouter ───────────────────────────────────────────────────
    if provider == "openrouter":
-        client, default = _try_openrouter(explicit_api_key=explicit_api_key)
+        client, default = _try_openrouter()
        if client is None:
            logger.warning(
                "resolve_provider_client: openrouter requested but %s",
@@ -2415,7 +2281,7 @@ def resolve_provider_client(

    if pconfig.auth_type == "api_key":
        if provider == "anthropic":
-            client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
+            client, default_model = _try_anthropic()
            if client is None:
                logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
                return None, None
@@ -2447,7 +2313,7 @@ def resolve_provider_client(
        if explicit_base_url:
            base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))

-        default_model = _get_aux_model_for_provider(provider)
+        default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
        final_model = _normalize_resolved_model(model or default_model, provider)

        if provider == "gemini":
@@ -2727,11 +2593,8 @@ def resolve_vision_provider_client(
        return resolved_provider, sync_client, final_model

    if resolved_base_url:
-        provider_for_base_override = (
-            requested if requested and requested not in ("", "auto") else "custom"
-        )
        client, final_model = resolve_provider_client(
-            provider_for_base_override,
+            "custom",
            model=resolved_model,
            async_mode=async_mode,
            explicit_base_url=resolved_base_url,
@@ -2739,8 +2602,8 @@ def resolve_vision_provider_client(
            api_mode=resolved_api_mode,
        )
        if client is None:
-            return provider_for_base_override, None, None
-        return provider_for_base_override, client, final_model
+            return "custom", None, None
+        return "custom", client, final_model

    if requested == "auto":
        # Vision auto-detection order:
@@ -3206,14 +3069,8 @@ def _resolve_task_provider_model(

    if task:
        # Config.yaml is the primary source for per-task overrides.
-        if cfg_base_url and cfg_api_key:
-            # Both base_url and api_key explicitly set → custom endpoint.
+        if cfg_base_url:
            return "custom", resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
-        if cfg_base_url and cfg_provider and cfg_provider != "auto":
-            # base_url set without api_key but with a known provider — use
-            # the provider so it can resolve credentials from env vars
-            # (e.g. OPENROUTER_API_KEY) instead of locking into "custom".
-            return cfg_provider, resolved_model, cfg_base_url, None, resolved_api_mode
        if cfg_provider and cfg_provider != "auto":
            return cfg_provider, resolved_model, None, None, resolved_api_mode

@@ -3380,26 +3237,7 @@ def _build_call_kwargs(
            kwargs["max_tokens"] = max_tokens

    if tools:
-        # Defensive dedup: providers like Google Vertex, Azure, and Bedrock
-        # reject requests with duplicate tool names (HTTP 400).  The upstream
-        # injection paths (run_agent.py) already dedup, but this guard
-        # converts a hard API failure into a warning if an upstream regression
-        # reintroduces duplicates.  See: #18478
-        _seen: set = set()
-        _deduped: list = []
-        for _t in tools:
-            _tname = (_t.get("function") or {}).get("name", "")
-            if _tname and _tname in _seen:
-                logger.warning(
-                    "_build_call_kwargs: duplicate tool name '%s' removed "
-                    "(provider=%s model=%s)",
-                    _tname, provider, model,
-                )
-                continue
-            if _tname:
-                _seen.add(_tname)
-            _deduped.append(_t)
-        kwargs["tools"] = _deduped
+        kwargs["tools"] = tools

    # Provider-specific extra_body
    merged_extra = dict(extra_body or {})
@@ -3614,7 +3452,7 @@ def call_llm(
            except Exception as retry_err:
                # If the max_tokens retry also hits a payment or connection
                # error, fall through to the fallback chain below.
-                if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
+                if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
                    raise
                first_err = retry_err

@@ -3697,27 +3535,13 @@ def call_llm(
        # Codex/OAuth tokens that authenticate but whose endpoint is down,
        # and providers the user never configured that got picked up by
        # the auto-detection chain.
-        #
-        # ── Rate-limit fallback (#13579) ─────────────────────────────
-        # When the provider returns a 429 rate-limit (not billing), fall
-        # back to an alternative provider instead of exhausting retries
-        # against the same rate-limited endpoint.
-        should_fallback = (
-            _is_payment_error(first_err)
-            or _is_connection_error(first_err)
-            or _is_rate_limit_error(first_err)
-        )
+        should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
        # Only try alternative providers when the user didn't explicitly
        # configure this task's provider.  Explicit provider = hard constraint;
        # auto (the default) = best-effort fallback chain.  (#7559)
        is_auto = resolved_provider in ("auto", "", None)
        if should_fallback and is_auto:
-            if _is_payment_error(first_err):
-                reason = "payment error"
-            elif _is_rate_limit_error(first_err):
-                reason = "rate limit"
-            else:
-                reason = "connection error"
+            reason = "payment error" if _is_payment_error(first_err) else "connection error"
            logger.info("Auxiliary %s: %s on %s (%s), trying fallback",
                        task or "call", reason, resolved_provider, first_err)
            fb_client, fb_model, fb_label = _try_payment_fallback(
@@ -3920,7 +3744,7 @@ async def async_call_llm(
            except Exception as retry_err:
                # If the max_tokens retry also hits a payment or connection
                # error, fall through to the fallback chain below.
-                if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
+                if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
                    raise
                first_err = retry_err

@@ -3989,20 +3813,11 @@ async def async_call_llm(
                    return _validate_llm_response(
                        await retry_client.chat.completions.create(**retry_kwargs), task)

-        # ── Payment / connection / rate-limit fallback (mirrors sync call_llm) ──
-        should_fallback = (
-            _is_payment_error(first_err)
-            or _is_connection_error(first_err)
-            or _is_rate_limit_error(first_err)
-        )
+        # ── Payment / connection fallback (mirrors sync call_llm) ─────
+        should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
        is_auto = resolved_provider in ("auto", "", None)
        if should_fallback and is_auto:
-            if _is_payment_error(first_err):
-                reason = "payment error"
-            elif _is_rate_limit_error(first_err):
-                reason = "rate limit"
-            else:
-                reason = "connection error"
+            reason = "payment error" if _is_payment_error(first_err) else "connection error"
            logger.info("Auxiliary %s (async): %s on %s (%s), trying fallback",
                        task or "call", reason, resolved_provider, first_err)
            fb_client, fb_model, fb_label = _try_payment_fallback(
@@ -344,7 +344,6 @@ class ContextCompressor(ContextEngine):
        self._last_aux_model_failure_model = None
        self._last_compression_savings_pct = 100.0
        self._ineffective_compression_count = 0
-        self._summary_failure_cooldown_until = 0.0  # transient errors must not block a fresh session

    def update_model(
        self,
@@ -554,16 +553,7 @@ class ContextCompressor(ContextEngine):
                    break
                accumulated += msg_tokens
                boundary = i
-            # Translate the budget walk into a "protected count", apply the
-            # floor in count-space (where `max` reads naturally: protect at
-            # least `min_protect` messages or whatever the budget reserved,
-            # whichever is more), then convert back to a prune boundary.
-            # Doing this in index-space with `max` would invert the direction
-            # (smaller index = MORE protected), so a generous budget would
-            # silently get truncated back down to `min_protect`.
-            budget_protect_count = len(result) - boundary
-            protected_count = max(budget_protect_count, min_protect)
-            prune_boundary = len(result) - protected_count
+            prune_boundary = max(boundary, len(result) - min_protect)
        else:
            prune_boundary = len(result) - protect_tail_count

@@ -579,8 +569,6 @@ class ContextCompressor(ContextEngine):
            # Skip multimodal content (list of content blocks)
            if isinstance(content, list):
                continue
-            if not isinstance(content, str):
-                continue
            if len(content) < 200:
                continue
            h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
@@ -600,8 +588,6 @@ class ContextCompressor(ContextEngine):
            # Skip multimodal content (list of content blocks)
            if isinstance(content, list):
                continue
-            if not isinstance(content, str):
-                continue
            if not content or content == _PRUNED_TOOL_PLACEHOLDER:
                continue
            # Skip already-deduplicated or previously-summarized results
@@ -917,19 +903,15 @@ The user has requested that this compaction PRIORITISE preserving all informatio
                or "does not exist" in _err_str
                or "no available channel" in _err_str
            )
-            _is_timeout = (
-                _status in (408, 429, 502, 504)
-                or "timeout" in _err_str
-            )
            if (
-                (_is_model_not_found or _is_timeout)
+                _is_model_not_found
                and self.summary_model
                and self.summary_model != self.model
                and not getattr(self, "_summary_model_fallen_back", False)
            ):
                self._summary_model_fallen_back = True
                logging.warning(
-                    "Summary model '%s' unavailable (%s). "
+                    "Summary model '%s' not available (%s). "
                    "Falling back to main model '%s' for compression.",
                    self.summary_model, e, self.model,
                )
@@ -993,39 +975,15 @@ The user has requested that this compaction PRIORITISE preserving all informatio
            return None

    @staticmethod
-    def _strip_summary_prefix(summary: str) -> str:
-        """Return summary body without the current or legacy handoff prefix."""
-        text = (summary or "").strip()
-        for prefix in (SUMMARY_PREFIX, LEGACY_SUMMARY_PREFIX):
-            if text.startswith(prefix):
-                return text[len(prefix):].lstrip()
-        return text
-
-    @classmethod
-    def _with_summary_prefix(cls, summary: str) -> str:
+    def _with_summary_prefix(summary: str) -> str:
        """Normalize summary text to the current compaction handoff format."""
-        text = cls._strip_summary_prefix(summary)
+        text = (summary or "").strip()
+        for prefix in (LEGACY_SUMMARY_PREFIX, SUMMARY_PREFIX):
+            if text.startswith(prefix):
+                text = text[len(prefix):].lstrip()
+                break
        return f"{SUMMARY_PREFIX}\n{text}" if text else SUMMARY_PREFIX

-    @staticmethod
-    def _is_context_summary_content(content: Any) -> bool:
-        text = _content_text_for_contains(content).lstrip()
-        return text.startswith(SUMMARY_PREFIX) or text.startswith(LEGACY_SUMMARY_PREFIX)
-
-    @classmethod
-    def _find_latest_context_summary(
-        cls,
-        messages: List[Dict[str, Any]],
-        start: int,
-        end: int,
-    ) -> tuple[Optional[int], str]:
-        """Find the newest handoff summary inside a compression window."""
-        for idx in range(end - 1, start - 1, -1):
-            content = messages[idx].get("content")
-            if cls._is_context_summary_content(content):
-                return idx, cls._strip_summary_prefix(_content_text_for_contains(content))
-        return None, ""
-
    # ------------------------------------------------------------------
    # Tool-call / tool-result pair integrity helpers
    # ------------------------------------------------------------------
@@ -1332,15 +1290,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
            return messages

        turns_to_summarize = messages[compress_start:compress_end]
-        summary_idx, summary_body = self._find_latest_context_summary(
-            messages,
-            compress_start,
-            compress_end,
-        )
-        if summary_idx is not None:
-            if summary_body and not self._previous_summary:
-                self._previous_summary = summary_body
-            turns_to_summarize = messages[summary_idx + 1:compress_end]

        if not self.quiet_mode:
            logger.info(
@@ -1418,19 +1367,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
                # Merge the summary into the first tail message instead
                # of inserting a standalone message that breaks alternation.
                _merge_summary_into_tail = True
-
-        # When the summary lands as a standalone role="user" message,
-        # weak models read the verbatim "## Active Task" quote of a past
-        # user request as fresh input (#11475, #14521). Append the explicit
-        # end marker — the same one used in the merge-into-tail path — so
-        # the model has a clear "summary above, not new input" signal.
-        if not _merge_summary_into_tail and summary_role == "user":
-            summary = (
-                summary
-                + "\n\n--- END OF CONTEXT SUMMARY — "
-                "respond to the message below, not the summary above ---"
-            )
-
        if not _merge_summary_into_tail:
            compressed.append({"role": summary_role, "content": summary})

@@ -3,7 +3,6 @@
 from __future__ import annotations

 import logging
-import os
 import random
 import threading
 import time
@@ -14,7 +13,7 @@ from datetime import datetime
 from typing import Any, Dict, List, Optional, Set, Tuple

 from hermes_constants import OPENROUTER_BASE_URL
-from hermes_cli.config import get_env_value, load_env
+from hermes_cli.config import get_env_value
 import hermes_cli.auth as auth_mod
 from hermes_cli.auth import (
    CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1381,16 +1380,6 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
 def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
    changed = False
    active_sources: Set[str] = set()
-
-    # Prefer ~/.hermes/.env over os.environ — the user's config file is the
-    # authoritative source for Hermes credentials. Stale env vars from parent
-    # processes (Codex CLI, test scripts, etc.) should not override deliberate
-    # changes to the .env file.
-    def _get_env_prefer_dotenv(key: str) -> str:
-        env_file = load_env()
-        val = env_file.get(key) or os.environ.get(key) or ""
-        return val.strip()
-
    # Honour user suppression — `hermes auth remove <provider> <N>` for an
    # env-seeded credential marks the env:<VAR> source as suppressed so it
    # won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1402,8 +1391,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        def _is_source_suppressed(_p, _s):  # type: ignore[misc]
            return False
    if provider == "openrouter":
-        # Prefer ~/.hermes/.env over os.environ
-        token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
+        # Check both os.environ and ~/.hermes/.env file
+        token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
        if token:
            source = "env:OPENROUTER_API_KEY"
            if _is_source_suppressed(provider, source):
@@ -1429,7 +1418,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool

    env_url = ""
    if pconfig.base_url_env_var:
-        env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
+        env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")

    env_vars = list(pconfig.api_key_env_vars)
    if provider == "anthropic":
@@ -1440,8 +1429,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        ]

    for env_var in env_vars:
-        # Prefer ~/.hermes/.env over os.environ
-        token = _get_env_prefer_dotenv(env_var)
+        # Check both os.environ and ~/.hermes/.env file
+        token = (get_env_value(env_var) or "").strip()
        if not token:
            continue
        source = f"env:{env_var}"
@@ -24,12 +24,11 @@ from __future__ import annotations
 import json
 import logging
 import os
-import re
 import tempfile
 import threading
 from datetime import datetime, timedelta, timezone
 from pathlib import Path
-from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
+from typing import Any, Callable, Dict, List, Optional, Set

 from hermes_constants import get_hermes_home
 from tools import skill_usage
@@ -37,22 +36,6 @@ from tools import skill_usage
 logger = logging.getLogger(__name__)


-def _strip_aux_credential(value: Any) -> Optional[str]:
-    if value is None:
-        return None
-    text = str(value).strip()
-    return text or None
-
-
-class _ReviewRuntimeBinding(NamedTuple):
-    """Provider/model for the curator review fork plus optional per-slot overrides."""
-
-    provider: str
-    model: str
-    explicit_api_key: Optional[str]
-    explicit_base_url: Optional[str]
-
-
 DEFAULT_INTERVAL_HOURS = 24 * 7  # 7 days
 DEFAULT_MIN_IDLE_HOURS = 2
 DEFAULT_STALE_AFTER_DAYS = 30
@@ -404,11 +387,6 @@ CURATOR_REVIEW_PROMPT = (
    "  - skill_manage action=write_file — add a references/, templates/, "
    "or scripts/ file under an existing skill (the skill must already "
    "exist)\n"
-    "  - skill_manage action=delete     — archive a skill. MUST pass "
-    "`absorbed_into=<umbrella>` when you've merged its content into another "
-    "skill, or `absorbed_into=\"\"` when you're truly pruning with no "
-    "forwarding target. This drives cron-job skill-reference migration — "
-    "guessing from your YAML summary after the fact is fragile.\n"
    "  - terminal                       — mv a sibling into the archive "
    "OR move its content into a support subfile\n\n"
    "'keep' is a legitimate decision ONLY when the skill is already a "
@@ -470,24 +448,6 @@ def _reports_root() -> Path:
    return root


-def _needle_in_path_component(needle: str, path: str) -> bool:
-    """Check if *needle* is a complete filename stem or directory name in *path*.
-
-    Unlike simple substring matching, this avoids false positives where short
-    skill names are embedded in longer filenames (e.g. "api" matching
-    "references/api-design.md").  Hyphens and underscores are normalised so
-    "open-webui-setup" matches "open_webui_setup.md".
-    """
-    norm_needle = needle.replace("-", "_")
-    for part in path.replace("\\", "/").split("/"):
-        if not part:
-            continue
-        stem = part.rsplit(".", 1)[0] if "." in part else part
-        if stem.replace("-", "_") == norm_needle:
-            return True
-    return False
-
-
 def _classify_removed_skills(
    removed: List[str],
    added: List[str],
@@ -566,29 +526,15 @@ def _classify_removed_skills(
                continue

            # Look for the removed skill's name in file_path / content / raw.
-            # Matching strategy differs by field type:
-            #   file_path — needle must be a complete path component
-            #     (filename stem or directory name), so "api" does NOT
-            #     falsely match "references/api-design.md".
-            #   content fields — word-boundary regex so "test" does NOT
-            #     falsely match "latest" or "testing".
-            haystacks: List[tuple[str, str]] = []
+            haystacks: List[str] = []
            for key in ("file_path", "file_content", "content", "new_string", "_raw"):
                v = args.get(key)
                if isinstance(v, str):
-                    haystacks.append((key, v))
+                    haystacks.append(v)
            hit = False
-            for key, hay in haystacks:
+            for hay in haystacks:
                for needle in needles:
-                    if not needle:
-                        continue
-                    if key == "file_path":
-                        matched = _needle_in_path_component(needle, hay)
-                    else:
-                        matched = bool(
-                            re.search(rf'\b{re.escape(needle)}\b', hay)
-                        )
-                    if matched:
+                    if needle and needle in hay:
                        hit = True
                        evidence = (
                            f"skill_manage action={args.get('action', '?')} "
@@ -691,76 +637,15 @@ def _parse_structured_summary(
    return out


-def _extract_absorbed_into_declarations(
-    tool_calls: List[Dict[str, Any]],
-) -> Dict[str, Dict[str, Any]]:
-    """Walk this run's tool calls and extract model-declared absorption targets.
-
-    The curator prompt requires every ``skill_manage(action='delete')`` call
-    to pass ``absorbed_into=<umbrella>`` when consolidating, or
-    ``absorbed_into=""`` when truly pruning. This is the single authoritative
-    signal for classification — the model's own declaration at the moment of
-    deletion, which beats both post-hoc YAML summary parsing and substring
-    heuristics on other tool calls.
-
-    Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
-    Entries with ``into == ""`` are explicit prunings.
-    Skills without a ``skill_manage(delete)`` call, or with one that omitted
-    ``absorbed_into``, are not in the returned dict — caller falls back to
-    the existing heuristic/YAML logic for those (backward compat with older
-    curator runs and any callers that don't populate the arg).
-    """
-    out: Dict[str, Dict[str, Any]] = {}
-    for tc in tool_calls or []:
-        if not isinstance(tc, dict):
-            continue
-        if tc.get("name") != "skill_manage":
-            continue
-        raw = tc.get("arguments") or ""
-        args: Dict[str, Any] = {}
-        if isinstance(raw, dict):
-            args = raw
-        elif isinstance(raw, str):
-            try:
-                args = json.loads(raw)
-            except Exception:
-                continue
-        if not isinstance(args, dict):
-            continue
-        if args.get("action") != "delete":
-            continue
-        name = args.get("name")
-        if not isinstance(name, str) or not name.strip():
-            continue
-        # absorbed_into must be present (even empty string is meaningful);
-        # missing key means the model didn't declare intent.
-        if "absorbed_into" not in args:
-            continue
-        target = args.get("absorbed_into")
-        if target is None:
-            continue
-        if not isinstance(target, str):
-            continue
-        out[name.strip()] = {"into": target.strip(), "declared": True}
-    return out
-
-
 def _reconcile_classification(
    removed: List[str],
    heuristic: Dict[str, List[Dict[str, Any]]],
    model_block: Dict[str, List[Dict[str, str]]],
    destinations: Set[str],
-    absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
 ) -> Dict[str, List[Dict[str, Any]]]:
    """Merge heuristic (tool-call evidence) with the model's structured block.

-    Rules (evaluated in order; first match wins):
-    - **Model-declared `absorbed_into` at delete time is authoritative.** Any
-      entry in ``absorbed_declarations`` beats every other signal. This is
-      the model telling us directly, at the moment of deletion, what it did.
-      ``into != ""`` and target exists → consolidated. ``into == ""`` →
-      pruned. ``into != ""`` but target doesn't exist → hallucination; fall
-      through to the usual signals.
+    Rules:
    - Model-declared consolidation wins when its ``into`` target exists
      in ``destinations`` (survived or newly-created). This gives the
      model authority over intent + rationale.
@@ -781,8 +666,6 @@ def _reconcile_classification(
    model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
    model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}

-    declared = absorbed_declarations or {}
-
    consolidated: List[Dict[str, Any]] = []
    pruned: List[Dict[str, Any]] = []

@@ -790,36 +673,6 @@ def _reconcile_classification(
        mc = model_cons.get(name)
        mp = model_pruned.get(name)
        hc = heur_cons.get(name)
-        dec = declared.get(name)
-
-        # Authoritative: model declared `absorbed_into` at the delete call.
-        if dec is not None:
-            into_claim = dec.get("into", "")
-            if into_claim and into_claim in destinations:
-                entry: Dict[str, Any] = {
-                    "name": name,
-                    "into": into_claim,
-                    "source": "absorbed_into (model-declared at delete)",
-                    "reason": (mc.get("reason") or "") if mc else "",
-                }
-                if hc and hc.get("evidence"):
-                    entry["evidence"] = hc["evidence"]
-                consolidated.append(entry)
-                continue
-            if into_claim == "":
-                # Explicit prune declaration
-                pruned.append({
-                    "name": name,
-                    "source": "absorbed_into=\"\" (model-declared prune)",
-                    "reason": (mp.get("reason") or "") if mp else "",
-                })
-                continue
-            # into_claim is non-empty but target doesn't exist: the model
-            # named a nonexistent umbrella at delete time. The tool already
-            # rejects this at the skill_manage layer, so we shouldn't see it
-            # in practice — but if it slips through (e.g. the umbrella was
-            # deleted LATER in the same run), fall through to the usual
-            # signals rather than trusting a broken reference.

        # Model says consolidated — trust it if the destination is real.
        if mc and mc.get("into") in destinations:
@@ -955,20 +808,11 @@ def _write_run_report(
    )
    model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
    destinations = set(after_names) | set(added or [])
-    # Authoritative signal: extract per-delete `absorbed_into` declarations
-    # from this run's tool calls. These beat both the YAML summary block and
-    # the substring heuristic — the model is telling us directly, at the
-    # moment of deletion, whether each archived skill was consolidated
-    # (into=<umbrella>) or pruned (into="").
-    absorbed_declarations = _extract_absorbed_into_declarations(
-        llm_meta.get("tool_calls", []) or []
-    )
    classification = _reconcile_classification(
        removed=removed,
        heuristic=heuristic,
        model_block=model_block,
        destinations=destinations,
-        absorbed_declarations=absorbed_declarations,
    )
    consolidated = classification["consolidated"]
    pruned = classification["pruned"]
@@ -1447,52 +1291,6 @@ def run_curator_review(
    }


-def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
-    """Resolve provider/model and per-slot credentials for the curator review fork.
-
-    Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
-    ``base_url`` from the active slot are returned as explicit overrides so
-    ``resolve_runtime_provider`` does not silently reuse the main chat
-    credential chain for a routed auxiliary model.
-    """
-    _main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
-    _main_provider = _main.get("provider") or "auto"
-    _main_model = _main.get("default") or _main.get("model") or ""
-
-    # 1. Canonical aux task slot
-    _aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
-    _cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
-    _task_provider = (_cur_task.get("provider") or "").strip() or None
-    _task_model = (_cur_task.get("model") or "").strip() or None
-    if _task_provider and _task_provider != "auto" and _task_model:
-        return _ReviewRuntimeBinding(
-            _task_provider,
-            _task_model,
-            _strip_aux_credential(_cur_task.get("api_key")),
-            _strip_aux_credential(_cur_task.get("base_url")),
-        )
-
-    # 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
-    _cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
-    _legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
-    _legacy_provider = _legacy.get("provider") or None
-    _legacy_model = _legacy.get("model") or None
-    if _legacy_provider and _legacy_model:
-        logger.info(
-            "curator: using deprecated curator.auxiliary.{provider,model} "
-            "config — please migrate to auxiliary.curator.{provider,model}"
-        )
-        return _ReviewRuntimeBinding(
-            str(_legacy_provider),
-            str(_legacy_model),
-            _strip_aux_credential(_legacy.get("api_key")),
-            _strip_aux_credential(_legacy.get("base_url")),
-        )
-
-    # 3. Fall through to the main chat model
-    return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
-
-
 def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
    """Pick (provider, model) for the curator review fork.

@@ -1508,8 +1306,32 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
      2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
      3. Main ``model.{provider,default/model}`` pair
    """
-    b = _resolve_review_runtime(cfg)
-    return b.provider, b.model
+    _main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
+    _main_provider = _main.get("provider") or "auto"
+    _main_model = _main.get("default") or _main.get("model") or ""
+
+    # 1. Canonical aux task slot
+    _aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
+    _cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
+    _task_provider = (_cur_task.get("provider") or "").strip() or None
+    _task_model = (_cur_task.get("model") or "").strip() or None
+    if _task_provider and _task_provider != "auto" and _task_model:
+        return _task_provider, _task_model
+
+    # 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
+    _cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
+    _legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
+    _legacy_provider = _legacy.get("provider") or None
+    _legacy_model = _legacy.get("model") or None
+    if _legacy_provider and _legacy_model:
+        logger.info(
+            "curator: using deprecated curator.auxiliary.{provider,model} "
+            "config — please migrate to auxiliary.curator.{provider,model}"
+        )
+        return _legacy_provider, _legacy_model
+
+    # 3. Fall through to the main chat model
+    return _main_provider, _main_model


 def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1548,10 +1370,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
    # arguments hits an auto-resolution path that fails for OAuth-only
    # providers and for pool-backed credentials.
    #
-    # `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
+    # `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
    # (canonical aux-task slot, wired through `hermes model` → auxiliary
    # picker and the dashboard Models tab), with a legacy fallback to
-    # `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
+    # `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
    _api_key = None
    _base_url = None
    _api_mode = None
@@ -1561,13 +1383,9 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
        from hermes_cli.config import load_config
        from hermes_cli.runtime_provider import resolve_runtime_provider
        _cfg = load_config()
-        _binding = _resolve_review_runtime(_cfg)
-        _provider, _model_name = _binding.provider, _binding.model
+        _provider, _model_name = _resolve_review_model(_cfg)
        _rp = resolve_runtime_provider(
-            requested=_provider,
-            target_model=_model_name,
-            explicit_api_key=_binding.explicit_api_key,
-            explicit_base_url=_binding.explicit_base_url,
+            requested=_provider, target_model=_model_name
        )
        _api_key = _rp.get("api_key")
        _base_url = _rp.get("base_url")
@@ -21,18 +21,6 @@ It DOES include:
    pointer — otherwise the curator would immediately re-fire on the next
    tick)
  - ``.bundled_manifest`` (so protection markers stay consistent)
-
-Alongside the skills tarball, each snapshot also captures a copy of
-``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
-jobs reference skills by name in their ``skills``/``skill`` fields; the
-curator's consolidation pass rewrites those in place via
-``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
-rolling back the skills tree would leave cron jobs pointing at the
-umbrella skills even though the narrow skills they were originally
-configured with have been restored. We store the whole jobs.json for
-fidelity but rollback only touches the ``skills``/``skill`` fields — the
-rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
-we leave it alone.
 """

 from __future__ import annotations
@@ -75,60 +63,6 @@ def _skills_dir() -> Path:
    return get_hermes_home() / "skills"


-def _cron_jobs_file() -> Path:
-    """Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
-    return get_hermes_home() / "cron" / "jobs.json"
-
-
-CRON_JOBS_FILENAME = "cron-jobs.json"
-
-
-def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
-    """Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
-
-    Returns a small dict describing what was captured so the caller can
-    fold it into the manifest. Never raises — if the cron file is missing
-    or unreadable, the return dict has ``backed_up=False`` and the reason,
-    and the snapshot proceeds without cron data (the snapshot is still
-    useful for rolling back skills).
-    """
-    src = _cron_jobs_file()
-    info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
-    if not src.exists():
-        info["reason"] = "no cron/jobs.json present"
-        return info
-    try:
-        raw = src.read_text(encoding="utf-8")
-    except OSError as e:
-        logger.debug("Failed to read cron/jobs.json for backup: %s", e)
-        info["reason"] = f"read error: {e}"
-        return info
-    # Count jobs as a nice diagnostic — but don't fail the snapshot if the
-    # file is unparseable; just store the raw text and let rollback deal
-    # with it (or not, if it's corrupted). jobs.json wraps the list as
-    # `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
-    # fall back to bare-list shape just in case the format ever changes.
-    try:
-        parsed = json.loads(raw)
-        if isinstance(parsed, dict):
-            inner = parsed.get("jobs")
-            if isinstance(inner, list):
-                info["jobs_count"] = len(inner)
-        elif isinstance(parsed, list):
-            info["jobs_count"] = len(parsed)
-    except (json.JSONDecodeError, TypeError):
-        info["jobs_count"] = 0
-        info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
-    try:
-        (dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
-    except OSError as e:
-        logger.debug("Failed to write cron backup file: %s", e)
-        info["reason"] = f"write error: {e}"
-        return info
-    info["backed_up"] = True
-    return info
-
-
 def _utc_id(now: Optional[datetime] = None) -> str:
    """UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
    if now is None:
@@ -182,8 +116,7 @@ def _count_skill_files(base: Path) -> int:


 def _write_manifest(dest: Path, reason: str, archive_path: Path,
-                    skills_counted: int,
-                    cron_info: Optional[Dict[str, Any]] = None) -> None:
+                    skills_counted: int) -> None:
    manifest = {
        "id": dest.name,
        "reason": reason,
@@ -192,15 +125,6 @@ def _write_manifest(dest: Path, reason: str, archive_path: Path,
        "archive_bytes": archive_path.stat().st_size,
        "skill_files": skills_counted,
    }
-    if cron_info is not None:
-        manifest["cron_jobs"] = {
-            "backed_up": bool(cron_info.get("backed_up", False)),
-            "jobs_count": int(cron_info.get("jobs_count", 0)),
-        }
-        if not cron_info.get("backed_up"):
-            manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
-        if cron_info.get("parse_warning"):
-            manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
    (dest / "manifest.json").write_text(
        json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
    )
@@ -257,14 +181,7 @@ def snapshot_skills(reason: str = "manual") -> Optional[Path]:
                # arcname: store paths relative to skills/ so extraction
                # drops cleanly back into the skills dir.
                tf.add(str(entry), arcname=entry.name, recursive=True)
-        # Capture cron/jobs.json alongside the tarball. Never fails the
-        # snapshot — the skills side is the core guarantee; cron is
-        # additive. We still record in the manifest whether it was
-        # captured so rollback can surface "no cron data in this snapshot".
-        cron_info = _backup_cron_jobs_into(dest)
-        _write_manifest(dest, reason, archive,
-                        _count_skill_files(skills),
-                        cron_info=cron_info)
+        _write_manifest(dest, reason, archive, _count_skill_files(skills))
    except (OSError, tarfile.TarError) as e:
        logger.debug("Curator snapshot failed: %s", e, exc_info=True)
        # Clean up partial snapshot
@@ -381,149 +298,6 @@ def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
    return candidates[0] if candidates else None


-def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
-    """Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
-
-    We do NOT overwrite the whole cron file. Only the ``skills`` and
-    ``skill`` fields are restored, and only on jobs that still exist in the
-    current file (matched by ``id``). Everything else about the job —
-    schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks —
-    is live state that the user/scheduler has modified since the snapshot;
-    overwriting it would regress unrelated cron activity.
-
-    Rules:
-    - Jobs present in backup AND live, with differing skills → skills restored.
-    - Jobs present in backup AND live, with matching skills → no-op.
-    - Jobs present in backup but gone from live (user deleted the job
-      after the snapshot) → skipped, noted in the return report.
-    - Jobs present in live but not in backup (user created a new cron
-      job after the snapshot) → left untouched.
-
-    Never raises; failures are captured in the return dict. Writes through
-    ``cron.jobs`` to pick up the same lock + atomic-write path that tick()
-    uses, so we don't race the scheduler.
-    """
-    report: Dict[str, Any] = {
-        "attempted": False,
-        "restored": [],
-        "skipped_missing": [],
-        "unchanged": 0,
-        "error": None,
-    }
-    backup_file = snapshot_dir / CRON_JOBS_FILENAME
-    if not backup_file.exists():
-        report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
-        return report
-
-    try:
-        backup_text = backup_file.read_text(encoding="utf-8")
-        backup_parsed = json.loads(backup_text)
-    except (OSError, json.JSONDecodeError) as e:
-        report["error"] = f"failed to load backed-up jobs: {e}"
-        return report
-    # jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
-    # that shape and a bare list for forward compat.
-    if isinstance(backup_parsed, dict):
-        backup_jobs = backup_parsed.get("jobs")
-    elif isinstance(backup_parsed, list):
-        backup_jobs = backup_parsed
-    else:
-        backup_jobs = None
-    if not isinstance(backup_jobs, list):
-        report["error"] = "backed-up cron-jobs.json has no jobs list"
-        return report
-
-    # Build a lookup of the backed-up skill state keyed by job id.
-    # We only need the two skill-ish fields (legacy single and modern list).
-    backup_by_id: Dict[str, Dict[str, Any]] = {}
-    for job in backup_jobs:
-        if not isinstance(job, dict):
-            continue
-        jid = job.get("id")
-        if not isinstance(jid, str) or not jid:
-            continue
-        backup_by_id[jid] = {
-            "skills": job.get("skills"),
-            "skill": job.get("skill"),
-            "name": job.get("name") or jid,
-        }
-
-    if not backup_by_id:
-        report["attempted"] = True  # we tried but there was nothing to do
-        return report
-
-    # Load and rewrite the live jobs under the scheduler's lock.
-    try:
-        from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
-    except ImportError as e:
-        report["error"] = f"cron module unavailable: {e}"
-        return report
-
-    report["attempted"] = True
-    try:
-        with _jobs_file_lock:
-            live_jobs = load_jobs()
-            changed = False
-
-            live_ids = set()
-            for live in live_jobs:
-                if not isinstance(live, dict):
-                    continue
-                jid = live.get("id")
-                if not isinstance(jid, str) or not jid:
-                    continue
-                live_ids.add(jid)
-
-                backup = backup_by_id.get(jid)
-                if backup is None:
-                    continue  # live job didn't exist at snapshot time
-
-                cur_skills = live.get("skills")
-                cur_skill = live.get("skill")
-                bkp_skills = backup.get("skills")
-                bkp_skill = backup.get("skill")
-
-                if cur_skills == bkp_skills and cur_skill == bkp_skill:
-                    report["unchanged"] += 1
-                    continue
-
-                # Restore. Preserve absence (don't force the key to appear
-                # if the backup didn't have it either).
-                if bkp_skills is None:
-                    live.pop("skills", None)
-                else:
-                    live["skills"] = bkp_skills
-                if bkp_skill is None:
-                    live.pop("skill", None)
-                else:
-                    live["skill"] = bkp_skill
-
-                report["restored"].append({
-                    "job_id": jid,
-                    "job_name": backup.get("name") or jid,
-                    "from": {"skills": cur_skills, "skill": cur_skill},
-                    "to": {"skills": bkp_skills, "skill": bkp_skill},
-                })
-                changed = True
-
-            # Jobs in backup but not in live = user deleted them after snapshot
-            for jid, backup in backup_by_id.items():
-                if jid not in live_ids:
-                    report["skipped_missing"].append({
-                        "job_id": jid,
-                        "job_name": backup.get("name") or jid,
-                    })
-
-            if changed:
-                save_jobs(live_jobs)
-    except Exception as e:  # noqa: BLE001 — rollback must not die mid-restore
-        logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
-        report["error"] = f"restore failed mid-flight: {e}"
-
-    return report
-
-
-
 def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
    """Restore ``~/.hermes/skills/`` from a snapshot.

@@ -634,35 +408,8 @@ def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]
    except OSError:
        pass

-    # Reconcile cron skill-links. Surgical: only the skills/skill fields
-    # on jobs matched by id. Everything else in jobs.json is live state
-    # (schedule, next_run_at, enabled, prompt, etc.) and we leave it
-    # alone. Failures here don't fail the overall rollback — the skills
-    # tree is already restored, which is the main guarantee.
-    cron_report = _restore_cron_skill_links(target)
-
-    summary_bits = [f"restored from snapshot {target.name}"]
-    if cron_report.get("attempted"):
-        restored_n = len(cron_report.get("restored") or [])
-        skipped_n = len(cron_report.get("skipped_missing") or [])
-        if cron_report.get("error"):
-            summary_bits.append(f"cron links: error — {cron_report['error']}")
-        elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
-            # Attempted but nothing matched — empty snapshot or no overlapping ids.
-            pass
-        else:
-            parts = []
-            if restored_n:
-                parts.append(f"{restored_n} job(s) had skill links restored")
-            if skipped_n:
-                parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
-            if cron_report.get("unchanged"):
-                parts.append(f"{cron_report['unchanged']} already matched")
-            summary_bits.append("cron links: " + ", ".join(parts))
-
-    logger.info("Curator rollback: restored from %s (cron_report=%s)",
-                target.name, cron_report)
-    return (True, "; ".join(summary_bits), target)
+    logger.info("Curator rollback: restored from %s", target.name)
+    return (True, f"restored from snapshot {target.name}", target)


 # ---------------------------------------------------------------------------
@@ -55,7 +55,6 @@ class FailoverReason(enum.Enum):
    thinking_signature = "thinking_signature"  # Anthropic thinking block sig invalid
    long_context_tier = "long_context_tier"    # Anthropic "extra usage" tier gate
    oauth_long_context_beta_forbidden = "oauth_long_context_beta_forbidden"  # Anthropic OAuth subscription rejects 1M context beta — disable beta and retry
-    llama_cpp_grammar_pattern = "llama_cpp_grammar_pattern"  # llama.cpp json-schema-to-grammar rejects regex escapes in `pattern` / `format` — strip from tools and retry

    # Catch-all
    unknown = "unknown"                  # Unclassifiable — retry with backoff
@@ -471,31 +470,6 @@ def classify_api_error(
            should_compress=False,
        )

-    # llama.cpp's ``json-schema-to-grammar`` converter (used by its OAI
-    # server to build GBNF tool-call parsers) rejects regex escape classes
-    # like ``\d``/``\w``/``\s`` and most ``format`` values. MCP servers
-    # routinely emit ``"pattern": "\\d{4}-\\d{2}-\\d{2}"`` for date/phone/
-    # email params. llama.cpp surfaces this as HTTP 400 with one of a few
-    # recognizable phrases; on match we strip ``pattern``/``format`` from
-    # ``self.tools`` in the retry loop and retry once. Cloud providers are
-    # unaffected — they accept these keywords and we never hit this branch.
-    if (
-        status_code == 400
-        and (
-            "error parsing grammar" in error_msg
-            or "json-schema-to-grammar" in error_msg
-            or (
-                "unable to generate parser" in error_msg
-                and "template" in error_msg
-            )
-        )
-    ):
-        return _result(
-            FailoverReason.llama_cpp_grammar_pattern,
-            retryable=True,
-            should_compress=False,
-        )
-
    # ── 2. HTTP status code classification ──────────────────────────

    if status_code is not None:
@@ -546,12 +520,7 @@ def classify_api_error(

    is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
    if is_disconnect and not status_code:
-        # Absolute token/message-count thresholds are only a proxy for smaller
-        # context windows.  Large-context sessions can have hundreds of
-        # messages while still being far below their actual token budget.
-        is_large = approx_tokens > context_length * 0.6 or (
-            context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
-        )
+        is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
        if is_large:
            return _result(
                FailoverReason.context_overflow,
@@ -797,12 +766,7 @@ def _classify_400(
        if not err_body_msg:
            err_body_msg = str(body.get("message") or "").strip().lower()
    is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
-    # Absolute token/message-count thresholds are only a proxy for smaller
-    # context windows.  Large-context sessions can have many messages while
-    # still being far below their actual token budget.
-    is_large = approx_tokens > context_length * 0.4 or (
-        context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
-    )
+    is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80

    if is_generic and is_large:
        return result_fn(
@@ -679,21 +679,7 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
    finish_reason_raw = str(cand.get("finishReason") or "")
    if finish_reason_raw:
        mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
-        finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
-        # Attach usage from this event's usageMetadata so the streaming
-        # loop in run_agent.py can record token counts (mirrors the
-        # non-streaming path in translate_gemini_response).
-        usage_meta = event.get("usageMetadata") or {}
-        if usage_meta:
-            finish_chunk.usage = SimpleNamespace(
-                prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
-                completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
-                total_tokens=int(usage_meta.get("totalTokenCount") or 0),
-                prompt_tokens_details=SimpleNamespace(
-                    cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
-                ),
-            )
-        chunks.append(finish_chunk)
+        chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
    return chunks


@@ -489,29 +489,16 @@ def save_credentials(creds: GoogleCredentials) -> Path:
    """Atomically write creds to disk with 0o600 permissions."""
    path = _credentials_path()
    path.parent.mkdir(parents=True, exist_ok=True)
-    # Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
-    # On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
-    try:
-        os.chmod(path.parent, 0o700)
-    except OSError:
-        pass
    payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"

    with _credentials_lock():
        tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
        try:
-            # Create with 0o600 atomically to close the TOCTOU window where the
-            # default umask (often 0o644) would briefly expose tokens to other
-            # local users between open() and chmod().
-            fd = os.open(
-                str(tmp_path),
-                os.O_WRONLY | os.O_CREAT | os.O_EXCL,
-                stat.S_IRUSR | stat.S_IWUSR,
-            )
-            with os.fdopen(fd, "w", encoding="utf-8") as fh:
+            with open(tmp_path, "w", encoding="utf-8") as fh:
                fh.write(payload)
                fh.flush()
                os.fsync(fh.fileno())
+            os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
            atomic_replace(tmp_path, path)
        finally:
            try:
@@ -1,230 +0,0 @@
-"""Lightweight internationalization (i18n) for Hermes static user-facing messages.
-
-Scope (thin slice, by design): only the highest-impact static strings shown
-to the user by Hermes itself -- approval prompts, a handful of gateway slash
-command replies, restart-drain notices.  Agent-generated output, log lines,
-error tracebacks, tool outputs, and slash-command descriptions all stay in
-English.
-
-Catalog files live under ``locales/<lang>.yaml`` at the repo root.  Each
-catalog is a flat dict keyed by dotted paths (e.g. ``approval.choose`` or
-``gateway.approval_expired``).  Missing keys fall back to English; if English
-is missing too, the key path itself is returned so a broken catalog never
-crashes the agent.
-
-Usage::
-
-    from agent.i18n import t
-    print(t("approval.choose_long"))                       # current lang
-    print(t("gateway.draining", count=3))                  # {count} formatted
-    print(t("approval.choose_long", lang="zh"))            # explicit override
-
-Language resolution order:
-    1. Explicit ``lang=`` argument passed to :func:`t`
-    2. ``HERMES_LANGUAGE`` environment variable (for tests / quick override)
-    3. ``display.language`` from config.yaml
-    4. ``"en"`` (baseline)
-
-Supported languages: en, zh, ja, de, es.  Unknown values fall back to en.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-import threading
-from functools import lru_cache
-from pathlib import Path
-from typing import Any
-
-logger = logging.getLogger(__name__)
-
-SUPPORTED_LANGUAGES: tuple[str, ...] = ("en", "zh", "ja", "de", "es")
-DEFAULT_LANGUAGE = "en"
-
-# Accept a few natural aliases so users who type "chinese" / "zh-CN" / "jp"
-# get the right catalog instead of silently falling back to English.
-_LANGUAGE_ALIASES: dict[str, str] = {
-    "english": "en", "en-us": "en", "en-gb": "en",
-    "chinese": "zh", "mandarin": "zh", "zh-cn": "zh", "zh-tw": "zh", "zh-hans": "zh", "zh-hant": "zh",
-    "japanese": "ja", "jp": "ja", "ja-jp": "ja",
-    "german": "de", "deutsch": "de", "de-de": "de",
-    "spanish": "es", "español": "es", "espanol": "es", "es-es": "es", "es-mx": "es",
-}
-
-_catalog_cache: dict[str, dict[str, str]] = {}
-_catalog_lock = threading.Lock()
-
-
-def _locales_dir() -> Path:
-    """Return the directory containing locale YAML files.
-
-    Lives next to the repo root so both the bundled install and editable
-    checkouts find it without PYTHONPATH gymnastics.
-    """
-    # agent/i18n.py -> agent/ -> repo root
-    return Path(__file__).resolve().parent.parent / "locales"
-
-
-def _normalize_lang(value: Any) -> str:
-    """Normalize a user-supplied language value to a supported code.
-
-    Accepts supported codes directly, common aliases (``chinese`` -> ``zh``),
-    and case-insensitive regional tags (``zh-CN`` -> ``zh``).  Returns the
-    default language for unknown values.
-    """
-    if not isinstance(value, str):
-        return DEFAULT_LANGUAGE
-    key = value.strip().lower()
-    if not key:
-        return DEFAULT_LANGUAGE
-    if key in SUPPORTED_LANGUAGES:
-        return key
-    if key in _LANGUAGE_ALIASES:
-        return _LANGUAGE_ALIASES[key]
-    # Try stripping a region suffix (e.g. "pt-br" -> "pt" won't be supported,
-    # but "zh-CN" -> "zh" will).
-    base = key.split("-", 1)[0]
-    if base in SUPPORTED_LANGUAGES:
-        return base
-    return DEFAULT_LANGUAGE
-
-
-def _load_catalog(lang: str) -> dict[str, str]:
-    """Load and flatten one locale YAML file into a dotted-key dict.
-
-    YAML files can be nested for human readability; this produces the flat
-    key space :func:`t` expects.  Cached per-language for the process.
-    """
-    with _catalog_lock:
-        cached = _catalog_cache.get(lang)
-        if cached is not None:
-            return cached
-
-    path = _locales_dir() / f"{lang}.yaml"
-    if not path.is_file():
-        logger.debug("i18n catalog missing for %s at %s", lang, path)
-        with _catalog_lock:
-            _catalog_cache[lang] = {}
-        return {}
-
-    try:
-        import yaml  # PyYAML is already a hermes dependency
-        with path.open("r", encoding="utf-8") as f:
-            raw = yaml.safe_load(f) or {}
-    except Exception as exc:
-        logger.warning("Failed to load i18n catalog %s: %s", path, exc)
-        with _catalog_lock:
-            _catalog_cache[lang] = {}
-        return {}
-
-    flat: dict[str, str] = {}
-    _flatten_into(raw, "", flat)
-    with _catalog_lock:
-        _catalog_cache[lang] = flat
-    return flat
-
-
-def _flatten_into(node: Any, prefix: str, out: dict[str, str]) -> None:
-    if isinstance(node, dict):
-        for key, value in node.items():
-            child_key = f"{prefix}.{key}" if prefix else str(key)
-            _flatten_into(value, child_key, out)
-    elif isinstance(node, str):
-        out[prefix] = node
-    # Non-string, non-dict leaves are ignored -- catalogs are text-only.
-
-
-@lru_cache(maxsize=1)
-def _config_language_cached() -> str | None:
-    """Read ``display.language`` from config.yaml once per process.
-
-    Cached because ``t()`` is called in hot paths (every approval prompt,
-    every gateway reply) and re-reading YAML each call would be wasteful.
-    ``reset_language_cache()`` clears this when config changes at runtime
-    (e.g. after the setup wizard).
-    """
-    try:
-        from hermes_cli.config import load_config
-        cfg = load_config()
-        lang = (cfg.get("display") or {}).get("language")
-        if lang:
-            return _normalize_lang(lang)
-    except Exception as exc:
-        logger.debug("Could not read display.language from config: %s", exc)
-    return None
-
-
-def reset_language_cache() -> None:
-    """Invalidate cached language resolution and catalogs.
-
-    Call after :func:`hermes_cli.config.save_config` if a running process
-    needs to pick up a changed ``display.language`` without restart.
-    """
-    _config_language_cached.cache_clear()
-    with _catalog_lock:
-        _catalog_cache.clear()
-
-
-def get_language() -> str:
-    """Resolve the active language using env > config > default order."""
-    env_lang = os.environ.get("HERMES_LANGUAGE")
-    if env_lang:
-        return _normalize_lang(env_lang)
-    cfg_lang = _config_language_cached()
-    if cfg_lang:
-        return cfg_lang
-    return DEFAULT_LANGUAGE
-
-
-def t(key: str, lang: str | None = None, **format_kwargs: Any) -> str:
-    """Translate a dotted key to the active language.
-
-    Parameters
-    ----------
-    key
-        Dotted path into the catalog, e.g. ``"approval.choose_long"``.
-    lang
-        Explicit language override.  Takes precedence over env + config.
-    **format_kwargs
-        ``str.format`` substitution arguments (``t("gateway.drain", count=3)``
-        expects a catalog entry with a ``{count}`` placeholder).
-
-    Returns
-    -------
-    The translated string, or the English fallback if the key is missing in
-    the target language, or the bare key if English is also missing.
-    """
-    target = _normalize_lang(lang) if lang else get_language()
-    catalog = _load_catalog(target)
-    value = catalog.get(key)
-
-    if value is None and target != DEFAULT_LANGUAGE:
-        # Fall through to English rather than showing a key path to the user.
-        value = _load_catalog(DEFAULT_LANGUAGE).get(key)
-
-    if value is None:
-        # Last-ditch: return the key itself.  A broken catalog should not
-        # crash anything; it just looks ugly until someone fixes it.
-        logger.debug("i18n miss: key=%r lang=%r", key, target)
-        value = key
-
-    if format_kwargs:
-        try:
-            return value.format(**format_kwargs)
-        except (KeyError, IndexError, ValueError) as exc:
-            logger.warning(
-                "i18n format failed for key=%r lang=%r kwargs=%r: %s",
-                key, target, format_kwargs, exc,
-            )
-            return value
-    return value
-
-
-__all__ = [
-    "SUPPORTED_LANGUAGES",
-    "DEFAULT_LANGUAGE",
-    "t",
-    "get_language",
-    "reset_language_cache",
-]
@@ -1,14 +1,17 @@
-"""MemoryManager — orchestrates memory providers for the agent.
+"""MemoryManager — orchestrates the built-in memory provider plus at most
+ONE external plugin memory provider.

 Single integration point in run_agent.py. Replaces scattered per-backend
 code with one manager that delegates to registered providers.

-Only ONE external plugin provider is allowed at a time — attempting to
-register a second external provider is rejected with a warning.  This
+The BuiltinMemoryProvider is always registered first and cannot be removed.
+Only ONE external (non-builtin) provider is allowed at a time — attempting
+to register a second external provider is rejected with a warning.  This
 prevents tool schema bloat and conflicting memory backends.

 Usage in run_agent.py:
    self._memory_manager = MemoryManager()
+    self._memory_manager.add_provider(BuiltinMemoryProvider(...))
    # Only ONE of these:
    self._memory_manager.add_provider(plugin_provider)

@@ -1,16 +1,17 @@
 """Abstract base class for pluggable memory providers.

-Memory providers give the agent persistent recall across sessions.
-The MemoryManager enforces a one-external-provider limit to prevent
-tool schema bloat and conflicting memory backends.
+Memory providers give the agent persistent recall across sessions. One
+external provider is active at a time alongside the always-on built-in
+memory (MEMORY.md / USER.md). The MemoryManager enforces this limit.

-External providers (Honcho, Hindsight, Mem0, etc.) are registered
-and managed via MemoryManager. Only one external provider runs at a
-time.
+Built-in memory is always active as the first provider and cannot be removed.
+External providers (Honcho, Hindsight, Mem0, etc.) are additive — they never
+disable the built-in store. Only one external provider runs at a time to
+prevent tool schema bloat and conflicting memory backends.

 Registration:
-  Plugins ship in plugins/memory/<name>/ and are activated via
-  the memory.provider config key.
+  1. Built-in: BuiltinMemoryProvider — always present, not removable.
+  2. Plugins: Ship in plugins/memory/<name>/, activated by memory.provider config.

 Lifecycle (called by MemoryManager, wired in run_agent.py):
  initialize()          — connect, create resources, warm up
@@ -318,17 +318,6 @@ _URL_TO_PROVIDER: Dict[str, str] = {
    "ollama.com": "ollama-cloud",
 }

-# Auto-extend with hostnames derived from provider profiles.
-# Any provider with a base_url not already in the map gets added automatically.
-try:
-    from providers import list_providers as _list_providers
-    for _pp in _list_providers():
-        _host = _pp.get_hostname()
-        if _host and _host not in _URL_TO_PROVIDER:
-            _URL_TO_PROVIDER[_host] = _pp.name
-except Exception:
-    pass
-

 def _infer_provider_from_url(base_url: str) -> Optional[str]:
    """Infer the models.dev provider name from a base URL.
@@ -183,8 +183,8 @@ SKILLS_GUIDANCE = (
 )

 KANBAN_GUIDANCE = (
-    "# Kanban task execution protocol\n"
-    "You have been assigned ONE task from "
+    "# You are a Kanban worker\n"
+    "You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
    "the shared board at `~/.hermes/kanban.db`. Your task id is in "
    "`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
    "The `kanban_*` tools in your schema are your primary coordination surface — "
@@ -513,12 +513,6 @@ PLATFORM_HINTS = {
        "image and is the WRONG path. Bare Unicode emoji in text is also not a substitute "
        "— when a sticker is the right response, use yb_send_sticker."
    ),
-    "api_server": (
-        "You're responding through an API server. The rendering layer is unknown — "
-        "assume plain text. No markdown formatting (no asterisks, bullets, headers, "
-        "code fences). Treat this like a conversation, not a document. Keep responses "
-        "brief and natural."
-    ),
 }

 # ---------------------------------------------------------------------------
@@ -305,18 +305,13 @@ def _redact_form_body(text: str) -> str:
    return _redact_query_string(text.strip())


-def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
+def redact_sensitive_text(text: str, *, force: bool = False) -> str:
    """Apply all redaction patterns to a block of text.

    Safe to call on any string -- non-matching text passes through unchanged.
    Disabled by default — enable via security.redact_secrets: true in config.yaml.
    Set force=True for safety boundaries that must never return raw secrets
    regardless of the user's global logging redaction preference.
-
-    Set code_file=True to skip the ENV-assignment and JSON-field regex
-    patterns when the text is known to be source code (e.g. MAX_TOKENS=***
-    constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
-    private keys, DB connstrings, JWTs, and URL secrets are still redacted.
    """
    if text is None:
        return None
@@ -330,18 +325,17 @@ def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = F
    # Known prefixes (sk-, ghp_, etc.)
    text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)

-    # ENV assignments: OPENAI_API_KEY=***  (skip for code files — false positives)
-    if not code_file:
-        def _redact_env(m):
-            name, quote, value = m.group(1), m.group(2), m.group(3)
-            return f"{name}={quote}{_mask_token(value)}{quote}"
-        text = _ENV_ASSIGN_RE.sub(_redact_env, text)
+    # ENV assignments: OPENAI_API_KEY=sk-abc...
+    def _redact_env(m):
+        name, quote, value = m.group(1), m.group(2), m.group(3)
+        return f"{name}={quote}{_mask_token(value)}{quote}"
+    text = _ENV_ASSIGN_RE.sub(_redact_env, text)

-        # JSON fields: "apiKey": "***"  (skip for code files — false positives)
-        def _redact_json(m):
-            key, value = m.group(1), m.group(2)
-            return f'{key}: "{_mask_token(value)}"'
-        text = _JSON_FIELD_RE.sub(_redact_json, text)
+    # JSON fields: "apiKey": "value"
+    def _redact_json(m):
+        key, value = m.group(1), m.group(2)
+        return f'{key}: "{_mask_token(value)}"'
+    text = _JSON_FIELD_RE.sub(_redact_json, text)

    # Authorization headers
    text = _AUTH_HEADER_RE.sub(
@@ -6,7 +6,6 @@ can invoke skills via /skill-name commands.

 import json
 import logging
-import os
 import re
 from pathlib import Path
 from typing import Any, Dict, Optional
@@ -21,35 +20,10 @@ from agent.skill_preprocessing import (
 logger = logging.getLogger(__name__)

 _skill_commands: Dict[str, Dict[str, Any]] = {}
-_skill_commands_platform: Optional[str] = None
 # Patterns for sanitizing skill names into clean hyphen-separated slugs.
 _SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
 _SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")

-
-def _resolve_skill_commands_platform() -> Optional[str]:
-    """Return the current platform scope used for disabled-skill filtering.
-
-    Used to detect when the active platform has shifted so
-    :func:`get_skill_commands` can drop a stale cache that was populated
-    for a different platform's ``skills.platform_disabled`` view (#14536).
-
-    Resolves from (in order) ``HERMES_PLATFORM`` env var and
-    ``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
-    ``None`` when no platform scope is active (e.g. classic CLI, RL
-    rollouts, standalone scripts).
-    """
-    try:
-        from gateway.session_context import get_session_env
-
-        resolved_platform = (
-            os.getenv("HERMES_PLATFORM")
-            or get_session_env("HERMES_SESSION_PLATFORM")
-        )
-    except Exception:
-        resolved_platform = os.getenv("HERMES_PLATFORM")
-    return resolved_platform or None
-
 def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
    """Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
    raw_identifier = (skill_identifier or "").strip()
@@ -244,8 +218,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
    Returns:
        Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
    """
-    global _skill_commands, _skill_commands_platform
-    _skill_commands_platform = _resolve_skill_commands_platform()
+    global _skill_commands
    _skill_commands = {}
    try:
        from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -305,16 +278,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:


 def get_skill_commands() -> Dict[str, Dict[str, Any]]:
-    """Return the current skill commands mapping (scan first if empty).
-
-    Rescans when the active platform scope changes (e.g. a gateway
-    process serving Telegram and Discord concurrently) so each platform
-    sees its own ``skills.platform_disabled`` view (#14536).
-    """
-    if (
-        not _skill_commands
-        or _skill_commands_platform != _resolve_skill_commands_platform()
-    ):
+    """Return the current skill commands mapping (scan first if empty)."""
+    if not _skill_commands:
        scan_skill_commands()
    return _skill_commands

@@ -1,386 +0,0 @@
-"""Stateful scrubber for reasoning/thinking blocks in streamed assistant text.
-
-``run_agent._strip_think_blocks`` is regex-based and correct for a complete
-string, but when it runs *per-delta* in ``_fire_stream_delta`` it destroys
-the state that downstream consumers (CLI ``_stream_delta``, gateway
-``GatewayStreamConsumer._filter_and_accumulate``) rely on.
-
-Concretely, when MiniMax-M2.7 streams
-
-    delta1 = "<think>"
-    delta2 = "Let me check their config"
-    delta3 = "</think>"
-
-the per-delta regex erases delta1 entirely (case 2: unterminated-open at
-boundary matches ``^<think>...``), so the downstream state machine never
-sees the open tag, treats delta2 as regular content, and leaks reasoning
-to the user.  Consumers that don't run their own state machine (ACP,
-api_server, TTS) never had any defence at all — they just emitted
-whatever survived the upstream regex.
-
-This module centralises the tag-suppression state machine at the
-upstream layer so every stream_delta_callback sees text that has
-already had reasoning blocks removed.  Partial tags at delta
-boundaries are held back until the next delta resolves them, and
-end-of-stream flushing surfaces any held-back prose that turned out
-not to be a real tag.
-
-Usage::
-
-    scrubber = StreamingThinkScrubber()
-    for delta in stream:
-        visible = scrubber.feed(delta)
-        if visible:
-            emit(visible)
-    tail = scrubber.flush()  # at end of stream
-    if tail:
-        emit(tail)
-
-The scrubber is re-entrant per agent instance.  Call ``reset()`` at
-the top of each new turn so a hung block from an interrupted prior
-stream cannot taint the next turn's output.
-
-Tag variants handled (case-insensitive):
-  ``<think>``, ``<thinking>``, ``<reasoning>``, ``<thought>``,
-  ``<REASONING_SCRATCHPAD>``.
-
-Block-boundary rule for opens: an opening tag is only treated as a
-reasoning-block opener when it appears at the start of the stream,
-after a newline (optionally followed by whitespace), or when only
-whitespace has been emitted on the current line.  This prevents prose
-that *mentions* the tag name (e.g. ``"use <think> tags here"``) from
-being incorrectly suppressed.  Closed pairs (``<think>X</think>``) are
-always suppressed regardless of boundary; a closed pair is an
-intentional, bounded construct.
-"""
-
-from __future__ import annotations
-
-from typing import Tuple
-
-__all__ = ["StreamingThinkScrubber"]
-
-
-class StreamingThinkScrubber:
-    """Stateful scrubber for streaming reasoning/thinking blocks.
-
-    State machine:
-      - ``_in_block``: True while inside an opened block, waiting for
-        a close tag.  All text inside is discarded.
-      - ``_buf``: held-back partial-tag tail.  Emitted / discarded on
-        the next ``feed()`` call or by ``flush()``.
-      - ``_last_emitted_ended_newline``: True iff the most recent
-        emission to the consumer ended with ``\\n``, or nothing has
-        been emitted yet (start-of-stream counts as a boundary).  Used
-        to decide whether an open tag at buffer position 0 is at a
-        block boundary.
-    """
-
-    _OPEN_TAG_NAMES: Tuple[str, ...] = (
-        "think",
-        "thinking",
-        "reasoning",
-        "thought",
-        "REASONING_SCRATCHPAD",
-    )
-
-    # Materialise literal tag strings so the hot path does string
-    # operations, not regex compilation per feed().
-    _OPEN_TAGS: Tuple[str, ...] = tuple(f"<{name}>" for name in _OPEN_TAG_NAMES)
-    _CLOSE_TAGS: Tuple[str, ...] = tuple(f"</{name}>" for name in _OPEN_TAG_NAMES)
-
-    # Pre-compute the longest tag (for partial-tag hold-back bound).
-    _MAX_TAG_LEN: int = max(len(tag) for tag in _OPEN_TAGS + _CLOSE_TAGS)
-
-    def __init__(self) -> None:
-        self._in_block: bool = False
-        self._buf: str = ""
-        self._last_emitted_ended_newline: bool = True
-
-    def reset(self) -> None:
-        """Reset all state.  Call at the top of every new turn."""
-        self._in_block = False
-        self._buf = ""
-        self._last_emitted_ended_newline = True
-
-    def feed(self, text: str) -> str:
-        """Feed one delta; return the scrubbed visible portion.
-
-        May return an empty string when the entire delta is reasoning
-        content or is being held back pending resolution of a partial
-        tag at the boundary.
-        """
-        if not text:
-            return ""
-        buf = self._buf + text
-        self._buf = ""
-        out: list[str] = []
-
-        while buf:
-            if self._in_block:
-                # Hunt for the earliest close tag.
-                close_idx, close_len = self._find_first_tag(
-                    buf, self._CLOSE_TAGS,
-                )
-                if close_idx == -1:
-                    # No close yet — hold back a potential partial
-                    # close-tag prefix; discard everything else.
-                    held = self._max_partial_suffix(buf, self._CLOSE_TAGS)
-                    self._buf = buf[-held:] if held else ""
-                    return "".join(out)
-                # Found close: discard block content + tag, continue.
-                buf = buf[close_idx + close_len:]
-                self._in_block = False
-            else:
-                # Priority 1 — closed <tag>X</tag> pair anywhere in
-                # buf.  Closed pairs are always an intentional,
-                # bounded construct (even mid-line prose containing
-                # an open/close pair is almost certainly a model
-                # leaking reasoning inline), so no boundary gating.
-                pair = self._find_earliest_closed_pair(buf)
-                # Priority 2 — unterminated open tag at a block
-                # boundary.  Boundary-gated so prose that mentions
-                # '<think>' isn't over-stripped.
-                open_idx, open_len = self._find_open_at_boundary(
-                    buf, out,
-                )
-
-                # Pick whichever match comes earliest in the buffer.
-                if pair is not None and (
-                    open_idx == -1 or pair[0] <= open_idx
-                ):
-                    start_idx, end_idx = pair
-                    preceding = buf[:start_idx]
-                    if preceding:
-                        preceding = self._strip_orphan_close_tags(preceding)
-                        if preceding:
-                            out.append(preceding)
-                            self._last_emitted_ended_newline = (
-                                preceding.endswith("\n")
-                            )
-                    buf = buf[end_idx:]
-                    continue
-
-                if open_idx != -1:
-                    # Unterminated open at boundary — emit preceding,
-                    # enter block, continue loop with remainder.
-                    preceding = buf[:open_idx]
-                    if preceding:
-                        preceding = self._strip_orphan_close_tags(preceding)
-                        if preceding:
-                            out.append(preceding)
-                            self._last_emitted_ended_newline = (
-                                preceding.endswith("\n")
-                            )
-                    self._in_block = True
-                    buf = buf[open_idx + open_len:]
-                    continue
-
-                # No resolvable tag structure in buf.  Hold back any
-                # partial-tag prefix at the tail so a split tag
-                # across deltas isn't missed, then emit the rest.
-                held = self._max_partial_suffix(buf, self._OPEN_TAGS)
-                held_close = self._max_partial_suffix(
-                    buf, self._CLOSE_TAGS,
-                )
-                held = max(held, held_close)
-                if held:
-                    emit_text = buf[:-held]
-                    self._buf = buf[-held:]
-                else:
-                    emit_text = buf
-                    self._buf = ""
-                if emit_text:
-                    emit_text = self._strip_orphan_close_tags(emit_text)
-                    if emit_text:
-                        out.append(emit_text)
-                        self._last_emitted_ended_newline = (
-                            emit_text.endswith("\n")
-                        )
-                return "".join(out)
-
-        return "".join(out)
-
-    def flush(self) -> str:
-        """End-of-stream flush.
-
-        If still inside an unterminated block, held-back content is
-        discarded — leaking partial reasoning is worse than a
-        truncated answer.  Otherwise the held-back partial-tag tail is
-        emitted verbatim (it turned out not to be a real tag prefix).
-        """
-        if self._in_block:
-            self._buf = ""
-            self._in_block = False
-            return ""
-        tail = self._buf
-        self._buf = ""
-        if not tail:
-            return ""
-        tail = self._strip_orphan_close_tags(tail)
-        if tail:
-            self._last_emitted_ended_newline = tail.endswith("\n")
-        return tail
-
-    # ── internal helpers ───────────────────────────────────────────────
-
-    @staticmethod
-    def _find_first_tag(
-        buf: str, tags: Tuple[str, ...],
-    ) -> Tuple[int, int]:
-        """Return (earliest_index, tag_length) over *tags*, or (-1, 0).
-
-        Case-insensitive match.
-        """
-        buf_lower = buf.lower()
-        best_idx = -1
-        best_len = 0
-        for tag in tags:
-            idx = buf_lower.find(tag.lower())
-            if idx != -1 and (best_idx == -1 or idx < best_idx):
-                best_idx = idx
-                best_len = len(tag)
-        return best_idx, best_len
-
-    def _find_earliest_closed_pair(self, buf: str):
-        """Return (start_idx, end_idx) of the earliest closed pair, else None.
-
-        A closed pair is ``<tag>...</tag>`` of any variant.  Matches are
-        case-insensitive and non-greedy (the closest close tag after
-        an open tag wins), matching the regex ``<tag>.*?</tag>``
-        semantics of ``_strip_think_blocks`` case 1.  When two tag
-        variants could both match, the one whose open tag appears
-        earlier wins.
-        """
-        buf_lower = buf.lower()
-        best: "tuple[int, int] | None" = None
-        for open_tag, close_tag in zip(self._OPEN_TAGS, self._CLOSE_TAGS):
-            open_lower = open_tag.lower()
-            close_lower = close_tag.lower()
-            open_idx = buf_lower.find(open_lower)
-            if open_idx == -1:
-                continue
-            close_idx = buf_lower.find(
-                close_lower, open_idx + len(open_lower),
-            )
-            if close_idx == -1:
-                continue
-            end_idx = close_idx + len(close_lower)
-            if best is None or open_idx < best[0]:
-                best = (open_idx, end_idx)
-        return best
-
-    def _find_open_at_boundary(
-        self, buf: str, already_emitted: list[str],
-    ) -> Tuple[int, int]:
-        """Return the earliest block-boundary open-tag (idx, len).
-
-        Returns (-1, 0) if no boundary-legal opener is present.
-        """
-        buf_lower = buf.lower()
-        best_idx = -1
-        best_len = 0
-        for tag in self._OPEN_TAGS:
-            tag_lower = tag.lower()
-            search_start = 0
-            while True:
-                idx = buf_lower.find(tag_lower, search_start)
-                if idx == -1:
-                    break
-                if self._is_block_boundary(buf, idx, already_emitted):
-                    if best_idx == -1 or idx < best_idx:
-                        best_idx = idx
-                        best_len = len(tag)
-                    break  # first boundary hit for this tag is enough
-                search_start = idx + 1
-        return best_idx, best_len
-
-    def _is_block_boundary(
-        self, buf: str, idx: int, already_emitted: list[str],
-    ) -> bool:
-        """True iff position *idx* in *buf* is a block boundary.
-
-        A block boundary is:
-          - buf position 0 AND the most recent emission ended with
-            a newline (or nothing has been emitted yet)
-          - any position whose preceding text on the current line
-            (since the last newline in buf) is whitespace-only, AND
-            if there is no newline in the preceding buf portion, the
-            most recent prior emission ended with a newline
-        """
-        if idx == 0:
-            # Check whether the last already-emitted chunk in THIS
-            # feed() call ended with a newline, otherwise fall back
-            # to the cross-feed flag.
-            if already_emitted:
-                return already_emitted[-1].endswith("\n")
-            return self._last_emitted_ended_newline
-        preceding = buf[:idx]
-        last_nl = preceding.rfind("\n")
-        if last_nl == -1:
-            # No newline in buf before the tag — boundary only if the
-            # prior emission ended with a newline AND everything since
-            # is whitespace.
-            if already_emitted:
-                prior_newline = already_emitted[-1].endswith("\n")
-            else:
-                prior_newline = self._last_emitted_ended_newline
-            return prior_newline and preceding.strip() == ""
-        # Newline present — text between it and the tag must be
-        # whitespace-only.
-        return preceding[last_nl + 1:].strip() == ""
-
-    @classmethod
-    def _max_partial_suffix(
-        cls, buf: str, tags: Tuple[str, ...],
-    ) -> int:
-        """Return the longest buf-suffix that is a prefix of any tag.
-
-        Only prefixes strictly shorter than the tag itself count
-        (full-length suffixes are the tag and are handled as matches,
-        not held-back partials).  Case-insensitive.
-        """
-        if not buf:
-            return 0
-        buf_lower = buf.lower()
-        max_check = min(len(buf_lower), cls._MAX_TAG_LEN - 1)
-        for i in range(max_check, 0, -1):
-            suffix = buf_lower[-i:]
-            for tag in tags:
-                tag_lower = tag.lower()
-                if len(tag_lower) > i and tag_lower.startswith(suffix):
-                    return i
-        return 0
-
-    @classmethod
-    def _strip_orphan_close_tags(cls, text: str) -> str:
-        """Remove any close tags from *text* (orphan-close handling).
-
-        An orphan close tag has no matching open in the current
-        scrubber state; it's always noise, stripped with any trailing
-        whitespace so the surrounding prose flows naturally.
-        """
-        if "</" not in text:
-            return text
-        text_lower = text.lower()
-        out: list[str] = []
-        i = 0
-        while i < len(text):
-            matched = False
-            if text_lower[i:i + 2] == "</":
-                for tag in cls._CLOSE_TAGS:
-                    tag_lower = tag.lower()
-                    tag_len = len(tag_lower)
-                    if text_lower[i:i + tag_len] == tag_lower:
-                        # Skip the tag and any trailing whitespace,
-                        # matching _strip_think_blocks case 3.
-                        j = i + tag_len
-                        while j < len(text) and text[j] in " \t\n\r":
-                            j += 1
-                        i = j
-                        matched = True
-                        break
-            if not matched:
-                out.append(text[i])
-                i += 1
-        return "".join(out)
@@ -17,7 +17,6 @@ logger = logging.getLogger(__name__)
 # so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
 # become visible instead of piling up as NULL session titles.
 FailureCallback = Callable[[str, BaseException], None]
-TitleCallback = Callable[[str], None]

 _TITLE_PROMPT = (
    "Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
@@ -91,7 +90,6 @@ def auto_title_session(
    assistant_response: str,
    failure_callback: Optional[FailureCallback] = None,
    main_runtime: dict = None,
-    title_callback: Optional[TitleCallback] = None,
 ) -> None:
    """Generate and set a session title if one doesn't already exist.

@@ -121,11 +119,6 @@ def auto_title_session(
    try:
        session_db.set_session_title(session_id, title)
        logger.debug("Auto-generated session title: %s", title)
-        if title_callback is not None:
-            try:
-                title_callback(title)
-            except Exception:
-                logger.debug("Auto-title callback failed", exc_info=True)
    except Exception as e:
        logger.debug("Failed to set auto-generated title: %s", e)

@@ -138,7 +131,6 @@ def maybe_auto_title(
    conversation_history: list,
    failure_callback: Optional[FailureCallback] = None,
    main_runtime: dict = None,
-    title_callback: Optional[TitleCallback] = None,
 ) -> None:
    """Fire-and-forget title generation after the first exchange.

@@ -160,11 +152,7 @@ def maybe_auto_title(
    thread = threading.Thread(
        target=auto_title_session,
        args=(session_db, session_id, user_message, assistant_response),
-        kwargs={
-            "failure_callback": failure_callback,
-            "main_runtime": main_runtime,
-            "title_callback": title_callback,
-        },
+        kwargs={"failure_callback": failure_callback, "main_runtime": main_runtime},
        daemon=True,
        name="auto-title",
    )
@@ -6,16 +6,9 @@ Usage:
    result = transport.normalize_response(raw_response)
 """

-from agent.transports.types import (
-    NormalizedResponse,
-    ToolCall,
-    Usage,
-    build_tool_call,
-    map_finish_reason,
-)  # noqa: F401
+from agent.transports.types import NormalizedResponse, ToolCall, Usage, build_tool_call, map_finish_reason  # noqa: F401

 _REGISTRY: dict = {}
-_discovered: bool = False


 def register_transport(api_mode: str, transport_cls: type) -> None:
@@ -30,9 +23,6 @@ def get_transport(api_mode: str):
    This allows gradual migration — call sites can check for None
    and fall back to the legacy code path.
    """
-    global _discovered
-    if not _discovered:
-        _discover_transports()
    cls = _REGISTRY.get(api_mode)
    if cls is None:
        # The registry can be partially populated when a specific transport
@@ -48,8 +38,6 @@ def get_transport(api_mode: str):

 def _discover_transports() -> None:
    """Import all transport modules to trigger auto-registration."""
-    global _discovered
-    _discovered = True
    try:
        import agent.transports.anthropic  # noqa: F401
    except ImportError:
@@ -109,9 +109,7 @@ class ChatCompletionsTransport(ProviderTransport):
    def api_mode(self) -> str:
        return "chat_completions"

-    def convert_messages(
-        self, messages: list[dict[str, Any]], **kwargs
-    ) -> list[dict[str, Any]]:
+    def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> List[Dict[str, Any]]:
        """Messages are already in OpenAI format — sanitize Codex leaks only.

        Strips Codex Responses API fields (``codex_reasoning_items`` /
@@ -128,9 +126,7 @@ class ChatCompletionsTransport(ProviderTransport):
            tool_calls = msg.get("tool_calls")
            if isinstance(tool_calls, list):
                for tc in tool_calls:
-                    if isinstance(tc, dict) and (
-                        "call_id" in tc or "response_item_id" in tc
-                    ):
+                    if isinstance(tc, dict) and ("call_id" in tc or "response_item_id" in tc):
                        needs_sanitize = True
                        break
                if needs_sanitize:
@@ -153,41 +149,39 @@ class ChatCompletionsTransport(ProviderTransport):
                        tc.pop("response_item_id", None)
        return sanitized

-    def convert_tools(self, tools: list[dict[str, Any]]) -> list[dict[str, Any]]:
+    def convert_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """Tools are already in OpenAI format — identity."""
        return tools

    def build_kwargs(
        self,
        model: str,
-        messages: list[dict[str, Any]],
-        tools: list[dict[str, Any]] | None = None,
+        messages: List[Dict[str, Any]],
+        tools: Optional[List[Dict[str, Any]]] = None,
        **params,
-    ) -> dict[str, Any]:
+    ) -> Dict[str, Any]:
        """Build chat.completions.create() kwargs.

-        params (all optional):
+        This is the most complex transport method — it handles ~16 providers
+        via params rather than subclasses.
+
+        params:
            timeout: float — API call timeout
            max_tokens: int | None — user-configured max tokens
-            ephemeral_max_output_tokens: int | None — one-shot override
+            ephemeral_max_output_tokens: int | None — one-shot override (error recovery)
            max_tokens_param_fn: callable — returns {max_tokens: N} or {max_completion_tokens: N}
            reasoning_config: dict | None
            request_overrides: dict | None
            session_id: str | None
+            qwen_session_metadata: dict | None — {sessionId, promptId} precomputed
            model_lower: str — lowercase model name for pattern matching
-            # Provider profile path (all per-provider quirks live in providers/)
-            provider_profile: ProviderProfile | None — when present, delegates to
-                _build_kwargs_from_profile(); all flag params below are bypassed.
-            # Legacy-path flags — only used when provider_profile is None
-            # (i.e. custom / unregistered providers). Known providers all go
-            # through provider_profile.
+            # Provider detection flags (all optional, default False)
            is_openrouter: bool
            is_nous: bool
            is_qwen_portal: bool
            is_github_models: bool
            is_nvidia_nim: bool
            is_kimi: bool
-            is_tokenhub: bool
            is_lmstudio: bool
            is_custom_provider: bool
            ollama_num_ctx: int | None
@@ -196,7 +190,6 @@ class ChatCompletionsTransport(ProviderTransport):
            # Qwen-specific
            qwen_prepare_fn: callable | None — runs AFTER codex sanitization
            qwen_prepare_inplace_fn: callable | None — in-place variant for deepcopied lists
-            qwen_session_metadata: dict | None
            # Temperature
            fixed_temperature: Any — from _fixed_temperature_for_model()
            omit_temperature: bool
@@ -206,21 +199,28 @@ class ChatCompletionsTransport(ProviderTransport):
            lmstudio_reasoning_options: list[str] | None  # raw allowed_options from /api/v1/models
            # Claude on OpenRouter/Nous max output
            anthropic_max_output: int | None
-            extra_body_additions: dict | None
+            # Extra
+            extra_body_additions: dict | None — pre-built extra_body entries
        """
        # Codex sanitization: drop reasoning_items / call_id / response_item_id
        sanitized = self.convert_messages(messages)

-        # ── Provider profile: single-path when present ──────────────────
-        _profile = params.get("provider_profile")
-        if _profile:
-            return self._build_kwargs_from_profile(
-                _profile, model, sanitized, tools, params
-            )
-
-        # ── Legacy fallback (unregistered / unknown provider) ───────────
-        # Reached only when get_provider_profile() returned None.
-        # Known providers always go through the profile path above.
+        # Qwen portal prep AFTER codex sanitization.  If sanitize already
+        # deepcopied, reuse that copy via the in-place variant to avoid a
+        # second deepcopy.
+        is_qwen = params.get("is_qwen_portal", False)
+        if is_qwen:
+            qwen_prep = params.get("qwen_prepare_fn")
+            qwen_prep_inplace = params.get("qwen_prepare_inplace_fn")
+            if sanitized is messages:
+                if qwen_prep is not None:
+                    sanitized = qwen_prep(sanitized)
+            else:
+                # Already deepcopied — transform in place
+                if qwen_prep_inplace is not None:
+                    qwen_prep_inplace(sanitized)
+                elif qwen_prep is not None:
+                    sanitized = qwen_prep(sanitized)

        # Developer role swap for GPT-5/Codex models
        model_lower = params.get("model_lower", (model or "").lower())
@@ -233,7 +233,7 @@ class ChatCompletionsTransport(ProviderTransport):
            sanitized = list(sanitized)
            sanitized[0] = {**sanitized[0], "role": "developer"}

-        api_kwargs: dict[str, Any] = {
+        api_kwargs: Dict[str, Any] = {
            "model": model,
            "messages": sanitized,
        }
@@ -242,6 +242,19 @@ class ChatCompletionsTransport(ProviderTransport):
        if timeout is not None:
            api_kwargs["timeout"] = timeout

+        # Temperature
+        fixed_temp = params.get("fixed_temperature")
+        omit_temp = params.get("omit_temperature", False)
+        if omit_temp:
+            api_kwargs.pop("temperature", None)
+        elif fixed_temp is not None:
+            api_kwargs["temperature"] = fixed_temp
+
+        # Qwen metadata (caller precomputes {sessionId, promptId})
+        qwen_meta = params.get("qwen_session_metadata")
+        if qwen_meta and is_qwen:
+            api_kwargs["metadata"] = qwen_meta
+
        # Tools
        if tools:
            # Moonshot/Kimi uses a stricter flavored JSON Schema.  Rewriting
@@ -265,6 +278,13 @@ class ChatCompletionsTransport(ProviderTransport):
            api_kwargs.update(max_tokens_fn(ephemeral))
        elif max_tokens is not None and max_tokens_fn:
            api_kwargs.update(max_tokens_fn(max_tokens))
+        elif is_nvidia_nim and max_tokens_fn:
+            api_kwargs.update(max_tokens_fn(16384))
+        elif is_qwen and max_tokens_fn:
+            api_kwargs.update(max_tokens_fn(65536))
+        elif is_kimi and max_tokens_fn:
+            # Kimi/Moonshot: 32000 matches Kimi CLI's default
+            api_kwargs.update(max_tokens_fn(32000))
        elif anthropic_max_out is not None:
            api_kwargs["max_tokens"] = anthropic_max_out

@@ -311,7 +331,7 @@ class ChatCompletionsTransport(ProviderTransport):
                api_kwargs["reasoning_effort"] = _lm_effort

        # extra_body assembly
-        extra_body: dict[str, Any] = {}
+        extra_body: Dict[str, Any] = {}

        is_openrouter = params.get("is_openrouter", False)
        is_nous = params.get("is_nous", False)
@@ -341,7 +361,35 @@ class ChatCompletionsTransport(ProviderTransport):
                if gh_reasoning is not None:
                    extra_body["reasoning"] = gh_reasoning
            else:
-                extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
+                if reasoning_config is not None:
+                    rc = dict(reasoning_config)
+                    if is_nous and rc.get("enabled") is False:
+                        pass  # omit for Nous when disabled
+                    else:
+                        extra_body["reasoning"] = rc
+                else:
+                    extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
+
+        if is_nous:
+            extra_body["tags"] = ["product=hermes-agent"]
+
+        # Ollama num_ctx
+        ollama_ctx = params.get("ollama_num_ctx")
+        if ollama_ctx:
+            options = extra_body.get("options", {})
+            options["num_ctx"] = ollama_ctx
+            extra_body["options"] = options
+
+        # Ollama/custom think=false
+        if params.get("is_custom_provider", False):
+            if reasoning_config and isinstance(reasoning_config, dict):
+                _effort = (reasoning_config.get("effort") or "").strip().lower()
+                _enabled = reasoning_config.get("enabled", True)
+                if _effort == "none" or _enabled is False:
+                    extra_body["think"] = False
+
+        if is_qwen:
+            extra_body["vl_high_resolution_images"] = True

        if provider_name == "gemini":
            raw_thinking_config = _build_gemini_thinking_config(model, reasoning_config)
@@ -375,120 +423,6 @@ class ChatCompletionsTransport(ProviderTransport):

        return api_kwargs

-    def _build_kwargs_from_profile(self, profile, model, sanitized, tools, params):
-        """Build API kwargs using a ProviderProfile — single path, no legacy flags.
-
-        This method replaces the entire flag-based kwargs assembly when a
-        provider_profile is passed. Every quirk comes from the profile object.
-        """
-        from providers.base import OMIT_TEMPERATURE
-
-        # Message preprocessing
-        sanitized = profile.prepare_messages(sanitized)
-
-        # Developer role swap — model-name-based, applies to all providers
-        _model_lower = (model or "").lower()
-        if (
-            sanitized
-            and isinstance(sanitized[0], dict)
-            and sanitized[0].get("role") == "system"
-            and any(p in _model_lower for p in DEVELOPER_ROLE_MODELS)
-        ):
-            sanitized = list(sanitized)
-            sanitized[0] = {**sanitized[0], "role": "developer"}
-
-        api_kwargs: dict[str, Any] = {
-            "model": model,
-            "messages": sanitized,
-        }
-
-        # Temperature
-        if profile.fixed_temperature is OMIT_TEMPERATURE:
-            pass  # Don't include temperature at all
-        elif profile.fixed_temperature is not None:
-            api_kwargs["temperature"] = profile.fixed_temperature
-        else:
-            # Use caller's temperature if provided
-            temp = params.get("temperature")
-            if temp is not None:
-                api_kwargs["temperature"] = temp
-
-        # Timeout
-        timeout = params.get("timeout")
-        if timeout is not None:
-            api_kwargs["timeout"] = timeout
-
-        # Tools — apply Moonshot/Kimi schema sanitization regardless of path
-        if tools:
-            if is_moonshot_model(model):
-                tools = sanitize_moonshot_tools(tools)
-            api_kwargs["tools"] = tools
-
-        # max_tokens resolution — priority: ephemeral > user > profile default
-        max_tokens_fn = params.get("max_tokens_param_fn")
-        ephemeral = params.get("ephemeral_max_output_tokens")
-        user_max = params.get("max_tokens")
-        anthropic_max = params.get("anthropic_max_output")
-
-        if ephemeral is not None and max_tokens_fn:
-            api_kwargs.update(max_tokens_fn(ephemeral))
-        elif user_max is not None and max_tokens_fn:
-            api_kwargs.update(max_tokens_fn(user_max))
-        elif profile.default_max_tokens and max_tokens_fn:
-            api_kwargs.update(max_tokens_fn(profile.default_max_tokens))
-        elif anthropic_max is not None:
-            api_kwargs["max_tokens"] = anthropic_max
-
-        # Provider-specific api_kwargs extras (reasoning_effort, metadata, etc.)
-        reasoning_config = params.get("reasoning_config")
-        extra_body_from_profile, top_level_from_profile = (
-            profile.build_api_kwargs_extras(
-                reasoning_config=reasoning_config,
-                supports_reasoning=params.get("supports_reasoning", False),
-                qwen_session_metadata=params.get("qwen_session_metadata"),
-                model=model,
-                ollama_num_ctx=params.get("ollama_num_ctx"),
-            )
-        )
-        api_kwargs.update(top_level_from_profile)
-
-        # extra_body assembly
-        extra_body: dict[str, Any] = {}
-
-        # Profile's extra_body (tags, provider prefs, vl_high_resolution, etc.)
-        profile_body = profile.build_extra_body(
-            session_id=params.get("session_id"),
-            provider_preferences=params.get("provider_preferences"),
-            model=model,
-            base_url=params.get("base_url"),
-            reasoning_config=reasoning_config,
-        )
-        if profile_body:
-            extra_body.update(profile_body)
-
-        # Profile's reasoning/thinking extra_body entries
-        if extra_body_from_profile:
-            extra_body.update(extra_body_from_profile)
-
-        # Merge any pre-built extra_body additions from the caller
-        additions = params.get("extra_body_additions")
-        if additions:
-            extra_body.update(additions)
-
-        # Request overrides (user config)
-        overrides = params.get("request_overrides")
-        if overrides:
-            for k, v in overrides.items():
-                if k == "extra_body" and isinstance(v, dict):
-                    extra_body.update(v)
-                else:
-                    api_kwargs[k] = v
-
-        if extra_body:
-            api_kwargs["extra_body"] = extra_body
-
-        return api_kwargs
-
    def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
        """Normalize OpenAI ChatCompletion to NormalizedResponse.

@@ -510,7 +444,7 @@ class ChatCompletionsTransport(ProviderTransport):
                # Gemini 3 thinking models attach extra_content with
                # thought_signature — without replay on the next turn the API
                # rejects the request with 400.
-                tc_provider_data: dict[str, Any] = {}
+                tc_provider_data: Dict[str, Any] = {}
                extra = getattr(tc, "extra_content", None)
                if extra is None and hasattr(tc, "model_extra"):
                    extra = (tc.model_extra or {}).get("extra_content")
@@ -521,14 +455,12 @@ class ChatCompletionsTransport(ProviderTransport):
                        except Exception:
                            pass
                    tc_provider_data["extra_content"] = extra
-                tool_calls.append(
-                    ToolCall(
-                        id=tc.id,
-                        name=tc.function.name,
-                        arguments=tc.function.arguments,
-                        provider_data=tc_provider_data or None,
-                    )
-                )
+                tool_calls.append(ToolCall(
+                    id=tc.id,
+                    name=tc.function.name,
+                    arguments=tc.function.arguments,
+                    provider_data=tc_provider_data or None,
+                ))

        usage = None
        if hasattr(response, "usage") and response.usage:
@@ -576,7 +508,7 @@ class ChatCompletionsTransport(ProviderTransport):
            return False
        return True

-    def extract_cache_stats(self, response: Any) -> dict[str, int] | None:
+    def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
        """Extract OpenRouter/OpenAI cache stats from prompt_tokens_details."""
        usage = getattr(response, "usage", None)
        if usage is None:
@@ -143,18 +143,7 @@ class ResponsesApiTransport(ProviderTransport):
            kwargs["max_output_tokens"] = max_tokens

        if is_xai_responses and session_id:
-            existing_extra_headers = kwargs.get("extra_headers")
-            merged_extra_headers: Dict[str, str] = {}
-            if isinstance(existing_extra_headers, dict):
-                merged_extra_headers.update(
-                    {
-                        str(key): str(value)
-                        for key, value in existing_extra_headers.items()
-                        if key and value is not None
-                    }
-                )
-            merged_extra_headers["x-grok-conv-id"] = session_id
-            kwargs["extra_headers"] = merged_extra_headers
+            kwargs["extra_headers"] = {"x-grok-conv-id": session_id}

        return kwargs

@@ -12,7 +12,7 @@ from __future__ import annotations

 import json
 from dataclasses import dataclass, field
-from typing import Any
+from typing import Any, Dict, List, Optional


@dataclass
@@ -32,10 +32,10 @@ class ToolCall:
    * Others: ``None``
    """

-    id: str | None
+    id: Optional[str]
    name: str
    arguments: str  # JSON string
-    provider_data: dict[str, Any] | None = field(default=None, repr=False)
+    provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)

    # ── Backward compatibility ──────────────────────────────────
    # The agent loop reads tc.function.name / tc.function.arguments
@@ -47,17 +47,17 @@ class ToolCall:
        return "function"

    @property
-    def function(self) -> ToolCall:
+    def function(self) -> "ToolCall":
        """Return self so tc.function.name / tc.function.arguments work."""
        return self

    @property
-    def call_id(self) -> str | None:
+    def call_id(self) -> Optional[str]:
        """Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
        return (self.provider_data or {}).get("call_id")

    @property
-    def response_item_id(self) -> str | None:
+    def response_item_id(self) -> Optional[str]:
        """Codex response_item_id from provider_data."""
        return (self.provider_data or {}).get("response_item_id")

@@ -101,18 +101,18 @@ class NormalizedResponse:
    * Others: ``None``
    """

-    content: str | None
-    tool_calls: list[ToolCall] | None
+    content: Optional[str]
+    tool_calls: Optional[List[ToolCall]]
    finish_reason: str  # "stop", "tool_calls", "length", "content_filter"
-    reasoning: str | None = None
-    usage: Usage | None = None
-    provider_data: dict[str, Any] | None = field(default=None, repr=False)
+    reasoning: Optional[str] = None
+    usage: Optional[Usage] = None
+    provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)

    # ── Backward compatibility ──────────────────────────────────
    # The shim _nr_to_assistant_message() mapped these from provider_data.
    # These properties let NormalizedResponse pass through directly.
    @property
-    def reasoning_content(self) -> str | None:
+    def reasoning_content(self) -> Optional[str]:
        pd = self.provider_data or {}
        return pd.get("reasoning_content")

@@ -136,9 +136,8 @@ class NormalizedResponse:
 # Factory helpers
 # ---------------------------------------------------------------------------

-
 def build_tool_call(
-    id: str | None,
+    id: Optional[str],
    name: str,
    arguments: Any,
    **provider_fields: Any,
@@ -152,7 +151,7 @@ def build_tool_call(
    return ToolCall(id=id, name=name, arguments=args_str, provider_data=pd)


-def map_finish_reason(reason: str | None, mapping: dict[str, str]) -> str:
+def map_finish_reason(reason: Optional[str], mapping: Dict[str, str]) -> str:
    """Translate a provider-specific stop reason to the normalised set.

    Falls back to ``"stop"`` for unknown or ``None`` reasons.
@@ -121,18 +121,6 @@ model:
 #   # Data policy: "allow" (default) or "deny" to exclude providers that may store data
 #   # data_collection: "deny"

-# =============================================================================
-# OpenRouter Response Caching (only applies when using OpenRouter)
-# =============================================================================
-# Cache identical API responses at the OpenRouter edge for free instant replays.
-# When enabled, identical requests (same model, messages, parameters) return
-# cached responses with zero billing. Separate from Anthropic prompt caching.
-# See: https://openrouter.ai/docs/guides/features/response-caching
-#
-# openrouter:
-#   response_cache: true         # Enable response caching (default: true)
-#   response_cache_ttl: 300      # Cache TTL in seconds, 1-86400 (default: 300)
-
 # =============================================================================
 # Git Worktree Isolation
 # =============================================================================
@@ -459,19 +459,32 @@ def load_cli_config() -> Dict[str, Any]:
    if "backend" in terminal_config:
        terminal_config["env_type"] = terminal_config["backend"]
    
-    # CWD resolution for CLI/TUI. The gateway has its own config bridge in
-    # gateway/run.py but may lazily import cli.py (triggering this code).
-    # Local backend: always os.getcwd(). Use `cd /dir && hermes` to control it.
-    # Non-local with placeholder: pop so terminal_tool uses its per-backend default.
-    # Non-local with explicit path: keep as-is.
+    # Handle special cwd values: "." or "auto" means use current working directory.
+    # Only resolve to the host's CWD for the local backend where the host
+    # filesystem is directly accessible.  For ALL remote/container backends
+    # (ssh, docker, modal, singularity), the host path doesn't exist on the
+    # target -- remove the key so terminal_tool.py uses its per-backend default.
+    #
+    # GUARD: If TERMINAL_CWD is already set to a real absolute path (by the
+    # gateway's config bridge earlier in the process), don't clobber it.
+    # This prevents a lazy import of cli.py during gateway runtime from
+    # rewriting TERMINAL_CWD to the service's working directory.
+    # See issue #10817.
    _CWD_PLACEHOLDERS = (".", "auto", "cwd")
-    effective_backend = terminal_config.get("env_type", "local")
-
-    if effective_backend == "local":
-        terminal_config["cwd"] = os.getcwd()
-        defaults["terminal"]["cwd"] = terminal_config["cwd"]
-    elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
-        terminal_config.pop("cwd", None)
+    if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
+        _existing_cwd = os.environ.get("TERMINAL_CWD", "")
+        if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
+            # Gateway (or earlier startup) already resolved a real path — keep it
+            terminal_config["cwd"] = _existing_cwd
+            defaults["terminal"]["cwd"] = _existing_cwd
+        else:
+            effective_backend = terminal_config.get("env_type", "local")
+            if effective_backend == "local":
+                terminal_config["cwd"] = os.getcwd()
+                defaults["terminal"]["cwd"] = terminal_config["cwd"]
+            else:
+                # Remove so TERMINAL_CWD stays unset → tool picks backend default
+                terminal_config.pop("cwd", None)
    
    env_mappings = {
        "env_type": "TERMINAL_ENV",
@@ -504,18 +517,13 @@ def load_cli_config() -> Dict[str, Any]:
        "sudo_password": "SUDO_PASSWORD",
    }
    
-    # Bridge config → env vars for terminal_tool. TERMINAL_CWD is force-exported
-    # UNLESS we're inside a gateway process (detected by _HERMES_GATEWAY marker)
-    # where it was already set correctly by gateway/run.py's config bridge.
-    _is_gateway = os.environ.get("_HERMES_GATEWAY") == "1"
+    # Apply config values to env vars so terminal_tool picks them up.
+    # If the config file explicitly has a [terminal] section, those values are
+    # authoritative and override any .env settings.  When using defaults only
+    # (no config file or no terminal section), don't overwrite env vars that
+    # were already set by .env -- the user's .env is the fallback source.
    for config_key, env_var in env_mappings.items():
        if config_key in terminal_config:
-            if env_var == "TERMINAL_CWD":
-                if _is_gateway:
-                    continue
-                # CLI: always export (overrides stale .env or inherited values)
-                os.environ[env_var] = str(terminal_config[config_key])
-                continue
            if _file_has_terminal_config or env_var not in os.environ:
                val = terminal_config[config_key]
                if isinstance(val, list):
@@ -940,18 +948,6 @@ def _run_state_db_auto_maintenance(session_db) -> None:
        except Exception as _prune_exc:
            logger.debug("Ghost session prune skipped: %s", _prune_exc)

-        # One-time finalize of orphaned compression continuations (#20001).
-        try:
-            if not session_db.get_meta("orphaned_compression_finalize_v1"):
-                finalized = session_db.finalize_orphaned_compression_sessions()
-                session_db.set_meta("orphaned_compression_finalize_v1", "1")
-                if finalized:
-                    logger.info(
-                        "Finalized %d orphaned compression sessions", finalized
-                    )
-        except Exception as _finalize_exc:
-            logger.debug("Orphan compression finalize skipped: %s", _finalize_exc)
-
        cfg = (_load_full_config().get("sessions") or {})
        if not cfg.get("auto_prune", False):
            return
@@ -1238,28 +1234,6 @@ def _strip_markdown_syntax(text: str) -> str:
    return plain.strip("\n")


-_WINDOWS_PATH_WITH_DOT_SEGMENT_RE = re.compile(
-    r"(?i)(?:\b[a-z]:\\|\\\\)[^\s`]*\\\.[^\s`]*"
-)
-
-
-def _preserve_windows_dot_segments_for_markdown(text: str) -> str:
-    r"""Keep Windows path separators before hidden directories in Markdown.
-
-    CommonMark treats ``\.`` as an escaped literal dot, so Rich Markdown would
-    render ``D:\repo\.ai`` as ``D:\repo.ai``.  Doubling only that separator
-    inside Windows path-looking tokens preserves the path without changing
-    ordinary markdown escapes like ``1\. not a list``.
-    """
-    if "\\." not in text:
-        return text
-
-    def _protect(match: re.Match[str]) -> str:
-        return re.sub(r"(?<!\\)\\(?=\.)", r"\\\\", match.group(0))
-
-    return _WINDOWS_PATH_WITH_DOT_SEGMENT_RE.sub(_protect, text)
-
-
 def _render_final_assistant_content(text: str, mode: str = "render"):
    """Render final assistant content as markdown, stripped text, or raw text."""
    from rich.markdown import Markdown
@@ -1271,7 +1245,6 @@ def _render_final_assistant_content(text: str, mode: str = "render"):
        return _rich_text_from_ansi(text or "")

    plain = _rich_text_from_ansi(text or "").plain
-    plain = _preserve_windows_dot_segments_for_markdown(plain)
    return Markdown(plain)


@@ -1540,10 +1513,6 @@ def _detect_file_drop(user_input: str) -> "dict | None":
        or stripped.startswith('"~')
        or stripped.startswith("'/")
        or stripped.startswith("'~")
-        or stripped.startswith('"./')
-        or stripped.startswith('"../')
-        or stripped.startswith("'./")
-        or stripped.startswith("'../")
        or (len(stripped) >= 4 and stripped[0] in ("'", '"') and stripped[2] == ":" and stripped[3] in ("\\", "/") and stripped[1].isalpha())
    )
    if not starts_like_path:
@@ -1902,8 +1871,8 @@ _skill_commands = scan_skill_commands()
 def _get_plugin_cmd_handler_names() -> set:
    """Return plugin command names (without slash prefix) for dispatch matching."""
    try:
-        from hermes_cli.plugins import get_plugin_commands
-        return set(get_plugin_commands().keys())
+        from hermes_cli.plugins import get_plugin_manager
+        return set(get_plugin_manager()._plugin_commands.keys())
    except Exception:
        return set()

@@ -2157,10 +2126,7 @@ class HermesCLI:
        elif CLI_CONFIG.get("max_turns"):  # Backwards compat: root-level max_turns
            self.max_turns = CLI_CONFIG["max_turns"]
        elif os.getenv("HERMES_MAX_ITERATIONS"):
-            try:
-                self.max_turns = int(os.getenv("HERMES_MAX_ITERATIONS", ""))
-            except (TypeError, ValueError):
-                self.max_turns = 90
+            self.max_turns = int(os.getenv("HERMES_MAX_ITERATIONS"))
        else:
            self.max_turns = 90
        
@@ -2594,59 +2560,23 @@ class HermesCLI:
            return f"  {txt}  ({elapsed_str})"
        return f"  {txt}"

-    def _voice_record_key_label(self) -> str:
-        """Return the configured voice push-to-talk key formatted for UI.
-
-        Shared helper so every voice-facing status line / placeholder /
-        recording hint advertises the SAME label as the registered
-        prompt_toolkit binding.
-
-        Cached at startup (see ``set_voice_record_key_cache``) rather
-        than re-read per render. Two reasons (Copilot round-13 on
-        #19835):
-
-        * The prompt_toolkit binding is registered once at session
-          start via ``@kb.add(_voice_key)``; re-reading config per
-          render meant the status bar could advertise a new shortcut
-          after a config edit while the actual binding was still the
-          startup chord — exactly the display/binding drift this PR
-          is trying to eliminate.
-        * The label is on the hot render path (status bar + composer
-          placeholder invalidated every 150ms during recording), so
-          reading config on every call added avoidable UI overhead.
-        """
-        return getattr(self, "_voice_record_key_display_cache", None) or "Ctrl+B"
-
-    def set_voice_record_key_cache(self, raw_key: object) -> None:
-        """Populate the voice label cache from a raw ``voice.record_key``.
-
-        Called at CLI startup after the prompt_toolkit binding is
-        registered so the cached label always matches the live binding.
-        """
-        try:
-            from hermes_cli.voice import format_voice_record_key_for_status
-            self._voice_record_key_display_cache = format_voice_record_key_for_status(raw_key)
-        except Exception:
-            self._voice_record_key_display_cache = "Ctrl+B"
-
    def _get_voice_status_fragments(self, width: Optional[int] = None):
        """Return the voice status bar fragments for the interactive TUI."""
        width = width or self._get_tui_terminal_width()
        compact = self._use_minimal_tui_chrome(width=width)
-        label = self._voice_record_key_label()
        if self._voice_recording:
            if compact:
                return [("class:voice-status-recording", " ● REC ")]
-            return [("class:voice-status-recording", f" ● REC  {label} to stop ")]
+            return [("class:voice-status-recording", " ● REC  Ctrl+B to stop ")]
        if self._voice_processing:
            if compact:
                return [("class:voice-status", " ◉ STT ")]
            return [("class:voice-status", " ◉ Transcribing... ")]
        if compact:
-            return [("class:voice-status", f" 🎤 {label} ")]
+            return [("class:voice-status", " 🎤 Ctrl+B ")]
        tts = " | TTS on" if self._voice_tts else ""
        cont = " | Continuous" if self._voice_continuous else ""
-        return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}  —  {label} to record ")]
+        return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}  —  Ctrl+B to record ")]

    def _build_status_bar_text(self, width: Optional[int] = None) -> str:
        """Return a compact one-line session status string for the TUI footer."""
@@ -2998,14 +2928,7 @@ class HermesCLI:

        def _expand_ref(match):
            path = Path(match.group(1))
-            # Use try/except instead of path.exists() to avoid TOCTOU race:
-            # the paste file may be deleted between check and read, causing
-            # the input to be silently dropped (#17666).
-            try:
-                return path.read_text(encoding="utf-8")
-            except (OSError, IOError):
-                logger.warning("Paste file gone or unreadable, returning placeholder: %s", path)
-                return match.group(0)
+            return path.read_text(encoding="utf-8") if path.exists() else match.group(0)

        return paste_ref_re.sub(_expand_ref, text)

@@ -4989,6 +4912,40 @@ class HermesCLI:

        flush_tool_summary()
        print()
+
+    def _handle_recap_command(self) -> None:
+        """Show a compact recap of recent activity in this session.
+
+        Inspired by Claude Code's ``/recap`` (v2.1.114, April 2026) — useful
+        when running multiple sessions simultaneously and returning to one
+        after a while. Purely local; no LLM call, no token cost, no cache
+        invalidation.
+        """
+        try:
+            from hermes_cli.session_recap import build_recap
+        except Exception as exc:  # pragma: no cover - defensive
+            print(f"  (recap unavailable: {exc})")
+            return
+
+        title = None
+        try:
+            if self._session_db and self.session_id:
+                row = self._session_db.get_session(self.session_id)
+                if row:
+                    title = row.get("title") or None
+        except Exception:
+            title = None
+
+        text = build_recap(
+            self.conversation_history or [],
+            session_title=title,
+            session_id=self.session_id,
+            platform="cli",
+        )
+        print()
+        for line in text.splitlines():
+            print(line)
+        print()
    
    def _notify_session_boundary(self, event_type: str) -> None:
        """Fire a session-boundary plugin hook (on_session_finalize or on_session_reset).
@@ -5006,7 +4963,7 @@ class HermesCLI:
        except Exception:
            pass

-    def new_session(self, silent=False, title=None):
+    def new_session(self, silent=False):
        """Start a fresh session with a new session ID and cleared agent state."""
        if self.agent and self.conversation_history:
            # Trigger memory extraction on the old session before session_id rotates.
@@ -5061,28 +5018,6 @@ class HermesCLI:
                    self.agent._session_db_created = True
                except Exception:
                    pass
-                if title and self._session_db:
-                    from hermes_state import SessionDB
-                    try:
-                        sanitized = SessionDB.sanitize_title(title)
-                    except ValueError as e:
-                        _cprint(f"  Title rejected: {e}")
-                        sanitized = None
-                        title = None
-                    if sanitized:
-                        try:
-                            self._session_db.set_session_title(self.session_id, sanitized)
-                            self._pending_title = None
-                            title = sanitized
-                        except ValueError as e:
-                            _cprint(f"  {e} — session started untitled.")
-                            title = None
-                        except Exception:
-                            title = None
-                    elif title is not None:
-                        # sanitize_title returned empty (whitespace-only / unprintable)
-                        _cprint("  Title is empty after cleanup — session started untitled.")
-                        title = None
            # Notify memory providers that session_id rotated to a fresh
            # conversation. reset=True signals providers to flush accumulated
            # per-session state (_session_turns, _turn_counter, _document_id).
@@ -5102,10 +5037,7 @@ class HermesCLI:
            self._notify_session_boundary("on_session_reset")

        if not silent:
-            if title:
-                print(f"(^_^)v New session started: {title}")
-            else:
-                print("(^_^)v New session started!")
+            print("(^_^)v New session started!")

    def _handle_resume_command(self, cmd_original: str) -> None:
        """Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
@@ -6381,7 +6313,7 @@ class HermesCLI:
        _cmd_def = _resolve_cmd(_base_word)
        canonical = _cmd_def.name if _cmd_def else _base_word
        
-        if canonical in ("quit", "exit"):
+        if canonical in ("quit", "exit", "q"):
            return False
        elif canonical == "help":
            self.show_help()
@@ -6466,6 +6398,8 @@ class HermesCLI:
                    pass
        elif canonical == "history":
            self.show_history()
+        elif canonical == "recap":
+            self._handle_recap_command()
        elif canonical == "title":
            parts = cmd_original.split(maxsplit=1)
            if len(parts) > 1:
@@ -6517,9 +6451,7 @@ class HermesCLI:
                else:
                    _cprint("  Session database not available.")
        elif canonical == "new":
-            parts = cmd_original.split(maxsplit=1)
-            title = parts[1].strip() if len(parts) > 1 else None
-            self.new_session(title=title)
+            self.new_session()
        elif canonical == "resume":
            self._handle_resume_command(cmd_original)
        elif canonical == "model":
@@ -7674,10 +7606,6 @@ class HermesCLI:
                ):
                    self.session_id = self.agent.session_id
                    self._pending_title = None
-                    # Manual /compress replaces conversation_history with a new
-                    # compressed handoff for the child session. Persist it from
-                    # offset 0 so resume can recover the continuation after exit.
-                    self.agent._flush_messages_to_session_db(self.conversation_history, None)
                new_tokens = estimate_request_tokens_rough(
                    self.conversation_history,
                    system_prompt=_sys_prompt,
@@ -8325,38 +8253,20 @@ class HermesCLI:
                return
            self._voice_recording = True

-        # Load silence detection params from config. Shape-safe: a
-        # hand-edited ``voice: true`` / ``voice: cmd+b`` leaves
-        # ``load_config()['voice']`` as a non-dict; coerce to {} so
-        # continuous recording falls back to the documented defaults
-        # instead of crashing on ``.get()``.
-        voice_cfg: dict = {}
+        # Load silence detection params from config
+        voice_cfg = {}
        try:
            from hermes_cli.config import load_config
-            _cfg = load_config().get("voice")
-            voice_cfg = _cfg if isinstance(_cfg, dict) else {}
+            voice_cfg = load_config().get("voice", {})
        except Exception:
            pass

        if self._voice_recorder is None:
            self._voice_recorder = create_audio_recorder()

-        # Apply config-driven silence params (numeric-guarded so YAML
-        # scalar corruption doesn't break recording start-up).
-        #
-        # ``bool`` is explicitly excluded from the numeric check — in
-        # Python bool is a subclass of int, so a hand-edited
-        # ``silence_threshold: true`` would otherwise be forwarded as
-        # ``1`` instead of falling back to the 200 default (Copilot
-        # round-12 on #19835).
-        _threshold = voice_cfg.get("silence_threshold")
-        _duration = voice_cfg.get("silence_duration")
-        self._voice_recorder._silence_threshold = (
-            _threshold if isinstance(_threshold, (int, float)) and not isinstance(_threshold, bool) else 200
-        )
-        self._voice_recorder._silence_duration = (
-            _duration if isinstance(_duration, (int, float)) and not isinstance(_duration, bool) else 3.0
-        )
+        # Apply config-driven silence params
+        self._voice_recorder._silence_threshold = voice_cfg.get("silence_threshold", 200)
+        self._voice_recorder._silence_duration = voice_cfg.get("silence_duration", 3.0)

        def _on_silence():
            """Called by AudioRecorder when silence is detected after speech."""
@@ -8382,13 +8292,12 @@ class HermesCLI:
            with self._voice_lock:
                self._voice_recording = False
            raise
-        _label = self._voice_record_key_label()
        if getattr(self._voice_recorder, "supports_silence_autostop", True):
-            _recording_hint = f"auto-stops on silence | {_label} to stop & exit continuous"
+            _recording_hint = "auto-stops on silence | Ctrl+B to stop & exit continuous"
        elif _is_termux_environment():
-            _recording_hint = f"Termux:API capture | {_label} to stop"
+            _recording_hint = "Termux:API capture | Ctrl+B to stop"
        else:
-            _recording_hint = f"{_label} to stop"
+            _recording_hint = "Ctrl+B to stop"
        _cprint(f"\n{_ACCENT}● Recording...{_RST} {_DIM}({_recording_hint}){_RST}")

        # Periodically refresh prompt to update audio level indicator
@@ -8503,17 +8412,6 @@ class HermesCLI:
                        _cprint(f"{_DIM}Voice auto-restart failed: {e}{_RST}")
                threading.Thread(target=_restart_recording, daemon=True).start()

-    def _voice_speak_response_async(self, text: str) -> None:
-        """Schedule TTS and mark it pending before continuous recording can restart."""
-        if not self._voice_tts or not text:
-            return
-        self._voice_tts_done.clear()
-        threading.Thread(
-            target=self._voice_speak_response,
-            args=(text,),
-            daemon=True,
-        ).start()
-
    def _voice_speak_response(self, text: str):
        """Speak the agent's response aloud using TTS (runs in background thread)."""
        if not self._voice_tts:
@@ -8633,12 +8531,10 @@ class HermesCLI:
        with self._voice_lock:
            self._voice_mode = True

-        # Check config for auto_tts (shape-safe — malformed ``voice:`` YAML
-        # leaves ``voice_config`` as a non-dict, so guard before .get()).
+        # Check config for auto_tts
        try:
            from hermes_cli.config import load_config
-            _raw_voice = load_config().get("voice")
-            voice_config = _raw_voice if isinstance(_raw_voice, dict) else {}
+            voice_config = load_config().get("voice", {})
            if voice_config.get("auto_tts", False):
                with self._voice_lock:
                    self._voice_tts = True
@@ -8650,11 +8546,13 @@ class HermesCLI:
        # _voice_message_prefix property and its usage in _process_message().

        tts_status = " (TTS enabled)" if self._voice_tts else ""
-        # Use the startup-pinned cache so the advertised shortcut always
-        # matches the live prompt_toolkit binding — reading live config
-        # here would drift after a mid-session config edit (Copilot
-        # round-14 on #19835, same class as round-13).
-        _ptt_display = self._voice_record_key_label()
+        try:
+            from hermes_cli.config import load_config
+            _raw_ptt = load_config().get("voice", {}).get("record_key", "ctrl+b")
+            _ptt_key = _raw_ptt.lower().replace("ctrl+", "c-").replace("alt+", "a-")
+        except Exception:
+            _ptt_key = "c-b"
+        _ptt_display = _ptt_key.replace("c-", "Ctrl+").upper()
        _cprint(f"\n{_ACCENT}Voice mode enabled{tts_status}{_RST}")
        _cprint(f"  {_DIM}{_ptt_display} to start/stop recording{_RST}")
        _cprint(f"  {_DIM}/voice tts  to toggle speech output{_RST}")
@@ -8711,6 +8609,7 @@ class HermesCLI:

    def _show_voice_status(self):
        """Show current voice mode status."""
+        from hermes_cli.config import load_config
        from tools.voice_mode import check_voice_requirements

        reqs = check_voice_requirements()
@@ -8719,11 +8618,9 @@ class HermesCLI:
        _cprint(f"  Mode:      {'ON' if self._voice_mode else 'OFF'}")
        _cprint(f"  TTS:       {'ON' if self._voice_tts else 'OFF'}")
        _cprint(f"  Recording: {'YES' if self._voice_recording else 'no'}")
-        # Display the startup-pinned label so /voice status always
-        # matches the live prompt_toolkit binding (Copilot round-14 on
-        # #19835, same class as round-13). Reading live config here
-        # would drift after a mid-session config edit.
-        _cprint(f"  Record key: {self._voice_record_key_label()}")
+        _raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
+        _display_key = _raw_key.replace("ctrl+", "Ctrl+").upper() if "ctrl+" in _raw_key.lower() else _raw_key
+        _cprint(f"  Record key: {_display_key}")
        _cprint(f"\n  {_BOLD}Requirements:{_RST}")
        for line in reqs["details"].split("\n"):
            _cprint(f"    {line}")
@@ -9675,7 +9572,11 @@ class HermesCLI:
            # Speak response aloud if voice TTS is enabled
            # Skip batch TTS when streaming TTS already handled it
            if self._voice_tts and response and not use_streaming_tts:
-                self._voice_speak_response_async(response)
+                threading.Thread(
+                    target=self._voice_speak_response,
+                    args=(response,),
+                    daemon=True,
+                ).start()


            # Re-queue the interrupt message (and any that arrived while we were
@@ -10558,92 +10459,7 @@ class HermesCLI:
                else:
                    self._should_exit = True
                    event.app.exit()
-
-        # Ctrl+Shift+C: no binding needed. Terminal emulators (GNOME Terminal,
-        # iTerm2, kitty, Windows Terminal, etc.) intercept Ctrl+Shift+C before
-        # the keystroke reaches the application's stdin — prompt_toolkit never
-        # sees it, and prompt_toolkit's key spec parser doesn't even recognise
-        # 'c-S-c' anyway (the Shift modifier is meaningless on control-sequence
-        # keys). #19884 added a handler for this; #19895 patched the resulting
-        # startup crash with try/except. Both were based on a misreading of how
-        # terminal key events propagate. Deleting the dead handler outright.
-
-        @kb.add('c-q')  # Ctrl+Q
-        def handle_ctrl_q(event):
-            """Alternative interrupt/exit shortcut (Ctrl+Q).
-
-            Behaves like Ctrl+C: cancels active prompts, interrupts the
-            running agent, or clears the input buffer. Does not support
-            the double-press 'force exit' feature of Ctrl+C.
-            """
-            # Cancel active voice recording.
-            _should_cancel_voice = False
-            _recorder_ref = None
-            with cli_ref._voice_lock:
-                if cli_ref._voice_recording and cli_ref._voice_recorder:
-                    _recorder_ref = cli_ref._voice_recorder
-                    cli_ref._voice_recording = False
-                    cli_ref._voice_continuous = False
-                    _should_cancel_voice = True
-            if _should_cancel_voice:
-                _cprint(f"\n{_DIM}Recording cancelled.{_RST}")
-                threading.Thread(
-                    target=_recorder_ref.cancel, daemon=True
-                ).start()
-                event.app.invalidate()
-                return
-
-            # Cancel sudo prompt
-            if self._sudo_state:
-                self._sudo_state["response_queue"].put("")
-                self._sudo_state = None
-                event.app.invalidate()
-                return
-
-            # Cancel secret prompt
-            if self._secret_state:
-                self._cancel_secret_capture()
-                event.app.current_buffer.reset()
-                event.app.invalidate()
-                return
-
-            # Cancel approval prompt (deny)
-            if self._approval_state:
-                self._approval_state["response_queue"].put("deny")
-                self._approval_state = None
-                event.app.invalidate()
-                return
-
-            # Cancel /model picker
-            if self._model_picker_state:
-                self._close_model_picker()
-                event.app.current_buffer.reset()
-                event.app.invalidate()
-                return
-
-            # Cancel clarify prompt
-            if self._clarify_state:
-                self._clarify_state["response_queue"].put(
-                    "The user cancelled. Use your best judgement to proceed."
-                )
-                self._clarify_state = None
-                self._clarify_freetext = False
-                event.app.current_buffer.reset()
-                event.app.invalidate()
-                return
-
-            if self._agent_running and self.agent:
-                print("\n⚡ Interrupting agent...")
-                self.agent.interrupt()
-            else:
-                if event.app.current_buffer.text or self._attached_images:
-                    event.app.current_buffer.reset()
-                    self._attached_images.clear()
-                    event.app.invalidate()
-                else:
-                    self._should_exit = True
-                    event.app.exit()
-
+        
        @kb.add('c-d')
        def handle_ctrl_d(event):
            """Ctrl+D: delete char under cursor (standard readline behaviour).
@@ -10697,44 +10513,15 @@ class HermesCLI:
            run_in_terminal(_suspend)

        # Voice push-to-talk key: configurable via config.yaml (voice.record_key)
-        # Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search).
-        # Config spellings (ctrl/control/alt/option/opt) are normalized to
-        # prompt_toolkit's c-x / a-x format via ``normalize_voice_record_key_for_prompt_toolkit``
-        # so the same config value binds identically in the TUI and CLI
-        # (Copilot round-9 review on #19835). ``super``/``win``/``windows``
-        # configs silently fall back to the default here since prompt_toolkit
-        # has no super modifier — log a warning so users notice the
-        # TUI/CLI split instead of a silent mismatch (round-11).
-        _raw_key: object = "ctrl+b"
+        # Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search)
+        # Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format.
        try:
            from hermes_cli.config import load_config
-            from hermes_cli.voice import (
-                normalize_voice_record_key_for_prompt_toolkit,
-                voice_record_key_from_config,
-            )
-            _raw_key = voice_record_key_from_config(load_config())
-            _voice_key = normalize_voice_record_key_for_prompt_toolkit(_raw_key)
-            if (
-                isinstance(_raw_key, str)
-                and _raw_key.strip().lower().split("+", 1)[0].strip() in {"super", "win", "windows"}
-                and _voice_key == "c-b"
-            ):
-                logger.warning(
-                    "voice.record_key %r uses a TUI-only modifier (super/win); "
-                    "CLI fell back to Ctrl+B. Use ctrl+<key> or alt+<key> for "
-                    "cross-runtime parity.",
-                    _raw_key,
-                )
+            _raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
+            _voice_key = _raw_key.lower().replace("ctrl+", "c-").replace("alt+", "a-")
        except Exception:
            _voice_key = "c-b"

-        # Cache the UI label here — same ``_raw_key`` that drives the
-        # prompt_toolkit binding below. Every status / placeholder /
-        # recording-hint render reads this cached value so display can
-        # never drift from the live keybinding even if the user edits
-        # voice.record_key mid-session (Copilot round-13 on #19835).
-        self.set_voice_record_key_cache(_raw_key)
-
        @kb.add(_voice_key)
        def handle_voice_record(event):
            """Toggle voice recording when voice mode is active.
@@ -11037,8 +10824,7 @@ class HermesCLI:

        def _get_placeholder():
            if cli_ref._voice_recording:
-                _label = cli_ref._voice_record_key_label()
-                return f"recording... {_label} to stop, Ctrl+C to cancel"
+                return "recording... Ctrl+B to stop, Ctrl+C to cancel"
            if cli_ref._voice_processing:
                return "transcribing..."
            if cli_ref._sudo_state:
@@ -11058,8 +10844,7 @@ class HermesCLI:
            if cli_ref._agent_running:
                return "msg=interrupt · /queue · /bg · /steer · Ctrl+C cancel"
            if cli_ref._voice_mode:
-                _label = cli_ref._voice_record_key_label()
-                return f"type or {_label} to record"
+                return "type or Ctrl+B to record"
            return ""

        input_area.control.input_processors.append(_PlaceholderProcessor(_get_placeholder))
@@ -11835,7 +11620,7 @@ class HermesCLI:
                            pass  # Non-fatal — don't break the main loop

                except Exception as e:
-                    logger.warning("process_loop unhandled error (msg may be lost): %s", e)
+                    print(f"Error: {e}")
        
        # Start processing thread
        process_thread = threading.Thread(target=process_loop, daemon=True)
@@ -420,7 +420,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:


 def create_job(
-    prompt: Optional[str],
+    prompt: str,
    schedule: str,
    name: Optional[str] = None,
    repeat: Optional[int] = None,
@@ -435,14 +435,12 @@ def create_job(
    context_from: Optional[Union[str, List[str]]] = None,
    enabled_toolsets: Optional[List[str]] = None,
    workdir: Optional[str] = None,
-    no_agent: bool = False,
 ) -> Dict[str, Any]:
    """
    Create a new cron job.

    Args:
-        prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
-                Ignored when ``no_agent=True`` except as an optional name hint.
+        prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
        schedule: Schedule string (see parse_schedule)
        name: Optional friendly name
        repeat: How many times to run (None = forever, 1 = once)
@@ -453,33 +451,21 @@ def create_job(
        model: Optional per-job model override
        provider: Optional per-job provider override
        base_url: Optional per-job base URL override
-        script: Optional path to a script whose stdout feeds the job. With
-                ``no_agent=True`` the script IS the job — its stdout is
-                delivered verbatim. Without ``no_agent``, its stdout is
-                injected into the agent's prompt as context (data-collection /
-                change-detection pattern). Paths resolve under
-                ~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
-                anything else via Python.
+        script: Optional path to a Python script whose stdout is injected into the
+                prompt each run.  The script runs before the agent turn, and its output
+                is prepended as context.  Useful for data collection / change detection.
        context_from: Optional job ID (or list of job IDs) whose most recent output
                      is injected into the prompt as context before each run.
                      Useful for chaining cron jobs: job A finds data, job B processes it.
        enabled_toolsets: Optional list of toolset names to restrict the agent to.
                          When set, only tools from these toolsets are loaded, reducing
                          token overhead. When omitted, all default tools are loaded.
-                          Ignored when ``no_agent=True``.
        workdir: Optional absolute path.  When set, the job runs as if launched
                from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
                that directory are injected into the system prompt, and the
                terminal/file/code_exec tools use it as their working directory
                (via TERMINAL_CWD).  When unset, the old behaviour is preserved
                (no context files injected, tools use the scheduler's cwd).
-                With ``no_agent=True``, ``workdir`` is still applied as the
-                script's cwd so relative paths inside the script behave
-                predictably.
-        no_agent: When True, skip the agent entirely — run ``script`` on schedule
-                and deliver its stdout directly. Empty stdout = silent (no
-                delivery). Requires ``script`` to be set. Ideal for classic
-                watchdogs and periodic alerts that don't need LLM reasoning.

    Returns:
        The created job dict
@@ -513,16 +499,6 @@ def create_job(
    normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
    normalized_toolsets = normalized_toolsets or None
    normalized_workdir = _normalize_workdir(workdir)
-    normalized_no_agent = bool(no_agent)
-
-    # no_agent jobs are meaningless without a script — the script IS the job.
-    # Surface this as a clear ValueError at create time so bad configs never
-    # reach the scheduler.
-    if normalized_no_agent and not normalized_script:
-        raise ValueError(
-            "no_agent=True requires a script — with no agent and no script "
-            "there is nothing for the job to run."
-        )

    # Normalize context_from: accept str or list of str, store as list or None
    if isinstance(context_from, str):
@@ -532,7 +508,7 @@ def create_job(
    else:
        context_from = None

-    label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
+    label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
    job = {
        "id": job_id,
        "name": name or label_source[:50].strip(),
@@ -543,7 +519,6 @@ def create_job(
        "provider": normalized_provider,
        "base_url": normalized_base_url,
        "script": normalized_script,
-        "no_agent": normalized_no_agent,
        "context_from": context_from,
        "schedule": parsed_schedule,
        "schedule_display": parsed_schedule.get("display", schedule),
@@ -810,12 +785,6 @@ def get_due_jobs() -> List[Dict[str, Any]]:
    the job is fast-forwarded to the next future run instead of firing
    immediately.  This prevents a burst of missed jobs on gateway restart.
    """
-    with _jobs_file_lock:
-        return _get_due_jobs_locked()
-
-
-def _get_due_jobs_locked() -> List[Dict[str, Any]]:
-    """Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
    now = _hermes_now()
    raw_jobs = load_jobs()
    jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
@@ -828,36 +797,19 @@ def _get_due_jobs_locked() -> List[Dict[str, Any]]:

        next_run = job.get("next_run_at")
        if not next_run:
-            schedule = job.get("schedule", {})
-            kind = schedule.get("kind")
-
-            # One-shot jobs use a small grace window via the dedicated helper.
            recovered_next = _recoverable_oneshot_run_at(
-                schedule,
+                job.get("schedule", {}),
                now,
                last_run_at=job.get("last_run_at"),
            )
-            recovery_kind = "one-shot" if recovered_next else None
-
-            # Recurring jobs reach here only when something — typically a
-            # direct jobs.json edit that bypassed add_job() — left
-            # next_run_at unset.  Without this branch, such jobs are
-            # silently skipped forever; recompute next_run_at from the
-            # schedule so they pick up at their next scheduled tick.
-            if not recovered_next and kind in ("cron", "interval"):
-                recovered_next = compute_next_run(schedule, now.isoformat())
-                if recovered_next:
-                    recovery_kind = kind
-
            if not recovered_next:
                continue

            job["next_run_at"] = recovered_next
            next_run = recovered_next
            logger.info(
-                "Job '%s' had no next_run_at; recovering %s run at %s",
+                "Job '%s' had no next_run_at; recovering one-shot run at %s",
                job.get("name", job["id"]),
-                recovery_kind,
                recovered_next,
            )
            for rj in raw_jobs:
@@ -35,7 +35,7 @@ from typing import List, Optional
 sys.path.insert(0, str(Path(__file__).parent.parent))

 from hermes_constants import get_hermes_home
-from hermes_cli.config import load_config, _expand_env_vars
+from hermes_cli.config import load_config
 from hermes_time import now as _hermes_now

 logger = logging.getLogger(__name__)
@@ -114,36 +114,18 @@ from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_
 # locally for audit.
 SILENT_MARKER = "[SILENT]"

-# Backward-compatible module override used by tests and emergency monkeypatches.
-_hermes_home: Path | None = None
+# Resolve Hermes home directory (respects HERMES_HOME override)
+_hermes_home = get_hermes_home()

-
-def _get_hermes_home() -> Path:
-    """Resolve Hermes home dynamically while preserving test monkeypatch hooks."""
-    return _hermes_home or get_hermes_home()
-
-
-def _get_lock_paths() -> tuple[Path, Path]:
-    """Resolve cron lock paths at call time so profile/env changes are honored."""
-    hermes_home = _get_hermes_home()
-    lock_dir = hermes_home / "cron"
-    return lock_dir, lock_dir / ".tick.lock"
+# File-based lock prevents concurrent ticks from gateway + daemon + systemd timer
+_LOCK_DIR = _hermes_home / "cron"
+_LOCK_FILE = _LOCK_DIR / ".tick.lock"


 def _resolve_origin(job: dict) -> Optional[dict]:
-    """Extract origin info from a job, preserving any extra routing metadata.
-
-    Treats non-dict origins (free-form provenance strings, ints, lists from
-    migration scripts or hand-edited jobs.json) as missing instead of
-    crashing with ``AttributeError`` on ``origin.get(...)``. Without this
-    guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
-    crashed every fire attempt with
-    ``'str' object has no attribute 'get'`` — ``mark_job_run`` recorded the
-    failure, but the next tick re-loaded the same poisoned origin and
-    crashed identically until the field was patched manually (#18722).
-    """
+    """Extract origin info from a job, preserving any extra routing metadata."""
    origin = job.get("origin")
-    if not isinstance(origin, dict):
+    if not origin:
        return None
    platform = origin.get("platform")
    chat_id = origin.get("chat_id")
@@ -165,19 +147,6 @@ def _get_home_target_chat_id(platform_name: str) -> str:
    return value


-def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
-    """Return the optional thread/topic ID for a platform home target."""
-    env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
-    if not env_var:
-        return None
-    value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
-    if not value:
-        legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
-        if legacy:
-            value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
-    return value or None
-
-
 def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
    """Resolve one concrete auto-delivery target for a cron job."""

@@ -206,7 +175,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
                return {
                    "platform": platform_name,
                    "chat_id": chat_id,
-                    "thread_id": _get_home_target_thread_id(platform_name),
+                    "thread_id": None,
                }
        return None

@@ -260,7 +229,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
    return {
        "platform": platform_name,
        "chat_id": chat_id,
-        "thread_id": _get_home_target_thread_id(platform_name),
+        "thread_id": None,
    }


@@ -425,7 +394,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
        thread_id = target.get("thread_id")

        # Diagnostic: log thread_id for topic-aware delivery debugging
-        origin = _resolve_origin(job) or {}
+        origin = job.get("origin") or {}
        origin_thread = origin.get("thread_id")
        if origin_thread and not thread_id:
            logger.warning(
@@ -584,18 +553,8 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
    prevent arbitrary script execution via path traversal or absolute
    path injection.

-    Supported interpreters (chosen by file extension):
-
-    * ``.sh`` / ``.bash`` — run with ``/bin/bash``
-    * anything else — run with the current Python interpreter
-      (``sys.executable``), preserving the original behaviour for
-      Python-based pre-check and data-collection scripts.
-
-    Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
-    (the `memory-watchdog.sh` pattern) without wrapping them in Python.
-
    Args:
-        script_path: Path to the script.  Relative paths are resolved
+        script_path: Path to a Python script.  Relative paths are resolved
            against HERMES_HOME/scripts/.  Absolute and ~-prefixed paths
            are also validated to ensure they stay within the scripts dir.

@@ -605,7 +564,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
    """
    from hermes_constants import get_hermes_home

-    scripts_dir = _get_hermes_home() / "scripts"
+    scripts_dir = get_hermes_home() / "scripts"
    scripts_dir.mkdir(parents=True, exist_ok=True)
    scripts_dir_resolved = scripts_dir.resolve()

@@ -632,19 +591,9 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:

    script_timeout = _get_script_timeout()

-    # Pick an interpreter by extension.  Bash for .sh/.bash, Python for
-    # everything else.  We deliberately do NOT honour the file's own
-    # shebang: the scripts dir is trusted, but keeping the interpreter
-    # choice explicit here keeps the allowed surface small and auditable.
-    suffix = path.suffix.lower()
-    if suffix in (".sh", ".bash"):
-        argv = ["/bin/bash", str(path)]
-    else:
-        argv = [sys.executable, str(path)]
-
    try:
        result = subprocess.run(
-            argv,
+            [sys.executable, str(path)],
            capture_output=True,
            text=True,
            timeout=script_timeout,
@@ -734,8 +683,10 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
                    f"{prompt}"
                )
            else:
-                # Script produced no output — nothing to report, skip AI call.
-                return None
+                prompt = (
+                    "[Script ran successfully but produced no output.]\n\n"
+                    f"{prompt}"
+                )
        else:
            prompt = (
                "## Script Error\n"
@@ -808,7 +759,6 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
        return prompt

    from tools.skills_tool import skill_view
-    from tools.skill_usage import bump_use

    parts = []
    skipped: list[str] = []
@@ -820,12 +770,6 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
            skipped.append(skill_name)
            continue

-        # Bump usage so the curator sees this skill as actively used.
-        try:
-            bump_use(skill_name)
-        except Exception:
-            logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
-
        content = str(loaded.get("content") or "").strip()
        if parts:
            parts.append("")
@@ -858,120 +802,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
    Returns:
        Tuple of (success, full_output_doc, final_response, error_message)
    """
-    job_id = job["id"]
-    job_name = job["name"]
-
-    # ---------------------------------------------------------------
-    # no_agent short-circuit — the script IS the job, no LLM involvement.
-    # ---------------------------------------------------------------
-    # This mirrors the classic "run a bash script on a timer, send its
-    # stdout to telegram" watchdog pattern. The agent path is skipped
-    # entirely: no AIAgent, no prompt, no tool loop, no token spend.
-    #
-    # We check this BEFORE importing run_agent / constructing SessionDB so
-    # a pure-script tick never pays for the agent machinery it isn't going
-    # to use. Keep this block self-contained.
-    #
-    # Semantics:
-    #   - script stdout (trimmed) → delivered verbatim as the final message
-    #   - empty stdout            → silent run (no delivery, success=True)
-    #   - non-zero exit / timeout → delivered as an error alert, success=False
-    #   - wakeAgent=false gate    → treated like empty stdout (silent), since
-    #                               the whole point of no_agent is that there
-    #                               is no agent to wake
-    if job.get("no_agent"):
-        script_path = job.get("script")
-        if not script_path:
-            err = "no_agent=True but no script is set for this job"
-            logger.error("Job '%s': %s", job_id, err)
-            return False, "", "", err
-
-        # Apply workdir if configured — lets scripts use predictable relative
-        # paths. For no_agent jobs this is just the subprocess cwd (not an
-        # agent TERMINAL_CWD bridge).
-        _job_workdir = (job.get("workdir") or "").strip() or None
-        _prior_cwd = None
-        if _job_workdir and Path(_job_workdir).is_dir():
-            _prior_cwd = os.getcwd()
-            try:
-                os.chdir(_job_workdir)
-            except OSError:
-                _prior_cwd = None
-
-        try:
-            ok, output = _run_job_script(script_path)
-        finally:
-            if _prior_cwd is not None:
-                try:
-                    os.chdir(_prior_cwd)
-                except OSError:
-                    pass
-
-        now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
-
-        if not ok:
-            # Script crashed / timed out / exited non-zero.  Deliver the
-            # error so the user knows the watchdog itself broke — silent
-            # failure for an alerting job is the worst-case outcome.
-            alert = (
-                f"⚠ Cron watchdog '{job_name}' script failed\n\n"
-                f"{output}\n\n"
-                f"Time: {now_iso}"
-            )
-            doc = (
-                f"# Cron Job: {job_name}\n\n"
-                f"**Job ID:** {job_id}\n"
-                f"**Run Time:** {now_iso}\n"
-                f"**Mode:** no_agent (script)\n"
-                f"**Status:** script failed\n\n"
-                f"{output}\n"
-            )
-            return False, doc, alert, output
-
-        # Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
-        # means "nothing to report this tick", same as empty stdout.
-        if not _parse_wake_gate(output):
-            logger.info(
-                "Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
-            )
-            silent_doc = (
-                f"# Cron Job: {job_name}\n\n"
-                f"**Job ID:** {job_id}\n"
-                f"**Run Time:** {now_iso}\n"
-                f"**Mode:** no_agent (script)\n"
-                f"**Status:** silent (wakeAgent=false)\n"
-            )
-            return True, silent_doc, SILENT_MARKER, None
-
-        if not output.strip():
-            logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
-            silent_doc = (
-                f"# Cron Job: {job_name}\n\n"
-                f"**Job ID:** {job_id}\n"
-                f"**Run Time:** {now_iso}\n"
-                f"**Mode:** no_agent (script)\n"
-                f"**Status:** silent (empty output)\n"
-            )
-            return True, silent_doc, SILENT_MARKER, None
-
-        doc = (
-            f"# Cron Job: {job_name}\n\n"
-            f"**Job ID:** {job_id}\n"
-            f"**Run Time:** {now_iso}\n"
-            f"**Mode:** no_agent (script)\n\n"
-            f"---\n\n"
-            f"{output}\n"
-        )
-        return True, doc, output, None
-
-    # ---------------------------------------------------------------
-    # Default (LLM) path — import and construct the agent machinery now
-    # that we know we actually need it. Doing these imports here instead of
-    # at module top keeps no_agent ticks from paying for AIAgent / SessionDB
-    # construction costs.
-    # ---------------------------------------------------------------
    from run_agent import AIAgent
-
+    
    # Initialize SQLite session store so cron job messages are persisted
    # and discoverable via session_search (same pattern as gateway/run.py).
    _session_db = None
@@ -980,6 +812,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        _session_db = SessionDB()
    except Exception as e:
        logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
+    
+    job_id = job["id"]
+    job_name = job["name"]

    # Wake-gate: if this job has a pre-check script, run it BEFORE building
    # the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -1004,9 +839,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            return True, silent_doc, SILENT_MARKER, None

    prompt = _build_job_prompt(job, prerun_script=prerun_script)
-    if prompt is None:
-        logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
-        return True, "", SILENT_MARKER, None
    origin = _resolve_origin(job)
    _cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"

@@ -1066,9 +898,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        # changes take effect without a gateway restart.
        from dotenv import load_dotenv
        try:
-            load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="utf-8")
+            load_dotenv(str(_hermes_home / ".env"), override=True, encoding="utf-8")
        except UnicodeDecodeError:
-            load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="latin-1")
+            load_dotenv(str(_hermes_home / ".env"), override=True, encoding="latin-1")

        delivery_target = _resolve_delivery_target(job)
        if delivery_target:
@@ -1086,11 +918,10 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        _cfg = {}
        try:
            import yaml
-            _cfg_path = str(_get_hermes_home() / "config.yaml")
+            _cfg_path = str(_hermes_home / "config.yaml")
            if os.path.exists(_cfg_path):
                with open(_cfg_path) as _f:
                    _cfg = yaml.safe_load(_f) or {}
-                _cfg = _expand_env_vars(_cfg)
                _model_cfg = _cfg.get("model", {})
                if not job.get("model"):
                    if isinstance(_model_cfg, str):
@@ -1120,7 +951,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        if prefill_file:
            pfpath = Path(prefill_file).expanduser()
            if not pfpath.is_absolute():
-                pfpath = _get_hermes_home() / pfpath
+                pfpath = _hermes_home / pfpath
            if pfpath.exists():
                try:
                    with open(pfpath, "r", encoding="utf-8") as _pf:
@@ -1143,13 +974,8 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        )
        from hermes_cli.auth import AuthError
        try:
-            # Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
-            # already prefers persisted config over stale shell/env overrides when
-            # no explicit provider is requested. Passing the env var here short-
-            # circuits that precedence and can resurrect old providers (for
-            # example DeepSeek) for cron jobs that do not pin provider/model.
            runtime_kwargs = {
-                "requested": job.get("provider"),
+                "requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
            }
            if job.get("base_url"):
                runtime_kwargs["explicit_base_url"] = job.get("base_url")
@@ -1444,13 +1270,12 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
    Returns:
        Number of jobs executed (0 if another tick is already running)
    """
-    lock_dir, lock_file = _get_lock_paths()
-    lock_dir.mkdir(parents=True, exist_ok=True)
+    _LOCK_DIR.mkdir(parents=True, exist_ok=True)

    # Cross-platform file locking: fcntl on Unix, msvcrt on Windows
    lock_fd = None
    try:
-        lock_fd = open(lock_file, "w")
+        lock_fd = open(_LOCK_FILE, "w")
        if fcntl:
            fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
        elif msvcrt:
@@ -86,41 +86,6 @@ if [ -d "$INSTALL_DIR/skills" ]; then
    python3 "$INSTALL_DIR/tools/skills_sync.py"
 fi

-# Optionally start `hermes dashboard` as a side-process.
-#
-# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
-# Host/port/TUI can be overridden via:
-#   HERMES_DASHBOARD_HOST  (default 0.0.0.0 — exposed outside the container)
-#   HERMES_DASHBOARD_PORT  (default 9119, matches `hermes dashboard` default)
-#   HERMES_DASHBOARD_TUI   (already honored by `hermes dashboard` itself)
-#
-# The dashboard is a long-lived server.  We background it *before* the final
-# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
-# sleep infinity, …) remains PID-of-interest for the container runtime.  When
-# the container stops the whole process tree is torn down, so no explicit
-# cleanup is needed.
-case "${HERMES_DASHBOARD:-}" in
-    1|true|TRUE|True|yes|YES|Yes)
-        dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
-        dash_port="${HERMES_DASHBOARD_PORT:-9119}"
-        dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
-        # Binding to anything other than localhost requires --insecure — the
-        # dashboard refuses otherwise because it exposes API keys.  Inside a
-        # container this is the expected deployment (host reaches it via
-        # published port), so opt in automatically.
-        if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
-            dash_args+=(--insecure)
-        fi
-        echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
-        # Prefix dashboard output so it's distinguishable from the main
-        # process in `docker logs`.  stdbuf keeps the pipe line-buffered.
-        (
-            stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
-                | sed -u 's/^/[dashboard] /'
-        ) &
-        ;;
-esac
-
 # Final exec: two supported invocation patterns.
 #
 #   docker run <image>                 -> exec `hermes` with no args (legacy default)
@@ -1,473 +0,0 @@
-# Telegram DM User-Managed Multi-Session Topics Implementation Plan
-
-> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
-
-**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
-
-**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
-
-**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
-
---
-
-## 1. Product decisions
-
-### Accepted
-
- PR-quality implementation: migrations, tests, docs, backwards compatibility.
- Use SQLite persistence, not JSON sidecars.
- Live status suffixes in topic titles are out of MVP.
- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
- User creates Telegram topics manually through the Telegram bot interface.
- `/new` does **not** create Telegram topics.
- Root/main DM becomes a system lobby after activation.
- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
-
-### Telegram API assumptions verified from Bot API docs
-
- `getMe` returns bot `User` fields:
-  - `has_topics_enabled`: forum/topic mode enabled in private chats.
-  - `allows_users_to_create_topics`: users may create/delete topics in private chats.
- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
- `Message.message_thread_id` identifies a topic in private chats.
- `sendMessage` supports `message_thread_id` for private-chat topics.
- `pinChatMessage` is allowed in private chats.
-
---
-
-## 2. Target UX
-
-### 2.1 Activation from root/main DM
-
-User sends:
-
-```text
-/topic
-```
-
-Hermes:
-
-1. calls Telegram `getMe`;
-2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
-3. enables multi-session topic mode for this Telegram DM user/chat;
-4. sends an onboarding message;
-5. pins the onboarding message if configured;
-6. shows old/unlinked sessions that can be restored into topics.
-
-Suggested onboarding text:
-
-```text
-Multi-session mode is enabled.
-
-Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
-
-This main chat is reserved for system commands, status, and session management.
-
-To restore an old session:
-1. Use /topic here to see unlinked sessions.
-2. Create a new topic with the + button.
-3. Send /topic <session_id> inside that topic.
-```
-
-### 2.2 Root/main DM after activation
-
-Root DM is a system lobby.
-
-Allowed/system commands include at least:
-
- `/topic`
- `/status`
- `/sessions` if available
- `/usage`
- `/help`
- `/platforms`
-
-Normal user prompts in root DM do not enter the agent loop. Reply:
-
-```text
-This main chat is reserved for system commands.
-
-To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
-```
-
-`/new` in root DM does not create a session/topic. Reply:
-
-```text
-To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
-
-Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
-```
-
-### 2.3 First message in a user-created topic
-
-When a user creates a Telegram topic and sends the first message there:
-
-1. Hermes receives a Telegram DM message with `message_thread_id`.
-2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
-3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
-4. The message runs through the normal agent loop for that lane.
-
-### 2.4 `/new` inside a non-main topic
-
-`/new` remains supported but replaces the session attached to the current topic lane.
-
-Hermes should warn:
-
-```text
-Started a new Hermes session in this topic.
-
-Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
-```
-
-### 2.5 `/topic` in root/main DM after activation
-
-Shows:
-
- mode enabled/disabled;
- last capability check result;
- whether intro message is pinned if known;
- count of known topic bindings;
- list of old/unlinked sessions.
-
-Example:
-
-```text
-Telegram multi-session topics are enabled.
-
-Create new Hermes chats with the + button in this bot interface.
-
-Unlinked previous sessions:
-1. 2026-05-01 Research notes — id: abc123
-2. 2026-04-30 Deploy debugging — id: def456
-3. Untitled session — id: ghi789
-
-To restore one:
-1. Create a new topic with the + button.
-2. Open that topic.
-3. Send /topic <id>
-```
-
-### 2.6 `/topic` inside a non-main topic
-
-Without args, show the current topic binding:
-
-```text
-This topic is linked to:
-Session: Research notes
-ID: abc123
-
-Use /new to replace this topic with a fresh session.
-For parallel work, create another topic with the + button.
-```
-
-### 2.7 `/topic <session_id>` inside a non-main topic
-
-Restore an old/unlinked session into the current user-created topic.
-
-Behavior:
-
-1. reject if not in Telegram DM topic;
-2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
-3. reject if session is already linked to another active topic in MVP;
-4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
-5. upsert binding with `managed_mode = restored`;
-6. send two messages into the topic:
-   - session restored confirmation;
-   - last Hermes assistant message if available.
-
-Example:
-
-```text
-Session restored: Research notes
-
-Last Hermes message:
-...
-```
-
---
-
-## 3. Persistence model
-
-Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
-
-Important rollback-safety rule:
-
- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
- old/default Telegram behavior must keep working on the existing `state.db`;
- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
-
-### 3.1 No eager `sessions` table mutation for MVP
-
-Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
-
-For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
-
-If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
-
-### 3.2 Explicit `/topic` migration API
-
-Add an idempotent method such as:
-
-```python
-def apply_telegram_topic_migration(self) -> None: ...
-```
-
-It creates only topic-mode side tables/indexes and records:
-
-```text
-state_meta.telegram_dm_topic_schema_version = 1
-```
-
-This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
-
-### 3.3 `telegram_dm_topic_mode`
-
-Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
-
-Suggested fields:
-
- `chat_id` primary key
- `user_id`
- `enabled`
- `activated_at`
- `updated_at`
- `has_topics_enabled`
- `allows_users_to_create_topics`
- `capability_checked_at`
- `intro_message_id`
- `pinned_message_id`
-
-### 3.4 `telegram_dm_topic_bindings`
-
-Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
-
-Suggested fields:
-
- `chat_id`
- `thread_id`
- `user_id`
- `session_key`
- `session_id`
- `managed_mode`
-  - `auto`
-  - `restored`
-  - `new_replaced`
- `linked_at`
- `updated_at`
-
-Recommended constraints:
-
- primary key `(chat_id, thread_id)`;
- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
- index `(user_id, chat_id)` for status/listing.
-
-### 3.5 Unlinked session semantics
-
-For MVP, a session is unlinked if:
-
- `source = telegram`;
- `user_id = current Telegram user`;
- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
-
-This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
-
-Never dedupe by title.
-
---
-
-## 4. Config
-
-Suggested config block:
-
-```yaml
-platforms:
-  telegram:
-    extra:
-      multisession_topics:
-        enabled: false
-        mode: user_managed_topics
-        root_chat_behavior: system_lobby
-        pin_intro_message: true
-```
-
-Notes:
-
- `enabled: false` means existing Telegram behavior is unchanged.
- Activation via `/topic` may create per-chat enabled state only if global config permits it.
- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
-
---
-
-## 5. Command behavior summary
-
-### `/topic` root/main DM
-
- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
- If activated: show status and unlinked sessions.
-
-### `/topic` non-main topic
-
- Show current binding.
-
-### `/topic <session_id>` root/main DM
-
-Reject with instructions:
-
-```text
-Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
-```
-
-### `/topic <session_id>` non-main topic
-
-Restore that session into this topic if ownership/linking checks pass.
-
-### `/new` root/main DM when activated
-
-Reply with instructions to use the `+` button. Do not enter agent loop.
-
-### `/new` non-main topic
-
-Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
-
-### Normal text root/main DM when activated
-
-Reply with system-lobby instruction. Do not enter agent loop.
-
-### Normal text non-main topic
-
-Normal Hermes agent flow for that topic's session lane.
-
---
-
-## 6. PR breakdown
-
-### PR 1 — Explicit topic-mode schema migration
-
-**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
-
-**Files likely touched:**
-
- `hermes_state.py`
- tests under `tests/`
-
-**Tests first:**
-
-1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
-2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
-3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
-
-### PR 2 — Topic mode activation and binding APIs
-
-**Goal:** Add SQLite persistence for activation and topic bindings.
-
-**Tests first:**
-
-1. enable/check mode row round-trips;
-2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
-3. linked sessions are excluded from unlinked list.
-
-### PR 3 — `/topic` activation/status command
-
-**Goal:** Implement root activation/status/listing behavior.
-
-**Tests first:**
-
-1. `/topic` in root checks `getMe` capabilities and records activation;
-2. capability failure returns readable instructions;
-3. activated root `/topic` lists unlinked sessions.
-
-### PR 4 — System lobby behavior
-
-**Goal:** Prevent root chat from entering agent loop after activation.
-
-**Tests first:**
-
-1. normal text in activated root returns lobby instruction;
-2. `/new` in activated root returns `+` button instruction;
-3. non-activated root behavior is unchanged.
-
-### PR 5 — Auto-bind user-created topics
-
-**Goal:** First message in non-main topic creates/uses an independent session lane.
-
-**Tests first:**
-
-1. new topic message creates binding with `auto_created`;
-2. repeated topic message reuses same binding/lane;
-3. two topics in same DM do not share sessions.
-
-### PR 6 — Restore legacy sessions into a topic
-
-**Goal:** Implement `/topic <session_id>` in non-main topics.
-
-**Tests first:**
-
-1. root `/topic <id>` rejects with instructions;
-2. topic `/topic <id>` switches current topic lane to target session;
-3. restore rejects sessions from other users/chats;
-4. restore rejects already-linked sessions;
-5. restore emits confirmation and last Hermes assistant message.
-
-### PR 7 — `/new` inside topic updates binding
-
-**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
-
-**Tests first:**
-
-1. `/new` in topic creates a new session for same topic lane;
-2. binding updates to `managed_mode = new_replaced`;
-3. response includes guidance to use `+` for parallel work.
-
-### PR 8 — Docs and polish
-
-**Goal:** Document the feature and Telegram setup.
-
-**Files likely touched:**
-
- `website/docs/user-guide/messaging/telegram.md`
- maybe `website/docs/user-guide/sessions.md`
-
-Docs must explain:
-
- BotFather/Telegram settings for topic mode and user-created topics;
- `/topic` activation;
- root system lobby;
- using `+` for new parallel chats;
- restoring old sessions with `/topic <id>` inside a topic;
- limitations.
-
---
-
-## 7. Testing / quality gates
-
-Run targeted tests after each TDD cycle, then broader tests before completion.
-
-Suggested commands after inspection confirms test paths:
-
-```bash
-python -m pytest tests/test_hermes_state.py -q
-python -m pytest tests/gateway/ -q
-python -m pytest tests/ -o 'addopts=' -q
-```
-
-Do not ship without verifying disabled-feature backwards compatibility.
-
---
-
-## 8. Definition of done for MVP
-
- `/topic` activates/checks Telegram DM multi-session mode.
- Root DM becomes a system lobby after activation.
- Onboarding message tells users to create new chats with the Telegram `+` button.
- Onboarding message can be pinned in private chat.
- User-created topics automatically become independent Hermes session lanes.
- `/new` in root gives instructions, not a new agent run.
- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
- `/topic` in root lists unlinked old sessions.
- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
- Ownership checks prevent restoring other users' sessions.
- Already-linked sessions are not restored into a second topic in MVP.
- Existing Telegram behavior is unchanged when the feature is disabled.
- Tests and docs are included.
@@ -186,24 +186,18 @@ class HomeChannel:
    Default destination for a platform.
    
    When a cron job specifies deliver="telegram" without a specific chat ID,
-    messages are sent to this home channel. Thread-aware platforms may also
-    store a thread/topic ID so the bare platform target routes to the exact
-    conversation where /sethome was run.
+    messages are sent to this home channel.
    """
    platform: Platform
    chat_id: str
    name: str  # Human-readable name for display
-    thread_id: Optional[str] = None
    
    def to_dict(self) -> Dict[str, Any]:
-        result = {
+        return {
            "platform": self.platform.value,
            "chat_id": self.chat_id,
            "name": self.name,
        }
-        if self.thread_id:
-            result["thread_id"] = self.thread_id
-        return result
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -211,7 +205,6 @@ class HomeChannel:
            platform=Platform(data["platform"]),
            chat_id=str(data["chat_id"]),
            name=data.get("name", "Home"),
-            thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
        )


@@ -845,36 +838,12 @@ def load_gateway_config() -> GatewayConfig:
                    ):
                        if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
                            os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
-                # reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
-                # YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
-                _discord_extra = discord_cfg.get("extra") if isinstance(discord_cfg.get("extra"), dict) else {}
-                _discord_rtm = (
-                    discord_cfg["reply_to_mode"] if "reply_to_mode" in discord_cfg
-                    else _discord_extra.get("reply_to_mode")
-                )
-                if _discord_rtm is not None and not os.getenv("DISCORD_REPLY_TO_MODE"):
-                    _rtm_str = "off" if _discord_rtm is False else str(_discord_rtm).lower()
-                    os.environ["DISCORD_REPLY_TO_MODE"] = _rtm_str
-
-            # Bridge top-level require_mention to Telegram when the telegram: section
-            # does not already provide one.  Users often write "require_mention: true"
-            # at the top level alongside group_sessions_per_user, expecting it to work
-            # the same way (#3979).
-            _tl_require_mention = yaml_cfg.get("require_mention")
-            if _tl_require_mention is not None:
-                _tg_section = yaml_cfg.get("telegram") or {}
-                if "require_mention" not in _tg_section:
-                    _tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
-                    _tg_extra = _tg_plat.setdefault("extra", {})
-                    _tg_extra.setdefault("require_mention", _tl_require_mention)

            # Telegram settings → env vars (env vars take precedence)
            telegram_cfg = yaml_cfg.get("telegram", {})
            if isinstance(telegram_cfg, dict):
-                # Prefer telegram.require_mention; fall back to the top-level shorthand.
-                _effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
-                if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
-                    os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
+                if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
+                    os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
                if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
                    os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
                frc = telegram_cfg.get("free_response_chats")
@@ -891,16 +860,6 @@ def load_gateway_config() -> GatewayConfig:
                    os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
                if "proxy_url" in telegram_cfg and not os.getenv("TELEGRAM_PROXY"):
                    os.environ["TELEGRAM_PROXY"] = str(telegram_cfg["proxy_url"]).strip()
-                # reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
-                # YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
-                _telegram_extra = telegram_cfg.get("extra") if isinstance(telegram_cfg.get("extra"), dict) else {}
-                _telegram_rtm = (
-                    telegram_cfg["reply_to_mode"] if "reply_to_mode" in telegram_cfg
-                    else _telegram_extra.get("reply_to_mode")
-                )
-                if _telegram_rtm is not None and not os.getenv("TELEGRAM_REPLY_TO_MODE"):
-                    _rtm_str = "off" if _telegram_rtm is False else str(_telegram_rtm).lower()
-                    os.environ["TELEGRAM_REPLY_TO_MODE"] = _rtm_str
                allowed_users = telegram_cfg.get("allow_from")
                if allowed_users is not None and not os.getenv("TELEGRAM_ALLOWED_USERS"):
                    if isinstance(allowed_users, list):
@@ -1112,7 +1071,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.TELEGRAM,
            chat_id=telegram_home,
            name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Discord
@@ -1129,7 +1087,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.DISCORD,
            chat_id=discord_home,
            name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Reply threading mode for Discord (off/first/all)
@@ -1151,7 +1108,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.WHATSAPP,
            chat_id=whatsapp_home,
            name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
        )

    # Slack
@@ -1179,7 +1135,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SLACK,
            chat_id=slack_home,
            name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
-            thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Signal
@@ -1200,7 +1155,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SIGNAL,
            chat_id=signal_home,
            name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
        )

    # Mattermost
@@ -1220,7 +1174,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.MATTERMOST,
            chat_id=mattermost_home,
            name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
        )

    # Matrix
@@ -1252,7 +1205,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.MATRIX,
            chat_id=matrix_home,
            name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
-            thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
        )

    # Home Assistant
@@ -1286,7 +1238,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.EMAIL,
            chat_id=email_home,
            name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
-            thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
        )

    # SMS (Twilio)
@@ -1302,7 +1253,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SMS,
            chat_id=sms_home,
            name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
        )

    # API Server
@@ -1365,7 +1315,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.DINGTALK,
                chat_id=dingtalk_home,
                name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
-                thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
            )

    # Feishu / Lark
@@ -1393,7 +1342,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.FEISHU,
                chat_id=feishu_home,
                name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
-                thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
            )

    # WeCom (Enterprise WeChat)
@@ -1416,7 +1364,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.WECOM,
                chat_id=wecom_home,
                name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
-                thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
            )

    # WeCom callback mode (self-built apps)
@@ -1475,7 +1422,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.WEIXIN,
                chat_id=weixin_home,
                name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
-                thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
            )

    # BlueBubbles (iMessage)
@@ -1499,7 +1445,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.BLUEBUBBLES,
            chat_id=bluebubbles_home,
            name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
-            thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
        )

    # QQ (Official Bot API v2)
@@ -1537,11 +1482,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.QQBOT,
                chat_id=qq_home,
                name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
-                thread_id=(
-                    os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
-                    or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
-                    or None
-                ),
            )

    # Yuanbao — YUANBAO_APP_ID preferred
@@ -1572,7 +1512,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.YUANBAO,
                chat_id=yuanbao_home,
                name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
-                thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
            )
        yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
        if yuanbao_dm_policy:
@@ -1,84 +0,0 @@
-"""Shared HTTP client factory for long-lived platform adapters.
-
-Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
-BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
-alive for the adapter's lifetime.  That amortises TLS/connection setup
-across many API calls, but it also means the process's file-descriptor
-pressure is sensitive to how aggressively the pool recycles idle keep-
-alive connections.
-
-httpx's default ``keepalive_expiry`` is 5 seconds.  On macOS behind
-Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
-sit in ``CLOSE_WAIT`` longer than that before the local socket actually
-drains — which, multiplied across 7 long-lived adapters plus the LLM
-client and MCP clients, walks straight into the default 256 fd limit.
-See #18451.
-
-``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
-adapter factories use instead of the httpx default.  The values chosen:
-
-* ``max_keepalive_connections=10`` — plenty for any single adapter;
-  platform APIs rarely parallelise beyond this.
-* ``keepalive_expiry=2.0`` — close idle sockets aggressively so a
-  proxy's lingering CLOSE_WAIT window can't starve the process.
-
-Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
-``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
-"""
-
-from __future__ import annotations
-
-import os
-
-try:
-    import httpx
-except ImportError:  # pragma: no cover — optional dep
-    httpx = None  # type: ignore[assignment]
-
-
-_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
-_DEFAULT_MAX_KEEPALIVE = 10
-
-
-def platform_httpx_limits() -> "httpx.Limits | None":
-    """Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
-
-    Returns ``None`` when httpx isn't importable, so callers can fall
-    back to httpx's built-in default without a hard dependency on this
-    helper being reachable.
-    """
-    if httpx is None:
-        return None
-
-    def _env_float(name: str, default: float) -> float:
-        raw = os.environ.get(name, "").strip()
-        if not raw:
-            return default
-        try:
-            val = float(raw)
-        except (TypeError, ValueError):
-            return default
-        return val if val > 0 else default
-
-    def _env_int(name: str, default: int) -> int:
-        raw = os.environ.get(name, "").strip()
-        if not raw:
-            return default
-        try:
-            val = int(raw)
-        except (TypeError, ValueError):
-            return default
-        return val if val > 0 else default
-
-    keepalive_expiry = _env_float(
-        "HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
-    )
-    max_keepalive = _env_int(
-        "HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
-    )
-
-    return httpx.Limits(
-        max_keepalive_connections=max_keepalive,
-        # Leave max_connections at httpx default (100) — plenty of headroom.
-        keepalive_expiry=keepalive_expiry,
-    )
@@ -2,8 +2,8 @@
 OpenAI-compatible API server platform adapter.

 Exposes an HTTP server with endpoints:
- POST /v1/chat/completions        — OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header; opt-in long-term memory scoping via X-Hermes-Session-Key header)
- POST /v1/responses               — OpenAI Responses API format (stateful via previous_response_id; X-Hermes-Session-Key supported)
+- POST /v1/chat/completions        — OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header)
+- POST /v1/responses               — OpenAI Responses API format (stateful via previous_response_id)
 - GET  /v1/responses/{response_id} — Retrieve a stored response
 - DELETE /v1/responses/{response_id} — Delete a stored response
 - GET  /v1/models                  — lists hermes-agent as an available model
@@ -62,14 +62,6 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536  # 64 KB cap for normalized content parts
 MAX_CONTENT_LIST_SIZE = 1_000  # Max items when content is an array


-def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
-    """Parse a listen port without letting malformed env/config values crash startup."""
-    try:
-        return int(value)
-    except (TypeError, ValueError):
-        return default
-
-
 def _normalize_chat_content(
    content: Any, *, _max_depth: int = 10, _depth: int = 0,
 ) -> str:
@@ -581,10 +573,7 @@ class APIServerAdapter(BasePlatformAdapter):
        super().__init__(config, Platform.API_SERVER)
        extra = config.extra or {}
        self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
-        raw_port = extra.get("port")
-        if raw_port is None:
-            raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
-        self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
+        self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
        self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
        self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
            extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -698,71 +687,6 @@ class APIServerAdapter(BasePlatformAdapter):
            status=401,
        )

-    # ------------------------------------------------------------------
-    # Session header helpers
-    # ------------------------------------------------------------------
-
-    # Soft length cap for session identifiers.  Headers are bounded in
-    # aggregate by aiohttp (``client_max_size`` / default 8 KiB per
-    # header), but we impose a tighter limit on the session headers so a
-    # caller can't burn memory by passing a multi-kilobyte "session key".
-    # 256 chars is well above any realistic stable channel identifier
-    # (e.g. ``agent:main:webui:dm:user-42``) while staying small enough
-    # that the sanitized form is safe to pass into Honcho / state.db.
-    _MAX_SESSION_HEADER_LEN = 256
-
-    def _parse_session_key_header(
-        self, request: "web.Request"
-    ) -> tuple[Optional[str], Optional["web.Response"]]:
-        """Extract and validate the ``X-Hermes-Session-Key`` header.
-
-        The session key is a stable per-channel identifier that scopes
-        long-term memory (e.g. Honcho sessions) across transcripts.  It
-        is independent of ``X-Hermes-Session-Id``: callers may send
-        either, both, or neither.
-
-        Returns ``(session_key, None)`` on success (with an empty/absent
-        header yielding ``None`` for the key), or ``(None, error_response)``
-        on validation failure.
-
-        Security: like session continuation, accepting a caller-supplied
-        memory scope requires API-key authentication so that an
-        unauthenticated client on a local-only server can't inject itself
-        into another user's long-term memory scope by guessing a key.
-        """
-        raw = request.headers.get("X-Hermes-Session-Key", "").strip()
-        if not raw:
-            return None, None
-
-        if not self._api_key:
-            logger.warning(
-                "X-Hermes-Session-Key rejected: no API key configured. "
-                "Set API_SERVER_KEY to enable long-term memory scoping."
-            )
-            return None, web.json_response(
-                _openai_error(
-                    "X-Hermes-Session-Key requires API key authentication. "
-                    "Configure API_SERVER_KEY to enable this feature."
-                ),
-                status=403,
-            )
-
-        # Reject control characters that could enable header injection on
-        # the echo path.
-        if re.search(r'[\r\n\x00]', raw):
-            return None, web.json_response(
-                {"error": {"message": "Invalid session key", "type": "invalid_request_error"}},
-                status=400,
-            )
-
-        if len(raw) > self._MAX_SESSION_HEADER_LEN:
-            return None, web.json_response(
-                {"error": {"message": "Session key too long", "type": "invalid_request_error"}},
-                status=400,
-            )
-
-        return raw, None
-
    # ------------------------------------------------------------------
    # Session DB helper
    # ------------------------------------------------------------------
@@ -793,7 +717,6 @@ class APIServerAdapter(BasePlatformAdapter):
        tool_progress_callback=None,
        tool_start_callback=None,
        tool_complete_callback=None,
-        gateway_session_key: Optional[str] = None,
    ) -> Any:
        """
        Create an AIAgent instance using the gateway's runtime config.
@@ -802,20 +725,12 @@ class APIServerAdapter(BasePlatformAdapter):
        base_url, etc. from config.yaml / env vars.  Toolsets are resolved
        from config.yaml platform_toolsets.api_server (same as all other
        gateway platforms), falling back to the hermes-api-server default.
-
-        ``gateway_session_key`` is a stable per-channel identifier supplied
-        by the client (via ``X-Hermes-Session-Key``).  Unlike ``session_id``
-        which scopes the short-term transcript and rotates on /new, this
-        key is meant to persist across transcripts so long-term memory
-        providers (e.g. Honcho) can scope their per-chat state correctly
-        — matching the semantics of the native gateway's ``session_key``.
        """
        from run_agent import AIAgent
-        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
+        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
        from hermes_cli.tools_config import _get_platform_tools

        runtime_kwargs = _resolve_runtime_agent_kwargs()
-        reasoning_config = GatewayRunner._load_reasoning_config()
        model = _resolve_gateway_model()

        user_config = _load_gateway_config()
@@ -825,6 +740,7 @@ class APIServerAdapter(BasePlatformAdapter):

        # Load fallback provider chain so the API server platform has the
        # same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
+        from gateway.run import GatewayRunner
        fallback_model = GatewayRunner._load_fallback_model()

        agent = AIAgent(
@@ -843,8 +759,6 @@ class APIServerAdapter(BasePlatformAdapter):
            tool_complete_callback=tool_complete_callback,
            session_db=self._ensure_session_db(),
            fallback_model=fallback_model,
-            reasoning_config=reasoning_config,
-            gateway_session_key=gateway_session_key,
        )
        return agent

@@ -928,7 +842,6 @@ class APIServerAdapter(BasePlatformAdapter):
                "run_stop": True,
                "tool_progress_events": True,
                "session_continuity_header": "X-Hermes-Session-Id",
-                "session_key_header": "X-Hermes-Session-Key",
                "cors": bool(self._cors_origins),
            },
            "endpoints": {
@@ -1000,15 +913,6 @@ class APIServerAdapter(BasePlatformAdapter):
                status=400,
            )

-        # Allow caller to scope long-term memory (e.g. Honcho) with a
-        # stable per-channel identifier via X-Hermes-Session-Key.  This
-        # is independent of X-Hermes-Session-Id: the key persists across
-        # transcripts while the id rotates when the caller starts a new
-        # transcript (i.e. /new semantics).  See _parse_session_key_header.
-        gateway_session_key, key_err = self._parse_session_key_header(request)
-        if key_err is not None:
-            return key_err
-
        # Allow caller to continue an existing session by passing X-Hermes-Session-Id.
        # When provided, history is loaded from state.db instead of from the request body.
        #
@@ -1143,13 +1047,11 @@ class APIServerAdapter(BasePlatformAdapter):
                tool_start_callback=_on_tool_start,
                tool_complete_callback=_on_tool_complete,
                agent_ref=agent_ref,
-                gateway_session_key=gateway_session_key,
            ))

            return await self._write_sse_chat_completion(
                request, completion_id, model_name, created, _stream_q,
                agent_task, agent_ref, session_id=session_id,
-                gateway_session_key=gateway_session_key,
            )

        # Non-streaming: run the agent (with optional Idempotency-Key)
@@ -1159,7 +1061,6 @@ class APIServerAdapter(BasePlatformAdapter):
                conversation_history=history,
                ephemeral_system_prompt=system_prompt,
                session_id=session_id,
-                gateway_session_key=gateway_session_key,
            )

        idempotency_key = request.headers.get("Idempotency-Key")
@@ -1209,17 +1110,11 @@ class APIServerAdapter(BasePlatformAdapter):
            },
        }

-        response_headers = {
-            "X-Hermes-Session-Id": result.get("session_id", session_id),
-        }
-        if gateway_session_key:
-            response_headers["X-Hermes-Session-Key"] = gateway_session_key
-        return web.json_response(response_data, headers=response_headers)
+        return web.json_response(response_data, headers={"X-Hermes-Session-Id": session_id})

    async def _write_sse_chat_completion(
        self, request: "web.Request", completion_id: str, model: str,
        created: int, stream_q, agent_task, agent_ref=None, session_id: str = None,
-        gateway_session_key: str = None,
    ) -> "web.StreamResponse":
        """Write real streaming SSE from agent's stream_delta_callback queue.

@@ -1242,8 +1137,6 @@ class APIServerAdapter(BasePlatformAdapter):
            sse_headers.update(cors)
        if session_id:
            sse_headers["X-Hermes-Session-Id"] = session_id
-        if gateway_session_key:
-            sse_headers["X-Hermes-Session-Key"] = gateway_session_key
        response = web.StreamResponse(status=200, headers=sse_headers)
        await response.prepare(request)

@@ -1367,7 +1260,6 @@ class APIServerAdapter(BasePlatformAdapter):
        conversation: Optional[str],
        store: bool,
        session_id: str,
-        gateway_session_key: Optional[str] = None,
    ) -> "web.StreamResponse":
        """Write an SSE stream for POST /v1/responses (OpenAI Responses API).

@@ -1410,8 +1302,6 @@ class APIServerAdapter(BasePlatformAdapter):
            sse_headers.update(cors)
        if session_id:
            sse_headers["X-Hermes-Session-Id"] = session_id
-        if gateway_session_key:
-            sse_headers["X-Hermes-Session-Key"] = gateway_session_key
        response = web.StreamResponse(status=200, headers=sse_headers)
        await response.prepare(request)

@@ -1861,11 +1751,6 @@ class APIServerAdapter(BasePlatformAdapter):
        if auth_err:
            return auth_err

-        # Long-term memory scope header (see chat_completions for details).
-        gateway_session_key, key_err = self._parse_session_key_header(request)
-        if key_err is not None:
-            return key_err
-
        # Parse request body
        try:
            body = await request.json()
@@ -2017,7 +1902,6 @@ class APIServerAdapter(BasePlatformAdapter):
                tool_start_callback=_on_tool_start,
                tool_complete_callback=_on_tool_complete,
                agent_ref=agent_ref,
-                gateway_session_key=gateway_session_key,
            ))

            response_id = f"resp_{uuid.uuid4().hex[:28]}"
@@ -2038,7 +1922,6 @@ class APIServerAdapter(BasePlatformAdapter):
                conversation=conversation,
                store=store,
                session_id=session_id,
-                gateway_session_key=gateway_session_key,
            )

        async def _compute_response():
@@ -2047,7 +1930,6 @@ class APIServerAdapter(BasePlatformAdapter):
                conversation_history=conversation_history,
                ephemeral_system_prompt=instructions,
                session_id=session_id,
-                gateway_session_key=gateway_session_key,
            )

        idempotency_key = request.headers.get("Idempotency-Key")
@@ -2122,10 +2004,7 @@ class APIServerAdapter(BasePlatformAdapter):
            if conversation:
                self._response_store.set_conversation(conversation, response_id)

-        response_headers = {"X-Hermes-Session-Id": session_id}
-        if gateway_session_key:
-            response_headers["X-Hermes-Session-Key"] = gateway_session_key
-        return web.json_response(response_data, headers=response_headers)
+        return web.json_response(response_data)

    # ------------------------------------------------------------------
    # GET / DELETE response endpoints
@@ -2447,7 +2326,6 @@ class APIServerAdapter(BasePlatformAdapter):
        tool_start_callback=None,
        tool_complete_callback=None,
        agent_ref: Optional[list] = None,
-        gateway_session_key: Optional[str] = None,
    ) -> tuple:
        """
        Create an agent and run a conversation in a thread executor.
@@ -2470,7 +2348,6 @@ class APIServerAdapter(BasePlatformAdapter):
                tool_progress_callback=tool_progress_callback,
                tool_start_callback=tool_start_callback,
                tool_complete_callback=tool_complete_callback,
-                gateway_session_key=gateway_session_key,
            )
            if agent_ref is not None:
                agent_ref[0] = agent
@@ -2485,12 +2362,6 @@ class APIServerAdapter(BasePlatformAdapter):
                "output_tokens": getattr(agent, "session_completion_tokens", 0) or 0,
                "total_tokens": getattr(agent, "session_total_tokens", 0) or 0,
            }
-            # Include the effective session ID in the result so callers
-            # (e.g. X-Hermes-Session-Id header) can track compression-
-            # triggered session rotations. (#16938)
-            _eff_sid = getattr(agent, "session_id", session_id)
-            if isinstance(_eff_sid, str) and _eff_sid:
-                result["session_id"] = _eff_sid
            return result, usage

        return await loop.run_in_executor(None, _run)
@@ -2570,11 +2441,6 @@ class APIServerAdapter(BasePlatformAdapter):
        if auth_err:
            return auth_err

-        # Long-term memory scope header (see chat_completions for details).
-        gateway_session_key, key_err = self._parse_session_key_header(request)
-        if key_err is not None:
-            return key_err
-
        # Enforce concurrency limit
        if len(self._run_streams) >= self._MAX_CONCURRENT_RUNS:
            return web.json_response(
@@ -2683,7 +2549,6 @@ class APIServerAdapter(BasePlatformAdapter):
                    session_id=session_id,
                    stream_delta_callback=_text_cb,
                    tool_progress_callback=event_cb,
-                    gateway_session_key=gateway_session_key,
                )
                self._active_run_agents[run_id] = agent
                def _run_sync():
@@ -2701,39 +2566,21 @@ class APIServerAdapter(BasePlatformAdapter):
                    return r, u

                result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
-                # Check for structured failure (non-retryable client errors like
-                # 401/400 return failed=True instead of raising, so the except
-                # block below never fires — issue #15561).
-                if isinstance(result, dict) and result.get("failed"):
-                    error_msg = result.get("error") or "agent run failed"
-                    q.put_nowait({
-                        "event": "run.failed",
-                        "run_id": run_id,
-                        "timestamp": time.time(),
-                        "error": error_msg,
-                    })
-                    self._set_run_status(
-                        run_id,
-                        "failed",
-                        error=error_msg,
-                        last_event="run.failed",
-                    )
-                else:
-                    final_response = result.get("final_response", "") if isinstance(result, dict) else ""
-                    q.put_nowait({
-                        "event": "run.completed",
-                        "run_id": run_id,
-                        "timestamp": time.time(),
-                        "output": final_response,
-                        "usage": usage,
-                    })
-                    self._set_run_status(
-                        run_id,
-                        "completed",
-                        output=final_response,
-                        usage=usage,
-                        last_event="run.completed",
-                    )
+                final_response = result.get("final_response", "") if isinstance(result, dict) else ""
+                q.put_nowait({
+                    "event": "run.completed",
+                    "run_id": run_id,
+                    "timestamp": time.time(),
+                    "output": final_response,
+                    "usage": usage,
+                })
+                self._set_run_status(
+                    run_id,
+                    "completed",
+                    output=final_response,
+                    usage=usage,
+                    last_event="run.completed",
+                )
            except asyncio.CancelledError:
                self._set_run_status(
                    run_id,
@@ -2784,14 +2631,7 @@ class APIServerAdapter(BasePlatformAdapter):
        if hasattr(task, "add_done_callback"):
            task.add_done_callback(self._background_tasks.discard)

-        response_headers = (
-            {"X-Hermes-Session-Key": gateway_session_key} if gateway_session_key else {}
-        )
-        return web.json_response(
-            {"run_id": run_id, "status": "started"},
-            status=202,
-            headers=response_headers,
-        )
+        return web.json_response({"run_id": run_id, "status": "started"}, status=202)

    async def _handle_get_run(self, request: "web.Request") -> "web.Response":
        """GET /v1/runs/{run_id} — return pollable run status for external UIs."""
@@ -2489,30 +2489,19 @@ class BasePlatformAdapter(ABC):

        try:
            response = await self._message_handler(event)
+            # Old adapter task (if any) is cancelled AFTER the runner has
+            # fully handled the command — keeps ordering deterministic.
+            await self.cancel_session_processing(
+                session_key,
+                release_guard=False,
+                discard_pending=False,
+            )
            _text, _eph_ttl = self._unwrap_ephemeral(response)
-            # Send the response BEFORE cancelling the old task so the send
-            # cannot be affected by task-cancellation side effects (race
-            # condition fix — issue #18912).  Previously the send happened
-            # after cancel_session_processing, which could silently drop the
-            # "/new" confirmation when an agent was actively running.
            if _text:
-                logger.info(
-                    "[%s] Sending command '/%s' response (%d chars) to %s",
-                    self.name,
-                    cmd,
-                    len(_text),
-                    event.source.chat_id,
-                )
                _r = await self._send_with_retry(
                    chat_id=event.source.chat_id,
                    content=_text,
-                    reply_to=(
-                        event.reply_to_message_id
-                        if event.source.platform == Platform.FEISHU
-                        and event.source.thread_id
-                        and event.reply_to_message_id
-                        else event.message_id
-                    ),
+                    reply_to=event.message_id,
                    metadata=thread_meta,
                )
                if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2521,13 +2510,6 @@ class BasePlatformAdapter(ABC):
                        message_id=_r.message_id,
                        ttl_seconds=_eph_ttl,
                    )
-            # Old adapter task (if any) is cancelled AFTER the response has
-            # been sent — keeps ordering deterministic and avoids the race.
-            await self.cancel_session_processing(
-                session_key,
-                release_guard=False,
-                discard_pending=False,
-            )
        except Exception:
            # On failure, restore the original guard if one still exists so
            # we don't leave the session in a half-reset state.
@@ -2612,13 +2594,7 @@ class BasePlatformAdapter(ABC):
                        _r = await self._send_with_retry(
                            chat_id=event.source.chat_id,
                            content=_text,
-                            reply_to=(
-                                event.reply_to_message_id
-                                if event.source.platform == Platform.FEISHU
-                                and event.source.thread_id
-                                and event.reply_to_message_id
-                                else event.message_id
-                            ),
+                            reply_to=event.message_id,
                            metadata=_thread_meta,
                        )
                        if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2675,18 +2651,10 @@ class BasePlatformAdapter(ABC):
        mode = os.getenv("HERMES_HUMAN_DELAY_MODE", "off").lower()
        if mode == "off":
            return 0.0
+        min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
+        max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
        if mode == "natural":
            min_ms, max_ms = 800, 2500
-            return random.uniform(min_ms / 1000.0, max_ms / 1000.0)
-        # custom mode — tolerate malformed env vars instead of crashing.
-        try:
-            min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
-        except (TypeError, ValueError):
-            min_ms = 800
-        try:
-            max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
-        except (TypeError, ValueError):
-            max_ms = 2500
        return random.uniform(min_ms / 1000.0, max_ms / 1000.0)

    async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
@@ -2830,15 +2798,10 @@ class BasePlatformAdapter(ABC):
                # Send the text portion
                if text_content:
                    logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
-                    _reply_anchor = (
-                        event.reply_to_message_id
-                        if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
-                        else event.message_id
-                    )
                    result = await self._send_with_retry(
                        chat_id=event.source.chat_id,
                        content=text_content,
-                        reply_to=_reply_anchor,
+                        reply_to=event.message_id,
                        metadata=_thread_metadata,
                    )
                    _record_delivery(result)
@@ -162,9 +162,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
            return False
        from aiohttp import web

-        # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
-        from gateway.platforms._http_client_limits import platform_httpx_limits
-        self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
+        self.client = httpx.AsyncClient(timeout=30.0)
        try:
            await self._api_get("/api/v1/ping")
            info = await self._api_get("/api/v1/server/info")
@@ -228,11 +228,7 @@ class DingTalkAdapter(BasePlatformAdapter):
            return False

        try:
-            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
-            from gateway.platforms._http_client_limits import platform_httpx_limits
-            self._http_client = httpx.AsyncClient(
-                timeout=30.0, limits=platform_httpx_limits(),
-            )
+            self._http_client = httpx.AsyncClient(timeout=30.0)

            credential = dingtalk_stream.Credential(
                self._client_id, self._client_secret
@@ -497,7 +497,6 @@ class DiscordAdapter(BasePlatformAdapter):
        self._ready_event = asyncio.Event()
        self._allowed_user_ids: set = set()  # For button approval authorization
        self._allowed_role_ids: set = set()  # For DISCORD_ALLOWED_ROLES filtering
-        self.gateway_runner = None  # Set by gateway/run.py for cross-platform delivery
        # Voice channel state (per-guild)
        self._voice_clients: Dict[int, Any] = {}  # guild_id -> VoiceClient
        self._voice_locks: Dict[int, asyncio.Lock] = {}  # guild_id -> serialize join/leave
@@ -614,21 +613,6 @@ class DiscordAdapter(BasePlatformAdapter):
            # so LLM output or echoed user content can't ping the whole
            # server; override per DISCORD_ALLOW_MENTION_* env vars or the
            # discord.allow_mentions.* block in config.yaml.
-
-            # Close any existing client to prevent zombie websocket connections
-            # on reconnect (see #18187). Without this, the old client remains
-            # connected to Discord gateway and both fire on_message, causing
-            # double responses.
-            if self._client is not None:
-                try:
-                    if not self._client.is_closed():
-                        await self._client.close()
-                except Exception:
-                    logger.debug("[%s] Failed to close previous Discord client", self.name)
-                finally:
-                    self._client = None
-                    self._ready_event.clear()
-
            self._client = commands.Bot(
                command_prefix="!",  # Not really used, we handle raw messages
                intents=intents,
@@ -720,22 +704,11 @@ class DiscordAdapter(BasePlatformAdapter):
                        return
                    # If humans are mentioned but we're not → not for us
                    # (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
-                    # EXCEPT in free-response channels where the bot should
-                    # answer regardless of who is mentioned.
                    _ignore_no_mention = os.getenv(
                        "DISCORD_IGNORE_NO_MENTION", "true"
                    ).lower() in ("true", "1", "yes")
                    if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
-                        _channel_id = str(message.channel.id)
-                        _parent_id = None
-                        if hasattr(message.channel, "parent_id") and message.channel.parent_id:
-                            _parent_id = str(message.channel.parent_id)
-                        _free_channels = adapter_self._discord_free_response_channels()
-                        _channel_ids = {_channel_id}
-                        if _parent_id:
-                            _channel_ids.add(_parent_id)
-                        if "*" not in _free_channels and not (_channel_ids & _free_channels):
-                            return
+                        return

                await self._handle_message(message)

@@ -1941,225 +1914,6 @@ class DiscordAdapter(BasePlatformAdapter):
                            return True
        return False

-    # ── Slash command authorization ─────────────────────────────────────
-    # Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
-    # are a separate Discord interaction surface from regular messages and
-    # historically ran with NO authorization check — bypassing every gate
-    # ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
-    # DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
-    # could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
-    # operator. ``_check_slash_authorization`` mirrors the on_message gates
-    # one-for-one so the slash surface honors the same trust boundary.
-    #
-    # By design, this is a no-op for deployments with no allowlist env vars
-    # set — ``_is_allowed_user`` returns True and the channel checks early-out
-    # — preserving the existing "single-tenant, all guild members trusted"
-    # default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
-    # parity with on_message.
-
-    def _evaluate_slash_authorization(
-        self, interaction: "discord.Interaction",
-    ) -> Tuple[bool, Optional[str]]:
-        """Evaluate slash authorization without producing any response.
-
-        Returns ``(allowed, reason)``. ``reason`` is populated only when
-        ``allowed`` is False. This is the shared core used by both the
-        responding wrapper (``_check_slash_authorization``) and side-effect-
-        free callers like the ``/skill`` autocomplete callback, which must
-        return an empty list for unauthorized users instead of leaking an
-        ephemeral rejection per-keystroke.
-
-        Fail-closed semantics for malformed payloads: when an allowlist is
-        configured but the interaction is missing the data needed to
-        evaluate it (no channel id with channel policy active, no user
-        with user/role policy active), the gate REJECTS rather than
-        falling through. Without these guards a guild interaction that
-        happens to deserialize without a channel id would silently bypass
-        ``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
-        raise ``AttributeError`` in the user check below, surfacing as
-        an opaque interaction failure rather than a clean rejection.
-        """
-        chan_obj = getattr(interaction, "channel", None)
-        in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
-
-        # ── Channel scope (mirrors on_message lines 3374-3388) ──
-        # DMs aren't channel-gated — DMs follow on_message's DM lockdown
-        # path which has its own user-allowlist enforcement.
-        if not in_dm:
-            chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
-                chan_obj, "id", None,
-            )
-            channel_ids: set = set()
-            if chan_id_raw is not None:
-                channel_ids.add(str(chan_id_raw))
-                # Mirror on_message: also test the parent channel for threads
-                # so per-channel allow/deny lists work consistently.
-                if isinstance(chan_obj, discord.Thread):
-                    parent_id = self._get_parent_channel_id(chan_obj)
-                    if parent_id:
-                        channel_ids.add(str(parent_id))
-
-            allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
-            if allowed_raw:
-                allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
-                if "*" not in allowed:
-                    if not channel_ids:
-                        # Channel policy is configured but the interaction
-                        # has no resolvable channel id. Fail closed.
-                        return (
-                            False,
-                            "channel id missing with DISCORD_ALLOWED_CHANNELS configured",
-                        )
-                    if not (channel_ids & allowed):
-                        return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
-
-            # Ignored beats allowed: even when a thread's parent channel
-            # is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
-            # entry on the thread or its parent rejects the interaction.
-            ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
-            if ignored_raw and channel_ids:
-                ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
-                if "*" in ignored or (channel_ids & ignored):
-                    return (False, "channel in DISCORD_IGNORED_CHANNELS")
-
-        # ── User / role allowlist (mirrors on_message line 681) ──
-        user = getattr(interaction, "user", None)
-        allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
-        allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
-        if user is None or getattr(user, "id", None) is None:
-            # No identifiable user. With any user/role allowlist
-            # configured, fail closed rather than raise AttributeError
-            # on ``interaction.user.id`` below. With no allowlist this
-            # is the existing "no allowlist = everyone" backwards-compat.
-            if allowed_users or allowed_roles:
-                return (False, "missing interaction.user with allowlist configured")
-            return (True, None)
-
-        user_id = str(user.id)
-        if not self._is_allowed_user(user_id, author=user):
-            return (
-                False,
-                "user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
-            )
-
-        return (True, None)
-
-    async def _check_slash_authorization(
-        self, interaction: "discord.Interaction", command_text: str,
-    ) -> bool:
-        """Mirror on_message's user/role/channel gates onto a slash invocation.
-
-        Returns True to proceed. Returns False *after* sending an ephemeral
-        rejection, logging a warning, and scheduling a cross-platform admin
-        alert — the caller must stop on False (the interaction has already
-        been responded to).
-        """
-        allowed, reason = self._evaluate_slash_authorization(interaction)
-        if allowed:
-            return True
-        return await self._reject_slash(
-            interaction, command_text, reason=reason or "unauthorized",
-        )
-
-    async def _reject_slash(
-        self, interaction: "discord.Interaction", command_text: str, *, reason: str,
-    ) -> bool:
-        """Send ephemeral reject + log warning + schedule admin alert. Returns False.
-
-        Tolerates a missing ``interaction.user`` -- the fail-closed branch
-        in ``_evaluate_slash_authorization`` deliberately routes here for
-        malformed payloads (no user) when an allowlist is configured, and
-        ``str(interaction.user.id)`` would raise AttributeError before the
-        ephemeral rejection could be sent.
-        """
-        user = getattr(interaction, "user", None)
-        if user is not None:
-            user_id = str(getattr(user, "id", "?"))
-            user_name = getattr(user, "name", "?")
-        else:
-            user_id = "?"
-            user_name = "?"
-        chan_id = getattr(interaction, "channel_id", None) or getattr(
-            getattr(interaction, "channel", None), "id", None,
-        )
-        guild_id = getattr(interaction, "guild_id", None)
-
-        logger.warning(
-            "[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
-            "guild=%s cmd=%r reason=%r",
-            user_name, user_id, chan_id, guild_id, command_text, reason,
-        )
-
-        try:
-            await interaction.response.send_message(
-                "You're not authorized to use this command.",
-                ephemeral=True,
-            )
-        except Exception as e:
-            # Interaction may already be responded to (e.g. caller deferred
-            # before the auth check, or Discord retried). Best-effort only.
-            logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
-
-        # Fire-and-forget: don't block the interaction handler on Telegram I/O.
-        try:
-            asyncio.create_task(self._notify_unauthorized_slash(
-                user_name, user_id, chan_id, guild_id, command_text, reason,
-            ))
-        except Exception as e:
-            logger.debug("[Discord] Could not schedule admin notify task: %s", e)
-
-        return False
-
-    async def _notify_unauthorized_slash(
-        self, user_name: str, user_id: str, chan_id, guild_id,
-        command_text: str, reason: str,
-    ) -> None:
-        """Best-effort cross-platform alert to the gateway operator.
-
-        Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
-        then SLACK. Silently no-ops if no other platform is configured
-        with a home channel.
-
-        A soft send failure -- adapter.send() returning a result with
-        ``success=False`` rather than raising -- continues the fallback
-        chain. Treating a SendResult(success=False) as delivered would
-        mean a Telegram outage that the adapter politely surfaces (e.g.
-        rate-limit, auth failure) silently swallows the alert without
-        attempting Slack. Hard exceptions still take the same path via
-        the except branch below.
-        """
-        runner = getattr(self, "gateway_runner", None)
-        if not runner:
-            return
-        for target in (Platform.TELEGRAM, Platform.SLACK):
-            try:
-                adapter = runner.adapters.get(target)
-                if not adapter:
-                    continue
-                home = runner.config.get_home_channel(target)
-                if not home or not getattr(home, "chat_id", None):
-                    continue
-                msg = (
-                    "⚠️ Unauthorized Discord slash attempt\n"
-                    f"User: {user_name} ({user_id})\n"
-                    f"Channel: {chan_id} (guild {guild_id})\n"
-                    f"Command: {command_text}\n"
-                    f"Reason: {reason}"
-                )
-                result = await adapter.send(str(home.chat_id), msg)
-                # Only return on confirmed delivery. SendResult(success=False)
-                # -> continue to the next platform.
-                if getattr(result, "success", None) is False:
-                    logger.debug(
-                        "[Discord] Admin notify via %s returned success=False"
-                        " (error=%r); falling through",
-                        target, getattr(result, "error", None),
-                    )
-                    continue
-                return
-            except Exception as e:
-                logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
-
    async def send_image_file(
        self,
        chat_id: str,
@@ -2547,11 +2301,6 @@ class DiscordAdapter(BasePlatformAdapter):
        except Exception:
            pass  # logging must never block command dispatch

-        # Auth gate — must run before defer() so an ephemeral rejection can
-        # be delivered on the still-unresponded interaction.
-        if not await self._check_slash_authorization(interaction, command_text):
-            return
-
        await interaction.response.defer(ephemeral=True)
        event = self._build_slash_event(interaction, command_text)
        await self.handle_message(event)
@@ -2696,8 +2445,7 @@ class DiscordAdapter(BasePlatformAdapter):
            message: str = "",
            auto_archive_duration: int = 1440,
        ):
-            # defer() is performed inside the handler *after* the auth gate
-            # so a rejected invoker can receive an ephemeral rejection.
+            await interaction.response.defer(ephemeral=True)
            await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)

        @tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2818,54 +2566,6 @@ class DiscordAdapter(BasePlatformAdapter):
        # supporting up to 25 categories × 25 skills = 625 skills.
        self._register_skill_group(tree)

-        # Optional defense-in-depth: hide every slash command from non-admin
-        # guild members in Discord's slash picker. Server-side authorization
-        # (``_check_slash_authorization``) is the actual gate; this is purely
-        # UX so users don't see commands they can't invoke. Off by default
-        # to preserve the slash UX for deployments that intentionally allow
-        # everyone in the guild.
-        if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
-            "true", "1", "yes", "on",
-        ):
-            self._apply_owner_only_visibility(tree)
-
-    def _apply_owner_only_visibility(self, tree) -> None:
-        """Set default_member_permissions=0 on every registered slash command.
-
-        Discord interprets ``Permissions(0)`` as "requires no permissions",
-        which paradoxically means the command is hidden from every guild
-        member except those with the Administrator permission. Server admins
-        can re-grant per user/role via Server Settings → Integrations →
-        <bot> → Permissions.
-
-        Authoritative gate is ``_check_slash_authorization`` on every
-        invocation, which catches stale clients, role grants made by
-        mistake, and direct API calls bypassing Discord's UI hide.
-        """
-        try:
-            no_perms = discord.Permissions(0)
-        except Exception as e:
-            logger.warning(
-                "[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
-                e,
-            )
-            return
-        applied = 0
-        for cmd in tree.get_commands():
-            try:
-                cmd.default_permissions = no_perms
-                applied += 1
-            except Exception as e:
-                logger.debug(
-                    "[Discord] Could not set default_permissions on %r: %s",
-                    getattr(cmd, "name", "?"), e,
-                )
-        logger.info(
-            "[Discord] Hid %d slash command(s) from non-admin guild members "
-            "(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
-            applied,
-        )
-
    def _register_skill_group(self, tree) -> None:
        """Register a single ``/skill`` command with autocomplete on the name.

@@ -2884,32 +2584,40 @@ class DiscordAdapter(BasePlatformAdapter):
        hidden skills. The slash picker also becomes more discoverable —
        Discord live-filters by the user's typed prefix against both the
        skill name and its description.
-
-        The entries list and lookup dict are stored on ``self`` rather
-        than captured in closure variables so :meth:`refresh_skill_group`
-        can repopulate them when the user runs ``/reload-skills`` without
-        needing to touch the Discord slash-command tree or trigger a
-        ``tree.sync()`` call.
        """
        try:
+            from hermes_cli.commands import discord_skill_commands_by_category
+
            existing_names = set()
            try:
                existing_names = {cmd.name for cmd in tree.get_commands()}
            except Exception:
                pass

-            # Populate the instance-level entries/lookup so the
-            # autocomplete + handler callbacks below always read the
-            # freshest state. refresh_skill_group() re-runs the same
-            # collector and mutates these two attributes in place.
-            self._skill_entries: list[tuple[str, str, str]] = []
-            self._skill_lookup: dict[str, tuple[str, str]] = {}
-            self._skill_group_reserved_names: set[str] = set(existing_names)
-            self._refresh_skill_catalog_state()
+            # Reuse the existing collector for consistent filtering
+            # (per-platform disabled, hub-excluded, name clamping), then
+            # flatten — the category grouping was only useful for the
+            # nested layout.
+            categories, uncategorized, hidden = discord_skill_commands_by_category(
+                reserved_names=existing_names,
+            )
+            entries: list[tuple[str, str, str]] = list(uncategorized)
+            for cat_skills in categories.values():
+                entries.extend(cat_skills)

-            if not self._skill_entries:
+            if not entries:
                return

+            # Stable alphabetical order so the autocomplete suggestion
+            # list is predictable across restarts.
+            entries.sort(key=lambda t: t[0])
+
+            # name -> (description, cmd_key) — used by both the autocomplete
+            # callback and the handler for O(1) dispatch.
+            skill_lookup: dict[str, tuple[str, str]] = {
+                n: (d, k) for n, d, k in entries
+            }
+
            async def _autocomplete_name(
                interaction: "discord.Interaction", current: str,
            ) -> list:
@@ -2919,29 +2627,10 @@ class DiscordAdapter(BasePlatformAdapter):
                "/skill pdf" surfaces skills whose description mentions
                PDFs even if the name doesn't. Discord caps this list at
                25 entries per query.
-
-                Authorization: a quiet pre-check evaluates the slash
-                allowlists and returns ``[]`` for unauthorized users so
-                the installed skill catalog is not leaked to anyone who
-                can see the command in the picker. Returning a generic
-                empty list here is intentional — sending a per-keystroke
-                ephemeral rejection would produce a barrage of error
-                popups during typing.
-
-                Reads ``self._skill_entries`` so a ``/reload-skills`` run
-                since process start shows up on the very next keystroke.
                """
-                try:
-                    allowed, _reason = self._evaluate_slash_authorization(interaction)
-                except Exception:
-                    # Defensive: never raise from autocomplete. Fail
-                    # closed by returning an empty suggestion list.
-                    return []
-                if not allowed:
-                    return []
                q = (current or "").strip().lower()
                choices: list = []
-                for name, desc, _key in self._skill_entries:
+                for name, desc, _key in entries:
                    if not q or q in name.lower() or (desc and q in desc.lower()):
                        if desc:
                            label = f"{name} — {desc}"
@@ -2965,13 +2654,7 @@ class DiscordAdapter(BasePlatformAdapter):
            async def _skill_handler(
                interaction: "discord.Interaction", name: str, args: str = "",
            ):
-                # Authorize BEFORE any skill lookup so that known and
-                # unknown skill names produce identical rejections for
-                # unauthorized users (no probing the installed catalog
-                # via "Unknown skill: <name>" responses).
-                if not await self._check_slash_authorization(interaction, "/skill"):
-                    return
-                entry = self._skill_lookup.get(name)
+                entry = skill_lookup.get(name)
                if not entry:
                    await interaction.response.send_message(
                        f"Unknown skill: `{name}`. Start typing for "
@@ -2993,74 +2676,16 @@ class DiscordAdapter(BasePlatformAdapter):

            logger.info(
                "[%s] Registered /skill command with %d skill(s) via autocomplete",
-                self.name, len(self._skill_entries),
+                self.name, len(entries),
            )
-            if self._skill_group_hidden_count:
+            if hidden:
                logger.info(
                    "[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
-                    self.name, self._skill_group_hidden_count,
+                    self.name, hidden,
                )
        except Exception as exc:
            logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)

-    def _refresh_skill_catalog_state(self) -> None:
-        """Re-scan disk for skills and repopulate ``self._skill_entries``.
-
-        Called once from :meth:`_register_skill_group` at startup and
-        again from :meth:`refresh_skill_group` whenever the user runs
-        ``/reload-skills``. No Discord API calls are made — autocomplete
-        and the handler both read from these instance attributes
-        directly, so an in-place mutation is sufficient.
-        """
-        from hermes_cli.commands import discord_skill_commands_by_category
-
-        reserved = getattr(self, "_skill_group_reserved_names", set())
-        categories, uncategorized, hidden = discord_skill_commands_by_category(
-            reserved_names=set(reserved),
-        )
-        entries: list[tuple[str, str, str]] = list(uncategorized)
-        for cat_skills in categories.values():
-            entries.extend(cat_skills)
-        # Stable alphabetical order so the autocomplete suggestion
-        # list is predictable across restarts.
-        entries.sort(key=lambda t: t[0])
-
-        self._skill_entries = entries
-        self._skill_lookup = {n: (d, k) for n, d, k in entries}
-        self._skill_group_hidden_count = hidden
-
-    def refresh_skill_group(self) -> tuple[int, int]:
-        """Rescan skills and update the live ``/skill`` autocomplete state.
-
-        Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
-        after :func:`agent.skill_commands.reload_skills` has refreshed
-        the in-process skill-command registry. Without this call, the
-        ``/skill`` autocomplete dropdown keeps showing the list captured
-        at process start — new skills stay invisible and deleted skills
-        return an "Unknown skill" error when clicked.
-
-        Because autocomplete options are fetched dynamically by Discord,
-        we only need to mutate the entries/lookup attributes read by the
-        callbacks — no ``tree.sync()`` is required.
-
-        Returns ``(new_count, hidden_count)``.
-        """
-        try:
-            self._refresh_skill_catalog_state()
-        except Exception as exc:
-            logger.warning(
-                "[%s] Failed to refresh /skill autocomplete after reload: %s",
-                self.name, exc,
-            )
-            return (len(getattr(self, "_skill_entries", [])), 0)
-        logger.info(
-            "[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
-            self.name,
-            len(self._skill_entries),
-            self._skill_group_hidden_count,
-        )
-        return (len(self._skill_entries), self._skill_group_hidden_count)
-
    def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
        """Build a MessageEvent from a Discord slash command interaction."""
        is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -3118,9 +2743,6 @@ class DiscordAdapter(BasePlatformAdapter):
        auto_archive_duration: int = 1440,
    ) -> None:
        """Create a Discord thread from a slash command and start a session in it."""
-        if not await self._check_slash_authorization(interaction, "/thread"):
-            return
-        await interaction.response.defer(ephemeral=True)
        result = await self._create_thread(
            interaction,
            name=name,
@@ -3415,7 +3037,6 @@ class DiscordAdapter(BasePlatformAdapter):
            view = ExecApprovalView(
                session_key=session_key,
                allowed_user_ids=self._allowed_user_ids,
-                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3454,7 +3075,6 @@ class DiscordAdapter(BasePlatformAdapter):
                session_key=session_key,
                confirm_id=confirm_id,
                allowed_user_ids=self._allowed_user_ids,
-                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3489,7 +3109,6 @@ class DiscordAdapter(BasePlatformAdapter):
            view = UpdatePromptView(
                session_key=session_key,
                allowed_user_ids=self._allowed_user_ids,
-                allowed_role_ids=self._allowed_role_ids,
            )
            msg = await channel.send(embed=embed, view=view)
            return SendResult(success=True, message_id=str(msg.id))
@@ -3547,7 +3166,6 @@ class DiscordAdapter(BasePlatformAdapter):
                session_key=session_key,
                on_model_selected=on_model_selected,
                allowed_user_ids=self._allowed_user_ids,
-                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3808,7 +3426,7 @@ class DiscordAdapter(BasePlatformAdapter):
        if not is_thread and not isinstance(message.channel, discord.DMChannel):
            no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
            no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
-            skip_thread = bool(channel_ids & no_thread_channels)
+            skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
            auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
            is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
            if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -4103,72 +3721,6 @@ class DiscordAdapter(BasePlatformAdapter):
 # Discord UI Components (outside the adapter class)
 # ---------------------------------------------------------------------------

-
-def _component_check_auth(
-    interaction,
-    allowed_user_ids: Optional[set],
-    allowed_role_ids: Optional[set],
-) -> bool:
-    """Shared user-or-role OR semantics for component view button clicks.
-
-    Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
-    gates so every Discord interaction surface honors the same trust
-    boundary. Component views (ExecApprovalView, SlashConfirmView,
-    UpdatePromptView, ModelPickerView) used to receive only
-    ``allowed_user_ids``: in role-only deployments
-    (DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
-    set was empty and the legacy "no allowlist = allow everyone" branch
-    let any guild member click the buttons -- approving exec commands,
-    cancelling slash confirmations, switching the model.
-
-    Behavior:
-
-      - both allowlists empty -> allow (preserves existing no-allowlist
-        deployments, no regression)
-      - user is in user allowlist -> allow
-      - role allowlist set + user has a role in it -> allow
-      - role allowlist set + interaction.user has no resolvable
-        ``roles`` attribute (e.g. DM context with a role policy active)
-        -> reject (fail closed)
-      - otherwise -> reject
-    """
-    user_set = allowed_user_ids or set()
-    role_set = allowed_role_ids or set()
-    has_users = bool(user_set)
-    has_roles = bool(role_set)
-    if not has_users and not has_roles:
-        return True
-
-    user = getattr(interaction, "user", None)
-    if user is None:
-        return False
-
-    if has_users:
-        try:
-            uid = str(user.id)
-        except AttributeError:
-            uid = ""
-        if uid and uid in user_set:
-            return True
-
-    if has_roles:
-        roles_attr = getattr(user, "roles", None)
-        if roles_attr is None:
-            # Role policy is configured but the interaction doesn't
-            # carry role data (DM-context Member, raw User payload).
-            # Fail closed: a user without a resolvable role list cannot
-            # satisfy a role allowlist.
-            return False
-        try:
-            user_role_ids = {getattr(r, "id", None) for r in roles_attr}
-        except TypeError:
-            return False
-        if user_role_ids & role_set:
-            return True
-
-    return False
-
-
 if DISCORD_AVAILABLE:

    class ExecApprovalView(discord.ui.View):
@@ -4181,23 +3733,17 @@ if DISCORD_AVAILABLE:
        Only users in the allowed list can click.  Times out after 5 minutes.
        """

-        def __init__(
-            self,
-            session_key: str,
-            allowed_user_ids: set,
-            allowed_role_ids: Optional[set] = None,
-        ):
+        def __init__(self, session_key: str, allowed_user_ids: set):
            super().__init__(timeout=300)  # 5-minute timeout
            self.session_key = session_key
            self.allowed_user_ids = allowed_user_ids
-            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
            """Verify the user clicking is authorized."""
-            return _component_check_auth(
-                interaction, self.allowed_user_ids, self.allowed_role_ids,
-            )
+            if not self.allowed_user_ids:
+                return True  # No allowlist = anyone can approve
+            return str(interaction.user.id) in self.allowed_user_ids

        async def _resolve(
            self, interaction: discord.Interaction, choice: str,
@@ -4289,24 +3835,17 @@ if DISCORD_AVAILABLE:
        5 minutes (matches the gateway primitive's timeout).
        """

-        def __init__(
-            self,
-            session_key: str,
-            confirm_id: str,
-            allowed_user_ids: set,
-            allowed_role_ids: Optional[set] = None,
-        ):
+        def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
            super().__init__(timeout=300)
            self.session_key = session_key
            self.confirm_id = confirm_id
            self.allowed_user_ids = allowed_user_ids
-            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            return _component_check_auth(
-                interaction, self.allowed_user_ids, self.allowed_role_ids,
-            )
+            if not self.allowed_user_ids:
+                return True
+            return str(interaction.user.id) in self.allowed_user_ids

        async def _resolve(
            self, interaction: discord.Interaction, choice: str,
@@ -4384,22 +3923,16 @@ if DISCORD_AVAILABLE:
        5-minute timeout on its side).
        """

-        def __init__(
-            self,
-            session_key: str,
-            allowed_user_ids: set,
-            allowed_role_ids: Optional[set] = None,
-        ):
+        def __init__(self, session_key: str, allowed_user_ids: set):
            super().__init__(timeout=300)
            self.session_key = session_key
            self.allowed_user_ids = allowed_user_ids
-            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            return _component_check_auth(
-                interaction, self.allowed_user_ids, self.allowed_role_ids,
-            )
+            if not self.allowed_user_ids:
+                return True
+            return str(interaction.user.id) in self.allowed_user_ids

        async def _respond(
            self, interaction: discord.Interaction, answer: str,
@@ -4476,7 +4009,6 @@ if DISCORD_AVAILABLE:
            session_key: str,
            on_model_selected,
            allowed_user_ids: set,
-            allowed_role_ids: Optional[set] = None,
        ):
            super().__init__(timeout=120)
            self.providers = providers
@@ -4485,16 +4017,15 @@ if DISCORD_AVAILABLE:
            self.session_key = session_key
            self.on_model_selected = on_model_selected
            self.allowed_user_ids = allowed_user_ids
-            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False
            self._selected_provider: str = ""

            self._build_provider_select()

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            return _component_check_auth(
-                interaction, self.allowed_user_ids, self.allowed_role_ids,
-            )
+            if not self.allowed_user_ids:
+                return True
+            return str(interaction.user.id) in self.allowed_user_ids

        def _build_provider_select(self):
            """Build the provider dropdown menu."""
@@ -416,18 +416,6 @@ class EmailAdapter(BasePlatformAdapter):
            logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
            return

-        # Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
-        # from creating a MessageEvent (and thus thread context) for senders
-        # that the gateway will never authorize.  Without this early guard,
-        # a race between dispatch and authorization can result in the adapter
-        # sending a reply even though the handler returned None.
-        allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
-        if allowed_raw:
-            allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
-            if sender_addr.lower() not in allowed:
-                logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
-                return
-
        subject = msg_data["subject"]
        body = msg_data["body"].strip()
        attachments = msg_data["attachments"]
@@ -153,9 +153,6 @@ _MARKDOWN_HINT_RE = re.compile(
    r"(^#{1,6}\s)|(^\s*[-*]\s)|(^\s*\d+\.\s)|(^\s*---+\s*$)|(```)|(`[^`\n]+`)|(\*\*[^*\n].+?\*\*)|(~~[^~\n].+?~~)|(<u>.+?</u>)|(\*[^*\n]+\*)|(\[[^\]]+\]\([^)]+\))|(^>\s)",
    re.MULTILINE,
 )
-# Detect markdown tables: a line starting with | followed by a separator line.
-# Feishu post-type 'md' elements do not render tables, so we force text mode.
-_MARKDOWN_TABLE_RE = re.compile(r"^\|.*\|\n\|[-|: ]+\|", re.MULTILINE)
 _MARKDOWN_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
 _MARKDOWN_FENCE_OPEN_RE = re.compile(r"^```([^\n`]*)\s*$")
 _MARKDOWN_FENCE_CLOSE_RE = re.compile(r"^```\s*$")
@@ -2760,11 +2757,9 @@ class FeishuAdapter(BasePlatformAdapter):
            if hint:
                text = f"{hint}\n\n{text}" if text else hint

-        thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
        reply_to_message_id = (
            getattr(message, "parent_id", None)
            or getattr(message, "upper_message_id", None)
-            or getattr(message, "root_id", None)
            or None
        )
        reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
@@ -2796,7 +2791,7 @@ class FeishuAdapter(BasePlatformAdapter):
            chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
            user_id=sender_profile["user_id"],
            user_name=sender_profile["user_name"],
-            thread_id=thread_id,
+            thread_id=getattr(message, "thread_id", None) or None,
            user_id_alt=sender_profile["user_id_alt"],
            is_bot=is_bot,
        )
@@ -2927,18 +2922,13 @@ class FeishuAdapter(BasePlatformAdapter):
                },
            )
            response.raise_for_status()
-            # Snapshot Content-Type and body while the client context is
-            # still active so pooled connections fully release on exit.
-            # See #18451.
-            content_type_hdr = str(response.headers.get("Content-Type", ""))
-            body = response.content
        filename = self._derive_remote_filename(
            file_url,
-            content_type=content_type_hdr,
+            content_type=str(response.headers.get("Content-Type", "")),
            default_name=preferred_name,
            default_ext=default_ext,
        )
-        cached_path = cache_document_from_bytes(body, filename)
+        cached_path = cache_document_from_bytes(response.content, filename)
        return cached_path, filename

    @staticmethod
@@ -3865,50 +3855,47 @@ class FeishuAdapter(BasePlatformAdapter):
        and self-sent bot event filtering.

        Populates ``_bot_open_id`` and ``_bot_name`` from /open-apis/bot/v3/info
-        (no extra scopes required beyond the tenant access token). The probe
-        always runs when a client is available so stale env vars from app/bot
-        migrations do not break group @mention gating. Falls back to the
-        application info endpoint for ``_bot_name`` only when the first probe
-        doesn't return it. If the probe fails, env-provided values are preserved.
+        (no extra scopes required beyond the tenant access token). Falls back to
+        the application info endpoint for ``_bot_name`` only when the first probe
+        doesn't return it. Each field is hydrated independently — a value already
+        supplied via env vars (FEISHU_BOT_OPEN_ID / FEISHU_BOT_USER_ID /
+        FEISHU_BOT_NAME) is preserved and skips its probe.
        """
        if not self._client:
            return
+        if self._bot_open_id and self._bot_name:
+            # Everything the self-send filter and precise mention gate need is
+            # already in place; nothing to probe.
+            return

        # Primary probe: /open-apis/bot/v3/info — returns bot_name + open_id, no
        # extra scopes required. This is the same endpoint the onboarding wizard
        # uses via probe_bot().
-        try:
-            req = (
-                BaseRequest.builder()
-                .http_method(HttpMethod.GET)
-                .uri("/open-apis/bot/v3/info")
-                .token_types({AccessTokenType.TENANT})
-                .build()
-            )
-            resp = await asyncio.to_thread(self._client.request, req)
-            content = getattr(getattr(resp, "raw", None), "content", None)
-            if content:
-                payload = json.loads(content)
-                parsed = _parse_bot_response(payload) or {}
-                open_id = (parsed.get("bot_open_id") or "").strip()
-                bot_name = (parsed.get("bot_name") or "").strip()
-                if open_id:
-                    if self._bot_open_id and self._bot_open_id != open_id:
-                        logger.warning(
-                            "[Feishu] FEISHU_BOT_OPEN_ID is stale; using /bot/v3/info open_id for group @mention gating."
-                        )
-                    self._bot_open_id = open_id
-                if bot_name:
-                    if self._bot_name and self._bot_name != bot_name:
-                        logger.info(
-                            "[Feishu] FEISHU_BOT_NAME differs from /bot/v3/info; using hydrated bot name for group @mention gating."
-                        )
-                    self._bot_name = bot_name
-        except Exception:
-            logger.debug(
-                "[Feishu] /bot/v3/info probe failed during hydration",
-                exc_info=True,
-            )
+        if not self._bot_open_id or not self._bot_name:
+            try:
+                req = (
+                    BaseRequest.builder()
+                    .http_method(HttpMethod.GET)
+                    .uri("/open-apis/bot/v3/info")
+                    .token_types({AccessTokenType.TENANT})
+                    .build()
+                )
+                resp = await asyncio.to_thread(self._client.request, req)
+                content = getattr(getattr(resp, "raw", None), "content", None)
+                if content:
+                    payload = json.loads(content)
+                    parsed = _parse_bot_response(payload) or {}
+                    open_id = (parsed.get("bot_open_id") or "").strip()
+                    bot_name = (parsed.get("bot_name") or "").strip()
+                    if open_id and not self._bot_open_id:
+                        self._bot_open_id = open_id
+                    if bot_name and not self._bot_name:
+                        self._bot_name = bot_name
+            except Exception:
+                logger.debug(
+                    "[Feishu] /bot/v3/info probe failed during hydration",
+                    exc_info=True,
+                )

        # Fallback probe for _bot_name only: application info endpoint. Needs
        # admin:app.info:readonly or application:application:self_manage scope,
@@ -3953,14 +3940,7 @@ class FeishuAdapter(BasePlatformAdapter):
        if isinstance(seen_data, list):
            entries: Dict[str, float] = {str(item).strip(): 0.0 for item in seen_data if str(item).strip()}
        elif isinstance(seen_data, dict):
-            entries = {}
-            for key, value in seen_data.items():
-                if not isinstance(key, str) or not key.strip():
-                    continue
-                try:
-                    entries[key] = float(value)
-                except (TypeError, ValueError):
-                    continue
+            entries = {k: float(v) for k, v in seen_data.items() if isinstance(k, str) and k.strip()}
        else:
            return
        # Filter out TTL-expired entries (entries saved with ts=0.0 are treated as immortal
@@ -4005,12 +3985,6 @@ class FeishuAdapter(BasePlatformAdapter):
    # =========================================================================

    def _build_outbound_payload(self, content: str) -> tuple[str, str]:
-        # Feishu post-type 'md' elements do not render markdown tables; sending
-        # table content as post causes the message to appear blank on the client.
-        # Force plain text for anything that looks like a markdown table.
-        if _MARKDOWN_TABLE_RE.search(content):
-            text_payload = {"text": content}
-            return "text", json.dumps(text_payload, ensure_ascii=False)
        if _MARKDOWN_HINT_RE.search(content):
            return "post", _build_markdown_post_payload(content)
        text_payload = {"text": content}
@@ -4106,15 +4080,7 @@ class FeishuAdapter(BasePlatformAdapter):
            content=payload,
            uuid_value=str(uuid.uuid4()),
        )
-        # Detect whether chat_id is a user open_id (DM) or a chat_id (group).
-        # Feishu API expects receive_id_type="open_id" for user DMs (ou_ prefix)
-        # and receive_id_type="chat_id" for group chats (oc_ prefix, which IS
-        # the chat_id format — see https://open.feishu.cn/document/).
-        if chat_id.startswith("ou_"):
-            receive_id_type = "open_id"
-        else:
-            receive_id_type = "chat_id"
-        request = self._build_create_message_request(receive_id_type, body)
+        request = self._build_create_message_request("chat_id", body)
        return await asyncio.to_thread(self._client.im.v1.message.create, request)

    @staticmethod
@@ -4256,15 +4222,6 @@ class FeishuAdapter(BasePlatformAdapter):
                if active_reply_to and not self._response_succeeded(response):
                    code = getattr(response, "code", None)
                    if code in _FEISHU_REPLY_FALLBACK_CODES:
-                        if (metadata or {}).get("thread_id"):
-                            logger.warning(
-                                "[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
-                                "skipping top-level fallback to avoid creating a new topic",
-                                active_reply_to,
-                                (metadata or {}).get("thread_id"),
-                                code,
-                            )
-                            return response
                        logger.warning(
                            "[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
                            "falling back to new message in chat %s",
@@ -222,37 +222,33 @@ class ThreadParticipationTracker:
    def __init__(self, platform_name: str, max_tracked: int = 500):
        self._platform = platform_name
        self._max_tracked = max_tracked
-        self._threads: dict[str, None] = {
-            str(thread_id): None for thread_id in self._load()
-        }
+        self._threads: set = self._load()

    def _state_path(self) -> Path:
        from hermes_constants import get_hermes_home
        return get_hermes_home() / f"{self._platform}_threads.json"

-    def _load(self) -> list[str]:
+    def _load(self) -> set:
        path = self._state_path()
        if path.exists():
            try:
-                data = json.loads(path.read_text(encoding="utf-8"))
-                if isinstance(data, list):
-                    return [str(thread_id) for thread_id in data]
+                return set(json.loads(path.read_text(encoding="utf-8")))
            except Exception:
                pass
-        return []
+        return set()

    def _save(self) -> None:
        path = self._state_path()
        thread_list = list(self._threads)
        if len(thread_list) > self._max_tracked:
            thread_list = thread_list[-self._max_tracked:]
-            self._threads = {thread_id: None for thread_id in thread_list}
+            self._threads = set(thread_list)
        atomic_json_write(path, thread_list, indent=None)

    def mark(self, thread_id: str) -> None:
        """Mark *thread_id* as participated and persist."""
        if thread_id not in self._threads:
-            self._threads[thread_id] = None
+            self._threads.add(thread_id)
            self._save()

    def __contains__(self, thread_id: str) -> bool:
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):

    async def _ws_connect(self) -> bool:
        """Establish WebSocket connection and authenticate."""
-        ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
+        ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
        ws_url = f"{ws_url}/api/websocket"

        self._session = aiohttp.ClientSession(
@@ -243,14 +243,10 @@ class QQAdapter(BasePlatformAdapter):
            return False

        try:
-            # Tighter keepalive pool so idle CLOSE_WAIT sockets drain
-            # faster behind proxies like Cloudflare Warp (#18451).
-            from gateway.platforms._http_client_limits import platform_httpx_limits
            self._http_client = httpx.AsyncClient(
                timeout=30.0,
                follow_redirects=True,
                event_hooks={"response": [_ssrf_redirect_guard]},
-                limits=platform_httpx_limits(),
            )

            # 1. Get access token
@@ -397,24 +393,13 @@ class QQAdapter(BasePlatformAdapter):
            await self._session.close()
        self._session = None

-        # Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
-        # local patch, so QQ can regress to direct-connect timeouts after update.
-        self._session = aiohttp.ClientSession(trust_env=True)
-        ws_proxy = (
-            os.getenv("WSS_PROXY")
-            or os.getenv("wss_proxy")
-            or os.getenv("HTTPS_PROXY")
-            or os.getenv("https_proxy")
-            or os.getenv("ALL_PROXY")
-            or os.getenv("all_proxy")
-        )
+        self._session = aiohttp.ClientSession()
        self._ws = await self._session.ws_connect(
            gateway_url,
            headers={
                "User-Agent": build_user_agent(),
            },
            timeout=CONNECT_TIMEOUT_SECONDS,
-            proxy=ws_proxy,
        )
        logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)

@@ -192,15 +192,6 @@ class SignalAdapter(BasePlatformAdapter):
        group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
        self.group_allow_from = set(_parse_comma_list(group_allowed_str))

-        # DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
-        # Stored here so the reaction hooks can skip unauthorized senders
-        # (reactions fire before run.py's auth gate, so without this check
-        # every inbound DM from any contact gets a 👀 reaction).
-        # "*" means all users allowed (open mode); empty means no restriction
-        # recorded at adapter level (run.py still enforces auth separately).
-        dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
-        self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
-
        # HTTP client
        self.client: Optional[httpx.AsyncClient] = None

@@ -257,9 +248,7 @@ class SignalAdapter(BasePlatformAdapter):
        except Exception as e:
            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)

-        # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
-        from gateway.platforms._http_client_limits import platform_httpx_limits
-        self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
+        self.client = httpx.AsyncClient(timeout=30.0)
        try:
            # Health check — verify signal-cli daemon is reachable
            try:
@@ -1439,28 +1428,8 @@ class SignalAdapter(BasePlatformAdapter):
            return None
        return (author, ts)

-    def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
-        """Check if message reactions are enabled for this event.
-
-        Two gates:
-        1. SIGNAL_REACTIONS env var — set to false/0/no to disable globally.
-        2. DM allowlist — if SIGNAL_ALLOWED_USERS is set, only react to
-           messages from senders in that list.  This prevents unauthorized
-           contacts from seeing the 👀 reaction (which fires before run.py's
-           auth gate and would otherwise reveal that a bot is listening).
-        """
-        if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
-            return False
-        if event is not None:
-            sender = getattr(getattr(event, "source", None), "user_id", None)
-            if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
-                return False
-        return True
-
    async def on_processing_start(self, event: MessageEvent) -> None:
        """React with 👀 when processing begins."""
-        if not self._reactions_enabled(event):
-            return
        target = self._extract_reaction_target(event)
        if target:
            await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1471,8 +1440,6 @@ class SignalAdapter(BasePlatformAdapter):
        On CANCELLED we leave the 👀 in place — no terminal outcome means
        the reaction should keep reflecting "in progress" (matches Telegram).
        """
-        if not self._reactions_enabled(event):
-            return
        if outcome == ProcessingOutcome.CANCELLED:
            return
        target = self._extract_reaction_target(event)
@@ -528,21 +528,6 @@ class SlackAdapter(BasePlatformAdapter):
                return False
            lock_acquired = True

-            # Close any previous handler before creating a new one so that
-            # calling connect() a second time (e.g. during a gateway restart or
-            # in-process reconnect attempt) does not leave a zombie Socket Mode
-            # connection alive.  Both the old and new connections would otherwise
-            # receive every Slack event and dispatch it twice, producing double
-            # responses — the same bug that affected DiscordAdapter (#18187).
-            if self._handler is not None:
-                try:
-                    await self._handler.close_async()
-                except Exception:
-                    logger.debug("[%s] Failed to close previous Slack handler", self.name)
-                finally:
-                    self._handler = None
-                    self._app = None
-
            # First token is the primary — used for AsyncApp / Socket Mode
            primary_token = bot_tokens[0]
            self._app = AsyncApp(token=primary_token)
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:

 Gateway-specific env vars:
  - SMS_WEBHOOK_PORT     (default 8080)
-  - SMS_WEBHOOK_HOST     (default 127.0.0.1)
+  - SMS_WEBHOOK_HOST     (default 0.0.0.0)
  - SMS_WEBHOOK_URL      (public URL for Twilio signature validation — required)
  - SMS_INSECURE_NO_SIGNATURE  (true to disable signature validation — dev only)
  - SMS_ALLOWED_USERS    (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
 TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
 MAX_SMS_LENGTH = 1600  # ~10 SMS segments
 DEFAULT_WEBHOOK_PORT = 8080
-DEFAULT_WEBHOOK_HOST = "127.0.0.1"
+DEFAULT_WEBHOOK_HOST = "0.0.0.0"


 def check_sms_requirements() -> bool:
@@ -91,23 +91,19 @@ class SmsAdapter(BasePlatformAdapter):
        from aiohttp import web

        if not self._from_number:
-            msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
-            logger.error(msg)
-            self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
+            logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
            return False

        insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"

        if not self._webhook_url and not insecure_no_sig:
-            msg = (
+            logger.error(
                "[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
                "signature validation. Set it to the public URL configured in your "
                "Twilio console (e.g. https://example.com/webhooks/twilio). "
                "For local development without validation, set "
-                "SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
+                "SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
            )
-            logger.error(msg)
-            self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
            return False

        if insecure_no_sig and not self._webhook_url:
@@ -353,10 +353,7 @@ class TelegramAdapter(BasePlatformAdapter):

    @classmethod
    def _message_thread_id_for_typing(cls, thread_id: Optional[str]) -> Optional[int]:
-        # Mirrors _message_thread_id_for_send: the General forum topic (thread id
-        # "1") is represented as "no thread id" on the wire. User-created topics
-        # keep their real id so typing stays scoped to that topic.
-        if not thread_id or str(thread_id) == cls._GENERAL_TOPIC_THREAD_ID:
+        if not thread_id:
            return None
        return int(thread_id)

@@ -515,17 +512,6 @@ class TelegramAdapter(BasePlatformAdapter):
                self.name, attempt,
            )
            self._polling_network_error_count = 0
-            # start_polling() returning is necessary but not sufficient:
-            # PTB's Updater can be left in a state where `running` is True
-            # but the underlying long-poll task is wedged on a stale httpx
-            # connection and never makes progress. No error_callback fires
-            # in that state, so the reconnect ladder won't advance on its
-            # own. Schedule a deferred probe to detect the wedge and
-            # re-enter the ladder if needed.
-            if not self.has_fatal_error:
-                probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
-                self._background_tasks.add(probe)
-                probe.add_done_callback(self._background_tasks.discard)
        except Exception as retry_err:
            logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
            # start_polling failed — polling is dead and no further error
@@ -537,50 +523,6 @@ class TelegramAdapter(BasePlatformAdapter):
                self._background_tasks.add(task)
                task.add_done_callback(self._background_tasks.discard)

-    async def _verify_polling_after_reconnect(self) -> None:
-        """Heartbeat probe scheduled after a successful reconnect.
-
-        PTB's Updater can survive a botched stop()+start_polling() cycle
-        with `running=True` but a wedged consumer task. No error callback
-        fires, so the reconnect ladder doesn't advance on its own. This
-        probe detects the wedge by:
-
-        1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
-           to complete at least one cycle.
-        2. Verifying `Updater.running` is still True.
-        3. Probing the bot endpoint with a tight asyncio timeout. A
-           wedged httpx pool fails this probe; a healthy one returns
-           well under the timeout.
-
-        On any failure, re-enter the reconnect ladder so the existing
-        MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
-        """
-        HEARTBEAT_PROBE_DELAY = 60
-        PROBE_TIMEOUT = 10
-
-        await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
-
-        if self.has_fatal_error:
-            return
-        if not (self._app and self._app.updater and self._app.updater.running):
-            logger.warning(
-                "[%s] Updater not running %ds after reconnect — treating as wedged",
-                self.name, HEARTBEAT_PROBE_DELAY,
-            )
-            await self._handle_polling_network_error(
-                RuntimeError("Updater not running after reconnect heartbeat")
-            )
-            return
-
-        try:
-            await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
-        except Exception as probe_err:
-            logger.warning(
-                "[%s] Polling heartbeat probe failed %ds after reconnect: %s",
-                self.name, HEARTBEAT_PROBE_DELAY, probe_err,
-            )
-            await self._handle_polling_network_error(probe_err)
-
    async def _handle_polling_conflict(self, error: Exception) -> None:
        if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
            return
@@ -691,29 +633,6 @@ class TelegramAdapter(BasePlatformAdapter):
                )
            return None

-    async def rename_dm_topic(
-        self,
-        chat_id: int,
-        thread_id: int,
-        name: str,
-    ) -> None:
-        """Rename a forum topic in a private (DM) chat."""
-        if not self._bot:
-            return
-        try:
-            chat_id_arg = int(chat_id)
-        except (TypeError, ValueError):
-            chat_id_arg = chat_id
-        await self._bot.edit_forum_topic(
-            chat_id=chat_id_arg,
-            message_thread_id=int(thread_id),
-            name=name,
-        )
-        logger.info(
-            "[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
-            self.name, chat_id, thread_id, name,
-        )
-
    def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
        """Save a newly created thread_id back into config.yaml so it persists across restarts."""
        try:
@@ -2293,54 +2212,13 @@ class TelegramAdapter(BasePlatformAdapter):
                )
            return SendResult(success=True, message_id=str(msg.message_id))
        except Exception as e:
-            error_str = str(e)
-            # Dimension-related errors are the expected case for valid image
-            # files that Telegram just refuses as photos (screenshots, extreme
-            # aspect ratios). Log at INFO because the document fallback is
-            # the correct path. Any other send_photo failure also falls back
-            # to document (rate limits, corrupt file markers, format edge
-            # cases), but at WARNING because it's unexpected and worth
-            # surfacing in logs.
-            is_dim_error = (
-                "Photo_invalid_dimensions" in error_str
-                or "PHOTO_INVALID_DIMENSIONS" in error_str
+            logger.error(
+                "[%s] Failed to send Telegram local image, falling back to base adapter: %s",
+                self.name,
+                e,
+                exc_info=True,
            )
-            if is_dim_error:
-                logger.info(
-                    "[%s] Image dimensions exceed Telegram photo limits, "
-                    "sending as document: %s",
-                    self.name,
-                    image_path,
-                )
-            else:
-                logger.warning(
-                    "[%s] Failed to send Telegram local image as photo, "
-                    "trying document fallback: %s",
-                    self.name,
-                    e,
-                    exc_info=True,
-                )
-            # Fallback to sending as document (file) — no dimension limit,
-            # only 50MB size limit. If even that fails, fall back to the
-            # base adapter's text-only "Image: /path" rendering.
-            try:
-                return await self.send_document(
-                    chat_id=chat_id,
-                    file_path=image_path,
-                    caption=caption,
-                    file_name=os.path.basename(image_path),
-                    reply_to=reply_to,
-                    metadata=metadata,
-                )
-            except Exception as doc_err:
-                logger.error(
-                    "[%s] Failed to send Telegram local image as document, "
-                    "falling back to base adapter: %s",
-                    self.name,
-                    doc_err,
-                    exc_info=True,
-                )
-                return await super().send_image_file(chat_id, image_path, caption, reply_to)
+            return await super().send_image_file(chat_id, image_path, caption, reply_to)

    async def send_document(
        self,
@@ -2511,16 +2389,21 @@ class TelegramAdapter(BasePlatformAdapter):
            try:
                _typing_thread = self._metadata_thread_id(metadata)
                message_thread_id = self._message_thread_id_for_typing(_typing_thread)
-                # No retry-without-thread fallback here: _message_thread_id_for_typing
-                # already maps the forum General topic to None, so any non-None value
-                # reaching this call is a user-created topic. If Telegram rejects it
-                # (e.g. topic deleted mid-session), we swallow the failure rather than
-                # showing a typing indicator in the wrong chat/All Messages.
-                await self._bot.send_chat_action(
-                    chat_id=int(chat_id),
-                    action="typing",
-                    message_thread_id=message_thread_id,
-                )
+                try:
+                    await self._bot.send_chat_action(
+                        chat_id=int(chat_id),
+                        action="typing",
+                        message_thread_id=message_thread_id,
+                    )
+                except Exception as e:
+                    if message_thread_id is not None and self._is_thread_not_found_error(e):
+                        await self._bot.send_chat_action(
+                            chat_id=int(chat_id),
+                            action="typing",
+                            message_thread_id=None,
+                        )
+                    else:
+                        raise
            except Exception as e:
                # Typing failures are non-fatal; log at debug level only.
                logger.debug(
@@ -185,13 +185,10 @@ async def _query_doh_provider(
 async def discover_fallback_ips() -> list[str]:
    """Auto-discover Telegram API IPs via DNS-over-HTTPS.

-    Resolves api.telegram.org through Google and Cloudflare DoH and returns all
-    unique A records.  IPs that match the local system resolver are kept rather
-    than excluded: in many networks the system-DNS IP is the most reliable path
-    to api.telegram.org and a transient primary-path failure should be retried
-    against the same address via the IP-rewrite path before the seed list is
-    consulted (#14520).  Falls back to a hardcoded seed list only when DoH
-    yields no usable answers.
+    Resolves api.telegram.org through Google and Cloudflare DoH, collects all
+    unique IPs, and excludes the system-DNS-resolved IP (which is presumably
+    unreachable on this network).  Falls back to a hardcoded seed list when DoH
+    is also unavailable.
    """
    async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
        doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
@@ -206,11 +203,11 @@ async def discover_fallback_ips() -> list[str]:
        if isinstance(r, list):
            doh_ips.extend(r)

-    # Deduplicate preserving order
+    # Deduplicate preserving order, exclude system-DNS IPs
    seen: set[str] = set()
    candidates: list[str] = []
    for ip in doh_ips:
-        if ip not in seen:
+        if ip not in seen and ip not in system_ips:
            seen.add(ip)
            candidates.append(ip)

@@ -222,7 +219,7 @@ async def discover_fallback_ips() -> list[str]:
        return validated

    logger.info(
-        "DoH discovery yielded no usable IPs (system DNS: %s); using seed fallback IPs %s",
+        "DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
        ", ".join(system_ips) or "unknown",
        ", ".join(_SEED_FALLBACK_IPS),
    )
@@ -142,7 +142,6 @@ class WeComAdapter(BasePlatformAdapter):
    """WeCom AI Bot adapter backed by a persistent WebSocket connection."""

    MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
-    SUPPORTS_MESSAGE_EDITING = False
    # Threshold for detecting WeCom client-side message splits.
    # When a chunk is near the 4000-char limit, a continuation is almost certain.
    _SPLIT_THRESHOLD = 3900
@@ -207,11 +206,7 @@ class WeComAdapter(BasePlatformAdapter):
            return False

        try:
-            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
-            from gateway.platforms._http_client_limits import platform_httpx_limits
-            self._http_client = httpx.AsyncClient(
-                timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
-            )
+            self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
            await self._open_connection()
            self._mark_connected()
            self._listen_task = asyncio.create_task(self._listen_loop())
@@ -1015,8 +1010,6 @@ class WeComAdapter(BasePlatformAdapter):
        if not aes_key:
            raise ValueError("aes_key is required")

-        # WeCom doesn't pad base64 keys; add padding if needed
-        aes_key = aes_key + '=' * ((4 - len(aes_key) % 4) % 4)
        key = base64.b64decode(aes_key)
        if len(key) != 32:
            raise ValueError(f"Invalid WeCom AES key length: expected 32 bytes, got {len(key)}")
@@ -119,9 +119,7 @@ class WecomCallbackAdapter(BasePlatformAdapter):
            pass

        try:
-            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
-            from gateway.platforms._http_client_limits import platform_httpx_limits
-            self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
+            self._http_client = httpx.AsyncClient(timeout=20.0)
            self._app = web.Application()
            self._app.router.add_get("/health", self._handle_health)
            self._app.router.add_get(self._path, self._handle_verify)
@@ -1333,15 +1333,6 @@ class WeixinAdapter(BasePlatformAdapter):
        if message_id and self._dedup.is_duplicate(message_id):
            return

-        # Secondary content-fingerprint dedup for text messages
-        item_list = message.get("item_list") or []
-        text = _extract_text(item_list)
-        if text:
-            content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
-            if self._dedup.is_duplicate(content_key):
-                logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
-                return
-
        chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
        if chat_type == "group":
            if self._group_policy == "disabled":
@@ -1356,6 +1347,8 @@ class WeixinAdapter(BasePlatformAdapter):
            self._token_store.set(self._account_id, sender_id, context_token)
        asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))

+        item_list = message.get("item_list") or []
+        text = _extract_text(item_list)
        media_paths: List[str] = []
        media_types: List[str] = []

@@ -2037,9 +2030,7 @@ async def send_weixin_direct(

    live_adapter = _LIVE_ADAPTERS.get(resolved_token)
    send_session = getattr(live_adapter, '_send_session', None)
-    if (live_adapter is not None and send_session is not None
-            and not send_session.closed
-            and send_session._loop is asyncio.get_running_loop()):
+    if live_adapter is not None and send_session is not None and not send_session.closed:
        last_result: Optional[SendResult] = None
        cleaned = live_adapter.format_message(message)
        if cleaned:
@@ -185,13 +185,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        self._bridge_log: Optional[Path] = None
        self._poll_task: Optional[asyncio.Task] = None
        self._http_session: Optional["aiohttp.ClientSession"] = None
-        # Set to True by disconnect() before we SIGTERM our child bridge so
-        # _check_managed_bridge_exit() can distinguish an intentional
-        # shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
-        # Without this, every graceful gateway shutdown/restart would log
-        # "Fatal whatsapp adapter error" plus dispatch a fatal-error
-        # notification before the normal "✓ whatsapp disconnected" fires.
-        self._shutting_down: bool = False

    def _whatsapp_require_mention(self) -> bool:
        configured = self.config.extra.get("require_mention")
@@ -562,21 +555,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        if returncode is None:
            return None

-        # Planned shutdown: disconnect() sets _shutting_down before it sends
-        # SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
-        # or 0 (clean exit) at that point is expected, not a crash. Treat it
-        # as informational and skip the fatal-error path.
-        # getattr-with-default keeps tests that construct the adapter via
-        # ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
-        # every _make_adapter() helper having to seed the attribute.
-        if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
-            logger.info(
-                "[%s] Bridge exited during shutdown (code %d).",
-                self.name,
-                returncode,
-            )
-            return None
-
        message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
        if not self.has_fatal_error:
            logger.error("[%s] %s", self.name, message)
@@ -587,10 +565,6 @@ class WhatsAppAdapter(BasePlatformAdapter):

    async def disconnect(self) -> None:
        """Stop the WhatsApp bridge and clean up any orphaned processes."""
-        # Flip the shutdown flag BEFORE signalling the child so the exit-check
-        # path (which runs from other tasks like send() and the poll loop)
-        # doesn't race us and report the intentional termination as fatal.
-        self._shutting_down = True
        if self._bridge_process:
            try:
                try:
@@ -902,15 +876,11 @@ class WhatsAppAdapter(BasePlatformAdapter):
        try:
            import aiohttp

-            # Must wrap in `async with` — a bare `await session.post(...)`
-            # leaves the response object alive until GC, holding its TCP
-            # socket in CLOSE_WAIT. See #18451.
-            async with self._http_session.post(
+            await self._http_session.post(
                f"http://127.0.0.1:{self._bridge_port}/typing",
                json={"chatId": chat_id},
                timeout=aiohttp.ClientTimeout(total=5)
-            ):
-                pass
+            )
        except Exception:
            pass  # Ignore typing indicator failures
    
@@ -1086,22 +1086,19 @@ class SessionStore:
        return len(removed_keys)

    def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
-        """Mark recently-active sessions as resumable after an unexpected exit.
+        """Mark recently-active sessions as suspended.

-        Called on gateway startup after a crash or fast restart to preserve
-        in-flight sessions instead of destroying their conversation history
-        (#7536).  Only marks sessions updated within *max_age_seconds* to
-        avoid touching long-idle sessions.  Sets ``resume_pending=True`` so
-        the next incoming message on the same session_key auto-resumes from
-        the existing transcript.
+        Called on gateway startup to prevent sessions that were likely
+        in-flight when the gateway last exited from being blindly resumed
+        (#7536).  Only suspends sessions updated within *max_age_seconds*
+        to avoid resetting long-idle sessions that are harmless to resume.
+        Returns the number of sessions that were suspended.

-        Entries already flagged ``resume_pending=True`` are skipped.  Entries
-        explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
-        are also skipped.  Terminal escalation for genuinely stuck sessions
-        is still handled by the existing ``.restart_failure_counts`` counter
-        (threshold 3), which runs after this method and sets ``suspended=True``.
-
-        Returns the number of sessions marked resumable.
+        Entries flagged ``resume_pending=True`` are skipped — those were
+        marked intentionally by the drain-timeout path as recoverable.
+        Terminal escalation for genuinely stuck ``resume_pending`` sessions
+        is handled by the existing ``.restart_failure_counts`` stuck-loop
+        counter, which runs after this method on startup.
        """
        from datetime import timedelta

@@ -1113,15 +1110,13 @@ class SessionStore:
                if entry.resume_pending:
                    continue
                if not entry.suspended and entry.updated_at >= cutoff:
-                    entry.resume_pending = True
-                    entry.resume_reason = "restart_interrupted"
-                    entry.last_resume_marked_at = _now()
+                    entry.suspended = True
                    count += 1
            if count:
                self._save()
        return count

-    def reset_session(self, session_key: str, display_name: Optional[str] = None) -> Optional[SessionEntry]:
+    def reset_session(self, session_key: str) -> Optional[SessionEntry]:
        """Force reset a session, creating a new session ID."""
        db_end_session_id = None
        db_create_kwargs = None
@@ -1145,7 +1140,7 @@ class SessionStore:
                created_at=now,
                updated_at=now,
                origin=old_entry.origin,
-                display_name=display_name if display_name is not None else old_entry.display_name,
+                display_name=old_entry.display_name,
                platform=old_entry.platform,
                chat_type=old_entry.chat_type,
                is_fresh_reset=True,
@@ -1276,9 +1271,8 @@ class SessionStore:
        
        # Also write legacy JSONL (keeps existing tooling working during transition)
        transcript_path = self.get_transcript_path(session_id)
-        with self._lock:
-            with open(transcript_path, "a", encoding="utf-8") as f:
-                f.write(json.dumps(message, ensure_ascii=False) + "\n")
+        with open(transcript_path, "a", encoding="utf-8") as f:
+            f.write(json.dumps(message, ensure_ascii=False) + "\n")
    
    def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
        """Replace the entire transcript for a session with new messages.
@@ -637,8 +637,6 @@ def release_all_scoped_locks(

 _TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
 _TAKEOVER_MARKER_TTL_S = 60  # Marker older than this is treated as stale
-_PLANNED_STOP_MARKER_FILENAME = ".gateway-planned-stop.json"
-_PLANNED_STOP_MARKER_TTL_S = 60


 def _get_takeover_marker_path() -> Path:
@@ -647,67 +645,6 @@ def _get_takeover_marker_path() -> Path:
    return home / _TAKEOVER_MARKER_FILENAME


-def _get_planned_stop_marker_path() -> Path:
-    """Return the path to the intentional gateway stop marker file."""
-    home = get_hermes_home()
-    return home / _PLANNED_STOP_MARKER_FILENAME
-
-
-def _marker_is_stale(written_at: str, ttl_s: int) -> bool:
-    try:
-        written_dt = datetime.fromisoformat(written_at)
-        age = (datetime.now(timezone.utc) - written_dt).total_seconds()
-        return age > ttl_s
-    except (TypeError, ValueError):
-        return True
-
-
-def _consume_pid_marker_for_self(
-    path: Path,
-    *,
-    pid_field: str,
-    start_time_field: str,
-    ttl_s: int,
-) -> bool:
-    record = _read_json_file(path)
-    if not record:
-        return False
-
-    try:
-        target_pid = int(record[pid_field])
-        target_start_time = record.get(start_time_field)
-        written_at = record.get("written_at") or ""
-    except (KeyError, TypeError, ValueError):
-        try:
-            path.unlink(missing_ok=True)
-        except OSError:
-            pass
-        return False
-
-    if _marker_is_stale(written_at, ttl_s):
-        try:
-            path.unlink(missing_ok=True)
-        except OSError:
-            pass
-        return False
-
-    our_pid = os.getpid()
-    our_start_time = _get_process_start_time(our_pid)
-    matches = (
-        target_pid == our_pid
-        and target_start_time is not None
-        and our_start_time is not None
-        and target_start_time == our_start_time
-    )
-
-    try:
-        path.unlink(missing_ok=True)
-    except OSError:
-        pass
-
-    return matches
-
-
 def write_takeover_marker(target_pid: int) -> bool:
    """Record that ``target_pid`` is being replaced by the current process.

@@ -744,13 +681,59 @@ def consume_takeover_marker_for_self() -> bool:
    Always unlinks the marker on match (and on detected staleness) so
    subsequent unrelated signals don't re-trigger.
    """
-    return _consume_pid_marker_for_self(
-        _get_takeover_marker_path(),
-        pid_field="target_pid",
-        start_time_field="target_start_time",
-        ttl_s=_TAKEOVER_MARKER_TTL_S,
+    path = _get_takeover_marker_path()
+    record = _read_json_file(path)
+    if not record:
+        return False
+
+    # Any malformed or stale marker → drop it and return False
+    try:
+        target_pid = int(record["target_pid"])
+        target_start_time = record.get("target_start_time")
+        written_at = record.get("written_at") or ""
+    except (KeyError, TypeError, ValueError):
+        try:
+            path.unlink(missing_ok=True)
+        except OSError:
+            pass
+        return False
+
+    # TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
+    stale = False
+    try:
+        written_dt = datetime.fromisoformat(written_at)
+        age = (datetime.now(timezone.utc) - written_dt).total_seconds()
+        if age > _TAKEOVER_MARKER_TTL_S:
+            stale = True
+    except (TypeError, ValueError):
+        stale = True  # Unparseable timestamp — treat as stale
+
+    if stale:
+        try:
+            path.unlink(missing_ok=True)
+        except OSError:
+            pass
+        return False
+
+    # Does the marker name THIS process?
+    our_pid = os.getpid()
+    our_start_time = _get_process_start_time(our_pid)
+    matches = (
+        target_pid == our_pid
+        and target_start_time is not None
+        and our_start_time is not None
+        and target_start_time == our_start_time
    )

+    # Consume the marker whether it matched or not — a marker that doesn't
+    # match our identity is stale-for-us anyway.
+    try:
+        path.unlink(missing_ok=True)
+    except OSError:
+        pass
+
+    return matches
+

 def clear_takeover_marker() -> None:
    """Remove the takeover marker unconditionally. Safe to call repeatedly."""
@@ -760,45 +743,6 @@ def clear_takeover_marker() -> None:
        pass


-def write_planned_stop_marker(target_pid: int) -> bool:
-    """Record that ``target_pid`` is being stopped intentionally.
-
-    The gateway exits non-zero for unexpected SIGTERM so service managers can
-    revive it. Service stop commands send the same SIGTERM, so the CLI writes
-    this short-lived marker first to let the target process exit cleanly.
-    """
-    try:
-        target_start_time = _get_process_start_time(target_pid)
-        record = {
-            "target_pid": target_pid,
-            "target_start_time": target_start_time,
-            "stopper_pid": os.getpid(),
-            "written_at": _utc_now_iso(),
-        }
-        _write_json_file(_get_planned_stop_marker_path(), record)
-        return True
-    except (OSError, PermissionError):
-        return False
-
-
-def consume_planned_stop_marker_for_self() -> bool:
-    """Return True when the current process is being intentionally stopped."""
-    return _consume_pid_marker_for_self(
-        _get_planned_stop_marker_path(),
-        pid_field="target_pid",
-        start_time_field="target_start_time",
-        ttl_s=_PLANNED_STOP_MARKER_TTL_S,
-    )
-
-
-def clear_planned_stop_marker() -> None:
-    """Remove the planned-stop marker unconditionally."""
-    try:
-        _get_planned_stop_marker_path().unlink(missing_ok=True)
-    except OSError:
-        pass
-
-
 def get_running_pid(
    pid_path: Optional[Path] = None,
    *,
@@ -5,43 +5,11 @@ Provides subcommands for:
 - hermes chat          - Interactive chat (same as ./hermes)
 - hermes gateway       - Run gateway in foreground
 - hermes gateway start - Start gateway service
- hermes gateway stop  - Stop gateway service
+- hermes gateway stop  - Stop gateway service  
 - hermes setup         - Interactive setup wizard
 - hermes status        - Show status of all components
 - hermes cron          - Manage cron jobs
 """

-import os
-import sys
-
 __version__ = "0.12.0"
 __release_date__ = "2026.4.30"
-
-
-def _ensure_utf8():
-    """Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
-
-    Windows services and terminals default to cp1252, which cannot encode
-    box-drawing characters used in CLI output. This causes unhandled
-    UnicodeEncodeError crashes on gateway startup.
-    """
-    if sys.platform != "win32":
-        return
-    os.environ.setdefault("PYTHONUTF8", "1")
-    os.environ.setdefault("PYTHONIOENCODING", "utf-8")
-    for stream_name in ("stdout", "stderr"):
-        stream = getattr(sys, stream_name, None)
-        if stream is None:
-            continue
-        try:
-            if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
-                new_stream = open(
-                    stream.fileno(), "w", encoding="utf-8",
-                    buffering=1, closefd=False,
-                )
-                setattr(sys, stream_name, new_stream)
-        except (AttributeError, OSError):
-            pass
-
-
-_ensure_utf8()
@@ -416,40 +416,6 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
    ),
 }

-# Auto-extend PROVIDER_REGISTRY with any api-key provider registered in
-# providers/ that is not already declared above.  New providers only need a
-# providers/*.py file — no edits to this file required.
-try:
-    from providers import list_providers as _list_providers_for_registry
-    for _pp in _list_providers_for_registry():
-        if _pp.name in PROVIDER_REGISTRY:
-            continue
-        if _pp.auth_type != "api_key" or not _pp.env_vars:
-            continue
-        # Skip providers that need custom token resolution or are special-cased
-        # in resolve_provider() (copilot/kimi/zai have bespoke token refresh;
-        # openrouter/custom are aggregator/user-supplied and handled outside
-        # the registry — adding them here breaks runtime_provider resolution
-        # that relies on `openrouter not in PROVIDER_REGISTRY`).
-        if _pp.name in {"copilot", "kimi-coding", "kimi-coding-cn", "zai", "openrouter", "custom"}:
-            continue
-        _api_key_vars = tuple(v for v in _pp.env_vars if not v.endswith("_BASE_URL") and not v.endswith("_URL"))
-        _base_url_var = next((v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")), None)
-        PROVIDER_REGISTRY[_pp.name] = ProviderConfig(
-            id=_pp.name,
-            name=_pp.display_name or _pp.name,
-            auth_type="api_key",
-            inference_base_url=_pp.base_url,
-            api_key_env_vars=_api_key_vars or _pp.env_vars,
-            base_url_env_var=_base_url_var or "",
-        )
-        # Also register aliases so resolve_provider() resolves them
-        for _alias in _pp.aliases:
-            if _alias not in PROVIDER_REGISTRY:
-                PROVIDER_REGISTRY[_alias] = PROVIDER_REGISTRY[_pp.name]
-except Exception:
-    pass
-

 # =============================================================================
 # Anthropic Key Helper
@@ -1229,17 +1195,6 @@ def resolve_provider(
        "vllm": "custom", "llamacpp": "custom",
        "llama.cpp": "custom", "llama-cpp": "custom",
    }
-    # Extend with aliases declared in providers/*.py that aren't already mapped.
-    # This keeps providers/ as the single source for new aliases while the
-    # hardcoded dict above remains authoritative for existing ones.
-    try:
-        from providers import list_providers as _lp
-        for _pp in _lp():
-            for _alias in _pp.aliases:
-                if _alias not in _PROVIDER_ALIASES:
-                    _PROVIDER_ALIASES[_alias] = _pp.name
-    except Exception:
-        pass
    normalized = _PROVIDER_ALIASES.get(normalized, normalized)

    if normalized == "openrouter":
@@ -2634,208 +2589,6 @@ def _poll_for_token(
 # Nous Portal — token refresh, agent key minting, model discovery
 # =============================================================================

-# -----------------------------------------------------------------------------
-# Shared Nous token store — lets OAuth credentials persist across profiles
-# so a new `hermes --profile <name> auth add nous --type oauth` can one-tap
-# import instead of running the full device-code flow every time.
-#
-# File lives at ${HERMES_SHARED_AUTH_DIR}/nous_auth.json, defaulting to
-# ~/.hermes/shared/nous_auth.json. It is OUTSIDE any named profile's
-# HERMES_HOME so named profiles (which typically live under
-# ~/.hermes/profiles/<name>/) all see the same file.
-#
-# Written on successful login and on every runtime refresh so the stored
-# refresh_token stays current even if one profile refreshes and rotates it.
-# If ever the stored refresh_token does go stale server-side, import fails
-# gracefully and the user falls back to the normal device-code flow.
-# -----------------------------------------------------------------------------
-
-NOUS_SHARED_STORE_FILENAME = "nous_auth.json"
-
-
-def _nous_shared_auth_dir() -> Path:
-    """Resolve the directory that holds the shared Nous token store.
-
-    Honors ``HERMES_SHARED_AUTH_DIR`` so tests can redirect it to a tmp
-    path without touching the real user's home. Defaults to
-    ``~/.hermes/shared/``.
-    """
-    override = os.getenv("HERMES_SHARED_AUTH_DIR", "").strip()
-    if override:
-        return Path(override).expanduser()
-    return Path.home() / ".hermes" / "shared"
-
-
-def _nous_shared_store_path() -> Path:
-    path = _nous_shared_auth_dir() / NOUS_SHARED_STORE_FILENAME
-    # Seat belt: if pytest is running and this resolves to a path under the
-    # real user's home, refuse rather than silently corrupt cross-profile
-    # state. Tests must set HERMES_SHARED_AUTH_DIR to a tmp_path (conftest
-    # does not do this automatically — mirror the _auth_file_path() guard
-    # so forgetting to set it fails loudly instead of writing to the real
-    # shared store).
-    if os.environ.get("PYTEST_CURRENT_TEST"):
-        real_home_shared = (
-            Path.home() / ".hermes" / "shared" / NOUS_SHARED_STORE_FILENAME
-        ).resolve(strict=False)
-        try:
-            resolved = path.resolve(strict=False)
-        except Exception:
-            resolved = path
-        if resolved == real_home_shared:
-            raise RuntimeError(
-                f"Refusing to touch real user shared Nous auth store during test run: "
-                f"{path}. Set HERMES_SHARED_AUTH_DIR to a tmp_path in your test fixture."
-            )
-    return path
-
-
-def _write_shared_nous_state(state: Dict[str, Any]) -> None:
-    """Persist a minimal copy of the Nous OAuth state to the shared store.
-
-    Best-effort: any failure is swallowed after logging. The shared store
-    is a convenience layer; the per-profile auth.json remains the source
-    of truth.
-
-    We deliberately omit the short-lived ``agent_key`` (24h TTL, profile-
-    specific) — only the long-lived OAuth tokens are cross-profile useful.
-    """
-    refresh_token = state.get("refresh_token")
-    access_token = state.get("access_token")
-    if not (isinstance(refresh_token, str) and refresh_token.strip()):
-        # No refresh_token = nothing worth sharing across profiles
-        return
-    if not (isinstance(access_token, str) and access_token.strip()):
-        return
-
-    shared = {
-        "_schema": 1,
-        "access_token": access_token,
-        "refresh_token": refresh_token,
-        "token_type": state.get("token_type") or "Bearer",
-        "scope": state.get("scope") or DEFAULT_NOUS_SCOPE,
-        "client_id": state.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
-        "portal_base_url": state.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
-        "inference_base_url": state.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
-        "obtained_at": state.get("obtained_at"),
-        "expires_at": state.get("expires_at"),
-        "updated_at": datetime.now(timezone.utc).isoformat(),
-    }
-    try:
-        path = _nous_shared_store_path()
-        path.parent.mkdir(parents=True, exist_ok=True)
-        tmp = path.with_suffix(path.suffix + ".tmp")
-        tmp.write_text(json.dumps(shared, indent=2, sort_keys=True))
-        try:
-            os.chmod(tmp, 0o600)
-        except OSError:
-            pass
-        os.replace(tmp, path)
-        _oauth_trace(
-            "nous_shared_store_written",
-            path=str(path),
-            refresh_token_fp=_token_fingerprint(refresh_token),
-        )
-    except Exception as exc:
-        logger.debug("Failed to write shared Nous auth store: %s", exc)
-
-
-def _read_shared_nous_state() -> Optional[Dict[str, Any]]:
-    """Return the shared Nous OAuth state if present and well-formed.
-
-    Returns ``None`` when the file is missing, unreadable, malformed, or
-    lacks required fields. Callers should treat ``None`` as "no shared
-    credentials available — fall through to device-code".
-    """
-    try:
-        path = _nous_shared_store_path()
-    except RuntimeError:
-        # Test seat belt tripped — treat as missing
-        return None
-    if not path.is_file():
-        return None
-    try:
-        payload = json.loads(path.read_text())
-    except (OSError, ValueError) as exc:
-        logger.debug("Shared Nous auth store at %s is unreadable: %s", path, exc)
-        return None
-    if not isinstance(payload, dict):
-        return None
-    refresh_token = payload.get("refresh_token")
-    access_token = payload.get("access_token")
-    if not (isinstance(refresh_token, str) and refresh_token.strip()):
-        return None
-    if not (isinstance(access_token, str) and access_token.strip()):
-        return None
-    return payload
-
-
-def _try_import_shared_nous_state(
-    *,
-    timeout_seconds: float = 15.0,
-    min_key_ttl_seconds: int = 5 * 60,
-) -> Optional[Dict[str, Any]]:
-    """Attempt to rehydrate Nous OAuth state from the shared store.
-
-    Reads the shared file (if present), runs a forced refresh+mint using
-    the stored refresh_token to produce a fresh access_token + agent_key
-    scoped to this profile, and returns the full auth_state dict ready
-    for ``persist_nous_credentials()``.
-
-    Returns ``None`` when no shared state is available or the rehydrate
-    fails for any reason (expired refresh_token, portal unreachable,
-    etc.) — caller should then fall through to the normal device-code
-    flow.
-    """
-    shared = _read_shared_nous_state()
-    if not shared:
-        return None
-
-    # Build a full state dict so refresh_nous_oauth_from_state has every
-    # field it needs. force_refresh=True gets us a fresh access_token
-    # for this profile; force_mint=True gets us a fresh agent_key.
-    state: Dict[str, Any] = {
-        "access_token": shared.get("access_token"),
-        "refresh_token": shared.get("refresh_token"),
-        "client_id": shared.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
-        "portal_base_url": shared.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
-        "inference_base_url": shared.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
-        "token_type": shared.get("token_type") or "Bearer",
-        "scope": shared.get("scope") or DEFAULT_NOUS_SCOPE,
-        "obtained_at": shared.get("obtained_at"),
-        "expires_at": shared.get("expires_at"),
-        "agent_key": None,
-        "agent_key_expires_at": None,
-        "tls": {"insecure": False, "ca_bundle": None},
-    }
-
-    try:
-        refreshed = refresh_nous_oauth_from_state(
-            state,
-            min_key_ttl_seconds=min_key_ttl_seconds,
-            timeout_seconds=timeout_seconds,
-            force_refresh=True,
-            force_mint=True,
-        )
-    except AuthError as exc:
-        _oauth_trace(
-            "nous_shared_import_failed",
-            error_type=type(exc).__name__,
-            error_code=getattr(exc, "code", None),
-        )
-        logger.debug("Shared Nous import failed: %s", exc)
-        return None
-    except Exception as exc:
-        _oauth_trace(
-            "nous_shared_import_failed",
-            error_type=type(exc).__name__,
-        )
-        logger.debug("Shared Nous import failed: %s", exc)
-        return None
-
-    return refreshed
-
-
 def _refresh_access_token(
    *,
    client: httpx.Client,
@@ -3238,12 +2991,6 @@ def persist_nous_credentials(
        _save_provider_state(auth_store, "nous", state)
        _save_auth_store(auth_store)

-    # Mirror to the shared store so a new profile can one-tap import
-    # these credentials via `hermes auth add nous --type oauth`. Best-
-    # effort: any I/O failure is logged and swallowed (the per-profile
-    # auth.json is still the source of truth).
-    _write_shared_nous_state(state)
-
    pool = load_pool("nous")
    return next(
        (e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
@@ -3312,11 +3059,6 @@ def resolve_nous_runtime_credentials(
                refresh_token_fp=_token_fingerprint(state.get("refresh_token")),
                access_token_fp=_token_fingerprint(state.get("access_token")),
            )
-            # Mirror post-refresh state to the shared store so sibling
-            # profiles don't hold stale refresh_tokens after rotation.
-            # Best-effort — any failure is logged and swallowed inside
-            # _write_shared_nous_state.
-            _write_shared_nous_state(state)

        verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
        timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
@@ -4541,8 +4283,7 @@ def _minimax_oauth_login(
    print(f"Portal: {portal_base_url}")

    with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
-                      headers={"Accept": "application/json"},
-                      follow_redirects=True) as client:
+                      headers={"Accept": "application/json"}) as client:
        code_data = _minimax_request_user_code(
            client, portal_base_url=portal_base_url,
            client_id=pconfig.client_id,
@@ -4619,8 +4360,7 @@ def _refresh_minimax_oauth_state(
        return state

    portal_base_url = state["portal_base_url"]
-    with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
-                      follow_redirects=True) as client:
+    with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
        response = client.post(
            f"{portal_base_url}/oauth/token",
            data={
@@ -4858,47 +4598,17 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
    )

    try:
-        auth_state = None
-
-        # Codex-style auto-import: before launching a fresh device-code
-        # flow, check the shared store for an existing Nous credential
-        # from any other profile. If present, offer to rehydrate it.
-        shared = _read_shared_nous_state()
-        if shared:
-            try:
-                shared_path = _nous_shared_store_path()
-            except RuntimeError:
-                shared_path = None
-            print()
-            if shared_path:
-                print(f"Found existing Nous OAuth credentials at {shared_path}")
-            else:
-                print("Found existing shared Nous OAuth credentials")
-            try:
-                do_import = input("Import these credentials? [Y/n]: ").strip().lower()
-            except (EOFError, KeyboardInterrupt):
-                do_import = "y"
-            if do_import in ("", "y", "yes"):
-                print("Rehydrating Nous session from shared credentials...")
-                auth_state = _try_import_shared_nous_state(
-                    timeout_seconds=timeout_seconds,
-                    min_key_ttl_seconds=5 * 60,
-                )
-                if auth_state is None:
-                    print("Could not refresh shared credentials — falling back to device-code login.")
-
-        if auth_state is None:
-            auth_state = _nous_device_code_login(
-                portal_base_url=getattr(args, "portal_url", None),
-                inference_base_url=getattr(args, "inference_url", None),
-                client_id=getattr(args, "client_id", None) or pconfig.client_id,
-                scope=getattr(args, "scope", None) or pconfig.scope,
-                open_browser=not getattr(args, "no_browser", False),
-                timeout_seconds=timeout_seconds,
-                insecure=insecure,
-                ca_bundle=ca_bundle,
-                min_key_ttl_seconds=5 * 60,
-            )
+        auth_state = _nous_device_code_login(
+            portal_base_url=getattr(args, "portal_url", None),
+            inference_base_url=getattr(args, "inference_url", None),
+            client_id=getattr(args, "client_id", None) or pconfig.client_id,
+            scope=getattr(args, "scope", None) or pconfig.scope,
+            open_browser=not getattr(args, "no_browser", False),
+            timeout_seconds=timeout_seconds,
+            insecure=insecure,
+            ca_bundle=ca_bundle,
+            min_key_ttl_seconds=5 * 60,
+        )

        inference_base_url = auth_state["inference_base_url"]

@@ -4915,11 +4625,6 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
            _save_provider_state(auth_store, "nous", auth_state)
            saved_to = _save_auth_store(auth_store)

-        # Mirror to the shared store so other profiles can one-tap import
-        # these credentials. Best-effort: any I/O failure is logged and
-        # swallowed inside the helper.
-        _write_shared_nous_state(auth_state)
-
        print()
        print("Login successful!")
        print(f"  Auth state: {saved_to}")
@@ -245,47 +245,6 @@ def auth_add_command(args) -> None:
        return

    if provider == "nous":
-        # Codex-style auto-import: if a shared Nous credential lives at
-        # ~/.hermes/shared/nous_auth.json (written by any previous
-        # successful login), offer to import it instead of running the
-        # full device-code flow. This makes `hermes --profile <name>
-        # auth add nous --type oauth` a one-tap operation for users who
-        # run multiple profiles.
-        shared = auth_mod._read_shared_nous_state()
-        if shared:
-            try:
-                path = auth_mod._nous_shared_store_path()
-            except RuntimeError:
-                path = None
-            print()
-            if path:
-                print(f"Found existing Nous OAuth credentials at {path}")
-            else:
-                print("Found existing shared Nous OAuth credentials")
-            try:
-                do_import = input("Import these credentials? [Y/n]: ").strip().lower()
-            except (EOFError, KeyboardInterrupt):
-                do_import = "y"
-            if do_import in ("", "y", "yes"):
-                print("Rehydrating Nous session from shared credentials...")
-                rehydrated = auth_mod._try_import_shared_nous_state(
-                    timeout_seconds=getattr(args, "timeout", None) or 15.0,
-                    min_key_ttl_seconds=max(
-                        60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))
-                    ),
-                )
-                if rehydrated is not None:
-                    custom_label = (getattr(args, "label", None) or "").strip() or None
-                    entry = auth_mod.persist_nous_credentials(rehydrated, label=custom_label)
-                    shown_label = entry.label if entry is not None else label_from_token(
-                        rehydrated.get("access_token", ""), _oauth_default_label(provider, 1),
-                    )
-                    print(f'Imported {provider} OAuth credentials: "{shown_label}"')
-                    return
-                # Rehydrate failed (expired refresh_token, portal down, etc.)
-                # — fall through to device-code flow.
-                print("Could not refresh shared credentials — falling back to device-code login.")
-
        creds = auth_mod._nous_device_code_login(
            portal_base_url=getattr(args, "portal_url", None),
            inference_base_url=getattr(args, "inference_url", None),
@@ -61,9 +61,6 @@ _EXCLUDED_NAMES = {
    "cron.pid",
 }

-# zipfile.open() drops Unix mode bits on extract; restore tightens these to 0600.
-_SECRET_FILE_NAMES = {".env", "auth.json", "state.db"}
-

 def _should_exclude(rel_path: Path) -> bool:
    """Return True if *rel_path* (relative to hermes root) should be skipped."""
@@ -384,8 +381,6 @@ def run_import(args) -> None:
                target.parent.mkdir(parents=True, exist_ok=True)
                with zf.open(member) as src, open(target, "wb") as dst:
                    dst.write(src.read())
-                if target.name in _SECRET_FILE_NAMES:
-                    os.chmod(target, 0o600)
                restored += 1
            except (PermissionError, OSError) as exc:
                errors.append(f"  {rel}: {exc}")
@@ -793,17 +788,9 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
    Returns the number of files deleted.  Only touches files matching
    ``pre-update-*.zip`` so hand-made zips dropped in the same directory
    are never touched.
-
-    ``keep`` is floored to 1 because this helper is only called immediately
-    after a fresh backup is written: deleting that backup right after the
-    user paid the disk/CPU cost to create it would leave them worse off
-    than no backup at all (and the wrapper in ``main.py`` would still print
-    a misleading ``Saved: <path>`` line for a file that no longer exists).
-    Operators who genuinely don't want a backup should set
-    ``updates.pre_update_backup: false`` in config — that gates creation.
    """
-    if keep < 1:
-        keep = 1
+    if keep < 0:
+        keep = 0
    if not backup_dir.exists():
        return 0

@@ -235,9 +235,6 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
    """
    findings: list[tuple[Path, str]] = []

-    if not source_dir.exists():
-        return findings
-
    # Direct state files in the root
    for name in ("todo.json", "sessions", "logs"):
        candidate = source_dir / name
@@ -246,12 +243,7 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
            findings.append((candidate, f"Root {kind}: {name}"))

    # State files inside workspace directories
-    try:
-        children = sorted(source_dir.iterdir())
-    except OSError:
-        return findings
-
-    for child in children:
+    for child in sorted(source_dir.iterdir()):
        if not child.is_dir() or child.name.startswith("."):
            continue
        # Check for workspace-like subdirectories
@@ -10,7 +10,6 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.

 from __future__ import annotations

-import logging
 import os
 import re
 import shutil
@@ -22,8 +21,6 @@ from typing import Any

 from utils import is_truthy_value

-logger = logging.getLogger(__name__)
-
 # prompt_toolkit is an optional CLI dependency — only needed for
 # SlashCommandCompleter and SlashCommandAutoSuggest.  Gateway and test
 # environments that lack it must still be able to import this module
@@ -64,15 +61,14 @@ class CommandDef:
 COMMAND_REGISTRY: list[CommandDef] = [
    # Session
    CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
-               aliases=("reset",), args_hint="[name]"),
-    CommandDef("topic", "Enable or inspect Telegram DM topic sessions", "Session",
-               gateway_only=True, args_hint="[off|help|session-id]"),
+               aliases=("reset",)),
    CommandDef("clear", "Clear screen and start a new session", "Session",
               cli_only=True),
    CommandDef("redraw", "Force a full UI repaint (recovers from terminal drift)", "Session",
               cli_only=True),
    CommandDef("history", "Show conversation history", "Session",
               cli_only=True),
+    CommandDef("recap", "Summarize recent activity in this session", "Session"),
    CommandDef("save", "Save the current conversation", "Session",
               cli_only=True),
    CommandDef("retry", "Retry the last message (resend to agent)", "Session"),
@@ -324,6 +320,7 @@ ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
        "new",
        "profile",
        "queue",
+        "recap",
        "restart",
        "status",
        "steer",
@@ -401,11 +398,6 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
    return False


-def _requires_argument(args_hint: str) -> bool:
-    """Return True when selecting a command without text would be incomplete."""
-    return args_hint.strip().startswith("<")
-
-
 def gateway_help_lines() -> list[str]:
    """Generate gateway help text lines from the registry."""
    overrides = _resolve_config_gates()
@@ -462,9 +454,7 @@ def telegram_bot_commands() -> list[tuple[str, str]]:

    Telegram command names cannot contain hyphens, so they are replaced with
    underscores.  Aliases are skipped -- Telegram shows one menu entry per
-    canonical command. Commands that require arguments are skipped because
-    selecting a Telegram BotCommand sends only ``/command`` and would execute
-    an incomplete command.
+    canonical command.

    Plugin-registered slash commands are included so plugins get native
    autocomplete in Telegram without touching core code.
@@ -474,14 +464,10 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
    for cmd in COMMAND_REGISTRY:
        if not _is_gateway_available(cmd, overrides):
            continue
-        if _requires_argument(cmd.args_hint):
-            continue
        tg_name = _sanitize_telegram_name(cmd.name)
        if tg_name:
            result.append((tg_name, cmd.description))
-    for name, description, args_hint in _iter_plugin_command_entries():
-        if _requires_argument(args_hint):
-            continue
+    for name, description, _args_hint in _iter_plugin_command_entries():
        tg_name = _sanitize_telegram_name(name)
        if tg_name:
            result.append((tg_name, description))
@@ -515,9 +501,9 @@ def _sanitize_telegram_name(raw: str) -> str:


 def _clamp_command_names(
-    entries: list[tuple[str, ...]],
+    entries: list[tuple[str, str]],
    reserved: set[str],
-) -> list[tuple[str, ...]]:
+) -> list[tuple[str, str]]:
    """Enforce 32-char command name limit with collision avoidance.

    Both Telegram and Discord cap slash command names at 32 characters.
@@ -525,15 +511,10 @@ def _clamp_command_names(
    (against *reserved* names or earlier entries in the same batch), the name is
    shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
    If all 10 digit slots are taken the entry is silently dropped.
-
-    Accepts tuples of any length >= 2.  Extra elements beyond ``(name, desc)``
-    (e.g. ``cmd_key``) are passed through unchanged, so callers can attach
-    metadata that survives the rename.
    """
    used: set[str] = set(reserved)
-    result: list[tuple] = []
-    for entry in entries:
-        name, desc, *extra = entry
+    result: list[tuple[str, str]] = []
+    for name, desc in entries:
        if len(name) > _CMD_NAME_LIMIT:
            candidate = name[:_CMD_NAME_LIMIT]
            if candidate in used:
@@ -549,7 +530,7 @@ def _clamp_command_names(
        if name in used:
            continue
        used.add(name)
-        result.append((name, desc, *extra))
+        result.append((name, desc))
    return result


@@ -632,26 +613,13 @@ def _collect_gateway_skill_entries(
    try:
        from agent.skill_commands import get_skill_commands
        from tools.skills_tool import SKILLS_DIR
-        from agent.skill_utils import get_external_skills_dirs
        _skills_dir = str(SKILLS_DIR.resolve())
-        _hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
-        # Build set of allowed directory prefixes: local skills dir + any
-        # user-configured ``skills.external_dirs``. Ensure each prefix ends
-        # with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
-        # Without this widening, external skills are visible in
-        # ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
-        # silently excluded from gateway slash menus (#8110).
-        _allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
-        _allowed_prefixes.extend(
-            str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
-        )
+        _hub_dir = str((SKILLS_DIR / ".hub").resolve())
        skill_cmds = get_skill_commands()
        for cmd_key in sorted(skill_cmds):
            info = skill_cmds[cmd_key]
            skill_path = info.get("skill_md_path", "")
-            if not skill_path:
-                continue
-            if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
+            if not skill_path.startswith(_skills_dir):
                continue
            if skill_path.startswith(_hub_dir):
                continue
@@ -669,15 +637,17 @@ def _collect_gateway_skill_entries(
    except Exception:
        pass

-    # Clamp names; cmd_key is passed through as extra payload so it survives
-    # any clamp-induced renames.
-    skill_triples = _clamp_command_names(skill_triples, reserved_names)
+    # Clamp names; _clamp_command_names works on (name, desc) pairs so we
+    # need to zip/unzip.
+    skill_pairs = [(n, d) for n, d, _ in skill_triples]
+    key_by_pair = {(n, d): k for n, d, k in skill_triples}
+    skill_pairs = _clamp_command_names(skill_pairs, reserved_names)

    # Skills fill remaining slots — only tier that gets trimmed
    remaining = max(0, max_slots - len(all_entries))
-    hidden_count = max(0, len(skill_triples) - remaining)
-    for n, d, k in skill_triples[:remaining]:
-        all_entries.append((n, d, k))
+    hidden_count = max(0, len(skill_pairs) - remaining)
+    for n, d in skill_pairs[:remaining]:
+        all_entries.append((n, d, key_by_pair.get((n, d), "")))

    return all_entries[:max_slots], hidden_count

@@ -753,40 +723,24 @@ def discord_skill_commands(
 def discord_skill_commands_by_category(
    reserved_names: set[str],
 ) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
-    """Return skill entries organized by category for Discord ``/skill`` autocomplete.
+    """Return skill entries organized by category for Discord ``/skill`` subcommand groups.

-    Skills whose directory is nested at least 2 levels under a scan root
+    Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
    (e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
    category.  Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
-    *uncategorized*.
+    *uncategorized* — the caller should register them as direct subcommands
+    of the ``/skill`` group.

-    Scan roots include the local ``SKILLS_DIR`` **and** any configured
-    ``skills.external_dirs`` — matching the widened filter applied to the
-    flat ``discord_skill_commands()`` collector in #18741. Without this
-    parity, external-dir skills are visible via ``hermes skills list`` and
-    the agent's ``/skill-name`` dispatch but silently absent from Discord's
-    ``/skill`` autocomplete.
-
-    Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
-    per-platform disabled excluded, names clamped to 32 chars, descriptions
-    clamped to 100 chars.
-
-    The legacy 25-group × 25-subcommand caps (from the old nested
-    ``/skill <cat> <name>`` layout) are **not** applied — the live caller
-    (``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
-    in PR #11580) flattens these results and feeds them into a single
-    autocomplete callback, which scales to thousands of entries without any
-    per-command payload concerns. ``hidden_count`` is retained in the return
-    tuple for backward compatibility and still reports skills dropped for
-    other reasons (32-char clamp collision vs a reserved name).
+    The same filtering as :func:`discord_skill_commands` is applied: hub
+    skills excluded, per-platform disabled excluded, names clamped.

    Returns:
        ``(categories, uncategorized, hidden_count)``

        - *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
        - *uncategorized*: ``[(name, description, cmd_key), ...]``
-        - *hidden_count*: skills dropped due to name clamp collisions
-          against already-registered command names.
+        - *hidden_count*: skills dropped due to Discord group limits
+          (25 subcommand groups, 25 subcommands per group)
    """
    from pathlib import Path as _P

@@ -800,33 +754,14 @@ def discord_skill_commands_by_category(
    # Collect raw skill data --------------------------------------------------
    categories: dict[str, list[tuple[str, str, str]]] = {}
    uncategorized: list[tuple[str, str, str]] = []
-    # Map clamped-32-char-name → what it came from, so we can emit an
-    # actionable warning on collision. Reserved (gateway-builtin) command
-    # names are marked with a sentinel so the warning distinguishes
-    # "skill collided with a reserved command" from "two skills collided
-    # on the 32-char clamp" — the latter is the rename-worthy case.
-    _names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
+    _names_used: set[str] = set(reserved_names)
    hidden = 0

    try:
        from agent.skill_commands import get_skill_commands
-        from agent.skill_utils import get_external_skills_dirs
        from tools.skills_tool import SKILLS_DIR
-
        _skills_dir = SKILLS_DIR.resolve()
        _hub_dir = (SKILLS_DIR / ".hub").resolve()
-        # Build list of (resolved_root, is_local) tuples. Each external dir
-        # becomes its own scan root for category derivation — a skill at
-        # ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
-        _scan_roots: list[_P] = [_skills_dir]
-        try:
-            for ext in get_external_skills_dirs():
-                try:
-                    _scan_roots.append(_P(ext).resolve())
-                except Exception:
-                    continue
-        except Exception:
-            pass
        skill_cmds = get_skill_commands()

        for cmd_key in sorted(skill_cmds):
@@ -835,21 +770,10 @@ def discord_skill_commands_by_category(
            if not skill_path:
                continue
            sp = _P(skill_path).resolve()
-            # Hub skills are loaded via the skill hub, not surfaced as
-            # slash commands.
-            if str(sp).startswith(str(_hub_dir)):
+            # Skip skills outside SKILLS_DIR or from the hub
+            if not str(sp).startswith(str(_skills_dir)):
                continue
-            # Accept skill if it lives under any scan root; record the
-            # matching root so we can derive the category correctly.
-            matched_root: _P | None = None
-            for root in _scan_roots:
-                try:
-                    sp.relative_to(root)
-                except ValueError:
-                    continue
-                matched_root = root
-                break
-            if matched_root is None:
+            if str(sp).startswith(str(_hub_dir)):
                continue

            skill_name = info.get("name", "")
@@ -857,50 +781,22 @@ def discord_skill_commands_by_category(
                continue

            raw_name = cmd_key.lstrip("/")
-            # Clamp to 32 chars (Discord per-command name limit)
+            # Clamp to 32 chars (Discord limit)
            discord_name = raw_name[:32]
            if discord_name in _names_used:
-                # Two skills whose first 32 chars are identical. One wins
-                # (the first one seen, which is alphabetical because the
-                # caller iterates ``sorted(skill_cmds)``); the other is
-                # dropped from Discord's /skill autocomplete.
-                #
-                # Silently counting this as ``hidden`` (the old behavior)
-                # meant skill authors had no way to discover the drop —
-                # their skill just didn't appear in the picker. Emit a
-                # WARNING naming both sides so the author can rename the
-                # losing skill's frontmatter name to something with a
-                # distinct 32-char prefix.
-                prior = _names_used[discord_name]
-                if prior == "<reserved>":
-                    logger.warning(
-                        "Discord /skill: %r (from %r) collides on its 32-char "
-                        "clamp with a reserved gateway command name %r — the "
-                        "skill will not appear in the /skill autocomplete. "
-                        "Rename the skill's frontmatter ``name:`` to differ "
-                        "in its first 32 chars.",
-                        discord_name, cmd_key, discord_name,
-                    )
-                else:
-                    logger.warning(
-                        "Discord /skill: %r and %r both clamp to %r on "
-                        "Discord's 32-char command-name limit — only %r "
-                        "will appear in the /skill autocomplete. Rename "
-                        "one skill's frontmatter ``name:`` to differ in "
-                        "its first 32 chars.",
-                        prior, cmd_key, discord_name, prior,
-                    )
-                hidden += 1
                continue
-            _names_used[discord_name] = cmd_key
+            _names_used.add(discord_name)

            desc = info.get("description", "")
            if len(desc) > 100:
                desc = desc[:97] + "..."

-            # Determine category from the relative path within the matched
-            # scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
-            rel = sp.parent.relative_to(matched_root)
+            # Determine category from the relative path within SKILLS_DIR.
+            # e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
+            try:
+                rel = sp.parent.relative_to(_skills_dir)
+            except ValueError:
+                continue
            parts = rel.parts
            if len(parts) >= 2:
                cat = parts[0]
@@ -910,7 +806,28 @@ def discord_skill_commands_by_category(
    except Exception:
        pass

-    return categories, uncategorized, hidden
+    # Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
+    _MAX_GROUPS = 25
+    _MAX_PER_GROUP = 25
+
+    trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
+    group_count = 0
+    for cat in sorted(categories):
+        if group_count >= _MAX_GROUPS:
+            hidden += len(categories[cat])
+            continue
+        entries = categories[cat][:_MAX_PER_GROUP]
+        hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
+        trimmed_categories[cat] = entries
+        group_count += 1
+
+    # Uncategorized skills also count against the 25 top-level limit
+    remaining_slots = _MAX_GROUPS - group_count
+    if len(uncategorized) > remaining_slots:
+        hidden += len(uncategorized) - remaining_slots
+        uncategorized = uncategorized[:remaining_slots]
+
+    return trimmed_categories, uncategorized, hidden


 # ---------------------------------------------------------------------------
@@ -1128,12 +1045,6 @@ class SlashCommandCompleter(Completer):
        except Exception:
            return {}

-    # Commands that open pickers when run without arguments.
-    # These should NOT receive a trailing space in completions because:
-    # - The TUI's submit handler applies completions on Enter if input differs
-    # - Adding space makes "/model" → "/model " which blocks picker execution
-    _PICKER_COMMANDS = frozenset({"model", "skin", "personality"})
-
    @staticmethod
    def _completion_text(cmd_name: str, word: str) -> str:
        """Return replacement text for a completion.
@@ -1142,17 +1053,8 @@ class SlashCommandCompleter(Completer):
        returning ``help`` would be a no-op and prompt_toolkit suppresses the
        menu. Appending a trailing space keeps the dropdown visible and makes
        backspacing retrigger it naturally.
-
-        However, commands that open pickers (model, skin, personality) should
-        NOT get a trailing space — the TUI would apply the completion on Enter
-        and block the picker from opening.
        """
-        if cmd_name != word:
-            return cmd_name
-        # Don't add space for picker commands — allows Enter to execute them
-        if cmd_name in SlashCommandCompleter._PICKER_COMMANDS:
-            return cmd_name
-        return f"{cmd_name} "
+        return f"{cmd_name} " if cmd_name == word else cmd_name

    @staticmethod
    def _extract_path_word(text: str) -> str | None:
@@ -400,12 +400,7 @@ DEFAULT_CONFIG = {
        # The gateway stops accepting new work, waits for running agents
        # to finish, then interrupts any remaining runs after the timeout.
        # 0 = no drain, interrupt immediately.
-        #
-        # 180s is calibrated for realistic in-flight agent turns: a typical
-        # coding conversation mid-reasoning runs 60–150s per call, so a 60s
-        # budget routinely interrupted legitimate work on /restart. Raise
-        # further in config.yaml if you run very-long-reasoning models.
-        "restart_drain_timeout": 180,
+        "restart_drain_timeout": 60,
        # Max app-level retry attempts for API errors (connection drops,
        # provider timeouts, 5xx, etc.) before the agent surfaces the
        # failure.  The OpenAI SDK already does its own low-level retries
@@ -644,18 +639,6 @@ DEFAULT_CONFIG = {
        "cache_ttl": "5m",
    },

-    # OpenRouter-specific settings.
-    # response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
-    #   When enabled, identical requests return cached responses for free (zero billing).
-    #   This is separate from Anthropic prompt caching and works alongside it.
-    #   See: https://openrouter.ai/docs/guides/features/response-caching
-    # response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
-    #   Default 300 (5 minutes). Only used when response_cache is enabled.
-    "openrouter": {
-        "response_cache": True,
-        "response_cache_ttl": 300,
-    },
-
    # AWS Bedrock provider configuration.
    # Only used when model.provider is "bedrock".
    "bedrock": {
@@ -781,11 +764,6 @@ DEFAULT_CONFIG = {
        "inline_diffs": True,     # Show inline diff previews for write actions (write_file, patch, skill_manage)
        "show_cost": False,       # Show $ cost in the status bar (off by default)
        "skin": "default",
-        # UI language for static user-facing messages (approval prompts, a
-        # handful of gateway slash-command replies).  Does NOT affect agent
-        # responses, log lines, tool outputs, or slash-command descriptions.
-        # Supported: en, zh, ja, de, es.  Unknown values fall back to en.
-        "language": "en",
        # TUI busy indicator style: kaomoji (default), emoji, unicode (braille
        # spinner), or ascii.  Live-swappable via `/indicator <style>`.
        "tui_status_indicator": "kaomoji",
@@ -814,7 +792,6 @@ DEFAULT_CONFIG = {
            "enabled": False,
            "fields": ["model", "context_pct", "cwd"],  # Order shown; drop any to hide
        },
-        "copy_shortcut": "auto",  # "auto" (platform default) | "ctrl_c" | "ctrl_shift_c" | "disabled"
    },

    # Web dashboard settings
@@ -848,7 +825,7 @@ DEFAULT_CONFIG = {
            # Voices: alloy, echo, fable, onyx, nova, shimmer
        },
        "xai": {
-            "voice_id": "eve",  # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
+            "voice_id": "eve",
            "language": "en",
            "sample_rate": 24000,
            "bit_rate": 128000,
@@ -1292,10 +1269,7 @@ DEFAULT_CONFIG = {
        # for a single update run.
        "pre_update_backup": False,
        # How many pre-update backup zips to retain.  Older ones are pruned
-        # automatically after each successful backup.  Values below 1 are
-        # floored to 1 — the backup just created is always preserved.  To
-        # disable backups entirely, set ``pre_update_backup: false`` above
-        # rather than ``backup_keep: 0``.
+        # automatically after each successful backup.
        "backup_keep": 5,
    },

@@ -3952,7 +3926,6 @@ _FALLBACK_COMMENT = """
 #   kimi-coding-cn (KIMI_CN_API_KEY)   — Kimi / Moonshot (China)
 #   minimax      (MINIMAX_API_KEY)     — MiniMax
 #   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
-#   bedrock      (AWS IAM / boto3)     — AWS Bedrock (Converse API)
 #
 # For custom OpenAI-compatible endpoints, add base_url and key_env.
 #
@@ -3984,7 +3957,6 @@ _COMMENTED_SECTIONS = """
 #   kimi-coding-cn (KIMI_CN_API_KEY)   — Kimi / Moonshot (China)
 #   minimax      (MINIMAX_API_KEY)     — MiniMax
 #   minimax-cn   (MINIMAX_CN_API_KEY)  — MiniMax (China)
-#   bedrock      (AWS IAM / boto3)     — AWS Bedrock (Converse API)
 #
 # For custom OpenAI-compatible endpoints, add base_url and key_env.
 #
@@ -4686,9 +4658,7 @@ def set_config_value(key: str, value: str):
        "terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
        "terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
        "terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
-        # terminal.cwd intentionally excluded — CLI resolves at runtime,
-        # gateway bridges it in gateway/run.py. Persisting to .env causes
-        # stale values to poison child processes.
+        "terminal.cwd": "TERMINAL_CWD",
        "terminal.timeout": "TERMINAL_TIMEOUT",
        "terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
        "terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
@@ -4842,45 +4812,3 @@ def config_command(args):
        print("  hermes config path      Show config file path")
        print("  hermes config env-path  Show .env file path")
        sys.exit(1)
-
-
-# ── Profile-driven env var injection ─────────────────────────────────────────
-# Any provider registered in providers/ with auth_type="api_key" automatically
-# gets its env_vars exposed in OPTIONAL_ENV_VARS without editing this file.
-# Runs once at import time.
-
-_profile_env_vars_injected = False
-
-
-def _inject_profile_env_vars() -> None:
-    """Populate OPTIONAL_ENV_VARS from provider profiles not already listed.
-
-    Called once at module load time. Idempotent — repeated calls are no-ops.
-    """
-    global _profile_env_vars_injected
-    if _profile_env_vars_injected:
-        return
-    _profile_env_vars_injected = True
-    try:
-        from providers import list_providers
-        for _pp in list_providers():
-            if _pp.auth_type not in ("api_key",):
-                continue
-            for _var in _pp.env_vars:
-                if _var in OPTIONAL_ENV_VARS:
-                    continue
-                _is_key = not _var.endswith("_BASE_URL") and not _var.endswith("_URL")
-                OPTIONAL_ENV_VARS[_var] = {
-                    "description": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL override'}",
-                    "prompt": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL (leave empty for default)'}",
-                    "url": _pp.signup_url or None,
-                    "password": _is_key,
-                    "category": "provider",
-                    "advanced": True,
-                }
-    except Exception:
-        pass
-
-
-# Eagerly inject so that OPTIONAL_ENV_VARS is fully populated at import time.
-_inject_profile_env_vars()
@@ -93,8 +93,6 @@ def cron_list(show_all: bool = False):
        script = job.get("script")
        if script:
            print(f"    Script:    {script}")
-        if job.get("no_agent"):
-            print(f"    Mode:      {color('no-agent', Colors.DIM)} (script stdout delivered directly)")
        workdir = job.get("workdir")
        if workdir:
            print(f"    Workdir:   {workdir}")
@@ -174,7 +172,6 @@ def cron_create(args):
        skills=_normalize_skills(getattr(args, "skill", None), getattr(args, "skills", None)),
        script=getattr(args, "script", None),
        workdir=getattr(args, "workdir", None),
-        no_agent=getattr(args, "no_agent", False) or None,
    )
    if not result.get("success"):
        print(color(f"Failed to create job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -187,8 +184,6 @@ def cron_create(args):
    job_data = result.get("job", {})
    if job_data.get("script"):
        print(f"  Script: {job_data['script']}")
-    if job_data.get("no_agent"):
-        print("  Mode: no-agent (script stdout delivered directly)")
    if job_data.get("workdir"):
        print(f"  Workdir: {job_data['workdir']}")
    print(f"  Next run: {result['next_run_at']}")
@@ -230,7 +225,6 @@ def cron_edit(args):
        skills=final_skills,
        script=getattr(args, "script", None),
        workdir=getattr(args, "workdir", None),
-        no_agent=getattr(args, "no_agent", None),
    )
    if not result.get("success"):
        print(color(f"Failed to update job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -246,8 +240,6 @@ def cron_edit(args):
        print("  Skills: none")
    if updated.get("script"):
        print(f"  Script: {updated['script']}")
-    if updated.get("no_agent"):
-        print("  Mode: no-agent (script stdout delivered directly)")
    if updated.get("workdir"):
        print(f"  Workdir: {updated['workdir']}")
    return 0
@@ -245,111 +245,6 @@ def _cmd_restore(args) -> int:
    return 0 if ok else 1


-def _cmd_archive(args) -> int:
-    """Manually archive an agent-created skill. Refuses if pinned.
-
-    The auto-curator archives stale skills on its own schedule; this verb is
-    for the user who wants to archive *now* without waiting for a run.
-    """
-    from tools import skill_usage
-    if skill_usage.get_record(args.skill).get("pinned"):
-        print(
-            f"curator: '{args.skill}' is pinned — unpin first with "
-            f"`hermes curator unpin {args.skill}`"
-        )
-        return 1
-    ok, msg = skill_usage.archive_skill(args.skill)
-    print(f"curator: {msg}")
-    return 0 if ok else 1
-
-
-def _idle_days(record: dict) -> Optional[int]:
-    """Days since the skill's last activity (view / use / patch).
-
-    Falls back to ``created_at`` so a skill that was authored but never used
-    can still be pruned — otherwise never-touched skills would be immortal.
-    Returns None only when both fields are missing or unparseable.
-    """
-    ts = record.get("last_activity_at") or record.get("created_at")
-    if not ts:
-        return None
-    try:
-        dt = datetime.fromisoformat(str(ts))
-    except (TypeError, ValueError):
-        return None
-    if dt.tzinfo is None:
-        dt = dt.replace(tzinfo=timezone.utc)
-    return max(0, (datetime.now(timezone.utc) - dt).days)
-
-
-def _cmd_prune(args) -> int:
-    """Bulk-archive agent-created skills idle for >= N days.
-
-    Pinned skills are exempt. Already-archived skills are skipped. Default
-    ``--days 90`` matches a conservative read of the curator's own archive
-    threshold; adjust with ``--days``. Use ``--dry-run`` to preview.
-    """
-    from tools import skill_usage
-    days = getattr(args, "days", 90)
-    if days < 1:
-        print(f"curator: --days must be >= 1 (got {days})", file=sys.stderr)
-        return 2
-
-    dry_run = bool(getattr(args, "dry_run", False))
-    skip_confirm = bool(getattr(args, "yes", False))
-
-    candidates = []
-    for r in skill_usage.agent_created_report():
-        if r.get("pinned"):
-            continue
-        if r.get("state") == skill_usage.STATE_ARCHIVED:
-            continue
-        idle = _idle_days(r)
-        if idle is None or idle < days:
-            continue
-        candidates.append((r["name"], idle))
-
-    if not candidates:
-        print(f"curator: nothing to prune (no unpinned skills idle >= {days}d)")
-        return 0
-
-    candidates.sort(key=lambda c: -c[1])
-    print(f"curator: {len(candidates)} skill(s) idle >= {days}d:")
-    for name, idle in candidates:
-        print(f"  {name:40s} idle {idle}d")
-
-    if dry_run:
-        print("\n(dry run — no changes made)")
-        return 0
-
-    if not skip_confirm:
-        try:
-            reply = input(f"\nArchive {len(candidates)} skill(s)? [y/N] ").strip().lower()
-        except (EOFError, KeyboardInterrupt):
-            print("\ncurator: aborted")
-            return 1
-        if reply not in ("y", "yes"):
-            print("curator: aborted")
-            return 1
-
-    archived = 0
-    failures = []
-    for name, _ in candidates:
-        ok, msg = skill_usage.archive_skill(name)
-        if ok:
-            archived += 1
-        else:
-            failures.append((name, msg))
-
-    print(f"\ncurator: archived {archived}/{len(candidates)}")
-    if failures:
-        print("failures:")
-        for name, msg in failures:
-            print(f"  {name}: {msg}")
-        return 1
-    return 0
-
-
 def _cmd_backup(args) -> int:
    """Take a manual snapshot of the skills tree. Same mechanism as the
    automatic pre-run snapshot, just user-initiated."""
@@ -407,21 +302,9 @@ def _cmd_rollback(args) -> int:
        print(f"  reason:      {manifest.get('reason', '?')}")
        print(f"  created_at:  {manifest.get('created_at', '?')}")
        print(f"  skill files: {manifest.get('skill_files', '?')}")
-        cron = manifest.get("cron_jobs") or {}
-        if isinstance(cron, dict):
-            if cron.get("backed_up"):
-                print(
-                    f"  cron jobs:   {cron.get('jobs_count', 0)} "
-                    f"(will be restored for skill-link fields only)"
-                )
-            else:
-                reason = cron.get("reason", "not captured")
-                print(f"  cron jobs:   not in snapshot ({reason})")
    print(
        "\nThis will replace the current ~/.hermes/skills/ tree (a safety "
-        "snapshot of the current state is taken first so this is undoable). "
-        "Cron jobs that still exist will have their skills/skill fields "
-        "restored from the snapshot; all other cron fields are left alone."
+        "snapshot of the current state is taken first so this is undoable)."
    )

    if not getattr(args, "yes", False):
@@ -488,31 +371,6 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
    p_restore.add_argument("skill", help="Skill name")
    p_restore.set_defaults(func=_cmd_restore)

-    p_archive = subs.add_parser(
-        "archive",
-        help="Manually archive a skill (move to .archive/, excluded from prompt)",
-    )
-    p_archive.add_argument("skill", help="Skill name")
-    p_archive.set_defaults(func=_cmd_archive)
-
-    p_prune = subs.add_parser(
-        "prune",
-        help="Bulk-archive agent-created skills idle for >= N days (default 90)",
-    )
-    p_prune.add_argument(
-        "--days", type=int, default=90,
-        help="Archive skills idle for at least N days (default: 90)",
-    )
-    p_prune.add_argument(
-        "-y", "--yes", action="store_true",
-        help="Skip the confirmation prompt",
-    )
-    p_prune.add_argument(
-        "--dry-run", dest="dry_run", action="store_true",
-        help="Show what would be archived without doing it",
-    )
-    p_prune.set_defaults(func=_cmd_prune)
-
    p_backup = subs.add_parser(
        "backup",
        help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
@@ -156,8 +156,6 @@ def curses_checklist(
        flush_stdin()
        return result_holder[0] if result_holder[0] is not None else cancel_returns

-    except KeyboardInterrupt:
-        return cancel_returns
    except Exception:
        return _numbered_fallback(title, items, selected, cancel_returns, status_fn)

@@ -280,8 +278,6 @@ def curses_radiolist(
        flush_stdin()
        return result_holder[0] if result_holder[0] is not None else cancel_returns

-    except KeyboardInterrupt:
-        return cancel_returns
    except Exception:
        return _radio_numbered_fallback(title, items, selected, cancel_returns)

@@ -405,8 +401,6 @@ def curses_single_select(
            return None
        return result_holder[0]

-    except KeyboardInterrupt:
-        return None
    except Exception:
        all_items = list(items) + [cancel_label]
        cancel_idx = len(items)
@@ -1,19 +1,12 @@
-"""``hermes debug`` debug tools for Hermes Agent.
+"""``hermes debug`` — debug tools for Hermes Agent.

 Currently supports:
    hermes debug share    Upload debug report (system info + logs) to a
                          paste service and print a shareable URL.
-                          By default, log content is run through
-                          ``agent.redact.redact_sensitive_text`` with
-                          ``force=True`` before upload so credentials in
-                          ``~/.hermes/logs/*.log`` are not leaked into
-                          the public paste service. Pass ``--no-redact``
-                          to disable.
 """

 import io
 import json
-import logging
 import sys
 import time
 import urllib.error
@@ -26,16 +19,6 @@ from typing import Optional
 from hermes_constants import get_hermes_home
 from utils import atomic_replace

-logger = logging.getLogger(__name__)
-
-# Banner prepended to upload-bound log content when redaction is enabled.
-# Visible in the public paste so reviewers know the content was sanitized.
-# Kept short; the trailing newline guarantees the banner sits on its own line.
-_REDACTION_BANNER = (
-    "[hermes debug share: log content redacted at upload time. "
-    "run with --no-redact to disable]\n"
-)
-

 # ---------------------------------------------------------------------------
 # Paste services — try paste.rs first, dpaste.com as fallback.
@@ -385,40 +368,17 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
    return None


-def _redact_log_text(text: str) -> str:
-    """Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
-
-    Uses ``force=True`` so redaction fires regardless of the operator's
-    ``security.redact_secrets`` setting. The local on-disk log file is
-    not modified; only the in-memory copy headed for the public paste
-    service is sanitized. Returns the redacted text (or the original
-    when empty / non-string).
-    """
-    if not text:
-        return text
-    from agent.redact import redact_sensitive_text
-
-    return redact_sensitive_text(text, force=True)
-
-
 def _capture_log_snapshot(
    log_name: str,
    *,
    tail_lines: int,
    max_bytes: int = _MAX_LOG_BYTES,
-    redact: bool = True,
 ) -> LogSnapshot:
    """Capture a log once and derive summary/full-log views from it.

    The report tail and standalone log upload must come from the same file
    snapshot. Otherwise a rotation/truncate between reads can make the report
    look newer than the uploaded ``agent.log`` paste.
-
-    When ``redact`` is True (the default), both ``tail_text`` and
-    ``full_text`` are run through ``_redact_log_text`` so the snapshot
-    returned is upload-safe. The on-disk log file is never modified.
-    Pass ``redact=False`` to capture original log content (used by
-    ``hermes debug share --no-redact``).
    """
    log_path = _resolve_log_path(log_name)
    if log_path is None:
@@ -478,34 +438,18 @@ def _capture_log_snapshot(
        if truncated:
            full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"

-        if redact:
-            tail_text = _redact_log_text(tail_text)
-            full_text = _redact_log_text(full_text)
-
        return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
    except Exception as exc:
        return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)


-def _capture_default_log_snapshots(
-    log_lines: int, *, redact: bool = True
-) -> dict[str, LogSnapshot]:
-    """Capture all logs used by debug-share exactly once.
-
-    ``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
-    captured logs share the same redaction policy for a given run.
-    """
+def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
+    """Capture all logs used by debug-share exactly once."""
    errors_lines = min(log_lines, 100)
    return {
-        "agent": _capture_log_snapshot(
-            "agent", tail_lines=log_lines, redact=redact
-        ),
-        "errors": _capture_log_snapshot(
-            "errors", tail_lines=errors_lines, redact=redact
-        ),
-        "gateway": _capture_log_snapshot(
-            "gateway", tail_lines=errors_lines, redact=redact
-        ),
+        "agent": _capture_log_snapshot("agent", tail_lines=log_lines),
+        "errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
+        "gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
    }


@@ -588,7 +532,6 @@ def run_debug_share(args):
    log_lines = getattr(args, "lines", 200)
    expiry = getattr(args, "expire", 7)
    local_only = getattr(args, "local", False)
-    redact = not getattr(args, "no_redact", False)

    if not local_only:
        print(_PRIVACY_NOTICE)
@@ -596,16 +539,8 @@ def run_debug_share(args):
    print("Collecting debug report...")

    # Capture dump once — prepended to every paste for context.
-    # The dump is already redacted at extract time via dump.py:_redact;
-    # log_snapshots are redacted by _capture_default_log_snapshots when
-    # redact=True so credentials never reach the public paste service.
    dump_text = _capture_dump()
-    log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
-
-    if redact:
-        logger.info(
-            "hermes debug share: applied force-mode redaction to log snapshots before upload"
-        )
+    log_snapshots = _capture_default_log_snapshots(log_lines)

    report = collect_debug_report(
        log_lines=log_lines,
@@ -621,15 +556,6 @@ def run_debug_share(args):
    if gateway_log:
        gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log

-    # Visible banner so reviewers reading the public paste know redaction
-    # was applied at upload time. Banner is omitted under --no-redact.
-    if redact:
-        report = _REDACTION_BANNER + report
-        if agent_log:
-            agent_log = _REDACTION_BANNER + agent_log
-        if gateway_log:
-            gateway_log = _REDACTION_BANNER + gateway_log
-
    if local_only:
        print(report)
        if agent_log:
@@ -740,7 +666,6 @@ def run_debug(args):
        print("  --lines N    Number of log lines to include (default: 200)")
        print("  --expire N   Paste expiry in days (default: 7)")
        print("  --local      Print report locally instead of uploading")
-        print("  --no-redact  Disable upload-time secret redaction (default: redact)")
        print()
        print("Options (delete):")
        print("  <url> ...    One or more paste URLs to delete")
@@ -12,7 +12,6 @@ import importlib.util
 from pathlib import Path

 from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
-from hermes_cli.env_loader import load_hermes_dotenv
 from hermes_constants import display_hermes_home

 PROJECT_ROOT = get_project_root()
@@ -20,8 +19,15 @@ HERMES_HOME = get_hermes_home()
 _DHH = display_hermes_home()  # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)

 # Load environment variables from ~/.hermes/.env so API key checks work
+from dotenv import load_dotenv
 _env_path = get_env_path()
-load_hermes_dotenv(hermes_home=_env_path.parent, project_env=PROJECT_ROOT / ".env")
+if _env_path.exists():
+    try:
+        load_dotenv(_env_path, encoding="utf-8")
+    except UnicodeDecodeError:
+        load_dotenv(_env_path, encoding="latin-1")
+# Also try project .env as dev fallback
+load_dotenv(PROJECT_ROOT / ".env", override=False, encoding="utf-8")

 from hermes_cli.colors import Colors, color
 from hermes_cli.models import _HERMES_USER_AGENT
@@ -169,85 +175,6 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
        check_warn("Could not verify systemd linger", f"({linger_detail})")


-_APIKEY_PROVIDERS_CACHE: list | None = None
-
-
-def _build_apikey_providers_list() -> list:
-    """Build the API-key provider health-check list once and cache it.
-
-    Tuple format: (name, env_vars, default_url, base_env, supports_models_endpoint)
-    Base list augmented with any ProviderProfile with auth_type="api_key" not
-    already present — adding providers/*.py is sufficient to get into doctor.
-    """
-    _static = [
-        ("Z.AI / GLM",      ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
-        ("Kimi / Moonshot",  ("KIMI_API_KEY",),                              "https://api.moonshot.ai/v1/models",   "KIMI_BASE_URL", True),
-        ("StepFun Step Plan", ("STEPFUN_API_KEY",),                          "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
-        ("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",),                    "https://api.moonshot.cn/v1/models",   None, True),
-        ("Arcee AI",         ("ARCEEAI_API_KEY",),                           "https://api.arcee.ai/api/v1/models",  "ARCEE_BASE_URL", True),
-        ("GMI Cloud",        ("GMI_API_KEY",),                               "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
-        ("DeepSeek",         ("DEEPSEEK_API_KEY",),                          "https://api.deepseek.com/v1/models",  "DEEPSEEK_BASE_URL", True),
-        ("Hugging Face",     ("HF_TOKEN",),                                  "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
-        ("NVIDIA NIM",       ("NVIDIA_API_KEY",),                            "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
-        ("Alibaba/DashScope", ("DASHSCOPE_API_KEY",),                        "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
-        # MiniMax global: /v1 endpoint supports /models.
-        ("MiniMax",          ("MINIMAX_API_KEY",),                           "https://api.minimax.io/v1/models",    "MINIMAX_BASE_URL", True),
-        # MiniMax CN: /v1 endpoint does NOT support /models (returns 404).
-        ("MiniMax (China)",  ("MINIMAX_CN_API_KEY",),                        "https://api.minimaxi.com/v1/models",  "MINIMAX_CN_BASE_URL", False),
-        ("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",),                       "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
-        ("Kilo Code",        ("KILOCODE_API_KEY",),                          "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
-        ("OpenCode Zen",     ("OPENCODE_ZEN_API_KEY",),                      "https://opencode.ai/zen/v1/models",  "OPENCODE_ZEN_BASE_URL", True),
-        # OpenCode Go has no shared /models endpoint; skip the health check.
-        ("OpenCode Go",      ("OPENCODE_GO_API_KEY",),                       None,                                  "OPENCODE_GO_BASE_URL", False),
-    ]
-    _known_names = {t[0] for t in _static}
-    # Also index by profile canonical name so profiles without display_name
-    # don't create duplicate entries for providers already in the static list.
-    _known_canonical: set[str] = set()
-    _name_to_canonical = {
-        "Z.AI / GLM": "zai", "Kimi / Moonshot": "kimi-coding",
-        "StepFun Step Plan": "stepfun", "Kimi / Moonshot (China)": "kimi-coding-cn",
-        "Arcee AI": "arcee", "GMI Cloud": "gmi", "DeepSeek": "deepseek",
-        "Hugging Face": "huggingface", "NVIDIA NIM": "nvidia",
-        "Alibaba/DashScope": "alibaba", "MiniMax": "minimax",
-        "MiniMax (China)": "minimax-cn", "Vercel AI Gateway": "ai-gateway",
-        "Kilo Code": "kilocode", "OpenCode Zen": "opencode-zen",
-        "OpenCode Go": "opencode-go",
-    }
-    for _label, _canonical in _name_to_canonical.items():
-        _known_canonical.add(_canonical)
-    try:
-        from providers import list_providers
-        from providers.base import ProviderProfile as _PP
-        for _pp in list_providers():
-            if not isinstance(_pp, _PP) or _pp.auth_type != "api_key" or not _pp.env_vars:
-                continue
-            _label = _pp.display_name or _pp.name
-            if _label in _known_names or _pp.name in _known_canonical:
-                continue
-            # Separate API-key vars from base-URL override vars — the health-check
-            # loop sends the first found value as Authorization: Bearer, so a URL
-            # string must never be picked.
-            _key_vars = tuple(
-                v for v in _pp.env_vars
-                if not v.endswith("_BASE_URL") and not v.endswith("_URL")
-            )
-            _base_var = next(
-                (v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")),
-                None,
-            )
-            if not _key_vars:
-                continue
-            _models_url = (
-                (_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
-                if _pp.base_url else None
-            )
-            _static.append((_label, _key_vars, _models_url, _base_var, True))
-    except Exception:
-        pass
-    return _static
-
-
 def run_doctor(args):
    """Run diagnostic checks."""
    should_fix = getattr(args, 'fix', False)
@@ -336,11 +263,8 @@ def run_doctor(args):
    if env_path.exists():
        check_ok(f"{_DHH}/.env file exists")
        
-        # Check for common issues. Pin encoding to UTF-8 because .env files are
-        # written as UTF-8 everywhere in the codebase, while Path.read_text()
-        # defaults to the system locale — which crashes on non-UTF-8 Windows
-        # locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
-        content = env_path.read_text(encoding="utf-8")
+        # Check for common issues
+        content = env_path.read_text()
        if _has_provider_env_config(content):
            check_ok("API key or custom endpoint configured")
        else:
@@ -1008,8 +932,6 @@ def run_doctor(args):
        agent_browser_path = PROJECT_ROOT / "node_modules" / "agent-browser"
        if agent_browser_path.exists():
            check_ok("agent-browser (Node.js)", "(browser automation)")
-        elif shutil.which("agent-browser"):
-            check_ok("agent-browser", "(browser automation)")
        else:
            if _is_termux():
                check_info("agent-browser is not installed (expected in the tested Termux path)")
@@ -1160,11 +1082,26 @@ def run_doctor(args):
    # -- API-key providers --
    # Tuple: (name, env_vars, default_url, base_env, supports_models_endpoint)
    # If supports_models_endpoint is False, we skip the health check and just show "configured"
-    # Cached at module level after first build — profiles auto-extend it.
-    global _APIKEY_PROVIDERS_CACHE
-    if _APIKEY_PROVIDERS_CACHE is None:
-        _APIKEY_PROVIDERS_CACHE = _build_apikey_providers_list()
-    _apikey_providers = _APIKEY_PROVIDERS_CACHE
+    _apikey_providers = [
+        ("Z.AI / GLM",      ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
+        ("Kimi / Moonshot",  ("KIMI_API_KEY",),                              "https://api.moonshot.ai/v1/models",   "KIMI_BASE_URL", True),
+        ("StepFun Step Plan",   ("STEPFUN_API_KEY",),                           "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
+        ("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",),                    "https://api.moonshot.cn/v1/models",   None, True),
+        ("Arcee AI",         ("ARCEEAI_API_KEY",),                            "https://api.arcee.ai/api/v1/models",  "ARCEE_BASE_URL", True),
+        ("GMI Cloud",        ("GMI_API_KEY",),                                "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
+        ("DeepSeek",         ("DEEPSEEK_API_KEY",),                           "https://api.deepseek.com/v1/models",  "DEEPSEEK_BASE_URL", True),
+        ("Hugging Face",     ("HF_TOKEN",),                                   "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
+        ("NVIDIA NIM",       ("NVIDIA_API_KEY",),                             "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
+        ("Alibaba/DashScope", ("DASHSCOPE_API_KEY",),                         "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
+        # MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
+        ("MiniMax",          ("MINIMAX_API_KEY",),                            "https://api.minimax.io/v1/models",    "MINIMAX_BASE_URL", True),
+        ("MiniMax (China)",  ("MINIMAX_CN_API_KEY",),                         "https://api.minimaxi.com/v1/models",  "MINIMAX_CN_BASE_URL", True),
+        ("Vercel AI Gateway",       ("AI_GATEWAY_API_KEY",),                          "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
+        ("Kilo Code",        ("KILOCODE_API_KEY",),                            "https://api.kilo.ai/api/gateway/models",  "KILOCODE_BASE_URL", True),
+        ("OpenCode Zen",     ("OPENCODE_ZEN_API_KEY",),                        "https://opencode.ai/zen/v1/models",  "OPENCODE_ZEN_BASE_URL", True),
+        # OpenCode Go has no shared /models endpoint; skip the health check.
+        ("OpenCode Go",      ("OPENCODE_GO_API_KEY",),                         None,                                  "OPENCODE_GO_BASE_URL", False),
+    ]
    for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
        _key = ""
        for _ev in _env_vars:
@@ -1321,23 +1258,9 @@ def run_doctor(args):
        check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")

    from hermes_cli.config import get_env_value
-
-    def _gh_authenticated() -> bool:
-        """Check if gh CLI is authenticated via token file or device flow."""
-        try:
-            result = subprocess.run(
-                ["gh", "auth", "status", "--json", "authenticated"],
-                capture_output=True, timeout=10,
-            )
-            return result.returncode == 0
-        except (FileNotFoundError, subprocess.TimeoutExpired):
-            return False
-
    github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
    if github_token:
        check_ok("GitHub token configured (authenticated API access)")
-    elif _gh_authenticated():
-        check_ok("GitHub authenticated via gh CLI", "(full API access — no GITHUB_TOKEN needed)")
    else:
        check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")

@@ -14,7 +14,6 @@ import sys
 from pathlib import Path

 from hermes_cli.config import get_hermes_home, get_env_path, get_project_root, load_config
-from hermes_cli.env_loader import load_hermes_dotenv
 from hermes_constants import display_hermes_home


@@ -196,11 +195,15 @@ def run_dump(args):
    show_keys = getattr(args, "show_keys", False)

    # Load env from .env file so key checks work
+    from dotenv import load_dotenv
    env_path = get_env_path()
-    load_hermes_dotenv(
-        hermes_home=env_path.parent,
-        project_env=get_project_root() / ".env",
-    )
+    if env_path.exists():
+        try:
+            load_dotenv(env_path, encoding="utf-8")
+        except UnicodeDecodeError:
+            load_dotenv(env_path, encoding="latin-1")
+    # Also try project .env as dev fallback
+    load_dotenv(get_project_root() / ".env", override=False, encoding="utf-8")

    project_root = get_project_root()
    hermes_home = get_hermes_home()
@@ -188,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:

    SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
    which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
-    seconds), then exits with code 75.  Both systemd (``Restart=always``
+    seconds), then exits with code 75.  Both systemd (``Restart=on-failure``
    + ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
    = false``) relaunch the process after the graceful exit.

@@ -237,26 +237,6 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
    return False


-def _get_ancestor_pids() -> set[int]:
-    """Return the set of PIDs in the current process's ancestor chain.
-
-    Walks from the current PID up to PID 1 (init) so that process-table scans
-    never match the calling CLI process or any of its parents.  This prevents
-    ``hermes gateway status`` from falsely counting the ``hermes`` CLI that
-    invoked it as a running gateway instance (see #13242).
-    """
-    ancestors: set[int] = set()
-    pid = os.getpid()
-    # Cap iterations to avoid infinite loops on exotic platforms.
-    for _ in range(64):
-        ancestors.add(pid)
-        parent = _get_parent_pid(pid)
-        if parent is None or parent <= 0 or parent in ancestors:
-            break
-        pid = parent
-    return ancestors
-
-
 def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
    if pid is None or pid <= 0:
        return
@@ -272,10 +252,6 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
    a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
    discover gateways outside the current profile.
    """
-    # Exclude the entire ancestor chain so the CLI process that invoked this
-    # scan (e.g. ``hermes gateway status``) is never mistaken for a running
-    # gateway.  See #13242.
-    exclude_pids = exclude_pids | _get_ancestor_pids()
    pids: list[int] = []
    patterns = [
        "hermes_cli.main gateway",
@@ -714,32 +690,6 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
    print("  can refuse to start another copy until this process stops.")


-def _print_other_profiles_gateway_status() -> None:
-    """Print a summary of gateway status across all profiles.
-
-    Shown at the bottom of ``hermes gateway status`` output so users with
-    multiple profiles can tell at a glance which gateways are running and
-    avoid confusing another profile's process with the current one.
-    """
-    try:
-        from hermes_cli.profiles import get_active_profile_name
-
-        current = get_active_profile_name()
-        other_processes = [
-            p for p in find_profile_gateway_processes()
-            if p.profile != current
-        ]
-        if not other_processes:
-            return
-
-        print()
-        print("Other profiles:")
-        for proc in other_processes:
-            print(f"  ✓ {proc.profile:<16s} — PID {proc.pid}")
-    except Exception:
-        pass
-
-
 def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
                           all_profiles: bool = False) -> int:
    """Kill any running gateway processes. Returns count killed.
@@ -785,12 +735,6 @@ def stop_profile_gateway() -> bool:
    if pid is None:
        return False

-    try:
-        from gateway.status import write_planned_stop_marker
-        write_planned_stop_marker(pid)
-    except Exception:
-        pass
-
    try:
        os.kill(pid, signal.SIGTERM)
    except ProcessLookupError:
@@ -1614,46 +1558,6 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
    return [p for p in candidates if p not in path_entries and Path(p).exists()]


-def _build_wsl_interop_paths(path_entries: list[str]) -> list[str]:
-    """Return WSL Windows interop PATH entries for generated systemd units.
-
-    WSL shells normally inherit Windows PATH entries such as
-    ``/mnt/c/WINDOWS/System32``. systemd user services do not, so gateway tools
-    that call ``powershell.exe``/``cmd.exe`` work in a terminal but fail in the
-    background service unless we persist the relevant entries at install time.
-    """
-    if not is_wsl():
-        return []
-
-    candidates: list[str] = []
-    for entry in os.environ.get("PATH", "").split(os.pathsep):
-        if entry.startswith("/mnt/"):
-            candidates.append(entry)
-
-    for executable in ("powershell.exe", "cmd.exe", "explorer.exe", "wsl.exe"):
-        resolved = shutil.which(executable)
-        if resolved:
-            candidates.append(str(Path(resolved).parent))
-
-    for entry in (
-        "/mnt/c/WINDOWS/system32",
-        "/mnt/c/WINDOWS",
-        "/mnt/c/WINDOWS/System32/Wbem",
-        "/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/",
-        "/mnt/c/WINDOWS/System32/OpenSSH/",
-    ):
-        if Path(entry).exists():
-            candidates.append(entry)
-
-    result: list[str] = []
-    seen = set(path_entries)
-    for entry in candidates:
-        if entry and entry not in seen:
-            seen.add(entry)
-            result.append(entry)
-    return result
-
-
 def _remap_path_for_user(path: str, target_home_dir: str) -> str:
    """Remap *path* from the current user's home to *target_home_dir*.

@@ -1745,14 +1649,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
        node_bin = _remap_path_for_user(node_bin, home_dir)
        path_entries = [_remap_path_for_user(p, home_dir) for p in path_entries]
        path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
-        path_entries.extend(_build_wsl_interop_paths(path_entries))
        path_entries.extend(common_bin_paths)
        sane_path = ":".join(path_entries)
        return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network-online.target
 Wants=network-online.target
-StartLimitIntervalSec=0
+StartLimitIntervalSec=600
+StartLimitBurst=5

 [Service]
 Type=simple
@@ -1766,10 +1670,8 @@ Environment="LOGNAME={username}"
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
-Restart=always
-RestartSec=60
-RestartMaxDelaySec=300
-RestartSteps=5
+Restart=on-failure
+RestartSec=30
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
 KillMode=mixed
 KillSignal=SIGTERM
@@ -1785,14 +1687,13 @@ WantedBy=multi-user.target
    hermes_home = str(get_hermes_home().resolve())
    profile_arg = _profile_arg(hermes_home)
    path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
-    path_entries.extend(_build_wsl_interop_paths(path_entries))
    path_entries.extend(common_bin_paths)
    sane_path = ":".join(path_entries)
    return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
-After=network-online.target
-Wants=network-online.target
-StartLimitIntervalSec=0
+After=network.target
+StartLimitIntervalSec=600
+StartLimitBurst=5

 [Service]
 Type=simple
@@ -1801,10 +1702,8 @@ WorkingDirectory={working_dir}
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
-Restart=always
-RestartSec=60
-RestartMaxDelaySec=300
-RestartSteps=5
+Restart=on-failure
+RestartSec=30
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
 KillMode=mixed
 KillSignal=SIGTERM
@@ -2019,15 +1918,6 @@ def systemd_uninstall(system: bool = False):
    print(f"✓ {_service_scope_label(system).capitalize()} service uninstalled")


-def _require_service_installed(action: str, system: bool = False) -> None:
-    unit_path = get_systemd_unit_path(system=system)
-    if not unit_path.exists():
-        scope_flag = " --system" if system else ""
-        print(f"✗ Gateway service is not installed")
-        print(f"  Run: {'sudo ' if system else ''}hermes gateway install{scope_flag}")
-        sys.exit(1)
-
-
 def systemd_start(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
@@ -2037,7 +1927,6 @@ def systemd_start(system: bool = False):
        # reachable (common on fresh RHEL/Debian SSH sessions without linger).
        # Raises UserSystemdUnavailableError with a remediation message.
        _preflight_user_systemd()
-    _require_service_installed("start", system=system)
    refresh_systemd_unit_if_needed(system=system)
    _run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
    print(f"✓ {_service_scope_label(system).capitalize()} service started")
@@ -2048,14 +1937,6 @@ def systemd_stop(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("stop")
-    _require_service_installed("stop", system=system)
-    try:
-        from gateway.status import get_running_pid, write_planned_stop_marker
-        pid = get_running_pid(cleanup_stale=False)
-        if pid is not None:
-            write_planned_stop_marker(pid)
-    except Exception:
-        pass
    _run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
    print(f"✓ {_service_scope_label(system).capitalize()} service stopped")

@@ -2067,7 +1948,6 @@ def systemd_restart(system: bool = False):
        _require_root_for_system_service("restart")
    else:
        _preflight_user_systemd()
-    _require_service_installed("restart", system=system)
    refresh_systemd_unit_if_needed(system=system)
    from gateway.status import get_running_pid

@@ -2417,13 +2297,6 @@ def launchd_start():
 def launchd_stop():
    label = get_launchd_label()
    target = f"{_launchd_domain()}/{label}"
-    try:
-        from gateway.status import get_running_pid, write_planned_stop_marker
-        pid = get_running_pid(cleanup_stale=False)
-        if pid is not None:
-            write_planned_stop_marker(pid)
-    except Exception:
-        pass
    # bootout unloads the service definition so KeepAlive doesn't respawn
    # the process.  A plain `kill SIGTERM` only signals the process — launchd
    # immediately restarts it because KeepAlive.SuccessfulExit = false.
@@ -2566,20 +2439,6 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
                 hasn't fully exited yet.
    """
    sys.path.insert(0, str(PROJECT_ROOT))
-
-    # Refresh the systemd unit definition on every boot so that restart
-    # settings (RestartSec, StartLimitIntervalSec, etc.) stay current even
-    # when the process was respawned via exit-code-75 (stale-code or
-    # /restart) rather than through `hermes gateway restart` which already
-    # calls refresh_systemd_unit_if_needed().  Without this, a code update
-    # that ships new unit settings won't take effect until the next manual
-    # `hermes gateway start/restart` — leaving the gateway vulnerable to
-    # the exact failure mode the new settings were meant to prevent.
-    if supports_systemd_services():
-        try:
-            refresh_systemd_unit_if_needed(system=False)
-        except Exception:
-            pass  # best-effort; don't block gateway startup
    
    from gateway.run import start_gateway
    
@@ -2592,7 +2451,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
    print()
    
    # Exit with code 1 if gateway fails to connect any platform,
-    # so systemd Restart=always will retry on transient errors
+    # so systemd Restart=on-failure will retry on transient errors
    verbosity = None if quiet else verbose
    try:
        success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4594,9 +4453,6 @@ def _gateway_command_inner(args):
                    print("  hermes gateway install  # Install as user service")
                    print("  sudo hermes gateway install --system  # Install as boot-time system service")

-        # Show other profiles' gateway status for multi-profile awareness
-        _print_other_profiles_gateway_status()
-
    elif subcmd == "migrate-legacy":
        # Stop, disable, and remove legacy Hermes gateway unit files from
        # pre-rename installs (e.g. hermes.service). Profile units and
@@ -169,93 +169,11 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
            "or docs/hermes-kanban-v1-spec.pdf for the full design."
        ),
    )
-    # --- global --board flag ---
-    # Applies to every subcommand below. When set, scopes all reads and
-    # writes to that board's DB. When omitted, resolves via the
-    # HERMES_KANBAN_BOARD env var, then the persisted current-board
-    # file, then "default". See kanban_db.get_current_board().
-    kanban_parser.add_argument(
-        "--board",
-        default=None,
-        metavar="<slug>",
-        help=(
-            "Board slug to operate on. Defaults to the current board "
-            "(set via `hermes kanban boards switch <slug>` or the "
-            "HERMES_KANBAN_BOARD env var). Use `hermes kanban boards list` "
-            "to see all boards."
-        ),
-    )
    sub = kanban_parser.add_subparsers(dest="kanban_action")

    # --- init ---
    sub.add_parser("init", help="Create kanban.db if missing (idempotent)")

-    # --- boards (new in v2: multi-project support) ---
-    p_boards = sub.add_parser(
-        "boards",
-        help="Manage kanban boards (one board per project / workstream)",
-        description=(
-            "Boards let you separate unrelated streams of work "
-            "(projects, repos, domains) into isolated queues. Each "
-            "board has its own DB, workspaces directory, and dispatcher "
-            "loop — tasks on one board cannot collide with tasks on "
-            "another. The first board is 'default' and always exists."
-        ),
-    )
-    boards_sub = p_boards.add_subparsers(dest="boards_action")
-
-    b_list = boards_sub.add_parser(
-        "list", aliases=["ls"],
-        help="List all boards with task counts",
-    )
-    b_list.add_argument("--json", action="store_true")
-    b_list.add_argument("--all", action="store_true",
-                        help="Include archived boards too")
-
-    b_create = boards_sub.add_parser(
-        "create", aliases=["new"],
-        help="Create a new board",
-    )
-    b_create.add_argument("slug",
-                          help="Board slug (kebab-case, e.g. atm10-server)")
-    b_create.add_argument("--name", default=None,
-                          help="Human-readable display name (defaults to Title Case of slug)")
-    b_create.add_argument("--description", default=None,
-                          help="Optional description")
-    b_create.add_argument("--icon", default=None,
-                          help="Optional emoji or single-character icon for the dashboard")
-    b_create.add_argument("--color", default=None,
-                          help="Optional hex color (e.g. '#8b5cf6') for the dashboard")
-    b_create.add_argument("--switch", action="store_true",
-                          help="Switch to the new board after creating it")
-
-    b_rm = boards_sub.add_parser(
-        "rm", aliases=["remove", "delete"],
-        help="Archive (default) or delete a board",
-    )
-    b_rm.add_argument("slug")
-    b_rm.add_argument("--delete", action="store_true",
-                      help="Hard-delete the board directory instead of archiving it. "
-                           "Default is to move it to boards/_archived/ so it's recoverable.")
-
-    b_switch = boards_sub.add_parser(
-        "switch", aliases=["use"],
-        help="Set the active board for subsequent CLI calls",
-    )
-    b_switch.add_argument("slug")
-
-    boards_sub.add_parser(
-        "show", aliases=["current"],
-        help="Print the currently-active board slug",
-    )
-
-    b_rename = boards_sub.add_parser(
-        "rename",
-        help="Change a board's human-readable display name (slug is immutable)",
-    )
-    b_rename.add_argument("slug")
-    b_rename.add_argument("name", help="New display name")
-
    # --- create ---
    p_create = sub.add_parser("create", help="Create a new task")
    p_create.add_argument("title", help="Task title")
@@ -308,57 +226,6 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
    p_assign.add_argument("task_id")
    p_assign.add_argument("profile", help="Profile name (or 'none' to unassign)")

-    # --- reclaim / reassign (recovery) ---
-    p_reclaim = sub.add_parser(
-        "reclaim",
-        help="Release an active worker claim on a running task",
-    )
-    p_reclaim.add_argument("task_id")
-    p_reclaim.add_argument(
-        "--reason", default=None,
-        help="Human-readable reason (recorded on the reclaimed event)",
-    )
-
-    p_reassign = sub.add_parser(
-        "reassign",
-        help="Reassign a task to a different profile, optionally reclaiming first",
-    )
-    p_reassign.add_argument("task_id")
-    p_reassign.add_argument(
-        "profile",
-        help="New profile name (or 'none' to unassign)",
-    )
-    p_reassign.add_argument(
-        "--reclaim", action="store_true",
-        help="Release any active claim before reassigning (required if task is running)",
-    )
-    p_reassign.add_argument(
-        "--reason", default=None,
-        help="Human-readable reason (recorded on the reclaimed event)",
-    )
-
-    # --- diagnostics (board-wide health) ---
-    p_diag = sub.add_parser(
-        "diagnostics",
-        aliases=["diag"],
-        help="List active diagnostics on the current board",
-    )
-    p_diag.add_argument(
-        "--severity",
-        choices=["warning", "error", "critical"],
-        default=None,
-        help="Only show diagnostics at or above this severity",
-    )
-    p_diag.add_argument(
-        "--task",
-        default=None,
-        help="Only show diagnostics for one task id",
-    )
-    p_diag.add_argument(
-        "--json", action="store_true",
-        help="Emit JSON (structured) instead of the default human table",
-    )
-
    # --- link / unlink ---
    p_link = sub.add_parser("link", help="Add a parent->child dependency")
    p_link.add_argument("parent_id")
@@ -394,27 +261,6 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
                            help='JSON dict of structured facts (e.g. \'{"changed_files": [...], '
                                 '"tests_run": 12}\'). Stored on the closing run.')

-    p_edit = sub.add_parser(
-        "edit",
-        help="Edit recovery fields on an already-completed task",
-    )
-    p_edit.add_argument("task_id")
-    p_edit.add_argument(
-        "--result",
-        required=True,
-        help="Backfilled task result text for a done task",
-    )
-    p_edit.add_argument(
-        "--summary",
-        default=None,
-        help="Structured handoff summary. Falls back to --result if omitted.",
-    )
-    p_edit.add_argument(
-        "--metadata",
-        default=None,
-        help="JSON dict of structured facts to store on the latest completed run.",
-    )
-
    p_block = sub.add_parser("block", help="Mark one or more tasks blocked")
    p_block.add_argument("task_id")
    p_block.add_argument("reason", nargs="*", help="Reason (also appended as a comment)")
@@ -520,7 +366,7 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
    # --- log ---
    p_log = sub.add_parser(
        "log",
-        help="Print the worker log for a task (from <kanban-root>/kanban/logs/)",
+        help="Print the worker log for a task (from $HERMES_HOME/kanban/logs/)",
    )
    p_log.add_argument("task_id")
    p_log.add_argument("--tail", type=int, default=None,
@@ -596,38 +442,6 @@ def kanban_command(args: argparse.Namespace) -> int:
            )
        return 0

-    # `--board <slug>` applies to every subcommand below by way of an
-    # env-var pin for the duration of this call. Using HERMES_KANBAN_BOARD
-    # (rather than threading `board=` through 50+ kb.connect() sites)
-    # keeps the patch small and inherits the exact same resolution the
-    # dispatcher uses for workers — consistency is a feature here.
-    board_override = getattr(args, "board", None)
-    if board_override:
-        try:
-            normed = kb._normalize_board_slug(board_override)
-        except ValueError as exc:
-            print(f"kanban: {exc}", file=sys.stderr)
-            return 2
-        if not normed:
-            print("kanban: --board requires a slug", file=sys.stderr)
-            return 2
-        # Boards other than 'default' must already exist — typoed slugs
-        # would otherwise silently create an empty board.
-        if normed != kb.DEFAULT_BOARD and not kb.board_exists(normed):
-            print(
-                f"kanban: board {normed!r} does not exist. "
-                f"Create it with `hermes kanban boards create {normed}`.",
-                file=sys.stderr,
-            )
-            return 1
-        os.environ["HERMES_KANBAN_BOARD"] = normed
-
-    # Boards management doesn't touch the DB at all — dispatch early so
-    # fresh installs that haven't initialized any DB can still use
-    # `hermes kanban boards create …`.
-    if action == "boards":
-        return _dispatch_boards(args)
-
    # Auto-initialize the DB before dispatching any subcommand. init_db
    # is idempotent, so running it every invocation is cheap (one
    # SELECT against sqlite_master when tables already exist) and
@@ -648,16 +462,11 @@ def kanban_command(args: argparse.Namespace) -> int:
        "ls":       _cmd_list,
        "show":     _cmd_show,
        "assign":   _cmd_assign,
-        "reclaim":  _cmd_reclaim,
-        "reassign": _cmd_reassign,
-        "diagnostics": _cmd_diagnostics,
-        "diag":     _cmd_diagnostics,
        "link":     _cmd_link,
        "unlink":   _cmd_unlink,
        "claim":    _cmd_claim,
        "comment":  _cmd_comment,
        "complete": _cmd_complete,
-        "edit":     _cmd_edit,
        "block":    _cmd_block,
        "unblock":  _cmd_unblock,
        "archive":  _cmd_archive,
@@ -704,185 +513,6 @@ def _profile_author() -> str:
        return "user"


-# ---------------------------------------------------------------------------
-# Boards management (hermes kanban boards …)
-# ---------------------------------------------------------------------------
-
-def _dispatch_boards(args: argparse.Namespace) -> int:
-    """Handle ``hermes kanban boards <action>``.
-
-    Boards management is deliberately separate from the task-level
-    commands: it operates on the filesystem (board directories,
-    ``current`` pointer, ``board.json``), not on the per-board SQLite
-    DB, so a fresh HERMES_HOME that has never called ``kanban init``
-    can still run ``boards create`` / ``boards list``.
-    """
-    sub = getattr(args, "boards_action", None) or "list"
-    if sub in ("list", "ls"):
-        return _cmd_boards_list(args)
-    if sub in ("create", "new"):
-        return _cmd_boards_create(args)
-    if sub in ("rm", "remove", "delete"):
-        return _cmd_boards_rm(args)
-    if sub in ("switch", "use"):
-        return _cmd_boards_switch(args)
-    if sub in ("show", "current"):
-        return _cmd_boards_show(args)
-    if sub == "rename":
-        return _cmd_boards_rename(args)
-    print(f"kanban boards: unknown action {sub!r}", file=sys.stderr)
-    return 2
-
-
-def _board_task_counts(slug: str) -> dict[str, int]:
-    """Return ``{status: count}`` for a board. Safe to call on an empty DB."""
-    try:
-        path = kb.kanban_db_path(board=slug)
-        if not path.exists():
-            return {}
-        with kb.connect(board=slug) as conn:
-            rows = conn.execute(
-                "SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
-            ).fetchall()
-        return {r["status"]: int(r["n"]) for r in rows}
-    except Exception:
-        return {}
-
-
-def _cmd_boards_list(args: argparse.Namespace) -> int:
-    include_archived = bool(getattr(args, "all", False))
-    boards = kb.list_boards(include_archived=include_archived)
-    # Enrich each entry with task counts + whether it's the current board.
-    current = kb.get_current_board()
-    for b in boards:
-        b["is_current"] = (b["slug"] == current)
-        b["counts"] = _board_task_counts(b["slug"])
-        b["total"] = sum(b["counts"].values())
-    if getattr(args, "json", False):
-        print(json.dumps(boards, indent=2, ensure_ascii=False))
-        return 0
-    # Human table: marker (•) for current, slug, display name, counts.
-    if not boards:
-        print("(no boards — create one with `hermes kanban boards create <slug>`)")
-        return 0
-    print(f"{'':2s}  {'SLUG':24s}  {'NAME':28s}  COUNTS")
-    for b in boards:
-        marker = "●" if b["is_current"] else " "
-        counts = b["counts"] or {}
-        counts_str = (
-            ", ".join(f"{k}={v}" for k, v in sorted(counts.items()))
-            or "(empty)"
-        )
-        name = b.get("name") or ""
-        if b.get("archived"):
-            name += " [archived]"
-        print(f"{marker:2s}  {b['slug']:24s}  {name:28s}  {counts_str}")
-    print()
-    print(f"Current board: {current}")
-    if len(boards) > 1:
-        print("Switch boards with `hermes kanban boards switch <slug>`.")
-    return 0
-
-
-def _cmd_boards_create(args: argparse.Namespace) -> int:
-    try:
-        normed = kb._normalize_board_slug(args.slug)
-    except ValueError as exc:
-        print(f"kanban boards create: {exc}", file=sys.stderr)
-        return 2
-    if not normed:
-        print("kanban boards create: slug is required", file=sys.stderr)
-        return 2
-    already = kb.board_exists(normed) and normed != kb.DEFAULT_BOARD
-    meta = kb.create_board(
-        normed,
-        name=args.name,
-        description=args.description,
-        icon=args.icon,
-        color=args.color,
-    )
-    verb = "already exists" if already else "created"
-    print(f"Board {meta['slug']!r} {verb}.")
-    print(f"  Display name: {meta.get('name', '')}")
-    print(f"  DB path:      {meta['db_path']}")
-    if getattr(args, "switch", False):
-        kb.set_current_board(meta["slug"])
-        print(f"  Switched to {meta['slug']!r}.")
-    else:
-        print(f"  Use `hermes kanban boards switch {meta['slug']}` to make it current.")
-    return 0
-
-
-def _cmd_boards_rm(args: argparse.Namespace) -> int:
-    try:
-        res = kb.remove_board(args.slug, archive=not getattr(args, "delete", False))
-    except ValueError as exc:
-        print(f"kanban boards rm: {exc}", file=sys.stderr)
-        return 1
-    if res["action"] == "archived":
-        print(f"Board {res['slug']!r} archived → {res['new_path']}")
-        print("Recover by moving the directory back to "
-              "<root>/kanban/boards/<slug>/.")
-    else:
-        print(f"Board {res['slug']!r} deleted.")
-    return 0
-
-
-def _cmd_boards_switch(args: argparse.Namespace) -> int:
-    try:
-        normed = kb._normalize_board_slug(args.slug)
-    except ValueError as exc:
-        print(f"kanban boards switch: {exc}", file=sys.stderr)
-        return 2
-    if not normed:
-        print("kanban boards switch: slug is required", file=sys.stderr)
-        return 2
-    if not kb.board_exists(normed):
-        print(
-            f"kanban boards switch: board {normed!r} does not exist. "
-            f"Create it with `hermes kanban boards create {normed}`.",
-            file=sys.stderr,
-        )
-        return 1
-    kb.set_current_board(normed)
-    print(f"Active board is now {normed!r}.")
-    return 0
-
-
-def _cmd_boards_show(args: argparse.Namespace) -> int:
-    current = kb.get_current_board()
-    meta = kb.read_board_metadata(current)
-    counts = _board_task_counts(current)
-    total = sum(counts.values())
-    print(f"Current board: {current}")
-    print(f"  Display name: {meta.get('name', '')}")
-    if meta.get("description"):
-        print(f"  Description:  {meta['description']}")
-    print(f"  DB path:      {meta['db_path']}")
-    print(f"  Tasks:        {total} total"
-          + (f" ({', '.join(f'{k}={v}' for k, v in sorted(counts.items()))})"
-             if counts else ""))
-    return 0
-
-
-def _cmd_boards_rename(args: argparse.Namespace) -> int:
-    try:
-        normed = kb._normalize_board_slug(args.slug)
-    except ValueError as exc:
-        print(f"kanban boards rename: {exc}", file=sys.stderr)
-        return 2
-    if not normed or not kb.board_exists(normed):
-        print(f"kanban boards rename: board {args.slug!r} does not exist",
-              file=sys.stderr)
-        return 1
-    meta = kb.write_board_metadata(normed, name=args.name)
-    print(f"Board {normed!r} renamed to {meta['name']!r}.")
-    return 0
-
-
-# ---------------------------------------------------------------------------
-
-
 def _parse_duration(val) -> Optional[int]:
    """Parse ``30s`` / ``5m`` / ``2h`` / ``1d`` or a raw integer → seconds.

@@ -1032,21 +662,6 @@ def _cmd_list(args: argparse.Namespace) -> int:
    if getattr(args, "json", False):
        print(json.dumps([_task_to_dict(t) for t in tasks], indent=2, ensure_ascii=False))
        return 0
-    # Passive discoverability: when the user has multiple boards, surface
-    # which one they're looking at in the list header. Single-board users
-    # never see this — the feature stays invisible until you opt in.
-    try:
-        all_boards = kb.list_boards(include_archived=False)
-    except Exception:
-        all_boards = []
-    if len(all_boards) > 1:
-        current = kb.get_current_board()
-        other_count = len(all_boards) - 1
-        print(
-            f"Board: {current} "
-            f"({other_count} other board{'s' if other_count != 1 else ''} — "
-            f"`hermes kanban boards list`)\n"
-        )
    if not tasks:
        print("(no matching tasks)")
        return 0
@@ -1115,31 +730,6 @@ def _cmd_show(args: argparse.Namespace) -> int:
    if task.skills:
        print(f"  skills:    {', '.join(task.skills)}")
    print(f"  created:   {_fmt_ts(task.created_at)} by {task.created_by or '-'}")
-
-    # Diagnostics section — surface active distress signals at the top
-    # of show output so CLI users see them before scrolling through
-    # comments / runs.
-    from hermes_cli import kanban_diagnostics as kd
-    diags = kd.compute_task_diagnostics(task, events, runs)
-    if diags:
-        sev_marker = {"warning": "⚠", "error": "!!", "critical": "!!!"}
-        print(f"\n  Diagnostics ({len(diags)}):")
-        for d in diags:
-            print(f"    {sev_marker.get(d.severity, '?')} [{d.severity}] {d.title}")
-            if d.data:
-                bits = []
-                for k, v in d.data.items():
-                    if isinstance(v, list):
-                        bits.append(f"{k}={','.join(str(x) for x in v)}")
-                    else:
-                        bits.append(f"{k}={v}")
-                if bits:
-                    print(f"       data: {' | '.join(bits)}")
-            # Only show suggested actions in show output to keep it tight;
-            # full list is available via `kanban diagnostics --task <id>`.
-            for a in d.actions:
-                if a.suggested:
-                    print(f"       → {a.label}")
    if task.started_at:
        print(f"  started:   {_fmt_ts(task.started_at)}")
    if task.completed_at:
@@ -1197,167 +787,6 @@ def _cmd_assign(args: argparse.Namespace) -> int:
    return 0


-def _cmd_reclaim(args: argparse.Namespace) -> int:
-    with kb.connect() as conn:
-        ok = kb.reclaim_task(
-            conn, args.task_id,
-            reason=getattr(args, "reason", None),
-        )
-    if not ok:
-        print(
-            f"cannot reclaim {args.task_id} (not running or unknown id)",
-            file=sys.stderr,
-        )
-        return 1
-    print(f"Reclaimed {args.task_id}")
-    return 0
-
-
-def _cmd_reassign(args: argparse.Namespace) -> int:
-    profile = None if args.profile.lower() in ("none", "-", "null") else args.profile
-    with kb.connect() as conn:
-        ok = kb.reassign_task(
-            conn, args.task_id, profile,
-            reclaim_first=bool(getattr(args, "reclaim", False)),
-            reason=getattr(args, "reason", None),
-        )
-    if not ok:
-        print(
-            f"cannot reassign {args.task_id} "
-            f"(unknown id, or still running — pass --reclaim to release first)",
-            file=sys.stderr,
-        )
-        return 1
-    print(
-        f"Reassigned {args.task_id} to "
-        f"{profile or '(unassigned)'}"
-        + (" (claim reclaimed)" if getattr(args, "reclaim", False) else "")
-    )
-    return 0
-
-
-def _cmd_diagnostics(args: argparse.Namespace) -> int:
-    """List active diagnostics on the board. Wraps the same rule engine
-    the dashboard uses, so CLI output matches what the UI shows.
-    """
-    from hermes_cli import kanban_diagnostics as kd
-
-    with kb.connect() as conn:
-        # Either one-task mode or fleet mode.
-        if getattr(args, "task", None):
-            task = kb.get_task(conn, args.task)
-            if task is None:
-                print(f"no such task: {args.task}", file=sys.stderr)
-                return 1
-            diags_by_task = {
-                args.task: kd.compute_task_diagnostics(
-                    task,
-                    kb.list_events(conn, args.task),
-                    kb.list_runs(conn, args.task),
-                )
-            }
-        else:
-            # Fleet mode: pull all non-archived tasks + their events/runs.
-            rows = list(conn.execute(
-                "SELECT * FROM tasks WHERE status != 'archived'"
-            ).fetchall())
-            ids = [r["id"] for r in rows]
-            if not ids:
-                diags_by_task = {}
-            else:
-                placeholders = ",".join(["?"] * len(ids))
-                ev_by = {i: [] for i in ids}
-                for row in conn.execute(
-                    f"SELECT * FROM task_events WHERE task_id IN ({placeholders}) ORDER BY id",
-                    tuple(ids),
-                ):
-                    ev_by.setdefault(row["task_id"], []).append(row)
-                run_by = {i: [] for i in ids}
-                for row in conn.execute(
-                    f"SELECT * FROM task_runs WHERE task_id IN ({placeholders}) ORDER BY id",
-                    tuple(ids),
-                ):
-                    run_by.setdefault(row["task_id"], []).append(row)
-                diags_by_task = {}
-                for r in rows:
-                    tid = r["id"]
-                    dl = kd.compute_task_diagnostics(r, ev_by.get(tid, []), run_by.get(tid, []))
-                    if dl:
-                        diags_by_task[tid] = dl
-
-        # Severity filter.
-        sev = getattr(args, "severity", None)
-        if sev:
-            for tid in list(diags_by_task.keys()):
-                kept = [d for d in diags_by_task[tid] if d.severity == sev]
-                if kept:
-                    diags_by_task[tid] = kept
-                else:
-                    del diags_by_task[tid]
-
-        # Map task_id → title/status/assignee for the table output.
-        meta: dict[str, dict] = {}
-        if diags_by_task:
-            placeholders = ",".join(["?"] * len(diags_by_task))
-            for r in conn.execute(
-                f"SELECT id, title, status, assignee FROM tasks WHERE id IN ({placeholders})",
-                tuple(diags_by_task.keys()),
-            ):
-                meta[r["id"]] = {
-                    "title": r["title"], "status": r["status"],
-                    "assignee": r["assignee"],
-                }
-
-    if getattr(args, "json", False):
-        out_json = [
-            {
-                "task_id": tid,
-                **meta.get(tid, {}),
-                "diagnostics": [d.to_dict() for d in dl],
-            }
-            for tid, dl in diags_by_task.items()
-        ]
-        print(json.dumps(out_json, indent=2, ensure_ascii=False))
-        return 0
-
-    if not diags_by_task:
-        print("No active diagnostics on this board.")
-        return 0
-
-    # Human-readable summary: grouped by task, severity-marked, with
-    # suggested actions inline.
-    sev_marker = {"warning": "⚠", "error": "!!", "critical": "!!!"}
-    total = sum(len(dl) for dl in diags_by_task.values())
-    print(
-        f"{total} active diagnostic(s) across "
-        f"{len(diags_by_task)} task(s):\n"
-    )
-    for tid, dl in diags_by_task.items():
-        m = meta.get(tid, {})
-        title = m.get("title") or "(untitled)"
-        status = m.get("status") or "?"
-        assignee = m.get("assignee") or "(unassigned)"
-        print(f"  {tid}  {status:8s}  @{assignee:18s}  {title}")
-        for d in dl:
-            print(f"    {sev_marker.get(d.severity, '?')} [{d.severity}] {d.kind}: {d.title}")
-            if d.data:
-                # Compact key:value pairs on one line.
-                bits = []
-                for k, v in d.data.items():
-                    if isinstance(v, list):
-                        bits.append(f"{k}={','.join(str(x) for x in v)}")
-                    else:
-                        bits.append(f"{k}={v}")
-                if bits:
-                    print(f"       data: {' | '.join(bits)}")
-            # Suggested actions first.
-            for a in d.actions:
-                if a.suggested:
-                    print(f"       → {a.label}")
-        print()
-    return 0
-
-
 def _cmd_link(args: argparse.Namespace) -> int:
    with kb.connect() as conn:
        kb.link_tasks(conn, args.parent_id, args.child_id)
@@ -1450,34 +879,6 @@ def _cmd_complete(args: argparse.Namespace) -> int:
    return 0 if not failed else 1


-def _cmd_edit(args: argparse.Namespace) -> int:
-    raw_meta = getattr(args, "metadata", None)
-    metadata = None
-    if raw_meta:
-        try:
-            metadata = json.loads(raw_meta)
-            if not isinstance(metadata, dict):
-                raise ValueError("must be a JSON object")
-        except (ValueError, json.JSONDecodeError) as exc:
-            print(f"kanban: --metadata: {exc}", file=sys.stderr)
-            return 2
-    with kb.connect() as conn:
-        if not kb.edit_completed_task_result(
-            conn,
-            args.task_id,
-            result=args.result,
-            summary=getattr(args, "summary", None),
-            metadata=metadata,
-        ):
-            print(
-                f"cannot edit {args.task_id} (unknown id or task is not done)",
-                file=sys.stderr,
-            )
-            return 1
-    print(f"Edited {args.task_id}")
-    return 0
-
-
 def _cmd_block(args: argparse.Namespace) -> int:
    reason = " ".join(args.reason).strip() if args.reason else None
    author = _profile_author()
@@ -1565,7 +966,6 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
                for (tid, who, ws) in res.spawned
            ],
            "skipped_unassigned": res.skipped_unassigned,
-            "skipped_nonspawnable": res.skipped_nonspawnable,
        }, indent=2))
        return 0
    print(f"Reclaimed:    {res.reclaimed}")
@@ -1585,11 +985,6 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
        print(f"  - {tid}  ->  {who}  @ {ws or '-'}{tag}")
    if res.skipped_unassigned:
        print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
-    if res.skipped_nonspawnable:
-        print(
-            f"Skipped (non-spawnable assignee — terminal lane, OK): "
-            f"{', '.join(res.skipped_nonspawnable)}"
-        )
    return 0


@@ -1701,18 +1096,16 @@ def _cmd_daemon(args: argparse.Namespace) -> int:
            )

    def _ready_queue_nonempty() -> bool:
-        """Cheap probe — is there at least one ready+assigned+unclaimed
-        task whose assignee maps to a real Hermes profile (i.e. one the
-        dispatcher would actually try to spawn for)?
-
-        Filters out tasks assigned to control-plane lanes
-        (e.g. ``orion-cc``, ``orion-research``) that are pulled by
-        terminals via ``claim_task`` directly — those are correctly idle
-        from the dispatcher's perspective, not stuck.
-        """
+        """Cheap SELECT — just asks whether there's at least one ready
+        task with an assignee that the dispatcher could have picked up."""
        try:
            with kb.connect() as conn:
-                return kb.has_spawnable_ready(conn)
+                row = conn.execute(
+                    "SELECT 1 FROM tasks "
+                    "WHERE status = 'ready' AND assignee IS NOT NULL "
+                    "    AND claim_lock IS NULL LIMIT 1"
+                ).fetchone()
+                return row is not None
        except Exception:
            return False

@@ -1,570 +0,0 @@
-"""Kanban diagnostics — structured, actionable distress signals for tasks.
-
-A ``Diagnostic`` is a machine-readable description of something that's wrong
-with a kanban task: a hallucinated card id, a spawn crash-loop, a task
-stuck blocked for too long, etc. Each one carries:
-
-* A **kind** (canonical code; UI/tests match on this).
-* A **severity** (``warning`` / ``error`` / ``critical``).
-* A **title** (one-line human description) and **detail** (longer text).
-* A list of **suggested actions** — structured entries the dashboard
-  turns into buttons and the CLI turns into hints.
-
-Rules run over (task, recent events, recent runs) and emit diagnostics.
-They are stateless and read-only — no DB writes. Callers compute
-diagnostics on demand (on ``/board`` load, ``/tasks/:id`` fetch, or
-``hermes kanban diagnostics``).
-
-Design goals:
-
-* Fixable-on-the-operator's-side signals only (missing config, phantom
-  ids, crash loop). Not "the provider returned 502 once" — that's a
-  transient runtime blip, not a diagnostic.
-* Recoverable: every diagnostic comes with at least one suggested
-  recovery action the operator can actually take from the UI.
-* Auto-clearing: when the underlying failure mode resolves (a clean
-  ``completed`` event arrives, a spawn succeeds, the task gets
-  unblocked), the diagnostic stops firing. The audit event trail stays.
-"""
-
-from __future__ import annotations
-
-from dataclasses import dataclass, field
-from typing import Any, Callable, Iterable, Optional
-import json
-import time
-
-
-# Severity rungs, ordered least → most urgent. The UI colors them
-# amber (warning), orange (error), red (critical). Sorted outputs put
-# critical first so operators see the worst fires at the top.
-SEVERITY_ORDER = ("warning", "error", "critical")
-
-
-@dataclass
-class DiagnosticAction:
-    """A single recovery action attached to a diagnostic.
-
-    The ``kind`` determines how both the UI and CLI render it:
-
-    * ``reclaim`` / ``reassign`` — POST to the matching /tasks/:id/*
-      endpoint; dashboard wires into the existing recovery popover.
-    * ``unblock`` — PATCH status back to ``ready`` (for stuck-blocked
-      diagnostics).
-    * ``cli_hint`` — print/copy a shell command (e.g.
-      ``hermes -p <profile> auth``). No HTTP side effect.
-    * ``open_docs`` — deep-link to the docs URL named in ``payload.url``.
-    * ``comment`` — nudge the operator to add a comment (for
-      stuck-blocked tasks that need human input).
-
-    ``suggested=True`` marks the action as the recommended first step;
-    the UI highlights it. Multiple actions can be suggested if they're
-    equally valid.
-    """
-
-    kind: str
-    label: str
-    payload: dict = field(default_factory=dict)
-    suggested: bool = False
-
-    def to_dict(self) -> dict:
-        return {
-            "kind": self.kind,
-            "label": self.label,
-            "payload": self.payload,
-            "suggested": self.suggested,
-        }
-
-
-@dataclass
-class Diagnostic:
-    """One active distress signal on a task."""
-
-    kind: str
-    severity: str  # "warning" | "error" | "critical"
-    title: str
-    detail: str
-    actions: list[DiagnosticAction] = field(default_factory=list)
-    first_seen_at: int = 0
-    last_seen_at: int = 0
-    count: int = 1
-    # Optional: the run id this diagnostic is scoped to. None = task-wide.
-    run_id: Optional[int] = None
-    # Optional structured payload for the UI (phantom ids, failure count).
-    data: dict = field(default_factory=dict)
-
-    def to_dict(self) -> dict:
-        return {
-            "kind": self.kind,
-            "severity": self.severity,
-            "title": self.title,
-            "detail": self.detail,
-            "actions": [a.to_dict() for a in self.actions],
-            "first_seen_at": self.first_seen_at,
-            "last_seen_at": self.last_seen_at,
-            "count": self.count,
-            "run_id": self.run_id,
-            "data": self.data,
-        }
-
-
-# ---------------------------------------------------------------------------
-# Rule helpers
-# ---------------------------------------------------------------------------
-
-def _task_field(task, name, default=None):
-    """Read a field from a task regardless of representation.
-
-    Callers pass sqlite3.Row (dict-like with [] but no attribute
-    access), kanban_db.Task dataclasses (attribute access), or plain
-    dicts (both). This normalises them so rule functions don't have
-    to branch on type each time.
-    """
-    if task is None:
-        return default
-    # sqlite Row + plain dicts both support mapping access; Row also
-    # supports .keys().
-    try:
-        # Row raises IndexError if the key isn't a column in the query;
-        # dicts return default via .get. Handle both.
-        if hasattr(task, "keys") and name in task.keys():
-            return task[name]
-    except Exception:
-        pass
-    if isinstance(task, dict):
-        return task.get(name, default)
-    return getattr(task, name, default)
-
-
-def _parse_payload(ev) -> dict:
-    """Tolerate event.payload being either a dict or a JSON string."""
-    p = _task_field(ev, "payload", None)
-    if p is None:
-        return {}
-    if isinstance(p, dict):
-        return p
-    if isinstance(p, str):
-        try:
-            return json.loads(p) or {}
-        except Exception:
-            return {}
-    return {}
-
-
-def _event_kind(ev) -> str:
-    return _task_field(ev, "kind", "") or ""
-
-
-def _event_ts(ev) -> int:
-    t = _task_field(ev, "created_at", 0)
-    return int(t or 0)
-
-
-def _active_hallucination_events(
-    events: Iterable[Any],
-    kind: str,
-) -> list[Any]:
-    """Return events of ``kind`` that have no ``completed``/``edited``
-    event *strictly after* them. Walks chronologically: each clean
-    event resets the accumulator; each matching event gets appended.
-
-    Events must be sorted by id (i.e. arrival order); callers pass the
-    task's full event list which the DB already returns in that order.
-    """
-    # Events arrive sorted by id asc (chronological). Walk once, track
-    # which hallucination events are still "active" (no clean event
-    # supersedes them).
-    active: list[Any] = []
-    for ev in events:
-        k = _event_kind(ev)
-        if k in ("completed", "edited"):
-            active.clear()
-        elif k == kind:
-            active.append(ev)
-    return active
-
-
-def _latest_clean_event_ts(events: Iterable[Any]) -> int:
-    """Timestamp of the most recent clean completion / edit event.
-
-    Kept for general "has this task ever been successfully completed"
-    lookups; hallucination rules use ``_active_hallucination_events``
-    instead because they need strict ordering.
-    """
-    latest = 0
-    for ev in events:
-        if _event_kind(ev) in ("completed", "edited"):
-            t = _event_ts(ev)
-            if t > latest:
-                latest = t
-    return latest
-
-
-# Standard always-available actions. Every diagnostic can offer these as
-# fallbacks regardless of kind — they're the two baseline recovery
-# primitives the kernel supports.
-def _generic_recovery_actions(task: Any, *, running: bool) -> list[DiagnosticAction]:
-    out: list[DiagnosticAction] = []
-    if running:
-        out.append(DiagnosticAction(
-            kind="reclaim",
-            label="Reclaim task",
-            payload={},
-        ))
-    out.append(DiagnosticAction(
-        kind="reassign",
-        label="Reassign to different profile",
-        payload={"reclaim_first": running},
-    ))
-    return out
-
-
-# ---------------------------------------------------------------------------
-# Rule implementations
-# ---------------------------------------------------------------------------
-
-# Each rule takes (task, events, runs, now_ts, config) and returns
-# zero or more Diagnostic instances. ``events`` / ``runs`` are lists of
-# kanban_db.Event / kanban_db.Run (or plain dicts matching the same
-# shape — for test convenience).
-
-RuleFn = Callable[[Any, list[Any], list[Any], int, dict], list[Diagnostic]]
-
-
-def _rule_hallucinated_cards(task, events, runs, now, cfg) -> list[Diagnostic]:
-    """Blocked-hallucination gate fires: a worker called kanban_complete
-    with created_cards that didn't exist or weren't created by the
-    completing profile. Task stayed in its prior state; the operator
-    needs to decide how to proceed.
-
-    Auto-clears when a successful completion (or edit) follows the
-    blocked event.
-    """
-    hits = _active_hallucination_events(events, "completion_blocked_hallucination")
-    if not hits:
-        return []
-    phantom_ids: list[str] = []
-    first = _event_ts(hits[0])
-    last = _event_ts(hits[-1])
-    for ev in hits:
-        payload = _parse_payload(ev)
-        for pid in payload.get("phantom_cards", []) or []:
-            if pid not in phantom_ids:
-                phantom_ids.append(pid)
-    running = _task_field(task, "status") == "running"
-    actions: list[DiagnosticAction] = []
-    actions.append(DiagnosticAction(
-        kind="comment",
-        label="Add a comment explaining what to do",
-        suggested=False,
-    ))
-    actions.extend(_generic_recovery_actions(task, running=running))
-    return [Diagnostic(
-        kind="hallucinated_cards",
-        severity="error",
-        title="Worker claimed cards that don't exist",
-        detail=(
-            f"The completing worker declared created_cards that either didn't "
-            f"exist or weren't created by its profile. The completion was "
-            f"blocked and the task stayed in its prior state. "
-            f"Usually means the worker hallucinated ids instead of capturing "
-            f"return values from kanban_create."
-        ),
-        actions=actions,
-        first_seen_at=first,
-        last_seen_at=last,
-        count=len(hits),
-        data={"phantom_ids": phantom_ids},
-    )]
-
-
-def _rule_prose_phantom_refs(task, events, runs, now, cfg) -> list[Diagnostic]:
-    """Advisory prose-scan: the completion summary mentions ``t_<hex>``
-    ids that don't resolve. Non-blocking; surfaced as a warning only.
-
-    Auto-clears when a fresh clean completion arrives AFTER the
-    suspected event.
-    """
-    hits = _active_hallucination_events(events, "suspected_hallucinated_references")
-    if not hits:
-        return []
-    phantom_refs: list[str] = []
-    for ev in hits:
-        for pid in _parse_payload(ev).get("phantom_refs", []) or []:
-            if pid not in phantom_refs:
-                phantom_refs.append(pid)
-    running = _task_field(task, "status") == "running"
-    return [Diagnostic(
-        kind="prose_phantom_refs",
-        severity="warning",
-        title="Completion summary references unknown task ids",
-        detail=(
-            "The completion summary mentions task ids that don't resolve "
-            "in this board's database. The completion itself succeeded, "
-            "but downstream consumers parsing the summary may be pointed "
-            "at cards that never existed."
-        ),
-        actions=_generic_recovery_actions(task, running=running),
-        first_seen_at=_event_ts(hits[0]),
-        last_seen_at=_event_ts(hits[-1]),
-        count=len(hits),
-        data={"phantom_refs": phantom_refs},
-    )]
-
-
-def _rule_repeated_spawn_failures(task, events, runs, now, cfg) -> list[Diagnostic]:
-    """Task's ``spawn_failures`` counter is climbing — worker can't
-    even start. Usually a profile misconfiguration (missing config.yaml,
-    bad PATH/venv, wrong credentials).
-
-    Threshold: cfg["spawn_failure_threshold"] (default 3).
-    """
-    threshold = int(cfg.get("spawn_failure_threshold", 3))
-    failures = _task_field(task, "spawn_failures", 0)
-    if failures is None or failures < threshold:
-        return []
-    last_err = _task_field(task, "last_spawn_error")
-    assignee = _task_field(task, "assignee")
-    actions: list[DiagnosticAction] = []
-    if assignee and assignee != "default":
-        actions.append(DiagnosticAction(
-            kind="cli_hint",
-            label=f"Verify profile: hermes -p {assignee} doctor",
-            payload={"command": f"hermes -p {assignee} doctor"},
-            suggested=True,
-        ))
-        actions.append(DiagnosticAction(
-            kind="cli_hint",
-            label=f"Fix profile auth: hermes -p {assignee} auth",
-            payload={"command": f"hermes -p {assignee} auth"},
-        ))
-    actions.extend(_generic_recovery_actions(task, running=False))
-    severity = "critical" if failures >= threshold * 2 else "error"
-    err_text = (last_err or "").strip() if last_err else ""
-    err_snippet = err_text[:500] + ("…" if len(err_text) > 500 else "") if err_text else ""
-    if err_snippet:
-        title = f"Agent spawn failed {failures}x: {err_snippet.splitlines()[0][:160]}"
-        detail = (
-            f"The dispatcher tried to launch a worker {failures} times "
-            f"and failed every time. Full last error:\n\n{err_snippet}\n\n"
-            f"Common causes: missing config.yaml, bad venv/PATH, or "
-            f"missing credentials for the profile's configured provider."
-        )
-    else:
-        title = f"Agent spawn failed {failures}x (no error recorded)"
-        detail = (
-            f"The dispatcher tried to launch a worker {failures} times "
-            f"and failed every time, but no error text was captured. "
-            f"Usually a profile configuration issue — check profile "
-            f"health with the suggested command."
-        )
-    return [Diagnostic(
-        kind="repeated_spawn_failures",
-        severity=severity,
-        title=title,
-        detail=detail,
-        actions=actions,
-        first_seen_at=now,
-        last_seen_at=now,
-        count=failures,
-        data={"spawn_failures": failures, "last_spawn_error": last_err},
-    )]
-
-
-def _rule_repeated_crashes(task, events, runs, now, cfg) -> list[Diagnostic]:
-    """The worker spawns fine but keeps crashing mid-run. Check the last
-    N runs' outcomes; N consecutive ``crashed`` without a successful
-    ``completed`` means something about the task + profile combo is
-    broken (OOM, missing dependency, tool it needs is down).
-
-    Threshold: cfg["crash_threshold"] (default 2).
-    """
-    threshold = int(cfg.get("crash_threshold", 2))
-    ordered = sorted(runs, key=lambda r: _task_field(r, "id", 0))
-    # Count trailing consecutive 'crashed' outcomes.
-    consecutive = 0
-    last_err = None
-    for r in reversed(ordered):
-        outcome = _task_field(r, "outcome")
-        if outcome == "crashed":
-            consecutive += 1
-            if last_err is None:
-                last_err = _task_field(r, "error")
-        elif outcome in ("completed", "reclaimed"):
-            # A success (or manual reclaim) breaks the streak.
-            break
-        else:
-            # Other outcomes (timed_out, blocked, spawn_failed, gave_up)
-            # aren't crash signals — don't count them, but they also
-            # don't break the crash streak.
-            continue
-    if consecutive < threshold:
-        return []
-    task_id = _task_field(task, "id")
-    actions: list[DiagnosticAction] = []
-    if task_id:
-        actions.append(DiagnosticAction(
-            kind="cli_hint",
-            label=f"Check logs: hermes kanban log {task_id}",
-            payload={"command": f"hermes kanban log {task_id}"},
-            suggested=True,
-        ))
-    running = _task_field(task, "status") == "running"
-    actions.extend(_generic_recovery_actions(task, running=running))
-    severity = "critical" if consecutive >= threshold * 2 else "error"
-    # Put the actual error up-front so operators see WHAT broke without
-    # having to open the logs. Truncate defensively — these can be huge
-    # (full tracebacks).
-    err_text = (last_err or "").strip() if last_err else ""
-    err_snippet = err_text[:500] + ("…" if len(err_text) > 500 else "") if err_text else ""
-    if err_snippet:
-        title = f"Agent crashed {consecutive}x: {err_snippet.splitlines()[0][:160]}"
-        detail = (
-            f"The last {consecutive} runs ended with outcome=crashed. "
-            f"Full last error:\n\n{err_snippet}"
-        )
-    else:
-        title = f"Agent crashed {consecutive}x (no error recorded)"
-        detail = (
-            f"The last {consecutive} runs ended with outcome=crashed but "
-            f"no error text was captured. Check the worker log for more."
-        )
-    return [Diagnostic(
-        kind="repeated_crashes",
-        severity=severity,
-        title=title,
-        detail=detail,
-        actions=actions,
-        first_seen_at=now,
-        last_seen_at=now,
-        count=consecutive,
-        data={"consecutive_crashes": consecutive, "last_error": last_err},
-    )]
-
-
-def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
-    """Task has been in ``blocked`` status for too long without a comment.
-
-    Threshold: cfg["blocked_stale_hours"] (default 24).
-    Surfaced as a warning so humans know there's a pending unblock.
-    """
-    hours = float(cfg.get("blocked_stale_hours", 24))
-    status = _task_field(task, "status")
-    if status != "blocked":
-        return []
-    # Find the most recent ``blocked`` event.
-    last_blocked_ts = 0
-    for ev in events:
-        if _event_kind(ev) == "blocked":
-            t = _event_ts(ev)
-            if t > last_blocked_ts:
-                last_blocked_ts = t
-    if last_blocked_ts == 0:
-        return []
-    age_hours = (now - last_blocked_ts) / 3600.0
-    if age_hours < hours:
-        return []
-    # Any comment / unblock after the block breaks the "stale" signal.
-    for ev in events:
-        if _event_kind(ev) in ("commented", "unblocked") and _event_ts(ev) > last_blocked_ts:
-            return []
-    actions: list[DiagnosticAction] = [
-        DiagnosticAction(
-            kind="comment",
-            label="Add a comment / unblock the task",
-            suggested=True,
-        ),
-    ]
-    return [Diagnostic(
-        kind="stuck_in_blocked",
-        severity="warning",
-        title=f"Task has been blocked for {int(age_hours)}h",
-        detail=(
-            f"This task transitioned to blocked {int(age_hours)}h ago and "
-            f"has had no comments or unblock attempts since. Blocked tasks "
-            f"are waiting for human input — check the block reason and "
-            f"either unblock with feedback or answer with a comment."
-        ),
-        actions=actions,
-        first_seen_at=last_blocked_ts,
-        last_seen_at=last_blocked_ts,
-        count=1,
-        data={"blocked_at": last_blocked_ts, "age_hours": round(age_hours, 1)},
-    )]
-
-
-# Registry — order matters: rules higher on the list render first when
-# severity ties. Add new rules here.
-_RULES: list[RuleFn] = [
-    _rule_hallucinated_cards,
-    _rule_prose_phantom_refs,
-    _rule_repeated_spawn_failures,
-    _rule_repeated_crashes,
-    _rule_stuck_in_blocked,
-]
-
-
-# Known kinds (for the UI's filter / legend / i18n keys). Update when
-# rules are added.
-DIAGNOSTIC_KINDS = (
-    "hallucinated_cards",
-    "prose_phantom_refs",
-    "repeated_spawn_failures",
-    "repeated_crashes",
-    "stuck_in_blocked",
-)
-
-
-DEFAULT_CONFIG = {
-    "spawn_failure_threshold": 3,
-    "crash_threshold": 2,
-    "blocked_stale_hours": 24,
-}
-
-
-def compute_task_diagnostics(
-    task,
-    events: list,
-    runs: list,
-    *,
-    now: Optional[int] = None,
-    config: Optional[dict] = None,
-) -> list[Diagnostic]:
-    """Run every rule against a single task's state and return a
-    severity-sorted list of active diagnostics.
-
-    Sorting: critical first, then error, then warning; ties broken by
-    most-recent ``last_seen_at``.
-    """
-    now_ts = int(now if now is not None else time.time())
-    cfg = {**DEFAULT_CONFIG, **(config or {})}
-    out: list[Diagnostic] = []
-    for rule in _RULES:
-        try:
-            out.extend(rule(task, events, runs, now_ts, cfg))
-        except Exception:
-            # A broken rule must never crash the dashboard. Rule bugs
-            # get caught in tests; in production we'd rather drop the
-            # diagnostic than 500 a whole /board request.
-            continue
-    severity_idx = {s: i for i, s in enumerate(SEVERITY_ORDER)}
-    out.sort(
-        key=lambda d: (
-            -severity_idx.get(d.severity, -1),
-            -(d.last_seen_at or 0),
-        )
-    )
-    return out
-
-
-def severity_of_highest(diagnostics: Iterable[Diagnostic]) -> Optional[str]:
-    """Highest severity present in the list, or None if empty. Useful
-    for card badges that need a single color."""
-    highest_idx = -1
-    highest = None
-    for d in diagnostics:
-        idx = SEVERITY_ORDER.index(d.severity) if d.severity in SEVERITY_ORDER else -1
-        if idx > highest_idx:
-            highest_idx = idx
-            highest = d.severity
-    return highest
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:

    existing_lines = []
    if env_path.exists():
-        existing_lines = env_path.read_text(encoding="utf-8").splitlines()
+        existing_lines = env_path.read_text().splitlines()

    updated_keys = set()
    new_lines = []
@@ -190,18 +190,11 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
            model: "minimax-m2.7"
            provider: custom
            base_url: "https://ollama.com/v1"
-
-    Also reads ``model.aliases`` (set by ``hermes config set model.aliases.xxx``)
-    and converts simple string entries (``ds-flash: deepseek/deepseek-v4-flash``)
-    into DirectAlias objects.  The provider is parsed from the ``provider/``
-    prefix in the value; if no slash, the current provider is used.
    """
    merged = dict(_BUILTIN_DIRECT_ALIASES)
    try:
        from hermes_cli.config import load_config
        cfg = load_config()
-
-        # --- model_aliases (dict-based format) ---
        user_aliases = cfg.get("model_aliases")
        if isinstance(user_aliases, dict):
            for name, entry in user_aliases.items():
@@ -214,30 +207,6 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
                    merged[name.strip().lower()] = DirectAlias(
                        model=model, provider=provider, base_url=base_url,
                    )
-
-        # --- model.aliases (string-based format, from config set) ---
-        model_section = cfg.get("model", {})
-        if isinstance(model_section, dict):
-            simple_aliases = model_section.get("aliases")
-            if isinstance(simple_aliases, dict):
-                current_provider = model_section.get("provider", "")
-                for name, value in simple_aliases.items():
-                    if not isinstance(value, str) or not value.strip():
-                        continue
-                    key = name.strip().lower()
-                    if key in merged:
-                        continue  # don't override explicit model_aliases entries
-                    val = value.strip()
-                    if "/" in val:
-                        provider, model = val.split("/", 1)
-                    else:
-                        provider = current_provider
-                        model = val
-                    merged[key] = DirectAlias(
-                        model=model.strip(),
-                        provider=provider.strip() or current_provider,
-                        base_url="",
-                    )
    except Exception:
        pass
    return merged
@@ -935,26 +904,6 @@ def switch_model(
                        if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
                            override = True
                            break
-        # Also check custom_providers list — models declared there should be accepted
-        # even if the remote /v1/models endpoint doesn't list them.
-        if not override and custom_providers and isinstance(custom_providers, list):
-            for entry in custom_providers:
-                if not isinstance(entry, dict):
-                    continue
-                # Match by provider slug (custom:<name>) or by base_url
-                entry_name = entry.get("name", "")
-                entry_slug = f"custom:{entry_name}" if entry_name else ""
-                entry_url = entry.get("base_url", "")
-                if entry_slug == target_provider or entry_url == base_url:
-                    # Check if the requested model matches the entry's model
-                    entry_model = entry.get("model", "")
-                    entry_models = entry.get("models", {})
-                    if new_model == entry_model:
-                        override = True
-                        break
-                    if isinstance(entry_models, dict) and new_model in entry_models:
-                        override = True
-                        break
        if override:
            validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
        else:
@@ -1108,45 +1057,6 @@ def list_authenticated_providers(
        if normed:
            _builtin_endpoints.add(normed)

-    def _has_fast_aws_sdk_signal() -> bool:
-        """Return True when explicit AWS auth config is present.
-
-        This intentionally avoids botocore's full credential chain. Provider
-        picker/model-switch discovery can run for non-Bedrock providers, and
-        botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
-        machines before returning no credentials.
-        """
-        if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
-            return True
-        if (
-            os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
-            and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
-        ):
-            return True
-        return any(
-            os.environ.get(name, "").strip()
-            for name in (
-                "AWS_PROFILE",
-                "AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
-                "AWS_CONTAINER_CREDENTIALS_FULL_URI",
-                "AWS_WEB_IDENTITY_TOKEN_FILE",
-            )
-        )
-
-    def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
-        """Credential check for AWS SDK providers in non-runtime discovery."""
-        slug_norm = str(slug or "").strip().lower()
-        current_norm = str(current_provider or "").strip().lower()
-        if _has_fast_aws_sdk_signal():
-            return True
-        if slug_norm != current_norm:
-            return False
-        try:
-            from agent.bedrock_adapter import has_aws_credentials
-            return bool(has_aws_credentials())
-        except Exception:
-            return False
-
    data = fetch_models_dev()

    # Build curated model lists keyed by hermes provider ID
@@ -1274,9 +1184,7 @@ def list_authenticated_providers(

        # Check if credentials exist
        has_creds = False
-        if overlay.auth_type == "aws_sdk":
-            has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
-        elif overlay.extra_env_vars:
+        if overlay.extra_env_vars:
            has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
        # Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
        if not has_creds and overlay.auth_type == "api_key":
@@ -1295,7 +1203,11 @@ def list_authenticated_providers(
                from hermes_cli.auth import _load_auth_store
                store = _load_auth_store()
                providers_store = store.get("providers", {})
-                if store and (pid in providers_store or hermes_slug in providers_store):
+                pool_store = store.get("credential_pool", {})
+                if store and (
+                    pid in providers_store or hermes_slug in providers_store
+                    or pid in pool_store or hermes_slug in pool_store
+                ):
                    has_creds = True
            except Exception as exc:
                logger.debug("Auth store check failed for %s: %s", pid, exc)
@@ -1391,7 +1303,11 @@ def list_authenticated_providers(
                from hermes_cli.auth import _load_auth_store
                _cp_store = _load_auth_store()
                _cp_providers_store = _cp_store.get("providers", {})
-                if _cp_store and _cp.slug in _cp_providers_store:
+                _cp_pool_store = _cp_store.get("credential_pool", {})
+                if _cp_store and (
+                    _cp.slug in _cp_providers_store
+                    or _cp.slug in _cp_pool_store
+                ):
                    _cp_has_creds = True
            except Exception:
                pass
@@ -1408,7 +1324,11 @@ def list_authenticated_providers(
        # credentials come from the boto3 credential chain (env vars,
        # ~/.aws/credentials, instance roles, etc.)
        if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
-            _cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)
+            try:
+                from agent.bedrock_adapter import has_aws_credentials
+                _cp_has_creds = has_aws_credentials()
+            except Exception:
+                pass

        if not _cp_has_creds:
            continue
@@ -1683,59 +1603,3 @@ def list_authenticated_providers(
    results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))

    return results
-
-
-def list_picker_providers(
-    current_provider: str = "",
-    user_providers: dict = None,
-    custom_providers: list | None = None,
-    max_models: int = 8,
-) -> List[dict]:
-    """Interactive-picker variant of :func:`list_authenticated_providers`.
-
-    Post-processes the base list so the ``/model`` picker (Telegram/Discord
-    inline keyboards) only surfaces models that are actually callable in the
-    current install:
-
-    - OpenRouter's model list is replaced with the output of
-      :func:`hermes_cli.models.fetch_openrouter_models`, which filters the
-      curated ``OPENROUTER_MODELS`` snapshot against the live OpenRouter
-      catalog.  IDs the live catalog no longer carries drop out, so the
-      picker never offers a model the user can't call.
-    - Provider rows whose model list ends up empty are dropped, except
-      custom endpoints (``is_user_defined=True`` with an ``api_url``) where
-      the user may supply their own model set through config.
-
-    All other providers and metadata fields are passed through unchanged.
-    The typed ``/model <name>`` path is unaffected -- only the interactive
-    picker payload is narrowed.
-    """
-    from hermes_cli.models import fetch_openrouter_models
-
-    providers = list_authenticated_providers(
-        current_provider=current_provider,
-        user_providers=user_providers,
-        custom_providers=custom_providers,
-        max_models=max_models,
-    )
-
-    filtered: List[dict] = []
-    for p in providers:
-        slug = str(p.get("slug", "")).lower()
-        if slug == "openrouter":
-            try:
-                live = fetch_openrouter_models()
-                live_ids = [mid for mid, _ in live]
-            except Exception:
-                live_ids = list(p.get("models", []))
-            p = dict(p)
-            p["models"] = live_ids[:max_models]
-            p["total_models"] = len(live_ids)
-
-        has_models = bool(p.get("models"))
-        is_custom_endpoint = bool(p.get("is_user_defined")) and bool(p.get("api_url"))
-        if not has_models and not is_custom_endpoint:
-            continue
-        filtered.append(p)
-
-    return filtered
@@ -806,25 +806,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("ai-gateway",     "Vercel AI Gateway",        "Vercel AI Gateway"),
 ]

-# Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
-# that is not already in the list above.  Adding providers/*.py is sufficient
-# to expose a new provider in the model picker, /model, and all downstream
-# consumers — no edits to this file needed.
-_canonical_slugs = {p.slug for p in CANONICAL_PROVIDERS}
-try:
-    from providers import list_providers as _list_providers_for_canonical
-    for _pp in _list_providers_for_canonical():
-        if _pp.name in _canonical_slugs:
-            continue
-        if _pp.auth_type in ("oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"):
-            continue  # non-api-key flows need bespoke picker UX; skip auto-inject
-        _label = _pp.display_name or _pp.name
-        _desc = _pp.description or f"{_label} (direct API)"
-        CANONICAL_PROVIDERS.append(ProviderEntry(_pp.name, _label, _desc))
-        _canonical_slugs.add(_pp.name)
-except Exception:
-    pass
-
 # Derived dicts — used throughout the codebase
 _PROVIDER_LABELS = {p.slug: p.label for p in CANONICAL_PROVIDERS}
 _PROVIDER_LABELS["custom"] = "Custom endpoint"  # special case: not a named provider
@@ -1759,20 +1740,10 @@ def model_supports_fast_mode(model_id: Optional[str]) -> bool:


 def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
-    """Return True if the model is a Claude model eligible for Anthropic Fast Mode.
-
-    Fast mode is currently supported on Claude Opus 4.6 only. Per Anthropic's
-    docs (https://platform.claude.com/docs/en/build-with-claude/fast-mode):
-    "Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
-    with an unsupported model returns an error." Opus 4.7 explicitly rejects
-    the ``speed`` parameter with HTTP 400.
-    """
+    """Return True if the model is a Claude model eligible for Anthropic Fast Mode."""
    raw = _strip_vendor_prefix(str(model_id or ""))
    base = raw.split(":")[0]
-    if not base.startswith("claude-"):
-        return False
-    # Only Opus 4.6 supports fast mode at present.
-    return "opus-4-6" in base or "opus-4.6" in base
+    return base.startswith("claude-")


 def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
@@ -2042,34 +2013,6 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
                return ids
        except Exception:
            pass
-
-    # ── Profile-based generic live fetch (all simple api-key providers) ──
-    # Handles any provider registered in providers/ with auth_type="api_key".
-    # Replaces per-provider copy-paste blocks (stepfun, gmi, zai, etc.).
-    try:
-        from providers import get_provider_profile
-        from hermes_cli.auth import resolve_api_key_provider_credentials
-
-        _p = get_provider_profile(normalized)
-        if _p and _p.auth_type == "api_key" and _p.base_url:
-            try:
-                creds = resolve_api_key_provider_credentials(normalized)
-                api_key = str(creds.get("api_key") or "").strip()
-                base_url = str(creds.get("base_url") or "").strip()
-            except Exception:
-                api_key, base_url = "", _p.base_url
-            if not base_url:
-                base_url = _p.base_url
-            if api_key:
-                live = _p.fetch_models(api_key=api_key)
-                if live:
-                    return live
-            # Use profile's fallback_models if defined
-            if _p.fallback_models:
-                return list(_p.fallback_models)
-    except Exception:
-        pass
-
    curated_static = list(_PROVIDER_MODELS.get(normalized, []))
    if normalized in _MODELS_DEV_PREFERRED:
        return _merge_with_models_dev(normalized, curated_static)
@@ -2953,19 +2896,6 @@ def fetch_api_models(
 _OLLAMA_CLOUD_CACHE_TTL = 3600  # 1 hour


-def _strip_ollama_cloud_suffix(model_id: str) -> str:
-    """Strip :cloud / -cloud suffixes that models.dev appends to Ollama Cloud IDs.
-
-    The live API uses clean IDs (e.g. 'kimi-k2.6') while models.dev sometimes
-    returns them as 'kimi-k2.6:cloud'. Normalising before the dedup merge
-    prevents duplicate entries in the merged model list.
-    """
-    for suffix in (":cloud", "-cloud"):
-        if model_id.endswith(suffix):
-            return model_id[: -len(suffix)]
-    return model_id
-
-
 def _ollama_cloud_cache_path() -> Path:
    """Return the path for the Ollama Cloud model cache."""
    from hermes_constants import get_hermes_home
@@ -3061,10 +2991,9 @@ def fetch_ollama_cloud_models(
                seen.add(m)
                merged.append(m)
        for m in mdev_models:
-            normalized = _strip_ollama_cloud_suffix(m)
-            if normalized and normalized not in seen:
-                seen.add(normalized)
-                merged.append(normalized)
+            if m and m not in seen:
+                seen.add(m)
+                merged.append(m)
        if merged:
            _save_ollama_cloud_cache(merged)
            return merged
@@ -3158,7 +3087,7 @@ def validate_requested_model(
            "message": f"Model `{requested}` was not found in LM Studio's model listing.",
        }

-    if normalized == "custom" or normalized.startswith("custom:"):
+    if normalized == "custom":
        # Try probing with correct auth for the api_mode.
        if api_mode == "anthropic_messages":
            probe = probe_api_models(api_key, base_url, api_mode=api_mode)
@@ -3256,12 +3185,11 @@ def validate_requested_model(
            if suggestions:
                suggestion_text = "\n  Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
            return {
-                "accepted": True,
-                "persist": True,
+                "accepted": False,
+                "persist": False,
                "recognized": False,
                "message": (
-                    f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
-                    "It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
+                    f"Model `{requested}` was not found in the OpenAI Codex model listing."
                    f"{suggestion_text}"
                ),
            }
@@ -173,7 +173,7 @@ def _get_enabled_plugins() -> Optional[set]:
 # Data classes
 # ---------------------------------------------------------------------------

-_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform", "model-provider"}
+_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform"}


@dataclass
@@ -643,17 +643,15 @@ class PluginManager:
        #   - flat: ``plugins/disk-cleanup/plugin.yaml`` (standalone)
        #   - category: ``plugins/image_gen/openai/plugin.yaml`` (backend)
        #
-        # ``memory/``, ``context_engine/``, and ``model-providers/`` are
-        # skipped at the top level — they have their own discovery systems
-        # (plugins/memory/__init__.py, providers/__init__.py). ``platforms/``
-        # is a category holding platform adapters (scanned one level deeper
-        # below).
+        # ``memory/`` and ``context_engine/`` are skipped at the top level —
+        # they have their own discovery systems. ``platforms/`` is a category
+        # holding platform adapters (scanned one level deeper below).
        repo_plugins = get_bundled_plugins_dir()
        manifests.extend(
            self._scan_directory(
                repo_plugins,
                source="bundled",
-                skip_names={"memory", "context_engine", "platforms", "model-providers"},
+                skip_names={"memory", "context_engine", "platforms"},
            )
        )
        manifests.extend(
@@ -711,21 +709,6 @@ class PluginManager:
                )
                continue

-            # Model provider plugins are loaded by providers/__init__.py
-            # (its own lazy discovery keyed off first get_provider_profile()
-            # call). We record the manifest here for introspection but do
-            # not import the module — a second import would create two
-            # ProviderProfile instances and break the "last writer wins"
-            # override semantics between bundled and user plugins.
-            if manifest.kind == "model-provider":
-                loaded = LoadedPlugin(manifest=manifest, enabled=True)
-                self._plugins[lookup_key] = loaded
-                logger.debug(
-                    "Skipping '%s' (model-provider, handled by providers/ discovery)",
-                    lookup_key,
-                )
-                continue
-
            # Built-in backends auto-load — they ship with hermes and must
            # just work. Selection among them (e.g. which image_gen backend
            # services calls) is driven by ``<category>.provider`` config,
@@ -903,19 +886,6 @@ class PluginManager:
                                "treating as kind='exclusive'",
                                key,
                            )
-                        elif (
-                            "register_provider" in source_text
-                            and "ProviderProfile" in source_text
-                        ):
-                            # Model provider plugin (calls register_provider()
-                            # from ``providers`` with a ProviderProfile). Route
-                            # to providers/__init__.py discovery.
-                            kind = "model-provider"
-                            logger.debug(
-                                "Plugin %s: detected model provider, "
-                                "treating as kind='model-provider'",
-                                key,
-                            )
                    except Exception:
                        pass

@@ -179,33 +179,8 @@ def _get_wrapper_dir() -> Path:
 # Validation
 # ---------------------------------------------------------------------------

-def normalize_profile_name(name: str) -> str:
-    """Return the canonical profile id used on disk and in CLI ``-p`` argv.
-
-    Named profiles are stored lowercase under ``profiles/<id>/``. The special
-    alias ``default`` is matched case-insensitively (``Default`` → ``default``).
-    Dashboards and tools may pass title-cased display labels; normalize before
-    validation, assignment, and subprocess spawn (see issue #18498).
-    """
-    if not isinstance(name, str):
-        name = str(name)
-    stripped = name.strip()
-    if not stripped:
-        raise ValueError("profile name cannot be empty")
-    if stripped.casefold() == "default":
-        return "default"
-    return stripped.lower()
-
-
 def validate_profile_name(name: str) -> None:
-    """Raise ``ValueError`` if *name* is not a valid profile identifier.
-
-    Validates the input as-given — strict lowercase match. Callers that accept
-    mixed-case or title-cased input from users (dashboard UI, CLI args) should
-    call :func:`normalize_profile_name` first. This separation keeps validate
-    honest about what the on-disk directory name must look like, while
-    ingress-point normalization handles UX flexibility (see #18498).
-    """
+    """Raise ``ValueError`` if *name* is not a valid profile identifier."""
    if name == "default":
        return  # special alias for ~/.hermes
    if not _PROFILE_ID_RE.match(name):
@@ -217,18 +192,16 @@ def validate_profile_name(name: str) -> None:

 def get_profile_dir(name: str) -> Path:
    """Resolve a profile name to its HERMES_HOME directory."""
-    canon = normalize_profile_name(name)
-    if canon == "default":
+    if name == "default":
        return _get_default_hermes_home()
-    return _get_profiles_root() / canon
+    return _get_profiles_root() / name


 def profile_exists(name: str) -> bool:
    """Check whether a profile directory exists."""
-    canon = normalize_profile_name(name)
-    if canon == "default":
+    if name == "default":
        return True
-    return get_profile_dir(canon).is_dir()
+    return get_profile_dir(name).is_dir()


 # ---------------------------------------------------------------------------
@@ -240,29 +213,28 @@ def check_alias_collision(name: str) -> Optional[str]:

    Checks: reserved names, hermes subcommands, existing binaries in PATH.
    """
-    canon = normalize_profile_name(name)
-    if canon in _RESERVED_NAMES:
-        return f"'{canon}' is a reserved name"
-    if canon in _HERMES_SUBCOMMANDS:
-        return f"'{canon}' conflicts with a hermes subcommand"
+    if name in _RESERVED_NAMES:
+        return f"'{name}' is a reserved name"
+    if name in _HERMES_SUBCOMMANDS:
+        return f"'{name}' conflicts with a hermes subcommand"

    # Check existing commands in PATH
    wrapper_dir = _get_wrapper_dir()
    try:
        result = subprocess.run(
-            ["which", canon], capture_output=True, text=True, timeout=5,
+            ["which", name], capture_output=True, text=True, timeout=5,
        )
        if result.returncode == 0:
            existing_path = result.stdout.strip()
            # Allow overwriting our own wrappers
-            if existing_path == str(wrapper_dir / canon):
+            if existing_path == str(wrapper_dir / name):
                try:
-                    content = (wrapper_dir / canon).read_text()
+                    content = (wrapper_dir / name).read_text()
                    if "hermes -p" in content:
                        return None  # it's our wrapper, safe to overwrite
                except Exception:
                    pass
-            return f"'{canon}' conflicts with an existing command ({existing_path})"
+            return f"'{name}' conflicts with an existing command ({existing_path})"
    except (FileNotFoundError, subprocess.TimeoutExpired):
        pass

@@ -280,7 +252,6 @@ def create_wrapper_script(name: str) -> Optional[Path]:

    Returns the path to the created wrapper, or None if creation failed.
    """
-    canon = normalize_profile_name(name)
    wrapper_dir = _get_wrapper_dir()
    try:
        wrapper_dir.mkdir(parents=True, exist_ok=True)
@@ -288,9 +259,9 @@ def create_wrapper_script(name: str) -> Optional[Path]:
        print(f"⚠ Could not create {wrapper_dir}: {e}")
        return None

-    wrapper_path = wrapper_dir / canon
+    wrapper_path = wrapper_dir / name
    try:
-        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {canon} "$@"\n')
+        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
        wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
        return wrapper_path
    except OSError as e:
@@ -300,7 +271,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:

 def remove_wrapper_script(name: str) -> bool:
    """Remove the wrapper script for a profile. Returns True if removed."""
-    wrapper_path = _get_wrapper_dir() / normalize_profile_name(name)
+    wrapper_path = _get_wrapper_dir() / name
    if wrapper_path.exists():
        try:
            # Verify it's our wrapper before removing
@@ -450,17 +421,16 @@ def create_profile(
    Path
        The newly created profile directory.
    """
-    canon = normalize_profile_name(name)
-    validate_profile_name(canon)
+    validate_profile_name(name)

-    if canon == "default":
+    if name == "default":
        raise ValueError(
            "Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
        )

-    profile_dir = get_profile_dir(canon)
+    profile_dir = get_profile_dir(name)
    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
+        raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")

    # Resolve clone source
    source_dir = None
@@ -470,7 +440,6 @@ def create_profile(
            from hermes_constants import get_hermes_home
            source_dir = get_hermes_home()
        else:
-            clone_from = normalize_profile_name(clone_from)
            validate_profile_name(clone_from)
            source_dir = get_profile_dir(clone_from)
        if not source_dir.is_dir():
@@ -571,25 +540,24 @@ def delete_profile(name: str, yes: bool = False) -> Path:

    Returns the path that was removed.
    """
-    canon = normalize_profile_name(name)
-    validate_profile_name(canon)
+    validate_profile_name(name)

-    if canon == "default":
+    if name == "default":
        raise ValueError(
            "Cannot delete the default profile (~/.hermes).\n"
            "To remove everything, use: hermes uninstall"
        )

-    profile_dir = get_profile_dir(canon)
+    profile_dir = get_profile_dir(name)
    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{canon}' does not exist.")
+        raise FileNotFoundError(f"Profile '{name}' does not exist.")

    # Show what will be deleted
    model, provider = _read_config_model(profile_dir)
    gw_running = _check_gateway_running(profile_dir)
    skill_count = _count_skills(profile_dir)

-    print(f"\nProfile: {canon}")
+    print(f"\nProfile: {name}")
    print(f"Path:    {profile_dir}")
    if model:
        print(f"Model:   {model}" + (f" ({provider})" if provider else ""))
@@ -601,7 +569,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    ]

    # Check for service
-    wrapper_path = _get_wrapper_dir() / canon
+    wrapper_path = _get_wrapper_dir() / name
    has_wrapper = wrapper_path.exists()
    if has_wrapper:
        items.append(f"Command alias ({wrapper_path})")
@@ -616,16 +584,16 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    if not yes:
        print()
        try:
-            confirm = input(f"Type '{canon}' to confirm: ").strip()
+            confirm = input(f"Type '{name}' to confirm: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nCancelled.")
            return profile_dir
-        if confirm != canon:
+        if confirm != name:
            print("Cancelled.")
            return profile_dir

    # 1. Disable service (prevents auto-restart)
-    _cleanup_gateway_service(canon, profile_dir)
+    _cleanup_gateway_service(name, profile_dir)

    # 2. Stop running gateway
    if gw_running:
@@ -633,7 +601,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:

    # 3. Remove wrapper script
    if has_wrapper:
-        if remove_wrapper_script(canon):
+        if remove_wrapper_script(name):
            print(f"✓ Removed {wrapper_path}")

    # 4. Remove profile directory
@@ -646,13 +614,13 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    # 5. Clear active_profile if it pointed to this profile
    try:
        active = get_active_profile()
-        if active == canon:
+        if active == name:
            set_active_profile("default")
            print("✓ Active profile reset to default")
    except Exception:
        pass

-    print(f"\nProfile '{canon}' deleted.")
+    print(f"\nProfile '{name}' deleted.")
    return profile_dir


@@ -762,23 +730,22 @@ def set_active_profile(name: str) -> None:

    Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
    """
-    canon = normalize_profile_name(name)
-    validate_profile_name(canon)
-    if canon != "default" and not profile_exists(canon):
+    validate_profile_name(name)
+    if name != "default" and not profile_exists(name):
        raise FileNotFoundError(
-            f"Profile '{canon}' does not exist. "
-            f"Create it with: hermes profile create {canon}"
+            f"Profile '{name}' does not exist. "
+            f"Create it with: hermes profile create {name}"
        )

    path = _get_active_profile_path()
    path.parent.mkdir(parents=True, exist_ok=True)
-    if canon == "default":
+    if name == "default":
        # Remove the file to indicate default
        path.unlink(missing_ok=True)
    else:
        # Atomic write
        tmp = path.with_suffix(".tmp")
-        tmp.write_text(canon + "\n")
+        tmp.write_text(name + "\n")
        tmp.replace(path)


@@ -844,17 +811,16 @@ def export_profile(name: str, output_path: str) -> Path:
    """
    import tempfile

-    canon = normalize_profile_name(name)
-    validate_profile_name(canon)
-    profile_dir = get_profile_dir(canon)
+    validate_profile_name(name)
+    profile_dir = get_profile_dir(name)
    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{canon}' does not exist.")
+        raise FileNotFoundError(f"Profile '{name}' does not exist.")

    output = Path(output_path)
    # shutil.make_archive wants the base name without extension
    base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")

-    if canon == "default":
+    if name == "default":
        # The default profile IS ~/.hermes itself — its parent is ~/ and its
        # directory name is ".hermes", not "default".  We stage a clean copy
        # under a temp dir so the archive contains ``default/...``.
@@ -870,14 +836,14 @@ def export_profile(name: str, output_path: str) -> Path:

    # Named profiles — stage a filtered copy to exclude credentials
    with tempfile.TemporaryDirectory() as tmpdir:
-        staged = Path(tmpdir) / canon
+        staged = Path(tmpdir) / name
        _CREDENTIAL_FILES = {"auth.json", ".env"}
        shutil.copytree(
            profile_dir,
            staged,
            ignore=lambda d, contents: _CREDENTIAL_FILES & set(contents),
        )
-        result = shutil.make_archive(base, "gztar", tmpdir, canon)
+        result = shutil.make_archive(base, "gztar", tmpdir, name)
        return Path(result)


@@ -986,17 +952,16 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
    # Archives exported from the default profile have "default/" as top-level
    # dir.  Importing as "default" would target ~/.hermes itself — disallow
    # that and guide the user toward a named profile.
-    canon = normalize_profile_name(inferred_name)
-    validate_profile_name(canon)
-    if canon == "default":
+    if inferred_name == "default":
        raise ValueError(
            "Cannot import as 'default' — that is the built-in root profile (~/.hermes). "
            "Specify a different name: hermes profile import <archive> --name <name>"
        )

-    profile_dir = get_profile_dir(canon)
+    validate_profile_name(inferred_name)
+    profile_dir = get_profile_dir(inferred_name)
    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
+        raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")

    profiles_root = _get_profiles_root()
    profiles_root.mkdir(parents=True, exist_ok=True)
@@ -1012,8 +977,8 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
            )

        final_source = extracted
-        if archive_root != canon:
-            final_source = staging_root / canon
+        if archive_root != inferred_name:
+            final_source = staging_root / inferred_name
            extracted.rename(final_source)

        shutil.move(str(final_source), str(profile_dir))
@@ -1083,27 +1048,25 @@ def rename_profile(old_name: str, new_name: str) -> Path:

    Returns the new profile directory.
    """
-    old_canon = normalize_profile_name(old_name)
-    new_canon = normalize_profile_name(new_name)
-    validate_profile_name(old_canon)
-    validate_profile_name(new_canon)
+    validate_profile_name(old_name)
+    validate_profile_name(new_name)

-    if old_canon == "default":
+    if old_name == "default":
        raise ValueError("Cannot rename the default profile.")
-    if new_canon == "default":
+    if new_name == "default":
        raise ValueError("Cannot rename to 'default' — it is reserved.")

-    old_dir = get_profile_dir(old_canon)
-    new_dir = get_profile_dir(new_canon)
+    old_dir = get_profile_dir(old_name)
+    new_dir = get_profile_dir(new_name)

    if not old_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{old_canon}' does not exist.")
+        raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
    if new_dir.exists():
-        raise FileExistsError(f"Profile '{new_canon}' already exists.")
+        raise FileExistsError(f"Profile '{new_name}' already exists.")

    # 1. Stop gateway if running
    if _check_gateway_running(old_dir):
-        _cleanup_gateway_service(old_canon, old_dir)
+        _cleanup_gateway_service(old_name, old_dir)
        _stop_gateway_process(old_dir)

    # 2. Rename directory
@@ -1111,22 +1074,22 @@ def rename_profile(old_name: str, new_name: str) -> Path:
    print(f"✓ Renamed {old_dir.name} → {new_dir.name}")

    # 3. Update profile-scoped Honcho host blocks, preserving aiPeer identity
-    _migrate_honcho_profile_host(old_canon, new_canon, new_dir)
+    _migrate_honcho_profile_host(old_name, new_name, new_dir)

    # 4. Update wrapper script
-    remove_wrapper_script(old_canon)
-    collision = check_alias_collision(new_canon)
+    remove_wrapper_script(old_name)
+    collision = check_alias_collision(new_name)
    if not collision:
-        create_wrapper_script(new_canon)
-        print(f"✓ Alias updated: {new_canon}")
+        create_wrapper_script(new_name)
+        print(f"✓ Alias updated: {new_name}")
    else:
-        print(f"⚠ Cannot create alias '{new_canon}' — {collision}")
+        print(f"⚠ Cannot create alias '{new_name}' — {collision}")

    # 5. Update active_profile if it pointed to old name
    try:
-        if get_active_profile() == old_canon:
-            set_active_profile(new_canon)
-            print(f"✓ Active profile updated: {new_canon}")
+        if get_active_profile() == old_name:
+            set_active_profile(new_name)
+            print(f"✓ Active profile updated: {new_name}")
    except Exception:
        pass

@@ -1228,14 +1191,13 @@ def resolve_profile_env(profile_name: str) -> str:
    Called early in the CLI entry point, before any hermes modules
    are imported, to set the HERMES_HOME environment variable.
    """
-    canon = normalize_profile_name(profile_name)
-    validate_profile_name(canon)
-    profile_dir = get_profile_dir(canon)
+    validate_profile_name(profile_name)
+    profile_dir = get_profile_dir(profile_name)

-    if canon != "default" and not profile_dir.is_dir():
+    if profile_name != "default" and not profile_dir.is_dir():
        raise FileNotFoundError(
-            f"Profile '{canon}' does not exist. "
-            f"Create it with: hermes profile create {canon}"
+            f"Profile '{profile_name}' does not exist. "
+            f"Create it with: hermes profile create {profile_name}"
        )

    return str(profile_dir)
@@ -108,14 +108,9 @@ class PtyBridge:
                    "(or pip install -e '.[pty]')."
                )
            raise PtyUnavailableError("Pseudo-terminals are unavailable.")
-        # PTY-hosted programs expect TERM to describe the terminal type.
-        # CI often runs without TERM in the parent process, which makes
-        # simple terminal probes like `tput cols` fail before winsize reads.
-        # Preserve explicit caller overrides, but backfill a sensible default
-        # when TERM is missing or blank.
-        spawn_env = (os.environ.copy() if env is None else env.copy())
-        if not spawn_env.get("TERM"):
-            spawn_env["TERM"] = "xterm-256color"
+        # Let caller-supplied env fully override inheritance; if they pass
+        # None we inherit the server's env (same semantics as subprocess).
+        spawn_env = os.environ.copy() if env is None else env
        proc = ptyprocess.PtyProcess.spawn(  # type: ignore[union-attr]
            list(argv),
            cwd=cwd,
@@ -0,0 +1,316 @@
+"""Session recap — summarize what's happened in the current session.
+
+Inspired by Claude Code's `/recap` command (v2.1.114, April 2026), which
+shows a one-line summary of what happened while a terminal was unfocused
+so users juggling multiple sessions can re-orient quickly.
+
+Source: https://code.claude.com/docs/en/whats-new/2026-w17
+
+Differences from Claude Code:
+    - Pure local computation from the in-memory conversation history. No
+      LLM call, no auxiliary model, no prompt-cache invalidation. A
+      recap should be instant and free.
+    - Works unchanged on CLI and every gateway platform (Telegram,
+      Discord, Slack, …) because both call into the same ``build_recap``
+      helper. Claude Code only shows this on the CLI.
+    - Tailored to hermes-agent's tool vocabulary (``terminal``, ``patch``,
+      ``write_file``, ``delegate_task``, ``browser_*``, ``web_*``) — the
+      recap surfaces which classes of work were most active.
+"""
+from __future__ import annotations
+
+import os
+from collections import Counter
+from typing import Any, Iterable, List, Mapping, Optional, Sequence, Tuple
+
+# How many recent user/assistant turns we consider "recent activity".
+_RECENT_TURN_WINDOW = 20
+
+# How many characters of the latest user prompt to show.
+_PROMPT_PREVIEW_CHARS = 140
+
+# How many characters of the latest assistant text to show.
+_ASSISTANT_PREVIEW_CHARS = 200
+
+# How many recently-touched files to list.
+_MAX_FILES_LISTED = 5
+
+# Tool names that identify a file-editing action and the argument key that
+# holds the path.
+_FILE_EDIT_TOOLS: Mapping[str, str] = {
+    "write_file": "path",
+    "patch": "path",
+    "read_file": "path",
+    "skill_manage": "file_path",
+    "skill_view": "file_path",
+}
+
+
+def _coerce_text(value: Any) -> str:
+    """Flatten assistant/user ``content`` into a plain string.
+
+    Content can be a string or a list of content blocks (for multimodal
+    or reasoning models). We concatenate every text-like block and
+    ignore the rest.
+    """
+    if value is None:
+        return ""
+    if isinstance(value, str):
+        return value
+    if isinstance(value, list):
+        parts: List[str] = []
+        for block in value:
+            if isinstance(block, str):
+                parts.append(block)
+                continue
+            if isinstance(block, Mapping):
+                text = block.get("text")
+                if isinstance(text, str) and text:
+                    parts.append(text)
+        return "\n".join(parts)
+    return str(value)
+
+
+def _tool_call_name_and_args(tool_call: Any) -> Tuple[str, Mapping[str, Any]]:
+    """Extract ``(name, arguments_dict)`` from a tool_call entry.
+
+    ``arguments`` may be a JSON string or a dict depending on provider.
+    Return an empty dict if it cannot be parsed.
+    """
+    if not isinstance(tool_call, Mapping):
+        return "", {}
+    fn = tool_call.get("function") or {}
+    if not isinstance(fn, Mapping):
+        return "", {}
+    name = str(fn.get("name") or "") or ""
+    raw_args = fn.get("arguments")
+    if isinstance(raw_args, Mapping):
+        return name, raw_args
+    if isinstance(raw_args, str) and raw_args:
+        try:
+            import json
+
+            parsed = json.loads(raw_args)
+            if isinstance(parsed, Mapping):
+                return name, parsed
+        except Exception:
+            return name, {}
+    return name, {}
+
+
+def _iter_assistant_tool_calls(
+    messages: Sequence[Mapping[str, Any]],
+) -> Iterable[Tuple[str, Mapping[str, Any]]]:
+    for msg in messages:
+        if not isinstance(msg, Mapping):
+            continue
+        if msg.get("role") != "assistant":
+            continue
+        tool_calls = msg.get("tool_calls") or []
+        if not isinstance(tool_calls, list):
+            continue
+        for tc in tool_calls:
+            name, args = _tool_call_name_and_args(tc)
+            if name:
+                yield name, args
+
+
+def _count_visible_turns(
+    messages: Sequence[Mapping[str, Any]],
+) -> Tuple[int, int, int]:
+    """Return ``(user_turn_count, assistant_turn_count, tool_message_count)``."""
+    users = assistants = tools = 0
+    for msg in messages:
+        if not isinstance(msg, Mapping):
+            continue
+        role = msg.get("role")
+        if role == "user":
+            users += 1
+        elif role == "assistant":
+            assistants += 1
+        elif role == "tool":
+            tools += 1
+    return users, assistants, tools
+
+
+def _latest_user_prompt(
+    messages: Sequence[Mapping[str, Any]],
+) -> Optional[str]:
+    for msg in reversed(messages):
+        if isinstance(msg, Mapping) and msg.get("role") == "user":
+            text = _coerce_text(msg.get("content")).strip()
+            if text:
+                return text
+    return None
+
+
+def _latest_assistant_text(
+    messages: Sequence[Mapping[str, Any]],
+) -> Optional[str]:
+    for msg in reversed(messages):
+        if not isinstance(msg, Mapping):
+            continue
+        if msg.get("role") != "assistant":
+            continue
+        text = _coerce_text(msg.get("content")).strip()
+        if text:
+            return text
+    return None
+
+
+def _recent_window(
+    messages: Sequence[Mapping[str, Any]], window: int = _RECENT_TURN_WINDOW
+) -> List[Mapping[str, Any]]:
+    """Return the tail slice of ``messages`` covering at most ``window``
+    user+assistant turns (tool messages ride along inside the window).
+
+    Iterating from the end, we count user and assistant messages and
+    keep everything from the first message that falls within the window.
+    """
+    count = 0
+    cut = 0
+    for i in range(len(messages) - 1, -1, -1):
+        msg = messages[i]
+        if isinstance(msg, Mapping) and msg.get("role") in ("user", "assistant"):
+            count += 1
+            if count >= window:
+                cut = i
+                break
+    else:
+        return list(messages)
+    return list(messages[cut:])
+
+
+def _shortened_path(path: str) -> str:
+    """Show a path relative to cwd when possible, otherwise with ~ expansion."""
+    if not path:
+        return path
+    try:
+        abs_path = os.path.abspath(os.path.expanduser(path))
+        cwd = os.getcwd()
+        if abs_path == cwd:
+            return "."
+        if abs_path.startswith(cwd + os.sep):
+            return abs_path[len(cwd) + 1 :]
+        home = os.path.expanduser("~")
+        if abs_path.startswith(home + os.sep):
+            return "~/" + abs_path[len(home) + 1 :]
+        return abs_path
+    except Exception:
+        return path
+
+
+def _summarise_tool_activity(
+    tool_calls: Sequence[Tuple[str, Mapping[str, Any]]],
+) -> Tuple[List[Tuple[str, int]], List[str]]:
+    """Return ``(tool_counts_sorted, recently_edited_files)``.
+
+    ``tool_counts_sorted`` is descending by count, keeping the full list
+    so callers can truncate for display. ``recently_edited_files`` lists
+    distinct paths (most recent first) from file-editing tools.
+    """
+    counter: Counter[str] = Counter()
+    files_seen: List[str] = []
+    files_set: set[str] = set()
+    # Walk in reverse so "most recent first" drops out of order-preserved iteration.
+    for name, args in reversed(list(tool_calls)):
+        counter[name] += 1
+        arg_key = _FILE_EDIT_TOOLS.get(name)
+        if arg_key:
+            path = args.get(arg_key)
+            if isinstance(path, str) and path and path not in files_set:
+                files_set.add(path)
+                files_seen.append(_shortened_path(path))
+    # Restore "reverse of reverse" for correct counts; Counter ignores order
+    # so only files_seen needed the reversal. Fix ordering: currently
+    # files_seen is newest→oldest which is what we want for display.
+    tool_counts = sorted(counter.items(), key=lambda kv: (-kv[1], kv[0]))
+    return tool_counts, files_seen
+
+
+def _truncate(text: str, limit: int) -> str:
+    text = " ".join(text.split())  # collapse newlines for a compact one-liner
+    if len(text) <= limit:
+        return text
+    return text[: limit - 1].rstrip() + "…"
+
+
+def build_recap(
+    messages: Sequence[Mapping[str, Any]],
+    *,
+    session_title: Optional[str] = None,
+    session_id: Optional[str] = None,
+    platform: Optional[str] = None,
+) -> str:
+    """Build a multi-line recap of recent activity.
+
+    Inputs:
+        messages: the full conversation history as a list of
+            chat-completion-style dicts (``role``, ``content``,
+            ``tool_calls``, …).
+        session_title: optional human title (from SessionDB).
+        session_id: optional session id.
+        platform: optional hint (``"cli"``, ``"telegram"``, …). Does not
+            change behavior today but is accepted for forward compat.
+
+    The output is plain text designed to render well in both a terminal
+    (with 80-col wrapping) and a gateway message bubble.
+    """
+    _ = platform  # reserved for future use
+    lines: List[str] = []
+
+    header_bits: List[str] = ["Session recap"]
+    if session_title:
+        header_bits.append(f"— {session_title}")
+    elif session_id:
+        header_bits.append(f"— {session_id[:8]}")
+    lines.append(" ".join(header_bits))
+
+    if not messages:
+        lines.append("  (nothing to recap — no messages yet)")
+        return "\n".join(lines)
+
+    users, assistants, tool_msgs = _count_visible_turns(messages)
+    window = _recent_window(messages)
+    win_users, win_assistants, _ = _count_visible_turns(window)
+
+    scope = (
+        f"{win_users} user turn{'s' if win_users != 1 else ''} / "
+        f"{win_assistants} assistant repl{'ies' if win_assistants != 1 else 'y'}"
+    )
+    if (users, assistants) != (win_users, win_assistants):
+        scope += f" (of {users}/{assistants} total)"
+    lines.append(f"  Recent: {scope}, {tool_msgs} tool result{'s' if tool_msgs != 1 else ''}")
+
+    tool_calls = list(_iter_assistant_tool_calls(window))
+    tool_counts, files = _summarise_tool_activity(tool_calls)
+    if tool_counts:
+        top = ", ".join(f"{name}×{count}" for name, count in tool_counts[:5])
+        extra = len(tool_counts) - 5
+        if extra > 0:
+            top += f" (+{extra} more)"
+        lines.append(f"  Tools used: {top}")
+    if files:
+        shown = files[:_MAX_FILES_LISTED]
+        extra = len(files) - len(shown)
+        entry = ", ".join(shown)
+        if extra > 0:
+            entry += f" (+{extra} more)"
+        lines.append(f"  Files touched: {entry}")
+
+    latest_user = _latest_user_prompt(window)
+    if latest_user:
+        lines.append(f"  Last ask: {_truncate(latest_user, _PROMPT_PREVIEW_CHARS)}")
+
+    latest_reply = _latest_assistant_text(window)
+    if latest_reply:
+        lines.append(f"  Last reply: {_truncate(latest_reply, _ASSISTANT_PREVIEW_CHARS)}")
+
+    if len(lines) == 2:
+        # Only the header + scope line — nothing substantive to show.
+        lines.append("  (no assistant activity yet in this window)")
+
+    return "\n".join(lines)
+
+
+__all__ = ["build_recap"]
@@ -15,7 +15,6 @@ import importlib.util
 import json
 import logging
 import os
-import re
 import shutil
 import sys
 import copy
@@ -209,23 +208,12 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
        else:
            value = input(color(display, Colors.YELLOW))

-        cleaned = _sanitize_pasted_input(value)
-        return cleaned.strip() or default or ""
+        return value.strip() or default or ""
    except (KeyboardInterrupt, EOFError):
        print()
        sys.exit(1)


-_BRACKETED_PASTE_PATTERN = re.compile(r"\x1b\[\s*200~|\x1b\[\s*201~")
-
-
-def _sanitize_pasted_input(value: str) -> str:
-    """Strip terminal bracketed-paste control markers from pasted text."""
-    if not isinstance(value, str) or not value:
-        return value
-    return _BRACKETED_PASTE_PATTERN.sub("", value)
-
-
 def _curses_prompt_choice(question: str, choices: list, default: int = 0, description: str | None = None) -> int:
    """Single-select menu using curses. Delegates to curses_radiolist."""
    from hermes_cli.curses_ui import curses_radiolist
@@ -976,8 +964,7 @@ def setup_model_provider(config: dict, *, quick: bool = False):
                    )
                else:
                    _selected_vision_model = prompt("  Vision model (blank = use main/custom default)").strip()
-                if _selected_vision_model:
-                    save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
+                save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
                print_success(
                    f"Vision configured with {_base_url}"
                    + (f" ({_selected_vision_model})" if _selected_vision_model else "")
@@ -1203,13 +1190,6 @@ def _setup_tts_provider(config: dict):
                    "Falling back to Edge TTS."
                )
                selected = "edge"
-        if selected == "xai":
-            print()
-            voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
-            if voice_id and voice_id.strip():
-                config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
-                print_success(f"xAI voice_id set to: {voice_id.strip()}")
-

    elif selected == "minimax":
        existing = get_env_value("MINIMAX_API_KEY")
@@ -1341,13 +1321,15 @@ def setup_terminal_backend(config: dict):
        print_success("Terminal backend: Local")
        print_info("Commands run directly on this machine.")

-        # Gateway/cron working directory
+        # CWD for messaging
        print()
-        print_info("Gateway working directory:")
-        print_info("  Used by Telegram/Discord/cron sessions.")
-        print_info("  CLI/TUI always uses your launch directory instead.")
+        print_info("Working directory for messaging sessions:")
+        print_info("  When using Hermes via Telegram/Discord, this is where")
+        print_info(
+            "  the agent starts. CLI mode always starts in the current directory."
+        )
        current_cwd = cfg_get(config, "terminal", "cwd", default="")
-        cwd = prompt("  Gateway working directory", current_cwd or str(Path.home()))
+        cwd = prompt("  Messaging working directory", current_cwd or str(Path.home()))
        if cwd:
            config["terminal"]["cwd"] = cwd

@@ -1661,11 +1643,7 @@ def setup_terminal_backend(config: dict):
 def _apply_default_agent_settings(config: dict):
    """Apply recommended defaults for all agent settings without prompting."""
    config.setdefault("agent", {})["max_turns"] = 90
-    # config.yaml is the authoritative source for max_turns; the gateway
-    # bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
-    # to .env to avoid the dual-source inconsistency that caused the
-    # 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
-    remove_env_value("HERMES_MAX_ITERATIONS")
+    save_env_value("HERMES_MAX_ITERATIONS", "90")

    config.setdefault("display", {})["tool_progress"] = "all"

@@ -1695,10 +1673,9 @@ def setup_agent_settings(config: dict):
    print()

    # ── Max Iterations ──
-    # config.yaml is authoritative; read from there. If a legacy .env
-    # entry is still around (from pre-PR#18413 setups), prefer the
-    # config value so we don't surface a stale number to the user.
-    current_max = str(cfg_get(config, "agent", "max_turns", default=90))
+    current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
+        cfg_get(config, "agent", "max_turns", default=90)
+    )
    print_info("Maximum tool-calling iterations per conversation.")
    print_info("Higher = more complex tasks, but costs more tokens.")
    print_info(
@@ -1709,13 +1686,9 @@ def setup_agent_settings(config: dict):
    try:
        max_iter = int(max_iter_str)
        if max_iter > 0:
-            # Write to config.yaml (authoritative) only. Also clean up any
-            # stale .env entry from earlier setup runs — the gateway's
-            # bridge in gateway/run.py now unconditionally derives
-            # HERMES_MAX_ITERATIONS from agent.max_turns at startup.
+            save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
            config.setdefault("agent", {})["max_turns"] = max_iter
            config.pop("max_turns", None)
-            remove_env_value("HERMES_MAX_ITERATIONS")
            print_success(f"Max iterations set to {max_iter}")
    except ValueError:
        print_warning("Invalid number, keeping current value")
@@ -2060,16 +2033,6 @@ def _setup_slack():
        print_warning("⚠️  No Slack allowlist set - unpaired users will be denied by default.")
        print_info("   Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")

-    print()
-    print_info("📬 Home Channel: where Hermes delivers cron job results,")
-    print_info("   cross-platform messages, and notifications.")
-    print_info("   To get a channel ID: open the channel in Slack, then right-click")
-    print_info("   the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
-    print_info("   You can also set this later by typing /set-home in a Slack channel.")
-    home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
-    if home_channel:
-        save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
-

 def _write_slack_manifest_and_instruct():
    """Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -3016,21 +2979,6 @@ def run_setup_wizard(args):
    config = load_config()
    hermes_home = get_hermes_home()

-    # Back up existing config before setup modifies it (#3522)
-    config_path = get_config_path()
-    if config_path.exists():
-        from datetime import datetime as _dt
-        _backup_path = config_path.with_suffix(
-            f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
-        )
-        try:
-            import shutil
-            shutil.copy2(config_path, _backup_path)
-        except Exception:
-            _backup_path = None
-    else:
-        _backup_path = None
-
    # Detect non-interactive environments (headless SSH, Docker, CI/CD)
    non_interactive = getattr(args, 'non_interactive', False)
    if not non_interactive and not is_interactive_stdin():
@@ -3200,10 +3148,6 @@ def run_setup_wizard(args):

    # Save and show summary
    save_config(config)
-    if _backup_path and _backup_path.exists():
-        print_info(f"Previous config backed up to: {_backup_path}")
-        print_info("If setup changed a value you customized, restore it with:")
-        print_info(f"  cp {_backup_path} {config_path}")
    _print_setup_summary(config, hermes_home)

    _offer_launch_chat()
@@ -122,16 +122,11 @@ def show_status(args):
    print()
    print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))

-    # Values may be a single env var name (str) or a tuple of alternates (first found wins).
-    keys: dict[str, str | tuple[str, ...]] = {
+    keys = {
        "OpenRouter": "OPENROUTER_API_KEY",
        "OpenAI": "OPENAI_API_KEY",
-        "Anthropic": ("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN"),
-        "Google / Gemini": ("GOOGLE_API_KEY", "GEMINI_API_KEY"),
-        "DeepSeek": "DEEPSEEK_API_KEY",
-        "xAI / Grok": "XAI_API_KEY",
-        "NVIDIA NIM": "NVIDIA_API_KEY",
-        "Z.AI / GLM": "GLM_API_KEY",
+        "NVIDIA": "NVIDIA_API_KEY",
+        "Z.AI/GLM": "GLM_API_KEY",
        "Kimi": "KIMI_API_KEY",
        "StepFun Step Plan": "STEPFUN_API_KEY",
        "MiniMax": "MINIMAX_API_KEY",
@@ -147,23 +142,8 @@ def show_status(args):
        "GitHub": "GITHUB_TOKEN",
    }

-    def _resolve_env(env_ref) -> str:
-        """Return first non-empty env var value from a str or tuple of names."""
-        if isinstance(env_ref, tuple):
-            for candidate in env_ref:
-                v = get_env_value(candidate) or ""
-                if v:
-                    return v
-            return ""
-        return get_env_value(env_ref) or ""
-
-    for name, env_ref in keys.items():
-        # Anthropic already has a dedicated lookup below; keep that as the
-        # single source of truth (it also resolves OAuth tokens), skip here
-        # so we don't print two "Anthropic" rows.
-        if name == "Anthropic":
-            continue
-        value = _resolve_env(env_ref)
+    for name, env_var in keys.items():
+        value = get_env_value(env_var) or ""
        has_key = bool(value)
        display = redact_key(value) if not show_all else value
        print(f"  {name:<12}  {check_mark(has_key)} {display}")
@@ -334,144 +334,6 @@ TIPS = [
    "MCP ${ENV_VAR} placeholders in config are resolved at server spawn — including vars from ~/.hermes/.env.",
    "Skills from trusted repos (NousResearch) get a 'trusted' security level; community skills get extra scanning.",
    "The skills quarantine at ~/.hermes/skills/.hub/quarantine/ holds skills pending security review.",
-
-    # --- Advanced Slash Commands ---
-    '/steer <prompt> injects a note after the next tool call — nudge direction mid-task without interrupting.',
-    '/goal <text> sets a standing Ralph-loop objective — Hermes auto-continues turn after turn until a judge says done.',
-    '/snapshot create [label] saves a full state snapshot of Hermes config; /snapshot restore <id> reverts later.',
-    '/copy [N] copies the last assistant response to your clipboard, or the Nth-from-last with a number.',
-    '/redraw forces a full UI repaint, fixing terminal drift after tmux resize or mouse selection artifacts.',
-    '/agents (alias /tasks) shows active agents and running background tasks across the current session.',
-    '/footer toggles the gateway footer on final replies showing model, tool counts, and turn timing.',
-    '/busy queue|steer|interrupt controls what pressing Enter does while Hermes is working.',
-    '/topic in Telegram DMs enables user-managed multi-session topic mode — /topic <id> restores past sessions inline.',
-    '/approve session|always runs a pending dangerous command with your chosen trust scope; /deny rejects it.',
-    '/restart gracefully restarts the gateway after draining active runs, then pings the requester when back up.',
-    '/kanban boards switch <slug> changes the active multi-project Kanban board from inside chat.',
-    '/reload reloads ~/.hermes/.env into the running session — pick up new API keys without restarting.',
-
-    # --- Cron (no-agent & scripts) ---
-    'cronjob with no_agent=True runs a script on schedule and sends its stdout directly — zero tokens, zero LLM.',
-    'An empty cron script stdout means silent tick — nothing is delivered, perfect for threshold watchdogs.',
-    "HERMES_CRON_MAX_PARALLEL (default 4) caps how many cron jobs run per tick so bursts don't saturate your keys.",
-
-    # --- Gateway Hooks ---
-    'Gateway hooks live under ~/.hermes/hooks/<name>/ with HOOK.yaml + handler.py — handler must be named `handle`.',
-    'Hook events include gateway:startup, session:start, agent:step, and command:* wildcard subscriptions.',
-    'Drop a ~/.hermes/BOOT.md checklist and a gateway:startup hook runs it as a one-shot agent every boot.',
-
-    # --- Curator ---
-    'hermes curator run --dry-run previews what the curator would archive or consolidate without mutating anything.',
-    "hermes curator pin <skill> hard-fences a skill against both auto-archival and the agent's skill_manage tool.",
-    'hermes curator rollback restores skills from a pre-run snapshot — backups live under skills/.curator_backups/.',
-
-    # --- Credential Pools & Routing ---
-    'hermes auth reset <provider> clears all cooldowns and exhaustion flags on a credential pool.',
-    'credential_pool_strategies.<provider>: round_robin cycles keys evenly instead of the fill_first default.',
-    'use_gateway: true per-tool routes web, image, tts, or browser through your Nous subscription — no extra keys.',
-    'provider_routing.data_collection: deny excludes data-storing providers on OpenRouter.',
-    'provider_routing.require_parameters: true only routes to providers that support every param in your request.',
-
-    # --- TUI & Dashboard ---
-    'HERMES_TUI_RESUME=1 auto-re-attaches to the most recent TUI session on launch — handy after SSH drops.',
-    "HERMES_TUI_THEME=light|dark|<hex> forces the TUI theme on terminals that don't set COLORFGBG.",
-    'Ctrl+G or Ctrl+X Ctrl+E in the TUI opens the input buffer in $EDITOR for long multi-line prompts.',
-    'The TUI renders LaTeX inline — $E=mc^2$ becomes Unicode math instead of raw TeX.',
-    'hermes dashboard launches a local web UI at 127.0.0.1:9119 — zero data leaves localhost.',
-    'hermes dashboard --tui embeds the full Hermes TUI in your browser via xterm.js and a WebSocket PTY.',
-    'Drop a YAML in ~/.hermes/dashboard-themes/ with two palette colors to reskin the entire dashboard.',
-    'Dashboard plugins are drop-in: manifest.json + JS bundle in ~/.hermes/dashboard-plugins/ — no npm build required.',
-    'layoutVariant: cockpit in a dashboard theme adds a 260px left rail that plugins can populate via the sidebar slot.',
-
-    # --- Env Vars & Config Gates ---
-    "display.tool_progress_command: true exposes /verbose on messaging platforms; it's CLI-only by default.",
-    'HERMES_BACKGROUND_NOTIFICATIONS=result only pings when background tasks finish (vs all/error/off).',
-    'HERMES_WRITE_SAFE_ROOT restricts write_file and patch to a directory prefix; writes outside require approval.',
-    'HERMES_IGNORE_RULES skips auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills.',
-    'HERMES_ACCEPT_HOOKS auto-approves unseen shell hooks declared in config.yaml without a TTY prompt.',
-    'auxiliary.goal_judge.model routes the /goal judge to a cheap fast model to keep loop cost near zero.',
-    'Checkpoints skip directories with more than 50,000 files to avoid slow git operations on massive monorepos.',
-
-    # --- TTS ---
-    'tts.provider: piper runs 44-language local TTS on CPU — voices auto-download to ~/.hermes/cache/piper-voices/.',
-    'tts.providers.<name>.type: command wires any CLI TTS engine with {input_path} and {output_path} placeholders.',
-
-    # --- API Server & Proxy ---
-    'API_SERVER_ENABLED=true runs an OpenAI-compatible endpoint alongside the gateway for Open WebUI and LibreChat.',
-    'GATEWAY_PROXY_URL runs a split setup: platform I/O locally, agent work delegated to a remote API server.',
-
-    # --- Platform-specific ---
-    'MATRIX_DEVICE_ID pins a stable device ID for E2EE — without it, keys rotate every start and historic decrypt breaks.',
-    'TELEGRAM_WEBHOOK_SECRET is required whenever TELEGRAM_WEBHOOK_URL is set — generate with openssl rand -hex 32.',
-
-    # --- Batch ---
-    "batch_runner.py --resume content-matches completed prompts by text so dataset reorders don't re-run finished work.",
-
-    # --- Less-Known Slash Commands ---
-    '/new starts a fresh session in place (alias /reset) — fresh session ID, clean history, CLI stays open.',
-    '/clear wipes the terminal screen AND starts a new session — one shortcut for a visual reset.',
-    '/history prints the current conversation in-line without leaving the CLI — useful for a quick re-read.',
-    '/save writes the current conversation to disk without ending the session.',
-    '/status shows session info at a glance: ID, title, model, token usage, and elapsed time.',
-    '/image <path> attaches a local image file for your next prompt without pasting or drag-and-drop.',
-    '/platforms shows gateway and messaging-platform connection status right from inside chat.',
-    '/commands paginates the full slash-command + installed-skill list — useful on platforms without tab completion.',
-    '/toolsets lists every available toolset so you know what -t/--toolsets accepts.',
-    '/gquota shows Google Gemini Code Assist quota usage with progress bars when that provider is active.',
-    '/voice tts toggles TTS-only mode — agent replies out loud but you still type your prompts.',
-    '/reload-skills re-scans ~/.hermes/skills/ so drop-in skills appear without restarting the session.',
-    '/indicator kaomoji|emoji|unicode|ascii picks the TUI busy-indicator style shown during agent runs.',
-    '/debug uploads a support bundle (system info + logs) and returns shareable links — works in chat too.',
-
-    # --- CLI Subcommands & Flags ---
-    'hermes -z "<prompt>" is the purest one-shot: final answer on stdout, nothing else — ideal for piping in scripts.',
-    'hermes chat --pass-session-id injects the session ID into the system prompt so the agent can self-reference it.',
-    'hermes chat --image path/to/pic.png attaches a local image to a single -q query without a separate upload step.',
-    'hermes chat --ignore-user-config skips ~/.hermes/config.yaml — reproducible bug reports and CI runs.',
-    "hermes chat --source tool tags programmatic chats so they don't clutter hermes sessions list.",
-    'hermes dump --show-keys includes redacted API key fingerprints for deeper support debugging.',
-    'hermes sessions rename <ID> "new title" renames any past session; hermes sessions delete <ID> removes one.',
-    'hermes import restores a session export or profile archive produced by sessions export or profile export.',
-    'hermes fallback manages the fallback_model chain interactively — no hand-editing config.yaml.',
-    'hermes pairing rotates the DM pairing token — the first messager after rotation claims access to the bot.',
-    'hermes setup walks first-time users through provider, keys, and platform wiring in one interactive flow.',
-    'hermes status --deep runs the full health sweep across every component; plain hermes status is the quick view.',
-
-    # --- Agent Behavior Env Vars ---
-    'HERMES_AGENT_TIMEOUT=0 disables the gateway inactivity kill for a running agent — use for long research runs.',
-    'HERMES_ENABLE_PROJECT_PLUGINS=1 auto-loads repo-local plugins from ./.hermes/plugins/ — trust-gated by design.',
-    "HERMES_DISABLE_FILE_STATE_GUARD=1 turns off the 'file changed since you read it' guard on patch and write_file.",
-    'HERMES_ALLOW_PRIVATE_URLS=true lets web tools hit localhost and private networks — off by default in gateway mode.',
-    'HERMES_OPTIONAL_SKILLS=name1,name2 auto-installs extra optional-catalog skills on first run per profile.',
-    'HERMES_BUNDLED_SKILLS points at a custom bundled-skill tree — used by Homebrew and Nix packaging.',
-    'HERMES_DUMP_REQUEST_STDOUT=1 dumps every API request payload to stdout instead of log files.',
-    'HERMES_OAUTH_TRACE=1 logs redacted OAuth token exchange and refresh attempts for debugging provider auth.',
-    'HERMES_STREAM_RETRIES (default 3) controls mid-stream reconnect attempts on transient network errors.',
-
-    # --- Gateway Behavior Env Vars ---
-    'HERMES_GATEWAY_BUSY_ACK_ENABLED=false silences the ⚡/⏳/⏩ ack messages when a user messages a busy agent.',
-    'HERMES_AGENT_NOTIFY_INTERVAL (default 180s) sets how often the gateway pings with progress on long turns.',
-    'HERMES_RESTART_DRAIN_TIMEOUT (default 900s) caps how long /restart waits for in-flight runs before forcing.',
-    'HERMES_CHECKPOINT_TIMEOUT (default 30s) caps filesystem checkpoint creation — raise it on huge monorepos.',
-
-    # --- Auxiliary Tasks & Image Generation ---
-    'image_gen.model in config.yaml picks the FAL model: flux-2/klein, gpt-image-2, nano-banana-pro, and more.',
-    'image_gen.provider routes image generation through a plugin (OpenAI Images, Codex, FAL) instead of the default.',
-    'AUXILIARY_VISION_BASE_URL + AUXILIARY_VISION_API_KEY point vision analysis at any OpenAI-compatible endpoint.',
-    'auxiliary.session_search.max_concurrency bounds how many matched sessions are summarized in parallel (default 3).',
-    'auxiliary.session_search.extra_body forwards provider-specific OpenAI-compatible fields on summarization calls.',
-
-    # --- Security ---
-    'security.tirith_fail_open: false makes Hermes block commands when the tirith scanner itself errors out.',
-    'TIRITH_FAIL_OPEN env var overrides the tirith_fail_open config — a quick toggle without editing config.yaml.',
-
-    # --- Sessions & Source Tags ---
-    '--source tool chats are excluded from hermes sessions list by default — set --source explicitly to see them.',
-    'Session IDs are timestamp-prefixed (20250305_091523_abcd) so sorting works naturally in ls and jq.',
-
-    # --- Misc ---
-    'API_SERVER_MODEL_NAME customizes the model name on /v1/models — essential for multi-profile Open WebUI setups.',
-    'Dashboard plugins are served from /dashboard-plugins/<name>/ — drop files into ~/.hermes/dashboard-plugins/.',
 ]


@@ -56,7 +56,6 @@ CONFIGURABLE_TOOLSETS = [
    ("file",            "📁 File Operations",           "read, write, patch, search"),
    ("code_execution",  "⚡ Code Execution",            "execute_code"),
    ("vision",          "👁️  Vision / Image Analysis",  "vision_analyze"),
-    ("video",           "🎬 Video Analysis",            "video_analyze (requires video-capable model)"),
    ("image_gen",       "🎨 Image Generation",          "image_generate"),
    ("moa",             "🧠 Mixture of Agents",         "mixture_of_agents"),
    ("tts",             "🔊 Text-to-Speech",            "text_to_speech"),
@@ -79,7 +78,7 @@ CONFIGURABLE_TOOLSETS = [
 # Toolsets that are OFF by default for new installs.
 # They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
 # but the setup checklist won't pre-select them for first-time users.
-_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
+_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}

 # Platform-scoped toolsets: only appear in the `hermes tools` checklist for
 # these platforms, and only resolve/save for these platforms.  A toolset
@@ -1823,7 +1822,7 @@ def _reconfigure_tool(config: dict):
        cat = TOOL_CATEGORIES.get(ts_key)
        reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
        if cat or reqs:
-            if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
+            if _toolset_has_keys(ts_key, config):
                configurable.append((ts_key, ts_label))

    if not configurable:
@@ -1849,28 +1848,6 @@ def _reconfigure_tool(config: dict):
    save_config(config)


-def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
-    """Return True if a configurable toolset is enabled anywhere.
-
-    Reconfigure must include enabled-but-unconfigured categories so users can
-    finish provider/API-key setup without disabling and re-enabling the toolset.
-    """
-    for platform in PLATFORMS:
-        if not _toolset_allowed_for_platform(ts_key, platform):
-            continue
-        try:
-            enabled = _get_platform_tools(
-                config,
-                platform,
-                include_default_mcp_servers=False,
-            )
-        except Exception:
-            continue
-        if ts_key in enabled:
-            return True
-    return False
-
-
 def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
    """Reconfigure a tool category - provider selection + API key update."""
    icon = cat.get("icon", "")
@@ -1920,27 +1897,21 @@ def _reconfigure_provider(provider: dict, config: dict):
            return

    if provider.get("tts_provider"):
-        tts_cfg = config.setdefault("tts", {})
-        tts_cfg["provider"] = provider["tts_provider"]
-        tts_cfg["use_gateway"] = bool(managed_feature)
+        config.setdefault("tts", {})["provider"] = provider["tts_provider"]
        _print_success(f"  TTS provider set to: {provider['tts_provider']}")

    if "browser_provider" in provider:
        bp = provider["browser_provider"]
-        browser_cfg = config.setdefault("browser", {})
        if bp == "local":
-            browser_cfg["cloud_provider"] = "local"
+            config.setdefault("browser", {})["cloud_provider"] = "local"
            _print_success("  Browser set to local mode")
        elif bp:
-            browser_cfg["cloud_provider"] = bp
+            config.setdefault("browser", {})["cloud_provider"] = bp
            _print_success(f"  Browser cloud provider set to: {bp}")
-        browser_cfg["use_gateway"] = bool(managed_feature)

    # Set web search backend in config if applicable
    if provider.get("web_backend"):
-        web_cfg = config.setdefault("web", {})
-        web_cfg["backend"] = provider["web_backend"]
-        web_cfg["use_gateway"] = bool(managed_feature)
+        config.setdefault("web", {})["backend"] = provider["web_backend"]
        _print_success(f"  Web backend set to: {provider['web_backend']}")

    if managed_feature and managed_feature not in ("web", "tts", "browser"):
@@ -27,192 +27,6 @@ import sys
 import threading
 from typing import Any, Callable, Optional

-# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``)
-# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime
-# mismatch Copilot flagged in round-9 on #19835.
-#
-# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit
-# has no super/meta modifier for the Cmd key, so those spellings are
-# TUI-only. The normalizer below returns the documented default
-# (``c-b``) for them — a silent fallback was preferred to a hard
-# startup crash (Copilot round-11). The CLI binding site
-# (``_register_voice_handler`` in cli.py) logs a warning when that
-# fallback fires so users see why their TUI-only shortcut isn't
-# bound in the classic CLI.
-_VOICE_MOD_ALIASES = {
-    "ctrl": "c-",
-    "control": "c-",
-    "alt": "a-",
-    "option": "a-",
-    "opt": "a-",
-}
-
-# Named keys prompt_toolkit accepts in ``c-<name>`` / ``a-<name>`` form.
-# Aliases collapse to prompt_toolkit's canonical spelling so the same
-# config value binds identically in both runtimes (Copilot round-10 on
-# #19835).
-_VOICE_NAMED_KEYS = {
-    "space": "space",
-    "spc": "space",
-    "enter": "enter",
-    "return": "enter",
-    "ret": "enter",
-    "tab": "tab",
-    "escape": "escape",
-    "esc": "escape",
-    "backspace": "backspace",
-    "bs": "backspace",
-    "delete": "delete",
-    "del": "delete",
-}
-
-# ``useInputHandlers()`` intercepts these before the voice check runs,
-# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or
-# ``ctrl+l`` (clear screen) would be advertised in /voice status but
-# never fire push-to-talk — the same blocklist the TUI parser uses.
-_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"})
-
-# On macOS the classic CLI's prompt_toolkit bindings for copy / exit /
-# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier
-# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals.
-# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc.
-# don't bind Alt+C in the CLI while the TUI silently falls back to
-# Ctrl+B (Copilot round-14 on #19835).
-_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"})
-
-_DEFAULT_PT_KEY = "c-b"
-
-
-def voice_record_key_from_config(cfg: Any) -> Any:
-    """Shape-safe ``cfg.voice.record_key`` lookup.
-
-    ``load_config()`` deep-merges raw YAML and preserves scalar
-    overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b``
-    leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the
-    naive ``.get("voice", {}).get("record_key")`` chain raises
-    AttributeError before voice can even start (Copilot round-11 on
-    #19835). Return ``None`` for malformed shapes so call sites can
-    feed the result straight into the normalizer/formatter and get
-    the documented default.
-    """
-    if not isinstance(cfg, dict):
-        return None
-
-    voice = cfg.get("voice")
-    if not isinstance(voice, dict):
-        return None
-
-    return voice.get("record_key")
-
-
-def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str:
-    """Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format.
-
-    Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``)
-    so one config value binds the same shortcut in both runtimes:
-
-    * non-string / empty / typo'd / bare-char / multi-modifier / reserved
-      ``ctrl+c|d|l`` → documented default ``c-b``
-    * single-char keys: ``ctrl+o`` → ``c-o``
-    * named keys: ``ctrl+space`` → ``c-space`` (aliases collapse:
-      ``ctrl+return`` → ``c-enter``)
-    * ``super`` / ``win`` / ``windows`` → ``c-b`` (TUI-only modifiers —
-      prompt_toolkit has no super mod; the CLI binding site is
-      expected to warn when this fallback fires so users see the
-      cross-runtime split, Copilot round-11 on #19835)
-    """
-    if not isinstance(raw, str):
-        return _DEFAULT_PT_KEY
-
-    lowered = raw.strip().lower()
-    if not lowered:
-        return _DEFAULT_PT_KEY
-
-    parts = [p.strip() for p in lowered.split("+") if p.strip()]
-    if not parts:
-        return _DEFAULT_PT_KEY
-
-    # Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts
-    # in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse
-    # to the documented default instead of silently diverging.
-    if len(parts) > 2:
-        return _DEFAULT_PT_KEY
-
-    # Bare char / bare named key (no explicit modifier) — the CLI's
-    # prompt_toolkit binds the raw key without a modifier, which the TUI
-    # parser refuses; reject here too so both runtimes agree.
-    if len(parts) == 1:
-        return _DEFAULT_PT_KEY
-
-    modifier_token, key_token = parts
-
-    # ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has
-    # no super modifier, so ``@kb.add(super+b)`` crashes the CLI at
-    # startup). Fall back to the documented default here; the CLI
-    # binding site is expected to log a warning when the configured
-    # value is one of these spellings so users know the TUI+CLI
-    # runtimes diverge on that shortcut (Copilot round-11 on #19835).
-    if modifier_token in {"super", "win", "windows"}:
-        return _DEFAULT_PT_KEY
-
-    normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token)
-    if not normalized_mod:
-        return _DEFAULT_PT_KEY
-
-    # Single-char key: reject reserved-ctrl chords that the TUI would
-    # also block at parse time, plus the mac-only alt reservation.
-    if len(key_token) == 1:
-        if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS:
-            return _DEFAULT_PT_KEY
-        if (
-            normalized_mod == "a-"
-            and sys.platform == "darwin"
-            and key_token in _VOICE_RESERVED_ALT_CHARS_MAC
-        ):
-            return _DEFAULT_PT_KEY
-        return f"{normalized_mod}{key_token}"
-
-    # Multi-char key token must be a known named key; typos like
-    # ``ctrl+spcae`` fall back to the default rather than being passed
-    # through as ``c-spcae`` (which prompt_toolkit would reject).
-    named = _VOICE_NAMED_KEYS.get(key_token)
-    if not named:
-        return _DEFAULT_PT_KEY
-
-    return f"{normalized_mod}{named}"
-
-
-def format_voice_record_key_for_status(raw: Any) -> str:
-    """Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form.
-
-    Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` /
-    ``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the
-    documented default so status never advertises a shortcut that
-    won't bind (Copilot round-10 on #19835).
-    """
-    normalized = normalize_voice_record_key_for_prompt_toolkit(raw)
-
-    if normalized.startswith("c-"):
-        prefix, key = "Ctrl+", normalized[2:]
-    elif normalized.startswith("a-"):
-        prefix, key = "Alt+", normalized[2:]
-    elif "+" in normalized:
-        # ``super+<key>`` / ``win+<key>`` — CLI won't bind them, but
-        # render in title case so status output is still readable.
-        mod, key = normalized.split("+", 1)
-        prefix = mod[0].upper() + mod[1:] + "+"
-    else:
-        return "Ctrl+B"
-
-    if not key:
-        return prefix.rstrip("+")
-
-    if len(key) == 1:
-        return prefix + key.upper()
-
-    return prefix + key[0].upper() + key[1:]
-
-
 from tools.voice_mode import (
    create_audio_recorder,
    is_whisper_hallucination,
@@ -470,23 +470,10 @@ except (ValueError, TypeError):
    )
    _GATEWAY_HEALTH_TIMEOUT = 3.0

-# DEPRECATED (scheduled for removal): GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT.
-# Cross-container / cross-host gateway liveness detection will be folded into a
-# first-class dashboard config key so it's no longer Docker-adjacent lore buried
-# in env vars.  The env vars still work for now so existing Compose deployments
-# don't break.  Do not add new callers — wire new uses through the planned
-# config surface.
-

 def _probe_gateway_health() -> tuple[bool, dict | None]:
    """Probe the gateway via its HTTP health endpoint (cross-container).

-    .. deprecated::
-        Driven by the deprecated ``GATEWAY_HEALTH_URL`` /
-        ``GATEWAY_HEALTH_TIMEOUT`` env vars.  Scheduled for removal alongside
-        a move to a first-class dashboard config key.  See
-        :data:`_GATEWAY_HEALTH_URL` for context.
-
    Uses ``/health/detailed`` first (returns full state), falling back to
    the simpler ``/health`` endpoint.  Returns ``(is_alive, body_dict)``.

@@ -2895,25 +2882,6 @@ _VALID_CHANNEL_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
 # loopback so tests don't need to rewrite request scope.
 _LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})

-
-def _is_public_bind() -> bool:
-    """True when bound to all-interfaces (operator used --insecure)."""
-    return getattr(app.state, "bound_host", "") in ("0.0.0.0", "::")
-
-
-def _ws_client_is_allowed(ws: "WebSocket") -> bool:
-    """Check if the WebSocket client IP is acceptable.
-
-    Allows loopback always; allows any IP when bound to all-interfaces
-    (--insecure mode, guarded by session token auth).
-    """
-    if _is_public_bind():
-        return True
-    client_host = ws.client.host if ws.client else ""
-    if not client_host:
-        return True
-    return client_host in _LOOPBACK_HOSTS
-
 # Per-channel subscriber registry used by /api/pub (PTY-side gateway → dashboard)
 # and /api/events (dashboard → browser sidebar).  Keyed by an opaque channel id
 # the chat tab generates on mount; entries auto-evict when the last subscriber
@@ -3004,7 +2972,8 @@ async def pty_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    if not _ws_client_is_allowed(ws):
+    client_host = ws.client.host if ws.client else ""
+    if client_host and client_host not in _LOOPBACK_HOSTS:
        await ws.close(code=4403)
        return

@@ -3111,7 +3080,8 @@ async def gateway_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    if not _ws_client_is_allowed(ws):
+    client_host = ws.client.host if ws.client else ""
+    if client_host and client_host not in _LOOPBACK_HOSTS:
        await ws.close(code=4403)
        return

@@ -3143,7 +3113,8 @@ async def pub_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    if not _ws_client_is_allowed(ws):
+    client_host = ws.client.host if ws.client else ""
+    if client_host and client_host not in _LOOPBACK_HOSTS:
        await ws.close(code=4403)
        return

@@ -3172,7 +3143,8 @@ async def events_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    if not _ws_client_is_allowed(ws):
+    client_host = ws.client.host if ws.client else ""
+    if client_host and client_host not in _LOOPBACK_HOSTS:
        await ws.close(code=4403)
        return

@@ -8,64 +8,14 @@ import os
 from pathlib import Path


-_profile_fallback_warned: bool = False
-
-
 def get_hermes_home() -> Path:
    """Return the Hermes home directory (default: ~/.hermes).

    Reads HERMES_HOME env var, falls back to ~/.hermes.
    This is the single source of truth — all other copies should import this.
-
-    When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
-    a non-default profile is active, logs a loud one-shot warning to
-    ``errors.log`` so cross-profile data corruption is diagnosable instead
-    of silent.  Behavior is unchanged otherwise — we still return
-    ``~/.hermes`` — because raising here would brick 30+ module-level
-    callers that import this at load time.  Subprocess spawners are
-    expected to propagate ``HERMES_HOME`` explicitly (see the systemd
-    template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
-    ``hermes_cli/kanban_db.py``).  See https://github.com/NousResearch/hermes-agent/issues/18594.
    """
    val = os.environ.get("HERMES_HOME", "").strip()
-    if val:
-        return Path(val)
-
-    # Guard: if a non-default profile is sticky-active, warn once that
-    # the fallback to the default profile is almost certainly wrong.
-    global _profile_fallback_warned
-    if not _profile_fallback_warned:
-        try:
-            # Inline the default-root resolution from get_default_hermes_root()
-            # to stay import-safe (this function is called from module scope
-            # in 30+ files; we cannot afford to trigger logging setup here).
-            active_path = (Path.home() / ".hermes" / "active_profile")
-            active = active_path.read_text().strip() if active_path.exists() else ""
-        except (UnicodeDecodeError, OSError):
-            active = ""
-        if active and active != "default":
-            _profile_fallback_warned = True
-            # Write directly to stderr.  We intentionally do NOT route this
-            # through ``logging`` because (a) this function is called at
-            # module-import time from 30+ sites, often before logging is
-            # configured, and (b) root-logger propagation would double-emit
-            # on consoles where a StreamHandler is already attached.
-            import sys
-            msg = (
-                f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
-                f"profile is {active!r}. Falling back to ~/.hermes, which "
-                f"is the DEFAULT profile — not {active!r}. Any data this "
-                f"process writes will land in the wrong profile. The "
-                f"subprocess spawner should pass HERMES_HOME explicitly "
-                f"(see issue #18594)."
-            )
-            try:
-                sys.stderr.write(msg + "\n")
-                sys.stderr.flush()
-            except Exception:
-                pass
-
-    return Path.home() / ".hermes"
+    return Path(val) if val else Path.home() / ".hermes"


 def get_default_hermes_root() -> Path:
@@ -718,45 +718,6 @@ class SessionDB:
                self._remove_session_files(sessions_dir, sid)
        return len(removed_ids)

-    def finalize_orphaned_compression_sessions(self) -> int:
-        """Mark orphaned compression continuation sessions as ended.
-
-        Targets child sessions that were never finalized: parent is ended
-        with reason='compression', child has messages but no end_reason/ended_at
-        and api_call_count=0.  Non-destructive: preserves all messages and sets
-        end_reason='orphaned_compression'.  Fix for #20001.
-        """
-        cutoff = time.time() - 604800  # 7 days
-
-        def _do(conn):
-            now = time.time()
-            result = conn.execute(
-                """
-                UPDATE sessions
-                SET ended_at = ?,
-                    end_reason = 'orphaned_compression'
-                WHERE api_call_count = 0
-                  AND end_reason IS NULL
-                  AND ended_at IS NULL
-                  AND started_at < ?
-                  AND parent_session_id IS NOT NULL
-                  AND EXISTS (
-                      SELECT 1 FROM sessions p
-                      WHERE p.id = sessions.parent_session_id
-                        AND p.end_reason = 'compression'
-                        AND p.ended_at IS NOT NULL
-                  )
-                  AND EXISTS (
-                      SELECT 1 FROM messages m
-                      WHERE m.session_id = sessions.id
-                  )
-                """,
-                (now, cutoff),
-            )
-            return result.rowcount
-
-        return self._execute_write(_do) or 0
-
    def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
        """Get a session by ID."""
        with self._lock:
@@ -2187,388 +2148,6 @@ class SessionDB:
            )
        self._execute_write(_do)

-    def apply_telegram_topic_migration(self) -> None:
-        """Create Telegram DM topic-mode tables on explicit /topic opt-in.
-
-        This migration is deliberately not part of automatic SessionDB startup
-        reconciliation. Operators must be able to upgrade Hermes, keep the old
-        Telegram bot behavior running, and only mutate topic-mode state when the
-        user executes /topic to opt into the feature.
-
-        Schema versions:
-          v1 — initial shape (no ON DELETE CASCADE on session_id FK)
-          v2 — session_id FK gets ON DELETE CASCADE so session pruning
-               automatically clears bindings.
-        """
-        def _do(conn):
-            conn.executescript(
-                """
-                CREATE TABLE IF NOT EXISTS telegram_dm_topic_mode (
-                    chat_id TEXT PRIMARY KEY,
-                    user_id TEXT NOT NULL,
-                    enabled INTEGER NOT NULL DEFAULT 1,
-                    activated_at REAL NOT NULL,
-                    updated_at REAL NOT NULL,
-                    has_topics_enabled INTEGER,
-                    allows_users_to_create_topics INTEGER,
-                    capability_checked_at REAL,
-                    intro_message_id TEXT,
-                    pinned_message_id TEXT
-                );
-
-                CREATE TABLE IF NOT EXISTS telegram_dm_topic_bindings (
-                    chat_id TEXT NOT NULL,
-                    thread_id TEXT NOT NULL,
-                    user_id TEXT NOT NULL,
-                    session_key TEXT NOT NULL,
-                    session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
-                    managed_mode TEXT NOT NULL DEFAULT 'auto',
-                    linked_at REAL NOT NULL,
-                    updated_at REAL NOT NULL,
-                    PRIMARY KEY (chat_id, thread_id)
-                );
-
-                CREATE UNIQUE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_session
-                ON telegram_dm_topic_bindings(session_id);
-
-                CREATE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_user
-                ON telegram_dm_topic_bindings(user_id, chat_id);
-                """
-            )
-
-            # v1 → v2: rebuild telegram_dm_topic_bindings if its session_id FK
-            # lacks ON DELETE CASCADE. SQLite can't ALTER a foreign key, so we
-            # rebuild the table. Only runs once per DB (version gate).
-            current = conn.execute(
-                "SELECT value FROM state_meta WHERE key = ?",
-                ("telegram_dm_topic_schema_version",),
-            ).fetchone()
-            current_version = int(current[0]) if current and str(current[0]).isdigit() else 0
-            if current_version < 2:
-                fk_rows = conn.execute(
-                    "PRAGMA foreign_key_list('telegram_dm_topic_bindings')"
-                ).fetchall()
-                needs_rebuild = any(
-                    row[2] == "sessions" and (row[6] or "") != "CASCADE"
-                    for row in fk_rows
-                )
-                if needs_rebuild:
-                    conn.executescript(
-                        """
-                        CREATE TABLE telegram_dm_topic_bindings_new (
-                            chat_id TEXT NOT NULL,
-                            thread_id TEXT NOT NULL,
-                            user_id TEXT NOT NULL,
-                            session_key TEXT NOT NULL,
-                            session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
-                            managed_mode TEXT NOT NULL DEFAULT 'auto',
-                            linked_at REAL NOT NULL,
-                            updated_at REAL NOT NULL,
-                            PRIMARY KEY (chat_id, thread_id)
-                        );
-                        INSERT INTO telegram_dm_topic_bindings_new
-                            SELECT chat_id, thread_id, user_id, session_key,
-                                   session_id, managed_mode, linked_at, updated_at
-                            FROM telegram_dm_topic_bindings;
-                        DROP TABLE telegram_dm_topic_bindings;
-                        ALTER TABLE telegram_dm_topic_bindings_new
-                            RENAME TO telegram_dm_topic_bindings;
-                        CREATE UNIQUE INDEX idx_telegram_dm_topic_bindings_session
-                            ON telegram_dm_topic_bindings(session_id);
-                        CREATE INDEX idx_telegram_dm_topic_bindings_user
-                            ON telegram_dm_topic_bindings(user_id, chat_id);
-                        """
-                    )
-
-            conn.execute(
-                "INSERT INTO state_meta (key, value) VALUES (?, ?) "
-                "ON CONFLICT(key) DO UPDATE SET value = excluded.value",
-                ("telegram_dm_topic_schema_version", "2"),
-            )
-        self._execute_write(_do)
-
-    def enable_telegram_topic_mode(
-        self,
-        *,
-        chat_id: str,
-        user_id: str,
-        has_topics_enabled: Optional[bool] = None,
-        allows_users_to_create_topics: Optional[bool] = None,
-    ) -> None:
-        """Enable Telegram DM topic mode for one private chat/user.
-
-        This method intentionally owns the explicit topic migration. Ordinary
-        SessionDB startup must not create these side tables.
-        """
-        self.apply_telegram_topic_migration()
-        now = time.time()
-
-        def _to_int(value: Optional[bool]) -> Optional[int]:
-            if value is None:
-                return None
-            return 1 if value else 0
-
-        def _do(conn):
-            conn.execute(
-                """
-                INSERT INTO telegram_dm_topic_mode (
-                    chat_id, user_id, enabled, activated_at, updated_at,
-                    has_topics_enabled, allows_users_to_create_topics,
-                    capability_checked_at
-                ) VALUES (?, ?, 1, ?, ?, ?, ?, ?)
-                ON CONFLICT(chat_id) DO UPDATE SET
-                    user_id = excluded.user_id,
-                    enabled = 1,
-                    updated_at = excluded.updated_at,
-                    has_topics_enabled = excluded.has_topics_enabled,
-                    allows_users_to_create_topics = excluded.allows_users_to_create_topics,
-                    capability_checked_at = excluded.capability_checked_at
-                """,
-                (
-                    str(chat_id),
-                    str(user_id),
-                    now,
-                    now,
-                    _to_int(has_topics_enabled),
-                    _to_int(allows_users_to_create_topics),
-                    now,
-                ),
-            )
-        self._execute_write(_do)
-
-    def disable_telegram_topic_mode(
-        self,
-        *,
-        chat_id: str,
-        clear_bindings: bool = True,
-    ) -> None:
-        """Disable Telegram DM topic mode for one private chat.
-
-        When ``clear_bindings`` is True (default) the (chat_id, thread_id)
-        bindings for this chat are also cleared so re-enabling later
-        starts from a clean slate. Set to False if the operator wants to
-        preserve bindings for a later re-enable.
-
-        Never creates the topic-mode tables from scratch; if they don't
-        exist there is nothing to disable and the call is a no-op.
-        """
-        def _do(conn):
-            try:
-                conn.execute(
-                    "UPDATE telegram_dm_topic_mode SET enabled = 0, updated_at = ? "
-                    "WHERE chat_id = ?",
-                    (time.time(), str(chat_id)),
-                )
-                if clear_bindings:
-                    conn.execute(
-                        "DELETE FROM telegram_dm_topic_bindings WHERE chat_id = ?",
-                        (str(chat_id),),
-                    )
-            except sqlite3.OperationalError:
-                # Tables don't exist yet — nothing to disable.
-                return
-        self._execute_write(_do)
-
-    def is_telegram_topic_mode_enabled(self, *, chat_id: str, user_id: str) -> bool:
-        """Return whether Telegram DM topic mode is enabled for this chat/user."""
-        with self._lock:
-            try:
-                row = self._conn.execute(
-                    """
-                    SELECT enabled FROM telegram_dm_topic_mode
-                    WHERE chat_id = ? AND user_id = ?
-                    """,
-                    (str(chat_id), str(user_id)),
-                ).fetchone()
-            except sqlite3.OperationalError:
-                return False
-        if row is None:
-            return False
-        enabled = row["enabled"] if isinstance(row, sqlite3.Row) else row[0]
-        return bool(enabled)
-
-    def get_telegram_topic_binding(
-        self,
-        *,
-        chat_id: str,
-        thread_id: str,
-    ) -> Optional[Dict[str, Any]]:
-        """Return the session binding for a Telegram DM topic, if present."""
-        with self._lock:
-            try:
-                row = self._conn.execute(
-                    """
-                    SELECT * FROM telegram_dm_topic_bindings
-                    WHERE chat_id = ? AND thread_id = ?
-                    """,
-                    (str(chat_id), str(thread_id)),
-                ).fetchone()
-            except sqlite3.OperationalError:
-                return None
-        return dict(row) if row else None
-
-    def bind_telegram_topic(
-        self,
-        *,
-        chat_id: str,
-        thread_id: str,
-        user_id: str,
-        session_key: str,
-        session_id: str,
-        managed_mode: str = "auto",
-    ) -> None:
-        """Bind one Telegram DM topic thread to one Hermes session.
-
-        A Hermes session may only be linked to one Telegram topic in MVP.
-        Rebinding the same topic to the same session is idempotent; trying to
-        link the same session to a different topic raises ValueError.
-        """
-        self.apply_telegram_topic_migration()
-        now = time.time()
-        chat_id = str(chat_id)
-        thread_id = str(thread_id)
-        user_id = str(user_id)
-        session_key = str(session_key)
-        session_id = str(session_id)
-
-        def _do(conn):
-            existing_session = conn.execute(
-                """
-                SELECT chat_id, thread_id FROM telegram_dm_topic_bindings
-                WHERE session_id = ?
-                """,
-                (session_id,),
-            ).fetchone()
-            if existing_session is not None:
-                linked_chat = existing_session["chat_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[0]
-                linked_thread = existing_session["thread_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[1]
-                if str(linked_chat) != chat_id or str(linked_thread) != thread_id:
-                    raise ValueError("session is already linked to another Telegram topic")
-
-            conn.execute(
-                """
-                INSERT INTO telegram_dm_topic_bindings (
-                    chat_id, thread_id, user_id, session_key, session_id,
-                    managed_mode, linked_at, updated_at
-                ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
-                ON CONFLICT(chat_id, thread_id) DO UPDATE SET
-                    user_id = excluded.user_id,
-                    session_key = excluded.session_key,
-                    session_id = excluded.session_id,
-                    managed_mode = excluded.managed_mode,
-                    updated_at = excluded.updated_at
-                """,
-                (
-                    chat_id,
-                    thread_id,
-                    user_id,
-                    session_key,
-                    session_id,
-                    managed_mode,
-                    now,
-                    now,
-                ),
-            )
-        self._execute_write(_do)
-
-    def is_telegram_session_linked_to_topic(self, *, session_id: str) -> bool:
-        """Return True if a Hermes session is already bound to any Telegram DM topic.
-
-        Read-only: does NOT trigger the telegram-topic migration. If the
-        topic-mode tables have not been created yet (i.e. nobody has run
-        ``/topic`` in this profile), the session is by definition unbound
-        and we return False.
-        """
-        with self._lock:
-            try:
-                row = self._conn.execute(
-                    """
-                    SELECT 1 FROM telegram_dm_topic_bindings
-                    WHERE session_id = ?
-                    LIMIT 1
-                    """,
-                    (str(session_id),),
-                ).fetchone()
-            except sqlite3.OperationalError:
-                return False
-        return row is not None
-
-    def list_unlinked_telegram_sessions_for_user(
-        self,
-        *,
-        chat_id: str,
-        user_id: str,
-        limit: int = 10,
-    ) -> List[Dict[str, Any]]:
-        """List previous Telegram sessions for this user that are not bound to a topic.
-
-        Read-only: does NOT trigger the telegram-topic migration. If the
-        topic-mode tables are absent, fall back to a simpler query that
-        just returns this user's Telegram sessions — there can't be any
-        bindings yet.
-        """
-        with self._lock:
-            try:
-                rows = self._conn.execute(
-                    """
-                    SELECT s.*,
-                        COALESCE(
-                            (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
-                             FROM messages m
-                             WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
-                             ORDER BY m.timestamp, m.id LIMIT 1),
-                            ''
-                        ) AS _preview_raw,
-                        COALESCE(
-                            (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
-                            s.started_at
-                        ) AS last_active
-                    FROM sessions s
-                    WHERE s.source = 'telegram'
-                      AND s.user_id = ?
-                      AND NOT EXISTS (
-                          SELECT 1 FROM telegram_dm_topic_bindings b
-                          WHERE b.session_id = s.id
-                      )
-                    ORDER BY last_active DESC, s.started_at DESC
-                    LIMIT ?
-                    """,
-                    (str(user_id), int(limit)),
-                ).fetchall()
-            except sqlite3.OperationalError:
-                # telegram_dm_topic_bindings doesn't exist yet — no bindings
-                # means every telegram session for this user is "unlinked".
-                rows = self._conn.execute(
-                    """
-                    SELECT s.*,
-                        COALESCE(
-                            (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
-                             FROM messages m
-                             WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
-                             ORDER BY m.timestamp, m.id LIMIT 1),
-                            ''
-                        ) AS _preview_raw,
-                        COALESCE(
-                            (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
-                            s.started_at
-                        ) AS last_active
-                    FROM sessions s
-                    WHERE s.source = 'telegram'
-                      AND s.user_id = ?
-                    ORDER BY last_active DESC, s.started_at DESC
-                    LIMIT ?
-                    """,
-                    (str(user_id), int(limit)),
-                ).fetchall()
-
-        sessions: List[Dict[str, Any]] = []
-        for row in rows:
-            session = dict(row)
-            raw = str(session.pop("_preview_raw", "") or "").strip()
-            session["preview"] = raw[:60] + ("..." if len(raw) > 60 else "") if raw else ""
-            sessions.append(session)
-        return sessions
-
    # ── Space reclamation ──

    def vacuum(self) -> None:
@@ -1,24 +0,0 @@
-# Hermes-Katalog für statische Meldungen -- Deutsch
-# See locales/en.yaml for the source of truth; keep keys in sync.
-
-approval:
-  dangerous_header: "⚠️  GEFÄHRLICHER BEFEHL: {description}"
-  choose_long:     "      [o]einmal  |  [s]sitzung  |  [a]immer  |  [d]ablehnen"
-  choose_short:    "      [o]einmal  |  [s]sitzung  |  [d]ablehnen"
-  prompt_long:     "      Auswahl [o/s/a/D]: "
-  prompt_short:    "      Auswahl [o/s/D]: "
-  timeout:         "      ⏱ Zeitüberschreitung – Befehl wird abgelehnt"
-  allowed_once:    "      ✓ Einmalig erlaubt"
-  allowed_session: "      ✓ Für diese Sitzung erlaubt"
-  allowed_always:  "      ✓ Zur dauerhaften Erlaubnisliste hinzugefügt"
-  denied:          "      ✗ Abgelehnt"
-  cancelled:       "      ✗ Abgebrochen"
-  blocklist_message: "Dieser Befehl steht auf der unbedingten Sperrliste und kann nicht genehmigt werden."
-
-gateway:
-  approval_expired: "⚠️ Genehmigung abgelaufen (Agent wartet nicht mehr). Bitten Sie den Agenten, es erneut zu versuchen."
-  draining:         "⏳ Warte auf {count} aktive(n) Agent(en) vor dem Neustart..."
-  goal_cleared:     "✓ Ziel gelöscht."
-  no_active_goal:   "Kein aktives Ziel."
-  config_read_failed: "⚠️ config.yaml konnte nicht gelesen werden: {error}"
-  config_save_failed: "⚠️ Konfiguration konnte nicht gespeichert werden: {error}"
@@ -1,35 +0,0 @@
-# Hermes static-message catalog -- English (baseline / source of truth)
-#
-# Only user-facing static messages from the CLI approval prompt and a handful
-# of gateway slash-command replies live here.  Agent-generated output, log
-# lines, error tracebacks, tool outputs, and slash-command descriptions stay
-# in English and are NOT translated -- see agent/i18n.py for scope rationale.
-#
-# Keys are dotted paths; nesting below is purely for readability.  Values may
-# contain {placeholder} tokens for str.format substitution.  When adding a
-# new key, add it to EVERY locale file (en/zh/ja/de/es) in the same commit --
-# tests/agent/test_i18n.py asserts catalog parity.
-
-approval:
-  # CLI approval prompt -- shown when a dangerous command needs user review.
-  dangerous_header: "⚠️  DANGEROUS COMMAND: {description}"
-  choose_long:     "      [o]nce  |  [s]ession  |  [a]lways  |  [d]eny"
-  choose_short:    "      [o]nce  |  [s]ession  |  [d]eny"
-  prompt_long:     "      Choice [o/s/a/D]: "
-  prompt_short:    "      Choice [o/s/D]: "
-  timeout:         "      ⏱ Timeout - denying command"
-  allowed_once:    "      ✓ Allowed once"
-  allowed_session: "      ✓ Allowed for this session"
-  allowed_always:  "      ✓ Added to permanent allowlist"
-  denied:          "      ✗ Denied"
-  cancelled:       "      ✗ Cancelled"
-  blocklist_message: "This command is on the unconditional blocklist and cannot be approved."
-
-gateway:
-  # Messenger replies to slash commands and implicit state changes.
-  approval_expired: "⚠️ Approval expired (agent is no longer waiting). Ask the agent to try again."
-  draining:         "⏳ Draining {count} active agent(s) before restart..."
-  goal_cleared:     "✓ Goal cleared."
-  no_active_goal:   "No active goal."
-  config_read_failed: "⚠️ Could not read config.yaml: {error}"
-  config_save_failed: "⚠️ Could not save config: {error}"
@@ -1,24 +0,0 @@
-# Catálogo de mensajes estáticos de Hermes -- Español
-# See locales/en.yaml for the source of truth; keep keys in sync.
-
-approval:
-  dangerous_header: "⚠️  COMANDO PELIGROSO: {description}"
-  choose_long:     "      [o]una vez  |  [s]sesión  |  [a]siempre  |  [d]denegar"
-  choose_short:    "      [o]una vez  |  [s]sesión  |  [d]denegar"
-  prompt_long:     "      Opción [o/s/a/D]: "
-  prompt_short:    "      Opción [o/s/D]: "
-  timeout:         "      ⏱ Tiempo agotado — comando denegado"
-  allowed_once:    "      ✓ Permitido una vez"
-  allowed_session: "      ✓ Permitido en esta sesión"
-  allowed_always:  "      ✓ Añadido a la lista de permitidos permanente"
-  denied:          "      ✗ Denegado"
-  cancelled:       "      ✗ Cancelado"
-  blocklist_message: "Este comando está en la lista de bloqueo incondicional y no se puede aprobar."
-
-gateway:
-  approval_expired: "⚠️ La aprobación ha caducado (el agente ya no está esperando). Pida al agente que lo intente de nuevo."
-  draining:         "⏳ Esperando a que terminen {count} agente(s) activo(s) antes de reiniciar..."
-  goal_cleared:     "✓ Objetivo eliminado."
-  no_active_goal:   "No hay objetivo activo."
-  config_read_failed: "⚠️ No se pudo leer config.yaml: {error}"
-  config_save_failed: "⚠️ No se pudo guardar la configuración: {error}"
@@ -1,24 +0,0 @@
-# Hermes 静的メッセージカタログ -- 日本語
-# See locales/en.yaml for the source of truth; keep keys in sync.
-
-approval:
-  dangerous_header: "⚠️  危険なコマンド: {description}"
-  choose_long:     "      [o]今回のみ  |  [s]セッション中  |  [a]常に許可  |  [d]拒否"
-  choose_short:    "      [o]今回のみ  |  [s]セッション中  |  [d]拒否"
-  prompt_long:     "      選択 [o/s/a/D]: "
-  prompt_short:    "      選択 [o/s/D]: "
-  timeout:         "      ⏱ タイムアウト — コマンドを拒否しました"
-  allowed_once:    "      ✓ 今回のみ許可"
-  allowed_session: "      ✓ このセッション中は許可"
-  allowed_always:  "      ✓ 永続的な許可リストに追加"
-  denied:          "      ✗ 拒否しました"
-  cancelled:       "      ✗ キャンセルしました"
-  blocklist_message: "このコマンドは無条件ブロックリストに含まれており、承認できません。"
-
-gateway:
-  approval_expired: "⚠️ 承認の有効期限が切れました（エージェントはもう待機していません）。エージェントに再試行を依頼してください。"
-  draining:         "⏳ 再起動前に {count} 個のアクティブエージェントの終了を待っています..."
-  goal_cleared:     "✓ 目標をクリアしました。"
-  no_active_goal:   "アクティブな目標はありません。"
-  config_read_failed: "⚠️ config.yaml を読み込めませんでした: {error}"
-  config_save_failed: "⚠️ 設定を保存できませんでした: {error}"
@@ -1,24 +0,0 @@
-# Hermes 静态消息目录 -- 中文（简体）
-# See locales/en.yaml for the source of truth; keep keys in sync.
-
-approval:
-  dangerous_header: "⚠️  危险命令： {description}"
-  choose_long:     "      [o]仅此一次  |  [s]本次会话  |  [a]永久允许  |  [d]拒绝"
-  choose_short:    "      [o]仅此一次  |  [s]本次会话  |  [d]拒绝"
-  prompt_long:     "      选择 [o/s/a/D]: "
-  prompt_short:    "      选择 [o/s/D]: "
-  timeout:         "      ⏱ 超时 — 已拒绝命令"
-  allowed_once:    "      ✓ 本次允许"
-  allowed_session: "      ✓ 本次会话内允许"
-  allowed_always:  "      ✓ 已加入永久允许列表"
-  denied:          "      ✗ 已拒绝"
-  cancelled:       "      ✗ 已取消"
-  blocklist_message: "此命令位于无条件拦截列表中，无法被批准。"
-
-gateway:
-  approval_expired: "⚠️ 批准已过期（代理不再等待）。请让代理重试。"
-  draining:         "⏳ 正在等待 {count} 个活跃代理结束后重启..."
-  goal_cleared:     "✓ 目标已清除。"
-  no_active_goal:   "当前没有活跃的目标。"
-  config_read_failed: "⚠️ 无法读取 config.yaml：{error}"
-  config_save_failed: "⚠️ 无法保存配置：{error}"
@@ -511,12 +511,6 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:

    Handles ``"type": "integer"``, ``"type": "number"``, ``"type": "boolean"``,
    and union types (``"type": ["integer", "string"]``).
-
-    Also wraps bare scalar values in a single-element list when the schema
-    declares ``"type": "array"``.  Open-weight models (DeepSeek, Qwen, GLM)
-    sometimes emit ``{"urls": "https://a.com"}`` when the tool expects
-    ``{"urls": ["https://a.com"]}``; wrapping here avoids a confusing tool
-    failure on what is otherwise a well-formed call.
    """
    if not args or not isinstance(args, dict):
        return args
@@ -529,42 +523,13 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
    if not properties:
        return args

-    for key, value in list(args.items()):
+    for key, value in args.items():
+        if not isinstance(value, str):
+            continue
        prop_schema = properties.get(key)
        if not prop_schema:
            continue
        expected = prop_schema.get("type")
-
-        # Wrap bare non-list values when the schema declares ``array``.
-        # Strings still go through _coerce_value first so JSON-encoded
-        # arrays (``'["a","b"]'``) get parsed and nullable ``"null"``
-        # becomes ``None`` rather than ``["null"]``.
-        # ``None`` itself is preserved — we don't know whether the model
-        # meant "omit" or "empty list", and tools with sensible defaults
-        # (e.g. read_file's normalize_read_pagination) already handle it.
-        if expected == "array" and value is not None and not isinstance(value, (list, tuple)):
-            if isinstance(value, str):
-                coerced = _coerce_value(value, expected, schema=prop_schema)
-                if coerced is not value:
-                    # _coerce_value handled it (JSON-parsed list or
-                    # nullable "null" → None).
-                    args[key] = coerced
-                    continue
-                args[key] = [value]
-                logger.info(
-                    "coerce_tool_args: wrapped bare string in list for %s.%s",
-                    tool_name, key,
-                )
-                continue
-            args[key] = [value]
-            logger.info(
-                "coerce_tool_args: wrapped bare %s in list for %s.%s",
-                type(value).__name__, tool_name, key,
-            )
-            continue
-
-        if not isinstance(value, str):
-            continue
        if not expected and not _schema_allows_null(prop_schema):
            continue
        coerced = _coerce_value(value, expected, schema=prop_schema)
@@ -163,42 +163,35 @@
      for entry in "''${ENTRIES[@]}"; do
        IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
        echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
-
-        # Compute the actual hash from the lockfile directly using
-        # prefetch-npm-deps. This avoids false "ok" from nix build when
-        # an old derivation is cached in a substituter (cachix/cache.nixos.org).
-        LOCK_FILE="$FOLDER/package-lock.json"
-        NEW_HASH=$(${pkgs.lib.getExe pkgs.prefetch-npm-deps} "$LOCK_FILE" 2>/dev/null)
-        if [ -z "$NEW_HASH" ]; then
-          echo "    prefetch-npm-deps failed, falling back to nix build" >&2
-          OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
-          STATUS=$?
-          if [ "$STATUS" -eq 0 ]; then
-            echo "    ok (via nix build)"
-            continue
-          fi
-          NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
-          if [ -z "$NEW_HASH" ]; then
-            if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
-              echo "    skipped (transient cache failure — see primary nix build for real status)" >&2
-              echo "$OUTPUT" | tail -8 >&2
-              continue
-            fi
-            echo "    build failed with no hash mismatch:" >&2
-            echo "$OUTPUT" | tail -40 >&2
-            exit 1
-          fi
-        fi
-
-        OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
-          | sed -E 's/hash = "(.*)"/\1/')
-
-        if [ "$NEW_HASH" = "$OLD_HASH" ]; then
+        OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
+        STATUS=$?
+        if [ "$STATUS" -eq 0 ]; then
          echo "    ok"
          continue
        fi

+        NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
+        if [ -z "$NEW_HASH" ]; then
+          # Magic-Nix-Cache occasionally returns HTTP 418 / cache-throttled
+          # mid-run; nix then prints "outputs … not valid, so checking is
+          # not possible" without a `got:` line.  That's an infrastructure
+          # blip, not a stale lockfile — warn + skip rather than failing
+          # the lint.  A real hash mismatch would still surface in the
+          # primary `.#$ATTR` build, which is a separate CI job.
+          if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
+            echo "    skipped (transient cache failure — see primary nix build for real status)" >&2
+            echo "$OUTPUT" | tail -8 >&2
+            continue
+          fi
+          echo "    build failed with no hash mismatch:" >&2
+          echo "$OUTPUT" | tail -40 >&2
+          exit 1
+        fi
+
        HASH_LINE=$(grep -n 'hash = "sha256-' "$NIX_FILE" | head -1 | cut -d: -f1)
+        OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
+          | sed -E 's/hash = "(.*)"/\1/')
+        LOCK_FILE="$FOLDER/package-lock.json"
        echo "    stale: $NIX_FILE:$HASH_LINE $OLD_HASH -> $NEW_HASH"
        STALE=1

--- a/Show More
+++ b/Show More