feat(lint): observability for the LSP bridge with steady-state silence

Adds per-call structured logging on the dedicated ``hermes.lint.lsp`` logger so an opt-in user can answer "did LSP fire on that edit?" with ``rg 'lsp\['`` against ``~/.hermes/logs/agent.log``. Levels are tuned so a 1000-write session emits exactly ONE INFO line at the default threshold, not 1000. Level model ----------- * ``DEBUG`` (invisible at the default INFO threshold) for every per-call steady-state event: ``clean``, ``feature off``, ``extension not mapped``, ``backend not local``, repeated ``no project root`` for an already-announced file, repeated ``server unavailable`` for an already-announced binary. * ``INFO`` for state transitions worth surfacing once: ``active for <root>`` the first time a (language, project_root) client starts, ``no project root for <path>`` the first time we see that orphan file. Plus every ``N diags`` event — diagnostics are inherently rare per-edit and are exactly the failure signal users want to grep for. * ``WARNING`` for action-required failures the first time per (language, binary): ``server unavailable`` (binary not on PATH), ``no server configured``. Per-call ``WARNING`` for timeouts, server errors, and unexpected bridge exceptions — these are inherently novel events, not steady state, and each one is its own signal. Dedup is in-process module-level sets guarded by a lock. Sets grow at most by the number of distinct (language, project_root) and (language, binary) pairs touched in one Python process — a few hundred entries in the most aggressive monorepo session, which is bytes of memory. A bounded LRU was rejected because evicting an entry would risk re-firing the WARNING/INFO line we explicitly want to suppress. Why this matters ---------------- The previous draft logged every per-call event at INFO. ``agent.log`` caps at 5 MB × 3 backups (= 20 MB) via ``RotatingFileHandler``, so nothing would crash, but a normal coding session would dwarf the actual signal under hundreds of ``lsp[typescript] clean (...)`` lines. The new model preserves the verification answer ("LSP active for <root>") and the action-required signals while keeping clean steady state out of the user's face. Tests ----- * ``TestLogLevelsSteadyState`` — feature off, unmapped extension, non- local backend, and repeated clean writes all stay at DEBUG. Exactly one INFO ("active for ...") survives across N calls. * ``TestLogLevelsNovelEvents`` — diagnostics are INFO per call; ``active for`` fires once per (language, root). * ``TestLogLevelsActionRequired`` — server unavailable warns once per binary; orphan files INFO once per path; timeouts WARN every time.
feat(lint): opt-in LSP-backed lint path in _check_lint
2026-05-12 00:28:30 -04:00 · 2026-05-12 00:13:12 -04:00
367 changed files with 7470 additions and 39136 deletions
@@ -14,14 +14,6 @@
 # LLM_MODEL is no longer read from .env — this line is kept for reference only.
 # LLM_MODEL=anthropic/claude-opus-4.6

-# =============================================================================
-# LLM PROVIDER (NovitaAI)
-# =============================================================================
-# NovitaAI — 90+ models, pay-per-use
-# Get your key at: https://novita.ai/settings/key-management
-# NOVITA_API_KEY=
-# NOVITA_BASE_URL=https://api.novita.ai/openai/v1  # Override default base URL
-
 # =============================================================================
 # LLM PROVIDER (Google AI Studio / Gemini)
 # =============================================================================
@@ -281,20 +273,6 @@ BROWSER_SESSION_TIMEOUT=300
 # Browser sessions are automatically closed after this period of no activity
 BROWSER_INACTIVITY_TIMEOUT=120

-# Camofox local anti-detection browser (Camoufox-based Firefox).
-# Set CAMOFOX_URL to route the browser tools through a local Camofox server
-# instead of agent-browser/Browserbase. See docs/user-guide/features/browser.md.
-# CAMOFOX_URL=http://localhost:9377
-
-# Externally managed Camofox sessions — when another app owns the visible
-# Camofox browser, set these so Hermes shares the same userId/profile instead
-# of creating its own isolated session.
-# CAMOFOX_USER_ID=
-# CAMOFOX_SESSION_KEY=
-# Set to true to reuse an already-open Camofox tab for this identity before
-# creating a new one (useful for gateway restarts).
-# CAMOFOX_ADOPT_EXISTING_TAB=false
-
 # =============================================================================
 # SESSION LOGGING
 # =============================================================================
@@ -28,10 +28,9 @@ permissions:
  contents: read

 # Concurrency: push/release runs are NEVER cancelled so every merge gets its
-# own SHA-tagged image; :main and :latest are guarded separately by the
-# move-main and move-latest jobs.  PR runs reuse a PR-scoped group with
-# cancel-in-progress: true so rapid pushes to the same PR collapse to the
-# latest commit.
+# own SHA-tagged image; :latest is guarded separately by the move-latest job.
+# PR runs reuse a PR-scoped group with cancel-in-progress: true so rapid
+# pushes to the same PR collapse to the latest commit.
 concurrency:
  group: docker-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: ${{ github.event_name == 'pull_request' }}
@@ -92,10 +91,10 @@ jobs:
      # pattern for multi-runner multi-platform builds.
      #
      # We apply the OCI revision label here (and again on arm64) because
-      # the move-main / move-latest jobs read it off the linux/amd64
-      # sub-manifest config of the floating tag to decide whether it's safe
-      # to advance.  The label must be on each per-arch image — manifest
-      # lists themselves don't carry image config labels.
+      # the move-latest job reads it off the linux/amd64 sub-manifest config
+      # of `:latest` to decide whether it's safe to advance.  The label must
+      # be on each per-arch image — manifest lists themselves don't carry
+      # image config labels.
      - name: Push amd64 by digest
        id: push
        if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'release'
@@ -218,8 +217,6 @@ jobs:
    timeout-minutes: 10
    outputs:
      pushed_sha_tag: ${{ steps.mark_pushed.outputs.pushed }}
-      pushed_release_tag: ${{ steps.mark_release_pushed.outputs.pushed }}
-      release_tag: ${{ steps.tag.outputs.tag }}
    steps:
      - name: Download digests
        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093  # v4
@@ -274,43 +271,33 @@ jobs:
          IMAGE_NAME: ${{ env.IMAGE_NAME }}
          TAG: ${{ steps.tag.outputs.tag }}

-      # Signal to move-main that the SHA tag is live.  Only on main pushes;
-      # releases set pushed_release_tag instead.
+      # Signal to move-latest that the SHA tag is live.  Only on main pushes;
+      # releases don't trigger move-latest (they use their own release tag).
      - name: Mark SHA tag pushed
        id: mark_pushed
        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
        run: echo "pushed=true" >> "$GITHUB_OUTPUT"

-      # Signal to move-latest that the release tag is live.
-      - name: Mark release tag pushed
-        id: mark_release_pushed
-        if: github.event_name == 'release'
-        run: echo "pushed=true" >> "$GITHUB_OUTPUT"
-
  # ---------------------------------------------------------------------------
-  # Move :main to point at the SHA tag the merge job pushed.
-  #
-  # :main is the floating tag that tracks the tip of the main branch.  Every
-  # merge to main retags :main forward.  Users who want "latest dev build"
-  # pull :main; users who want stable releases pull :latest.
+  # Move :latest to point at the SHA tag the merge job pushed.
  #
  # The real serialization guarantee comes from the top-level concurrency
  # group (`docker-${{ github.ref }}` with `cancel-in-progress: false`),
  # which ensures at most one workflow run for this ref executes at a time.
-  # That means two move-main steps for the same ref cannot overlap.
+  # That means two move-latest steps for the same ref cannot overlap.
  #
  # This job has its own concurrency group as defense-in-depth: if the
-  # top-level group is ever loosened, queued move-mains will run serially
+  # top-level group is ever loosened, queued move-latests will run serially
  # in arrival order, each one running the ancestor check below and either
-  # advancing :main or skipping.  `cancel-in-progress: false` matches the
+  # advancing :latest or skipping.  `cancel-in-progress: false` matches the
  # top-level setting — we don't want rapid pushes to cancel a queued
-  # move-main, because the ancestor check is the real safety mechanism
-  # and queueing is cheap (move-main is a ~30s registry op).
+  # move-latest, because the ancestor check is the real safety mechanism
+  # and queueing is cheap (move-latest is a ~30s registry op).
  #
-  # Combined with the ancestor check, this means :main only ever moves
+  # Combined with the ancestor check, this means :latest only ever moves
  # forward in git history.
  # ---------------------------------------------------------------------------
-  move-main:
+  move-latest:
    if: |
      github.repository == 'NousResearch/hermes-agent'
      && github.event_name == 'push'
@@ -320,7 +307,7 @@ jobs:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    concurrency:
-      group: docker-move-main-${{ github.ref }}
+      group: docker-move-latest-${{ github.ref }}
      cancel-in-progress: false
    steps:
      - name: Checkout code
@@ -337,13 +324,13 @@ jobs:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

-      # Read the git revision label off the current :main manifest, then
+      # Read the git revision label off the current :latest manifest, then
      # use `git merge-base --is-ancestor` to check whether our commit is a
-      # descendant of it.  If :main doesn't exist yet, or its label is
+      # descendant of it.  If :latest doesn't exist yet, or its label is
      # missing, we treat that as "safe to publish".  If another run already
-      # advanced :main past us (or diverged), we skip and leave it alone.
-      - name: Decide whether to move :main
-        id: main_check
+      # advanced :latest past us (or diverged), we skip and leave it alone.
+      - name: Decide whether to move :latest
+        id: latest_check
        run: |
          set -euo pipefail
          image=nousresearch/hermes-agent
@@ -351,119 +338,6 @@ jobs:
          # Pull the JSON for the linux/amd64 sub-manifest's config and extract
          # the OCI revision label with jq — Go template field access can't
          # handle dots in map keys, so using json+jq is the robust route.
-          image_json=$(
-            docker buildx imagetools inspect "${image}:main" \
-              --format '{{ json (index .Image "linux/amd64") }}' \
-              2>/dev/null || true
-          )
-
-          if [ -z "${image_json}" ]; then
-            echo "No existing :main (or inspect failed) — safe to publish."
-            echo "push_main=true" >> "$GITHUB_OUTPUT"
-            exit 0
-          fi
-
-          current_sha=$(
-            printf '%s' "${image_json}" \
-              | jq -r '.config.Labels."org.opencontainers.image.revision" // ""'
-          )
-
-          if [ -z "${current_sha}" ]; then
-            echo "Registry :main has no revision label — safe to publish."
-            echo "push_main=true" >> "$GITHUB_OUTPUT"
-            exit 0
-          fi
-
-          echo "Registry :main is at ${current_sha}"
-          echo "This run is at      ${GITHUB_SHA}"
-
-          if [ "${current_sha}" = "${GITHUB_SHA}" ]; then
-            echo ":main already points at our SHA — nothing to do."
-            echo "push_main=false" >> "$GITHUB_OUTPUT"
-            exit 0
-          fi
-
-          # Make sure we have the :main commit locally for merge-base.
-          if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
-            git fetch --no-tags --prune origin \
-              "+refs/heads/main:refs/remotes/origin/main" \
-              || true
-          fi
-
-          if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
-            echo "Registry :main points at an unknown commit (${current_sha}); refusing to overwrite."
-            echo "push_main=false" >> "$GITHUB_OUTPUT"
-            exit 0
-          fi
-
-          # Our SHA must be a descendant of the current :main to be safe.
-          if git merge-base --is-ancestor "${current_sha}" "${GITHUB_SHA}"; then
-            echo "Our commit is a descendant of :main — safe to advance."
-            echo "push_main=true" >> "$GITHUB_OUTPUT"
-          else
-            echo "Another run advanced :main past us (or diverged) — leaving it alone."
-            echo "push_main=false" >> "$GITHUB_OUTPUT"
-          fi
-
-      # Retag the already-pushed SHA manifest as :main.  This is a registry-
-      # side operation — no rebuild, no layer re-push — so it's quick and
-      # atomic per-tag.  The ancestor check above plus the cancel-in-progress
-      # concurrency on this job together guarantee we only ever move :main
-      # forward in git history.
-      - name: Move :main to this SHA
-        if: steps.main_check.outputs.push_main == 'true'
-        run: |
-          set -euo pipefail
-          image=nousresearch/hermes-agent
-          docker buildx imagetools create \
-            --tag "${image}:main" \
-            "${image}:sha-${GITHUB_SHA}"
-
-  # ---------------------------------------------------------------------------
-  # Move :latest to point at the release tag the merge job pushed.
-  #
-  # :latest is the floating tag that tracks the most recent stable release.
-  # Only `release: published` events advance it — never main pushes.
-  #
-  # We still run an ancestor check against the existing :latest so that a
-  # backport release on an older branch (e.g. patching v1.1.5 after v1.2.3
-  # is out) doesn't drag :latest backwards.  The check is the same shape as
-  # move-main: read the OCI revision label off the current :latest, look up
-  # that commit in git, and only advance if our release commit is a strict
-  # descendant.
-  # ---------------------------------------------------------------------------
-  move-latest:
-    if: |
-      github.repository == 'NousResearch/hermes-agent'
-      && github.event_name == 'release'
-      && needs.merge.outputs.pushed_release_tag == 'true'
-    needs: merge
-    runs-on: ubuntu-latest
-    timeout-minutes: 10
-    concurrency:
-      group: docker-move-latest
-      cancel-in-progress: false
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5  # v4
-        with:
-          fetch-depth: 1000
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f  # v3
-
-      - name: Log in to Docker Hub
-        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9  # v3
-        with:
-          username: ${{ secrets.DOCKERHUB_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_TOKEN }}
-
-      - name: Decide whether to move :latest
-        id: latest_check
-        run: |
-          set -euo pipefail
-          image=nousresearch/hermes-agent
-
          image_json=$(
            docker buildx imagetools inspect "${image}:latest" \
              --format '{{ json (index .Image "linux/amd64") }}' \
@@ -488,7 +362,7 @@ jobs:
          fi

          echo "Registry :latest is at ${current_sha}"
-          echo "This release is at  ${GITHUB_SHA}"
+          echo "This run is at      ${GITHUB_SHA}"

          if [ "${current_sha}" = "${GITHUB_SHA}" ]; then
            echo ":latest already points at our SHA — nothing to do."
@@ -497,7 +371,6 @@ jobs:
          fi

          # Make sure we have the :latest commit locally for merge-base.
-          # Releases can be cut from any branch, so fetch broadly.
          if ! git cat-file -e "${current_sha}^{commit}" 2>/dev/null; then
            git fetch --no-tags --prune origin \
              "+refs/heads/main:refs/remotes/origin/main" \
@@ -510,25 +383,25 @@ jobs:
            exit 0
          fi

-          # Our release SHA must be a descendant of the current :latest.
-          # Backport releases on older branches won't satisfy this and will
-          # be left alone — :latest stays on the newer release.
+          # Our SHA must be a descendant of the current :latest to be safe.
          if git merge-base --is-ancestor "${current_sha}" "${GITHUB_SHA}"; then
-            echo "Our release commit is a descendant of :latest — safe to advance."
+            echo "Our commit is a descendant of :latest — safe to advance."
            echo "push_latest=true" >> "$GITHUB_OUTPUT"
          else
-            echo "Existing :latest is newer than this release (likely a backport) — leaving it alone."
+            echo "Another run advanced :latest past us (or diverged) — leaving it alone."
            echo "push_latest=false" >> "$GITHUB_OUTPUT"
          fi

-      # Retag the already-pushed release manifest as :latest.
-      - name: Move :latest to this release tag
+      # Retag the already-pushed SHA manifest as :latest.  This is a registry-
+      # side operation — no rebuild, no layer re-push — so it's quick and
+      # atomic per-tag.  The ancestor check above plus the cancel-in-progress
+      # concurrency on this job together guarantee we only ever move :latest
+      # forward in git history.
+      - name: Move :latest to this SHA
        if: steps.latest_check.outputs.push_latest == 'true'
-        env:
-          RELEASE_TAG: ${{ needs.merge.outputs.release_tag }}
        run: |
          set -euo pipefail
          image=nousresearch/hermes-agent
          docker buildx imagetools create \
            --tag "${image}:latest" \
-            "${image}:${RELEASE_TAG}"
+            "${image}:sha-${GITHUB_SHA}"
@@ -55,14 +55,11 @@ jobs:

  e2e:
    runs-on: ubuntu-latest
-    timeout-minutes: 15
+    timeout-minutes: 10
    steps:
      - name: Checkout code
        uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5  # v4

-      - name: Install system dependencies
-        run: sudo apt-get update && sudo apt-get install -y ripgrep
-
      - name: Install uv
        uses: astral-sh/setup-uv@d4b2f3b6ecc6e67c4457f6d3e41ec42d3d0fcb86  # v5

@@ -513,17 +513,6 @@ generic plugin surface (new hook, new ctx method) — never hardcode
 plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
 honcho argparse from `main.py` for exactly this reason.

-**No new in-tree memory providers (policy, May 2026):** the set of
-built-in memory providers under `plugins/memory/` is closed. New memory
-backends must ship as **standalone plugin repos** that users install
-into `~/.hermes/plugins/` (or via pip entry points) — they implement
-the same `MemoryProvider` ABC, register through the same discovery
-path, and integrate via `hermes memory setup` / `post_setup()` without
-landing in this tree. PRs that add a new directory under
-`plugins/memory/` will be closed with a pointer to publish the
-provider as its own repo. Existing in-tree providers stay; bug fixes
-to them are welcome.
-
 ### Model-provider plugins (`plugins/model-providers/<name>/`)

 Every inference backend (openrouter, anthropic, gmi, deepseek, nvidia, …)
@@ -591,86 +580,6 @@ during setup, injected at load time).
 Top-level `tags:` and `category:` are also accepted and mirrored from
 `metadata.hermes.*` by the loader.

-### Skill authoring standards (HARDLINE)
-
-Every new or modernized skill — bundled, optional, or contributed —
-must meet these standards before merge. Reviewers reject PRs that
-violate them.
-
-1. **`description` ≤ 60 characters, one sentence, ends with a period.**
-   Long descriptions bloat skill listings and dilute the model's
-   attention when many skills are loaded. State the capability, not
-   the implementation. No marketing words ("powerful",
-   "comprehensive", "seamless", "advanced"). Don't repeat the skill
-   name. Verify with:
-   ```python
-   import re, pathlib
-   m = re.search(r'^description: (.*)$',
-                 pathlib.Path('skills/<cat>/<name>/SKILL.md').read_text(),
-                 re.MULTILINE)
-   assert len(m.group(1)) <= 60, len(m.group(1))
-   ```
-
-2. **Tools referenced in SKILL.md prose must be native Hermes tools or
-   MCP servers the skill explicitly expects.** When the skill needs a
-   capability, point at the proper tool by name in backticks
-   (`` `terminal` ``, `` `web_extract` ``, `` `read_file` ``,
-   `` `patch` ``, `` `search_files` ``, `` `vision_analyze` ``,
-   `` `browser_navigate` ``, `` `delegate_task` ``, etc.). Do NOT
-   name shell utilities the agent already has wrapped — `grep` →
-   `search_files`, `cat`/`head`/`tail` → `read_file`, `sed`/`awk` →
-   `patch`, `find`/`ls` → `search_files target='files'`. If the skill
-   depends on an MCP server, name the MCP server and document the
-   expected setup in `## Prerequisites`. Anything else (third-party
-   CLIs, shell pipelines, etc.) is fair game inside script files but
-   should not be the headline interaction surface in the prose.
-
-3. **`platforms:` gating audited against actual script imports.**
-   Skills that use POSIX-only primitives (`fcntl`, `termios`,
-   `os.setsid`, `os.kill(pid, 0)` for liveness, `/proc`, `/tmp`
-   hardcoded, `signal.SIGKILL`, bash heredocs, `osascript`, `apt`,
-   `systemctl`) must declare their supported platforms. Default
-   posture: try to fix it cross-platform first — `tempfile.gettempdir`,
-   `pathlib.Path`, `psutil.pid_exists`, Python-level filtering instead
-   of `grep`. Gate to a narrower set only when the dependency is
-   genuinely platform-bound.
-
-4. **`author` credits the human contributor first.** For external
-   contributions, the contributor's real name + GitHub handle goes
-   first; "Hermes Agent" is the secondary collaborator. If the
-   contributor's commit shows "Hermes Agent" as author (because they
-   used Hermes to draft the skill), replace it with their actual name
-   — credit the human, not the tool.
-
-5. **SKILL.md body uses the modern section order.** `# <Skill> Skill`
-   title, 2-3 sentence intro stating what it does and doesn't do,
-   `## When to Use`, `## Prerequisites`, `## How to Run`,
-   `## Quick Reference`, `## Procedure`, `## Pitfalls`,
-   `## Verification`. Target ~200 lines for a complex skill,
-   ~100 lines for a simple one. Cut redundant intro fluff, marketing
-   prose, and re-explanations of env vars already in
-   `## Prerequisites`.
-
-6. **Scripts go in `scripts/`, references in `references/`,
-   templates in `templates/`.** Don't expect the model to inline-write
-   parsers, XML walkers, or non-trivial logic every call — ship a
-   helper script. Reference it from SKILL.md by path relative to the
-   skill directory.
-
-7. **Tests live at `tests/skills/test_<skill>_skill.py`** and use only
-   stdlib + pytest + `unittest.mock`. No live network calls. Run via
-   `scripts/run_tests.sh tests/skills/test_<skill>_skill.py -q`.
-
-8. **`.env.example` additions are isolated to a clearly delimited
-   block.** Don't touch the surrounding file — contributor-supplied
-   `.env.example` versions are usually stale and edits outside the
-   skill's own block must be dropped during salvage.
-
-The full salvage / modernization checklist for external skill PRs
-lives in the `hermes-agent-dev` skill at
-`references/new-skill-pr-salvage.md` — load it before polishing
-contributor skill PRs.
-
 ---

 ## Toolsets
@@ -49,24 +49,6 @@ If your skill is specialized, community-contributed, or niche, it's better suite

 ---

-## Memory Providers: Ship as a Standalone Plugin
-
-**We are no longer accepting new memory providers into this repo.** The set of built-in providers under `plugins/memory/` (honcho, mem0, supermemory, byterover, hindsight, holographic, openviking, retaindb) is closed. If you want to add a new memory backend, publish it as a **standalone plugin repo** that users install into `~/.hermes/plugins/` (or via a pip entry point).
-
-Standalone memory plugins:
-
- Implement the same `MemoryProvider` ABC (`agent/memory_provider.py`) — `sync_turn`, `prefetch`, `shutdown`, and optionally `post_setup(hermes_home, config)` for setup-wizard integration
- Use the same discovery system — `discover_memory_providers()` picks them up from user/project plugin directories and pip entry points
- Integrate with `hermes memory setup` via `post_setup()` — no need to touch core code
- Can register their own CLI subcommands via `register_cli(subparser)` in a `cli.py` file
- Get all the same lifecycle hooks and config plumbing as in-tree providers
-
-PRs that add a new directory under `plugins/memory/` will be closed with a pointer to publish the provider as its own repo. Existing in-tree providers stay; bug fixes to them are welcome.
-
-This isn't a quality bar — it's a coupling-and-maintenance decision. Memory providers are the most common plugin type and they shouldn't all live in this tree.
-
---
-
 ## Development Setup

 ### Prerequisites
@@ -479,58 +461,6 @@ Gateway and messaging sessions never collect secrets in-band; they instruct the

 See `skills/gifs/gif-search/` and `skills/email/himalaya/` for examples.

-### Skill authoring standards (HARDLINE)
-
-Every new or modernized skill — bundled, optional, or contributed — must meet these standards before merge. Reviewers reject PRs that violate them.
-
-1. **`description` ≤ 60 characters, one sentence, ends with a period.** Long descriptions bloat the skill listing UI and dilute the model's attention when many skills are loaded. State the capability, not the implementation. No marketing words ("powerful", "comprehensive", "seamless", "advanced"). Don't repeat the skill name. Verify with:
-   ```python
-   import re, pathlib
-   m = re.search(r'^description: (.*)$',
-                 pathlib.Path('skills/<cat>/<name>/SKILL.md').read_text(),
-                 re.MULTILINE)
-   assert len(m.group(1)) <= 60, len(m.group(1))
-   ```
-
-   Good: `Search arXiv papers by keyword, author, category, or ID.`
-   Bad: `A powerful and comprehensive skill that allows the agent to search arXiv for relevant academic papers using various criteria including keywords, authors, and categories.`
-
-2. **Tools referenced in SKILL.md prose must be native Hermes tools or MCP servers the skill explicitly expects.** When the skill needs a capability, point at the proper tool by name in backticks: `` `terminal` ``, `` `web_extract` ``, `` `web_search` ``, `` `read_file` ``, `` `write_file` ``, `` `patch` ``, `` `search_files` ``, `` `vision_analyze` ``, `` `browser_navigate` ``, `` `delegate_task` ``, `` `image_generate` ``, `` `text_to_speech` ``, `` `cronjob` ``, `` `memory` ``, `` `skill_view` ``, `` `todo` ``, `` `execute_code` ``.
-
-   Do NOT name shell utilities the agent already has wrapped:
-
-   | Don't say | Say |
-   |---|---|
-   | `grep`, `rg` | `search_files` |
-   | `cat`, `head`, `tail` | `read_file` |
-   | `sed`, `awk` | `patch` |
-   | `find`, `ls` | `search_files` (with `target='files'`) |
-   | `curl` for content extraction | `web_extract` |
-   | `echo > file`, `cat <<EOF` | `write_file` |
-
-   If the skill depends on an MCP server, name the MCP server and document its setup in `## Prerequisites`. Third-party CLIs (e.g. `ffmpeg`, `gh`, a specific SDK) are fine to invoke from inside script files, but the prose should frame the interaction as "invoke through the `terminal` tool", not as a manual shell session.
-
-3. **`platforms:` gating audited against actual script imports.** Skills that use POSIX-only primitives (`fcntl`, `termios`, `os.setsid`, `os.kill(pid, 0)` for liveness, `/proc`, hardcoded `/tmp` paths, `signal.SIGKILL`, bash heredocs, `osascript`, `apt`, `systemctl`) must declare their supported platforms via the `platforms:` frontmatter. Default posture is to fix it cross-platform first — `tempfile.gettempdir()`, `pathlib.Path`, `psutil.pid_exists()`, Python-level filtering instead of `grep`. Gate to a narrower set only when the dependency is genuinely platform-bound (e.g. `osascript` is macOS-only, `/proc` is Linux-only).
-
-4. **`author` credits the human contributor first.** For external contributions, the contributor's real name + GitHub handle goes first (`Jane Doe (jane-doe)`); "Hermes Agent" is the secondary collaborator. If the contributor's commit shows "Hermes Agent" as author because they used Hermes to draft the skill, replace it with their actual name — credit the human, not the tool.
-
-5. **SKILL.md body uses the modern section order.** `# <Skill> Skill` title, 2-3 sentence intro stating what it does and what it doesn't do, then:
-   - `## When to Use` — trigger conditions
-   - `## Prerequisites` — env vars, install steps, MCP setup, API key sourcing
-   - `## How to Run` — canonical invocation through the `terminal` tool
-   - `## Quick Reference` — flat command/API reference
-   - `## Procedure` — numbered steps with copy-paste commands
-   - `## Pitfalls` — known limits, rate limits, things that look broken but aren't
-   - `## Verification` — single command that proves the skill works
-
-   Target ~200 lines for a complex skill, ~100 lines for a simple one. Cut redundant intro fluff, marketing prose, and re-explanations of env vars already documented in `## Prerequisites`.
-
-6. **Scripts go in `scripts/`, references in `references/`, templates in `templates/`.** Don't expect the model to inline-write parsers, XML walkers, or non-trivial logic every call — ship a helper script. Reference scripts from SKILL.md by path relative to the skill directory.
-
-7. **Tests live at `tests/skills/test_<skill>_skill.py`** and use only stdlib + pytest + `unittest.mock`. No live network calls. Run via `scripts/run_tests.sh tests/skills/test_<skill>_skill.py -q`. Must pass under the hermetic CI env (no API keys leaking through). Use `monkeypatch` and `tmp_path` for any env-var or filesystem dependencies.
-
-8. **`.env.example` additions are isolated to a clearly delimited block.** Don't touch the surrounding file — contributor-supplied `.env.example` versions are usually stale, and edits outside the skill's own block will be dropped during salvage. Comment all values with `#` (it's documentation, not live config).
-
 ### Skill guidelines

 - **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
@@ -94,13 +94,9 @@ RUN cd web && npm run build && \
 # hermes_cli/main.py succeeds (see #18800). /opt/hermes/web is build-time
 # only (HERMES_WEB_DIST points at hermes_cli/web_dist) and is intentionally
 # not chowned here.
-# The .venv MUST be hermes-writable so lazy_deps.py can install platform
-# packages (discord.py, telegram, slack, etc.) at first gateway boot.
-# Without this, `uv pip install` fails with EACCES and all messaging
-# adapters silently fail to load.  See tools/lazy_deps.py.
 USER root
 RUN chmod -R a+rX /opt/hermes && \
-    chown -R hermes:hermes /opt/hermes/.venv /opt/hermes/ui-tui /opt/hermes/node_modules
+    chown -R hermes:hermes /opt/hermes/ui-tui /opt/hermes/node_modules
 # Start as root so the entrypoint can usermod/groupmod + gosu.
 # If HERMES_UID is unset, the entrypoint drops to the default hermes user (10000).

@@ -14,7 +14,7 @@

 **The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.

-Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NovitaAI](https://novita.ai) (AI-native cloud for Model API, Agent Sandbox, and GPU Cloud), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
+Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.

 <table>
 <tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
@@ -1,11 +1,10 @@
-"""ACP permission bridging for Hermes dangerous-command approvals."""
+"""ACP permission bridging — maps ACP approval requests to hermes approval callbacks."""

 from __future__ import annotations

 import asyncio
 import logging
 from concurrent.futures import TimeoutError as FutureTimeout
-from itertools import count
 from typing import Callable

 from acp.schema import (
@@ -15,87 +14,24 @@ from acp.schema import (

 logger = logging.getLogger(__name__)

-# Maps ACP permission option ids to Hermes approval result strings.
-# Option ids are stable across both the ``allow_permanent=True`` and
-# ``allow_permanent=False`` paths even though the option list differs.
-_OPTION_ID_TO_HERMES = {
+# Maps ACP PermissionOptionKind -> hermes approval result strings
+_KIND_TO_HERMES = {
    "allow_once": "once",
-    "allow_session": "session",
    "allow_always": "always",
-    "deny": "deny",
+    "reject_once": "deny",
+    "reject_always": "deny",
 }

-_PERMISSION_REQUEST_IDS = count(1)
-
-
-def _build_permission_options(*, allow_permanent: bool) -> list[PermissionOption]:
-    """Return ACP options that match Hermes approval semantics."""
-    options = [
-        PermissionOption(option_id="allow_once", kind="allow_once", name="Allow once"),
-        PermissionOption(
-            option_id="allow_session",
-            # ACP has no session-scoped kind, so use the closest persistent
-            # hint while keeping Hermes semantics in the option id.
-            kind="allow_always",
-            name="Allow for session",
-        ),
-    ]
-    if allow_permanent:
-        options.append(
-            PermissionOption(
-                option_id="allow_always",
-                kind="allow_always",
-                name="Allow always",
-            ),
-        )
-    options.append(PermissionOption(option_id="deny", kind="reject_once", name="Deny"))
-    return options
-
-
-def _build_permission_tool_call(command: str, description: str):
-    """Return the ACP tool-call update attached to a permission request.
-
-    ``request_permission`` expects a ``ToolCallUpdate`` payload — produced
-    by ``_acp.update_tool_call`` — not a ``ToolCallStart``. Each request
-    gets a unique ``perm-check-N`` id so concurrent requests don't collide.
-    """
-    import acp as _acp
-
-    tool_call_id = f"perm-check-{next(_PERMISSION_REQUEST_IDS)}"
-    return _acp.update_tool_call(
-        tool_call_id,
-        title=description,
-        kind="execute",
-        status="pending",
-        content=[_acp.tool_content(_acp.text_block(f"$ {command}"))],
-        raw_input={"command": command, "description": description},
-    )
-
-
-def _map_outcome_to_hermes(outcome: object, *, allowed_option_ids: set[str]) -> str:
-    """Map an ACP permission outcome into Hermes approval strings."""
-    if not isinstance(outcome, AllowedOutcome):
-        return "deny"
-
-    option_id = outcome.option_id
-    if option_id not in allowed_option_ids:
-        logger.warning("Permission request returned unknown option_id: %s", option_id)
-        return "deny"
-    return _OPTION_ID_TO_HERMES.get(option_id, "deny")
-

 def make_approval_callback(
    request_permission_fn: Callable,
    loop: asyncio.AbstractEventLoop,
    session_id: str,
    timeout: float = 60.0,
-) -> Callable[..., str]:
+) -> Callable[[str, str], str]:
    """
-    Return a Hermes-compatible approval callback that bridges to ACP.
-
-    The callback accepts ``command`` and ``description`` plus optional
-    keyword arguments such as ``allow_permanent`` used by
-    ``tools.approval.prompt_dangerous_approval()``.
+    Return a hermes-compatible ``approval_callback(command, description) -> str``
+    that bridges to the ACP client's ``request_permission`` call.

    Args:
        request_permission_fn: The ACP connection's ``request_permission`` coroutine.
@@ -104,38 +40,41 @@ def make_approval_callback(
        timeout: Seconds to wait for a response before auto-denying.
    """

-    def _callback(
-        command: str,
-        description: str,
-        *,
-        allow_permanent: bool = True,
-        **_: object,
-    ) -> str:
-        options = _build_permission_options(allow_permanent=allow_permanent)
+    def _callback(command: str, description: str) -> str:
+        options = [
+            PermissionOption(option_id="allow_once", kind="allow_once", name="Allow once"),
+            PermissionOption(option_id="allow_always", kind="allow_always", name="Allow always"),
+            PermissionOption(option_id="deny", kind="reject_once", name="Deny"),
+        ]
+        import acp as _acp
+
+        tool_call = _acp.start_tool_call("perm-check", command, kind="execute")
+
+        coro = request_permission_fn(
+            session_id=session_id,
+            tool_call=tool_call,
+            options=options,
+        )

-        future = None
        try:
-            tool_call = _build_permission_tool_call(command, description)
-            coro = request_permission_fn(
-                session_id=session_id,
-                tool_call=tool_call,
-                options=options,
-            )
            future = asyncio.run_coroutine_threadsafe(coro, loop)
            response = future.result(timeout=timeout)
        except (FutureTimeout, Exception) as exc:
-            if future is not None:
-                future.cancel()
            logger.warning("Permission request timed out or failed: %s", exc)
            return "deny"

        if response is None:
            return "deny"

-        allowed_option_ids = {option.option_id for option in options}
-        return _map_outcome_to_hermes(
-            response.outcome,
-            allowed_option_ids=allowed_option_ids,
-        )
+        outcome = response.outcome
+        if isinstance(outcome, AllowedOutcome):
+            option_id = outcome.option_id
+            # Look up the kind from our options list
+            for opt in options:
+                if opt.option_id == option_id:
+                    return _KIND_TO_HERMES.get(opt.kind, "deny")
+            return "once"  # fallback for unknown option_id
+        else:
+            return "deny"

    return _callback
@@ -35,14 +35,6 @@ def _get_anthropic_sdk():
    """Return the ``anthropic`` SDK module, importing lazily. None if not installed."""
    global _anthropic_sdk
    if _anthropic_sdk is ...:
-        try:
-            from tools.lazy_deps import ensure as _lazy_ensure
-            _lazy_ensure("provider.anthropic", prompt=False)
-        except ImportError:
-            pass
-        except Exception:
-            # FeatureUnavailable — fall through to ImportError handling below
-            pass
        try:
            import anthropic as _sdk
            _anthropic_sdk = _sdk
@@ -1305,8 +1297,9 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
            ),
        }
        # Forward cache_control marker when present on the OpenAI-format
-        # tool dict. Anthropic's tools array supports cache_control on the
-        # last tool to cache the entire schema cross-session.
+        # tool dict (set by ``mark_tools_for_long_lived_cache``). Anthropic's
+        # tools array supports cache_control on the last tool to cache the
+        # entire schema cross-session.
        cache_control = t.get("cache_control")
        if isinstance(cache_control, dict):
            anthropic_tool["cache_control"] = dict(cache_control)
@@ -382,28 +382,7 @@ _AI_GATEWAY_HEADERS = {
 # Nous Portal extra_body for product attribution.
 # Callers should pass this as extra_body in chat.completions.create()
 # when the auxiliary client is backed by Nous Portal.
-#
-# The tags are computed from agent.portal_tags so the client= marker stays
-# in lockstep with hermes_cli.__version__ across every Portal call site
-# (main loop, aux, compression, web_extract). Do not inline a literal here;
-# see agent/portal_tags.py for the rationale.
-from agent.portal_tags import nous_portal_tags as _nous_portal_tags
-
-
-def _nous_extra_body() -> dict:
-    """Return a fresh Nous Portal ``extra_body`` dict.
-
-    Computed at call time so a hot-reloaded ``hermes_cli.__version__`` is
-    reflected without restarting long-running processes.
-    """
-    return {"tags": _nous_portal_tags()}
-
-
-# Backwards-compatible module attribute. Some callers (tests, third-party
-# plugins) read ``NOUS_EXTRA_BODY`` directly; keep it as a snapshot of the
-# current tags. Callers that need the freshest value should call
-# ``_nous_extra_body()`` or import ``nous_portal_tags`` directly.
-NOUS_EXTRA_BODY = _nous_extra_body()
+NOUS_EXTRA_BODY = {"tags": ["product=hermes-agent"]}

 # Set at resolve time — True if the auxiliary client points to Nous Portal
 auxiliary_is_nous: bool = False
@@ -1407,7 +1386,6 @@ def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Opt
    if pool_present:
        or_key = explicit_api_key or _pool_runtime_api_key(entry)
        if not or_key:
-            _mark_provider_unhealthy("openrouter", ttl=60)
            return None, None
        base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
        logger.debug("Auxiliary client: OpenRouter via pool")
@@ -1416,7 +1394,6 @@ def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Opt

    or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
    if not or_key:
-        _mark_provider_unhealthy("openrouter", ttl=60)
        return None, None
    logger.debug("Auxiliary client: OpenRouter")
    return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
@@ -1448,7 +1425,6 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
                "Auxiliary: skipping Nous Portal (rate-limited, resets in %.0fs)",
                _remaining,
            )
-            _mark_provider_unhealthy("nous", ttl=_remaining)
            return None, None
    except Exception:
        pass
@@ -1456,7 +1432,6 @@ def _try_nous(vision: bool = False) -> Tuple[Optional[OpenAI], Optional[str]]:
    nous = _read_nous_auth()
    runtime = _resolve_nous_runtime_api(force_refresh=False)
    if runtime is None and not nous:
-        _mark_provider_unhealthy("nous", ttl=60)
        return None, None
    global auxiliary_is_nous
    auxiliary_is_nous = True
@@ -3462,7 +3437,7 @@ def get_auxiliary_extra_body() -> dict:
    Includes Nous Portal product tags when the auxiliary client is backed
    by Nous Portal. Returns empty dict otherwise.
    """
-    return _nous_extra_body() if auxiliary_is_nous else {}
+    return dict(NOUS_EXTRA_BODY) if auxiliary_is_nous else {}


 def auxiliary_max_tokens_param(value: int) -> dict:
@@ -3853,7 +3828,7 @@ def _resolve_task_provider_model(
            # (e.g. OPENROUTER_API_KEY) instead of locking into "custom".
            return cfg_provider, resolved_model, cfg_base_url, None, resolved_api_mode
        if cfg_provider and cfg_provider != "auto":
-            return cfg_provider, resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
+            return cfg_provider, resolved_model, None, None, resolved_api_mode

        return "auto", resolved_model, None, None, resolved_api_mode

@@ -4051,7 +4026,7 @@ def _build_call_kwargs(
    # Provider-specific extra_body
    merged_extra = dict(extra_body or {})
    if provider == "nous" or auxiliary_is_nous:
-        merged_extra.setdefault("tags", []).extend(_nous_portal_tags())
+        merged_extra.setdefault("tags", []).extend(["product=hermes-agent"])
    if merged_extra:
        kwargs["extra_body"] = merged_extra

@@ -4436,7 +4411,7 @@ def extract_content_or_reasoning(response) -> str:
      1. ``message.content`` — strip inline think/reasoning blocks, check for
         remaining non-whitespace text.
      2. ``message.reasoning`` / ``message.reasoning_content`` — direct
-         structured reasoning fields (DeepSeek, Moonshot, NovitaAI, etc.).
+         structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
      3. ``message.reasoning_details`` — OpenRouter unified array format.

    Returns the best available text, or ``""`` if nothing found.
@@ -1185,26 +1185,6 @@ The user has requested that this compaction PRIORITISE preserving all informatio
            idx += 1
        return idx

-    def _protect_head_size(self, messages: List[Dict[str, Any]]) -> int:
-        """Total count of head messages to protect.
-
-        ``protect_first_n`` is defined as *additional* messages protected
-        beyond the system prompt.  The system prompt (if present at index 0)
-        is always implicitly protected — it's load-bearing context that
-        must never be summarised away.  This keeps semantics stable across
-        call paths where the system prompt may or may not be included in
-        the ``messages`` list (e.g. the gateway ``/compress`` handler
-        strips it before calling compress()).
-
-        Examples:
-          protect_first_n=0 → system prompt only (or nothing if no system msg)
-          protect_first_n=3 → system + first 3 non-system messages
-        """
-        head = 0
-        if messages and messages[0].get("role") == "system":
-            head = 1
-        return head + self.protect_first_n
-
    def _align_boundary_backward(self, messages: List[Dict[str, Any]], idx: int) -> int:
        """Pull a compress-end boundary backward to avoid splitting a
        tool_call / result group.
@@ -1363,7 +1343,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
        skip the LLM call when the transcript is still entirely inside
        the protected head/tail.
        """
-        compress_start = self._align_boundary_forward(messages, self._protect_head_size(messages))
+        compress_start = self._align_boundary_forward(messages, self.protect_first_n)
        compress_end = self._find_tail_cut_by_tokens(messages, compress_start)
        return compress_start < compress_end

@@ -1399,7 +1379,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
        self._last_aux_model_failure_model = None
        n_messages = len(messages)
        # Only need head + 3 tail messages minimum (token budget decides the real tail size)
-        _min_for_compress = self._protect_head_size(messages) + 3 + 1
+        _min_for_compress = self.protect_first_n + 3 + 1
        if n_messages <= _min_for_compress:
            if not self.quiet_mode:
                logger.warning(
@@ -1419,7 +1399,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
            logger.info("Pre-compression: pruned %d old tool result(s)", pruned_count)

        # Phase 2: Determine boundaries
-        compress_start = self._protect_head_size(messages)
+        compress_start = self.protect_first_n
        compress_start = self._align_boundary_forward(messages, compress_start)

        # Use token-budget tail protection instead of fixed message count
@@ -55,11 +55,6 @@ class ContextEngine(ABC):
    # These control the preflight compression check.  Subclasses may
    # override via __init__ or property; defaults are sensible for most
    # engines.
-    #
-    # protect_first_n semantics (since PR #13754): count of non-system head
-    # messages always preserved verbatim, IN ADDITION to the system prompt
-    # which is always implicitly protected.  Default 3 keeps the
-    # historical "system + first 3 non-system messages" head shape.

    threshold_percent: float = 0.75
    protect_first_n: int = 3
@@ -14,7 +14,6 @@ from difflib import unified_diff
 from pathlib import Path

 from utils import safe_json_loads
-from agent.tool_result_classification import file_mutation_result_landed

 # ANSI escape codes for coloring tool failure indicators
 _RED = "\033[31m"
@@ -811,8 +810,6 @@ def _detect_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]
    """
    if result is None:
        return False, ""
-    if file_mutation_result_landed(tool_name, result):
-        return False, ""

    if tool_name == "terminal":
        data = safe_json_loads(result)
@@ -450,13 +450,7 @@ def _make_stream_chunk(
    finish_reason: Optional[str] = None,
    reasoning: str = "",
 ) -> _GeminiStreamChunk:
-    delta_kwargs: Dict[str, Any] = {
-        "role": "assistant",
-        "content": None,
-        "tool_calls": None,
-        "reasoning": None,
-        "reasoning_content": None,
-    }
+    delta_kwargs: Dict[str, Any] = {"role": "assistant"}
    if content:
        delta_kwargs["content"] = content
    if tool_call_delta is not None:
@@ -77,17 +77,6 @@ def get_active_provider() -> Optional[ImageGenProvider]:

    Reads ``image_gen.provider`` from config.yaml; falls back per the
    module docstring.
-
-    **Availability semantics** (mirrors :mod:`agent.web_search_registry`):
-
-    - When ``image_gen.provider`` is explicitly set, the configured
-      provider is returned even if :meth:`ImageGenProvider.is_available`
-      reports False — the dispatcher surfaces a precise "X_API_KEY is not
-      set" error rather than silently switching backends.
-    - When ``image_gen.provider`` is unset, the fallback path (single-
-      provider shortcut and the FAL legacy preference) is filtered by
-      ``is_available()`` so we don't pick a provider the user has no
-      credentials for.
    """
    configured: Optional[str] = None
    try:
@@ -105,17 +94,6 @@ def get_active_provider() -> Optional[ImageGenProvider]:
    with _lock:
        snapshot = dict(_providers)

-    def _is_available_safe(p: ImageGenProvider) -> bool:
-        """Wrap ``is_available()`` so a buggy provider doesn't kill resolution."""
-        try:
-            return bool(p.is_available())
-        except Exception as exc:  # noqa: BLE001
-            logger.debug("image_gen provider %s.is_available() raised %s", p.name, exc)
-            return False
-
-    # 1. Explicit config wins — return regardless of is_available() so the
-    #    user gets a precise downstream error message rather than a silent
-    #    backend switch.
    if configured:
        provider = snapshot.get(configured)
        if provider is not None:
@@ -125,16 +103,13 @@ def get_active_provider() -> Optional[ImageGenProvider]:
            configured,
        )

-    # 2. Fallback: single registered provider — but only if it's actually
-    #    available (no credentials = don't surface it as "active").
-    available = [p for p in snapshot.values() if _is_available_safe(p)]
-    if len(available) == 1:
-        return available[0]
+    # Fallback: single-provider case
+    if len(snapshot) == 1:
+        return next(iter(snapshot.values()))

-    # 3. Fallback: prefer legacy FAL for backward compat, when available.
-    fal = snapshot.get("fal")
-    if fal is not None and _is_available_safe(fal):
-        return fal
+    # Fallback: prefer legacy FAL for backward compat
+    if "fal" in snapshot:
+        return snapshot["fal"]

    return None

@@ -1,106 +0,0 @@
-"""Language Server Protocol (LSP) integration for Hermes Agent.
-
-Hermes runs full language servers (pyright, gopls, rust-analyzer,
-typescript-language-server, etc.) as subprocesses and pipes their
-``textDocument/publishDiagnostics`` output into the post-write lint
-delta filter used by ``write_file`` and ``patch``.
-
-LSP is **gated on git workspace detection** — if the agent's cwd is
-inside a git repository, LSP runs against that workspace; otherwise the
-file_operations layer falls back to its existing in-process syntax
-checks.  This keeps users on user-home cwd's (e.g. Telegram gateway
-chats) from spawning daemons they don't need.
-
-Public API:
-
-    from agent.lsp import get_service
-
-    svc = get_service()
-    if svc and svc.enabled_for(path):
-        await svc.touch_file(path)
-        diags = svc.diagnostics_for(path)
-
-The bulk of the wiring is internal — most callers only need the layer
-in :func:`tools.file_operations.FileOperations._check_lint_delta`,
-which is already wired (see that module).
-
-Architecture is documented in ``website/docs/user-guide/features/lsp.md``.
-"""
-from __future__ import annotations
-
-import atexit
-import logging
-import threading
-from typing import Optional
-
-from agent.lsp.manager import LSPService
-
-logger = logging.getLogger("agent.lsp")
-
-_service: Optional[LSPService] = None
-_atexit_registered = False
-_service_lock = threading.Lock()
-
-
-def get_service() -> Optional[LSPService]:
-    """Return the process-wide LSP service singleton, or None when disabled.
-
-    The service is created lazily on first call.  ``None`` is returned
-    when LSP is disabled in config, when no workspace can be detected,
-    or when the platform doesn't support subprocess-based LSP servers.
-
-    On first creation, registers an :mod:`atexit` handler that tears
-    down spawned language servers on Python exit so a long-running
-    CLI or gateway session doesn't leak pyright/gopls/etc. processes
-    when it terminates.
-    """
-    global _service, _atexit_registered
-    if _service is not None:
-        return _service if _service.is_active() else None
-    with _service_lock:
-        if _service is not None:
-            return _service if _service.is_active() else None
-        _service = LSPService.create_from_config()
-        if not _atexit_registered:
-            # ``atexit`` handlers run in LIFO order on normal Python
-            # exit and on SystemExit, but NOT on os._exit() or
-            # uncaught signals.  Language servers are stateless
-            # subprocesses — losing them on SIGKILL is fine; they'll
-            # be reaped by the kernel along with their parent.  We
-            # care about clean exits where Python flushes stdio
-            # before terminating; without this hook every
-            # ``hermes chat`` exit would leak pyright processes that
-            # outlive the parent for a few seconds while their
-            # stdout buffers drain.
-            atexit.register(_atexit_shutdown)
-            _atexit_registered = True
-    return _service if (_service is not None and _service.is_active()) else None
-
-
-def shutdown_service() -> None:
-    """Tear down the LSP service if one was started.
-
-    Safe to call multiple times; safe to call when no service was created.
-    """
-    global _service
-    with _service_lock:
-        svc = _service
-        _service = None
-    if svc is not None:
-        try:
-            svc.shutdown()
-        except Exception as e:  # noqa: BLE001
-            logger.debug("LSP shutdown error: %s", e)
-
-
-def _atexit_shutdown() -> None:
-    """atexit-registered wrapper.  Logs at debug because by the time
-    atexit fires the user has already seen the agent's final output —
-    a noisy shutdown line on top of that is just clutter."""
-    try:
-        shutdown_service()
-    except Exception as e:  # noqa: BLE001
-        logger.debug("atexit LSP shutdown failed: %s", e)
-
-
-__all__ = ["get_service", "shutdown_service", "LSPService"]
@@ -1,308 +0,0 @@
-"""``hermes lsp`` CLI subcommand.
-
-Subcommands:
-
- ``status`` — show service state, configured servers, install status.
- ``install <server_id>`` — eagerly install one server's binary.
- ``install-all`` — try to install every server with a known recipe.
- ``restart`` — tear down running clients so the next edit re-spawns.
- ``which <server_id>`` — print the resolved binary path for one server.
- ``list`` — print the registry of supported servers.
-
-The handlers are kept here (rather than in
-``hermes_cli/main.py``) so the LSP module ships self-contained.
-"""
-from __future__ import annotations
-
-import argparse
-import sys
-from typing import Optional
-
-
-def register_subparser(subparsers: argparse._SubParsersAction) -> None:
-    """Wire the ``hermes lsp`` subcommand tree into the main argparse."""
-    parser = subparsers.add_parser(
-        "lsp",
-        help="Language Server Protocol management",
-        description=(
-            "Manage the LSP layer that powers post-write semantic "
-            "diagnostics in write_file/patch."
-        ),
-    )
-    sub = parser.add_subparsers(dest="lsp_command")
-
-    sub_status = sub.add_parser("status", help="Show LSP service status")
-    sub_status.add_argument(
-        "--json", action="store_true", help="Emit machine-readable JSON"
-    )
-
-    sub_list = sub.add_parser("list", help="List supported language servers")
-    sub_list.add_argument(
-        "--installed-only",
-        action="store_true",
-        help="Only show servers whose binary is currently available",
-    )
-
-    sub_install = sub.add_parser("install", help="Install a server binary")
-    sub_install.add_argument("server", help="Server id (e.g. pyright, gopls)")
-
-    sub_install_all = sub.add_parser(
-        "install-all",
-        help="Install every server with a known auto-install recipe",
-    )
-    sub_install_all.add_argument(
-        "--include-manual",
-        action="store_true",
-        help="Even attempt servers marked manual-install (best effort)",
-    )
-
-    sub_restart = sub.add_parser(
-        "restart",
-        help="Tear down running LSP clients (next edit re-spawns)",
-    )
-
-    sub_which = sub.add_parser("which", help="Print binary path for a server")
-    sub_which.add_argument("server", help="Server id")
-
-    parser.set_defaults(func=run_lsp_command)
-
-
-def run_lsp_command(args: argparse.Namespace) -> int:
-    """Top-level dispatcher for ``hermes lsp <subcommand>``."""
-    sub = getattr(args, "lsp_command", None) or "status"
-    try:
-        if sub == "status":
-            return _cmd_status(getattr(args, "json", False))
-        if sub == "list":
-            return _cmd_list(getattr(args, "installed_only", False))
-        if sub == "install":
-            return _cmd_install(args.server)
-        if sub == "install-all":
-            return _cmd_install_all(getattr(args, "include_manual", False))
-        if sub == "restart":
-            return _cmd_restart()
-        if sub == "which":
-            return _cmd_which(args.server)
-        sys.stderr.write(f"unknown lsp subcommand: {sub}\n")
-        return 2
-    except KeyboardInterrupt:
-        return 130
-
-
-def _cmd_status(emit_json: bool) -> int:
-    from agent.lsp import get_service
-    from agent.lsp.servers import SERVERS
-    from agent.lsp.install import detect_status
-
-    svc = get_service()
-    service_active = svc is not None
-    info = svc.get_status() if svc is not None else {"enabled": False}
-
-    if emit_json:
-        import json
-        payload = {
-            "service": info,
-            "registry": [
-                {
-                    "server_id": s.server_id,
-                    "extensions": list(s.extensions),
-                    "description": s.description,
-                    "binary_status": detect_status(_recipe_pkg_for(s.server_id)),
-                }
-                for s in SERVERS
-            ],
-        }
-        sys.stdout.write(json.dumps(payload, indent=2) + "\n")
-        return 0
-
-    out = []
-    out.append("LSP Service")
-    out.append("===========")
-    out.append(f"  enabled:         {info.get('enabled', False)}")
-    if service_active:
-        out.append(f"  wait_mode:       {info.get('wait_mode')}")
-        out.append(f"  wait_timeout:    {info.get('wait_timeout')}s")
-        out.append(f"  install_strategy:{info.get('install_strategy')}")
-        clients = info.get("clients") or []
-        if clients:
-            out.append(f"  active clients:  {len(clients)}")
-            for c in clients:
-                out.append(
-                    f"    - {c['server_id']:20s} state={c['state']:10s} root={c['workspace_root']}"
-                )
-        else:
-            out.append("  active clients:  none")
-        broken = info.get("broken") or []
-        if broken:
-            out.append(f"  broken pairs:    {len(broken)}")
-            for b in broken:
-                out.append(f"    - {b}")
-        disabled = info.get("disabled_servers") or []
-        if disabled:
-            out.append(f"  disabled in cfg: {', '.join(disabled)}")
-
-    # Surface backend-tool gaps that aren't visible in the registry table:
-    # some servers spawn fine but emit no diagnostics without a sidecar
-    # binary (bash-language-server -> shellcheck).
-    backend_warnings = _backend_warnings()
-    if backend_warnings:
-        out.append("")
-        out.append("Backend warnings")
-        out.append("================")
-        for line in backend_warnings:
-            out.append(f"  ! {line}")
-    out.append("")
-    out.append("Registered Servers")
-    out.append("==================")
-    for s in SERVERS:
-        pkg = _recipe_pkg_for(s.server_id)
-        status = detect_status(pkg)
-        marker = {
-            "installed": "✓",
-            "missing": "·",
-            "manual-only": "?",
-        }.get(status, " ")
-        ext_summary = ", ".join(list(s.extensions)[:5])
-        if len(s.extensions) > 5:
-            ext_summary += f", … (+{len(s.extensions) - 5})"
-        out.append(
-            f"  {marker} {s.server_id:24s} [{status:11s}] {ext_summary}"
-        )
-        if s.description:
-            out.append(f"      {s.description}")
-    sys.stdout.write("\n".join(out) + "\n")
-    return 0
-
-
-def _cmd_list(installed_only: bool) -> int:
-    from agent.lsp.servers import SERVERS
-    from agent.lsp.install import detect_status
-
-    for s in SERVERS:
-        pkg = _recipe_pkg_for(s.server_id)
-        status = detect_status(pkg)
-        if installed_only and status != "installed":
-            continue
-        sys.stdout.write(
-            f"{s.server_id:24s} [{status:11s}] {','.join(s.extensions)}\n"
-        )
-    return 0
-
-
-def _cmd_install(server_id: str) -> int:
-    from agent.lsp.install import try_install, INSTALL_RECIPES, detect_status
-    pkg = _recipe_pkg_for(server_id)
-    pre_status = detect_status(pkg)
-    if pre_status == "installed":
-        sys.stdout.write(f"{server_id} already installed\n")
-        return 0
-    sys.stdout.write(f"installing {server_id} (pkg={pkg}) ...\n")
-    sys.stdout.flush()
-    bin_path = try_install(pkg, "auto")
-    if bin_path is None:
-        recipe = INSTALL_RECIPES.get(pkg)
-        if recipe and recipe.get("strategy") == "manual":
-            sys.stderr.write(
-                f"{server_id}: this server requires a manual install. "
-                f"See documentation.\n"
-            )
-        else:
-            sys.stderr.write(f"{server_id}: install failed (see logs).\n")
-        return 1
-    sys.stdout.write(f"installed: {bin_path}\n")
-    return 0
-
-
-def _cmd_install_all(include_manual: bool) -> int:
-    from agent.lsp.servers import SERVERS
-    from agent.lsp.install import try_install, INSTALL_RECIPES, detect_status
-
-    rc = 0
-    for s in SERVERS:
-        pkg = _recipe_pkg_for(s.server_id)
-        recipe = INSTALL_RECIPES.get(pkg)
-        if recipe is None:
-            continue
-        if recipe.get("strategy") == "manual" and not include_manual:
-            continue
-        if detect_status(pkg) == "installed":
-            sys.stdout.write(f"  {s.server_id:24s} already installed\n")
-            continue
-        sys.stdout.write(f"  installing {s.server_id} (pkg={pkg}) ... ")
-        sys.stdout.flush()
-        path = try_install(pkg, "auto")
-        if path:
-            sys.stdout.write(f"ok ({path})\n")
-        else:
-            sys.stdout.write("FAILED\n")
-            rc = 1
-    return rc
-
-
-def _cmd_restart() -> int:
-    from agent.lsp import shutdown_service
-
-    shutdown_service()
-    sys.stdout.write("LSP service shut down. Next edit will respawn clients.\n")
-    return 0
-
-
-def _cmd_which(server_id: str) -> int:
-    from agent.lsp.install import INSTALL_RECIPES, hermes_lsp_bin_dir
-    import os
-    import shutil as _shutil
-
-    recipe = INSTALL_RECIPES.get(server_id)
-    bin_name = (recipe or {}).get("bin", server_id)
-    staged = hermes_lsp_bin_dir() / bin_name
-    if staged.exists():
-        sys.stdout.write(str(staged) + "\n")
-        return 0
-    on_path = _shutil.which(bin_name)
-    if on_path:
-        sys.stdout.write(on_path + "\n")
-        return 0
-    sys.stderr.write(f"{server_id}: not installed\n")
-    return 1
-
-
-def _recipe_pkg_for(server_id: str) -> str:
-    """Map a registry ``server_id`` to its install-recipe package key."""
-    # The mapping lives here (not in install.py) because it's a CLI
-    # convenience layer.  Most server_ids are also their own recipe
-    # key, but a few differ (e.g. ``vue-language-server`` →
-    # ``@vue/language-server``).
-    aliases = {
-        "vue-language-server": "@vue/language-server",
-        "astro-language-server": "@astrojs/language-server",
-        "dockerfile-ls": "dockerfile-language-server-nodejs",
-        "typescript": "typescript-language-server",
-    }
-    return aliases.get(server_id, server_id)
-
-
-def _backend_warnings() -> list:
-    """Return human-readable notes about LSP backend tools that are missing
-    in a way that won't surface elsewhere.
-
-    Some language servers ship as thin wrappers around an external CLI for
-    actual diagnostics — they spawn cleanly but never emit any errors when
-    the sidecar binary isn't on PATH.  bash-language-server / shellcheck
-    is the load-bearing example.
-
-    Returned strings are short, actionable, and include the install
-    suggestion across common platforms.
-    """
-    import shutil as _shutil
-    from agent.lsp.install import hermes_lsp_bin_dir
-    notes: list = []
-    bash_installed = _shutil.which("bash-language-server") is not None or (
-        (hermes_lsp_bin_dir() / "bash-language-server").exists()
-    )
-    if bash_installed and _shutil.which("shellcheck") is None:
-        notes.append(
-            "bash-language-server is installed but shellcheck is missing — "
-            "diagnostics will be empty (apt: shellcheck, brew: shellcheck, "
-            "scoop: shellcheck)."
-        )
-    return notes
@@ -1,930 +0,0 @@
-"""Async LSP client over stdin/stdout.
-
-One :class:`LSPClient` corresponds to one ``(language_server, workspace_root)``
-pair — exactly what OpenCode keys clients on, and the same shape Claude
-Code uses.  The client owns a child process, drives the JSON-RPC
-exchange, and exposes:
-
- :meth:`open_file` / :meth:`change_file` — text document sync
- :meth:`wait_for_diagnostics` — block until the server emits fresh
-  diagnostics for a specific file (or a timeout fires)
- :meth:`diagnostics_for` — read the current per-file diagnostic store
- :meth:`shutdown` — graceful close + SIGTERM/SIGKILL fallback
-
-The class is designed for async use from a single asyncio event loop.
-The :class:`agent.lsp.manager.LSPService` runs an event loop in a
-background thread so the synchronous file_operations layer can call
-into it via :func:`agent.lsp.manager.LSPService.touch_file`.
-
-Implementation notes:
-
- Push diagnostics are stored per-URI in :attr:`_push_diagnostics` from
-  ``textDocument/publishDiagnostics`` notifications.  Pull diagnostics
-  go in :attr:`_pull_diagnostics`.  The merged view dedupes by content.
-
- Whole-document sync.  Even when the server advertises incremental
-  sync, we send a single ``contentChanges`` entry replacing the
-  entire document.  Pretending to be incremental while sending a
-  full replacement is well-tolerated by every major server and saves
-  range bookkeeping.  See OpenCode's ``client.ts:584-659`` for the
-  same trick.
-
- The "touch-file dance": every ``open_file`` call also fires a
-  ``workspace/didChangeWatchedFiles`` notification (CREATED on the
-  first open, CHANGED thereafter).  Some servers (clangd, eslint)
-  only re-scan when this notification fires, even though the LSP spec
-  doesn't strictly require it.
-
- ``ContentModified`` (-32801) errors get retried with exponential
-  backoff up to 3 times.  This matches Claude Code's
-  ``LSPServerInstance.sendRequest``.
-"""
-from __future__ import annotations
-
-import asyncio
-import logging
-import os
-from pathlib import Path
-from typing import Any, Awaitable, Callable, Dict, List, Optional, Set
-from urllib.parse import quote, unquote
-
-from agent.lsp.protocol import (
-    ERROR_CONTENT_MODIFIED,
-    ERROR_METHOD_NOT_FOUND,
-    LSPProtocolError,
-    LSPRequestError,
-    classify_message,
-    encode_message,
-    make_error_response,
-    make_notification,
-    make_request,
-    make_response,
-    read_message,
-)
-
-logger = logging.getLogger("agent.lsp.client")
-
-# Timeouts (seconds) — mirror OpenCode's constants, scaled to seconds.
-INITIALIZE_TIMEOUT = 45.0
-DIAGNOSTICS_DOCUMENT_WAIT = 5.0
-DIAGNOSTICS_FULL_WAIT = 10.0
-DIAGNOSTICS_REQUEST_TIMEOUT = 3.0
-PUSH_DEBOUNCE = 0.15
-SHUTDOWN_GRACE = 1.0  # seconds between SIGTERM and SIGKILL
-
-# Retry policy for transient ContentModified errors.
-MAX_CONTENT_MODIFIED_RETRIES = 3
-RETRY_BASE_DELAY = 0.5  # 0.5, 1.0, 2.0 — exponential
-
-
-def file_uri(path: str) -> str:
-    """Return ``file://`` URI for an absolute filesystem path.
-
-    Mirrors Node's ``pathToFileURL`` — handles spaces, unicode, and
-    Windows drive letters (``C:\\foo`` → ``file:///C:/foo``).
-    """
-    abs_path = os.path.abspath(path)
-    if os.name == "nt":
-        # Windows: backslash → forward slash, prepend extra slash so
-        # the drive letter shows up as part of the path component.
-        abs_path = abs_path.replace("\\", "/")
-        if not abs_path.startswith("/"):
-            abs_path = "/" + abs_path
-    return "file://" + quote(abs_path, safe="/:")
-
-
-def uri_to_path(uri: str) -> str:
-    """Inverse of :func:`file_uri`."""
-    if not uri.startswith("file://"):
-        return uri
-    raw = uri[len("file://"):]
-    if os.name == "nt" and raw.startswith("/") and len(raw) > 2 and raw[2] == ":":
-        raw = raw[1:]  # strip leading slash before drive letter
-    return os.path.normpath(unquote(raw))
-
-
-def _end_position(text: str) -> Dict[str, int]:
-    """Return the LSP Position at the end of ``text``.
-
-    Used to construct a single-range "replace whole document" change
-    for ``textDocument/didChange`` regardless of the server's declared
-    sync mode.
-    """
-    if not text:
-        return {"line": 0, "character": 0}
-    lines = text.splitlines(keepends=False)
-    last_line = len(lines) - 1
-    last_col = len(lines[-1]) if lines else 0
-    # If the text ends with a trailing newline, ``splitlines`` won't
-    # represent it.  The end position is then the start of the next
-    # (empty) line — line index is len(lines), column 0.
-    if text.endswith(("\n", "\r")):
-        return {"line": last_line + 1, "character": 0}
-    return {"line": last_line, "character": last_col}
-
-
-class LSPClient:
-    """Async LSP client tied to one server process and one workspace root.
-
-    Lifecycle:
-
-        c = LSPClient(server_id, workspace_root, command, args, init_options)
-        await c.start()       # spawn + initialize
-        ver = await c.open_file("/path/to/foo.py")
-        await c.wait_for_diagnostics("/path/to/foo.py", ver)
-        diags = c.diagnostics_for("/path/to/foo.py")
-        await c.shutdown()
-    """
-
-    # ------------------------------------------------------------------
-    # construction + lifecycle
-    # ------------------------------------------------------------------
-
-    def __init__(
-        self,
-        *,
-        server_id: str,
-        workspace_root: str,
-        command: List[str],
-        env: Optional[Dict[str, str]] = None,
-        cwd: Optional[str] = None,
-        initialization_options: Optional[Dict[str, Any]] = None,
-        seed_diagnostics_on_first_push: bool = False,
-    ) -> None:
-        self.server_id = server_id
-        self.workspace_root = workspace_root
-        self._command = list(command)
-        self._env = env
-        self._cwd = cwd or workspace_root
-        self._init_options = initialization_options or {}
-        self._seed_first_push = seed_diagnostics_on_first_push
-
-        # Process + streams
-        self._proc: Optional[asyncio.subprocess.Process] = None
-        self._stderr_task: Optional[asyncio.Task] = None
-        self._reader_task: Optional[asyncio.Task] = None
-
-        # Request/response correlation
-        self._next_id: int = 0
-        self._pending: Dict[int, asyncio.Future] = {}
-
-        # Server-side request handlers (server → client requests).
-        # Kept small and explicit; everything else returns method-not-found.
-        self._request_handlers: Dict[str, Callable[[Any], Awaitable[Any]]] = {
-            "window/workDoneProgress/create": self._handle_work_done_create,
-            "workspace/configuration": self._handle_workspace_configuration,
-            "client/registerCapability": self._handle_register_capability,
-            "client/unregisterCapability": self._handle_unregister_capability,
-            "workspace/workspaceFolders": self._handle_workspace_folders,
-            "workspace/diagnostic/refresh": self._handle_diagnostic_refresh,
-        }
-        # Notifications (server → client) we care about.
-        self._notification_handlers: Dict[str, Callable[[Any], None]] = {
-            "textDocument/publishDiagnostics": self._handle_publish_diagnostics,
-            # Everything else (window/showMessage, $/progress, etc.)
-            # is silently dropped by default.
-        }
-
-        # Tracked file state — required for didChange version bumps.
-        self._files: Dict[str, Dict[str, Any]] = {}
-        # Diagnostic stores, keyed by file path (NOT URI).
-        self._push_diagnostics: Dict[str, List[Dict[str, Any]]] = {}
-        self._pull_diagnostics: Dict[str, List[Dict[str, Any]]] = {}
-        # Per-path "last published" time so wait-for-fresh logic works.
-        self._published: Dict[str, float] = {}
-        # Per-path version of the latest push (matches our didChange
-        # version when the server respects it).
-        self._published_version: Dict[str, int] = {}
-        # First-push seen flag, for typescript-style seed-on-first-push.
-        self._first_push_seen: Set[str] = set()
-        # Capability registrations — only diagnostic ones are tracked.
-        self._diagnostic_registrations: Dict[str, Dict[str, Any]] = {}
-
-        # State machine
-        self._state: str = "stopped"
-        self._initialize_result: Optional[Dict[str, Any]] = None
-        self._sync_kind: int = 1  # 1=Full, 2=Incremental
-        self._stopping: bool = False
-
-        # Push event for waiters.
-        self._push_event = asyncio.Event()
-        # Monotonic counter incremented on every publishDiagnostics push.
-        # Waiters snapshot it on entry and treat any increase as
-        # "something happened, recheck the predicate".  Avoids the
-        # asyncio.Event sticky-state trap.
-        self._push_counter = 0
-        # Registration change event so wait_for_diagnostics can re-loop
-        # when the server announces a new dynamic provider.
-        self._registration_event = asyncio.Event()
-
-    @property
-    def is_running(self) -> bool:
-        return self._state == "running" and self._proc is not None and self._proc.returncode is None
-
-    @property
-    def state(self) -> str:
-        return self._state
-
-    async def start(self) -> None:
-        """Spawn the server and complete the initialize handshake.
-
-        Raises any exception encountered during spawn/init.  On failure
-        the process is killed and the client is left in state
-        ``"error"`` — re-call ``start()`` to retry.
-        """
-        if self._state in ("running", "starting"):
-            return
-        self._state = "starting"
-        try:
-            await self._spawn()
-            await self._initialize()
-            self._state = "running"
-        except Exception:
-            self._state = "error"
-            await self._cleanup_process()
-            raise
-
-    async def _spawn(self) -> None:
-        env = dict(os.environ)
-        if self._env:
-            env.update(self._env)
-
-        try:
-            self._proc = await asyncio.create_subprocess_exec(
-                self._command[0],
-                *self._command[1:],
-                stdin=asyncio.subprocess.PIPE,
-                stdout=asyncio.subprocess.PIPE,
-                stderr=asyncio.subprocess.PIPE,
-                env=env,
-                cwd=self._cwd,
-            )
-        except FileNotFoundError as e:
-            raise LSPProtocolError(
-                f"LSP server binary not found: {self._command[0]} ({e})"
-            ) from e
-
-        # Drain stderr at debug level — if we don't, the pipe buffer
-        # fills and the server hangs.
-        self._stderr_task = asyncio.create_task(self._drain_stderr())
-        # Start the reader loop.
-        self._reader_task = asyncio.create_task(self._reader_loop())
-
-    async def _drain_stderr(self) -> None:
-        if self._proc is None or self._proc.stderr is None:
-            return
-        try:
-            while True:
-                line = await self._proc.stderr.readline()
-                if not line:
-                    break
-                text = line.decode("utf-8", errors="replace").rstrip()
-                if text:
-                    logger.debug("[%s] stderr: %s", self.server_id, text[:1000])
-        except (asyncio.CancelledError, OSError):
-            pass
-
-    async def _reader_loop(self) -> None:
-        if self._proc is None or self._proc.stdout is None:
-            return
-        try:
-            while True:
-                msg = await read_message(self._proc.stdout)
-                if msg is None:
-                    logger.debug("[%s] server closed stdout cleanly", self.server_id)
-                    break
-                kind, key = classify_message(msg)
-                if kind == "response":
-                    self._dispatch_response(key, msg)
-                elif kind == "request":
-                    asyncio.create_task(self._dispatch_request(key, msg))
-                elif kind == "notification":
-                    self._dispatch_notification(key, msg)
-                else:
-                    logger.warning("[%s] dropping invalid message: %r", self.server_id, msg)
-        except LSPProtocolError as e:
-            logger.warning("[%s] protocol error in reader loop: %s", self.server_id, e)
-        except (asyncio.CancelledError, OSError):
-            pass
-        finally:
-            # Wake up any pending requests so they can fail fast.
-            for fut in list(self._pending.values()):
-                if not fut.done():
-                    fut.set_exception(LSPProtocolError("server connection closed"))
-            self._pending.clear()
-
-    async def _initialize(self) -> None:
-        params = {
-            "rootUri": file_uri(self.workspace_root),
-            "rootPath": self.workspace_root,
-            "processId": os.getpid(),
-            "workspaceFolders": [
-                {"name": "workspace", "uri": file_uri(self.workspace_root)}
-            ],
-            "initializationOptions": self._init_options,
-            "capabilities": {
-                "window": {"workDoneProgress": True},
-                "workspace": {
-                    "configuration": True,
-                    "workspaceFolders": True,
-                    "didChangeWatchedFiles": {"dynamicRegistration": True},
-                    "diagnostics": {"refreshSupport": False},
-                },
-                "textDocument": {
-                    "synchronization": {
-                        "dynamicRegistration": False,
-                        "didOpen": True,
-                        "didChange": True,
-                        "didSave": True,
-                        "willSave": False,
-                        "willSaveWaitUntil": False,
-                    },
-                    "diagnostic": {
-                        "dynamicRegistration": True,
-                        "relatedDocumentSupport": True,
-                    },
-                    "publishDiagnostics": {
-                        "relatedInformation": True,
-                        "tagSupport": {"valueSet": [1, 2]},
-                        "versionSupport": True,
-                        "codeDescriptionSupport": True,
-                        "dataSupport": False,
-                    },
-                    "hover": {"contentFormat": ["markdown", "plaintext"]},
-                    "definition": {"linkSupport": True},
-                    "references": {},
-                    "documentSymbol": {"hierarchicalDocumentSymbolSupport": True},
-                },
-                "general": {"positionEncodings": ["utf-16"]},
-            },
-        }
-
-        result = await asyncio.wait_for(
-            self._send_request("initialize", params),
-            timeout=INITIALIZE_TIMEOUT,
-        )
-        self._initialize_result = result
-        self._sync_kind = self._extract_sync_kind(result.get("capabilities") or {})
-
-        await self._send_notification("initialized", {})
-        if self._init_options:
-            # Some servers (vtsls, eslint) want config pushed via
-            # didChangeConfiguration even if it was sent in
-            # initializationOptions.
-            await self._send_notification(
-                "workspace/didChangeConfiguration",
-                {"settings": self._init_options},
-            )
-
-    @staticmethod
-    def _extract_sync_kind(capabilities: dict) -> int:
-        sync = capabilities.get("textDocumentSync")
-        if isinstance(sync, int):
-            return sync
-        if isinstance(sync, dict):
-            change = sync.get("change")
-            if isinstance(change, int):
-                return change
-        return 1  # default to Full
-
-    async def shutdown(self) -> None:
-        """Best-effort graceful shutdown.
-
-        Sends ``shutdown`` + ``exit``, then SIGTERMs/SIGKILLs the
-        process if it doesn't exit cleanly.  Idempotent.
-        """
-        if self._stopping:
-            return
-        self._stopping = True
-        try:
-            if self.is_running:
-                try:
-                    await asyncio.wait_for(self._send_request("shutdown", None), timeout=2.0)
-                except (asyncio.TimeoutError, LSPRequestError, LSPProtocolError):
-                    pass
-                try:
-                    await self._send_notification("exit", None)
-                except Exception:
-                    pass
-        finally:
-            self._state = "stopped"
-            await self._cleanup_process()
-
-    async def _cleanup_process(self) -> None:
-        if self._reader_task is not None and not self._reader_task.done():
-            self._reader_task.cancel()
-            try:
-                await self._reader_task
-            except (asyncio.CancelledError, Exception):  # noqa: BLE001
-                pass
-        if self._stderr_task is not None and not self._stderr_task.done():
-            self._stderr_task.cancel()
-            try:
-                await self._stderr_task
-            except (asyncio.CancelledError, Exception):  # noqa: BLE001
-                pass
-        proc = self._proc
-        self._proc = None
-        if proc is None:
-            return
-        if proc.returncode is None:
-            try:
-                proc.terminate()
-                try:
-                    await asyncio.wait_for(proc.wait(), timeout=SHUTDOWN_GRACE)
-                except asyncio.TimeoutError:
-                    try:
-                        proc.kill()
-                        await proc.wait()
-                    except ProcessLookupError:
-                        pass
-            except ProcessLookupError:
-                pass
-
-    # ------------------------------------------------------------------
-    # request / notification plumbing
-    # ------------------------------------------------------------------
-
-    async def _send_request(self, method: str, params: Any) -> Any:
-        if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
-            raise LSPProtocolError(f"cannot send {method!r}: stdin closed")
-        loop = asyncio.get_running_loop()
-        req_id = self._next_id
-        self._next_id += 1
-        fut: asyncio.Future = loop.create_future()
-        self._pending[req_id] = fut
-        try:
-            self._proc.stdin.write(encode_message(make_request(req_id, method, params)))
-            await self._proc.stdin.drain()
-        except (BrokenPipeError, ConnectionResetError, OSError) as e:
-            self._pending.pop(req_id, None)
-            raise LSPProtocolError(f"send failed for {method!r}: {e}") from e
-        try:
-            return await fut
-        finally:
-            self._pending.pop(req_id, None)
-
-    async def _send_request_with_retry(self, method: str, params: Any, *, timeout: float) -> Any:
-        """Send a request, retrying on ``ContentModified`` (-32801).
-
-        Other errors propagate.  The retry policy matches Claude Code's
-        ``LSPServerInstance.sendRequest`` — 3 attempts with delays
-        0.5s, 1.0s, 2.0s.
-        """
-        for attempt in range(MAX_CONTENT_MODIFIED_RETRIES + 1):
-            try:
-                return await asyncio.wait_for(self._send_request(method, params), timeout=timeout)
-            except LSPRequestError as e:
-                if e.code == ERROR_CONTENT_MODIFIED and attempt < MAX_CONTENT_MODIFIED_RETRIES:
-                    await asyncio.sleep(RETRY_BASE_DELAY * (2 ** attempt))
-                    continue
-                raise
-
-    async def _send_notification(self, method: str, params: Any) -> None:
-        if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
-            return
-        try:
-            self._proc.stdin.write(encode_message(make_notification(method, params)))
-            await self._proc.stdin.drain()
-        except (BrokenPipeError, ConnectionResetError, OSError) as e:
-            logger.debug("[%s] notify %s failed: %s", self.server_id, method, e)
-
-    async def _send_response(self, req_id: Any, result: Any) -> None:
-        if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
-            return
-        try:
-            self._proc.stdin.write(encode_message(make_response(req_id, result)))
-            await self._proc.stdin.drain()
-        except (BrokenPipeError, ConnectionResetError, OSError):
-            pass
-
-    async def _send_error_response(self, req_id: Any, code: int, message: str) -> None:
-        if self._proc is None or self._proc.stdin is None or self._proc.stdin.is_closing():
-            return
-        try:
-            self._proc.stdin.write(encode_message(make_error_response(req_id, code, message)))
-            await self._proc.stdin.drain()
-        except (BrokenPipeError, ConnectionResetError, OSError):
-            pass
-
-    def _dispatch_response(self, req_id: int, msg: dict) -> None:
-        fut = self._pending.get(req_id)
-        if fut is None or fut.done():
-            return
-        if "error" in msg:
-            err = msg["error"] or {}
-            fut.set_exception(
-                LSPRequestError(
-                    code=int(err.get("code", -32000)),
-                    message=str(err.get("message", "unknown")),
-                    data=err.get("data"),
-                )
-            )
-        else:
-            fut.set_result(msg.get("result"))
-
-    async def _dispatch_request(self, req_id: Any, msg: dict) -> None:
-        method = msg.get("method", "")
-        params = msg.get("params")
-        handler = self._request_handlers.get(method)
-        if handler is None:
-            await self._send_error_response(req_id, ERROR_METHOD_NOT_FOUND, f"method not found: {method}")
-            return
-        try:
-            result = await handler(params)
-        except Exception as e:  # noqa: BLE001 — protocol must not blow up
-            logger.warning("[%s] request handler %s failed: %s", self.server_id, method, e)
-            await self._send_error_response(req_id, -32000, f"handler failed: {e}")
-            return
-        await self._send_response(req_id, result)
-
-    def _dispatch_notification(self, method: str, msg: dict) -> None:
-        handler = self._notification_handlers.get(method)
-        if handler is None:
-            return
-        try:
-            handler(msg.get("params"))
-        except Exception as e:  # noqa: BLE001
-            logger.debug("[%s] notification handler %s failed: %s", self.server_id, method, e)
-
-    # ------------------------------------------------------------------
-    # built-in server-→-client request handlers
-    # ------------------------------------------------------------------
-
-    async def _handle_work_done_create(self, params: Any) -> Any:
-        # Acknowledge progress tokens — required by some servers.
-        return None
-
-    async def _handle_workspace_configuration(self, params: Any) -> Any:
-        # Walk dotted sections through initializationOptions.  Mirrors
-        # OpenCode's `client.ts:198-220` — return null when missing.
-        if not isinstance(params, dict):
-            return [None]
-        items = params.get("items") or []
-        out: List[Any] = []
-        for item in items:
-            if not isinstance(item, dict):
-                out.append(None)
-                continue
-            section = item.get("section")
-            if not section or not self._init_options:
-                out.append(self._init_options or None)
-                continue
-            cur: Any = self._init_options
-            for part in str(section).split("."):
-                if isinstance(cur, dict) and part in cur:
-                    cur = cur[part]
-                else:
-                    cur = None
-                    break
-            out.append(cur)
-        return out
-
-    async def _handle_register_capability(self, params: Any) -> Any:
-        if not isinstance(params, dict):
-            return None
-        for reg in params.get("registrations") or []:
-            if not isinstance(reg, dict):
-                continue
-            method = reg.get("method")
-            reg_id = reg.get("id")
-            if method == "textDocument/diagnostic" and reg_id:
-                self._diagnostic_registrations[str(reg_id)] = reg
-                self._registration_event.set()
-        return None
-
-    async def _handle_unregister_capability(self, params: Any) -> Any:
-        if not isinstance(params, dict):
-            return None
-        for unreg in params.get("unregisterations") or []:
-            if not isinstance(unreg, dict):
-                continue
-            reg_id = unreg.get("id")
-            if reg_id:
-                self._diagnostic_registrations.pop(str(reg_id), None)
-        return None
-
-    async def _handle_workspace_folders(self, params: Any) -> Any:
-        return [{"name": "workspace", "uri": file_uri(self.workspace_root)}]
-
-    async def _handle_diagnostic_refresh(self, params: Any) -> Any:
-        # We don't honour refresh — we re-pull on every touchFile.
-        return None
-
-    # ------------------------------------------------------------------
-    # publishDiagnostics handler
-    # ------------------------------------------------------------------
-
-    def _handle_publish_diagnostics(self, params: Any) -> None:
-        if not isinstance(params, dict):
-            return
-        uri = params.get("uri")
-        if not isinstance(uri, str):
-            return
-        path = uri_to_path(uri)
-        diagnostics = params.get("diagnostics") or []
-        if not isinstance(diagnostics, list):
-            diagnostics = []
-        version = params.get("version")
-        loop_time = asyncio.get_event_loop().time()
-
-        if self._seed_first_push and path not in self._first_push_seen:
-            # First push: seed without firing the event so a waiter
-            # doesn't resolve on the very first push (which arrives
-            # before the user-triggered didChange could've produced
-            # fresh diagnostics).
-            self._first_push_seen.add(path)
-            self._push_diagnostics[path] = diagnostics
-            self._published[path] = loop_time
-            if isinstance(version, int):
-                self._published_version[path] = version
-            return
-
-        self._push_diagnostics[path] = diagnostics
-        self._published[path] = loop_time
-        if isinstance(version, int):
-            self._published_version[path] = version
-        self._first_push_seen.add(path)
-        # Bump the monotonic push counter and wake every waiter.  We
-        # keep the Event sticky-set so any wait already in progress
-        # resolves; waiters re-check their predicate after waking and
-        # decide whether to keep waiting.  ``_push_counter`` is what
-        # they actually compare against to detect a fresh event.
-        self._push_counter += 1
-        self._push_event.set()
-
-    # ------------------------------------------------------------------
-    # public file-sync API
-    # ------------------------------------------------------------------
-
-    async def open_file(self, path: str, *, language_id: str = "plaintext") -> int:
-        """Send didOpen (first time) or didChange (subsequent) for ``path``.
-
-        Returns the new document version number that the agent's
-        ``wait_for_diagnostics`` should match against.
-        """
-        if not self.is_running:
-            raise LSPProtocolError("client not running")
-
-        abs_path = os.path.abspath(path)
-        try:
-            text = Path(abs_path).read_text(encoding="utf-8", errors="replace")
-        except OSError as e:
-            raise LSPProtocolError(f"cannot read {abs_path}: {e}") from e
-
-        uri = file_uri(abs_path)
-        existing = self._files.get(abs_path)
-
-        if existing is not None:
-            # Re-open: bump version, fire didChangeWatchedFiles + didChange.
-            await self._send_notification(
-                "workspace/didChangeWatchedFiles",
-                {"changes": [{"uri": uri, "type": 2}]},  # 2 = CHANGED
-            )
-            new_version = existing["version"] + 1
-            old_text = existing["text"]
-            content_changes: List[Dict[str, Any]]
-            if self._sync_kind == 2:
-                content_changes = [
-                    {
-                        "range": {
-                            "start": {"line": 0, "character": 0},
-                            "end": _end_position(old_text),
-                        },
-                        "text": text,
-                    }
-                ]
-            else:
-                content_changes = [{"text": text}]
-            await self._send_notification(
-                "textDocument/didChange",
-                {
-                    "textDocument": {"uri": uri, "version": new_version},
-                    "contentChanges": content_changes,
-                },
-            )
-            self._files[abs_path] = {"version": new_version, "text": text}
-            return new_version
-
-        # First open: didChangeWatchedFiles CREATED + didOpen.
-        await self._send_notification(
-            "workspace/didChangeWatchedFiles",
-            {"changes": [{"uri": uri, "type": 1}]},  # 1 = CREATED
-        )
-        # Clear any stale push/pull entries — fresh open should start
-        # from scratch.
-        self._push_diagnostics.pop(abs_path, None)
-        self._pull_diagnostics.pop(abs_path, None)
-        self._published.pop(abs_path, None)
-        self._published_version.pop(abs_path, None)
-        await self._send_notification(
-            "textDocument/didOpen",
-            {
-                "textDocument": {
-                    "uri": uri,
-                    "languageId": language_id,
-                    "version": 0,
-                    "text": text,
-                }
-            },
-        )
-        self._files[abs_path] = {"version": 0, "text": text}
-        return 0
-
-    async def save_file(self, path: str) -> None:
-        """Send didSave for ``path``.  Some linters re-scan only on save."""
-        if not self.is_running:
-            return
-        abs_path = os.path.abspath(path)
-        await self._send_notification(
-            "textDocument/didSave",
-            {"textDocument": {"uri": file_uri(abs_path)}},
-        )
-
-    # ------------------------------------------------------------------
-    # diagnostics: pull + wait
-    # ------------------------------------------------------------------
-
-    async def _pull_document_diagnostics(self, path: str) -> None:
-        """Send ``textDocument/diagnostic`` for one file.
-
-        Stores results into :attr:`_pull_diagnostics`.  Silently
-        no-ops on errors (server may not support the pull endpoint).
-        """
-        try:
-            params: Dict[str, Any] = {
-                "textDocument": {"uri": file_uri(os.path.abspath(path))}
-            }
-            result = await self._send_request_with_retry(
-                "textDocument/diagnostic",
-                params,
-                timeout=DIAGNOSTICS_REQUEST_TIMEOUT,
-            )
-        except (LSPRequestError, LSPProtocolError, asyncio.TimeoutError) as e:
-            logger.debug("[%s] document diagnostic pull failed: %s", self.server_id, e)
-            return
-        if not isinstance(result, dict):
-            return
-        items = result.get("items")
-        if isinstance(items, list):
-            self._pull_diagnostics[os.path.abspath(path)] = items
-        related = result.get("relatedDocuments")
-        if isinstance(related, dict):
-            for uri, sub in related.items():
-                if not isinstance(sub, dict):
-                    continue
-                sub_items = sub.get("items")
-                if isinstance(sub_items, list):
-                    self._pull_diagnostics[uri_to_path(uri)] = sub_items
-
-    async def wait_for_diagnostics(
-        self,
-        path: str,
-        version: int,
-        *,
-        mode: str = "document",
-    ) -> None:
-        """Wait for the server to publish diagnostics for ``path`` at ``version``.
-
-        ``mode`` is ``"document"`` (5s budget, document pulls) or
-        ``"full"`` (10s budget, also workspace pulls).  Best-effort —
-        returns silently on timeout.  Does NOT throw if the server
-        doesn't support pull diagnostics; we still get the push side.
-        """
-        budget = DIAGNOSTICS_FULL_WAIT if mode == "full" else DIAGNOSTICS_DOCUMENT_WAIT
-        deadline = asyncio.get_event_loop().time() + budget
-        abs_path = os.path.abspath(path)
-
-        while True:
-            remaining = deadline - asyncio.get_event_loop().time()
-            if remaining <= 0:
-                return
-
-            # Concurrent: document pull + push wait.
-            pull_task = asyncio.create_task(self._pull_document_diagnostics(abs_path))
-            push_task = asyncio.create_task(self._wait_for_fresh_push(abs_path, version, remaining))
-            done, pending = await asyncio.wait(
-                {pull_task, push_task},
-                timeout=remaining,
-                return_when=asyncio.FIRST_COMPLETED,
-            )
-            for t in pending:
-                t.cancel()
-            for t in pending:
-                try:
-                    await t
-                except (asyncio.CancelledError, Exception):  # noqa: BLE001
-                    pass
-
-            # If we got a fresh push for our version, we're done.
-            current_v = self._published_version.get(abs_path)
-            if abs_path in self._published and (
-                current_v is None or current_v >= version
-            ):
-                return
-
-            # Pull may have populated _pull_diagnostics — that's also
-            # success.
-            if abs_path in self._pull_diagnostics:
-                return
-
-            # Loop until budget runs out.
-
-    async def _wait_for_fresh_push(self, path: str, version: int, timeout: float) -> None:
-        """Wait until a publishDiagnostics arrives for ``path`` at ``version``+."""
-        deadline = asyncio.get_event_loop().time() + timeout
-        baseline = self._push_counter
-        while True:
-            current_v = self._published_version.get(path)
-            if path in self._published and (current_v is None or current_v >= version):
-                # Debounce — wait a tick in case more diagnostics arrive
-                # immediately after.  TS often emits in pairs.  We
-                # snapshot the counter so we wake on a *new* push, not
-                # on the one that satisfied us a moment ago.
-                debounce_baseline = self._push_counter
-                debounce_deadline = asyncio.get_event_loop().time() + PUSH_DEBOUNCE
-                while self._push_counter == debounce_baseline:
-                    remaining = debounce_deadline - asyncio.get_event_loop().time()
-                    if remaining <= 0:
-                        break
-                    self._push_event.clear()
-                    try:
-                        await asyncio.wait_for(self._push_event.wait(), timeout=remaining)
-                    except asyncio.TimeoutError:
-                        break
-                return
-            remaining = deadline - asyncio.get_event_loop().time()
-            if remaining <= 0:
-                return
-            if self._push_counter > baseline:
-                # New event arrived but predicate still false — re-check
-                # immediately without waiting again.
-                baseline = self._push_counter
-                continue
-            self._push_event.clear()
-            try:
-                await asyncio.wait_for(self._push_event.wait(), timeout=min(remaining, 0.5))
-            except asyncio.TimeoutError:
-                continue
-
-    def diagnostics_for(self, path: str) -> List[Dict[str, Any]]:
-        """Return current merged + deduped diagnostics for one file.
-
-        Diagnostics from push and pull stores are concatenated and
-        deduplicated by ``(severity, code, message, range)`` content
-        key.  Empty list if the server hasn't published anything.
-        """
-        abs_path = os.path.abspath(path)
-        push = self._push_diagnostics.get(abs_path) or []
-        pull = self._pull_diagnostics.get(abs_path) or []
-        return _dedupe(push, pull)
-
-
-def _dedupe(*lists: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
-    seen: Set[str] = set()
-    out: List[Dict[str, Any]] = []
-    for lst in lists:
-        for d in lst:
-            if not isinstance(d, dict):
-                continue
-            key = _diagnostic_key(d)
-            if key in seen:
-                continue
-            seen.add(key)
-            out.append(d)
-    return out
-
-
-def _diagnostic_key(d: Dict[str, Any]) -> str:
-    """Content-equality key for a diagnostic.
-
-    Matches the structural-equality used in claude-code's
-    ``areDiagnosticsEqual`` — message + severity + source + code +
-    range coords.  The range is reduced to a tuple to keep the key
-    stable across dict orderings.
-    """
-    rng = d.get("range") or {}
-    start = rng.get("start") or {}
-    end = rng.get("end") or {}
-    code = d.get("code")
-    if code is not None and not isinstance(code, str):
-        code = str(code)
-    return "\x00".join(
-        [
-            str(d.get("severity") or 1),
-            str(code or ""),
-            str(d.get("source") or ""),
-            str(d.get("message") or "").strip(),
-            f"{start.get('line', 0)}:{start.get('character', 0)}-{end.get('line', 0)}:{end.get('character', 0)}",
-        ]
-    )
-
-
-__all__ = [
-    "LSPClient",
-    "file_uri",
-    "uri_to_path",
-    "INITIALIZE_TIMEOUT",
-    "DIAGNOSTICS_DOCUMENT_WAIT",
-    "DIAGNOSTICS_FULL_WAIT",
-]
@@ -1,213 +0,0 @@
-"""Structured logging with steady-state silence for the LSP layer.
-
-The LSP layer fires on every write_file/patch.  In a busy session
-that's hundreds of events.  We want users to be able to ``rg`` the
-log for "did LSP fire on that edit?" without drowning in noise.
-
-The level model:
-
- ``DEBUG`` for steady-state events that have no novel signal:
-  ``clean``, ``feature off``, ``extension not mapped``, ``no project
-  root for already-announced file``, ``server unavailable for
-  already-announced binary``.  These never reach ``agent.log`` at the
-  default INFO threshold.
-
- ``INFO`` for state transitions worth surfacing exactly once per
-  session: ``active for <root>`` the first time a (server_id,
-  workspace_root) client starts, ``no project root for <path>``
-  the first time we see that file.  Plus every diagnostic event
-  (those are inherently rare and per-edit, exactly what users grep
-  for).
-
- ``WARNING`` for action-required failures: ``server unavailable``
-  (binary not on PATH) the first time per (server_id, binary),
-  ``no server configured`` once per language.  Per-call WARNING for
-  timeouts and unexpected bridge exceptions.
-
-The dedup is in-process module-level sets.  Each set grows at most by
-the number of distinct (server_id, root) and (server_id, binary)
-pairs touched in one Python process — bytes of memory in even an
-aggressive monorepo session.  Bounded LRU was rejected: evicting an
-entry would risk re-firing the WARNING/INFO line we explicitly want
-to suppress.
-
-Grep recipe::
-
-    tail -f ~/.hermes/logs/agent.log | rg 'lsp\\['
-"""
-from __future__ import annotations
-
-import logging
-import os
-import threading
-from typing import Tuple
-
-# Dedicated logger name so the documented grep recipe survives a
-# ``logging.getLogger(__name__)`` rename of any internal module.
-event_log = logging.getLogger("hermes.lint.lsp")
-
-# ---------------------------------------------------------------------------
-# Once-per-X dedup sets
-# ---------------------------------------------------------------------------
-
-_announce_lock = threading.Lock()
-_announced_active: set = set()        # keys: (server_id, workspace_root)
-_announced_unavailable: set = set()   # keys: (server_id, binary_path_or_name)
-_announced_no_root: set = set()       # keys: (server_id, file_path)
-_announced_no_server: set = set()     # keys: (server_id,)
-
-
-def _short_path(file_path: str) -> str:
-    """Render *file_path* relative to the cwd when sensible, else absolute.
-
-    Keeps log lines readable for the common case (the user is inside
-    the project they're editing) without emitting brittle ``../../..``
-    chains for the cross-tree case.
-    """
-    if not file_path:
-        return file_path
-    try:
-        rel = os.path.relpath(file_path)
-    except ValueError:
-        return file_path
-    if rel.startswith(".." + os.sep) or rel == "..":
-        return file_path
-    return rel
-
-
-def _emit(server_id: str, level: int, message: str) -> None:
-    event_log.log(level, "lsp[%s] %s", server_id, message)
-
-
-def _announce_once(bucket: set, key: Tuple) -> bool:
-    """Return True if *key* has not been announced for *bucket* yet.
-
-    Atomically marks the key as announced so concurrent callers
-    cannot both win the race and double-log.
-    """
-    with _announce_lock:
-        if key in bucket:
-            return False
-        bucket.add(key)
-        return True
-
-
-# ---------------------------------------------------------------------------
-# Public event helpers — call these from the LSP layer.
-# ---------------------------------------------------------------------------
-
-
-def log_clean(server_id: str, file_path: str) -> None:
-    """No diagnostics emitted for *file_path*.  DEBUG (silent at default)."""
-    _emit(server_id, logging.DEBUG, f"clean ({_short_path(file_path)})")
-
-
-def log_disabled(server_id: str, file_path: str, reason: str) -> None:
-    """LSP intentionally skipped for this file (feature off, ext unmapped,
-    backend not local, etc.).  DEBUG."""
-    _emit(server_id, logging.DEBUG, f"skipped: {reason} ({_short_path(file_path)})")
-
-
-def log_active(server_id: str, workspace_root: str) -> None:
-    """A new LSP client started for (server_id, workspace_root).
-
-    INFO once per (server_id, workspace_root); DEBUG thereafter.
-    Lets users verify "is LSP actually running?" with a single grep.
-    """
-    key = (server_id, workspace_root)
-    if _announce_once(_announced_active, key):
-        _emit(server_id, logging.INFO, f"active for {workspace_root}")
-    else:
-        _emit(server_id, logging.DEBUG, f"reused client for {workspace_root}")
-
-
-def log_diagnostics(server_id: str, file_path: str, count: int) -> None:
-    """Diagnostics arrived for a file.  INFO every time — these are the
-    failure signals users actually want to grep for, and they are
-    inherently rare per edit."""
-    _emit(server_id, logging.INFO, f"{count} diags ({_short_path(file_path)})")
-
-
-def log_no_project_root(server_id: str, file_path: str) -> None:
-    """File had no recognised project marker.  INFO once per file,
-    DEBUG thereafter."""
-    key = (server_id, file_path)
-    if _announce_once(_announced_no_root, key):
-        _emit(server_id, logging.INFO, f"no project root for {_short_path(file_path)}")
-    else:
-        _emit(server_id, logging.DEBUG, f"no project root for {_short_path(file_path)}")
-
-
-def log_server_unavailable(server_id: str, binary_or_pkg: str) -> None:
-    """The server binary couldn't be resolved.  WARNING once per
-    (server_id, binary), DEBUG thereafter so a hundred subsequent
-    .py edits don't spam the log."""
-    key = (server_id, binary_or_pkg)
-    if _announce_once(_announced_unavailable, key):
-        _emit(
-            server_id,
-            logging.WARNING,
-            f"server unavailable: {binary_or_pkg} not found "
-            "(install via `hermes lsp install <id>` or set lsp.servers.<id>.command)",
-        )
-    else:
-        _emit(server_id, logging.DEBUG, f"server still unavailable: {binary_or_pkg}")
-
-
-def log_no_server_configured(server_id: str) -> None:
-    """No spawn recipe for this language.  WARNING once."""
-    if _announce_once(_announced_no_server, (server_id,)):
-        _emit(server_id, logging.WARNING, "no server configured")
-
-
-def log_timeout(server_id: str, file_path: str, kind: str = "diagnostics") -> None:
-    """A request to the server timed out.  WARNING every time — these are
-    inherently novel events worth surfacing on each occurrence."""
-    _emit(
-        server_id,
-        logging.WARNING,
-        f"{kind} timed out for {_short_path(file_path)}",
-    )
-
-
-def log_server_error(server_id: str, file_path: str, exc: BaseException) -> None:
-    """An unexpected exception bubbled out of the LSP layer.  WARNING."""
-    _emit(
-        server_id,
-        logging.WARNING,
-        f"unexpected error for {_short_path(file_path)}: {type(exc).__name__}: {exc}",
-    )
-
-
-def log_spawn_failed(server_id: str, workspace_root: str, exc: BaseException) -> None:
-    """The LSP server failed to spawn or initialize.  WARNING."""
-    _emit(
-        server_id,
-        logging.WARNING,
-        f"spawn/initialize failed for {workspace_root}: {type(exc).__name__}: {exc}",
-    )
-
-
-def reset_announce_caches() -> None:
-    """Test-only: clear the dedup caches.  Production code never calls this."""
-    with _announce_lock:
-        _announced_active.clear()
-        _announced_unavailable.clear()
-        _announced_no_root.clear()
-        _announced_no_server.clear()
-
-
-__all__ = [
-    "event_log",
-    "log_clean",
-    "log_disabled",
-    "log_active",
-    "log_diagnostics",
-    "log_no_project_root",
-    "log_server_unavailable",
-    "log_no_server_configured",
-    "log_timeout",
-    "log_server_error",
-    "log_spawn_failed",
-    "reset_announce_caches",
-]
@@ -1,376 +0,0 @@
-"""Auto-installation of LSP server binaries.
-
-Tries to install missing servers using whatever package manager is
-appropriate.  All installs go to a Hermes-owned bin staging dir,
-``<HERMES_HOME>/lsp/bin/``, so we don't pollute the user's global
-toolchain.
-
-Strategies:
-
- ``auto`` — attempt to install with the best available package
-  manager.  This is the default.
- ``manual`` — never install; if a binary is missing, the server is
-  silently skipped and the user is told about it via ``hermes lsp
-  status``.
- ``off`` — same as ``manual`` for now (kept distinct so we can
-  evolve behavior later, e.g. logging differently).
-
-The actual installs happen synchronously the first time a server is
-needed and concurrent calls to :func:`try_install` for the same
-package are deduplicated via a per-package lock.
-
-Failure modes are non-fatal: every install path is wrapped in
-try/except and returns ``None`` on failure.  The tool layer then
-falls back to its in-process syntax checker, exactly as if the user
-hadn't enabled LSP at all.
-"""
-from __future__ import annotations
-
-import logging
-import os
-import shutil
-import subprocess
-import sys
-import threading
-from pathlib import Path
-from typing import Any, Dict, Optional
-
-logger = logging.getLogger("agent.lsp.install")
-
-# Package-name → install-strategy hint registry.  Each entry is a
-# tuple of strategy name + package name + executable name.  When the
-# install completes, we look for the executable in
-# ``<HERMES_HOME>/lsp/bin/`` first, then on PATH.
-#
-# Optional fields:
-#   - ``extra_pkgs``: list of sibling packages to install alongside
-#     ``pkg`` in the same node_modules tree.  Used when an LSP server
-#     has a runtime peer dependency that npm doesn't auto-pull (e.g.
-#     typescript-language-server needs ``typescript``).
-INSTALL_RECIPES: Dict[str, Dict[str, Any]] = {
-    # Python
-    "pyright": {"strategy": "npm", "pkg": "pyright", "bin": "pyright-langserver"},
-    # JS/TS family
-    "typescript-language-server": {
-        "strategy": "npm",
-        "pkg": "typescript-language-server",
-        "bin": "typescript-language-server",
-        # typescript-language-server requires the `typescript` SDK
-        # (tsserver) to be importable from the same node_modules tree;
-        # otherwise initialize() fails with "Could not find a valid
-        # TypeScript installation".  Install them together.
-        "extra_pkgs": ["typescript"],
-    },
-    "@vue/language-server": {
-        "strategy": "npm",
-        "pkg": "@vue/language-server",
-        "bin": "vue-language-server",
-    },
-    "svelte-language-server": {
-        "strategy": "npm",
-        "pkg": "svelte-language-server",
-        "bin": "svelteserver",
-    },
-    "@astrojs/language-server": {
-        "strategy": "npm",
-        "pkg": "@astrojs/language-server",
-        "bin": "astro-ls",
-    },
-    "yaml-language-server": {
-        "strategy": "npm",
-        "pkg": "yaml-language-server",
-        "bin": "yaml-language-server",
-    },
-    "bash-language-server": {
-        "strategy": "npm",
-        "pkg": "bash-language-server",
-        "bin": "bash-language-server",
-    },
-    "intelephense": {"strategy": "npm", "pkg": "intelephense", "bin": "intelephense"},
-    "dockerfile-language-server-nodejs": {
-        "strategy": "npm",
-        "pkg": "dockerfile-language-server-nodejs",
-        "bin": "docker-langserver",
-    },
-    # Go
-    "gopls": {"strategy": "go", "pkg": "golang.org/x/tools/gopls@latest", "bin": "gopls"},
-    # Rust — too heavy (hundreds of MB to bootstrap).  We do NOT
-    # auto-install rust-analyzer; users install via rustup.
-    "rust-analyzer": {"strategy": "manual", "pkg": "", "bin": "rust-analyzer"},
-    # C/C++ — manual (clangd ships with LLVM, very heavy)
-    "clangd": {"strategy": "manual", "pkg": "", "bin": "clangd"},
-    # Lua — manual (LuaLS is platform-specific binaries from GitHub
-    # releases; complex enough that we punt to the user)
-    "lua-language-server": {"strategy": "manual", "pkg": "", "bin": "lua-language-server"},
-}
-
-
-_install_locks: Dict[str, threading.Lock] = {}
-_install_results: Dict[str, Optional[str]] = {}
-_install_lock_meta = threading.Lock()
-
-
-def hermes_lsp_bin_dir() -> Path:
-    """Return the Hermes-owned bin staging dir for LSP servers."""
-    home = os.environ.get("HERMES_HOME")
-    if home is None:
-        home = os.path.join(os.path.expanduser("~"), ".hermes")
-    p = Path(home) / "lsp" / "bin"
-    p.mkdir(parents=True, exist_ok=True)
-    return p
-
-
-def _existing_binary(name: str) -> Optional[str]:
-    """Probe the staging dir + PATH for a binary named ``name``."""
-    staged = hermes_lsp_bin_dir() / name
-    if staged.exists() and os.access(staged, os.X_OK):
-        return str(staged)
-    on_path = shutil.which(name)
-    if on_path:
-        return on_path
-    return None
-
-
-def _get_lock(pkg: str) -> threading.Lock:
-    with _install_lock_meta:
-        lock = _install_locks.get(pkg)
-        if lock is None:
-            lock = threading.Lock()
-            _install_locks[pkg] = lock
-        return lock
-
-
-def try_install(pkg: str, strategy: str = "auto") -> Optional[str]:
-    """Try to install ``pkg`` and return the binary path if successful.
-
-    ``strategy`` is ``"auto"``, ``"manual"``, or ``"off"``.  In
-    ``manual``/``off`` mode, this function only probes for an
-    existing binary and returns ``None`` if not found.
-
-    The install is cached per-package — a second call returns the
-    same path (or ``None``) without reinstalling.  Concurrent calls
-    are serialized.
-    """
-    if strategy not in ("auto",):
-        # Only ``auto`` triggers an actual install.  In manual/off,
-        # we still check whether the binary already exists.
-        recipe = INSTALL_RECIPES.get(pkg, {})
-        bin_name = recipe.get("bin", pkg)
-        return _existing_binary(bin_name)
-
-    if pkg in _install_results:
-        return _install_results[pkg]
-
-    lock = _get_lock(pkg)
-    with lock:
-        # Double-check after acquiring lock.
-        if pkg in _install_results:
-            return _install_results[pkg]
-        result = _do_install(pkg)
-        _install_results[pkg] = result
-        return result
-
-
-def _do_install(pkg: str) -> Optional[str]:
-    recipe = INSTALL_RECIPES.get(pkg)
-    if recipe is None:
-        # Not in our registry — best-effort: just probe PATH.
-        return shutil.which(pkg)
-
-    strategy = recipe.get("strategy", "manual")
-    bin_name = recipe.get("bin", pkg)
-
-    # Check if already present (shutil.which or staging dir)
-    existing = _existing_binary(bin_name)
-    if existing:
-        return existing
-
-    if strategy == "manual":
-        logger.debug("[install] %s requires manual install (recipe=%s)", pkg, recipe)
-        return None
-
-    if strategy == "npm":
-        return _install_npm(
-            recipe.get("pkg", pkg),
-            bin_name,
-            extra_pkgs=recipe.get("extra_pkgs") or [],
-        )
-    if strategy == "go":
-        return _install_go(recipe.get("pkg", pkg), bin_name)
-    if strategy == "pip":
-        return _install_pip(recipe.get("pkg", pkg), bin_name)
-
-    logger.warning("[install] unknown strategy %r for %s", strategy, pkg)
-    return None
-
-
-def _install_npm(
-    pkg: str,
-    bin_name: str,
-    extra_pkgs: Optional[list] = None,
-) -> Optional[str]:
-    """Install an npm package into our staging dir.
-
-    Uses ``npm install --prefix`` so the binaries land in
-    ``<staging>/node_modules/.bin/<bin_name>`` and we symlink them up
-    one level for direct PATH-style access.
-
-    ``extra_pkgs`` is a list of sibling packages to install in the
-    same ``node_modules`` tree.  Used for LSP servers with runtime
-    peer deps that npm doesn't auto-pull (typescript-language-server
-    needs ``typescript`` next to it; intelephense ships standalone).
-    """
-    npm = shutil.which("npm")
-    if npm is None:
-        logger.info("[install] cannot install %s: npm not on PATH", pkg)
-        return None
-    staging = hermes_lsp_bin_dir().parent  # <HERMES_HOME>/lsp/
-    install_targets = [pkg] + list(extra_pkgs or [])
-    try:
-        logger.info(
-            "[install] npm install --prefix %s %s",
-            staging,
-            " ".join(install_targets),
-        )
-        proc = subprocess.run(
-            [npm, "install", "--prefix", str(staging), "--silent", "--no-fund", "--no-audit", *install_targets],
-            check=False,
-            capture_output=True,
-            text=True,
-            timeout=300,
-        )
-        if proc.returncode != 0:
-            logger.warning(
-                "[install] npm install failed for %s: %s", pkg, proc.stderr.strip()[:500]
-            )
-            return None
-    except (subprocess.TimeoutExpired, OSError) as e:
-        logger.warning("[install] npm install errored for %s: %s", pkg, e)
-        return None
-
-    # Find the bin
-    nm_bin = staging / "node_modules" / ".bin" / bin_name
-    if os.name == "nt":
-        # On Windows npm sometimes drops `.cmd` shims
-        candidates = [nm_bin, nm_bin.with_suffix(".cmd")]
-    else:
-        candidates = [nm_bin]
-    for c in candidates:
-        if c.exists():
-            # Symlink into our `lsp/bin/` for stable PATH access.
-            link = hermes_lsp_bin_dir() / c.name
-            if not link.exists():
-                try:
-                    link.symlink_to(c)
-                except (OSError, NotImplementedError):
-                    # Symlinks fail on some Windows setups — copy instead.
-                    try:
-                        shutil.copy2(c, link)
-                    except OSError:
-                        return str(c)
-            return str(link if link.exists() else c)
-    logger.warning("[install] npm install for %s succeeded but bin %s not found", pkg, bin_name)
-    return None
-
-
-def _install_go(pkg: str, bin_name: str) -> Optional[str]:
-    """Install a Go module to GOBIN=<staging>."""
-    go = shutil.which("go")
-    if go is None:
-        logger.info("[install] cannot install %s: go not on PATH", pkg)
-        return None
-    staging = hermes_lsp_bin_dir()
-    env = dict(os.environ)
-    env["GOBIN"] = str(staging)
-    try:
-        logger.info("[install] go install %s (GOBIN=%s)", pkg, staging)
-        proc = subprocess.run(
-            [go, "install", pkg],
-            check=False,
-            capture_output=True,
-            text=True,
-            timeout=600,
-            env=env,
-        )
-        if proc.returncode != 0:
-            logger.warning(
-                "[install] go install failed for %s: %s", pkg, proc.stderr.strip()[:500]
-            )
-            return None
-    except (subprocess.TimeoutExpired, OSError) as e:
-        logger.warning("[install] go install errored for %s: %s", pkg, e)
-        return None
-    bin_path = staging / bin_name
-    if os.name == "nt":
-        bin_path = bin_path.with_suffix(".exe")
-    if bin_path.exists():
-        return str(bin_path)
-    logger.warning("[install] go install for %s succeeded but bin %s not found", pkg, bin_name)
-    return None
-
-
-def _install_pip(pkg: str, bin_name: str) -> Optional[str]:
-    """Install a Python package into a hermes-owned target dir.
-
-    We avoid polluting the user's site-packages by using
-    ``pip install --target``.  Bins go into
-    ``<staging>/python-packages/bin/`` which we symlink into
-    ``<staging>/bin``.  Note: this only works for packages that ship a
-    console script.
-    """
-    pip_target = hermes_lsp_bin_dir().parent / "python-packages"
-    pip_target.mkdir(parents=True, exist_ok=True)
-    try:
-        logger.info("[install] pip install --target %s %s", pip_target, pkg)
-        proc = subprocess.run(
-            [sys.executable, "-m", "pip", "install", "--target", str(pip_target), "--quiet", pkg],
-            check=False,
-            capture_output=True,
-            text=True,
-            timeout=300,
-        )
-        if proc.returncode != 0:
-            logger.warning(
-                "[install] pip install failed for %s: %s", pkg, proc.stderr.strip()[:500]
-            )
-            return None
-    except (subprocess.TimeoutExpired, OSError) as e:
-        logger.warning("[install] pip install errored for %s: %s", pkg, e)
-        return None
-    # Look for the script
-    bin_path = pip_target / "bin" / bin_name
-    if bin_path.exists():
-        link = hermes_lsp_bin_dir() / bin_name
-        if not link.exists():
-            try:
-                link.symlink_to(bin_path)
-            except (OSError, NotImplementedError):
-                try:
-                    shutil.copy2(bin_path, link)
-                except OSError:
-                    return str(bin_path)
-        return str(link if link.exists() else bin_path)
-    return None
-
-
-def detect_status(pkg: str) -> str:
-    """Return ``installed``, ``missing``, or ``manual-only`` for a package.
-
-    Used by the ``hermes lsp status`` CLI to give users a quick
-    overview of what's available without spawning anything.
-    """
-    recipe = INSTALL_RECIPES.get(pkg)
-    bin_name = recipe.get("bin", pkg) if recipe else pkg
-    if _existing_binary(bin_name):
-        return "installed"
-    if recipe and recipe.get("strategy") == "manual":
-        return "manual-only"
-    return "missing"
-
-
-__all__ = [
-    "INSTALL_RECIPES",
-    "try_install",
-    "detect_status",
-    "hermes_lsp_bin_dir",
-]
@@ -1,607 +0,0 @@
-"""Service-level orchestration for LSP clients.
-
-The :class:`LSPService` is the bridge between the synchronous
-file_operations layer and the async :class:`agent.lsp.client.LSPClient`.
-
-Design choices:
-
- A **single asyncio event loop** runs in a background thread.  All
-  client work happens on that loop.  Synchronous callers from
-  ``tools/file_operations.py`` use :meth:`get_diagnostics_sync` to
-  open + wait + drain in one blocking call.
-
- One client per ``(server_id, workspace_root)`` key.  Lazy spawn:
-  the first request for a key spawns the client; subsequent requests
-  re-use it.
-
- A **broken-set** records ``(server_id, workspace_root)`` pairs that
-  failed to spawn or initialize.  These are never retried for the
-  life of the service.  Mirrors OpenCode's design.
-
- A **delta baseline** map keeps "diagnostics-as-of-the-last-snapshot"
-  per file.  ``snapshot_baseline()`` is called BEFORE a write; the
-  next ``get_diagnostics_sync()`` returns only diagnostics that
-  weren't in the baseline.  This is the lift from Claude Code's
-  ``beforeFileEdited`` / ``getNewDiagnostics`` pattern, except wired
-  to the local LSP layer instead of MCP IDE RPC.
-
-The service is **off by default** — call :meth:`is_active` to check
-whether it's actually doing anything.  When LSP is disabled in
-config, when no git workspace can be detected, when all configured
-servers are missing binaries and auto-install is off, ``is_active``
-returns False and the file_operations layer falls through to the
-in-process syntax check.
-"""
-from __future__ import annotations
-
-import asyncio
-import logging
-import os
-import threading
-import time
-from concurrent.futures import Future as ConcurrentFuture
-from typing import Any, Dict, List, Optional, Tuple
-
-from agent.lsp import eventlog
-from agent.lsp.client import (
-    DIAGNOSTICS_DOCUMENT_WAIT,
-    LSPClient,
-    file_uri,
-)
-from agent.lsp.servers import (
-    ServerContext,
-    ServerDef,
-    SpawnSpec,
-    find_server_for_file,
-    language_id_for,
-)
-from agent.lsp.workspace import (
-    clear_cache,
-    is_inside_workspace,
-    resolve_workspace_for_file,
-)
-
-logger = logging.getLogger("agent.lsp.manager")
-
-DEFAULT_IDLE_TIMEOUT = 600  # seconds; servers idle for >10min get reaped
-
-
-class _BackgroundLoop:
-    """A daemon thread that owns one asyncio event loop.
-
-    Provides :meth:`run` for synchronous callers — submits a coroutine
-    to the loop and blocks until it finishes (or a timeout fires).
-    """
-
-    def __init__(self) -> None:
-        self._loop: Optional[asyncio.AbstractEventLoop] = None
-        self._thread: Optional[threading.Thread] = None
-        self._ready = threading.Event()
-
-    def start(self) -> None:
-        if self._thread is not None:
-            return
-        self._thread = threading.Thread(
-            target=self._run_forever,
-            name="hermes-lsp-loop",
-            daemon=True,
-        )
-        self._thread.start()
-        self._ready.wait(timeout=5.0)
-
-    def _run_forever(self) -> None:
-        loop = asyncio.new_event_loop()
-        self._loop = loop
-        asyncio.set_event_loop(loop)
-        self._ready.set()
-        try:
-            loop.run_forever()
-        finally:
-            try:
-                loop.close()
-            except Exception:  # noqa: BLE001
-                pass
-
-    def run(self, coro, *, timeout: Optional[float] = None) -> Any:
-        """Submit a coroutine to the loop and block until done.
-
-        Returns the coroutine's result, or raises its exception.
-        """
-        if self._loop is None:
-            raise RuntimeError("background loop not started")
-        fut: ConcurrentFuture = asyncio.run_coroutine_threadsafe(coro, self._loop)
-        try:
-            return fut.result(timeout=timeout)
-        except Exception:
-            fut.cancel()
-            raise
-
-    def stop(self) -> None:
-        loop = self._loop
-        if loop is None:
-            return
-        try:
-            loop.call_soon_threadsafe(loop.stop)
-        except RuntimeError:
-            pass
-        if self._thread is not None:
-            self._thread.join(timeout=2.0)
-        self._loop = None
-        self._thread = None
-
-
-class LSPService:
-    """The process-wide LSP service.
-
-    Created once via :meth:`create_from_config`; the
-    :func:`agent.lsp.get_service` accessor manages the singleton.
-    Most callers should use that accessor rather than constructing
-    :class:`LSPService` directly.
-    """
-
-    # ------------------------------------------------------------------
-    # construction + factory
-    # ------------------------------------------------------------------
-
-    def __init__(
-        self,
-        *,
-        enabled: bool,
-        wait_mode: str,
-        wait_timeout: float,
-        install_strategy: str,
-        binary_overrides: Optional[Dict[str, List[str]]] = None,
-        env_overrides: Optional[Dict[str, Dict[str, str]]] = None,
-        init_overrides: Optional[Dict[str, Dict[str, Any]]] = None,
-        disabled_servers: Optional[List[str]] = None,
-        idle_timeout: float = DEFAULT_IDLE_TIMEOUT,
-    ) -> None:
-        self._enabled = enabled
-        self._wait_mode = wait_mode if wait_mode in ("document", "full") else "document"
-        self._wait_timeout = wait_timeout
-        self._install_strategy = install_strategy
-        self._binary_overrides = binary_overrides or {}
-        self._env_overrides = env_overrides or {}
-        self._init_overrides = init_overrides or {}
-        self._disabled_servers = set(disabled_servers or [])
-        self._idle_timeout = idle_timeout
-
-        self._loop = _BackgroundLoop()
-        if self._enabled:
-            self._loop.start()
-
-        # Per-(server_id, workspace_root) state
-        self._clients: Dict[Tuple[str, str], LSPClient] = {}
-        self._broken: set = set()
-        self._spawning: Dict[Tuple[str, str], asyncio.Future] = {}
-        self._last_used: Dict[Tuple[str, str], float] = {}
-        self._state_lock = threading.Lock()
-
-        # Delta baseline: file path → snapshot of diagnostics taken
-        # immediately before a write.  ``get_diagnostics_sync`` filters
-        # out anything in the baseline so the agent only sees errors
-        # introduced by the current edit.
-        self._delta_baseline: Dict[str, List[Dict[str, Any]]] = {}
-
-    @classmethod
-    def create_from_config(cls) -> Optional["LSPService"]:
-        """Build a service from ``hermes_cli.config`` settings.
-
-        Returns ``None`` if the config can't be loaded.  The service
-        itself returns ``is_active()`` False when LSP is disabled.
-        """
-        try:
-            from hermes_cli.config import load_config
-            cfg = load_config()
-        except Exception as e:  # noqa: BLE001
-            logger.debug("LSP config load failed: %s", e)
-            return None
-
-        lsp_cfg = (cfg.get("lsp") or {}) if isinstance(cfg, dict) else {}
-        if not isinstance(lsp_cfg, dict):
-            lsp_cfg = {}
-
-        enabled = bool(lsp_cfg.get("enabled", True))
-        wait_mode = lsp_cfg.get("wait_mode", "document")
-        wait_timeout = float(lsp_cfg.get("wait_timeout", DIAGNOSTICS_DOCUMENT_WAIT))
-        install_strategy = lsp_cfg.get("install_strategy", "auto")
-        servers_cfg = lsp_cfg.get("servers") or {}
-        disabled = []
-        binary_overrides: Dict[str, List[str]] = {}
-        env_overrides: Dict[str, Dict[str, str]] = {}
-        init_overrides: Dict[str, Dict[str, Any]] = {}
-        if isinstance(servers_cfg, dict):
-            for name, sub in servers_cfg.items():
-                if not isinstance(sub, dict):
-                    continue
-                if sub.get("disabled"):
-                    disabled.append(name)
-                cmd = sub.get("command")
-                if isinstance(cmd, list) and cmd:
-                    binary_overrides[name] = cmd
-                env = sub.get("env")
-                if isinstance(env, dict):
-                    env_overrides[name] = {k: str(v) for k, v in env.items()}
-                init = sub.get("initialization_options")
-                if isinstance(init, dict):
-                    init_overrides[name] = init
-
-        return cls(
-            enabled=enabled,
-            wait_mode=wait_mode,
-            wait_timeout=wait_timeout,
-            install_strategy=install_strategy,
-            binary_overrides=binary_overrides,
-            env_overrides=env_overrides,
-            init_overrides=init_overrides,
-            disabled_servers=disabled,
-        )
-
-    # ------------------------------------------------------------------
-    # public API
-    # ------------------------------------------------------------------
-
-    def is_active(self) -> bool:
-        """Return True iff this service should be consulted at all."""
-        return self._enabled
-
-    def enabled_for(self, file_path: str) -> bool:
-        """Return True iff LSP should run for this specific file.
-
-        Gates on workspace detection (file or cwd inside a git worktree),
-        on whether any registered server matches the extension, and
-        on whether the (server_id, workspace_root) pair is in the
-        broken-set from a previous spawn failure.
-
-        Files in already-broken pairs return False so the file_operations
-        layer skips the LSP path entirely — no spawn attempts, no
-        timeout cost — until the service is restarted (``hermes lsp
-        restart``) or the process exits.
-        """
-        if not self._enabled:
-            return False
-        srv = find_server_for_file(file_path)
-        if srv is None or srv.server_id in self._disabled_servers:
-            return False
-        ws_root, gated_in = resolve_workspace_for_file(file_path)
-        if not (ws_root and gated_in):
-            return False
-        # Broken-set short-circuit.  Use the per-server root if we can
-        # compute one cheaply; otherwise fall back to the workspace
-        # root as the broken key (which is what _get_or_spawn would
-        # have used anyway when it failed).
-        try:
-            per_server_root = srv.resolve_root(file_path, ws_root) or ws_root
-        except Exception:  # noqa: BLE001
-            per_server_root = ws_root
-        if (srv.server_id, per_server_root) in self._broken:
-            return False
-        return True
-
-    def snapshot_baseline(self, file_path: str) -> None:
-        """Snapshot current diagnostics for ``file_path`` as the delta baseline.
-
-        Called BEFORE a write so the next ``get_diagnostics_sync()``
-        can filter out pre-existing errors.  Best-effort — failures
-        are silently swallowed so a flaky server can't break a write.
-
-        Outer timeouts (e.g. server hangs during initialize) mark the
-        (server_id, workspace_root) pair as broken so subsequent edits
-        skip it instantly instead of re-paying the timeout cost.
-        """
-        if not self.enabled_for(file_path):
-            return
-        try:
-            diags = self._loop.run(self._snapshot_async(file_path), timeout=8.0)
-            self._delta_baseline[os.path.abspath(file_path)] = diags or []
-        except Exception as e:  # noqa: BLE001
-            logger.debug("baseline snapshot failed for %s: %s", file_path, e)
-            self._mark_broken_for_file(file_path, e)
-            self._delta_baseline[os.path.abspath(file_path)] = []
-
-    def get_diagnostics_sync(
-        self,
-        file_path: str,
-        *,
-        delta: bool = True,
-        timeout: Optional[float] = None,
-    ) -> List[Dict[str, Any]]:
-        """Synchronously open ``file_path`` in the right server, wait for
-        diagnostics, return them.
-
-        If ``delta`` is True (default), the result is filtered against
-        any baseline previously captured via :meth:`snapshot_baseline`.
-        Diagnostics present in the baseline are removed so the caller
-        only sees errors introduced by the current edit.
-
-        Returns an empty list when LSP is disabled, when no workspace
-        can be detected, when no server matches, or when the server
-        can't be spawned.  Never raises.
-        """
-        if not self.enabled_for(file_path):
-            return []
-
-        # Resolve server_id eagerly so we can emit structured logs even
-        # when the request errors out below.
-        srv = find_server_for_file(file_path)
-        server_id = srv.server_id if srv else "?"
-
-        try:
-            t = timeout if timeout is not None else self._wait_timeout + 2.0
-            diags = self._loop.run(self._open_and_wait_async(file_path), timeout=t) or []
-        except asyncio.TimeoutError as e:
-            eventlog.log_timeout(server_id, file_path)
-            logger.debug("LSP diagnostics timeout for %s: %s", file_path, e)
-            self._mark_broken_for_file(file_path, e)
-            return []
-        except Exception as e:  # noqa: BLE001
-            eventlog.log_server_error(server_id, file_path, e)
-            logger.debug("LSP diagnostics fetch failed for %s: %s", file_path, e)
-            self._mark_broken_for_file(file_path, e)
-            return []
-
-        abs_path = os.path.abspath(file_path)
-        if delta:
-            baseline = self._delta_baseline.get(abs_path) or []
-            if baseline:
-                seen = {_diag_key(d) for d in baseline}
-                diags = [d for d in diags if _diag_key(d) not in seen]
-            # Roll baseline forward — next call returns deltas relative
-            # to the just-emitted state, mirroring claude-code's
-            # diagnosticTracking.
-            try:
-                fresh = self._loop.run(self._current_diags_async(file_path), timeout=2.0) or []
-            except Exception:  # noqa: BLE001
-                fresh = []
-            if fresh:
-                self._delta_baseline[abs_path] = fresh
-
-        if diags:
-            eventlog.log_diagnostics(server_id, file_path, len(diags))
-        else:
-            eventlog.log_clean(server_id, file_path)
-        return diags
-
-    def _mark_broken_for_file(self, file_path: str, exc: BaseException) -> None:
-        """Mark the (server_id, workspace_root) pair as broken so subsequent
-        edits skip it instantly instead of re-paying timeout cost.
-
-        Called when the outer ``_loop.run`` timeout cancels an in-flight
-        spawn/initialize that the inner ``_get_or_spawn`` task was still
-        holding open.  Without this, every subsequent write would re-enter
-        the spawn path and re-pay the full ``snapshot_baseline``
-        timeout (8s) until the binary is fixed.
-
-        Also kills any orphan client process that survived the cancelled
-        future, and emits a single eventlog WARNING so the user knows
-        which server gave up.
-
-        ``exc`` is whatever exception the outer wrapper caught — used
-        only for logging, never re-raised.
-        """
-        srv = find_server_for_file(file_path)
-        if srv is None:
-            return
-        ws_root, gated = resolve_workspace_for_file(file_path)
-        if not (ws_root and gated):
-            return
-        try:
-            per_server_root = srv.resolve_root(file_path, ws_root) or ws_root
-        except Exception:  # noqa: BLE001
-            per_server_root = ws_root
-        key = (srv.server_id, per_server_root)
-        already_broken = key in self._broken
-        self._broken.add(key)
-
-        # Kill any client we managed to spawn before the timeout.  The
-        # cancelled future never reached the broken-set add inside
-        # ``_get_or_spawn`` so the client may still be hanging in
-        # ``_clients`` with a half-initialized state.
-        with self._state_lock:
-            client = self._clients.pop(key, None)
-        if client is not None:
-            try:
-                # Fire-and-forget shutdown — give it a second to cleanup,
-                # but don't block.  We're already on a slow path.
-                self._loop.run(client.shutdown(), timeout=1.0)
-            except Exception:  # noqa: BLE001
-                pass
-
-        if not already_broken:
-            eventlog.log_spawn_failed(srv.server_id, per_server_root, exc)
-
-    def shutdown(self) -> None:
-        """Tear down all clients and stop the background loop."""
-        if not self._enabled:
-            return
-        try:
-            self._loop.run(self._shutdown_async(), timeout=10.0)
-        except Exception as e:  # noqa: BLE001
-            logger.debug("LSP shutdown error: %s", e)
-        self._loop.stop()
-        clear_cache()
-
-    # ------------------------------------------------------------------
-    # async internals
-    # ------------------------------------------------------------------
-
-    async def _snapshot_async(self, file_path: str) -> List[Dict[str, Any]]:
-        client = await self._get_or_spawn(file_path)
-        if client is None:
-            return []
-        try:
-            version = await client.open_file(file_path, language_id=language_id_for(file_path))
-            await client.wait_for_diagnostics(file_path, version, mode=self._wait_mode)
-        except Exception as e:  # noqa: BLE001
-            logger.debug("snapshot open/wait failed: %s", e)
-            return []
-        self._last_used[(client.server_id, client.workspace_root)] = time.time()
-        return list(client.diagnostics_for(file_path))
-
-    async def _open_and_wait_async(self, file_path: str) -> List[Dict[str, Any]]:
-        client = await self._get_or_spawn(file_path)
-        if client is None:
-            return []
-        try:
-            version = await client.open_file(file_path, language_id=language_id_for(file_path))
-            await client.save_file(file_path)
-            await client.wait_for_diagnostics(file_path, version, mode=self._wait_mode)
-        except Exception as e:  # noqa: BLE001
-            logger.debug("open/wait failed for %s: %s", file_path, e)
-            return []
-        self._last_used[(client.server_id, client.workspace_root)] = time.time()
-        return list(client.diagnostics_for(file_path))
-
-    async def _current_diags_async(self, file_path: str) -> List[Dict[str, Any]]:
-        ws, gated = resolve_workspace_for_file(file_path)
-        srv = find_server_for_file(file_path)
-        if not (ws and gated and srv):
-            return []
-        with self._state_lock:
-            client = self._clients.get((srv.server_id, ws))
-        if client is None:
-            return []
-        return list(client.diagnostics_for(file_path))
-
-    async def _get_or_spawn(self, file_path: str) -> Optional[LSPClient]:
-        srv = find_server_for_file(file_path)
-        if srv is None:
-            return None
-        if srv.server_id in self._disabled_servers:
-            eventlog.log_disabled(srv.server_id, file_path, "disabled in config")
-            return None
-        ws_root, gated = resolve_workspace_for_file(file_path)
-        if not (ws_root and gated):
-            eventlog.log_no_project_root(srv.server_id, file_path)
-            return None
-        per_server_root = srv.resolve_root(file_path, ws_root)
-        if per_server_root is None:
-            eventlog.log_disabled(
-                srv.server_id, file_path, "exclude marker hit (server gated off)"
-            )
-            return None  # exclude marker hit, server gated off
-
-        key = (srv.server_id, per_server_root)
-        if key in self._broken:
-            return None
-        with self._state_lock:
-            client = self._clients.get(key)
-            if client is not None and client.is_running:
-                eventlog.log_active(srv.server_id, per_server_root)
-                return client
-            spawning = self._spawning.get(key)
-        if spawning is not None:
-            try:
-                return await spawning
-            except Exception:  # noqa: BLE001
-                return None
-
-        # Begin spawn
-        loop = asyncio.get_running_loop()
-        spawn_future: asyncio.Future = loop.create_future()
-        with self._state_lock:
-            self._spawning[key] = spawn_future
-        try:
-            ctx = ServerContext(
-                workspace_root=per_server_root,
-                install_strategy=self._install_strategy,
-                binary_overrides=self._binary_overrides,
-                env_overrides=self._env_overrides,
-                init_overrides=self._init_overrides,
-            )
-            spec = srv.build_spawn(per_server_root, ctx)
-            if spec is None:
-                # ``build_spawn`` returns None when the binary can't be
-                # located (auto-install disabled, manual-only server,
-                # or install attempt failed).  Surface this once via
-                # the structured logger so the user can act on it.
-                eventlog.log_server_unavailable(srv.server_id, srv.server_id)
-                self._broken.add(key)
-                spawn_future.set_result(None)
-                return None
-            client = LSPClient(
-                server_id=srv.server_id,
-                workspace_root=spec.workspace_root,
-                command=spec.command,
-                env=spec.env,
-                cwd=spec.cwd,
-                initialization_options=spec.initialization_options,
-                seed_diagnostics_on_first_push=spec.seed_diagnostics_on_first_push or srv.seed_first_push,
-            )
-            try:
-                await client.start()
-            except Exception as e:  # noqa: BLE001
-                eventlog.log_spawn_failed(srv.server_id, per_server_root, e)
-                self._broken.add(key)
-                spawn_future.set_result(None)
-                return None
-            with self._state_lock:
-                self._clients[key] = client
-            self._last_used[key] = time.time()
-            eventlog.log_active(srv.server_id, per_server_root)
-            spawn_future.set_result(client)
-            return client
-        finally:
-            with self._state_lock:
-                self._spawning.pop(key, None)
-
-    async def _shutdown_async(self) -> None:
-        with self._state_lock:
-            clients = list(self._clients.values())
-            self._clients.clear()
-            self._broken.clear()
-            self._last_used.clear()
-        await asyncio.gather(
-            *(c.shutdown() for c in clients),
-            return_exceptions=True,
-        )
-
-    # ------------------------------------------------------------------
-    # status / introspection (used by ``hermes lsp status``)
-    # ------------------------------------------------------------------
-
-    def get_status(self) -> Dict[str, Any]:
-        """Return a snapshot of the service for the CLI status command."""
-        with self._state_lock:
-            clients = [
-                {
-                    "server_id": k[0],
-                    "workspace_root": k[1],
-                    "state": c.state,
-                    "running": c.is_running,
-                }
-                for k, c in self._clients.items()
-            ]
-            broken = list(self._broken)
-        return {
-            "enabled": self._enabled,
-            "wait_mode": self._wait_mode,
-            "wait_timeout": self._wait_timeout,
-            "install_strategy": self._install_strategy,
-            "clients": clients,
-            "broken": broken,
-            "disabled_servers": sorted(self._disabled_servers),
-        }
-
-
-def _diag_key(d: Dict[str, Any]) -> str:
-    """Content equality key used for delta filtering.  Mirrors
-    :func:`agent.lsp.client._diagnostic_key`."""
-    rng = d.get("range") or {}
-    start = rng.get("start") or {}
-    end = rng.get("end") or {}
-    code = d.get("code")
-    if code is not None and not isinstance(code, str):
-        code = str(code)
-    return "\x00".join(
-        [
-            str(d.get("severity") or 1),
-            str(code or ""),
-            str(d.get("source") or ""),
-            str(d.get("message") or "").strip(),
-            f"{start.get('line', 0)}:{start.get('character', 0)}-{end.get('line', 0)}:{end.get('character', 0)}",
-        ]
-    )
-
-
-__all__ = ["LSPService"]
@@ -1,196 +0,0 @@
-"""Minimal LSP JSON-RPC 2.0 framer over async streams.
-
-LSP wire format:
-
-    Content-Length: <bytes>\\r\\n
-    \\r\\n
-    <utf-8 JSON body>
-
-The body is a JSON-RPC 2.0 envelope: request, response, or notification.
-
-This module replaces what ``vscode-jsonrpc/node`` would do in a
-TypeScript implementation.  We keep it deliberately small — just the
-framer + envelope helpers — so :class:`agent.lsp.client.LSPClient` can
-focus on protocol semantics.
-"""
-from __future__ import annotations
-
-import asyncio
-import json
-import logging
-from typing import Any, Optional, Tuple
-
-logger = logging.getLogger("agent.lsp.protocol")
-
-# LSP error codes we care about.  Full list in
-# https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#errorCodes
-ERROR_CONTENT_MODIFIED = -32801
-ERROR_REQUEST_CANCELLED = -32800
-ERROR_METHOD_NOT_FOUND = -32601
-
-
-class LSPProtocolError(Exception):
-    """Raised when the wire protocol is violated.
-
-    Distinct from :class:`LSPRequestError` which represents a server
-    returning a JSON-RPC error response — that's protocol-conformant.
-    This exception means the framing or envelope itself is broken.
-    """
-
-
-class LSPRequestError(Exception):
-    """Raised when an LSP request returns an error response.
-
-    Carries the JSON-RPC ``code``, ``message``, and optional ``data``.
-    """
-
-    def __init__(self, code: int, message: str, data: Any = None) -> None:
-        super().__init__(f"LSP error {code}: {message}")
-        self.code = code
-        self.message = message
-        self.data = data
-
-
-def encode_message(obj: dict) -> bytes:
-    """Encode a JSON-RPC envelope as a Content-Length framed byte string.
-
-    The body is encoded as compact UTF-8 JSON (no spaces between
-    separators) — matches what ``vscode-jsonrpc`` emits and keeps the
-    Content-Length count exact.
-    """
-    body = json.dumps(obj, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
-    header = f"Content-Length: {len(body)}\r\n\r\n".encode("ascii")
-    return header + body
-
-
-async def read_message(reader: asyncio.StreamReader) -> Optional[dict]:
-    """Read one Content-Length framed JSON-RPC message from the stream.
-
-    Returns ``None`` on clean EOF (server closed stdout cleanly between
-    messages — typical shutdown).  Raises :class:`LSPProtocolError` on
-    malformed framing.
-
-    The reader is advanced to just past the JSON body on success.
-    """
-    headers: dict = {}
-    header_bytes = 0
-    while True:
-        try:
-            line = await reader.readuntil(b"\r\n")
-        except asyncio.IncompleteReadError as e:
-            # EOF while reading headers.  If we hadn't started a header
-            # block, treat as clean EOF; otherwise the framing is bad.
-            if not e.partial and not headers:
-                return None
-            raise LSPProtocolError(
-                f"unexpected EOF while reading LSP headers (partial={e.partial!r})"
-            ) from e
-        # Defensive cap against a server streaming headers without ever
-        # emitting CRLF-CRLF.  Caps total header bytes at 8 KiB — a
-        # well-behaved server fits in well under 200 bytes.
-        header_bytes += len(line)
-        if header_bytes > 8192:
-            raise LSPProtocolError(
-                f"LSP header block exceeded 8 KiB without terminator"
-            )
-        line = line[:-2]  # strip CRLF
-        if not line:
-            break  # blank line ends header block
-        try:
-            key, _, value = line.decode("ascii").partition(":")
-        except UnicodeDecodeError as e:
-            raise LSPProtocolError(f"non-ASCII LSP header: {line!r}") from e
-        if not key:
-            raise LSPProtocolError(f"malformed LSP header line: {line!r}")
-        headers[key.strip().lower()] = value.strip()
-
-    cl = headers.get("content-length")
-    if cl is None:
-        raise LSPProtocolError(f"LSP message missing Content-Length: {headers!r}")
-    try:
-        n = int(cl)
-    except ValueError as e:
-        raise LSPProtocolError(f"non-integer Content-Length: {cl!r}") from e
-    if n < 0 or n > 64 * 1024 * 1024:  # 64 MiB sanity cap
-        raise LSPProtocolError(f"unreasonable Content-Length: {n}")
-
-    try:
-        body = await reader.readexactly(n)
-    except asyncio.IncompleteReadError as e:
-        raise LSPProtocolError(
-            f"truncated LSP body: expected {n} bytes, got {len(e.partial)}"
-        ) from e
-
-    try:
-        return json.loads(body.decode("utf-8"))
-    except json.JSONDecodeError as e:
-        raise LSPProtocolError(f"invalid JSON in LSP body: {e}") from e
-    except UnicodeDecodeError as e:
-        raise LSPProtocolError(f"non-UTF-8 LSP body: {e}") from e
-
-
-def make_request(req_id: int, method: str, params: Any) -> dict:
-    """Build a JSON-RPC 2.0 request envelope."""
-    msg: dict = {"jsonrpc": "2.0", "id": req_id, "method": method}
-    if params is not None:
-        msg["params"] = params
-    return msg
-
-
-def make_notification(method: str, params: Any) -> dict:
-    """Build a JSON-RPC 2.0 notification envelope (no ``id``)."""
-    msg: dict = {"jsonrpc": "2.0", "method": method}
-    if params is not None:
-        msg["params"] = params
-    return msg
-
-
-def make_response(req_id: Any, result: Any) -> dict:
-    """Build a JSON-RPC 2.0 success response envelope."""
-    return {"jsonrpc": "2.0", "id": req_id, "result": result}
-
-
-def make_error_response(req_id: Any, code: int, message: str, data: Any = None) -> dict:
-    """Build a JSON-RPC 2.0 error response envelope."""
-    err: dict = {"code": code, "message": message}
-    if data is not None:
-        err["data"] = data
-    return {"jsonrpc": "2.0", "id": req_id, "error": err}
-
-
-def classify_message(msg: dict) -> Tuple[str, Any]:
-    """Return ``(kind, key)`` where kind is one of ``request``,
-    ``response``, ``notification``, ``invalid``.
-
-    The key is the request id for request/response, the method name
-    for notifications, and ``None`` for invalid messages.
-    """
-    if not isinstance(msg, dict):
-        return "invalid", None
-    if msg.get("jsonrpc") != "2.0":
-        return "invalid", None
-    has_id = "id" in msg
-    has_method = "method" in msg
-    if has_id and has_method:
-        return "request", msg["id"]
-    if has_id and ("result" in msg or "error" in msg):
-        return "response", msg["id"]
-    if has_method and not has_id:
-        return "notification", msg["method"]
-    return "invalid", None
-
-
-__all__ = [
-    "ERROR_CONTENT_MODIFIED",
-    "ERROR_REQUEST_CANCELLED",
-    "ERROR_METHOD_NOT_FOUND",
-    "LSPProtocolError",
-    "LSPRequestError",
-    "encode_message",
-    "read_message",
-    "make_request",
-    "make_notification",
-    "make_response",
-    "make_error_response",
-    "classify_message",
-]
@@ -1,78 +0,0 @@
-"""Format LSP diagnostics for inclusion in tool output.
-
-The model sees a compact, severity-filtered, line-bounded summary of
-diagnostics introduced by the latest edit.  Format matches what
-OpenCode's ``lsp/diagnostic.ts`` and Claude Code's
-``formatDiagnosticsSummary`` produce — ``<diagnostics>`` blocks with
-1-indexed line/column, capped at ``MAX_PER_FILE`` errors.
-"""
-from __future__ import annotations
-
-from typing import Any, Dict, List
-
-# Severity-1 only by default — warnings/info/hints would flood the
-# agent.  Lift this in config under ``lsp.severities`` if needed.
-SEVERITY_NAMES = {1: "ERROR", 2: "WARN", 3: "INFO", 4: "HINT"}
-DEFAULT_SEVERITIES = frozenset({1})  # ERROR only
-
-MAX_PER_FILE = 20
-MAX_TOTAL_CHARS = 4000
-
-
-def format_diagnostic(d: Dict[str, Any]) -> str:
-    """One-line representation of a single diagnostic."""
-    sev = SEVERITY_NAMES.get(d.get("severity") or 1, "ERROR")
-    rng = d.get("range") or {}
-    start = rng.get("start") or {}
-    line = int(start.get("line", 0)) + 1
-    col = int(start.get("character", 0)) + 1
-    msg = str(d.get("message") or "").rstrip()
-    code = d.get("code")
-    code_part = f" [{code}]" if code not in (None, "") else ""
-    source = d.get("source")
-    source_part = f" ({source})" if source else ""
-    return f"{sev} [{line}:{col}] {msg}{code_part}{source_part}"
-
-
-def report_for_file(
-    file_path: str,
-    diagnostics: List[Dict[str, Any]],
-    *,
-    severities: frozenset = DEFAULT_SEVERITIES,
-    max_per_file: int = MAX_PER_FILE,
-) -> str:
-    """Build a ``<diagnostics file=...>`` block for one file.
-
-    Returns an empty string when no diagnostics pass the severity
-    filter, so callers can do ``if block:`` to skip empty cases.
-    """
-    if not diagnostics:
-        return ""
-    filtered = [d for d in diagnostics if (d.get("severity") or 1) in severities]
-    if not filtered:
-        return ""
-    limited = filtered[:max_per_file]
-    extra = len(filtered) - len(limited)
-    lines = [format_diagnostic(d) for d in limited]
-    body = "\n".join(lines)
-    if extra > 0:
-        body += f"\n... and {extra} more"
-    return f"<diagnostics file=\"{file_path}\">\n{body}\n</diagnostics>"
-
-
-def truncate(s: str, *, limit: int = MAX_TOTAL_CHARS) -> str:
-    """Hard-cap a formatted summary string."""
-    if len(s) <= limit:
-        return s
-    marker = "\n…[truncated]"
-    return s[: limit - len(marker)] + marker
-
-
-__all__ = [
-    "SEVERITY_NAMES",
-    "DEFAULT_SEVERITIES",
-    "MAX_PER_FILE",
-    "format_diagnostic",
-    "report_for_file",
-    "truncate",
-]
@@ -1,223 +0,0 @@
-"""Workspace and project-root resolution for LSP.
-
-Two concerns live here:
-
-1. **Workspace gate** — the upper-level "is this directory a project?"
-   check.  Hermes only runs LSP when the cwd (or the file being edited)
-   sits inside a git worktree.  Files outside any git root never
-   trigger LSP, even if a server is configured.  This keeps Telegram
-   gateway users on user-home cwd's from spawning daemons.
-
-2. **NearestRoot** — the per-server project-root walk.  Each language
-   server cares about a different marker (``pyproject.toml`` for
-   Python, ``Cargo.toml`` for Rust, ``go.mod`` for Go, etc.) and
-   wants the directory containing that marker.  ``nearest_root()``
-   walks up from a starting path looking for any of a list of marker
-   files, optionally bailing if an exclude marker shows up first.
-"""
-from __future__ import annotations
-
-import logging
-import os
-from pathlib import Path
-from typing import Iterable, Optional, Tuple
-
-logger = logging.getLogger("agent.lsp.workspace")
-
-# Cache: cwd → (worktree_root, is_git) so repeated calls don't re-stat.
-# Cleared on shutdown.  Keyed by absolute resolved path so symlink
-# folds collapse to one entry.
-_workspace_cache: dict = {}
-
-
-def normalize_path(path: str) -> str:
-    """Normalize a path for use as a stable map key.
-
-    Resolves ``~``, makes absolute, and collapses ``.``/``..``.  We do
-    NOT resolve symlinks here — symlink stability matters for some
-    LSP servers (rust-analyzer cares about Cargo workspace identity)
-    and we want the canonical path the user typed when possible.
-    """
-    return os.path.abspath(os.path.expanduser(path))
-
-
-def find_git_worktree(start: str) -> Optional[str]:
-    """Walk up from ``start`` looking for a ``.git`` entry (file or dir).
-
-    Returns the directory containing ``.git``, or ``None`` if no git
-    root is found before hitting the filesystem root.
-
-    A ``.git`` *file* (not directory) means we're inside a git
-    worktree set up via ``git worktree add`` — both forms count.
-    """
-    try:
-        start_path = Path(normalize_path(start))
-        if start_path.is_file():
-            start_path = start_path.parent
-    except (OSError, RuntimeError, ValueError):
-        # Pathological input (loop in symlinks, encoding error, etc.) —
-        # bail out rather than crash the lint hook.
-        return None
-
-    # Cache check
-    cached = _workspace_cache.get(str(start_path))
-    if cached is not None:
-        root, _is_git = cached
-        return root
-
-    cur = start_path
-    # Defensive cap: the deepest reasonable monorepo is well under 64
-    # levels.  Caps the walk so a pathological cwd or a symlink cycle
-    # we somehow traverse can't keep us looping.
-    for _ in range(64):
-        git_marker = cur / ".git"
-        try:
-            if git_marker.exists():
-                resolved = str(cur)
-                _workspace_cache[str(start_path)] = (resolved, True)
-                return resolved
-        except OSError:
-            # Permission error on a parent dir — bail out cleanly.
-            break
-        parent = cur.parent
-        if parent == cur:
-            break
-        cur = parent
-
-    _workspace_cache[str(start_path)] = (None, False)
-    return None
-
-
-def is_inside_workspace(path: str, workspace_root: str) -> bool:
-    """Return True iff ``path`` is inside (or equal to) ``workspace_root``.
-
-    Uses absolute paths but does not resolve symlinks — a file accessed
-    via a symlink that points outside the workspace still counts as
-    outside.  This is the conservative interpretation; matches LSP
-    behaviour where servers reject didOpen for unrelated files.
-    """
-    p = normalize_path(path)
-    root = normalize_path(workspace_root)
-    if p == root:
-        return True
-    # Use os.path.commonpath to handle case-insensitive filesystems
-    # correctly on macOS/Windows.
-    try:
-        common = os.path.commonpath([p, root])
-    except ValueError:
-        # Different drives on Windows.
-        return False
-    return common == root
-
-
-def nearest_root(
-    start: str,
-    markers: Iterable[str],
-    *,
-    excludes: Optional[Iterable[str]] = None,
-    ceiling: Optional[str] = None,
-) -> Optional[str]:
-    """Walk up from ``start`` looking for any of the given marker files.
-
-    Returns the **directory containing** the first matched marker, or
-    ``None`` if no marker is found before hitting ``ceiling`` (or the
-    filesystem root if no ceiling).
-
-    If ``excludes`` is provided and an exclude marker matches *first*
-    in the upward walk, returns ``None`` — the server is gated off
-    for that file.  Mirrors OpenCode's NearestRoot exclude semantics
-    (e.g. typescript skips deno projects when ``deno.json`` is found
-    before ``package.json``).
-    """
-    start_path = Path(normalize_path(start))
-    try:
-        if start_path.is_file():
-            start_path = start_path.parent
-    except (OSError, RuntimeError, ValueError):
-        return None
-    ceiling_path = Path(normalize_path(ceiling)) if ceiling else None
-
-    markers_list = list(markers)
-    excludes_list = list(excludes) if excludes else []
-
-    cur = start_path
-    # Defensive cap matching ``find_git_worktree``.  Bounded walk
-    # protects against pathological inputs even though the
-    # parent-equality stop normally terminates within ~10 steps.
-    for _ in range(64):
-        # Check excludes first — if an exclude is found at this level,
-        # the server is gated off for this file.
-        for exc in excludes_list:
-            try:
-                if (cur / exc).exists():
-                    return None
-            except OSError:
-                continue
-        # Then check markers.
-        for marker in markers_list:
-            try:
-                if (cur / marker).exists():
-                    return str(cur)
-            except OSError:
-                continue
-        # Stop conditions.
-        if ceiling_path is not None and cur == ceiling_path:
-            return None
-        parent = cur.parent
-        if parent == cur:
-            return None
-        cur = parent
-    return None
-
-
-def resolve_workspace_for_file(
-    file_path: str,
-    *,
-    cwd: Optional[str] = None,
-) -> Tuple[Optional[str], bool]:
-    """Resolve the workspace root for a file.
-
-    Returns ``(workspace_root, gated_in)`` where ``gated_in`` is True
-    iff LSP should run for this file at all.  Currently the gate is
-    "file is inside a git worktree found by walking up from cwd OR
-    from the file itself".
-
-    The cwd path takes precedence — if the agent was launched in a
-    git project, that worktree is the workspace, and any edit inside
-    it (regardless of where the file lives) is in-scope.  If the cwd
-    isn't in a git worktree, we try the file's own location as a
-    fallback.
-
-    Returns ``(None, False)`` when neither path is in a git worktree.
-    """
-    cwd = cwd or os.getcwd()
-    cwd_root = find_git_worktree(cwd)
-    if cwd_root is not None:
-        if is_inside_workspace(file_path, cwd_root):
-            return cwd_root, True
-        # File is outside the cwd's worktree — try the file's own
-        # location as a secondary anchor.  Useful for monorepos where
-        # the user opens an unrelated checkout.
-    file_root = find_git_worktree(file_path)
-    if file_root is not None:
-        return file_root, True
-    return None, False
-
-
-def clear_cache() -> None:
-    """Clear the workspace-resolution cache.
-
-    Called on service shutdown so a subsequent re-init doesn't pick
-    up stale results from a previous session.
-    """
-    _workspace_cache.clear()
-
-
-__all__ = [
-    "find_git_worktree",
-    "is_inside_workspace",
-    "nearest_root",
-    "normalize_path",
-    "resolve_workspace_for_file",
-    "clear_cache",
-]
@@ -10,7 +10,7 @@ import os
 import re
 import time
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional
 from urllib.parse import urlparse

 import requests
@@ -47,7 +47,7 @@ def _resolve_requests_verify() -> bool | str:
 _PROVIDER_PREFIXES: frozenset[str] = frozenset({
    "openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
    "gemini", "ollama-cloud", "zai", "kimi-coding", "kimi-coding-cn", "stepfun", "minimax", "minimax-oauth", "minimax-cn", "anthropic", "deepseek",
-    "opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba", "novita",
+    "opencode-zen", "opencode-go", "ai-gateway", "kilocode", "alibaba",
    "qwen-oauth",
    "xiaomi",
    "arcee",
@@ -66,7 +66,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
    "gmi-cloud", "gmicloud",
    "xai", "x-ai", "x.ai", "grok",
    "nvidia", "nim", "nvidia-nim", "nemotron",
-    "qwen-portal", "novita-ai", "novitaai",
+    "qwen-portal",
 })


@@ -104,8 +104,6 @@ def _strip_provider_prefix(model: str) -> str:

 _model_metadata_cache: Dict[str, Dict[str, Any]] = {}
 _model_metadata_cache_time: float = 0
-_novita_metadata_cache: Dict[str, Dict[str, Any]] = {}
-_novita_metadata_cache_time: float = 0
 _MODEL_CACHE_TTL = 3600
 _endpoint_model_metadata_cache: Dict[str, Dict[str, Dict[str, Any]]] = {}
 _endpoint_model_metadata_cache_time: Dict[str, float] = {}
@@ -287,7 +285,6 @@ def grok_supports_reasoning_effort(model: str) -> bool:
 _CONTEXT_LENGTH_KEYS = (
    "context_length",
    "context_window",
-    "context_size",
    "max_context_length",
    "max_position_embeddings",
    "max_model_len",
@@ -364,7 +361,6 @@ _URL_TO_PROVIDER: Dict[str, str] = {
    "api.xiaomimimo.com": "xiaomi",
    "xiaomimimo.com": "xiaomi",
    "api.gmi-serving.com": "gmi",
-    "api.novita.ai": "novita",
    "tokenhub.tencentmaas.com": "tencent-tokenhub",
    "ollama.com": "ollama-cloud",
 }
@@ -561,16 +557,6 @@ def _extract_max_completion_tokens(payload: Dict[str, Any]) -> Optional[int]:


 def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
-    novita_input = payload.get("input_token_price_per_m")
-    novita_output = payload.get("output_token_price_per_m")
-    if novita_input is not None or novita_output is not None:
-        pricing: Dict[str, Any] = {}
-        if novita_input is not None:
-            pricing["prompt"] = str(float(novita_input) / 10_000 / 1_000_000)
-        if novita_output is not None:
-            pricing["completion"] = str(float(novita_output) / 10_000 / 1_000_000)
-        return pricing
-
    alias_map = {
        "prompt": ("prompt", "input", "input_cost_per_token", "prompt_token_cost"),
        "completion": ("completion", "output", "output_cost_per_token", "completion_token_cost"),
@@ -1344,66 +1330,27 @@ def _resolve_codex_oauth_context_length(
    return None


-def _resolve_nous_context_length(
-    model: str,
-    base_url: str = "",
-    api_key: str = "",
-) -> Tuple[Optional[int], str]:
-    """Resolve Nous Portal model context length.
+def _resolve_nous_context_length(model: str) -> Optional[int]:
+    """Resolve Nous Portal model context length via OpenRouter metadata.

-    Tries the live Nous inference endpoint first (authoritative), then falls
-    back to OpenRouter metadata with suffix/version matching.
-
-    Nous model IDs are bare after prefix-stripping (e.g. 'qwen3.6-plus',
-    'claude-opus-4-6') while OpenRouter uses prefixed IDs (e.g.
-    'qwen/qwen3.6-plus', 'anthropic/claude-opus-4.6').  Version
-    normalization (dot↔dash) is applied to handle name drifts.
-
-    Returns ``(context_length, source)`` where ``source`` is one of:
-      - ``"portal"``    — live /v1/models response (authoritative)
-      - ``"openrouter"`` — OpenRouter cache fallback (non-authoritative;
-        callers must NOT persist this to the on-disk cache or a single
-        portal blip will freeze the wrong value in forever)
-      - ``""``           — could not resolve
+    Nous model IDs are bare (e.g. 'claude-opus-4-6') while OpenRouter uses
+    prefixed IDs (e.g. 'anthropic/claude-opus-4.6'). Try suffix matching
+    with version normalization (dot↔dash).
    """
-    # Portal first — the Nous /models endpoint is authoritative for what our
-    # infrastructure enforces and may differ from OR (e.g. OR reports 1M for
-    # qwen3.6-plus; the portal correctly says 262144).  Fall back to the OR
-    # catalog only if the portal doesn't list the model.
-    if base_url:
-        portal_ctx = _resolve_endpoint_context_length(model, base_url, api_key=api_key)
-        if portal_ctx is not None:
-            return portal_ctx, "portal"
-
-    metadata = fetch_model_metadata()
-
-    def _safe_ctx(or_id: str, entry: dict) -> Optional[int]:
-        ctx = entry.get("context_length")
-        if ctx is None:
-            return None
-        if ctx <= 32768 and _model_name_suggests_kimi(or_id):
-            logger.info(
-                "Rejecting OpenRouter metadata context=%s for %r "
-                "(Kimi-family underreport, Nous path); falling through to hardcoded defaults",
-                ctx, or_id,
-            )
-            return None
-        return ctx
-
+    metadata = fetch_model_metadata()  # OpenRouter cache
+    # Exact match first
    if model in metadata:
-        ctx = _safe_ctx(model, metadata[model])
-        if ctx is not None:
-            return ctx, "openrouter"
+        return metadata[model].get("context_length")

    normalized = _normalize_model_version(model).lower()

    for or_id, entry in metadata.items():
        bare = or_id.split("/", 1)[1] if "/" in or_id else or_id
        if bare.lower() == model.lower() or _normalize_model_version(bare).lower() == normalized:
-            ctx = _safe_ctx(or_id, entry)
-            if ctx is not None:
-                return ctx, "openrouter"
+            return entry.get("context_length")

+    # Partial prefix match for cases like gemini-3-flash → gemini-3-flash-preview
+    # Require match to be at a word boundary (followed by -, :, or end of string)
    model_lower = model.lower()
    for or_id, entry in metadata.items():
        bare = or_id.split("/", 1)[1] if "/" in or_id else or_id
@@ -1411,11 +1358,9 @@ def _resolve_nous_context_length(
            if candidate.startswith(query) and (
                len(candidate) == len(query) or candidate[len(query)] in "-:."
            ):
-                ctx = _safe_ctx(or_id, entry)
-                if ctx is not None:
-                    return ctx, "openrouter"
+                return entry.get("context_length")

-    return None, ""
+    return None


 def get_model_context_length(
@@ -1430,18 +1375,14 @@ def get_model_context_length(

    Resolution order:
    0. Explicit config override (model.context_length or custom_providers per-model)
-    1. Persistent cache (previously discovered via probing).  Nous URLs
-       bypass the cache here so step 5b can always reconcile against
-       the authoritative portal /v1/models response.
+    1. Persistent cache (previously discovered via probing)
    1b. AWS Bedrock static table (must precede custom-endpoint probe)
    2. Active endpoint metadata (/models for explicit custom endpoints)
    3. Local server query (for local endpoints)
    4. Anthropic /v1/models API (API-key users only, not OAuth)
    5. Provider-aware lookups (before generic OpenRouter cache):
       a. Copilot live /models API
-       b. Nous: live /v1/models probe first (authoritative), then OR
-          cache fallback with suffix/version normalisation.  Only
-          portal-derived values are persisted to disk.
+       b. Nous suffix-match via OpenRouter cache
       c. Codex OAuth /models probe
       d. GMI /models endpoint
       e. Ollama native /api/show probe (any base_url, provider-agnostic)
@@ -1496,28 +1437,6 @@ def get_model_context_length(
                    model, base_url, f"{cached:,}",
                )
                _invalidate_cached_context_length(model, base_url)
-            # Invalidate stale 32k cache entries for Kimi-family models.
-            elif cached <= 32768 and _model_name_suggests_kimi(model):
-                logger.info(
-                    "Dropping stale Kimi cache entry %s@%s -> %s (OpenRouter underreport); "
-                    "re-resolving via hardcoded defaults",
-                    model, base_url, f"{cached:,}",
-                )
-                _invalidate_cached_context_length(model, base_url)
-            # Nous Portal: the portal /v1/models endpoint is authoritative.
-            # Bypass the persistent cache so step 5b can always reconcile
-            # against it — this corrects pre-fix entries seeded from the
-            # OR catalog (the same OR underreport class that the Kimi/Qwen
-            # DEFAULT_CONTEXT_LENGTHS overrides exist to mitigate) without
-            # touching the on-disk file when the portal is unreachable.
-            # The in-memory 300s endpoint metadata cache makes the per-call
-            # cost amortise to ~0 within a process.
-            elif _infer_provider_from_url(base_url) == "nous":
-                logger.debug(
-                    "Bypassing persistent cache for %s@%s (Nous portal authoritative)",
-                    model, base_url,
-                )
-                # Fall through; step 5b reconciles and overwrites if portal responds.
            else:
                return cached

@@ -1541,13 +1460,6 @@ def get_model_context_length(
        except ImportError:
            pass  # boto3 not installed — fall through to generic resolution

-    if provider == "novita" or (base_url and base_url_host_matches(base_url, "api.novita.ai")):
-        ctx = _resolve_endpoint_context_length(model, base_url or "https://api.novita.ai/openai/v1", api_key=api_key)
-        if ctx is not None:
-            if base_url:
-                save_context_length(model, base_url, ctx)
-            return ctx
-
    # 2. Active endpoint metadata for truly custom/unknown endpoints.
    # Known providers (Copilot, OpenAI, Anthropic, etc.) skip this — their
    # /models endpoint may report a provider-imposed limit (e.g. Copilot
@@ -1616,18 +1528,8 @@ def get_model_context_length(
            pass  # Fall through to models.dev

    if effective_provider == "nous":
-        ctx, source = _resolve_nous_context_length(
-            model, base_url=base_url or "", api_key=api_key or ""
-        )
+        ctx = _resolve_nous_context_length(model)
        if ctx:
-            # Persist ONLY portal-derived values.  Caching an OR-fallback
-            # value here would freeze in a wrong number on the first portal
-            # blip / auth glitch and step-1 would short-circuit it forever.
-            # OR's catalog is community-maintained and is precisely why the
-            # Kimi/Qwen DEFAULT_CONTEXT_LENGTHS overrides exist — we don't
-            # want it leaking into the persistent cache for Nous URLs.
-            if base_url and source == "portal":
-                save_context_length(model, base_url, ctx)
            return ctx
    if effective_provider == "openai-codex":
        # Codex OAuth enforces lower context limits than the direct OpenAI
@@ -1673,6 +1575,14 @@ def get_model_context_length(
        if model in metadata:
            or_ctx = metadata[model].get("context_length", DEFAULT_FALLBACK_CONTEXT)
            # Guard against stale OpenRouter metadata for Kimi-family models.
+            # OpenRouter reports 32768 for moonshotai/kimi-k2.6, but the model
+            # actually supports 262144 (models.dev + official Kimi docs agree).
+            # Providers that host their own Kimi endpoints (Ollama Cloud, Kimi
+            # Coding, Moonshot) would otherwise trip the 64k minimum-context
+            # guard and reject a perfectly capable model.
+            # The filter is narrow: only reject exactly 32768 for Kimi-named
+            # models.  If OpenRouter ever updates its data, the stale path
+            # becomes dead code with no impact.
            if or_ctx == 32768 and _model_name_suggests_kimi(model):
                logger.info(
                    "Rejecting OpenRouter metadata context=%s for %r "
@@ -141,7 +141,6 @@ class ProviderInfo:
 # Hermes provider names → models.dev provider IDs
 PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
    "openrouter": "openrouter",
-    "novita": "novita-ai",
    "anthropic": "anthropic",
    "openai": "openai",
    "openai-codex": "openai",
@@ -1,64 +0,0 @@
-"""Centralized Nous Portal request tags.
-
-Every Hermes request that hits the Nous Portal — main agent loop, auxiliary
-client (compression / titles / vision / web_extract / session_search / etc.),
-and any future code path — must carry the same product-attribution tags so
-Nous can attribute usage to Hermes Agent and bucket it by client release.
-
-Tag shape (sent in OpenAI-compatible ``extra_body['tags']``):
-
-    [
-        "product=hermes-agent",
-        "client=hermes-client-v<__version__>",
-    ]
-
-The version is sourced live from ``hermes_cli.__version__`` so it auto-aligns
-to whatever release is installed; the release script
-(``scripts/release.py``) regex-bumps that single string, and every Portal
-request picks up the new tag on the next process start.
-
-Why one helper instead of inlining the literal at each site:
-* Four call sites (main loop profile, aux client, run_agent compression
-  fallback, web_tools fallback) used to drift apart — see PR #24194 which
-  only got the aux site, leaving the main loop sending a different tag set.
-* Tests should assert the same tag list everywhere; centralizing makes that
-  assertion a one-liner against this module.
-
-Do NOT pre-compute these as module-level constants in the consumers. The
-version can change at runtime (editable installs, hot-reload tooling), and
-``hermes_cli.__version__`` is the canonical source of truth.
-"""
-
-from __future__ import annotations
-
-from typing import List
-
-
-def _hermes_version() -> str:
-    """Return the current Hermes release version, e.g. ``"0.13.0"``.
-
-    Falls back to ``"unknown"`` if ``hermes_cli`` cannot be imported (should
-    never happen in a real install — guarded for defensive testing).
-    """
-    try:
-        from hermes_cli import __version__
-        return __version__
-    except Exception:
-        return "unknown"
-
-
-def hermes_client_tag() -> str:
-    """Return the ``client=...`` tag for Nous Portal requests.
-
-    Format: ``client=hermes-client-v<MAJOR>.<MINOR>.<PATCH>``.
-    """
-    return f"client=hermes-client-v{_hermes_version()}"
-
-
-def nous_portal_tags() -> List[str]:
-    """Return the canonical list of Nous Portal product tags.
-
-    Always returns a fresh list so callers can mutate it freely
-    (e.g. ``merged_extra.setdefault("tags", []).extend(nous_portal_tags())``).
-    """
-    return ["product=hermes-agent", hermes_client_tag()]
@@ -268,7 +268,7 @@ TOOL_USE_ENFORCEMENT_GUIDANCE = (

 # Model name substrings that trigger tool-use enforcement guidance.
 # Add new patterns here when a model family needs explicit steering.
-TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok", "glm")
+TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok")

 # OpenAI GPT/Codex-specific execution guidance.  Addresses known failure modes
 # where GPT models abandon work on partial results, skip prerequisite lookups,
@@ -1,15 +1,25 @@
-"""Anthropic prompt caching strategy.
+"""Anthropic prompt caching strategies.

-Single layout: ``system_and_3``. 4 cache_control breakpoints — system
-prompt + last 3 non-system messages, all at the same TTL (5m or 1h).
-Reduces input token costs by ~75% on multi-turn conversations within a
-single session.
+Two layouts:
+
+* ``system_and_3`` (default, used everywhere except the long-lived path):
+  4 cache_control breakpoints — system prompt + last 3 non-system messages.
+  All at the same TTL (5m or 1h). Reduces input token costs by ~75% on
+  multi-turn conversations within a single session.
+
+* ``prefix_and_2`` (Claude on Anthropic / OpenRouter / Nous Portal):
+  4 breakpoints split across two TTL tiers — tools[-1] (1h) +
+  stable system prefix (1h) + last 2 non-system messages (5m). The
+  long-lived prefix is byte-stable across sessions for a given user
+  config, so every fresh session reads the cached system+tools instead
+  of re-paying for them. Within-session rolling window shrinks from 3
+  messages to 2 to free the breakpoint budget.

 Pure functions -- no class state, no AIAgent dependency.
 """

 import copy
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional


 def _apply_cache_marker(msg: dict, cache_marker: dict, native_anthropic: bool = False) -> None:
@@ -77,3 +87,115 @@ def apply_anthropic_cache_control(
        _apply_cache_marker(messages[idx], marker, native_anthropic=native_anthropic)

    return messages
+
+
+def _mark_system_stable_block(
+    messages: List[Dict[str, Any]],
+    long_lived_marker: Dict[str, str],
+) -> bool:
+    """Mark the *first* content block of the system message with the 1h marker.
+
+    The system message is expected to have been split into multiple content
+    blocks beforehand by the caller — block[0] is the cross-session-stable
+    prefix, subsequent blocks carry context files + volatile suffix.
+    Falls back to marking the whole system message as a single block when
+    the message hasn't been split (preserves correctness on the fallback path).
+
+    Returns True when a marker was placed.
+    """
+    if not messages or messages[0].get("role") != "system":
+        return False
+
+    sys_msg = messages[0]
+    content = sys_msg.get("content")
+
+    # Already a list of blocks → mark the first block.
+    if isinstance(content, list) and content:
+        first = content[0]
+        if isinstance(first, dict):
+            first["cache_control"] = long_lived_marker
+            return True
+        return False
+
+    # String content (no split) → cannot place a stable-prefix breakpoint
+    # without changing the byte content.  Caller is responsible for
+    # splitting; if they didn't, fall through to envelope marker so we still
+    # cache *something* for this turn.
+    if isinstance(content, str) and content:
+        sys_msg["content"] = [
+            {"type": "text", "text": content, "cache_control": long_lived_marker}
+        ]
+        return True
+
+    return False
+
+
+def apply_anthropic_cache_control_long_lived(
+    api_messages: List[Dict[str, Any]],
+    long_lived_ttl: str = "1h",
+    rolling_ttl: str = "5m",
+    native_anthropic: bool = False,
+) -> List[Dict[str, Any]]:
+    """Apply prefix_and_2 caching: long-lived stable prefix + rolling window.
+
+    Layout (4 breakpoints total):
+      * Stable system prefix (block[0]) → ``long_lived_ttl`` TTL
+      * Last 2 non-system messages → ``rolling_ttl`` TTL each
+
+    NOTE: this function does NOT mark the tools array. Tools cache_control
+    is attached separately (see ``mark_tools_for_long_lived_cache``) because
+    tools live outside the messages list in the API payload.
+
+    The caller MUST have split the system message into ordered content
+    blocks where block[0] is the cross-session-stable portion. If the system
+    message is still a single string, it is wrapped into a single block and
+    marked — this is correct, just less effective (the volatile suffix is
+    not isolated, so the prefix invalidates per-session).
+
+    Returns:
+        Deep copy of messages with cache_control breakpoints injected.
+    """
+    messages = copy.deepcopy(api_messages)
+    if not messages:
+        return messages
+
+    long_marker = _build_marker(long_lived_ttl)
+    rolling_marker = _build_marker(rolling_ttl)
+
+    placed_prefix = _mark_system_stable_block(messages, long_marker)
+
+    # Reserve 1 breakpoint for the system prefix (when placed); spend the
+    # remaining 3 on the rolling tail.  Anthropic max is 4 total —
+    # tools[-1] (when marked) consumes the 4th, so we cap rolling at 2 here.
+    rolling_budget = 2 if placed_prefix else 3
+    non_sys = [i for i in range(len(messages)) if messages[i].get("role") != "system"]
+    for idx in non_sys[-rolling_budget:]:
+        _apply_cache_marker(messages[idx], rolling_marker, native_anthropic=native_anthropic)
+
+    return messages
+
+
+def mark_tools_for_long_lived_cache(
+    tools: Optional[List[Dict[str, Any]]],
+    long_lived_ttl: str = "1h",
+) -> Optional[List[Dict[str, Any]]]:
+    """Attach cache_control to the last tool in the OpenAI-format tools list.
+
+    Anthropic prefix-cache order is ``tools → system → messages``.  Marking
+    the last tool dict caches the entire tools array (Anthropic's docs:
+    "the marker is placed on the last block you want included in the cached
+    prefix").  Marker is preserved across the OpenAI-wire boundary on
+    OpenRouter and Nous Portal (which proxies to OpenRouter); on native
+    Anthropic the marker is forwarded by ``convert_tools_to_anthropic``.
+
+    Returns a deep copy of the tools list with the marker attached, or the
+    input unchanged when tools is empty/None.  Pure function — does not
+    mutate the input.
+    """
+    if not tools:
+        return tools
+    out = copy.deepcopy(tools)
+    last = out[-1]
+    if isinstance(last, dict):
+        last["cache_control"] = _build_marker(long_lived_ttl)
+    return out
@@ -14,7 +14,6 @@ from dataclasses import dataclass, field
 from typing import Any, Mapping

 from utils import safe_json_loads
-from agent.tool_result_classification import file_mutation_result_landed


 IDEMPOTENT_TOOL_NAMES = frozenset(
@@ -197,8 +196,6 @@ def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str
    """
    if result is None:
        return False, ""
-    if file_mutation_result_landed(tool_name, result):
-        return False, ""

    if tool_name == "terminal":
        data = safe_json_loads(result)
@@ -1,26 +0,0 @@
-"""Shared helpers for classifying tool result payloads."""
-
-from __future__ import annotations
-
-import json
-from typing import Any
-
-
-FILE_MUTATING_TOOL_NAMES = frozenset({"write_file", "patch"})
-
-
-def file_mutation_result_landed(tool_name: str, result: Any) -> bool:
-    """Return True when a file mutation result proves the write landed."""
-    if tool_name not in FILE_MUTATING_TOOL_NAMES or not isinstance(result, str):
-        return False
-    try:
-        data = json.loads(result.strip())
-    except Exception:
-        return False
-    if not isinstance(data, dict) or data.get("error"):
-        return False
-    if tool_name == "write_file":
-        return "bytes_written" in data
-    if tool_name == "patch":
-        return data.get("success") is True
-    return False
@@ -1,368 +0,0 @@
-"""Codex app-server JSON-RPC client.
-
-Speaks the protocol documented in codex-rs/app-server/README.md (codex 0.125+).
-Transport is newline-delimited JSON-RPC 2.0 over stdio: spawn `codex app-server`,
-do an `initialize` handshake, then drive `thread/start` + `turn/start` and
-consume streaming `item/*` notifications until `turn/completed`.
-
-This module is the wire-level speaker only. Higher-level concerns (event
-projection into Hermes' display, approval bridging, transcript projection into
-AIAgent.messages, plugin migration) live in sibling modules.
-
-Status: optional opt-in runtime gated behind `model.openai_runtime ==
-"codex_app_server"`. Hermes' default tool dispatch is unchanged when this
-runtime is not selected.
-"""
-
-from __future__ import annotations
-
-import json
-import os
-import queue
-import subprocess
-import threading
-import time
-from dataclasses import dataclass, field
-from typing import Any, Callable, Optional
-
-# Default minimum codex version we test against. The PR sets this from the
-# `codex --version` parsed at install time; bumping is a one-line change here.
-MIN_CODEX_VERSION = (0, 125, 0)
-
-
-@dataclass
-class CodexAppServerError(RuntimeError):
-    """Raised on JSON-RPC errors from the app-server."""
-
-    code: int
-    message: str
-    data: Optional[Any] = None
-
-    def __str__(self) -> str:  # pragma: no cover - trivial
-        return f"codex app-server error {self.code}: {self.message}"
-
-
-@dataclass
-class _Pending:
-    queue: queue.Queue
-    method: str
-    sent_at: float = field(default_factory=time.time)
-
-
-class CodexAppServerClient:
-    """Minimal JSON-RPC 2.0 client for `codex app-server` over stdio.
-
-    Threading model:
-      - Spawning thread (caller) drives request/response pairs synchronously.
-      - One reader thread parses stdout, dispatches replies to the right
-        pending future, and routes notifications + server-initiated requests
-        to bounded queues that the caller drains on their own cadence.
-      - One reader thread captures stderr for diagnostics; codex emits
-        tracing logs there at RUST_LOG-controlled levels.
-
-    Intentionally NOT async. AIAgent.run_conversation() is synchronous and
-    runs on the main thread; layering asyncio just to drive a stdio child
-    creates surprising interrupt semantics. We use blocking queues with
-    timeouts and rely on `turn/interrupt` for cancellation.
-    """
-
-    def __init__(
-        self,
-        codex_bin: str = "codex",
-        codex_home: Optional[str] = None,
-        extra_args: Optional[list[str]] = None,
-        env: Optional[dict[str, str]] = None,
-    ) -> None:
-        self._codex_bin = codex_bin
-        cmd = [codex_bin, "app-server"] + list(extra_args or [])
-        spawn_env = os.environ.copy()
-        if env:
-            spawn_env.update(env)
-        if codex_home:
-            spawn_env["CODEX_HOME"] = codex_home
-        # Codex emits tracing to stderr; default WARN keeps it quiet for users.
-        spawn_env.setdefault("RUST_LOG", "warn")
-
-        self._proc = subprocess.Popen(
-            cmd,
-            stdin=subprocess.PIPE,
-            stdout=subprocess.PIPE,
-            stderr=subprocess.PIPE,
-            bufsize=0,
-            env=spawn_env,
-        )
-        self._next_id = 1
-        self._pending: dict[int, _Pending] = {}
-        self._pending_lock = threading.Lock()
-        self._notifications: queue.Queue = queue.Queue()
-        self._server_requests: queue.Queue = queue.Queue()
-        self._stderr_lines: list[str] = []
-        self._stderr_lock = threading.Lock()
-        self._closed = False
-        self._initialized = False
-
-        self._reader = threading.Thread(target=self._read_stdout, daemon=True)
-        self._reader.start()
-        self._stderr_reader = threading.Thread(target=self._read_stderr, daemon=True)
-        self._stderr_reader.start()
-
-    # ---------- lifecycle ----------
-
-    def initialize(
-        self,
-        client_name: str = "hermes",
-        client_title: str = "Hermes Agent",
-        client_version: str = "0.1",
-        capabilities: Optional[dict] = None,
-        timeout: float = 10.0,
-    ) -> dict:
-        """Send `initialize` + `initialized` handshake. Returns the server's
-        InitializeResponse (userAgent, codexHome, platformFamily, platformOs)."""
-        if self._initialized:
-            raise RuntimeError("already initialized")
-        params = {
-            "clientInfo": {
-                "name": client_name,
-                "title": client_title,
-                "version": client_version,
-            },
-            "capabilities": capabilities or {},
-        }
-        result = self.request("initialize", params, timeout=timeout)
-        self.notify("initialized")
-        self._initialized = True
-        return result
-
-    def close(self, timeout: float = 3.0) -> None:
-        """Close stdin and wait for the subprocess to exit, escalating to kill."""
-        if self._closed:
-            return
-        self._closed = True
-        try:
-            if self._proc.stdin and not self._proc.stdin.closed:
-                self._proc.stdin.close()
-        except Exception:
-            pass
-        try:
-            self._proc.terminate()
-            self._proc.wait(timeout=timeout)
-        except subprocess.TimeoutExpired:
-            try:
-                self._proc.kill()
-                self._proc.wait(timeout=1.0)
-            except Exception:
-                pass
-
-    def __enter__(self) -> "CodexAppServerClient":
-        return self
-
-    def __exit__(self, *exc: Any) -> None:
-        self.close()
-
-    # ---------- send/receive ----------
-
-    def request(
-        self,
-        method: str,
-        params: Optional[dict] = None,
-        timeout: float = 30.0,
-    ) -> dict:
-        """Send a JSON-RPC request and block on the response. Returns `result`,
-        raises CodexAppServerError on `error`."""
-        rid = self._take_id()
-        q: queue.Queue = queue.Queue(maxsize=1)
-        with self._pending_lock:
-            self._pending[rid] = _Pending(queue=q, method=method)
-        self._send({"id": rid, "method": method, "params": params or {}})
-        try:
-            msg = q.get(timeout=timeout)
-        except queue.Empty:
-            with self._pending_lock:
-                self._pending.pop(rid, None)
-            raise TimeoutError(
-                f"codex app-server method {method!r} timed out after {timeout}s"
-            )
-        if "error" in msg:
-            err = msg["error"]
-            raise CodexAppServerError(
-                code=err.get("code", -1),
-                message=err.get("message", ""),
-                data=err.get("data"),
-            )
-        return msg.get("result", {})
-
-    def notify(self, method: str, params: Optional[dict] = None) -> None:
-        """Send a JSON-RPC notification (no id, no response expected)."""
-        self._send({"method": method, "params": params or {}})
-
-    def respond(self, request_id: Any, result: dict) -> None:
-        """Reply to a server-initiated request (e.g. approval prompts)."""
-        self._send({"id": request_id, "result": result})
-
-    def respond_error(
-        self, request_id: Any, code: int, message: str, data: Optional[Any] = None
-    ) -> None:
-        """Reply to a server-initiated request with an error."""
-        err: dict[str, Any] = {"code": code, "message": message}
-        if data is not None:
-            err["data"] = data
-        self._send({"id": request_id, "error": err})
-
-    def take_notification(self, timeout: float = 0.0) -> Optional[dict]:
-        """Pop the next streaming notification, or return None on timeout.
-
-        timeout=0.0 means non-blocking. Use small positive timeouts inside the
-        AIAgent turn loop to interleave reads with interrupt checks."""
-        try:
-            if timeout <= 0:
-                return self._notifications.get_nowait()
-            return self._notifications.get(timeout=timeout)
-        except queue.Empty:
-            return None
-
-    def take_server_request(self, timeout: float = 0.0) -> Optional[dict]:
-        """Pop the next server-initiated request (e.g. exec/applyPatch approval)."""
-        try:
-            if timeout <= 0:
-                return self._server_requests.get_nowait()
-            return self._server_requests.get(timeout=timeout)
-        except queue.Empty:
-            return None
-
-    # ---------- diagnostics ----------
-
-    def stderr_tail(self, n: int = 20) -> list[str]:
-        """Return last n lines of codex's stderr (for error reports)."""
-        with self._stderr_lock:
-            return list(self._stderr_lines[-n:])
-
-    def is_alive(self) -> bool:
-        return self._proc.poll() is None
-
-    # ---------- internals ----------
-
-    def _take_id(self) -> int:
-        # JSON-RPC ids only need to be unique per-connection. A simple
-        # monotonically increasing int is the common choice and matches what
-        # codex's own clients use.
-        rid = self._next_id
-        self._next_id += 1
-        return rid
-
-    def _send(self, obj: dict) -> None:
-        if self._closed:
-            raise RuntimeError("codex app-server client is closed")
-        if self._proc.stdin is None:
-            raise RuntimeError("codex app-server stdin not available")
-        try:
-            self._proc.stdin.write((json.dumps(obj) + "\n").encode("utf-8"))
-            self._proc.stdin.flush()
-        except (BrokenPipeError, ValueError) as exc:
-            raise RuntimeError(
-                f"codex app-server stdin closed unexpectedly: {exc}"
-            ) from exc
-
-    def _read_stdout(self) -> None:
-        if self._proc.stdout is None:
-            return
-        try:
-            for line in iter(self._proc.stdout.readline, b""):
-                if not line:
-                    break
-                line = line.strip()
-                if not line:
-                    continue
-                try:
-                    msg = json.loads(line)
-                except json.JSONDecodeError:
-                    # Non-JSON output is unexpected on stdout; tracing belongs
-                    # on stderr. Surface it via stderr buffer for diagnostics.
-                    with self._stderr_lock:
-                        self._stderr_lines.append(
-                            f"<non-json on stdout> {line[:200]!r}"
-                        )
-                    continue
-                self._dispatch(msg)
-        except Exception as exc:
-            with self._stderr_lock:
-                self._stderr_lines.append(f"<stdout reader error> {exc}")
-
-    def _dispatch(self, msg: dict) -> None:
-        # Reply (has id + result/error, no method)
-        if "id" in msg and ("result" in msg or "error" in msg):
-            with self._pending_lock:
-                pending = self._pending.pop(msg["id"], None)
-            if pending is not None:
-                try:
-                    pending.queue.put_nowait(msg)
-                except queue.Full:  # pragma: no cover - defensive
-                    pass
-            return
-        # Server-initiated request (has id + method)
-        if "id" in msg and "method" in msg:
-            self._server_requests.put(msg)
-            return
-        # Notification (no id)
-        if "method" in msg:
-            self._notifications.put(msg)
-
-    def _read_stderr(self) -> None:
-        if self._proc.stderr is None:
-            return
-        try:
-            for line in iter(self._proc.stderr.readline, b""):
-                if not line:
-                    break
-                with self._stderr_lock:
-                    self._stderr_lines.append(
-                        line.decode("utf-8", "replace").rstrip()
-                    )
-                    # Bound memory: keep last 500 lines.
-                    if len(self._stderr_lines) > 500:
-                        self._stderr_lines = self._stderr_lines[-500:]
-        except Exception:  # pragma: no cover
-            pass
-
-
-def parse_codex_version(output: str) -> Optional[tuple[int, int, int]]:
-    """Parse `codex --version` output. Returns (major, minor, patch) or None."""
-    # Output format: "codex-cli 0.130.0" possibly followed by metadata.
-    import re
-
-    match = re.search(r"(\d+)\.(\d+)\.(\d+)", output or "")
-    if not match:
-        return None
-    return (int(match.group(1)), int(match.group(2)), int(match.group(3)))
-
-
-def check_codex_binary(
-    codex_bin: str = "codex", min_version: tuple[int, int, int] = MIN_CODEX_VERSION
-) -> tuple[bool, str]:
-    """Verify codex CLI is installed and meets minimum version.
-
-    Returns (ok, message). Used by setup wizard and runtime startup."""
-    try:
-        proc = subprocess.run(
-            [codex_bin, "--version"],
-            capture_output=True,
-            text=True,
-            timeout=10,
-        )
-    except FileNotFoundError:
-        return False, (
-            f"codex CLI not found at {codex_bin!r}. Install with: "
-            f"npm i -g @openai/codex"
-        )
-    except subprocess.TimeoutExpired:
-        return False, "codex --version timed out"
-    if proc.returncode != 0:
-        return False, f"codex --version exited {proc.returncode}: {proc.stderr.strip()}"
-    version = parse_codex_version(proc.stdout)
-    if version is None:
-        return False, f"could not parse codex version from: {proc.stdout!r}"
-    if version < min_version:
-        return False, (
-            f"codex {'.'.join(map(str, version))} is older than required "
-            f"{'.'.join(map(str, min_version))}. Run: npm i -g @openai/codex"
-        )
-    return True, ".".join(map(str, version))
@@ -1,810 +0,0 @@
-"""Session adapter for codex app-server runtime.
-
-Owns one Codex thread per Hermes session. Drives `turn/start`, consumes
-streaming notifications via CodexEventProjector, handles server-initiated
-approval requests (apply_patch, exec command), translates cancellation,
-and returns a clean turn result that AIAgent.run_conversation() can splice
-into its `messages` list.
-
-Lifecycle:
-    session = CodexAppServerSession(cwd="/home/x/proj")
-    session.ensure_started()                              # spawns + handshake + thread/start
-    result = session.run_turn(user_input="hello")         # blocks until turn/completed
-    # result.final_text          → assistant text returned to caller
-    # result.projected_messages  → list of {role, content, ...} for messages list
-    # result.tool_iterations     → how many tool-shaped items completed (skill nudge counter)
-    # result.interrupted         → True if Ctrl+C / interrupt_requested fired mid-turn
-    session.close()                                       # tears down subprocess
-
-Threading model: the adapter is single-threaded from the caller's perspective.
-The underlying CodexAppServerClient owns its own reader threads but exposes
-blocking-with-timeout queues that this adapter polls in a loop, so the run_turn
-call is synchronous and behaves like AIAgent's existing chat_completions loop.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-import threading
-import time
-from dataclasses import dataclass, field
-from typing import Any, Callable, Optional
-
-from agent.redact import redact_sensitive_text
-from agent.transports.codex_app_server import (
-    CodexAppServerClient,
-    CodexAppServerError,
-)
-from agent.transports.codex_event_projector import CodexEventProjector
-
-logger = logging.getLogger(__name__)
-
-
-# How many tailing stderr lines from the codex subprocess to attach to a
-# user-facing error when we don't have a more specific classification (OAuth,
-# wedge watchdog, etc.). Small enough to keep error messages legible, large
-# enough to surface a config/provider/auth diagnostic.
-_STDERR_TAIL_LINES = 12
-
-
-# Permission profile mapping mirrors the docstring in PR proposal:
-# Hermes' tools.terminal.security_mode → Codex's permissions profile id.
-# Defaults if config is missing → workspace-write (matches Codex's own default).
-_HERMES_TO_CODEX_PERMISSION_PROFILE = {
-    "auto": "workspace-write",
-    "approval-required": "read-only-with-approval",
-    "unrestricted": "full-access",
-    # Backstop alias used by some skills/tests.
-    "yolo": "full-access",
-}
-
-
-@dataclass
-class TurnResult:
-    """Result of one user→assistant→tool turn through the codex app-server."""
-
-    final_text: str = ""
-    projected_messages: list[dict] = field(default_factory=list)
-    tool_iterations: int = 0
-    interrupted: bool = False
-    error: Optional[str] = None  # Set if turn ended in a non-recoverable error
-    turn_id: Optional[str] = None
-    thread_id: Optional[str] = None
-    # Hint to the caller that the underlying codex subprocess is likely
-    # wedged (turn-level timeout fired, post-tool watchdog tripped, or
-    # token-refresh failure killed the child). The caller should retire
-    # the session so the next turn respawns codex from scratch instead
-    # of riding a CPU-spinning or auth-broken process. Mirrors openclaw
-    # beta.8's "retire timed-out app-server clients" fix.
-    should_retire: bool = False
-
-
-# Markers we accept as terminal even when codex never emits turn/completed.
-# Some codex versions stream `<turn_aborted>` as raw text in agentMessage
-# items when an interrupt or upstream error tears the turn down before the
-# normal completion path fires. Mirrors openclaw beta.8 fix.
-_TURN_ABORTED_MARKERS = ("<turn_aborted>", "<turn_aborted/>")
-
-
-# Substrings in codex stderr / JSON-RPC error messages that signal the
-# subprocess died because its OAuth credentials are no longer valid.
-# Kept conservative: we only redirect users to `codex login` when we're
-# reasonably sure that's the actual failure, otherwise we surface the
-# original error verbatim. Mirrors openclaw beta.8's auth-refresh
-# classification.
-_OAUTH_REFRESH_FAILURE_HINTS = (
-    "invalid_grant",
-    "invalid grant",
-    "refresh token",
-    "refresh_token",
-    "token refresh",
-    "token_refresh",
-    "token has expired",
-    "expired_token",
-    "expired token",
-    "not authenticated",
-    "unauthenticated",
-    "unauthorized",
-    "401 unauthorized",
-    "re-authenticate",
-    "reauthenticate",
-    "please log in",
-    "please login",
-    "auth profile",
-    "no auth profile",
-    "oauth",
-)
-
-
-def _classify_oauth_failure(*parts: str) -> Optional[str]:
-    """Return a user-friendly re-auth hint if any of the provided strings
-    look like a codex OAuth/token-refresh failure; otherwise None.
-
-    Used for both `turn/start` JSON-RPC errors and post-mortem stderr
-    inspection when the subprocess exits unexpectedly. Conservative on
-    purpose — we only redirect users to `codex login` when the signal
-    is strong, so unrelated runtime failures still surface verbatim.
-    """
-    haystack = " ".join(p for p in parts if p).lower()
-    if not haystack:
-        return None
-    for needle in _OAUTH_REFRESH_FAILURE_HINTS:
-        if needle in haystack:
-            return (
-                "Codex authentication failed — your ChatGPT/Codex login "
-                "looks expired or invalid. Run `codex login` to refresh, "
-                "then retry. (Fall back to default runtime with "
-                "`/codex-runtime auto` if the issue persists.)"
-            )
-    return None
-
-
-@dataclass
-class _ServerRequestRouting:
-    """Default policies for codex-side approval requests when no interactive
-    callback is wired in. These are only used by tests + cron / non-interactive
-    contexts; the live CLI path passes an approval_callback that defers to
-    tools.approval.prompt_dangerous_approval()."""
-
-    auto_approve_exec: bool = False
-    auto_approve_apply_patch: bool = False
-
-
-class CodexAppServerSession:
-    """One Codex thread per Hermes session, lifetime owned by AIAgent.
-
-    Not thread-safe — one caller drives it at a time, matching how AIAgent's
-    run_conversation() loop is structured today. The codex client itself can
-    handle interleaved reads/writes via its own threads, but the adapter's
-    state (projector, thread_id, turn counter) is owned by the caller thread.
-    """
-
-    def __init__(
-        self,
-        *,
-        cwd: Optional[str] = None,
-        codex_bin: str = "codex",
-        codex_home: Optional[str] = None,
-        permission_profile: Optional[str] = None,
-        approval_callback: Optional[Callable[..., str]] = None,
-        on_event: Optional[Callable[[dict], None]] = None,
-        request_routing: Optional[_ServerRequestRouting] = None,
-        client_factory: Optional[Callable[..., CodexAppServerClient]] = None,
-    ) -> None:
-        self._cwd = cwd or os.getcwd()
-        self._codex_bin = codex_bin
-        self._codex_home = codex_home
-        self._permission_profile = (
-            permission_profile or _HERMES_TO_CODEX_PERMISSION_PROFILE.get(
-                os.environ.get("HERMES_TERMINAL_SECURITY_MODE", "auto"),
-                "workspace-write",
-            )
-        )
-        self._approval_callback = approval_callback
-        self._on_event = on_event  # Display hook (kawaii spinner ticks etc.)
-        self._routing = request_routing or _ServerRequestRouting()
-        self._client_factory = client_factory or CodexAppServerClient
-
-        self._client: Optional[CodexAppServerClient] = None
-        self._thread_id: Optional[str] = None
-        self._interrupt_event = threading.Event()
-        # Pending file-change items, keyed by item id. Populated on
-        # item/started for fileChange items; consumed by the approval
-        # bridge when codex sends item/fileChange/requestApproval. The
-        # approval params don't carry the changeset, so we cache here
-        # to surface a real summary in the approval prompt (quirk #4).
-        self._pending_file_changes: dict[str, str] = {}
-        self._closed = False
-
-    # ---------- lifecycle ----------
-
-    def ensure_started(self) -> str:
-        """Spawn the subprocess, do the initialize handshake, and start a
-        thread. Returns the codex thread id. Idempotent — repeated calls
-        return the same thread id."""
-        if self._thread_id is not None:
-            return self._thread_id
-        if self._client is None:
-            self._client = self._client_factory(
-                codex_bin=self._codex_bin, codex_home=self._codex_home
-            )
-        self._client.initialize(
-            client_name="hermes",
-            client_title="Hermes Agent",
-            client_version=_get_hermes_version(),
-        )
-        # Permission selection is intentionally NOT sent on thread/start.
-        # Two reasons (live-tested against codex 0.130.0):
-        #   1. `thread/start.permissions` is gated behind the experimentalApi
-        #      capability on this codex version — we'd have to opt in during
-        #      initialize and accept the unstable surface.
-        #   2. Even with experimentalApi declared and the correct shape
-        #      (`{"type": "profile", "id": "..."}`, not `{"profileId": ...}`),
-        #      codex requires a matching `[permissions]` table in
-        #      ~/.codex/config.toml or it fails the request with
-        #      'default_permissions requires a [permissions] table'.
-        # Letting codex pick its default (`:read-only` unless the user has
-        # configured otherwise in their codex config.toml) is the standard
-        # codex CLI workflow and avoids fighting codex's own validation.
-        # Users who want a write-capable profile configure it in their
-        # ~/.codex/config.toml the same way they would for any codex usage.
-        params: dict[str, Any] = {"cwd": self._cwd}
-        result = self._client.request("thread/start", params, timeout=15)
-        # Cross-fill thread.id/sessionId — different codex versions have
-        # serialized this under either key. Mirrors openclaw beta.8's
-        # tolerance fix so future codex drops/renames don't KeyError us
-        # at handshake time.
-        thread_obj = result.get("thread") or {}
-        thread_id = (
-            thread_obj.get("id")
-            or thread_obj.get("sessionId")
-            or result.get("sessionId")
-            or result.get("threadId")
-        )
-        if not thread_id:
-            raise CodexAppServerError(
-                code=-32603,
-                message=(
-                    "codex thread/start returned no thread id "
-                    f"(payload keys: {sorted(result.keys())})"
-                ),
-            )
-        self._thread_id = thread_id
-        logger.info(
-            "codex app-server thread started: id=%s profile=%s cwd=%s",
-            self._thread_id[:8],
-            self._permission_profile,
-            self._cwd,
-        )
-        return self._thread_id
-
-    def close(self) -> None:
-        if self._closed:
-            return
-        self._closed = True
-        if self._client is not None:
-            try:
-                self._client.close()
-            except Exception:  # pragma: no cover - best-effort cleanup
-                pass
-            self._client = None
-        self._thread_id = None
-
-    def __enter__(self) -> "CodexAppServerSession":
-        return self
-
-    def __exit__(self, *exc: Any) -> None:
-        self.close()
-
-    # ---------- interrupt ----------
-
-    def request_interrupt(self) -> None:
-        """Idempotent: signal the active turn loop to issue turn/interrupt
-        and unwind. Called by AIAgent's _interrupt_requested path."""
-        self._interrupt_event.set()
-
-    # ---------- diagnostics ----------
-
-    def _format_error_with_stderr(
-        self,
-        prefix: str,
-        exc: Any = "",
-        *,
-        tail_lines: int = _STDERR_TAIL_LINES,
-    ) -> str:
-        """Build a user-facing error string for codex failures.
-
-        Appends the last few lines of codex's stderr buffer when available,
-        passed through agent.redact with force=True so secrets in provider
-        error responses (auth headers, query-string tokens, sk-* keys) never
-        leak into chat output or trajectories. The codex CLI's own error
-        text ('Internal error', 'turn/start failed: ...') is otherwise
-        opaque and forces users to re-run with verbose flags to diagnose
-        config / provider / auth-bridge problems.
-
-        Use this for the generic / catch-all branches. Specific
-        classifications (OAuth via _classify_oauth_failure, post-tool wedge
-        watchdog) already produce a clean hint and should be used instead.
-        """
-        exc_str = str(exc) if exc != "" and exc is not None else ""
-        base = f"{prefix}: {exc_str}" if exc_str else prefix
-        if self._client is None:
-            return base
-        try:
-            tail = self._client.stderr_tail(tail_lines)
-        except Exception:  # pragma: no cover - diagnostic best-effort
-            return base
-        if not tail:
-            return base
-        joined = "\n".join(line.rstrip() for line in tail if line)
-        if not joined.strip():
-            return base
-        redacted = redact_sensitive_text(joined, force=True)
-        return f"{base}\ncodex stderr (last {len(tail)} lines):\n{redacted}"
-
-    # ---------- per-turn ----------
-
-    def run_turn(
-        self,
-        user_input: str,
-        *,
-        turn_timeout: float = 600.0,
-        notification_poll_timeout: float = 0.25,
-        post_tool_quiet_timeout: float = 90.0,
-    ) -> TurnResult:
-        """Send a user message and block until turn/completed, while
-        forwarding server-initiated approval requests and projecting items
-        into Hermes' messages shape.
-
-        post_tool_quiet_timeout: if codex emits a tool completion and then
-        goes quiet for this many seconds without emitting another item or
-        `turn/completed`, fast-fail and mark the session for retirement.
-        Mirrors openclaw beta.8's post-tool completion watchdog (#81697)
-        so a wedged codex doesn't burn the full turn deadline.
-        """
-        # Pre-create the result so startup failures (codex subprocess can't
-        # spawn, initialize handshake rejects, thread/start blows up) surface
-        # the same way per-turn failures do — with a TurnResult.error string
-        # the caller can render — instead of bubbling raw codex exceptions
-        # up to AIAgent.run_conversation.
-        result = TurnResult()
-        try:
-            self.ensure_started()
-        except (CodexAppServerError, TimeoutError) as exc:
-            result.error = self._format_error_with_stderr(
-                "codex app-server startup failed", exc
-            )
-            # Subprocess almost certainly unhealthy — retire so the next
-            # turn re-spawns cleanly.
-            result.should_retire = True
-            return result
-        assert self._client is not None and self._thread_id is not None
-        result.thread_id = self._thread_id
-
-        self._interrupt_event.clear()
-        projector = CodexEventProjector()
-
-        # Send turn/start with the user input. Text-only for now (codex
-        # supports rich content but Hermes' text path is the common case).
-        try:
-            ts = self._client.request(
-                "turn/start",
-                {
-                    "threadId": self._thread_id,
-                    "input": [{"type": "text", "text": user_input}],
-                },
-                timeout=10,
-            )
-        except CodexAppServerError as exc:
-            # Classify auth/refresh failures so the user gets a clear
-            # `codex login` pointer instead of a raw RPC error string.
-            stderr_blob = "\n".join(self._client.stderr_tail(40))
-            hint = _classify_oauth_failure(exc.message, stderr_blob)
-            if hint is not None:
-                result.error = hint
-                # Subprocess is fine on a JSON-RPC level here, but the
-                # token store is broken — retire so the next turn does a
-                # clean handshake (and the user has a chance to re-auth
-                # via `codex login` between turns).
-                result.should_retire = True
-            else:
-                result.error = self._format_error_with_stderr(
-                    "turn/start failed", exc
-                )
-            return result
-        except TimeoutError as exc:
-            # turn/start hanging is a strong signal the subprocess is wedged.
-            stderr_blob = "\n".join(self._client.stderr_tail(40))
-            hint = _classify_oauth_failure(stderr_blob)
-            result.error = hint or self._format_error_with_stderr(
-                "turn/start timed out", exc
-            )
-            result.should_retire = True
-            return result
-
-        result.turn_id = (ts.get("turn") or {}).get("id")
-        deadline = time.time() + turn_timeout
-        turn_complete = False
-        # Post-tool watchdog state. last_tool_completion_at is set whenever
-        # a tool-shaped item completes; if no further notification arrives
-        # within post_tool_quiet_timeout and the turn hasn't completed, we
-        # fast-fail and retire the session.
-        last_tool_completion_at: Optional[float] = None
-
-        while time.time() < deadline and not turn_complete:
-            if self._interrupt_event.is_set():
-                self._issue_interrupt(result.turn_id)
-                result.interrupted = True
-                break
-
-            # Detect a dead subprocess between iterations. If codex exited
-            # (e.g. crashed, segfaulted, or its auth refresh thread killed
-            # the process), we won't get any more notifications — bail out
-            # rather than waiting for the full turn deadline.
-            if not self._client.is_alive():
-                stderr_blob = "\n".join(self._client.stderr_tail(60))
-                hint = _classify_oauth_failure(stderr_blob)
-                if hint is not None:
-                    result.error = hint
-                else:
-                    result.error = self._format_error_with_stderr(
-                        "codex app-server subprocess exited unexpectedly",
-                        tail_lines=20,
-                    )
-                result.should_retire = True
-                break
-
-            # Post-tool watchdog: if a tool completion was the most recent
-            # signal and codex has been silent past the quiet timeout, give
-            # up on this turn instead of waiting for the outer deadline.
-            if (
-                last_tool_completion_at is not None
-                and (time.time() - last_tool_completion_at)
-                    > post_tool_quiet_timeout
-            ):
-                self._issue_interrupt(result.turn_id)
-                result.interrupted = True
-                result.error = (
-                    f"codex went silent for "
-                    f"{post_tool_quiet_timeout:.0f}s after a tool result; "
-                    f"retiring app-server session."
-                )
-                result.should_retire = True
-                break
-
-            # Drain any server-initiated requests (approvals) before
-            # reading notifications, so the codex side isn't blocked.
-            sreq = self._client.take_server_request(timeout=0)
-            if sreq is not None:
-                # Drain any pending notifications first so per-turn state
-                # (e.g. _pending_file_changes for fileChange approvals) is
-                # up to date when we make the approval decision. Bounded
-                # to avoid starving the server-request response.
-                for _ in range(8):
-                    pending = self._client.take_notification(timeout=0)
-                    if pending is None:
-                        break
-                    self._track_pending_file_change(pending)
-                    proj = projector.project(pending)
-                    if proj.messages:
-                        result.projected_messages.extend(proj.messages)
-                    if proj.is_tool_iteration:
-                        result.tool_iterations += 1
-                        last_tool_completion_at = time.time()
-                    if proj.final_text is not None:
-                        result.final_text = proj.final_text
-                        if _has_turn_aborted_marker(proj.final_text):
-                            turn_complete = True
-                            result.interrupted = True
-                            result.error = (
-                                result.error
-                                or "codex reported turn_aborted"
-                            )
-                self._handle_server_request(sreq)
-                # Activity counts as live signal — reset the post-tool
-                # quiet timer so an approval round-trip doesn't trip it.
-                last_tool_completion_at = None
-                continue
-
-            note = self._client.take_notification(
-                timeout=notification_poll_timeout
-            )
-            if note is None:
-                continue
-
-            method = note.get("method", "")
-            if self._on_event is not None:
-                try:
-                    self._on_event(note)
-                except Exception:  # pragma: no cover - display callback
-                    logger.debug("on_event callback raised", exc_info=True)
-
-            # Track in-progress fileChange items so the approval bridge
-            # can surface a real change summary when codex requests
-            # approval (the approval params themselves don't carry the
-            # changeset). Quirk #4 fix.
-            self._track_pending_file_change(note)
-
-            # Project into messages
-            projection = projector.project(note)
-            if projection.messages:
-                result.projected_messages.extend(projection.messages)
-            if projection.is_tool_iteration:
-                result.tool_iterations += 1
-                # Arm/refresh the post-tool quiet watchdog whenever a
-                # tool-shaped item completes.
-                last_tool_completion_at = time.time()
-            else:
-                # Any non-tool projected activity (assistant message,
-                # status update, etc.) means codex is still producing
-                # output — clear the quiet timer so we don't fast-fail.
-                if projection.messages or projection.final_text is not None:
-                    last_tool_completion_at = None
-            if projection.final_text is not None:
-                # Codex can emit multiple agentMessage items in one turn
-                # (e.g. partial then final). Take the last one as canonical.
-                result.final_text = projection.final_text
-                # Some codex builds tear a turn down by emitting a
-                # `<turn_aborted>` marker in the agent message text and
-                # never sending turn/completed. Treat the marker itself
-                # as terminal so we don't burn the full deadline.
-                if _has_turn_aborted_marker(projection.final_text):
-                    turn_complete = True
-                    result.interrupted = True
-                    result.error = (
-                        result.error or "codex reported turn_aborted"
-                    )
-
-            if method == "turn/completed":
-                turn_complete = True
-                turn_status = (
-                    (note.get("params") or {}).get("turn") or {}
-                ).get("status")
-                if turn_status and turn_status not in ("completed", "interrupted"):
-                    err_obj = (
-                        (note.get("params") or {}).get("turn") or {}
-                    ).get("error")
-                    if err_obj:
-                        err_msg = err_obj.get("message") or str(err_obj)
-                        # If the turn failed for an auth/refresh reason,
-                        # rewrite the error into a re-auth hint AND mark
-                        # the session for retirement.
-                        stderr_blob = "\n".join(
-                            self._client.stderr_tail(40)
-                        )
-                        hint = _classify_oauth_failure(err_msg, stderr_blob)
-                        if hint is not None:
-                            result.error = hint
-                            result.should_retire = True
-                        else:
-                            result.error = self._format_error_with_stderr(
-                                f"turn ended status={turn_status}", err_msg
-                            )
-
-        if not turn_complete and not result.interrupted:
-            # Hit the deadline. Issue interrupt to stop wasted compute, and
-            # tell the caller to retire the session — a turn that never
-            # finished is a strong sign codex is wedged in a way the next
-            # turn shouldn't inherit.
-            self._issue_interrupt(result.turn_id)
-            result.interrupted = True
-            if not result.error:
-                result.error = self._format_error_with_stderr(
-                    f"turn timed out after {turn_timeout}s"
-                )
-            result.should_retire = True
-
-        return result
-
-    # ---------- internals ----------
-
-    def _issue_interrupt(self, turn_id: Optional[str]) -> None:
-        if self._client is None or self._thread_id is None or turn_id is None:
-            return
-        try:
-            self._client.request(
-                "turn/interrupt",
-                {"threadId": self._thread_id, "turnId": turn_id},
-                timeout=5,
-            )
-        except CodexAppServerError as exc:
-            # "no active turn to interrupt" is fine — already done.
-            logger.debug("turn/interrupt non-fatal: %s", exc)
-        except TimeoutError:
-            logger.warning("turn/interrupt timed out")
-
-    def _handle_server_request(self, req: dict) -> None:
-        """Translate a codex server request (approval) into Hermes' approval
-        flow, then send the response.
-
-        Method names verified live against codex 0.130.0 (Apr 2026):
-          item/commandExecution/requestApproval — exec approvals
-          item/fileChange/requestApproval       — apply_patch approvals
-          item/permissions/requestApproval      — permissions changes
-                                                  (we decline; user controls
-                                                  permission profile in
-                                                  ~/.codex/config.toml).
-        """
-        if self._client is None:
-            return
-        method = req.get("method", "")
-        rid = req.get("id")
-        params = req.get("params") or {}
-
-        if method == "item/commandExecution/requestApproval":
-            decision = self._decide_exec_approval(params)
-            self._client.respond(rid, {"decision": decision})
-        elif method == "item/fileChange/requestApproval":
-            decision = self._decide_apply_patch_approval(params)
-            self._client.respond(rid, {"decision": decision})
-        elif method == "item/permissions/requestApproval":
-            # Codex sometimes asks to escalate permissions mid-turn. We
-            # always decline — the user already chose their permission
-            # profile in ~/.codex/config.toml and surprise escalations
-            # shouldn't be silently accepted.
-            self._client.respond(rid, {"decision": "decline"})
-        elif method == "mcpServer/elicitation/request":
-            # Codex's MCP layer asks the user for structured input on
-            # behalf of an MCP server (e.g. tool-call confirmation,
-            # OAuth, form data). For our own hermes-tools callback we
-            # auto-accept — the user already approved Hermes' tools
-            # by enabling the runtime, and we never expose anything
-            # codex's built-in shell can't already do. For other MCP
-            # servers we decline so the user explicitly opts in via
-            # codex's own auth flow.
-            server_name = params.get("serverName") or ""
-            if server_name == "hermes-tools":
-                self._client.respond(
-                    rid,
-                    {"action": "accept", "content": None, "_meta": None},
-                )
-            else:
-                self._client.respond(
-                    rid,
-                    {"action": "decline", "content": None, "_meta": None},
-                )
-        else:
-            # Unknown server request — codex can extend this surface. Reject
-            # cleanly so codex doesn't hang waiting for us.
-            logger.warning("Unknown codex server request: %s", method)
-            self._client.respond_error(
-                rid, code=-32601, message=f"Unsupported method: {method}"
-            )
-
-    def _decide_exec_approval(self, params: dict) -> str:
-        if self._routing.auto_approve_exec:
-            return "accept"
-        command = params.get("command") or ""
-        # Codex's CommandExecutionRequestApprovalParams has cwd as Optional —
-        # fall back to the session's cwd when codex doesn't include it so the
-        # approval prompt is never empty (quirk #10 fix).
-        cwd = params.get("cwd") or self._cwd or "<unknown>"
-        reason = params.get("reason")
-        description = f"Codex requests exec in {cwd}"
-        if reason:
-            description += f" — {reason}"
-        if self._approval_callback is not None:
-            try:
-                choice = self._approval_callback(
-                    command, description, allow_permanent=False
-                )
-                return _approval_choice_to_codex_decision(choice)
-            except Exception:
-                logger.exception("approval_callback raised on exec request")
-                return "decline"
-        return "decline"  # fail-closed when no callback wired
-
-    def _decide_apply_patch_approval(self, params: dict) -> str:
-        if self._routing.auto_approve_apply_patch:
-            return "accept"
-        if self._approval_callback is not None:
-            # FileChangeRequestApprovalParams gives us reason + grantRoot.
-            # The actual changeset lives on the corresponding fileChange
-            # item which the projector has already cached for us — look it
-            # up by item_id so the user sees what's actually changing.
-            reason = params.get("reason")
-            grant_root = params.get("grantRoot")
-            item_id = params.get("itemId") or ""
-            change_summary = self._lookup_pending_file_change(item_id)
-            description_parts = []
-            if reason:
-                description_parts.append(reason)
-            if change_summary:
-                description_parts.append(change_summary)
-            if grant_root:
-                description_parts.append(f"grants write to {grant_root}")
-            description = (
-                "; ".join(description_parts)
-                if description_parts
-                else "Codex requests to apply a patch"
-            )
-            command_label = (
-                f"apply_patch: {change_summary}" if change_summary
-                else f"apply_patch: {reason}" if reason
-                else "apply_patch"
-            )
-            try:
-                choice = self._approval_callback(
-                    command_label,
-                    description,
-                    allow_permanent=False,
-                )
-                return _approval_choice_to_codex_decision(choice)
-            except Exception:
-                logger.exception("approval_callback raised on apply_patch")
-                return "decline"
-        return "decline"
-
-    def _track_pending_file_change(self, note: dict) -> None:
-        """Maintain self._pending_file_changes from item/started + item/completed
-        notifications. Lets the apply_patch approval prompt show what's
-        actually changing — codex's approval params don't carry the data."""
-        method = note.get("method", "")
-        params = note.get("params") or {}
-        item = params.get("item") or {}
-        if item.get("type") != "fileChange":
-            return
-        item_id = item.get("id") or ""
-        if not item_id:
-            return
-        if method == "item/started":
-            changes = item.get("changes") or []
-            if not changes:
-                self._pending_file_changes[item_id] = "1 change pending"
-                return
-            kinds: dict[str, int] = {}
-            paths: list[str] = []
-            for ch in changes:
-                if not isinstance(ch, dict):
-                    continue
-                kind = (ch.get("kind") or {}).get("type") or "update"
-                kinds[kind] = kinds.get(kind, 0) + 1
-                p = ch.get("path") or ""
-                if p:
-                    paths.append(p)
-            counts = ", ".join(f"{n} {k}" for k, n in sorted(kinds.items()))
-            preview = ", ".join(paths[:3])
-            if len(paths) > 3:
-                preview += f", +{len(paths) - 3} more"
-            self._pending_file_changes[item_id] = (
-                f"{counts}: {preview}" if preview else counts
-            )
-        elif method == "item/completed":
-            self._pending_file_changes.pop(item_id, None)
-
-    def _lookup_pending_file_change(self, item_id: str) -> Optional[str]:
-        """Look up an in-progress fileChange item by id and summarize its
-        changes for the approval prompt. Returns None when we don't have
-        the item cached (e.g. approval arrived before item/started, or
-        fileChange item content not tracked yet)."""
-        if not item_id:
-            return None
-        cached = self._pending_file_changes.get(item_id)
-        if not cached:
-            return None
-        return cached
-
-
-def _approval_choice_to_codex_decision(choice: str) -> str:
-    """Map Hermes approval choices onto codex's CommandExecutionApprovalDecision
-    / FileChangeApprovalDecision wire values.
-
-    Hermes returns 'once', 'session', 'always', or 'deny'.
-    Codex expects 'accept', 'acceptForSession', 'decline', or 'cancel'
-    (verified against codex-rs/app-server-protocol/src/protocol/v2/item.rs
-    on codex 0.130.0).
-    """
-    if choice in ("once",):
-        return "accept"
-    if choice in ("session", "always"):
-        return "acceptForSession"
-    return "decline"
-
-
-def _has_turn_aborted_marker(text: str) -> bool:
-    """Return True if `text` contains any of the raw markers codex uses
-    to signal a turn was aborted without emitting `turn/completed`.
-
-    Codex emits `<turn_aborted>` (and sometimes `<turn_aborted/>`) as raw
-    text inside agentMessage items when an interrupt or upstream error
-    tears the turn down before the normal completion path fires. Mirrors
-    openclaw beta.8's terminal-marker fix so we don't burn the full turn
-    deadline waiting for a turn/completed that never comes.
-    """
-    if not text:
-        return False
-    for marker in _TURN_ABORTED_MARKERS:
-        if marker in text:
-            return True
-    return False
-
-
-def _get_hermes_version() -> str:
-    """Best-effort Hermes version string for codex's userAgent line."""
-    try:
-        from importlib.metadata import version
-
-        return version("hermes-agent")
-    except Exception:  # pragma: no cover
-        return "0.0.0"
@@ -1,312 +0,0 @@
-"""Projects codex app-server events into Hermes' messages list.
-
-The translator that lets Hermes' memory/skill review keep working under the
-Codex runtime: it converts Codex `item/*` notifications into the standard
-OpenAI-shaped `{role, content, tool_calls, tool_call_id}` entries that
-`agent/curator.py` already knows how to read.
-
-Codex emits items with a discriminator field `type`:
-  - userMessage         → {role: "user", content}
-  - agentMessage        → {role: "assistant", content}
-  - reasoning           → stashed in the assistant's "reasoning" field
-  - commandExecution    → assistant tool_call(name="exec") + tool result
-  - fileChange          → assistant tool_call(name="apply_patch") + tool result
-  - mcpToolCall         → assistant tool_call(name=f"mcp.{server}.{tool}") + tool result
-  - dynamicToolCall     → assistant tool_call(name=tool) + tool result
-  - plan/hookPrompt/collabAgentToolCall → recorded as opaque assistant notes
-
-Each item maps to AT MOST one assistant entry + one tool entry, preserving
-Hermes' message-alternation invariants (system → user → assistant → user/tool
-→ assistant → ...). Multiple Codex tool calls within one Codex turn produce
-multiple consecutive (assistant, tool) pairs, which is the same shape Hermes
-already produces for parallel tool calls.
-
-Counters tracked alongside projection:
-  - tool_iterations: ticks once per completed tool-shaped item. Used by
-    AIAgent._iters_since_skill (skill nudge gate, default threshold 10).
-"""
-
-from __future__ import annotations
-
-import hashlib
-import json
-from dataclasses import dataclass, field
-from typing import Any, Optional
-
-
-def _deterministic_call_id(item_type: str, item_id: str) -> str:
-    """Stable id for tool_call message correlation.
-
-    Uses the codex item id directly when present (already a uuid); falls back
-    to a content hash so replay produces the same id across sessions and
-    prefix caches stay valid. See AGENTS.md Pitfall #16 (deterministic IDs in
-    tool call history)."""
-    if item_id:
-        return f"codex_{item_type}_{item_id}"
-    digest = hashlib.sha256(f"{item_type}".encode()).hexdigest()[:16]
-    return f"codex_{item_type}_{digest}"
-
-
-def _format_tool_args(d: dict) -> str:
-    """Format a dict as JSON the way Hermes' existing tool_calls path does."""
-    return json.dumps(d, ensure_ascii=False, sort_keys=True)
-
-
-@dataclass
-class ProjectionResult:
-    """Output of projecting one Codex item.
-
-    `messages` is a list because some Codex items produce two messages
-    (assistant tool_call + tool result). Empty list = item ignored (e.g. a
-    streaming `outputDelta` that doesn't materialize into messages until the
-    `item/completed` event)."""
-
-    messages: list[dict] = field(default_factory=list)
-    is_tool_iteration: bool = False
-    final_text: Optional[str] = None  # Set when an agentMessage completes
-
-
-class CodexEventProjector:
-    """Stateful projector consuming Codex notifications in arrival order.
-
-    Owns the in-progress reasoning content (codex emits reasoning as separate
-    items but Hermes stashes it on the next assistant message)."""
-
-    def __init__(self) -> None:
-        self._pending_reasoning: list[str] = []
-
-    def project(self, notification: dict) -> ProjectionResult:
-        """Project a single notification. Idempotent for non-completion events;
-        only `item/completed` and `turn/completed` materialize messages."""
-        method = notification.get("method", "")
-        params = notification.get("params", {}) or {}
-
-        # We only materialize messages on `item/completed`. Streaming deltas
-        # (`item/<type>/outputDelta`, `item/<type>/delta`) are display-only and
-        # don't enter the messages list — same way Hermes already only writes
-        # the assistant message after the streaming completion event.
-        if method != "item/completed":
-            return ProjectionResult()
-
-        item = params.get("item") or {}
-        item_type = item.get("type") or ""
-        item_id = item.get("id") or ""
-
-        if item_type == "agentMessage":
-            return self._project_agent_message(item)
-        if item_type == "reasoning":
-            self._pending_reasoning.extend(item.get("summary") or [])
-            self._pending_reasoning.extend(item.get("content") or [])
-            return ProjectionResult()
-        if item_type == "commandExecution":
-            return self._project_command(item, item_id)
-        if item_type == "fileChange":
-            return self._project_file_change(item, item_id)
-        if item_type == "mcpToolCall":
-            return self._project_mcp_tool_call(item, item_id)
-        if item_type == "dynamicToolCall":
-            return self._project_dynamic_tool_call(item, item_id)
-        if item_type == "userMessage":
-            return self._project_user_message(item)
-
-        # Unknown / rare items (plan, hookPrompt, collabAgentToolCall, etc.)
-        # — record as opaque assistant note so memory review can still see
-        # *something* happened, but don't fabricate tool_call structure.
-        return self._project_opaque(item, item_type)
-
-    # ---------- per-type projections ----------
-
-    def _project_agent_message(self, item: dict) -> ProjectionResult:
-        text = item.get("text") or ""
-        msg: dict[str, Any] = {"role": "assistant", "content": text}
-        if self._pending_reasoning:
-            msg["reasoning"] = "\n".join(self._pending_reasoning)
-            self._pending_reasoning = []
-        return ProjectionResult(messages=[msg], final_text=text)
-
-    def _project_user_message(self, item: dict) -> ProjectionResult:
-        # codex's userMessage content is a list of UserInput variants. For
-        # projection purposes we flatten any text fragments and ignore
-        # non-text parts (images, etc.) — Hermes' messages store text only.
-        text_parts: list[str] = []
-        for fragment in item.get("content") or []:
-            if isinstance(fragment, dict):
-                if fragment.get("type") == "text":
-                    text_parts.append(fragment.get("text") or "")
-                elif "text" in fragment:
-                    text_parts.append(str(fragment["text"]))
-        return ProjectionResult(
-            messages=[{"role": "user", "content": "\n".join(text_parts)}]
-        )
-
-    def _project_command(self, item: dict, item_id: str) -> ProjectionResult:
-        call_id = _deterministic_call_id("exec", item_id)
-        args = {
-            "command": item.get("command") or "",
-            "cwd": item.get("cwd") or "",
-        }
-        assistant_msg = {
-            "role": "assistant",
-            "content": None,
-            "tool_calls": [
-                {
-                    "id": call_id,
-                    "type": "function",
-                    "function": {
-                        "name": "exec_command",
-                        "arguments": _format_tool_args(args),
-                    },
-                }
-            ],
-        }
-        if self._pending_reasoning:
-            assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
-            self._pending_reasoning = []
-        output = item.get("aggregatedOutput") or ""
-        exit_code = item.get("exitCode")
-        if exit_code is not None and exit_code != 0:
-            output = f"[exit {exit_code}]\n{output}"
-        tool_msg = {
-            "role": "tool",
-            "tool_call_id": call_id,
-            "content": output,
-        }
-        return ProjectionResult(
-            messages=[assistant_msg, tool_msg], is_tool_iteration=True
-        )
-
-    def _project_file_change(self, item: dict, item_id: str) -> ProjectionResult:
-        call_id = _deterministic_call_id("apply_patch", item_id)
-        # Reduce the codex changes array to a digest the agent loop will
-        # find readable. We record per-file change kinds (Add/Update/Delete)
-        # without inlining full file contents — those can be huge.
-        changes_summary = []
-        for change in item.get("changes") or []:
-            kind = (change.get("kind") or {}).get("type") or "update"
-            path = change.get("path") or ""
-            changes_summary.append({"kind": kind, "path": path})
-        args = {"changes": changes_summary}
-        assistant_msg = {
-            "role": "assistant",
-            "content": None,
-            "tool_calls": [
-                {
-                    "id": call_id,
-                    "type": "function",
-                    "function": {
-                        "name": "apply_patch",
-                        "arguments": _format_tool_args(args),
-                    },
-                }
-            ],
-        }
-        if self._pending_reasoning:
-            assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
-            self._pending_reasoning = []
-        status = item.get("status") or "unknown"
-        n = len(changes_summary)
-        tool_msg = {
-            "role": "tool",
-            "tool_call_id": call_id,
-            "content": f"apply_patch status={status}, {n} change(s)",
-        }
-        return ProjectionResult(
-            messages=[assistant_msg, tool_msg], is_tool_iteration=True
-        )
-
-    def _project_mcp_tool_call(self, item: dict, item_id: str) -> ProjectionResult:
-        server = item.get("server") or "mcp"
-        tool = item.get("tool") or "unknown"
-        call_id = _deterministic_call_id(f"mcp_{server}_{tool}", item_id)
-        args = item.get("arguments") or {}
-        if not isinstance(args, dict):
-            args = {"arguments": args}
-        assistant_msg = {
-            "role": "assistant",
-            "content": None,
-            "tool_calls": [
-                {
-                    "id": call_id,
-                    "type": "function",
-                    "function": {
-                        "name": f"mcp.{server}.{tool}",
-                        "arguments": _format_tool_args(args),
-                    },
-                }
-            ],
-        }
-        if self._pending_reasoning:
-            assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
-            self._pending_reasoning = []
-        result = item.get("result")
-        error = item.get("error")
-        if error:
-            content = f"[error] {json.dumps(error, ensure_ascii=False)[:1000]}"
-        elif result is not None:
-            content = json.dumps(result, ensure_ascii=False)[:4000]
-        else:
-            content = ""
-        tool_msg = {
-            "role": "tool",
-            "tool_call_id": call_id,
-            "content": content,
-        }
-        return ProjectionResult(
-            messages=[assistant_msg, tool_msg], is_tool_iteration=True
-        )
-
-    def _project_dynamic_tool_call(
-        self, item: dict, item_id: str
-    ) -> ProjectionResult:
-        tool = item.get("tool") or "unknown"
-        call_id = _deterministic_call_id(f"dyn_{tool}", item_id)
-        args = item.get("arguments") or {}
-        if not isinstance(args, dict):
-            args = {"arguments": args}
-        assistant_msg = {
-            "role": "assistant",
-            "content": None,
-            "tool_calls": [
-                {
-                    "id": call_id,
-                    "type": "function",
-                    "function": {
-                        "name": tool,
-                        "arguments": _format_tool_args(args),
-                    },
-                }
-            ],
-        }
-        if self._pending_reasoning:
-            assistant_msg["reasoning"] = "\n".join(self._pending_reasoning)
-            self._pending_reasoning = []
-        content_items = item.get("contentItems") or []
-        if isinstance(content_items, list) and content_items:
-            content = json.dumps(content_items, ensure_ascii=False)[:4000]
-        else:
-            success = item.get("success")
-            content = f"success={success}"
-        tool_msg = {
-            "role": "tool",
-            "tool_call_id": call_id,
-            "content": content,
-        }
-        return ProjectionResult(
-            messages=[assistant_msg, tool_msg], is_tool_iteration=True
-        )
-
-    def _project_opaque(self, item: dict, item_type: str) -> ProjectionResult:
-        # Record the existence of the item without inventing tool_calls.
-        # Memory review will see this and may or may not save anything.
-        try:
-            payload = json.dumps(item, ensure_ascii=False)[:1500]
-        except (TypeError, ValueError):
-            payload = repr(item)[:1500]
-        return ProjectionResult(
-            messages=[
-                {
-                    "role": "assistant",
-                    "content": f"[codex {item_type}] {payload}",
-                }
-            ]
-        )
@@ -1,225 +0,0 @@
-"""Hermes-tools-as-MCP server for the codex_app_server runtime.
-
-When the user runs `openai/*` turns through the codex app-server, codex
-owns the loop and builds its own tool list. By default, that means
-Hermes' richer tool surface — web search, browser automation,
-delegate_task subagents, vision analysis, persistent memory, skills,
-cross-session search, image generation, TTS — is unreachable.
-
-This module exposes a curated subset of those Hermes tools to the
-spawned codex subprocess via stdio MCP. Codex registers it as a normal
-MCP server (per `~/.codex/config.toml [mcp_servers.hermes-tools]`) and
-the user gets full Hermes capability inside a Codex turn.
-
-Scope (what we expose):
-  - web_search, web_extract              — Firecrawl, no codex equivalent
-  - browser_navigate / _click / _type /  — Camofox/Browserbase automation
-    _snapshot / _screenshot / _scroll / _back / _press / _vision
-  - delegate_task                        — Hermes subagents
-  - vision_analyze                       — image inspection by vision model
-  - image_generate                       — image generation
-  - memory                               — Hermes' persistent memory store
-  - skill_view, skills_list              — Hermes' skill library
-  - session_search                       — cross-session search
-  - text_to_speech                       — TTS
-
-What we DO NOT expose (codex has equivalents):
-  - terminal / shell                     — codex's own shell tool
-  - read_file / write_file / patch       — codex's apply_patch + shell
-  - search_files / process               — codex's shell
-  - clarify, todo                        — codex's own UX
-
-Run with: python -m agent.transports.hermes_tools_mcp_server
-Spawned by: CodexAppServerSession.ensure_started() when the runtime is
-            active and config opts in.
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import os
-import sys
-from typing import Any, Optional
-
-logger = logging.getLogger(__name__)
-
-
-# Tools we expose. Each name MUST match a registered Hermes tool that
-# `model_tools.handle_function_call()` can dispatch.
-#
-# What we deliberately DO NOT expose:
-#   - terminal / shell / read_file / write_file / patch / search_files /
-#     process — codex's built-ins cover these and approval routes through
-#     codex's own UI.
-#   - delegate_task / memory / session_search / todo — these are
-#     `_AGENT_LOOP_TOOLS` in Hermes (model_tools.py:493). They require
-#     the running AIAgent context to dispatch (mid-loop state), so a
-#     stateless MCP callback can't drive them. Hermes' default runtime
-#     keeps these working; the codex_app_server runtime cannot.
-EXPOSED_TOOLS: tuple[str, ...] = (
-    "web_search",
-    "web_extract",
-    "browser_navigate",
-    "browser_click",
-    "browser_type",
-    "browser_press",
-    "browser_snapshot",
-    "browser_scroll",
-    "browser_back",
-    "browser_get_images",
-    "browser_console",
-    "browser_vision",
-    "vision_analyze",
-    "image_generate",
-    "skill_view",
-    "skills_list",
-    "text_to_speech",
-    # Kanban worker handoff tools — gated on HERMES_KANBAN_TASK env var
-    # (set by the kanban dispatcher when spawning a worker). Without these
-    # in the callback, a worker spawned with openai_runtime=codex_app_server
-    # could do the work but couldn't report completion back to the kernel,
-    # making it hang until timeout. Stateless dispatch — they just read
-    # the env var and write to ~/.hermes/kanban.db.
-    "kanban_complete",
-    "kanban_block",
-    "kanban_comment",
-    "kanban_heartbeat",
-    "kanban_show",
-    "kanban_list",
-    # NOTE: kanban_create / kanban_unblock / kanban_link are orchestrator-
-    # only — the kanban tool gates them on HERMES_KANBAN_TASK being unset.
-    # They're exposed here for orchestrator agents running on the codex
-    # runtime that need to dispatch new tasks.
-    "kanban_create",
-    "kanban_unblock",
-    "kanban_link",
-)
-
-
-def _build_server() -> Any:
-    """Create the FastMCP server with Hermes tools attached. Lazy imports
-    so the module can be imported without the mcp package installed
-    (we degrade to a clear error only when actually run)."""
-    try:
-        from mcp.server.fastmcp import FastMCP
-    except ImportError as exc:  # pragma: no cover - install hint
-        raise ImportError(
-            f"hermes-tools MCP server requires the 'mcp' package: {exc}"
-        ) from exc
-
-    # Discover Hermes tools so dispatch works.
-    from model_tools import (
-        get_tool_definitions,
-        handle_function_call,
-    )
-
-    mcp = FastMCP(
-        "hermes-tools",
-        instructions=(
-            "Hermes Agent's tool surface, exposed for use inside a Codex "
-            "session. Use these for capabilities Codex's built-in toolset "
-            "doesn't cover: web search/extract, browser automation, "
-            "subagent delegation, vision, image generation, persistent "
-            "memory, skills, and cross-session search."
-        ),
-    )
-
-    # Pull authoritative Hermes tool schemas for the ones we expose, so
-    # MCP clients see the same parameter docs Hermes gives the model.
-    all_defs = {
-        td["function"]["name"]: td["function"]
-        for td in (get_tool_definitions(quiet_mode=True) or [])
-        if isinstance(td, dict) and td.get("type") == "function"
-    }
-
-    exposed_count = 0
-
-    for name in EXPOSED_TOOLS:
-        spec = all_defs.get(name)
-        if spec is None:
-            logger.debug(
-                "skipping %s — not registered in this Hermes process", name
-            )
-            continue
-
-        description = spec.get("description") or f"Hermes {name} tool"
-        params_schema = spec.get("parameters") or {"type": "object", "properties": {}}
-
-        # FastMCP wants a Python callable. Build a closure that takes the
-        # arguments dict, dispatches via handle_function_call, and returns
-        # the result string. We use add_tool() for full control over the
-        # input schema (FastMCP's @tool() decorator inspects type hints,
-        # which we can't get from a JSON schema at runtime).
-        def _make_handler(tool_name: str):
-            def _dispatch(**kwargs: Any) -> str:
-                try:
-                    return handle_function_call(tool_name, kwargs or {})
-                except Exception as exc:
-                    logger.exception("tool %s raised", tool_name)
-                    return json.dumps({"error": str(exc), "tool": tool_name})
-            _dispatch.__name__ = tool_name
-            _dispatch.__doc__ = description
-            return _dispatch
-
-        try:
-            mcp.add_tool(
-                _make_handler(name),
-                name=name,
-                description=description,
-                # FastMCP accepts JSON schema directly via the
-                # input_schema parameter on newer versions; older
-                # versions use parameters_schema. Try both for compat.
-            )
-        except TypeError:
-            # Older mcp SDK signature — fall back to decorator-style.
-            handler = _make_handler(name)
-            handler = mcp.tool(name=name, description=description)(handler)
-
-        exposed_count += 1
-
-    logger.info(
-        "hermes-tools MCP server registered %d/%d tools",
-        exposed_count,
-        len(EXPOSED_TOOLS),
-    )
-    return mcp
-
-
-def main(argv: Optional[list[str]] = None) -> int:
-    """Entry point for `python -m agent.transports.hermes_tools_mcp_server`."""
-    argv = argv or sys.argv[1:]
-    verbose = "--verbose" in argv or "-v" in argv
-
-    log_level = logging.INFO if verbose else logging.WARNING
-    logging.basicConfig(
-        level=log_level,
-        stream=sys.stderr,  # MCP uses stdio for protocol — logs MUST go to stderr
-        format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
-    )
-
-    # Quiet mode: keep Hermes' own banners off stdout (which is the MCP wire).
-    os.environ.setdefault("HERMES_QUIET", "1")
-    os.environ.setdefault("HERMES_REDACT_SECRETS", "true")
-
-    try:
-        server = _build_server()
-    except ImportError as exc:
-        sys.stderr.write(f"hermes-tools MCP server cannot start: {exc}\n")
-        return 2
-
-    # FastMCP runs with stdio transport by default when launched as a
-    # subprocess.
-    try:
-        server.run()
-    except KeyboardInterrupt:
-        return 0
-    except Exception as exc:
-        logger.exception("hermes-tools MCP server crashed")
-        sys.stderr.write(f"hermes-tools MCP server error: {exc}\n")
-        return 1
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(main())
@@ -370,17 +370,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://api-docs.deepseek.com/quick_start/pricing",
        pricing_version="deepseek-pricing-2026-03-16",
    ),
-    (
-        "deepseek",
-        "deepseek-v4-pro",
-    ): PricingEntry(
-        input_cost_per_million=Decimal("1.74"),
-        output_cost_per_million=Decimal("3.48"),
-        cache_read_cost_per_million=Decimal("0.0145"),
-        source="official_docs_snapshot",
-        source_url="https://api-docs.deepseek.com/quick_start/pricing",
-        pricing_version="deepseek-pricing-2026-05-12",
-    ),
    # Google Gemini
    (
        "google",
@@ -1,299 +0,0 @@
-"""
-Video Generation Provider ABC
-=============================
-
-Defines the pluggable-backend interface for video generation. Providers register
-instances via ``PluginContext.register_video_gen_provider()``; the active one
-(selected via ``video_gen.provider`` in ``config.yaml``) services every
-``video_generate`` tool call.
-
-Providers live in ``<repo>/plugins/video_gen/<name>/`` (built-in, auto-loaded
-as ``kind: backend``) or ``~/.hermes/plugins/video_gen/<name>/`` (user, opt-in
-via ``plugins.enabled``).
-
-Mirrors the ``image_gen`` provider design (``agent/image_gen_provider.py``) so
-the two surfaces stay learnable together.
-
-Unified surface
---------------
-One tool — ``video_generate`` — covers **text-to-video** and **image-to-video**.
-The router is the presence of ``image_url``: if it's set, the provider routes
-to its image-to-video endpoint; if it's omitted, the provider routes to
-text-to-video. Users pick one **model family** (e.g. Pixverse v6, Veo 3.1,
-Kling O3 Standard); the provider handles which underlying FAL/xAI endpoint
-to hit.
-
-Video edit and video extend are intentionally NOT exposed in this surface —
-the inconsistency across backends is too large for one unified tool. If
-those use cases warrant attention later they can ship as separate tools.
-
-Response shape
--------------
-All providers return a dict built by :func:`success_response` /
-:func:`error_response`. Keys:
-
-    success         bool
-    video           str | None      URL or absolute file path
-    model           str             provider-specific model identifier
-    prompt          str             echoed prompt
-    modality        str             "text" | "image" (which mode was used)
-    aspect_ratio    str             provider-native (e.g. "16:9") or ""
-    duration        int             seconds (0 if not applicable)
-    provider        str             provider name (for diagnostics)
-    error           str             only when success=False
-    error_type      str             only when success=False
-"""
-
-from __future__ import annotations
-
-import abc
-import base64
-import datetime
-import logging
-import uuid
-from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple
-
-logger = logging.getLogger(__name__)
-
-
-# Common aspect ratios across providers (Veo / Kling / xAI / Pixverse). The
-# tool schema advertises this set as an enum hint, but providers may accept
-# a narrower or wider set — they are responsible for clamping.
-COMMON_ASPECT_RATIOS: Tuple[str, ...] = ("16:9", "9:16", "1:1", "4:3", "3:4", "3:2", "2:3")
-DEFAULT_ASPECT_RATIO = "16:9"
-
-COMMON_RESOLUTIONS: Tuple[str, ...] = ("480p", "540p", "720p", "1080p")
-DEFAULT_RESOLUTION = "720p"
-
-
-# ---------------------------------------------------------------------------
-# ABC
-# ---------------------------------------------------------------------------
-
-
-class VideoGenProvider(abc.ABC):
-    """Abstract base class for a video generation backend.
-
-    Subclasses must implement :meth:`generate`. Everything else has sane
-    defaults — override only what your provider needs.
-    """
-
-    @property
-    @abc.abstractmethod
-    def name(self) -> str:
-        """Stable short identifier used in ``video_gen.provider`` config.
-
-        Lowercase, no spaces. Examples: ``xai``, ``fal``, ``google``.
-        """
-
-    @property
-    def display_name(self) -> str:
-        """Human-readable label shown in ``hermes tools``. Defaults to ``name.title()``."""
-        return self.name.title()
-
-    def is_available(self) -> bool:
-        """Return True when this provider can service calls.
-
-        Typically checks for a required API key and optional-dependency
-        import. Default: True.
-        """
-        return True
-
-    def list_models(self) -> List[Dict[str, Any]]:
-        """Return catalog entries for ``hermes tools`` model picker.
-
-        Each entry represents a **model family** that supports text-to-video
-        and/or image-to-video routing internally::
-
-            {
-                "id": "veo-3.1",                       # required
-                "display": "Veo 3.1",                  # optional; defaults to id
-                "speed": "~60s",                       # optional
-                "strengths": "...",                    # optional
-                "price": "$0.20/s",                    # optional
-                "modalities": ["text", "image"],       # optional, advisory
-            }
-
-        Default: empty list (provider has no user-selectable models).
-        """
-        return []
-
-    def get_setup_schema(self) -> Dict[str, Any]:
-        """Return provider metadata for the ``hermes tools`` picker."""
-        return {
-            "name": self.display_name,
-            "badge": "",
-            "tag": "",
-            "env_vars": [],
-        }
-
-    def default_model(self) -> Optional[str]:
-        """Return the default model id, or None if not applicable."""
-        models = self.list_models()
-        if models:
-            return models[0].get("id")
-        return None
-
-    def capabilities(self) -> Dict[str, Any]:
-        """Return what this provider supports.
-
-        Returned dict (all keys optional)::
-
-            {
-                "modalities": ["text", "image"],      # which inputs the backend accepts
-                "aspect_ratios": ["16:9", "9:16", ...],
-                "resolutions": ["720p", "1080p"],
-                "max_duration": 15,                   # seconds
-                "min_duration": 1,
-                "supports_audio": True,
-                "supports_negative_prompt": True,
-                "max_reference_images": 7,
-            }
-
-        Used by the tool layer for soft validation and by ``hermes tools``
-        for the picker. Default: text-only.
-        """
-        return {
-            "modalities": ["text"],
-            "aspect_ratios": list(COMMON_ASPECT_RATIOS),
-            "resolutions": list(COMMON_RESOLUTIONS),
-            "max_duration": 10,
-            "min_duration": 1,
-            "supports_audio": False,
-            "supports_negative_prompt": False,
-            "max_reference_images": 0,
-        }
-
-    @abc.abstractmethod
-    def generate(
-        self,
-        prompt: str,
-        *,
-        model: Optional[str] = None,
-        image_url: Optional[str] = None,
-        reference_image_urls: Optional[List[str]] = None,
-        duration: Optional[int] = None,
-        aspect_ratio: str = DEFAULT_ASPECT_RATIO,
-        resolution: str = DEFAULT_RESOLUTION,
-        negative_prompt: Optional[str] = None,
-        audio: Optional[bool] = None,
-        seed: Optional[int] = None,
-        **kwargs: Any,
-    ) -> Dict[str, Any]:
-        """Generate a video from a prompt (text-to-video) or animate an image
-        (image-to-video).
-
-        Routing: if ``image_url`` is provided, the provider should route to
-        its image-to-video endpoint; otherwise text-to-video. The plugin
-        is responsible for picking the right underlying endpoint within
-        the user's chosen model family.
-
-        Implementations should return the dict from :func:`success_response`
-        or :func:`error_response`. ``kwargs`` may contain forward-compat
-        parameters future versions of the schema will expose —
-        implementations MUST ignore unknown keys (no TypeError).
-        """
-
-
-# ---------------------------------------------------------------------------
-# Helpers
-# ---------------------------------------------------------------------------
-
-
-def _videos_cache_dir() -> Path:
-    """Return ``$HERMES_HOME/cache/videos/``, creating parents as needed."""
-    from hermes_constants import get_hermes_home
-
-    path = get_hermes_home() / "cache" / "videos"
-    path.mkdir(parents=True, exist_ok=True)
-    return path
-
-
-def save_b64_video(
-    b64_data: str,
-    *,
-    prefix: str = "video",
-    extension: str = "mp4",
-) -> Path:
-    """Decode base64 video data and write under ``$HERMES_HOME/cache/videos/``.
-
-    Returns the absolute :class:`Path` to the saved file.
-
-    Filename format: ``<prefix>_<YYYYMMDD_HHMMSS>_<short-uuid>.<ext>``.
-    """
-    raw = base64.b64decode(b64_data)
-    ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
-    short = uuid.uuid4().hex[:8]
-    path = _videos_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
-    path.write_bytes(raw)
-    return path
-
-
-def save_bytes_video(
-    raw: bytes,
-    *,
-    prefix: str = "video",
-    extension: str = "mp4",
-) -> Path:
-    """Write raw video bytes (e.g. an HTTP download body) to the cache."""
-    ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
-    short = uuid.uuid4().hex[:8]
-    path = _videos_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
-    path.write_bytes(raw)
-    return path
-
-
-def success_response(
-    *,
-    video: str,
-    model: str,
-    prompt: str,
-    modality: str = "text",
-    aspect_ratio: str = "",
-    duration: int = 0,
-    provider: str,
-    extra: Optional[Dict[str, Any]] = None,
-) -> Dict[str, Any]:
-    """Build a uniform success response dict.
-
-    ``video`` may be an HTTP URL or an absolute filesystem path.
-    ``modality`` is ``"text"`` (text-to-video) or ``"image"`` (image-to-video) —
-    indicates which endpoint was actually hit, useful for diagnostics.
-    """
-    payload: Dict[str, Any] = {
-        "success": True,
-        "video": video,
-        "model": model,
-        "prompt": prompt,
-        "modality": modality,
-        "aspect_ratio": aspect_ratio,
-        "duration": int(duration) if duration else 0,
-        "provider": provider,
-    }
-    if extra:
-        for k, v in extra.items():
-            payload.setdefault(k, v)
-    return payload
-
-
-def error_response(
-    *,
-    error: str,
-    error_type: str = "provider_error",
-    provider: str = "",
-    model: str = "",
-    prompt: str = "",
-    aspect_ratio: str = "",
-) -> Dict[str, Any]:
-    """Build a uniform error response dict."""
-    return {
-        "success": False,
-        "video": None,
-        "error": error,
-        "error_type": error_type,
-        "model": model,
-        "prompt": prompt,
-        "aspect_ratio": aspect_ratio,
-        "provider": provider,
-    }
@@ -1,117 +0,0 @@
-"""
-Video Generation Provider Registry
-==================================
-
-Central map of registered providers. Populated by plugins at import-time via
-``PluginContext.register_video_gen_provider()``; consumed by the
-``video_generate`` tool to dispatch each call to the active backend.
-
-Active selection
----------------
-The active provider is chosen by ``video_gen.provider`` in ``config.yaml``.
-If unset, :func:`get_active_provider` applies fallback logic:
-
-1. If exactly one provider is registered, use it.
-2. Otherwise return ``None`` (the tool surfaces a helpful error pointing
-   the user at ``hermes tools``).
-
-Mirrors ``agent/image_gen_registry.py`` so the two surfaces behave the
-same.
-"""
-
-from __future__ import annotations
-
-import logging
-import threading
-from typing import Dict, List, Optional
-
-from agent.video_gen_provider import VideoGenProvider
-
-logger = logging.getLogger(__name__)
-
-
-_providers: Dict[str, VideoGenProvider] = {}
-_lock = threading.Lock()
-
-
-def register_provider(provider: VideoGenProvider) -> None:
-    """Register a video generation provider.
-
-    Re-registration (same ``name``) overwrites the previous entry and logs
-    a debug message — this makes hot-reload scenarios (tests, dev loops)
-    behave predictably.
-    """
-    if not isinstance(provider, VideoGenProvider):
-        raise TypeError(
-            f"register_provider() expects a VideoGenProvider instance, "
-            f"got {type(provider).__name__}"
-        )
-    name = provider.name
-    if not isinstance(name, str) or not name.strip():
-        raise ValueError("Video gen provider .name must be a non-empty string")
-    with _lock:
-        existing = _providers.get(name)
-        _providers[name] = provider
-    if existing is not None:
-        logger.debug("Video gen provider '%s' re-registered (was %r)", name, type(existing).__name__)
-    else:
-        logger.debug("Registered video gen provider '%s' (%s)", name, type(provider).__name__)
-
-
-def list_providers() -> List[VideoGenProvider]:
-    """Return all registered providers, sorted by name."""
-    with _lock:
-        items = list(_providers.values())
-    return sorted(items, key=lambda p: p.name)
-
-
-def get_provider(name: str) -> Optional[VideoGenProvider]:
-    """Return the provider registered under *name*, or None."""
-    if not isinstance(name, str):
-        return None
-    with _lock:
-        return _providers.get(name.strip())
-
-
-def get_active_provider() -> Optional[VideoGenProvider]:
-    """Resolve the currently-active provider.
-
-    Reads ``video_gen.provider`` from config.yaml; falls back per the
-    module docstring.
-    """
-    configured: Optional[str] = None
-    try:
-        from hermes_cli.config import load_config
-
-        cfg = load_config()
-        section = cfg.get("video_gen") if isinstance(cfg, dict) else None
-        if isinstance(section, dict):
-            raw = section.get("provider")
-            if isinstance(raw, str) and raw.strip():
-                configured = raw.strip()
-    except Exception as exc:
-        logger.debug("Could not read video_gen.provider from config: %s", exc)
-
-    with _lock:
-        snapshot = dict(_providers)
-
-    if configured:
-        provider = snapshot.get(configured)
-        if provider is not None:
-            return provider
-        logger.debug(
-            "video_gen.provider='%s' configured but not registered; falling back",
-            configured,
-        )
-
-    # Fallback: single-provider case
-    if len(snapshot) == 1:
-        return next(iter(snapshot.values()))
-
-    return None
-
-
-def _reset_for_tests() -> None:
-    """Clear the registry. **Test-only.**"""
-    with _lock:
-        _providers.clear()
@@ -1,221 +0,0 @@
-"""
-Web Search Provider ABC
-=======================
-
-Defines the pluggable-backend interface for web search and content extraction.
-Providers register instances via ``PluginContext.register_web_search_provider()``;
-the active one (selected via ``web.search_backend`` / ``web.extract_backend`` /
-``web.backend`` in ``config.yaml``) services every ``web_search`` /
-``web_extract`` tool call.
-
-Providers live in ``<repo>/plugins/web/<name>/`` (built-in, auto-loaded as
-``kind: backend``) or ``~/.hermes/plugins/web/<name>/`` (user, opt-in via
-``plugins.enabled``).
-
-This ABC is the SINGLE plugin-facing surface for web providers — every
-provider in the tree (brave-free, ddgs, searxng, exa, parallel, tavily,
-firecrawl) implements it. The legacy in-tree ``tools.web_providers.base``
-ABCs were deleted in PR #25182 along with the per-vendor inline helpers
-in ``tools/web_tools.py``; the response-shape contract documented below
-is preserved bit-for-bit so the tool wrapper does not have to translate.
-
-Response shape (preserved from the legacy contract):
-
-Search results::
-
-    {
-        "success": True,
-        "data": {
-            "web": [
-                {"title": str, "url": str, "description": str, "position": int},
-                ...
-            ]
-        }
-    }
-
-Extract results::
-
-    {
-        "success": True,
-        "data": [
-            {"url": str, "title": str, "content": str,
-             "raw_content": str, "metadata": dict},
-            ...
-        ]
-    }
-
-On failure (either capability)::
-
-    {"success": False, "error": str}
-"""
-
-from __future__ import annotations
-
-import abc
-from typing import Any, Dict, List
-
-
-# ---------------------------------------------------------------------------
-# ABC
-# ---------------------------------------------------------------------------
-
-
-class WebSearchProvider(abc.ABC):
-    """Abstract base class for a web search/extract/crawl backend.
-
-    Subclasses must implement :meth:`is_available` and at least one of
-    :meth:`search` / :meth:`extract` / :meth:`crawl`. The
-    :meth:`supports_search` / :meth:`supports_extract` / :meth:`supports_crawl`
-    capability flags let the registry route each tool call to the right
-    provider, and let multi-capability providers (Firecrawl, Tavily, Exa,
-    …) advertise multiple capabilities from a single class.
-    """
-
-    @property
-    @abc.abstractmethod
-    def name(self) -> str:
-        """Stable short identifier used in ``web.search_backend`` /
-        ``web.extract_backend`` / ``web.backend`` config keys.
-
-        Lowercase, no spaces; hyphens permitted to preserve existing
-        user-visible names. Examples: ``brave-free``, ``ddgs``,
-        ``searxng``, ``firecrawl``.
-        """
-
-    @property
-    def display_name(self) -> str:
-        """Human-readable label shown in ``hermes tools``. Defaults to ``name``."""
-        return self.name
-
-    @abc.abstractmethod
-    def is_available(self) -> bool:
-        """Return True when this provider can service calls.
-
-        Typically a cheap check (env var present, optional Python dep
-        importable, instance URL set). Must NOT make network calls — this
-        runs at tool-registration time and on every ``hermes tools`` paint.
-        """
-
-    def supports_search(self) -> bool:
-        """Return True if this provider implements :meth:`search`."""
-        return True
-
-    def supports_extract(self) -> bool:
-        """Return True if this provider implements :meth:`extract`.
-
-        Both sync and async :meth:`extract` implementations are valid — the
-        dispatcher detects coroutine functions via
-        :func:`inspect.iscoroutinefunction` and awaits as needed. Sync
-        implementations that perform blocking I/O (HTTP, SDK calls) should
-        ideally wrap in :func:`asyncio.to_thread` at the call site; small
-        providers can keep their sync shape and let the dispatcher handle
-        threading.
-        """
-        return False
-
-    def supports_crawl(self) -> bool:
-        """Return True if this provider implements :meth:`crawl`.
-
-        Crawl differs from extract in that the agent provides a *seed URL*
-        and the provider walks linked pages on its own — useful for
-        documentation sites where the agent doesn't know all relevant
-        URLs upfront. Tavily is the only built-in backend that natively
-        crawls today; Firecrawl provides a similar capability that we
-        don't currently surface as a tool.
-
-        Providers that don't crawl should leave this as False; the
-        dispatcher in :func:`tools.web_tools.web_crawl_tool` will fall
-        back to its auxiliary-model summarization path.
-        """
-        return False
-
-    def search(self, query: str, limit: int = 5) -> Dict[str, Any]:
-        """Execute a web search.
-
-        Override when :meth:`supports_search` returns True. The default
-        raises NotImplementedError; callers should gate on
-        :meth:`supports_search` before calling.
-        """
-        raise NotImplementedError(
-            f"{self.name} does not support search (override supports_search)"
-        )
-
-    def extract(self, urls: List[str], **kwargs: Any) -> Any:
-        """Extract content from one or more URLs.
-
-        Override when :meth:`supports_extract` returns True. The default
-        raises NotImplementedError; callers should gate on
-        :meth:`supports_extract` before calling.
-
-        Return shape: a list of result dicts matching what the legacy
-        :func:`tools.web_tools.web_extract_tool` post-processing pipeline
-        expects::
-
-            [
-                {
-                    "url": str,
-                    "title": str,
-                    "content": str,
-                    "raw_content": str,
-                    "metadata": dict,           # optional
-                    "error": str,               # optional, only on per-URL failure
-                },
-                ...
-            ]
-
-        Implementations MAY be ``async def`` — the dispatcher detects
-        coroutines via :func:`inspect.iscoroutinefunction` and awaits.
-
-        ``kwargs`` may carry forward-compat fields (``format``, ``include_raw``,
-        ``max_chars``) — implementations should ignore unknown keys.
-        """
-        raise NotImplementedError(
-            f"{self.name} does not support extract (override supports_extract)"
-        )
-
-    def crawl(self, url: str, **kwargs: Any) -> Any:
-        """Crawl a seed URL and return results.
-
-        Override when :meth:`supports_crawl` returns True. The default
-        raises NotImplementedError; callers should gate on
-        :meth:`supports_crawl` before calling.
-
-        Return shape: ``{"results": [{"url": str, "title": str,
-        "content": str, ...}, ...]}`` matching what
-        :func:`tools.web_tools.web_crawl_tool` post-processing expects.
-
-        Implementations MAY be ``async def``.
-
-        ``kwargs`` may carry forward-compat fields (e.g. ``max_depth``,
-        ``include_domains``) — implementations should ignore unknown keys.
-        """
-        raise NotImplementedError(
-            f"{self.name} does not support crawl (override supports_crawl)"
-        )
-
-    def get_setup_schema(self) -> Dict[str, Any]:
-        """Return provider metadata for the ``hermes tools`` picker.
-
-        Used by ``hermes_cli/tools_config.py`` to inject this provider as a
-        row in the Web Search / Web Extract picker. Shape::
-
-            {
-                "name": "Brave Search (Free)",
-                "badge": "free",
-                "tag": "No paid tier needed — uses Brave's free API.",
-                "env_vars": [
-                    {"key": "BRAVE_SEARCH_API_KEY",
-                     "prompt": "Brave Search API key",
-                     "url": "https://brave.com/search/api/"},
-                ],
-            }
-
-        Default: minimal entry derived from ``display_name``. Override to
-        expose API key prompts, badges, and instance URL fields.
-        """
-        return {
-            "name": self.display_name,
-            "badge": "",
-            "tag": "",
-            "env_vars": [],
-        }
@@ -1,262 +0,0 @@
-"""
-Web Search Provider Registry
-============================
-
-Central map of registered web providers. Populated by plugins at import-time
-via :meth:`PluginContext.register_web_search_provider`; consumed by the
-``web_search`` and ``web_extract`` tool wrappers in :mod:`tools.web_tools` to
-dispatch each call to the active backend.
-
-Active selection
----------------
-The active provider is chosen by configuration with this precedence:
-
-1. ``web.search_backend`` / ``web.extract_backend`` / ``web.crawl_backend``
-   (per-capability override).
-2. ``web.backend`` (shared fallback).
-3. If exactly one capability-eligible provider is registered AND available,
-   use it.
-4. Legacy preference order — ``firecrawl`` → ``parallel`` → ``tavily`` →
-   ``exa`` → ``searxng`` → ``brave-free`` → ``ddgs`` — filtered by
-   availability. Matches the historic ``tools.web_tools._get_backend()``
-   candidate order so installs that never set a config key keep landing
-   on the same provider they did before the plugin migration.
-5. Otherwise ``None`` — the tool surfaces a helpful error pointing at
-   ``hermes tools``.
-
-The capability filter (``supports_search`` / ``supports_extract`` /
-``supports_crawl``) is applied at every step so a search-only provider
-(``brave-free``) configured as ``web.extract_backend`` correctly falls
-through to an extract-capable backend.
-"""
-
-from __future__ import annotations
-
-import logging
-import threading
-from typing import Dict, List, Optional
-
-from agent.web_search_provider import WebSearchProvider
-
-logger = logging.getLogger(__name__)
-
-
-_providers: Dict[str, WebSearchProvider] = {}
-_lock = threading.Lock()
-
-
-def register_provider(provider: WebSearchProvider) -> None:
-    """Register a web search/extract provider.
-
-    Re-registration (same ``name``) overwrites the previous entry and logs
-    a debug message — makes hot-reload scenarios (tests, dev loops) behave
-    predictably.
-    """
-    if not isinstance(provider, WebSearchProvider):
-        raise TypeError(
-            f"register_provider() expects a WebSearchProvider instance, "
-            f"got {type(provider).__name__}"
-        )
-    name = provider.name
-    if not isinstance(name, str) or not name.strip():
-        raise ValueError("Web provider .name must be a non-empty string")
-    with _lock:
-        existing = _providers.get(name)
-        _providers[name] = provider
-    if existing is not None:
-        logger.debug(
-            "Web provider '%s' re-registered (was %r)",
-            name, type(existing).__name__,
-        )
-    else:
-        logger.debug(
-            "Registered web provider '%s' (%s)",
-            name, type(provider).__name__,
-        )
-
-
-def list_providers() -> List[WebSearchProvider]:
-    """Return all registered providers, sorted by name."""
-    with _lock:
-        items = list(_providers.values())
-    return sorted(items, key=lambda p: p.name)
-
-
-def get_provider(name: str) -> Optional[WebSearchProvider]:
-    """Return the provider registered under *name*, or None."""
-    if not isinstance(name, str):
-        return None
-    with _lock:
-        return _providers.get(name.strip())
-
-
-# ---------------------------------------------------------------------------
-# Active-provider resolution
-# ---------------------------------------------------------------------------
-
-
-def _read_config_key(*path: str) -> Optional[str]:
-    """Resolve a dotted config key from ``config.yaml``. Returns None on miss."""
-    try:
-        from hermes_cli.config import load_config
-
-        cfg = load_config()
-        cur = cfg
-        for segment in path:
-            if not isinstance(cur, dict):
-                return None
-            cur = cur.get(segment)
-        if isinstance(cur, str) and cur.strip():
-            return cur.strip()
-    except Exception as exc:
-        logger.debug("Could not read config %s: %s", ".".join(path), exc)
-    return None
-
-
-# Legacy preference order — preserves behaviour for users who set no
-# ``web.backend`` / ``web.<capability>_backend`` config key at all. Matches
-# the historic candidate order in :func:`tools.web_tools._get_backend`
-# (paid providers first so existing paid setups don't get downgraded to
-# a free tier on upgrade). Filtered by ``is_available()`` at walk time so
-# we don't surface a provider the user has no credentials for.
-_LEGACY_PREFERENCE = (
-    "firecrawl",
-    "parallel",
-    "tavily",
-    "exa",
-    "searxng",
-    "brave-free",
-    "ddgs",
-)
-
-
-def _resolve(configured: Optional[str], *, capability: str) -> Optional[WebSearchProvider]:
-    """Resolve the active provider for a capability ("search" | "extract" | "crawl").
-
-    Resolution rules (in order):
-
-    1. **Explicit config wins, ignoring availability.** If
-       ``web.{capability}_backend`` or ``web.backend`` names a registered
-       provider that supports *capability*, return it even if its
-       :meth:`is_available` returns False — the dispatcher will surface a
-       precise "X_API_KEY is not set" error to the user instead of silently
-       routing somewhere else. Matches legacy
-       :func:`tools.web_tools._get_backend` behavior for configured names.
-
-    2. **Single-provider shortcut.** When only one registered provider
-       supports *capability* AND ``is_available()`` reports True, return it.
-
-    3. **Legacy preference walk, filtered by availability.** Walk the
-       :data:`_LEGACY_PREFERENCE` order (firecrawl → parallel → tavily →
-       exa → searxng → brave-free → ddgs) looking for a provider whose
-       ``supports_<capability>()`` is True AND whose ``is_available()`` is
-       True. Matches the historic ``tools.web_tools._get_backend()``
-       candidate order so users with credentials but no explicit config
-       key keep landing on the same provider as pre-migration. This is
-       the path that fires when no config key is set — pick the
-       highest-priority backend the user actually has credentials for.
-
-    Returns None when no provider is configured AND no available provider
-    matches the legacy preference; the dispatcher then returns a "set up a
-    provider" error to the user.
-    """
-    with _lock:
-        snapshot = dict(_providers)
-
-    def _capable(p: WebSearchProvider) -> bool:
-        if capability == "search":
-            return bool(p.supports_search())
-        if capability == "extract":
-            return bool(p.supports_extract())
-        if capability == "crawl":
-            return bool(p.supports_crawl())
-        return False
-
-    def _is_available_safe(p: WebSearchProvider) -> bool:
-        """Wrap ``is_available()`` so a buggy provider doesn't kill resolution."""
-        try:
-            return bool(p.is_available())
-        except Exception as exc:  # noqa: BLE001
-            logger.debug("provider %s.is_available() raised %s", p.name, exc)
-            return False
-
-    # 1. Explicit config wins — return regardless of is_available() so the
-    #    user gets a precise downstream error message rather than a silent
-    #    backend switch. Matches _get_backend() in web_tools.py.
-    if configured:
-        provider = snapshot.get(configured)
-        if provider is not None and _capable(provider):
-            return provider
-        if provider is None:
-            logger.debug(
-                "web backend '%s' configured but not registered; falling back",
-                configured,
-            )
-        else:
-            logger.debug(
-                "web backend '%s' configured but does not support '%s'; falling back",
-                configured, capability,
-            )
-
-    # 2. + 3. Fallback path — filter by availability so we don't surface
-    #    a provider the user has no credentials for. Without this filter,
-    #    a registered-but-unconfigured provider could end up "active" on
-    #    a fresh install with no API keys at all.
-    eligible = [
-        p for p in snapshot.values()
-        if _capable(p) and _is_available_safe(p)
-    ]
-    if len(eligible) == 1:
-        return eligible[0]
-
-    for legacy in _LEGACY_PREFERENCE:
-        provider = snapshot.get(legacy)
-        if (
-            provider is not None
-            and _capable(provider)
-            and _is_available_safe(provider)
-        ):
-            return provider
-
-    return None
-
-
-def get_active_search_provider() -> Optional[WebSearchProvider]:
-    """Resolve the currently-active web search provider.
-
-    Reads ``web.search_backend`` (preferred) or ``web.backend`` (shared
-    fallback) from config.yaml; falls back per the module docstring.
-    """
-    explicit = _read_config_key("web", "search_backend") or _read_config_key("web", "backend")
-    return _resolve(explicit, capability="search")
-
-
-def get_active_extract_provider() -> Optional[WebSearchProvider]:
-    """Resolve the currently-active web extract provider.
-
-    Reads ``web.extract_backend`` (preferred) or ``web.backend`` (shared
-    fallback) from config.yaml; falls back per the module docstring.
-    """
-    explicit = _read_config_key("web", "extract_backend") or _read_config_key("web", "backend")
-    return _resolve(explicit, capability="extract")
-
-
-def get_active_crawl_provider() -> Optional[WebSearchProvider]:
-    """Resolve the currently-active web crawl provider.
-
-    Reads ``web.crawl_backend`` (preferred) or ``web.backend`` (shared
-    fallback) from config.yaml; falls back per the module docstring.
-
-    Crawl is a niche capability — among built-in providers only Tavily and
-    Firecrawl implement it. Callers should expect ``None`` and fall back to
-    a different strategy (e.g. summarize-via-LLM) when neither is
-    configured.
-    """
-    explicit = _read_config_key("web", "crawl_backend") or _read_config_key("web", "backend")
-    return _resolve(explicit, capability="crawl")
-
-
-def _reset_for_tests() -> None:
-    """Clear the registry. **Test-only.**"""
-    with _lock:
-        _providers.clear()
@@ -364,18 +364,6 @@ compression:
  # compression of older turns.
  protect_last_n: 20

-  # Number of non-system messages to protect at the head of the transcript, in
-  # ADDITION to the system prompt (which is always implicitly protected).
-  # Head messages are NEVER summarized — they survive every compression
-  # indefinitely. This gives stable early context for short/medium sessions,
-  # but in long-running sessions that rely on rolling compaction the pinned
-  # opening turns may not match how you want the session framed over time.
-  # Set to 0 to preserve ONLY the system prompt (plus the rolling summary
-  # and recent tail) — the cleanest configuration for long-running sessions.
-  # Default 3 preserves the system prompt plus the first three non-system
-  # head messages, matching the pre-feature behaviour.
-  protect_first_n: 3
-
  # To pin a specific model/provider for compression summaries, use the
  # auxiliary section below (auxiliary.compression.provider / model).

@@ -111,7 +111,6 @@ _HOME_TARGET_ENV_VARS = {
    "weixin": "WEIXIN_HOME_CHANNEL",
    "bluebubbles": "BLUEBUBBLES_HOME_CHANNEL",
    "qqbot": "QQBOT_HOME_CHANNEL",
-    "whatsapp": "WHATSAPP_HOME_CHANNEL",
 }

 # Legacy env var names kept for back-compat.  Each entry is the current
@@ -39,10 +39,6 @@ if [ "$(id -u)" = "0" ]; then
        # by the mapped user on the host side.
        chown -R hermes:hermes "$HERMES_HOME" 2>/dev/null || \
            echo "Warning: chown failed (rootless container?) — continuing anyway"
-        # The .venv must also be re-chowned when UID is remapped, otherwise
-        # lazy_deps.py cannot install platform packages (discord.py, etc.).
-        chown -R hermes:hermes "$INSTALL_DIR/.venv" 2>/dev/null || \
-            echo "Warning: chown .venv failed (rootless container?) — continuing anyway"
    fi

    # Ensure config.yaml is readable by the hermes runtime user even if it was
@@ -2,7 +2,7 @@
 Hermes Gateway - Multi-platform messaging integration.

 This module provides a unified gateway for connecting the Hermes agent
-to various messaging platforms (Telegram, Discord, WhatsApp, Weixin, and more) with:
+to various messaging platforms (Telegram, Discord, WhatsApp) with:
 - Session management (persistent conversations with reset policies)
 - Dynamic context injection (agent knows where messages come from)
 - Delivery routing (cron job outputs to appropriate channels)
@@ -2,7 +2,7 @@
 Gateway configuration management.

 Handles loading and validating configuration for:
- Connected platforms (Telegram, Discord, WhatsApp, Weixin, and more)
+- Connected platforms (Telegram, Discord, WhatsApp)
 - Home channels for each platform
 - Session reset policies
 - Delivery preferences
@@ -74,24 +74,6 @@ def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
    return default


-def _ensure_platform_extra_dict(platforms_data: dict, name: str) -> tuple[dict, dict]:
-    """Get-or-create ``platforms_data[name]`` and its nested ``extra`` dict.
-
-    Both slots are coerced to ``{}`` if a non-dict value is encountered, so
-    callers can safely write keys without type-checking.  Returns
-    ``(plat_data, extra)`` for in-place mutation.
-    """
-    plat_data = platforms_data.setdefault(name, {})
-    if not isinstance(plat_data, dict):
-        plat_data = {}
-        platforms_data[name] = plat_data
-    extra = plat_data.setdefault("extra", {})
-    if not isinstance(extra, dict):
-        extra = {}
-        plat_data["extra"] = extra
-    return plat_data, extra
-
-
 # Module-level cache for bundled platform plugin names (lives outside the
 # enum so it doesn't become an accidental enum member).
 _Platform__bundled_plugin_names: Optional[set] = None
@@ -735,10 +717,6 @@ def load_gateway_config() -> GatewayConfig:
                gw_data["thread_sessions_per_user"] = yaml_cfg["thread_sessions_per_user"]

            streaming_cfg = yaml_cfg.get("streaming")
-            if not isinstance(streaming_cfg, dict):
-                # Fall back to nested gateway.streaming written by
-                # ``hermes config set gateway.streaming.*``
-                streaming_cfg = yaml_cfg.get("gateway", {}).get("streaming")
            if isinstance(streaming_cfg, dict):
                gw_data["streaming"] = streaming_cfg

@@ -777,27 +755,7 @@ def load_gateway_config() -> GatewayConfig:
                        merged["extra"] = merged_extra
                    platforms_data[plat_name] = merged
                gw_data["platforms"] = platforms_data
-            # Iterate built-in platforms plus any registered plugin platforms
-            # so plugin authors get the same shared-key bridging (#24836).
-            try:
-                from hermes_cli.plugins import discover_plugins
-                discover_plugins()  # idempotent
-                from gateway.platform_registry import platform_registry as _pr
-            except Exception as e:
-                logger.debug("plugin discovery skipped: %s", e)
-                _pr = None
-
-            _shared_loop_targets: list = list(Platform)
-            if _pr is not None:
-                for _entry in _pr.plugin_entries():
-                    try:
-                        _plat = Platform(_entry.name)
-                    except (ValueError, KeyError):
-                        continue
-                    if _plat not in _shared_loop_targets:
-                        _shared_loop_targets.append(_plat)
-
-            for plat in _shared_loop_targets:
+            for plat in Platform:
                if plat == Platform.LOCAL:
                    continue
                platform_cfg = yaml_cfg.get(plat.value)
@@ -852,38 +810,20 @@ def load_gateway_config() -> GatewayConfig:
                enabled_was_explicit = "enabled" in platform_cfg
                if not bridged and not enabled_was_explicit:
                    continue
-                plat_data, extra = _ensure_platform_extra_dict(platforms_data, plat.value)
+                plat_data = platforms_data.setdefault(plat.value, {})
+                if not isinstance(plat_data, dict):
+                    plat_data = {}
+                    platforms_data[plat.value] = plat_data
                if enabled_was_explicit:
                    plat_data["enabled"] = platform_cfg["enabled"]
+                extra = plat_data.setdefault("extra", {})
+                if not isinstance(extra, dict):
+                    extra = {}
+                    plat_data["extra"] = extra
                if plat == Platform.SLACK and enabled_was_explicit:
                    extra["_enabled_explicit"] = True
                extra.update(bridged)

-            # Plugin-owned YAML→env config bridges (#24836).  See
-            # ``PlatformEntry.apply_yaml_config_fn`` for the hook contract.
-            # Order: shared-key loop (above) → this dispatch → legacy hardcoded
-            # blocks (below; no-op when a hook already set their env var) →
-            # ``_apply_env_overrides()`` after ``GatewayConfig.from_dict``.
-            if _pr is not None:
-                for entry in _pr.all_entries():
-                    if entry.apply_yaml_config_fn is None:
-                        continue
-                    platform_cfg = yaml_cfg.get(entry.name)
-                    if not isinstance(platform_cfg, dict):
-                        continue
-                    try:
-                        seeded = entry.apply_yaml_config_fn(yaml_cfg, platform_cfg)
-                    except Exception as e:
-                        logger.debug(
-                            "apply_yaml_config_fn for %s raised: %s",
-                            entry.name, e,
-                        )
-                        continue
-                    if not isinstance(seeded, dict) or not seeded:
-                        continue
-                    _, extra = _ensure_platform_extra_dict(platforms_data, entry.name)
-                    extra.update(seeded)
-
            # Slack settings → env vars (env vars take precedence)
            slack_cfg = yaml_cfg.get("slack", {})
            if isinstance(slack_cfg, dict):
@@ -912,8 +852,6 @@ def load_gateway_config() -> GatewayConfig:
            if isinstance(discord_cfg, dict):
                if "require_mention" in discord_cfg and not os.getenv("DISCORD_REQUIRE_MENTION"):
                    os.environ["DISCORD_REQUIRE_MENTION"] = str(discord_cfg["require_mention"]).lower()
-                if "thread_require_mention" in discord_cfg and not os.getenv("DISCORD_THREAD_REQUIRE_MENTION"):
-                    os.environ["DISCORD_THREAD_REQUIRE_MENTION"] = str(discord_cfg["thread_require_mention"]).lower()
                frc = discord_cfg.get("free_response_channels")
                if frc is not None and not os.getenv("DISCORD_FREE_RESPONSE_CHANNELS"):
                    if isinstance(frc, list):
@@ -119,22 +119,6 @@ class PlatformEntry:
    # Signature: () -> Optional[dict[str, Any]]
    env_enablement_fn: Optional[Callable[[], Optional[dict]]] = None

-    # ── YAML→env config bridge ──
-    # Optional: translate this platform's ``config.yaml`` keys into env vars
-    # and/or seed ``PlatformConfig.extra`` directly.  Lets a plugin own its
-    # YAML config translation instead of forcing core ``gateway/config.py``
-    # to know every platform's schema.
-    #
-    # Signature: (yaml_cfg: dict, platform_cfg: dict) -> Optional[dict]
-    # Called from ``load_gateway_config()`` after the generic shared-key loop
-    # and before ``_apply_env_overrides``.  Mutating ``os.environ`` is allowed
-    # (use ``not os.getenv(...)`` guards to preserve env > YAML precedence);
-    # any returned dict is merged into ``PlatformConfig.extra``.  Exceptions
-    # are caught and logged at debug level.
-    # See website/docs/developer-guide/adding-platform-adapters.md for the
-    # full contract and a worked example.
-    apply_yaml_config_fn: Optional[Callable[[dict, dict], Optional[dict]]] = None
-
    # Optional: home-channel env var name for cron/notification delivery
    # (e.g. ``"IRC_HOME_CHANNEL"``).  When set, ``cron.scheduler`` treats this
    # platform as a valid ``deliver=<name>`` target and reads the env var to
@@ -21,14 +21,6 @@ status display, gateway setup, and more.
  constructed.  Without this, env-only setups don't surface in
  `hermes gateway status` or `get_connected_platforms()` until the SDK
  instantiates.
- `apply_yaml_config_fn: (yaml_cfg, platform_cfg) -> Optional[dict]` —
-  translate this platform's `config.yaml` keys into env vars and/or seed
-  `PlatformConfig.extra` directly.  Lets a plugin own its YAML schema
-  instead of growing core `gateway/config.py` boilerplate per platform.
-  Mutating `os.environ` is allowed (use `not os.getenv(...)` guards to
-  preserve env > YAML precedence); the returned dict is merged into
-  `PlatformConfig.extra`.  Called during `load_gateway_config()` after
-  the generic shared-key loop and before `_apply_env_overrides()`.
 - `cron_deliver_env_var: str` — name of the `*_HOME_CHANNEL` env var.  When
  set, `deliver=<name>` cron jobs route to this var without editing
  `cron/scheduler.py`'s hardcoded sets.
@@ -1168,9 +1168,6 @@ class APIServerAdapter(BasePlatformAdapter):
                agent_ref=agent_ref,
                gateway_session_key=gateway_session_key,
            ))
-            # Ensure SSE drain loops can terminate without relying on polling
-            # agent_task.done(), which can race with queue timeout checks.
-            agent_task.add_done_callback(lambda _fut: _stream_q.put(None))

            return await self._write_sse_chat_completion(
                request, completion_id, model_name, created, _stream_q,
@@ -2200,9 +2197,6 @@ class APIServerAdapter(BasePlatformAdapter):
                agent_ref=agent_ref,
                gateway_session_key=gateway_session_key,
            ))
-            # Ensure SSE drain loops can terminate without relying on polling
-            # agent_task.done(), which can race with queue timeout checks.
-            agent_task.add_done_callback(lambda _fut: _stream_q.put(None))

            response_id = f"resp_{uuid.uuid4().hex[:28]}"
            model_name = body.get("model", self._model_name)
@@ -1,7 +1,7 @@
 """
 Base platform adapter interface.

-All platform adapters (Telegram, Discord, WhatsApp, Weixin, and more) inherit from this
+All platform adapters (Telegram, Discord, WhatsApp) inherit from this
 and implement the required methods.
 """

@@ -1743,63 +1743,6 @@ class BasePlatformAdapter(ABC):
        """
        return SendResult(success=False, error="Not supported")

-    async def send_clarify(
-        self,
-        chat_id: str,
-        question: str,
-        choices: Optional[list],
-        clarify_id: str,
-        session_key: str,
-        metadata: Optional[Dict[str, Any]] = None,
-    ) -> SendResult:
-        """Send a clarify prompt to the user.
-
-        Two render modes:
-
-          * **Multiple choice** (``choices`` is a non-empty list) — adapters
-            that override this should render inline buttons (one per choice
-            plus a final "Other" / free-text option).  Button callbacks
-            MUST resolve via
-            ``tools.clarify_gateway.resolve_gateway_clarify(clarify_id, response)``
-            with the chosen string.  Picking the "Other" button calls
-            ``mark_awaiting_text(clarify_id)`` so the next message in the
-            session is captured as the response.
-
-          * **Open-ended** (``choices`` is None or empty) — render the
-            question as a plain text message; the next user message in the
-            session is captured by the gateway's text-intercept and
-            resolves the clarify automatically (see
-            ``GatewayRunner._maybe_intercept_clarify_text``).
-
-        The default implementation falls back to a numbered text list,
-        which works on every platform — the user replies with a number
-        ("2") or with the literal choice text, and the gateway intercepts
-        and resolves.  For the text fallback path, the default calls
-        ``mark_awaiting_text()`` so that the gateway text-intercept
-        (:meth:`GatewayRunner._maybe_intercept_clarify_text`) catches the
-        user's reply instead of timing out.
-        Adapters with native button UIs (Telegram, Discord) SHOULD
-        override this for a richer UX.
-        """
-        if choices:
-            lines = [f"❓ {question}", ""]
-            for i, choice in enumerate(choices, start=1):
-                lines.append(f"  {i}. {choice}")
-            lines.append("")
-            lines.append("Reply with the number, the option text, or your own answer.")
-            text = "\n".join(lines)
-            # Text fallback: enable text-capture so the gateway intercept
-            # picks up the user's typed reply (e.g. "2" or choice text).
-            from tools.clarify_gateway import mark_awaiting_text
-            mark_awaiting_text(clarify_id)
-        else:
-            text = f"❓ {question}"
-        return await self.send(
-            chat_id=chat_id,
-            content=text,
-            metadata=metadata,
-        )
-
    async def send_private_notice(
        self,
        chat_id: str,
@@ -2888,58 +2831,6 @@ class BasePlatformAdapter(ABC):
                    logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
                return

-            # Clarify text-capture bypass: if the agent is blocked on a
-            # clarify_tool call awaiting a free-form text response (open-
-            # ended clarify, or user picked "Other"), the next non-command
-            # message in this session MUST reach the runner so the
-            # clarify-intercept can resolve it and unblock the agent.
-            #
-            # Without this bypass: the message gets queued in
-            # _pending_messages AND triggers an interrupt, killing the
-            # agent run mid-clarify and discarding the user's answer.
-            # Same shape as the /approve deadlock fix (PR #4926) — both
-            # cases are "agent thread blocked on Event.wait, message must
-            # reach the resolver before being treated as a new turn."
-            if not cmd:
-                try:
-                    from tools import clarify_gateway as _clarify_mod
-                    _has_text_clarify = (
-                        _clarify_mod.get_pending_for_session(session_key) is not None
-                    )
-                except Exception:
-                    _has_text_clarify = False
-
-                if _has_text_clarify:
-                    logger.debug(
-                        "[%s] Routing message to clarify text-intercept for %s",
-                        self.name, session_key,
-                    )
-                    try:
-                        _thread_meta = _thread_metadata_for_source(
-                            event.source, _reply_anchor_for_event(event)
-                        )
-                        response = await self._message_handler(event)
-                        _text, _eph_ttl = self._unwrap_ephemeral(response)
-                        if _text:
-                            _r = await self._send_with_retry(
-                                chat_id=event.source.chat_id,
-                                content=_text,
-                                reply_to=_reply_anchor_for_event(event),
-                                metadata=_thread_meta,
-                            )
-                            if _eph_ttl > 0 and _r.success and _r.message_id:
-                                self._schedule_ephemeral_delete(
-                                    chat_id=event.source.chat_id,
-                                    message_id=_r.message_id,
-                                    ttl_seconds=_eph_ttl,
-                                )
-                    except Exception as e:
-                        logger.error(
-                            "[%s] Clarify text-intercept dispatch failed: %s",
-                            self.name, e, exc_info=True,
-                        )
-                    return
-
            if self._busy_session_handler is not None:
                try:
                    if await self._busy_session_handler(event, session_key):
@@ -111,33 +111,9 @@ DINGTALK_TYPE_MAPPING = {


 def check_dingtalk_requirements() -> bool:
-    """Check if DingTalk dependencies are available and configured.
-
-    Lazy-installs dingtalk-stream via ``tools.lazy_deps.ensure("platform.dingtalk")``
-    on first call if not present.
-    """
-    global DINGTALK_STREAM_AVAILABLE, dingtalk_stream, ChatbotMessage, CallbackMessage, AckMessage
-    global HTTPX_AVAILABLE, httpx
+    """Check if DingTalk dependencies are available and configured."""
    if not DINGTALK_STREAM_AVAILABLE or not HTTPX_AVAILABLE:
-        try:
-            from tools.lazy_deps import ensure as _lazy_ensure
-            _lazy_ensure("platform.dingtalk", prompt=False)
-        except Exception:
-            return False
-        try:
-            import dingtalk_stream as _ds
-            from dingtalk_stream import ChatbotMessage as _CM
-            from dingtalk_stream.frames import CallbackMessage as _CBM, AckMessage as _AM
-            import httpx as _httpx
-        except ImportError:
-            return False
-        dingtalk_stream = _ds
-        ChatbotMessage = _CM
-        CallbackMessage = _CBM
-        AckMessage = _AM
-        httpx = _httpx
-        DINGTALK_STREAM_AVAILABLE = True
-        HTTPX_AVAILABLE = True
+        return False
    if not os.getenv("DINGTALK_CLIENT_ID") or not os.getenv("DINGTALK_CLIENT_SECRET"):
        return False
    return True
@@ -86,32 +86,8 @@ def _clean_discord_id(entry: str) -> str:


 def check_discord_requirements() -> bool:
-    """Check if Discord dependencies are available.
-
-    Lazy-installs discord.py via ``tools.lazy_deps.ensure("platform.discord")``
-    on first call if not present. After successful install, re-binds module
-    globals so ``DISCORD_AVAILABLE`` becomes True.
-    """
-    global DISCORD_AVAILABLE, discord, DiscordMessage, Intents, commands
-    if DISCORD_AVAILABLE:
-        return True
-    try:
-        from tools.lazy_deps import ensure as _lazy_ensure
-        _lazy_ensure("platform.discord", prompt=False)
-    except Exception:
-        return False
-    try:
-        import discord as _discord
-        from discord import Message as _DM, Intents as _Intents
-        from discord.ext import commands as _commands
-    except ImportError:
-        return False
-    discord = _discord
-    DiscordMessage = _DM
-    Intents = _Intents
-    commands = _commands
-    DISCORD_AVAILABLE = True
-    return True
+    """Check if Discord dependencies are available."""
+    return DISCORD_AVAILABLE


 def _build_allowed_mentions():
@@ -3577,25 +3553,6 @@ class DiscordAdapter(BasePlatformAdapter):
            return {part.strip() for part in s.split(",") if part.strip()}
        return set()

-    def _discord_thread_require_mention(self) -> bool:
-        """Return whether thread participation requires @mention to follow up.
-
-        When ``False`` (default), once the bot has participated in a thread it
-        keeps responding to every message in that thread without needing to be
-        mentioned again — useful for one-on-one conversations.
-
-        When ``True``, the @mention requirement is enforced inside threads as
-        well.  Set this when multiple bots share a thread and you want each
-        one to only fire on explicit @mention, avoiding bot-to-bot loops or
-        unwanted cross-replies.
-        """
-        configured = self.config.extra.get("thread_require_mention")
-        if configured is not None:
-            if isinstance(configured, str):
-                return configured.lower() not in ("false", "0", "no", "off")
-            return bool(configured)
-        return os.getenv("DISCORD_THREAD_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
-
    def _thread_parent_channel(self, channel: Any) -> Any:
        """Return the parent text channel when invoked from a thread."""
        return getattr(channel, "parent", None) or channel
@@ -3896,84 +3853,6 @@ class DiscordAdapter(BasePlatformAdapter):
        except Exception as e:
            return SendResult(success=False, error=str(e))

-    async def send_clarify(
-        self,
-        chat_id: str,
-        question: str,
-        choices: Optional[list],
-        clarify_id: str,
-        session_key: str,
-        metadata: Optional[Dict[str, Any]] = None,
-    ) -> SendResult:
-        """Render a clarify prompt with one Discord button per choice.
-
-        Multi-choice mode (``choices`` non-empty): renders a button per option
-        plus a final "✏️ Other (type answer)" button. Picking "Other" flips
-        the clarify entry into text-capture mode so the next user message in
-        the session becomes the response. Numeric clicks resolve immediately
-        via ``resolve_gateway_clarify(clarify_id, choice_text)``.
-
-        Open-ended mode (``choices`` empty/None): renders the question as
-        plain embed text — no buttons. The gateway's text-intercept captures
-        the next message in this session and resolves the clarify.
-        """
-        if not self._client or not DISCORD_AVAILABLE:
-            return SendResult(success=False, error="Not connected")
-
-        try:
-            target_id = chat_id
-            if metadata and metadata.get("thread_id"):
-                target_id = metadata["thread_id"]
-
-            channel = self._client.get_channel(int(target_id))
-            if not channel:
-                channel = await self._client.fetch_channel(int(target_id))
-
-            # Discord embed description limit is 4096; trim conservatively.
-            max_desc = 4088
-            body = str(question or "").strip()
-            if len(body) > max_desc:
-                body = body[: max_desc - 3] + "..."
-
-            embed = discord.Embed(
-                title="❓ Hermes needs your input",
-                description=body,
-                color=discord.Color.orange(),
-            )
-
-            clean_choices = [
-                str(c).strip() for c in (choices or []) if c is not None and str(c).strip()
-            ]
-            # Discord allows up to 5 buttons per row, 5 rows per view = 25.
-            # We reserve one slot for the "Other" button, so cap at 24 choices.
-            clean_choices = clean_choices[:24]
-
-            if clean_choices:
-                embed.add_field(
-                    name="Choices",
-                    value="Pick one below, or click ✏️ Other to type a custom answer.",
-                    inline=False,
-                )
-                view = ClarifyChoiceView(
-                    choices=clean_choices,
-                    clarify_id=clarify_id,
-                    allowed_user_ids=self._allowed_user_ids,
-                    allowed_role_ids=self._allowed_role_ids,
-                )
-            else:
-                embed.add_field(
-                    name="Reply",
-                    value="Reply in this channel with your answer.",
-                    inline=False,
-                )
-                view = None
-
-            msg = await channel.send(embed=embed, view=view) if view else await channel.send(embed=embed)
-            return SendResult(success=True, message_id=str(msg.id))
-        except Exception as e:
-            logger.warning("[%s] send_clarify failed: %s", self.name, e)
-            return SendResult(success=False, error=str(e))
-
    async def send_update_prompt(
        self, chat_id: str, prompt: str, default: str = "",
        session_key: str = "",
@@ -4264,17 +4143,6 @@ class DiscordAdapter(BasePlatformAdapter):
        raw_content = message.content.strip()
        normalized_content = raw_content
        mention_prefix = False
-
-        snapshot_attachments = []
-        if hasattr(message, "message_snapshots") and message.message_snapshots:
-            snapshot_text_parts = []
-            for snap in message.message_snapshots:
-                if getattr(snap, "content", None):
-                    snapshot_text_parts.append(snap.content.strip())
-                snapshot_attachments.extend(getattr(snap, "attachments", []) or [])
-            if snapshot_text_parts and not raw_content:
-                raw_content = "\n".join(snapshot_text_parts)
-                normalized_content = raw_content
        if self._client.user and self._client.user in message.mentions:
            mention_prefix = True
            normalized_content = normalized_content.replace(f"<@{self._client.user.id}>", "").strip()
@@ -4317,15 +4185,8 @@ class DiscordAdapter(BasePlatformAdapter):
            )

            # Skip the mention check if the message is in a thread where
-            # the bot has previously participated (auto-created or replied in)
-            # — UNLESS thread_require_mention is enabled, in which case threads
-            # are gated the same as channels.  Useful when multiple bots share
-            # a thread.
-            in_bot_thread = (
-                is_thread
-                and thread_id in self._threads
-                and not self._discord_thread_require_mention()
-            )
+            # the bot has previously participated (auto-created or replied in).
+            in_bot_thread = is_thread and thread_id in self._threads

            if require_mention and not is_free_channel and not in_bot_thread:
                if self._client.user not in message.mentions and not mention_prefix:
@@ -4338,7 +4199,7 @@ class DiscordAdapter(BasePlatformAdapter):
        if not is_thread and not isinstance(message.channel, discord.DMChannel):
            no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
            no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
-            skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
+            skip_thread = bool(channel_ids & no_thread_channels)
            auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in {"true", "1", "yes"}
            is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
            if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -4350,15 +4211,13 @@ class DiscordAdapter(BasePlatformAdapter):
                    auto_threaded_channel = thread
                    self._threads.mark(thread_id)

-        all_attachments = list(message.attachments) + snapshot_attachments
-
        # Determine message type
        msg_type = MessageType.TEXT
        if normalized_content.startswith("/"):
            msg_type = MessageType.COMMAND
-        elif all_attachments:
+        elif message.attachments:
            # Check attachment types
-            for att in all_attachments:
+            for att in message.attachments:
                if att.content_type:
                    if att.content_type.startswith("image/"):
                        msg_type = MessageType.PHOTO
@@ -4417,7 +4276,7 @@ class DiscordAdapter(BasePlatformAdapter):
        media_urls = []
        media_types = []
        pending_text_injection: Optional[str] = None
-        for att in all_attachments:
+        for att in message.attachments:
            content_type = att.content_type or "unknown"
            if content_type.startswith("image/"):
                try:
@@ -5216,188 +5075,3 @@ if DISCORD_AVAILABLE:
        async def on_timeout(self):
            self.resolved = True
            self.clear_items()
-
-
-    class ClarifyChoiceView(discord.ui.View):
-        """Interactive button view for the clarify tool's multiple-choice prompts.
-
-        Renders one button per choice (max 24) plus a final ``✏️ Other`` button.
-        Picking a numeric choice resolves the gateway clarify entry immediately;
-        picking ``Other`` flips the entry into text-capture mode so the next
-        user message in the session becomes the response (the gateway's
-        text-intercept handles the resolution).
-
-        Auth gating mirrors ``ExecApprovalView`` — only users/roles in the
-        Discord adapter's allowlist may answer. Single-use: after the first
-        valid click all buttons disable and the embed updates to show who
-        answered and what they chose.
-        """
-
-        def __init__(
-            self,
-            choices: List[str],
-            clarify_id: str,
-            allowed_user_ids: set,
-            allowed_role_ids: Optional[set] = None,
-        ):
-            super().__init__(timeout=300)  # 5-minute timeout
-            self.choices = list(choices)[:24]
-            self.clarify_id = clarify_id
-            self.allowed_user_ids = allowed_user_ids
-            self.allowed_role_ids = allowed_role_ids or set()
-            self.resolved = False
-
-            for index, choice in enumerate(self.choices):
-                # Discord button labels are capped at 80 chars.
-                label_body = choice if len(choice) <= 75 else choice[:72] + "..."
-                button = discord.ui.Button(
-                    label=f"{index + 1}. {label_body}",
-                    style=discord.ButtonStyle.primary,
-                    custom_id=f"clarify:{clarify_id}:{index}",
-                )
-                button.callback = self._make_choice_callback(index, choice)
-                self.add_item(button)
-
-            other_btn = discord.ui.Button(
-                label="✏️ Other (type answer)",
-                style=discord.ButtonStyle.secondary,
-                custom_id=f"clarify:{clarify_id}:other",
-            )
-            other_btn.callback = self._on_other
-            self.add_item(other_btn)
-
-        def _check_auth(self, interaction: "discord.Interaction") -> bool:
-            return _component_check_auth(
-                interaction, self.allowed_user_ids, self.allowed_role_ids,
-            )
-
-        def _make_choice_callback(self, index: int, choice: str):
-            async def _callback(interaction: "discord.Interaction"):
-                await self._resolve_choice(interaction, index, choice)
-            return _callback
-
-        async def _resolve_choice(
-            self,
-            interaction: "discord.Interaction",
-            index: int,
-            choice: str,
-        ) -> None:
-            """Resolve the clarify with a chosen option."""
-            if self.resolved:
-                await interaction.response.send_message(
-                    "This prompt has already been answered~", ephemeral=True,
-                )
-                return
-            if not self._check_auth(interaction):
-                await interaction.response.send_message(
-                    "You're not authorized to answer this prompt~", ephemeral=True,
-                )
-                return
-
-            self.resolved = True
-            for child in self.children:
-                child.disabled = True
-
-            embed = interaction.message.embeds[0] if (
-                interaction.message and interaction.message.embeds
-            ) else None
-            if embed:
-                user = getattr(interaction, "user", None)
-                display_name = getattr(user, "display_name", "user")
-                embed.color = discord.Color.green()
-                embed.set_footer(text=f"Answered by {display_name}: {choice}")
-
-            try:
-                await interaction.response.edit_message(embed=embed, view=self)
-            except Exception:
-                logger.debug(
-                    "Discord clarify edit_message failed for %s",
-                    self.clarify_id,
-                    exc_info=True,
-                )
-                try:
-                    await interaction.response.defer()
-                except Exception:
-                    pass
-
-            # Resolve via the gateway clarify primitive — same mechanism as
-            # Telegram. Look up the canonical choice text from the entry so
-            # we round-trip the original value, not a button-label variant.
-            resolved_text: Optional[str] = None
-            try:
-                from tools.clarify_gateway import _entries as _clarify_entries  # type: ignore
-                entry = _clarify_entries.get(self.clarify_id)
-                if entry and entry.choices and 0 <= index < len(entry.choices):
-                    resolved_text = entry.choices[index]
-            except Exception:
-                resolved_text = None
-            if resolved_text is None:
-                resolved_text = choice
-
-            try:
-                from tools.clarify_gateway import resolve_gateway_clarify
-                resolved = resolve_gateway_clarify(self.clarify_id, resolved_text)
-                logger.info(
-                    "Discord clarify button resolved (id=%s, choice=%r, user=%s, ok=%s)",
-                    self.clarify_id, resolved_text,
-                    getattr(getattr(interaction, "user", None), "display_name", "?"),
-                    resolved,
-                )
-            except Exception as exc:
-                logger.error(
-                    "Discord clarify resolve_gateway_clarify failed (id=%s): %s",
-                    self.clarify_id, exc,
-                )
-
-        async def _on_other(self, interaction: "discord.Interaction") -> None:
-            """Flip the clarify entry into text-capture mode."""
-            if self.resolved:
-                await interaction.response.send_message(
-                    "This prompt has already been answered~", ephemeral=True,
-                )
-                return
-            if not self._check_auth(interaction):
-                await interaction.response.send_message(
-                    "You're not authorized to answer this prompt~", ephemeral=True,
-                )
-                return
-
-            # Don't pop the entry — the gateway's text-intercept needs it
-            # until the user actually types. Just mark it as awaiting text
-            # and disable the buttons so the user can't double-click.
-            try:
-                from tools.clarify_gateway import mark_awaiting_text
-                mark_awaiting_text(self.clarify_id)
-            except Exception as exc:
-                logger.warning(
-                    "Discord clarify mark_awaiting_text failed (id=%s): %s",
-                    self.clarify_id, exc,
-                )
-
-            self.resolved = True
-            for child in self.children:
-                child.disabled = True
-
-            embed = interaction.message.embeds[0] if (
-                interaction.message and interaction.message.embeds
-            ) else None
-            if embed:
-                user = getattr(interaction, "user", None)
-                display_name = getattr(user, "display_name", "user")
-                embed.color = discord.Color.blue()
-                embed.set_footer(
-                    text=f"Awaiting typed response from {display_name}…",
-                )
-
-            try:
-                await interaction.response.edit_message(embed=embed, view=self)
-            except Exception:
-                try:
-                    await interaction.response.defer()
-                except Exception:
-                    pass
-
-        async def on_timeout(self):
-            self.resolved = True
-            for child in self.children:
-                child.disabled = True
@@ -1300,12 +1300,12 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
        except Exception:
            logger.debug("[Feishu] Failed to apply websocket runtime overrides", exc_info=True)

-    def _connect_with_overrides(*args: Any, **kwargs: Any) -> Any:
+    async def _connect_with_overrides(*args: Any, **kwargs: Any) -> Any:
        if adapter._ws_ping_interval is not None and "ping_interval" not in kwargs:
            kwargs["ping_interval"] = adapter._ws_ping_interval
        if adapter._ws_ping_timeout is not None and "ping_timeout" not in kwargs:
            kwargs["ping_timeout"] = adapter._ws_ping_timeout
-        return original_connect(*args, **kwargs)
+        return await original_connect(*args, **kwargs)

    def _configure_with_overrides(conf: Any) -> Any:
        if original_configure is None:
@@ -1343,65 +1343,8 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:


 def check_feishu_requirements() -> bool:
-    """Check if Feishu/Lark dependencies are available.
-
-    Lazy-installs lark-oapi via ``tools.lazy_deps.ensure("platform.feishu")``
-    on first call if not present. Rebinds all module-level globals on success.
-    """
-    if FEISHU_AVAILABLE:
-        return True
-
-    def _import():
-        import lark_oapi as lark
-        from lark_oapi.api.application.v6 import GetApplicationRequest
-        from lark_oapi.api.im.v1 import (
-            CreateFileRequest, CreateFileRequestBody,
-            CreateImageRequest, CreateImageRequestBody,
-            CreateMessageRequest, CreateMessageRequestBody,
-            GetChatRequest, GetMessageRequest, GetMessageResourceRequest,
-            P2ImMessageMessageReadV1,
-            ReplyMessageRequest, ReplyMessageRequestBody,
-            UpdateMessageRequest, UpdateMessageRequestBody,
-        )
-        from lark_oapi.core import AccessTokenType, HttpMethod
-        from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
-        from lark_oapi.core.model import BaseRequest
-        from lark_oapi.event.callback.model.p2_card_action_trigger import (
-            CallBackCard, P2CardActionTriggerResponse,
-        )
-        from lark_oapi.event.dispatcher_handler import EventDispatcherHandler
-        from lark_oapi.ws import Client as FeishuWSClient
-        return {
-            "lark": lark,
-            "GetApplicationRequest": GetApplicationRequest,
-            "CreateFileRequest": CreateFileRequest,
-            "CreateFileRequestBody": CreateFileRequestBody,
-            "CreateImageRequest": CreateImageRequest,
-            "CreateImageRequestBody": CreateImageRequestBody,
-            "CreateMessageRequest": CreateMessageRequest,
-            "CreateMessageRequestBody": CreateMessageRequestBody,
-            "GetChatRequest": GetChatRequest,
-            "GetMessageRequest": GetMessageRequest,
-            "GetMessageResourceRequest": GetMessageResourceRequest,
-            "P2ImMessageMessageReadV1": P2ImMessageMessageReadV1,
-            "ReplyMessageRequest": ReplyMessageRequest,
-            "ReplyMessageRequestBody": ReplyMessageRequestBody,
-            "UpdateMessageRequest": UpdateMessageRequest,
-            "UpdateMessageRequestBody": UpdateMessageRequestBody,
-            "AccessTokenType": AccessTokenType,
-            "HttpMethod": HttpMethod,
-            "FEISHU_DOMAIN": FEISHU_DOMAIN,
-            "LARK_DOMAIN": LARK_DOMAIN,
-            "BaseRequest": BaseRequest,
-            "CallBackCard": CallBackCard,
-            "P2CardActionTriggerResponse": P2CardActionTriggerResponse,
-            "EventDispatcherHandler": EventDispatcherHandler,
-            "FeishuWSClient": FeishuWSClient,
-            "FEISHU_AVAILABLE": True,
-        }
-
-    from tools.lazy_deps import ensure_and_bind
-    return ensure_and_bind("platform.feishu", _import, globals(), prompt=False)
+    """Check if Feishu/Lark dependencies are available."""
+    return FEISHU_AVAILABLE


 class FeishuAdapter(BasePlatformAdapter):
@@ -224,11 +224,7 @@ def _check_e2ee_deps() -> bool:


 def check_matrix_requirements() -> bool:
-    """Return True if the Matrix adapter can be used.
-
-    Lazy-installs mautrix via ``tools.lazy_deps.ensure("platform.matrix")``
-    on first call if not present. Rebinds all module-level type globals on success.
-    """
+    """Return True if the Matrix adapter can be used."""
    token = os.getenv("MATRIX_ACCESS_TOKEN", "")
    password = os.getenv("MATRIX_PASSWORD", "")
    homeserver = os.getenv("MATRIX_HOMESERVER", "")
@@ -242,31 +238,10 @@ def check_matrix_requirements() -> bool:
    try:
        import mautrix  # noqa: F401
    except ImportError:
-        def _import():
-            from mautrix.types import (
-                ContentURI, EventID, EventType, PaginationDirection,
-                PresenceState, RoomCreatePreset, RoomID, SyncToken,
-                TrustState, UserID,
-            )
-            return {
-                "ContentURI": ContentURI,
-                "EventID": EventID,
-                "EventType": EventType,
-                "PaginationDirection": PaginationDirection,
-                "PresenceState": PresenceState,
-                "RoomCreatePreset": RoomCreatePreset,
-                "RoomID": RoomID,
-                "SyncToken": SyncToken,
-                "TrustState": TrustState,
-                "UserID": UserID,
-            }
-
-        from tools.lazy_deps import ensure_and_bind
-        if not ensure_and_bind("platform.matrix", _import, globals(), prompt=False):
-            logger.warning(
-                "Matrix: mautrix not installed. Run: pip install 'mautrix[encryption]'"
-            )
-            return False
+        logger.warning(
+            "Matrix: mautrix not installed. Run: pip install 'mautrix[encryption]'"
+        )
+        return False

    # If encryption is requested, verify E2EE deps are available at startup
    # rather than silently degrading to plaintext-only at connect time.
@@ -176,28 +176,6 @@ class QQAdapter(BasePlatformAdapter):
                fut.set_exception(RuntimeError(reason))
        self._pending_responses.clear()

-    def _mark_transport_disconnected(self) -> None:
-        """Mark QQ WS down without stopping the reconnect loop.
-
-        BasePlatformAdapter uses _running for both process lifecycle and
-        connection status. QQBot needs to keep the listener task alive across
-        transient transport drops so it can continue reconnect attempts after a
-        short-lived gateway or network failure.
-        """
-        if self.has_fatal_error:
-            return
-        self._write_runtime_status_safe(
-            "disconnected",
-            platform_state="disconnected",
-            error_code=None,
-            error_message=None,
-        )
-
-    @property
-    def is_connected(self) -> bool:
-        """Return True only when the QQ WebSocket transport is usable."""
-        return bool(self._running and self._ws and not self._ws.closed)
-
    def __init__(self, config: PlatformConfig):
        super().__init__(config, Platform.QQBOT)

@@ -531,7 +509,7 @@ class QQAdapter(BasePlatformAdapter):
                else:
                    quick_disconnect_count = 0

-                self._mark_transport_disconnected()
+                self._mark_disconnected()
                self._fail_pending("Connection closed")

                # Stop reconnecting for fatal codes
@@ -553,7 +531,6 @@ class QQAdapter(BasePlatformAdapter):
                        RATE_LIMIT_DELAY,
                    )
                    if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
-                        self._mark_disconnected()
                        return
                    await asyncio.sleep(RATE_LIMIT_DELAY)
                    if await self._reconnect(backoff_idx):
@@ -607,19 +584,17 @@ class QQAdapter(BasePlatformAdapter):
                    backoff_idx += 1
                    if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
                        logger.error("[%s] Max reconnect attempts reached (QQCloseError)", self._log_tag)
-                        self._mark_disconnected()
                        return

            except Exception as exc:
                if not self._running:
                    return
                logger.warning("[%s] WebSocket error: %s", self._log_tag, exc)
-                self._mark_transport_disconnected()
+                self._mark_disconnected()
                self._fail_pending("Connection interrupted")

                if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
                    logger.error("[%s] Max reconnect attempts reached", self._log_tag)
-                    self._mark_disconnected()
                    return

                if await self._reconnect(backoff_idx):
@@ -446,9 +446,7 @@ class SignalAdapter(BasePlatformAdapter):
                if sent_msg and isinstance(sent_msg, dict):
                    dest = sent_msg.get("destinationNumber") or sent_msg.get("destination")
                    sent_ts = sent_msg.get("timestamp")
-                    sent_msg_group_info = sent_msg.get("groupInfo") or {}
-                    sent_msg_group_id = sent_msg_group_info.get("groupId") if sent_msg_group_info else None
-                    if dest == self._account_normalized or sent_msg_group_id:
+                    if dest == self._account_normalized:
                        # Check if this is an echo of our own outbound reply
                        if sent_ts and sent_ts in self._recent_sent_timestamps:
                            self._recent_sent_timestamps.discard(sent_ts)
@@ -73,29 +73,8 @@ class _ThreadContextCache:


 def check_slack_requirements() -> bool:
-    """Check if Slack dependencies are available.
-
-    Lazy-installs slack-bolt/slack-sdk via ``tools.lazy_deps.ensure("platform.slack")``
-    on first call if not present. Rebinds all module-level globals on success.
-    """
-    if SLACK_AVAILABLE:
-        return True
-
-    def _import():
-        from slack_bolt.async_app import AsyncApp
-        from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
-        from slack_sdk.web.async_client import AsyncWebClient
-        import aiohttp
-        return {
-            "AsyncApp": AsyncApp,
-            "AsyncSocketModeHandler": AsyncSocketModeHandler,
-            "AsyncWebClient": AsyncWebClient,
-            "aiohttp": aiohttp,
-            "SLACK_AVAILABLE": True,
-        }
-
-    from tools.lazy_deps import ensure_and_bind
-    return ensure_and_bind("platform.slack", _import, globals(), prompt=False)
+    """Check if Slack dependencies are available."""
+    return SLACK_AVAILABLE


 def _extract_text_from_slack_blocks(blocks: list) -> str:
@@ -1798,26 +1777,6 @@ class SlackAdapter(BasePlatformAdapter):
            return

        original_text = event.get("text", "")
-
-        # Slack blocks native slash commands inside threads ("/queue is not
-        # supported in threads. Sorry!").  As a workaround, recognise a
-        # leading ``!`` as an alternate command prefix and rewrite it to
-        # ``/`` so the rest of the pipeline (MessageType.COMMAND tagging,
-        # gateway dispatcher) handles it like a normal slash command.  Only
-        # rewrite when the first token resolves to a known gateway command
-        # so casual messages like "!nice work" pass through unchanged.
-        if original_text.startswith("!"):
-            try:
-                from hermes_cli.commands import is_gateway_known_command
-                first_token = original_text[1:].split(maxsplit=1)[0]
-                # Strip "@suffix" the same way get_command() does, so
-                # forms like ``!stop@hermes`` still resolve.
-                cmd_name = first_token.split("@", 1)[0].lower()
-                if cmd_name and "/" not in cmd_name and is_gateway_known_command(cmd_name):
-                    original_text = "/" + original_text[1:]
-            except Exception:  # pragma: no cover - defensive
-                pass
-
        text = original_text

        # Extract quoted/forwarded content from Slack blocks.
@@ -103,58 +103,8 @@ _TELEGRAM_IMAGE_EXT_TO_MIME = {


 def check_telegram_requirements() -> bool:
-    """Check if Telegram dependencies are available.
-
-    If python-telegram-bot is missing, attempts to lazy-install it via
-    ``tools.lazy_deps.ensure("platform.telegram")``. After a successful
-    install, re-imports the SDK and flips ``TELEGRAM_AVAILABLE`` to True
-    so the adapter's class-level type aliases get rebound.
-    """
-    global TELEGRAM_AVAILABLE, Update, Bot, Message, InlineKeyboardButton
-    global InlineKeyboardMarkup, LinkPreviewOptions, Application
-    global CommandHandler, CallbackQueryHandler, TelegramMessageHandler
-    global ContextTypes, filters, ParseMode, ChatType, HTTPXRequest
-    if TELEGRAM_AVAILABLE:
-        return True
-    try:
-        from tools.lazy_deps import ensure as _lazy_ensure
-        _lazy_ensure("platform.telegram", prompt=False)
-    except Exception:
-        return False
-    try:
-        from telegram import Update as _Update, Bot as _Bot, Message as _Message
-        from telegram import InlineKeyboardButton as _IKB, InlineKeyboardMarkup as _IKM
-        try:
-            from telegram import LinkPreviewOptions as _LPO
-        except ImportError:
-            _LPO = None
-        from telegram.ext import (
-            Application as _App, CommandHandler as _CH,
-            CallbackQueryHandler as _CQH,
-            MessageHandler as _MH,
-            ContextTypes as _CT, filters as _filters,
-        )
-        from telegram.constants import ParseMode as _PM, ChatType as _CtT
-        from telegram.request import HTTPXRequest as _HR
-    except ImportError:
-        return False
-    Update = _Update
-    Bot = _Bot
-    Message = _Message
-    InlineKeyboardButton = _IKB
-    InlineKeyboardMarkup = _IKM
-    LinkPreviewOptions = _LPO
-    Application = _App
-    CommandHandler = _CH
-    CallbackQueryHandler = _CQH
-    TelegramMessageHandler = _MH
-    ContextTypes = _CT
-    filters = _filters
-    ParseMode = _PM
-    ChatType = _CtT
-    HTTPXRequest = _HR
-    TELEGRAM_AVAILABLE = True
-    return True
+    """Check if Telegram dependencies are available."""
+    return TELEGRAM_AVAILABLE


 # Matches every character that MarkdownV2 requires to be backslash-escaped
@@ -332,13 +282,6 @@ class TelegramAdapter(BasePlatformAdapter):
    MEDIA_GROUP_WAIT_SECONDS = 0.8
    _GENERAL_TOPIC_THREAD_ID = "1"

-    # Telegram's edit_message applies MarkdownV2 formatting only on the
-    # finalize=True path.  Without this flag, stream_consumer._send_or_edit
-    # short-circuits when the raw text is unchanged between the last streamed
-    # edit and the final edit, skipping the plain-text → MarkdownV2 conversion.
-    # Fixes #25710.
-    REQUIRES_EDIT_FINALIZE: bool = True
-
    # Adaptive text-batch ingress: short messages need a tighter delay so the
    # first token reaches the agent fast.  Numbers tuned for "feels instant":
    # ≤320 codepoints (one short paragraph) settles in ~180ms; ≤1024
@@ -434,9 +377,6 @@ class TelegramAdapter(BasePlatformAdapter):
        # Slash-confirm button state: confirm_id → session_key (for /reload-mcp
        # and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
        self._slash_confirm_state: Dict[str, str] = {}
-        # Clarify button state: clarify_id → session_key (for the clarify tool's
-        # multiple-choice prompts; see GatewayRunner clarify_callback wiring).
-        self._clarify_state: Dict[str, str] = {}
        # Notification mode for message sends.
        # "important" — only final responses, approvals, and slash confirmations
        #               trigger notifications; tool progress, streaming, status
@@ -2077,7 +2017,7 @@ class TelegramAdapter(BasePlatformAdapter):
            return SendResult(success=False, error="Not connected")
        try:
            default_hint = f" (default: {default})" if default else ""
-            text = self.format_message(f"⚕ *Update needs your input:*\n\n{prompt}{default_hint}")
+            text = f"⚕ *Update needs your input:*\n\n{prompt}{default_hint}"
            keyboard = InlineKeyboardMarkup([
                [
                    InlineKeyboardButton("✓ Yes", callback_data="update_prompt:y"),
@@ -2089,7 +2029,7 @@ class TelegramAdapter(BasePlatformAdapter):
            msg = await self._send_message_with_thread_fallback(
                chat_id=int(chat_id),
                text=text,
-                parse_mode=ParseMode.MARKDOWN_V2,
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
                reply_to_message_id=reply_to_id,
                **self._thread_kwargs_for_send(
@@ -2225,80 +2165,6 @@ class TelegramAdapter(BasePlatformAdapter):
            logger.warning("[%s] send_slash_confirm failed: %s", self.name, e)
            return SendResult(success=False, error=str(e))

-    async def send_clarify(
-        self,
-        chat_id: str,
-        question: str,
-        choices: Optional[list],
-        clarify_id: str,
-        session_key: str,
-        metadata: Optional[Dict[str, Any]] = None,
-    ) -> SendResult:
-        """Render a clarify prompt with one inline button per choice.
-
-        Multi-choice mode (``choices`` non-empty): renders one button per
-        option plus a final "✏️ Other (type answer)" button.  Picking the
-        "Other" button flips the entry into text-capture mode so the next
-        message becomes the response.
-
-        Open-ended mode (``choices`` empty): renders the question as plain
-        text — no buttons.  The next message in the session is captured by
-        the gateway's text-intercept and resolves the clarify.
-        """
-        if not self._bot:
-            return SendResult(success=False, error="Not connected")
-
-        try:
-            text = f"❓ {_html.escape(question)}"
-            thread_id = self._metadata_thread_id(metadata)
-
-            kwargs: Dict[str, Any] = {
-                "chat_id": int(chat_id),
-                "text": text,
-                "parse_mode": ParseMode.HTML,
-                **self._link_preview_kwargs(),
-            }
-
-            if choices:
-                # Telegram caps callback_data at 64 bytes; keep "cl:<id>:<idx>"
-                # short.  Button label is also capped (~64 chars in practice).
-                rows = []
-                for idx, choice in enumerate(choices):
-                    label = str(choice)
-                    if len(label) > 60:
-                        label = label[:57] + "..."
-                    rows.append([
-                        InlineKeyboardButton(
-                            f"{idx + 1}. {label}",
-                            callback_data=f"cl:{clarify_id}:{idx}",
-                        )
-                    ])
-                rows.append([
-                    InlineKeyboardButton(
-                        "✏️ Other (type answer)",
-                        callback_data=f"cl:{clarify_id}:other",
-                    )
-                ])
-                kwargs["reply_markup"] = InlineKeyboardMarkup(rows)
-
-            reply_to_id = self._reply_to_message_id_for_send(None, metadata)
-            kwargs["reply_to_message_id"] = reply_to_id
-            kwargs.update(
-                self._thread_kwargs_for_send(
-                    chat_id,
-                    thread_id,
-                    metadata,
-                    reply_to_message_id=reply_to_id,
-                )
-            )
-
-            msg = await self._send_message_with_thread_fallback(**kwargs)
-            self._clarify_state[clarify_id] = session_key
-            return SendResult(success=True, message_id=str(msg.message_id))
-        except Exception as e:
-            logger.warning("[%s] send_clarify failed: %s", self.name, e)
-            return SendResult(success=False, error=str(e))
-
    async def send_model_picker(
        self,
        chat_id: str,
@@ -2341,13 +2207,11 @@ class TelegramAdapter(BasePlatformAdapter):
            keyboard = InlineKeyboardMarkup(rows)

            provider_label = get_label(current_provider)
-            text = self.format_message(
-                (
-                    f"⚙ *Model Configuration*\n\n"
-                    f"Current model: `{current_model or 'unknown'}`\n"
-                    f"Provider: {provider_label}\n\n"
-                    f"Select a provider:"
-                )
+            text = (
+                f"⚙ *Model Configuration*\n\n"
+                f"Current model: `{current_model or 'unknown'}`\n"
+                f"Provider: {provider_label}\n\n"
+                f"Select a provider:"
            )

            thread_id = metadata.get("thread_id") if metadata else None
@@ -2355,7 +2219,7 @@ class TelegramAdapter(BasePlatformAdapter):
            msg = await self._send_message_with_thread_fallback(
                chat_id=int(chat_id),
                text=text,
-                parse_mode=ParseMode.MARKDOWN_V2,
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
                reply_to_message_id=reply_to_id,
                **self._thread_kwargs_for_send(
@@ -2465,14 +2329,12 @@ class TelegramAdapter(BasePlatformAdapter):
            extra = f"\n_{total - shown} more available — type `/model <name>` directly_" if total > shown else ""

            await query.edit_message_text(
-                text=self.format_message(
-                    (
-                        f"⚙ *Model Configuration*\n\n"
-                        f"Provider: *{pname}*{page_info}\n"
-                        f"Select a model:{extra}"
-                    )
+                text=(
+                    f"⚙ *Model Configuration*\n\n"
+                    f"Provider: *{pname}*{page_info}\n"
+                    f"Select a model:{extra}"
                ),
-                parse_mode=ParseMode.MARKDOWN_V2,
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
            )
            await query.answer()
@@ -2501,14 +2363,12 @@ class TelegramAdapter(BasePlatformAdapter):
            extra = f"\n_{total - shown} more available — type `/model <name>` directly_" if total > shown else ""

            await query.edit_message_text(
-                text=self.format_message(
-                    (
-                        f"⚙ *Model Configuration*\n\n"
-                        f"Provider: *{pname}*{page_info}\n"
-                        f"Select a model:{extra}"
-                    )
+                text=(
+                    f"⚙ *Model Configuration*\n\n"
+                    f"Provider: *{pname}*{page_info}\n"
+                    f"Select a model:{extra}"
                ),
-                parse_mode=ParseMode.MARKDOWN_V2,
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
            )
            await query.answer()
@@ -2543,8 +2403,8 @@ class TelegramAdapter(BasePlatformAdapter):
            # Edit message to show confirmation, remove buttons
            try:
                await query.edit_message_text(
-                    text=self.format_message(result_text),
-                    parse_mode=ParseMode.MARKDOWN_V2,
+                    text=result_text,
+                    parse_mode=ParseMode.MARKDOWN,
                    reply_markup=None,
                )
            except Exception:
@@ -2584,15 +2444,13 @@ class TelegramAdapter(BasePlatformAdapter):
                provider_label = state["current_provider"]

            await query.edit_message_text(
-                text=self.format_message(
-                    (
-                        f"⚙ *Model Configuration*\n\n"
-                        f"Current model: `{state['current_model'] or 'unknown'}`\n"
-                        f"Provider: {provider_label}\n\n"
-                        f"Select a provider:"
-                    )
+                text=(
+                    f"⚙ *Model Configuration*\n\n"
+                    f"Current model: `{state['current_model'] or 'unknown'}`\n"
+                    f"Provider: {provider_label}\n\n"
+                    f"Select a provider:"
                ),
-                parse_mode=ParseMode.MARKDOWN_V2,
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
            )
            await query.answer()
@@ -2675,8 +2533,8 @@ class TelegramAdapter(BasePlatformAdapter):
                # Edit message to show decision, remove buttons
                try:
                    await query.edit_message_text(
-                        text=self.format_message(f"{label} by {user_display}"),
-                        parse_mode=ParseMode.MARKDOWN_V2,
+                        text=f"{label} by {user_display}",
+                        parse_mode=ParseMode.MARKDOWN,
                        reply_markup=None,
                    )
                except Exception:
@@ -2729,8 +2587,8 @@ class TelegramAdapter(BasePlatformAdapter):

                try:
                    await query.edit_message_text(
-                        text=self.format_message(f"{label} by {user_display}"),
-                        parse_mode=ParseMode.MARKDOWN_V2,
+                        text=f"{label} by {user_display}",
+                        parse_mode=ParseMode.MARKDOWN,
                        reply_markup=None,
                    )
                except Exception:
@@ -2755,8 +2613,8 @@ class TelegramAdapter(BasePlatformAdapter):
                        prompt_message_id = getattr(query.message, "message_id", None)
                        send_kwargs: Dict[str, Any] = {
                            "chat_id": int(query.message.chat_id),
-                            "text": self.format_message(result_text),
-                            "parse_mode": ParseMode.MARKDOWN_V2,
+                            "text": result_text,
+                            "parse_mode": ParseMode.MARKDOWN,
                            **self._link_preview_kwargs(),
                        }
                        chat_type_value = getattr(chat_type, "value", chat_type)
@@ -2787,116 +2645,11 @@ class TelegramAdapter(BasePlatformAdapter):
                                    {"thread_id": str(thread_id)},
                                )
                            )
-                        await self._send_message_with_thread_fallback(**send_kwargs)
+                        await self._bot.send_message(**send_kwargs)
                except Exception as exc:
                    logger.error("[%s] slash-confirm callback failed: %s", self.name, exc, exc_info=True)
            return

-        # --- Clarify callbacks (cl:clarify_id:idx | cl:clarify_id:other) ---
-        if data.startswith("cl:"):
-            parts = data.split(":", 2)
-            if len(parts) == 3:
-                clarify_id = parts[1]
-                choice_token = parts[2]
-
-                caller_id = str(getattr(query.from_user, "id", ""))
-                if not self._is_callback_user_authorized(
-                    caller_id,
-                    chat_id=query_chat_id,
-                    chat_type=str(query_chat_type) if query_chat_type is not None else None,
-                    thread_id=str(query_thread_id) if query_thread_id is not None else None,
-                    user_name=query_user_name,
-                ):
-                    await query.answer(text="⛔ You are not authorized to answer this prompt.")
-                    return
-
-                session_key = self._clarify_state.get(clarify_id)
-                if not session_key:
-                    await query.answer(text="This prompt has already been resolved.")
-                    return
-
-                user_display = getattr(query.from_user, "first_name", "User")
-
-                if choice_token == "other":
-                    # Flip into text-capture mode and tell the user to type
-                    # their answer.  The gateway's text-intercept will pick
-                    # up the next message in this session and resolve the
-                    # clarify.  Do NOT pop _clarify_state yet — we still
-                    # need it if the user is slow to respond and the entry
-                    # is cleared by something else.
-                    try:
-                        from tools.clarify_gateway import mark_awaiting_text
-                        mark_awaiting_text(clarify_id)
-                    except Exception as exc:
-                        logger.warning("[%s] mark_awaiting_text failed: %s", self.name, exc)
-
-                    await query.answer(text="✏️ Type your answer in the chat.")
-                    try:
-                        await query.edit_message_text(
-                            text=f"❓ {query.message.text or ''}\n\n<i>Awaiting typed response from {_html.escape(user_display)}…</i>",
-                            parse_mode=ParseMode.HTML,
-                            reply_markup=None,
-                        )
-                    except Exception:
-                        pass
-                    return
-
-                # Numeric choice → resolve immediately with the chosen text
-                try:
-                    idx = int(choice_token)
-                except (ValueError, TypeError):
-                    await query.answer(text="Invalid choice.")
-                    return
-
-                # Look up the choice text from the entry registered in the
-                # clarify primitive.  Fall back to the index if the entry
-                # has been cleaned up (race with timeout / session reset).
-                resolved_text: Optional[str] = None
-                try:
-                    from tools.clarify_gateway import _entries as _clarify_entries  # type: ignore
-                    entry = _clarify_entries.get(clarify_id)
-                    if entry and entry.choices and 0 <= idx < len(entry.choices):
-                        resolved_text = entry.choices[idx]
-                except Exception:
-                    resolved_text = None
-
-                if resolved_text is None:
-                    # Race: entry vanished. Echo the index as a number so
-                    # the agent at least sees an intentional response
-                    # rather than nothing.
-                    resolved_text = f"choice {idx + 1}"
-
-                # Pop state and resolve
-                self._clarify_state.pop(clarify_id, None)
-                try:
-                    from tools.clarify_gateway import resolve_gateway_clarify
-                    resolved = resolve_gateway_clarify(clarify_id, resolved_text)
-                except Exception as exc:
-                    logger.error("[%s] resolve_gateway_clarify failed: %s", self.name, exc)
-                    resolved = False
-
-                await query.answer(text=f"✓ {resolved_text[:60]}")
-                try:
-                    await query.edit_message_text(
-                        text=f"❓ {_html.escape(query.message.text or '')}\n\n<b>{_html.escape(user_display)}:</b> {_html.escape(resolved_text)}",
-                        parse_mode=ParseMode.HTML,
-                        reply_markup=None,
-                    )
-                except Exception:
-                    pass
-
-                if resolved:
-                    logger.info(
-                        "Telegram clarify button resolved (id=%s, choice=%r, user=%s)",
-                        clarify_id, resolved_text, user_display,
-                    )
-                else:
-                    logger.warning(
-                        "Telegram clarify button: resolve_gateway_clarify returned False (id=%s)",
-                        clarify_id,
-                    )
-            return
-
        # --- Update prompt callbacks ---
        if not data.startswith("update_prompt:"):
            return
@@ -2916,8 +2669,8 @@ class TelegramAdapter(BasePlatformAdapter):
        label = "Yes" if answer == "y" else "No"
        try:
            await query.edit_message_text(
-                text=self.format_message(f"⚕ Update prompt answered: *{label}*"),
-                parse_mode=ParseMode.MARKDOWN_V2,
+                text=f"⚕ Update prompt answered: *{label}*",
+                parse_mode=ParseMode.MARKDOWN,
                reply_markup=None,
            )
        except Exception:
@@ -4776,27 +4529,6 @@ class TelegramAdapter(BasePlatformAdapter):
            logger.debug("[%s] set_message_reaction failed (%s): %s", self.name, emoji, e)
            return False

-    async def _clear_reactions(self, chat_id: str, message_id: str) -> bool:
-        """Clear all reactions from a Telegram message.
-
-        Calling ``set_message_reaction`` with ``reaction=None`` (or an empty
-        sequence) is the documented Bot API way to remove all bot-set
-        reactions on a message — equivalent to Bot API 10.0's
-        ``deleteMessageReaction`` but supported in PTB 22.6 already.
-        """
-        if not self._bot:
-            return False
-        try:
-            await self._bot.set_message_reaction(
-                chat_id=int(chat_id),
-                message_id=int(message_id),
-                reaction=None,
-            )
-            return True
-        except Exception as e:
-            logger.debug("[%s] clear reactions failed: %s", self.name, e)
-            return False
-
    async def on_processing_start(self, event: MessageEvent) -> None:
        """Add an in-progress reaction when message processing begins."""
        if not self._reactions_enabled():
@@ -4811,23 +4543,12 @@ class TelegramAdapter(BasePlatformAdapter):

        Unlike Discord (additive reactions), Telegram's set_message_reaction
        replaces all existing reactions in one call — no remove step needed.
-
-        On CANCELLED outcomes (e.g. the user runs ``/stop``, or a session is
-        interrupted mid-flight), we explicitly clear the 👀 in-progress
-        reaction so it doesn't linger on the user's message indefinitely.
-        Without this clear, the only way to remove the 👀 was to wait for
-        another agent run to swap it to 👍/👎 — which never happens if the
-        cancellation was the last activity in the chat.
        """
        if not self._reactions_enabled():
            return
        chat_id = getattr(event.source, "chat_id", None)
        message_id = getattr(event, "message_id", None)
-        if not (chat_id and message_id):
-            return
-        if outcome == ProcessingOutcome.CANCELLED:
-            await self._clear_reactions(chat_id, message_id)
-        else:
+        if chat_id and message_id and outcome != ProcessingOutcome.CANCELLED:
            await self._set_reaction(
                chat_id,
                message_id,
@@ -345,7 +345,6 @@ class WeComAdapter(BasePlatformAdapter):
                try:
                    await self._open_connection()
                    backoff_idx = 0
-                    self._mark_connected()
                    logger.info("[%s] Reconnected", self.name)
                except Exception as reconnect_exc:
                    logger.warning("[%s] Reconnect failed: %s", self.name, reconnect_exc)
@@ -322,26 +322,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
            return {str(part).strip() for part in raw if str(part).strip()}
        return {part.strip() for part in str(raw).split(",") if part.strip()}

-    @staticmethod
-    def _is_broadcast_chat(chat_id: str) -> bool:
-        """True for WhatsApp pseudo-chats that aren't real conversations.
-
-        Covers Status updates (Stories) and Channel/Newsletter broadcasts.
-        These show up as inbound messages on Baileys but the agent should
-        never reply — answering a Story update spams the contact's status
-        feed, and Channel posts aren't addressable in the first place.
-        """
-        if not chat_id:
-            return False
-        cid = chat_id.strip().lower()
-        if cid == "status@broadcast":
-            return True
-        # @broadcast suffix covers status@broadcast plus any future
-        # broadcast-list variants. @newsletter is the Channel JID suffix.
-        if cid.endswith("@broadcast") or cid.endswith("@newsletter"):
-            return True
-        return False
-
    def _is_dm_allowed(self, sender_id: str) -> bool:
        """Check whether a DM from the given sender should be processed."""
        if self._dm_policy == "disabled":
@@ -452,16 +432,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
        return cleaned.strip() or text

    def _should_process_message(self, data: Dict[str, Any]) -> bool:
-        chat_id_raw = str(data.get("chatId") or "")
-        # WhatsApp uses pseudo-chats for Status updates (Stories) and
-        # Channel/Newsletter broadcasts. These are not real conversations
-        # and the agent should never reply to them — even in self-chat mode
-        # where the bridge may surface them as "fromMe" events.
-        if self._is_broadcast_chat(chat_id_raw):
-            return False
        is_group = data.get("isGroup", False)
        if is_group:
-            chat_id = chat_id_raw
+            chat_id = str(data.get("chatId") or "")
            if not self._is_group_allowed(chat_id):
                return False
        else:
@@ -521,15 +494,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
                # plain executable path.
                _npm_bin = shutil.which("npm") or "npm"
                try:
-                    # Read timeout from environment variable, default to 300 seconds (5 minutes)
-                    # to accommodate slower systems like Unraid NAS
-                    npm_install_timeout = int(os.environ.get("WHATSAPP_NPM_INSTALL_TIMEOUT", "300"))
                    install_result = subprocess.run(
                        [_npm_bin, "install", "--silent"],
                        cwd=str(bridge_dir),
                        capture_output=True,
                        text=True,
-                        timeout=npm_install_timeout,
+                        timeout=60,
                    )
                    if install_result.returncode != 0:
                        print(f"[{self.name}] npm install failed: {install_result.stderr}")
@@ -1139,38 +1139,6 @@ def _should_clear_resume_pending_after_turn(agent_result: dict) -> bool:
    return True


-def _preserve_queued_followup_history_offset(
-    current_result: dict,
-    followup_result: dict,
-) -> dict:
-    """Carry the outer history offset through queued follow-up drains.
-
-    ``_process_message_background()`` persists transcript rows only once, after the
-    entire in-band queued-follow-up chain returns.  Each recursive ``_run_agent()``
-    call advances ``history_offset`` to the history it received, so without
-    correction the outermost persistence step sees only the *last* queued turn as
-    "new" and silently drops earlier turns from the same drain chain.
-
-    Preserve the earliest (outermost) history offset so the final transcript slice
-    still includes every queued turn that ran during the chain.
-    """
-    if not isinstance(followup_result, dict):
-        return followup_result
-    if not isinstance(current_result, dict):
-        return followup_result
-
-    current_offset = current_result.get("history_offset")
-    followup_offset = followup_result.get("history_offset")
-    if not isinstance(current_offset, int):
-        return followup_result
-    if isinstance(followup_offset, int) and followup_offset <= current_offset:
-        return followup_result
-
-    merged = dict(followup_result)
-    merged["history_offset"] = current_offset
-    return merged
-
-
 class GatewayRunner:
    """
    Main gateway controller.
@@ -3307,30 +3275,6 @@ class GatewayRunner:
            write_runtime_status(gateway_state="starting", exit_reason=None)
        except Exception:
            pass
-
-        # Log any active supply-chain security advisories. Operators see this
-        # in gateway.log and `hermes status` surfaces it; we do NOT block
-        # startup or surface it inline to user messages, since the gateway
-        # operator is the one who can act on it (uninstall the package,
-        # rotate credentials).  See hermes_cli/security_advisories.py.
-        try:
-            from hermes_cli.security_advisories import (
-                detect_compromised,
-                gateway_log_message,
-            )
-            _adv_hits = detect_compromised()
-            _adv_msg = gateway_log_message(_adv_hits)
-            if _adv_msg:
-                logger.warning("%s", _adv_msg)
-                logger.warning(
-                    "Run `hermes doctor` on the gateway host for full "
-                    "remediation steps."
-                )
-        except Exception:
-            logger.debug(
-                "security advisory check failed at gateway startup",
-                exc_info=True,
-            )
        
        # Warn if no user allowlists are configured and open access is not opted in
        _builtin_allowed_vars = (
@@ -5860,37 +5804,6 @@ class GatewayRunner:
                    )
                _update_prompts.pop(_quick_key, None)

-        # Intercept messages that are responses to a pending clarify
-        # request that is awaiting free-form text (either an open-ended
-        # clarify with no choices, or one where the user picked the
-        # "Other" button).  The first non-empty user message in the
-        # session resolves the clarify and unblocks the agent thread —
-        # we do NOT route it to the agent as a new turn.
-        try:
-            from tools import clarify_gateway as _clarify_mod
-            _pending_clarify = _clarify_mod.get_pending_for_session(_quick_key)
-        except Exception:
-            _pending_clarify = None
-        if _pending_clarify is not None:
-            _raw_clarify_reply = (event.text or "").strip()
-            # Skip slash commands — the user clearly wanted to issue a
-            # command, not answer the clarify.  Leave the clarify pending
-            # so the user can retry; if it times out, the agent unblocks
-            # with an empty response.
-            if _raw_clarify_reply and not _raw_clarify_reply.startswith("/"):
-                _resolved = _clarify_mod.resolve_gateway_clarify(
-                    _pending_clarify.clarify_id, _raw_clarify_reply,
-                )
-                if _resolved:
-                    logger.info(
-                        "Gateway intercepted clarify text response (session=%s, id=%s)",
-                        _quick_key, _pending_clarify.clarify_id,
-                    )
-                    # Acknowledge with empty string so adapters that emit
-                    # the agent's response don't double-post.  The agent
-                    # itself will produce the next user-facing message.
-                    return ""
-
        # Intercept messages that are responses to a pending /reload-mcp
        # (or future) slash-confirm prompt.  Recognized confirm replies are
        # /approve, /always, /cancel (plus short aliases).  Anything else
@@ -6128,12 +6041,6 @@ class GatewayRunner:
            if _cmd_def_inner and _cmd_def_inner.name == "model":
                return "Agent is running — wait or /stop first, then switch models."

-            # /codex-runtime must not be used while the agent is running.
-            # Switching mid-turn would split a turn across two transports.
-            if _cmd_def_inner and _cmd_def_inner.name == "codex-runtime":
-                return ("Agent is running — wait or /stop first, then "
-                        "change runtime.")
-
            # /approve and /deny must bypass the running-agent interrupt path.
            # The agent thread is blocked on a threading.Event inside
            # tools/approval.py — sending an interrupt won't unblock it.
@@ -6173,12 +6080,6 @@ class GatewayRunner:
                    return await self._handle_goal_command(event)
                return "Agent is running — use /goal status / pause / clear mid-run, or /stop before setting a new goal."

-            # /subgoal is safe mid-run — it only modifies the goal's
-            # subgoals list, which the judge reads at the next turn
-            # boundary. No race with the running turn.
-            if _cmd_def_inner and _cmd_def_inner.name == "subgoal":
-                return await self._handle_subgoal_command(event)
-
            # Session-level toggles that are safe to run mid-agent —
            # /yolo can unblock a pending approval prompt, /verbose cycles
            # the tool-progress display mode for the ongoing stream.
@@ -6474,9 +6375,6 @@ class GatewayRunner:
        if canonical == "model":
            return await self._handle_model_command(event)

-        if canonical == "codex-runtime":
-            return await self._handle_codex_runtime_command(event)
-
        if canonical == "personality":
            return await self._handle_personality_command(event)

@@ -6560,9 +6458,6 @@ class GatewayRunner:
        if canonical == "goal":
            return await self._handle_goal_command(event)

-        if canonical == "subgoal":
-            return await self._handle_subgoal_command(event)
-
        if canonical == "voice":
            return await self._handle_voice_command(event)

@@ -7593,7 +7488,6 @@ class GatewayRunner:
            hook_ctx = {
                "platform": source.platform.value if source.platform else "",
                "user_id": source.user_id,
-                "chat_id": source.chat_id or "",
                "session_id": session_entry.session_id,
                "message": message_text[:500],
            }
@@ -9260,51 +9154,6 @@ class GatewayRunner:

        return "\n".join(lines)

-    async def _handle_codex_runtime_command(self, event: MessageEvent) -> str:
-        """Handle /codex-runtime command in the gateway.
-
-        Same surface as the CLI handler in cli.py:
-            /codex-runtime                  — show current state
-            /codex-runtime auto             — Hermes default runtime
-            /codex-runtime codex_app_server — codex subprocess runtime
-            /codex-runtime on / off         — synonyms
-
-        On change, the cached agent for this session is evicted so the next
-        message creates a fresh AIAgent with the new api_mode wired in
-        (avoids prompt-cache invalidation mid-session)."""
-        from hermes_cli import codex_runtime_switch as crs
-
-        raw_args = event.get_command_args().strip() if event else ""
-        new_value, errors = crs.parse_args(raw_args)
-        if errors:
-            return "❌ " + "\n❌ ".join(errors)
-
-        # Load + persist via the same helpers used for /model and /yolo
-        try:
-            from hermes_cli.config import load_config, save_config
-        except Exception as exc:
-            return f"❌ Could not load config: {exc}"
-        cfg = load_config()
-
-        result = crs.apply(
-            cfg,
-            new_value,
-            persist_callback=(save_config if new_value is not None else None),
-        )
-
-        # On a real change, evict the cached agent so the new runtime takes
-        # effect on the next message rather than waiting for cache TTL.
-        if result.success and new_value is not None and result.requires_new_session:
-            try:
-                session_key = self._session_key_for_source(event.source)
-                self._evict_cached_agent(session_key)
-            except Exception:
-                logger.debug("could not evict cached agent after codex-runtime change",
-                             exc_info=True)
-
-        prefix = "✓" if result.success else "✗"
-        return f"{prefix} {result.message}"
-
    async def _handle_personality_command(self, event: MessageEvent) -> str:
        """Handle /personality command - list or set a personality."""
        from hermes_constants import display_hermes_home
@@ -9533,57 +9382,6 @@ class GatewayRunner:

        return t("gateway.goal.set", budget=state.max_turns, goal=state.goal)

-    async def _handle_subgoal_command(self, event: "MessageEvent") -> str:
-        """Handle /subgoal for gateway platforms (mirror of CLI handler).
-
-        Subgoals are extra criteria appended to the active goal mid-loop.
-        They modify state read at the next turn boundary, so this is safe
-        to invoke while the agent is running.
-        """
-        args = (event.get_command_args() or "").strip()
-        mgr, _session_entry = self._get_goal_manager_for_event(event)
-        if mgr is None:
-            return t("gateway.goal.unavailable")
-        if not mgr.has_goal():
-            return "No active goal. Set one with /goal <text>."
-
-        # No args → list current subgoals.
-        if not args:
-            return f"{mgr.status_line()}\n{mgr.render_subgoals()}"
-
-        tokens = args.split(None, 1)
-        verb = tokens[0].lower()
-        rest = tokens[1].strip() if len(tokens) > 1 else ""
-
-        if verb == "remove":
-            if not rest:
-                return "Usage: /subgoal remove <n>"
-            try:
-                idx = int(rest.split()[0])
-            except ValueError:
-                return "/subgoal remove: <n> must be an integer (1-based index)."
-            try:
-                removed = mgr.remove_subgoal(idx)
-            except (IndexError, RuntimeError) as exc:
-                return f"/subgoal remove: {exc}"
-            return f"✓ Removed subgoal {idx}: {removed}"
-
-        if verb == "clear":
-            try:
-                prev = mgr.clear_subgoals()
-            except RuntimeError as exc:
-                return f"/subgoal clear: {exc}"
-            if prev:
-                return f"✓ Cleared {prev} subgoal{'s' if prev != 1 else ''}."
-            return "No subgoals to clear."
-
-        try:
-            text = mgr.add_subgoal(args)
-        except (ValueError, RuntimeError) as exc:
-            return f"/subgoal: {exc}"
-        idx = len(mgr.state.subgoals) if mgr.state else 0
-        return f"✓ Added subgoal {idx}: {text}"
-
    async def _send_goal_status_notice(self, source: Any, message: str) -> None:
        """Send a /goal judge status line back to the originating chat/thread."""
        adapter = self.adapters.get(source.platform)
@@ -10355,10 +10153,6 @@ class GatewayRunner:

        event_message_id = self._reply_anchor_for_event(event)

-        # Forward image/audio attachments so the background agent can see them.
-        media_urls = list(event.media_urls) if event.media_urls else []
-        media_types = list(event.media_types) if event.media_types else []
-
        # Fire-and-forget the background task
        _task = asyncio.create_task(
            self._run_background_task(
@@ -10366,8 +10160,6 @@ class GatewayRunner:
                source,
                task_id,
                event_message_id=event_message_id,
-                media_urls=media_urls,
-                media_types=media_types,
            )
        )
        self._background_tasks.add(_task)
@@ -10382,15 +10174,10 @@ class GatewayRunner:
        source: "SessionSource",
        task_id: str,
        event_message_id: Optional[str] = None,
-        media_urls: Optional[List[str]] = None,
-        media_types: Optional[List[str]] = None,
    ) -> None:
        """Execute a background agent task and deliver the result to the chat."""
        from run_agent import AIAgent

-        media_urls = media_urls or []
-        media_types = media_types or []
-
        adapter = self.adapters.get(source.platform)
        if not adapter:
            logger.warning("No adapter for platform %s in background task %s", source.platform, task_id)
@@ -10426,23 +10213,6 @@ class GatewayRunner:
            self._service_tier = self._load_service_tier()
            turn_route = self._resolve_turn_agent_config(prompt, model, runtime_kwargs)

-            # Enrich the prompt with image descriptions so the background
-            # agent can see user-attached images (same as the main flow).
-            enriched_prompt = prompt
-            if media_urls:
-                image_paths = []
-                for i, path in enumerate(media_urls):
-                    mtype = media_types[i] if i < len(media_types) else ""
-                    if mtype.startswith("image/"):
-                        image_paths.append(path)
-                if image_paths:
-                    try:
-                        enriched_prompt = await self._enrich_message_with_vision(
-                            prompt, image_paths,
-                        )
-                    except Exception as e:
-                        logger.warning("Background task vision enrichment failed: %s", e)
-
            def run_sync():
                agent = AIAgent(
                    model=turn_route["model"],
@@ -10474,7 +10244,7 @@ class GatewayRunner:
                )
                try:
                    return agent.run_conversation(
-                        user_message=enriched_prompt,
+                        user_message=prompt,
                        task_id=task_id,
                    )
                finally:
@@ -15163,76 +14933,6 @@ class GatewayRunner:
                    if _pdc is not None:
                        _pdc[session_key] = _release_bg_review_messages

-            # ------------------------------------------------------------------
-            # Clarify callback: present a clarify prompt and block on a response.
-            #
-            # Runs on the agent's worker thread (see clarify_tool's synchronous
-            # callback contract).  Bridges sync→async by scheduling the
-            # adapter's send_clarify on the gateway event loop, then blocks on
-            # the clarify primitive's threading.Event with a configurable
-            # timeout.  Returns the user's response string, or a sentinel
-            # explaining that no response arrived (so the agent can adapt
-            # rather than hang forever).
-            # ------------------------------------------------------------------
-            def _clarify_callback_sync(question: str, choices) -> str:
-                from tools import clarify_gateway as _clarify_mod
-                import uuid as _uuid
-
-                if not _status_adapter:
-                    return ""
-
-                clarify_id = _uuid.uuid4().hex[:10]
-                _clarify_mod.register(
-                    clarify_id=clarify_id,
-                    session_key=session_key or "",
-                    question=question,
-                    choices=list(choices) if choices else None,
-                )
-
-                # Pause typing — like approval, we don't want a "thinking..."
-                # status to obscure the prompt or block the user from typing
-                # an "Other" response on platforms that disable input while
-                # typing is active (Slack Assistant API).
-                try:
-                    _status_adapter.pause_typing_for_chat(_status_chat_id)
-                except Exception:
-                    pass
-
-                send_ok = False
-                try:
-                    fut = asyncio.run_coroutine_threadsafe(
-                        _status_adapter.send_clarify(
-                            chat_id=_status_chat_id,
-                            question=question,
-                            choices=list(choices) if choices else None,
-                            clarify_id=clarify_id,
-                            session_key=session_key or "",
-                            metadata=_status_thread_metadata,
-                        ),
-                        _loop_for_step,
-                    )
-                    result = fut.result(timeout=15)
-                    send_ok = bool(getattr(result, "success", False))
-                except Exception as exc:
-                    logger.warning("Clarify send failed: %s", exc)
-                    send_ok = False
-
-                if not send_ok:
-                    # Couldn't deliver the prompt — clean up and return
-                    # sentinel so the agent can fall back to a sensible
-                    # default rather than hanging.
-                    _clarify_mod.clear_session(session_key or "")
-                    return "[clarify prompt could not be delivered]"
-
-                timeout = _clarify_mod.get_clarify_timeout()
-                response = _clarify_mod.wait_for_response(clarify_id, timeout=float(timeout))
-                if response is None or response == "":
-                    # Timeout or session-boundary cancellation
-                    return f"[user did not respond within {int(timeout / 60)}m]"
-                return response
-
-            agent.clarify_callback = _clarify_callback_sync
-
            # Store agent reference for interrupt support
            agent_holder[0] = agent
            # Capture the full tool definitions for transcript logging
@@ -15504,14 +15204,6 @@ class GatewayRunner:
                result = agent.run_conversation(_run_message, conversation_history=agent_history, task_id=session_id)
            finally:
                unregister_gateway_notify(_approval_session_key)
-                # Cancel any pending clarify entries so blocked agent
-                # threads don't hang past the end of the run (interrupt,
-                # completion, gateway shutdown).  Idempotent.
-                try:
-                    from tools.clarify_gateway import clear_session as _clear_clarify_session
-                    _clear_clarify_session(_approval_session_key)
-                except Exception:
-                    pass
                reset_current_session_key(_approval_session_token)
            result_holder[0] = result

@@ -16131,7 +15823,6 @@ class GatewayRunner:
                    _already_streamed = bool(
                        (_sc and getattr(_sc, "final_response_sent", False))
                        or _previewed
-                        or (_sc and getattr(_sc, "final_content_delivered", False))
                    )
                    first_response = result.get("final_response", "")
                    if first_response and not _already_streamed:
@@ -16217,7 +15908,7 @@ class GatewayRunner:
                    except Exception:
                        pass

-                followup_result = await self._run_agent(
+                return await self._run_agent(
                    message=next_message,
                    context_prompt=context_prompt,
                    history=updated_history,
@@ -16229,7 +15920,6 @@ class GatewayRunner:
                    event_message_id=next_message_id,
                    channel_prompt=next_channel_prompt,
                )
-                return _preserve_queued_followup_history_offset(result, followup_result)
        finally:
            # Stop progress sender, interrupt monitor, and notification task
            if progress_task:
@@ -16293,16 +15983,12 @@ class GatewayRunner:
            # response_previewed means the interim_assistant_callback already
            # sent the final text via the adapter (non-streaming path).
            _previewed = bool(response.get("response_previewed"))
-            _content_delivered = bool(
-                _sc and getattr(_sc, "final_content_delivered", False)
-            )
-            if not _is_empty_sentinel and (_streamed or _previewed or _content_delivered):
+            if not _is_empty_sentinel and (_streamed or _previewed):
                logger.info(
-                    "Suppressing normal final send for session %s: final delivery already confirmed (streamed=%s previewed=%s content_delivered=%s).",
+                    "Suppressing normal final send for session %s: final delivery already confirmed (streamed=%s previewed=%s).",
                    session_key or "?",
                    _streamed,
                    _previewed,
-                    _content_delivered,
                )
                response["already_sent"] = True

@@ -124,44 +124,16 @@ def get_process_start_time(pid: int) -> Optional[int]:


 def _read_process_cmdline(pid: int) -> Optional[str]:
-    """Return the process command line as a space-separated string.
-
-    On Linux, reads /proc/<pid>/cmdline directly.  On macOS and other
-    platforms without /proc, falls back to ``ps -p <pid> -o command=``.
-    On Windows (no /proc, no ps), uses psutil.
-    """
+    """Return the process command line as a space-separated string."""
    cmdline_path = Path(f"/proc/{pid}/cmdline")
    try:
        raw = cmdline_path.read_bytes()
    except (FileNotFoundError, PermissionError, OSError):
-        pass
-    else:
-        if raw:
-            return raw.replace(b"\x00", b" ").decode("utf-8", errors="ignore").strip()
+        return None

-    try:
-        result = subprocess.run(
-            ["ps", "-p", str(pid), "-o", "command="],
-            capture_output=True,
-            text=True,
-            timeout=5,
-        )
-        if result.returncode == 0 and result.stdout.strip():
-            return result.stdout.strip()
-    except (OSError, subprocess.TimeoutExpired):
-        pass
-
-    # Windows fallback: psutil (already used by _pid_exists)
-    try:
-        import psutil  # type: ignore
-        proc = psutil.Process(pid)
-        cmdline_parts = proc.cmdline()
-        if cmdline_parts:
-            return " ".join(cmdline_parts)
-    except Exception:
-        pass
-
-    return None
+    if not raw:
+        return None
+    return raw.replace(b"\x00", b" ").decode("utf-8", errors="ignore").strip()


 def _looks_like_gateway_process(pid: int) -> bool:
@@ -189,8 +161,7 @@ def _record_looks_like_gateway(record: dict[str, Any]) -> bool:
    if not isinstance(argv, list) or not argv:
        return False

-    # Normalize Windows backslashes so patterns match cross-platform.
-    cmdline = " ".join(str(part) for part in argv).replace("\\", "/")
+    cmdline = " ".join(str(part) for part in argv)
    patterns = (
        "hermes_cli.main gateway",
        "hermes_cli/main.py gateway",
@@ -623,22 +594,6 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
                    and current_start != existing.get("start_time")
                ):
                    stale = True
-                # When start_time comparison is unavailable (macOS / Windows
-                # have no /proc, so both sides are None), fall back to
-                # checking the live process command line.  When cmdline is
-                # also unreadable (Windows has no ps), consult the lock
-                # record's own argv — the gateway writes it at startup and
-                # it's the only identity signal on platforms without ps.
-                # Both oracles must indicate "not a gateway" to mark stale.
-                if (
-                    not stale
-                    and existing.get("start_time") is None
-                    and current_start is None
-                    and not _looks_like_gateway_process(existing_pid)
-                ):
-                    live_cmdline = _read_process_cmdline(existing_pid)
-                    if live_cmdline is not None or not _record_looks_like_gateway(existing):
-                        stale = True
                # Check if process is stopped (Ctrl+Z / SIGTSTP) — stopped
                # processes still appear alive to _pid_exists but are not
                # actually running. Treat them as stale so --replace works.
@@ -150,10 +150,6 @@ class GatewayStreamConsumer:
        self._flood_strikes = 0         # Consecutive flood-control edit failures
        self._current_edit_interval = self.cfg.edit_interval  # Adaptive backoff
        self._final_response_sent = False
-        # Set when the final response content was sent to the user via
-        # streaming, even if the final edit (cursor removal etc.)
-        # subsequently failed.
-        self._final_content_delivered = False
        # Cache adapter lifecycle capability: only platforms that need an
        # explicit finalize call (e.g. DingTalk AI Cards) force us to make
        # a redundant final edit.  Everyone else keeps the fast path.
@@ -191,12 +187,6 @@ class GatewayStreamConsumer:
        """True when the stream consumer delivered the final assistant reply."""
        return self._final_response_sent

-    @property
-    def final_content_delivered(self) -> bool:
-        """True when the final response content reached the user, even if
-        the subsequent cosmetic edit (cursor removal) failed."""
-        return self._final_content_delivered
-
    def on_segment_break(self) -> None:
        """Finalize the current stream segment and start a fresh message."""
        self._queue.put(_NEW_SEGMENT)
@@ -465,8 +455,6 @@ class GatewayStreamConsumer:
                            # tool-progress edits or fallback-mode promotion (#10748)
                            # — that doesn't mean the final answer reached the user.
                            self._final_response_sent = chunks_delivered
-                            if chunks_delivered:
-                                self._final_content_delivered = True
                            return
                        if got_segment_break:
                            self._message_id = None
@@ -517,11 +505,6 @@ class GatewayStreamConsumer:
                    self._last_edit_time = time.monotonic()

                if got_done:
-                    # Record that the final content reached the user even
-                    # if the cosmetic final edit below fails.
-                    if current_update_visible and self._accumulated:
-                        self._final_content_delivered = True
-
                    # Final edit without cursor. If progressive editing failed
                    # mid-stream, send a single continuation/fallback message
                    # here instead of letting the base gateway path send the
@@ -35,7 +35,7 @@ from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from http.server import BaseHTTPRequestHandler, HTTPServer
 from pathlib import Path
-from typing import Any, Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional
 from urllib.parse import parse_qs, urlencode, urlparse

 import httpx
@@ -284,7 +284,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
    ),
    "alibaba": ProviderConfig(
        id="alibaba",
-        name="Qwen Cloud",
+        name="Alibaba Cloud (DashScope)",
        auth_type="api_key",
        inference_base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
        api_key_env_vars=("DASHSCOPE_API_KEY",),
@@ -3870,39 +3870,6 @@ def _snapshot_nous_pool_status() -> Dict[str, Any]:
        return _empty_nous_auth_status()


-# ── Process-level memo for get_nous_auth_status() ──
-# get_nous_auth_status() validates state by calling resolve_nous_runtime_credentials(),
-# which does a synchronous OAuth refresh POST to portal.nousresearch.com. That can take
-# ~350ms even on the failure path, and read-only UI surfaces (`hermes tools`, status panels,
-# subscription-feature checks) call it many times per render — `hermes tools` → "All Platforms"
-# was firing the refresh ~31× during one menu paint, racking up >13s of HTTP and burning
-# single-use refresh tokens. Cache the snapshot for a few seconds, keyed on the auth.json
-# mtime so that `hermes auth login/logout/add/remove` invalidate naturally on the next call.
-_NOUS_AUTH_STATUS_CACHE_TTL = 15.0  # seconds
-_nous_auth_status_cache: Optional[Tuple[float, Optional[float], Dict[str, Any]]] = None
-
-
-def _auth_file_mtime() -> Optional[float]:
-    try:
-        return _auth_file_path().stat().st_mtime
-    except FileNotFoundError:
-        return None
-    except Exception:
-        return None
-
-
-def invalidate_nous_auth_status_cache() -> None:
-    """Clear the get_nous_auth_status() process-level memo.
-
-    Call this from any code path that mutates Nous auth state without going
-    through resolve_nous_runtime_credentials() (e.g. tests). Login/logout
-    flows touch auth.json, so the mtime check below invalidates them
-    automatically — explicit invalidation is the belt-and-braces option.
-    """
-    global _nous_auth_status_cache
-    _nous_auth_status_cache = None
-
-
 def get_nous_auth_status() -> Dict[str, Any]:
    """Status snapshot for Nous auth.

@@ -3911,32 +3878,7 @@ def get_nous_auth_status() -> Dict[str, Any]:
    by resolving runtime credentials so revoked refresh sessions do not show up
    as a healthy login. If provider state is absent, fall back to the credential
    pool for the just-logged-in / not-yet-promoted case.
-
-    The returned snapshot is memoised for ~15s keyed on the auth.json mtime,
-    so menu/status surfaces that ask repeatedly don't trigger one refresh POST
-    per call. Login/logout flows write to auth.json and therefore invalidate
-    the cache automatically; tests can also call
-    ``invalidate_nous_auth_status_cache()`` explicitly.
    """
-    global _nous_auth_status_cache
-    now = time.monotonic()
-    mtime = _auth_file_mtime()
-    cached = _nous_auth_status_cache
-    if cached is not None:
-        cached_at, cached_mtime, cached_status = cached
-        if (
-            cached_mtime == mtime
-            and (now - cached_at) < _NOUS_AUTH_STATUS_CACHE_TTL
-        ):
-            return dict(cached_status)
-
-    status = _compute_nous_auth_status()
-    _nous_auth_status_cache = (now, mtime, dict(status))
-    return status
-
-
-def _compute_nous_auth_status() -> Dict[str, Any]:
-    """Uncached implementation of get_nous_auth_status(). See that function."""
    state = get_provider_auth_state("nous")
    if state:
        base_status = {
@@ -4104,8 +4046,6 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
        return get_qwen_auth_status()
    if target == "google-gemini-cli":
        return get_gemini_oauth_auth_status()
-    if target == "minimax-oauth":
-        return get_minimax_oauth_auth_status()
    if target == "copilot-acp":
        return get_external_process_provider_status(target)
    # API-key providers
@@ -4817,20 +4757,6 @@ def _minimax_request_user_code(
    return payload


-def _minimax_expired_in_looks_like_unix_ms(expired_in: int, *, now_ms: int) -> bool:
-    """True if ``expired_in`` is plausibly a unix-ms absolute time (vs TTL seconds)."""
-    return int(expired_in) > (now_ms // 2)
-
-
-def _minimax_resolve_token_expiry_unix(expired_in: int, *, now: datetime) -> float:
-    """Return access-token expiry as unix seconds (MiniMax uses ms epoch or TTL seconds)."""
-    raw = int(expired_in)
-    now_ms = int(now.timestamp() * 1000)
-    if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
-        return raw / 1000.0
-    return now.timestamp() + max(1, raw)
-
-
 def _minimax_poll_token(
    client: httpx.Client, *, portal_base_url: str, client_id: str,
    user_code: str, code_verifier: str, expired_in: int, interval_ms: Optional[int],
@@ -4839,11 +4765,12 @@ def _minimax_poll_token(
    # Defensive parsing: if it's small enough to be a duration, treat as seconds.
    import time as _time
    now_ms = int(_time.time() * 1000)
-    raw = int(expired_in)
-    if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
-        deadline = raw / 1000.0
+    if expired_in > now_ms // 2:
+        # Looks like a unix-ms timestamp.
+        deadline = expired_in / 1000.0
    else:
-        deadline = _time.time() + max(1, raw)
+        # Treat as duration in seconds from now.
+        deadline = _time.time() + max(1, expired_in)
    interval = max(2.0, (interval_ms or 2000) / 1000.0)

    while _time.time() < deadline:
@@ -4957,10 +4884,8 @@ def _minimax_oauth_login(
        )

    now = datetime.now(timezone.utc)
-    expires_at_unix = _minimax_resolve_token_expiry_unix(
-        int(token_data["expired_in"]), now=now,
-    )
-    expires_in_s = max(0, int(expires_at_unix - now.timestamp()))
+    expires_in_s = int(token_data["expired_in"])
+    expires_at = now.timestamp() + expires_in_s

    auth_state = {
        "provider": "minimax-oauth",
@@ -4974,7 +4899,7 @@ def _minimax_oauth_login(
        "refresh_token": token_data["refresh_token"],
        "resource_url": token_data.get("resource_url"),
        "obtained_at": now.isoformat(),
-        "expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
+        "expires_at": datetime.fromtimestamp(expires_at, tz=timezone.utc).isoformat(),
        "expires_in": expires_in_s,
    }

@@ -5035,16 +4960,14 @@ def _refresh_minimax_oauth_state(
            relogin_required=True,
        )
    now_dt = datetime.now(timezone.utc)
-    expires_at_unix = _minimax_resolve_token_expiry_unix(
-        int(payload["expired_in"]), now=now_dt,
-    )
-    expires_in_s = max(0, int(expires_at_unix - now_dt.timestamp()))
+    expires_in_s = int(payload["expired_in"])
    new_state = dict(state)
    new_state.update({
        "access_token": payload["access_token"],
        "refresh_token": payload.get("refresh_token", state["refresh_token"]),
        "obtained_at": now_dt.isoformat(),
-        "expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
+        "expires_at": datetime.fromtimestamp(now_dt.timestamp() + expires_in_s,
+                                             tz=timezone.utc).isoformat(),
        "expires_in": expires_in_s,
    })
    _minimax_save_auth_state(new_state)
@@ -5329,7 +5252,6 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
                get_curated_nous_model_ids, get_pricing_for_provider,
                check_nous_free_tier, partition_nous_models_by_tier,
                union_with_portal_free_recommendations,
-                union_with_portal_paid_recommendations,
            )
            model_ids = get_curated_nous_model_ids()

@@ -5338,27 +5260,19 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
            if model_ids:
                pricing = get_pricing_for_provider("nous")
                free_tier = check_nous_free_tier()
-                _portal_for_recs = auth_state.get("portal_base_url", "")
                if free_tier:
                    # The Portal's freeRecommendedModels endpoint is the
                    # source of truth for what's free *right now*. Augment
                    # the curated list with anything new the Portal flags
                    # as free so users on older Hermes builds still see
                    # newly-launched free models without a CLI release.
+                    _portal_for_recs = auth_state.get("portal_base_url", "")
                    model_ids, pricing = union_with_portal_free_recommendations(
                        model_ids, pricing, _portal_for_recs,
                    )
                    model_ids, unavailable_models = partition_nous_models_by_tier(
                        model_ids, pricing, free_tier=True,
                    )
-                else:
-                    # Paid-tier mirror: pull paidRecommendedModels so newly
-                    # launched paid models surface in the picker even if
-                    # the in-repo curated list and docs-hosted manifest
-                    # haven't caught up yet.
-                    model_ids, pricing = union_with_portal_paid_recommendations(
-                        model_ids, pricing, _portal_for_recs,
-                    )
            _portal = auth_state.get("portal_base_url", "")
            if model_ids:
                print(f"Showing {len(model_ids)} curated models — use \"Enter custom model name\" for others.")
@@ -375,12 +375,10 @@ def auth_add_command(args) -> None:
        return

    if provider == "minimax-oauth":
-        creds = auth_mod._minimax_oauth_login(
-            open_browser=not getattr(args, "no_browser", False),
-            timeout_seconds=getattr(args, "timeout", None) or 15.0,
-        )
+        from hermes_cli.auth import resolve_minimax_oauth_runtime_credentials
+        creds = resolve_minimax_oauth_runtime_credentials()
        label = (getattr(args, "label", None) or "").strip() or label_from_token(
-            creds["access_token"],
+            creds["api_key"],
            _oauth_default_label(provider, len(pool.entries()) + 1),
        )
        entry = PooledCredential(
@@ -390,9 +388,8 @@ def auth_add_command(args) -> None:
            auth_type=AUTH_TYPE_OAUTH,
            priority=0,
            source=f"{SOURCE_MANUAL}:minimax_oauth",
-            access_token=creds["access_token"],
-            refresh_token=creds.get("refresh_token"),
-            base_url=creds.get("inference_base_url"),
+            access_token=creds["api_key"],
+            base_url=creds.get("base_url"),
        )
        pool.add_entry(entry)
        print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
@@ -581,19 +581,6 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
    if mcp_connected:
        summary_parts.append(f"{mcp_connected} MCP servers")
    summary_parts.append("/help for commands")
-    # Indicate when the codex_app_server runtime is active so users
-    # understand why tool counts may not match what's actually reachable
-    # (codex builds its own tool list inside the spawned subprocess).
-    try:
-        from hermes_cli.codex_runtime_switch import get_current_runtime
-        from hermes_cli.config import load_config as _load_cfg
-        if get_current_runtime(_load_cfg()) == "codex_app_server":
-            right_lines.append(
-                f"[bold {accent}]Runtime:[/] [{text}]codex app-server[/] "
-                f"[dim {dim}](terminal/file ops/MCP run inside codex)[/]"
-            )
-    except Exception:
-        pass
    # Show active profile name when not 'default'
    try:
        from hermes_cli.profiles import get_active_profile_name
@@ -22,7 +22,6 @@ from pathlib import Path
 from hermes_constants import is_wsl as _is_wsl

 logger = logging.getLogger(__name__)
-_PNG_SIGNATURE = b"\x89PNG\r\n\x1a\n"


 def save_clipboard_image(dest: Path) -> bool:
@@ -379,13 +378,10 @@ def _wayland_save(dest: Path) -> bool:
            dest.unlink(missing_ok=True)
            return False

-        # save_clipboard_image() promises a PNG output path. Wayland can offer
-        # JPEG/GIF/WebP/BMP payloads, so normalize every non-PNG result before
-        # returning success.
-        if mime != "image/png":
-            if not _convert_to_png(dest) or not _is_png_file(dest):
-                dest.unlink(missing_ok=True)
-                return False
+        # BMP needs conversion to PNG (common in WSLg where only BMP
+        # is bridged from Windows clipboard via RDP).
+        if mime == "image/bmp":
+            return _convert_to_png(dest)

        return True

@@ -437,15 +433,6 @@ def _convert_to_png(path: Path) -> bool:
    return path.exists() and path.stat().st_size > 0


-def _is_png_file(path: Path) -> bool:
-    """Return True when *path* starts with the PNG file signature."""
-    try:
-        with path.open("rb") as f:
-            return f.read(len(_PNG_SIGNATURE)) == _PNG_SIGNATURE
-    except OSError:
-        return False
-
-
 # ── X11 (xclip) ─────────────────────────────────────────────────────────

 def _xclip_has_image() -> bool:
@@ -1,614 +0,0 @@
-"""Migrate Hermes' MCP server config and Codex's installed curated plugins
-to the format Codex expects in ~/.codex/config.toml.
-
-When the user enables the codex_app_server runtime, the codex subprocess
-runs its own MCP client and its own plugin runtime (Linear, Atlassian,
-Asana, plus per-account ChatGPT apps via app/list). For both of those to
-be useful, the user's choices need to be visible to codex too. This
-module:
-
-  1. Reads Hermes' YAML and writes equivalent [mcp_servers.<name>]
-     entries to ~/.codex/config.toml.
-  2. Queries codex's `plugin/list` for the openai-curated marketplace
-     and writes [plugins."<name>@<marketplace>"] entries for any plugin
-     the user has installed=true on their codex CLI. (This is what
-     OpenClaw calls "migrate native codex plugins" — the YouTube-video-
-     worthy bit Pash highlighted: Canva, GitHub, Calendar, Gmail
-     pre-configured.)
-  3. Writes a [permissions] default profile so users on this runtime
-     don't get an approval prompt on every write attempt.
-
-What translates (MCP servers):
-  Hermes mcp_servers.<n>.command/args/env  → codex stdio transport
-  Hermes mcp_servers.<n>.url/headers       → codex streamable_http transport
-  Hermes mcp_servers.<n>.timeout           → codex tool_timeout_sec
-  Hermes mcp_servers.<n>.connect_timeout   → codex startup_timeout_sec
-
-What does NOT translate (warned + skipped):
-  Hermes-specific keys (sampling, etc.) — codex's MCP client has no
-  equivalent. Listed in the per-server skipped[] field of the report.
-
-What's NOT migrated (intentional):
-  AGENTS.md — codex respects this file natively in its cwd. Hermes' own
-  AGENTS.md (project-level) is already in the worktree, so codex picks
-  it up without translation. No code needed.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-from dataclasses import dataclass, field
-from pathlib import Path
-from typing import Any, Optional
-
-logger = logging.getLogger(__name__)
-
-
-# Marker comments wrapping the managed section so re-runs can detect
-# what's ours and what's user-edited. Both must appear or strip is a no-op.
-MIGRATION_MARKER = (
-    "# managed by hermes-agent — `hermes codex-runtime migrate` regenerates this section"
-)
-MIGRATION_END_MARKER = (
-    "# end hermes-agent managed section"
-)
-
-
-@dataclass
-class MigrationReport:
-    """Outcome of a migration pass."""
-
-    target_path: Optional[Path] = None
-    migrated: list[str] = field(default_factory=list)
-    skipped_keys_per_server: dict[str, list[str]] = field(default_factory=dict)
-    migrated_plugins: list[str] = field(default_factory=list)
-    plugin_query_error: Optional[str] = None
-    wrote_permissions_default: Optional[str] = None
-    errors: list[str] = field(default_factory=list)
-    written: bool = False
-    dry_run: bool = False
-
-    def summary(self) -> str:
-        lines = []
-        if self.dry_run:
-            lines.append(f"(dry run) Would write {self.target_path}")
-        elif self.written:
-            lines.append(f"Wrote {self.target_path}")
-        if self.migrated:
-            lines.append(f"Migrated {len(self.migrated)} MCP server(s):")
-            for name in self.migrated:
-                skipped = self.skipped_keys_per_server.get(name, [])
-                note = (
-                    f" (skipped: {', '.join(skipped)})" if skipped else ""
-                )
-                lines.append(f"  - {name}{note}")
-        else:
-            lines.append("No MCP servers found in Hermes config.")
-        if self.migrated_plugins:
-            lines.append(
-                f"Migrated {len(self.migrated_plugins)} native Codex plugin(s):"
-            )
-            for name in self.migrated_plugins:
-                lines.append(f"  - {name}")
-        elif self.plugin_query_error:
-            lines.append(f"Codex plugin discovery skipped: {self.plugin_query_error}")
-        if self.wrote_permissions_default:
-            lines.append(
-                f"Wrote default_permissions = "
-                f"{self.wrote_permissions_default!r}"
-            )
-        for err in self.errors:
-            lines.append(f"⚠ {err}")
-        return "\n".join(lines)
-
-
-# Hermes keys that codex's MCP schema doesn't support — dropped during
-# migration with a warning. Anything not on the keep list AND not the
-# transport keys is added to skipped.
-_KNOWN_HERMES_KEYS = {
-    # transport — stdio
-    "command", "args", "env", "cwd",
-    # transport — http
-    "url", "headers", "transport",
-    # timeouts
-    "timeout", "connect_timeout",
-    # general
-    "enabled", "description",
-}
-
-# Subset that have a direct codex equivalent.
-_KEYS_DROPPED_WITH_WARNING = {
-    # Hermes' sampling subsection — codex MCP has no equivalent
-    "sampling",
-}
-
-
-def _translate_one_server(
-    name: str, hermes_cfg: dict
-) -> tuple[Optional[dict], list[str]]:
-    """Translate one Hermes MCP server config to the codex inline-table dict
-    representation. Returns (codex_entry, skipped_keys).
-
-    codex_entry is a dict ready for TOML serialization, or None when the
-    server can't be translated (e.g. neither command nor url present)."""
-    if not isinstance(hermes_cfg, dict):
-        return None, []
-
-    skipped: list[str] = []
-    out: dict[str, Any] = {}
-
-    has_command = bool(hermes_cfg.get("command"))
-    has_url = bool(hermes_cfg.get("url"))
-
-    if has_command and has_url:
-        skipped.append("url (both command and url set; preferring stdio)")
-        has_url = False
-
-    if has_command:
-        # Stdio transport
-        out["command"] = str(hermes_cfg["command"])
-        args = hermes_cfg.get("args") or []
-        if args:
-            out["args"] = [str(a) for a in args]
-        env = hermes_cfg.get("env") or {}
-        if env:
-            # Codex expects string values
-            out["env"] = {str(k): str(v) for k, v in env.items()}
-        cwd = hermes_cfg.get("cwd")
-        if cwd:
-            out["cwd"] = str(cwd)
-    elif has_url:
-        # streamable_http transport (codex covers both http and SSE here)
-        out["url"] = str(hermes_cfg["url"])
-        headers = hermes_cfg.get("headers") or {}
-        if headers:
-            out["http_headers"] = {str(k): str(v) for k, v in headers.items()}
-        # Hermes' transport: sse hint is informational; codex auto-negotiates
-        if hermes_cfg.get("transport") == "sse":
-            skipped.append("transport=sse (codex auto-negotiates)")
-    else:
-        return None, ["no command or url field"]
-
-    # Timeouts
-    if "timeout" in hermes_cfg:
-        try:
-            out["tool_timeout_sec"] = float(hermes_cfg["timeout"])
-        except (TypeError, ValueError):
-            skipped.append("timeout (not numeric)")
-    if "connect_timeout" in hermes_cfg:
-        try:
-            out["startup_timeout_sec"] = float(hermes_cfg["connect_timeout"])
-        except (TypeError, ValueError):
-            skipped.append("connect_timeout (not numeric)")
-
-    # Enabled flag (codex defaults to true so we only emit when explicitly false)
-    if hermes_cfg.get("enabled") is False:
-        out["enabled"] = False
-
-    # Detect keys we explicitly drop with warning
-    for key in hermes_cfg:
-        if key in _KEYS_DROPPED_WITH_WARNING:
-            skipped.append(f"{key} (no codex equivalent)")
-        elif key not in _KNOWN_HERMES_KEYS:
-            skipped.append(f"{key} (unknown Hermes key)")
-
-    return out, skipped
-
-
-def _format_toml_value(value: Any) -> str:
-    """Minimal TOML value formatter for the value types we emit.
-
-    We only emit strings, numbers, booleans, and tables of those — no nested
-    arrays of tables. This covers everything codex's MCP schema accepts."""
-    if isinstance(value, bool):
-        return "true" if value else "false"
-    if isinstance(value, (int, float)):
-        return repr(value)
-    if isinstance(value, str):
-        # Escape per TOML basic-string rules. Order matters: backslash
-        # first so the other escapes don't get re-escaped.
-        # Control characters (newline, tab, etc.) must use \-escapes
-        # because TOML basic strings don't allow literal control chars
-        # — passing them through would produce invalid TOML that codex
-        # would refuse to load. Paths usually don't contain control
-        # chars but env-var passthrough (HERMES_HOME, PYTHONPATH) could
-        # in pathological cases.
-        escaped = (
-            value
-            .replace("\\", "\\\\")
-            .replace('"', '\\"')
-            .replace("\b", "\\b")
-            .replace("\t", "\\t")
-            .replace("\n", "\\n")
-            .replace("\f", "\\f")
-            .replace("\r", "\\r")
-        )
-        return f'"{escaped}"'
-    if isinstance(value, list):
-        items = ", ".join(_format_toml_value(v) for v in value)
-        return f"[{items}]"
-    if isinstance(value, dict):
-        items = ", ".join(
-            f'{_quote_key(k)} = {_format_toml_value(v)}' for k, v in value.items()
-        )
-        return "{ " + items + " }" if items else "{}"
-    raise ValueError(f"Unsupported TOML value type: {type(value).__name__}")
-
-
-def _quote_key(key: str) -> str:
-    """Return key bare-or-quoted depending on whether it's a valid bare key."""
-    if all(c.isalnum() or c in "-_" for c in key) and key:
-        return key
-    escaped = key.replace("\\", "\\\\").replace('"', '\\"')
-    return f'"{escaped}"'
-
-def render_codex_toml_section(
-    servers: dict[str, dict],
-    plugins: Optional[list[dict]] = None,
-    default_permission_profile: Optional[str] = None,
-) -> str:
-    """Render the managed [mcp_servers.<n>] / [plugins.<id>] / [permissions]
-    block for ~/.codex/config.toml.
-
-    Args:
-        servers: dict of MCP server name → translated codex inline-table
-        plugins: optional list of {name, marketplace, enabled} for native
-            Codex plugins to enable. (E.g. the Linear / Atlassian / Asana
-            curated plugins, or per-account ChatGPT apps.)
-        default_permission_profile: when set, write `[permissions] default`
-            so the user doesn't get an approval prompt on every write
-            attempt. Common values: "workspace-write", "read-only",
-            "full-access".
-    """
-    out = [MIGRATION_MARKER]
-    if not servers and not plugins and not default_permission_profile:
-        out.append("# (no MCP servers, plugins, or permissions configured by Hermes)")
-        out.append(MIGRATION_END_MARKER)
-        return "\n".join(out) + "\n"
-
-    if default_permission_profile:
-        # Codex's config schema: `default_permissions` is a top-level
-        # string referencing a profile name. Built-in profile names start
-        # with ":" (":workspace-write", ":read-only", ":full-access"). The
-        # [permissions] table is for *user-defined* named profiles with
-        # structured fields — not what we want.
-        normalized = (
-            default_permission_profile
-            if default_permission_profile.startswith(":")
-            else f":{default_permission_profile}"
-        )
-        out.append("")
-        out.append(f"default_permissions = {_format_toml_value(normalized)}")
-
-    if servers:
-        for name in sorted(servers.keys()):
-            cfg = servers[name]
-            out.append("")
-            out.append(f"[mcp_servers.{_quote_key(name)}]")
-            for k, v in cfg.items():
-                out.append(f"{_quote_key(k)} = {_format_toml_value(v)}")
-
-    if plugins:
-        for plugin in sorted(plugins, key=lambda p: f"{p.get('name','')}@{p.get('marketplace','')}"):
-            name = plugin.get("name") or ""
-            marketplace = plugin.get("marketplace") or "openai-curated"
-            enabled = bool(plugin.get("enabled", True))
-            qualified = f"{name}@{marketplace}"
-            out.append("")
-            out.append(f'[plugins.{_quote_key(qualified)}]')
-            out.append(f"enabled = {_format_toml_value(enabled)}")
-
-    out.append("")
-    out.append(MIGRATION_END_MARKER)
-    return "\n".join(out) + "\n"
-
-
-def _strip_existing_managed_block(toml_text: str) -> str:
-    """Remove any prior managed section so re-runs idempotently replace it.
-
-    The managed section is everything between MIGRATION_MARKER (start) and
-    MIGRATION_END_MARKER (end), inclusive of both markers. User-edited
-    sections above or below are preserved verbatim.
-
-    Backward compatibility: if the start marker is found but no end marker
-    follows, we fall back to the heuristic that swallows lines until we
-    hit a section that's not [mcp_servers.*]/[plugins.*]/[permissions]/
-    a `default_permissions =` key. This matches what older versions of
-    this code wrote so re-runs don't break configs from prior Hermes
-    versions."""
-    lines = toml_text.splitlines(keepends=True)
-    out: list[str] = []
-    in_managed = False
-    saw_end_marker = False
-    for line in lines:
-        line_stripped_nl = line.rstrip("\n")
-        if line_stripped_nl == MIGRATION_MARKER:
-            in_managed = True
-            saw_end_marker = False
-            continue
-        if in_managed:
-            if line_stripped_nl == MIGRATION_END_MARKER:
-                in_managed = False
-                saw_end_marker = True
-                continue
-            stripped = line.lstrip()
-            if not saw_end_marker and stripped.startswith("[") and not (
-                stripped.startswith("[mcp_servers")
-                or stripped.startswith("[plugins")
-                or stripped.startswith("[permissions]")
-                or stripped.startswith("[permissions.")
-            ):
-                # Old-format managed block without end marker: bail back
-                # to user content as soon as we see a non-managed section.
-                in_managed = False
-                out.append(line)
-                continue
-            # Otherwise swallow the line.
-            continue
-        out.append(line)
-    return "".join(out)
-
-
-def _query_codex_plugins(
-    codex_home: Optional[Path] = None,
-    timeout: float = 8.0,
-) -> tuple[list[dict], Optional[str]]:
-    """Query codex's `plugin/list` for installed curated plugins.
-
-    Spawns `codex app-server` briefly, sends initialize + plugin/list,
-    extracts plugins where installed=true. Returns (plugins, error).
-    Plugins is a list of {name, marketplace, enabled} dicts ready for
-    render_codex_toml_section().
-
-    On any failure (codex not installed, RPC error, timeout) returns
-    ([], error_message). Migration treats this as non-fatal — MCP
-    servers and permissions still write through.
-    """
-    try:
-        from agent.transports.codex_app_server import CodexAppServerClient
-    except Exception as exc:
-        return [], f"transport unavailable: {exc}"
-
-    try:
-        with CodexAppServerClient(
-            codex_home=str(codex_home) if codex_home else None
-        ) as client:
-            client.initialize(client_name="hermes-migration")
-            resp = client.request("plugin/list", {}, timeout=timeout)
-    except Exception as exc:
-        return [], f"plugin/list query failed: {exc}"
-
-    out: list[dict] = []
-    seen: set[tuple[str, str]] = set()
-    marketplaces = resp.get("marketplaces") or []
-    if not isinstance(marketplaces, list):
-        return [], "plugin/list response missing 'marketplaces'"
-    for marketplace in marketplaces:
-        if not isinstance(marketplace, dict):
-            continue
-        market_name = str(marketplace.get("name") or "openai-curated")
-        plugins = marketplace.get("plugins") or []
-        if not isinstance(plugins, list):
-            continue
-        for plugin in plugins:
-            if not isinstance(plugin, dict):
-                continue
-            installed = bool(plugin.get("installed", False))
-            if not installed:
-                continue
-            # Skip plugins codex itself reports as unavailable (broken
-            # install, missing OAuth, removed from marketplace, etc.).
-            # Cf. openclaw/openclaw#80815 — OpenClaw learned to gate
-            # migration on app readiness to avoid writing config that
-            # would fail at activation time. Our migration writes to
-            # codex's config.toml directly, so a broken plugin would
-            # surface as a codex error on first use. Skipping it here
-            # keeps the migrated config clean and the user's first
-            # codex turn from failing.
-            availability = str(plugin.get("availability") or "").upper()
-            if availability and availability != "AVAILABLE":
-                logger.debug(
-                    "skipping plugin %s: availability=%s",
-                    plugin.get("name"), availability,
-                )
-                continue
-            name = str(plugin.get("name") or "")
-            if not name:
-                continue
-            key = (name, market_name)
-            if key in seen:
-                continue
-            seen.add(key)
-            # Carry forward whatever 'enabled' codex reports — defaults to
-            # true for installed plugins. This is the same shape OpenClaw
-            # writes when migrating native codex plugins.
-            out.append({
-                "name": name,
-                "marketplace": market_name,
-                "enabled": bool(plugin.get("enabled", True)),
-            })
-    return out, None
-
-
-def _build_hermes_tools_mcp_entry() -> dict:
-    """Build the codex stdio-transport entry that launches Hermes' own
-    tool surface as an MCP server. Codex's subprocess will call back into
-    this for browser/web/delegate_task/vision/memory/skills tools.
-
-    The command runs the worktree's Python via the current sys.executable
-    so a hermes installed under /opt/, /usr/local/, or a venv all work.
-    HERMES_HOME and PYTHONPATH are passed through so the spawned process
-    sees the same config + module layout the user is running."""
-    import sys
-
-    env: dict[str, str] = {}
-    # HERMES_HOME passes through if set so the MCP subprocess sees the
-    # same config / auth / sessions DB as the parent CLI.
-    hermes_home = os.environ.get("HERMES_HOME")
-    if hermes_home:
-        env["HERMES_HOME"] = hermes_home
-    # PYTHONPATH passes through so a worktree-launched hermes finds the
-    # branch's modules instead of the installed package.
-    pythonpath = os.environ.get("PYTHONPATH")
-    if pythonpath:
-        env["PYTHONPATH"] = pythonpath
-    # Quiet mode + redaction defaults so the MCP wire stays clean.
-    env["HERMES_QUIET"] = "1"
-    env["HERMES_REDACT_SECRETS"] = env.get("HERMES_REDACT_SECRETS", "true")
-
-    out: dict[str, Any] = {
-        "command": sys.executable,
-        "args": ["-m", "agent.transports.hermes_tools_mcp_server"],
-    }
-    if env:
-        out["env"] = env
-    # Generous timeouts — browser_navigate or delegate_task can take a
-    # while; we don't want codex's MCP client to give up too early.
-    out["startup_timeout_sec"] = 30.0
-    out["tool_timeout_sec"] = 600.0
-    return out
-
-
-def migrate(
-    hermes_config: dict,
-    *,
-    codex_home: Optional[Path] = None,
-    dry_run: bool = False,
-    discover_plugins: bool = True,
-    default_permission_profile: Optional[str] = ":workspace",
-    expose_hermes_tools: bool = True,
-) -> MigrationReport:
-    """Translate Hermes mcp_servers config + Codex curated plugins into
-    ~/.codex/config.toml.
-
-    Args:
-        hermes_config: full ~/.hermes/config.yaml dict
-        codex_home: override CODEX_HOME (defaults to ~/.codex)
-        dry_run: skip the actual write; report what would happen
-        discover_plugins: when True (default), query `plugin/list` against
-            the live codex CLI to migrate any installed curated plugins
-            into [plugins."<name>@<marketplace>"] entries. Set False to
-            skip the subprocess spawn (for tests or restricted environments).
-        default_permission_profile: when set (default ":workspace"), write
-            top-level `default_permissions = "<name>"` so users on this
-            runtime don't get an approval prompt on every write attempt.
-            Built-in codex profile names are ":workspace", ":read-only",
-            ":danger-no-sandbox" (note the leading ":"). Also accepts a
-            user-defined profile name (no leading ":") that the user has
-            configured in their own [permissions.<name>] table. Set None
-            to leave permissions unset and let codex use its compiled-in
-            default (which is read-only).
-        expose_hermes_tools: when True (default), register Hermes' own
-            tool surface (web_search, browser_*, delegate_task, vision,
-            memory, skills, etc.) as an MCP server in ~/.codex/config.toml
-            so the codex subprocess can call back into Hermes for tools
-            codex doesn't have built in. Set False to opt out.
-    """
-    report = MigrationReport(dry_run=dry_run)
-    codex_home = codex_home or Path.home() / ".codex"
-    target = codex_home / "config.toml"
-    report.target_path = target
-
-    hermes_servers = (hermes_config or {}).get("mcp_servers") or {}
-    if not isinstance(hermes_servers, dict):
-        report.errors.append(
-            "mcp_servers in Hermes config is not a dict; cannot migrate."
-        )
-        return report
-
-    translated: dict[str, dict] = {}
-    for name, cfg in hermes_servers.items():
-        out, skipped = _translate_one_server(str(name), cfg or {})
-        if out is None:
-            report.errors.append(
-                f"server {name!r} skipped: {', '.join(skipped) or 'no transport configured'}"
-            )
-            continue
-        translated[str(name)] = out
-        if skipped:
-            report.skipped_keys_per_server[str(name)] = skipped
-        report.migrated.append(str(name))
-
-    # Discover installed Codex curated plugins. Best-effort — never blocks
-    # the migration if codex is unreachable or the RPC fails.
-    plugins: list[dict] = []
-    if discover_plugins and not dry_run:
-        plugins, plugin_err = _query_codex_plugins(codex_home=codex_home)
-        if plugin_err:
-            report.plugin_query_error = plugin_err
-        for p in plugins:
-            report.migrated_plugins.append(f"{p['name']}@{p['marketplace']}")
-
-    # Track whether we wrote a default permission profile so the report
-    # surfaces it to the user.
-    if default_permission_profile:
-        report.wrote_permissions_default = default_permission_profile
-
-    # Inject Hermes' own tool surface as an MCP server so the spawned
-    # codex subprocess can call back into Hermes for the tools codex
-    # doesn't ship with — web_search, browser_*, delegate_task, vision,
-    # memory, skills, session_search, image_generate, text_to_speech.
-    # The server itself is agent/transports/hermes_tools_mcp_server.py
-    # and is launched on demand by codex (stdio MCP).
-    if expose_hermes_tools:
-        translated["hermes-tools"] = _build_hermes_tools_mcp_entry()
-        if "hermes-tools" not in report.migrated:
-            report.migrated.append("hermes-tools")
-
-    # Build the new managed block
-    managed_block = render_codex_toml_section(
-        translated, plugins=plugins,
-        default_permission_profile=default_permission_profile,
-    )
-
-    # Read existing codex config if any, strip the prior managed block,
-    # append the new one.
-    if target.exists():
-        try:
-            existing = target.read_text(encoding="utf-8")
-        except Exception as exc:
-            report.errors.append(f"could not read {target}: {exc}")
-            return report
-        without_managed = _strip_existing_managed_block(existing)
-        # Ensure exactly one blank line between user content and managed block
-        if without_managed and not without_managed.endswith("\n"):
-            without_managed += "\n"
-        new_text = (
-            without_managed.rstrip("\n") + "\n\n" + managed_block
-            if without_managed.strip()
-            else managed_block
-        )
-    else:
-        new_text = managed_block
-
-    if dry_run:
-        return report
-
-    try:
-        codex_home.mkdir(parents=True, exist_ok=True)
-        # Atomic write: write to a temp file in the same directory then
-        # rename. Same-directory rename is atomic on POSIX and ReplaceFile
-        # on Windows. Avoids leaving a half-written config.toml that
-        # codex would refuse to load if we crash mid-write.
-        import tempfile
-        tmp_fd, tmp_path_str = tempfile.mkstemp(
-            prefix=".config.toml.", dir=str(codex_home)
-        )
-        tmp_path = Path(tmp_path_str)
-        try:
-            with os.fdopen(tmp_fd, "w", encoding="utf-8") as fh:
-                fh.write(new_text)
-            tmp_path.replace(target)
-        except Exception:
-            # Clean up the temp file if the rename didn't happen.
-            try:
-                if tmp_path.exists():
-                    tmp_path.unlink()
-            except Exception:
-                pass
-            raise
-        report.written = True
-    except Exception as exc:
-        report.errors.append(f"could not write {target}: {exc}")
-    return report
@@ -1,266 +0,0 @@
-"""Shared logic for the /codex-runtime slash command.
-
-Toggles `model.openai_runtime` between "auto" (= chat_completions, Hermes'
-default) and "codex_app_server" (= hand turns to a codex subprocess).
-
-Both CLI (cli.py) and gateway (gateway/run.py) call into this module so the
-behavior stays identical across surfaces.
-
-The actual runtime resolution happens in hermes_cli.runtime_provider's
-_maybe_apply_codex_app_server_runtime() helper, which reads the persisted
-config value. This module just persists the value and reports the change.
-"""
-
-from __future__ import annotations
-
-import logging
-from dataclasses import dataclass
-from typing import Optional
-
-logger = logging.getLogger(__name__)
-
-
-VALID_RUNTIMES = ("auto", "codex_app_server")
-
-
-@dataclass
-class CodexRuntimeStatus:
-    """Result of a /codex-runtime invocation. Callers render this however
-    suits their surface (CLI uses Rich panels, gateway sends a text message)."""
-
-    success: bool
-    new_value: Optional[str] = None
-    old_value: Optional[str] = None
-    message: str = ""
-    requires_new_session: bool = False
-    codex_binary_ok: bool = True
-    codex_version: Optional[str] = None
-
-
-def parse_args(arg_string: str) -> tuple[Optional[str], list[str]]:
-    """Parse the slash-command argument string. Returns (value, errors).
-
-    No args         → return current state (value=None)
-    'auto' / 'codex_app_server' / 'on' / 'off' → return that value
-    anything else   → error
-    """
-    raw = (arg_string or "").strip().lower()
-    if not raw:
-        return None, []
-    # Accept human-friendly synonyms
-    if raw in ("on", "codex", "enable"):
-        return "codex_app_server", []
-    if raw in ("off", "default", "disable", "hermes"):
-        return "auto", []
-    if raw in VALID_RUNTIMES:
-        return raw, []
-    return None, [
-        f"Unknown runtime {raw!r}. Use one of: auto, codex_app_server, on, off"
-    ]
-
-
-def get_current_runtime(config: dict) -> str:
-    """Read the current `model.openai_runtime` value from a config dict.
-    Returns 'auto' for unset / empty / unrecognized values."""
-    if not isinstance(config, dict):
-        return "auto"
-    model_cfg = config.get("model") or {}
-    if not isinstance(model_cfg, dict):
-        return "auto"
-    value = str(model_cfg.get("openai_runtime") or "").strip().lower()
-    if value in VALID_RUNTIMES:
-        return value
-    return "auto"
-
-
-def set_runtime(config: dict, new_value: str) -> str:
-    """Mutate the config dict in place to persist the new runtime value.
-    Returns the previous value for callers that want to report a delta."""
-    if new_value not in VALID_RUNTIMES:
-        raise ValueError(
-            f"invalid runtime {new_value!r}; must be one of {VALID_RUNTIMES}"
-        )
-    old = get_current_runtime(config)
-    if not isinstance(config.get("model"), dict):
-        config["model"] = {}
-    config["model"]["openai_runtime"] = new_value
-    return old
-
-
-def check_codex_binary_ok() -> tuple[bool, Optional[str]]:
-    """Best-effort verification that codex CLI is installed at acceptable
-    version. Returns (ok, version_or_message)."""
-    try:
-        from agent.transports.codex_app_server import check_codex_binary
-
-        return check_codex_binary()
-    except Exception as exc:  # pragma: no cover
-        return False, f"codex check failed: {exc}"
-
-
-def apply(
-    config: dict,
-    new_value: Optional[str],
-    *,
-    persist_callback=None,
-) -> CodexRuntimeStatus:
-    """Top-level entry point used by both CLI and gateway handlers.
-
-    Args:
-        config: in-memory config dict (will be mutated when new_value is set)
-        new_value: desired runtime; None means "show current state only"
-        persist_callback: optional callable taking the mutated config dict
-            and persisting it to disk. Skipped when None (used by tests).
-
-    Returns: CodexRuntimeStatus describing the outcome.
-    """
-    current = get_current_runtime(config)
-
-    # Cache the codex binary check for this apply() call. Subprocess spawn
-    # is cheap (~50ms for `codex --version`), but we'd otherwise call it up
-    # to 3 times in the enable path (read-only/state, gate, success message).
-    # None = not yet checked; (bool, str) = result.
-    _binary_check: Optional[tuple[bool, Optional[str]]] = None
-
-    def _check_binary_cached() -> tuple[bool, Optional[str]]:
-        nonlocal _binary_check
-        if _binary_check is None:
-            _binary_check = check_codex_binary_ok()
-        return _binary_check
-
-    # Read-only call: just report state
-    if new_value is None:
-        ok, ver = _check_binary_cached()
-        msg = (
-            f"openai_runtime: {current}\n"
-            f"codex CLI: {'OK ' + ver if ok else 'not available — ' + (ver or 'install with `npm i -g @openai/codex`')}"
-        )
-        return CodexRuntimeStatus(
-            success=True,
-            new_value=current,
-            old_value=current,
-            message=msg,
-            codex_binary_ok=ok,
-            codex_version=ver if ok else None,
-        )
-
-    # No change requested
-    if new_value == current:
-        return CodexRuntimeStatus(
-            success=True,
-            new_value=current,
-            old_value=current,
-            message=f"openai_runtime already set to {current}",
-        )
-
-    # If switching ON, verify codex CLI is installed before persisting —
-    # an opt-in toggle that silently fails on the first turn is the
-    # worst possible UX. Block here with a clear install hint.
-    if new_value == "codex_app_server":
-        ok, ver_or_msg = _check_binary_cached()
-        if not ok:
-            return CodexRuntimeStatus(
-                success=False,
-                new_value=None,
-                old_value=current,
-                message=(
-                    "Cannot enable codex_app_server runtime: "
-                    f"{ver_or_msg or 'codex CLI not available'}\n"
-                    "Install with: npm i -g @openai/codex"
-                ),
-                codex_binary_ok=False,
-                codex_version=None,
-            )
-
-    set_runtime(config, new_value)
-    if persist_callback is not None:
-        try:
-            persist_callback(config)
-        except Exception as exc:
-            logger.exception("failed to persist openai_runtime change")
-            return CodexRuntimeStatus(
-                success=False,
-                new_value=new_value,
-                old_value=current,
-                message=f"updated config in memory but persist failed: {exc}",
-            )
-
-    msg_lines = [
-        f"openai_runtime: {current} → {new_value}",
-    ]
-    if new_value == "codex_app_server":
-        ok, ver = _check_binary_cached()
-        if ok:
-            msg_lines.append(f"codex CLI: {ver}")
-        # Auto-migrate Hermes' MCP servers + Codex's installed curated
-        # plugins into ~/.codex/config.toml so the spawned codex subprocess
-        # sees the same tool surface AND can call back into Hermes for
-        # browser/web/delegate_task/vision/memory tools (#7 fix).
-        # Failures are non-fatal — the runtime change still proceeds.
-        try:
-            from hermes_cli.codex_runtime_plugin_migration import migrate
-            mig_report = migrate(config)
-            # Tools/MCP servers (excluding the hermes-tools callback,
-            # which is internal plumbing — surface separately).
-            user_servers = [
-                s for s in mig_report.migrated if s != "hermes-tools"
-            ]
-            if user_servers:
-                msg_lines.append(
-                    f"Migrated {len(user_servers)} MCP server(s): "
-                    f"{', '.join(user_servers)}"
-                )
-            # Native Codex plugin migration (Linear, GitHub, etc.)
-            if mig_report.migrated_plugins:
-                msg_lines.append(
-                    f"Migrated {len(mig_report.migrated_plugins)} native "
-                    f"Codex plugin(s): {', '.join(mig_report.migrated_plugins)}"
-                )
-            elif mig_report.plugin_query_error:
-                msg_lines.append(
-                    f"Codex plugin discovery skipped: "
-                    f"{mig_report.plugin_query_error}"
-                )
-            # Permissions + Hermes tool callback are always-on production
-            # bits the user benefits from knowing about.
-            if mig_report.wrote_permissions_default:
-                msg_lines.append(
-                    f"Default sandbox: {mig_report.wrote_permissions_default} "
-                    f"(no approval prompt on every write)"
-                )
-            if "hermes-tools" in mig_report.migrated:
-                msg_lines.append(
-                    "Hermes tool callback registered: codex can now use "
-                    "web_search, web_extract, browser_*, vision_analyze, "
-                    "image_generate, skill_view, skills_list, text_to_speech, "
-                    "kanban_* (worker + orchestrator) via MCP."
-                )
-                msg_lines.append(
-                    "  (delegate_task, memory, session_search, todo run "
-                    "only on the default Hermes runtime — they need the "
-                    "agent loop context.)"
-                )
-            msg_lines.append(f"  (config: {mig_report.target_path})")
-            for err in mig_report.errors:
-                msg_lines.append(f"⚠ MCP migration: {err}")
-        except Exception as exc:
-            msg_lines.append(f"⚠ MCP migration skipped: {exc}")
-        msg_lines.append(
-            "OpenAI/Codex turns now run through `codex app-server` "
-            "(terminal/file ops/patching inside Codex; "
-            "Hermes tools available via MCP callback)."
-        )
-        msg_lines.append(
-            "Effective on next session — current cached agent keeps "
-            "the prior runtime to preserve prompt cache."
-        )
-    else:
-        msg_lines.append("OpenAI/Codex turns will use the default Hermes runtime.")
-        msg_lines.append("Effective on next session.")
-    return CodexRuntimeStatus(
-        success=True,
-        new_value=new_value,
-        old_value=current,
-        message="\n".join(msg_lines),
-        requires_new_session=True,
-    )
@@ -104,8 +104,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
               args_hint="<prompt>"),
    CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
               args_hint="[text | pause | resume | clear | status]"),
-    CommandDef("subgoal", "Add or manage extra criteria on the active goal", "Session",
-               args_hint="[text | remove N | clear]"),
    CommandDef("status", "Show session info", "Session"),
    CommandDef("whoami", "Show your slash command access (admin / user)", "Info"),
    CommandDef("profile", "Show active profile name and home directory", "Info"),
@@ -122,8 +120,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
               cli_only=True),
    CommandDef("model", "Switch model for this session", "Configuration",
               aliases=("provider",), args_hint="[model] [--provider name] [--global]"),
-    CommandDef("codex-runtime", "Toggle codex app-server runtime for OpenAI/Codex models",
-               "Configuration", args_hint="[auto|codex_app_server]"),
    CommandDef("gquota", "Show Google Gemini Code Assist quota usage", "Info",
               cli_only=True),

@@ -472,23 +468,20 @@ def telegram_bot_commands() -> list[tuple[str, str]]:

    Telegram command names cannot contain hyphens, so they are replaced with
    underscores.  Aliases are skipped -- Telegram shows one menu entry per
-    canonical command.
+    canonical command. Commands that require arguments are skipped because
+    selecting a Telegram BotCommand sends only ``/command`` and would execute
+    an incomplete command.

-    Built-in commands that require arguments (e.g. /queue, /steer, /background)
-    are **included** because their handlers return usage text when selected
-    without a payload, making them discoverable via autocomplete.
-
-    Plugin-registered slash commands that require arguments are **excluded**
-    because plugins may not provide a no-arg usage fallback.
+    Plugin-registered slash commands are included so plugins get native
+    autocomplete in Telegram without touching core code.
    """
    overrides = _resolve_config_gates()
    result: list[tuple[str, str]] = []
    for cmd in COMMAND_REGISTRY:
        if not _is_gateway_available(cmd, overrides):
            continue
-        # Built-in arg-taking commands are included — their handlers show
-        # usage text when invoked without arguments, and hiding them from
-        # the menu hurts discoverability (issue #24312).
+        if _requires_argument(cmd.args_hint):
+            continue
        tg_name = _sanitize_telegram_name(cmd.name)
        if tg_name:
            result.append((tg_name, cmd.description))
@@ -1366,9 +1359,9 @@ class SlashCommandCompleter(Completer):
            try:
                proc = subprocess.run(
                    cmd, capture_output=True, text=True, timeout=2,
-                    cwd=cwd, encoding="utf-8", errors="replace",
+                    cwd=cwd,
                )
-                if proc.returncode == 0 and proc.stdout and proc.stdout.strip():
+                if proc.returncode == 0 and proc.stdout.strip():
                    raw = proc.stdout.strip().split("\n")
                    # Store relative paths
                    for p in raw[:5000]:
@@ -238,7 +238,7 @@ _hermes() {{
    esac
 }}

-compdef _hermes hermes
+_hermes "$@"
 """


@@ -477,12 +477,6 @@ DEFAULT_CONFIG = {
        # threshold before escalating to a full timeout.  The warning fires
        # once per run and does not interrupt the agent.  0 = disable warning.
        "gateway_timeout_warning": 900,
-        # Maximum time (seconds) the gateway will block an agent waiting for
-        # a clarify-tool response from the user.  Hit this and the agent
-        # unblocks with "[user did not respond within Xm]" so it can adapt
-        # rather than pinning the running-agent guard forever.  CLI clarify
-        # blocks indefinitely (input() is synchronous) and ignores this.
-        "clarify_timeout": 600,
        # Periodic "still working" notification interval (seconds).
        # Sends a status message every N seconds so the user knows the
        # agent hasn't died during long tasks.  0 = disable notifications.
@@ -634,12 +628,6 @@ DEFAULT_CONFIG = {
            # so the server maps it to a persistent Firefox profile automatically.
            # When false (default), each session gets a random userId (ephemeral).
            "managed_persistence": False,
-            # Optional externally managed Camofox identity. Useful when another
-            # app owns the visible browser and Hermes should operate in it.
-            "user_id": "",
-            "session_key": "",
-            # Rehydrate tab_id from Camofox before creating a new tab.
-            "adopt_existing_tab": False,
        },
    },

@@ -731,18 +719,19 @@ DEFAULT_CONFIG = {
        "target_ratio": 0.20,         # fraction of threshold to preserve as recent tail
        "protect_last_n": 20,         # minimum recent messages to keep uncompressed
        "hygiene_hard_message_limit": 400,  # gateway session-hygiene force-compress threshold by message count
-        "protect_first_n": 3,         # non-system head messages always preserved
-                                      # verbatim, in ADDITION to the system prompt
-                                      # (which is always implicitly protected). Set to
-                                      # 0 for long-running rolling-compaction sessions
-                                      # where you want nothing pinned except the
-                                      # system prompt + rolling summary + recent tail.
    },

    # Anthropic prompt caching (Claude via OpenRouter or native Anthropic API).
    # cache_ttl must be "5m" or "1h" (Anthropic-supported tiers); other values are ignored.
+    # long_lived_prefix: when true (default), Claude on Anthropic / OpenRouter / Nous
+    #   Portal uses a split layout: tools[-1] + stable system prefix at long_lived_ttl
+    #   (cross-session cache), last 2 messages at cache_ttl (within-session rolling).
+    #   Set false to keep the legacy "system + last 3 messages" single-tier layout.
+    # long_lived_ttl: TTL for the cross-session prefix tier ("5m" or "1h"; default "1h").
    "prompt_caching": {
        "cache_ttl": "5m",
+        "long_lived_prefix": True,
+        "long_lived_ttl": "1h",
    },

    # OpenRouter-specific settings.
@@ -928,14 +917,6 @@ DEFAULT_CONFIG = {
        "persistent_output": True,
        "persistent_output_max_lines": 200,
        "inline_diffs": True,     # Show inline diff previews for write actions (write_file, patch, skill_manage)
-        # File-mutation verifier footer.  When true (default), the agent
-        # appends a one-line advisory to its final response whenever a
-        # write_file / patch call failed during the turn and was never
-        # superseded by a successful write to the same path.  This catches
-        # the "batch of parallel patches, half fail, model claims success"
-        # class of over-claim that otherwise forces users to run
-        # `git status` to verify edits landed.  Set false to suppress.
-        "file_mutation_verifier": True,
        "show_cost": False,       # Show $ cost in the status bar (off by default)
        "skin": "default",
        # UI language for static user-facing messages (approval prompts, a
@@ -977,21 +958,6 @@ DEFAULT_CONFIG = {
    # Web dashboard settings
    "dashboard": {
        "theme": "default",  # Dashboard visual theme: "default", "midnight", "ember", "mono", "cyberpunk", "rose"
-        # Hide the token/cost analytics surfaces (Analytics page, token bars and
-        # cost figures on the Models page) by default.  The numbers shown there
-        # are a local debug estimate: they only count successful main-agent
-        # responses with a usable ``response.usage``, and silently exclude every
-        # auxiliary call (context compression, title generation, vision,
-        # session search, web extract, smart approval, MCP routing, plugin LLM
-        # access) plus provider-side retries, fallback attempts, and any call
-        # whose usage block didn't come back.  Cache writes are also missing
-        # from the API response.  On models with heavy auxiliary traffic
-        # (Kimi K2.6, MiniMax M2.7) the local total can be 10x-100x lower than
-        # the provider bill, which is worse than hiding the numbers entirely
-        # because they look precise enough to compare against the provider.
-        # Set this to True to re-enable the surfaces with the understanding
-        # that the numbers are a local lower-bound estimate, not billing.
-        "show_token_analytics": False,
    },

    # Privacy settings
@@ -1227,6 +1193,45 @@ DEFAULT_CONFIG = {
        },
    },

+    # Post-edit lint behaviour. Hermes runs a syntax check after every
+    # write/patch and surfaces only the errors that edit introduced (see
+    # ``ShellFileOperations._check_lint_delta`` in tools/file_operations.py).
+    # The defaults below leave the legacy shell linters (``npx tsc --noEmit``,
+    # ``go vet``, ``rustfmt --check``, ``py_compile``) in charge — the LSP
+    # path is opt-in until container/SSH backends grow a way to host the
+    # language server inside the sandbox.
+    "lint": {
+        "lsp": {
+            # Master switch. When false, ``_check_lint`` skips LSP entirely
+            # and the in-process / shell linter table runs as before.
+            "enabled": False,
+            # Per-language server launch command. Each value is either a
+            # whitespace-separated string or a list. Missing languages
+            # fall back to ``tools.lsp_lint._DEFAULT_SERVERS``.
+            "servers": {
+                "typescript": "typescript-language-server --stdio",
+                "typescriptreact": "typescript-language-server --stdio",
+                "javascript": "typescript-language-server --stdio",
+                "javascriptreact": "typescript-language-server --stdio",
+                "rust": "rust-analyzer",
+                "go": "gopls",
+            },
+            # Wall-clock timeout for a single ``didOpen`` → diagnostics
+            # round-trip. Servers that take longer than this on cold start
+            # (notably rust-analyzer indexing a fresh checkout) cause the
+            # caller to fall through to the shell linter.
+            "diagnostic_timeout": 10,
+            # Settle window (milliseconds). Typescript-language-server
+            # publishes an empty diagnostics batch while the program graph
+            # loads, then re-publishes once the file is fully analysed.
+            # Re-snapshotting after this delay catches the real verdict.
+            "settle_ms": 400,
+            # Shut down a long-lived server process after this many seconds
+            # of inactivity. The reaper runs in-process; no daemons.
+            "idle_shutdown": 600,
+        },
+    },
+
    # Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
    # This section is only needed for hermes-specific overrides; everything else
    # (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@@ -1250,7 +1255,6 @@ DEFAULT_CONFIG = {
        "free_response_channels": "",  # Comma-separated channel IDs where bot responds without mention
        "allowed_channels": "",        # If set, bot ONLY responds in these channel IDs (whitelist)
        "auto_thread": True,           # Auto-create threads on @mention in channels (like Slack)
-        "thread_require_mention": False,  # If True, require @mention in threads too (multi-bot threads)
        "reactions": True,             # Add 👀/✅/❌ reactions to messages during processing
        "channel_prompts": {},         # Per-channel ephemeral system prompts (forum parents apply to child threads)
        # Opt-in DM role-based auth (#12136). By default, DISCORD_ALLOWED_ROLES
@@ -1367,21 +1371,6 @@ DEFAULT_CONFIG = {
            "domains": [],
            "shared_files": [],
        },
-        # Acknowledged supply-chain security advisories. Each entry is the
-        # ID of an advisory the user has read and acted on (uninstalled the
-        # compromised package, rotated credentials). Acked advisories no
-        # longer trigger the startup banner. Add via `hermes doctor --ack
-        # <id>`; remove by editing the list directly. See
-        # ``hermes_cli/security_advisories.py`` for the catalog.
-        "acked_advisories": [],
-        # Allow Hermes to lazy-install opt-in backend packages from PyPI
-        # the first time the user enables a backend that needs them
-        # (e.g. installing ``elevenlabs`` when the user picks ElevenLabs as
-        # their TTS provider). Set to false to require explicit
-        # ``pip install`` for everything beyond the base set — appropriate
-        # for restricted networks, audited environments, or air-gapped
-        # systems where any runtime install is unacceptable.
-        "allow_lazy_installs": True,
    },

    "cron": {
@@ -1520,53 +1509,6 @@ DEFAULT_CONFIG = {
        "backup_keep": 5,
    },

-    # Language Server Protocol — semantic diagnostics from real
-    # language servers (pyright, gopls, rust-analyzer, etc.) wired
-    # into the post-write lint check used by ``write_file`` and
-    # ``patch``.
-    #
-    # LSP is gated on git-workspace detection: when the agent's
-    # cwd (or the file being edited) is inside a git worktree, LSP
-    # runs against that workspace.  When neither is in a git repo,
-    # LSP stays dormant and the in-process syntax check is the only
-    # tier — handy for Telegram/Discord chats where the cwd is the
-    # user's home directory.
-    "lsp": {
-        # Master toggle.  Setting this to false disables the entire
-        # subsystem — no servers spawn, no background event loop, no
-        # cost.
-        "enabled": True,
-
-        # Diagnostic-wait mode for the post-write check.
-        # ``"document"`` waits up to ``wait_timeout`` seconds for the
-        # current file's diagnostics; ``"full"`` additionally requests
-        # workspace-wide diagnostics (slower).
-        "wait_mode": "document",
-        "wait_timeout": 5.0,
-
-        # How to handle missing server binaries.
-        # ``"auto"`` — try to install via npm/go/pip into
-        #              ``<HERMES_HOME>/lsp/bin/`` on first use.
-        # ``"manual"`` — only use binaries already on PATH.
-        # ``"off"`` — alias for ``manual``.
-        "install_strategy": "auto",
-
-        # Per-server overrides.  Each key is a server_id from the
-        # registry (``pyright``, ``typescript``, ``gopls``,
-        # ``rust-analyzer``, etc.) and accepts:
-        #   disabled: true
-        #     — skip this server even when its extensions match
-        #   command: ["full/path/to/server", "--stdio"]
-        #     — pin a custom binary path; bypasses auto-install
-        #   env: {"KEY": "value"}
-        #     — extra env vars passed to the spawned process
-        #   initialization_options: {...}
-        #     — merged into the LSP ``initializationOptions``
-        # Empty by default; the registry defaults work for typical
-        # setups.
-        "servers": {},
-    },
-
    # Config schema version - bump this when adding new required fields
    "_config_version": 23,
 }
@@ -2129,10 +2071,10 @@ OPTIONAL_ENV_VARS = {
        "category": "tool",
    },
    "FAL_KEY": {
-        "description": "FAL API key for image and video generation",
+        "description": "FAL API key for image generation",
        "prompt": "FAL API key",
        "url": "https://fal.ai/",
-        "tools": ["image_generate", "video_generate"],
+        "tools": ["image_generate"],
        "password": True,
        "category": "tool",
    },
@@ -4341,34 +4283,10 @@ def load_env() -> Dict[str, str]:
    concatenated KEY=VALUE pairs on a single line) are handled
    gracefully instead of producing mangled values such as duplicated
    bot tokens.  See #8908.
-
-    The parsed dict is memoised keyed on the .env file mtime, because
-    ``get_env_value()`` is called dozens-to-hundreds of times per
-    interactive menu render (`hermes tools`, `hermes setup`, status
-    panels). Sanitisation is O(lines × known-keys), so re-parsing the
-    same file on every call was burning ~300ms of CPU per `hermes tools`
-    menu paint on top of the OAuth-refresh slowness. The mtime check
-    invalidates the cache when the user edits .env mid-process.
    """
-    global _env_cache
    env_path = get_env_path()
-
-    try:
-        mtime = env_path.stat().st_mtime
-        size = env_path.stat().st_size
-        cache_key = (str(env_path), mtime, size)
-    except FileNotFoundError:
-        cache_key = (str(env_path), None, None)
-    except Exception:
-        cache_key = None
-
-    if cache_key is not None and _env_cache is not None:
-        cached_key, cached_vars = _env_cache
-        if cached_key == cache_key:
-            return dict(cached_vars)
-
-    env_vars: Dict[str, str] = {}
-
+    env_vars = {}
+    
    if env_path.exists():
        # On Windows, open() defaults to the system locale (cp1252) which can
        # fail on UTF-8 .env files. Always use explicit UTF-8; tolerate BOM
@@ -4384,33 +4302,10 @@ def load_env() -> Dict[str, str]:
            if line and not line.startswith('#') and '=' in line:
                key, _, value = line.partition('=')
                env_vars[key.strip()] = value.strip().strip('"\'')
-
-    if cache_key is not None:
-        _env_cache = (cache_key, dict(env_vars))
-
+    
    return env_vars


-# Module-level memo for load_env(), keyed on (path, mtime, size).
-# Editing .env bumps mtime → next load_env() rebuilds. invalidate_env_cache()
-# is the explicit knob for writers that update .env via this module
-# (set_env_value, save_env, etc.) without relying on filesystem mtime
-# resolution.
-_env_cache: Optional[Tuple[Tuple[str, Optional[float], Optional[int]], Dict[str, str]]] = None
-
-
-def invalidate_env_cache() -> None:
-    """Clear the load_env() process-level memo.
-
-    Writers that mutate .env (set_env_value, save_env, etc.) call this
-    to guarantee the next load_env() sees their change even on
-    filesystems with coarse mtime resolution. Reads invalidate naturally
-    via the mtime/size check.
-    """
-    global _env_cache
-    _env_cache = None
-
-
 def _sanitize_env_lines(lines: list) -> list:
    """Fix corrupted .env lines before reading or writing.

@@ -4513,7 +4408,6 @@ def sanitize_env_file() -> int:
            pass
        raise
    _secure_file(env_path)
-    invalidate_env_cache()
    return fixes


@@ -4625,7 +4519,6 @@ def save_env_value(key: str, value: str):
    _secure_file(env_path)

    os.environ[key] = value
-    invalidate_env_cache()


 def remove_env_value(key: str) -> bool:
@@ -4681,7 +4574,6 @@ def remove_env_value(key: str) -> bool:
        _secure_file(env_path)

    os.environ.pop(key, None)
-    invalidate_env_cache()
    return found


@@ -4868,7 +4760,6 @@ def show_config():
        print(f"  Threshold:    {compression.get('threshold', 0.50) * 100:.0f}%")
        print(f"  Target ratio: {compression.get('target_ratio', 0.20) * 100:.0f}% of threshold preserved")
        print(f"  Protect last: {compression.get('protect_last_n', 20)} messages")
-        print(f"  Protect first: {compression.get('protect_first_n', 3)} non-system head messages")
        _aux_comp = config.get('auxiliary', {}).get('compression', {})
        _sm = _aux_comp.get('model', '') or '(auto)'
        print(f"  Model:        {_sm}")
@@ -287,8 +287,7 @@ def _build_apikey_providers_list() -> list:
                (_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
                if _pp.base_url else None
            )
-            _hc = getattr(_pp, "supports_health_check", True)
-            _static.append((_label, _key_vars, _models_url, _base_var, _hc))
+            _static.append((_label, _key_vars, _models_url, _base_var, True))
    except Exception:
        pass
    return _static
@@ -297,101 +296,19 @@ def _build_apikey_providers_list() -> list:
 def run_doctor(args):
    """Run diagnostic checks."""
    should_fix = getattr(args, 'fix', False)
-    ack_target = getattr(args, 'ack', None)

    # Doctor runs from the interactive CLI, so CLI-gated tool availability
    # checks (like cronjob management) should see the same context as `hermes`.
    os.environ.setdefault("HERMES_INTERACTIVE", "1")
-
-    # Handle `hermes doctor --ack <id>` as a fast path. Persist the ack and
-    # return without running the rest of the diagnostics — the user has
-    # already seen the advisory and just wants to silence it.
-    if ack_target:
-        from hermes_cli.security_advisories import (
-            ADVISORIES,
-            ack_advisory,
-        )
-        valid_ids = {a.id for a in ADVISORIES}
-        if ack_target not in valid_ids:
-            print(color(
-                f"Unknown advisory ID: {ack_target!r}. Known IDs: "
-                f"{', '.join(sorted(valid_ids)) or '(none)'}",
-                Colors.RED,
-            ))
-            sys.exit(2)
-        if ack_advisory(ack_target):
-            print(color(
-                f"  ✓ Acknowledged advisory {ack_target}. "
-                f"It will no longer trigger startup banners.",
-                Colors.GREEN,
-            ))
-        else:
-            print(color(
-                f"  ✗ Failed to persist ack for {ack_target}. "
-                f"Check ~/.hermes/config.yaml is writable.",
-                Colors.RED,
-            ))
-            sys.exit(1)
-        return
-
+    
    issues = []
    manual_issues = []  # issues that can't be auto-fixed
    fixed_count = 0
-
+    
    print()
    print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
    print(color("│                 🩺 Hermes Doctor                        │", Colors.CYAN))
    print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
-
-    # =========================================================================
-    # Check: Security advisories  (RUNS FIRST — these are the most urgent)
-    # =========================================================================
-    print()
-    print(color("◆ Security Advisories", Colors.CYAN, Colors.BOLD))
-    try:
-        from hermes_cli.security_advisories import (
-            detect_compromised,
-            filter_unacked,
-            full_remediation_text,
-            get_acked_ids,
-        )
-        all_hits = detect_compromised()
-        fresh_hits = filter_unacked(all_hits)
-        if fresh_hits:
-            for hit in fresh_hits:
-                check_fail(
-                    f"{hit.advisory.title}",
-                    f"({hit.package}=={hit.installed_version})",
-                )
-                # Print the full remediation block, indented under the
-                # check_fail header so it reads as a single section.
-                for line in full_remediation_text(hit):
-                    if line:
-                        print(f"    {color(line, Colors.YELLOW)}")
-                    else:
-                        print()
-                # Funnel into the action list so the summary block surfaces it
-                # for users who scroll past the section.
-                manual_issues.append(
-                    f"Resolve security advisory {hit.advisory.id}: "
-                    f"uninstall {hit.package}=={hit.installed_version} and "
-                    f"rotate credentials, then run "
-                    f"`hermes doctor --ack {hit.advisory.id}`."
-                )
-            # Acked-but-still-installed: show as informational so the user
-            # knows the package is still on disk after the ack.
-            acked_ids = get_acked_ids()
-            for h in all_hits:
-                if h.advisory.id in acked_ids:
-                    check_warn(
-                        f"{h.package}=={h.installed_version} still installed "
-                        f"(advisory {h.advisory.id} acknowledged)",
-                    )
-        else:
-            check_ok("No active security advisories")
-    except Exception as e:
-        # Never let a bug in the advisory check block the rest of doctor.
-        check_warn(f"Security advisory check failed: {e}")
    
    # =========================================================================
    # Check: Python version
@@ -2164,7 +2164,7 @@ Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
 Restart=always
-RestartSec=5
+RestartSec=60
 RestartMaxDelaySec=300
 RestartSteps=5
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
@@ -2199,7 +2199,7 @@ Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
 Restart=always
-RestartSec=5
+RestartSec=60
 RestartMaxDelaySec=300
 RestartSteps=5
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
@@ -3658,15 +3658,6 @@ def _all_platforms() -> list[dict]:
    ``hermes setup gateway`` without needing the gateway to be running.
    Built-ins keep their dict shape; plugin entries are adapted to the same
    shape with ``_registry_entry`` holding the source.
-
-    Platform-specific gating: some platforms can't be configured on
-    every host. Currently:
-      - Matrix is hidden on Windows. The [matrix] extra pulls
-        ``mautrix[encryption]`` -> ``python-olm``, which has no Windows
-        wheel and needs ``make`` + libolm to build from sdist. There's
-        no native Windows path that works, so we don't offer it in the
-        picker. Users who want Matrix on Windows can run hermes under
-        WSL.
    """
    # Populate the registry so plugin platforms are visible. Idempotent.
    # Bundled platform plugins (``kind: platform``) auto-load unconditionally,
@@ -3680,11 +3671,6 @@ def _all_platforms() -> list[dict]:
        logger.debug("plugin discovery failed during platform enumeration: %s", e)

    platforms = [dict(p) for p in _PLATFORMS]
-
-    # Drop platforms that can't function on this host. See docstring.
-    if sys.platform == "win32":
-        platforms = [p for p in platforms if p.get("key") != "matrix"]
-
    by_key = {p["key"]: p for p in platforms}

    try:
@@ -33,8 +33,8 @@ import json
 import logging
 import re
 import time
-from dataclasses import dataclass, field, asdict
-from typing import Any, Dict, List, Optional, Tuple
+from dataclasses import dataclass, asdict
+from typing import Any, Dict, Optional, Tuple

 logger = logging.getLogger(__name__)

@@ -65,21 +65,6 @@ CONTINUATION_PROMPT_TEMPLATE = (
    "If you are blocked and need input from the user, say so clearly and stop."
 )

-# Used when the user has added one or more /subgoal criteria. Surfaced
-# to the agent verbatim so it sees what to target on the next turn,
-# and surfaced to the judge so the verdict considers them too.
-CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE = (
-    "[Continuing toward your standing goal]\n"
-    "Goal: {goal}\n\n"
-    "Additional criteria the user added mid-loop:\n"
-    "{subgoals_block}\n\n"
-    "Continue working toward the goal AND all additional criteria. Take "
-    "the next concrete step. If you believe the goal and every "
-    "additional criterion are complete, state so explicitly and stop. "
-    "If you are blocked and need input from the user, say so clearly "
-    "and stop."
-)
-

 JUDGE_SYSTEM_PROMPT = (
    "You are a strict judge evaluating whether an autonomous agent has "
@@ -103,23 +88,6 @@ JUDGE_USER_PROMPT_TEMPLATE = (
    "Is the goal satisfied?"
 )

-# Used when the user has added /subgoal criteria. The judge must
-# evaluate ALL of them being met, not just the original goal.
-JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE = (
-    "Goal:\n{goal}\n\n"
-    "Additional criteria the user added mid-loop (all must also be "
-    "satisfied for the goal to be DONE):\n{subgoals_block}\n\n"
-    "Agent's most recent response:\n{response}\n\n"
-    "Decision: For each numbered criterion above, find concrete "
-    "evidence in the agent's response that the criterion is "
-    "satisfied. Do not accept generic phrases like 'all requirements "
-    "met' or 'implying it was done' — require specific evidence (a "
-    "file contents excerpt, an output line, a command result). If "
-    "ANY criterion lacks specific evidence in the response, the goal "
-    "is NOT done — return CONTINUE.\n\n"
-    "Is the goal AND every additional criterion satisfied?"
-)
-

 # ──────────────────────────────────────────────────────────────────────
 # Dataclass
@@ -140,12 +108,6 @@ class GoalState:
    last_reason: Optional[str] = None
    paused_reason: Optional[str] = None       # why we auto-paused (budget, etc.)
    consecutive_parse_failures: int = 0       # judge-output parse failures in a row
-    # User-added criteria appended mid-loop via the /subgoal command.
-    # When non-empty the judge prompt and continuation prompt both
-    # include them so the agent works toward them and the judge factors
-    # them into the verdict. Backwards-compatible: defaults to empty so
-    # old state_meta rows load unchanged.
-    subgoals: List[str] = field(default_factory=list)

    def to_json(self) -> str:
        return json.dumps(asdict(self), ensure_ascii=False)
@@ -153,10 +115,6 @@ class GoalState:
    @classmethod
    def from_json(cls, raw: str) -> "GoalState":
        data = json.loads(raw)
-        raw_subgoals = data.get("subgoals") or []
-        subgoals: List[str] = []
-        if isinstance(raw_subgoals, list):
-            subgoals = [str(s).strip() for s in raw_subgoals if str(s).strip()]
        return cls(
            goal=data.get("goal", ""),
            status=data.get("status", "active"),
@@ -168,18 +126,8 @@ class GoalState:
            last_reason=data.get("last_reason"),
            paused_reason=data.get("paused_reason"),
            consecutive_parse_failures=int(data.get("consecutive_parse_failures", 0) or 0),
-            subgoals=subgoals,
        )

-    # --- subgoals helpers -------------------------------------------------
-
-    def render_subgoals_block(self) -> str:
-        """Render the subgoals as a numbered ``- N. text`` block. Empty
-        when no subgoals exist."""
-        if not self.subgoals:
-            return ""
-        return "\n".join(f"- {i}. {text}" for i, text in enumerate(self.subgoals, start=1))
-

 # ──────────────────────────────────────────────────────────────────────
 # Persistence (SessionDB state_meta)
@@ -336,7 +284,6 @@ def judge_goal(
    last_response: str,
    *,
    timeout: float = DEFAULT_JUDGE_TIMEOUT,
-    subgoals: Optional[List[str]] = None,
 ) -> Tuple[str, str, bool]:
    """Ask the auxiliary model whether the goal is satisfied.

@@ -349,11 +296,6 @@ def judge_goal(
    auto-pause after N consecutive parse failures (see
    ``DEFAULT_MAX_CONSECUTIVE_PARSE_FAILURES``).

-    ``subgoals`` is an optional list of user-added criteria (from
-    ``/subgoal``) that the judge must also factor into its DONE/CONTINUE
-    decision. When non-empty the prompt switches to the with-subgoals
-    template; otherwise behavior is identical to the original judge.
-
    This is deliberately fail-open: any error returns ``("continue", "...", False)``
    so a broken judge doesn't wedge progress — the turn budget and the
    consecutive-parse-failures auto-pause are the backstops.
@@ -365,7 +307,7 @@ def judge_goal(
        return "continue", "empty response (nothing to evaluate)", False

    try:
-        from agent.auxiliary_client import get_auxiliary_extra_body, get_text_auxiliary_client
+        from agent.auxiliary_client import get_text_auxiliary_client
    except Exception as exc:
        logger.debug("goal judge: auxiliary client import failed: %s", exc)
        return "continue", "auxiliary client unavailable", False
@@ -379,22 +321,10 @@ def judge_goal(
    if client is None or not model:
        return "continue", "no auxiliary client configured", False

-    # Build the prompt — pick the with-subgoals variant when applicable.
-    clean_subgoals = [s.strip() for s in (subgoals or []) if s and s.strip()]
-    if clean_subgoals:
-        subgoals_block = "\n".join(
-            f"- {i}. {text}" for i, text in enumerate(clean_subgoals, start=1)
-        )
-        prompt = JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE.format(
-            goal=_truncate(goal, 2000),
-            subgoals_block=_truncate(subgoals_block, 2000),
-            response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
-        )
-    else:
-        prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
-            goal=_truncate(goal, 2000),
-            response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
-        )
+    prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
+        goal=_truncate(goal, 2000),
+        response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
+    )

    try:
        resp = client.chat.completions.create(
@@ -406,7 +336,6 @@ def judge_goal(
            temperature=0,
            max_tokens=200,
            timeout=timeout,
-            extra_body=get_auxiliary_extra_body() or None,
        )
    except Exception as exc:
        logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
@@ -467,15 +396,14 @@ class GoalManager:
        if s is None or s.status in {"cleared",}:
            return "No active goal. Set one with /goal <text>."
        turns = f"{s.turns_used}/{s.max_turns} turns"
-        sub = f", {len(s.subgoals)} subgoal{'s' if len(s.subgoals) != 1 else ''}" if s.subgoals else ""
        if s.status == "active":
-            return f"⊙ Goal (active, {turns}{sub}): {s.goal}"
+            return f"⊙ Goal (active, {turns}): {s.goal}"
        if s.status == "paused":
            extra = f" — {s.paused_reason}" if s.paused_reason else ""
-            return f"⏸ Goal (paused, {turns}{sub}{extra}): {s.goal}"
+            return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
        if s.status == "done":
-            return f"✓ Goal done ({turns}{sub}): {s.goal}"
-        return f"Goal ({s.status}, {turns}{sub}): {s.goal}"
+            return f"✓ Goal done ({turns}): {s.goal}"
+        return f"Goal ({s.status}, {turns}): {s.goal}"

    # --- mutation -----------------------------------------------------

@@ -528,53 +456,6 @@ class GoalManager:
        self._state.last_reason = reason
        save_goal(self.session_id, self._state)

-    # --- /subgoal user controls ---------------------------------------
-
-    def add_subgoal(self, text: str) -> str:
-        """Append a user-added criterion to the active goal. Requires
-        ``has_goal()``; raises ``RuntimeError`` otherwise.
-
-        Returns the cleaned text so the caller can show it back to the user.
-        """
-        if self._state is None or not self.has_goal():
-            raise RuntimeError("no active goal")
-        text = (text or "").strip()
-        if not text:
-            raise ValueError("subgoal text is empty")
-        self._state.subgoals.append(text)
-        save_goal(self.session_id, self._state)
-        return text
-
-    def remove_subgoal(self, index_1based: int) -> str:
-        """Remove a subgoal by 1-based index. Returns the removed text."""
-        if self._state is None or not self.has_goal():
-            raise RuntimeError("no active goal")
-        idx = int(index_1based) - 1
-        if idx < 0 or idx >= len(self._state.subgoals):
-            raise IndexError(
-                f"index out of range (1..{len(self._state.subgoals)})"
-            )
-        removed = self._state.subgoals.pop(idx)
-        save_goal(self.session_id, self._state)
-        return removed
-
-    def clear_subgoals(self) -> int:
-        """Wipe all subgoals. Returns the previous count."""
-        if self._state is None or not self.has_goal():
-            raise RuntimeError("no active goal")
-        prev = len(self._state.subgoals)
-        self._state.subgoals = []
-        save_goal(self.session_id, self._state)
-        return prev
-
-    def render_subgoals(self) -> str:
-        """Public helper for the /subgoal slash command."""
-        if self._state is None:
-            return "(no active goal)"
-        if not self._state.subgoals:
-            return "(no subgoals — use /subgoal <text> to add criteria)"
-        return self._state.render_subgoals_block()
-
    # --- the main entry point called after every turn -----------------

    def evaluate_after_turn(
@@ -612,9 +493,7 @@ class GoalManager:
        state.turns_used += 1
        state.last_turn_at = time.time()

-        verdict, reason, parse_failed = judge_goal(
-            state.goal, last_response, subgoals=state.subgoals or None
-        )
+        verdict, reason, parse_failed = judge_goal(state.goal, last_response)
        state.last_verdict = verdict
        state.last_reason = reason

@@ -699,11 +578,6 @@ class GoalManager:
    def next_continuation_prompt(self) -> Optional[str]:
        if not self._state or self._state.status != "active":
            return None
-        if self._state.subgoals:
-            return CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE.format(
-                goal=self._state.goal,
-                subgoals_block=self._state.render_subgoals_block(),
-            )
        return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)


@@ -711,9 +585,6 @@ __all__ = [
    "GoalState",
    "GoalManager",
    "CONTINUATION_PROMPT_TEMPLATE",
-    "CONTINUATION_PROMPT_WITH_SUBGOALS_TEMPLATE",
-    "JUDGE_USER_PROMPT_TEMPLATE",
-    "JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE",
    "DEFAULT_MAX_TURNS",
    "load_goal",
    "save_goal",
@@ -1,240 +0,0 @@
-"""Provider/model inventory context — shared substrate for the dashboard
-``/api/model/options``, the TUI ``model.options``/``model.save_key``
-JSON-RPC handlers, and the interactive picker.
-
-Before this module the three call-sites each duplicated:
-
-1. The 17-LOC config-slice that pulls ``model.{default,name,provider,base_url}``,
-   ``providers:``, and ``custom_providers:`` out of ``load_config()``;
-2. The call into ``list_authenticated_providers`` with the resulting kwargs;
-3. (TUI only) a 45-LOC post-pass that merges authenticated rows with
-   unconfigured ``CANONICAL_PROVIDERS`` rows and emits ``authenticated``/
-   ``auth_type``/``key_env``/``warning`` hints for the picker UI.
-
-Consolidating those three steps into one entry point eliminates two bugs
-the duplicates were hiding:
-
- The dashboard read ``cfg.get("custom_providers")`` directly, missing the
-  v12+ keyed ``providers:`` form (which the TUI handled via
-  ``get_compatible_custom_providers``).
- The TUI's canonical-merge keyed on ``is_user_defined`` to decide
-  ordering. Section 3 of ``list_authenticated_providers`` sets
-  ``is_user_defined=True`` even for canonical slugs that appear in the
-  ``providers:`` config dict, which silently demoted them to the tail of
-  the picker. ``_reorder_canonical`` keys on slug membership instead.
-
-Substrate facts (verified May 2026):
- ``list_authenticated_providers`` already populates each row's
-  ``models`` from the curated catalog (same source as the picker). Do
-  NOT call ``provider_model_ids()`` per row to "freshen" — that bypasses
-  curation and pulls in non-agentic models (Nous /models returns ~400
-  IDs including TTS, embeddings, rerankers, image/video generators).
-"""
-
-from __future__ import annotations
-
-from dataclasses import dataclass, replace
-from typing import Optional
-
-
-# ─── Public types ───────────────────────────────────────────────────────
-
-
-@dataclass(frozen=True)
-class ConfigContext:
-    """Snapshot of the model + provider config every inventory caller
-    needs. Built once via ``load_picker_context()``; the TUI overlays
-    live agent state via ``with_overrides()`` before passing through.
-    """
-
-    current_provider: str
-    current_model: str
-    current_base_url: str
-    user_providers: dict
-    custom_providers: list
-
-    def with_overrides(
-        self,
-        *,
-        current_provider: Optional[str] = None,
-        current_model: Optional[str] = None,
-        current_base_url: Optional[str] = None,
-    ) -> "ConfigContext":
-        """Return a copy with truthy overrides applied.
-
-        Truthy-only because the TUI reads agent attributes that may be
-        empty strings before an agent is spawned — empties must NOT
-        clobber the disk-config values.
-        """
-        kw: dict = {}
-        if current_provider:
-            kw["current_provider"] = current_provider
-        if current_model:
-            kw["current_model"] = current_model
-        if current_base_url:
-            kw["current_base_url"] = current_base_url
-        return replace(self, **kw) if kw else self
-
-
-def load_picker_context() -> ConfigContext:
-    """Load the disk-config snapshot every consumer needs.
-
-    Replaces the inline 17-LOC config-slice that ``web_server.py`` and
-    ``tui_gateway/server.py`` (×2 sites) used to do.
-    """
-    from hermes_cli.config import get_compatible_custom_providers, load_config
-
-    cfg = load_config()
-    model_cfg = cfg.get("model", {})
-    if isinstance(model_cfg, dict):
-        current_model = model_cfg.get("default", model_cfg.get("name", "")) or ""
-        current_provider = model_cfg.get("provider", "") or ""
-        current_base_url = model_cfg.get("base_url", "") or ""
-    else:
-        # config.model can be a bare string in older configs.
-        current_model = str(model_cfg) if model_cfg else ""
-        current_provider = ""
-        current_base_url = ""
-    raw = cfg.get("providers")
-    return ConfigContext(
-        current_provider=current_provider,
-        current_model=current_model,
-        current_base_url=current_base_url,
-        user_providers=raw if isinstance(raw, dict) else {},
-        custom_providers=get_compatible_custom_providers(cfg),
-    )
-
-
-# ─── Public: payload builder ────────────────────────────────────────────
-
-
-def build_models_payload(
-    ctx: ConfigContext,
-    *,
-    include_unconfigured: bool = False,
-    picker_hints: bool = False,
-    canonical_order: bool = False,
-    max_models: int = 50,
-) -> dict:
-    """Build the ``{providers, model, provider}`` shape every consumer
-    needs from a single substrate call.
-
-    Flags:
-    - ``include_unconfigured``: append ``CANONICAL_PROVIDERS`` rows that
-      ``list_authenticated_providers`` didn't emit (TUI uses this to show
-      the full provider universe in the picker).
-    - ``picker_hints``: add ``authenticated``/``auth_type``/``key_env``/
-      ``warning`` per row (TUI ``ModelPickerDialog`` shape).
-    - ``canonical_order``: reorder canonical-slug rows to
-      ``CANONICAL_PROVIDERS`` declaration order; truly-custom rows go
-      last (TUI display order).
-    """
-    from hermes_cli.model_switch import list_authenticated_providers
-
-    rows = list_authenticated_providers(
-        current_provider=ctx.current_provider,
-        current_base_url=ctx.current_base_url,
-        current_model=ctx.current_model,
-        user_providers=ctx.user_providers,
-        custom_providers=ctx.custom_providers,
-        max_models=max_models,
-    )
-
-    if include_unconfigured:
-        rows = list(rows) + _append_unconfigured_rows(rows, ctx)
-    if picker_hints:
-        _apply_picker_hints(rows)
-    if canonical_order:
-        rows = _reorder_canonical(rows)
-
-    return {
-        "providers": rows,
-        "model": ctx.current_model,
-        "provider": ctx.current_provider,
-    }
-
-
-# ─── Internal: row post-processing ──────────────────────────────────────
-
-
-def _append_unconfigured_rows(rows: list[dict], ctx: ConfigContext) -> list[dict]:
-    """Build skeleton rows for canonical providers missing from ``rows``."""
-    from hermes_cli.models import CANONICAL_PROVIDERS, _PROVIDER_LABELS
-
-    seen = {r["slug"].lower() for r in rows}
-    cur = (ctx.current_provider or "").lower()
-    extras: list[dict] = []
-    for entry in CANONICAL_PROVIDERS:
-        if entry.slug.lower() in seen:
-            continue
-        extras.append(
-            {
-                "slug": entry.slug,
-                "name": _PROVIDER_LABELS.get(entry.slug, entry.label),
-                "is_current": entry.slug.lower() == cur,
-                "is_user_defined": False,
-                "models": [],
-                "total_models": 0,
-                "source": "canonical",
-            }
-        )
-    return extras
-
-
-def _apply_picker_hints(rows: list[dict]) -> None:
-    """Add ``authenticated``/``auth_type``/``key_env``/``warning`` per row.
-
-    Mutates ``rows`` in-place. Rows already from
-    ``list_authenticated_providers`` are marked ``authenticated=True``;
-    the unconfigured skeleton rows from ``_append_unconfigured_rows`` get
-    the picker's setup-hint shape.
-    """
-    from hermes_cli.auth import PROVIDER_REGISTRY
-
-    for row in rows:
-        if "authenticated" in row:
-            continue
-        # Distinguish authenticated rows (returned by
-        # list_authenticated_providers) from skeleton rows (from
-        # _append_unconfigured_rows). The skeleton rows have empty
-        # `models` AND source="canonical"; authenticated rows have
-        # populated `models` OR a non-canonical source.
-        is_skeleton = row.get("source") == "canonical" and not row.get("models")
-        row["authenticated"] = not is_skeleton
-        if not is_skeleton or row.get("is_user_defined"):
-            continue
-        cfg = PROVIDER_REGISTRY.get(row["slug"])
-        auth_type = cfg.auth_type if cfg else "api_key"
-        key_env = (
-            cfg.api_key_env_vars[0]
-            if (cfg and cfg.api_key_env_vars)
-            else ""
-        )
-        row["auth_type"] = auth_type
-        row["key_env"] = key_env
-        row["warning"] = (
-            f"paste {key_env} to activate"
-            if auth_type == "api_key" and key_env
-            else f"run `hermes model` to configure ({auth_type})"
-        )
-
-
-def _reorder_canonical(rows: list[dict]) -> list[dict]:
-    """Canonical slugs in ``CANONICAL_PROVIDERS`` declaration order;
-    truly-custom rows last.
-
-    Keys on slug membership, NOT ``is_user_defined`` — section 3 of
-    ``list_authenticated_providers`` sets ``is_user_defined=True`` on
-    rows from the ``providers:`` config dict even when the slug is
-    canonical. Keying on the flag would silently demote canonical
-    providers configured via the new keyed schema.
-    """
-    from hermes_cli.models import CANONICAL_PROVIDERS
-
-    order = {e.slug: i for i, e in enumerate(CANONICAL_PROVIDERS)}
-    canon = sorted(
-        (r for r in rows if r["slug"] in order),
-        key=lambda r: order[r["slug"]],
-    )
-    extras = [r for r in rows if r["slug"] not in order]
-    return canon + extras
@@ -155,7 +155,7 @@ def specify_task(
        )

    try:
-        from agent.auxiliary_client import get_auxiliary_extra_body, get_text_auxiliary_client
+        from agent.auxiliary_client import get_text_auxiliary_client
    except Exception as exc:  # pragma: no cover — import smoke test
        logger.debug("specify: auxiliary client import failed: %s", exc)
        return SpecifyOutcome(task_id, False, "auxiliary client unavailable")
@@ -187,7 +187,6 @@ def specify_task(
            temperature=0.3,
            max_tokens=1500,
            timeout=timeout or 120,
-            extra_body=get_auxiliary_extra_body() or None,
        )
    except Exception as exc:
        logger.info(
@@ -2414,31 +2414,30 @@ def _prompt_provider_choice(choices, *, default=0):
 def _model_flow_openrouter(config, current_model=""):
    """OpenRouter provider: ensure API key, then pick model."""
    from hermes_cli.auth import (
-        ProviderConfig,
        _prompt_model_selection,
        _save_model_choice,
        deactivate_provider,
    )
-    from hermes_cli.config import get_env_value
+    from hermes_cli.config import get_env_value, save_env_value

-    # Route through _prompt_api_key so users can replace a stale/broken key
-    # in-flow (K/R/C) instead of having to edit ~/.hermes/.env by hand. The
-    # previous bypass-when-key-exists branch left no way to recover from a
-    # bad paste short of re-running `hermes setup` from scratch. OpenRouter
-    # isn't in PROVIDER_REGISTRY so we synthesize a minimal pconfig.
-    pconfig = ProviderConfig(
-        id="openrouter",
-        name="OpenRouter",
-        auth_type="api_key",
-        api_key_env_vars=("OPENROUTER_API_KEY",),
-    )
-    existing_key = get_env_value("OPENROUTER_API_KEY") or ""
-    if not existing_key:
+    api_key = get_env_value("OPENROUTER_API_KEY")
+    if not api_key:
+        print("No OpenRouter API key configured.")
        print("Get one at: https://openrouter.ai/keys")
        print()
-    _resolved, abort = _prompt_api_key(pconfig, existing_key, provider_id="openrouter")
-    if abort:
-        return
+        try:
+            import getpass
+
+            key = getpass.getpass("OpenRouter API key (or Enter to cancel): ").strip()
+        except (KeyboardInterrupt, EOFError):
+            print()
+            return
+        if not key:
+            print("Cancelled.")
+            return
+        save_env_value("OPENROUTER_API_KEY", key)
+        print("API key saved.")
+        print()

    from hermes_cli.models import model_ids, get_pricing_for_provider

@@ -2474,26 +2473,33 @@ def _model_flow_openrouter(config, current_model=""):
 def _model_flow_ai_gateway(config, current_model=""):
    """Vercel AI Gateway provider: ensure API key, then pick model with pricing."""
    from hermes_cli.auth import (
-        PROVIDER_REGISTRY,
        _prompt_model_selection,
        _save_model_choice,
        deactivate_provider,
    )
-    from hermes_cli.config import get_env_value
+    from hermes_cli.config import get_env_value, save_env_value

-    # Route through _prompt_api_key so users can replace a stale/broken key
-    # in-flow (K/R/C) instead of having to edit ~/.hermes/.env by hand.
-    pconfig = PROVIDER_REGISTRY["ai-gateway"]
-    existing_key = get_env_value("AI_GATEWAY_API_KEY") or ""
-    if not existing_key:
+    api_key = get_env_value("AI_GATEWAY_API_KEY")
+    if not api_key:
+        print("No Vercel AI Gateway API key configured.")
        print(
            "Create API key here: https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai-gateway&title=AI+Gateway"
        )
        print("Add a payment method to get $5 in free credits.")
        print()
-    _resolved, abort = _prompt_api_key(pconfig, existing_key, provider_id="ai-gateway")
-    if abort:
-        return
+        try:
+            import getpass
+
+            key = getpass.getpass("AI Gateway API key (or Enter to cancel): ").strip()
+        except (KeyboardInterrupt, EOFError):
+            print()
+            return
+        if not key:
+            print("Cancelled.")
+            return
+        save_env_value("AI_GATEWAY_API_KEY", key)
+        print("API key saved.")
+        print()

    from hermes_cli.models import ai_gateway_model_ids, get_pricing_for_provider

@@ -2584,7 +2590,6 @@ def _model_flow_nous(config, current_model="", args=None):
        check_nous_free_tier,
        partition_nous_models_by_tier,
        union_with_portal_free_recommendations,
-        union_with_portal_paid_recommendations,
    )

    model_ids = get_curated_nous_model_ids()
@@ -2640,10 +2645,6 @@ def _model_flow_nous(config, current_model="", args=None):
    # with the Portal's freeRecommendedModels list so newly-launched free
    # models show up even if this CLI build's hardcoded curated list and
    # docs-hosted manifest haven't caught up yet.
-    #
-    # For paid users: mirror the same idea with paidRecommendedModels so
-    # newly-launched paid models surface in the picker too — independent
-    # of CLI release cadence.
    unavailable_models: list[str] = []
    if free_tier:
        model_ids, pricing = union_with_portal_free_recommendations(
@@ -2652,10 +2653,6 @@ def _model_flow_nous(config, current_model="", args=None):
        model_ids, unavailable_models = partition_nous_models_by_tier(
            model_ids, pricing, free_tier=True
        )
-    else:
-        model_ids, pricing = union_with_portal_paid_recommendations(
-            model_ids, pricing, _nous_portal_url,
-        )

    if not model_ids and not unavailable_models:
        print("No models available for Nous Portal after filtering.")
@@ -3073,21 +3070,6 @@ def _model_flow_custom(config):
            else:
                print(f"  If /v1 should not be in the base URL, try: {suggested}")

-    # Prompt for API compatibility mode explicitly so codex-compatible custom
-    # providers don't silently fall back to chat_completions.
-    current_model_cfg = config.get("model")
-    current_api_mode = ""
-    if isinstance(current_model_cfg, dict):
-        current_api_mode = str(current_model_cfg.get("api_mode") or "").strip()
-    api_mode = _prompt_custom_api_mode_selection(
-        effective_url,
-        current_api_mode=current_api_mode,
-    )
-    if api_mode:
-        print(f"  API mode: {api_mode}")
-    else:
-        print("  API mode: auto-detect")
-
    # Select model — use probe results when available, fall back to manual input
    model_name = ""
    detected_models = probe.get("models") or []
@@ -3151,10 +3133,7 @@ def _model_flow_custom(config):
        model["base_url"] = effective_url
        if effective_key:
            model["api_key"] = effective_key
-        if api_mode:
-            model["api_mode"] = api_mode
-        else:
-            model.pop("api_mode", None)
+        model.pop("api_mode", None)  # let runtime auto-detect from URL
        save_config(cfg)
        deactivate_provider()

@@ -3177,10 +3156,7 @@ def _model_flow_custom(config):
        _caller_model["base_url"] = effective_url
        if effective_key:
            _caller_model["api_key"] = effective_key
-        if api_mode:
-            _caller_model["api_mode"] = api_mode
-        else:
-            _caller_model.pop("api_mode", None)
+        _caller_model.pop("api_mode", None)
        config["model"] = _caller_model
        print("Endpoint saved. Use `/model` in chat or `hermes model` to set a model.")

@@ -3191,80 +3167,9 @@ def _model_flow_custom(config):
        model_name or "",
        context_length=context_length,
        name=display_name,
-        api_mode=api_mode,
    )


-def _prompt_custom_api_mode_selection(base_url: str, current_api_mode: str = "") -> Optional[str]:
-    """Prompt for a custom provider API mode.
-
-    Returns an explicit mode string, or None to keep auto-detect behavior.
-    """
-    from hermes_cli.runtime_provider import _detect_api_mode_for_url
-
-    detected_mode = _detect_api_mode_for_url(base_url)
-    normalized_current = str(current_api_mode or "").strip().lower()
-    default_mode = normalized_current or detected_mode or ""
-
-    mode_options = [
-        (
-            "",
-            "Auto-detect",
-            "Use Hermes URL heuristics; best for standard OpenAI-compatible endpoints.",
-        ),
-        (
-            "chat_completions",
-            "Chat Completions",
-            "Use /chat/completions for standard OpenAI-compatible servers.",
-        ),
-        (
-            "codex_responses",
-            "Responses / Codex",
-            "Use /responses for Codex-compatible tool-calling backends.",
-        ),
-        (
-            "anthropic_messages",
-            "Anthropic Messages",
-            "Use /v1/messages for Anthropic-compatible endpoints.",
-        ),
-    ]
-
-    print()
-    print("Select API compatibility mode:")
-    for idx, (value, label, description) in enumerate(mode_options, 1):
-        markers = []
-        if value == detected_mode:
-            markers.append("detected")
-        if value == default_mode:
-            markers.append("current")
-        suffix = f" [{' / '.join(markers)}]" if markers else ""
-        print(f"  {idx}. {label}{suffix}")
-        print(f"     {description}")
-
-    try:
-        raw = input(
-            "Choice [1-4, Enter to keep current/detected]: "
-        ).strip().lower()
-    except (KeyboardInterrupt, EOFError):
-        print("\nCancelled.")
-        raise
-
-    if not raw:
-        return default_mode or None
-
-    if raw in {"1", "auto", "detect", "auto-detect"}:
-        return None
-    if raw in {"2", "chat", "chat_completions", "completions"}:
-        return "chat_completions"
-    if raw in {"3", "responses", "codex", "codex_responses"}:
-        return "codex_responses"
-    if raw in {"4", "anthropic", "anthropic_messages", "messages"}:
-        return "anthropic_messages"
-
-    print(f"Invalid API mode choice: {raw}. Falling back to auto-detect.")
-    return None
-
-
 def _auto_provider_name(base_url: str) -> str:
    """Generate a display name from a custom endpoint URL.

@@ -3300,12 +3205,12 @@ def _custom_provider_api_key_config_value(provider_info, resolved_api_key=""):


 def _save_custom_provider(
-    base_url, api_key="", model="", context_length=None, name=None, api_mode=None
+    base_url, api_key="", model="", context_length=None, name=None
 ):
    """Save a custom endpoint to custom_providers in config.yaml.

    Deduplicates by base_url — if the URL already exists, updates the
-    model name, context_length, and api_mode but doesn't add a duplicate entry.
+    model name and context_length but doesn't add a duplicate entry.
    Uses *name* when provided, otherwise auto-generates from the URL.
    """
    from hermes_cli.config import load_config, save_config
@@ -3331,13 +3236,6 @@ def _save_custom_provider(
                models_cfg[model] = {"context_length": context_length}
                entry["models"] = models_cfg
                changed = True
-            if api_mode:
-                if entry.get("api_mode") != api_mode:
-                    entry["api_mode"] = api_mode
-                    changed = True
-            elif "api_mode" in entry:
-                entry.pop("api_mode", None)
-                changed = True
            if changed:
                cfg["custom_providers"] = providers
                save_config(cfg)
@@ -3352,8 +3250,6 @@ def _save_custom_provider(
        entry["api_key"] = api_key
    if model:
        entry["model"] = model
-    if api_mode:
-        entry["api_mode"] = api_mode
    if model and context_length:
        entry["models"] = {model: {"context_length": context_length}}

@@ -3807,7 +3703,7 @@ def _model_flow_named_custom(config, provider_info):
                save_config(cfg)
    else:
        # Save model name to the custom_providers entry for next time
-        _save_custom_provider(base_url, config_api_key, model_name, api_mode=api_mode)
+        _save_custom_provider(base_url, config_api_key, model_name)

    print(f"\n✅ Model set to: {model_name}")
    print(f"   Provider: {name} ({base_url})")
@@ -4964,37 +4860,6 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
        )
        if model_list:
            print(f"  Found {len(model_list)} model(s) from Ollama Cloud")
-    elif provider_id == "novita":
-        from hermes_cli.models import fetch_api_models
-
-        api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
-        curated = _PROVIDER_MODELS.get(provider_id, [])
-        live_models = fetch_api_models(api_key_for_probe, effective_base)
-        if live_models:
-            model_list = live_models
-            print(f"  Found {len(model_list)} model(s) from {pconfig.name} API")
-        else:
-            mdev_models: list = []
-            try:
-                from agent.models_dev import list_agentic_models
-
-                mdev_models = list_agentic_models(provider_id)
-            except Exception:
-                pass
-            if mdev_models:
-                seen = {m.lower() for m in mdev_models}
-                model_list = list(mdev_models)
-                for m in curated:
-                    if m.lower() not in seen:
-                        model_list.append(m)
-                        seen.add(m.lower())
-                print(f"  Found {len(model_list)} model(s) from models.dev registry")
-            else:
-                model_list = curated
-                if model_list:
-                    print(
-                        f'  Showing {len(model_list)} curated models — use "Enter custom model name" for others.'
-                    )
    else:
        curated = _PROVIDER_MODELS.get(provider_id, [])

@@ -6827,74 +6692,6 @@ def _cleanup_quarantined_exes(scripts_dir: Path | None = None) -> None:
        pass


-def _refresh_active_lazy_features() -> None:
-    """Refresh lazy-installed backends after a code update.
-
-    When pyproject.toml's ``[all]`` extra was slimmed down (May 2026), most
-    optional backends moved to ``tools/lazy_deps.py`` and only install on
-    first use. ``hermes update`` runs ``uv pip install -e .[all]`` which
-    leaves those packages untouched — so if we bump a pin in
-    :data:`LAZY_DEPS` (CVE response, transitive bug fix), users who already
-    activated the backend keep the stale version forever.
-
-    This function asks lazy_deps which features the user has previously
-    activated and reinstalls them under the current pins. Features the
-    user never enabled stay quiet — no churn for cold backends.
-
-    Never raises. A failure here must not block the rest of the update.
-    """
-    try:
-        from tools import lazy_deps
-    except Exception as exc:
-        logger.debug("Lazy refresh skipped (import failed): %s", exc)
-        return
-
-    try:
-        active = lazy_deps.active_features()
-    except Exception as exc:
-        logger.debug("Lazy refresh skipped (active_features failed): %s", exc)
-        return
-
-    if not active:
-        return
-
-    print()
-    print(f"→ Refreshing {len(active)} active lazy backend(s)...")
-
-    try:
-        results = lazy_deps.refresh_active_features(prompt=False)
-    except Exception as exc:
-        # refresh_active_features is documented as never-raise, but defend
-        # the update flow against future regressions.
-        print(f"  ⚠ Lazy refresh failed unexpectedly: {exc}")
-        return
-
-    refreshed = [f for f, s in results.items() if s == "refreshed"]
-    current = [f for f, s in results.items() if s == "current"]
-    failed = [(f, s) for f, s in results.items() if s.startswith("failed:")]
-    skipped = [(f, s) for f, s in results.items() if s.startswith("skipped:")]
-
-    if refreshed:
-        print(f"  ↑ {len(refreshed)} refreshed: {', '.join(refreshed)}")
-    if current:
-        print(f"  ✓ {len(current)} already current")
-    if skipped:
-        # Most common reason: security.allow_lazy_installs=false. Show one
-        # line so the user knows why; not an error.
-        names = ", ".join(f for f, _ in skipped)
-        reason = skipped[0][1].split(": ", 1)[-1]
-        print(f"  · {len(skipped)} skipped ({reason}): {names}")
-    if failed:
-        for feature, status in failed:
-            reason = status.split(": ", 1)[-1]
-            # Clip noisy pip stderr to keep update output legible.
-            if len(reason) > 200:
-                reason = reason[:200] + "..."
-            print(f"  ⚠ {feature} failed to refresh: {reason}")
-        print("  Backends keep their previously-installed version; rerun")
-        print("  `hermes update` once the upstream issue is resolved.")
-
-
 def _install_python_dependencies_with_optional_fallback(
    install_cmd_prefix: list[str],
    *,
@@ -7817,8 +7614,6 @@ def _cmd_update_impl(args, gateway_mode: bool):
                _install_psutil_android_compat(pip_cmd)
            _install_python_dependencies_with_optional_fallback(pip_cmd, group=install_group)

-        _refresh_active_lazy_features()
-
        _update_node_dependencies()
        _build_web_ui(PROJECT_ROOT / "web")

@@ -9364,7 +9159,7 @@ def _build_provider_choices() -> list[str]:
            "auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot",
            "anthropic", "gemini", "google-gemini-cli", "xai", "bedrock", "azure-foundry",
            "ollama-cloud", "huggingface", "zai", "kimi-coding", "kimi-coding-cn",
-            "stepfun", "minimax", "minimax-cn", "kilocode", "novita", "xiaomi", "arcee",
+            "stepfun", "minimax", "minimax-cn", "kilocode", "xiaomi", "arcee",
            "nvidia", "deepseek", "alibaba", "qwen-oauth", "opencode-zen", "opencode-go",
        ]

@@ -9384,10 +9179,10 @@ _BUILTIN_SUBCOMMANDS = frozenset(
        "computer-use",
        "config", "cron", "curator", "dashboard", "debug", "doctor",
        "dump", "fallback", "gateway", "hooks", "import", "insights",
-        "kanban", "login", "logout", "logs", "lsp", "mcp", "memory",
-        "model", "pairing", "plugins", "profile", "sessions", "setup",
-        "skills", "slack", "status", "tools", "uninstall", "update",
-        "version", "webhook", "whatsapp", "chat",
+        "kanban", "login", "logout", "logs", "mcp", "memory", "model",
+        "pairing", "plugins", "profile", "sessions", "setup", "skills",
+        "slack", "status", "tools", "uninstall", "update", "version",
+        "webhook", "whatsapp", "chat",
        # Help-ish invocations — plugin commands not being listed in
        # top-level --help is an acceptable trade-off for skipping an
        # expensive eager import of every bundled plugin module.
@@ -9586,7 +9381,7 @@ def main():
    gateway_parser = subparsers.add_parser(
        "gateway",
        help="Messaging gateway management",
-        description="Manage the messaging gateway (Telegram, Discord, WhatsApp, Weixin, and more)",
+        description="Manage the messaging gateway (Telegram, Discord, WhatsApp)",
    )
    gateway_subparsers = gateway_parser.add_subparsers(dest="gateway_command")

@@ -9729,17 +9524,6 @@ def main():

    gateway_parser.set_defaults(func=cmd_gateway)

-    # =========================================================================
-    # lsp command
-    # =========================================================================
-    try:
-        from agent.lsp.cli import register_subparser as _lsp_register
-        _lsp_register(subparsers)
-    except Exception as _lsp_err:  # noqa: BLE001
-        # LSP is optional infrastructure — never let a registration
-        # failure break the CLI overall.
-        logger.debug("LSP CLI registration failed: %s", _lsp_err)
-
    # =========================================================================
    # setup command
    # =========================================================================
@@ -10302,16 +10086,6 @@ def main():
    doctor_parser.add_argument(
        "--fix", action="store_true", help="Attempt to fix issues automatically"
    )
-    doctor_parser.add_argument(
-        "--ack",
-        metavar="ADVISORY_ID",
-        default=None,
-        help=(
-            "Acknowledge a security advisory by ID and exit. After ack, the "
-            "advisory will no longer trigger startup banners. Run `hermes "
-            "doctor` first to see active advisories and their IDs."
-        ),
-    )
    doctor_parser.set_defaults(func=cmd_doctor)

    # =========================================================================
@@ -10,7 +10,6 @@ from __future__ import annotations
 import getpass
 import os
 import sys
-import shlex
 from pathlib import Path

 from hermes_constants import get_hermes_home
@@ -135,7 +134,7 @@ def _install_dependencies(provider_name: str) -> None:
        if check_cmd:
            try:
                subprocess.run(
-                    shlex.split(check_cmd), check=True, capture_output=True, timeout=5
+                    check_cmd, shell=True, capture_output=True, timeout=5
                )
            except Exception:
                if install_cmd:
@@ -379,12 +378,6 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
            new_lines.append(f"{key}={val}")

    env_path.write_text("\n".join(new_lines) + "\n", encoding="utf-8")
-    # Restrict permissions — .env holds API keys and tokens.
-    try:
-        import stat
-        env_path.chmod(stat.S_IRUSR | stat.S_IWUSR)  # 0600
-    except OSError:
-        pass  # Windows or read-only FS


 # ---------------------------------------------------------------------------
@@ -445,14 +445,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
    # Azure Foundry: user-provided endpoint and model.
    # Empty list because models depend on the endpoint configuration.
    "azure-foundry": [],
-    "novita": [
-        "moonshotai/kimi-k2.5",
-        "minimax/minimax-m2.7",
-        "zai-org/glm-5",
-        "deepseek/deepseek-v3-0324",
-        "deepseek/deepseek-r1-0528",
-        "qwen/qwen3-235b-a22b-fp8",
-    ],
 }

 # Vercel AI Gateway: derive the bare-model-id catalog from the curated
@@ -629,71 +621,6 @@ def union_with_portal_free_recommendations(
    return (augmented_ids, augmented_pricing)


-def union_with_portal_paid_recommendations(
-    curated_ids: list[str],
-    pricing: dict[str, dict[str, str]],
-    portal_base_url: str = "",
-    *,
-    force_refresh: bool = False,
-) -> tuple[list[str], dict[str, dict[str, str]]]:
-    """Augment curated list with the Portal's ``paidRecommendedModels``.
-
-    Mirror of :func:`union_with_portal_free_recommendations` for paid-tier
-    users. The Portal's ``/api/nous/recommended-models`` endpoint advertises
-    which paid models are blessed *right now* — independent of what the
-    in-repo ``_PROVIDER_MODELS["nous"]`` list happens to contain or whether
-    the docs-hosted catalog manifest has been rebuilt since the last release.
-
-    For paid-tier users this lets newly-launched paid models surface in the
-    picker even if the user is running an older Hermes that doesn't ship
-    them in its hardcoded curated list. This function returns an augmented
-    ``(model_ids, pricing)`` pair where:
-
-    * Portal paid recommendations missing from ``curated_ids`` are
-      appended at the front (so the picker shows them first).
-    * ``pricing`` is left untouched — we deliberately do NOT synthesize
-      pricing entries for paid models. Live pricing is fetched separately
-      via :func:`get_pricing_for_provider`; if the live endpoint hasn't
-      published pricing yet, the picker shows a blank price column rather
-      than fabricating numbers. (The free helper synthesizes ``$0`` so
-      :func:`partition_nous_models_by_tier` keeps free models selectable;
-      no equivalent gating applies on the paid side, so synthesis would
-      only mislead the user.)
-
-    Failures (network, parse, missing field) are silent and degrade to
-    returning the inputs unchanged — never block the picker on a
-    Portal-side hiccup.
-    """
-    try:
-        payload = fetch_nous_recommended_models(
-            portal_base_url, force_refresh=force_refresh
-        )
-    except Exception:
-        return (list(curated_ids), dict(pricing))
-
-    paid_block = payload.get("paidRecommendedModels") if isinstance(payload, dict) else None
-    if not isinstance(paid_block, list) or not paid_block:
-        return (list(curated_ids), dict(pricing))
-
-    portal_paid_ids: list[str] = []
-    for entry in paid_block:
-        name = _extract_model_name(entry)
-        if name:
-            portal_paid_ids.append(name)
-    if not portal_paid_ids:
-        return (list(curated_ids), dict(pricing))
-
-    augmented_ids = list(curated_ids)
-    seen = set(augmented_ids)
-    # Prepend Portal paid recommendations that aren't already curated, so
-    # the Portal-blessed picks surface first in the picker.
-    new_ones = [mid for mid in portal_paid_ids if mid not in seen]
-    if new_ones:
-        augmented_ids = new_ones + augmented_ids
-
-    return (augmented_ids, dict(pricing))
-
-
 # ---------------------------------------------------------------------------
 # TTL cache for free-tier detection — avoids repeated API calls within a
 # session while still picking up upgrades quickly.
@@ -913,14 +840,13 @@ class ProviderEntry(NamedTuple):
 CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("nous",           "Nous Portal",              "Nous Portal (Nous Research subscription)"),
    ProviderEntry("openrouter",     "OpenRouter",               "OpenRouter (100+ models, pay-per-use)"),
-    ProviderEntry("novita",         "NovitaAI",                 "NovitaAI (AI-native cloud: Model API, Agent Sandbox, GPU Cloud)"),
    ProviderEntry("lmstudio",       "LM Studio",                "LM Studio (local desktop app with built-in model server)"),
    ProviderEntry("anthropic",      "Anthropic",                "Anthropic (Claude models — API key or Claude Code)"),
    ProviderEntry("openai-codex",   "OpenAI Codex",             "OpenAI Codex"),
-    ProviderEntry("alibaba",        "Qwen Cloud",               "Qwen Cloud / DashScope Coding (Qwen + multi-provider)"),
    ProviderEntry("xiaomi",         "Xiaomi MiMo",              "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
    ProviderEntry("tencent-tokenhub", "Tencent TokenHub",       "Tencent TokenHub (Hy3 Preview — direct API via tokenhub.tencentmaas.com)"),
    ProviderEntry("nvidia",         "NVIDIA NIM",               "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
+    ProviderEntry("qwen-oauth",     "Qwen OAuth (Portal)",      "Qwen OAuth (reuses local Qwen CLI login)"),
    ProviderEntry("copilot",        "GitHub Copilot",           "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
    ProviderEntry("copilot-acp",    "GitHub Copilot ACP",       "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
    ProviderEntry("huggingface",    "Hugging Face",             "Hugging Face Inference Providers (20+ open models)"),
@@ -935,6 +861,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("minimax",        "MiniMax",                  "MiniMax (global direct API)"),
    ProviderEntry("minimax-oauth",  "MiniMax (OAuth)",          "MiniMax via OAuth browser login (Coding Plan, minimax.io)"),
    ProviderEntry("minimax-cn",     "MiniMax (China)",          "MiniMax China (domestic direct API)"),
+    ProviderEntry("alibaba",        "Alibaba Cloud (DashScope)","Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
    ProviderEntry("ollama-cloud",   "Ollama Cloud",             "Ollama Cloud (cloud-hosted open models — ollama.com)"),
    ProviderEntry("arcee",          "Arcee AI",                 "Arcee AI (Trinity models — direct API)"),
    ProviderEntry("gmi",            "GMI Cloud",                "GMI Cloud (multi-model direct API)"),
@@ -944,7 +871,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("bedrock",        "AWS Bedrock",              "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
    ProviderEntry("azure-foundry",  "Azure Foundry",            "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
    ProviderEntry("ai-gateway",     "Vercel AI Gateway",        "Vercel AI Gateway"),
-    ProviderEntry("qwen-oauth",     "Qwen OAuth (Portal)",      "Qwen OAuth (reuses local Qwen CLI login)"),
 ]

 # Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
@@ -1023,8 +949,6 @@ _PROVIDER_ALIASES = {
    "hf": "huggingface",
    "hugging-face": "huggingface",
    "huggingface-hub": "huggingface",
-    "novita-ai": "novita",
-    "novitaai": "novita",
    "mimo": "xiaomi",
    "xiaomi-mimo": "xiaomi",
    "tencent": "tencent-tokenhub",
@@ -1505,7 +1429,7 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:


 def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
-    """Return live pricing for providers that support it (openrouter, nous, ai-gateway, novita)."""
+    """Return live pricing for providers that support it (openrouter, nous, ai-gateway)."""
    normalized = normalize_provider(provider)
    if normalized == "openrouter":
        return fetch_models_with_pricing(
@@ -1515,8 +1439,6 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
        )
    if normalized == "ai-gateway":
        return fetch_ai_gateway_pricing(force_refresh=force_refresh)
-    if normalized == "novita":
-        return _fetch_novita_pricing(force_refresh=force_refresh)
    if normalized == "nous":
        api_key, base_url = _resolve_nous_pricing_credentials()
        if base_url:
@@ -1533,65 +1455,6 @@ def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> d
    return {}


-def _fetch_novita_pricing(
-    timeout: float = 8.0,
-    *,
-    force_refresh: bool = False,
-) -> dict[str, dict[str, str]]:
-    """Fetch pricing from NovitaAI /v1/models.
-
-    NovitaAI returns input/output prices per million tokens in units of
-    0.0001 USD. Convert them to the per-token strings used by the shared
-    pricing formatter.
-
-    Results are cached in ``_pricing_cache`` keyed on the resolved base URL,
-    matching the pattern used by ``fetch_ai_gateway_pricing`` — without this,
-    every menu render or pricing lookup re-hits the network.
-    """
-    api_key = os.getenv("NOVITA_API_KEY", "").strip()
-    if not api_key:
-        return {}
-
-    base_url = os.getenv("NOVITA_BASE_URL", "").strip() or "https://api.novita.ai/openai/v1"
-    cache_key = base_url.rstrip("/")
-    if not force_refresh and cache_key in _pricing_cache:
-        return _pricing_cache[cache_key]
-
-    url = cache_key + "/models"
-    headers = {
-        "Authorization": f"Bearer {api_key}",
-        "Accept": "application/json",
-        "User-Agent": _HERMES_USER_AGENT,
-    }
-
-    try:
-        req = urllib.request.Request(url, headers=headers)
-        with urllib.request.urlopen(req, timeout=timeout) as resp:
-            payload = json.loads(resp.read().decode())
-    except Exception:
-        _pricing_cache[cache_key] = {}
-        return {}
-
-    result: dict[str, dict[str, str]] = {}
-    for item in payload.get("data", []):
-        if not isinstance(item, dict):
-            continue
-        mid = item.get("id")
-        if not mid:
-            continue
-        inp = item.get("input_token_price_per_m")
-        out = item.get("output_token_price_per_m")
-        if inp is None and out is None:
-            continue
-        result[str(mid)] = {
-            "prompt": str(float(inp or 0) / 10_000 / 1_000_000),
-            "completion": str(float(out or 0) / 10_000 / 1_000_000),
-        }
-
-    _pricing_cache[cache_key] = result
-    return result
-
-
 # All provider IDs and aliases that are valid for the provider:model syntax.
 _KNOWN_PROVIDER_NAMES: set[str] = (
    set(_PROVIDER_LABELS.keys())
@@ -542,61 +542,6 @@ class PluginContext:
            self.manifest.name, provider.name,
        )

-    # -- video gen provider registration -------------------------------------
-
-    def register_video_gen_provider(self, provider) -> None:
-        """Register a video generation backend.
-
-        ``provider`` must be an instance of
-        :class:`agent.video_gen_provider.VideoGenProvider`. The
-        ``provider.name`` attribute is what ``video_gen.provider`` in
-        ``config.yaml`` matches against when routing ``video_generate``
-        tool calls.
-        """
-        from agent.video_gen_provider import VideoGenProvider
-        from agent.video_gen_registry import register_provider as _register_video_provider
-
-        if not isinstance(provider, VideoGenProvider):
-            logger.warning(
-                "Plugin '%s' tried to register a video_gen provider that does "
-                "not inherit from VideoGenProvider. Ignoring.",
-                self.manifest.name,
-            )
-            return
-        _register_video_provider(provider)
-        logger.info(
-            "Plugin '%s' registered video_gen provider: %s",
-            self.manifest.name, provider.name,
-        )
-
-    # -- web search/extract provider registration ----------------------------
-
-    def register_web_search_provider(self, provider) -> None:
-        """Register a web search/extract backend.
-
-        ``provider`` must be an instance of
-        :class:`agent.web_search_provider.WebSearchProvider`. The
-        ``provider.name`` attribute is what ``web.search_backend`` /
-        ``web.extract_backend`` / ``web.backend`` in ``config.yaml``
-        matches against when routing ``web_search`` / ``web_extract``
-        tool calls.
-        """
-        from agent.web_search_provider import WebSearchProvider
-        from agent.web_search_registry import register_provider as _register_web_provider
-
-        if not isinstance(provider, WebSearchProvider):
-            logger.warning(
-                "Plugin '%s' tried to register a web provider that does "
-                "not inherit from WebSearchProvider. Ignoring.",
-                self.manifest.name,
-            )
-            return
-        _register_web_provider(provider)
-        logger.info(
-            "Plugin '%s' registered web provider: %s",
-            self.manifest.name, provider.name,
-        )
-
    # -- platform adapter registration ---------------------------------------

    def register_platform(
@@ -1367,21 +1312,6 @@ def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:



-_thread_tool_whitelist = threading.local()
-
-
-def set_thread_tool_whitelist(
-    allowed: Optional[Set[str]],
-    deny_msg_fmt: str = "Tool '{tool_name}' denied: not in this thread's tool whitelist",
-) -> None:
-    _thread_tool_whitelist.allowed = allowed
-    _thread_tool_whitelist.fmt = deny_msg_fmt
-
-
-def clear_thread_tool_whitelist() -> None:
-    _thread_tool_whitelist.allowed = None
-
-
 def get_pre_tool_call_block_message(
    tool_name: str,
    args: Optional[Dict[str, Any]],
@@ -1400,11 +1330,6 @@ def get_pre_tool_call_block_message(
    directive wins.  Invalid or irrelevant hook return values are
    silently ignored so existing observer-only hooks are unaffected.
    """
-    allowed = getattr(_thread_tool_whitelist, "allowed", None)
-    if allowed is not None and tool_name not in allowed:
-        fmt = getattr(_thread_tool_whitelist, "fmt", "Tool '{tool_name}' denied")
-        return fmt.format(tool_name=tool_name)
-
    hook_results = invoke_hook(
        "pre_tool_call",
        tool_name=tool_name,
@@ -1295,6 +1295,91 @@ def rename_profile(old_name: str, new_name: str) -> Path:
    return new_dir


+# ---------------------------------------------------------------------------
+# Tab completion
+# ---------------------------------------------------------------------------
+
+def generate_bash_completion() -> str:
+    """Generate a bash completion script for hermes profile names."""
+    return '''# Hermes Agent profile completion
+# Add to ~/.bashrc: eval "$(hermes completion bash)"
+
+_hermes_profiles() {
+    local profiles_dir="$HOME/.hermes/profiles"
+    local profiles="default"
+    if [ -d "$profiles_dir" ]; then
+        profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
+    fi
+    echo "$profiles"
+}
+
+_hermes_completion() {
+    local cur prev
+    cur="${COMP_WORDS[COMP_CWORD]}"
+    prev="${COMP_WORDS[COMP_CWORD-1]}"
+
+    # Complete profile names after -p / --profile
+    if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
+        COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
+        return
+    fi
+
+    # Complete profile subcommands
+    if [[ "${COMP_WORDS[1]}" == "profile" ]]; then
+        case "$prev" in
+            profile)
+                COMPREPLY=($(compgen -W "list use create delete show alias rename export import" -- "$cur"))
+                return
+                ;;
+            use|delete|show|alias|rename|export)
+                COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
+                return
+                ;;
+        esac
+    fi
+
+    # Top-level subcommands
+    if [[ "$COMP_CWORD" == 1 ]]; then
+        local commands="chat model gateway setup status cron doctor dump config skills tools mcp sessions profile update version"
+        COMPREPLY=($(compgen -W "$commands" -- "$cur"))
+    fi
+}
+
+complete -F _hermes_completion hermes
+'''
+
+
+def generate_zsh_completion() -> str:
+    """Generate a zsh completion script for hermes profile names."""
+    return '''#compdef hermes
+# Hermes Agent profile completion
+# Add to ~/.zshrc: eval "$(hermes completion zsh)"
+
+_hermes() {
+    local -a profiles
+    profiles=(default)
+    if [[ -d "$HOME/.hermes/profiles" ]]; then
+        profiles+=("${(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}")
+    fi
+
+    _arguments \\
+        '-p[Profile name]:profile:($profiles)' \\
+        '--profile[Profile name]:profile:($profiles)' \\
+        '1:command:(chat model gateway setup status cron doctor dump config skills tools mcp sessions profile update version)' \\
+        '*::arg:->args'
+
+    case $words[1] in
+        profile)
+            _arguments '1:action:(list use create delete show alias rename export import)' \\
+                        '2:profile:($profiles)'
+            ;;
+    esac
+}
+
+_hermes "$@"
+'''
+
+
 # ---------------------------------------------------------------------------
 # Profile env resolution (called from _apply_profile_override)
 # ---------------------------------------------------------------------------
@@ -156,11 +156,6 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
        is_aggregator=True,
        base_url_env_var="HF_BASE_URL",
    ),
-    "novita": HermesOverlay(
-        transport="openai_chat",
-        is_aggregator=True,
-        base_url_env_var="NOVITA_BASE_URL",
-    ),
    "xai": HermesOverlay(
        transport="codex_responses",
        base_url_override="https://api.x.ai/v1",
@@ -314,10 +309,6 @@ ALIASES: Dict[str, str] = {
    "hugging-face": "huggingface",
    "huggingface-hub": "huggingface",

-    # novita
-    "novita-ai": "novita",
-    "novitaai": "novita",
-
    # xiaomi
    "mimo": "xiaomi",
    "xiaomi-mimo": "xiaomi",
@@ -164,18 +164,7 @@ def _copilot_runtime_api_mode(model_cfg: Dict[str, Any], api_key: str) -> str:
        return "chat_completions"


-_VALID_API_MODES = {
-    "chat_completions",
-    "codex_responses",
-    "anthropic_messages",
-    "bedrock_converse",
-    # Optional opt-in: hand the entire turn to a `codex app-server` subprocess
-    # so terminal/file-ops/patching/sandboxing run inside Codex's own runtime
-    # instead of Hermes' tool dispatch. Gated behind config key
-    # `model.openai_runtime == "codex_app_server"` AND provider in
-    # {"openai", "openai-codex"}. Default is unchanged.
-    "codex_app_server",
-}
+_VALID_API_MODES = {"chat_completions", "codex_responses", "anthropic_messages", "bedrock_converse"}


 def _parse_api_mode(raw: Any) -> Optional[str]:
@@ -187,32 +176,6 @@ def _parse_api_mode(raw: Any) -> Optional[str]:
    return None


-def _maybe_apply_codex_app_server_runtime(
-    *,
-    provider: str,
-    api_mode: str,
-    model_cfg: Optional[Dict[str, Any]],
-) -> str:
-    """Optional opt-in: rewrite api_mode → "codex_app_server" for OpenAI/Codex
-    providers when the user has explicitly enabled that runtime via
-    `model.openai_runtime: codex_app_server` in config.yaml.
-
-    Default behavior is preserved: when the key is unset, "auto", or empty,
-    this function is a no-op. Only providers in {"openai", "openai-codex"}
-    are eligible — other providers (anthropic, openrouter, etc.) cannot be
-    rerouted through codex.
-
-    Returns the (possibly-rewritten) api_mode."""
-    if not model_cfg:
-        return api_mode
-    if provider not in ("openai", "openai-codex"):
-        return api_mode
-    runtime = str(model_cfg.get("openai_runtime") or "").strip().lower()
-    if runtime == "codex_app_server":
-        return "codex_app_server"
-    return api_mode
-
-
 def _resolve_runtime_from_pool_entry(
    *,
    provider: str,
@@ -242,14 +205,6 @@ def _resolve_runtime_from_pool_entry(
    elif provider == "google-gemini-cli":
        api_mode = "chat_completions"
        base_url = base_url or "cloudcode-pa://google"
-    elif provider == "minimax-oauth":
-        # MiniMax OAuth tokens are valid only against the Anthropic Messages
-        # compatible endpoint. Do not honor stale model.api_mode values from a
-        # prior OpenAI-compatible provider, or the client will hit
-        # /chat/completions under /anthropic and receive a bare nginx 404.
-        api_mode = "anthropic_messages"
-        pconfig = PROVIDER_REGISTRY.get(provider)
-        base_url = base_url or (pconfig.inference_base_url if pconfig else "")
    elif provider == "anthropic":
        api_mode = "anthropic_messages"
        cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
@@ -330,12 +285,6 @@ def _resolve_runtime_from_pool_entry(
    if api_mode == "anthropic_messages" and provider in {"opencode-zen", "opencode-go"}:
        base_url = re.sub(r"/v1/?$", "", base_url)

-    # Optional opt-in: route OpenAI/Codex turns through `codex app-server`.
-    # Inert when `model.openai_runtime` is unset or "auto".
-    api_mode = _maybe_apply_codex_app_server_runtime(
-        provider=provider, api_mode=api_mode, model_cfg=model_cfg
-    )
-
    return {
        "provider": provider,
        "api_mode": api_mode,
@@ -1,451 +0,0 @@
-"""
-Security advisory checker for Hermes Agent.
-
-Detects known-compromised Python packages installed in the active venv
-(supply-chain attacks like the Mini Shai-Hulud worm of May 2026 that
-poisoned ``mistralai 2.4.6`` on PyPI) and surfaces remediation guidance to
-the user.
-
-Design goals:
-
- **Cheap.** A single ``importlib.metadata.version()`` call per advisory
-  package. Safe to run on every CLI startup.
- **Loud when it matters, silent otherwise.** If no compromised package is
-  installed, the user sees nothing.
- **Acknowledgeable.** Once the user has read and acted on an advisory they
-  can dismiss it via ``hermes doctor --ack <id>``; the ack is persisted to
-  ``config.security.acked_advisories`` and survives restart.
- **Extensible.** Adding a new advisory is one entry in ``ADVISORIES``;
-  adding a new compromised version is a one-line edit. No code changes
-  needed when the next worm hits.
-
-The check is invoked from three places:
-
-1. ``hermes doctor`` (and ``hermes doctor --ack <id>``)
-2. CLI startup banner (one short line, then full guidance via
-   ``hermes doctor``)
-3. Gateway startup (logged to gateway.log; first interactive message gets
-   a one-line operator banner)
-
-This module is intentionally dependency-free beyond the stdlib so it can
-run in environments where the rest of Hermes failed to import.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-import sys
-from dataclasses import dataclass, field
-from pathlib import Path
-from typing import Iterable, Optional
-
-logger = logging.getLogger(__name__)
-
-
-# =============================================================================
-# Advisory catalog
-#
-# Each advisory is a community-facing security warning about one or more
-# specific package versions that are known to be compromised. To add a new
-# advisory:
-#
-#   1. Append a new ``Advisory`` to ``ADVISORIES`` below
-#   2. Set ``compromised`` to a tuple of ``(pkg_name, frozenset_of_versions)``
-#      — version strings must match what ``importlib.metadata.version()``
-#      returns. Use an empty frozenset to flag *any installed version*
-#      (rare; only when the maintainer namespace itself is compromised).
-#   3. Write 2-4 short ``remediation`` lines a non-expert can copy/paste.
-#
-# Do NOT remove old advisories. Once an advisory ships, leave it in place so
-# users running an older release with the compromised package still get
-# warned. Mark superseded ones via ``superseded_by`` if needed.
-# =============================================================================
-
-
-@dataclass(frozen=True)
-class Advisory:
-    """One security advisory entry.
-
-    Attributes:
-        id: stable identifier used for acks (e.g. ``shai-hulud-2026-05``).
-            Lowercase-hyphen, never reused.
-        title: one-line headline shown in banners.
-        summary: 1-3 sentence description of what was compromised and how.
-        url: reference URL (Socket advisory, GitHub advisory, PyPI page).
-        compromised: tuple of ``(package_name, frozenset_of_versions)``
-            pairs. Empty frozenset means "any version of this package is
-            considered suspect" — use sparingly.
-        remediation: ordered list of steps the user should take. First step
-            should be the uninstall command; subsequent steps the credential
-            audit / rotation guidance.
-        published: ISO date string for sort order.
-    """
-
-    id: str
-    title: str
-    summary: str
-    url: str
-    compromised: tuple[tuple[str, frozenset[str]], ...]
-    remediation: tuple[str, ...]
-    published: str = ""
-    severity: str = "high"  # low / medium / high / critical
-
-
-ADVISORIES: tuple[Advisory, ...] = (
-    Advisory(
-        id="shai-hulud-2026-05",
-        title="Mini Shai-Hulud worm — mistralai 2.4.6 compromised on PyPI",
-        summary=(
-            "PyPI quarantined the mistralai package on 2026-05-12 after a "
-            "malicious 2.4.6 release. The worm steals credentials from "
-            "environment variables and credential files (~/.npmrc, ~/.pypirc, "
-            "~/.aws/credentials, GitHub PATs, cloud SDK tokens) and exfils "
-            "them to a hardcoded webhook. If you ran any Python process that "
-            "imported mistralai 2.4.6 — including hermes when configured "
-            "with provider=mistral for TTS or STT — assume those credentials "
-            "are exposed."
-        ),
-        url="https://socket.dev/blog/mini-shai-hulud-worm-pypi",
-        compromised=(
-            ("mistralai", frozenset({"2.4.6"})),
-        ),
-        remediation=(
-            "Run: pip uninstall -y mistralai  (or: uv pip uninstall mistralai)",
-            "Rotate API keys in ~/.hermes/.env (OpenRouter, Anthropic, OpenAI, "
-            "Nous, GitHub, AWS, Google, Mistral, etc.).",
-            "Audit ~/.npmrc, ~/.pypirc, ~/.aws/credentials, ~/.config/gh/hosts.yml, "
-            "and any other credential files for tokens that may have been read.",
-            "Check GitHub for unexpected new SSH keys, deploy keys, or webhook "
-            "additions on repos you have admin on.",
-            "After cleanup: hermes doctor --ack shai-hulud-2026-05  to dismiss "
-            "this warning.",
-        ),
-        published="2026-05-12",
-        severity="critical",
-    ),
-)
-
-
-# =============================================================================
-# Detection
-# =============================================================================
-
-
-@dataclass(frozen=True)
-class AdvisoryHit:
-    """One package-version match against an advisory."""
-
-    advisory: Advisory
-    package: str
-    installed_version: str
-
-
-def _installed_version(pkg_name: str) -> Optional[str]:
-    """Return the installed version of ``pkg_name``, or None if not installed.
-
-    Uses ``importlib.metadata`` so we don't depend on pip being importable
-    inside the active venv (uv-created venvs may lack pip).
-    """
-    try:
-        from importlib.metadata import PackageNotFoundError, version
-    except ImportError:  # py<3.8 — Hermes requires 3.10+ but defensive.
-        return None
-    try:
-        return version(pkg_name)
-    except PackageNotFoundError:
-        return None
-    except Exception:
-        # Some metadata corruption modes raise ValueError or OSError. Don't
-        # let advisory checking crash the CLI startup path.
-        logger.debug("importlib.metadata.version(%s) raised", pkg_name, exc_info=True)
-        return None
-
-
-def detect_compromised(
-    advisories: Iterable[Advisory] = ADVISORIES,
-) -> list[AdvisoryHit]:
-    """Scan installed packages and return all advisory hits.
-
-    A "hit" means an advisory's listed package is installed AND the version
-    is in the compromised set (or the compromised set is empty, meaning
-    *any* version is suspect).
-    """
-    hits: list[AdvisoryHit] = []
-    for advisory in advisories:
-        for pkg_name, bad_versions in advisory.compromised:
-            installed = _installed_version(pkg_name)
-            if installed is None:
-                continue
-            if not bad_versions or installed in bad_versions:
-                hits.append(AdvisoryHit(
-                    advisory=advisory,
-                    package=pkg_name,
-                    installed_version=installed,
-                ))
-    return hits
-
-
-# =============================================================================
-# Acknowledgement persistence
-#
-# Acks live under ``security.acked_advisories`` in config.yaml as a list of
-# advisory IDs. The list is the only state — no per-host data, no
-# timestamps, no fingerprints. Users sharing a config.yaml across machines
-# (rare but possible) get the same dismissal everywhere, which is the
-# correct behavior for a global advisory.
-# =============================================================================
-
-
-def get_acked_ids() -> set[str]:
-    """Return the set of advisory IDs the user has dismissed.
-
-    Returns an empty set if config can't be loaded (don't block startup
-    just because config is broken — the advisory will keep firing until
-    config is repaired, which is fine).
-    """
-    try:
-        from hermes_cli.config import load_config
-        cfg = load_config()
-    except Exception:
-        logger.debug("Could not load config for advisory acks", exc_info=True)
-        return set()
-    sec = cfg.get("security") or {}
-    raw = sec.get("acked_advisories") or []
-    if not isinstance(raw, list):
-        return set()
-    return {str(x).strip() for x in raw if str(x).strip()}
-
-
-def ack_advisory(advisory_id: str) -> bool:
-    """Persist an ack for ``advisory_id``. Returns True on success.
-
-    Idempotent — acking an already-acked ID is a no-op.
-    """
-    advisory_id = advisory_id.strip()
-    if not advisory_id:
-        return False
-    try:
-        from hermes_cli.config import load_config, save_config
-    except Exception:
-        logger.warning("Could not import config module to persist ack")
-        return False
-    try:
-        cfg = load_config()
-        sec = cfg.setdefault("security", {})
-        existing = sec.get("acked_advisories") or []
-        if not isinstance(existing, list):
-            existing = []
-        if advisory_id not in existing:
-            existing.append(advisory_id)
-            sec["acked_advisories"] = existing
-            save_config(cfg)
-        return True
-    except Exception:
-        logger.exception("Failed to persist advisory ack for %s", advisory_id)
-        return False
-
-
-def filter_unacked(hits: list[AdvisoryHit]) -> list[AdvisoryHit]:
-    """Return only hits whose advisories the user has not dismissed."""
-    if not hits:
-        return []
-    acked = get_acked_ids()
-    return [h for h in hits if h.advisory.id not in acked]
-
-
-# =============================================================================
-# Rendering helpers
-# =============================================================================
-
-
-def _term_supports_color() -> bool:
-    if os.environ.get("NO_COLOR"):
-        return False
-    if not sys.stdout.isatty():
-        return False
-    return True
-
-
-def short_banner_lines(hits: list[AdvisoryHit]) -> list[str]:
-    """Return 1-3 short lines suitable for a startup banner.
-
-    Caller is responsible for color/styling. Always names the worst hit
-    explicitly so the user knows what's wrong without running doctor.
-    """
-    if not hits:
-        return []
-    primary = hits[0]
-    lines = [
-        f"SECURITY ADVISORY [{primary.advisory.id}]: {primary.advisory.title}",
-        f"  Detected: {primary.package}=={primary.installed_version}",
-        "  Run 'hermes doctor' for remediation steps.",
-    ]
-    if len(hits) > 1:
-        lines.insert(1, f"  ({len(hits) - 1} additional advisor"
-                       f"{'ies' if len(hits) > 2 else 'y'} also active.)")
-    return lines
-
-
-def full_remediation_text(hit: AdvisoryHit) -> list[str]:
-    """Return a multi-line block describing the advisory + remediation."""
-    a = hit.advisory
-    lines = [
-        f"=== {a.title} ===",
-        f"ID:        {a.id}    Severity: {a.severity}    Published: {a.published}",
-        f"Detected:  {hit.package}=={hit.installed_version}",
-        f"Reference: {a.url}",
-        "",
-        a.summary,
-        "",
-        "Remediation:",
-    ]
-    for i, step in enumerate(a.remediation, 1):
-        lines.append(f"  {i}. {step}")
-    return lines
-
-
-# =============================================================================
-# Startup-banner gating
-#
-# We do NOT want to hammer the user with the banner on every command. Once
-# they've seen it inside a 24h window we cache that fact in
-# ``~/.hermes/cache/advisory_banner_seen`` (a single line per advisory ID:
-# ``<id> <iso8601_timestamp>``).
-#
-# Acked advisories never re-banner. Cached-but-not-acked advisories
-# re-banner after 24h so the user doesn't fully forget.
-# =============================================================================
-
-
-_BANNER_CACHE_FILE = "advisory_banner_seen"
-_BANNER_REPEAT_HOURS = 24
-
-
-def _banner_cache_path() -> Optional[Path]:
-    try:
-        from hermes_constants import get_hermes_home
-        cache_dir = Path(get_hermes_home()) / "cache"
-        cache_dir.mkdir(parents=True, exist_ok=True)
-        return cache_dir / _BANNER_CACHE_FILE
-    except Exception:
-        return None
-
-
-def _read_banner_cache() -> dict[str, float]:
-    p = _banner_cache_path()
-    if p is None or not p.exists():
-        return {}
-    out: dict[str, float] = {}
-    try:
-        for line in p.read_text(encoding="utf-8").splitlines():
-            line = line.strip()
-            if not line:
-                continue
-            parts = line.split(None, 1)
-            if len(parts) != 2:
-                continue
-            advisory_id, ts = parts
-            try:
-                out[advisory_id] = float(ts)
-            except ValueError:
-                continue
-    except Exception:
-        return {}
-    return out
-
-
-def _write_banner_cache(seen: dict[str, float]) -> None:
-    p = _banner_cache_path()
-    if p is None:
-        return
-    try:
-        lines = [f"{aid} {ts}" for aid, ts in seen.items()]
-        p.write_text("\n".join(lines) + "\n", encoding="utf-8")
-    except Exception:
-        logger.debug("Could not write advisory banner cache", exc_info=True)
-
-
-def hits_due_for_banner(
-    hits: list[AdvisoryHit],
-    *,
-    repeat_hours: int = _BANNER_REPEAT_HOURS,
-) -> list[AdvisoryHit]:
-    """Return only hits whose banner is due (not acked, not recently shown).
-
-    Side effect: stamps the banner cache for any hit that's about to be
-    shown. Callers should subsequently render the result.
-    """
-    import time
-
-    fresh = filter_unacked(hits)
-    if not fresh:
-        return []
-    now = time.time()
-    cache = _read_banner_cache()
-    cutoff = now - (repeat_hours * 3600)
-
-    due: list[AdvisoryHit] = []
-    for hit in fresh:
-        last = cache.get(hit.advisory.id, 0.0)
-        if last < cutoff:
-            due.append(hit)
-            cache[hit.advisory.id] = now
-    if due:
-        _write_banner_cache(cache)
-    return due
-
-
-# =============================================================================
-# Public entry points used by doctor / CLI / gateway
-# =============================================================================
-
-
-def render_doctor_section(hits: list[AdvisoryHit]) -> tuple[bool, list[str]]:
-    """Render the security-advisory section for ``hermes doctor``.
-
-    Returns ``(has_problems, lines)``. Caller is responsible for printing
-    with whatever color scheme it uses.
-    """
-    fresh = filter_unacked(hits)
-    if not fresh:
-        return False, ["No active security advisories.  ✓"]
-
-    lines: list[str] = []
-    for i, hit in enumerate(fresh):
-        if i:
-            lines.append("")
-        lines.extend(full_remediation_text(hit))
-    return True, lines
-
-
-def startup_banner(hits: list[AdvisoryHit]) -> Optional[str]:
-    """Return a printable startup banner, or None if nothing is due.
-
-    Updates the banner cache as a side effect (so the next call within
-    24h returns None for the same hit).
-    """
-    due = hits_due_for_banner(hits)
-    if not due:
-        return None
-    lines = short_banner_lines(due)
-    if _term_supports_color():
-        red = "\x1b[1;31m"
-        reset = "\x1b[0m"
-        return red + "\n".join(lines) + reset
-    return "\n".join(lines)
-
-
-def gateway_log_message(hits: list[AdvisoryHit]) -> Optional[str]:
-    """Return a one-line log message for gateway operators, or None."""
-    fresh = filter_unacked(hits)
-    if not fresh:
-        return None
-    if len(fresh) == 1:
-        h = fresh[0]
-        return (f"Security advisory [{h.advisory.id}] active: "
-                f"{h.package}=={h.installed_version} matches {h.advisory.title}. "
-                f"See {h.advisory.url}")
-    return (f"{len(fresh)} security advisories active "
-            f"(IDs: {', '.join(h.advisory.id for h in fresh)}). "
-            f"Run `hermes doctor` on the gateway host for details.")
@@ -454,26 +454,6 @@ def _print_setup_summary(config: dict, hermes_home):
        else:
            tool_status.append(("Image Generation", False, "FAL_KEY or OPENAI_API_KEY"))

-    # Video generation — opt-in via `hermes tools` → Video Generation.
-    # Only show the row when a plugin reports available so we don't badger
-    # users who don't care about video gen with a "missing" status line.
-    try:
-        from agent.video_gen_registry import list_providers as _list_video_providers
-        from hermes_cli.plugins import _ensure_plugins_discovered as _ensure_plugins
-        _ensure_plugins()
-        _video_backend = None
-        for _vp in _list_video_providers():
-            try:
-                if _vp.is_available():
-                    _video_backend = _vp.display_name
-                    break
-            except Exception:
-                continue
-    except Exception:
-        _video_backend = None
-    if _video_backend:
-        tool_status.append((f"Video Generation ({_video_backend})", True, None))
-
    # TTS — show configured provider
    tts_provider = cfg_get(config, "tts", "provider", default="edge")
    if subscription_features.tts.managed_by_nous:
@@ -3266,6 +3246,18 @@ def run_setup_wizard(args):
        print_info(f"  cp {_backup_path} {config_path}")
    _print_setup_summary(config, hermes_home)

+    _offer_launch_chat()
+
+
+def _offer_launch_chat():
+    """Prompt the user to jump straight into chat after setup."""
+    print()
+    if not prompt_yes_no("Launch hermes chat now?", True):
+        return
+
+    from hermes_cli.relaunch import relaunch
+    relaunch(["chat"])
+

 def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):
    """Streamlined first-time setup: provider, model, terminal & messaging.
@@ -3309,6 +3301,8 @@ def _run_first_time_quick_setup(config: dict, hermes_home, is_existing: bool):

    _print_setup_summary(config, hermes_home)

+    _offer_launch_chat()
+

 def _run_quick_setup(config: dict, hermes_home):
    """Quick setup — only configure items that are missing."""
@@ -666,46 +666,25 @@ def _load_skin_from_yaml(path: Path) -> Optional[Dict[str, Any]]:
    return None


-def _mapping_or_empty(value: Any, *, section: str, skin_name: str) -> Dict[str, Any]:
-    """Return a mapping value or an empty dict when the section type is invalid."""
-    if isinstance(value, dict):
-        return value
-    if value is None:
-        return {}
-    logger.warning(
-        "Skin '%s' has invalid '%s' section type (%s); ignoring section",
-        skin_name,
-        section,
-        type(value).__name__,
-    )
-    return {}
-
-
 def _build_skin_config(data: Dict[str, Any]) -> SkinConfig:
    """Build a SkinConfig from a raw dict (built-in or loaded from YAML)."""
    # Start with default values as base for missing keys
    default = _BUILTIN_SKINS["default"]
-    skin_name = str(data.get("name", "unknown"))
-    color_overrides = _mapping_or_empty(data.get("colors"), section="colors", skin_name=skin_name)
-    spinner_overrides = _mapping_or_empty(data.get("spinner"), section="spinner", skin_name=skin_name)
-    branding_overrides = _mapping_or_empty(data.get("branding"), section="branding", skin_name=skin_name)
-    emoji_overrides = _mapping_or_empty(data.get("tool_emojis"), section="tool_emojis", skin_name=skin_name)
-
    colors = dict(default.get("colors", {}))
-    colors.update(color_overrides)
+    colors.update(data.get("colors", {}))
    spinner = dict(default.get("spinner", {}))
-    spinner.update(spinner_overrides)
+    spinner.update(data.get("spinner", {}))
    branding = dict(default.get("branding", {}))
-    branding.update(branding_overrides)
+    branding.update(data.get("branding", {}))

    return SkinConfig(
-        name=skin_name,
+        name=data.get("name", "unknown"),
        description=data.get("description", ""),
        colors=colors,
        spinner=spinner,
        branding=branding,
        tool_prefix=data.get("tool_prefix", default.get("tool_prefix", "┊")),
-        tool_emojis=emoji_overrides,
+        tool_emojis=data.get("tool_emojis", {}),
        banner_logo=data.get("banner_logo", ""),
        banner_hero=data.get("banner_hero", ""),
    )
@@ -849,14 +828,10 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
    except Exception:
        return {}

-    # Input/prompt: leave unset by default so the typed text inherits
-    # the terminal's foreground color (readable in both light and dark
-    # color schemes).  Skins can opt into a colored prompt by setting
-    # `prompt` explicitly in their YAML.
-    prompt = skin.get_color("prompt", "")
+    prompt = skin.get_color("prompt", "#FFF8DC")
    input_rule = skin.get_color("input_rule", "#CD7F32")
    title = skin.get_color("banner_title", "#FFD700")
-    text = skin.get_color("banner_text", "#FFF8DC")
+    text = skin.get_color("banner_text", prompt)
    dim = skin.get_color("banner_dim", "#555555")
    label = skin.get_color("ui_label", title)
    warn = skin.get_color("ui_warn", "#FF8C00")
@@ -876,11 +851,7 @@ def get_prompt_toolkit_style_overrides() -> Dict[str, str]:
    menu_meta_current_bg = skin.get_color("completion_menu_meta_current_bg", menu_current_bg)

    return {
-        # Typed input always uses terminal default fg/bg so it's
-        # readable in both light and dark Terminal.app modes.  The
-        # skin's `prompt` color (if any) only styles the prompt symbol,
-        # NOT the user's typed text.
-        "input-area": "",
+        "input-area": prompt,
        "placeholder": f"{dim} italic",
        "prompt": prompt,
        "prompt-working": f"{dim} italic",
@@ -60,7 +60,6 @@ CONFIGURABLE_TOOLSETS = [
    ("vision",          "👁️  Vision / Image Analysis",  "vision_analyze"),
    ("video",           "🎬 Video Analysis",            "video_analyze (requires video-capable model)"),
    ("image_gen",       "🎨 Image Generation",          "image_generate"),
-    ("video_gen",       "🎬 Video Generation",          "video_generate (text-to-video + image-to-video)"),
    ("moa",             "🧠 Mixture of Agents",         "mixture_of_agents"),
    ("tts",             "🔊 Text-to-Speech",            "text_to_speech"),
    ("skills",          "📚 Skills",                    "list, view, manage"),
@@ -83,11 +82,7 @@ CONFIGURABLE_TOOLSETS = [
 # Toolsets that are OFF by default for new installs.
 # They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
 # but the setup checklist won't pre-select them for first-time users.
-#
-# Video gen is off by default — it's a niche, paid, slow feature. Users
-# who want it opt in via `hermes tools` → Video Generation, which walks
-# them through provider + model selection.
-_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video", "video_gen"}
+_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}

 # Platform-scoped toolsets: only appear in the `hermes tools` checklist for
 # these platforms, and only resolve/save for these platforms.  A toolset
@@ -210,9 +205,15 @@ TOOL_CATEGORIES = {
                ],
                "tts_provider": "elevenlabs",
            },
-            # Mistral (Voxtral TTS) temporarily hidden — `mistralai` PyPI
-            # package is currently quarantined (malicious 2.4.6 release on
-            # 2026-05-12). Restore this entry once PyPI un-quarantines.
+            {
+                "name": "Mistral (Voxtral TTS)",
+                "badge": "paid",
+                "tag": "Multilingual, native Opus",
+                "env_vars": [
+                    {"key": "MISTRAL_API_KEY", "prompt": "Mistral API key", "url": "https://console.mistral.ai/"},
+                ],
+                "tts_provider": "mistral",
+            },
            {
                "name": "Google Gemini TTS",
                "badge": "preview",
@@ -245,15 +246,6 @@ TOOL_CATEGORIES = {
        "setup_title": "Select Search Provider",
        "setup_note": "A free DuckDuckGo search skill is also included — skip this if you don't need a premium provider.",
        "icon": "🔍",
-        # Per-provider rows are injected at runtime from
-        # plugins.web.<vendor>.provider via _plugin_web_search_providers()
-        # in _visible_providers(). Only non-provider UX setup-flow rows
-        # for the firecrawl backend are listed here:
-        #   - "Nous Subscription" — managed Firecrawl billed via Nous
-        #     subscription (requires_nous_auth + override_env_vars).
-        #   - "Firecrawl Self-Hosted" — points firecrawl at a private
-        #     Docker instance via FIRECRAWL_API_URL only.
-        # See PR #25182 for the migration rationale.
        "providers": [
            {
                "name": "Nous Subscription",
@@ -265,6 +257,42 @@ TOOL_CATEGORIES = {
                "managed_nous_feature": "web",
                "override_env_vars": ["FIRECRAWL_API_KEY", "FIRECRAWL_API_URL"],
            },
+            {
+                "name": "Firecrawl Cloud",
+                "badge": "★ recommended",
+                "tag": "Full-featured search, extract, and crawl",
+                "web_backend": "firecrawl",
+                "env_vars": [
+                    {"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
+                ],
+            },
+            {
+                "name": "Exa",
+                "badge": "paid",
+                "tag": "Neural search with semantic understanding",
+                "web_backend": "exa",
+                "env_vars": [
+                    {"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
+                ],
+            },
+            {
+                "name": "Parallel",
+                "badge": "paid",
+                "tag": "AI-powered search and extract",
+                "web_backend": "parallel",
+                "env_vars": [
+                    {"key": "PARALLEL_API_KEY", "prompt": "Parallel API key", "url": "https://parallel.ai"},
+                ],
+            },
+            {
+                "name": "Tavily",
+                "badge": "free tier",
+                "tag": "Search, extract, and crawl — 1000 free searches/mo",
+                "web_backend": "tavily",
+                "env_vars": [
+                    {"key": "TAVILY_API_KEY", "prompt": "Tavily API key", "url": "https://app.tavily.com/home"},
+                ],
+            },
            {
                "name": "Firecrawl Self-Hosted",
                "badge": "free · self-hosted",
@@ -274,6 +302,32 @@ TOOL_CATEGORIES = {
                    {"key": "FIRECRAWL_API_URL", "prompt": "Your Firecrawl instance URL (e.g., http://localhost:3002)"},
                ],
            },
+            {
+                "name": "SearXNG",
+                "badge": "free · self-hosted · search only",
+                "tag": "Privacy-respecting metasearch engine — search only (pair with any extract provider)",
+                "web_backend": "searxng",
+                "env_vars": [
+                    {"key": "SEARXNG_URL", "prompt": "Your SearXNG instance URL (e.g., http://localhost:8080)", "url": "https://searxng.github.io/searxng/"},
+                ],
+            },
+            {
+                "name": "Brave Search (Free Tier)",
+                "badge": "free tier · search only",
+                "tag": "2,000 queries/mo free — search only (pair with any extract provider)",
+                "web_backend": "brave-free",
+                "env_vars": [
+                    {"key": "BRAVE_SEARCH_API_KEY", "prompt": "Brave Search subscription token", "url": "https://brave.com/search/api/"},
+                ],
+            },
+            {
+                "name": "DuckDuckGo (ddgs)",
+                "badge": "free · no key · search only",
+                "tag": "Search via the ddgs Python package — no API key (pair with any extract provider)",
+                "web_backend": "ddgs",
+                "env_vars": [],
+                "post_setup": "ddgs",
+            },
        ],
    },
    "image_gen": {
@@ -301,15 +355,6 @@ TOOL_CATEGORIES = {
            },
        ],
    },
-    "video_gen": {
-        "name": "Video Generation",
-        "icon": "🎬",
-        # Providers list is intentionally empty — every video gen backend
-        # is a plugin, surfaced by ``_plugin_video_gen_providers()`` and
-        # injected by ``_visible_providers``. Mirrors the design we'll
-        # converge image_gen toward.
-        "providers": [],
-    },
    "browser": {
        "name": "Browser Automation",
        "icon": "🌐",
@@ -1486,101 +1531,6 @@ def _plugin_image_gen_providers() -> list[dict]:
    return rows


-def _plugin_video_gen_providers() -> list[dict]:
-    """Build picker-row dicts from plugin-registered video gen providers.
-
-    Mirrors ``_plugin_image_gen_providers`` exactly — every video backend
-    is a plugin, so this function is the *only* source of provider rows
-    for the Video Generation category. The hardcoded ``TOOL_CATEGORIES``
-    entry for ``video_gen`` keeps an empty providers list.
-    """
-    try:
-        from agent.video_gen_registry import list_providers
-        from hermes_cli.plugins import _ensure_plugins_discovered
-
-        _ensure_plugins_discovered()
-        providers = list_providers()
-    except Exception:
-        return []
-
-    rows: list[dict] = []
-    for provider in providers:
-        try:
-            schema = provider.get_setup_schema()
-        except Exception:
-            continue
-        if not isinstance(schema, dict):
-            continue
-        rows.append(
-            {
-                "name": schema.get("name", provider.display_name),
-                "badge": schema.get("badge", ""),
-                "tag": schema.get("tag", ""),
-                "env_vars": schema.get("env_vars", []),
-                "video_gen_plugin_name": provider.name,
-            }
-        )
-    return rows
-
-
-# Mirror of _plugin_image_gen_providers for web search backends. Surfaces
-# every plugin-registered web provider so it appears in the
-# "Web Search & Extract" picker. All seven providers (brave-free, ddgs,
-# searxng, exa, parallel, tavily, firecrawl) live as plugins after
-# PR #25182 — this helper is the sole source of truth for the category's
-# provider rows. The hardcoded entries that used to drive the category
-# were deleted in the same PR; only the two non-provider UX rows
-# ("Nous Subscription" managed-gateway entry, "Firecrawl Self-Hosted")
-# remain in TOOL_CATEGORIES because they describe alternative *setup
-# flows* for the firecrawl backend rather than distinct providers.
-def _plugin_web_search_providers() -> list[dict]:
-    """Build picker-row dicts from plugin-registered web search providers.
-
-    Each returned dict is a regular ``TOOL_CATEGORIES`` provider row. It
-    populates both ``web_backend`` (legacy field consumed by setup +
-    selection helpers) and ``web_search_plugin_name`` (informational
-    marker) so the picker behaves identically whether a provider is
-    hardcoded or plugin-registered.
-
-    After PR #25182, all seven web providers (brave-free, ddgs, searxng,
-    exa, parallel, tavily, firecrawl) are plugins; this helper is the sole
-    source of provider rows for the Web Search & Extract category.
-    """
-    try:
-        from agent.web_search_registry import list_providers as _list_web_providers
-        from hermes_cli.plugins import _ensure_plugins_discovered
-
-        _ensure_plugins_discovered()
-        providers = _list_web_providers()
-    except Exception:
-        return []
-
-    rows: list[dict] = []
-    for provider in providers:
-        name = getattr(provider, "name", None)
-        if not name:
-            continue
-        try:
-            schema = provider.get_setup_schema()
-        except Exception:
-            continue
-        if not isinstance(schema, dict):
-            continue
-        row = {
-            "name": schema.get("name", provider.display_name),
-            "badge": schema.get("badge", ""),
-            "tag": schema.get("tag", ""),
-            "env_vars": schema.get("env_vars", []),
-            "web_backend": name,
-            "web_search_plugin_name": name,
-        }
-        # Optional pass-through fields the schema can opt into.
-        if schema.get("post_setup"):
-            row["post_setup"] = schema["post_setup"]
-        rows.append(row)
-    return rows
-
-
 def _visible_providers(cat: dict, config: dict) -> list[dict]:
    """Return provider entries visible for the current auth/config state."""
    features = get_nous_subscription_features(config)
@@ -1597,19 +1547,6 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
    if cat.get("name") == "Image Generation":
        visible.extend(_plugin_image_gen_providers())

-    # Inject plugin-registered video_gen backends. Unlike image_gen,
-    # video_gen has NO hardcoded providers — every backend is a plugin.
-    if cat.get("name") == "Video Generation":
-        visible.extend(_plugin_video_gen_providers())
-
-    # Inject plugin-registered web search backends. After PR #25182, this
-    # is the SOLE source of provider rows for the Web Search & Extract
-    # category — the per-provider hardcoded entries were deleted. The two
-    # remaining hardcoded rows ("Nous Subscription", "Firecrawl
-    # Self-Hosted") are non-provider UX setup-flow rows for firecrawl.
-    if cat.get("name") == "Web Search & Extract":
-        visible.extend(_plugin_web_search_providers())
-
    return visible


@@ -1677,23 +1614,6 @@ def _toolset_needs_configuration_prompt(ts_key: str, config: dict) -> bool:
            from agent.image_gen_registry import list_providers
            from hermes_cli.plugins import _ensure_plugins_discovered

-            _ensure_plugins_discovered()
-            for provider in list_providers():
-                try:
-                    if provider.is_available():
-                        return False
-                except Exception:
-                    continue
-        except Exception:
-            pass
-        return True
-    if ts_key == "video_gen":
-        # Satisfied when any plugin-registered video gen provider reports
-        # available — no in-tree fallback (every backend is a plugin).
-        try:
-            from agent.video_gen_registry import list_providers
-            from hermes_cli.plugins import _ensure_plugins_discovered
-
            _ensure_plugins_discovered()
            for provider in list_providers():
                try:
@@ -2038,106 +1958,6 @@ def _select_plugin_image_gen_provider(plugin_name: str, config: dict) -> None:
    _configure_imagegen_model_for_plugin(plugin_name, config)


-# ─── Video Generation Model Pickers ───────────────────────────────────────────
-
-
-def _plugin_video_gen_catalog(plugin_name: str):
-    """Return ``(catalog_dict, default_model_id)`` for a video gen plugin.
-
-    Mirrors :func:`_plugin_image_gen_catalog`. Returns ``({}, None)`` when
-    the plugin isn't registered or has no models.
-    """
-    try:
-        from agent.video_gen_registry import get_provider
-        from hermes_cli.plugins import _ensure_plugins_discovered
-
-        _ensure_plugins_discovered()
-        provider = get_provider(plugin_name)
-    except Exception:
-        return {}, None
-    if provider is None:
-        return {}, None
-    try:
-        models = provider.list_models() or []
-        default = provider.default_model()
-    except Exception:
-        return {}, None
-    catalog = {m["id"]: m for m in models if isinstance(m, dict) and "id" in m}
-    return catalog, default
-
-
-def _configure_videogen_model_for_plugin(plugin_name: str, config: dict) -> None:
-    """Prompt for a video gen model from a plugin's catalog.
-
-    Mirrors :func:`_configure_imagegen_model_for_plugin`. Writes the
-    selection to ``video_gen.model``.
-    """
-    catalog, default_model = _plugin_video_gen_catalog(plugin_name)
-    if not catalog:
-        return
-
-    cur_cfg = config.setdefault("video_gen", {})
-    if not isinstance(cur_cfg, dict):
-        cur_cfg = {}
-        config["video_gen"] = cur_cfg
-    current_model = cur_cfg.get("model") or default_model
-    if current_model not in catalog:
-        current_model = default_model
-
-    model_ids = list(catalog.keys())
-    ordered = [current_model] + [m for m in model_ids if m != current_model]
-
-    widths = {
-        "model": max(len(m) for m in model_ids),
-        "speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
-        "strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
-    }
-
-    print()
-    header = (
-        f"  {'Model':<{widths['model']}}  "
-        f"{'Speed':<{widths['speed']}}  "
-        f"{'Strengths':<{widths['strengths']}}  "
-        f"Price"
-    )
-    print(color(header, Colors.CYAN))
-
-    rows = []
-    for mid in ordered:
-        meta = catalog[mid]
-        row = (
-            f"  {mid:<{widths['model']}}  "
-            f"{meta.get('speed', ''):<{widths['speed']}}  "
-            f"{meta.get('strengths', ''):<{widths['strengths']}}  "
-            f"{meta.get('price', '')}"
-        )
-        if mid == current_model:
-            row += "  ← currently in use"
-        rows.append(row)
-
-    idx = _prompt_choice(
-        f"  Choose {plugin_name} model:",
-        rows,
-        default=0,
-    )
-
-    chosen = ordered[idx]
-    cur_cfg["model"] = chosen
-    _print_success(f"  Model set to: {chosen}")
-
-
-def _select_plugin_video_gen_provider(plugin_name: str, config: dict) -> None:
-    """Persist a plugin-backed video generation provider selection."""
-    vid_cfg = config.setdefault("video_gen", {})
-    if not isinstance(vid_cfg, dict):
-        vid_cfg = {}
-        config["video_gen"] = vid_cfg
-    vid_cfg["provider"] = plugin_name
-    vid_cfg["use_gateway"] = False
-    _print_success(f"  video_gen.provider set to: {plugin_name}")
-    _configure_videogen_model_for_plugin(plugin_name, config)
-
-
 def _configure_provider(provider: dict, config: dict):
    """Configure a single provider - prompt for API keys and set config."""
    env_vars = provider.get("env_vars", [])
@@ -2200,12 +2020,6 @@ def _configure_provider(provider: dict, config: dict):
        if plugin_name:
            _select_plugin_image_gen_provider(plugin_name, config)
            return
-        # Plugin-registered video_gen provider — same flow, different
-        # registry.
-        video_plugin = provider.get("video_gen_plugin_name")
-        if video_plugin:
-            _select_plugin_video_gen_provider(video_plugin, config)
-            return
        # Imagegen backends prompt for model selection after backend pick.
        backend = provider.get("imagegen_backend")
        if backend:
@@ -2254,10 +2068,6 @@ def _configure_provider(provider: dict, config: dict):
        if plugin_name:
            _select_plugin_image_gen_provider(plugin_name, config)
            return
-        video_plugin = provider.get("video_gen_plugin_name")
-        if video_plugin:
-            _select_plugin_video_gen_provider(video_plugin, config)
-            return
        # Imagegen backends prompt for model selection after env vars are in.
        backend = provider.get("imagegen_backend")
        if backend:
@@ -2482,11 +2292,6 @@ def _reconfigure_provider(provider: dict, config: dict):
        if plugin_name:
            _select_plugin_image_gen_provider(plugin_name, config)
            return
-        # Plugin-registered video_gen provider — same flow, different registry.
-        video_plugin = provider.get("video_gen_plugin_name")
-        if video_plugin:
-            _select_plugin_video_gen_provider(video_plugin, config)
-            return
        # Imagegen backends prompt for model selection on reconfig too.
        backend = provider.get("imagegen_backend")
        if backend:
@@ -2519,12 +2324,6 @@ def _reconfigure_provider(provider: dict, config: dict):
        _select_plugin_image_gen_provider(plugin_name, config)
        return

-    # Plugin-registered video_gen provider — same flow, different registry.
-    video_plugin = provider.get("video_gen_plugin_name")
-    if video_plugin:
-        _select_plugin_video_gen_provider(video_plugin, config)
-        return
-
    backend = provider.get("imagegen_backend")
    if backend:
        _configure_imagegen_model(backend, config)
@@ -56,22 +56,10 @@ try:
    from fastapi.staticfiles import StaticFiles
    from pydantic import BaseModel
 except ImportError:
-    # First try lazy-installing the dashboard extras. Only the user actually
-    # running `hermes dashboard` needs fastapi+uvicorn; lazy install keeps
-    # them out of every other install path. After install, re-import.
-    try:
-        from tools.lazy_deps import ensure as _lazy_ensure
-        _lazy_ensure("tool.dashboard", prompt=False)
-        from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
-        from fastapi.middleware.cors import CORSMiddleware
-        from fastapi.responses import FileResponse, HTMLResponse, JSONResponse, Response
-        from fastapi.staticfiles import StaticFiles
-        from pydantic import BaseModel
-    except Exception:
-        raise SystemExit(
-            "Web UI requires fastapi and uvicorn.\n"
-            f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
-        )
+    raise SystemExit(
+        "Web UI requires fastapi and uvicorn.\n"
+        f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
+    )

 WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
 _log = logging.getLogger(__name__)
@@ -285,9 +273,7 @@ _SCHEMA_OVERRIDES: Dict[str, Dict[str, Any]] = {
    "stt.provider": {
        "type": "select",
        "description": "Speech-to-text provider",
-        # "mistral" temporarily removed — mistralai PyPI package quarantined
-        # (malicious 2.4.6 release on 2026-05-12). Restore once available.
-        "options": ["local", "openai"],
+        "options": ["local", "openai", "mistral"],
    },
    "display.skin": {
        "type": "select",
@@ -994,9 +980,39 @@ def get_model_options():
    can share the same types.
    """
    try:
-        from hermes_cli.inventory import build_models_payload, load_picker_context
+        from hermes_cli.model_switch import list_authenticated_providers

-        return build_models_payload(load_picker_context(), max_models=50)
+        cfg = load_config()
+        model_cfg = cfg.get("model", {})
+        if isinstance(model_cfg, dict):
+            current_model = model_cfg.get("default", model_cfg.get("name", "")) or ""
+            current_provider = model_cfg.get("provider", "") or ""
+            current_base_url = model_cfg.get("base_url", "") or ""
+        else:
+            current_model = str(model_cfg) if model_cfg else ""
+            current_provider = ""
+            current_base_url = ""
+
+        user_providers = cfg.get("providers") if isinstance(cfg.get("providers"), dict) else {}
+        custom_providers = (
+            cfg.get("custom_providers")
+            if isinstance(cfg.get("custom_providers"), list)
+            else []
+        )
+
+        providers = list_authenticated_providers(
+            current_provider=current_provider,
+            current_base_url=current_base_url,
+            current_model=current_model,
+            user_providers=user_providers,
+            custom_providers=custom_providers,
+            max_models=50,
+        )
+        return {
+            "providers": providers,
+            "model": current_model,
+            "provider": current_provider,
+        }
    except Exception:
        _log.exception("GET /api/model/options failed")
        raise HTTPException(status_code=500, detail="Failed to list model options")
@@ -2037,7 +2053,6 @@ def _minimax_poller(session_id: str) -> None:
    """
    from hermes_cli.auth import (
        _minimax_poll_token,
-        _minimax_resolve_token_expiry_unix,
        _minimax_save_auth_state,
        MINIMAX_OAUTH_GLOBAL_INFERENCE,
        MINIMAX_OAUTH_SCOPE,
@@ -2075,10 +2090,8 @@ def _minimax_poller(session_id: str) -> None:
        # dashboard path; cn-region operators can still use the CLI
        # flow which supports `--region cn`.
        now = datetime.now(timezone.utc)
-        expires_at_ts = _minimax_resolve_token_expiry_unix(
-            int(token_data["expired_in"]), now=now,
-        )
-        expires_in_s = max(0, int(expires_at_ts - now.timestamp()))
+        expires_in_s = int(token_data["expired_in"])
+        expires_at_ts = now.timestamp() + expires_in_s
        auth_state = {
            "provider": "minimax-oauth",
            "region": sess.get("region", "global"),
@@ -3991,9 +4004,6 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:
    global _dashboard_plugins_cache
    if _dashboard_plugins_cache is None or force_rescan:
        _dashboard_plugins_cache = _discover_dashboard_plugins()
-    elif _dashboard_plugins_cache:
-        if any(not Path(p["_dir"]).is_dir() for p in _dashboard_plugins_cache):
-            _dashboard_plugins_cache = _discover_dashboard_plugins()
    return _dashboard_plugins_cache


@@ -4405,33 +4415,11 @@ def start_server(
    if open_browser:
        import webbrowser

-        # On headless Linux (no DISPLAY or WAYLAND_DISPLAY) some registered
-        # browsers are TUI programs (links, lynx, www-browser) that try to
-        # take over the terminal.  That can send SIGHUP to the server process
-        # and cause an immediate exit even though uvicorn bound successfully.
-        # Skip the auto-open attempt on headless systems and let the user
-        # open the URL manually.  macOS and Windows are always considered
-        # display-capable.
-        _has_display = (
-            sys.platform != "linux"
-            or bool(os.environ.get("DISPLAY"))
-            or bool(os.environ.get("WAYLAND_DISPLAY"))
-        )
+        def _open():
+            time.sleep(1.0)
+            webbrowser.open(f"http://{host}:{port}")

-        if _has_display:
-            def _open():
-                try:
-                    time.sleep(1.0)
-                    webbrowser.open(f"http://{host}:{port}")
-                except Exception:
-                    pass
-
-            threading.Thread(target=_open, daemon=True).start()
-        else:
-            _log.debug(
-                "Skipping browser-open: no DISPLAY or WAYLAND_DISPLAY detected "
-                "(headless Linux). Pass --no-open to suppress this detection."
-            )
+        threading.Thread(target=_open, daemon=True).start()

    print(f"  Hermes Web UI → http://{host}:{port}")
    uvicorn.run(app, host=host, port=port, log_level="warning")
@@ -1597,10 +1597,10 @@ class SessionDB:
        self._execute_write(_do)

    def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
-        """Load all messages for a session, ordered by insertion order."""
+        """Load all messages for a session, ordered by timestamp."""
        with self._lock:
            cursor = self._conn.execute(
-                "SELECT * FROM messages WHERE session_id = ? ORDER BY id",
+                "SELECT * FROM messages WHERE session_id = ? ORDER BY timestamp, id",
                (session_id,),
            )
            rows = cursor.fetchall()
@@ -1700,7 +1700,7 @@ class SessionDB:
                "SELECT role, content, tool_call_id, tool_calls, tool_name, "
                "finish_reason, reasoning, reasoning_content, reasoning_details, "
                "codex_reasoning_items, codex_message_items "
-                f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY id",
+                f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
                tuple(session_ids),
            ).fetchall()

@@ -0,0 +1,232 @@
+---
+name: base
+description: Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required.
+version: 0.1.0
+author: youssefea
+license: MIT
+platforms: [linux, macos, windows]
+metadata:
+  hermes:
+    tags: [Base, Blockchain, Crypto, Web3, RPC, DeFi, EVM, L2, Ethereum]
+    related_skills: []
+---
+
+# Base Blockchain Skill
+
+Query Base (Ethereum L2) on-chain data enriched with USD pricing via CoinGecko.
+8 commands: wallet portfolio, token info, transactions, gas analysis,
+contract inspection, whale detection, network stats, and price lookup.
+
+No API key needed. Uses only Python standard library (urllib, json, argparse).
+
+---
+
+## When to Use
+
+- User asks for a Base wallet balance, token holdings, or portfolio value
+- User wants to inspect a specific transaction by hash
+- User wants ERC-20 token metadata, price, supply, or market cap
+- User wants to understand Base gas costs and L1 data fees
+- User wants to inspect a contract (ERC type detection, proxy resolution)
+- User wants to find large ETH transfers (whale detection)
+- User wants Base network health, gas price, or ETH price
+- User asks "what's the price of USDC/AERO/DEGEN/ETH?"
+
+---
+
+## Prerequisites
+
+The helper script uses only Python standard library (urllib, json, argparse).
+No external packages required.
+
+Pricing data comes from CoinGecko's free API (no key needed, rate-limited
+to ~10-30 requests/minute). For faster lookups, use `--no-prices` flag.
+
+---
+
+## Quick Reference
+
+RPC endpoint (default): https://mainnet.base.org
+Override: export BASE_RPC_URL=https://your-private-rpc.com
+
+Helper script path: ~/.hermes/skills/blockchain/base/scripts/base_client.py
+
+```
+python3 base_client.py wallet   <address> [--limit N] [--all] [--no-prices]
+python3 base_client.py tx       <hash>
+python3 base_client.py token    <contract_address>
+python3 base_client.py gas
+python3 base_client.py contract <address>
+python3 base_client.py whales   [--min-eth N]
+python3 base_client.py stats
+python3 base_client.py price    <contract_address_or_symbol>
+```
+
+---
+
+## Procedure
+
+### 0. Setup Check
+
+```bash
+python3 --version
+
+# Optional: set a private RPC for better rate limits
+export BASE_RPC_URL="https://mainnet.base.org"
+
+# Confirm connectivity
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
+```
+
+### 1. Wallet Portfolio
+
+Get ETH balance and ERC-20 token holdings with USD values.
+Checks ~15 well-known Base tokens (USDC, WETH, AERO, DEGEN, etc.)
+via on-chain `balanceOf` calls. Tokens sorted by value, dust filtered.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
+  wallet 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
+```
+
+Flags:
+- `--limit N` — show top N tokens (default: 20)
+- `--all` — show all tokens, no dust filter, no limit
+- `--no-prices` — skip CoinGecko price lookups (faster, RPC-only)
+
+Output includes: ETH balance + USD value, token list with prices sorted
+by value, dust count, total portfolio value in USD.
+
+Note: Only checks known tokens. Unknown ERC-20s are not discovered.
+Use the `token` command with a specific contract address for any token.
+
+### 2. Transaction Details
+
+Inspect a full transaction by its hash. Shows ETH value transferred,
+gas used, fee in ETH/USD, status, and decoded ERC-20/ERC-721 transfers.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
+  tx 0xabc123...your_tx_hash_here
+```
+
+Output: hash, block, from, to, value (ETH + USD), gas price, gas used,
+fee, status, contract creation address (if any), token transfers.
+
+### 3. Token Info
+
+Get ERC-20 token metadata: name, symbol, decimals, total supply, price,
+market cap, and contract code size.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
+  token 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
+```
+
+Output: name, symbol, decimals, total supply, price, market cap.
+Reads name/symbol/decimals directly from the contract via eth_call.
+
+### 4. Gas Analysis
+
+Detailed gas analysis with cost estimates for common operations.
+Shows current gas price, base fee trends over 10 blocks, block
+utilization, and estimated costs for ETH transfers, ERC-20 transfers,
+and swaps.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py gas
+```
+
+Output: current gas price, base fee, block utilization, 10-block trend,
+cost estimates in ETH and USD.
+
+Note: Base is an L2 — actual transaction costs include an L1 data
+posting fee that depends on calldata size and L1 gas prices. The
+estimates shown are for L2 execution only.
+
+### 5. Contract Inspection
+
+Inspect an address: determine if it's an EOA or contract, detect
+ERC-20/ERC-721/ERC-1155 interfaces, resolve EIP-1967 proxy
+implementation addresses.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
+  contract 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
+```
+
+Output: is_contract, code size, ETH balance, detected interfaces
+(ERC-20, ERC-721, ERC-1155), ERC-20 metadata, proxy implementation
+address.
+
+### 6. Whale Detector
+
+Scan the most recent block for large ETH transfers with USD values.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py \
+  whales --min-eth 1.0
+```
+
+Note: scans the latest block only — point-in-time snapshot, not historical.
+Default threshold is 1.0 ETH (lower than Solana's default since ETH
+values are higher).
+
+### 7. Network Stats
+
+Live Base network health: latest block, chain ID, gas price, base fee,
+block utilization, transaction count, and ETH price.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
+```
+
+### 8. Price Lookup
+
+Quick price check for any token by contract address or known symbol.
+
+```bash
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price ETH
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price USDC
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price AERO
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price DEGEN
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py price 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913
+```
+
+Known symbols: ETH, WETH, USDC, cbETH, AERO, DEGEN, TOSHI, BRETT,
+WELL, wstETH, rETH, cbBTC.
+
+---
+
+## Pitfalls
+
+- **CoinGecko rate-limits** — free tier allows ~10-30 requests/minute.
+  Price lookups use 1 request per token. Use `--no-prices` for speed.
+- **Public RPC rate-limits** — Base's public RPC limits requests.
+  For production use, set BASE_RPC_URL to a private endpoint
+  (Alchemy, QuickNode, Infura).
+- **Wallet shows known tokens only** — unlike Solana, EVM chains have no
+  built-in "get all tokens" RPC. The wallet command checks ~15 popular
+  Base tokens via `balanceOf`. Unknown ERC-20s won't appear. Use the
+  `token` command for any specific contract.
+- **Token names read from contract** — if a contract doesn't implement
+  `name()` or `symbol()`, these fields may be empty. Known tokens have
+  hardcoded labels as fallback.
+- **Gas estimates are L2 only** — Base transaction costs include an L1
+  data posting fee (depends on calldata size and L1 gas prices). The gas
+  command estimates L2 execution cost only.
+- **Whale detector scans latest block only** — not historical. Results
+  vary by the moment you query. Default threshold is 1.0 ETH.
+- **Proxy detection** — only EIP-1967 proxies are detected. Other proxy
+  patterns (EIP-1167 minimal proxy, custom storage slots) are not checked.
+- **Retry on 429** — both RPC and CoinGecko calls retry up to 2 times
+  with exponential backoff on rate-limit errors.
+
+---
+
+## Verification
+
+```bash
+# Should print Base chain ID (8453), latest block, gas price, and ETH price
+python3 ~/.hermes/skills/blockchain/base/scripts/base_client.py stats
+```
@@ -1,211 +0,0 @@
---
-name: evm
-description: "Read-only EVM client: wallets, tokens, gas across 8 chains."
-version: 1.0.0
-author: Mibayy (@Mibayy), youssefea (@youssefea), ethernet8023 (@ethernet8023), Hermes Agent
-license: MIT
-platforms: [linux, macos, windows]
-metadata:
-  hermes:
-    tags: [EVM, Ethereum, BNB, BSC, Base, Arbitrum, Polygon, Optimism, Avalanche, zkSync, Blockchain, Crypto, Web3, DeFi, NFT, ENS, Whale, Security]
-    category: blockchain
-    related_skills: [solana]
-    requires_toolsets: [terminal]
---
-
-# EVM Blockchain Skill
-
-Query EVM-compatible blockchain data across 8 chains with USD pricing.
-14 commands: wallet portfolio, token info, transactions, activity, gas tracker,
-network stats, price lookup, multi-chain scan, whale detection, ENS resolution,
-allowance checker, contract inspector, and transaction decoder.
-
-Supports 8 chains: Ethereum, BNB Chain (BSC), Base, Arbitrum One, Polygon,
-Optimism, Avalanche (C-Chain), zkSync Era.
-
-No API key needed. Zero external dependencies — Python standard library only
-(urllib, json, argparse, threading).
-
-> **Supersedes the standalone `base` skill.** Base-specific tokens (AERO, DEGEN,
-> TOSHI, BRETT, WELL, cbETH, cbBTC, wstETH, rETH) and all Base RPC functionality
-> previously living under `optional-skills/blockchain/base/` have been folded
-> into this skill. Pass `--chain base` to any command for Base coverage.
-
---
-
-## When to Use
- User asks for a wallet balance or portfolio on any EVM chain
- User wants to check the same wallet across ALL chains at once
- User wants to inspect a transaction by hash (or decode what it did)
- User wants ERC-20 token metadata, price, supply, or market cap
- User wants recent transaction history for an address
- User wants current gas prices or to compare fees across chains
- User wants to find large whale transfers in recent blocks
- User asks to resolve an ENS name (vitalik.eth) or reverse-lookup an address
- User wants to check if a contract has dangerous token approvals
- User wants to inspect a smart contract (proxy? ERC-20? ERC-721? bytecode size?)
- User wants to compare gas costs across chains before a transaction
-
---
-
-## Prerequisites
-Python 3.8+ standard library only. No pip installs required.
-Pricing: CoinGecko free API (rate-limited, ~10-30 req/min).
-ENS: ensideas.com public API.
-Tx decoding: 4byte.directory public API.
-
-Override RPC endpoint: `export EVM_RPC_URL=https://your-rpc.com`
-
-Helper script path: `~/.hermes/skills/blockchain/evm/scripts/evm_client.py`
-
---
-
-## Quick Reference
-
-```
-SCRIPT=~/.hermes/skills/blockchain/evm/scripts/evm_client.py
-
-# Network & prices
-python3 $SCRIPT stats                            # Ethereum stats
-python3 $SCRIPT stats --chain arbitrum           # Arbitrum stats
-python3 $SCRIPT compare                          # Gas + prices ALL 8 chains
-
-# Wallet
-python3 $SCRIPT wallet 0xd8dA...96045            # Portfolio (ETH + ERC-20)
-python3 $SCRIPT wallet 0xd8dA...96045 --chain bsc
-python3 $SCRIPT multichain 0xd8dA...96045        # Same wallet on ALL chains
-
-# Tokens & prices
-python3 $SCRIPT price ETH
-python3 $SCRIPT price 0xdAC1...1ec7              # By contract address
-python3 $SCRIPT token 0xdAC1...1ec7              # ERC-20 metadata + market cap
-
-# Transactions
-python3 $SCRIPT tx 0x5c50...f060                 # Transaction details
-python3 $SCRIPT decode 0x5c50...f060             # Decode input data (4byte.directory)
-python3 $SCRIPT activity 0xd8dA...96045          # Recent transactions
-
-# Gas
-python3 $SCRIPT gas                              # Gas prices + cost estimates
-python3 $SCRIPT gas --chain optimism
-
-# Security
-python3 $SCRIPT allowance 0xd8dA...96045         # Dangerous ERC-20 approvals
-python3 $SCRIPT contract 0xdAC1...1ec7           # Contract inspection (proxy? standards?)
-
-# ENS
-python3 $SCRIPT ens vitalik.eth                  # Name -> address + profile
-python3 $SCRIPT ens 0xd8dA...96045               # Address -> ENS name
-
-# Whale detection
-python3 $SCRIPT whale                            # Large transfers (last 20 blocks, >$10k)
-python3 $SCRIPT whale --blocks 50 --min-usd 100000 --chain arbitrum
-```
-
---
-
-## Procedure
-
-### 0. Setup Check
-```bash
-python3 --version   # 3.8+ required
-python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py stats
-```
-
-### 1. Wallet Portfolio
-Native balance + known ERC-20 tokens, sorted by USD value.
-```bash
-python3 $SCRIPT wallet 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
-python3 $SCRIPT wallet 0xd8dA... --chain bsc --no-prices   # faster
-```
-
-### 2. Multi-Chain Scan
-Scans all 8 chains simultaneously for the same address using threads.
-```bash
-python3 $SCRIPT multichain 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045
-```
-Output: per-chain native balance + token holdings + grand total USD.
-
-### 3. Compare (Gas + Prices)
-All 8 chains queried in parallel. Shows cheapest/most expensive chain.
-```bash
-python3 $SCRIPT compare
-```
-
-### 4. Transaction Details & Decode
-```bash
-python3 $SCRIPT tx 0x5c504ed432cb51138bcf09aa5e8a410dd4a1e204ef84bfed1be16dfba1b22060
-python3 $SCRIPT decode 0x5c504ed...   # Shows human-readable function signature
-```
-Decode uses 4byte.directory to translate 0xa9059cbb -> transfer(address,uint256).
-
-### 5. ENS Resolution
-```bash
-python3 $SCRIPT ens vitalik.eth          # -> 0xd8dA... + avatar + social links
-python3 $SCRIPT ens 0xd8dA...96045       # -> vitalik.eth
-```
-
-### 6. Allowance Checker (Security)
-Checks ERC-20 approvals granted to known DEX/bridge contracts.
-```bash
-python3 $SCRIPT allowance 0xYourWallet
-```
-Flags UNLIMITED approvals as HIGH risk.
-
-### 7. Contract Inspector
-```bash
-python3 $SCRIPT contract 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48   # USDC (proxy)
-python3 $SCRIPT contract 0xdAC17F958D2ee523a2206206994597C13D831ec7   # USDT (ERC-20)
-```
-Detects: proxy (EIP-1967/EIP-1167), ERC-20, ERC-721, ERC-165. Shows bytecode size and implementation address for proxies.
-
-### 8. Whale Detection
-```bash
-python3 $SCRIPT whale                                    # ETH, last 20 blocks, >$10k
-python3 $SCRIPT whale --blocks 50 --min-usd 50000 --chain bsc
-```
-
-### 9. Gas Tracker
-```bash
-python3 $SCRIPT gas
-python3 $SCRIPT gas --chain polygon
-```
-Shows gwei price + USD cost for: transfer, ERC-20 transfer, approve, swap, NFT mint, NFT transfer.
-
---
-
-## Supported Chains
-| Key       | Name           | Native | Chain ID |
-|-----------|----------------|--------|----------|
-| ethereum  | Ethereum       | ETH    | 1        |
-| bsc       | BNB Chain      | BNB    | 56       |
-| base      | Base           | ETH    | 8453     |
-| arbitrum  | Arbitrum One   | ETH    | 42161    |
-| polygon   | Polygon        | POL    | 137      |
-| optimism  | Optimism       | ETH    | 10       |
-| avalanche | Avalanche C    | AVAX   | 43114    |
-| zksync    | zkSync Era     | ETH    | 324      |
-
---
-
-## Pitfalls
- CoinGecko free tier: ~10-30 req/min. Use `--no-prices` for faster wallet scans.
- Public RPCs may throttle. Set EVM_RPC_URL to a private endpoint for production.
- `wallet` and `allowance` only check known token list (~30 tokens per chain). Use a block explorer for complete token discovery.
- `activity` scans recent blocks only (max 200). For full history, use Etherscan API.
- `multichain` runs 8 parallel threads — can trigger rate limits on public RPCs.
- ENS resolution depends on a single public endpoint (ensideas.com / ens.vitalik.ca) with no fallback. If that endpoint is down, `ens` will fail — re-run later or use a block explorer.
- Tx decoding depends on a single public endpoint (4byte.directory) with no fallback. Selectors not in their database show up as `unknown`.
- **L2 gas estimates are L2-execution only.** On rollups like Base, Arbitrum, Optimism, and zkSync, the actual transaction cost also includes an L1 data-posting fee that depends on calldata size and current L1 gas prices. The `gas` command does not estimate that L1 component. For Base specifically, see the network's L1 fee oracle (contract `0x420000000000000000000000000000000000000F`).
- Address / tx-hash inputs are validated for 0x-prefix + correct length + hex, but EIP-55 checksum casing is **not** enforced (RPC endpoints accept any-case hex).
-
---
-
-## Verification
-```bash
-# Should print current block, gas price, ETH price
-python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py stats
-
-# Should resolve vitalik.eth to 0xd8dA...
-python3 ~/.hermes/skills/blockchain/evm/scripts/evm_client.py ens vitalik.eth
-```
@@ -1,14 +0,0 @@
-{
-  "name": "example",
-  "label": "Example",
-  "description": "Example dashboard plugin — used by test suite for auth coverage",
-  "icon": "Sparkles",
-  "version": "1.0.0",
-  "tab": {
-    "path": "/example",
-    "position": "after:skills"
-  },
-  "slots": [],
-  "entry": "dist/index.js",
-  "api": "plugin_api.py"
-}
@@ -1,17 +0,0 @@
-"""Example dashboard plugin — backend API routes.
-
-Mounted at /api/plugins/example/ by the dashboard plugin system.
-
-This minimal plugin exists so the test suite has a stable, side-effect-free
-GET endpoint to verify that plugin API routes work with auth.
-"""
-
-from fastapi import APIRouter
-
-router = APIRouter()
-
-
-@router.get("/hello")
-async def hello():
-    """Simple greeting endpoint to demonstrate plugin API routes."""
-    return {"message": "Hello from the example plugin!", "plugin": "example", "version": "1.0.0"}
@@ -875,13 +875,6 @@ class HindsightMemoryProvider(MemoryProvider):
                        "Hindsight local runtime is unavailable"
                        + (f": {reason}" if reason else "")
                    )
-                try:
-                    from tools.lazy_deps import ensure as _lazy_ensure
-                    _lazy_ensure("memory.hindsight", prompt=False)
-                except ImportError:
-                    pass
-                except Exception as _e:
-                    raise ImportError(str(_e))
                from hindsight import HindsightEmbedded
                HindsightEmbedded.__del__ = lambda self: None
                llm_provider = self._config.get("llm_provider", "")
@@ -21,7 +21,6 @@ from dataclasses import dataclass, field
 from pathlib import Path

 from hermes_constants import get_hermes_home
-from hermes_cli.profiles import _get_default_hermes_home
 from typing import Any, TYPE_CHECKING

 if TYPE_CHECKING:
@@ -74,7 +73,7 @@ def resolve_config_path() -> Path:
        return local_path

    # Default profile's config — host blocks accumulate here via setup/clone
-    default_path = _get_default_hermes_home() / "honcho.json"
+    default_path = Path.home() / ".hermes" / "honcho.json"
    if default_path != local_path and default_path.exists():
        return default_path

@@ -688,28 +687,12 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
            "For local instances, set HONCHO_BASE_URL instead."
        )

-    # Lazy-install the honcho SDK on demand. ensure() honors
-    # security.allow_lazy_installs (default true). On failure we surface
-    # the original ImportError-shape message so existing callers still get
-    # the "go run hermes honcho setup" hint they used to.
-    try:
-        from tools.lazy_deps import FeatureUnavailable, ensure as _lazy_ensure
-        _lazy_ensure("memory.honcho", prompt=False)
-    except ImportError:
-        # lazy_deps module missing — fall through to the raw import below.
-        pass
-    except Exception:
-        # FeatureUnavailable or unexpected error. Don't crash here; let the
-        # actual import attempt produce the canonical error message.
-        pass
-
    try:
        from honcho import Honcho
    except ImportError:
        raise ImportError(
            "honcho-ai is required for Honcho integration. "
-            "Install it with: pip install honcho-ai  "
-            "(or run `hermes honcho setup` to configure)."
+            "Install it with: pip install honcho-ai"
        )

    # Allow config.yaml honcho.base_url to override the SDK's environment
@@ -336,17 +336,10 @@ ADD_RESOURCE_SCHEMA = {

 def _zip_directory(dir_path: Path) -> Path:
    """Create a temporary zip file containing a directory tree."""
-    root = dir_path.resolve()
    zip_path = Path(tempfile.gettempdir()) / f"openviking_upload_{uuid.uuid4().hex}.zip"
    with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as zipf:
        for file_path in dir_path.rglob("*"):
-            if file_path.is_symlink():
-                continue
            if file_path.is_file():
-                try:
-                    file_path.resolve().relative_to(root)
-                except ValueError:
-                    continue
                arcname = str(file_path.relative_to(dir_path)).replace("\\", "/")
                zipf.write(file_path, arcname=arcname)
    return zip_path
--- a/Show More
+++ b/Show More