revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_*, _cached_current_sha*, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit.
fix(nix): refresh stale tui npmDepsHash + fix cache-blind detection (#20144 )
2026-05-05 03:44:10 -07:00 · 2026-05-05 15:32:20 +05:30 · 2026-05-04 20:59:18 -07:00 · 2026-05-04 20:59:18 -07:00 · 2026-05-04 20:59:18 -07:00 · 2026-05-04 20:59:18 -07:00
498 changed files with 71519 additions and 3458 deletions
@@ -9,6 +9,12 @@ node_modules
 .venv
 **/.venv

+# Built artifacts that are regenerated inside the image.  Excluded so local
+# rebuilds on the developer's machine don't invalidate the npm-install layer
+# that now depends on the full ui-tui/packages/hermes-ink/ tree being present.
+ui-tui/dist/
+ui-tui/packages/hermes-ink/dist/
+
 # CI/CD
 .github

@@ -19,3 +25,7 @@ node_modules

 # Runtime data (bind-mounted at /opt/data; must not leak into build context)
 data/
+
+# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
+hermes-config/
+runtime/
@@ -0,0 +1,44 @@
+# Dependabot configuration for hermes-agent.
+#
+# Deliberately scoped to github-actions only.
+#
+# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
+# because we pin source dependencies exactly (uv.lock, package-lock.json) as
+# part of our supply-chain posture. Automatic version-bump PRs against those
+# pins would undermine the strategy — pins are moved deliberately, after
+# review, not on a schedule.
+#
+# github-actions is the exception: action pins (we use full commit SHAs per
+# supply-chain policy) must be updated when upstream actions publish
+# patches — usually themselves security fixes. Dependabot opens a PR with
+# the new SHA and release notes; we review and merge like any other PR.
+#
+# Security-update PRs for source dependencies (opened ONLY when a CVE is
+# published affecting a currently-pinned version) are enabled separately
+# via the repo's Dependabot security updates setting
+# (Settings → Code security → Dependabot → Dependabot security updates).
+# Those are CVE-only, not schedule-driven, and do not conflict with our
+# pinning strategy — they fire when a pinned version becomes known-bad,
+# which is exactly when we want to move the pin.
+
+version: 2
+updates:
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+      day: "monday"
+    open-pull-requests-limit: 5
+    labels:
+      - "dependencies"
+      - "github-actions"
+    commit-message:
+      prefix: "chore(actions)"
+      include: "scope"
+    groups:
+      # Batch routine action bumps into one PR per week to reduce noise.
+      # Security updates still open individually and bypass grouping.
+      actions-minor-patch:
+        update-types:
+          - "minor"
+          - "patch"
@@ -76,6 +76,16 @@ jobs:
        run: |
          mkdir -p _site/docs
          cp -r website/build/* _site/docs/
+          # llms.txt / llms-full.txt are also published at the site root
+          # (https://hermes-agent.nousresearch.com/llms.txt) because some
+          # agents and IDE plugins probe the classic root-level path rather
+          # than /docs/llms.txt. Same file, two URLs, one source of truth.
+          if [ -f website/build/llms.txt ]; then
+            cp website/build/llms.txt _site/llms.txt
+          fi
+          if [ -f website/build/llms-full.txt ]; then
+            cp website/build/llms-full.txt _site/llms-full.txt
+          fi

      - name: Upload artifact
        uses: actions/upload-pages-artifact@56afc609e74202658d3ffba0e8f6dda462b719fa  # v3
@@ -0,0 +1,67 @@
+name: OSV-Scanner
+
+# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
+# database. Runs on every PR that touches a lockfile and on a weekly schedule
+# against main.
+#
+# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
+# It reports known CVEs in currently-pinned dependency versions so we can
+# decide when and how to patch on our own schedule. Our pinning strategy
+# (full SHA / exact version) is preserved; only the notification signal
+# is added.
+#
+# Complements the existing supply-chain-audit.yml workflow (which scans
+# for malicious code patterns in PR diffs) by covering the orthogonal
+# "currently-pinned dep became known-vulnerable" case.
+#
+# Uses Google's officially-recommended reusable workflow, pinned by SHA.
+# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
+# fail-on-vuln is disabled so the job does not block merges on pre-existing
+# vulnerabilities in pinned deps that we may need to patch deliberately.
+
+on:
+  pull_request:
+    branches: [main]
+    paths:
+      - 'uv.lock'
+      - 'pyproject.toml'
+      - 'package.json'
+      - 'package-lock.json'
+      - 'ui-tui/package.json'
+      - 'ui-tui/package-lock.json'
+      - 'website/package.json'
+      - 'website/package-lock.json'
+      - '.github/workflows/osv-scanner.yml'
+  push:
+    branches: [main]
+    paths:
+      - 'uv.lock'
+      - 'pyproject.toml'
+      - 'package.json'
+      - 'package-lock.json'
+      - 'ui-tui/package-lock.json'
+      - 'website/package-lock.json'
+  schedule:
+    # Weekly scan against main — catches CVEs published after merge for
+    # deps that haven't changed since.
+    - cron: '0 9 * * 1'
+  workflow_dispatch:
+
+permissions:
+  # Required by the reusable workflow to upload SARIF to the Security tab.
+  actions: read
+  contents: read
+  security-events: write
+
+jobs:
+  scan:
+    name: Scan lockfiles
+    uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5  # v2.3.5
+    with:
+      # Scan explicit lockfiles rather than recursing, so we only look at
+      # the three sources of truth and skip vendored / test / worktree dirs.
+      scan-args: |-
+        --lockfile=uv.lock
+        --lockfile=ui-tui/package-lock.json
+        --lockfile=website/package-lock.json
+      fail-on-vuln: false
@@ -257,7 +257,16 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite.  See `hermes

 ## Adding New Tools

-Requires changes in **2 files**:
+For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
+route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
+`~/.hermes/plugins/<name>/__init__.py`, then register tools with
+`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
+enabled or disabled without touching `tools/` or `toolsets.py`.
+
+Use the built-in route below only when the user is explicitly contributing a new
+core Hermes tool that should ship in the base system.
+
+Built-in/core tools require changes in **2 files**:

 **1. Create `tools/your_tool.py`:**
 ```python
@@ -28,10 +28,26 @@ WORKDIR /opt/hermes
 # ---------- Layer-cached dependency install ----------
 # Copy only package manifests first so npm install + Playwright are cached
 # unless the lockfiles themselves change.
+#
+# ui-tui/packages/hermes-ink/ is copied IN FULL (not just its manifests)
+# because it is referenced as a `file:` workspace dependency from
+# ui-tui/package.json.  Copying the tree up front lets npm resolve the
+# workspace to real content instead of stopping at a bare package.json.
 COPY package.json package-lock.json ./
 COPY web/package.json web/package-lock.json web/
 COPY ui-tui/package.json ui-tui/package-lock.json ui-tui/
-COPY ui-tui/packages/hermes-ink/package.json ui-tui/packages/hermes-ink/package-lock.json ui-tui/packages/hermes-ink/
+COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/
+
+# `npm_config_install_links=false` forces npm to install `file:` deps as
+# symlinks (the npm 10+ default) even on Debian's older bundled npm 9.x,
+# which defaults to `install-links=true` and installs file deps as *copies*.
+# The host-side package-lock.json is generated with a newer npm that uses
+# symlinks, so an install-as-copy produces a hidden node_modules/.package-lock.json
+# that permanently disagrees with the root lock on the @hermes/ink entry.
+# That disagreement trips the TUI launcher's `_tui_need_npm_install()`
+# check on every startup and triggers a runtime `npm install` that then
+# fails with EACCES (node_modules/ is root-owned from build time).
+ENV npm_config_install_links=false

 RUN npm install --prefer-offline --no-audit && \
    npx playwright install --with-deps chromium --only-shell && \
@@ -45,13 +61,7 @@ COPY --chown=hermes:hermes . .

 # Build browser dashboard and terminal UI assets.
 RUN cd web && npm run build && \
-    cd ../ui-tui && npm run build && \
-    rm -rf node_modules/@hermes/ink && \
-    rm -rf packages/hermes-ink/node_modules && \
-    cp -R packages/hermes-ink node_modules/@hermes/ink && \
-    npm install --omit=dev --prefer-offline --no-audit --prefix node_modules/@hermes/ink && \
-    rm -rf node_modules/@hermes/ink/node_modules/react && \
-    node --input-type=module -e "await import('@hermes/ink')"
+    cd ../ui-tui && npm run build

 # ---------- Permissions ----------
 # Make install dir world-readable so any HERMES_UID can read it at runtime.
@@ -0,0 +1,505 @@
+# Hermes Agent v0.12.0 (v2026.4.30)
+
+**Release Date:** April 30, 2026
+**Since v0.11.0:** 1,096 commits · 550 merged PRs · 1,270 files changed · 217,776 insertions · 213 community contributors (including co-authors)
+
+> The Curator release — Hermes Agent now maintains itself. An autonomous background Curator grades, prunes, and consolidates your skill library on its own schedule. The self-improvement loop that reviews what to save got a substantial upgrade. Four new inference providers, a 18th messaging platform, a 19th via Teams plugin, native Spotify + Google Meet integrations, ComfyUI and TouchDesigner-MCP moved from optional to bundled-by-default, and a ~57% cut to visible TUI cold start.
+
+---
+
+## ✨ Highlights
+
+- **Autonomous Curator** — `hermes curator` runs as a background agent on the gateway's cron ticker (7-day cycle default). It grades your skill library, consolidates related skills, prunes dead ones, and writes per-run reports to `logs/curator/run.json` + `REPORT.md`. Archived skills are classified consolidated-vs-pruned via model + heuristic. Defense-in-depth gates protect bundled/hub skills from mutation. Unified under `auxiliary.curator` — pick the curator's model in `hermes model`, manage it from the dashboard. `hermes curator status` ranks skills by usage (most-used / least-used). ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277), [#17307](https://github.com/NousResearch/hermes-agent/pull/17307), [#17941](https://github.com/NousResearch/hermes-agent/pull/17941), [#17868](https://github.com/NousResearch/hermes-agent/pull/17868), [#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
+
+- **Self-improvement loop — substantially upgraded** — The background review fork (the core of Hermes' self-improvement: after each turn it decides what memories/skills to save or update) is now class-first (rubric-based rather than free-form), active-update biased (prefers the skill the agent just loaded), handles `references/`/`templates/` sub-files, and properly inherits the parent's live runtime (provider, model, credentials actually propagate). Restricted to memory + skills toolsets so it can't sprawl. Memory providers shut down cleanly. Prior-turn tool messages excluded from the summary so the fork sees a clean context. ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026), [#17213](https://github.com/NousResearch/hermes-agent/pull/17213), [#16099](https://github.com/NousResearch/hermes-agent/pull/16099), [#16569](https://github.com/NousResearch/hermes-agent/pull/16569), [#16204](https://github.com/NousResearch/hermes-agent/pull/16204), [#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
+
+- **Skill integrations — major expansion** — **ComfyUI v5** with official CLI + REST + hardware-gated local install, moved from optional to **built-in by default** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734)). **TouchDesigner-MCP** bundled by default, expanded with GLSL, post-FX, audio, geometry, and 9 new reference docs ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753), [#16624](https://github.com/NousResearch/hermes-agent/pull/16624), [#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @kshitijk4poor + @SHL0MS). **Humanizer** skill ports a text-cleaner that strips AI-isms ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787)). **claude-design** HTML artifact skill + design-md (Google DESIGN.md spec) + airtable salvage + `skill_manage` edits in `external_dirs` + direct-URL skill install + `/reload-skills` slash command. ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358), [#14876](https://github.com/NousResearch/hermes-agent/pull/14876), [#16291](https://github.com/NousResearch/hermes-agent/pull/16291), [#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#16323](https://github.com/NousResearch/hermes-agent/pull/16323), [#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
+
+- **LM Studio — first-class provider** — upgraded from a custom-endpoint alias to a full-blown native provider: dedicated auth, `hermes doctor` checks, reasoning transport, live `/models` listing. (Salvage of @kshitijk4poor's #17061.) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
+
+- **Four more new inference providers** — **GMI Cloud** (first-class, salvage of #11955 — @isaachuangGMICLOUD), **Azure AI Foundry** with auto-detection, **MiniMax OAuth** with PKCE browser flow (salvage #15203), **Tencent Tokenhub** (salvage of #16860). ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663), [#15845](https://github.com/NousResearch/hermes-agent/pull/15845), [#17524](https://github.com/NousResearch/hermes-agent/pull/17524), [#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
+
+- **Pluggable gateway platforms + Microsoft Teams** — the gateway is now a plugin host. Drop-in messaging adapters live outside the core, and Microsoft Teams is the first plugin-shipped platform. (Salvage of #17664.) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751), [#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
+
+- **Tencent 元宝 (Yuanbao) — 18th messaging platform** — native gateway adapter with text + media delivery. ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424))
+
+- **Spotify — native tools + bundled skill + wizard** — 7 tools (play, search, queue, playlists, devices) behind PKCE OAuth, interactive setup wizard, bundled skill, surfacing in `hermes tools`, cron usage documented. ([#15121](https://github.com/NousResearch/hermes-agent/pull/15121), [#15130](https://github.com/NousResearch/hermes-agent/pull/15130), [#15154](https://github.com/NousResearch/hermes-agent/pull/15154), [#15180](https://github.com/NousResearch/hermes-agent/pull/15180))
+
+- **Google Meet plugin** — join calls, transcribe, speak, follow up. Realtime OpenAI transport + Node bot server, full pipeline bundled as a plugin. ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364))
+
+- **`hermes -z` one-shot mode + `hermes update --check`** — non-interactive `hermes -z <prompt>` with `--model`/`--provider`/`HERMES_INFERENCE_MODEL`. `hermes update --check` preflight. Opt-in pre-update HERMES_HOME backup. ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702), [#15704](https://github.com/NousResearch/hermes-agent/pull/15704), [#15841](https://github.com/NousResearch/hermes-agent/pull/15841), [#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
+
+- **Models dashboard tab + in-browser model config** — rich per-model analytics, switch main + auxiliary models from the dashboard. ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745), [#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
+
+- **Remote model catalog manifest** — OpenRouter + Nous Portal model catalogs are now pulled from a remote manifest so new models show up without a release. ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
+
+- **Native multimodal image routing** — images now route based on the model's actual vision capability rather than provider defaults. ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
+
+- **Gateway media parity** — native multi-image sending across Telegram, Discord, Slack, Mattermost, Email, and Signal; centralized audio routing with FLAC support + Telegram document fallback. ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909), [#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
+
+- **TUI catches up to (and past) the classic CLI** — LaTeX rendering (@austinpickett), `/reload` .env hot-reload, pluggable busy-indicator styles (@OutThisLife, #13610), opt-in auto-resume of last session, expanded light-terminal auto-detection, session delete from `/resume` picker with `d`, modified mouse-wheel line scroll, and a `/mouse` toggle that kills ConPTY's phantom mouse injection (@kevin-ho). ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175), [#17286](https://github.com/NousResearch/hermes-agent/pull/17286), [#17150](https://github.com/NousResearch/hermes-agent/pull/17150), [#17130](https://github.com/NousResearch/hermes-agent/pull/17130), [#17113](https://github.com/NousResearch/hermes-agent/pull/17113), [#17668](https://github.com/NousResearch/hermes-agent/pull/17668), [#17669](https://github.com/NousResearch/hermes-agent/pull/17669), [#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
+
+- **Observability + achievements plugins** — bundled Langfuse observability plugin (salvage #16845) + bundled hermes-achievements plugin that scans full session history. ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917), [#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
+
+- **TTS provider registry + Piper local TTS** — pluggable `tts.providers.<name>` registry; Piper ships as a native local TTS provider. (Closes #8508.) ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843), [#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
+
+- **Vercel Sandbox backend** — Vercel sandboxes as an execute_code/terminal backend (@kshitijk4poor). ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
+
+- **Secret redaction off by default** — default flipped to off. Prevents the long-standing patch-corruption incidents where fake secret-shaped substrings mangled tool outputs. Opt in via `redaction.enabled: true` when you need it. ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
+
+- **Cold-start performance** — visible TUI cold start cut **~57%** via lazy agent init (@OutThisLife), lazy imports of OpenAI / Anthropic / Firecrawl / account_usage, mtime-cached `load_config()`, memoized `get_tool_definitions()` with TTL-cached `check_fn` results, precompiled dangerous-command patterns. ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190), [#17046](https://github.com/NousResearch/hermes-agent/pull/17046), [#17041](https://github.com/NousResearch/hermes-agent/pull/17041), [#17098](https://github.com/NousResearch/hermes-agent/pull/17098), [#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
+
+- **Configurable prompt cache TTL** — `prompt_caching.cache_ttl` (5m default, 1h opt-in — cost savings for bursty sessions that keep cache warm). Salvage of #12659. ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
+
+---
+
+## 🧠 Autonomous Curator & Self-Improvement Loop
+
+### Curator — autonomous skill maintenance
+- **`hermes curator` as a background agent** — runs on the gateway's cron ticker, 7-day cycle by default, umbrella-first prompt, inherits parent config, unbounded iterations ([#17277](https://github.com/NousResearch/hermes-agent/pull/17277) — issue #7816)
+- **Per-run reports** — `logs/curator/run.json` + `REPORT.md` per cycle ([#17307](https://github.com/NousResearch/hermes-agent/pull/17307))
+- **Consolidated vs pruned classification** — archived skills split with model + heuristic ([#17941](https://github.com/NousResearch/hermes-agent/pull/17941))
+- **`hermes curator status`** — ranks skills by usage, shows most-used and least-used ([#18033](https://github.com/NousResearch/hermes-agent/pull/18033))
+- **Unified under `auxiliary.curator`** — pick the model in `hermes model`, configure from the dashboard ([#17868](https://github.com/NousResearch/hermes-agent/pull/17868))
+- **Documentation** — dedicated curator feature page on the docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
+- Fix: seed defaults on update, create `logs/curator/` directory, defer fire import ([#17927](https://github.com/NousResearch/hermes-agent/pull/17927))
+- Fix: scan nested archive subdirs in `restore_skill` (@0xDevNinja) ([#17951](https://github.com/NousResearch/hermes-agent/pull/17951))
+- Fix: use actual skill activity in curator status (@y0shua1ee) ([#17953](https://github.com/NousResearch/hermes-agent/pull/17953))
+- Fix: `skill_manage` refuses writes on pinned skills; pinning now blocks curator writes ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562), [#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
+- Fix: `bump_use()` wired into skill invocation + preload + skill_view (salvage #17782) ([#17932](https://github.com/NousResearch/hermes-agent/pull/17932))
+
+### Self-improvement loop (background review fork)
+- **Class-first skill-review prompt** — rubric-based grading rather than free-form "should this update" ([#16026](https://github.com/NousResearch/hermes-agent/pull/16026))
+- **Active-update bias** — prefers updating skills the agent just loaded, handles `references/` + `templates/` sub-files ([#17213](https://github.com/NousResearch/hermes-agent/pull/17213))
+- **Fork inherits parent's live runtime** — provider, model, credentials actually propagate now ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
+- **Scoped toolsets** — review fork restricted to memory + skills (no shell, no web) ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
+- **Clean shutdown** — background review memory providers exit properly (salvage #15289) ([#16204](https://github.com/NousResearch/hermes-agent/pull/16204))
+- **Clean context** — prior-history tool messages excluded from review summary (salvage #14967) ([#15057](https://github.com/NousResearch/hermes-agent/pull/15057))
+
+---
+
+## 🧩 Skills Ecosystem
+
+### Skill integrations — newly bundled or promoted
+- **ComfyUI v5** — official CLI + REST + hardware-gated local install; **moved from optional to built-in** ([#17610](https://github.com/NousResearch/hermes-agent/pull/17610), [#17631](https://github.com/NousResearch/hermes-agent/pull/17631), [#17734](https://github.com/NousResearch/hermes-agent/pull/17734), [#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
+- **TouchDesigner-MCP** — **bundled by default** ([#16753](https://github.com/NousResearch/hermes-agent/pull/16753) — @kshitijk4poor), expanded with GLSL, post-FX, audio, geometry references ([#16624](https://github.com/NousResearch/hermes-agent/pull/16624)), 9 new reference docs ([#16768](https://github.com/NousResearch/hermes-agent/pull/16768) — @SHL0MS)
+- **Humanizer** — strips AI-isms from text ([#16787](https://github.com/NousResearch/hermes-agent/pull/16787))
+- **claude-design** — HTML artifact skill with disambiguation from other design skills ([#16358](https://github.com/NousResearch/hermes-agent/pull/16358))
+- **design-md** — Google's DESIGN.md spec skill ([#14876](https://github.com/NousResearch/hermes-agent/pull/14876))
+- **airtable** — salvaged skill + skill API keys wired into `.env` (#15838) ([#16291](https://github.com/NousResearch/hermes-agent/pull/16291))
+- **pretext** — creative browser demos with @chenglou/pretext ([#17259](https://github.com/NousResearch/hermes-agent/pull/17259))
+- **spike** + **sketch** — throwaway experiments + HTML mockups, adapted from gsd-build ([#17421](https://github.com/NousResearch/hermes-agent/pull/17421))
+
+### Skills UX
+- **Install skills from a direct HTTP(S) URL** — `hermes skills install <url>` ([#16323](https://github.com/NousResearch/hermes-agent/pull/16323))
+- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
+- **`hermes skills list`** shows enabled/disabled status ([#16129](https://github.com/NousResearch/hermes-agent/pull/16129))
+- **`skill_manage` refuses writes on pinned skills** ([#17562](https://github.com/NousResearch/hermes-agent/pull/17562))
+- **`skill_manage` edits external_dirs skills in place** (salvage #9966) ([#17512](https://github.com/NousResearch/hermes-agent/pull/17512), [#17289](https://github.com/NousResearch/hermes-agent/pull/17289))
+- Fix: inline-shell rendering in `skill_view` ([#15376](https://github.com/NousResearch/hermes-agent/pull/15376))
+- Fix: exclude `.archive/` from skill index walk (salvage #17639) ([#17931](https://github.com/NousResearch/hermes-agent/pull/17931))
+- Fix: dedicated docs page per bundled + optional skill ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929))
+- Fix: `google-workspace` shared HERMES_HOME helper + ship deps as optional extra ([#15405](https://github.com/NousResearch/hermes-agent/pull/15405))
+- Fix: auto-wrap ASCII-art code blocks in generated skill pages ([#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
+- Point agent at `hermes-agent` skill + docs site for Hermes questions ([#16535](https://github.com/NousResearch/hermes-agent/pull/16535))
+
+---
+
+## 🏗️ Core Agent & Architecture
+
+### Provider & Model Support
+
+#### New providers
+- **GMI Cloud** — first-class API-key provider on par with Arcee/Kilocode/Xiaomi (salvage of #11955 — @isaachuangGMICLOUD) ([#16663](https://github.com/NousResearch/hermes-agent/pull/16663))
+- **Azure AI Foundry** — auto-detection, full wiring ([#15845](https://github.com/NousResearch/hermes-agent/pull/15845))
+- **LM Studio** — upgraded from custom-endpoint alias to first-class provider: dedicated auth, doctor checks, reasoning transport, live `/models` (salvage of #17061 — @kshitijk4poor) ([#17102](https://github.com/NousResearch/hermes-agent/pull/17102))
+- **MiniMax OAuth** — PKCE browser flow with full OAuth integration (salvage #15203) ([#17524](https://github.com/NousResearch/hermes-agent/pull/17524))
+- **Tencent Tokenhub** — new provider (salvage of #16860) ([#16960](https://github.com/NousResearch/hermes-agent/pull/16960))
+
+#### Model catalog
+- **Remote model catalog manifest** — OpenRouter + Nous Portal catalogs pulled from remote manifest so new models show up without a release ([#16033](https://github.com/NousResearch/hermes-agent/pull/16033))
+- `openai/gpt-5.5` and `gpt-5.5-pro` added to OpenRouter + Nous Portal ([#15343](https://github.com/NousResearch/hermes-agent/pull/15343))
+- `deepseek-v4-pro` and `deepseek-v4-flash` added ([#14934](https://github.com/NousResearch/hermes-agent/pull/14934))
+- `qwen3.6-plus` added to Alibaba-supported models ([#16896](https://github.com/NousResearch/hermes-agent/pull/16896))
+- Gemini free-tier keys blocked at setup with 429 guidance surfacing ([#15100](https://github.com/NousResearch/hermes-agent/pull/15100))
+
+#### Model configuration
+- **Configurable `prompt_caching.cache_ttl`** — 5m default, 1h opt-in (salvage #12659) ([#15065](https://github.com/NousResearch/hermes-agent/pull/15065))
+- `/fast` whitelist broadened to all OpenAI + Anthropic models ([#16883](https://github.com/NousResearch/hermes-agent/pull/16883))
+- `auxiliary.extra_body.reasoning` translates into Codex Responses API ([#17004](https://github.com/NousResearch/hermes-agent/pull/17004))
+- `hermes fallback` command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
+
+### Agent Loop & Conversation
+- **Native multimodal image routing** — based on model vision capability, not provider defaults ([#16506](https://github.com/NousResearch/hermes-agent/pull/16506))
+- **Delegate `child_timeout_seconds` default bumped to 600s** ([#14809](https://github.com/NousResearch/hermes-agent/pull/14809))
+- **Diagnostic dump when subagent times out with 0 API calls** ([#15105](https://github.com/NousResearch/hermes-agent/pull/15105))
+- **Gateway busts cached agent on compression/context_length config edits** ([#17008](https://github.com/NousResearch/hermes-agent/pull/17008))
+- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
+- `/reload-mcp` awareness — rebuild cached agents + prompt-cache cost confirmation ([#17729](https://github.com/NousResearch/hermes-agent/pull/17729))
+- Fix: repair CamelCase + `_tool` suffix tool-call emissions ([#15124](https://github.com/NousResearch/hermes-agent/pull/15124))
+- Fix: retry on `json.JSONDecodeError` instead of treating as local validation error ([#15107](https://github.com/NousResearch/hermes-agent/pull/15107))
+- Fix: handle unescaped control chars in `tool_call.arguments` ([#15356](https://github.com/NousResearch/hermes-agent/pull/15356))
+- Fix: ordering fix in `_copy_reasoning_content_for_api` — cross-provider reasoning isolation (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749))
+- Fix: inject empty `reasoning_content` for DeepSeek/Kimi `tool_calls` unconditionally (@Zjianru) ([#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
+- Fix: persist streamed `reasoning_content` on assistant turns (#16844) ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
+- Fix: cancel coroutine on timeout so worker thread exits; full traceback on tool failure ([#17428](https://github.com/NousResearch/hermes-agent/pull/17428))
+- Fix: isolate `get_tool_definitions` quiet_mode cache + dedup LCM injection (#17335) ([#17889](https://github.com/NousResearch/hermes-agent/pull/17889))
+- Fix: serialize concurrent `hermes_tools` RPC calls from `execute_code` (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
+- Fix: rename `[SYSTEM:` → `[IMPORTANT:` in all user-injected markers (dodges Azure content filter) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
+
+### Compression
+- **Retry summary on main model for unknown errors before giving up** ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774))
+- **Notify users when configured aux model fails even if main-model fallback recovers** ([#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
+- `/compress` wrapped in `_busy_command` to block input during compression ([#15388](https://github.com/NousResearch/hermes-agent/pull/15388))
+- Fix: reserve system + tools headroom when aux binds threshold ([#15631](https://github.com/NousResearch/hermes-agent/pull/15631))
+- Fix: use text-char sum for multimodal token estimation in `_find_tail_cut_by_tokens` ([#16369](https://github.com/NousResearch/hermes-agent/pull/16369))
+
+### Session, Memory & State
+- **Trigram FTS5 index for CJK search, replace LIKE fallback** (@alt-glitch) ([#16651](https://github.com/NousResearch/hermes-agent/pull/16651))
+- **Index `tool_name` + `tool_calls` in FTS5, with repair + migration** (salvages #16866) ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
+- **Checkpoints: auto-prune orphan and stale shadow repos at startup** ([#16303](https://github.com/NousResearch/hermes-agent/pull/16303))
+- **Memory providers notified on mid-process session_id rotation** (#6672) ([#17409](https://github.com/NousResearch/hermes-agent/pull/17409))
+- Fix: quote underscored terms in FTS5 query sanitization ([#16915](https://github.com/NousResearch/hermes-agent/pull/16915))
+- Fix: resolve viking_read 500/412 on file URIs + pseudo-summary URIs (salvage #5886) ([#17869](https://github.com/NousResearch/hermes-agent/pull/17869))
+- Fix: skip external-provider sync on interrupted turns ([#15395](https://github.com/NousResearch/hermes-agent/pull/15395))
+- Fix: close embedded Hindsight async client cleanly (salvage #14605) ([#16209](https://github.com/NousResearch/hermes-agent/pull/16209))
+- Fix: pass session transcript to `shutdown_memory_provider` on gateway + CLI (#15165) ([#16571](https://github.com/NousResearch/hermes-agent/pull/16571))
+- Fix: write-origin metadata seam ([#15346](https://github.com/NousResearch/hermes-agent/pull/15346))
+- Fix: preserve symlinks during atomic file writes ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
+- Refactor: remove `flush_memories` entirely ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
+
+### Auxiliary models
+- Fix: surface auxiliary failures in UI (previously silent) ([#15324](https://github.com/NousResearch/hermes-agent/pull/15324))
+- Fix: surface title-gen auxiliary failures instead of silently dropping ([#16371](https://github.com/NousResearch/hermes-agent/pull/16371))
+- Fix: generalize unsupported-parameter detector and harden `max_tokens` retry ([#15633](https://github.com/NousResearch/hermes-agent/pull/15633))
+
+---
+
+## 📱 Messaging Platforms (Gateway)
+
+### New Platforms
+- **Microsoft Teams (19th platform)** — as a plugin, + xdist collision guard ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
+- **Yuanbao (Tencent 元宝, 18th platform)** — native adapter with text + media delivery ([#16298](https://github.com/NousResearch/hermes-agent/pull/16298), [#17424](https://github.com/NousResearch/hermes-agent/pull/17424), [#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
+
+### Pluggable Gateway Platforms
+- **Drop-in messaging adapters** — the gateway is now a plugin host for platforms (salvage of #17664) ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
+
+### Telegram
+- **Chat allowlists for groups and forums** (@web3blind) ([#15027](https://github.com/NousResearch/hermes-agent/pull/15027))
+- **Send fresh finals for stale preview streams** (port openclaw#72038) ([#16261](https://github.com/NousResearch/hermes-agent/pull/16261))
+- **Render markdown tables as row-group bullets + prompt hint** ([#16997](https://github.com/NousResearch/hermes-agent/pull/16997))
+- Document fallback in centralized audio routing ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
+- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
+
+### Discord
+- **Opt-in toolsets + ID injection + tool split + Feishu wiring** (salvage #15457, #15458) ([#15610](https://github.com/NousResearch/hermes-agent/pull/15610), [#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
+- Fix: coerce `limit` parameter to int before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
+
+### Slack
+- **Register every gateway command as a native slash (Discord/Telegram parity)** ([#16164](https://github.com/NousResearch/hermes-agent/pull/16164))
+- **`strict_mention` config** — prevents thread auto-engagement ([#16193](https://github.com/NousResearch/hermes-agent/pull/16193))
+- **`channel_skill_bindings`** — bind specific skills to specific Slack channels ([#16283](https://github.com/NousResearch/hermes-agent/pull/16283))
+
+### Signal
+- **Native formatting** — markdown → bodyRanges, reply quotes, reactions ([#17417](https://github.com/NousResearch/hermes-agent/pull/17417))
+- Native multi-image sending ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
+
+### Feishu / Mattermost / Email / Signal
+- All participate in **native multi-image sending** ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
+
+### Gateway Core
+- **Centralized audio routing + FLAC support + Telegram doc fallback** ([#17833](https://github.com/NousResearch/hermes-agent/pull/17833))
+- **Native multi-image sending** across Telegram, Discord, Slack, Mattermost, Email, Signal ([#17909](https://github.com/NousResearch/hermes-agent/pull/17909))
+- **Make hygiene hard message limit configurable** ([#17000](https://github.com/NousResearch/hermes-agent/pull/17000))
+- **Opt-in runtime-metadata footer on final replies** ([#17026](https://github.com/NousResearch/hermes-agent/pull/17026))
+- **`pre_gateway_dispatch` hook** — plugins can intercept before dispatch ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
+- **`pre_approval_request` / `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
+- Fix: timeouts — guard `load_config()` call against runtime exceptions ([#16318](https://github.com/NousResearch/hermes-agent/pull/16318))
+- Fix: support passing handler tools via registry ([#15613](https://github.com/NousResearch/hermes-agent/pull/15613))
+
+---
+
+## 🔧 Tool System
+
+### Plugin-first architecture
+- **Pluggable gateway platforms** — platforms can ship as plugins ([#17751](https://github.com/NousResearch/hermes-agent/pull/17751))
+- **Microsoft Teams as first plugin-shipped platform** ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
+- **`pre_gateway_dispatch` hook** ([#15050](https://github.com/NousResearch/hermes-agent/pull/15050))
+- **`pre_approval_request` + `post_approval_response` hooks** ([#16776](https://github.com/NousResearch/hermes-agent/pull/16776))
+- **`duration_ms` on `post_tool_call`** (inspired by Claude Code 2.1.119) ([#15429](https://github.com/NousResearch/hermes-agent/pull/15429))
+- **Bundled plugins**: Spotify ([#15174](https://github.com/NousResearch/hermes-agent/pull/15174)), Google Meet ([#16364](https://github.com/NousResearch/hermes-agent/pull/16364)), Langfuse observability ([#16917](https://github.com/NousResearch/hermes-agent/pull/16917)), hermes-achievements ([#17754](https://github.com/NousResearch/hermes-agent/pull/17754))
+- **Page-scoped plugin slots for built-in dashboard pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
+- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
+
+### Browser
+- **CDP supervisor** — dialog detection + response + cross-origin iframe eval ([#14540](https://github.com/NousResearch/hermes-agent/pull/14540))
+- **Auto-spawn local Chromium for LAN/localhost URLs** when cloud provider is configured ([#16136](https://github.com/NousResearch/hermes-agent/pull/16136))
+
+### Execute code / Terminal
+- **Vercel Sandbox backend** for `execute_code` / terminal (@kshitijk4poor) ([#17445](https://github.com/NousResearch/hermes-agent/pull/17445))
+- **Collapse subagent `task_id`s to shared container** ([#16177](https://github.com/NousResearch/hermes-agent/pull/16177))
+- **Docker: run container as host user** to avoid root-owned bind mounts (@benbarclay) ([#17305](https://github.com/NousResearch/hermes-agent/pull/17305))
+- Fix: safely quote `~/` subpaths in wrapped `cd` commands ([#15394](https://github.com/NousResearch/hermes-agent/pull/15394))
+- Fix: close file descriptor in `LocalEnvironment._update_cwd` ([#17300](https://github.com/NousResearch/hermes-agent/pull/17300))
+- Fix: SSH — prevent tar from overwriting remote home dir permissions ([#17898](https://github.com/NousResearch/hermes-agent/pull/17898), [#17867](https://github.com/NousResearch/hermes-agent/pull/17867))
+
+### Image generation
+- See Provider section for updates; no new image providers this window.
+
+### TTS / Voice
+- **Pluggable TTS provider registry** under `tts.providers.<name>` ([#17843](https://github.com/NousResearch/hermes-agent/pull/17843))
+- **Piper** as native local TTS provider (closes #8508) ([#17885](https://github.com/NousResearch/hermes-agent/pull/17885))
+- **Voice mode CLI parity in the TUI** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
+- Fix: vision — use HERMES_HOME-based cache dir instead of cwd ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
+
+### Cron
+- **Honor `hermes tools` config for the cron platform** ([#14798](https://github.com/NousResearch/hermes-agent/pull/14798))
+- **Per-job `workdir`** — project-aware cron runs ([#15110](https://github.com/NousResearch/hermes-agent/pull/15110))
+- **`context_from` field** — chain cron job outputs ([#15606](https://github.com/NousResearch/hermes-agent/pull/15606))
+- Fix: promote `croniter` to a core dependency ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
+
+### Web search
+- **Expose `limit` for `web_search`** ([#16934](https://github.com/NousResearch/hermes-agent/pull/16934))
+
+### Maps
+- Fix: include seconds in timezone UTC offset output ([#16300](https://github.com/NousResearch/hermes-agent/pull/16300))
+
+### Approvals
+- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
+- Perf: precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
+
+### ACP
+- **Advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
+
+### API Server
+- **POST `/v1/runs/{run_id}/stop`** (salvage of #15656) ([#15842](https://github.com/NousResearch/hermes-agent/pull/15842))
+- **Expose run status for external UIs** (#17085) ([#17458](https://github.com/NousResearch/hermes-agent/pull/17458))
+
+### Nix
+- **Declarative plugin installation for NixOS module** (@alt-glitch) ([#15953](https://github.com/NousResearch/hermes-agent/pull/15953))
+- Fix: use `--rebuild` in fix-lockfiles to bypass cached FOD store paths ([#15444](https://github.com/NousResearch/hermes-agent/pull/15444))
+- Fix: `extraPackages` now actually works via per-user profile ([#17047](https://github.com/NousResearch/hermes-agent/pull/17047))
+- Fix: refresh web/ npm-deps hash to unblock main builds ([#17174](https://github.com/NousResearch/hermes-agent/pull/17174))
+- Fix: replace magic-nix-cache with Cachix ([#17928](https://github.com/NousResearch/hermes-agent/pull/17928))
+
+---
+
+## 🖥️ TUI
+
+### New features
+- **LaTeX rendering** (@austinpickett) ([#17175](https://github.com/NousResearch/hermes-agent/pull/17175))
+- **`/reload` .env hot-reload** — ported from the classic CLI ([#17286](https://github.com/NousResearch/hermes-agent/pull/17286))
+- **Pluggable busy-indicator styles** (@OutThisLife, #13610) ([#17150](https://github.com/NousResearch/hermes-agent/pull/17150))
+- **Opt-in auto-resume of the most recent session** (@OutThisLife) ([#17130](https://github.com/NousResearch/hermes-agent/pull/17130))
+- **Expanded light-terminal auto-detection** — `HERMES_TUI_THEME` + background hex (@OutThisLife) ([#17113](https://github.com/NousResearch/hermes-agent/pull/17113))
+- **Delete sessions from `/resume` picker with `d`** (@OutThisLife) ([#17668](https://github.com/NousResearch/hermes-agent/pull/17668))
+- **Line-by-line scroll on modified mouse wheel** (@OutThisLife) ([#17669](https://github.com/NousResearch/hermes-agent/pull/17669))
+- **Delete queued message while editing with ctrl-x / cancel with esc** (@OutThisLife) ([#16707](https://github.com/NousResearch/hermes-agent/pull/16707))
+- **Per-section visibility for the details accordion** (@OutThisLife) ([#14968](https://github.com/NousResearch/hermes-agent/pull/14968))
+- **Voice mode CLI parity** — VAD loop + TTS + crash forensics ([#14810](https://github.com/NousResearch/hermes-agent/pull/14810))
+- **Contextual first-touch hints ported to TUI** — `/busy`, `/verbose` ([#16054](https://github.com/NousResearch/hermes-agent/pull/16054))
+- **Mini help menu on `?` in the input field** (@ethernet8023) ([#18043](https://github.com/NousResearch/hermes-agent/pull/18043))
+
+### Fixes
+- Fix: proactive mouse disable on ConPTY + `/mouse` toggle command (@kevin-ho, WSL2 ghost-mouse fix) ([#15488](https://github.com/NousResearch/hermes-agent/pull/15488))
+- Fix: restore skills search RPC ([#15870](https://github.com/NousResearch/hermes-agent/pull/15870))
+- Perf: cache text measurements across yoga flex re-passes ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
+- Perf: stabilize long-session scrolling ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
+- Perf: lazily seed virtual history heights ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
+- Perf: cut visible cold start ~57% with lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
+
+---
+
+## 🖱️ CLI & User Experience
+
+### New commands
+- **`hermes -z <prompt>`** — non-interactive one-shot mode ([#15702](https://github.com/NousResearch/hermes-agent/pull/15702))
+- **`hermes -z` with `--model` / `--provider` / `HERMES_INFERENCE_MODEL`** ([#15704](https://github.com/NousResearch/hermes-agent/pull/15704))
+- **`hermes update --check`** preflight flag ([#15841](https://github.com/NousResearch/hermes-agent/pull/15841))
+- **`hermes fallback`** command for managing fallback providers ([#16052](https://github.com/NousResearch/hermes-agent/pull/16052))
+- **`/busy`** slash command for busy input mode ([#15382](https://github.com/NousResearch/hermes-agent/pull/15382))
+- **`/busy` input mode 'steer'** as a third option ([#16279](https://github.com/NousResearch/hermes-agent/pull/16279))
+- **`/btw` as alias for `/background`** ([#16053](https://github.com/NousResearch/hermes-agent/pull/16053))
+- **`/reload-skills`** slash command (salvage #17670) ([#17744](https://github.com/NousResearch/hermes-agent/pull/17744))
+- **Surface `/queue`, `/bg`, `/steer` in agent-running placeholder** ([#16118](https://github.com/NousResearch/hermes-agent/pull/16118))
+
+### Setup / onboarding
+- **Auto-reconfigure on existing installs** ([#15879](https://github.com/NousResearch/hermes-agent/pull/15879))
+- **Contextual first-touch hints for `/busy` and `/verbose`** ([#16046](https://github.com/NousResearch/hermes-agent/pull/16046))
+- **Cost-saving tips from the April 30 tip-of-the-day** ([#17841](https://github.com/NousResearch/hermes-agent/pull/17841))
+- **Hyperlink startup banner title to the latest GitHub Release** ([#14945](https://github.com/NousResearch/hermes-agent/pull/14945))
+
+### Update / backup
+- **Snapshot pairing data before `git pull`** ([#16383](https://github.com/NousResearch/hermes-agent/pull/16383))
+- **Auto-backup HERMES_HOME before `hermes update`** (opt-in, off by default) ([#16539](https://github.com/NousResearch/hermes-agent/pull/16539), [#16566](https://github.com/NousResearch/hermes-agent/pull/16566))
+- **Exclude `checkpoints/` from backups** ([#16572](https://github.com/NousResearch/hermes-agent/pull/16572))
+- **Exclude SQLite WAL/SHM/journal sidecars from backups** ([#16576](https://github.com/NousResearch/hermes-agent/pull/16576))
+- **Installer FHS layout for root installs on Linux** ([#15608](https://github.com/NousResearch/hermes-agent/pull/15608))
+- Fix: kill stale dashboards instead of warning ([#17832](https://github.com/NousResearch/hermes-agent/pull/17832))
+- Fix: show correct update status on nix-built hermes ([#17550](https://github.com/NousResearch/hermes-agent/pull/17550))
+
+### Slash-command housekeeping
+- Refactor: drop `/provider`, `/plan` handler, and clean up slash registry ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
+- Refactor: drop `persist_session` plumbing + fix broken `/btw` mid-turn bypass ([#16075](https://github.com/NousResearch/hermes-agent/pull/16075))
+
+### OpenClaw migration (for folks coming from OpenClaw)
+- **Hardened OpenClaw import** — plan-first apply, redaction, pre-migration backup ([#16911](https://github.com/NousResearch/hermes-agent/pull/16911))
+- Fix: case-preserving brand rewrite + one-time `~/.openclaw` residue banner ([#16327](https://github.com/NousResearch/hermes-agent/pull/16327))
+- Fix: resolve `openclaw` workspace files from `agents.defaults.workspace` ([#16879](https://github.com/NousResearch/hermes-agent/pull/16879))
+- Fix: resolve model aliases against real OpenClaw catalog schema (salvage #16778) ([#16977](https://github.com/NousResearch/hermes-agent/pull/16977))
+
+---
+
+## 📊 Web Dashboard
+
+- **Models tab** — rich per-model analytics ([#17745](https://github.com/NousResearch/hermes-agent/pull/17745))
+- **Configure main + auxiliary models from the Models page** ([#17802](https://github.com/NousResearch/hermes-agent/pull/17802))
+- **Dashboard Chat tab — xterm.js + JSON-RPC sidecar** (supersedes #12710 + #13379, @OutThisLife) ([#14890](https://github.com/NousResearch/hermes-agent/pull/14890))
+- **Dashboard layout refresh** (@austinpickett) ([#14899](https://github.com/NousResearch/hermes-agent/pull/14899))
+- **`--stop` and `--status` flags** on the dashboard CLI ([#17840](https://github.com/NousResearch/hermes-agent/pull/17840))
+- **Page-scoped plugin slots for built-in pages** ([#15658](https://github.com/NousResearch/hermes-agent/pull/15658))
+- Fix: replace all buttons for design system buttons ([#17007](https://github.com/NousResearch/hermes-agent/pull/17007))
+
+---
+
+## ⚡ Performance
+
+- **TUI visible cold start cut ~57%** via lazy agent init ([#17190](https://github.com/NousResearch/hermes-agent/pull/17190))
+- **Lazy-import OpenAI, Anthropic, Firecrawl, account_usage** ([#17046](https://github.com/NousResearch/hermes-agent/pull/17046))
+- **mtime-cache `load_config()` and `read_raw_config()`** ([#17041](https://github.com/NousResearch/hermes-agent/pull/17041))
+- **Memoize `get_tool_definitions()` + TTL-cache `check_fn` results** ([#17098](https://github.com/NousResearch/hermes-agent/pull/17098))
+- **Precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS** ([#17206](https://github.com/NousResearch/hermes-agent/pull/17206))
+- **Cache Ink text measurements across yoga flex re-passes** ([#14818](https://github.com/NousResearch/hermes-agent/pull/14818))
+- **Stabilize long-session scrolling** ([#15926](https://github.com/NousResearch/hermes-agent/pull/15926))
+- **Lazily seed virtual history heights** ([#16523](https://github.com/NousResearch/hermes-agent/pull/16523))
+
+---
+
+## 🔒 Security & Reliability
+
+- **Secret redaction off by default** — stops corrupting patches / API payloads with fake-key substitutions. Opt in via `redaction.enabled: true` ([#16794](https://github.com/NousResearch/hermes-agent/pull/16794))
+- **`[SYSTEM:` → `[IMPORTANT:`** in all user-injected markers (Azure content filter dodge) ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
+- **Hardline blocklist for unrecoverable commands** ([#15878](https://github.com/NousResearch/hermes-agent/pull/15878))
+- **Canonical `mask_secret` helper; fix status.py DIM drift** ([#17207](https://github.com/NousResearch/hermes-agent/pull/17207))
+- **Sweep expired paste.rs uploads on a real timer** ([#16431](https://github.com/NousResearch/hermes-agent/pull/16431))
+- **Preserve symlinks during atomic file writes** ([#16980](https://github.com/NousResearch/hermes-agent/pull/16980))
+- **Probe `/dev/tty` by opening it, not bare existence** ([#17024](https://github.com/NousResearch/hermes-agent/pull/17024))
+
+---
+
+## 🐛 Notable Bug Fixes
+
+This window includes 360 `fix:` PRs. Selected highlights from across the stack:
+
+- **Background review fork inherits parent's live runtime** — provider/model/creds now propagate correctly ([#16099](https://github.com/NousResearch/hermes-agent/pull/16099))
+- **Hindsight configurable `HINDSIGHT_TIMEOUT` env var** ([#15077](https://github.com/NousResearch/hermes-agent/pull/15077))
+- **Tools: normalize numeric entries + clear stale `no_mcp` in `_save_platform_tools`** ([#15607](https://github.com/NousResearch/hermes-agent/pull/15607))
+- **MCP: rewrite `definitions` refs to `$defs` in input schemas** — closes provider-side 400s
+- **Azure content filter compatibility** — renamed `[SYSTEM:` markers so Azure's content filter stops flagging them ([#16114](https://github.com/NousResearch/hermes-agent/pull/16114))
+- **Vision cache uses HERMES_HOME instead of cwd** ([#17719](https://github.com/NousResearch/hermes-agent/pull/17719))
+- **FTS5 search** — tool_name + tool_calls indexing with repair + migration ([#16914](https://github.com/NousResearch/hermes-agent/pull/16914))
+- **Streaming reasoning persists on assistant turns** ([#16892](https://github.com/NousResearch/hermes-agent/pull/16892))
+- **execute_code concurrent RPC serialization** (#17770) ([#17894](https://github.com/NousResearch/hermes-agent/pull/17894), [#17902](https://github.com/NousResearch/hermes-agent/pull/17902))
+- **Background reviewer scoped to memory + skills toolsets** — no more accidental web/shell escapes ([#16569](https://github.com/NousResearch/hermes-agent/pull/16569))
+- **Compression recovery** — retry on main before giving up; notify user when aux fails ([#16774](https://github.com/NousResearch/hermes-agent/pull/16774), [#16775](https://github.com/NousResearch/hermes-agent/pull/16775))
+- **`croniter` promoted to a core dependency** ([#17577](https://github.com/NousResearch/hermes-agent/pull/17577))
+- **Discord tool `limit` parameter coerced to int** before `min()` call ([#16319](https://github.com/NousResearch/hermes-agent/pull/16319))
+- **Yuanbao messaging platform entrance fix** ([#16880](https://github.com/NousResearch/hermes-agent/pull/16880))
+- **ACP advertise and forward image prompts** ([#18030](https://github.com/NousResearch/hermes-agent/pull/18030))
+- **DeepSeek / Kimi reasoning content isolation** across cross-provider histories (@Zjianru) ([#15749](https://github.com/NousResearch/hermes-agent/pull/15749), [#15762](https://github.com/NousResearch/hermes-agent/pull/15762))
+- **Preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking** ([#18045](https://github.com/NousResearch/hermes-agent/pull/18045))
+
+The vast majority of the 360 fixes landed in the streaming/compression/tool-calling paths across all providers — DeepSeek, Kimi, Moonshot, GLM, Qwen, MiniMax, Gemini, Anthropic, OpenAI — alongside TUI polish (resize, scroll, sticky-prompt) and gateway platform-specific edge cases.
+
+---
+
+## 🧪 Testing & CI
+
+- Hermetic test parity (`scripts/run_tests.sh`) held across this window
+- **Microsoft Teams xdist collision guard** — prevents worker collisions when Teams platform tests run in parallel ([#17828](https://github.com/NousResearch/hermes-agent/pull/17828))
+- Chore: remove unused imports and dead locals (ruff F401, F841) ([#17010](https://github.com/NousResearch/hermes-agent/pull/17010))
+
+---
+
+## 📚 Documentation
+
+- **Curator feature page** added to docs site ([#17563](https://github.com/NousResearch/hermes-agent/pull/17563))
+- **Document pin also blocking `skill_manage` writes** ([#17578](https://github.com/NousResearch/hermes-agent/pull/17578))
+- **Direct-URL skill install documented** across features, reference, guide, and `hermes-agent` skill ([#16355](https://github.com/NousResearch/hermes-agent/pull/16355))
+- **Hooks tutorial — build a BOOT.md startup checklist** (replaces the removed built-in hook) ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202))
+- **ComfyUI docs: ask local vs cloud FIRST before hardware check** ([#17612](https://github.com/NousResearch/hermes-agent/pull/17612))
+- **Obliteratus skill: link YouTube video guide in SKILL.md** ([#15808](https://github.com/NousResearch/hermes-agent/pull/15808))
+- Per-skill docs pages generated for bundled + optional skills; ASCII art code blocks auto-wrapped ([#14929](https://github.com/NousResearch/hermes-agent/pull/14929), [#16497](https://github.com/NousResearch/hermes-agent/pull/16497))
+
+---
+
+## ⚖️ Removed / Reverted
+
+- **Kanban multi-profile collaboration board** — landed in #16081, reverted in ([#16098](https://github.com/NousResearch/hermes-agent/pull/16098)) while the design is reworked
+- **computer-use cua-driver** — 3 preparatory PRs landed then were reverted in ([#16927](https://github.com/NousResearch/hermes-agent/pull/16927))
+- **BOOT.md built-in hook** removed ([#17093](https://github.com/NousResearch/hermes-agent/pull/17093)); the hooks tutorial ([#17202](https://github.com/NousResearch/hermes-agent/pull/17202)) shows how to build the same workflow yourself with a shell hook
+- **`/provider` + `/plan` slash commands dropped** ([#15047](https://github.com/NousResearch/hermes-agent/pull/15047))
+- **`flush_memories` removed entirely** ([#15696](https://github.com/NousResearch/hermes-agent/pull/15696))
+
+---
+
+## 👥 Contributors
+
+### Core
+- **@teknium1** (Teknium)
+
+### Top Community Contributors (by merged PR count since v0.11.0)
+
+- **@OutThisLife** (Brooklyn) — 52 PRs · TUI — light-terminal detection + pluggable busy styles + auto-resume + session-delete from /resume + mouse-wheel scrolling + xterm.js dashboard Chat tab + cold-start cut + accordion polish
+- **@kshitijk4poor** — 12 PRs · LM Studio first-class provider (salvage), Vercel Sandbox backend, GMI Cloud salvage, bundled-by-default touchdesigner-mcp, many tool-call / reasoning fixes
+- **@helix4u** — 10 PRs · MCP schema robustness, assorted stability fixes
+- **@alt-glitch** — 8 PRs · trigram FTS5 CJK search, declarative Nix plugin install, matrix/feishu hints and fixes
+- **@ethernet8023** — 4 PRs
+- **@austinpickett** — 4 PRs · LaTeX rendering in TUI, dashboard layout refresh
+- **@benbarclay** — 3 PRs · Docker run-as-host-user so bind mounts don't get root-owned
+- **@vominh1919** — 2 PRs
+- **@stephenschoettler** — 2 PRs
+- **@kevin-ho** — ConPTY mouse-injection fix (#15488)
+- **@Zjianru** — cross-provider reasoning_content isolation + DeepSeek/Kimi empty-reasoning injection (#15749, #15762)
+- **@web3blind** — Telegram chat allowlists for groups and forums (#15027)
+- **@SHL0MS** — 9 new TouchDesigner-MCP reference docs (#16768)
+- **@0xDevNinja** — curator `restore_skill` nested-archive fix (#17951)
+- **@y0shua1ee** — curator `use` activity fix (#17953)
+
+### Also contributing
+Salvaged or co-authored work from **@isaachuangGMICLOUD** (GMI Cloud), earlier upstream PRs from the original author of each salvage chain, and a long tail of one-shot fixes, documentation nudges, and skill contributions from the community.
+
+### All Contributors (alphabetical, excluding @teknium1)
+
+@0xbyt4, @0xharryriddle, @0xDevNinja, @0z1-ghb, @5park1e, @A-FdL-Prog, @aj-nt, @akhater, @alblez, @alexg0bot,
+@alexzhu0, @AllardQuek, @alt-glitch, @amanning3390, @amanuel2, @AndreKurait, @andrewhosf, @Andy283, @andyylin,
+@angel12, @AntAISecurityLab, @ash, @austinpickett, @badgerbees, @BadTechBandit, @Bartok9, @beenherebefore,
+@beesrsj2500, @BeliefanX, @benbarclay, @benjaminsehl, @BlackishGreen33, @bloodcarter, @BlueBirdBack,
+@briandevans, @brooklynnicholson, @bsgdigital, @buray, @bwjoke, @camaragon, @cdanis, @cgarwood82,
+@charles-brooks, @chen1749144759, @chengoak, @ching-kaching, @Contentment003111, @crayfish-ai, @CruxExperts,
+@cyclingwithelephants, @dandaka, @danklynn, @ddupont808, @dhabibi, @difujia, @dimitrovi, @dlkakbs,
+@dontcallmejames, @EKKOLearnAI, @emozilla, @ericnicolaides, @Erosika, @ethernet8023, @exiao, @Feranmi10,
+@flobo3, @foxion37, @georgeglessner, @georgex8001, @ghostmfr, @H-Ali13381, @HangGlidersRule, @harryplusplus,
+@haru398801, @heathley, @hejuntt1014, @hekaru-agent, @helix4u, @Heltman, @HenkDz, @heyitsaamir, @hharry11,
+@hhhonzik, @hhuang91, @HiddenPuppy, @htsh, @iamagenius00, @in-liberty420, @innocarpe, @irispillars, @iRonin,
+@isaachuangGMICLOUD, @Ito-69, @j3ffffff, @jackjin1997, @jakubkrcmar, @Jason2031, @JayGwod, @jerome-benoit,
+@johnncenae, @Kailigithub, @keiravoss94, @kevin-ho, @knockyai, @konsisumer, @kshitijk4poor, @kunlabs, @l0hde,
+@Leihb, @leoneparise, @LeonSGP43, @liizfq, @liuhao1024, @loongzhao, @lsdsjy, @luyao618, @ma-pony, @Magaav,
+@MagicRay1217, @math0r-be, @MattMaximo, @maxims-oss, @MaxyMoos, @maymuneth, @mcndjxlefnd, @memosr,
+@MestreY0d4-Uninter, @mewwts, @Mirac1eSky, @MorAlekss, @mrhwick, @mrunmayee17, @mssteuer, @Nanako0129,
+@nazirulhafiy, @Nerijusas, @Nicecsh, @nicoloboschi, @nightq, @ningfangbin, @octo-patch, @Octopus,
+@OutThisLife, @Paperclip, @pein892, @perlowja, @prasadus92, @qike-ms, @qiyin-code, @Readon, @ReginaldasR,
+@revaraver, @rfilgueiras, @rmoen, @romanornr, @rugvedS07, @rylena, @samrusani, @Sanjays2402, @sasha-id,
+@Satoshi-agi, @scheidti, @scotttrinh, @season179, @SeeYangZhi, @sgaofen, @shamork, @shannonsands, @SHL0MS,
+@simbam99, @Societus, @socrates1024, @Sonoyunchu, @sprmn24, @stephenschoettler, @tangyuanjc, @TechPrototyper,
+@tekgnosis-net, @ThomassJonax, @tmimmanuel, @tochukwuada, @Tosko4, @Tranquil-Flow, @twozle, @txbxxx,
+@UgwujaGeorge, @Versun, @vlwkaos, @voidborne-d, @vominh1919, @Wang-tianhao, @Wangshengyang2004, @web3blind,
+@westers, @Wysie, @xandersbell, @xiahu88988, @XieNBi, @xinbenlv, @xnbi, @y0shua1ee, @yatesjalex, @yes999zc,
+@yeyitech, @Yoimex, @YueLich, @Yukipukii1, @zhiyanliu, @zicochaos, @Zjianru, @zkl2333, @zons-zhaozhy,
+@ztexydt-cqh.
+
+Also: @Siddharth Balyan, @YuShu.
+
+---
+
+**Full Changelog**: [v2026.4.23...v2026.4.30](https://github.com/NousResearch/hermes-agent/compare/v2026.4.23...v2026.4.30)
@@ -4,6 +4,7 @@ from __future__ import annotations

 import asyncio
 import contextvars
+import json
 import logging
 import os
 from collections import defaultdict, deque
@@ -47,6 +48,7 @@ from acp.schema import (
    TextContentBlock,
    UnstructuredCommandInput,
    Usage,
+    UsageUpdate,
    UserMessageChunk,
 )

@@ -65,6 +67,7 @@ from acp_adapter.events import (
 )
 from acp_adapter.permissions import make_approval_callback
 from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
+from acp_adapter.tools import build_tool_complete, build_tool_start

 logger = logging.getLogger(__name__)

@@ -164,6 +167,8 @@ class HermesACPAgent(acp.Agent):
        "context": "Show conversation context info",
        "reset": "Clear conversation history",
        "compact": "Compress conversation context",
+        "steer": "Inject guidance into the currently running agent turn",
+        "queue": "Queue a prompt to run after the current turn finishes",
        "version": "Show Hermes version",
    }

@@ -193,6 +198,16 @@ class HermesACPAgent(acp.Agent):
            "name": "compact",
            "description": "Compress conversation context",
        },
+        {
+            "name": "steer",
+            "description": "Inject guidance into the currently running agent turn",
+            "input_hint": "guidance for the active turn",
+        },
+        {
+            "name": "queue",
+            "description": "Queue a prompt to run after the current turn finishes",
+            "input_hint": "prompt to run next",
+        },
        {
            "name": "version",
            "description": "Show Hermes version",
@@ -303,6 +318,66 @@ class HermesACPAgent(acp.Agent):

        return target_provider, new_model

+    @staticmethod
+    def _build_usage_update(state: SessionState) -> UsageUpdate | None:
+        """Build ACP native context-usage data for clients like Zed.
+
+        Zed's circular context indicator is driven by ACP ``usage_update``
+        session updates: ``size`` is the model context window and ``used`` is
+        the current request pressure.  Hermes estimates ``used`` from the same
+        buckets it sends to providers: system prompt, conversation history, and
+        tool schemas.
+        """
+        agent = state.agent
+        compressor = getattr(agent, "context_compressor", None)
+        size = int(getattr(compressor, "context_length", 0) or 0)
+        if size <= 0:
+            return None
+
+        try:
+            from agent.model_metadata import estimate_request_tokens_rough
+
+            used = estimate_request_tokens_rough(
+                state.history,
+                system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
+                tools=getattr(agent, "tools", None) or None,
+            )
+        except Exception:
+            logger.debug("Could not estimate ACP native context usage", exc_info=True)
+            used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
+
+        return UsageUpdate(
+            session_update="usage_update",
+            size=max(size, 0),
+            used=max(used, 0),
+        )
+
+    async def _send_usage_update(self, state: SessionState) -> None:
+        """Send ACP native context usage to the connected client."""
+        if not self._conn:
+            return
+        update = self._build_usage_update(state)
+        if update is None:
+            return
+        try:
+            await self._conn.session_update(
+                session_id=state.session_id,
+                update=update,
+            )
+        except Exception:
+            logger.warning(
+                "Failed to send ACP usage update for session %s",
+                state.session_id,
+                exc_info=True,
+            )
+
+    def _schedule_usage_update(self, state: SessionState) -> None:
+        """Schedule native context indicator refresh after ACP responses."""
+        if not self._conn:
+            return
+        loop = asyncio.get_running_loop()
+        loop.call_soon(asyncio.create_task, self._send_usage_update(state))
+
    async def _register_session_mcp_servers(
        self,
        state: SessionState,
@@ -473,37 +548,99 @@ class HermesACPAgent(acp.Agent):
            )
        return None

+    @staticmethod
+    def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
+        """Extract function name/arguments from an OpenAI-style tool_call."""
+        function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
+        name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
+        raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
+        if isinstance(raw_args, str):
+            try:
+                parsed = json.loads(raw_args)
+            except Exception:
+                parsed = {"raw": raw_args}
+            raw_args = parsed
+        if not isinstance(raw_args, dict):
+            raw_args = {}
+        return name, raw_args
+
+    @staticmethod
+    def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
+        """Return the stable provider tool call id for ACP history replay."""
+        return str(
+            tool_call.get("id")
+            or tool_call.get("call_id")
+            or tool_call.get("tool_call_id")
+            or ""
+        ).strip()
+
    async def _replay_session_history(self, state: SessionState) -> None:
        """Send persisted user/assistant history to clients during session/load.

        Zed's ACP history UI calls ``session/load`` after the user picks an item
        from the Agents sidebar. The agent must then replay the full conversation
-        as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
-        restoring server-side state makes Hermes remember context, but leaves the
-        editor looking like a clean thread.
+        as user/assistant chunks plus reconstructed tool-call start/completion
+        notifications; merely restoring server-side state makes Hermes remember
+        context, but leaves the editor looking like a clean thread.
        """
        if not self._conn or not state.history:
            return

-        for message in state.history:
-            role = str(message.get("role") or "")
-            if role not in {"user", "assistant"}:
-                continue
-            text = self._history_message_text(message)
-            if not text:
-                continue
-            update = self._history_message_update(role=role, text=text)
-            if update is None:
-                continue
+        active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
+
+        async def _send(update: Any) -> bool:
            try:
                await self._conn.session_update(session_id=state.session_id, update=update)
+                return True
            except Exception:
                logger.warning(
                    "Failed to replay ACP history for session %s",
                    state.session_id,
                    exc_info=True,
                )
-                return
+                return False
+
+        for message in state.history:
+            role = str(message.get("role") or "")
+
+            if role in {"user", "assistant"}:
+                text = self._history_message_text(message)
+                if text:
+                    update = self._history_message_update(role=role, text=text)
+                    if update is not None and not await _send(update):
+                        return
+
+            if role == "assistant" and isinstance(message.get("tool_calls"), list):
+                for tool_call in message["tool_calls"]:
+                    if not isinstance(tool_call, dict):
+                        continue
+                    tool_call_id = self._history_tool_call_id(tool_call)
+                    if not tool_call_id:
+                        continue
+                    tool_name, args = self._history_tool_call_name_args(tool_call)
+                    active_tool_calls[tool_call_id] = (tool_name, args)
+                    if not await _send(build_tool_start(tool_call_id, tool_name, args)):
+                        return
+                continue
+
+            if role == "tool":
+                tool_call_id = str(message.get("tool_call_id") or "").strip()
+                tool_name = str(message.get("tool_name") or "").strip()
+                function_args: dict[str, Any] | None = None
+                if tool_call_id in active_tool_calls:
+                    tool_name, function_args = active_tool_calls.pop(tool_call_id)
+                if not tool_call_id or not tool_name:
+                    continue
+                result = message.get("content")
+                if not await _send(
+                    build_tool_complete(
+                        tool_call_id,
+                        tool_name,
+                        result=result if isinstance(result, str) else None,
+                        function_args=function_args,
+                    )
+                ):
+                    return

    async def new_session(
        self,
@@ -515,11 +652,24 @@ class HermesACPAgent(acp.Agent):
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("New session %s (cwd=%s)", state.session_id, cwd)
        self._schedule_available_commands_update(state.session_id)
+        self._schedule_usage_update(state)
        return NewSessionResponse(
            session_id=state.session_id,
            models=self._build_model_state(state),
        )

+    def _schedule_history_replay(self, state: SessionState) -> None:
+        """Replay persisted history after session/load or session/resume returns.
+
+        Zed only attaches streamed transcript/tool updates once the load/resume
+        response has completed. Sending replay notifications while the request is
+        still in-flight can make the server look correct in logs while the editor
+        drops or fails to attach the tool-call history.
+        """
+        loop = asyncio.get_running_loop()
+        replay_coro = self._replay_session_history(state)
+        loop.call_soon(asyncio.create_task, replay_coro)
+
    async def load_session(
        self,
        cwd: str,
@@ -533,8 +683,9 @@ class HermesACPAgent(acp.Agent):
            return None
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("Loaded session %s", session_id)
-        await self._replay_session_history(state)
+        self._schedule_history_replay(state)
        self._schedule_available_commands_update(session_id)
+        self._schedule_usage_update(state)
        return LoadSessionResponse(models=self._build_model_state(state))

    async def resume_session(
@@ -550,13 +701,17 @@ class HermesACPAgent(acp.Agent):
            state = self.session_manager.create_session(cwd=cwd)
        await self._register_session_mcp_servers(state, mcp_servers)
        logger.info("Resumed session %s", state.session_id)
-        await self._replay_session_history(state)
+        self._schedule_history_replay(state)
        self._schedule_available_commands_update(state.session_id)
+        self._schedule_usage_update(state)
        return ResumeSessionResponse(models=self._build_model_state(state))

    async def cancel(self, session_id: str, **kwargs: Any) -> None:
        state = self.session_manager.get_session(session_id)
        if state and state.cancel_event:
+            with state.runtime_lock:
+                if state.is_running and state.current_prompt_text:
+                    state.interrupted_prompt_text = state.current_prompt_text
            state.cancel_event.set()
            try:
                if getattr(state, "agent", None) and hasattr(state.agent, "interrupt"):
@@ -654,6 +809,39 @@ class HermesACPAgent(acp.Agent):
        if not has_content:
            return PromptResponse(stop_reason="end_turn")

+        # /steer on an idle session has no in-flight tool call to inject into.
+        # Rewrite it so the payload runs as a normal user prompt, matching the
+        # gateway's behavior (gateway/run.py ~L4898). Two sub-cases:
+        #   1. Zed-interrupt salvage — a prior prompt was cancelled by the
+        #      client right before /steer arrived; replay it with the steer
+        #      text attached as explicit correction/guidance so the user's
+        #      in-flight work isn't lost.
+        #   2. Plain idle — no prior work to salvage; just run the steer
+        #      payload as a regular prompt. Without this, _cmd_steer would
+        #      silently append to state.queued_prompts and respond with
+        #      "No active turn — queued for the next turn", which looks like
+        #      /queue even though the user never typed /queue.
+        if isinstance(user_content, str) and user_text.startswith("/steer"):
+            steer_text = user_text.split(maxsplit=1)[1].strip() if len(user_text.split(maxsplit=1)) > 1 else ""
+            interrupted_prompt = ""
+            rewrite_idle = False
+            with state.runtime_lock:
+                if not state.is_running and steer_text:
+                    if state.interrupted_prompt_text:
+                        interrupted_prompt = state.interrupted_prompt_text
+                        state.interrupted_prompt_text = ""
+                    else:
+                        rewrite_idle = True
+            if interrupted_prompt:
+                user_text = (
+                    f"{interrupted_prompt}\n\n"
+                    f"User correction/guidance after interrupt: {steer_text}"
+                )
+                user_content = user_text
+            elif rewrite_idle:
+                user_text = steer_text
+                user_content = steer_text
+
        # Intercept slash commands — handle locally without calling the LLM.
        # Slash commands are text-only; if the client included images/resources,
        # send the whole multimodal prompt to the agent instead of treating it as
@@ -664,8 +852,27 @@ class HermesACPAgent(acp.Agent):
                if self._conn:
                    update = acp.update_agent_message_text(response_text)
                    await self._conn.session_update(session_id, update)
+                    await self._send_usage_update(state)
                return PromptResponse(stop_reason="end_turn")

+        # If Zed sends another regular prompt while the same ACP session is
+        # still running, queue it instead of racing two AIAgent loops against
+        # the same state.history. /steer and /queue are handled above and can
+        # land immediately.
+        with state.runtime_lock:
+            if state.is_running:
+                queued_text = user_text or "[Image attachment]"
+                state.queued_prompts.append(queued_text)
+                depth = len(state.queued_prompts)
+                if self._conn:
+                    update = acp.update_agent_message_text(
+                        f"Queued for the next turn. ({depth} queued)"
+                    )
+                    await self._conn.session_update(session_id, update)
+                return PromptResponse(stop_reason="end_turn")
+            state.is_running = True
+            state.current_prompt_text = user_text or "[Image attachment]"
+
        logger.info("Prompt on session %s: %s", session_id, user_text[:100])

        conn = self._conn
@@ -678,24 +885,37 @@ class HermesACPAgent(acp.Agent):
        tool_call_meta: dict[str, dict[str, Any]] = {}
        previous_approval_cb = None

+        streamed_message = False
+
        if conn:
            tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
-            thinking_cb = make_thinking_cb(conn, session_id, loop)
+            reasoning_cb = make_thinking_cb(conn, session_id, loop)
            step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
            message_cb = make_message_cb(conn, session_id, loop)
+
+            def stream_delta_cb(text: str) -> None:
+                nonlocal streamed_message
+                if text:
+                    streamed_message = True
+                message_cb(text)
+
            approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
        else:
            tool_progress_cb = None
-            thinking_cb = None
+            reasoning_cb = None
            step_cb = None
-            message_cb = None
+            stream_delta_cb = None
            approval_cb = None

        agent = state.agent
        agent.tool_progress_callback = tool_progress_cb
-        agent.thinking_callback = thinking_cb
+        # ACP thought panes should not receive Hermes' local kawaii waiting/status
+        # updates. Route provider/model reasoning deltas instead; if the provider
+        # emits no reasoning, Zed should not get a fake "thinking" accordion.
+        agent.thinking_callback = None
+        agent.reasoning_callback = reasoning_cb
        agent.step_callback = step_cb
-        agent.message_callback = message_cb
+        agent.stream_delta_callback = stream_delta_cb

        # Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
        # Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -777,6 +997,9 @@ class HermesACPAgent(acp.Agent):
            result = await loop.run_in_executor(_executor, ctx.run, _run_agent)
        except Exception:
            logger.exception("Executor error for session %s", session_id)
+            with state.runtime_lock:
+                state.is_running = False
+                state.current_prompt_text = ""
            return PromptResponse(stop_reason="end_turn")

        if result.get("messages"):
@@ -798,10 +1021,32 @@ class HermesACPAgent(acp.Agent):
                )
            except Exception:
                logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
-        if final_response and conn:
+        if final_response and conn and not streamed_message:
            update = acp.update_agent_message_text(final_response)
            await conn.session_update(session_id, update)

+        # Mark this turn idle before draining queued work so recursive prompt()
+        # calls can acquire the session. Queued turns are intentionally run as
+        # normal follow-up user prompts, preserving role alternation and history.
+        with state.runtime_lock:
+            state.is_running = False
+            state.current_prompt_text = ""
+
+        while True:
+            with state.runtime_lock:
+                if not state.queued_prompts:
+                    break
+                next_prompt = state.queued_prompts.pop(0)
+            if conn:
+                await conn.session_update(
+                    session_id,
+                    acp.update_user_message_text(next_prompt),
+                )
+            await self.prompt(
+                prompt=[TextContentBlock(type="text", text=next_prompt)],
+                session_id=session_id,
+            )
+
        usage = None
        if any(result.get(key) is not None for key in ("prompt_tokens", "completion_tokens", "total_tokens")):
            usage = Usage(
@@ -812,6 +1057,8 @@ class HermesACPAgent(acp.Agent):
                cached_read_tokens=result.get("cache_read_tokens"),
            )

+        await self._send_usage_update(state)
+
        stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
        return PromptResponse(stop_reason=stop_reason, usage=usage)

@@ -879,6 +1126,8 @@ class HermesACPAgent(acp.Agent):
            "context": self._cmd_context,
            "reset": self._cmd_reset,
            "compact": self._cmd_compact,
+            "steer": self._cmd_steer,
+            "queue": self._cmd_queue,
            "version": self._cmd_version,
        }.get(cmd)

@@ -942,22 +1191,84 @@ class HermesACPAgent(acp.Agent):
            return f"Could not list tools: {e}"

    def _cmd_context(self, args: str, state: SessionState) -> str:
+        """Show ACP session context pressure and compression guidance."""
        n_messages = len(state.history)
-        if n_messages == 0:
-            return "Conversation is empty (no messages yet)."
-        # Count by role
+
+        # Count by role.
        roles: dict[str, int] = {}
        for msg in state.history:
            role = msg.get("role", "unknown")
            roles[role] = roles.get(role, 0) + 1
+
+        agent = state.agent
+        model = state.model or getattr(agent, "model", "")
+        provider = getattr(agent, "provider", None) or "auto"
+        compressor = getattr(agent, "context_compressor", None)
+        context_length = int(getattr(compressor, "context_length", 0) or 0)
+        threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
+
+        try:
+            from agent.model_metadata import estimate_request_tokens_rough
+
+            system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
+            tools = getattr(agent, "tools", None) or None
+            approx_tokens = estimate_request_tokens_rough(
+                state.history,
+                system_prompt=system_prompt,
+                tools=tools,
+            )
+        except Exception:
+            logger.debug("Could not estimate ACP context usage", exc_info=True)
+            approx_tokens = 0
+
+        if threshold_tokens <= 0 and context_length > 0:
+            threshold_tokens = int(context_length * 0.80)
+
        lines = [
-            f"Conversation: {n_messages} messages",
+            f"Conversation: {n_messages} messages"
+            if n_messages
+            else "Conversation is empty (no messages yet).",
            f"  user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
            f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
        ]
-        model = state.model or getattr(state.agent, "model", "")
        if model:
            lines.append(f"Model: {model}")
+        lines.append(f"Provider: {provider}")
+
+        if approx_tokens > 0:
+            if context_length > 0:
+                usage_pct = (approx_tokens / context_length) * 100
+                lines.append(
+                    f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
+                )
+            else:
+                lines.append(f"Context usage: ~{approx_tokens:,} tokens")
+
+        if threshold_tokens > 0:
+            if approx_tokens > 0:
+                threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
+                remaining = max(threshold_tokens - approx_tokens, 0)
+                if approx_tokens >= threshold_tokens:
+                    lines.append(
+                        f"Compression: due now (threshold ~{threshold_tokens:,}"
+                        + (f", {threshold_pct:.0f}%" if threshold_pct else "")
+                        + "). Run /compact."
+                    )
+                else:
+                    lines.append(
+                        f"Compression: ~{remaining:,} tokens until threshold "
+                        f"(~{threshold_tokens:,}"
+                        + (f", {threshold_pct:.0f}%" if threshold_pct else "")
+                        + ")."
+                    )
+            else:
+                lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
+
+        if getattr(agent, "compression_enabled", True) is False:
+            lines.append("Compression is disabled for this agent.")
+        else:
+            lines.append("Tip: run /compact to compress manually before the threshold.")
+
        return "\n".join(lines)

    def _cmd_reset(self, args: str, state: SessionState) -> str:
@@ -975,10 +1286,16 @@ class HermesACPAgent(acp.Agent):
            if not hasattr(agent, "_compress_context"):
                return "Context compression not available for this agent."

-            from agent.model_metadata import estimate_messages_tokens_rough
+            from agent.model_metadata import estimate_request_tokens_rough

            original_count = len(state.history)
-            approx_tokens = estimate_messages_tokens_rough(state.history)
+            # Include system prompt + tool schemas so the figure reflects real
+            # request pressure, not a transcript-only underestimate (#6217).
+            _sys_prompt = getattr(agent, "_cached_system_prompt", "") or ""
+            _tools = getattr(agent, "tools", None) or None
+            approx_tokens = estimate_request_tokens_rough(
+                state.history, system_prompt=_sys_prompt, tools=_tools
+            )
            original_session_db = getattr(agent, "_session_db", None)

            try:
@@ -998,7 +1315,13 @@ class HermesACPAgent(acp.Agent):
            self.session_manager.save_session(state.session_id)

            new_count = len(state.history)
-            new_tokens = estimate_messages_tokens_rough(state.history)
+            _sys_prompt_after = getattr(agent, "_cached_system_prompt", "") or _sys_prompt
+            _tools_after = getattr(agent, "tools", None) or _tools
+            new_tokens = estimate_request_tokens_rough(
+                state.history,
+                system_prompt=_sys_prompt_after,
+                tools=_tools_after,
+            )
            return (
                f"Context compressed: {original_count} -> {new_count} messages\n"
                f"~{approx_tokens:,} -> ~{new_tokens:,} tokens"
@@ -1006,6 +1329,34 @@ class HermesACPAgent(acp.Agent):
        except Exception as e:
            return f"Compression failed: {e}"

+    def _cmd_steer(self, args: str, state: SessionState) -> str:
+        steer_text = args.strip()
+        if not steer_text:
+            return "Usage: /steer <guidance>"
+
+        if state.is_running and hasattr(state.agent, "steer"):
+            try:
+                if state.agent.steer(steer_text):
+                    preview = steer_text[:80] + ("..." if len(steer_text) > 80 else "")
+                    return f"⏩ Steer queued for the active turn: {preview}"
+            except Exception as exc:
+                logger.warning("ACP steer failed for session %s: %s", state.session_id, exc)
+                return f"⚠️ Steer failed: {exc}"
+
+        with state.runtime_lock:
+            state.queued_prompts.append(steer_text)
+            depth = len(state.queued_prompts)
+        return f"No active turn — queued for the next turn. ({depth} queued)"
+
+    def _cmd_queue(self, args: str, state: SessionState) -> str:
+        queued_text = args.strip()
+        if not queued_text:
+            return "Usage: /queue <prompt>"
+        with state.runtime_lock:
+            state.queued_prompts.append(queued_text)
+            depth = len(state.queued_prompts)
+        return f"Queued for the next turn. ({depth} queued)"
+
    def _cmd_version(self, args: str, state: SessionState) -> str:
        return f"Hermes Agent v{HERMES_VERSION}"

@@ -26,6 +26,33 @@ from typing import Any, Dict, List, Optional
 logger = logging.getLogger(__name__)


+def _win_path_to_wsl(path: str) -> str | None:
+    """Convert a Windows drive path to its WSL /mnt/<drive>/... equivalent."""
+    match = re.match(r"^([A-Za-z]):[\\/](.*)$", path)
+    if not match:
+        return None
+    drive = match.group(1).lower()
+    tail = match.group(2).replace("\\", "/")
+    return f"/mnt/{drive}/{tail}"
+
+
+def _translate_acp_cwd(cwd: str) -> str:
+    """Translate Windows ACP cwd values when Hermes itself is running in WSL.
+
+    Windows ACP clients can launch ``hermes acp`` inside WSL while still sending
+    editor workspaces as Windows drive paths such as ``E:\\Projects``. Store
+    and execute against the WSL mount path so agents, tools, and persisted ACP
+    sessions all agree on the usable workspace. Native Linux/macOS keeps the
+    original cwd unchanged.
+    """
+    from hermes_constants import is_wsl
+
+    if not is_wsl():
+        return cwd
+    translated = _win_path_to_wsl(str(cwd))
+    return translated if translated is not None else cwd
+
+
 def _normalize_cwd_for_compare(cwd: str | None) -> str:
    raw = str(cwd or ".").strip()
    if not raw:
@@ -34,11 +61,9 @@ def _normalize_cwd_for_compare(cwd: str | None) -> str:

    # Normalize Windows drive paths into the equivalent WSL mount form so
    # ACP history filters match the same workspace across Windows and WSL.
-    match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
-    if match:
-        drive = match.group(1).lower()
-        tail = match.group(2).replace("\\", "/")
-        expanded = f"/mnt/{drive}/{tail}"
+    translated = _win_path_to_wsl(expanded)
+    if translated is not None:
+        expanded = translated
    elif re.match(r"^/mnt/[A-Za-z]/", expanded):
        expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"

@@ -96,12 +121,18 @@ def _acp_stderr_print(*args, **kwargs) -> None:


 def _register_task_cwd(task_id: str, cwd: str) -> None:
-    """Bind a task/session id to the editor's working directory for tools."""
+    """Bind a task/session id to the editor's working directory for tools.
+
+    Zed can launch Hermes from a Windows workspace while the ACP process runs
+    inside WSL. In that case ACP sends cwd as e.g. ``E:\\Projects\\POTI``;
+    local tools need the WSL mount equivalent or subprocess creation fails
+    before the command can run.
+    """
    if not task_id:
        return
    try:
        from tools.terminal_tool import register_task_env_overrides
-        register_task_env_overrides(task_id, {"cwd": cwd})
+        register_task_env_overrides(task_id, {"cwd": _translate_acp_cwd(cwd)})
    except Exception:
        logger.debug("Failed to register ACP task cwd override", exc_info=True)

@@ -145,6 +176,11 @@ class SessionState:
    model: str = ""
    history: List[Dict[str, Any]] = field(default_factory=list)
    cancel_event: Any = None  # threading.Event
+    is_running: bool = False
+    queued_prompts: List[str] = field(default_factory=list)
+    runtime_lock: Any = field(default_factory=Lock)
+    current_prompt_text: str = ""
+    interrupted_prompt_text: str = ""


 class SessionManager:
@@ -175,6 +211,7 @@ class SessionManager:
        """Create a new session with a unique ID and a fresh AIAgent."""
        import threading

+        cwd = _translate_acp_cwd(cwd)
        session_id = str(uuid.uuid4())
        agent = self._make_agent(session_id=session_id, cwd=cwd)
        state = SessionState(
@@ -217,6 +254,7 @@ class SessionManager:
        """Deep-copy a session's history into a new session."""
        import threading

+        cwd = _translate_acp_cwd(cwd)
        original = self.get_session(session_id)  # checks DB too
        if original is None:
            return None
@@ -318,6 +356,7 @@ class SessionManager:

    def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
        """Update the working directory for a session and its tool overrides."""
+        cwd = _translate_acp_cwd(cwd)
        state = self.get_session(session_id)  # checks DB too
        if state is None:
            return None
@@ -28,6 +28,11 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
    "terminal": "execute",
    "process": "execute",
    "execute_code": "execute",
+    # Session/meta tools
+    "todo": "other",
+    "skill_view": "read",
+    "skills_list": "read",
+    "skill_manage": "edit",
    # Web / fetch
    "web_search": "fetch",
    "web_extract": "fetch",
@@ -51,6 +56,28 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
 }


+_POLISHED_TOOLS = {
+    # Core operator loop
+    "todo", "memory", "session_search", "delegate_task",
+    # Files / execution
+    "read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
+    # Skills / web / browser / media
+    "skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
+    "browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
+    "browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
+    "vision_analyze", "image_generate", "text_to_speech",
+    # Schedulers / platform integrations
+    "cronjob", "send_message", "clarify", "discord", "discord_admin",
+    "ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
+    "feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
+    "feishu_drive_reply_comment", "feishu_drive_add_comment",
+    "kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
+    "kanban_block", "kanban_link", "kanban_heartbeat",
+    "yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
+    "yb_send_dm", "yb_send_sticker", "mixture_of_agents",
+}
+
+
 def get_tool_kind(tool_name: str) -> ToolKind:
    """Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
    return TOOL_KIND_MAP.get(tool_name, "other")
@@ -85,18 +112,645 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
        if urls:
            return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
        return "web extract"
+    if tool_name == "process":
+        action = str(args.get("action") or "").strip() or "manage"
+        sid = str(args.get("session_id") or "").strip()
+        return f"process {action}: {sid}" if sid else f"process {action}"
    if tool_name == "delegate_task":
+        tasks = args.get("tasks")
+        if isinstance(tasks, list) and tasks:
+            return f"delegate batch ({len(tasks)} tasks)"
        goal = args.get("goal", "")
        if goal and len(goal) > 60:
            goal = goal[:57] + "..."
        return f"delegate: {goal}" if goal else "delegate task"
+    if tool_name == "session_search":
+        query = str(args.get("query") or "").strip()
+        return f"session search: {query}" if query else "recent sessions"
+    if tool_name == "memory":
+        action = str(args.get("action") or "manage").strip() or "manage"
+        target = str(args.get("target") or "memory").strip() or "memory"
+        return f"memory {action}: {target}"
    if tool_name == "execute_code":
-        return "execute code"
+        code = str(args.get("code") or "").strip()
+        first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
+        if first_line:
+            if len(first_line) > 70:
+                first_line = first_line[:67] + "..."
+            return f"python: {first_line}"
+        return "python code"
+    if tool_name == "todo":
+        items = args.get("todos")
+        if isinstance(items, list):
+            return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
+        return "todo"
+    if tool_name == "skill_view":
+        name = str(args.get("name") or "?").strip() or "?"
+        file_path = str(args.get("file_path") or "").strip()
+        suffix = f"/{file_path}" if file_path else ""
+        return f"skill view ({name}{suffix})"
+    if tool_name == "skills_list":
+        category = str(args.get("category") or "").strip()
+        return f"skills list ({category})" if category else "skills list"
+    if tool_name == "skill_manage":
+        action = str(args.get("action") or "manage").strip() or "manage"
+        name = str(args.get("name") or "?").strip() or "?"
+        file_path = str(args.get("file_path") or "").strip()
+        target = f"{name}/{file_path}" if file_path else name
+        if len(target) > 64:
+            target = target[:61] + "..."
+        return f"skill {action}: {target}"
+    if tool_name == "browser_navigate":
+        return f"navigate: {args.get('url', '?')}"
+    if tool_name == "browser_snapshot":
+        return "browser snapshot"
+    if tool_name == "browser_vision":
+        return f"browser vision: {str(args.get('question', '?'))[:50]}"
+    if tool_name == "browser_get_images":
+        return "browser images"
    if tool_name == "vision_analyze":
-        return f"analyze image: {args.get('question', '?')[:50]}"
+        return f"analyze image: {str(args.get('question', '?'))[:50]}"
+    if tool_name == "image_generate":
+        prompt = str(args.get("prompt") or args.get("description") or "").strip()
+        return f"generate image: {prompt[:50]}" if prompt else "generate image"
+    if tool_name == "cronjob":
+        action = str(args.get("action") or "manage").strip() or "manage"
+        job_id = str(args.get("job_id") or args.get("id") or "").strip()
+        return f"cron {action}: {job_id}" if job_id else f"cron {action}"
    return tool_name


+def _text(content: str) -> Any:
+    return acp.tool_content(acp.text_block(content))
+
+
+def _json_loads_maybe(value: Optional[str]) -> Any:
+    if not isinstance(value, str):
+        return value
+    try:
+        return json.loads(value)
+    except Exception:
+        pass
+
+    # Some Hermes tools append a human hint after a JSON payload, e.g.
+    # ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
+    # by decoding the first JSON value instead of falling back to raw text.
+    try:
+        decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
+        return decoded
+    except Exception:
+        return None
+
+
+def _truncate_text(text: str, limit: int = 5000) -> str:
+    if len(text) <= limit:
+        return text
+    return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
+
+
+def _fenced_text(text: str, language: str = "") -> str:
+    """Return a Markdown fence that cannot be broken by backticks in text."""
+    longest = max((len(run) for run in text.split("`")[1::2]), default=0)
+    fence = "`" * max(3, longest + 1)
+    return f"{fence}{language}\n{text}\n{fence}"
+
+
+def _format_todo_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
+        return None
+    summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
+    icon = {
+        "completed": "✅",
+        "in_progress": "🔄",
+        "pending": "⏳",
+        "cancelled": "✗",
+    }
+    lines = ["**Todo list**", ""]
+    for item in data["todos"]:
+        if not isinstance(item, dict):
+            continue
+        status = str(item.get("status") or "pending")
+        content = str(item.get("content") or item.get("id") or "").strip()
+        if content:
+            lines.append(f"- {icon.get(status, '•')} {content}")
+    if summary:
+        cancelled = summary.get("cancelled", 0)
+        lines.extend([
+            "",
+            "**Progress:** "
+            f"{summary.get('completed', 0)} completed, "
+            f"{summary.get('in_progress', 0)} in progress, "
+            f"{summary.get('pending', 0)} pending"
+            + (f", {cancelled} cancelled" if cancelled else ""),
+        ])
+    return "\n".join(lines)
+
+
+def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    if data.get("error") and not data.get("content"):
+        return f"Read failed: {data.get('error')}"
+    content = data.get("content")
+    if not isinstance(content, str):
+        return None
+    path = str((args or {}).get("path") or data.get("path") or "file").strip()
+    offset = (args or {}).get("offset")
+    limit = (args or {}).get("limit")
+    range_bits = []
+    if offset:
+        range_bits.append(f"from line {offset}")
+    if limit:
+        range_bits.append(f"limit {limit}")
+    suffix = f" ({', '.join(range_bits)})" if range_bits else ""
+    header = f"Read {path}{suffix}"
+    if data.get("total_lines") is not None:
+        header += f" — {data.get('total_lines')} total lines"
+    # Hermes read_file output is line-numbered with `|`. If we send it as raw
+    # Markdown, Zed can interpret pipes as tables and collapse the layout.
+    # Fence the payload so file lines stay readable and literal.
+    return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
+
+
+def _format_search_files_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    matches = data.get("matches")
+    if not isinstance(matches, list):
+        return None
+
+    total = data.get("total_count", len(matches))
+    shown = min(len(matches), 12)
+    truncated = bool(data.get("truncated")) or len(matches) > shown
+    lines = [
+        "Search results",
+        f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
+        "",
+    ]
+
+    for match in matches[:shown]:
+        if not isinstance(match, dict):
+            lines.append(f"- {match}")
+            continue
+
+        path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
+        line = match.get("line") or match.get("line_number")
+        content = str(match.get("content") or match.get("text") or "").strip()
+        loc = f"{path}:{line}" if line else path
+        lines.append(f"- {loc}")
+        if content:
+            snippet = _truncate_text(" ".join(content.split()), 300)
+            lines.append(f"  {snippet}")
+
+    if truncated:
+        lines.extend([
+            "",
+            "Results truncated. Narrow the search, add file_glob, or use offset to page.",
+        ])
+    return _truncate_text("\n".join(lines), limit=7000)
+
+
+def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return result if isinstance(result, str) and result.strip() else None
+    output = str(data.get("output") or "")
+    error = str(data.get("error") or "")
+    exit_code = data.get("exit_code")
+    parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
+    if output:
+        parts.extend(["", "Output:", output])
+    if error:
+        parts.extend(["", "Error:", error])
+    return _truncate_text("\n".join(parts))
+
+
+def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
+    headings: list[str] = []
+    for line in content.splitlines():
+        stripped = line.strip()
+        if stripped.startswith("#"):
+            heading = stripped.lstrip("#").strip()
+            if heading:
+                headings.append(heading)
+        if len(headings) >= limit:
+            break
+    return headings
+
+
+def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    if data.get("success") is False:
+        return f"Skill view failed: {data.get('error', 'unknown error')}"
+    name = str(data.get("name") or "skill")
+    file_path = str(data.get("file") or data.get("path") or "SKILL.md")
+    description = str(data.get("description") or "").strip()
+    content = str(data.get("content") or "")
+    linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
+
+    lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
+    if description:
+        lines.append(f"- **Description:** {description}")
+    if content:
+        lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
+    if linked:
+        linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
+        lines.append(f"- **Linked files:** {linked_count}")
+
+    headings = _extract_markdown_headings(content)
+    if headings:
+        lines.extend(["", "**Sections**"])
+        lines.extend(f"- {heading}" for heading in headings)
+
+    lines.extend([
+        "",
+        "_Full skill content is available to the agent but hidden here to keep ACP readable._",
+    ])
+    return "\n".join(lines)
+
+
+def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+
+    action = str((args or {}).get("action") or "manage").strip() or "manage"
+    name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
+    file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
+    success = data.get("success")
+    status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
+
+    lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
+    if action not in {"delete"}:
+        lines.append(f"- **File:** `{file_path}`")
+
+    message = str(data.get("message") or data.get("error") or "").strip()
+    if message:
+        lines.append(f"- **Result:** {message}")
+
+    replacements = data.get("replacements") or data.get("replacement_count")
+    if replacements is not None:
+        lines.append(f"- **Replacements:** {replacements}")
+
+    path = str(data.get("path") or "").strip()
+    if path:
+        lines.append(f"- **Path:** `{path}`")
+
+    return "\n".join(lines)
+
+
+def _format_web_search_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
+    if not isinstance(web, list):
+        return None
+    lines = [f"Web results: {len(web)}"]
+    for item in web[:10]:
+        if not isinstance(item, dict):
+            continue
+        title = str(item.get("title") or item.get("url") or "result").strip()
+        url = str(item.get("url") or "").strip()
+        desc = str(item.get("description") or "").strip()
+        lines.append(f"• {title}" + (f" — {url}" if url else ""))
+        if desc:
+            lines.append(f"  {desc}")
+    return _truncate_text("\n".join(lines))
+
+
+def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
+    """Return only web_extract errors for ACP; success stays compact via title."""
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    if data.get("success") is False and data.get("error"):
+        return f"Web extract failed: {data.get('error')}"
+    results = data.get("results")
+    if not isinstance(results, list):
+        return None
+
+    failures: list[str] = []
+    for item in results[:10]:
+        if not isinstance(item, dict):
+            continue
+        error = str(item.get("error") or "").strip()
+        if not error or error in {"None", "null"}:
+            continue
+        url = str(item.get("url") or "").strip()
+        title = str(item.get("title") or url or "Untitled").strip()
+        failures.append(
+            f"- {title}" + (f" — {url}" if url and url != title else "") + f"\n  Error: {_truncate_text(error, limit=500)}"
+        )
+
+    if not failures:
+        return None
+    lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
+    lines.extend(failures)
+    return "\n".join(lines)
+
+
+def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return result if isinstance(result, str) and result.strip() else None
+    if data.get("success") is False and data.get("error"):
+        return f"Process error: {data.get('error')}"
+    action = str((args or {}).get("action") or "process").strip() or "process"
+    if isinstance(data.get("processes"), list):
+        processes = data["processes"]
+        lines = [f"Processes: {len(processes)}"]
+        for proc in processes[:20]:
+            if not isinstance(proc, dict):
+                lines.append(f"- {proc}")
+                continue
+            sid = str(proc.get("session_id") or proc.get("id") or "?")
+            status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
+            cmd = str(proc.get("command") or "").strip()
+            pid = proc.get("pid")
+            code = proc.get("exit_code")
+            bits = [status]
+            if pid is not None:
+                bits.append(f"pid {pid}")
+            if code is not None:
+                bits.append(f"exit {code}")
+            lines.append(f"- `{sid}` — {', '.join(bits)}" + (f" — {cmd[:120]}" if cmd else ""))
+        if len(processes) > 20:
+            lines.append(f"... {len(processes) - 20} more process(es)")
+        return "\n".join(lines)
+
+    status = str(data.get("status") or data.get("state") or action).strip()
+    sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
+    lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
+    for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
+        if data.get(key) is not None:
+            lines.append(f"- **{label}:** {data.get(key)}")
+    output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
+    error = data.get("error") or data.get("stderr")
+    if output:
+        lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
+    if error:
+        lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
+    msg = data.get("message")
+    if msg and not output and not error:
+        lines.append(str(msg))
+    return _truncate_text("\n".join(lines), limit=7000)
+
+
+def _format_delegate_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    if data.get("error") and not isinstance(data.get("results"), list):
+        return f"Delegation failed: {data.get('error')}"
+    results = data.get("results")
+    if not isinstance(results, list):
+        return None
+    total = data.get("total_duration_seconds")
+    lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
+    icon = {"completed": "✅", "failed": "✗", "error": "✗", "timeout": "⏱", "interrupted": "⚠"}
+    for item in results:
+        if not isinstance(item, dict):
+            lines.append(f"- {item}")
+            continue
+        idx = item.get("task_index")
+        status = str(item.get("status") or "unknown")
+        model = item.get("model")
+        dur = item.get("duration_seconds")
+        role = item.get("_child_role")
+        header = f"{icon.get(status, '•')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
+        bits = []
+        if model:
+            bits.append(str(model))
+        if role:
+            bits.append(f"role={role}")
+        if dur is not None:
+            bits.append(f"{dur}s")
+        if bits:
+            header += " (" + ", ".join(bits) + ")"
+        lines.extend(["", header])
+        summary = str(item.get("summary") or "").strip()
+        error = str(item.get("error") or "").strip()
+        if summary:
+            lines.append(_truncate_text(summary, limit=1200))
+        if error:
+            lines.append("Error: " + _truncate_text(error, limit=800))
+        trace = item.get("tool_trace")
+        if isinstance(trace, list) and trace:
+            names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
+            if names:
+                lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
+    return _truncate_text("\n".join(lines), limit=8000)
+
+
+def _format_session_search_result(result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    if data.get("success") is False:
+        return f"Session search failed: {data.get('error', 'unknown error')}"
+    results = data.get("results")
+    if not isinstance(results, list):
+        return None
+    mode = data.get("mode") or "search"
+    query = data.get("query")
+    lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
+    if not results:
+        lines.append(str(data.get("message") or "No matching sessions found."))
+        return "\n".join(lines)
+    for item in results:
+        if not isinstance(item, dict):
+            continue
+        sid = str(item.get("session_id") or "?")
+        title = str(item.get("title") or item.get("when") or "Untitled session").strip()
+        when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
+        count = item.get("message_count")
+        source = str(item.get("source") or "").strip()
+        meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
+        lines.append(f"- **{title}** (`{sid}`)" + (f" — {meta}" if meta else ""))
+        summary = str(item.get("summary") or item.get("preview") or "").strip()
+        if summary:
+            lines.append("  " + _truncate_text(" ".join(summary.split()), limit=500))
+    return _truncate_text("\n".join(lines), limit=7000)
+
+
+def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return None
+    action = str((args or {}).get("action") or "memory").strip() or "memory"
+    target = str(data.get("target") or (args or {}).get("target") or "memory")
+    if data.get("success") is False:
+        lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
+        matches = data.get("matches")
+        if isinstance(matches, list) and matches:
+            lines.append("Matches:")
+            lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
+        return "\n".join(lines)
+    lines = [f"✅ Memory {action} saved ({target})"]
+    if data.get("message"):
+        lines.append(str(data.get("message")))
+    if data.get("entry_count") is not None:
+        lines.append(f"Entries: {data.get('entry_count')}")
+    if data.get("usage"):
+        lines.append(f"Usage: {data.get('usage')}")
+    # Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
+    preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
+    if preview:
+        lines.append("Preview: " + _truncate_text(preview, limit=300))
+    return "\n".join(lines)
+
+
+def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    path = str((args or {}).get("path") or "file").strip()
+    if isinstance(data, dict):
+        if data.get("success") is False or data.get("error"):
+            return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
+        message = str(data.get("message") or "").strip()
+        replacements = data.get("replacements") or data.get("replacement_count")
+        lines = [f"✅ {tool_name} completed" + (f" for `{path}`" if path else "")]
+        if message:
+            lines.append(message)
+        if replacements is not None:
+            lines.append(f"Replacements: {replacements}")
+        if data.get("files_modified"):
+            files = data.get("files_modified")
+            if isinstance(files, list):
+                lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
+        return "\n".join(lines)
+    if isinstance(result, str) and result.strip():
+        return _truncate_text(result, limit=3000)
+    return f"✅ {tool_name} completed" + (f" for `{path}`" if path else "")
+
+
+def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return result if isinstance(result, str) and result.strip() else None
+    if data.get("success") is False or data.get("error"):
+        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
+    if tool_name == "browser_get_images":
+        images = data.get("images") or data.get("data")
+        if isinstance(images, list):
+            lines = [f"Images found: {len(images)}"]
+            for img in images[:12]:
+                if isinstance(img, dict):
+                    alt = str(img.get("alt") or "").strip()
+                    url = str(img.get("url") or img.get("src") or "").strip()
+                    lines.append(f"- {alt or 'image'}" + (f" — {url}" if url else ""))
+            return _truncate_text("\n".join(lines), limit=5000)
+    title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
+    text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
+    lines = [title]
+    if data.get("url") and data.get("url") != title:
+        lines.append(str(data.get("url")))
+    if text:
+        lines.extend(["", _truncate_text(text, limit=5000)])
+    return _truncate_text("\n".join(lines), limit=7000)
+
+
+def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, dict):
+        return result if isinstance(result, str) and result.strip() else None
+    if data.get("success") is False or data.get("error"):
+        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
+    lines = [f"✅ {tool_name} completed"]
+    for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
+        if data.get(key):
+            lines.append(f"- **{key}:** {data.get(key)}")
+    return "\n".join(lines)
+
+
+def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
+    data = _json_loads_maybe(result)
+    if not isinstance(data, (dict, list)):
+        return result if isinstance(result, str) and result.strip() else None
+    if isinstance(data, list):
+        lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
+        for item in data[:12]:
+            lines.append(f"- {_truncate_text(str(item), limit=240)}")
+        return _truncate_text("\n".join(lines), limit=5000)
+
+    if data.get("success") is False or data.get("error"):
+        return f"{tool_name} failed: {data.get('error', 'unknown error')}"
+
+    lines = [f"✅ {tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
+    priority_keys = (
+        "message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
+        "state", "service", "url", "path", "file_path", "count", "total", "next_run",
+    )
+    seen = set()
+    for key in priority_keys:
+        value = data.get(key)
+        if value in (None, "", [], {}):
+            continue
+        seen.add(key)
+        lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
+
+    for key, value in data.items():
+        if key in seen or key in {"success", "raw", "content", "entries"}:
+            continue
+        if value in (None, "", [], {}):
+            continue
+        if isinstance(value, (dict, list)):
+            preview = json.dumps(value, ensure_ascii=False, default=str)
+        else:
+            preview = str(value)
+        lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
+        if len(lines) >= 14:
+            break
+
+    content = data.get("content")
+    if isinstance(content, str) and content.strip():
+        lines.extend(["", _truncate_text(content.strip(), limit=1500)])
+    return _truncate_text("\n".join(lines), limit=7000)
+
+
+def _build_polished_completion_content(
+    tool_name: str,
+    result: Optional[str],
+    function_args: Optional[Dict[str, Any]],
+) -> Optional[List[Any]]:
+    formatter = {
+        "todo": lambda: _format_todo_result(result),
+        "read_file": lambda: _format_read_file_result(result, function_args),
+        "write_file": lambda: _format_edit_result(tool_name, result, function_args),
+        "patch": lambda: _format_edit_result(tool_name, result, function_args),
+        "search_files": lambda: _format_search_files_result(result),
+        "execute_code": lambda: _format_execute_code_result(result),
+        "process": lambda: _format_process_result(result, function_args),
+        "delegate_task": lambda: _format_delegate_result(result),
+        "session_search": lambda: _format_session_search_result(result),
+        "memory": lambda: _format_memory_result(result, function_args),
+        "skill_view": lambda: _format_skill_view_result(result),
+        "skill_manage": lambda: _format_skill_manage_result(result, function_args),
+        "web_search": lambda: _format_web_search_result(result),
+        "web_extract": lambda: _format_web_extract_result(result),
+        "browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
+        "browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
+        "browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
+        "browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
+        "vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
+        "image_generate": lambda: _format_media_or_cron_result(tool_name, result),
+        "cronjob": lambda: _format_media_or_cron_result(tool_name, result),
+    }.get(tool_name)
+    if formatter is None and tool_name in _POLISHED_TOOLS:
+        formatter = lambda: _format_generic_structured_result(tool_name, result)
+    if formatter is None:
+        return None
+    text = formatter()
+    if not text:
+        return None
+    return [_text(text)]
+
+
 def _build_patch_mode_content(patch_text: str) -> List[Any]:
    """Parse V4A patch mode input into ACP diff blocks when possible."""
    if not patch_text:
@@ -258,7 +912,11 @@ def _build_tool_complete_content(
        except Exception:
            pass

-    return [acp.tool_content(acp.text_block(display_result))]
+    polished_content = _build_polished_completion_content(tool_name, result, function_args)
+    if polished_content:
+        return polished_content
+
+    return [_text(display_result)]


 # ---------------------------------------------------------------------------
@@ -288,7 +946,6 @@ def build_tool_start(
            content = _build_patch_mode_content(patch_text)
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
-            raw_input=arguments,
        )

    if tool_name == "write_file":
@@ -297,32 +954,172 @@ def build_tool_start(
        content = [acp.tool_diff_content(path=path, new_text=file_content)]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
-            raw_input=arguments,
        )

    if tool_name == "terminal":
        command = arguments.get("command", "")
-        content = [acp.tool_content(acp.text_block(f"$ {command}"))]
+        content = [_text(f"$ {command}")]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
-            raw_input=arguments,
        )

    if tool_name == "read_file":
-        path = arguments.get("path", "")
-        content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
+        # The title and location already identify the file. Sending a synthetic
+        # "Reading ..." content block makes Zed render an unhelpful Output
+        # section before the real file contents arrive on completion.
        return acp.start_tool_call(
-            tool_call_id, title, kind=kind, content=content, locations=locations,
-            raw_input=arguments,
+            tool_call_id, title, kind=kind, content=None, locations=locations,
        )

    if tool_name == "search_files":
        pattern = arguments.get("pattern", "")
        target = arguments.get("target", "content")
-        content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
+        search_path = arguments.get("path")
+        where = f" in {search_path}" if search_path else ""
+        content = [_text(f"Searching for '{pattern}' ({target}){where}")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "todo":
+        items = arguments.get("todos")
+        if isinstance(items, list):
+            preview_lines = ["Updating todo list", ""]
+            for item in items[:8]:
+                if isinstance(item, dict):
+                    preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
+            if len(items) > 8:
+                preview_lines.append(f"... {len(items) - 8} more")
+            content = [_text("\n".join(preview_lines))]
+        else:
+            content = [_text("Reading todo list")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "skill_view":
+        name = str(arguments.get("name") or "?").strip() or "?"
+        file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
+        content = [_text(f"Loading skill '{name}' ({file_path})")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "skill_manage":
+        action = str(arguments.get("action") or "manage").strip() or "manage"
+        name = str(arguments.get("name") or "?").strip() or "?"
+        file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
+        path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
+
+        if action == "patch":
+            old = str(arguments.get("old_string") or "")
+            new = str(arguments.get("new_string") or "")
+            content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
+        elif action in {"edit", "create"}:
+            content = [
+                acp.tool_diff_content(
+                    path=path,
+                    new_text=str(arguments.get("content") or ""),
+                )
+            ]
+        elif action == "write_file":
+            target = str(arguments.get("file_path") or "file")
+            content = [
+                acp.tool_diff_content(
+                    path=f"skills/{name}/{target}",
+                    new_text=str(arguments.get("file_content") or ""),
+                )
+            ]
+        elif action in {"delete", "remove_file"}:
+            target = str(arguments.get("file_path") or file_path or name)
+            content = [_text(f"Removing {target} from skill '{name}'")]
+        else:
+            content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
+
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "execute_code":
+        code = str(arguments.get("code") or "").strip()
+        preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
+        content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "web_search":
+        query = str(arguments.get("query") or "").strip()
+        content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "web_extract":
+        # The title identifies the URL(s). Avoid a duplicate content block so
+        # Zed renders this like read_file: compact start, concise completion.
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=None, locations=locations,
+        )
+
+    if tool_name == "process":
+        action = str(arguments.get("action") or "").strip() or "manage"
+        sid = str(arguments.get("session_id") or "").strip()
+        data_preview = str(arguments.get("data") or "").strip()
+        text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
+        if data_preview:
+            text += "\nInput: " + _truncate_text(data_preview, limit=500)
+        content = [_text(text)]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "delegate_task":
+        tasks = arguments.get("tasks")
+        if isinstance(tasks, list) and tasks:
+            lines = [f"Delegating {len(tasks)} tasks", ""]
+            for i, task in enumerate(tasks[:8], 1):
+                if isinstance(task, dict):
+                    goal = str(task.get("goal") or "").strip()
+                    role = str(task.get("role") or "").strip()
+                    lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
+            if len(tasks) > 8:
+                lines.append(f"... {len(tasks) - 8} more")
+            content = [_text("\n".join(lines))]
+        else:
+            goal = str(arguments.get("goal") or "").strip()
+            content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "session_search":
+        query = str(arguments.get("query") or "").strip()
+        content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name == "memory":
+        action = str(arguments.get("action") or "manage").strip() or "manage"
+        target = str(arguments.get("target") or "memory").strip() or "memory"
+        preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
+        text = f"Memory {action} ({target})"
+        if preview:
+            text += "\nPreview: " + _truncate_text(preview, limit=500)
+        content = [_text(text)]
+        return acp.start_tool_call(
+            tool_call_id, title, kind=kind, content=content, locations=locations,
+        )
+
+    if tool_name in _POLISHED_TOOLS:
+        try:
+            args_text = json.dumps(arguments, indent=2, default=str)
+        except (TypeError, ValueError):
+            args_text = str(arguments)
+        content = [_text(_truncate_text(args_text, limit=1200))]
        return acp.start_tool_call(
            tool_call_id, title, kind=kind, content=content, locations=locations,
-            raw_input=arguments,
        )

    # Generic fallback
@@ -334,7 +1131,7 @@ def build_tool_start(
    content = [acp.tool_content(acp.text_block(args_text))]
    return acp.start_tool_call(
        tool_call_id, title, kind=kind, content=content, locations=locations,
-        raw_input=arguments,
+        raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
    )


@@ -347,18 +1144,22 @@ def build_tool_complete(
 ) -> ToolCallProgress:
    """Create a ToolCallUpdate (progress) event for a completed tool call."""
    kind = get_tool_kind(tool_name)
-    content = _build_tool_complete_content(
-        tool_name,
-        result,
-        function_args=function_args,
-        snapshot=snapshot,
-    )
+    if tool_name == "web_extract":
+        error_text = _format_web_extract_result(result)
+        content = [_text(error_text)] if error_text else None
+    else:
+        content = _build_tool_complete_content(
+            tool_name,
+            result,
+            function_args=function_args,
+            snapshot=snapshot,
+        )
    return acp.update_tool_call(
        tool_call_id,
        kind=kind,
        status="completed",
        content=content,
-        raw_output=result,
+        raw_output=None if tool_name in _POLISHED_TOOLS else result,
    )


@@ -76,6 +76,7 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
 # Models where temperature/top_p/top_k return 400 if set to non-default values.
 # This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
 _NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
+_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")

 # ── Max output token limits per Anthropic model ───────────────────────
 # Source: Anthropic docs + Cline model catalog.  Anthropic's API requires
@@ -105,6 +106,9 @@ _ANTHROPIC_OUTPUT_LIMITS = {
    "claude-3-haiku":      4_096,
    # Third-party Anthropic-compatible providers
    "minimax":            131_072,
+    # Qwen models via DashScope Anthropic-compatible endpoint
+    # DashScope enforces max_tokens ∈ [1, 65536]
+    "qwen3":               65_536,
 }

 # For any model not in the table, assume the highest current limit.
@@ -216,6 +220,17 @@ def _forbids_sampling_params(model: str) -> bool:
    return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)


+def _supports_fast_mode(model: str) -> bool:
+    """Return True for models that support Anthropic Fast Mode (speed=fast).
+
+    Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
+    Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
+    returns HTTP 400. This guard prevents silently 400'ing when stale config
+    or older callers leave fast mode enabled across a model upgrade.
+    """
+    return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
+
+
 # Beta headers for enhanced features (sent with ALL auth types).
 # As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
 # beta headers are still accepted (harmless no-op) but not required. Kept
@@ -1222,6 +1237,14 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
    ``keep_nullable_hint=False`` because the Anthropic validator does not
    recognize the OpenAPI-style ``nullable: true`` extension and strict
    schema-to-grammar converters may reject unknown keywords.
+
+    Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
+    Anthropic API rejects union keywords at the schema root with a generic
+    HTTP 400. Several upstream and plugin tools ship schemas with one of
+    these keywords at the top level (commonly for Pydantic discriminated
+    unions). If we land here with those keywords still present after
+    nullable-union stripping, drop them and fall back to a plain object
+    schema so the tool still validates at the Anthropic boundary.
    """
    if not schema:
        return {"type": "object", "properties": {}}
@@ -1231,6 +1254,12 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
    normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
    if not isinstance(normalized, dict):
        return {"type": "object", "properties": {}}
+    # Strip top-level union keywords that Anthropic's validator rejects.
+    banned = {"oneOf", "allOf", "anyOf"}
+    if banned & normalized.keys():
+        normalized = {k: v for k, v in normalized.items() if k not in banned}
+        if "type" not in normalized:
+            normalized["type"] = "object"
    if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
        normalized = {**normalized, "properties": {}}
    return normalized
@@ -1241,10 +1270,24 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
    if not tools:
        return []
    result = []
+    seen_names: set = set()
    for t in tools:
        fn = t.get("function", {})
+        name = fn.get("name", "")
+        # Defensive dedup: Anthropic rejects requests with duplicate tool
+        # names.  Upstream injection paths already dedup, but this guard
+        # converts a hard API failure into a warning.  See: #18478
+        if name and name in seen_names:
+            logger.warning(
+                "convert_tools_to_anthropic: duplicate tool name '%s' "
+                "— dropping second occurrence",
+                name,
+            )
+            continue
+        if name:
+            seen_names.add(name)
        result.append({
-            "name": fn.get("name", ""),
+            "name": name,
            "description": fn.get("description", ""),
            "input_schema": _normalize_tool_input_schema(
                fn.get("parameters", {"type": "object", "properties": {}})
@@ -1901,9 +1944,15 @@ def build_anthropic_kwargs(

    # ── Fast mode (Opus 4.6 only) ────────────────────────────────────
    # Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
-    # output speed. Only for native Anthropic endpoints — third-party
-    # providers would reject the unknown beta header and speed parameter.
-    if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
+    # output speed. Per Anthropic docs, fast mode is only supported on
+    # Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
+    # Only for native Anthropic endpoints — third-party providers would
+    # reject the unknown beta header and speed parameter.
+    if (
+        fast_mode
+        and not _is_third_party_anthropic_endpoint(base_url)
+        and _supports_fast_mode(model)
+    ):
        kwargs.setdefault("extra_body", {})["speed"] = "fast"
        # Build extra_headers with ALL applicable betas (the per-request
        # extra_headers override the client-level anthropic-beta header).
@@ -259,13 +259,68 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding-cn",
 })

-# OpenRouter app attribution headers
-_OR_HEADERS = {
+# OpenRouter app attribution headers (base — always sent)
+_OR_HEADERS_BASE = {
    "HTTP-Referer": "https://hermes-agent.nousresearch.com",
    "X-OpenRouter-Title": "Hermes Agent",
    "X-OpenRouter-Categories": "productivity,cli-agent",
 }

+# Truthy values for boolean env-var parsing.
+_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
+
+
+def build_or_headers(or_config: dict | None = None) -> dict:
+    """Build OpenRouter headers, optionally including response-cache headers.
+
+    Precedence for response cache: env var > config.yaml > default (enabled).
+
+    Environment variables:
+        ``HERMES_OPENROUTER_CACHE`` — truthy (``1``/``true``/``yes``/``on``)
+            enables caching; ``0``/``false``/``no``/``off`` disables.
+            Overrides ``openrouter.response_cache`` in config.yaml.
+        ``HERMES_OPENROUTER_CACHE_TTL`` — integer seconds (1-86400).
+            Overrides ``openrouter.response_cache_ttl`` in config.yaml.
+
+    *or_config* is the ``openrouter`` section from config.yaml.  When *None*,
+    falls back to reading config from disk via ``load_config()``.
+    """
+    headers = dict(_OR_HEADERS_BASE)
+
+    # Resolve config from disk if not provided.
+    if or_config is None:
+        try:
+            from hermes_cli.config import load_config
+            or_config = load_config().get("openrouter", {})
+        except Exception:
+            or_config = {}
+
+    # Determine cache enabled: env var overrides config.
+    env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
+    if env_cache:
+        cache_enabled = env_cache in _TRUTHY_ENV_VALUES
+    else:
+        cache_enabled = or_config.get("response_cache", False)
+
+    if not cache_enabled:
+        return headers
+
+    headers["X-OpenRouter-Cache"] = "true"
+
+    # Determine TTL: env var overrides config.
+    env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
+    if env_ttl:
+        if env_ttl.isdigit():
+            ttl = int(env_ttl)
+            if 1 <= ttl <= 86400:
+                headers["X-OpenRouter-Cache-TTL"] = str(ttl)
+    else:
+        ttl = or_config.get("response_cache_ttl", 300)
+        if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
+            headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
+
+    return headers
+
 # Vercel AI Gateway app attribution headers. HTTP-Referer maps to
 # referrerUrl and X-Title maps to appName in the gateway's analytics.
 from hermes_cli import __version__ as _HERMES_VERSION
@@ -1149,23 +1204,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:



-def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
+def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
    pool_present, entry = _select_pool_entry("openrouter")
    if pool_present:
-        or_key = _pool_runtime_api_key(entry)
+        or_key = explicit_api_key or _pool_runtime_api_key(entry)
        if not or_key:
            return None, None
        base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
        logger.debug("Auxiliary client: OpenRouter via pool")
        return OpenAI(api_key=or_key, base_url=base_url,
-                       default_headers=_OR_HEADERS), _OPENROUTER_MODEL
+                       default_headers=build_or_headers()), _OPENROUTER_MODEL

-    or_key = os.getenv("OPENROUTER_API_KEY")
+    or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
    if not or_key:
        return None, None
    logger.debug("Auxiliary client: OpenRouter")
    return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
-                   default_headers=_OR_HEADERS), _OPENROUTER_MODEL
+                   default_headers=build_or_headers()), _OPENROUTER_MODEL


 def _describe_openrouter_unavailable() -> str:
@@ -1474,7 +1529,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
    return CodexAuxiliaryClient(real_client, model), model


-def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
+def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
    try:
        from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
    except ImportError:
@@ -1484,10 +1539,10 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
    if pool_present:
        if entry is None:
            return None, None
-        token = _pool_runtime_api_key(entry)
+        token = explicit_api_key or _pool_runtime_api_key(entry)
    else:
        entry = None
-        token = resolve_anthropic_token()
+        token = explicit_api_key or resolve_anthropic_token()
    if not token:
        return None, None

@@ -1911,7 +1966,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
    }
    sync_base_url = str(sync_client.base_url)
    if base_url_host_matches(sync_base_url, "openrouter.ai"):
-        async_kwargs["default_headers"] = dict(_OR_HEADERS)
+        async_kwargs["default_headers"] = build_or_headers()
    elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
        from hermes_cli.copilot_auth import copilot_request_headers

@@ -1977,6 +2032,12 @@ def resolve_provider_client(
        (client, resolved_model) or (None, None) if auth is unavailable.
    """
    _validate_proxy_env_urls()
+    # Preserve the original provider name before alias normalization so a
+    # user-declared ``custom_providers`` entry whose name coincidentally
+    # matches a built-in alias (e.g. user names their custom provider "kimi"
+    # which aliases to "kimi-coding") is still reachable via the named-custom
+    # branch below.
+    original_provider = (provider or "").strip().lower()
    # Normalise aliases
    provider = _normalize_aux_provider(provider)

@@ -2047,9 +2108,9 @@ def resolve_provider_client(
        return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
                else (client, final_model))

-    # ── OpenRouter ───────────────────────────────────────────────────
+    # ── OpenRouter ───────────────────────────────────────────
    if provider == "openrouter":
-        client, default = _try_openrouter()
+        client, default = _try_openrouter(explicit_api_key=explicit_api_key)
        if client is None:
            logger.warning(
                "resolve_provider_client: openrouter requested but %s",
@@ -2163,7 +2224,18 @@ def resolve_provider_client(
    # ── Named custom providers (config.yaml providers dict / custom_providers list) ───
    try:
        from hermes_cli.runtime_provider import _get_named_custom_provider
-        custom_entry = _get_named_custom_provider(provider)
+        # When the raw requested name is an alias (``kimi`` → ``kimi-coding``)
+        # and the user defined a ``custom_providers`` entry under that alias
+        # name, the custom entry is the intended target — the built-in alias
+        # rewriting would otherwise hijack the request.  Only preferred when
+        # the raw name is an alias (not a canonical provider name) so custom
+        # entries that coincidentally match a canonical provider (e.g. ``nous``)
+        # still defer to the built-in per `_get_named_custom_provider`'s guard.
+        custom_entry = None
+        if original_provider and original_provider != provider:
+            custom_entry = _get_named_custom_provider(original_provider)
+        if custom_entry is None:
+            custom_entry = _get_named_custom_provider(provider)
        if custom_entry:
            custom_base = custom_entry.get("base_url", "").strip()
            custom_key = custom_entry.get("api_key", "").strip()
@@ -2264,7 +2336,7 @@ def resolve_provider_client(

    if pconfig.auth_type == "api_key":
        if provider == "anthropic":
-            client, default_model = _try_anthropic()
+            client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
            if client is None:
                logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
                return None, None
@@ -2273,6 +2345,12 @@ def resolve_provider_client(

        creds = resolve_api_key_provider_credentials(provider)
        api_key = str(creds.get("api_key", "")).strip()
+        # Honour an explicit api_key override (e.g. from a fallback_model entry
+        # or a custom_providers entry) so callers that pass an explicit
+        # credential can authenticate against endpoints where no built-in
+        # credential is registered for this provider alias.
+        if explicit_api_key:
+            api_key = explicit_api_key.strip() or api_key
        if not api_key:
            tried_sources = list(pconfig.api_key_env_vars)
            if provider == "copilot":
@@ -2284,6 +2362,11 @@ def resolve_provider_client(

        raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
        base_url = _to_openai_base_url(raw_base_url)
+        # Honour an explicit base_url override from the caller — used when a
+        # fallback_model entry (or custom_providers lookup) routes through a
+        # built-in provider name but targets a user-specified endpoint.
+        if explicit_base_url:
+            base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))

        default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
        final_model = _normalize_resolved_model(model or default_model, provider)
@@ -2565,8 +2648,11 @@ def resolve_vision_provider_client(
        return resolved_provider, sync_client, final_model

    if resolved_base_url:
+        provider_for_base_override = (
+            requested if requested and requested not in ("", "auto") else "custom"
+        )
        client, final_model = resolve_provider_client(
-            "custom",
+            provider_for_base_override,
            model=resolved_model,
            async_mode=async_mode,
            explicit_base_url=resolved_base_url,
@@ -2574,8 +2660,8 @@ def resolve_vision_provider_client(
            api_mode=resolved_api_mode,
        )
        if client is None:
-            return "custom", None, None
-        return "custom", client, final_model
+            return provider_for_base_override, None, None
+        return provider_for_base_override, client, final_model

    if requested == "auto":
        # Vision auto-detection order:
@@ -3209,7 +3295,26 @@ def _build_call_kwargs(
            kwargs["max_tokens"] = max_tokens

    if tools:
-        kwargs["tools"] = tools
+        # Defensive dedup: providers like Google Vertex, Azure, and Bedrock
+        # reject requests with duplicate tool names (HTTP 400).  The upstream
+        # injection paths (run_agent.py) already dedup, but this guard
+        # converts a hard API failure into a warning if an upstream regression
+        # reintroduces duplicates.  See: #18478
+        _seen: set = set()
+        _deduped: list = []
+        for _t in tools:
+            _tname = (_t.get("function") or {}).get("name", "")
+            if _tname and _tname in _seen:
+                logger.warning(
+                    "_build_call_kwargs: duplicate tool name '%s' removed "
+                    "(provider=%s model=%s)",
+                    _tname, provider, model,
+                )
+                continue
+            if _tname:
+                _seen.add(_tname)
+            _deduped.append(_t)
+        kwargs["tools"] = _deduped

    # Provider-specific extra_body
    merged_extra = dict(extra_body or {})
@@ -344,6 +344,7 @@ class ContextCompressor(ContextEngine):
        self._last_aux_model_failure_model = None
        self._last_compression_savings_pct = 100.0
        self._ineffective_compression_count = 0
+        self._summary_failure_cooldown_until = 0.0  # transient errors must not block a fresh session

    def update_model(
        self,
@@ -538,7 +539,7 @@ class ContextCompressor(ContextEngine):
            # Token-budget approach: walk backward accumulating tokens
            accumulated = 0
            boundary = len(result)
-            min_protect = min(protect_tail_count, len(result) - 1)
+            min_protect = min(protect_tail_count, len(result))
            for i in range(len(result) - 1, -1, -1):
                msg = result[i]
                raw_content = msg.get("content") or ""
@@ -553,7 +554,16 @@ class ContextCompressor(ContextEngine):
                    break
                accumulated += msg_tokens
                boundary = i
-            prune_boundary = max(boundary, len(result) - min_protect)
+            # Translate the budget walk into a "protected count", apply the
+            # floor in count-space (where `max` reads naturally: protect at
+            # least `min_protect` messages or whatever the budget reserved,
+            # whichever is more), then convert back to a prune boundary.
+            # Doing this in index-space with `max` would invert the direction
+            # (smaller index = MORE protected), so a generous budget would
+            # silently get truncated back down to `min_protect`.
+            budget_protect_count = len(result) - boundary
+            protected_count = max(budget_protect_count, min_protect)
+            prune_boundary = len(result) - protected_count
        else:
            prune_boundary = len(result) - protect_tail_count

@@ -569,6 +579,8 @@ class ContextCompressor(ContextEngine):
            # Skip multimodal content (list of content blocks)
            if isinstance(content, list):
                continue
+            if not isinstance(content, str):
+                continue
            if len(content) < 200:
                continue
            h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
@@ -588,6 +600,8 @@ class ContextCompressor(ContextEngine):
            # Skip multimodal content (list of content blocks)
            if isinstance(content, list):
                continue
+            if not isinstance(content, str):
+                continue
            if not content or content == _PRUNED_TOOL_PLACEHOLDER:
                continue
            # Skip already-deduplicated or previously-summarized results
@@ -903,15 +917,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
                or "does not exist" in _err_str
                or "no available channel" in _err_str
            )
+            _is_timeout = (
+                _status in (408, 429, 502, 504)
+                or "timeout" in _err_str
+            )
            if (
-                _is_model_not_found
+                (_is_model_not_found or _is_timeout)
                and self.summary_model
                and self.summary_model != self.model
                and not getattr(self, "_summary_model_fallen_back", False)
            ):
                self._summary_model_fallen_back = True
                logging.warning(
-                    "Summary model '%s' not available (%s). "
+                    "Summary model '%s' unavailable (%s). "
                    "Falling back to main model '%s' for compression.",
                    self.summary_model, e, self.model,
                )
@@ -992,8 +1010,8 @@ The user has requested that this compaction PRIORITISE preserving all informatio
    def _get_tool_call_id(tc) -> str:
        """Extract the call ID from a tool_call entry (dict or SimpleNamespace)."""
        if isinstance(tc, dict):
-            return tc.get("id", "")
-        return getattr(tc, "id", "") or ""
+            return tc.get("call_id", "") or tc.get("id", "") or ""
+        return getattr(tc, "call_id", "") or getattr(tc, "id", "") or ""

    def _sanitize_tool_pairs(self, messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        """Fix orphaned tool_call / tool_result pairs after compression.
@@ -3,6 +3,7 @@
 from __future__ import annotations

 import logging
+import os
 import random
 import threading
 import time
@@ -13,7 +14,7 @@ from datetime import datetime
 from typing import Any, Dict, List, Optional, Set, Tuple

 from hermes_constants import OPENROUTER_BASE_URL
-from hermes_cli.config import get_env_value
+from hermes_cli.config import get_env_value, load_env
 import hermes_cli.auth as auth_mod
 from hermes_cli.auth import (
    CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1380,6 +1381,16 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
 def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
    changed = False
    active_sources: Set[str] = set()
+
+    # Prefer ~/.hermes/.env over os.environ — the user's config file is the
+    # authoritative source for Hermes credentials. Stale env vars from parent
+    # processes (Codex CLI, test scripts, etc.) should not override deliberate
+    # changes to the .env file.
+    def _get_env_prefer_dotenv(key: str) -> str:
+        env_file = load_env()
+        val = env_file.get(key) or os.environ.get(key) or ""
+        return val.strip()
+
    # Honour user suppression — `hermes auth remove <provider> <N>` for an
    # env-seeded credential marks the env:<VAR> source as suppressed so it
    # won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1391,8 +1402,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        def _is_source_suppressed(_p, _s):  # type: ignore[misc]
            return False
    if provider == "openrouter":
-        # Check both os.environ and ~/.hermes/.env file
-        token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
+        # Prefer ~/.hermes/.env over os.environ
+        token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
        if token:
            source = "env:OPENROUTER_API_KEY"
            if _is_source_suppressed(provider, source):
@@ -1418,7 +1429,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool

    env_url = ""
    if pconfig.base_url_env_var:
-        env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
+        env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")

    env_vars = list(pconfig.api_key_env_vars)
    if provider == "anthropic":
@@ -1429,8 +1440,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        ]

    for env_var in env_vars:
-        # Check both os.environ and ~/.hermes/.env file
-        token = (get_env_value(env_var) or "").strip()
+        # Prefer ~/.hermes/.env over os.environ
+        token = _get_env_prefer_dotenv(env_var)
        if not token:
            continue
        source = f"env:{env_var}"
@@ -24,11 +24,12 @@ from __future__ import annotations
 import json
 import logging
 import os
+import re
 import tempfile
 import threading
 from datetime import datetime, timedelta, timezone
 from pathlib import Path
-from typing import Any, Callable, Dict, List, Optional, Set
+from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set

 from hermes_constants import get_hermes_home
 from tools import skill_usage
@@ -36,6 +37,22 @@ from tools import skill_usage
 logger = logging.getLogger(__name__)


+def _strip_aux_credential(value: Any) -> Optional[str]:
+    if value is None:
+        return None
+    text = str(value).strip()
+    return text or None
+
+
+class _ReviewRuntimeBinding(NamedTuple):
+    """Provider/model for the curator review fork plus optional per-slot overrides."""
+
+    provider: str
+    model: str
+    explicit_api_key: Optional[str]
+    explicit_base_url: Optional[str]
+
+
 DEFAULT_INTERVAL_HOURS = 24 * 7  # 7 days
 DEFAULT_MIN_IDLE_HOURS = 2
 DEFAULT_STALE_AFTER_DAYS = 30
@@ -55,6 +72,7 @@ def _default_state() -> Dict[str, Any]:
        "last_run_at": None,
        "last_run_duration_seconds": None,
        "last_run_summary": None,
+        "last_report_path": None,
        "paused": False,
        "run_count": 0,
    }
@@ -183,7 +201,16 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
    Gates:
      - curator.enabled == True
      - not paused
-      - last_run_at missing, OR older than interval_hours
+      - last_run_at present AND older than interval_hours
+
+    First-run behavior: when there is no ``last_run_at`` (fresh install, or
+    install that predates the curator), we DO NOT run immediately. The
+    curator is designed to run after at least ``interval_hours`` (7 days by
+    default) of skill activity, not on the first background tick after
+    ``hermes update``. On first observation we seed ``last_run_at`` to "now"
+    and defer the first real pass by one full interval. Users who want to
+    run it sooner can always invoke ``hermes curator run`` (with or without
+    ``--dry-run``) explicitly — that path bypasses this gate.

    The idle check (min_idle_hours) is applied at the call site where we know
    whether an agent is actively running — here we only enforce the static
@@ -197,7 +224,21 @@ def should_run_now(now: Optional[datetime] = None) -> bool:
    state = load_state()
    last = _parse_iso(state.get("last_run_at"))
    if last is None:
-        return True
+        # Never run before. Seed state so we wait a full interval before the
+        # first real pass. Report-only; do not auto-mutate the library the
+        # very first time a gateway ticks after an update.
+        if now is None:
+            now = datetime.now(timezone.utc)
+        try:
+            state["last_run_at"] = now.isoformat()
+            state["last_run_summary"] = (
+                "deferred first run — curator seeded, will run after one "
+                "interval; use `hermes curator run --dry-run` to preview now"
+            )
+            save_state(state)
+        except Exception as e:  # pragma: no cover — best-effort persistence
+            logger.debug("Failed to seed curator last_run_at: %s", e)
+        return False

    if now is None:
        now = datetime.now(timezone.utc)
@@ -258,6 +299,33 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
 # Review prompt for the forked agent
 # ---------------------------------------------------------------------------

+CURATOR_DRY_RUN_BANNER = (
+    "═══════════════════════════════════════════════════════════════\n"
+    "DRY-RUN — REPORT ONLY. DO NOT MUTATE THE SKILL LIBRARY.\n"
+    "═══════════════════════════════════════════════════════════════\n"
+    "\n"
+    "This is a PREVIEW pass. Follow every instruction below EXCEPT:\n"
+    "\n"
+    "  • DO NOT call skill_manage with action=patch, create, delete, "
+    "write_file, or remove_file.\n"
+    "  • DO NOT call terminal to mv skill directories into .archive/.\n"
+    "  • DO NOT call terminal to mv, cp, rm, or rewrite any file under "
+    "~/.hermes/skills/.\n"
+    "  • skills_list and skill_view are FINE — read as much as you need.\n"
+    "\n"
+    "Your output IS the deliverable. Produce the exact same "
+    "human-readable summary and structured YAML block you would "
+    "produce on a live run — but describe the actions you WOULD take, "
+    "not actions you took. A downstream reviewer will read the report "
+    "and decide whether to approve a live run with "
+    "`hermes curator run` (no flag).\n"
+    "\n"
+    "If you accidentally take a mutating action, say so explicitly in "
+    "the summary so the reviewer can revert it.\n"
+    "═══════════════════════════════════════════════════════════════"
+)
+
+
 CURATOR_REVIEW_PROMPT = (
    "You are running as Hermes' background skill CURATOR. This is an "
    "UMBRELLA-BUILDING consolidation pass, not a passive audit and not a "
@@ -336,6 +404,11 @@ CURATOR_REVIEW_PROMPT = (
    "  - skill_manage action=write_file — add a references/, templates/, "
    "or scripts/ file under an existing skill (the skill must already "
    "exist)\n"
+    "  - skill_manage action=delete     — archive a skill. MUST pass "
+    "`absorbed_into=<umbrella>` when you've merged its content into another "
+    "skill, or `absorbed_into=\"\"` when you're truly pruning with no "
+    "forwarding target. This drives cron-job skill-reference migration — "
+    "guessing from your YAML summary after the fact is fragile.\n"
    "  - terminal                       — mv a sibling into the archive "
    "OR move its content into a support subfile\n\n"
    "'keep' is a legitimate decision ONLY when the skill is already a "
@@ -397,6 +470,24 @@ def _reports_root() -> Path:
    return root


+def _needle_in_path_component(needle: str, path: str) -> bool:
+    """Check if *needle* is a complete filename stem or directory name in *path*.
+
+    Unlike simple substring matching, this avoids false positives where short
+    skill names are embedded in longer filenames (e.g. "api" matching
+    "references/api-design.md").  Hyphens and underscores are normalised so
+    "open-webui-setup" matches "open_webui_setup.md".
+    """
+    norm_needle = needle.replace("-", "_")
+    for part in path.replace("\\", "/").split("/"):
+        if not part:
+            continue
+        stem = part.rsplit(".", 1)[0] if "." in part else part
+        if stem.replace("-", "_") == norm_needle:
+            return True
+    return False
+
+
 def _classify_removed_skills(
    removed: List[str],
    added: List[str],
@@ -475,15 +566,29 @@ def _classify_removed_skills(
                continue

            # Look for the removed skill's name in file_path / content / raw.
-            haystacks: List[str] = []
+            # Matching strategy differs by field type:
+            #   file_path — needle must be a complete path component
+            #     (filename stem or directory name), so "api" does NOT
+            #     falsely match "references/api-design.md".
+            #   content fields — word-boundary regex so "test" does NOT
+            #     falsely match "latest" or "testing".
+            haystacks: List[tuple[str, str]] = []
            for key in ("file_path", "file_content", "content", "new_string", "_raw"):
                v = args.get(key)
                if isinstance(v, str):
-                    haystacks.append(v)
+                    haystacks.append((key, v))
            hit = False
-            for hay in haystacks:
+            for key, hay in haystacks:
                for needle in needles:
-                    if needle and needle in hay:
+                    if not needle:
+                        continue
+                    if key == "file_path":
+                        matched = _needle_in_path_component(needle, hay)
+                    else:
+                        matched = bool(
+                            re.search(rf'\b{re.escape(needle)}\b', hay)
+                        )
+                    if matched:
                        hit = True
                        evidence = (
                            f"skill_manage action={args.get('action', '?')} "
@@ -586,15 +691,76 @@ def _parse_structured_summary(
    return out


+def _extract_absorbed_into_declarations(
+    tool_calls: List[Dict[str, Any]],
+) -> Dict[str, Dict[str, Any]]:
+    """Walk this run's tool calls and extract model-declared absorption targets.
+
+    The curator prompt requires every ``skill_manage(action='delete')`` call
+    to pass ``absorbed_into=<umbrella>`` when consolidating, or
+    ``absorbed_into=""`` when truly pruning. This is the single authoritative
+    signal for classification — the model's own declaration at the moment of
+    deletion, which beats both post-hoc YAML summary parsing and substring
+    heuristics on other tool calls.
+
+    Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
+    Entries with ``into == ""`` are explicit prunings.
+    Skills without a ``skill_manage(delete)`` call, or with one that omitted
+    ``absorbed_into``, are not in the returned dict — caller falls back to
+    the existing heuristic/YAML logic for those (backward compat with older
+    curator runs and any callers that don't populate the arg).
+    """
+    out: Dict[str, Dict[str, Any]] = {}
+    for tc in tool_calls or []:
+        if not isinstance(tc, dict):
+            continue
+        if tc.get("name") != "skill_manage":
+            continue
+        raw = tc.get("arguments") or ""
+        args: Dict[str, Any] = {}
+        if isinstance(raw, dict):
+            args = raw
+        elif isinstance(raw, str):
+            try:
+                args = json.loads(raw)
+            except Exception:
+                continue
+        if not isinstance(args, dict):
+            continue
+        if args.get("action") != "delete":
+            continue
+        name = args.get("name")
+        if not isinstance(name, str) or not name.strip():
+            continue
+        # absorbed_into must be present (even empty string is meaningful);
+        # missing key means the model didn't declare intent.
+        if "absorbed_into" not in args:
+            continue
+        target = args.get("absorbed_into")
+        if target is None:
+            continue
+        if not isinstance(target, str):
+            continue
+        out[name.strip()] = {"into": target.strip(), "declared": True}
+    return out
+
+
 def _reconcile_classification(
    removed: List[str],
    heuristic: Dict[str, List[Dict[str, Any]]],
    model_block: Dict[str, List[Dict[str, str]]],
    destinations: Set[str],
+    absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
 ) -> Dict[str, List[Dict[str, Any]]]:
    """Merge heuristic (tool-call evidence) with the model's structured block.

-    Rules:
+    Rules (evaluated in order; first match wins):
+    - **Model-declared `absorbed_into` at delete time is authoritative.** Any
+      entry in ``absorbed_declarations`` beats every other signal. This is
+      the model telling us directly, at the moment of deletion, what it did.
+      ``into != ""`` and target exists → consolidated. ``into == ""`` →
+      pruned. ``into != ""`` but target doesn't exist → hallucination; fall
+      through to the usual signals.
    - Model-declared consolidation wins when its ``into`` target exists
      in ``destinations`` (survived or newly-created). This gives the
      model authority over intent + rationale.
@@ -615,6 +781,8 @@ def _reconcile_classification(
    model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
    model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}

+    declared = absorbed_declarations or {}
+
    consolidated: List[Dict[str, Any]] = []
    pruned: List[Dict[str, Any]] = []

@@ -622,6 +790,36 @@ def _reconcile_classification(
        mc = model_cons.get(name)
        mp = model_pruned.get(name)
        hc = heur_cons.get(name)
+        dec = declared.get(name)
+
+        # Authoritative: model declared `absorbed_into` at the delete call.
+        if dec is not None:
+            into_claim = dec.get("into", "")
+            if into_claim and into_claim in destinations:
+                entry: Dict[str, Any] = {
+                    "name": name,
+                    "into": into_claim,
+                    "source": "absorbed_into (model-declared at delete)",
+                    "reason": (mc.get("reason") or "") if mc else "",
+                }
+                if hc and hc.get("evidence"):
+                    entry["evidence"] = hc["evidence"]
+                consolidated.append(entry)
+                continue
+            if into_claim == "":
+                # Explicit prune declaration
+                pruned.append({
+                    "name": name,
+                    "source": "absorbed_into=\"\" (model-declared prune)",
+                    "reason": (mp.get("reason") or "") if mp else "",
+                })
+                continue
+            # into_claim is non-empty but target doesn't exist: the model
+            # named a nonexistent umbrella at delete time. The tool already
+            # rejects this at the skill_manage layer, so we shouldn't see it
+            # in practice — but if it slips through (e.g. the umbrella was
+            # deleted LATER in the same run), fall through to the usual
+            # signals rather than trusting a broken reference.

        # Model says consolidated — trust it if the destination is real.
        if mc and mc.get("into") in destinations:
@@ -757,15 +955,57 @@ def _write_run_report(
    )
    model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
    destinations = set(after_names) | set(added or [])
+    # Authoritative signal: extract per-delete `absorbed_into` declarations
+    # from this run's tool calls. These beat both the YAML summary block and
+    # the substring heuristic — the model is telling us directly, at the
+    # moment of deletion, whether each archived skill was consolidated
+    # (into=<umbrella>) or pruned (into="").
+    absorbed_declarations = _extract_absorbed_into_declarations(
+        llm_meta.get("tool_calls", []) or []
+    )
    classification = _reconcile_classification(
        removed=removed,
        heuristic=heuristic,
        model_block=model_block,
        destinations=destinations,
+        absorbed_declarations=absorbed_declarations,
    )
    consolidated = classification["consolidated"]
    pruned = classification["pruned"]

+    # Rewrite cron job skill references. When the curator consolidates
+    # skill X into umbrella Y, any cron job that lists X fails to load
+    # it at run time — the scheduler skips it and the job runs without
+    # the instructions it was scheduled to follow. Rewriting the
+    # references in-place keeps scheduled jobs working across
+    # consolidation passes. Best-effort: never let a cron-module issue
+    # break the curator.
+    cron_rewrites: Dict[str, Any] = {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
+    try:
+        consolidated_map = {
+            e["name"]: e["into"]
+            for e in consolidated
+            if isinstance(e, dict) and e.get("name") and e.get("into")
+        }
+        pruned_names = [
+            e["name"] for e in pruned
+            if isinstance(e, dict) and e.get("name")
+        ]
+        if consolidated_map or pruned_names:
+            from cron.jobs import rewrite_skill_refs as _rewrite_cron_refs
+            cron_rewrites = _rewrite_cron_refs(
+                consolidated=consolidated_map,
+                pruned=pruned_names,
+            )
+    except Exception as e:
+        logger.debug("Curator cron skill rewrite failed: %s", e, exc_info=True)
+        cron_rewrites = {
+            "rewrites": [],
+            "jobs_updated": 0,
+            "jobs_scanned": 0,
+            "error": str(e),
+        }
+
    payload = {
        "started_at": started_at.isoformat(),
        "duration_seconds": round(elapsed_seconds, 2),
@@ -781,6 +1021,7 @@ def _write_run_report(
            "consolidated_this_run": len(consolidated),
            "pruned_this_run": len(pruned),
            "state_transitions": len(transitions),
+            "cron_jobs_rewritten": int(cron_rewrites.get("jobs_updated", 0)),
            "tool_calls_total": sum(tc_counts.values()),
        },
        "tool_call_counts": tc_counts,
@@ -790,6 +1031,7 @@ def _write_run_report(
        "pruned_names": [p["name"] for p in pruned],
        "added": added,
        "state_transitions": transitions,
+        "cron_rewrites": cron_rewrites,
        "llm_final": llm_meta.get("final", ""),
        "llm_summary": llm_meta.get("summary", ""),
        "llm_error": llm_meta.get("error"),
@@ -812,6 +1054,17 @@ def _write_run_report(
    except Exception as e:
        logger.debug("Curator REPORT.md write failed: %s", e)

+    # cron_rewrites.json — only when at least one job was touched, to
+    # keep run dirs uncluttered for the common no-op case.
+    try:
+        if int(cron_rewrites.get("jobs_updated", 0)) > 0:
+            (run_dir / "cron_rewrites.json").write_text(
+                json.dumps(cron_rewrites, indent=2, ensure_ascii=False) + "\n",
+                encoding="utf-8",
+            )
+    except Exception as e:
+        logger.debug("Curator cron_rewrites.json write failed: %s", e)
+
    return run_dir


@@ -942,6 +1195,39 @@ def _render_report_markdown(p: Dict[str, Any]) -> str:
            lines.append(f"- `{t.get('name')}`: {t.get('from')} → {t.get('to')}")
        lines.append("")

+    # Cron job rewrites — show which scheduled jobs had their skill
+    # references updated so users can audit that the auto-rewrite did
+    # the right thing. Only present when at least one job changed.
+    cron_rw = p.get("cron_rewrites") or {}
+    cron_rewrites_list = cron_rw.get("rewrites") or []
+    if cron_rewrites_list:
+        lines.append(f"### Cron job skill references rewritten ({len(cron_rewrites_list)})\n")
+        lines.append(
+            "_Cron jobs that referenced a consolidated or pruned skill were "
+            "updated in-place so they keep loading the right instructions "
+            "on their next run. See `cron_rewrites.json` for the full record._\n"
+        )
+        SHOW = 25
+        for entry in cron_rewrites_list[:SHOW]:
+            job_name = entry.get("job_name") or entry.get("job_id") or "?"
+            before = entry.get("before") or []
+            after = entry.get("after") or []
+            mapped = entry.get("mapped") or {}
+            dropped = entry.get("dropped") or []
+            lines.append(
+                f"- `{job_name}`: `{', '.join(before)}` → `{', '.join(after) or '(none)'}`"
+            )
+            for old, new in mapped.items():
+                lines.append(f"    - `{old}` → `{new}` (consolidated)")
+            for name in dropped:
+                lines.append(f"    - `{name}` dropped (pruned)")
+        if len(cron_rewrites_list) > SHOW:
+            lines.append(
+                f"- … and {len(cron_rewrites_list) - SHOW} more "
+                "(see `cron_rewrites.json`)"
+            )
+        lines.append("")
+
    # Full LLM final response
    final = (p.get("llm_final") or "").strip()
    if final:
@@ -992,6 +1278,7 @@ def _render_candidate_list() -> str:
 def run_curator_review(
    on_summary: Optional[Callable[[str], None]] = None,
    synchronous: bool = False,
+    dry_run: bool = False,
 ) -> Dict[str, Any]:
    """Execute a single curator review pass.

@@ -1004,9 +1291,43 @@ def run_curator_review(

    If *synchronous* is True, the LLM review runs in the calling thread; the
    default is to spawn a daemon thread so the caller returns immediately.
+
+    If *dry_run* is True, the automatic stale/archive transitions are SKIPPED
+    and the LLM review pass is instructed to produce a report only — no
+    skill_manage mutations, no terminal archive moves. The REPORT.md still
+    gets written and ``state.last_report_path`` still records it so users
+    can read what the curator WOULD have done.
    """
    start = datetime.now(timezone.utc)
-    counts = apply_automatic_transitions(now=start)
+    if dry_run:
+        # Count candidates without mutating state.
+        try:
+            report = skill_usage.agent_created_report()
+            counts = {
+                "checked": len(report),
+                "marked_stale": 0,
+                "archived": 0,
+                "reactivated": 0,
+            }
+        except Exception:
+            counts = {"checked": 0, "marked_stale": 0, "archived": 0, "reactivated": 0}
+    else:
+        # Pre-mutation snapshot — best-effort, never blocks the run. A
+        # failed snapshot logs at debug and continues (the alternative is
+        # that a transient disk issue silently disables curator forever,
+        # which is worse). Users who want to require snapshots can disable
+        # curator entirely until they can fix disk space.
+        try:
+            from agent import curator_backup
+            snap = curator_backup.snapshot_skills(reason="pre-curator-run")
+            if snap is not None and on_summary:
+                try:
+                    on_summary(f"curator: snapshot created ({snap.name})")
+                except Exception:
+                    pass
+        except Exception as e:
+            logger.debug("Curator pre-run snapshot failed: %s", e, exc_info=True)
+        counts = apply_automatic_transitions(now=start)

    auto_summary_parts = []
    if counts["marked_stale"]:
@@ -1018,11 +1339,16 @@ def run_curator_review(
    auto_summary = ", ".join(auto_summary_parts) if auto_summary_parts else "no changes"

    # Persist state before the LLM pass so a crash mid-review still records
-    # the run and doesn't immediately re-trigger.
+    # the run and doesn't immediately re-trigger. In dry-run we do NOT bump
+    # last_run_at or run_count — a preview shouldn't push the next scheduled
+    # real pass out. We still record a summary so `hermes curator status`
+    # shows that a preview ran.
    state = load_state()
-    state["last_run_at"] = start.isoformat()
-    state["run_count"] = int(state.get("run_count", 0)) + 1
-    state["last_run_summary"] = f"auto: {auto_summary}"
+    if not dry_run:
+        state["last_run_at"] = start.isoformat()
+        state["run_count"] = int(state.get("run_count", 0)) + 1
+    prefix = "dry-run auto: " if dry_run else "auto: "
+    state["last_run_summary"] = f"{prefix}{auto_summary}"
    save_state(state)

    def _llm_pass():
@@ -1038,7 +1364,7 @@ def run_curator_review(
        try:
            candidate_list = _render_candidate_list()
            if "No agent-created skills" in candidate_list:
-                final_summary = f"auto: {auto_summary}; llm: skipped (no candidates)"
+                final_summary = f"{prefix}{auto_summary}; llm: skipped (no candidates)"
                llm_meta = {
                    "final": "",
                    "summary": "skipped (no candidates)",
@@ -1048,14 +1374,21 @@ def run_curator_review(
                    "error": None,
                }
            else:
-                prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
+                if dry_run:
+                    prompt = (
+                        f"{CURATOR_DRY_RUN_BANNER}\n\n"
+                        f"{CURATOR_REVIEW_PROMPT}\n\n"
+                        f"{candidate_list}"
+                    )
+                else:
+                    prompt = f"{CURATOR_REVIEW_PROMPT}\n\n{candidate_list}"
                llm_meta = _run_llm_review(prompt)
                final_summary = (
-                    f"auto: {auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
+                    f"{prefix}{auto_summary}; llm: {llm_meta.get('summary', 'no change')}"
                )
        except Exception as e:
            logger.debug("Curator LLM pass failed: %s", e, exc_info=True)
-            final_summary = f"auto: {auto_summary}; llm: error ({e})"
+            final_summary = f"{prefix}{auto_summary}; llm: error ({e})"
            llm_meta = {
                "final": "",
                "summary": f"error ({e})",
@@ -1114,6 +1447,52 @@ def run_curator_review(
    }


+def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
+    """Resolve provider/model and per-slot credentials for the curator review fork.
+
+    Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
+    ``base_url`` from the active slot are returned as explicit overrides so
+    ``resolve_runtime_provider`` does not silently reuse the main chat
+    credential chain for a routed auxiliary model.
+    """
+    _main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
+    _main_provider = _main.get("provider") or "auto"
+    _main_model = _main.get("default") or _main.get("model") or ""
+
+    # 1. Canonical aux task slot
+    _aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
+    _cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
+    _task_provider = (_cur_task.get("provider") or "").strip() or None
+    _task_model = (_cur_task.get("model") or "").strip() or None
+    if _task_provider and _task_provider != "auto" and _task_model:
+        return _ReviewRuntimeBinding(
+            _task_provider,
+            _task_model,
+            _strip_aux_credential(_cur_task.get("api_key")),
+            _strip_aux_credential(_cur_task.get("base_url")),
+        )
+
+    # 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
+    _cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
+    _legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
+    _legacy_provider = _legacy.get("provider") or None
+    _legacy_model = _legacy.get("model") or None
+    if _legacy_provider and _legacy_model:
+        logger.info(
+            "curator: using deprecated curator.auxiliary.{provider,model} "
+            "config — please migrate to auxiliary.curator.{provider,model}"
+        )
+        return _ReviewRuntimeBinding(
+            str(_legacy_provider),
+            str(_legacy_model),
+            _strip_aux_credential(_legacy.get("api_key")),
+            _strip_aux_credential(_legacy.get("base_url")),
+        )
+
+    # 3. Fall through to the main chat model
+    return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
+
+
 def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
    """Pick (provider, model) for the curator review fork.

@@ -1129,32 +1508,8 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
      2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
      3. Main ``model.{provider,default/model}`` pair
    """
-    _main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
-    _main_provider = _main.get("provider") or "auto"
-    _main_model = _main.get("default") or _main.get("model") or ""
-
-    # 1. Canonical aux task slot
-    _aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
-    _cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
-    _task_provider = (_cur_task.get("provider") or "").strip() or None
-    _task_model = (_cur_task.get("model") or "").strip() or None
-    if _task_provider and _task_provider != "auto" and _task_model:
-        return _task_provider, _task_model
-
-    # 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
-    _cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
-    _legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
-    _legacy_provider = _legacy.get("provider") or None
-    _legacy_model = _legacy.get("model") or None
-    if _legacy_provider and _legacy_model:
-        logger.info(
-            "curator: using deprecated curator.auxiliary.{provider,model} "
-            "config — please migrate to auxiliary.curator.{provider,model}"
-        )
-        return _legacy_provider, _legacy_model
-
-    # 3. Fall through to the main chat model
-    return _main_provider, _main_model
+    b = _resolve_review_runtime(cfg)
+    return b.provider, b.model


 def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1193,10 +1548,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
    # arguments hits an auto-resolution path that fails for OAuth-only
    # providers and for pool-backed credentials.
    #
-    # `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
+    # `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
    # (canonical aux-task slot, wired through `hermes model` → auxiliary
    # picker and the dashboard Models tab), with a legacy fallback to
-    # `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
+    # `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
    _api_key = None
    _base_url = None
    _api_mode = None
@@ -1206,9 +1561,13 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
        from hermes_cli.config import load_config
        from hermes_cli.runtime_provider import resolve_runtime_provider
        _cfg = load_config()
-        _provider, _model_name = _resolve_review_model(_cfg)
+        _binding = _resolve_review_runtime(_cfg)
+        _provider, _model_name = _binding.provider, _binding.model
        _rp = resolve_runtime_provider(
-            requested=_provider, target_model=_model_name
+            requested=_provider,
+            target_model=_model_name,
+            explicit_api_key=_binding.explicit_api_key,
+            explicit_base_url=_binding.explicit_base_url,
        )
        _api_key = _rp.get("api_key")
        _base_url = _rp.get("base_url")
@@ -0,0 +1,693 @@
+"""Curator snapshot + rollback.
+
+A pre-run snapshot of ``~/.hermes/skills/`` (excluding ``.curator_backups/``
+itself) is taken before any mutating curator pass. Snapshots are tar.gz
+files under ``~/.hermes/skills/.curator_backups/<utc-iso>/`` with a
+companion ``manifest.json`` describing the snapshot (reason, time, size,
+counted skill files). Rollback picks a snapshot, moves the current
+``skills/`` tree aside into another snapshot so even the rollback itself
+is undoable, then extracts the chosen snapshot into place.
+
+The snapshot does NOT include:
+  - ``.curator_backups/`` (would recurse)
+  - ``.hub/`` (hub-installed skills — managed by the hub, not us)
+
+It DOES include:
+  - all SKILL.md files + their directories (``scripts/``, ``references/``,
+    ``templates/``, ``assets/``)
+  - ``.usage.json`` (usage telemetry — needed to rehydrate state cleanly)
+  - ``.archive/`` (so rollback restores previously-archived skills too)
+  - ``.curator_state`` (so rolling back also restores the last-run-at
+    pointer — otherwise the curator would immediately re-fire on the next
+    tick)
+  - ``.bundled_manifest`` (so protection markers stay consistent)
+
+Alongside the skills tarball, each snapshot also captures a copy of
+``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
+jobs reference skills by name in their ``skills``/``skill`` fields; the
+curator's consolidation pass rewrites those in place via
+``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
+rolling back the skills tree would leave cron jobs pointing at the
+umbrella skills even though the narrow skills they were originally
+configured with have been restored. We store the whole jobs.json for
+fidelity but rollback only touches the ``skills``/``skill`` fields — the
+rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
+we leave it alone.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+import re
+import shutil
+import tarfile
+import tempfile
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any, Dict, List, Optional, Tuple
+
+from hermes_constants import get_hermes_home
+
+logger = logging.getLogger(__name__)
+
+
+DEFAULT_KEEP = 5
+
+# Entries under skills/ that should NEVER be rolled up into a snapshot.
+# .hub/ is managed by the skills hub; rolling it back would break lockfile
+# invariants. .curator_backups is the backup dir itself — recursion bomb.
+_EXCLUDE_TOP_LEVEL = {".curator_backups", ".hub"}
+
+# Snapshot id regex: UTC ISO with colons replaced by dashes so the filename
+# is portable (Windows-safe). An optional ``-NN`` suffix handles two
+# snapshots landing in the same wallclock second.
+_ID_RE = re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}Z(-\d{2})?$")
+
+
+def _backups_dir() -> Path:
+    return get_hermes_home() / "skills" / ".curator_backups"
+
+
+def _skills_dir() -> Path:
+    return get_hermes_home() / "skills"
+
+
+def _cron_jobs_file() -> Path:
+    """Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
+    return get_hermes_home() / "cron" / "jobs.json"
+
+
+CRON_JOBS_FILENAME = "cron-jobs.json"
+
+
+def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
+    """Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
+
+    Returns a small dict describing what was captured so the caller can
+    fold it into the manifest. Never raises — if the cron file is missing
+    or unreadable, the return dict has ``backed_up=False`` and the reason,
+    and the snapshot proceeds without cron data (the snapshot is still
+    useful for rolling back skills).
+    """
+    src = _cron_jobs_file()
+    info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
+    if not src.exists():
+        info["reason"] = "no cron/jobs.json present"
+        return info
+    try:
+        raw = src.read_text(encoding="utf-8")
+    except OSError as e:
+        logger.debug("Failed to read cron/jobs.json for backup: %s", e)
+        info["reason"] = f"read error: {e}"
+        return info
+    # Count jobs as a nice diagnostic — but don't fail the snapshot if the
+    # file is unparseable; just store the raw text and let rollback deal
+    # with it (or not, if it's corrupted). jobs.json wraps the list as
+    # `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
+    # fall back to bare-list shape just in case the format ever changes.
+    try:
+        parsed = json.loads(raw)
+        if isinstance(parsed, dict):
+            inner = parsed.get("jobs")
+            if isinstance(inner, list):
+                info["jobs_count"] = len(inner)
+        elif isinstance(parsed, list):
+            info["jobs_count"] = len(parsed)
+    except (json.JSONDecodeError, TypeError):
+        info["jobs_count"] = 0
+        info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
+    try:
+        (dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
+    except OSError as e:
+        logger.debug("Failed to write cron backup file: %s", e)
+        info["reason"] = f"write error: {e}"
+        return info
+    info["backed_up"] = True
+    return info
+
+
+def _utc_id(now: Optional[datetime] = None) -> str:
+    """UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
+    if now is None:
+        now = datetime.now(timezone.utc)
+    # isoformat → "2026-05-01T13:05:42.123456+00:00"; strip subseconds and tz.
+    s = now.replace(microsecond=0).isoformat()
+    if s.endswith("+00:00"):
+        s = s[:-6]
+    return s.replace(":", "-") + "Z"
+
+
+def _load_config() -> Dict[str, Any]:
+    try:
+        from hermes_cli.config import load_config
+        cfg = load_config()
+    except Exception as e:
+        logger.debug("Failed to load config for curator backup: %s", e)
+        return {}
+    if not isinstance(cfg, dict):
+        return {}
+    cur = cfg.get("curator") or {}
+    if not isinstance(cur, dict):
+        return {}
+    bk = cur.get("backup") or {}
+    return bk if isinstance(bk, dict) else {}
+
+
+def is_enabled() -> bool:
+    """Default ON — the whole point of the backup is safety by default."""
+    return bool(_load_config().get("enabled", True))
+
+
+def get_keep() -> int:
+    cfg = _load_config()
+    try:
+        n = int(cfg.get("keep", DEFAULT_KEEP))
+    except (TypeError, ValueError):
+        n = DEFAULT_KEEP
+    return max(1, n)
+
+
+# ---------------------------------------------------------------------------
+# Snapshot
+# ---------------------------------------------------------------------------
+
+def _count_skill_files(base: Path) -> int:
+    try:
+        return sum(1 for _ in base.rglob("SKILL.md"))
+    except OSError:
+        return 0
+
+
+def _write_manifest(dest: Path, reason: str, archive_path: Path,
+                    skills_counted: int,
+                    cron_info: Optional[Dict[str, Any]] = None) -> None:
+    manifest = {
+        "id": dest.name,
+        "reason": reason,
+        "created_at": datetime.now(timezone.utc).isoformat(),
+        "archive": archive_path.name,
+        "archive_bytes": archive_path.stat().st_size,
+        "skill_files": skills_counted,
+    }
+    if cron_info is not None:
+        manifest["cron_jobs"] = {
+            "backed_up": bool(cron_info.get("backed_up", False)),
+            "jobs_count": int(cron_info.get("jobs_count", 0)),
+        }
+        if not cron_info.get("backed_up"):
+            manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
+        if cron_info.get("parse_warning"):
+            manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
+    (dest / "manifest.json").write_text(
+        json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
+    )
+
+
+def snapshot_skills(reason: str = "manual") -> Optional[Path]:
+    """Create a tar.gz snapshot of ``~/.hermes/skills/`` and prune old ones.
+
+    Returns the snapshot directory path, or ``None`` if the snapshot was
+    skipped (backup disabled, skills dir missing, or an IO error occurred —
+    in which case we log at debug and return None so the curator never
+    aborts a pass because of a backup failure).
+    """
+    if not is_enabled():
+        logger.debug("Curator backup disabled by config; skipping snapshot")
+        return None
+
+    skills = _skills_dir()
+    if not skills.exists():
+        logger.debug("No ~/.hermes/skills/ directory — nothing to back up")
+        return None
+
+    backups = _backups_dir()
+    try:
+        backups.mkdir(parents=True, exist_ok=True)
+    except OSError as e:
+        logger.debug("Failed to create backups dir %s: %s", backups, e)
+        return None
+
+    # Uniquify: if a snapshot with the same second already exists (can
+    # happen if two curator runs fire in the same second), append a short
+    # counter. Avoids clobbering and avoids timestamp collisions.
+    base_id = _utc_id()
+    snap_id = base_id
+    counter = 1
+    while (backups / snap_id).exists():
+        snap_id = f"{base_id}-{counter:02d}"
+        counter += 1
+
+    dest = backups / snap_id
+    try:
+        dest.mkdir(parents=True, exist_ok=False)
+    except OSError as e:
+        logger.debug("Failed to create snapshot dir %s: %s", dest, e)
+        return None
+
+    archive = dest / "skills.tar.gz"
+    try:
+        # Stream into the tarball — no tempdir copy needed.
+        with tarfile.open(archive, "w:gz", compresslevel=6) as tf:
+            for entry in sorted(skills.iterdir()):
+                if entry.name in _EXCLUDE_TOP_LEVEL:
+                    continue
+                # arcname: store paths relative to skills/ so extraction
+                # drops cleanly back into the skills dir.
+                tf.add(str(entry), arcname=entry.name, recursive=True)
+        # Capture cron/jobs.json alongside the tarball. Never fails the
+        # snapshot — the skills side is the core guarantee; cron is
+        # additive. We still record in the manifest whether it was
+        # captured so rollback can surface "no cron data in this snapshot".
+        cron_info = _backup_cron_jobs_into(dest)
+        _write_manifest(dest, reason, archive,
+                        _count_skill_files(skills),
+                        cron_info=cron_info)
+    except (OSError, tarfile.TarError) as e:
+        logger.debug("Curator snapshot failed: %s", e, exc_info=True)
+        # Clean up partial snapshot
+        try:
+            shutil.rmtree(dest, ignore_errors=True)
+        except OSError:
+            pass
+        return None
+
+    _prune_old(keep=get_keep())
+    logger.info("Curator snapshot created: %s (%s)", snap_id, reason)
+    return dest
+
+
+def _prune_old(keep: int) -> List[str]:
+    """Delete regular snapshots beyond the newest *keep*. Returns deleted
+    ids. Staging dirs (``.rollback-staging-*``) are implementation detail
+    and pruned independently on every call."""
+    backups = _backups_dir()
+    if not backups.exists():
+        return []
+    entries: List[Tuple[str, Path]] = []
+    stale_staging: List[Path] = []
+    for child in backups.iterdir():
+        if not child.is_dir():
+            continue
+        if child.name.startswith(".rollback-staging-"):
+            # Staging dirs are only supposed to exist briefly during a
+            # rollback. If we find one here (e.g. from a crashed rollback),
+            # clean it up opportunistically.
+            stale_staging.append(child)
+            continue
+        if _ID_RE.match(child.name):
+            entries.append((child.name, child))
+    # Newest first (lexicographic works because the id is UTC ISO).
+    entries.sort(key=lambda t: t[0], reverse=True)
+    deleted: List[str] = []
+    for _, path in entries[keep:]:
+        try:
+            shutil.rmtree(path)
+            deleted.append(path.name)
+        except OSError as e:
+            logger.debug("Failed to prune %s: %s", path, e)
+    for path in stale_staging:
+        try:
+            shutil.rmtree(path)
+        except OSError as e:
+            logger.debug("Failed to clean stale staging dir %s: %s", path, e)
+    return deleted
+
+
+# ---------------------------------------------------------------------------
+# List + rollback
+# ---------------------------------------------------------------------------
+
+def _read_manifest(snap_dir: Path) -> Dict[str, Any]:
+    mf = snap_dir / "manifest.json"
+    if not mf.exists():
+        return {}
+    try:
+        return json.loads(mf.read_text(encoding="utf-8"))
+    except (OSError, json.JSONDecodeError):
+        return {}
+
+
+def list_backups() -> List[Dict[str, Any]]:
+    """Return all restorable snapshots, newest first. Only entries with a
+    real ``skills.tar.gz`` tarball are listed — transient
+    ``.rollback-staging-*`` directories created mid-rollback are
+    implementation detail and not shown."""
+    backups = _backups_dir()
+    if not backups.exists():
+        return []
+    out: List[Dict[str, Any]] = []
+    for child in sorted(backups.iterdir(), reverse=True):
+        if not child.is_dir():
+            continue
+        if not _ID_RE.match(child.name):
+            continue
+        if not (child / "skills.tar.gz").exists():
+            continue
+        mf = _read_manifest(child)
+        mf.setdefault("id", child.name)
+        mf.setdefault("path", str(child))
+        if "archive_bytes" not in mf:
+            arc = child / "skills.tar.gz"
+            try:
+                mf["archive_bytes"] = arc.stat().st_size
+            except OSError:
+                mf["archive_bytes"] = 0
+        out.append(mf)
+    return out
+
+
+def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
+    """Return the path of the requested backup, or the newest one if
+    *backup_id* is None. Returns None if no match."""
+    backups = _backups_dir()
+    if not backups.exists():
+        return None
+    if backup_id:
+        target = backups / backup_id
+        if (
+            target.is_dir()
+            and _ID_RE.match(backup_id)
+            and (target / "skills.tar.gz").exists()
+        ):
+            return target
+        return None
+    candidates = [
+        c for c in sorted(backups.iterdir(), reverse=True)
+        if c.is_dir() and _ID_RE.match(c.name) and (c / "skills.tar.gz").exists()
+    ]
+    return candidates[0] if candidates else None
+
+
+def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
+    """Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
+
+    We do NOT overwrite the whole cron file. Only the ``skills`` and
+    ``skill`` fields are restored, and only on jobs that still exist in the
+    current file (matched by ``id``). Everything else about the job —
+    schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks —
+    is live state that the user/scheduler has modified since the snapshot;
+    overwriting it would regress unrelated cron activity.
+
+    Rules:
+    - Jobs present in backup AND live, with differing skills → skills restored.
+    - Jobs present in backup AND live, with matching skills → no-op.
+    - Jobs present in backup but gone from live (user deleted the job
+      after the snapshot) → skipped, noted in the return report.
+    - Jobs present in live but not in backup (user created a new cron
+      job after the snapshot) → left untouched.
+
+    Never raises; failures are captured in the return dict. Writes through
+    ``cron.jobs`` to pick up the same lock + atomic-write path that tick()
+    uses, so we don't race the scheduler.
+    """
+    report: Dict[str, Any] = {
+        "attempted": False,
+        "restored": [],
+        "skipped_missing": [],
+        "unchanged": 0,
+        "error": None,
+    }
+    backup_file = snapshot_dir / CRON_JOBS_FILENAME
+    if not backup_file.exists():
+        report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
+        return report
+
+    try:
+        backup_text = backup_file.read_text(encoding="utf-8")
+        backup_parsed = json.loads(backup_text)
+    except (OSError, json.JSONDecodeError) as e:
+        report["error"] = f"failed to load backed-up jobs: {e}"
+        return report
+    # jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
+    # that shape and a bare list for forward compat.
+    if isinstance(backup_parsed, dict):
+        backup_jobs = backup_parsed.get("jobs")
+    elif isinstance(backup_parsed, list):
+        backup_jobs = backup_parsed
+    else:
+        backup_jobs = None
+    if not isinstance(backup_jobs, list):
+        report["error"] = "backed-up cron-jobs.json has no jobs list"
+        return report
+
+    # Build a lookup of the backed-up skill state keyed by job id.
+    # We only need the two skill-ish fields (legacy single and modern list).
+    backup_by_id: Dict[str, Dict[str, Any]] = {}
+    for job in backup_jobs:
+        if not isinstance(job, dict):
+            continue
+        jid = job.get("id")
+        if not isinstance(jid, str) or not jid:
+            continue
+        backup_by_id[jid] = {
+            "skills": job.get("skills"),
+            "skill": job.get("skill"),
+            "name": job.get("name") or jid,
+        }
+
+    if not backup_by_id:
+        report["attempted"] = True  # we tried but there was nothing to do
+        return report
+
+    # Load and rewrite the live jobs under the scheduler's lock.
+    try:
+        from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
+    except ImportError as e:
+        report["error"] = f"cron module unavailable: {e}"
+        return report
+
+    report["attempted"] = True
+    try:
+        with _jobs_file_lock:
+            live_jobs = load_jobs()
+            changed = False
+
+            live_ids = set()
+            for live in live_jobs:
+                if not isinstance(live, dict):
+                    continue
+                jid = live.get("id")
+                if not isinstance(jid, str) or not jid:
+                    continue
+                live_ids.add(jid)
+
+                backup = backup_by_id.get(jid)
+                if backup is None:
+                    continue  # live job didn't exist at snapshot time
+
+                cur_skills = live.get("skills")
+                cur_skill = live.get("skill")
+                bkp_skills = backup.get("skills")
+                bkp_skill = backup.get("skill")
+
+                if cur_skills == bkp_skills and cur_skill == bkp_skill:
+                    report["unchanged"] += 1
+                    continue
+
+                # Restore. Preserve absence (don't force the key to appear
+                # if the backup didn't have it either).
+                if bkp_skills is None:
+                    live.pop("skills", None)
+                else:
+                    live["skills"] = bkp_skills
+                if bkp_skill is None:
+                    live.pop("skill", None)
+                else:
+                    live["skill"] = bkp_skill
+
+                report["restored"].append({
+                    "job_id": jid,
+                    "job_name": backup.get("name") or jid,
+                    "from": {"skills": cur_skills, "skill": cur_skill},
+                    "to": {"skills": bkp_skills, "skill": bkp_skill},
+                })
+                changed = True
+
+            # Jobs in backup but not in live = user deleted them after snapshot
+            for jid, backup in backup_by_id.items():
+                if jid not in live_ids:
+                    report["skipped_missing"].append({
+                        "job_id": jid,
+                        "job_name": backup.get("name") or jid,
+                    })
+
+            if changed:
+                save_jobs(live_jobs)
+    except Exception as e:  # noqa: BLE001 — rollback must not die mid-restore
+        logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
+        report["error"] = f"restore failed mid-flight: {e}"
+
+    return report
+
+
+
+def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
+    """Restore ``~/.hermes/skills/`` from a snapshot.
+
+    Strategy:
+      1. Resolve the target snapshot (explicit id or newest regular).
+      2. Take a safety snapshot of the CURRENT skills tree under
+         ``.curator_backups/pre-rollback-<ts>/`` so the rollback itself is
+         undoable.
+      3. Move all current top-level entries (except ``.curator_backups``
+         and ``.hub``) into a tempdir.
+      4. Extract the chosen snapshot into ``~/.hermes/skills/``.
+      5. On failure during 4, move the tempdir contents back (best-effort)
+         and return failure.
+
+    Returns ``(ok, message, snapshot_path)``.
+    """
+    target = _resolve_backup(backup_id)
+    if target is None:
+        return (
+            False,
+            f"no matching backup found"
+            + (f" for id '{backup_id}'" if backup_id else "")
+            + " (use `hermes curator rollback --list` to see available snapshots)",
+            None,
+        )
+    archive = target / "skills.tar.gz"
+    if not archive.exists():
+        return (False, f"snapshot {target.name} has no skills.tar.gz — corrupted?", None)
+
+    skills = _skills_dir()
+    skills.mkdir(parents=True, exist_ok=True)
+    backups = _backups_dir()
+    backups.mkdir(parents=True, exist_ok=True)
+
+    # Step 2: safety snapshot of current state FIRST. If this fails we bail
+    # out before touching anything — otherwise a failed extract could leave
+    # the user with no skills.
+    try:
+        snapshot_skills(reason=f"pre-rollback to {target.name}")
+    except Exception as e:
+        return (False, f"pre-rollback safety snapshot failed: {e}", None)
+
+    # Additionally move current entries into an internal staging dir so
+    # the extract happens into an empty skills tree (predictable result).
+    # This dir is implementation detail — not listed as a restorable
+    # backup. The safety snapshot above is the user-facing undo handle.
+    staged = backups / f".rollback-staging-{_utc_id()}"
+    try:
+        staged.mkdir(parents=True, exist_ok=False)
+    except OSError as e:
+        return (False, f"failed to create staging dir: {e}", None)
+
+    moved: List[Tuple[Path, Path]] = []
+    try:
+        for entry in list(skills.iterdir()):
+            if entry.name in _EXCLUDE_TOP_LEVEL:
+                continue
+            dest = staged / entry.name
+            shutil.move(str(entry), str(dest))
+            moved.append((entry, dest))
+    except OSError as e:
+        # Best-effort rollback of the move
+        for orig, dest in moved:
+            try:
+                shutil.move(str(dest), str(orig))
+            except OSError:
+                pass
+        try:
+            shutil.rmtree(staged, ignore_errors=True)
+        except OSError:
+            pass
+        return (False, f"failed to stage current skills: {e}", None)
+
+    # Step 4: extract the snapshot into skills/
+    try:
+        with tarfile.open(archive, "r:gz") as tf:
+            # Python 3.12+ supports filter='data' for safer extraction.
+            # Fall back to the unfiltered call for older interpreters but
+            # still reject absolute paths and .. components defensively.
+            for member in tf.getmembers():
+                name = member.name
+                if name.startswith("/") or ".." in Path(name).parts:
+                    raise tarfile.TarError(
+                        f"refusing to extract unsafe path: {name!r}"
+                    )
+            try:
+                tf.extractall(str(skills), filter="data")  # type: ignore[call-arg]
+            except TypeError:
+                # Python < 3.12 — no filter kwarg
+                tf.extractall(str(skills))
+    except (OSError, tarfile.TarError) as e:
+        # Best-effort recover: move staged contents back
+        for orig, dest in moved:
+            try:
+                shutil.move(str(dest), str(orig))
+            except OSError:
+                pass
+        try:
+            shutil.rmtree(staged, ignore_errors=True)
+        except OSError:
+            pass
+        return (False, f"snapshot extract failed (state restored): {e}", None)
+
+    # Extract succeeded — the staging dir has served its purpose. The
+    # user's undo handle is the safety snapshot tarball we took earlier.
+    try:
+        shutil.rmtree(staged, ignore_errors=True)
+    except OSError:
+        pass
+
+    # Reconcile cron skill-links. Surgical: only the skills/skill fields
+    # on jobs matched by id. Everything else in jobs.json is live state
+    # (schedule, next_run_at, enabled, prompt, etc.) and we leave it
+    # alone. Failures here don't fail the overall rollback — the skills
+    # tree is already restored, which is the main guarantee.
+    cron_report = _restore_cron_skill_links(target)
+
+    summary_bits = [f"restored from snapshot {target.name}"]
+    if cron_report.get("attempted"):
+        restored_n = len(cron_report.get("restored") or [])
+        skipped_n = len(cron_report.get("skipped_missing") or [])
+        if cron_report.get("error"):
+            summary_bits.append(f"cron links: error — {cron_report['error']}")
+        elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
+            # Attempted but nothing matched — empty snapshot or no overlapping ids.
+            pass
+        else:
+            parts = []
+            if restored_n:
+                parts.append(f"{restored_n} job(s) had skill links restored")
+            if skipped_n:
+                parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
+            if cron_report.get("unchanged"):
+                parts.append(f"{cron_report['unchanged']} already matched")
+            summary_bits.append("cron links: " + ", ".join(parts))
+
+    logger.info("Curator rollback: restored from %s (cron_report=%s)",
+                target.name, cron_report)
+    return (True, "; ".join(summary_bits), target)
+
+
+# ---------------------------------------------------------------------------
+# Human-readable summary for CLI
+# ---------------------------------------------------------------------------
+
+def format_size(n: int) -> str:
+    for unit in ("B", "KB", "MB", "GB"):
+        if n < 1024 or unit == "GB":
+            return f"{n:.1f} {unit}" if unit != "B" else f"{n} B"
+        n /= 1024
+    return f"{n:.1f} GB"
+
+
+def summarize_backups() -> str:
+    rows = list_backups()
+    if not rows:
+        return "No curator snapshots yet."
+    lines = [f"{'id':<24}  {'reason':<40}  {'skills':>6}  {'size':>8}"]
+    lines.append("─" * len(lines[0]))
+    for r in rows:
+        lines.append(
+            f"{r.get('id','?'):<24}  "
+            f"{(r.get('reason','?') or '?')[:40]:<40}  "
+            f"{r.get('skill_files', 0):>6}  "
+            f"{format_size(int(r.get('archive_bytes', 0))):>8}"
+        )
+    return "\n".join(lines)
@@ -520,7 +520,12 @@ def classify_api_error(

    is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
    if is_disconnect and not status_code:
-        is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
+        # Absolute token/message-count thresholds are only a proxy for smaller
+        # context windows.  Large-context sessions can have hundreds of
+        # messages while still being far below their actual token budget.
+        is_large = approx_tokens > context_length * 0.6 or (
+            context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
+        )
        if is_large:
            return _result(
                FailoverReason.context_overflow,
@@ -766,7 +771,12 @@ def _classify_400(
        if not err_body_msg:
            err_body_msg = str(body.get("message") or "").strip().lower()
    is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
-    is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
+    # Absolute token/message-count thresholds are only a proxy for smaller
+    # context windows.  Large-context sessions can have many messages while
+    # still being far below their actual token budget.
+    is_large = approx_tokens > context_length * 0.4 or (
+        context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
+    )

    if is_generic and is_large:
        return result_fn(
@@ -679,7 +679,21 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
    finish_reason_raw = str(cand.get("finishReason") or "")
    if finish_reason_raw:
        mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
-        chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
+        finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
+        # Attach usage from this event's usageMetadata so the streaming
+        # loop in run_agent.py can record token counts (mirrors the
+        # non-streaming path in translate_gemini_response).
+        usage_meta = event.get("usageMetadata") or {}
+        if usage_meta:
+            finish_chunk.usage = SimpleNamespace(
+                prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
+                completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
+                total_tokens=int(usage_meta.get("totalTokenCount") or 0),
+                prompt_tokens_details=SimpleNamespace(
+                    cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
+                ),
+            )
+        chunks.append(finish_chunk)
    return chunks


@@ -489,16 +489,29 @@ def save_credentials(creds: GoogleCredentials) -> Path:
    """Atomically write creds to disk with 0o600 permissions."""
    path = _credentials_path()
    path.parent.mkdir(parents=True, exist_ok=True)
+    # Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
+    # On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
+    try:
+        os.chmod(path.parent, 0o700)
+    except OSError:
+        pass
    payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"

    with _credentials_lock():
        tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
        try:
-            with open(tmp_path, "w", encoding="utf-8") as fh:
+            # Create with 0o600 atomically to close the TOCTOU window where the
+            # default umask (often 0o644) would briefly expose tokens to other
+            # local users between open() and chmod().
+            fd = os.open(
+                str(tmp_path),
+                os.O_WRONLY | os.O_CREAT | os.O_EXCL,
+                stat.S_IRUSR | stat.S_IWUSR,
+            )
+            with os.fdopen(fd, "w", encoding="utf-8") as fh:
                fh.write(payload)
                fh.flush()
                os.fsync(fh.fileno())
-            os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
            atomic_replace(tmp_path, path)
        finally:
            try:
@@ -20,25 +20,25 @@ def summarize_manual_compression(
        headline = f"No changes from compression: {before_count} messages"
        if after_tokens == before_tokens:
            token_line = (
-                f"Rough transcript estimate: ~{before_tokens:,} tokens (unchanged)"
+                f"Approx request size: ~{before_tokens:,} tokens (unchanged)"
            )
        else:
            token_line = (
-                f"Rough transcript estimate: ~{before_tokens:,} → "
+                f"Approx request size: ~{before_tokens:,} → "
                f"~{after_tokens:,} tokens"
            )
    else:
        headline = f"Compressed: {before_count} → {after_count} messages"
        token_line = (
-            f"Rough transcript estimate: ~{before_tokens:,} → "
+            f"Approx request size: ~{before_tokens:,} → "
            f"~{after_tokens:,} tokens"
        )

    note = None
    if not noop and after_count < before_count and after_tokens > before_tokens:
        note = (
-            "Note: fewer messages can still raise this rough transcript estimate "
-            "when compression rewrites the transcript into denser summaries."
+            "Note: fewer messages can still raise this estimate when "
+            "compression rewrites the transcript into denser summaries."
        )

    return {
@@ -81,15 +81,56 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
        return repaired

    # Rule 2: when anyOf is present, type belongs only on the children.
+    # Additionally, Moonshot rejects null-type branches inside anyOf
+    # (enum value (<nil>) does not match any type in [string]).
+    # Collapse the anyOf to the first non-null branch and infer its type.
    if "anyOf" in repaired and isinstance(repaired["anyOf"], list):
        repaired.pop("type", None)
-        return repaired
+        non_null = [b for b in repaired["anyOf"]
+                    if isinstance(b, dict) and b.get("type") != "null"]
+        if non_null and len(non_null) < len(repaired["anyOf"]):
+            # Drop the anyOf wrapper — keep only the non-null branch.
+            # If there's a single non-null branch, promote it and fall
+            # through to Rules 1/3 so nullable/enum cleanup still applies
+            # to the merged node.
+            if len(non_null) == 1:
+                merge = {k: v for k, v in repaired.items() if k != "anyOf"}
+                merge.update(non_null[0])
+                repaired = merge
+            else:
+                repaired["anyOf"] = non_null
+                return repaired
+        else:
+            # Nothing to collapse — parent type stripped, children already
+            # repaired by the recursive walk above.
+            return repaired
+
+    # Moonshot also rejects non-standard keywords like ``nullable`` on
+    # parameter schemas — strip it.
+    repaired.pop("nullable", None)

    # Rule 1: property schemas without type need one.  $ref nodes are exempt
    # — their type comes from the referenced definition.
-    if "$ref" in repaired:
-        return repaired
-    return _fill_missing_type(repaired)
+    # Fill missing type BEFORE Rule 3 so enum cleanup can check the type.
+    if "$ref" not in repaired:
+        repaired = _fill_missing_type(repaired)
+
+    # Rule 3: Moonshot rejects null/empty-string values inside enum arrays
+    # when the parent type is a scalar (string, integer, etc.).  The error:
+    #   "enum value (<nil>) does not match any type in [string]"
+    # Strip null and empty-string from enum values, and if the enum becomes
+    # empty, drop it entirely.
+    if "enum" in repaired and isinstance(repaired["enum"], list):
+        node_type = repaired.get("type")
+        if node_type in ("string", "integer", "number", "boolean"):
+            cleaned = [v for v in repaired["enum"]
+                       if v is not None and v != ""]
+            if cleaned:
+                repaired["enum"] = cleaned
+            else:
+                repaired.pop("enum")
+
+    return repaired


 def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
@@ -182,6 +182,64 @@ SKILLS_GUIDANCE = (
    "Skills that aren't maintained become liabilities."
 )

+KANBAN_GUIDANCE = (
+    "# Kanban task execution protocol\n"
+    "You have been assigned ONE task from "
+    "the shared board at `~/.hermes/kanban.db`. Your task id is in "
+    "`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
+    "The `kanban_*` tools in your schema are your primary coordination surface — "
+    "they write directly to the shared SQLite DB and work regardless of terminal "
+    "backend (local/docker/modal/ssh).\n"
+    "\n"
+    "## Lifecycle\n"
+    "\n"
+    "1. **Orient.** Call `kanban_show()` first (no args — it defaults to your "
+    "task). The response includes title, body, parent-task handoffs (summary + "
+    "metadata), any prior attempts on this task if you're a retry, the full "
+    "comment thread, and a pre-formatted `worker_context` you can treat as "
+    "ground truth.\n"
+    "2. **Work inside the workspace.** `cd $HERMES_KANBAN_WORKSPACE` before "
+    "any file operations. The workspace is yours for this run. Don't modify "
+    "files outside it unless the task explicitly asks.\n"
+    "3. **Heartbeat on long operations.** Call `kanban_heartbeat(note=...)` "
+    "every few minutes during long subprocesses (training, encoding, crawling). "
+    "Skip heartbeats for short tasks.\n"
+    "4. **Block on genuine ambiguity.** If you need a human decision you cannot "
+    "infer (missing credentials, UX choice, paywalled source, peer output you "
+    "need first), call `kanban_block(reason=\"...\")` and stop. Don't guess. "
+    "The user will unblock with context and the dispatcher will respawn you.\n"
+    "5. **Complete with structured handoff.** Call `kanban_complete(summary=..., "
+    "metadata=...)`. `summary` is 1–3 human-readable sentences naming concrete "
+    "artifacts. `metadata` is machine-readable facts "
+    "(`{changed_files: [...], tests_run: N, decisions: [...]}`). Downstream "
+    "workers read both via their own `kanban_show`. Never put secrets / "
+    "tokens / raw PII in either field — run rows are durable forever.\n"
+    "6. **If follow-up work appears, create it; don't do it.** Use "
+    "`kanban_create(title=..., assignee=<right-profile>, parents=[your-task-id])` "
+    "to spawn a child task for the appropriate specialist profile instead of "
+    "scope-creeping into the next thing.\n"
+    "\n"
+    "## Orchestrator mode\n"
+    "\n"
+    "If your task is itself a decomposition task (e.g. a planner profile given "
+    "a high-level goal), use `kanban_create` to fan out into child tasks — one "
+    "per specialist, each with an explicit `assignee` and `parents=[...]` to "
+    "express dependencies. Then `kanban_complete` your own task with a summary "
+    "of the decomposition. Do NOT execute the work yourself; your job is "
+    "routing, not implementation.\n"
+    "\n"
+    "## Do NOT\n"
+    "\n"
+    "- Do not shell out to `hermes kanban <verb>` for board operations. Use "
+    "the `kanban_*` tools — they work across all terminal backends.\n"
+    "- Do not complete a task you didn't actually finish. Block it.\n"
+    "- Do not assign follow-up work to yourself. Assign it to the right "
+    "specialist profile.\n"
+    "- Do not call `delegate_task` as a board substitute. `delegate_task` is "
+    "for short reasoning subtasks inside your own run; board tasks are for "
+    "cross-agent handoffs that outlive one API loop."
+)
+
 TOOL_USE_ENFORCEMENT_GUIDANCE = (
    "# Tool-use enforcement\n"
    "You MUST use your tools to take action — do not describe what you would do "
@@ -305,13 +305,18 @@ def _redact_form_body(text: str) -> str:
    return _redact_query_string(text.strip())


-def redact_sensitive_text(text: str, *, force: bool = False) -> str:
+def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
    """Apply all redaction patterns to a block of text.

    Safe to call on any string -- non-matching text passes through unchanged.
    Disabled by default — enable via security.redact_secrets: true in config.yaml.
    Set force=True for safety boundaries that must never return raw secrets
    regardless of the user's global logging redaction preference.
+
+    Set code_file=True to skip the ENV-assignment and JSON-field regex
+    patterns when the text is known to be source code (e.g. MAX_TOKENS=***
+    constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
+    private keys, DB connstrings, JWTs, and URL secrets are still redacted.
    """
    if text is None:
        return None
@@ -325,17 +330,18 @@ def redact_sensitive_text(text: str, *, force: bool = False) -> str:
    # Known prefixes (sk-, ghp_, etc.)
    text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)

-    # ENV assignments: OPENAI_API_KEY=sk-abc...
-    def _redact_env(m):
-        name, quote, value = m.group(1), m.group(2), m.group(3)
-        return f"{name}={quote}{_mask_token(value)}{quote}"
-    text = _ENV_ASSIGN_RE.sub(_redact_env, text)
+    # ENV assignments: OPENAI_API_KEY=***  (skip for code files — false positives)
+    if not code_file:
+        def _redact_env(m):
+            name, quote, value = m.group(1), m.group(2), m.group(3)
+            return f"{name}={quote}{_mask_token(value)}{quote}"
+        text = _ENV_ASSIGN_RE.sub(_redact_env, text)

-    # JSON fields: "apiKey": "value"
-    def _redact_json(m):
-        key, value = m.group(1), m.group(2)
-        return f'{key}: "{_mask_token(value)}"'
-    text = _JSON_FIELD_RE.sub(_redact_json, text)
+        # JSON fields: "apiKey": "***"  (skip for code files — false positives)
+        def _redact_json(m):
+            key, value = m.group(1), m.group(2)
+            return f'{key}: "{_mask_token(value)}"'
+        text = _JSON_FIELD_RE.sub(_redact_json, text)

    # Authorization headers
    text = _AUTH_HEADER_RE.sub(
@@ -6,6 +6,7 @@ can invoke skills via /skill-name commands.

 import json
 import logging
+import os
 import re
 from pathlib import Path
 from typing import Any, Dict, Optional
@@ -20,10 +21,35 @@ from agent.skill_preprocessing import (
 logger = logging.getLogger(__name__)

 _skill_commands: Dict[str, Dict[str, Any]] = {}
+_skill_commands_platform: Optional[str] = None
 # Patterns for sanitizing skill names into clean hyphen-separated slugs.
 _SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
 _SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")

+
+def _resolve_skill_commands_platform() -> Optional[str]:
+    """Return the current platform scope used for disabled-skill filtering.
+
+    Used to detect when the active platform has shifted so
+    :func:`get_skill_commands` can drop a stale cache that was populated
+    for a different platform's ``skills.platform_disabled`` view (#14536).
+
+    Resolves from (in order) ``HERMES_PLATFORM`` env var and
+    ``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
+    ``None`` when no platform scope is active (e.g. classic CLI, RL
+    rollouts, standalone scripts).
+    """
+    try:
+        from gateway.session_context import get_session_env
+
+        resolved_platform = (
+            os.getenv("HERMES_PLATFORM")
+            or get_session_env("HERMES_SESSION_PLATFORM")
+        )
+    except Exception:
+        resolved_platform = os.getenv("HERMES_PLATFORM")
+    return resolved_platform or None
+
 def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
    """Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
    raw_identifier = (skill_identifier or "").strip()
@@ -218,7 +244,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
    Returns:
        Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
    """
-    global _skill_commands
+    global _skill_commands, _skill_commands_platform
+    _skill_commands_platform = _resolve_skill_commands_platform()
    _skill_commands = {}
    try:
        from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -278,8 +305,16 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:


 def get_skill_commands() -> Dict[str, Dict[str, Any]]:
-    """Return the current skill commands mapping (scan first if empty)."""
-    if not _skill_commands:
+    """Return the current skill commands mapping (scan first if empty).
+
+    Rescans when the active platform scope changes (e.g. a gateway
+    process serving Telegram and Discord concurrently) so each platform
+    sees its own ``skills.platform_disabled`` view (#14536).
+    """
+    if (
+        not _skill_commands
+        or _skill_commands_platform != _resolve_skill_commands_platform()
+    ):
        scan_skill_commands()
    return _skill_commands

@@ -17,6 +17,7 @@ logger = logging.getLogger(__name__)
 # so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
 # become visible instead of piling up as NULL session titles.
 FailureCallback = Callable[[str, BaseException], None]
+TitleCallback = Callable[[str], None]

 _TITLE_PROMPT = (
    "Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
@@ -90,6 +91,7 @@ def auto_title_session(
    assistant_response: str,
    failure_callback: Optional[FailureCallback] = None,
    main_runtime: dict = None,
+    title_callback: Optional[TitleCallback] = None,
 ) -> None:
    """Generate and set a session title if one doesn't already exist.

@@ -119,6 +121,11 @@ def auto_title_session(
    try:
        session_db.set_session_title(session_id, title)
        logger.debug("Auto-generated session title: %s", title)
+        if title_callback is not None:
+            try:
+                title_callback(title)
+            except Exception:
+                logger.debug("Auto-title callback failed", exc_info=True)
    except Exception as e:
        logger.debug("Failed to set auto-generated title: %s", e)

@@ -131,6 +138,7 @@ def maybe_auto_title(
    conversation_history: list,
    failure_callback: Optional[FailureCallback] = None,
    main_runtime: dict = None,
+    title_callback: Optional[TitleCallback] = None,
 ) -> None:
    """Fire-and-forget title generation after the first exchange.

@@ -152,7 +160,11 @@ def maybe_auto_title(
    thread = threading.Thread(
        target=auto_title_session,
        args=(session_db, session_id, user_message, assistant_response),
-        kwargs={"failure_callback": failure_callback, "main_runtime": main_runtime},
+        kwargs={
+            "failure_callback": failure_callback,
+            "main_runtime": main_runtime,
+            "title_callback": title_callback,
+        },
        daemon=True,
        name="auto-title",
    )
@@ -0,0 +1,455 @@
+"""Pure tool-call loop guardrail primitives.
+
+The controller in this module is intentionally side-effect free: it tracks
+per-turn tool-call observations and returns decisions. Runtime code owns whether
+those decisions become warning guidance, synthetic tool results, or controlled
+turn halts.
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+from dataclasses import dataclass, field
+from typing import Any, Mapping
+
+from utils import safe_json_loads
+
+
+IDEMPOTENT_TOOL_NAMES = frozenset(
+    {
+        "read_file",
+        "search_files",
+        "web_search",
+        "web_extract",
+        "session_search",
+        "browser_snapshot",
+        "browser_console",
+        "browser_get_images",
+        "mcp_filesystem_read_file",
+        "mcp_filesystem_read_text_file",
+        "mcp_filesystem_read_multiple_files",
+        "mcp_filesystem_list_directory",
+        "mcp_filesystem_list_directory_with_sizes",
+        "mcp_filesystem_directory_tree",
+        "mcp_filesystem_get_file_info",
+        "mcp_filesystem_search_files",
+    }
+)
+
+MUTATING_TOOL_NAMES = frozenset(
+    {
+        "terminal",
+        "execute_code",
+        "write_file",
+        "patch",
+        "todo",
+        "memory",
+        "skill_manage",
+        "browser_click",
+        "browser_type",
+        "browser_press",
+        "browser_scroll",
+        "browser_navigate",
+        "send_message",
+        "cronjob",
+        "delegate_task",
+        "process",
+    }
+)
+
+
+@dataclass(frozen=True)
+class ToolCallGuardrailConfig:
+    """Thresholds for per-turn tool-call loop detection.
+
+    Warnings are enabled by default and never prevent tool execution. Hard stops
+    are explicit opt-in so interactive CLI/TUI sessions get a gentle nudge unless
+    the user enables circuit-breaker behavior in config.yaml.
+    """
+
+    warnings_enabled: bool = True
+    hard_stop_enabled: bool = False
+    exact_failure_warn_after: int = 2
+    exact_failure_block_after: int = 5
+    same_tool_failure_warn_after: int = 3
+    same_tool_failure_halt_after: int = 8
+    no_progress_warn_after: int = 2
+    no_progress_block_after: int = 5
+    idempotent_tools: frozenset[str] = field(default_factory=lambda: IDEMPOTENT_TOOL_NAMES)
+    mutating_tools: frozenset[str] = field(default_factory=lambda: MUTATING_TOOL_NAMES)
+
+    @classmethod
+    def from_mapping(cls, data: Mapping[str, Any] | None) -> "ToolCallGuardrailConfig":
+        """Build config from the `tool_loop_guardrails` config.yaml section."""
+        if not isinstance(data, Mapping):
+            return cls()
+
+        warn_after = data.get("warn_after")
+        if not isinstance(warn_after, Mapping):
+            warn_after = {}
+        hard_stop_after = data.get("hard_stop_after")
+        if not isinstance(hard_stop_after, Mapping):
+            hard_stop_after = {}
+
+        defaults = cls()
+        return cls(
+            warnings_enabled=_as_bool(data.get("warnings_enabled"), defaults.warnings_enabled),
+            hard_stop_enabled=_as_bool(data.get("hard_stop_enabled"), defaults.hard_stop_enabled),
+            exact_failure_warn_after=_positive_int(
+                warn_after.get("exact_failure", data.get("exact_failure_warn_after")),
+                defaults.exact_failure_warn_after,
+            ),
+            same_tool_failure_warn_after=_positive_int(
+                warn_after.get("same_tool_failure", data.get("same_tool_failure_warn_after")),
+                defaults.same_tool_failure_warn_after,
+            ),
+            no_progress_warn_after=_positive_int(
+                warn_after.get("idempotent_no_progress", data.get("no_progress_warn_after")),
+                defaults.no_progress_warn_after,
+            ),
+            exact_failure_block_after=_positive_int(
+                hard_stop_after.get("exact_failure", data.get("exact_failure_block_after")),
+                defaults.exact_failure_block_after,
+            ),
+            same_tool_failure_halt_after=_positive_int(
+                hard_stop_after.get("same_tool_failure", data.get("same_tool_failure_halt_after")),
+                defaults.same_tool_failure_halt_after,
+            ),
+            no_progress_block_after=_positive_int(
+                hard_stop_after.get("idempotent_no_progress", data.get("no_progress_block_after")),
+                defaults.no_progress_block_after,
+            ),
+        )
+
+
+@dataclass(frozen=True)
+class ToolCallSignature:
+    """Stable, non-reversible identity for a tool name plus canonical args."""
+
+    tool_name: str
+    args_hash: str
+
+    @classmethod
+    def from_call(cls, tool_name: str, args: Mapping[str, Any] | None) -> "ToolCallSignature":
+        canonical = canonical_tool_args(args or {})
+        return cls(tool_name=tool_name, args_hash=_sha256(canonical))
+
+    def to_metadata(self) -> dict[str, str]:
+        """Return public metadata without raw argument values."""
+        return {"tool_name": self.tool_name, "args_hash": self.args_hash}
+
+
+@dataclass(frozen=True)
+class ToolGuardrailDecision:
+    """Decision returned by the tool-call guardrail controller."""
+
+    action: str = "allow"  # allow | warn | block | halt
+    code: str = "allow"
+    message: str = ""
+    tool_name: str = ""
+    count: int = 0
+    signature: ToolCallSignature | None = None
+
+    @property
+    def allows_execution(self) -> bool:
+        return self.action in {"allow", "warn"}
+
+    @property
+    def should_halt(self) -> bool:
+        return self.action in {"block", "halt"}
+
+    def to_metadata(self) -> dict[str, Any]:
+        data: dict[str, Any] = {
+            "action": self.action,
+            "code": self.code,
+            "message": self.message,
+            "tool_name": self.tool_name,
+            "count": self.count,
+        }
+        if self.signature is not None:
+            data["signature"] = self.signature.to_metadata()
+        return data
+
+
+def canonical_tool_args(args: Mapping[str, Any]) -> str:
+    """Return sorted compact JSON for parsed tool arguments."""
+    if not isinstance(args, Mapping):
+        raise TypeError(f"tool args must be a mapping, got {type(args).__name__}")
+    return json.dumps(
+        args,
+        ensure_ascii=False,
+        sort_keys=True,
+        separators=(",", ":"),
+        default=str,
+    )
+
+
+def classify_tool_failure(tool_name: str, result: str | None) -> tuple[bool, str]:
+    """Safety-fallback classifier used only when callers don't pass ``failed``.
+
+    Mirrors ``agent.display._detect_tool_failure`` exactly so the guardrail
+    never disagrees with the CLI's user-visible ``[error]`` tag. Production
+    callers in ``run_agent.py`` always pass an explicit ``failed=`` derived
+    from ``_detect_tool_failure``; this function exists so standalone callers
+    (tests, tooling) still get consistent behavior.
+    """
+    if result is None:
+        return False, ""
+
+    if tool_name == "terminal":
+        data = safe_json_loads(result)
+        if isinstance(data, dict):
+            exit_code = data.get("exit_code")
+            if exit_code is not None and exit_code != 0:
+                return True, f" [exit {exit_code}]"
+        return False, ""
+
+    if tool_name == "memory":
+        data = safe_json_loads(result)
+        if isinstance(data, dict):
+            if data.get("success") is False and "exceed the limit" in data.get("error", ""):
+                return True, " [full]"
+
+    lower = result[:500].lower()
+    if '"error"' in lower or '"failed"' in lower or result.startswith("Error"):
+        return True, " [error]"
+
+    return False, ""
+
+
+class ToolCallGuardrailController:
+    """Per-turn controller for repeated failed/non-progressing tool calls."""
+
+    def __init__(self, config: ToolCallGuardrailConfig | None = None):
+        self.config = config or ToolCallGuardrailConfig()
+        self.reset_for_turn()
+
+    def reset_for_turn(self) -> None:
+        self._exact_failure_counts: dict[ToolCallSignature, int] = {}
+        self._same_tool_failure_counts: dict[str, int] = {}
+        self._no_progress: dict[ToolCallSignature, tuple[str, int]] = {}
+        self._halt_decision: ToolGuardrailDecision | None = None
+
+    @property
+    def halt_decision(self) -> ToolGuardrailDecision | None:
+        return self._halt_decision
+
+    def before_call(self, tool_name: str, args: Mapping[str, Any] | None) -> ToolGuardrailDecision:
+        signature = ToolCallSignature.from_call(tool_name, _coerce_args(args))
+        if not self.config.hard_stop_enabled:
+            return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
+
+        exact_count = self._exact_failure_counts.get(signature, 0)
+        if exact_count >= self.config.exact_failure_block_after:
+            decision = ToolGuardrailDecision(
+                action="block",
+                code="repeated_exact_failure_block",
+                message=(
+                    f"Blocked {tool_name}: the same tool call failed {exact_count} "
+                    "times with identical arguments. Stop retrying it unchanged; "
+                    "change strategy or explain the blocker."
+                ),
+                tool_name=tool_name,
+                count=exact_count,
+                signature=signature,
+            )
+            self._halt_decision = decision
+            return decision
+
+        if self._is_idempotent(tool_name):
+            record = self._no_progress.get(signature)
+            if record is not None:
+                _result_hash, repeat_count = record
+                if repeat_count >= self.config.no_progress_block_after:
+                    decision = ToolGuardrailDecision(
+                        action="block",
+                        code="idempotent_no_progress_block",
+                        message=(
+                            f"Blocked {tool_name}: this read-only call returned the same "
+                            f"result {repeat_count} times. Stop repeating it unchanged; "
+                            "use the result already provided or try a different query."
+                        ),
+                        tool_name=tool_name,
+                        count=repeat_count,
+                        signature=signature,
+                    )
+                    self._halt_decision = decision
+                    return decision
+
+        return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
+
+    def after_call(
+        self,
+        tool_name: str,
+        args: Mapping[str, Any] | None,
+        result: str | None,
+        *,
+        failed: bool | None = None,
+    ) -> ToolGuardrailDecision:
+        args = _coerce_args(args)
+        signature = ToolCallSignature.from_call(tool_name, args)
+        if failed is None:
+            failed, _ = classify_tool_failure(tool_name, result)
+
+        if failed:
+            exact_count = self._exact_failure_counts.get(signature, 0) + 1
+            self._exact_failure_counts[signature] = exact_count
+            self._no_progress.pop(signature, None)
+
+            same_count = self._same_tool_failure_counts.get(tool_name, 0) + 1
+            self._same_tool_failure_counts[tool_name] = same_count
+
+            if self.config.hard_stop_enabled and same_count >= self.config.same_tool_failure_halt_after:
+                decision = ToolGuardrailDecision(
+                    action="halt",
+                    code="same_tool_failure_halt",
+                    message=(
+                        f"Stopped {tool_name}: it failed {same_count} times this turn. "
+                        "Stop retrying the same failing tool path and choose a different approach."
+                    ),
+                    tool_name=tool_name,
+                    count=same_count,
+                    signature=signature,
+                )
+                self._halt_decision = decision
+                return decision
+
+            if self.config.warnings_enabled and exact_count >= self.config.exact_failure_warn_after:
+                return ToolGuardrailDecision(
+                    action="warn",
+                    code="repeated_exact_failure_warning",
+                    message=(
+                        f"{tool_name} has failed {exact_count} times with identical arguments. "
+                        "This looks like a loop; inspect the error and change strategy "
+                        "instead of retrying it unchanged."
+                    ),
+                    tool_name=tool_name,
+                    count=exact_count,
+                    signature=signature,
+                )
+
+            if self.config.warnings_enabled and same_count >= self.config.same_tool_failure_warn_after:
+                return ToolGuardrailDecision(
+                    action="warn",
+                    code="same_tool_failure_warning",
+                    message=(
+                        f"{tool_name} has failed {same_count} times this turn. "
+                        "This looks like a loop; change approach before retrying."
+                    ),
+                    tool_name=tool_name,
+                    count=same_count,
+                    signature=signature,
+                )
+
+            return ToolGuardrailDecision(tool_name=tool_name, count=exact_count, signature=signature)
+
+        self._exact_failure_counts.pop(signature, None)
+        self._same_tool_failure_counts.pop(tool_name, None)
+
+        if not self._is_idempotent(tool_name):
+            self._no_progress.pop(signature, None)
+            return ToolGuardrailDecision(tool_name=tool_name, signature=signature)
+
+        result_hash = _result_hash(result)
+        previous = self._no_progress.get(signature)
+        repeat_count = 1
+        if previous is not None and previous[0] == result_hash:
+            repeat_count = previous[1] + 1
+        self._no_progress[signature] = (result_hash, repeat_count)
+
+        if self.config.warnings_enabled and repeat_count >= self.config.no_progress_warn_after:
+            return ToolGuardrailDecision(
+                action="warn",
+                code="idempotent_no_progress_warning",
+                message=(
+                    f"{tool_name} returned the same result {repeat_count} times. "
+                    "Use the result already provided or change the query instead of "
+                    "repeating it unchanged."
+                ),
+                tool_name=tool_name,
+                count=repeat_count,
+                signature=signature,
+            )
+
+        return ToolGuardrailDecision(tool_name=tool_name, count=repeat_count, signature=signature)
+
+    def _is_idempotent(self, tool_name: str) -> bool:
+        if tool_name in self.config.mutating_tools:
+            return False
+        return tool_name in self.config.idempotent_tools
+
+
+def toolguard_synthetic_result(decision: ToolGuardrailDecision) -> str:
+    """Build a synthetic role=tool content string for a blocked tool call."""
+    return json.dumps(
+        {
+            "error": decision.message,
+            "guardrail": decision.to_metadata(),
+        },
+        ensure_ascii=False,
+    )
+
+
+def append_toolguard_guidance(result: str, decision: ToolGuardrailDecision) -> str:
+    """Append runtime guidance to the current tool result content."""
+    if decision.action not in {"warn", "halt"} or not decision.message:
+        return result
+    label = "Tool loop hard stop" if decision.action == "halt" else "Tool loop warning"
+    suffix = (
+        f"\n\n[{label}: "
+        f"{decision.code}; count={decision.count}; {decision.message}]"
+    )
+    return (result or "") + suffix
+
+
+def _coerce_args(args: Mapping[str, Any] | None) -> Mapping[str, Any]:
+    return args if isinstance(args, Mapping) else {}
+
+
+def _result_hash(result: str | None) -> str:
+    parsed = safe_json_loads(result or "")
+    if parsed is not None:
+        try:
+            canonical = json.dumps(
+                parsed,
+                ensure_ascii=False,
+                sort_keys=True,
+                separators=(",", ":"),
+                default=str,
+            )
+        except TypeError:
+            canonical = str(parsed)
+    else:
+        canonical = result or ""
+    return _sha256(canonical)
+
+
+def _as_bool(value: Any, default: bool) -> bool:
+    if value is None:
+        return default
+    if isinstance(value, bool):
+        return value
+    if isinstance(value, (int, float)):
+        return bool(value)
+    if isinstance(value, str):
+        lowered = value.strip().lower()
+        if lowered in {"1", "true", "yes", "on", "enabled"}:
+            return True
+        if lowered in {"0", "false", "no", "off", "disabled"}:
+            return False
+    return default
+
+
+def _positive_int(value: Any, default: int) -> int:
+    if value is None:
+        return default
+    try:
+        parsed = int(value)
+    except (TypeError, ValueError):
+        return default
+    return parsed if parsed >= 1 else default
+
+
+def _sha256(value: str) -> str:
+    return hashlib.sha256(value.encode("utf-8")).hexdigest()
@@ -477,9 +477,13 @@ class ChatCompletionsTransport(ProviderTransport):
        # so keep them apart in provider_data rather than merging.
        reasoning = getattr(msg, "reasoning", None)
        reasoning_content = getattr(msg, "reasoning_content", None)
+        if reasoning_content is None and hasattr(msg, "model_extra"):
+            model_extra = getattr(msg, "model_extra", None) or {}
+            if isinstance(model_extra, dict) and "reasoning_content" in model_extra:
+                reasoning_content = model_extra["reasoning_content"]

        provider_data: Dict[str, Any] = {}
-        if reasoning_content:
+        if reasoning_content is not None:
            provider_data["reasoning_content"] = reasoning_content
        rd = getattr(msg, "reasoning_details", None)
        if rd:
@@ -143,7 +143,18 @@ class ResponsesApiTransport(ProviderTransport):
            kwargs["max_output_tokens"] = max_tokens

        if is_xai_responses and session_id:
-            kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
+            existing_extra_headers = kwargs.get("extra_headers")
+            merged_extra_headers: Dict[str, str] = {}
+            if isinstance(existing_extra_headers, dict):
+                merged_extra_headers.update(
+                    {
+                        str(key): str(value)
+                        for key, value in existing_extra_headers.items()
+                        if key and value is not None
+                    }
+                )
+            merged_extra_headers["x-grok-conv-id"] = session_id
+            kwargs["extra_headers"] = merged_extra_headers

        return kwargs

@@ -121,6 +121,18 @@ model:
 #   # Data policy: "allow" (default) or "deny" to exclude providers that may store data
 #   # data_collection: "deny"

+# =============================================================================
+# OpenRouter Response Caching (only applies when using OpenRouter)
+# =============================================================================
+# Cache identical API responses at the OpenRouter edge for free instant replays.
+# When enabled, identical requests (same model, messages, parameters) return
+# cached responses with zero billing. Separate from Anthropic prompt caching.
+# See: https://openrouter.ai/docs/guides/features/response-caching
+#
+# openrouter:
+#   response_cache: true         # Enable response caching (default: true)
+#   response_cache_ttl: 300      # Cache TTL in seconds, 1-86400 (default: 300)
+
 # =============================================================================
 # Git Worktree Isolation
 # =============================================================================
@@ -289,6 +301,25 @@ browser:
  # after this period of no activity between agent loops (default: 120 = 2 minutes)
  inactivity_timeout: 120

+# =============================================================================
+# Tool Loop Guardrails
+# =============================================================================
+# Soft warnings are enabled by default. They append guidance to repeated failed
+# or non-progressing tool results but still let the tool execute. Hard stops are
+# opt-in circuit breakers for autonomous/cron sessions where stopping a loop is
+# preferable to spending the full iteration budget.
+tool_loop_guardrails:
+  warnings_enabled: true
+  hard_stop_enabled: false
+  warn_after:
+    exact_failure: 2
+    same_tool_failure: 3
+    idempotent_no_progress: 2
+  hard_stop_after:
+    exact_failure: 5
+    same_tool_failure: 8
+    idempotent_no_progress: 5
+
 # =============================================================================
 # Context Compression (Auto-shrinks long conversations)
 # =============================================================================
@@ -420,7 +420,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:


 def create_job(
-    prompt: str,
+    prompt: Optional[str],
    schedule: str,
    name: Optional[str] = None,
    repeat: Optional[int] = None,
@@ -435,12 +435,14 @@ def create_job(
    context_from: Optional[Union[str, List[str]]] = None,
    enabled_toolsets: Optional[List[str]] = None,
    workdir: Optional[str] = None,
+    no_agent: bool = False,
 ) -> Dict[str, Any]:
    """
    Create a new cron job.

    Args:
-        prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
+        prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
+                Ignored when ``no_agent=True`` except as an optional name hint.
        schedule: Schedule string (see parse_schedule)
        name: Optional friendly name
        repeat: How many times to run (None = forever, 1 = once)
@@ -451,21 +453,33 @@ def create_job(
        model: Optional per-job model override
        provider: Optional per-job provider override
        base_url: Optional per-job base URL override
-        script: Optional path to a Python script whose stdout is injected into the
-                prompt each run.  The script runs before the agent turn, and its output
-                is prepended as context.  Useful for data collection / change detection.
+        script: Optional path to a script whose stdout feeds the job. With
+                ``no_agent=True`` the script IS the job — its stdout is
+                delivered verbatim. Without ``no_agent``, its stdout is
+                injected into the agent's prompt as context (data-collection /
+                change-detection pattern). Paths resolve under
+                ~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
+                anything else via Python.
        context_from: Optional job ID (or list of job IDs) whose most recent output
                      is injected into the prompt as context before each run.
                      Useful for chaining cron jobs: job A finds data, job B processes it.
        enabled_toolsets: Optional list of toolset names to restrict the agent to.
                          When set, only tools from these toolsets are loaded, reducing
                          token overhead. When omitted, all default tools are loaded.
+                          Ignored when ``no_agent=True``.
        workdir: Optional absolute path.  When set, the job runs as if launched
                from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
                that directory are injected into the system prompt, and the
                terminal/file/code_exec tools use it as their working directory
                (via TERMINAL_CWD).  When unset, the old behaviour is preserved
                (no context files injected, tools use the scheduler's cwd).
+                With ``no_agent=True``, ``workdir`` is still applied as the
+                script's cwd so relative paths inside the script behave
+                predictably.
+        no_agent: When True, skip the agent entirely — run ``script`` on schedule
+                and deliver its stdout directly. Empty stdout = silent (no
+                delivery). Requires ``script`` to be set. Ideal for classic
+                watchdogs and periodic alerts that don't need LLM reasoning.

    Returns:
        The created job dict
@@ -499,6 +513,16 @@ def create_job(
    normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
    normalized_toolsets = normalized_toolsets or None
    normalized_workdir = _normalize_workdir(workdir)
+    normalized_no_agent = bool(no_agent)
+
+    # no_agent jobs are meaningless without a script — the script IS the job.
+    # Surface this as a clear ValueError at create time so bad configs never
+    # reach the scheduler.
+    if normalized_no_agent and not normalized_script:
+        raise ValueError(
+            "no_agent=True requires a script — with no agent and no script "
+            "there is nothing for the job to run."
+        )

    # Normalize context_from: accept str or list of str, store as list or None
    if isinstance(context_from, str):
@@ -508,7 +532,7 @@ def create_job(
    else:
        context_from = None

-    label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
+    label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
    job = {
        "id": job_id,
        "name": name or label_source[:50].strip(),
@@ -519,6 +543,7 @@ def create_job(
        "provider": normalized_provider,
        "base_url": normalized_base_url,
        "script": normalized_script,
+        "no_agent": normalized_no_agent,
        "context_from": context_from,
        "schedule": parsed_schedule,
        "schedule_display": parsed_schedule.get("display", schedule),
@@ -785,6 +810,12 @@ def get_due_jobs() -> List[Dict[str, Any]]:
    the job is fast-forwarded to the next future run instead of firing
    immediately.  This prevents a burst of missed jobs on gateway restart.
    """
+    with _jobs_file_lock:
+        return _get_due_jobs_locked()
+
+
+def _get_due_jobs_locked() -> List[Dict[str, Any]]:
+    """Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
    now = _hermes_now()
    raw_jobs = load_jobs()
    jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
@@ -797,19 +828,36 @@ def get_due_jobs() -> List[Dict[str, Any]]:

        next_run = job.get("next_run_at")
        if not next_run:
+            schedule = job.get("schedule", {})
+            kind = schedule.get("kind")
+
+            # One-shot jobs use a small grace window via the dedicated helper.
            recovered_next = _recoverable_oneshot_run_at(
-                job.get("schedule", {}),
+                schedule,
                now,
                last_run_at=job.get("last_run_at"),
            )
+            recovery_kind = "one-shot" if recovered_next else None
+
+            # Recurring jobs reach here only when something — typically a
+            # direct jobs.json edit that bypassed add_job() — left
+            # next_run_at unset.  Without this branch, such jobs are
+            # silently skipped forever; recompute next_run_at from the
+            # schedule so they pick up at their next scheduled tick.
+            if not recovered_next and kind in ("cron", "interval"):
+                recovered_next = compute_next_run(schedule, now.isoformat())
+                if recovered_next:
+                    recovery_kind = kind
+
            if not recovered_next:
                continue

            job["next_run_at"] = recovered_next
            next_run = recovered_next
            logger.info(
-                "Job '%s' had no next_run_at; recovering one-shot run at %s",
+                "Job '%s' had no next_run_at; recovering %s run at %s",
                job.get("name", job["id"]),
+                recovery_kind,
                recovered_next,
            )
            for rj in raw_jobs:
@@ -882,3 +930,121 @@ def save_job_output(job_id: str, output: str):
        raise
    
    return output_file
+
+
+# =============================================================================
+# Skill reference rewriting (curator integration)
+# =============================================================================
+
+def rewrite_skill_refs(
+    consolidated: Optional[Dict[str, str]] = None,
+    pruned: Optional[List[str]] = None,
+) -> Dict[str, Any]:
+    """Rewrite cron job skill references after a curator consolidation pass.
+
+    When the curator consolidates a skill X into umbrella Y (or archives X
+    as pruned), any cron job that lists ``X`` in its ``skills`` field will
+    fail to load ``X`` at run time — the scheduler logs a warning and
+    skips the skill, so the job runs without the instructions it was
+    scheduled to follow. See cron/scheduler.py where ``skill_view`` is
+    called per skill name.
+
+    This function repairs cron jobs in-place:
+
+    - A skill listed in ``consolidated`` is replaced with its umbrella
+      target (the ``into`` value). If the umbrella is already in the
+      job's skill list, the stale name is dropped without duplication.
+    - A skill listed in ``pruned`` is dropped outright — there is no
+      forwarding target.
+    - Ordering and other skills in the list are preserved.
+    - The legacy ``skill`` field is realigned via ``_apply_skill_fields``.
+
+    Args:
+        consolidated: mapping of ``old_skill_name -> umbrella_skill_name``.
+        pruned: list of skill names that were archived with no forwarding
+            target.
+
+    Returns a report dict::
+
+        {
+            "rewrites": [
+                {
+                    "job_id": ...,
+                    "job_name": ...,
+                    "before": [...],
+                    "after": [...],
+                    "mapped": {"old": "new", ...},
+                    "dropped": ["old", ...],
+                },
+                ...
+            ],
+            "jobs_updated": N,
+            "jobs_scanned": M,
+        }
+
+    Best-effort: exceptions from loading/saving propagate to the caller so
+    tests can assert behaviour; the curator invocation site wraps this
+    call in a try/except so a failure here never breaks the curator.
+    """
+    consolidated = dict(consolidated or {})
+    pruned_set = set(pruned or [])
+    # A skill listed in both wins as "consolidated" — it has a target,
+    # which is the more useful of the two outcomes.
+    pruned_set -= set(consolidated.keys())
+
+    if not consolidated and not pruned_set:
+        return {"rewrites": [], "jobs_updated": 0, "jobs_scanned": 0}
+
+    with _jobs_file_lock:
+        jobs = load_jobs()
+        rewrites: List[Dict[str, Any]] = []
+        changed = False
+
+        for job in jobs:
+            skills_before = _normalize_skill_list(job.get("skill"), job.get("skills"))
+            if not skills_before:
+                continue
+
+            mapped: Dict[str, str] = {}
+            dropped: List[str] = []
+            new_skills: List[str] = []
+
+            for name in skills_before:
+                if name in consolidated:
+                    target = consolidated[name]
+                    mapped[name] = target
+                    if target and target not in new_skills:
+                        new_skills.append(target)
+                elif name in pruned_set:
+                    dropped.append(name)
+                else:
+                    if name not in new_skills:
+                        new_skills.append(name)
+
+            if not mapped and not dropped:
+                continue
+
+            job["skills"] = new_skills
+            job["skill"] = new_skills[0] if new_skills else None
+            changed = True
+
+            rewrites.append({
+                "job_id": job.get("id"),
+                "job_name": job.get("name") or job.get("id"),
+                "before": list(skills_before),
+                "after": list(new_skills),
+                "mapped": mapped,
+                "dropped": dropped,
+            })
+
+        if changed:
+            save_jobs(jobs)
+            logger.info(
+                "Curator rewrote skill references in %d cron job(s)", len(rewrites)
+            )
+
+        return {
+            "rewrites": rewrites,
+            "jobs_updated": len(rewrites),
+            "jobs_scanned": len(jobs),
+        }
@@ -35,7 +35,7 @@ from typing import List, Optional
 sys.path.insert(0, str(Path(__file__).parent.parent))

 from hermes_constants import get_hermes_home
-from hermes_cli.config import load_config
+from hermes_cli.config import load_config, _expand_env_vars
 from hermes_time import now as _hermes_now

 logger = logging.getLogger(__name__)
@@ -123,9 +123,19 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"


 def _resolve_origin(job: dict) -> Optional[dict]:
-    """Extract origin info from a job, preserving any extra routing metadata."""
+    """Extract origin info from a job, preserving any extra routing metadata.
+
+    Treats non-dict origins (free-form provenance strings, ints, lists from
+    migration scripts or hand-edited jobs.json) as missing instead of
+    crashing with ``AttributeError`` on ``origin.get(...)``. Without this
+    guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
+    crashed every fire attempt with
+    ``'str' object has no attribute 'get'`` — ``mark_job_run`` recorded the
+    failure, but the next tick re-loaded the same poisoned origin and
+    crashed identically until the field was patched manually (#18722).
+    """
    origin = job.get("origin")
-    if not origin:
+    if not isinstance(origin, dict):
        return None
    platform = origin.get("platform")
    chat_id = origin.get("chat_id")
@@ -147,6 +157,19 @@ def _get_home_target_chat_id(platform_name: str) -> str:
    return value


+def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
+    """Return the optional thread/topic ID for a platform home target."""
+    env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
+    if not env_var:
+        return None
+    value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
+    if not value:
+        legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
+        if legacy:
+            value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
+    return value or None
+
+
 def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
    """Resolve one concrete auto-delivery target for a cron job."""

@@ -175,7 +198,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
                return {
                    "platform": platform_name,
                    "chat_id": chat_id,
-                    "thread_id": None,
+                    "thread_id": _get_home_target_thread_id(platform_name),
                }
        return None

@@ -229,7 +252,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
    return {
        "platform": platform_name,
        "chat_id": chat_id,
-        "thread_id": None,
+        "thread_id": _get_home_target_thread_id(platform_name),
    }


@@ -394,7 +417,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
        thread_id = target.get("thread_id")

        # Diagnostic: log thread_id for topic-aware delivery debugging
-        origin = job.get("origin") or {}
+        origin = _resolve_origin(job) or {}
        origin_thread = origin.get("thread_id")
        if origin_thread and not thread_id:
            logger.warning(
@@ -553,8 +576,18 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
    prevent arbitrary script execution via path traversal or absolute
    path injection.

+    Supported interpreters (chosen by file extension):
+
+    * ``.sh`` / ``.bash`` — run with ``/bin/bash``
+    * anything else — run with the current Python interpreter
+      (``sys.executable``), preserving the original behaviour for
+      Python-based pre-check and data-collection scripts.
+
+    Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
+    (the `memory-watchdog.sh` pattern) without wrapping them in Python.
+
    Args:
-        script_path: Path to a Python script.  Relative paths are resolved
+        script_path: Path to the script.  Relative paths are resolved
            against HERMES_HOME/scripts/.  Absolute and ~-prefixed paths
            are also validated to ensure they stay within the scripts dir.

@@ -591,9 +624,19 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:

    script_timeout = _get_script_timeout()

+    # Pick an interpreter by extension.  Bash for .sh/.bash, Python for
+    # everything else.  We deliberately do NOT honour the file's own
+    # shebang: the scripts dir is trusted, but keeping the interpreter
+    # choice explicit here keeps the allowed surface small and auditable.
+    suffix = path.suffix.lower()
+    if suffix in (".sh", ".bash"):
+        argv = ["/bin/bash", str(path)]
+    else:
+        argv = [sys.executable, str(path)]
+
    try:
        result = subprocess.run(
-            [sys.executable, str(path)],
+            argv,
            capture_output=True,
            text=True,
            timeout=script_timeout,
@@ -683,10 +726,8 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
                    f"{prompt}"
                )
            else:
-                prompt = (
-                    "[Script ran successfully but produced no output.]\n\n"
-                    f"{prompt}"
-                )
+                # Script produced no output — nothing to report, skip AI call.
+                return None
        else:
            prompt = (
                "## Script Error\n"
@@ -759,6 +800,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
        return prompt

    from tools.skills_tool import skill_view
+    from tools.skill_usage import bump_use

    parts = []
    skipped: list[str] = []
@@ -770,6 +812,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
            skipped.append(skill_name)
            continue

+        # Bump usage so the curator sees this skill as actively used.
+        try:
+            bump_use(skill_name)
+        except Exception:
+            logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
+
        content = str(loaded.get("content") or "").strip()
        if parts:
            parts.append("")
@@ -802,8 +850,120 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
    Returns:
        Tuple of (success, full_output_doc, final_response, error_message)
    """
+    job_id = job["id"]
+    job_name = job["name"]
+
+    # ---------------------------------------------------------------
+    # no_agent short-circuit — the script IS the job, no LLM involvement.
+    # ---------------------------------------------------------------
+    # This mirrors the classic "run a bash script on a timer, send its
+    # stdout to telegram" watchdog pattern. The agent path is skipped
+    # entirely: no AIAgent, no prompt, no tool loop, no token spend.
+    #
+    # We check this BEFORE importing run_agent / constructing SessionDB so
+    # a pure-script tick never pays for the agent machinery it isn't going
+    # to use. Keep this block self-contained.
+    #
+    # Semantics:
+    #   - script stdout (trimmed) → delivered verbatim as the final message
+    #   - empty stdout            → silent run (no delivery, success=True)
+    #   - non-zero exit / timeout → delivered as an error alert, success=False
+    #   - wakeAgent=false gate    → treated like empty stdout (silent), since
+    #                               the whole point of no_agent is that there
+    #                               is no agent to wake
+    if job.get("no_agent"):
+        script_path = job.get("script")
+        if not script_path:
+            err = "no_agent=True but no script is set for this job"
+            logger.error("Job '%s': %s", job_id, err)
+            return False, "", "", err
+
+        # Apply workdir if configured — lets scripts use predictable relative
+        # paths. For no_agent jobs this is just the subprocess cwd (not an
+        # agent TERMINAL_CWD bridge).
+        _job_workdir = (job.get("workdir") or "").strip() or None
+        _prior_cwd = None
+        if _job_workdir and Path(_job_workdir).is_dir():
+            _prior_cwd = os.getcwd()
+            try:
+                os.chdir(_job_workdir)
+            except OSError:
+                _prior_cwd = None
+
+        try:
+            ok, output = _run_job_script(script_path)
+        finally:
+            if _prior_cwd is not None:
+                try:
+                    os.chdir(_prior_cwd)
+                except OSError:
+                    pass
+
+        now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
+
+        if not ok:
+            # Script crashed / timed out / exited non-zero.  Deliver the
+            # error so the user knows the watchdog itself broke — silent
+            # failure for an alerting job is the worst-case outcome.
+            alert = (
+                f"⚠ Cron watchdog '{job_name}' script failed\n\n"
+                f"{output}\n\n"
+                f"Time: {now_iso}"
+            )
+            doc = (
+                f"# Cron Job: {job_name}\n\n"
+                f"**Job ID:** {job_id}\n"
+                f"**Run Time:** {now_iso}\n"
+                f"**Mode:** no_agent (script)\n"
+                f"**Status:** script failed\n\n"
+                f"{output}\n"
+            )
+            return False, doc, alert, output
+
+        # Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
+        # means "nothing to report this tick", same as empty stdout.
+        if not _parse_wake_gate(output):
+            logger.info(
+                "Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
+            )
+            silent_doc = (
+                f"# Cron Job: {job_name}\n\n"
+                f"**Job ID:** {job_id}\n"
+                f"**Run Time:** {now_iso}\n"
+                f"**Mode:** no_agent (script)\n"
+                f"**Status:** silent (wakeAgent=false)\n"
+            )
+            return True, silent_doc, SILENT_MARKER, None
+
+        if not output.strip():
+            logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
+            silent_doc = (
+                f"# Cron Job: {job_name}\n\n"
+                f"**Job ID:** {job_id}\n"
+                f"**Run Time:** {now_iso}\n"
+                f"**Mode:** no_agent (script)\n"
+                f"**Status:** silent (empty output)\n"
+            )
+            return True, silent_doc, SILENT_MARKER, None
+
+        doc = (
+            f"# Cron Job: {job_name}\n\n"
+            f"**Job ID:** {job_id}\n"
+            f"**Run Time:** {now_iso}\n"
+            f"**Mode:** no_agent (script)\n\n"
+            f"---\n\n"
+            f"{output}\n"
+        )
+        return True, doc, output, None
+
+    # ---------------------------------------------------------------
+    # Default (LLM) path — import and construct the agent machinery now
+    # that we know we actually need it. Doing these imports here instead of
+    # at module top keeps no_agent ticks from paying for AIAgent / SessionDB
+    # construction costs.
+    # ---------------------------------------------------------------
    from run_agent import AIAgent
-    
+
    # Initialize SQLite session store so cron job messages are persisted
    # and discoverable via session_search (same pattern as gateway/run.py).
    _session_db = None
@@ -812,9 +972,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        _session_db = SessionDB()
    except Exception as e:
        logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
-    
-    job_id = job["id"]
-    job_name = job["name"]

    # Wake-gate: if this job has a pre-check script, run it BEFORE building
    # the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -839,6 +996,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            return True, silent_doc, SILENT_MARKER, None

    prompt = _build_job_prompt(job, prerun_script=prerun_script)
+    if prompt is None:
+        logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
+        return True, "", SILENT_MARKER, None
    origin = _resolve_origin(job)
    _cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"

@@ -922,6 +1082,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            if os.path.exists(_cfg_path):
                with open(_cfg_path) as _f:
                    _cfg = yaml.safe_load(_f) or {}
+                _cfg = _expand_env_vars(_cfg)
                _model_cfg = _cfg.get("model", {})
                if not job.get("model"):
                    if isinstance(_model_cfg, str):
@@ -974,8 +1135,13 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
        )
        from hermes_cli.auth import AuthError
        try:
+            # Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
+            # already prefers persisted config over stale shell/env overrides when
+            # no explicit provider is requested. Passing the env var here short-
+            # circuits that precedence and can resurrect old providers (for
+            # example DeepSeek) for cron jobs that do not pin provider/model.
            runtime_kwargs = {
-                "requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
+                "requested": job.get("provider"),
            }
            if job.get("base_url"):
                runtime_kwargs["explicit_base_url"] = job.get("base_url")
@@ -40,7 +40,7 @@ services:
      # - TEAMS_CLIENT_SECRET=${TEAMS_CLIENT_SECRET}
      # - TEAMS_TENANT_ID=${TEAMS_TENANT_ID}
      # - TEAMS_ALLOWED_USERS=${TEAMS_ALLOWED_USERS}
-      # - TEAMS_PORT=3978
+      # - TEAMS_PORT=${TEAMS_PORT:-3978}
    command: ["gateway", "run"]

  dashboard:
@@ -86,6 +86,41 @@ if [ -d "$INSTALL_DIR/skills" ]; then
    python3 "$INSTALL_DIR/tools/skills_sync.py"
 fi

+# Optionally start `hermes dashboard` as a side-process.
+#
+# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
+# Host/port/TUI can be overridden via:
+#   HERMES_DASHBOARD_HOST  (default 0.0.0.0 — exposed outside the container)
+#   HERMES_DASHBOARD_PORT  (default 9119, matches `hermes dashboard` default)
+#   HERMES_DASHBOARD_TUI   (already honored by `hermes dashboard` itself)
+#
+# The dashboard is a long-lived server.  We background it *before* the final
+# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
+# sleep infinity, …) remains PID-of-interest for the container runtime.  When
+# the container stops the whole process tree is torn down, so no explicit
+# cleanup is needed.
+case "${HERMES_DASHBOARD:-}" in
+    1|true|TRUE|True|yes|YES|Yes)
+        dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
+        dash_port="${HERMES_DASHBOARD_PORT:-9119}"
+        dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
+        # Binding to anything other than localhost requires --insecure — the
+        # dashboard refuses otherwise because it exposes API keys.  Inside a
+        # container this is the expected deployment (host reaches it via
+        # published port), so opt in automatically.
+        if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
+            dash_args+=(--insecure)
+        fi
+        echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
+        # Prefix dashboard output so it's distinguishable from the main
+        # process in `docker logs`.  stdbuf keeps the pipe line-buffered.
+        (
+            stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
+                | sed -u 's/^/[dashboard] /'
+        ) &
+        ;;
+esac
+
 # Final exec: two supported invocation patterns.
 #
 #   docker run <image>                 -> exec `hermes` with no args (legacy default)
@@ -0,0 +1,473 @@
+# Telegram DM User-Managed Multi-Session Topics Implementation Plan
+
+> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
+
+**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
+
+**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
+
+**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
+
+---
+
+## 1. Product decisions
+
+### Accepted
+
+- PR-quality implementation: migrations, tests, docs, backwards compatibility.
+- Use SQLite persistence, not JSON sidecars.
+- Live status suffixes in topic titles are out of MVP.
+- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
+- User creates Telegram topics manually through the Telegram bot interface.
+- `/new` does **not** create Telegram topics.
+- Root/main DM becomes a system lobby after activation.
+- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
+- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
+
+### Telegram API assumptions verified from Bot API docs
+
+- `getMe` returns bot `User` fields:
+  - `has_topics_enabled`: forum/topic mode enabled in private chats.
+  - `allows_users_to_create_topics`: users may create/delete topics in private chats.
+- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
+- `Message.message_thread_id` identifies a topic in private chats.
+- `sendMessage` supports `message_thread_id` for private-chat topics.
+- `pinChatMessage` is allowed in private chats.
+
+---
+
+## 2. Target UX
+
+### 2.1 Activation from root/main DM
+
+User sends:
+
+```text
+/topic
+```
+
+Hermes:
+
+1. calls Telegram `getMe`;
+2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
+3. enables multi-session topic mode for this Telegram DM user/chat;
+4. sends an onboarding message;
+5. pins the onboarding message if configured;
+6. shows old/unlinked sessions that can be restored into topics.
+
+Suggested onboarding text:
+
+```text
+Multi-session mode is enabled.
+
+Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
+
+This main chat is reserved for system commands, status, and session management.
+
+To restore an old session:
+1. Use /topic here to see unlinked sessions.
+2. Create a new topic with the + button.
+3. Send /topic <session_id> inside that topic.
+```
+
+### 2.2 Root/main DM after activation
+
+Root DM is a system lobby.
+
+Allowed/system commands include at least:
+
+- `/topic`
+- `/status`
+- `/sessions` if available
+- `/usage`
+- `/help`
+- `/platforms`
+
+Normal user prompts in root DM do not enter the agent loop. Reply:
+
+```text
+This main chat is reserved for system commands.
+
+To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
+```
+
+`/new` in root DM does not create a session/topic. Reply:
+
+```text
+To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
+
+Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
+```
+
+### 2.3 First message in a user-created topic
+
+When a user creates a Telegram topic and sends the first message there:
+
+1. Hermes receives a Telegram DM message with `message_thread_id`.
+2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
+3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
+4. The message runs through the normal agent loop for that lane.
+
+### 2.4 `/new` inside a non-main topic
+
+`/new` remains supported but replaces the session attached to the current topic lane.
+
+Hermes should warn:
+
+```text
+Started a new Hermes session in this topic.
+
+Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
+```
+
+### 2.5 `/topic` in root/main DM after activation
+
+Shows:
+
+- mode enabled/disabled;
+- last capability check result;
+- whether intro message is pinned if known;
+- count of known topic bindings;
+- list of old/unlinked sessions.
+
+Example:
+
+```text
+Telegram multi-session topics are enabled.
+
+Create new Hermes chats with the + button in this bot interface.
+
+Unlinked previous sessions:
+1. 2026-05-01 Research notes — id: abc123
+2. 2026-04-30 Deploy debugging — id: def456
+3. Untitled session — id: ghi789
+
+To restore one:
+1. Create a new topic with the + button.
+2. Open that topic.
+3. Send /topic <id>
+```
+
+### 2.6 `/topic` inside a non-main topic
+
+Without args, show the current topic binding:
+
+```text
+This topic is linked to:
+Session: Research notes
+ID: abc123
+
+Use /new to replace this topic with a fresh session.
+For parallel work, create another topic with the + button.
+```
+
+### 2.7 `/topic <session_id>` inside a non-main topic
+
+Restore an old/unlinked session into the current user-created topic.
+
+Behavior:
+
+1. reject if not in Telegram DM topic;
+2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
+3. reject if session is already linked to another active topic in MVP;
+4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
+5. upsert binding with `managed_mode = restored`;
+6. send two messages into the topic:
+   - session restored confirmation;
+   - last Hermes assistant message if available.
+
+Example:
+
+```text
+Session restored: Research notes
+
+Last Hermes message:
+...
+```
+
+---
+
+## 3. Persistence model
+
+Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
+
+Important rollback-safety rule:
+
+- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
+- old/default Telegram behavior must keep working on the existing `state.db`;
+- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
+- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
+
+### 3.1 No eager `sessions` table mutation for MVP
+
+Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
+
+For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
+
+If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
+
+### 3.2 Explicit `/topic` migration API
+
+Add an idempotent method such as:
+
+```python
+def apply_telegram_topic_migration(self) -> None: ...
+```
+
+It creates only topic-mode side tables/indexes and records:
+
+```text
+state_meta.telegram_dm_topic_schema_version = 1
+```
+
+This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
+
+### 3.3 `telegram_dm_topic_mode`
+
+Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
+
+Suggested fields:
+
+- `chat_id` primary key
+- `user_id`
+- `enabled`
+- `activated_at`
+- `updated_at`
+- `has_topics_enabled`
+- `allows_users_to_create_topics`
+- `capability_checked_at`
+- `intro_message_id`
+- `pinned_message_id`
+
+### 3.4 `telegram_dm_topic_bindings`
+
+Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
+
+Suggested fields:
+
+- `chat_id`
+- `thread_id`
+- `user_id`
+- `session_key`
+- `session_id`
+- `managed_mode`
+  - `auto`
+  - `restored`
+  - `new_replaced`
+- `linked_at`
+- `updated_at`
+
+Recommended constraints:
+
+- primary key `(chat_id, thread_id)`;
+- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
+- index `(user_id, chat_id)` for status/listing.
+
+### 3.5 Unlinked session semantics
+
+For MVP, a session is unlinked if:
+
+- `source = telegram`;
+- `user_id = current Telegram user`;
+- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
+
+This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
+
+Never dedupe by title.
+
+---
+
+## 4. Config
+
+Suggested config block:
+
+```yaml
+platforms:
+  telegram:
+    extra:
+      multisession_topics:
+        enabled: false
+        mode: user_managed_topics
+        root_chat_behavior: system_lobby
+        pin_intro_message: true
+```
+
+Notes:
+
+- `enabled: false` means existing Telegram behavior is unchanged.
+- Activation via `/topic` may create per-chat enabled state only if global config permits it.
+- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
+
+---
+
+## 5. Command behavior summary
+
+### `/topic` root/main DM
+
+- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
+- If activated: show status and unlinked sessions.
+
+### `/topic` non-main topic
+
+- Show current binding.
+
+### `/topic <session_id>` root/main DM
+
+Reject with instructions:
+
+```text
+Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
+```
+
+### `/topic <session_id>` non-main topic
+
+Restore that session into this topic if ownership/linking checks pass.
+
+### `/new` root/main DM when activated
+
+Reply with instructions to use the `+` button. Do not enter agent loop.
+
+### `/new` non-main topic
+
+Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
+
+### Normal text root/main DM when activated
+
+Reply with system-lobby instruction. Do not enter agent loop.
+
+### Normal text non-main topic
+
+Normal Hermes agent flow for that topic's session lane.
+
+---
+
+## 6. PR breakdown
+
+### PR 1 — Explicit topic-mode schema migration
+
+**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
+
+**Files likely touched:**
+
+- `hermes_state.py`
+- tests under `tests/`
+
+**Tests first:**
+
+1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
+2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
+3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
+
+### PR 2 — Topic mode activation and binding APIs
+
+**Goal:** Add SQLite persistence for activation and topic bindings.
+
+**Tests first:**
+
+1. enable/check mode row round-trips;
+2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
+3. linked sessions are excluded from unlinked list.
+
+### PR 3 — `/topic` activation/status command
+
+**Goal:** Implement root activation/status/listing behavior.
+
+**Tests first:**
+
+1. `/topic` in root checks `getMe` capabilities and records activation;
+2. capability failure returns readable instructions;
+3. activated root `/topic` lists unlinked sessions.
+
+### PR 4 — System lobby behavior
+
+**Goal:** Prevent root chat from entering agent loop after activation.
+
+**Tests first:**
+
+1. normal text in activated root returns lobby instruction;
+2. `/new` in activated root returns `+` button instruction;
+3. non-activated root behavior is unchanged.
+
+### PR 5 — Auto-bind user-created topics
+
+**Goal:** First message in non-main topic creates/uses an independent session lane.
+
+**Tests first:**
+
+1. new topic message creates binding with `auto_created`;
+2. repeated topic message reuses same binding/lane;
+3. two topics in same DM do not share sessions.
+
+### PR 6 — Restore legacy sessions into a topic
+
+**Goal:** Implement `/topic <session_id>` in non-main topics.
+
+**Tests first:**
+
+1. root `/topic <id>` rejects with instructions;
+2. topic `/topic <id>` switches current topic lane to target session;
+3. restore rejects sessions from other users/chats;
+4. restore rejects already-linked sessions;
+5. restore emits confirmation and last Hermes assistant message.
+
+### PR 7 — `/new` inside topic updates binding
+
+**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
+
+**Tests first:**
+
+1. `/new` in topic creates a new session for same topic lane;
+2. binding updates to `managed_mode = new_replaced`;
+3. response includes guidance to use `+` for parallel work.
+
+### PR 8 — Docs and polish
+
+**Goal:** Document the feature and Telegram setup.
+
+**Files likely touched:**
+
+- `website/docs/user-guide/messaging/telegram.md`
+- maybe `website/docs/user-guide/sessions.md`
+
+Docs must explain:
+
+- BotFather/Telegram settings for topic mode and user-created topics;
+- `/topic` activation;
+- root system lobby;
+- using `+` for new parallel chats;
+- restoring old sessions with `/topic <id>` inside a topic;
+- limitations.
+
+---
+
+## 7. Testing / quality gates
+
+Run targeted tests after each TDD cycle, then broader tests before completion.
+
+Suggested commands after inspection confirms test paths:
+
+```bash
+python -m pytest tests/test_hermes_state.py -q
+python -m pytest tests/gateway/ -q
+python -m pytest tests/ -o 'addopts=' -q
+```
+
+Do not ship without verifying disabled-feature backwards compatibility.
+
+---
+
+## 8. Definition of done for MVP
+
+- `/topic` activates/checks Telegram DM multi-session mode.
+- Root DM becomes a system lobby after activation.
+- Onboarding message tells users to create new chats with the Telegram `+` button.
+- Onboarding message can be pinned in private chat.
+- User-created topics automatically become independent Hermes session lanes.
+- `/new` in root gives instructions, not a new agent run.
+- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
+- `/topic` in root lists unlinked old sessions.
+- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
+- Ownership checks prevent restoring other users' sessions.
+- Already-linked sessions are not restored into a second topic in MVP.
+- Existing Telegram behavior is unchanged when the feature is disabled.
+- Tests and docs are included.
@@ -36,6 +36,26 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
    return is_truthy_value(value, default=default)


+def _coerce_float(value: Any, default: float) -> float:
+    """Coerce numeric config values, falling back on malformed input."""
+    if value is None:
+        return default
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return default
+
+
+def _coerce_int(value: Any, default: int) -> int:
+    """Coerce integer config values, falling back on malformed input."""
+    if value is None:
+        return default
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return default
+
+
 def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> str:
    """Normalize unauthorized DM behavior to a supported value."""
    if isinstance(value, str):
@@ -45,6 +65,15 @@ def _normalize_unauthorized_dm_behavior(value: Any, default: str = "pair") -> st
    return default


+def _normalize_notice_delivery(value: Any, default: str = "public") -> str:
+    """Normalize notice delivery mode to a supported value."""
+    if isinstance(value, str):
+        normalized = value.strip().lower()
+        if normalized in {"public", "private"}:
+            return normalized
+    return default
+
+
 # Module-level cache for bundled platform plugin names (lives outside the
 # enum so it doesn't become an accidental enum member).
 _Platform__bundled_plugin_names: Optional[set] = None
@@ -157,18 +186,24 @@ class HomeChannel:
    Default destination for a platform.
    
    When a cron job specifies deliver="telegram" without a specific chat ID,
-    messages are sent to this home channel.
+    messages are sent to this home channel. Thread-aware platforms may also
+    store a thread/topic ID so the bare platform target routes to the exact
+    conversation where /sethome was run.
    """
    platform: Platform
    chat_id: str
    name: str  # Human-readable name for display
+    thread_id: Optional[str] = None
    
    def to_dict(self) -> Dict[str, Any]:
-        return {
+        result = {
            "platform": self.platform.value,
            "chat_id": self.chat_id,
            "name": self.name,
        }
+        if self.thread_id:
+            result["thread_id"] = self.thread_id
+        return result
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -176,6 +211,7 @@ class HomeChannel:
            platform=Platform(data["platform"]),
            chat_id=str(data["chat_id"]),
            name=data.get("name", "Home"),
+            thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
        )


@@ -301,13 +337,13 @@ class StreamingConfig:
        if not data:
            return cls()
        return cls(
-            enabled=data.get("enabled", False),
+            enabled=_coerce_bool(data.get("enabled"), False),
            transport=data.get("transport", "edit"),
-            edit_interval=float(data.get("edit_interval", 1.0)),
-            buffer_threshold=int(data.get("buffer_threshold", 40)),
+            edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
+            buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
            cursor=data.get("cursor", " ▉"),
-            fresh_final_after_seconds=float(
-                data.get("fresh_final_after_seconds", 60.0)
+            fresh_final_after_seconds=_coerce_float(
+                data.get("fresh_final_after_seconds"), 60.0
            ),
        )

@@ -572,6 +608,17 @@ class GatewayConfig:
                )
        return self.unauthorized_dm_behavior

+    def get_notice_delivery(self, platform: Optional[Platform] = None) -> str:
+        """Return the effective notice-delivery mode for a platform."""
+        if platform:
+            platform_cfg = self.platforms.get(platform)
+            if platform_cfg and "notice_delivery" in platform_cfg.extra:
+                return _normalize_notice_delivery(
+                    platform_cfg.extra.get("notice_delivery"),
+                    "public",
+                )
+        return "public"
+

 def load_gateway_config() -> GatewayConfig:
    """
@@ -687,6 +734,11 @@ def load_gateway_config() -> GatewayConfig:
                        platform_cfg.get("unauthorized_dm_behavior"),
                        gw_data.get("unauthorized_dm_behavior", "pair"),
                    )
+                if "notice_delivery" in platform_cfg:
+                    bridged["notice_delivery"] = _normalize_notice_delivery(
+                        platform_cfg.get("notice_delivery"),
+                        "public",
+                    )
                if "reply_prefix" in platform_cfg:
                    bridged["reply_prefix"] = platform_cfg["reply_prefix"]
                if "reply_in_thread" in platform_cfg:
@@ -794,11 +846,25 @@ def load_gateway_config() -> GatewayConfig:
                        if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
                            os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()

+            # Bridge top-level require_mention to Telegram when the telegram: section
+            # does not already provide one.  Users often write "require_mention: true"
+            # at the top level alongside group_sessions_per_user, expecting it to work
+            # the same way (#3979).
+            _tl_require_mention = yaml_cfg.get("require_mention")
+            if _tl_require_mention is not None:
+                _tg_section = yaml_cfg.get("telegram") or {}
+                if "require_mention" not in _tg_section:
+                    _tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
+                    _tg_extra = _tg_plat.setdefault("extra", {})
+                    _tg_extra.setdefault("require_mention", _tl_require_mention)
+
            # Telegram settings → env vars (env vars take precedence)
            telegram_cfg = yaml_cfg.get("telegram", {})
            if isinstance(telegram_cfg, dict):
-                if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
-                    os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
+                # Prefer telegram.require_mention; fall back to the top-level shorthand.
+                _effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
+                if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
+                    os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
                if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
                    os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
                frc = telegram_cfg.get("free_response_chats")
@@ -900,6 +966,12 @@ def load_gateway_config() -> GatewayConfig:
                if "dm_mention_threads" in matrix_cfg and not os.getenv("MATRIX_DM_MENTION_THREADS"):
                    os.environ["MATRIX_DM_MENTION_THREADS"] = str(matrix_cfg["dm_mention_threads"]).lower()

+            # Feishu settings → env vars (env vars take precedence)
+            feishu_cfg = yaml_cfg.get("feishu", {})
+            if isinstance(feishu_cfg, dict):
+                if "allow_bots" in feishu_cfg and not os.getenv("FEISHU_ALLOW_BOTS"):
+                    os.environ["FEISHU_ALLOW_BOTS"] = str(feishu_cfg["allow_bots"]).lower()
+
    except Exception as e:
        logger.warning(
            "Failed to process config.yaml — falling back to .env / gateway.json values. "
@@ -1020,6 +1092,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.TELEGRAM,
            chat_id=telegram_home,
            name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Discord
@@ -1036,6 +1109,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.DISCORD,
            chat_id=discord_home,
            name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Reply threading mode for Discord (off/first/all)
@@ -1051,7 +1125,15 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
        if Platform.WHATSAPP not in config.platforms:
            config.platforms[Platform.WHATSAPP] = PlatformConfig()
        config.platforms[Platform.WHATSAPP].enabled = True
-    
+    whatsapp_home = os.getenv("WHATSAPP_HOME_CHANNEL")
+    if whatsapp_home and Platform.WHATSAPP in config.platforms:
+        config.platforms[Platform.WHATSAPP].home_channel = HomeChannel(
+            platform=Platform.WHATSAPP,
+            chat_id=whatsapp_home,
+            name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
+        )
+
    # Slack
    slack_token = os.getenv("SLACK_BOT_TOKEN")
    if slack_token:
@@ -1077,6 +1159,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SLACK,
            chat_id=slack_home,
            name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
+            thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
        )
    
    # Signal
@@ -1097,6 +1180,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SIGNAL,
            chat_id=signal_home,
            name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
        )

    # Mattermost
@@ -1116,6 +1200,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.MATTERMOST,
            chat_id=mattermost_home,
            name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
        )

    # Matrix
@@ -1147,6 +1232,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.MATRIX,
            chat_id=matrix_home,
            name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
+            thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
        )

    # Home Assistant
@@ -1180,6 +1266,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.EMAIL,
            chat_id=email_home,
            name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
+            thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
        )

    # SMS (Twilio)
@@ -1195,6 +1282,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.SMS,
            chat_id=sms_home,
            name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
        )

    # API Server
@@ -1257,6 +1345,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.DINGTALK,
                chat_id=dingtalk_home,
                name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
+                thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
            )

    # Feishu / Lark
@@ -1284,6 +1373,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.FEISHU,
                chat_id=feishu_home,
                name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
+                thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
            )

    # WeCom (Enterprise WeChat)
@@ -1306,6 +1396,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.WECOM,
                chat_id=wecom_home,
                name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
+                thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
            )

    # WeCom callback mode (self-built apps)
@@ -1364,6 +1455,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.WEIXIN,
                chat_id=weixin_home,
                name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
+                thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
            )

    # BlueBubbles (iMessage)
@@ -1387,6 +1479,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            platform=Platform.BLUEBUBBLES,
            chat_id=bluebubbles_home,
            name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
+            thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
        )

    # QQ (Official Bot API v2)
@@ -1424,6 +1517,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.QQBOT,
                chat_id=qq_home,
                name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
+                thread_id=(
+                    os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
+                    or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
+                    or None
+                ),
            )

    # Yuanbao — YUANBAO_APP_ID preferred
@@ -1454,6 +1552,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
                platform=Platform.YUANBAO,
                chat_id=yuanbao_home,
                name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
+                thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
            )
        yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
        if yuanbao_dm_policy:
@@ -53,9 +53,10 @@ class DeliveryTarget:
        - "telegram" → Telegram home channel
        - "telegram:123456" → specific Telegram chat
        """
-        target = target.strip().lower()
+        target_stripped = target.strip()
+        target_lower = target_stripped.lower()
        
-        if target == "origin":
+        if target_lower == "origin":
            if origin:
                return cls(
                    platform=origin.platform,
@@ -67,13 +68,14 @@ class DeliveryTarget:
                # Fallback to local if no origin
                return cls(platform=Platform.LOCAL, is_origin=True)
        
-        if target == "local":
+        if target_lower == "local":
            return cls(platform=Platform.LOCAL)
        
        # Check for platform:chat_id or platform:chat_id:thread_id format
-        if ":" in target:
-            parts = target.split(":", 2)
-            platform_str = parts[0]
+        # Use the original case for chat_id/thread_id to preserve case-sensitive IDs
+        if ":" in target_stripped:
+            parts = target_stripped.split(":", 2)
+            platform_str = parts[0].lower()  # Platform names are case-insensitive
            chat_id = parts[1] if len(parts) > 1 else None
            thread_id = parts[2] if len(parts) > 2 else None
            try:
@@ -85,7 +87,7 @@ class DeliveryTarget:
        
        # Just a platform name (use home channel)
        try:
-            platform = Platform(target)
+            platform = Platform(target_lower)
            return cls(platform=platform)
        except ValueError:
            # Unknown platform, treat as local
@@ -0,0 +1,84 @@
+"""Shared HTTP client factory for long-lived platform adapters.
+
+Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
+BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
+alive for the adapter's lifetime.  That amortises TLS/connection setup
+across many API calls, but it also means the process's file-descriptor
+pressure is sensitive to how aggressively the pool recycles idle keep-
+alive connections.
+
+httpx's default ``keepalive_expiry`` is 5 seconds.  On macOS behind
+Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
+sit in ``CLOSE_WAIT`` longer than that before the local socket actually
+drains — which, multiplied across 7 long-lived adapters plus the LLM
+client and MCP clients, walks straight into the default 256 fd limit.
+See #18451.
+
+``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
+adapter factories use instead of the httpx default.  The values chosen:
+
+* ``max_keepalive_connections=10`` — plenty for any single adapter;
+  platform APIs rarely parallelise beyond this.
+* ``keepalive_expiry=2.0`` — close idle sockets aggressively so a
+  proxy's lingering CLOSE_WAIT window can't starve the process.
+
+Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
+``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
+"""
+
+from __future__ import annotations
+
+import os
+
+try:
+    import httpx
+except ImportError:  # pragma: no cover — optional dep
+    httpx = None  # type: ignore[assignment]
+
+
+_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
+_DEFAULT_MAX_KEEPALIVE = 10
+
+
+def platform_httpx_limits() -> "httpx.Limits | None":
+    """Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
+
+    Returns ``None`` when httpx isn't importable, so callers can fall
+    back to httpx's built-in default without a hard dependency on this
+    helper being reachable.
+    """
+    if httpx is None:
+        return None
+
+    def _env_float(name: str, default: float) -> float:
+        raw = os.environ.get(name, "").strip()
+        if not raw:
+            return default
+        try:
+            val = float(raw)
+        except (TypeError, ValueError):
+            return default
+        return val if val > 0 else default
+
+    def _env_int(name: str, default: int) -> int:
+        raw = os.environ.get(name, "").strip()
+        if not raw:
+            return default
+        try:
+            val = int(raw)
+        except (TypeError, ValueError):
+            return default
+        return val if val > 0 else default
+
+    keepalive_expiry = _env_float(
+        "HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
+    )
+    max_keepalive = _env_int(
+        "HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
+    )
+
+    return httpx.Limits(
+        max_keepalive_connections=max_keepalive,
+        # Leave max_connections at httpx default (100) — plenty of headroom.
+        keepalive_expiry=keepalive_expiry,
+    )
@@ -62,6 +62,14 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536  # 64 KB cap for normalized content parts
 MAX_CONTENT_LIST_SIZE = 1_000  # Max items when content is an array


+def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
+    """Parse a listen port without letting malformed env/config values crash startup."""
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return default
+
+
 def _normalize_chat_content(
    content: Any, *, _max_depth: int = 10, _depth: int = 0,
 ) -> str:
@@ -573,7 +581,10 @@ class APIServerAdapter(BasePlatformAdapter):
        super().__init__(config, Platform.API_SERVER)
        extra = config.extra or {}
        self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
-        self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
+        raw_port = extra.get("port")
+        if raw_port is None:
+            raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
+        self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
        self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
        self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
            extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -727,10 +738,11 @@ class APIServerAdapter(BasePlatformAdapter):
        gateway platforms), falling back to the hermes-api-server default.
        """
        from run_agent import AIAgent
-        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
+        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
        from hermes_cli.tools_config import _get_platform_tools

        runtime_kwargs = _resolve_runtime_agent_kwargs()
+        reasoning_config = GatewayRunner._load_reasoning_config()
        model = _resolve_gateway_model()

        user_config = _load_gateway_config()
@@ -740,7 +752,6 @@ class APIServerAdapter(BasePlatformAdapter):

        # Load fallback provider chain so the API server platform has the
        # same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
-        from gateway.run import GatewayRunner
        fallback_model = GatewayRunner._load_fallback_model()

        agent = AIAgent(
@@ -759,6 +770,7 @@ class APIServerAdapter(BasePlatformAdapter):
            tool_complete_callback=tool_complete_callback,
            session_db=self._ensure_session_db(),
            fallback_model=fallback_model,
+            reasoning_config=reasoning_config,
        )
        return agent

@@ -2351,10 +2363,11 @@ class APIServerAdapter(BasePlatformAdapter):
            )
            if agent_ref is not None:
                agent_ref[0] = agent
+            effective_task_id = session_id or str(uuid.uuid4())
            result = agent.run_conversation(
                user_message=user_message,
                conversation_history=conversation_history,
-                task_id="default",
+                task_id=effective_task_id,
            )
            usage = {
                "input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2551,10 +2564,11 @@ class APIServerAdapter(BasePlatformAdapter):
                )
                self._active_run_agents[run_id] = agent
                def _run_sync():
+                    effective_task_id = session_id or run_id
                    r = agent.run_conversation(
                        user_message=user_message,
                        conversation_history=conversation_history,
-                        task_id="default",
+                        task_id=effective_task_id,
                    )
                    u = {
                        "input_tokens": getattr(agent, "session_prompt_tokens", 0) or 0,
@@ -2564,21 +2578,39 @@ class APIServerAdapter(BasePlatformAdapter):
                    return r, u

                result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
-                final_response = result.get("final_response", "") if isinstance(result, dict) else ""
-                q.put_nowait({
-                    "event": "run.completed",
-                    "run_id": run_id,
-                    "timestamp": time.time(),
-                    "output": final_response,
-                    "usage": usage,
-                })
-                self._set_run_status(
-                    run_id,
-                    "completed",
-                    output=final_response,
-                    usage=usage,
-                    last_event="run.completed",
-                )
+                # Check for structured failure (non-retryable client errors like
+                # 401/400 return failed=True instead of raising, so the except
+                # block below never fires — issue #15561).
+                if isinstance(result, dict) and result.get("failed"):
+                    error_msg = result.get("error") or "agent run failed"
+                    q.put_nowait({
+                        "event": "run.failed",
+                        "run_id": run_id,
+                        "timestamp": time.time(),
+                        "error": error_msg,
+                    })
+                    self._set_run_status(
+                        run_id,
+                        "failed",
+                        error=error_msg,
+                        last_event="run.failed",
+                    )
+                else:
+                    final_response = result.get("final_response", "") if isinstance(result, dict) else ""
+                    q.put_nowait({
+                        "event": "run.completed",
+                        "run_id": run_id,
+                        "timestamp": time.time(),
+                        "output": final_response,
+                        "usage": usage,
+                    })
+                    self._set_run_status(
+                        run_id,
+                        "completed",
+                        output=final_response,
+                        usage=usage,
+                        last_event="run.completed",
+                    )
            except asyncio.CancelledError:
                self._set_run_status(
                    run_id,
@@ -416,7 +416,7 @@ def is_host_excluded_by_no_proxy(hostname: str, no_proxy_value: str | None = Non
 from dataclasses import dataclass, field
 from datetime import datetime
 from pathlib import Path
-from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple
+from typing import Dict, List, Optional, Any, Callable, Awaitable, Tuple, Union
 from enum import Enum

 from pathlib import Path as _Path
@@ -981,7 +981,7 @@ def coerce_plaintext_gateway_command(event: "MessageEvent") -> None:
        return


-@dataclass 
+@dataclass
 class SendResult:
    """Result of sending a message."""
    success: bool
@@ -991,6 +991,45 @@ class SendResult:
    retryable: bool = False  # True for transient connection errors — base will retry automatically


+class EphemeralReply(str):
+    """System-notice reply that auto-deletes after a TTL.
+
+    Slash-command handlers in ``gateway/run.py`` can return this wrapper
+    instead of a plain string to request that the reply message be deleted
+    after ``ttl_seconds`` on platforms that support ``delete_message``.
+
+    Subclassing ``str`` keeps the wrapper transparent to anything that
+    treats handler return values as text (existing tests use ``in`` /
+    ``startswith`` / equality; the ``_process_message_background`` pipeline
+    extracts attachments from the string content).  ``isinstance(r,
+    EphemeralReply)`` still distinguishes ephemeral replies from plain
+    strings so the send path can schedule deletion.
+
+    Platforms that don't override :meth:`BasePlatformAdapter.delete_message`
+    silently ignore the TTL — the message is sent normally and left in
+    place.  When ``ttl_seconds`` is ``None``, the pipeline uses the
+    configured ``display.ephemeral_system_ttl`` default.  A default of ``0``
+    disables auto-deletion globally, preserving prior behavior.
+    """
+
+    ttl_seconds: Optional[int]
+
+    def __new__(cls, text: str, ttl_seconds: Optional[int] = None):
+        instance = super().__new__(cls, text)
+        instance.ttl_seconds = ttl_seconds
+        return instance
+
+    @property
+    def text(self) -> str:
+        """Return the underlying text.
+
+        Provided for call sites that want an explicit string conversion,
+        though ``str(reply)`` and using ``reply`` directly where a string
+        is expected both work identically.
+        """
+        return str.__str__(self)
+
+
 def merge_pending_message_event(
    pending_messages: Dict[str, MessageEvent],
    session_key: str,
@@ -1034,6 +1073,11 @@ def merge_pending_message_event(
                    existing.text = event.text
            if existing_is_photo or incoming_is_photo:
                existing.message_type = MessageType.PHOTO
+            elif (
+                getattr(existing, "message_type", None) == MessageType.TEXT
+                and event.message_type != MessageType.TEXT
+            ):
+                existing.message_type = event.message_type
            return

        if (
@@ -1068,8 +1112,10 @@ _RETRYABLE_ERROR_PATTERNS = (
 )


-# Type for message handlers
-MessageHandler = Callable[[MessageEvent], Awaitable[Optional[str]]]
+# Type for message handlers.  Handlers may return a plain string (normal
+# reply), an ``EphemeralReply`` to opt the reply into auto-deletion, or
+# ``None`` when the response was already delivered (e.g. via streaming).
+MessageHandler = Callable[[MessageEvent], Awaitable[Optional[Union[str, "EphemeralReply"]]]]


 def resolve_channel_prompt(
@@ -1454,6 +1500,64 @@ class BasePlatformAdapter(ABC):
        """
        return False

+    def _get_ephemeral_system_ttl_default(self) -> int:
+        """Read ``display.ephemeral_system_ttl`` from config.
+
+        Returns the TTL in seconds to use when an :class:`EphemeralReply`
+        does not specify one explicitly.  ``0`` (the default) disables
+        auto-deletion.  Non-fatal if config is unreadable.
+        """
+        try:
+            from hermes_cli.config import load_config as _load_config
+        except Exception:
+            return 0
+        try:
+            cfg = _load_config()
+        except Exception:
+            return 0
+        display = cfg.get("display", {}) if isinstance(cfg, dict) else {}
+        if not isinstance(display, dict):
+            return 0
+        raw = display.get("ephemeral_system_ttl", 0)
+        try:
+            return int(raw)
+        except (TypeError, ValueError):
+            return 0
+
+    def _schedule_ephemeral_delete(
+        self,
+        chat_id: str,
+        message_id: str,
+        ttl_seconds: int,
+    ) -> None:
+        """Spawn a detached task that deletes ``message_id`` after ``ttl_seconds``.
+
+        Best-effort — failures (gateway restart, permission denied, message
+        too old for Telegram's 48h window) are swallowed at debug level.
+        Does not block the caller.
+        """
+
+        async def _run_delete() -> None:
+            try:
+                await asyncio.sleep(max(1, int(ttl_seconds)))
+                await self.delete_message(chat_id=chat_id, message_id=message_id)
+            except asyncio.CancelledError:
+                raise
+            except Exception as e:
+                logger.debug(
+                    "[%s] Ephemeral delete failed for %s/%s: %s",
+                    self.name, chat_id, message_id, e,
+                )
+
+        coro = _run_delete()
+        try:
+            asyncio.create_task(coro)
+        except RuntimeError:
+            # No running loop (e.g. unit tests that never reach the async
+            # path).  Close the coroutine cleanly so Python doesn't warn
+            # about it never being awaited, then drop silently.
+            coro.close()
+
    async def send_slash_confirm(
        self,
        chat_id: str,
@@ -1489,6 +1593,26 @@ class BasePlatformAdapter(ABC):
        """
        return SendResult(success=False, error="Not supported")

+    async def send_private_notice(
+        self,
+        chat_id: str,
+        user_id: Optional[str],
+        content: str,
+        reply_to: Optional[str] = None,
+        metadata: Optional[Dict[str, Any]] = None,
+    ) -> SendResult:
+        """Send a notice privately when the platform supports it.
+
+        The default implementation falls back to a normal send so callers can
+        use one code path across platforms.
+        """
+        return await self.send(
+            chat_id=chat_id,
+            content=content,
+            reply_to=reply_to,
+            metadata=metadata,
+        )
+
    async def send_typing(self, chat_id: str, metadata=None) -> None:
        """
        Send a typing indicator.
@@ -2043,6 +2167,28 @@ class BasePlatformAdapter(ABC):
        lowered = error.lower()
        return "timed out" in lowered or "readtimeout" in lowered or "writetimeout" in lowered

+    def _unwrap_ephemeral(self, response: Any) -> Tuple[Optional[str], int]:
+        """Unwrap a handler response into (text, ttl_seconds).
+
+        Accepts a plain string, ``None``, or an :class:`EphemeralReply`.
+        Returns ``(text, ttl)`` where ``ttl > 0`` means the caller should
+        schedule a deletion via :meth:`_schedule_ephemeral_delete` after
+        the send succeeds.  ``ttl`` is forced to 0 when the adapter
+        doesn't override :meth:`delete_message` so non-supporting
+        platforms silently degrade to normal sends.
+        """
+        if isinstance(response, EphemeralReply):
+            ttl = response.ttl_seconds
+            if ttl is None:
+                try:
+                    ttl = int(self._get_ephemeral_system_ttl_default())
+                except Exception:
+                    ttl = 0
+            if ttl and ttl > 0 and type(self).delete_message is BasePlatformAdapter.delete_message:
+                ttl = 0
+            return response.text, int(ttl or 0)
+        return response, 0
+
    async def _send_with_retry(
        self,
        chat_id: str,
@@ -2343,20 +2489,45 @@ class BasePlatformAdapter(ABC):

        try:
            response = await self._message_handler(event)
-            # Old adapter task (if any) is cancelled AFTER the runner has
-            # fully handled the command — keeps ordering deterministic.
+            _text, _eph_ttl = self._unwrap_ephemeral(response)
+            # Send the response BEFORE cancelling the old task so the send
+            # cannot be affected by task-cancellation side effects (race
+            # condition fix — issue #18912).  Previously the send happened
+            # after cancel_session_processing, which could silently drop the
+            # "/new" confirmation when an agent was actively running.
+            if _text:
+                logger.info(
+                    "[%s] Sending command '/%s' response (%d chars) to %s",
+                    self.name,
+                    cmd,
+                    len(_text),
+                    event.source.chat_id,
+                )
+                _r = await self._send_with_retry(
+                    chat_id=event.source.chat_id,
+                    content=_text,
+                    reply_to=(
+                        event.reply_to_message_id
+                        if event.source.platform == Platform.FEISHU
+                        and event.source.thread_id
+                        and event.reply_to_message_id
+                        else event.message_id
+                    ),
+                    metadata=thread_meta,
+                )
+                if _eph_ttl > 0 and _r.success and _r.message_id:
+                    self._schedule_ephemeral_delete(
+                        chat_id=event.source.chat_id,
+                        message_id=_r.message_id,
+                        ttl_seconds=_eph_ttl,
+                    )
+            # Old adapter task (if any) is cancelled AFTER the response has
+            # been sent — keeps ordering deterministic and avoids the race.
            await self.cancel_session_processing(
                session_key,
                release_guard=False,
                discard_pending=False,
            )
-            if response:
-                await self._send_with_retry(
-                    chat_id=event.source.chat_id,
-                    content=response,
-                    reply_to=event.message_id,
-                    metadata=thread_meta,
-                )
        except Exception:
            # On failure, restore the original guard if one still exists so
            # we don't leave the session in a half-reset state.
@@ -2436,13 +2607,26 @@ class BasePlatformAdapter(ABC):
                try:
                    _thread_meta = {"thread_id": event.source.thread_id} if event.source.thread_id else None
                    response = await self._message_handler(event)
-                    if response:
-                        await self._send_with_retry(
+                    _text, _eph_ttl = self._unwrap_ephemeral(response)
+                    if _text:
+                        _r = await self._send_with_retry(
                            chat_id=event.source.chat_id,
-                            content=response,
-                            reply_to=event.message_id,
+                            content=_text,
+                            reply_to=(
+                                event.reply_to_message_id
+                                if event.source.platform == Platform.FEISHU
+                                and event.source.thread_id
+                                and event.reply_to_message_id
+                                else event.message_id
+                            ),
                            metadata=_thread_meta,
                        )
+                        if _eph_ttl > 0 and _r.success and _r.message_id:
+                            self._schedule_ephemeral_delete(
+                                chat_id=event.source.chat_id,
+                                message_id=_r.message_id,
+                                ttl_seconds=_eph_ttl,
+                            )
                except Exception as e:
                    logger.error("[%s] Command '/%s' dispatch failed: %s", self.name, cmd, e, exc_info=True)
                return
@@ -2516,7 +2700,6 @@ class BasePlatformAdapter(ABC):
        # Fall back to a new Event only if the entry was removed externally.
        interrupt_event = self._active_sessions.get(session_key) or asyncio.Event()
        self._active_sessions[session_key] = interrupt_event
-        callback_generation = getattr(interrupt_event, "_hermes_run_generation", None)
        
        # Start continuous typing indicator (refreshes every 2 seconds)
        _thread_metadata = {"thread_id": event.source.thread_id} if event.source.thread_id else None
@@ -2549,7 +2732,16 @@ class BasePlatformAdapter(ABC):

            # Call the handler (this can take a while with tool calls)
            response = await self._message_handler(event)
-            
+
+            # Slash-command handlers may return an EphemeralReply sentinel to
+            # request that their reply message auto-delete after a TTL (used
+            # for system notices like "✨ New session started!" that the user
+            # doesn't need to keep in the thread).  Unwrap here so all the
+            # downstream extract_media / text-processing logic sees a plain
+            # string, and remember the TTL + platform capability so the
+            # post-send block can schedule the deletion.
+            response, _ephemeral_ttl = self._unwrap_ephemeral(response)
+
            # Send response if any.  A None/empty response is normal when
            # streaming already delivered the text (already_sent=True) or
            # when the message was queued behind an active agent.  Log at
@@ -2630,14 +2822,34 @@ class BasePlatformAdapter(ABC):
                # Send the text portion
                if text_content:
                    logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
+                    _reply_anchor = (
+                        event.reply_to_message_id
+                        if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
+                        else event.message_id
+                    )
                    result = await self._send_with_retry(
                        chat_id=event.source.chat_id,
                        content=text_content,
-                        reply_to=event.message_id,
+                        reply_to=_reply_anchor,
                        metadata=_thread_metadata,
                    )
                    _record_delivery(result)

+                    # Schedule auto-deletion of system-notice replies.
+                    # Detached so the handler returns immediately; errors
+                    # (permission denied, message too old) are swallowed.
+                    if (
+                        _ephemeral_ttl
+                        and _ephemeral_ttl > 0
+                        and result.success
+                        and result.message_id
+                    ):
+                        self._schedule_ephemeral_delete(
+                            chat_id=event.source.chat_id,
+                            message_id=result.message_id,
+                            ttl_seconds=_ephemeral_ttl,
+                        )
+
                # Human-like pacing delay between text and media
                human_delay = self._get_human_delay()

@@ -2815,7 +3027,20 @@ class BasePlatformAdapter(ABC):
        finally:
            # Fire any one-shot post-delivery callback registered for this
            # session (e.g. deferred background-review notifications).
-            _callback_generation = callback_generation
+            #
+            # Snapshot the callback generation HERE (after the agent has run),
+            # not at the top of this task.  _hermes_run_generation is set on
+            # the interrupt event by GatewayRunner._bind_adapter_run_generation
+            # during _handle_message_with_agent — which happens DURING the
+            # self._message_handler(event) await above.  Snapshotting earlier
+            # always captured None, which bypassed the generation-ownership
+            # check in pop_post_delivery_callback and let stale runs fire a
+            # fresher run's callbacks.
+            _callback_generation = getattr(
+                interrupt_event,
+                "_hermes_run_generation",
+                None,
+            )
            if hasattr(self, "pop_post_delivery_callback"):
                _post_cb = self.pop_post_delivery_callback(
                    session_key,
@@ -162,7 +162,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
            return False
        from aiohttp import web

-        self.client = httpx.AsyncClient(timeout=30.0)
+        # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
+        from gateway.platforms._http_client_limits import platform_httpx_limits
+        self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
        try:
            await self._api_get("/api/v1/ping")
            info = await self._api_get("/api/v1/server/info")
@@ -228,7 +228,11 @@ class DingTalkAdapter(BasePlatformAdapter):
            return False

        try:
-            self._http_client = httpx.AsyncClient(timeout=30.0)
+            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
+            from gateway.platforms._http_client_limits import platform_httpx_limits
+            self._http_client = httpx.AsyncClient(
+                timeout=30.0, limits=platform_httpx_limits(),
+            )

            credential = dingtalk_stream.Credential(
                self._client_id, self._client_secret
@@ -497,6 +497,7 @@ class DiscordAdapter(BasePlatformAdapter):
        self._ready_event = asyncio.Event()
        self._allowed_user_ids: set = set()  # For button approval authorization
        self._allowed_role_ids: set = set()  # For DISCORD_ALLOWED_ROLES filtering
+        self.gateway_runner = None  # Set by gateway/run.py for cross-platform delivery
        # Voice channel state (per-guild)
        self._voice_clients: Dict[int, Any] = {}  # guild_id -> VoiceClient
        self._voice_locks: Dict[int, asyncio.Lock] = {}  # guild_id -> serialize join/leave
@@ -613,6 +614,21 @@ class DiscordAdapter(BasePlatformAdapter):
            # so LLM output or echoed user content can't ping the whole
            # server; override per DISCORD_ALLOW_MENTION_* env vars or the
            # discord.allow_mentions.* block in config.yaml.
+
+            # Close any existing client to prevent zombie websocket connections
+            # on reconnect (see #18187). Without this, the old client remains
+            # connected to Discord gateway and both fire on_message, causing
+            # double responses.
+            if self._client is not None:
+                try:
+                    if not self._client.is_closed():
+                        await self._client.close()
+                except Exception:
+                    logger.debug("[%s] Failed to close previous Discord client", self.name)
+                finally:
+                    self._client = None
+                    self._ready_event.clear()
+
            self._client = commands.Bot(
                command_prefix="!",  # Not really used, we handle raw messages
                intents=intents,
@@ -704,11 +720,22 @@ class DiscordAdapter(BasePlatformAdapter):
                        return
                    # If humans are mentioned but we're not → not for us
                    # (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
+                    # EXCEPT in free-response channels where the bot should
+                    # answer regardless of who is mentioned.
                    _ignore_no_mention = os.getenv(
                        "DISCORD_IGNORE_NO_MENTION", "true"
                    ).lower() in ("true", "1", "yes")
                    if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
-                        return
+                        _channel_id = str(message.channel.id)
+                        _parent_id = None
+                        if hasattr(message.channel, "parent_id") and message.channel.parent_id:
+                            _parent_id = str(message.channel.parent_id)
+                        _free_channels = adapter_self._discord_free_response_channels()
+                        _channel_ids = {_channel_id}
+                        if _parent_id:
+                            _channel_ids.add(_parent_id)
+                        if "*" not in _free_channels and not (_channel_ids & _free_channels):
+                            return

                await self._handle_message(message)

@@ -1914,6 +1941,225 @@ class DiscordAdapter(BasePlatformAdapter):
                            return True
        return False

+    # ── Slash command authorization ─────────────────────────────────────
+    # Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
+    # are a separate Discord interaction surface from regular messages and
+    # historically ran with NO authorization check — bypassing every gate
+    # ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
+    # DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
+    # could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
+    # operator. ``_check_slash_authorization`` mirrors the on_message gates
+    # one-for-one so the slash surface honors the same trust boundary.
+    #
+    # By design, this is a no-op for deployments with no allowlist env vars
+    # set — ``_is_allowed_user`` returns True and the channel checks early-out
+    # — preserving the existing "single-tenant, all guild members trusted"
+    # default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
+    # parity with on_message.
+
+    def _evaluate_slash_authorization(
+        self, interaction: "discord.Interaction",
+    ) -> Tuple[bool, Optional[str]]:
+        """Evaluate slash authorization without producing any response.
+
+        Returns ``(allowed, reason)``. ``reason`` is populated only when
+        ``allowed`` is False. This is the shared core used by both the
+        responding wrapper (``_check_slash_authorization``) and side-effect-
+        free callers like the ``/skill`` autocomplete callback, which must
+        return an empty list for unauthorized users instead of leaking an
+        ephemeral rejection per-keystroke.
+
+        Fail-closed semantics for malformed payloads: when an allowlist is
+        configured but the interaction is missing the data needed to
+        evaluate it (no channel id with channel policy active, no user
+        with user/role policy active), the gate REJECTS rather than
+        falling through. Without these guards a guild interaction that
+        happens to deserialize without a channel id would silently bypass
+        ``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
+        raise ``AttributeError`` in the user check below, surfacing as
+        an opaque interaction failure rather than a clean rejection.
+        """
+        chan_obj = getattr(interaction, "channel", None)
+        in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
+
+        # ── Channel scope (mirrors on_message lines 3374-3388) ──
+        # DMs aren't channel-gated — DMs follow on_message's DM lockdown
+        # path which has its own user-allowlist enforcement.
+        if not in_dm:
+            chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
+                chan_obj, "id", None,
+            )
+            channel_ids: set = set()
+            if chan_id_raw is not None:
+                channel_ids.add(str(chan_id_raw))
+                # Mirror on_message: also test the parent channel for threads
+                # so per-channel allow/deny lists work consistently.
+                if isinstance(chan_obj, discord.Thread):
+                    parent_id = self._get_parent_channel_id(chan_obj)
+                    if parent_id:
+                        channel_ids.add(str(parent_id))
+
+            allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
+            if allowed_raw:
+                allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
+                if "*" not in allowed:
+                    if not channel_ids:
+                        # Channel policy is configured but the interaction
+                        # has no resolvable channel id. Fail closed.
+                        return (
+                            False,
+                            "channel id missing with DISCORD_ALLOWED_CHANNELS configured",
+                        )
+                    if not (channel_ids & allowed):
+                        return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
+
+            # Ignored beats allowed: even when a thread's parent channel
+            # is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
+            # entry on the thread or its parent rejects the interaction.
+            ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
+            if ignored_raw and channel_ids:
+                ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
+                if "*" in ignored or (channel_ids & ignored):
+                    return (False, "channel in DISCORD_IGNORED_CHANNELS")
+
+        # ── User / role allowlist (mirrors on_message line 681) ──
+        user = getattr(interaction, "user", None)
+        allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
+        allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
+        if user is None or getattr(user, "id", None) is None:
+            # No identifiable user. With any user/role allowlist
+            # configured, fail closed rather than raise AttributeError
+            # on ``interaction.user.id`` below. With no allowlist this
+            # is the existing "no allowlist = everyone" backwards-compat.
+            if allowed_users or allowed_roles:
+                return (False, "missing interaction.user with allowlist configured")
+            return (True, None)
+
+        user_id = str(user.id)
+        if not self._is_allowed_user(user_id, author=user):
+            return (
+                False,
+                "user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
+            )
+
+        return (True, None)
+
+    async def _check_slash_authorization(
+        self, interaction: "discord.Interaction", command_text: str,
+    ) -> bool:
+        """Mirror on_message's user/role/channel gates onto a slash invocation.
+
+        Returns True to proceed. Returns False *after* sending an ephemeral
+        rejection, logging a warning, and scheduling a cross-platform admin
+        alert — the caller must stop on False (the interaction has already
+        been responded to).
+        """
+        allowed, reason = self._evaluate_slash_authorization(interaction)
+        if allowed:
+            return True
+        return await self._reject_slash(
+            interaction, command_text, reason=reason or "unauthorized",
+        )
+
+    async def _reject_slash(
+        self, interaction: "discord.Interaction", command_text: str, *, reason: str,
+    ) -> bool:
+        """Send ephemeral reject + log warning + schedule admin alert. Returns False.
+
+        Tolerates a missing ``interaction.user`` -- the fail-closed branch
+        in ``_evaluate_slash_authorization`` deliberately routes here for
+        malformed payloads (no user) when an allowlist is configured, and
+        ``str(interaction.user.id)`` would raise AttributeError before the
+        ephemeral rejection could be sent.
+        """
+        user = getattr(interaction, "user", None)
+        if user is not None:
+            user_id = str(getattr(user, "id", "?"))
+            user_name = getattr(user, "name", "?")
+        else:
+            user_id = "?"
+            user_name = "?"
+        chan_id = getattr(interaction, "channel_id", None) or getattr(
+            getattr(interaction, "channel", None), "id", None,
+        )
+        guild_id = getattr(interaction, "guild_id", None)
+
+        logger.warning(
+            "[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
+            "guild=%s cmd=%r reason=%r",
+            user_name, user_id, chan_id, guild_id, command_text, reason,
+        )
+
+        try:
+            await interaction.response.send_message(
+                "You're not authorized to use this command.",
+                ephemeral=True,
+            )
+        except Exception as e:
+            # Interaction may already be responded to (e.g. caller deferred
+            # before the auth check, or Discord retried). Best-effort only.
+            logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
+
+        # Fire-and-forget: don't block the interaction handler on Telegram I/O.
+        try:
+            asyncio.create_task(self._notify_unauthorized_slash(
+                user_name, user_id, chan_id, guild_id, command_text, reason,
+            ))
+        except Exception as e:
+            logger.debug("[Discord] Could not schedule admin notify task: %s", e)
+
+        return False
+
+    async def _notify_unauthorized_slash(
+        self, user_name: str, user_id: str, chan_id, guild_id,
+        command_text: str, reason: str,
+    ) -> None:
+        """Best-effort cross-platform alert to the gateway operator.
+
+        Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
+        then SLACK. Silently no-ops if no other platform is configured
+        with a home channel.
+
+        A soft send failure -- adapter.send() returning a result with
+        ``success=False`` rather than raising -- continues the fallback
+        chain. Treating a SendResult(success=False) as delivered would
+        mean a Telegram outage that the adapter politely surfaces (e.g.
+        rate-limit, auth failure) silently swallows the alert without
+        attempting Slack. Hard exceptions still take the same path via
+        the except branch below.
+        """
+        runner = getattr(self, "gateway_runner", None)
+        if not runner:
+            return
+        for target in (Platform.TELEGRAM, Platform.SLACK):
+            try:
+                adapter = runner.adapters.get(target)
+                if not adapter:
+                    continue
+                home = runner.config.get_home_channel(target)
+                if not home or not getattr(home, "chat_id", None):
+                    continue
+                msg = (
+                    "⚠️ Unauthorized Discord slash attempt\n"
+                    f"User: {user_name} ({user_id})\n"
+                    f"Channel: {chan_id} (guild {guild_id})\n"
+                    f"Command: {command_text}\n"
+                    f"Reason: {reason}"
+                )
+                result = await adapter.send(str(home.chat_id), msg)
+                # Only return on confirmed delivery. SendResult(success=False)
+                # -> continue to the next platform.
+                if getattr(result, "success", None) is False:
+                    logger.debug(
+                        "[Discord] Admin notify via %s returned success=False"
+                        " (error=%r); falling through",
+                        target, getattr(result, "error", None),
+                    )
+                    continue
+                return
+            except Exception as e:
+                logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
+
    async def send_image_file(
        self,
        chat_id: str,
@@ -2301,6 +2547,11 @@ class DiscordAdapter(BasePlatformAdapter):
        except Exception:
            pass  # logging must never block command dispatch

+        # Auth gate — must run before defer() so an ephemeral rejection can
+        # be delivered on the still-unresponded interaction.
+        if not await self._check_slash_authorization(interaction, command_text):
+            return
+
        await interaction.response.defer(ephemeral=True)
        event = self._build_slash_event(interaction, command_text)
        await self.handle_message(event)
@@ -2445,7 +2696,8 @@ class DiscordAdapter(BasePlatformAdapter):
            message: str = "",
            auto_archive_duration: int = 1440,
        ):
-            await interaction.response.defer(ephemeral=True)
+            # defer() is performed inside the handler *after* the auth gate
+            # so a rejected invoker can receive an ephemeral rejection.
            await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)

        @tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2566,6 +2818,54 @@ class DiscordAdapter(BasePlatformAdapter):
        # supporting up to 25 categories × 25 skills = 625 skills.
        self._register_skill_group(tree)

+        # Optional defense-in-depth: hide every slash command from non-admin
+        # guild members in Discord's slash picker. Server-side authorization
+        # (``_check_slash_authorization``) is the actual gate; this is purely
+        # UX so users don't see commands they can't invoke. Off by default
+        # to preserve the slash UX for deployments that intentionally allow
+        # everyone in the guild.
+        if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
+            "true", "1", "yes", "on",
+        ):
+            self._apply_owner_only_visibility(tree)
+
+    def _apply_owner_only_visibility(self, tree) -> None:
+        """Set default_member_permissions=0 on every registered slash command.
+
+        Discord interprets ``Permissions(0)`` as "requires no permissions",
+        which paradoxically means the command is hidden from every guild
+        member except those with the Administrator permission. Server admins
+        can re-grant per user/role via Server Settings → Integrations →
+        <bot> → Permissions.
+
+        Authoritative gate is ``_check_slash_authorization`` on every
+        invocation, which catches stale clients, role grants made by
+        mistake, and direct API calls bypassing Discord's UI hide.
+        """
+        try:
+            no_perms = discord.Permissions(0)
+        except Exception as e:
+            logger.warning(
+                "[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
+                e,
+            )
+            return
+        applied = 0
+        for cmd in tree.get_commands():
+            try:
+                cmd.default_permissions = no_perms
+                applied += 1
+            except Exception as e:
+                logger.debug(
+                    "[Discord] Could not set default_permissions on %r: %s",
+                    getattr(cmd, "name", "?"), e,
+                )
+        logger.info(
+            "[Discord] Hid %d slash command(s) from non-admin guild members "
+            "(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
+            applied,
+        )
+
    def _register_skill_group(self, tree) -> None:
        """Register a single ``/skill`` command with autocomplete on the name.

@@ -2584,40 +2884,32 @@ class DiscordAdapter(BasePlatformAdapter):
        hidden skills. The slash picker also becomes more discoverable —
        Discord live-filters by the user's typed prefix against both the
        skill name and its description.
+
+        The entries list and lookup dict are stored on ``self`` rather
+        than captured in closure variables so :meth:`refresh_skill_group`
+        can repopulate them when the user runs ``/reload-skills`` without
+        needing to touch the Discord slash-command tree or trigger a
+        ``tree.sync()`` call.
        """
        try:
-            from hermes_cli.commands import discord_skill_commands_by_category
-
            existing_names = set()
            try:
                existing_names = {cmd.name for cmd in tree.get_commands()}
            except Exception:
                pass

-            # Reuse the existing collector for consistent filtering
-            # (per-platform disabled, hub-excluded, name clamping), then
-            # flatten — the category grouping was only useful for the
-            # nested layout.
-            categories, uncategorized, hidden = discord_skill_commands_by_category(
-                reserved_names=existing_names,
-            )
-            entries: list[tuple[str, str, str]] = list(uncategorized)
-            for cat_skills in categories.values():
-                entries.extend(cat_skills)
+            # Populate the instance-level entries/lookup so the
+            # autocomplete + handler callbacks below always read the
+            # freshest state. refresh_skill_group() re-runs the same
+            # collector and mutates these two attributes in place.
+            self._skill_entries: list[tuple[str, str, str]] = []
+            self._skill_lookup: dict[str, tuple[str, str]] = {}
+            self._skill_group_reserved_names: set[str] = set(existing_names)
+            self._refresh_skill_catalog_state()

-            if not entries:
+            if not self._skill_entries:
                return

-            # Stable alphabetical order so the autocomplete suggestion
-            # list is predictable across restarts.
-            entries.sort(key=lambda t: t[0])
-
-            # name -> (description, cmd_key) — used by both the autocomplete
-            # callback and the handler for O(1) dispatch.
-            skill_lookup: dict[str, tuple[str, str]] = {
-                n: (d, k) for n, d, k in entries
-            }
-
            async def _autocomplete_name(
                interaction: "discord.Interaction", current: str,
            ) -> list:
@@ -2627,10 +2919,29 @@ class DiscordAdapter(BasePlatformAdapter):
                "/skill pdf" surfaces skills whose description mentions
                PDFs even if the name doesn't. Discord caps this list at
                25 entries per query.
+
+                Authorization: a quiet pre-check evaluates the slash
+                allowlists and returns ``[]`` for unauthorized users so
+                the installed skill catalog is not leaked to anyone who
+                can see the command in the picker. Returning a generic
+                empty list here is intentional — sending a per-keystroke
+                ephemeral rejection would produce a barrage of error
+                popups during typing.
+
+                Reads ``self._skill_entries`` so a ``/reload-skills`` run
+                since process start shows up on the very next keystroke.
                """
+                try:
+                    allowed, _reason = self._evaluate_slash_authorization(interaction)
+                except Exception:
+                    # Defensive: never raise from autocomplete. Fail
+                    # closed by returning an empty suggestion list.
+                    return []
+                if not allowed:
+                    return []
                q = (current or "").strip().lower()
                choices: list = []
-                for name, desc, _key in entries:
+                for name, desc, _key in self._skill_entries:
                    if not q or q in name.lower() or (desc and q in desc.lower()):
                        if desc:
                            label = f"{name} — {desc}"
@@ -2654,7 +2965,13 @@ class DiscordAdapter(BasePlatformAdapter):
            async def _skill_handler(
                interaction: "discord.Interaction", name: str, args: str = "",
            ):
-                entry = skill_lookup.get(name)
+                # Authorize BEFORE any skill lookup so that known and
+                # unknown skill names produce identical rejections for
+                # unauthorized users (no probing the installed catalog
+                # via "Unknown skill: <name>" responses).
+                if not await self._check_slash_authorization(interaction, "/skill"):
+                    return
+                entry = self._skill_lookup.get(name)
                if not entry:
                    await interaction.response.send_message(
                        f"Unknown skill: `{name}`. Start typing for "
@@ -2676,16 +2993,74 @@ class DiscordAdapter(BasePlatformAdapter):

            logger.info(
                "[%s] Registered /skill command with %d skill(s) via autocomplete",
-                self.name, len(entries),
+                self.name, len(self._skill_entries),
            )
-            if hidden:
+            if self._skill_group_hidden_count:
                logger.info(
                    "[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
-                    self.name, hidden,
+                    self.name, self._skill_group_hidden_count,
                )
        except Exception as exc:
            logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)

+    def _refresh_skill_catalog_state(self) -> None:
+        """Re-scan disk for skills and repopulate ``self._skill_entries``.
+
+        Called once from :meth:`_register_skill_group` at startup and
+        again from :meth:`refresh_skill_group` whenever the user runs
+        ``/reload-skills``. No Discord API calls are made — autocomplete
+        and the handler both read from these instance attributes
+        directly, so an in-place mutation is sufficient.
+        """
+        from hermes_cli.commands import discord_skill_commands_by_category
+
+        reserved = getattr(self, "_skill_group_reserved_names", set())
+        categories, uncategorized, hidden = discord_skill_commands_by_category(
+            reserved_names=set(reserved),
+        )
+        entries: list[tuple[str, str, str]] = list(uncategorized)
+        for cat_skills in categories.values():
+            entries.extend(cat_skills)
+        # Stable alphabetical order so the autocomplete suggestion
+        # list is predictable across restarts.
+        entries.sort(key=lambda t: t[0])
+
+        self._skill_entries = entries
+        self._skill_lookup = {n: (d, k) for n, d, k in entries}
+        self._skill_group_hidden_count = hidden
+
+    def refresh_skill_group(self) -> tuple[int, int]:
+        """Rescan skills and update the live ``/skill`` autocomplete state.
+
+        Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
+        after :func:`agent.skill_commands.reload_skills` has refreshed
+        the in-process skill-command registry. Without this call, the
+        ``/skill`` autocomplete dropdown keeps showing the list captured
+        at process start — new skills stay invisible and deleted skills
+        return an "Unknown skill" error when clicked.
+
+        Because autocomplete options are fetched dynamically by Discord,
+        we only need to mutate the entries/lookup attributes read by the
+        callbacks — no ``tree.sync()`` is required.
+
+        Returns ``(new_count, hidden_count)``.
+        """
+        try:
+            self._refresh_skill_catalog_state()
+        except Exception as exc:
+            logger.warning(
+                "[%s] Failed to refresh /skill autocomplete after reload: %s",
+                self.name, exc,
+            )
+            return (len(getattr(self, "_skill_entries", [])), 0)
+        logger.info(
+            "[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
+            self.name,
+            len(self._skill_entries),
+            self._skill_group_hidden_count,
+        )
+        return (len(self._skill_entries), self._skill_group_hidden_count)
+
    def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
        """Build a MessageEvent from a Discord slash command interaction."""
        is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -2743,6 +3118,9 @@ class DiscordAdapter(BasePlatformAdapter):
        auto_archive_duration: int = 1440,
    ) -> None:
        """Create a Discord thread from a slash command and start a session in it."""
+        if not await self._check_slash_authorization(interaction, "/thread"):
+            return
+        await interaction.response.defer(ephemeral=True)
        result = await self._create_thread(
            interaction,
            name=name,
@@ -2851,8 +3229,15 @@ class DiscordAdapter(BasePlatformAdapter):
            raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
        if isinstance(raw, list):
            return {str(part).strip() for part in raw if str(part).strip()}
-        if isinstance(raw, str) and raw.strip():
-            return {part.strip() for part in raw.split(",") if part.strip()}
+        # Coerce non-list scalars (str/int/float) to str before splitting.
+        # YAML parses a bare numeric value such as
+        # `free_response_channels: 1491973769726791812` as int, which was
+        # previously falling through the isinstance(str) branch and silently
+        # returning an empty set.  str() here accepts whatever scalar the YAML
+        # loader hands us without changing existing string/CSV semantics.
+        s = str(raw).strip() if raw is not None else ""
+        if s:
+            return {part.strip() for part in s.split(",") if part.strip()}
        return set()

    def _thread_parent_channel(self, channel: Any) -> Any:
@@ -3030,6 +3415,7 @@ class DiscordAdapter(BasePlatformAdapter):
            view = ExecApprovalView(
                session_key=session_key,
                allowed_user_ids=self._allowed_user_ids,
+                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3068,6 +3454,7 @@ class DiscordAdapter(BasePlatformAdapter):
                session_key=session_key,
                confirm_id=confirm_id,
                allowed_user_ids=self._allowed_user_ids,
+                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3078,6 +3465,7 @@ class DiscordAdapter(BasePlatformAdapter):
    async def send_update_prompt(
        self, chat_id: str, prompt: str, default: str = "",
        session_key: str = "",
+        metadata: Optional[Dict[str, Any]] = None,
    ) -> SendResult:
        """Send an interactive button-based update prompt (Yes / No).

@@ -3087,9 +3475,10 @@ class DiscordAdapter(BasePlatformAdapter):
        if not self._client or not DISCORD_AVAILABLE:
            return SendResult(success=False, error="Not connected")
        try:
-            channel = self._client.get_channel(int(chat_id))
+            target_id = metadata.get("thread_id") if metadata and metadata.get("thread_id") else chat_id
+            channel = self._client.get_channel(int(target_id))
            if not channel:
-                channel = await self._client.fetch_channel(int(chat_id))
+                channel = await self._client.fetch_channel(int(target_id))

            default_hint = f" (default: {default})" if default else ""
            embed = discord.Embed(
@@ -3100,6 +3489,7 @@ class DiscordAdapter(BasePlatformAdapter):
            view = UpdatePromptView(
                session_key=session_key,
                allowed_user_ids=self._allowed_user_ids,
+                allowed_role_ids=self._allowed_role_ids,
            )
            msg = await channel.send(embed=embed, view=view)
            return SendResult(success=True, message_id=str(msg.id))
@@ -3157,6 +3547,7 @@ class DiscordAdapter(BasePlatformAdapter):
                session_key=session_key,
                on_model_selected=on_model_selected,
                allowed_user_ids=self._allowed_user_ids,
+                allowed_role_ids=self._allowed_role_ids,
            )

            msg = await channel.send(embed=embed, view=view)
@@ -3417,7 +3808,7 @@ class DiscordAdapter(BasePlatformAdapter):
        if not is_thread and not isinstance(message.channel, discord.DMChannel):
            no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
            no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
-            skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
+            skip_thread = bool(channel_ids & no_thread_channels)
            auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
            is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
            if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -3712,6 +4103,72 @@ class DiscordAdapter(BasePlatformAdapter):
 # Discord UI Components (outside the adapter class)
 # ---------------------------------------------------------------------------

+
+def _component_check_auth(
+    interaction,
+    allowed_user_ids: Optional[set],
+    allowed_role_ids: Optional[set],
+) -> bool:
+    """Shared user-or-role OR semantics for component view button clicks.
+
+    Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
+    gates so every Discord interaction surface honors the same trust
+    boundary. Component views (ExecApprovalView, SlashConfirmView,
+    UpdatePromptView, ModelPickerView) used to receive only
+    ``allowed_user_ids``: in role-only deployments
+    (DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
+    set was empty and the legacy "no allowlist = allow everyone" branch
+    let any guild member click the buttons -- approving exec commands,
+    cancelling slash confirmations, switching the model.
+
+    Behavior:
+
+      - both allowlists empty -> allow (preserves existing no-allowlist
+        deployments, no regression)
+      - user is in user allowlist -> allow
+      - role allowlist set + user has a role in it -> allow
+      - role allowlist set + interaction.user has no resolvable
+        ``roles`` attribute (e.g. DM context with a role policy active)
+        -> reject (fail closed)
+      - otherwise -> reject
+    """
+    user_set = allowed_user_ids or set()
+    role_set = allowed_role_ids or set()
+    has_users = bool(user_set)
+    has_roles = bool(role_set)
+    if not has_users and not has_roles:
+        return True
+
+    user = getattr(interaction, "user", None)
+    if user is None:
+        return False
+
+    if has_users:
+        try:
+            uid = str(user.id)
+        except AttributeError:
+            uid = ""
+        if uid and uid in user_set:
+            return True
+
+    if has_roles:
+        roles_attr = getattr(user, "roles", None)
+        if roles_attr is None:
+            # Role policy is configured but the interaction doesn't
+            # carry role data (DM-context Member, raw User payload).
+            # Fail closed: a user without a resolvable role list cannot
+            # satisfy a role allowlist.
+            return False
+        try:
+            user_role_ids = {getattr(r, "id", None) for r in roles_attr}
+        except TypeError:
+            return False
+        if user_role_ids & role_set:
+            return True
+
+    return False
+
+
 if DISCORD_AVAILABLE:

    class ExecApprovalView(discord.ui.View):
@@ -3724,17 +4181,23 @@ if DISCORD_AVAILABLE:
        Only users in the allowed list can click.  Times out after 5 minutes.
        """

-        def __init__(self, session_key: str, allowed_user_ids: set):
+        def __init__(
+            self,
+            session_key: str,
+            allowed_user_ids: set,
+            allowed_role_ids: Optional[set] = None,
+        ):
            super().__init__(timeout=300)  # 5-minute timeout
            self.session_key = session_key
            self.allowed_user_ids = allowed_user_ids
+            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
            """Verify the user clicking is authorized."""
-            if not self.allowed_user_ids:
-                return True  # No allowlist = anyone can approve
-            return str(interaction.user.id) in self.allowed_user_ids
+            return _component_check_auth(
+                interaction, self.allowed_user_ids, self.allowed_role_ids,
+            )

        async def _resolve(
            self, interaction: discord.Interaction, choice: str,
@@ -3826,17 +4289,24 @@ if DISCORD_AVAILABLE:
        5 minutes (matches the gateway primitive's timeout).
        """

-        def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
+        def __init__(
+            self,
+            session_key: str,
+            confirm_id: str,
+            allowed_user_ids: set,
+            allowed_role_ids: Optional[set] = None,
+        ):
            super().__init__(timeout=300)
            self.session_key = session_key
            self.confirm_id = confirm_id
            self.allowed_user_ids = allowed_user_ids
+            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            if not self.allowed_user_ids:
-                return True
-            return str(interaction.user.id) in self.allowed_user_ids
+            return _component_check_auth(
+                interaction, self.allowed_user_ids, self.allowed_role_ids,
+            )

        async def _resolve(
            self, interaction: discord.Interaction, choice: str,
@@ -3914,16 +4384,22 @@ if DISCORD_AVAILABLE:
        5-minute timeout on its side).
        """

-        def __init__(self, session_key: str, allowed_user_ids: set):
+        def __init__(
+            self,
+            session_key: str,
+            allowed_user_ids: set,
+            allowed_role_ids: Optional[set] = None,
+        ):
            super().__init__(timeout=300)
            self.session_key = session_key
            self.allowed_user_ids = allowed_user_ids
+            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            if not self.allowed_user_ids:
-                return True
-            return str(interaction.user.id) in self.allowed_user_ids
+            return _component_check_auth(
+                interaction, self.allowed_user_ids, self.allowed_role_ids,
+            )

        async def _respond(
            self, interaction: discord.Interaction, answer: str,
@@ -4000,6 +4476,7 @@ if DISCORD_AVAILABLE:
            session_key: str,
            on_model_selected,
            allowed_user_ids: set,
+            allowed_role_ids: Optional[set] = None,
        ):
            super().__init__(timeout=120)
            self.providers = providers
@@ -4008,15 +4485,16 @@ if DISCORD_AVAILABLE:
            self.session_key = session_key
            self.on_model_selected = on_model_selected
            self.allowed_user_ids = allowed_user_ids
+            self.allowed_role_ids = allowed_role_ids or set()
            self.resolved = False
            self._selected_provider: str = ""

            self._build_provider_select()

        def _check_auth(self, interaction: discord.Interaction) -> bool:
-            if not self.allowed_user_ids:
-                return True
-            return str(interaction.user.id) in self.allowed_user_ids
+            return _component_check_auth(
+                interaction, self.allowed_user_ids, self.allowed_role_ids,
+            )

        def _build_provider_select(self):
            """Build the provider dropdown menu."""
@@ -416,6 +416,18 @@ class EmailAdapter(BasePlatformAdapter):
            logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
            return

+        # Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
+        # from creating a MessageEvent (and thus thread context) for senders
+        # that the gateway will never authorize.  Without this early guard,
+        # a race between dispatch and authorization can result in the adapter
+        # sending a reply even though the handler returned None.
+        allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
+        if allowed_raw:
+            allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
+            if sender_addr.lower() not in allowed:
+                logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
+                return
+
        subject = msg_data["subject"]
        body = msg_data["body"].strip()
        attachments = msg_data["attachments"]
@@ -64,7 +64,7 @@ from dataclasses import dataclass, field
 from datetime import datetime
 from pathlib import Path
 from types import SimpleNamespace
-from typing import Any, Dict, List, Optional, Sequence
+from typing import Any, Dict, List, Literal, Optional, Sequence
 from urllib.error import HTTPError, URLError
 from urllib.parse import urlencode
 from urllib.request import Request, urlopen
@@ -141,6 +141,7 @@ from gateway.platforms.base import (
 )
 from gateway.status import acquire_scoped_lock, release_scoped_lock
 from hermes_constants import get_hermes_home
+from utils import atomic_json_write

 logger = logging.getLogger(__name__)

@@ -387,6 +388,8 @@ class FeishuAdapterSettings:
    admins: frozenset[str] = frozenset()
    default_group_policy: str = ""
    group_rules: Dict[str, FeishuGroupRule] = field(default_factory=dict)
+    allow_bots: str = "none"  # "none" | "mentions" | "all"
+    require_mention: bool = True


@dataclass
@@ -396,6 +399,7 @@ class FeishuGroupRule:
    policy: str  # "open" | "allowlist" | "blacklist" | "admin_only" | "disabled"
    allowlist: set[str] = field(default_factory=set)
    blacklist: set[str] = field(default_factory=set)
+    require_mention: Optional[bool] = None  # None = inherit global


@dataclass
@@ -405,6 +409,40 @@ class FeishuBatchState:
    counts: Dict[str, int] = field(default_factory=dict)


+# ---------------------------------------------------------------------------
+# Admission: policy types
+# ---------------------------------------------------------------------------
+
+
+RejectReason = Literal[
+    "self_echo",
+    "self_ids_unknown",
+    "bots_disabled",
+    "bot_not_mentioned",
+    "group_policy_rejected",
+]
+
+
+def _is_bot_sender(sender: Any) -> bool:
+    # receive_v1 docs say {user, bot}; accept "app" defensively.
+    return getattr(sender, "sender_type", "") in ("bot", "app")
+
+
+def _sender_identity(sender: Any) -> frozenset:
+    # Take any non-empty id variant — tenant sender_id_type decides which are populated.
+    sid = getattr(sender, "sender_id", None)
+    if sid is None:
+        return frozenset()
+    return frozenset(
+        v for v in (
+            getattr(sid, "open_id", None),
+            getattr(sid, "user_id", None),
+            getattr(sid, "union_id", None),
+        )
+        if v
+    )
+
+
 # ---------------------------------------------------------------------------
 # Markdown rendering helpers
 # ---------------------------------------------------------------------------
@@ -1377,10 +1415,16 @@ class FeishuAdapter(BasePlatformAdapter):
            for chat_id, rule_cfg in raw_group_rules.items():
                if not isinstance(rule_cfg, dict):
                    continue
+                # Only override when the key is explicitly set — missing vs false
+                # must not collapse.
+                per_chat_require_mention: Optional[bool] = None
+                if "require_mention" in rule_cfg:
+                    per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
                group_rules[str(chat_id)] = FeishuGroupRule(
                    policy=str(rule_cfg.get("policy", "open")).strip().lower(),
                    allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
                    blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
+                    require_mention=per_chat_require_mention,
                )

        # Bot-level admins
@@ -1390,6 +1434,16 @@ class FeishuAdapter(BasePlatformAdapter):
        # Default group policy (for groups not in group_rules)
        default_group_policy = str(extra.get("default_group_policy", "")).strip().lower()

+        # Env-only so adapter and gateway auth bypass share one source; yaml
+        # feishu.allow_bots is bridged to this env var at config load.
+        allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
+        if allow_bots not in ("none", "mentions", "all"):
+            logger.warning(
+                "[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
+                allow_bots,
+            )
+            allow_bots = "none"
+
        return FeishuAdapterSettings(
            app_id=str(extra.get("app_id") or os.getenv("FEISHU_APP_ID", "")).strip(),
            app_secret=str(extra.get("app_secret") or os.getenv("FEISHU_APP_SECRET", "")).strip(),
@@ -1446,6 +1500,10 @@ class FeishuAdapter(BasePlatformAdapter):
            admins=admins,
            default_group_policy=default_group_policy,
            group_rules=group_rules,
+            allow_bots=allow_bots,
+            require_mention=_to_boolean(
+                extra.get("require_mention", os.getenv("FEISHU_REQUIRE_MENTION", "true"))
+            ),
        )

    def _apply_settings(self, settings: FeishuAdapterSettings) -> None:
@@ -1476,6 +1534,8 @@ class FeishuAdapter(BasePlatformAdapter):
        self._ws_reconnect_interval = settings.ws_reconnect_interval
        self._ws_ping_interval = settings.ws_ping_interval
        self._ws_ping_timeout = settings.ws_ping_timeout
+        self._allow_bots = settings.allow_bots
+        self._require_mention = settings.require_mention

    def _build_event_handler(self) -> Any:
        if EventDispatcherHandler is None:
@@ -2189,30 +2249,28 @@ class FeishuAdapter(BasePlatformAdapter):
        event = getattr(data, "event", None)
        message = getattr(event, "message", None)
        sender = getattr(event, "sender", None)
-        sender_id = getattr(sender, "sender_id", None)
-        if not message or not sender_id:
-            logger.debug("[Feishu] Dropping malformed inbound event: missing message or sender_id")
+        if not message or not sender or not getattr(sender, "sender_id", None):
+            logger.debug("[Feishu] Dropping malformed inbound event: missing message/sender")
            return

        message_id = getattr(message, "message_id", None)
        if not message_id or self._is_duplicate(message_id):
            logger.debug("[Feishu] Dropping duplicate/missing message_id: %s", message_id)
            return
-        if self._is_self_sent_bot_message(event):
-            logger.debug("[Feishu] Dropping self-sent bot event: %s", message_id)
+
+        reason = self._admit(sender, message)
+        if reason is not None:
+            logger.debug("[Feishu] dropping inbound event: %s", reason)
            return

        chat_type = getattr(message, "chat_type", "p2p")
-        chat_id = getattr(message, "chat_id", "") or ""
-        if chat_type != "p2p" and not self._should_accept_group_message(message, sender_id, chat_id):
-            logger.debug("[Feishu] Dropping group message that failed mention/policy gate: %s", message_id)
-            return
        await self._process_inbound_message(
            data=data,
            message=message,
-            sender_id=sender_id,
+            sender_id=getattr(sender, "sender_id", None),
            chat_type=chat_type,
            message_id=message_id,
+            is_bot=_is_bot_sender(sender),
        )

    def _on_message_read_event(self, data: P2ImMessageMessageReadV1) -> None:
@@ -2389,10 +2447,11 @@ class FeishuAdapter(BasePlatformAdapter):
            msg = items[0] if items else None
            if not msg:
                return
+            # GET im/v1/messages returns sender.id=app_id for bot messages —
+            # peer bots and us share sender_type="app" but differ on app_id.
            sender = getattr(msg, "sender", None)
-            sender_type = str(getattr(sender, "sender_type", "") or "").lower()
-            if sender_type != "app":
-                return  # only route reactions on our own bot messages
+            if str(getattr(sender, "id", "") or "") != self._app_id:
+                return  # only route reactions on this bot's own messages
            chat_id = str(getattr(msg, "chat_id", "") or "")
            chat_type_raw = str(getattr(msg, "chat_type", "p2p") or "p2p")
            if not chat_id:
@@ -2679,6 +2738,7 @@ class FeishuAdapter(BasePlatformAdapter):
        sender_id: Any,
        chat_type: str,
        message_id: str,
+        is_bot: bool = False,
    ) -> None:
        text, inbound_type, media_urls, media_types, mentions = await self._extract_message_content(message)

@@ -2697,34 +2757,45 @@ class FeishuAdapter(BasePlatformAdapter):
            if hint:
                text = f"{hint}\n\n{text}" if text else hint

+        thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
        reply_to_message_id = (
            getattr(message, "parent_id", None)
            or getattr(message, "upper_message_id", None)
+            or getattr(message, "root_id", None)
            or None
        )
        reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None

+        sender_primary = (
+            getattr(sender_id, "open_id", None)
+            or getattr(sender_id, "user_id", None)
+            or getattr(sender_id, "union_id", None)
+            or "<unknown>"
+        )
        logger.info(
-            "[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s text=%r media=%d",
+            "[Feishu] Inbound %s message received: id=%s type=%s chat_id=%s sender=%s:%s text=%r media=%d",
            "dm" if chat_type == "p2p" else "group",
            message_id,
            inbound_type.value,
            getattr(message, "chat_id", "") or "",
+            "bot" if is_bot else "user",
+            sender_primary,
            text[:120],
            len(media_urls),
        )

        chat_id = getattr(message, "chat_id", "") or ""
        chat_info = await self.get_chat_info(chat_id)
-        sender_profile = await self._resolve_sender_profile(sender_id)
+        sender_profile = await self._resolve_sender_profile(sender_id, is_bot=is_bot)
        source = self.build_source(
            chat_id=chat_id,
            chat_name=chat_info.get("name") or chat_id or "Feishu Chat",
            chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
            user_id=sender_profile["user_id"],
            user_name=sender_profile["user_name"],
-            thread_id=getattr(message, "thread_id", None) or None,
+            thread_id=thread_id,
            user_id_alt=sender_profile["user_id_alt"],
+            is_bot=is_bot,
        )
        normalized = MessageEvent(
            text=text,
@@ -2853,13 +2924,18 @@ class FeishuAdapter(BasePlatformAdapter):
                },
            )
            response.raise_for_status()
+            # Snapshot Content-Type and body while the client context is
+            # still active so pooled connections fully release on exit.
+            # See #18451.
+            content_type_hdr = str(response.headers.get("Content-Type", ""))
+            body = response.content
        filename = self._derive_remote_filename(
            file_url,
-            content_type=str(response.headers.get("Content-Type", "")),
+            content_type=content_type_hdr,
            default_name=preferred_name,
            default_ext=default_ext,
        )
-        cached_path = cache_document_from_bytes(response.content, filename)
+        cached_path = cache_document_from_bytes(body, filename)
        return cached_path, filename

    @staticmethod
@@ -3447,7 +3523,12 @@ class FeishuAdapter(BasePlatformAdapter):
            return "dm"
        return "group"

-    async def _resolve_sender_profile(self, sender_id: Any) -> Dict[str, Optional[str]]:
+    async def _resolve_sender_profile(
+        self,
+        sender_id: Any,
+        *,
+        is_bot: bool = False,
+    ) -> Dict[str, Optional[str]]:
        """Map Feishu's three-tier user IDs onto Hermes' SessionSource fields.

        Preference order for the primary ``user_id`` field:
@@ -3464,7 +3545,11 @@ class FeishuAdapter(BasePlatformAdapter):
        union_id = getattr(sender_id, "union_id", None) or None
        # Prefer tenant-scoped user_id; fall back to app-scoped open_id.
        primary_id = user_id or open_id
-        display_name = await self._resolve_sender_name_from_api(primary_id or union_id)
+        # bot/v3/bots/basic_batch only accepts open_id.
+        name_lookup_id = open_id if is_bot else (primary_id or union_id)
+        display_name = await self._resolve_sender_name_from_api(
+            name_lookup_id, is_bot=is_bot,
+        )
        return {
            "user_id": primary_id,
            "user_name": display_name,
@@ -3484,11 +3569,14 @@ class FeishuAdapter(BasePlatformAdapter):
        self._sender_name_cache.pop(sender_id, None)
        return None

-    async def _resolve_sender_name_from_api(self, sender_id: Optional[str]) -> Optional[str]:
-        """Fetch the sender's display name from the Feishu contact API with a 10-minute cache.
-
-        ID-type detection mirrors openclaw: ou_ → open_id, on_ → union_id, else user_id.
-        Failures are silently suppressed; the message pipeline must not block on name resolution.
+    async def _resolve_sender_name_from_api(
+        self,
+        sender_id: Optional[str],
+        *,
+        is_bot: bool = False,
+    ) -> Optional[str]:
+        """Bots divert to bot/basic_batch — contact API doesn't return bot names.
+        Failures are silent so the pipeline never blocks on name resolution.
        """
        if not sender_id or not self._client:
            return None
@@ -3498,7 +3586,16 @@ class FeishuAdapter(BasePlatformAdapter):
        now = time.time()
        cached_name = self._get_cached_sender_name(trimmed)
        if cached_name is not None:
-            return cached_name
+            return cached_name or None  # "" cached means "known nameless"
+        if is_bot:
+            names = await self._fetch_bot_names([trimmed])
+            if names is None:
+                return None
+            expire_at = now + _FEISHU_SENDER_NAME_TTL_SECONDS
+            for oid, name in names.items():
+                self._sender_name_cache[oid] = (name, expire_at)
+            hit = self._sender_name_cache.get(trimmed)
+            return (hit[0] or None) if hit else None
        try:
            from lark_oapi.api.contact.v3 import GetUserRequest  # lazy import
            if trimmed.startswith("ou_"):
@@ -3527,6 +3624,35 @@ class FeishuAdapter(BasePlatformAdapter):
            logger.debug("[Feishu] Failed to resolve sender name for %s", sender_id, exc_info=True)
        return None

+    async def _fetch_bot_names(self, bot_ids: List[str]) -> Optional[Dict[str, str]]:
+        if not self._client or not bot_ids:
+            return None
+        try:
+            req = (
+                BaseRequest.builder()
+                .http_method(HttpMethod.GET)
+                .uri("/open-apis/bot/v3/bots/basic_batch")
+                .queries([("bot_ids", oid) for oid in bot_ids])
+                .token_types({AccessTokenType.TENANT})
+                .build()
+            )
+            resp = await asyncio.to_thread(self._client.request, req)
+            content = getattr(getattr(resp, "raw", None), "content", None)
+            if not content:
+                return None
+            payload = json.loads(content)
+            if payload.get("code") != 0:
+                return None
+            bots = (payload.get("data") or {}).get("bots") or {}
+            return {
+                oid: str(info.get("name") or "").strip()
+                for oid, info in bots.items()
+                if oid
+            }
+        except Exception:
+            logger.debug("[Feishu] Failed to fetch bot names for %s", bot_ids, exc_info=True)
+            return None
+
    async def _fetch_message_text(self, message_id: str) -> Optional[str]:
        if not self._client or not message_id:
            return None
@@ -3590,10 +3716,60 @@ class FeishuAdapter(BasePlatformAdapter):
            logger.exception("[Feishu] Background inbound processing failed")

    # =========================================================================
-    # Group policy and mention gating
+    # Inbound admission
    # =========================================================================

-    def _allow_group_message(self, sender_id: Any, chat_id: str = "") -> bool:
+    def _admit(self, sender: Any, message: Any) -> Optional[RejectReason]:
+        sender_ids = _sender_identity(sender)
+        self_ids = frozenset(v for v in (self._bot_open_id, self._bot_user_id) if v)
+        is_bot = _is_bot_sender(sender)
+        is_group = getattr(message, "chat_type", "p2p") != "p2p"
+        chat_id = getattr(message, "chat_id", "") or ""
+        require_mention = is_group and self._require_mention_for(chat_id)
+
+        # Defensive only — Feishu doesn't echo our outbound back as inbound,
+        # and open_id is always populated on both sides.
+        if self_ids and sender_ids & self_ids:
+            return "self_echo"
+
+        if is_bot:
+            mode = self._allow_bots
+            if mode != "mentions" and mode != "all":
+                return "bots_disabled"
+            # Defensive: pre-hydration or malformed payloads.
+            if not self_ids or not sender_ids:
+                return "self_ids_unknown"
+            # Step 4 covers mention enforcement for groups when require_mention
+            # is on; check here only on paths step 4 won't reach.
+            if mode == "mentions" and not require_mention and not self._mentions_self(message):
+                return "bot_not_mentioned"
+
+        if not is_group:
+            return None
+
+        if not self._allow_group_message(
+            getattr(sender, "sender_id", None), chat_id, is_bot=is_bot,
+        ):
+            return "group_policy_rejected"
+        if require_mention and not self._mentions_self(message):
+            return "group_policy_rejected"
+        return None
+
+    def _require_mention_for(self, chat_id: str) -> bool:
+        rule = self._group_rules.get(chat_id) if chat_id else None
+        if rule and rule.require_mention is not None:
+            return rule.require_mention
+        return self._require_mention
+
+    # --- Group policy ---------------------------------------------------------
+
+    def _allow_group_message(
+        self,
+        sender_id: Any,
+        chat_id: str = "",
+        *,
+        is_bot: bool = False,
+    ) -> bool:
        """Per-group policy gate for non-DM traffic."""
        sender_open_id = getattr(sender_id, "open_id", None)
        sender_user_id = getattr(sender_id, "user_id", None)
@@ -3612,12 +3788,17 @@ class FeishuAdapter(BasePlatformAdapter):
            allowlist = self._allowed_group_users
            blacklist = set()

+        # Channel locks apply to everyone; allowlist/blacklist only gate humans
+        # (bots were already cleared upstream by FEISHU_ALLOW_BOTS).
        if policy == "disabled":
            return False
        if policy == "open":
            return True
        if policy == "admin_only":
            return False
+        if is_bot:
+            return True
+
        if policy == "allowlist":
            return bool(sender_ids and (sender_ids & allowlist))
        if policy == "blacklist":
@@ -3625,17 +3806,16 @@ class FeishuAdapter(BasePlatformAdapter):

        return bool(sender_ids and (sender_ids & self._allowed_group_users))

-    def _should_accept_group_message(self, message: Any, sender_id: Any, chat_id: str = "") -> bool:
-        """Require an explicit @mention before group messages enter the agent."""
-        if not self._allow_group_message(sender_id, chat_id):
-            return False
-        # @_all is Feishu's @everyone placeholder — always route to the bot.
+    # --- Mention detection ----------------------------------------------------
+
+    def _mentions_self(self, message: Any) -> bool:
+        # @_all is Feishu's @everyone placeholder.
        raw_content = getattr(message, "content", "") or ""
        if "@_all" in raw_content:
            return True
        mentions = getattr(message, "mentions", None) or []
-        if mentions:
-            return self._message_mentions_bot(mentions)
+        if mentions and self._message_mentions_bot(mentions):
+            return True
        normalized = normalize_feishu_message(
            message_type=getattr(message, "message_type", "") or "",
            raw_content=raw_content,
@@ -3644,23 +3824,6 @@ class FeishuAdapter(BasePlatformAdapter):
        )
        return self._post_mentions_bot(normalized.mentions)

-    def _is_self_sent_bot_message(self, event: Any) -> bool:
-        """Return True only for Feishu events emitted by this Hermes bot."""
-        sender = getattr(event, "sender", None)
-        sender_type = str(getattr(sender, "sender_type", "") or "").strip().lower()
-        if sender_type not in {"bot", "app"}:
-            return False
-
-        sender_id = getattr(sender, "sender_id", None)
-        sender_open_id = str(getattr(sender_id, "open_id", "") or "").strip()
-        sender_user_id = str(getattr(sender_id, "user_id", "") or "").strip()
-
-        if self._bot_open_id and sender_open_id == self._bot_open_id:
-            return True
-        if self._bot_user_id and sender_user_id == self._bot_user_id:
-            return True
-        return False
-
    def _message_mentions_bot(self, mentions: List[Any]) -> bool:
        # IDs trump names: when both sides have open_id (or both user_id),
        # match requires equal IDs. Name fallback only when either side
@@ -3804,7 +3967,7 @@ class FeishuAdapter(BasePlatformAdapter):
            recent = self._seen_message_order[-self._dedup_cache_size:]
            # Save as {msg_id: timestamp} so TTL filtering works across restarts.
            payload = {"message_ids": {k: self._seen_message_ids[k] for k in recent if k in self._seen_message_ids}}
-            self._dedup_state_path.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
+            atomic_json_write(self._dedup_state_path, payload, indent=None)
        except OSError:
            logger.warning("[Feishu] Failed to persist dedup state to %s", self._dedup_state_path, exc_info=True)

@@ -4066,6 +4229,15 @@ class FeishuAdapter(BasePlatformAdapter):
                if active_reply_to and not self._response_succeeded(response):
                    code = getattr(response, "code", None)
                    if code in _FEISHU_REPLY_FALLBACK_CODES:
+                        if (metadata or {}).get("thread_id"):
+                            logger.warning(
+                                "[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
+                                "skipping top-level fallback to avoid creating a new topic",
+                                active_reply_to,
+                                (metadata or {}).get("thread_id"),
+                                code,
+                            )
+                            return response
                        logger.warning(
                            "[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
                            "falling back to new message in chat %s",
@@ -13,6 +13,8 @@ import time
 from pathlib import Path
 from typing import TYPE_CHECKING, Dict

+from utils import atomic_json_write
+
 if TYPE_CHECKING:
    from gateway.platforms.base import MessageEvent

@@ -237,12 +239,11 @@ class ThreadParticipationTracker:

    def _save(self) -> None:
        path = self._state_path()
-        path.parent.mkdir(parents=True, exist_ok=True)
        thread_list = list(self._threads)
        if len(thread_list) > self._max_tracked:
            thread_list = thread_list[-self._max_tracked:]
            self._threads = set(thread_list)
-        path.write_text(json.dumps(thread_list), encoding="utf-8")
+        atomic_json_write(path, thread_list, indent=None)

    def mark(self, thread_id: str) -> None:
        """Mark *thread_id* as participated and persist."""
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):

    async def _ws_connect(self) -> bool:
        """Establish WebSocket connection and authenticate."""
-        ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
+        ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
        ws_url = f"{ws_url}/api/websocket"

        self._session = aiohttp.ClientSession(
@@ -243,10 +243,14 @@ class QQAdapter(BasePlatformAdapter):
            return False

        try:
+            # Tighter keepalive pool so idle CLOSE_WAIT sockets drain
+            # faster behind proxies like Cloudflare Warp (#18451).
+            from gateway.platforms._http_client_limits import platform_httpx_limits
            self._http_client = httpx.AsyncClient(
                timeout=30.0,
                follow_redirects=True,
                event_hooks={"response": [_ssrf_redirect_guard]},
+                limits=platform_httpx_limits(),
            )

            # 1. Get access token
@@ -393,13 +397,24 @@ class QQAdapter(BasePlatformAdapter):
            await self._session.close()
        self._session = None

-        self._session = aiohttp.ClientSession()
+        # Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
+        # local patch, so QQ can regress to direct-connect timeouts after update.
+        self._session = aiohttp.ClientSession(trust_env=True)
+        ws_proxy = (
+            os.getenv("WSS_PROXY")
+            or os.getenv("wss_proxy")
+            or os.getenv("HTTPS_PROXY")
+            or os.getenv("https_proxy")
+            or os.getenv("ALL_PROXY")
+            or os.getenv("all_proxy")
+        )
        self._ws = await self._session.ws_connect(
            gateway_url,
            headers={
                "User-Agent": build_user_agent(),
            },
            timeout=CONNECT_TIMEOUT_SECONDS,
+            proxy=ws_proxy,
        )
        logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)

@@ -192,6 +192,15 @@ class SignalAdapter(BasePlatformAdapter):
        group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
        self.group_allow_from = set(_parse_comma_list(group_allowed_str))

+        # DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
+        # Stored here so the reaction hooks can skip unauthorized senders
+        # (reactions fire before run.py's auth gate, so without this check
+        # every inbound DM from any contact gets a 👀 reaction).
+        # "*" means all users allowed (open mode); empty means no restriction
+        # recorded at adapter level (run.py still enforces auth separately).
+        dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
+        self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
+
        # HTTP client
        self.client: Optional[httpx.AsyncClient] = None

@@ -248,7 +257,9 @@ class SignalAdapter(BasePlatformAdapter):
        except Exception as e:
            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)

-        self.client = httpx.AsyncClient(timeout=30.0)
+        # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
+        from gateway.platforms._http_client_limits import platform_httpx_limits
+        self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
        try:
            # Health check — verify signal-cli daemon is reachable
            try:
@@ -534,6 +545,18 @@ class SignalAdapter(BasePlatformAdapter):
                except Exception:
                    logger.exception("Signal: failed to fetch attachment %s", att_id)

+        # Skip envelopes with no meaningful content (no text, no attachments).
+        # Catches profile key updates, empty messages, and other metadata-only
+        # envelopes that still carry a dataMessage wrapper but have nothing
+        # worth processing. See issue: signal-cli logs "Profile key update" +
+        # Hermes receives msg='' triggering a full agent turn for nothing.
+        if (not text or not text.strip()) and not media_urls:
+            logger.debug(
+                "Signal: skipping contentless envelope from %s (%d attachments)",
+                redact_phone(sender), len(media_urls) if media_urls else 0,
+            )
+            return
+
        # Build session source
        source = self.build_source(
            chat_id=chat_id,
@@ -1416,8 +1439,28 @@ class SignalAdapter(BasePlatformAdapter):
            return None
        return (author, ts)

+    def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
+        """Check if message reactions are enabled for this event.
+
+        Two gates:
+        1. SIGNAL_REACTIONS env var — set to false/0/no to disable globally.
+        2. DM allowlist — if SIGNAL_ALLOWED_USERS is set, only react to
+           messages from senders in that list.  This prevents unauthorized
+           contacts from seeing the 👀 reaction (which fires before run.py's
+           auth gate and would otherwise reveal that a bot is listening).
+        """
+        if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
+            return False
+        if event is not None:
+            sender = getattr(getattr(event, "source", None), "user_id", None)
+            if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
+                return False
+        return True
+
    async def on_processing_start(self, event: MessageEvent) -> None:
        """React with 👀 when processing begins."""
+        if not self._reactions_enabled(event):
+            return
        target = self._extract_reaction_target(event)
        if target:
            await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1428,6 +1471,8 @@ class SignalAdapter(BasePlatformAdapter):
        On CANCELLED we leave the 👀 in place — no terminal outcome means
        the reaction should keep reflecting "in progress" (matches Telegram).
        """
+        if not self._reactions_enabled(event):
+            return
        if outcome == ProcessingOutcome.CANCELLED:
            return
        target = self._extract_reaction_target(event)
@@ -9,6 +9,7 @@ Uses slack-bolt (Python) with Socket Mode for:
 """

 import asyncio
+import contextvars
 import json
 import logging
 import os
@@ -21,6 +22,7 @@ try:
    from slack_bolt.async_app import AsyncApp
    from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
    from slack_sdk.web.async_client import AsyncWebClient
+    import aiohttp
    SLACK_AVAILABLE = True
 except ImportError:
    SLACK_AVAILABLE = False
@@ -50,6 +52,16 @@ from gateway.platforms.base import (

 logger = logging.getLogger(__name__)

+# ContextVar carrying the user_id of the slash-command invoker.
+# Set in _handle_slash_command, read in send() to match the correct
+# stashed response_url when multiple users issue commands on the same
+# channel concurrently.  ContextVars propagate to child asyncio.Tasks
+# (Python 3.7+), so the value set in _handle_slash_command's task is
+# visible in _process_message_background's child task.
+_slash_user_id: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
+    "_slash_user_id", default=None,
+)
+

@dataclass
 class _ThreadContextCache:
@@ -310,6 +322,11 @@ class SlackAdapter(BasePlatformAdapter):
        # Track active assistant thread status indicators so stop_typing can
        # clear them (chat_id → thread_ts).
        self._active_status_threads: Dict[str, str] = {}
+        # Slash-command contexts: stash response_url + user_id so send()
+        # can route the first reply ephemerally.  Keyed by
+        # (channel_id, user_id) to avoid cross-user collisions.
+        # Each value: {"response_url": str, "ts": float}
+        self._slash_command_contexts: Dict[Tuple[str, str], Dict[str, Any]] = {}

    def _describe_slack_api_error(self, response: Any, *, file_obj: Optional[Dict[str, Any]] = None) -> Optional[str]:
        """Convert Slack API auth/permission failures into actionable user-facing text."""
@@ -368,6 +385,103 @@ class SlackAdapter(BasePlatformAdapter):
            )
        return None

+    # ------------------------------------------------------------------
+    # Slash-command ephemeral helpers
+    # ------------------------------------------------------------------
+
+    _SLASH_CTX_TTL = 120.0  # seconds — response_url is valid for 30 min;
+    # we use a much shorter TTL to avoid routing unrelated messages
+    # as ephemeral if the command handler was slow or dropped.
+
+    def _pop_slash_context(
+        self, chat_id: str,
+    ) -> Optional[Dict[str, Any]]:
+        """Return and remove the slash-command context for *chat_id*, if fresh.
+
+        Contexts older than ``_SLASH_CTX_TTL`` seconds are silently discarded.
+
+        Uses the ``_slash_user_id`` ContextVar (set in ``_handle_slash_command``)
+        to match the exact ``(channel_id, user_id)`` key.  This prevents a
+        concurrent slash command from a different user on the same channel from
+        stealing another user's ephemeral context.  Falls back to a
+        channel-only scan when the ContextVar is unset (e.g. send() called
+        from a non-slash code path — should not match anything).
+        """
+        now = time.monotonic()
+        # Clean up stale entries on every lookup — dict is small.
+        stale_keys = [
+            k for k, v in self._slash_command_contexts.items()
+            if now - v["ts"] > self._SLASH_CTX_TTL
+        ]
+        for k in stale_keys:
+            self._slash_command_contexts.pop(k, None)
+
+        # Precise match: (channel_id, user_id) from ContextVar.
+        uid = _slash_user_id.get()
+        if uid:
+            return self._slash_command_contexts.pop((chat_id, uid), None)
+
+        # Fallback: channel-only scan (only reachable when ContextVar is
+        # unset, i.e. send() called outside a slash-command async context).
+        match_key = None
+        for key in list(self._slash_command_contexts):
+            if key[0] == chat_id:
+                match_key = key
+                break
+        if match_key is None:
+            return None
+        return self._slash_command_contexts.pop(match_key)
+
+    async def _send_slash_ephemeral(
+        self,
+        ctx: Dict[str, Any],
+        content: str,
+    ) -> "SendResult":
+        """Replace the initial ephemeral ack via ``response_url``.
+
+        Slack's ``response_url`` accepts a POST with ``replace_original``
+        for up to 30 minutes after the slash command was invoked.  This
+        lets us swap the "Running /cmd…" placeholder with the real reply,
+        and the message stays ephemeral ("Only visible to you").
+
+        Falls back to a simple ``True`` SendResult if the POST fails —
+        the user already saw the initial ack, so a delivery failure here
+        is non-critical.
+        """
+        formatted = self.format_message(content)
+        # Slack's response_url has the same ~40k char limit as chat_postMessage.
+        # Truncate to MAX_MESSAGE_LENGTH and use only the first chunk — the
+        # response_url replaces a single ephemeral ack, so multi-chunk isn't
+        # possible.  Long responses are rare for command replies.
+        chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
+        text = chunks[0] if chunks else formatted
+        payload = {
+            "response_type": "ephemeral",
+            "replace_original": True,
+            "text": text,
+        }
+        try:
+            async with aiohttp.ClientSession() as session:
+                async with session.post(
+                    ctx["response_url"],
+                    json=payload,
+                    timeout=aiohttp.ClientTimeout(total=10),
+                ) as resp:
+                    if resp.status == 200:
+                        return SendResult(success=True, message_id=None)
+                    body = await resp.text()
+                    logger.warning(
+                        "[Slack] response_url POST returned %s: %s",
+                        resp.status,
+                        body[:200],
+                    )
+        except Exception as e:
+            logger.warning(
+                "[Slack] response_url POST failed: %s", e,
+            )
+        # Non-fatal — the user saw the initial ack already.
+        return SendResult(success=True, message_id=None)
+
    async def connect(self) -> bool:
        """Connect to Slack via Socket Mode."""
        if not SLACK_AVAILABLE:
@@ -414,6 +528,21 @@ class SlackAdapter(BasePlatformAdapter):
                return False
            lock_acquired = True

+            # Close any previous handler before creating a new one so that
+            # calling connect() a second time (e.g. during a gateway restart or
+            # in-process reconnect attempt) does not leave a zombie Socket Mode
+            # connection alive.  Both the old and new connections would otherwise
+            # receive every Slack event and dispatch it twice, producing double
+            # responses — the same bug that affected DiscordAdapter (#18187).
+            if self._handler is not None:
+                try:
+                    await self._handler.close_async()
+                except Exception:
+                    logger.debug("[%s] Failed to close previous Slack handler", self.name)
+                finally:
+                    self._handler = None
+                    self._app = None
+
            # First token is the primary — used for AsyncApp / Socket Mode
            primary_token = bot_tokens[0]
            self._app = AsyncApp(token=primary_token)
@@ -446,12 +575,16 @@ class SlackAdapter(BasePlatformAdapter):
            async def handle_message_event(event, say):
                await self._handle_slack_message(event)

-            # Acknowledge app_mention events to prevent Bolt 404 errors.
-            # The "message" handler above already processes @mentions in
-            # channels, so this is intentionally a no-op to avoid duplicates.
+            # Handle app_mention explicitly. In some Slack app configurations,
+            # channel mentions arrive only as app_mention events rather than the
+            # generic message event. Forward them into the normal message
+            # pipeline so @mentions reliably produce replies.
+            # NOTE: when Slack fires BOTH message and app_mention for the same
+            # @mention, they share the same event ts — the dedup in
+            # _handle_slack_message (MessageDeduplicator) suppresses the second.
            @self._app.event("app_mention")
            async def handle_app_mention(event, say):
-                pass
+                await self._handle_slack_message(event)

            # File lifecycle events can arrive around snippet uploads even when
            # the actual user message is what we care about. Ack them so Slack
@@ -502,7 +635,11 @@ class SlackAdapter(BasePlatformAdapter):

            @self._app.command(_slash_pattern)
            async def handle_hermes_command(ack, command):
-                await ack()
+                slash = (command.get("command") or "").lstrip("/")
+                await ack(
+                    response_type="ephemeral",
+                    text=f"Running `/{slash}`…",
+                )
                await self._handle_slash_command(command)

            # Register Block Kit action handlers for approval buttons
@@ -574,6 +711,17 @@ class SlackAdapter(BasePlatformAdapter):
            return SendResult(success=False, error="Not connected")

        try:
+            # Check for a pending slash-command context.  When the user ran a
+            # native slash command (e.g. /q, /stop, /model), the initial ack
+            # already showed an ephemeral "Running /cmd…" message.  If we have
+            # a stashed response_url for this channel, replace that ack with
+            # the actual command reply ephemerally instead of posting publicly.
+            slash_ctx = self._pop_slash_context(chat_id)
+            if slash_ctx:
+                return await self._send_slash_ephemeral(
+                    slash_ctx, content,
+                )
+
            # Convert standard markdown → Slack mrkdwn
            formatted = self.format_message(content)

@@ -601,6 +749,10 @@ class SlackAdapter(BasePlatformAdapter):

                last_result = await self._get_client(chat_id).chat_postMessage(**kwargs)

+            # Clear Slack Assistant status as soon as the final message is posted.
+            if thread_ts:
+                await self.stop_typing(chat_id)
+
            # Track the sent message ts so we can auto-respond to thread
            # replies without requiring @mention.
            sent_ts = last_result.get("ts") if last_result else None
@@ -624,6 +776,42 @@ class SlackAdapter(BasePlatformAdapter):
            logger.error("[Slack] Send error: %s", e, exc_info=True)
            return SendResult(success=False, error=str(e))

+    async def send_private_notice(
+        self,
+        chat_id: str,
+        user_id: str,
+        content: str,
+        reply_to: Optional[str] = None,
+        metadata: Optional[Dict[str, Any]] = None,
+    ) -> SendResult:
+        """Send a Slack ephemeral message visible only to one user."""
+        if not self._app:
+            return SendResult(success=False, error="Not connected")
+        if not chat_id or not user_id:
+            return SendResult(success=False, error="chat_id and user_id are required")
+
+        try:
+            formatted = self.format_message(content)
+            thread_ts = self._resolve_thread_ts(reply_to, metadata)
+            kwargs = {
+                "channel": chat_id,
+                "user": user_id,
+                "text": formatted,
+                "mrkdwn": True,
+            }
+            if thread_ts:
+                kwargs["thread_ts"] = thread_ts
+
+            result = await self._get_client(chat_id).chat_postEphemeral(**kwargs)
+            return SendResult(
+                success=True,
+                message_id=result.get("message_ts") or result.get("ts"),
+                raw_response=result,
+            )
+        except Exception as e:  # pragma: no cover - defensive logging
+            logger.error("[Slack] Ephemeral send error: %s", e, exc_info=True)
+            return SendResult(success=False, error=str(e))
+
    async def edit_message(
        self,
        chat_id: str,
@@ -642,6 +830,8 @@ class SlackAdapter(BasePlatformAdapter):
                ts=message_id,
                text=formatted,
            )
+            if finalize:
+                await self.stop_typing(chat_id)
            return SendResult(success=True, message_id=message_id)
        except Exception as e:  # pragma: no cover - defensive logging
            logger.error(
@@ -682,7 +872,7 @@ class SlackAdapter(BasePlatformAdapter):
            # in an assistant-enabled context. Falls back to reactions.
            logger.debug("[Slack] assistant.threads.setStatus failed: %s", e)

-    async def stop_typing(self, chat_id: str) -> None:
+    async def stop_typing(self, chat_id: str, metadata=None) -> None:
        """Clear the assistant thread status indicator."""
        if not self._app:
            return
@@ -969,7 +1159,7 @@ class SlackAdapter(BasePlatformAdapter):
            return _ph(f'<{url}|{label}>')

        text = re.sub(
-            r'\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
+            r'(?<!!)\[([^\]]+)\]\(([^()]*(?:\([^()]*\)[^()]*)*)\)',
            _convert_markdown_link,
            text,
        )
@@ -1016,9 +1206,11 @@ class SlackAdapter(BasePlatformAdapter):
        )

        # 10) Convert italic: _text_ stays as _text_ (already Slack italic)
-        #     Single *text* → _text_ (Slack italic)
+        #     Single *text* → _text_ (Slack italic), but only when the
+        #     emphasized text touches non-whitespace on both sides so literal
+        #     delimiters like "a * b * c" are preserved.
        text = re.sub(
-            r'(?<!\*)\*([^*\n]+)\*(?!\*)',
+            r'(?<!\*)\*(\S(?:[^*\n]*?\S)?)\*(?!\*)',
            lambda m: _ph(f'_{m.group(1)}_'),
            text,
        )
@@ -2524,9 +2716,14 @@ class SlackAdapter(BasePlatformAdapter):
            # gateway command dispatcher by prepending the slash.
            text = f"/{slash_name} {text}".strip()

+        # Slack slash commands can originate from DMs or shared channels.
+        # Preserve DM semantics only for DM channel IDs; shared channels must
+        # keep group semantics so different users do not collide into one
+        # session key.
+        is_dm = str(channel_id).startswith("D")
        source = self.build_source(
            chat_id=channel_id,
-            chat_type="dm",  # Slash commands are always in DM-like context
+            chat_type="dm" if is_dm else "group",
            user_id=user_id,
        )

@@ -2537,7 +2734,26 @@ class SlackAdapter(BasePlatformAdapter):
            raw_message=command,
        )

-        await self.handle_message(event)
+        # Stash the Slack response_url so the first reply for this
+        # channel+user can be routed ephemerally (replaces the initial
+        # "Running /cmd…" ack shown by handle_hermes_command).
+        # Only stash for COMMAND events (text starts with "/") — free-form
+        # questions via "/hermes <question>" must produce public replies so
+        # the whole channel can see the agent's answer.
+        response_url = command.get("response_url", "")
+        if response_url and user_id and channel_id and text.startswith("/"):
+            self._slash_command_contexts[(channel_id, user_id)] = {
+                "response_url": response_url,
+                "ts": time.monotonic(),
+            }
+
+        # Set the ContextVar so send() can match the correct stashed
+        # response_url even when multiple users slash concurrently.
+        _slash_user_id_token = _slash_user_id.set(user_id or None)
+        try:
+            await self.handle_message(event)
+        finally:
+            _slash_user_id.reset(_slash_user_id_token)

    def _has_active_session_for_thread(
        self,
@@ -2698,6 +2914,13 @@ class SlackAdapter(BasePlatformAdapter):
            raw = os.getenv("SLACK_FREE_RESPONSE_CHANNELS", "")
        if isinstance(raw, list):
            return {str(part).strip() for part in raw if str(part).strip()}
-        if isinstance(raw, str) and raw.strip():
-            return {part.strip() for part in raw.split(",") if part.strip()}
+        # Coerce non-list scalars (str/int/float) to str before splitting.
+        # A bare numeric YAML value (`free_response_channels: 1234567890`) is
+        # loaded as int and was previously falling through the isinstance(str)
+        # branch to return an empty set.  str() here accepts whatever scalar
+        # the YAML loader hands us without changing existing string/CSV
+        # semantics.
+        s = str(raw).strip() if raw is not None else ""
+        if s:
+            return {part.strip() for part in s.split(",") if part.strip()}
        return set()
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:

 Gateway-specific env vars:
  - SMS_WEBHOOK_PORT     (default 8080)
-  - SMS_WEBHOOK_HOST     (default 0.0.0.0)
+  - SMS_WEBHOOK_HOST     (default 127.0.0.1)
  - SMS_WEBHOOK_URL      (public URL for Twilio signature validation — required)
  - SMS_INSECURE_NO_SIGNATURE  (true to disable signature validation — dev only)
  - SMS_ALLOWED_USERS    (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
 TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
 MAX_SMS_LENGTH = 1600  # ~10 SMS segments
 DEFAULT_WEBHOOK_PORT = 8080
-DEFAULT_WEBHOOK_HOST = "0.0.0.0"
+DEFAULT_WEBHOOK_HOST = "127.0.0.1"


 def check_sms_requirements() -> bool:
@@ -91,19 +91,23 @@ class SmsAdapter(BasePlatformAdapter):
        from aiohttp import web

        if not self._from_number:
-            logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
+            msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
+            logger.error(msg)
+            self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
            return False

        insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"

        if not self._webhook_url and not insecure_no_sig:
-            logger.error(
+            msg = (
                "[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
                "signature validation. Set it to the public URL configured in your "
                "Twilio console (e.g. https://example.com/webhooks/twilio). "
                "For local development without validation, set "
-                "SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
+                "SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
            )
+            logger.error(msg)
+            self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
            return False

        if insecure_no_sig and not self._webhook_url:
@@ -290,14 +290,53 @@ class TelegramAdapter(BasePlatformAdapter):
        # and any other slash-confirm prompts; see GatewayRunner._request_slash_confirm).
        self._slash_confirm_state: Dict[str, str] = {}

-    @staticmethod
-    def _is_callback_user_authorized(user_id: str) -> bool:
+    def _is_callback_user_authorized(
+        self,
+        user_id: str,
+        *,
+        chat_id: Optional[str] = None,
+        chat_type: Optional[str] = None,
+        thread_id: Optional[str] = None,
+        user_name: Optional[str] = None,
+    ) -> bool:
        """Return whether a Telegram inline-button caller may perform gated actions."""
+        normalized_user_id = str(user_id or "").strip()
+        if not normalized_user_id:
+            return False
+
+        runner = getattr(getattr(self, "_message_handler", None), "__self__", None)
+        auth_fn = getattr(runner, "_is_user_authorized", None)
+        if callable(auth_fn):
+            try:
+                from gateway.session import SessionSource
+
+                normalized_chat_type = str(chat_type or "dm").strip().lower() or "dm"
+                if normalized_chat_type == "private":
+                    normalized_chat_type = "dm"
+                elif normalized_chat_type == "supergroup":
+                    normalized_chat_type = "forum" if thread_id is not None else "group"
+
+                source = SessionSource(
+                    platform=Platform.TELEGRAM,
+                    chat_id=str(chat_id or normalized_user_id),
+                    chat_type=normalized_chat_type,
+                    user_id=normalized_user_id,
+                    user_name=str(user_name).strip() if user_name else None,
+                    thread_id=str(thread_id) if thread_id is not None else None,
+                )
+                return bool(auth_fn(source))
+            except Exception:
+                logger.debug(
+                    "[Telegram] Falling back to env-only callback auth for user %s",
+                    normalized_user_id,
+                    exc_info=True,
+                )
+
        allowed_csv = os.getenv("TELEGRAM_ALLOWED_USERS", "").strip()
        if not allowed_csv:
            return True
        allowed_ids = {uid.strip() for uid in allowed_csv.split(",") if uid.strip()}
-        return "*" in allowed_ids or user_id in allowed_ids
+        return "*" in allowed_ids or normalized_user_id in allowed_ids

    @classmethod
    def _metadata_thread_id(cls, metadata: Optional[Dict[str, Any]]) -> Optional[str]:
@@ -473,6 +512,17 @@ class TelegramAdapter(BasePlatformAdapter):
                self.name, attempt,
            )
            self._polling_network_error_count = 0
+            # start_polling() returning is necessary but not sufficient:
+            # PTB's Updater can be left in a state where `running` is True
+            # but the underlying long-poll task is wedged on a stale httpx
+            # connection and never makes progress. No error_callback fires
+            # in that state, so the reconnect ladder won't advance on its
+            # own. Schedule a deferred probe to detect the wedge and
+            # re-enter the ladder if needed.
+            if not self.has_fatal_error:
+                probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
+                self._background_tasks.add(probe)
+                probe.add_done_callback(self._background_tasks.discard)
        except Exception as retry_err:
            logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
            # start_polling failed — polling is dead and no further error
@@ -484,6 +534,50 @@ class TelegramAdapter(BasePlatformAdapter):
                self._background_tasks.add(task)
                task.add_done_callback(self._background_tasks.discard)

+    async def _verify_polling_after_reconnect(self) -> None:
+        """Heartbeat probe scheduled after a successful reconnect.
+
+        PTB's Updater can survive a botched stop()+start_polling() cycle
+        with `running=True` but a wedged consumer task. No error callback
+        fires, so the reconnect ladder doesn't advance on its own. This
+        probe detects the wedge by:
+
+        1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
+           to complete at least one cycle.
+        2. Verifying `Updater.running` is still True.
+        3. Probing the bot endpoint with a tight asyncio timeout. A
+           wedged httpx pool fails this probe; a healthy one returns
+           well under the timeout.
+
+        On any failure, re-enter the reconnect ladder so the existing
+        MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
+        """
+        HEARTBEAT_PROBE_DELAY = 60
+        PROBE_TIMEOUT = 10
+
+        await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
+
+        if self.has_fatal_error:
+            return
+        if not (self._app and self._app.updater and self._app.updater.running):
+            logger.warning(
+                "[%s] Updater not running %ds after reconnect — treating as wedged",
+                self.name, HEARTBEAT_PROBE_DELAY,
+            )
+            await self._handle_polling_network_error(
+                RuntimeError("Updater not running after reconnect heartbeat")
+            )
+            return
+
+        try:
+            await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
+        except Exception as probe_err:
+            logger.warning(
+                "[%s] Polling heartbeat probe failed %ds after reconnect: %s",
+                self.name, HEARTBEAT_PROBE_DELAY, probe_err,
+            )
+            await self._handle_polling_network_error(probe_err)
+
    async def _handle_polling_conflict(self, error: Exception) -> None:
        if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
            return
@@ -594,6 +688,29 @@ class TelegramAdapter(BasePlatformAdapter):
                )
            return None

+    async def rename_dm_topic(
+        self,
+        chat_id: int,
+        thread_id: int,
+        name: str,
+    ) -> None:
+        """Rename a forum topic in a private (DM) chat."""
+        if not self._bot:
+            return
+        try:
+            chat_id_arg = int(chat_id)
+        except (TypeError, ValueError):
+            chat_id_arg = chat_id
+        await self._bot.edit_forum_topic(
+            chat_id=chat_id_arg,
+            message_thread_id=int(thread_id),
+            name=name,
+        )
+        logger.info(
+            "[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
+            self.name, chat_id, thread_id, name,
+        )
+
    def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
        """Save a newly created thread_id back into config.yaml so it persists across restarts."""
        try:
@@ -722,6 +839,20 @@ class TelegramAdapter(BasePlatformAdapter):
                    # Persist thread_id to config so we don't recreate on next restart
                    self._persist_dm_topic_thread_id(int(chat_id), topic_name, thread_id)

+                    # Send a seed message so the topic is visible in Telegram's client.
+                    # Empty topics are hidden by the client UI until they contain a message.
+                    try:
+                        await self._bot.send_message(
+                            chat_id=int(chat_id),
+                            message_thread_id=thread_id,
+                            text=f"\U0001f4cc {topic_name}",
+                        )
+                    except Exception as seed_err:
+                        logger.debug(
+                            "[%s] Could not send seed message to topic '%s': %s",
+                            self.name, topic_name, seed_err,
+                        )
+
    async def connect(self) -> bool:
        """Connect to Telegram via polling or webhook.

@@ -1321,6 +1452,7 @@ class TelegramAdapter(BasePlatformAdapter):
    async def send_update_prompt(
        self, chat_id: str, prompt: str, default: str = "",
        session_key: str = "",
+        metadata: Optional[Dict[str, Any]] = None,
    ) -> SendResult:
        """Send an inline-keyboard update prompt (Yes / No buttons).

@@ -1338,11 +1470,14 @@ class TelegramAdapter(BasePlatformAdapter):
                    InlineKeyboardButton("✗ No", callback_data="update_prompt:n"),
                ]
            ])
+            thread_id = self._metadata_thread_id(metadata)
+            message_thread_id = self._message_thread_id_for_send(thread_id)
            msg = await self._bot.send_message(
                chat_id=int(chat_id),
                text=text,
                parse_mode=ParseMode.MARKDOWN,
                reply_markup=keyboard,
+                message_thread_id=message_thread_id,
                **self._link_preview_kwargs(),
            )
            return SendResult(success=True, message_id=str(msg.message_id))
@@ -1760,6 +1895,12 @@ class TelegramAdapter(BasePlatformAdapter):
        if not query or not query.data:
            return
        data = query.data
+        query_message = getattr(query, "message", None)
+        query_chat_id = getattr(query_message, "chat_id", None)
+        query_chat = getattr(query_message, "chat", None)
+        query_chat_type = getattr(query_chat, "type", None)
+        query_thread_id = getattr(query_message, "message_thread_id", None)
+        query_user_name = getattr(query.from_user, "first_name", None)

        # --- Model picker callbacks ---
        if data.startswith(("mp:", "mm:", "mb", "mx", "mg:")):
@@ -1781,7 +1922,13 @@ class TelegramAdapter(BasePlatformAdapter):

                # Only authorized users may click approval buttons.
                caller_id = str(getattr(query.from_user, "id", ""))
-                if not self._is_callback_user_authorized(caller_id):
+                if not self._is_callback_user_authorized(
+                    caller_id,
+                    chat_id=query_chat_id,
+                    chat_type=str(query_chat_type) if query_chat_type is not None else None,
+                    thread_id=str(query_thread_id) if query_thread_id is not None else None,
+                    user_name=query_user_name,
+                ):
                    await query.answer(text="⛔ You are not authorized to approve commands.")
                    return

@@ -1831,8 +1978,14 @@ class TelegramAdapter(BasePlatformAdapter):
                choice = parts[1]  # once, always, cancel
                confirm_id = parts[2]

-                caller_id = str(getattr(query.from_user, "id", "")) 
-                if not self._is_callback_user_authorized(caller_id):
+                caller_id = str(getattr(query.from_user, "id", ""))
+                if not self._is_callback_user_authorized(
+                    caller_id,
+                    chat_id=query_chat_id,
+                    chat_type=str(query_chat_type) if query_chat_type is not None else None,
+                    thread_id=str(query_thread_id) if query_thread_id is not None else None,
+                    user_name=query_user_name,
+                ):
                    await query.answer(text="⛔ You are not authorized to answer this prompt.")
                    return

@@ -1891,7 +2044,13 @@ class TelegramAdapter(BasePlatformAdapter):
            return
        answer = data.split(":", 1)[1]  # "y" or "n"
        caller_id = str(getattr(query.from_user, "id", ""))
-        if not self._is_callback_user_authorized(caller_id):
+        if not self._is_callback_user_authorized(
+            caller_id,
+            chat_id=query_chat_id,
+            chat_type=str(query_chat_type) if query_chat_type is not None else None,
+            thread_id=str(query_thread_id) if query_thread_id is not None else None,
+            user_name=query_user_name,
+        ):
            await query.answer(text="⛔ You are not authorized to answer update prompts.")
            return
        await query.answer(text=f"Sent '{answer}' to the update process.")
@@ -2131,13 +2290,54 @@ class TelegramAdapter(BasePlatformAdapter):
                )
            return SendResult(success=True, message_id=str(msg.message_id))
        except Exception as e:
-            logger.error(
-                "[%s] Failed to send Telegram local image, falling back to base adapter: %s",
-                self.name,
-                e,
-                exc_info=True,
+            error_str = str(e)
+            # Dimension-related errors are the expected case for valid image
+            # files that Telegram just refuses as photos (screenshots, extreme
+            # aspect ratios). Log at INFO because the document fallback is
+            # the correct path. Any other send_photo failure also falls back
+            # to document (rate limits, corrupt file markers, format edge
+            # cases), but at WARNING because it's unexpected and worth
+            # surfacing in logs.
+            is_dim_error = (
+                "Photo_invalid_dimensions" in error_str
+                or "PHOTO_INVALID_DIMENSIONS" in error_str
            )
-            return await super().send_image_file(chat_id, image_path, caption, reply_to)
+            if is_dim_error:
+                logger.info(
+                    "[%s] Image dimensions exceed Telegram photo limits, "
+                    "sending as document: %s",
+                    self.name,
+                    image_path,
+                )
+            else:
+                logger.warning(
+                    "[%s] Failed to send Telegram local image as photo, "
+                    "trying document fallback: %s",
+                    self.name,
+                    e,
+                    exc_info=True,
+                )
+            # Fallback to sending as document (file) — no dimension limit,
+            # only 50MB size limit. If even that fails, fall back to the
+            # base adapter's text-only "Image: /path" rendering.
+            try:
+                return await self.send_document(
+                    chat_id=chat_id,
+                    file_path=image_path,
+                    caption=caption,
+                    file_name=os.path.basename(image_path),
+                    reply_to=reply_to,
+                    metadata=metadata,
+                )
+            except Exception as doc_err:
+                logger.error(
+                    "[%s] Failed to send Telegram local image as document, "
+                    "falling back to base adapter: %s",
+                    self.name,
+                    doc_err,
+                    exc_info=True,
+                )
+                return await super().send_image_file(chat_id, image_path, caption, reply_to)

    async def send_document(
        self,
@@ -142,6 +142,7 @@ class WeComAdapter(BasePlatformAdapter):
    """WeCom AI Bot adapter backed by a persistent WebSocket connection."""

    MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
+    SUPPORTS_MESSAGE_EDITING = False
    # Threshold for detecting WeCom client-side message splits.
    # When a chunk is near the 4000-char limit, a continuation is almost certain.
    _SPLIT_THRESHOLD = 3900
@@ -206,7 +207,11 @@ class WeComAdapter(BasePlatformAdapter):
            return False

        try:
-            self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
+            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
+            from gateway.platforms._http_client_limits import platform_httpx_limits
+            self._http_client = httpx.AsyncClient(
+                timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
+            )
            await self._open_connection()
            self._mark_connected()
            self._listen_task = asyncio.create_task(self._listen_loop())
@@ -119,7 +119,9 @@ class WecomCallbackAdapter(BasePlatformAdapter):
            pass

        try:
-            self._http_client = httpx.AsyncClient(timeout=20.0)
+            # Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
+            from gateway.platforms._http_client_limits import platform_httpx_limits
+            self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
            self._app = web.Application()
            self._app.router.add_get("/health", self._handle_health)
            self._app.router.add_get(self._path, self._handle_verify)
@@ -1333,6 +1333,15 @@ class WeixinAdapter(BasePlatformAdapter):
        if message_id and self._dedup.is_duplicate(message_id):
            return

+        # Secondary content-fingerprint dedup for text messages
+        item_list = message.get("item_list") or []
+        text = _extract_text(item_list)
+        if text:
+            content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
+            if self._dedup.is_duplicate(content_key):
+                logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
+                return
+
        chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
        if chat_type == "group":
            if self._group_policy == "disabled":
@@ -1347,8 +1356,6 @@ class WeixinAdapter(BasePlatformAdapter):
            self._token_store.set(self._account_id, sender_id, context_token)
        asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))

-        item_list = message.get("item_list") or []
-        text = _extract_text(item_list)
        media_paths: List[str] = []
        media_types: List[str] = []

@@ -2030,7 +2037,9 @@ async def send_weixin_direct(

    live_adapter = _LIVE_ADAPTERS.get(resolved_token)
    send_session = getattr(live_adapter, '_send_session', None)
-    if live_adapter is not None and send_session is not None and not send_session.closed:
+    if (live_adapter is not None and send_session is not None
+            and not send_session.closed
+            and send_session._loop is asyncio.get_running_loop()):
        last_result: Optional[SendResult] = None
        cleaned = live_adapter.format_message(message)
        if cleaned:
@@ -185,6 +185,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
        self._bridge_log: Optional[Path] = None
        self._poll_task: Optional[asyncio.Task] = None
        self._http_session: Optional["aiohttp.ClientSession"] = None
+        # Set to True by disconnect() before we SIGTERM our child bridge so
+        # _check_managed_bridge_exit() can distinguish an intentional
+        # shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
+        # Without this, every graceful gateway shutdown/restart would log
+        # "Fatal whatsapp adapter error" plus dispatch a fatal-error
+        # notification before the normal "✓ whatsapp disconnected" fires.
+        self._shutting_down: bool = False

    def _whatsapp_require_mention(self) -> bool:
        configured = self.config.extra.get("require_mention")
@@ -555,6 +562,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
        if returncode is None:
            return None

+        # Planned shutdown: disconnect() sets _shutting_down before it sends
+        # SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
+        # or 0 (clean exit) at that point is expected, not a crash. Treat it
+        # as informational and skip the fatal-error path.
+        # getattr-with-default keeps tests that construct the adapter via
+        # ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
+        # every _make_adapter() helper having to seed the attribute.
+        if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
+            logger.info(
+                "[%s] Bridge exited during shutdown (code %d).",
+                self.name,
+                returncode,
+            )
+            return None
+
        message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
        if not self.has_fatal_error:
            logger.error("[%s] %s", self.name, message)
@@ -565,6 +587,10 @@ class WhatsAppAdapter(BasePlatformAdapter):

    async def disconnect(self) -> None:
        """Stop the WhatsApp bridge and clean up any orphaned processes."""
+        # Flip the shutdown flag BEFORE signalling the child so the exit-check
+        # path (which runs from other tasks like send() and the poll loop)
+        # doesn't race us and report the intentional termination as fatal.
+        self._shutting_down = True
        if self._bridge_process:
            try:
                try:
@@ -876,11 +902,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
        try:
            import aiohttp

-            await self._http_session.post(
+            # Must wrap in `async with` — a bare `await session.post(...)`
+            # leaves the response object alive until GC, holding its TCP
+            # socket in CLOSE_WAIT. See #18451.
+            async with self._http_session.post(
                f"http://127.0.0.1:{self._bridge_port}/typing",
                json={"chatId": chat_id},
                timeout=aiohttp.ClientTimeout(total=5)
-            )
+            ):
+                pass
        except Exception:
            pass  # Ignore typing indicator failures
    
@@ -1896,10 +1896,12 @@ class OwnerCommandMiddleware(InboundMiddleware):
        if cmd not in cls.ALLOWLIST:
            return None, None, False

-        # Sender identity check: bot owner <-> push.from_account == push.bot_owner_id
-        # owner_id = (push or {}).get("bot_owner_id") or ""
-        # is_owner = bool(owner_id) and owner_id == from_account
-        is_owner = True
+        # Sender identity check: bot owner <-> push.from_account == push.bot_owner_id.
+        # The allowlisted commands (/approve, /deny, /stop, /reset, ...) are
+        # privileged — leaking them to non-owners lets any group member approve
+        # a dangerous tool call, kill the owner's task, or wipe session state.
+        owner_id = str((push or {}).get("bot_owner_id") or "").strip()
+        is_owner = bool(owner_id) and owner_id == from_account
        return cmd, cmd_line, is_owner

    async def handle(self, ctx: InboundContext, next_fn) -> None:
@@ -458,6 +458,15 @@ class SessionEntry:
    was_auto_reset: bool = False
    auto_reset_reason: Optional[str] = None  # "idle" or "daily"
    reset_had_activity: bool = False  # whether the expired session had any messages
+
+    # Set by reset_session() when the user explicitly sends /new or /reset.
+    # Consumed once by _handle_message_with_agent to trigger topic/channel
+    # skill re-injection on the first message of the new session.  We can't
+    # reuse was_auto_reset for this because that flag fires the "session
+    # expired due to inactivity" user-facing notice and a misleading
+    # context-note prepend — both wrong for an explicit manual reset.
+    # See issue #6508.
+    is_fresh_reset: bool = False
    
    # Set by the background expiry watcher after it finalizes an expired
    # session (invoking on_session_finalize hooks and evicting the cached
@@ -508,6 +517,7 @@ class SessionEntry:
                if self.last_resume_marked_at
                else None
            ),
+            "is_fresh_reset": self.is_fresh_reset,
        }
        if self.origin:
            result["origin"] = self.origin.to_dict()
@@ -556,6 +566,7 @@ class SessionEntry:
            resume_pending=data.get("resume_pending", False),
            resume_reason=data.get("resume_reason"),
            last_resume_marked_at=last_resume_marked_at,
+            is_fresh_reset=data.get("is_fresh_reset", False),
        )


@@ -1075,19 +1086,22 @@ class SessionStore:
        return len(removed_keys)

    def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
-        """Mark recently-active sessions as suspended.
+        """Mark recently-active sessions as resumable after an unexpected exit.

-        Called on gateway startup to prevent sessions that were likely
-        in-flight when the gateway last exited from being blindly resumed
-        (#7536).  Only suspends sessions updated within *max_age_seconds*
-        to avoid resetting long-idle sessions that are harmless to resume.
-        Returns the number of sessions that were suspended.
+        Called on gateway startup after a crash or fast restart to preserve
+        in-flight sessions instead of destroying their conversation history
+        (#7536).  Only marks sessions updated within *max_age_seconds* to
+        avoid touching long-idle sessions.  Sets ``resume_pending=True`` so
+        the next incoming message on the same session_key auto-resumes from
+        the existing transcript.

-        Entries flagged ``resume_pending=True`` are skipped — those were
-        marked intentionally by the drain-timeout path as recoverable.
-        Terminal escalation for genuinely stuck ``resume_pending`` sessions
-        is handled by the existing ``.restart_failure_counts`` stuck-loop
-        counter, which runs after this method on startup.
+        Entries already flagged ``resume_pending=True`` are skipped.  Entries
+        explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
+        are also skipped.  Terminal escalation for genuinely stuck sessions
+        is still handled by the existing ``.restart_failure_counts`` counter
+        (threshold 3), which runs after this method and sets ``suspended=True``.
+
+        Returns the number of sessions marked resumable.
        """
        from datetime import timedelta

@@ -1099,13 +1113,15 @@ class SessionStore:
                if entry.resume_pending:
                    continue
                if not entry.suspended and entry.updated_at >= cutoff:
-                    entry.suspended = True
+                    entry.resume_pending = True
+                    entry.resume_reason = "restart_interrupted"
+                    entry.last_resume_marked_at = _now()
                    count += 1
            if count:
                self._save()
        return count

-    def reset_session(self, session_key: str) -> Optional[SessionEntry]:
+    def reset_session(self, session_key: str, display_name: Optional[str] = None) -> Optional[SessionEntry]:
        """Force reset a session, creating a new session ID."""
        db_end_session_id = None
        db_create_kwargs = None
@@ -1129,9 +1145,10 @@ class SessionStore:
                created_at=now,
                updated_at=now,
                origin=old_entry.origin,
-                display_name=old_entry.display_name,
+                display_name=display_name if display_name is not None else old_entry.display_name,
                platform=old_entry.platform,
                chat_type=old_entry.chat_type,
+                is_fresh_reset=True,
            )

            self._entries[session_key] = new_entry
@@ -21,6 +21,7 @@ from datetime import datetime, timezone
 from pathlib import Path
 from hermes_constants import get_hermes_home
 from typing import Any, Optional
+from utils import atomic_json_write

 if sys.platform == "win32":
    import msvcrt
@@ -34,6 +35,10 @@ _IS_WINDOWS = sys.platform == "win32"
 _UNSET = object()
 _GATEWAY_LOCK_FILENAME = "gateway.lock"
 _gateway_lock_handle = None
+# Windows byte-range locks are mandatory for other readers. Lock a byte well
+# past the JSON payload so runtime status / PID readers can still read the file
+# while another process holds the mutual-exclusion lock.
+_WINDOWS_LOCK_OFFSET = 1024 * 1024


 def _get_pid_path() -> Path:
@@ -205,8 +210,7 @@ def _read_json_file(path: Path) -> Optional[dict[str, Any]]:


 def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
-    path.parent.mkdir(parents=True, exist_ok=True)
-    path.write_text(json.dumps(payload))
+    atomic_json_write(path, payload, indent=None, separators=(",", ":"))


 def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
@@ -286,7 +290,7 @@ def _try_acquire_file_lock(handle) -> bool:
            if handle.tell() == 0:
                handle.write("\n")
                handle.flush()
-            handle.seek(0)
+            handle.seek(_WINDOWS_LOCK_OFFSET)
            msvcrt.locking(handle.fileno(), msvcrt.LK_NBLCK, 1)
        else:
            fcntl.flock(handle.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
@@ -298,7 +302,7 @@ def _try_acquire_file_lock(handle) -> bool:
 def _release_file_lock(handle) -> None:
    try:
        if _IS_WINDOWS:
-            handle.seek(0)
+            handle.seek(_WINDOWS_LOCK_OFFSET)
            msvcrt.locking(handle.fileno(), msvcrt.LK_UNLCK, 1)
        else:
            fcntl.flock(handle.fileno(), fcntl.LOCK_UN)
@@ -633,6 +637,8 @@ def release_all_scoped_locks(

 _TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
 _TAKEOVER_MARKER_TTL_S = 60  # Marker older than this is treated as stale
+_PLANNED_STOP_MARKER_FILENAME = ".gateway-planned-stop.json"
+_PLANNED_STOP_MARKER_TTL_S = 60


 def _get_takeover_marker_path() -> Path:
@@ -641,6 +647,67 @@ def _get_takeover_marker_path() -> Path:
    return home / _TAKEOVER_MARKER_FILENAME


+def _get_planned_stop_marker_path() -> Path:
+    """Return the path to the intentional gateway stop marker file."""
+    home = get_hermes_home()
+    return home / _PLANNED_STOP_MARKER_FILENAME
+
+
+def _marker_is_stale(written_at: str, ttl_s: int) -> bool:
+    try:
+        written_dt = datetime.fromisoformat(written_at)
+        age = (datetime.now(timezone.utc) - written_dt).total_seconds()
+        return age > ttl_s
+    except (TypeError, ValueError):
+        return True
+
+
+def _consume_pid_marker_for_self(
+    path: Path,
+    *,
+    pid_field: str,
+    start_time_field: str,
+    ttl_s: int,
+) -> bool:
+    record = _read_json_file(path)
+    if not record:
+        return False
+
+    try:
+        target_pid = int(record[pid_field])
+        target_start_time = record.get(start_time_field)
+        written_at = record.get("written_at") or ""
+    except (KeyError, TypeError, ValueError):
+        try:
+            path.unlink(missing_ok=True)
+        except OSError:
+            pass
+        return False
+
+    if _marker_is_stale(written_at, ttl_s):
+        try:
+            path.unlink(missing_ok=True)
+        except OSError:
+            pass
+        return False
+
+    our_pid = os.getpid()
+    our_start_time = _get_process_start_time(our_pid)
+    matches = (
+        target_pid == our_pid
+        and target_start_time is not None
+        and our_start_time is not None
+        and target_start_time == our_start_time
+    )
+
+    try:
+        path.unlink(missing_ok=True)
+    except OSError:
+        pass
+
+    return matches
+
+
 def write_takeover_marker(target_pid: int) -> bool:
    """Record that ``target_pid`` is being replaced by the current process.

@@ -677,59 +744,13 @@ def consume_takeover_marker_for_self() -> bool:
    Always unlinks the marker on match (and on detected staleness) so
    subsequent unrelated signals don't re-trigger.
    """
-    path = _get_takeover_marker_path()
-    record = _read_json_file(path)
-    if not record:
-        return False
-
-    # Any malformed or stale marker → drop it and return False
-    try:
-        target_pid = int(record["target_pid"])
-        target_start_time = record.get("target_start_time")
-        written_at = record.get("written_at") or ""
-    except (KeyError, TypeError, ValueError):
-        try:
-            path.unlink(missing_ok=True)
-        except OSError:
-            pass
-        return False
-
-    # TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
-    stale = False
-    try:
-        written_dt = datetime.fromisoformat(written_at)
-        age = (datetime.now(timezone.utc) - written_dt).total_seconds()
-        if age > _TAKEOVER_MARKER_TTL_S:
-            stale = True
-    except (TypeError, ValueError):
-        stale = True  # Unparseable timestamp — treat as stale
-
-    if stale:
-        try:
-            path.unlink(missing_ok=True)
-        except OSError:
-            pass
-        return False
-
-    # Does the marker name THIS process?
-    our_pid = os.getpid()
-    our_start_time = _get_process_start_time(our_pid)
-    matches = (
-        target_pid == our_pid
-        and target_start_time is not None
-        and our_start_time is not None
-        and target_start_time == our_start_time
+    return _consume_pid_marker_for_self(
+        _get_takeover_marker_path(),
+        pid_field="target_pid",
+        start_time_field="target_start_time",
+        ttl_s=_TAKEOVER_MARKER_TTL_S,
    )

-    # Consume the marker whether it matched or not — a marker that doesn't
-    # match our identity is stale-for-us anyway.
-    try:
-        path.unlink(missing_ok=True)
-    except OSError:
-        pass
-
-    return matches
-

 def clear_takeover_marker() -> None:
    """Remove the takeover marker unconditionally. Safe to call repeatedly."""
@@ -739,6 +760,45 @@ def clear_takeover_marker() -> None:
        pass


+def write_planned_stop_marker(target_pid: int) -> bool:
+    """Record that ``target_pid`` is being stopped intentionally.
+
+    The gateway exits non-zero for unexpected SIGTERM so service managers can
+    revive it. Service stop commands send the same SIGTERM, so the CLI writes
+    this short-lived marker first to let the target process exit cleanly.
+    """
+    try:
+        target_start_time = _get_process_start_time(target_pid)
+        record = {
+            "target_pid": target_pid,
+            "target_start_time": target_start_time,
+            "stopper_pid": os.getpid(),
+            "written_at": _utc_now_iso(),
+        }
+        _write_json_file(_get_planned_stop_marker_path(), record)
+        return True
+    except (OSError, PermissionError):
+        return False
+
+
+def consume_planned_stop_marker_for_self() -> bool:
+    """Return True when the current process is being intentionally stopped."""
+    return _consume_pid_marker_for_self(
+        _get_planned_stop_marker_path(),
+        pid_field="target_pid",
+        start_time_field="target_start_time",
+        ttl_s=_PLANNED_STOP_MARKER_TTL_S,
+    )
+
+
+def clear_planned_stop_marker() -> None:
+    """Remove the planned-stop marker unconditionally."""
+    try:
+        _get_planned_stop_marker_path().unlink(missing_ok=True)
+    except OSError:
+        pass
+
+
 def get_running_pid(
    pid_path: Optional[Path] = None,
    *,
@@ -5,11 +5,43 @@ Provides subcommands for:
 - hermes chat          - Interactive chat (same as ./hermes)
 - hermes gateway       - Run gateway in foreground
 - hermes gateway start - Start gateway service
- hermes gateway stop  - Stop gateway service  
+- hermes gateway stop  - Stop gateway service
 - hermes setup         - Interactive setup wizard
 - hermes status        - Show status of all components
 - hermes cron          - Manage cron jobs
 """

-__version__ = "0.11.0"
-__release_date__ = "2026.4.23"
+import os
+import sys
+
+__version__ = "0.12.0"
+__release_date__ = "2026.4.30"
+
+
+def _ensure_utf8():
+    """Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
+
+    Windows services and terminals default to cp1252, which cannot encode
+    box-drawing characters used in CLI output. This causes unhandled
+    UnicodeEncodeError crashes on gateway startup.
+    """
+    if sys.platform != "win32":
+        return
+    os.environ.setdefault("PYTHONUTF8", "1")
+    os.environ.setdefault("PYTHONIOENCODING", "utf-8")
+    for stream_name in ("stdout", "stderr"):
+        stream = getattr(sys, stream_name, None)
+        if stream is None:
+            continue
+        try:
+            if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
+                new_stream = open(
+                    stream.fileno(), "w", encoding="utf-8",
+                    buffering=1, closefd=False,
+                )
+                setattr(sys, stream_name, new_stream)
+        except (AttributeError, OSError):
+            pass
+
+
+_ensure_utf8()
@@ -43,7 +43,7 @@ import yaml

 from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
 from hermes_constants import OPENROUTER_BASE_URL
-from utils import atomic_replace
+from utils import atomic_replace, atomic_yaml_write, is_truthy_value

 logger = logging.getLogger(__name__)

@@ -2480,8 +2480,8 @@ def _resolve_verify(
    tls_state = tls_state if isinstance(tls_state, dict) else {}

    effective_insecure = (
-        bool(insecure) if insecure is not None
-        else bool(tls_state.get("insecure", False))
+        is_truthy_value(insecure, default=False) if insecure is not None
+        else is_truthy_value(tls_state.get("insecure", False), default=False)
    )
    effective_ca = (
        ca_bundle
@@ -2589,6 +2589,208 @@ def _poll_for_token(
 # Nous Portal — token refresh, agent key minting, model discovery
 # =============================================================================

+# -----------------------------------------------------------------------------
+# Shared Nous token store — lets OAuth credentials persist across profiles
+# so a new `hermes --profile <name> auth add nous --type oauth` can one-tap
+# import instead of running the full device-code flow every time.
+#
+# File lives at ${HERMES_SHARED_AUTH_DIR}/nous_auth.json, defaulting to
+# ~/.hermes/shared/nous_auth.json. It is OUTSIDE any named profile's
+# HERMES_HOME so named profiles (which typically live under
+# ~/.hermes/profiles/<name>/) all see the same file.
+#
+# Written on successful login and on every runtime refresh so the stored
+# refresh_token stays current even if one profile refreshes and rotates it.
+# If ever the stored refresh_token does go stale server-side, import fails
+# gracefully and the user falls back to the normal device-code flow.
+# -----------------------------------------------------------------------------
+
+NOUS_SHARED_STORE_FILENAME = "nous_auth.json"
+
+
+def _nous_shared_auth_dir() -> Path:
+    """Resolve the directory that holds the shared Nous token store.
+
+    Honors ``HERMES_SHARED_AUTH_DIR`` so tests can redirect it to a tmp
+    path without touching the real user's home. Defaults to
+    ``~/.hermes/shared/``.
+    """
+    override = os.getenv("HERMES_SHARED_AUTH_DIR", "").strip()
+    if override:
+        return Path(override).expanduser()
+    return Path.home() / ".hermes" / "shared"
+
+
+def _nous_shared_store_path() -> Path:
+    path = _nous_shared_auth_dir() / NOUS_SHARED_STORE_FILENAME
+    # Seat belt: if pytest is running and this resolves to a path under the
+    # real user's home, refuse rather than silently corrupt cross-profile
+    # state. Tests must set HERMES_SHARED_AUTH_DIR to a tmp_path (conftest
+    # does not do this automatically — mirror the _auth_file_path() guard
+    # so forgetting to set it fails loudly instead of writing to the real
+    # shared store).
+    if os.environ.get("PYTEST_CURRENT_TEST"):
+        real_home_shared = (
+            Path.home() / ".hermes" / "shared" / NOUS_SHARED_STORE_FILENAME
+        ).resolve(strict=False)
+        try:
+            resolved = path.resolve(strict=False)
+        except Exception:
+            resolved = path
+        if resolved == real_home_shared:
+            raise RuntimeError(
+                f"Refusing to touch real user shared Nous auth store during test run: "
+                f"{path}. Set HERMES_SHARED_AUTH_DIR to a tmp_path in your test fixture."
+            )
+    return path
+
+
+def _write_shared_nous_state(state: Dict[str, Any]) -> None:
+    """Persist a minimal copy of the Nous OAuth state to the shared store.
+
+    Best-effort: any failure is swallowed after logging. The shared store
+    is a convenience layer; the per-profile auth.json remains the source
+    of truth.
+
+    We deliberately omit the short-lived ``agent_key`` (24h TTL, profile-
+    specific) — only the long-lived OAuth tokens are cross-profile useful.
+    """
+    refresh_token = state.get("refresh_token")
+    access_token = state.get("access_token")
+    if not (isinstance(refresh_token, str) and refresh_token.strip()):
+        # No refresh_token = nothing worth sharing across profiles
+        return
+    if not (isinstance(access_token, str) and access_token.strip()):
+        return
+
+    shared = {
+        "_schema": 1,
+        "access_token": access_token,
+        "refresh_token": refresh_token,
+        "token_type": state.get("token_type") or "Bearer",
+        "scope": state.get("scope") or DEFAULT_NOUS_SCOPE,
+        "client_id": state.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
+        "portal_base_url": state.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
+        "inference_base_url": state.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
+        "obtained_at": state.get("obtained_at"),
+        "expires_at": state.get("expires_at"),
+        "updated_at": datetime.now(timezone.utc).isoformat(),
+    }
+    try:
+        path = _nous_shared_store_path()
+        path.parent.mkdir(parents=True, exist_ok=True)
+        tmp = path.with_suffix(path.suffix + ".tmp")
+        tmp.write_text(json.dumps(shared, indent=2, sort_keys=True))
+        try:
+            os.chmod(tmp, 0o600)
+        except OSError:
+            pass
+        os.replace(tmp, path)
+        _oauth_trace(
+            "nous_shared_store_written",
+            path=str(path),
+            refresh_token_fp=_token_fingerprint(refresh_token),
+        )
+    except Exception as exc:
+        logger.debug("Failed to write shared Nous auth store: %s", exc)
+
+
+def _read_shared_nous_state() -> Optional[Dict[str, Any]]:
+    """Return the shared Nous OAuth state if present and well-formed.
+
+    Returns ``None`` when the file is missing, unreadable, malformed, or
+    lacks required fields. Callers should treat ``None`` as "no shared
+    credentials available — fall through to device-code".
+    """
+    try:
+        path = _nous_shared_store_path()
+    except RuntimeError:
+        # Test seat belt tripped — treat as missing
+        return None
+    if not path.is_file():
+        return None
+    try:
+        payload = json.loads(path.read_text())
+    except (OSError, ValueError) as exc:
+        logger.debug("Shared Nous auth store at %s is unreadable: %s", path, exc)
+        return None
+    if not isinstance(payload, dict):
+        return None
+    refresh_token = payload.get("refresh_token")
+    access_token = payload.get("access_token")
+    if not (isinstance(refresh_token, str) and refresh_token.strip()):
+        return None
+    if not (isinstance(access_token, str) and access_token.strip()):
+        return None
+    return payload
+
+
+def _try_import_shared_nous_state(
+    *,
+    timeout_seconds: float = 15.0,
+    min_key_ttl_seconds: int = 5 * 60,
+) -> Optional[Dict[str, Any]]:
+    """Attempt to rehydrate Nous OAuth state from the shared store.
+
+    Reads the shared file (if present), runs a forced refresh+mint using
+    the stored refresh_token to produce a fresh access_token + agent_key
+    scoped to this profile, and returns the full auth_state dict ready
+    for ``persist_nous_credentials()``.
+
+    Returns ``None`` when no shared state is available or the rehydrate
+    fails for any reason (expired refresh_token, portal unreachable,
+    etc.) — caller should then fall through to the normal device-code
+    flow.
+    """
+    shared = _read_shared_nous_state()
+    if not shared:
+        return None
+
+    # Build a full state dict so refresh_nous_oauth_from_state has every
+    # field it needs. force_refresh=True gets us a fresh access_token
+    # for this profile; force_mint=True gets us a fresh agent_key.
+    state: Dict[str, Any] = {
+        "access_token": shared.get("access_token"),
+        "refresh_token": shared.get("refresh_token"),
+        "client_id": shared.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
+        "portal_base_url": shared.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
+        "inference_base_url": shared.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
+        "token_type": shared.get("token_type") or "Bearer",
+        "scope": shared.get("scope") or DEFAULT_NOUS_SCOPE,
+        "obtained_at": shared.get("obtained_at"),
+        "expires_at": shared.get("expires_at"),
+        "agent_key": None,
+        "agent_key_expires_at": None,
+        "tls": {"insecure": False, "ca_bundle": None},
+    }
+
+    try:
+        refreshed = refresh_nous_oauth_from_state(
+            state,
+            min_key_ttl_seconds=min_key_ttl_seconds,
+            timeout_seconds=timeout_seconds,
+            force_refresh=True,
+            force_mint=True,
+        )
+    except AuthError as exc:
+        _oauth_trace(
+            "nous_shared_import_failed",
+            error_type=type(exc).__name__,
+            error_code=getattr(exc, "code", None),
+        )
+        logger.debug("Shared Nous import failed: %s", exc)
+        return None
+    except Exception as exc:
+        _oauth_trace(
+            "nous_shared_import_failed",
+            error_type=type(exc).__name__,
+        )
+        logger.debug("Shared Nous import failed: %s", exc)
+        return None
+
+    return refreshed
+
+
 def _refresh_access_token(
    *,
    client: httpx.Client,
@@ -2991,6 +3193,12 @@ def persist_nous_credentials(
        _save_provider_state(auth_store, "nous", state)
        _save_auth_store(auth_store)

+    # Mirror to the shared store so a new profile can one-tap import
+    # these credentials via `hermes auth add nous --type oauth`. Best-
+    # effort: any I/O failure is logged and swallowed (the per-profile
+    # auth.json is still the source of truth).
+    _write_shared_nous_state(state)
+
    pool = load_pool("nous")
    return next(
        (e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
@@ -3059,6 +3267,11 @@ def resolve_nous_runtime_credentials(
                refresh_token_fp=_token_fingerprint(state.get("refresh_token")),
                access_token_fp=_token_fingerprint(state.get("access_token")),
            )
+            # Mirror post-refresh state to the shared store so sibling
+            # profiles don't hold stale refresh_tokens after rotation.
+            # Best-effort — any failure is logged and swallowed inside
+            # _write_shared_nous_state.
+            _write_shared_nous_state(state)

        verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
        timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
@@ -3653,7 +3866,7 @@ def _update_config_for_provider(

    config["model"] = model_cfg

-    config_path.write_text(yaml.safe_dump(config, sort_keys=False))
+    atomic_yaml_write(config_path, config, sort_keys=False)
    return config_path


@@ -3712,7 +3925,7 @@ def _reset_config_provider() -> Path:
        model["provider"] = "auto"
        if "base_url" in model:
            model["base_url"] = OPENROUTER_BASE_URL
-    config_path.write_text(yaml.safe_dump(config, sort_keys=False))
+    atomic_yaml_write(config_path, config, sort_keys=False)
    return config_path


@@ -4283,7 +4496,8 @@ def _minimax_oauth_login(
    print(f"Portal: {portal_base_url}")

    with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
-                      headers={"Accept": "application/json"}) as client:
+                      headers={"Accept": "application/json"},
+                      follow_redirects=True) as client:
        code_data = _minimax_request_user_code(
            client, portal_base_url=portal_base_url,
            client_id=pconfig.client_id,
@@ -4360,7 +4574,8 @@ def _refresh_minimax_oauth_state(
        return state

    portal_base_url = state["portal_base_url"]
-    with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
+    with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
+                      follow_redirects=True) as client:
        response = client.post(
            f"{portal_base_url}/oauth/token",
            data={
@@ -4598,17 +4813,47 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
    )

    try:
-        auth_state = _nous_device_code_login(
-            portal_base_url=getattr(args, "portal_url", None),
-            inference_base_url=getattr(args, "inference_url", None),
-            client_id=getattr(args, "client_id", None) or pconfig.client_id,
-            scope=getattr(args, "scope", None) or pconfig.scope,
-            open_browser=not getattr(args, "no_browser", False),
-            timeout_seconds=timeout_seconds,
-            insecure=insecure,
-            ca_bundle=ca_bundle,
-            min_key_ttl_seconds=5 * 60,
-        )
+        auth_state = None
+
+        # Codex-style auto-import: before launching a fresh device-code
+        # flow, check the shared store for an existing Nous credential
+        # from any other profile. If present, offer to rehydrate it.
+        shared = _read_shared_nous_state()
+        if shared:
+            try:
+                shared_path = _nous_shared_store_path()
+            except RuntimeError:
+                shared_path = None
+            print()
+            if shared_path:
+                print(f"Found existing Nous OAuth credentials at {shared_path}")
+            else:
+                print("Found existing shared Nous OAuth credentials")
+            try:
+                do_import = input("Import these credentials? [Y/n]: ").strip().lower()
+            except (EOFError, KeyboardInterrupt):
+                do_import = "y"
+            if do_import in ("", "y", "yes"):
+                print("Rehydrating Nous session from shared credentials...")
+                auth_state = _try_import_shared_nous_state(
+                    timeout_seconds=timeout_seconds,
+                    min_key_ttl_seconds=5 * 60,
+                )
+                if auth_state is None:
+                    print("Could not refresh shared credentials — falling back to device-code login.")
+
+        if auth_state is None:
+            auth_state = _nous_device_code_login(
+                portal_base_url=getattr(args, "portal_url", None),
+                inference_base_url=getattr(args, "inference_url", None),
+                client_id=getattr(args, "client_id", None) or pconfig.client_id,
+                scope=getattr(args, "scope", None) or pconfig.scope,
+                open_browser=not getattr(args, "no_browser", False),
+                timeout_seconds=timeout_seconds,
+                insecure=insecure,
+                ca_bundle=ca_bundle,
+                min_key_ttl_seconds=5 * 60,
+            )

        inference_base_url = auth_state["inference_base_url"]

@@ -4625,6 +4870,11 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
            _save_provider_state(auth_store, "nous", auth_state)
            saved_to = _save_auth_store(auth_store)

+        # Mirror to the shared store so other profiles can one-tap import
+        # these credentials. Best-effort: any I/O failure is logged and
+        # swallowed inside the helper.
+        _write_shared_nous_state(auth_state)
+
        print()
        print("Login successful!")
        print(f"  Auth state: {saved_to}")
@@ -245,6 +245,47 @@ def auth_add_command(args) -> None:
        return

    if provider == "nous":
+        # Codex-style auto-import: if a shared Nous credential lives at
+        # ~/.hermes/shared/nous_auth.json (written by any previous
+        # successful login), offer to import it instead of running the
+        # full device-code flow. This makes `hermes --profile <name>
+        # auth add nous --type oauth` a one-tap operation for users who
+        # run multiple profiles.
+        shared = auth_mod._read_shared_nous_state()
+        if shared:
+            try:
+                path = auth_mod._nous_shared_store_path()
+            except RuntimeError:
+                path = None
+            print()
+            if path:
+                print(f"Found existing Nous OAuth credentials at {path}")
+            else:
+                print("Found existing shared Nous OAuth credentials")
+            try:
+                do_import = input("Import these credentials? [Y/n]: ").strip().lower()
+            except (EOFError, KeyboardInterrupt):
+                do_import = "y"
+            if do_import in ("", "y", "yes"):
+                print("Rehydrating Nous session from shared credentials...")
+                rehydrated = auth_mod._try_import_shared_nous_state(
+                    timeout_seconds=getattr(args, "timeout", None) or 15.0,
+                    min_key_ttl_seconds=max(
+                        60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))
+                    ),
+                )
+                if rehydrated is not None:
+                    custom_label = (getattr(args, "label", None) or "").strip() or None
+                    entry = auth_mod.persist_nous_credentials(rehydrated, label=custom_label)
+                    shown_label = entry.label if entry is not None else label_from_token(
+                        rehydrated.get("access_token", ""), _oauth_default_label(provider, 1),
+                    )
+                    print(f'Imported {provider} OAuth credentials: "{shown_label}"')
+                    return
+                # Rehydrate failed (expired refresh_token, portal down, etc.)
+                # — fall through to device-code flow.
+                print("Could not refresh shared credentials — falling back to device-code login.")
+
        creds = auth_mod._nous_device_code_login(
            portal_base_url=getattr(args, "portal_url", None),
            inference_base_url=getattr(args, "inference_url", None),
@@ -61,6 +61,9 @@ _EXCLUDED_NAMES = {
    "cron.pid",
 }

+# zipfile.open() drops Unix mode bits on extract; restore tightens these to 0600.
+_SECRET_FILE_NAMES = {".env", "auth.json", "state.db"}
+

 def _should_exclude(rel_path: Path) -> bool:
    """Return True if *rel_path* (relative to hermes root) should be skipped."""
@@ -381,6 +384,8 @@ def run_import(args) -> None:
                target.parent.mkdir(parents=True, exist_ok=True)
                with zf.open(member) as src, open(target, "wb") as dst:
                    dst.write(src.read())
+                if target.name in _SECRET_FILE_NAMES:
+                    os.chmod(target, 0o600)
                restored += 1
            except (PermissionError, OSError) as exc:
                errors.append(f"  {rel}: {exc}")
@@ -788,9 +793,17 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
    Returns the number of files deleted.  Only touches files matching
    ``pre-update-*.zip`` so hand-made zips dropped in the same directory
    are never touched.
+
+    ``keep`` is floored to 1 because this helper is only called immediately
+    after a fresh backup is written: deleting that backup right after the
+    user paid the disk/CPU cost to create it would leave them worse off
+    than no backup at all (and the wrapper in ``main.py`` would still print
+    a misleading ``Saved: <path>`` line for a file that no longer exists).
+    Operators who genuinely don't want a backup should set
+    ``updates.pre_update_backup: false`` in config — that gates creation.
    """
-    if keep < 0:
-        keep = 0
+    if keep < 1:
+        keep = 1
    if not backup_dir.exists():
        return 0

@@ -10,6 +10,7 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.

 from __future__ import annotations

+import logging
 import os
 import re
 import shutil
@@ -19,6 +20,10 @@ from collections.abc import Callable, Mapping
 from dataclasses import dataclass
 from typing import Any

+from utils import is_truthy_value
+
+logger = logging.getLogger(__name__)
+
 # prompt_toolkit is an optional CLI dependency — only needed for
 # SlashCommandCompleter and SlashCommandAutoSuggest.  Gateway and test
 # environments that lack it must still be able to import this module
@@ -59,7 +64,9 @@ class CommandDef:
 COMMAND_REGISTRY: list[CommandDef] = [
    # Session
    CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
-               aliases=("reset",)),
+               aliases=("reset",), args_hint="[name]"),
+    CommandDef("topic", "Enable or inspect Telegram DM topic sessions", "Session",
+               gateway_only=True, args_hint="[off|help|session-id]"),
    CommandDef("clear", "Clear screen and start a new session", "Session",
               cli_only=True),
    CommandDef("redraw", "Force a full UI repaint (recovers from terminal drift)", "Session",
@@ -93,6 +100,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
               aliases=("q",), args_hint="<prompt>"),
    CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
               args_hint="<prompt>"),
+    CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
+               args_hint="[text | pause | resume | clear | status]"),
    CommandDef("status", "Show session info", "Session"),
    CommandDef("profile", "Show active profile name and home directory", "Info"),
    CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -151,6 +160,11 @@ COMMAND_REGISTRY: list[CommandDef] = [
    CommandDef("curator", "Background skill maintenance (status, run, pin, archive)",
               "Tools & Skills", args_hint="[subcommand]",
               subcommands=("status", "run", "pause", "resume", "pin", "unpin", "restore")),
+    CommandDef("kanban", "Multi-profile collaboration board (tasks, links, comments)",
+               "Tools & Skills", args_hint="[subcommand]",
+               subcommands=("list", "ls", "show", "create", "assign", "link", "unlink",
+                            "claim", "comment", "complete", "block", "unblock", "archive",
+                            "tail", "dispatch", "context", "init", "gc")),
    CommandDef("reload", "Reload .env variables into the running session", "Tools & Skills",
               cli_only=True),
    CommandDef("reload-mcp", "Reload MCP servers from config", "Tools & Skills",
@@ -366,7 +380,7 @@ def _resolve_config_gates() -> set[str]:
            else:
                val = None
                break
-        if val:
+        if is_truthy_value(val, default=False):
            result.add(cmd.name)
    return result

@@ -387,6 +401,11 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
    return False


+def _requires_argument(args_hint: str) -> bool:
+    """Return True when selecting a command without text would be incomplete."""
+    return args_hint.strip().startswith("<")
+
+
 def gateway_help_lines() -> list[str]:
    """Generate gateway help text lines from the registry."""
    overrides = _resolve_config_gates()
@@ -443,7 +462,9 @@ def telegram_bot_commands() -> list[tuple[str, str]]:

    Telegram command names cannot contain hyphens, so they are replaced with
    underscores.  Aliases are skipped -- Telegram shows one menu entry per
-    canonical command.
+    canonical command. Commands that require arguments are skipped because
+    selecting a Telegram BotCommand sends only ``/command`` and would execute
+    an incomplete command.

    Plugin-registered slash commands are included so plugins get native
    autocomplete in Telegram without touching core code.
@@ -453,10 +474,14 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
    for cmd in COMMAND_REGISTRY:
        if not _is_gateway_available(cmd, overrides):
            continue
+        if _requires_argument(cmd.args_hint):
+            continue
        tg_name = _sanitize_telegram_name(cmd.name)
        if tg_name:
            result.append((tg_name, cmd.description))
-    for name, description, _args_hint in _iter_plugin_command_entries():
+    for name, description, args_hint in _iter_plugin_command_entries():
+        if _requires_argument(args_hint):
+            continue
        tg_name = _sanitize_telegram_name(name)
        if tg_name:
            result.append((tg_name, description))
@@ -490,9 +515,9 @@ def _sanitize_telegram_name(raw: str) -> str:


 def _clamp_command_names(
-    entries: list[tuple[str, str]],
+    entries: list[tuple[str, ...]],
    reserved: set[str],
-) -> list[tuple[str, str]]:
+) -> list[tuple[str, ...]]:
    """Enforce 32-char command name limit with collision avoidance.

    Both Telegram and Discord cap slash command names at 32 characters.
@@ -500,10 +525,15 @@ def _clamp_command_names(
    (against *reserved* names or earlier entries in the same batch), the name is
    shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
    If all 10 digit slots are taken the entry is silently dropped.
+
+    Accepts tuples of any length >= 2.  Extra elements beyond ``(name, desc)``
+    (e.g. ``cmd_key``) are passed through unchanged, so callers can attach
+    metadata that survives the rename.
    """
    used: set[str] = set(reserved)
-    result: list[tuple[str, str]] = []
-    for name, desc in entries:
+    result: list[tuple] = []
+    for entry in entries:
+        name, desc, *extra = entry
        if len(name) > _CMD_NAME_LIMIT:
            candidate = name[:_CMD_NAME_LIMIT]
            if candidate in used:
@@ -519,7 +549,7 @@ def _clamp_command_names(
        if name in used:
            continue
        used.add(name)
-        result.append((name, desc))
+        result.append((name, desc, *extra))
    return result


@@ -602,13 +632,26 @@ def _collect_gateway_skill_entries(
    try:
        from agent.skill_commands import get_skill_commands
        from tools.skills_tool import SKILLS_DIR
+        from agent.skill_utils import get_external_skills_dirs
        _skills_dir = str(SKILLS_DIR.resolve())
-        _hub_dir = str((SKILLS_DIR / ".hub").resolve())
+        _hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
+        # Build set of allowed directory prefixes: local skills dir + any
+        # user-configured ``skills.external_dirs``. Ensure each prefix ends
+        # with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
+        # Without this widening, external skills are visible in
+        # ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
+        # silently excluded from gateway slash menus (#8110).
+        _allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
+        _allowed_prefixes.extend(
+            str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
+        )
        skill_cmds = get_skill_commands()
        for cmd_key in sorted(skill_cmds):
            info = skill_cmds[cmd_key]
            skill_path = info.get("skill_md_path", "")
-            if not skill_path.startswith(_skills_dir):
+            if not skill_path:
+                continue
+            if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
                continue
            if skill_path.startswith(_hub_dir):
                continue
@@ -626,17 +669,15 @@ def _collect_gateway_skill_entries(
    except Exception:
        pass

-    # Clamp names; _clamp_command_names works on (name, desc) pairs so we
-    # need to zip/unzip.
-    skill_pairs = [(n, d) for n, d, _ in skill_triples]
-    key_by_pair = {(n, d): k for n, d, k in skill_triples}
-    skill_pairs = _clamp_command_names(skill_pairs, reserved_names)
+    # Clamp names; cmd_key is passed through as extra payload so it survives
+    # any clamp-induced renames.
+    skill_triples = _clamp_command_names(skill_triples, reserved_names)

    # Skills fill remaining slots — only tier that gets trimmed
    remaining = max(0, max_slots - len(all_entries))
-    hidden_count = max(0, len(skill_pairs) - remaining)
-    for n, d in skill_pairs[:remaining]:
-        all_entries.append((n, d, key_by_pair.get((n, d), "")))
+    hidden_count = max(0, len(skill_triples) - remaining)
+    for n, d, k in skill_triples[:remaining]:
+        all_entries.append((n, d, k))

    return all_entries[:max_slots], hidden_count

@@ -712,24 +753,40 @@ def discord_skill_commands(
 def discord_skill_commands_by_category(
    reserved_names: set[str],
 ) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
-    """Return skill entries organized by category for Discord ``/skill`` subcommand groups.
+    """Return skill entries organized by category for Discord ``/skill`` autocomplete.

-    Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
+    Skills whose directory is nested at least 2 levels under a scan root
    (e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
    category.  Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
-    *uncategorized* — the caller should register them as direct subcommands
-    of the ``/skill`` group.
+    *uncategorized*.

-    The same filtering as :func:`discord_skill_commands` is applied: hub
-    skills excluded, per-platform disabled excluded, names clamped.
+    Scan roots include the local ``SKILLS_DIR`` **and** any configured
+    ``skills.external_dirs`` — matching the widened filter applied to the
+    flat ``discord_skill_commands()`` collector in #18741. Without this
+    parity, external-dir skills are visible via ``hermes skills list`` and
+    the agent's ``/skill-name`` dispatch but silently absent from Discord's
+    ``/skill`` autocomplete.
+
+    Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
+    per-platform disabled excluded, names clamped to 32 chars, descriptions
+    clamped to 100 chars.
+
+    The legacy 25-group × 25-subcommand caps (from the old nested
+    ``/skill <cat> <name>`` layout) are **not** applied — the live caller
+    (``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
+    in PR #11580) flattens these results and feeds them into a single
+    autocomplete callback, which scales to thousands of entries without any
+    per-command payload concerns. ``hidden_count`` is retained in the return
+    tuple for backward compatibility and still reports skills dropped for
+    other reasons (32-char clamp collision vs a reserved name).

    Returns:
        ``(categories, uncategorized, hidden_count)``

        - *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
        - *uncategorized*: ``[(name, description, cmd_key), ...]``
-        - *hidden_count*: skills dropped due to Discord group limits
-          (25 subcommand groups, 25 subcommands per group)
+        - *hidden_count*: skills dropped due to name clamp collisions
+          against already-registered command names.
    """
    from pathlib import Path as _P

@@ -743,14 +800,33 @@ def discord_skill_commands_by_category(
    # Collect raw skill data --------------------------------------------------
    categories: dict[str, list[tuple[str, str, str]]] = {}
    uncategorized: list[tuple[str, str, str]] = []
-    _names_used: set[str] = set(reserved_names)
+    # Map clamped-32-char-name → what it came from, so we can emit an
+    # actionable warning on collision. Reserved (gateway-builtin) command
+    # names are marked with a sentinel so the warning distinguishes
+    # "skill collided with a reserved command" from "two skills collided
+    # on the 32-char clamp" — the latter is the rename-worthy case.
+    _names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
    hidden = 0

    try:
        from agent.skill_commands import get_skill_commands
+        from agent.skill_utils import get_external_skills_dirs
        from tools.skills_tool import SKILLS_DIR
+
        _skills_dir = SKILLS_DIR.resolve()
        _hub_dir = (SKILLS_DIR / ".hub").resolve()
+        # Build list of (resolved_root, is_local) tuples. Each external dir
+        # becomes its own scan root for category derivation — a skill at
+        # ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
+        _scan_roots: list[_P] = [_skills_dir]
+        try:
+            for ext in get_external_skills_dirs():
+                try:
+                    _scan_roots.append(_P(ext).resolve())
+                except Exception:
+                    continue
+        except Exception:
+            pass
        skill_cmds = get_skill_commands()

        for cmd_key in sorted(skill_cmds):
@@ -759,33 +835,72 @@ def discord_skill_commands_by_category(
            if not skill_path:
                continue
            sp = _P(skill_path).resolve()
-            # Skip skills outside SKILLS_DIR or from the hub
-            if not str(sp).startswith(str(_skills_dir)):
-                continue
+            # Hub skills are loaded via the skill hub, not surfaced as
+            # slash commands.
            if str(sp).startswith(str(_hub_dir)):
                continue
+            # Accept skill if it lives under any scan root; record the
+            # matching root so we can derive the category correctly.
+            matched_root: _P | None = None
+            for root in _scan_roots:
+                try:
+                    sp.relative_to(root)
+                except ValueError:
+                    continue
+                matched_root = root
+                break
+            if matched_root is None:
+                continue

            skill_name = info.get("name", "")
            if skill_name in _platform_disabled:
                continue

            raw_name = cmd_key.lstrip("/")
-            # Clamp to 32 chars (Discord limit)
+            # Clamp to 32 chars (Discord per-command name limit)
            discord_name = raw_name[:32]
            if discord_name in _names_used:
+                # Two skills whose first 32 chars are identical. One wins
+                # (the first one seen, which is alphabetical because the
+                # caller iterates ``sorted(skill_cmds)``); the other is
+                # dropped from Discord's /skill autocomplete.
+                #
+                # Silently counting this as ``hidden`` (the old behavior)
+                # meant skill authors had no way to discover the drop —
+                # their skill just didn't appear in the picker. Emit a
+                # WARNING naming both sides so the author can rename the
+                # losing skill's frontmatter name to something with a
+                # distinct 32-char prefix.
+                prior = _names_used[discord_name]
+                if prior == "<reserved>":
+                    logger.warning(
+                        "Discord /skill: %r (from %r) collides on its 32-char "
+                        "clamp with a reserved gateway command name %r — the "
+                        "skill will not appear in the /skill autocomplete. "
+                        "Rename the skill's frontmatter ``name:`` to differ "
+                        "in its first 32 chars.",
+                        discord_name, cmd_key, discord_name,
+                    )
+                else:
+                    logger.warning(
+                        "Discord /skill: %r and %r both clamp to %r on "
+                        "Discord's 32-char command-name limit — only %r "
+                        "will appear in the /skill autocomplete. Rename "
+                        "one skill's frontmatter ``name:`` to differ in "
+                        "its first 32 chars.",
+                        prior, cmd_key, discord_name, prior,
+                    )
+                hidden += 1
                continue
-            _names_used.add(discord_name)
+            _names_used[discord_name] = cmd_key

            desc = info.get("description", "")
            if len(desc) > 100:
                desc = desc[:97] + "..."

-            # Determine category from the relative path within SKILLS_DIR.
-            # e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
-            try:
-                rel = sp.parent.relative_to(_skills_dir)
-            except ValueError:
-                continue
+            # Determine category from the relative path within the matched
+            # scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
+            rel = sp.parent.relative_to(matched_root)
            parts = rel.parts
            if len(parts) >= 2:
                cat = parts[0]
@@ -795,28 +910,7 @@ def discord_skill_commands_by_category(
    except Exception:
        pass

-    # Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
-    _MAX_GROUPS = 25
-    _MAX_PER_GROUP = 25
-
-    trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
-    group_count = 0
-    for cat in sorted(categories):
-        if group_count >= _MAX_GROUPS:
-            hidden += len(categories[cat])
-            continue
-        entries = categories[cat][:_MAX_PER_GROUP]
-        hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
-        trimmed_categories[cat] = entries
-        group_count += 1
-
-    # Uncategorized skills also count against the 25 top-level limit
-    remaining_slots = _MAX_GROUPS - group_count
-    if len(uncategorized) > remaining_slots:
-        hidden += len(uncategorized) - remaining_slots
-        uncategorized = uncategorized[:remaining_slots]
-
-    return trimmed_categories, uncategorized, hidden
+    return categories, uncategorized, hidden


 # ---------------------------------------------------------------------------
@@ -829,6 +923,13 @@ def discord_skill_commands_by_category(
 _SLACK_MAX_SLASH_COMMANDS = 50
 _SLACK_NAME_LIMIT = 32
 _SLACK_INVALID_CHARS = re.compile(r"[^a-z0-9_\-]")
+_SLACK_RESERVED_COMMANDS = frozenset({
+    # Built-in Slack slash commands that cannot be registered by apps.
+    # https://slack.com/help/articles/201259356-Use-built-in-slash-commands
+    "me", "status", "away", "dnd", "shrug", "remind", "msg", "feed",
+    "who", "collapse", "expand", "leave", "join", "open", "search",
+    "topic", "mute", "pro", "shortcuts",
+})


 def _sanitize_slack_name(raw: str) -> str:
@@ -855,6 +956,10 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
    documented form (e.g. ``/background``, ``/bg``, and ``/btw`` all work).
    Plugin-registered slash commands are included too.

+    Commands whose sanitized name collides with a Slack built-in
+    (e.g. ``/status``, ``/me``, ``/join``) are silently skipped.  Users
+    can still reach them via ``/hermes <command>``.
+
    Results are clamped to Slack's 50-command limit with duplicate-name
    avoidance. ``/hermes`` is always reserved as the first entry so the
    legacy ``/hermes <subcommand>`` form keeps working for anything that
@@ -872,6 +977,8 @@ def slack_native_slashes() -> list[tuple[str, str, str]]:
        slack_name = _sanitize_slack_name(name)
        if not slack_name or slack_name in seen:
            return
+        if slack_name in _SLACK_RESERVED_COMMANDS:
+            return
        if len(entries) >= _SLACK_MAX_SLASH_COMMANDS:
            return
        # Slack description cap is 2000 chars; keep it short.
@@ -1021,6 +1128,12 @@ class SlashCommandCompleter(Completer):
        except Exception:
            return {}

+    # Commands that open pickers when run without arguments.
+    # These should NOT receive a trailing space in completions because:
+    # - The TUI's submit handler applies completions on Enter if input differs
+    # - Adding space makes "/model" → "/model " which blocks picker execution
+    _PICKER_COMMANDS = frozenset({"model", "skin", "personality"})
+
    @staticmethod
    def _completion_text(cmd_name: str, word: str) -> str:
        """Return replacement text for a completion.
@@ -1029,8 +1142,17 @@ class SlashCommandCompleter(Completer):
        returning ``help`` would be a no-op and prompt_toolkit suppresses the
        menu. Appending a trailing space keeps the dropdown visible and makes
        backspacing retrigger it naturally.
+
+        However, commands that open pickers (model, skin, personality) should
+        NOT get a trailing space — the TUI would apply the completion on Enter
+        and block the picker from opening.
        """
-        return f"{cmd_name} " if cmd_name == word else cmd_name
+        if cmd_name != word:
+            return cmd_name
+        # Don't add space for picker commands — allows Enter to execute them
+        if cmd_name in SlashCommandCompleter._PICKER_COMMANDS:
+            return cmd_name
+        return f"{cmd_name} "

    @staticmethod
    def _extract_path_word(text: str) -> str | None:
@@ -400,7 +400,12 @@ DEFAULT_CONFIG = {
        # The gateway stops accepting new work, waits for running agents
        # to finish, then interrupts any remaining runs after the timeout.
        # 0 = no drain, interrupt immediately.
-        "restart_drain_timeout": 60,
+        #
+        # 180s is calibrated for realistic in-flight agent turns: a typical
+        # coding conversation mid-reasoning runs 60–150s per call, so a 60s
+        # budget routinely interrupted legitimate work on /restart. Raise
+        # further in config.yaml if you run very-long-reasoning models.
+        "restart_drain_timeout": 180,
        # Max app-level retry attempts for API errors (connection drops,
        # provider timeouts, 5xx, etc.) before the agent surfaces the
        # failure.  The OpenAI SDK already does its own low-level retries
@@ -457,6 +462,7 @@ DEFAULT_CONFIG = {
        # remains available as a tool regardless of this setting — the routing
        # only controls how inbound user images are presented.
        "image_input_mode": "auto",
+        "disabled_toolsets": [],
    },
    
    "terminal": {
@@ -606,6 +612,24 @@ DEFAULT_CONFIG = {
        "max_line_length": 2000,
    },

+    # Tool loop guardrails nudge models when they repeat failed or
+    # non-progressing tool calls. Soft warnings are always-on by default;
+    # hard stops are opt-in so interactive CLI/TUI sessions keep flowing.
+    "tool_loop_guardrails": {
+        "warnings_enabled": True,
+        "hard_stop_enabled": False,
+        "warn_after": {
+            "exact_failure": 2,
+            "same_tool_failure": 3,
+            "idempotent_no_progress": 2,
+        },
+        "hard_stop_after": {
+            "exact_failure": 5,
+            "same_tool_failure": 8,
+            "idempotent_no_progress": 5,
+        },
+    },
+
    "compression": {
        "enabled": True,
        "threshold": 0.50,            # compress when context usage exceeds this ratio
@@ -620,6 +644,18 @@ DEFAULT_CONFIG = {
        "cache_ttl": "5m",
    },

+    # OpenRouter-specific settings.
+    # response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
+    #   When enabled, identical requests return cached responses for free (zero billing).
+    #   This is separate from Anthropic prompt caching and works alongside it.
+    #   See: https://openrouter.ai/docs/guides/features/response-caching
+    # response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
+    #   Default 300 (5 minutes). Only used when response_cache is enabled.
+    "openrouter": {
+        "response_cache": True,
+        "response_cache_ttl": 300,
+    },
+
    # AWS Bedrock provider configuration.
    # Only used when model.provider is "bedrock".
    "bedrock": {
@@ -756,6 +792,14 @@ DEFAULT_CONFIG = {
        "tool_progress_command": False,  # Enable /verbose command in messaging gateway
        "tool_progress_overrides": {},  # DEPRECATED — use display.platforms instead
        "tool_preview_length": 0,  # Max chars for tool call previews (0 = no limit, show full paths/commands)
+        # Auto-delete system-notice replies (e.g. "✨ New session started!",
+        # "♻ Restarting gateway…", "⚡ Stopped…") after N seconds on platforms
+        # that support message deletion (currently Telegram; other platforms
+        # ignore and leave the message in place).  Only affects slash-command
+        # replies wrapped with gateway.platforms.base.EphemeralReply — agent
+        # responses and content messages are never touched.  Default 0
+        # (disabled) preserves prior behavior.
+        "ephemeral_system_ttl": 0,
        "platforms": {},  # Per-platform display overrides: {"telegram": {"tool_progress": "all"}, "slack": {"tool_progress": "off"}}
        # Gateway runtime-metadata footer appended to the FINAL message of a turn
        # (disabled by default to keep replies minimal). When enabled, renders
@@ -765,6 +809,7 @@ DEFAULT_CONFIG = {
            "enabled": False,
            "fields": ["model", "context_pct", "cwd"],  # Order shown; drop any to hide
        },
+        "copy_shortcut": "auto",  # "auto" (platform default) | "ctrl_c" | "ctrl_shift_c" | "disabled"
    },

    # Web dashboard settings
@@ -798,7 +843,7 @@ DEFAULT_CONFIG = {
            # Voices: alloy, echo, fable, onyx, nova, shimmer
        },
        "xai": {
-            "voice_id": "eve",
+            "voice_id": "eve",  # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
            "language": "en",
            "sample_rate": 24000,
            "bit_rate": 128000,
@@ -925,7 +970,23 @@ DEFAULT_CONFIG = {
    # injected at the start of every API call for few-shot priming.
    # Never saved to sessions, logs, or trajectories.
    "prefill_messages_file": "",
-    
+
+    # Goals — persistent cross-turn goals (Ralph-style loop).
+    # After every turn, a lightweight judge call asks the auxiliary model
+    # whether the active /goal is satisfied by the assistant's last
+    # response. If not, Hermes feeds a continuation prompt back into the
+    # same session and keeps working until the goal is done, the turn
+    # budget is exhausted, or the user pauses/clears it. Judge failures
+    # fail OPEN (continue) so a flaky judge never wedges progress — the
+    # turn budget is the real backstop.
+    "goals": {
+        # Max continuation turns before Hermes auto-pauses the goal and
+        # asks the user to /goal resume. Protects against judge false
+        # negatives (goal actually done but judge says continue) and
+        # unbounded model spend on fuzzy / unachievable goals.
+        "max_turns": 20,
+    },
+
    # Skills — external skill directories for sharing skills across tools/agents.
    # Each path is expanded (~, ${VAR}) and resolved.  Read-only — skill creation
    # always goes to ~/.hermes/skills/.
@@ -979,6 +1040,14 @@ DEFAULT_CONFIG = {
        # Archive a skill (move to skills/.archive/) after this many days
        # without use. Archived skills are recoverable — no auto-deletion.
        "archive_after_days": 90,
+        # Pre-run backup: before every real curator pass (dry-run is
+        # skipped), snapshot ~/.hermes/skills/ into
+        # ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz so the
+        # user can roll back with `hermes curator rollback`.
+        "backup": {
+            "enabled": True,
+            "keep": 5,  # retain last N regular snapshots
+        },
    },

    # Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
@@ -1104,6 +1173,24 @@ DEFAULT_CONFIG = {
        "max_parallel_jobs": None,
    },

+    # Kanban multi-agent coordination — controls the dispatcher loop that
+    # spawns workers for ready tasks. The dispatcher ticks every N seconds
+    # (default 60), reclaims stale claims, promotes dependency-satisfied
+    # todos to ready, and fires `hermes -p <assignee> chat -q ...` for
+    # each claimable ready task. One dispatcher per profile is sufficient;
+    # running more than one on the same kanban.db will race for claims.
+    "kanban": {
+        # Run the dispatcher inside the gateway process. On by default —
+        # the cost is ~300µs every `dispatch_interval_seconds` when idle,
+        # and gateway is the supervisor users already have. Set to false
+        # only if you run the dispatcher as a separate systemd unit or
+        # don't want the gateway to spawn workers.
+        "dispatch_in_gateway": True,
+        # Seconds between dispatcher ticks (idle or not). Lower = snappier
+        # pickup of newly-ready tasks; higher = less SQL pressure.
+        "dispatch_interval_seconds": 60,
+    },
+
    # execute_code settings — controls the tool used for programmatic tool calls.
    "code_execution": {
        # Execution mode:
@@ -1200,7 +1287,10 @@ DEFAULT_CONFIG = {
        # for a single update run.
        "pre_update_backup": False,
        # How many pre-update backup zips to retain.  Older ones are pruned
-        # automatically after each successful backup.
+        # automatically after each successful backup.  Values below 1 are
+        # floored to 1 — the backup just created is always preserved.  To
+        # disable backups entirely, set ``pre_update_backup: false`` above
+        # rather than ``backup_keep: 0``.
        "backup_keep": 5,
    },

@@ -2400,7 +2490,17 @@ def get_missing_skill_config_vars() -> List[Dict[str, Any]]:
    except Exception:
        return []

-    all_vars = discover_all_skill_config_vars()
+    try:
+        all_vars = discover_all_skill_config_vars()
+    except Exception as e:
+        # A malformed SKILL.md, unreadable external skill dir, or similar
+        # should never break `hermes update`.  Skill-config prompting is a
+        # post-migration nicety, not a blocker.
+        import logging
+        logging.getLogger(__name__).debug(
+            "discover_all_skill_config_vars failed: %s", e
+        )
+        return []
    if not all_vars:
        return []

@@ -4579,7 +4679,9 @@ def set_config_value(key: str, value: str):
        "terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
        "terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
        "terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
-        "terminal.cwd": "TERMINAL_CWD",
+        # terminal.cwd intentionally excluded — CLI resolves at runtime,
+        # gateway bridges it in gateway/run.py. Persisting to .env causes
+        # stale values to poison child processes.
        "terminal.timeout": "TERMINAL_TIMEOUT",
        "terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
        "terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
@@ -93,6 +93,8 @@ def cron_list(show_all: bool = False):
        script = job.get("script")
        if script:
            print(f"    Script:    {script}")
+        if job.get("no_agent"):
+            print(f"    Mode:      {color('no-agent', Colors.DIM)} (script stdout delivered directly)")
        workdir = job.get("workdir")
        if workdir:
            print(f"    Workdir:   {workdir}")
@@ -172,6 +174,7 @@ def cron_create(args):
        skills=_normalize_skills(getattr(args, "skill", None), getattr(args, "skills", None)),
        script=getattr(args, "script", None),
        workdir=getattr(args, "workdir", None),
+        no_agent=getattr(args, "no_agent", False) or None,
    )
    if not result.get("success"):
        print(color(f"Failed to create job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -184,6 +187,8 @@ def cron_create(args):
    job_data = result.get("job", {})
    if job_data.get("script"):
        print(f"  Script: {job_data['script']}")
+    if job_data.get("no_agent"):
+        print("  Mode: no-agent (script stdout delivered directly)")
    if job_data.get("workdir"):
        print(f"  Workdir: {job_data['workdir']}")
    print(f"  Next run: {result['next_run_at']}")
@@ -225,6 +230,7 @@ def cron_edit(args):
        skills=final_skills,
        script=getattr(args, "script", None),
        workdir=getattr(args, "workdir", None),
+        no_agent=getattr(args, "no_agent", None),
    )
    if not result.get("success"):
        print(color(f"Failed to update job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -240,6 +246,8 @@ def cron_edit(args):
        print("  Skills: none")
    if updated.get("script"):
        print(f"  Script: {updated['script']}")
+    if updated.get("no_agent"):
+        print("  Mode: no-agent (script stdout delivered directly)")
    if updated.get("workdir"):
        print(f"  Workdir: {updated['workdir']}")
    return 0
@@ -108,6 +108,49 @@ def _cmd_status(args) -> int:
                f"last_activity={last}"
            )

+    # Show top 5 most-active and least-active skills by activity_count
+    # (use + view + patch). This is a different signal from
+    # least-recently-active: activity_count reflects frequency,
+    # last_activity_at reflects recency. A skill touched 30 times a year
+    # ago is high-frequency but stale; a skill touched once yesterday is
+    # recent but low-frequency. Both can matter.
+    active_all = by_state.get("active", [])
+    if active_all:
+        most_active = sorted(
+            active_all,
+            key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
+            reverse=True,
+        )[:5]
+        if most_active and (most_active[0].get("activity_count") or 0) > 0:
+            print("\nmost active (top 5):")
+            for r in most_active:
+                last = _fmt_ts(r.get("last_activity_at"))
+                print(
+                    f"  {r['name']:40s}  "
+                    f"activity={r.get('activity_count', 0):3d}  "
+                    f"use={r.get('use_count', 0):3d}  "
+                    f"view={r.get('view_count', 0):3d}  "
+                    f"patches={r.get('patch_count', 0):3d}  "
+                    f"last_activity={last}"
+                )
+
+        least_active = sorted(
+            active_all,
+            key=lambda r: (r.get("activity_count") or 0, r.get("last_activity_at") or ""),
+        )[:5]
+        if least_active:
+            print("\nleast active (top 5):")
+            for r in least_active:
+                last = _fmt_ts(r.get("last_activity_at"))
+                print(
+                    f"  {r['name']:40s}  "
+                    f"activity={r.get('activity_count', 0):3d}  "
+                    f"use={r.get('use_count', 0):3d}  "
+                    f"view={r.get('view_count', 0):3d}  "
+                    f"patches={r.get('patch_count', 0):3d}  "
+                    f"last_activity={last}"
+                )
+
    return 0


@@ -117,7 +160,11 @@ def _cmd_run(args) -> int:
        print("curator: disabled via config; enable with `curator.enabled: true`")
        return 1

-    print("curator: running review pass...")
+    dry = bool(getattr(args, "dry_run", False))
+    if dry:
+        print("curator: running DRY-RUN (report only, no mutations)...")
+    else:
+        print("curator: running review pass...")

    def _on_summary(msg: str) -> None:
        print(msg)
@@ -125,17 +172,29 @@ def _cmd_run(args) -> int:
    result = curator.run_curator_review(
        on_summary=_on_summary,
        synchronous=bool(args.synchronous),
+        dry_run=dry,
    )
    auto = result.get("auto_transitions", {})
    if auto:
-        print(
-            f"auto: checked={auto.get('checked', 0)} "
-            f"stale={auto.get('marked_stale', 0)} "
-            f"archived={auto.get('archived', 0)} "
-            f"reactivated={auto.get('reactivated', 0)}"
-        )
+        if dry:
+            print(
+                f"auto (preview): {auto.get('checked', 0)} candidate skill(s) "
+                "— no transitions applied in dry-run"
+            )
+        else:
+            print(
+                f"auto: checked={auto.get('checked', 0)} "
+                f"stale={auto.get('marked_stale', 0)} "
+                f"archived={auto.get('archived', 0)} "
+                f"reactivated={auto.get('reactivated', 0)}"
+            )
    if not args.synchronous:
        print("llm pass running in background — check `hermes curator status` later")
+    if dry:
+        print(
+            "dry-run: no changes applied. When the report lands, read it with "
+            "`hermes curator status` and run `hermes curator run` (no flag) to apply."
+        )
    return 0


@@ -186,6 +245,98 @@ def _cmd_restore(args) -> int:
    return 0 if ok else 1


+def _cmd_backup(args) -> int:
+    """Take a manual snapshot of the skills tree. Same mechanism as the
+    automatic pre-run snapshot, just user-initiated."""
+    from agent import curator_backup
+    if not curator_backup.is_enabled():
+        print(
+            "curator: backups are disabled via config "
+            "(`curator.backup.enabled: false`); re-enable to snapshot"
+        )
+        return 1
+    reason = getattr(args, "reason", None) or "manual"
+    snap = curator_backup.snapshot_skills(reason=reason)
+    if snap is None:
+        print("curator: snapshot failed — check logs (backup disabled or IO error)")
+        return 1
+    print(f"curator: snapshot created at ~/.hermes/skills/.curator_backups/{snap.name}")
+    return 0
+
+
+def _cmd_rollback(args) -> int:
+    """Restore the skills tree from a snapshot. Defaults to newest.
+
+    ``--list`` prints available snapshots and exits. ``--id <stamp>`` picks
+    a specific one. Without ``-y``, prompts for confirmation. A safety
+    snapshot of the current tree is always taken first, so rollbacks are
+    themselves undoable.
+    """
+    from agent import curator_backup
+
+    if getattr(args, "list", False):
+        print(curator_backup.summarize_backups())
+        return 0
+
+    backup_id = getattr(args, "backup_id", None)
+    target_path = curator_backup._resolve_backup(backup_id)
+    if target_path is None:
+        rows = curator_backup.list_backups()
+        if not rows:
+            print(
+                "curator: no snapshots exist yet. Take one with "
+                "`hermes curator backup` or wait for the next curator run."
+            )
+        else:
+            print(
+                f"curator: no snapshot matching "
+                f"{'id ' + repr(backup_id) if backup_id else 'your query'}."
+            )
+            print("Available:")
+            print(curator_backup.summarize_backups())
+        return 1
+
+    manifest = curator_backup._read_manifest(target_path)
+    print(f"Rollback target: {target_path.name}")
+    if manifest:
+        print(f"  reason:      {manifest.get('reason', '?')}")
+        print(f"  created_at:  {manifest.get('created_at', '?')}")
+        print(f"  skill files: {manifest.get('skill_files', '?')}")
+        cron = manifest.get("cron_jobs") or {}
+        if isinstance(cron, dict):
+            if cron.get("backed_up"):
+                print(
+                    f"  cron jobs:   {cron.get('jobs_count', 0)} "
+                    f"(will be restored for skill-link fields only)"
+                )
+            else:
+                reason = cron.get("reason", "not captured")
+                print(f"  cron jobs:   not in snapshot ({reason})")
+    print(
+        "\nThis will replace the current ~/.hermes/skills/ tree (a safety "
+        "snapshot of the current state is taken first so this is undoable). "
+        "Cron jobs that still exist will have their skills/skill fields "
+        "restored from the snapshot; all other cron fields are left alone."
+    )
+
+    if not getattr(args, "yes", False):
+        try:
+            ans = input("Proceed? [y/N] ").strip().lower()
+        except (EOFError, KeyboardInterrupt):
+            print("\ncancelled")
+            return 1
+        if ans not in ("y", "yes"):
+            print("cancelled")
+            return 1
+
+    ok, msg, _ = curator_backup.rollback(backup_id=target_path.name)
+    if ok:
+        print(f"curator: {msg}")
+        return 0
+    print(f"curator: rollback failed — {msg}")
+    return 1
+
+
 # ---------------------------------------------------------------------------
 # argparse wiring (called from hermes_cli.main)
 # ---------------------------------------------------------------------------
@@ -207,6 +358,11 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
        "--sync", "--synchronous", dest="synchronous", action="store_true",
        help="Wait for the LLM review pass to finish (default: background thread)",
    )
+    p_run.add_argument(
+        "--dry-run", dest="dry_run", action="store_true",
+        help="Report only — no state changes, no archives, no consolidation "
+             "(use this to preview what curator would do)",
+    )
    p_run.set_defaults(func=_cmd_run)

    p_pause = subs.add_parser("pause", help="Pause the curator until resumed")
@@ -227,6 +383,36 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
    p_restore.add_argument("skill", help="Skill name")
    p_restore.set_defaults(func=_cmd_restore)

+    p_backup = subs.add_parser(
+        "backup",
+        help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
+             "(curator also does this automatically before every real run)",
+    )
+    p_backup.add_argument(
+        "--reason", default=None,
+        help="Free-text label stored in manifest.json (default: 'manual')",
+    )
+    p_backup.set_defaults(func=_cmd_backup)
+
+    p_rollback = subs.add_parser(
+        "rollback",
+        help="Restore ~/.hermes/skills/ from a curator snapshot "
+             "(defaults to the newest)",
+    )
+    p_rollback.add_argument(
+        "--list", action="store_true",
+        help="List available snapshots and exit without restoring",
+    )
+    p_rollback.add_argument(
+        "--id", dest="backup_id", default=None,
+        help="Snapshot id to restore (see `--list`); default: newest",
+    )
+    p_rollback.add_argument(
+        "-y", "--yes", action="store_true",
+        help="Skip confirmation prompt",
+    )
+    p_rollback.set_defaults(func=_cmd_rollback)
+

 def cli_main(argv=None) -> int:
    """Standalone entry (also usable by hermes_cli.main fallthrough)."""
@@ -156,6 +156,8 @@ def curses_checklist(
        flush_stdin()
        return result_holder[0] if result_holder[0] is not None else cancel_returns

+    except KeyboardInterrupt:
+        return cancel_returns
    except Exception:
        return _numbered_fallback(title, items, selected, cancel_returns, status_fn)

@@ -278,6 +280,8 @@ def curses_radiolist(
        flush_stdin()
        return result_holder[0] if result_holder[0] is not None else cancel_returns

+    except KeyboardInterrupt:
+        return cancel_returns
    except Exception:
        return _radio_numbered_fallback(title, items, selected, cancel_returns)

@@ -401,6 +405,8 @@ def curses_single_select(
            return None
        return result_holder[0]

+    except KeyboardInterrupt:
+        return None
    except Exception:
        all_items = list(items) + [cancel_label]
        cancel_idx = len(items)
@@ -1,12 +1,19 @@
-"""``hermes debug`` — debug tools for Hermes Agent.
+"""``hermes debug`` debug tools for Hermes Agent.

 Currently supports:
    hermes debug share    Upload debug report (system info + logs) to a
                          paste service and print a shareable URL.
+                          By default, log content is run through
+                          ``agent.redact.redact_sensitive_text`` with
+                          ``force=True`` before upload so credentials in
+                          ``~/.hermes/logs/*.log`` are not leaked into
+                          the public paste service. Pass ``--no-redact``
+                          to disable.
 """

 import io
 import json
+import logging
 import sys
 import time
 import urllib.error
@@ -19,6 +26,16 @@ from typing import Optional
 from hermes_constants import get_hermes_home
 from utils import atomic_replace

+logger = logging.getLogger(__name__)
+
+# Banner prepended to upload-bound log content when redaction is enabled.
+# Visible in the public paste so reviewers know the content was sanitized.
+# Kept short; the trailing newline guarantees the banner sits on its own line.
+_REDACTION_BANNER = (
+    "[hermes debug share: log content redacted at upload time. "
+    "run with --no-redact to disable]\n"
+)
+

 # ---------------------------------------------------------------------------
 # Paste services — try paste.rs first, dpaste.com as fallback.
@@ -368,17 +385,40 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
    return None


+def _redact_log_text(text: str) -> str:
+    """Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
+
+    Uses ``force=True`` so redaction fires regardless of the operator's
+    ``security.redact_secrets`` setting. The local on-disk log file is
+    not modified; only the in-memory copy headed for the public paste
+    service is sanitized. Returns the redacted text (or the original
+    when empty / non-string).
+    """
+    if not text:
+        return text
+    from agent.redact import redact_sensitive_text
+
+    return redact_sensitive_text(text, force=True)
+
+
 def _capture_log_snapshot(
    log_name: str,
    *,
    tail_lines: int,
    max_bytes: int = _MAX_LOG_BYTES,
+    redact: bool = True,
 ) -> LogSnapshot:
    """Capture a log once and derive summary/full-log views from it.

    The report tail and standalone log upload must come from the same file
    snapshot. Otherwise a rotation/truncate between reads can make the report
    look newer than the uploaded ``agent.log`` paste.
+
+    When ``redact`` is True (the default), both ``tail_text`` and
+    ``full_text`` are run through ``_redact_log_text`` so the snapshot
+    returned is upload-safe. The on-disk log file is never modified.
+    Pass ``redact=False`` to capture original log content (used by
+    ``hermes debug share --no-redact``).
    """
    log_path = _resolve_log_path(log_name)
    if log_path is None:
@@ -438,18 +478,34 @@ def _capture_log_snapshot(
        if truncated:
            full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"

+        if redact:
+            tail_text = _redact_log_text(tail_text)
+            full_text = _redact_log_text(full_text)
+
        return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
    except Exception as exc:
        return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)


-def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
-    """Capture all logs used by debug-share exactly once."""
+def _capture_default_log_snapshots(
+    log_lines: int, *, redact: bool = True
+) -> dict[str, LogSnapshot]:
+    """Capture all logs used by debug-share exactly once.
+
+    ``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
+    captured logs share the same redaction policy for a given run.
+    """
    errors_lines = min(log_lines, 100)
    return {
-        "agent": _capture_log_snapshot("agent", tail_lines=log_lines),
-        "errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
-        "gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
+        "agent": _capture_log_snapshot(
+            "agent", tail_lines=log_lines, redact=redact
+        ),
+        "errors": _capture_log_snapshot(
+            "errors", tail_lines=errors_lines, redact=redact
+        ),
+        "gateway": _capture_log_snapshot(
+            "gateway", tail_lines=errors_lines, redact=redact
+        ),
    }


@@ -532,6 +588,7 @@ def run_debug_share(args):
    log_lines = getattr(args, "lines", 200)
    expiry = getattr(args, "expire", 7)
    local_only = getattr(args, "local", False)
+    redact = not getattr(args, "no_redact", False)

    if not local_only:
        print(_PRIVACY_NOTICE)
@@ -539,8 +596,16 @@ def run_debug_share(args):
    print("Collecting debug report...")

    # Capture dump once — prepended to every paste for context.
+    # The dump is already redacted at extract time via dump.py:_redact;
+    # log_snapshots are redacted by _capture_default_log_snapshots when
+    # redact=True so credentials never reach the public paste service.
    dump_text = _capture_dump()
-    log_snapshots = _capture_default_log_snapshots(log_lines)
+    log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
+
+    if redact:
+        logger.info(
+            "hermes debug share: applied force-mode redaction to log snapshots before upload"
+        )

    report = collect_debug_report(
        log_lines=log_lines,
@@ -556,6 +621,15 @@ def run_debug_share(args):
    if gateway_log:
        gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log

+    # Visible banner so reviewers reading the public paste know redaction
+    # was applied at upload time. Banner is omitted under --no-redact.
+    if redact:
+        report = _REDACTION_BANNER + report
+        if agent_log:
+            agent_log = _REDACTION_BANNER + agent_log
+        if gateway_log:
+            gateway_log = _REDACTION_BANNER + gateway_log
+
    if local_only:
        print(report)
        if agent_log:
@@ -666,6 +740,7 @@ def run_debug(args):
        print("  --lines N    Number of log lines to include (default: 200)")
        print("  --expire N   Paste expiry in days (default: 7)")
        print("  --local      Print report locally instead of uploading")
+        print("  --no-redact  Disable upload-time secret redaction (default: redact)")
        print()
        print("Options (delete):")
        print("  <url> ...    One or more paste URLs to delete")
@@ -263,8 +263,11 @@ def run_doctor(args):
    if env_path.exists():
        check_ok(f"{_DHH}/.env file exists")
        
-        # Check for common issues
-        content = env_path.read_text()
+        # Check for common issues. Pin encoding to UTF-8 because .env files are
+        # written as UTF-8 everywhere in the codebase, while Path.read_text()
+        # defaults to the system locale — which crashes on non-UTF-8 Windows
+        # locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
+        content = env_path.read_text(encoding="utf-8")
        if _has_provider_env_config(content):
            check_ok("API key or custom endpoint configured")
        else:
@@ -932,6 +935,8 @@ def run_doctor(args):
        agent_browser_path = PROJECT_ROOT / "node_modules" / "agent-browser"
        if agent_browser_path.exists():
            check_ok("agent-browser (Node.js)", "(browser automation)")
+        elif shutil.which("agent-browser"):
+            check_ok("agent-browser", "(browser automation)")
        else:
            if _is_termux():
                check_info("agent-browser is not installed (expected in the tested Termux path)")
@@ -1093,9 +1098,10 @@ def run_doctor(args):
        ("Hugging Face",     ("HF_TOKEN",),                                   "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
        ("NVIDIA NIM",       ("NVIDIA_API_KEY",),                             "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
        ("Alibaba/DashScope", ("DASHSCOPE_API_KEY",),                         "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
-        # MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
+        # MiniMax global: /v1 endpoint supports /models.
        ("MiniMax",          ("MINIMAX_API_KEY",),                            "https://api.minimax.io/v1/models",    "MINIMAX_BASE_URL", True),
-        ("MiniMax (China)",  ("MINIMAX_CN_API_KEY",),                         "https://api.minimaxi.com/v1/models",  "MINIMAX_CN_BASE_URL", True),
+        # MiniMax CN: /v1 endpoint does NOT support /models (returns 404).
+        ("MiniMax (China)",  ("MINIMAX_CN_API_KEY",),                         "https://api.minimaxi.com/v1/models",  "MINIMAX_CN_BASE_URL", False),
        ("Vercel AI Gateway",       ("AI_GATEWAY_API_KEY",),                          "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
        ("Kilo Code",        ("KILOCODE_API_KEY",),                            "https://api.kilo.ai/api/gateway/models",  "KILOCODE_BASE_URL", True),
        ("OpenCode Zen",     ("OPENCODE_ZEN_API_KEY",),                        "https://opencode.ai/zen/v1/models",  "OPENCODE_ZEN_BASE_URL", True),
@@ -1258,9 +1264,23 @@ def run_doctor(args):
        check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")

    from hermes_cli.config import get_env_value
+
+    def _gh_authenticated() -> bool:
+        """Check if gh CLI is authenticated via token file or device flow."""
+        try:
+            result = subprocess.run(
+                ["gh", "auth", "status", "--json", "authenticated"],
+                capture_output=True, timeout=10,
+            )
+            return result.returncode == 0
+        except (FileNotFoundError, subprocess.TimeoutExpired):
+            return False
+
    github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
    if github_token:
        check_ok("GitHub token configured (authenticated API access)")
+    elif _gh_authenticated():
+        check_ok("GitHub authenticated via gh CLI", "(full API access — no GITHUB_TOKEN needed)")
    else:
        check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")

@@ -10,6 +10,7 @@ import shutil
 import signal
 import subprocess
 import sys
+import textwrap
 from dataclasses import dataclass
 from pathlib import Path

@@ -59,6 +60,13 @@ class GatewayRuntimeSnapshot:
    def has_process_service_mismatch(self) -> bool:
        return self.service_installed and self.running and not self.service_running

+
+@dataclass(frozen=True)
+class ProfileGatewayProcess:
+    profile: str
+    path: Path
+    pid: int
+
 def _get_service_pids() -> set:
    """Return PIDs currently managed by systemd or launchd gateway services.

@@ -180,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:

    SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
    which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
-    seconds), then exits with code 75.  Both systemd (``Restart=on-failure``
+    seconds), then exits with code 75.  Both systemd (``Restart=always``
    + ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
    = false``) relaunch the process after the graceful exit.

@@ -229,6 +237,26 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
    return False


+def _get_ancestor_pids() -> set[int]:
+    """Return the set of PIDs in the current process's ancestor chain.
+
+    Walks from the current PID up to PID 1 (init) so that process-table scans
+    never match the calling CLI process or any of its parents.  This prevents
+    ``hermes gateway status`` from falsely counting the ``hermes`` CLI that
+    invoked it as a running gateway instance (see #13242).
+    """
+    ancestors: set[int] = set()
+    pid = os.getpid()
+    # Cap iterations to avoid infinite loops on exotic platforms.
+    for _ in range(64):
+        ancestors.add(pid)
+        parent = _get_parent_pid(pid)
+        if parent is None or parent <= 0 or parent in ancestors:
+            break
+        pid = parent
+    return ancestors
+
+
 def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
    if pid is None or pid <= 0:
        return
@@ -244,6 +272,10 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
    a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
    discover gateways outside the current profile.
    """
+    # Exclude the entire ancestor chain so the CLI process that invoked this
+    # scan (e.g. ``hermes gateway status``) is never mistaken for a running
+    # gateway.  See #13242.
+    exclude_pids = exclude_pids | _get_ancestor_pids()
    pids: list[int] = []
    patterns = [
        "hermes_cli.main gateway",
@@ -371,6 +403,83 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
    return pids


+def find_profile_gateway_processes(
+    exclude_pids: set | None = None,
+) -> list[ProfileGatewayProcess]:
+    """Return running gateway PIDs mapped to Hermes profiles via PID files."""
+    _exclude = set(exclude_pids or set())
+    processes: list[ProfileGatewayProcess] = []
+    try:
+        from gateway.status import get_running_pid
+        from hermes_cli.profiles import list_profiles
+    except Exception:
+        return processes
+
+    seen: set[int] = set()
+    for profile in list_profiles():
+        try:
+            pid = get_running_pid(profile.path / "gateway.pid", cleanup_stale=False)
+        except Exception:
+            continue
+        if pid is None or pid <= 0 or pid in _exclude or pid in seen:
+            continue
+        seen.add(pid)
+        processes.append(ProfileGatewayProcess(profile=profile.name, path=profile.path, pid=pid))
+    return processes
+
+
+def _gateway_run_args_for_profile(profile: str) -> list[str]:
+    args = [get_python_path(), "-m", "hermes_cli.main"]
+    if profile != "default":
+        args.extend(["--profile", profile])
+    args.extend(["gateway", "run", "--replace"])
+    return args
+
+
+def launch_detached_profile_gateway_restart(profile: str, old_pid: int) -> bool:
+    """Relaunch a manually-run profile gateway after its current PID exits."""
+    if old_pid <= 0:
+        return False
+
+    watcher = textwrap.dedent(
+        """
+        import os
+        import subprocess
+        import sys
+        import time
+
+        pid = int(sys.argv[1])
+        cmd = sys.argv[2:]
+        deadline = time.monotonic() + 120
+        while time.monotonic() < deadline:
+            try:
+                os.kill(pid, 0)
+            except ProcessLookupError:
+                break
+            except PermissionError:
+                pass
+            time.sleep(0.2)
+        subprocess.Popen(
+            cmd,
+            stdout=subprocess.DEVNULL,
+            stderr=subprocess.DEVNULL,
+            start_new_session=True,
+        )
+        """
+    ).strip()
+
+    try:
+        subprocess.Popen(
+            [sys.executable, "-c", watcher, str(old_pid), *_gateway_run_args_for_profile(profile)],
+            stdout=subprocess.DEVNULL,
+            stderr=subprocess.DEVNULL,
+            start_new_session=True,
+        )
+    except OSError:
+        return False
+    return True
+
+
 def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
    selected_system = _select_systemd_scope(system)
    unit_exists = get_systemd_unit_path(system=selected_system).exists()
@@ -605,6 +714,32 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
    print("  can refuse to start another copy until this process stops.")


+def _print_other_profiles_gateway_status() -> None:
+    """Print a summary of gateway status across all profiles.
+
+    Shown at the bottom of ``hermes gateway status`` output so users with
+    multiple profiles can tell at a glance which gateways are running and
+    avoid confusing another profile's process with the current one.
+    """
+    try:
+        from hermes_cli.profiles import get_active_profile_name
+
+        current = get_active_profile_name()
+        other_processes = [
+            p for p in find_profile_gateway_processes()
+            if p.profile != current
+        ]
+        if not other_processes:
+            return
+
+        print()
+        print("Other profiles:")
+        for proc in other_processes:
+            print(f"  ✓ {proc.profile:<16s} — PID {proc.pid}")
+    except Exception:
+        pass
+
+
 def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
                           all_profiles: bool = False) -> int:
    """Kill any running gateway processes. Returns count killed.
@@ -650,6 +785,12 @@ def stop_profile_gateway() -> bool:
    if pid is None:
        return False

+    try:
+        from gateway.status import write_planned_stop_marker
+        write_planned_stop_marker(pid)
+    except Exception:
+        pass
+
    try:
        os.kill(pid, signal.SIGTERM)
    except ProcessLookupError:
@@ -1473,6 +1614,46 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
    return [p for p in candidates if p not in path_entries and Path(p).exists()]


+def _build_wsl_interop_paths(path_entries: list[str]) -> list[str]:
+    """Return WSL Windows interop PATH entries for generated systemd units.
+
+    WSL shells normally inherit Windows PATH entries such as
+    ``/mnt/c/WINDOWS/System32``. systemd user services do not, so gateway tools
+    that call ``powershell.exe``/``cmd.exe`` work in a terminal but fail in the
+    background service unless we persist the relevant entries at install time.
+    """
+    if not is_wsl():
+        return []
+
+    candidates: list[str] = []
+    for entry in os.environ.get("PATH", "").split(os.pathsep):
+        if entry.startswith("/mnt/"):
+            candidates.append(entry)
+
+    for executable in ("powershell.exe", "cmd.exe", "explorer.exe", "wsl.exe"):
+        resolved = shutil.which(executable)
+        if resolved:
+            candidates.append(str(Path(resolved).parent))
+
+    for entry in (
+        "/mnt/c/WINDOWS/system32",
+        "/mnt/c/WINDOWS",
+        "/mnt/c/WINDOWS/System32/Wbem",
+        "/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/",
+        "/mnt/c/WINDOWS/System32/OpenSSH/",
+    ):
+        if Path(entry).exists():
+            candidates.append(entry)
+
+    result: list[str] = []
+    seen = set(path_entries)
+    for entry in candidates:
+        if entry and entry not in seen:
+            seen.add(entry)
+            result.append(entry)
+    return result
+
+
 def _remap_path_for_user(path: str, target_home_dir: str) -> str:
    """Remap *path* from the current user's home to *target_home_dir*.

@@ -1564,14 +1745,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
        node_bin = _remap_path_for_user(node_bin, home_dir)
        path_entries = [_remap_path_for_user(p, home_dir) for p in path_entries]
        path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
+        path_entries.extend(_build_wsl_interop_paths(path_entries))
        path_entries.extend(common_bin_paths)
        sane_path = ":".join(path_entries)
        return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network-online.target
 Wants=network-online.target
-StartLimitIntervalSec=600
-StartLimitBurst=5
+StartLimitIntervalSec=0

 [Service]
 Type=simple
@@ -1585,8 +1766,10 @@ Environment="LOGNAME={username}"
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
-Restart=on-failure
-RestartSec=30
+Restart=always
+RestartSec=60
+RestartMaxDelaySec=300
+RestartSteps=5
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
 KillMode=mixed
 KillSignal=SIGTERM
@@ -1602,13 +1785,14 @@ WantedBy=multi-user.target
    hermes_home = str(get_hermes_home().resolve())
    profile_arg = _profile_arg(hermes_home)
    path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
+    path_entries.extend(_build_wsl_interop_paths(path_entries))
    path_entries.extend(common_bin_paths)
    sane_path = ":".join(path_entries)
    return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
-After=network.target
-StartLimitIntervalSec=600
-StartLimitBurst=5
+After=network-online.target
+Wants=network-online.target
+StartLimitIntervalSec=0

 [Service]
 Type=simple
@@ -1617,8 +1801,10 @@ WorkingDirectory={working_dir}
 Environment="PATH={sane_path}"
 Environment="VIRTUAL_ENV={venv_dir}"
 Environment="HERMES_HOME={hermes_home}"
-Restart=on-failure
-RestartSec=30
+Restart=always
+RestartSec=60
+RestartMaxDelaySec=300
+RestartSteps=5
 RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
 KillMode=mixed
 KillSignal=SIGTERM
@@ -1833,6 +2019,15 @@ def systemd_uninstall(system: bool = False):
    print(f"✓ {_service_scope_label(system).capitalize()} service uninstalled")


+def _require_service_installed(action: str, system: bool = False) -> None:
+    unit_path = get_systemd_unit_path(system=system)
+    if not unit_path.exists():
+        scope_flag = " --system" if system else ""
+        print(f"✗ Gateway service is not installed")
+        print(f"  Run: {'sudo ' if system else ''}hermes gateway install{scope_flag}")
+        sys.exit(1)
+
+
 def systemd_start(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
@@ -1842,6 +2037,7 @@ def systemd_start(system: bool = False):
        # reachable (common on fresh RHEL/Debian SSH sessions without linger).
        # Raises UserSystemdUnavailableError with a remediation message.
        _preflight_user_systemd()
+    _require_service_installed("start", system=system)
    refresh_systemd_unit_if_needed(system=system)
    _run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
    print(f"✓ {_service_scope_label(system).capitalize()} service started")
@@ -1852,6 +2048,14 @@ def systemd_stop(system: bool = False):
    system = _select_systemd_scope(system)
    if system:
        _require_root_for_system_service("stop")
+    _require_service_installed("stop", system=system)
+    try:
+        from gateway.status import get_running_pid, write_planned_stop_marker
+        pid = get_running_pid(cleanup_stale=False)
+        if pid is not None:
+            write_planned_stop_marker(pid)
+    except Exception:
+        pass
    _run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
    print(f"✓ {_service_scope_label(system).capitalize()} service stopped")

@@ -1863,6 +2067,7 @@ def systemd_restart(system: bool = False):
        _require_root_for_system_service("restart")
    else:
        _preflight_user_systemd()
+    _require_service_installed("restart", system=system)
    refresh_systemd_unit_if_needed(system=system)
    from gateway.status import get_running_pid

@@ -2212,6 +2417,13 @@ def launchd_start():
 def launchd_stop():
    label = get_launchd_label()
    target = f"{_launchd_domain()}/{label}"
+    try:
+        from gateway.status import get_running_pid, write_planned_stop_marker
+        pid = get_running_pid(cleanup_stale=False)
+        if pid is not None:
+            write_planned_stop_marker(pid)
+    except Exception:
+        pass
    # bootout unloads the service definition so KeepAlive doesn't respawn
    # the process.  A plain `kill SIGTERM` only signals the process — launchd
    # immediately restarts it because KeepAlive.SuccessfulExit = false.
@@ -2354,6 +2566,20 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
                 hasn't fully exited yet.
    """
    sys.path.insert(0, str(PROJECT_ROOT))
+
+    # Refresh the systemd unit definition on every boot so that restart
+    # settings (RestartSec, StartLimitIntervalSec, etc.) stay current even
+    # when the process was respawned via exit-code-75 (stale-code or
+    # /restart) rather than through `hermes gateway restart` which already
+    # calls refresh_systemd_unit_if_needed().  Without this, a code update
+    # that ships new unit settings won't take effect until the next manual
+    # `hermes gateway start/restart` — leaving the gateway vulnerable to
+    # the exact failure mode the new settings were meant to prevent.
+    if supports_systemd_services():
+        try:
+            refresh_systemd_unit_if_needed(system=False)
+        except Exception:
+            pass  # best-effort; don't block gateway startup
    
    from gateway.run import start_gateway
    
@@ -2366,7 +2592,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
    print()
    
    # Exit with code 1 if gateway fails to connect any platform,
-    # so systemd Restart=on-failure will retry on transient errors
+    # so systemd Restart=always will retry on transient errors
    verbosity = None if quiet else verbose
    try:
        success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4368,6 +4594,9 @@ def _gateway_command_inner(args):
                    print("  hermes gateway install  # Install as user service")
                    print("  sudo hermes gateway install --system  # Install as boot-time system service")

+        # Show other profiles' gateway status for multi-profile awareness
+        _print_other_profiles_gateway_status()
+
    elif subcmd == "migrate-legacy":
        # Stop, disable, and remove legacy Hermes gateway unit files from
        # pre-rename installs (e.g. hermes.service). Profile units and
@@ -4377,4 +4606,4 @@ def _gateway_command_inner(args):
        if not supports_systemd_services() and not is_macos():
            print("Legacy unit migration only applies to systemd-based Linux hosts.")
            return
-        remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
+        remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
@@ -0,0 +1,535 @@
+"""Persistent session goals — the Ralph loop for Hermes.
+
+A goal is a free-form user objective that stays active across turns. After
+each turn completes, a small judge call asks an auxiliary model "is this
+goal satisfied by the assistant's last response?". If not, Hermes feeds a
+continuation prompt back into the same session and keeps working until the
+goal is done, turn budget is exhausted, the user pauses/clears it, or the
+user sends a new message (which takes priority and pauses the goal loop).
+
+State is persisted in SessionDB's ``state_meta`` table keyed by
+``goal:<session_id>`` so ``/resume`` picks it up.
+
+Design notes / invariants:
+
+- The continuation prompt is just a normal user message appended to the
+  session via ``run_conversation``. No system-prompt mutation, no toolset
+  swap — prompt caching stays intact.
+- Judge failures are fail-OPEN: ``continue``. A broken judge must not wedge
+  progress; the turn budget is the backstop.
+- When a real user message arrives mid-loop it preempts the continuation
+  prompt and also pauses the goal loop for that turn (we still re-judge
+  after, so if the user's message happens to complete the goal the judge
+  will say ``done``).
+- This module has zero hard dependency on ``cli.HermesCLI`` or the gateway
+  runner — both wire the same ``GoalManager`` in.
+
+Nothing in this module touches the agent's system prompt or toolset.
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import re
+import time
+from dataclasses import dataclass, asdict
+from typing import Any, Dict, Optional, Tuple
+
+logger = logging.getLogger(__name__)
+
+
+# ──────────────────────────────────────────────────────────────────────
+# Constants & defaults
+# ──────────────────────────────────────────────────────────────────────
+
+DEFAULT_MAX_TURNS = 20
+DEFAULT_JUDGE_TIMEOUT = 30.0
+# Cap how much of the last response + recent messages we send to the judge.
+_JUDGE_RESPONSE_SNIPPET_CHARS = 4000
+
+
+CONTINUATION_PROMPT_TEMPLATE = (
+    "[Continuing toward your standing goal]\n"
+    "Goal: {goal}\n\n"
+    "Continue working toward this goal. Take the next concrete step. "
+    "If you believe the goal is complete, state so explicitly and stop. "
+    "If you are blocked and need input from the user, say so clearly and stop."
+)
+
+
+JUDGE_SYSTEM_PROMPT = (
+    "You are a strict judge evaluating whether an autonomous agent has "
+    "achieved a user's stated goal. You receive the goal text and the "
+    "agent's most recent response. Your only job is to decide whether "
+    "the goal is fully satisfied based on that response.\n\n"
+    "A goal is DONE only when:\n"
+    "- The response explicitly confirms the goal was completed, OR\n"
+    "- The response clearly shows the final deliverable was produced, OR\n"
+    "- The response explains the goal is unachievable / blocked / needs "
+    "user input (treat this as DONE with reason describing the block).\n\n"
+    "Otherwise the goal is NOT done — CONTINUE.\n\n"
+    "Reply ONLY with a single JSON object on one line:\n"
+    '{\"done\": <true|false>, \"reason\": \"<one-sentence rationale>\"}'
+)
+
+
+JUDGE_USER_PROMPT_TEMPLATE = (
+    "Goal:\n{goal}\n\n"
+    "Agent's most recent response:\n{response}\n\n"
+    "Is the goal satisfied?"
+)
+
+
+# ──────────────────────────────────────────────────────────────────────
+# Dataclass
+# ──────────────────────────────────────────────────────────────────────
+
+
+@dataclass
+class GoalState:
+    """Serializable goal state stored per session."""
+
+    goal: str
+    status: str = "active"          # active | paused | done | cleared
+    turns_used: int = 0
+    max_turns: int = DEFAULT_MAX_TURNS
+    created_at: float = 0.0
+    last_turn_at: float = 0.0
+    last_verdict: Optional[str] = None        # "done" | "continue" | "skipped"
+    last_reason: Optional[str] = None
+    paused_reason: Optional[str] = None       # why we auto-paused (budget, etc.)
+
+    def to_json(self) -> str:
+        return json.dumps(asdict(self), ensure_ascii=False)
+
+    @classmethod
+    def from_json(cls, raw: str) -> "GoalState":
+        data = json.loads(raw)
+        return cls(
+            goal=data.get("goal", ""),
+            status=data.get("status", "active"),
+            turns_used=int(data.get("turns_used", 0) or 0),
+            max_turns=int(data.get("max_turns", DEFAULT_MAX_TURNS) or DEFAULT_MAX_TURNS),
+            created_at=float(data.get("created_at", 0.0) or 0.0),
+            last_turn_at=float(data.get("last_turn_at", 0.0) or 0.0),
+            last_verdict=data.get("last_verdict"),
+            last_reason=data.get("last_reason"),
+            paused_reason=data.get("paused_reason"),
+        )
+
+
+# ──────────────────────────────────────────────────────────────────────
+# Persistence (SessionDB state_meta)
+# ──────────────────────────────────────────────────────────────────────
+
+
+def _meta_key(session_id: str) -> str:
+    return f"goal:{session_id}"
+
+
+_DB_CACHE: Dict[str, Any] = {}
+
+
+def _get_session_db() -> Optional[Any]:
+    """Return a SessionDB instance for the current HERMES_HOME.
+
+    SessionDB has no built-in singleton, but opening a new connection per
+    /goal call would thrash the file. We cache one instance per
+    ``hermes_home`` path so profile switches still pick up the right DB.
+    Defensive against import/instantiation failures so tests and
+    non-standard launchers can still use the GoalManager.
+    """
+    try:
+        from hermes_constants import get_hermes_home
+        from hermes_state import SessionDB
+
+        home = str(get_hermes_home())
+    except Exception as exc:  # pragma: no cover
+        logger.debug("GoalManager: SessionDB bootstrap failed (%s)", exc)
+        return None
+
+    cached = _DB_CACHE.get(home)
+    if cached is not None:
+        return cached
+    try:
+        db = SessionDB()
+    except Exception as exc:  # pragma: no cover
+        logger.debug("GoalManager: SessionDB() raised (%s)", exc)
+        return None
+    _DB_CACHE[home] = db
+    return db
+
+
+def load_goal(session_id: str) -> Optional[GoalState]:
+    """Load the goal for a session, or None if none exists."""
+    if not session_id:
+        return None
+    db = _get_session_db()
+    if db is None:
+        return None
+    try:
+        raw = db.get_meta(_meta_key(session_id))
+    except Exception as exc:
+        logger.debug("GoalManager: get_meta failed: %s", exc)
+        return None
+    if not raw:
+        return None
+    try:
+        return GoalState.from_json(raw)
+    except Exception as exc:
+        logger.warning("GoalManager: could not parse stored goal for %s: %s", session_id, exc)
+        return None
+
+
+def save_goal(session_id: str, state: GoalState) -> None:
+    """Persist a goal to SessionDB. No-op if DB unavailable."""
+    if not session_id:
+        return
+    db = _get_session_db()
+    if db is None:
+        return
+    try:
+        db.set_meta(_meta_key(session_id), state.to_json())
+    except Exception as exc:
+        logger.debug("GoalManager: set_meta failed: %s", exc)
+
+
+def clear_goal(session_id: str) -> None:
+    """Mark a goal cleared in the DB (preserved for audit, status=cleared)."""
+    state = load_goal(session_id)
+    if state is None:
+        return
+    state.status = "cleared"
+    save_goal(session_id, state)
+
+
+# ──────────────────────────────────────────────────────────────────────
+# Judge
+# ──────────────────────────────────────────────────────────────────────
+
+
+def _truncate(text: str, limit: int) -> str:
+    if not text:
+        return ""
+    if len(text) <= limit:
+        return text
+    return text[:limit] + "… [truncated]"
+
+
+_JSON_OBJECT_RE = re.compile(r"\{.*?\}", re.DOTALL)
+
+
+def _parse_judge_response(raw: str) -> Tuple[bool, str]:
+    """Parse the judge's reply. Fail-open to ``(False, "<reason>")``.
+
+    Returns ``(done, reason)``.
+    """
+    if not raw:
+        return False, "judge returned empty response"
+
+    text = raw.strip()
+
+    # Strip markdown code fences the model may wrap JSON in.
+    if text.startswith("```"):
+        text = text.strip("`")
+        # Peel off leading json/JSON/etc tag
+        nl = text.find("\n")
+        if nl != -1:
+            text = text[nl + 1:]
+
+    # First try: parse the whole blob.
+    data: Optional[Dict[str, Any]] = None
+    try:
+        data = json.loads(text)
+    except Exception:
+        # Second try: pull the first JSON object out.
+        match = _JSON_OBJECT_RE.search(text)
+        if match:
+            try:
+                data = json.loads(match.group(0))
+            except Exception:
+                data = None
+
+    if not isinstance(data, dict):
+        return False, f"judge reply was not JSON: {_truncate(raw, 200)!r}"
+
+    done_val = data.get("done")
+    if isinstance(done_val, str):
+        done = done_val.strip().lower() in ("true", "yes", "1", "done")
+    else:
+        done = bool(done_val)
+    reason = str(data.get("reason") or "").strip()
+    if not reason:
+        reason = "no reason provided"
+    return done, reason
+
+
+def judge_goal(
+    goal: str,
+    last_response: str,
+    *,
+    timeout: float = DEFAULT_JUDGE_TIMEOUT,
+) -> Tuple[str, str]:
+    """Ask the auxiliary model whether the goal is satisfied.
+
+    Returns ``(verdict, reason)`` where verdict is ``"done"``, ``"continue"``,
+    or ``"skipped"`` (when the judge couldn't be reached).
+
+    This is deliberately fail-open: any error returns ``("continue", "...")``
+    so a broken judge doesn't wedge progress — the turn budget is the
+    backstop.
+    """
+    if not goal.strip():
+        return "skipped", "empty goal"
+    if not last_response.strip():
+        # No substantive reply this turn — almost certainly not done yet.
+        return "continue", "empty response (nothing to evaluate)"
+
+    try:
+        from agent.auxiliary_client import get_text_auxiliary_client
+    except Exception as exc:
+        logger.debug("goal judge: auxiliary client import failed: %s", exc)
+        return "continue", "auxiliary client unavailable"
+
+    try:
+        client, model = get_text_auxiliary_client("goal_judge")
+    except Exception as exc:
+        logger.debug("goal judge: get_text_auxiliary_client failed: %s", exc)
+        return "continue", "auxiliary client unavailable"
+
+    if client is None or not model:
+        return "continue", "no auxiliary client configured"
+
+    prompt = JUDGE_USER_PROMPT_TEMPLATE.format(
+        goal=_truncate(goal, 2000),
+        response=_truncate(last_response, _JUDGE_RESPONSE_SNIPPET_CHARS),
+    )
+
+    try:
+        resp = client.chat.completions.create(
+            model=model,
+            messages=[
+                {"role": "system", "content": JUDGE_SYSTEM_PROMPT},
+                {"role": "user", "content": prompt},
+            ],
+            temperature=0,
+            max_tokens=200,
+            timeout=timeout,
+        )
+    except Exception as exc:
+        logger.info("goal judge: API call failed (%s) — falling through to continue", exc)
+        return "continue", f"judge error: {type(exc).__name__}"
+
+    try:
+        raw = resp.choices[0].message.content or ""
+    except Exception:
+        raw = ""
+
+    done, reason = _parse_judge_response(raw)
+    verdict = "done" if done else "continue"
+    logger.info("goal judge: verdict=%s reason=%s", verdict, _truncate(reason, 120))
+    return verdict, reason
+
+
+# ──────────────────────────────────────────────────────────────────────
+# GoalManager — the orchestration surface CLI + gateway talk to
+# ──────────────────────────────────────────────────────────────────────
+
+
+class GoalManager:
+    """Per-session goal state + continuation decisions.
+
+    The CLI and gateway each hold one ``GoalManager`` per live session.
+
+    Methods:
+
+    - ``set(goal)`` — start a new standing goal.
+    - ``clear()`` — remove the active goal.
+    - ``pause()`` / ``resume()`` — explicit user controls.
+    - ``status()`` — printable one-liner.
+    - ``evaluate_after_turn(last_response)`` — call the judge, update state,
+      and return a decision dict the caller uses to drive the next turn.
+    - ``next_continuation_prompt()`` — the canonical user-role message to
+      feed back into ``run_conversation``.
+    """
+
+    def __init__(self, session_id: str, *, default_max_turns: int = DEFAULT_MAX_TURNS):
+        self.session_id = session_id
+        self.default_max_turns = int(default_max_turns or DEFAULT_MAX_TURNS)
+        self._state: Optional[GoalState] = load_goal(session_id)
+
+    # --- introspection ------------------------------------------------
+
+    @property
+    def state(self) -> Optional[GoalState]:
+        return self._state
+
+    def is_active(self) -> bool:
+        return self._state is not None and self._state.status == "active"
+
+    def has_goal(self) -> bool:
+        return self._state is not None and self._state.status in ("active", "paused")
+
+    def status_line(self) -> str:
+        s = self._state
+        if s is None or s.status in ("cleared",):
+            return "No active goal. Set one with /goal <text>."
+        turns = f"{s.turns_used}/{s.max_turns} turns"
+        if s.status == "active":
+            return f"⊙ Goal (active, {turns}): {s.goal}"
+        if s.status == "paused":
+            extra = f" — {s.paused_reason}" if s.paused_reason else ""
+            return f"⏸ Goal (paused, {turns}{extra}): {s.goal}"
+        if s.status == "done":
+            return f"✓ Goal done ({turns}): {s.goal}"
+        return f"Goal ({s.status}, {turns}): {s.goal}"
+
+    # --- mutation -----------------------------------------------------
+
+    def set(self, goal: str, *, max_turns: Optional[int] = None) -> GoalState:
+        goal = (goal or "").strip()
+        if not goal:
+            raise ValueError("goal text is empty")
+        state = GoalState(
+            goal=goal,
+            status="active",
+            turns_used=0,
+            max_turns=int(max_turns) if max_turns else self.default_max_turns,
+            created_at=time.time(),
+            last_turn_at=0.0,
+        )
+        self._state = state
+        save_goal(self.session_id, state)
+        return state
+
+    def pause(self, reason: str = "user-paused") -> Optional[GoalState]:
+        if not self._state:
+            return None
+        self._state.status = "paused"
+        self._state.paused_reason = reason
+        save_goal(self.session_id, self._state)
+        return self._state
+
+    def resume(self, *, reset_budget: bool = True) -> Optional[GoalState]:
+        if not self._state:
+            return None
+        self._state.status = "active"
+        self._state.paused_reason = None
+        if reset_budget:
+            self._state.turns_used = 0
+        save_goal(self.session_id, self._state)
+        return self._state
+
+    def clear(self) -> None:
+        if self._state is None:
+            return
+        self._state.status = "cleared"
+        save_goal(self.session_id, self._state)
+        self._state = None
+
+    def mark_done(self, reason: str) -> None:
+        if not self._state:
+            return
+        self._state.status = "done"
+        self._state.last_verdict = "done"
+        self._state.last_reason = reason
+        save_goal(self.session_id, self._state)
+
+    # --- the main entry point called after every turn -----------------
+
+    def evaluate_after_turn(
+        self,
+        last_response: str,
+        *,
+        user_initiated: bool = True,
+    ) -> Dict[str, Any]:
+        """Run the judge and update state. Return a decision dict.
+
+        ``user_initiated`` distinguishes a real user prompt (True) from a
+        continuation prompt we fed ourselves (False). Both increment
+        ``turns_used`` because both consume model budget.
+
+        Decision keys:
+          - ``status``: current goal status after update
+          - ``should_continue``: bool — caller should fire another turn
+          - ``continuation_prompt``: str or None
+          - ``verdict``: "done" | "continue" | "skipped" | "inactive"
+          - ``reason``: str
+          - ``message``: user-visible one-liner to print/send
+        """
+        state = self._state
+        if state is None or state.status != "active":
+            return {
+                "status": state.status if state else None,
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "inactive",
+                "reason": "no active goal",
+                "message": "",
+            }
+
+        # Count the turn that just finished.
+        state.turns_used += 1
+        state.last_turn_at = time.time()
+
+        verdict, reason = judge_goal(state.goal, last_response)
+        state.last_verdict = verdict
+        state.last_reason = reason
+
+        if verdict == "done":
+            state.status = "done"
+            save_goal(self.session_id, state)
+            return {
+                "status": "done",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "done",
+                "reason": reason,
+                "message": f"✓ Goal achieved: {reason}",
+            }
+
+        if state.turns_used >= state.max_turns:
+            state.status = "paused"
+            state.paused_reason = f"turn budget exhausted ({state.turns_used}/{state.max_turns})"
+            save_goal(self.session_id, state)
+            return {
+                "status": "paused",
+                "should_continue": False,
+                "continuation_prompt": None,
+                "verdict": "continue",
+                "reason": reason,
+                "message": (
+                    f"⏸ Goal paused — {state.turns_used}/{state.max_turns} turns used. "
+                    "Use /goal resume to keep going, or /goal clear to stop."
+                ),
+            }
+
+        save_goal(self.session_id, state)
+        return {
+            "status": "active",
+            "should_continue": True,
+            "continuation_prompt": self.next_continuation_prompt(),
+            "verdict": "continue",
+            "reason": reason,
+            "message": (
+                f"↻ Continuing toward goal ({state.turns_used}/{state.max_turns}): {reason}"
+            ),
+        }
+
+    def next_continuation_prompt(self) -> Optional[str]:
+        if not self._state or self._state.status != "active":
+            return None
+        return CONTINUATION_PROMPT_TEMPLATE.format(goal=self._state.goal)
+
+
+__all__ = [
+    "GoalState",
+    "GoalManager",
+    "CONTINUATION_PROMPT_TEMPLATE",
+    "DEFAULT_MAX_TURNS",
+    "load_goal",
+    "save_goal",
+    "clear_goal",
+    "judge_goal",
+]
@@ -114,6 +114,16 @@ def _apply_profile_override() -> None:
            consume = 1
            break

+    # 1b. Reject values that can't be valid profile names (e.g. pytest's
+    # "-p no:xdist" would be misread as profile "no:xdist" otherwise).
+    # Mirrors hermes_cli.profiles._PROFILE_ID_RE so we never call
+    # resolve_profile_env() with a value it must reject + sys.exit on.
+    if profile_name is not None and consume == 2:
+        import re as _re
+        if not _re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", profile_name):
+            profile_name = None
+            consume = 0
+
    # 1.5 If HERMES_HOME is already set and no explicit flag was given, trust it.
    # This lets child processes (relaunch, subprocess) inherit the parent's
    # profile choice without having to pass --profile again.
@@ -289,7 +299,7 @@ def _has_any_provider_configured() -> bool:
    env_file = get_env_path()
    if env_file.exists():
        try:
-            for line in env_file.read_text().splitlines():
+            for line in env_file.read_text(encoding="utf-8").splitlines():
                line = line.strip()
                if line.startswith("#") or "=" not in line:
                    continue
@@ -800,6 +810,8 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti

        title = db.get_session_title(target)
        message_count = int(session.get("message_count") or 0)
+        if message_count == 0:
+            return  # No real conversation — don't show resume info
        input_tokens = int(session.get("input_tokens") or 0)
        output_tokens = int(session.get("output_tokens") or 0)
        cache_read_tokens = int(session.get("cache_read_tokens") or 0)
@@ -835,7 +847,17 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
    )


-_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert"})
+_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert", "peer"})
+"""Lockfile fields npm writes non-deterministically at install time.
+
+``ideallyInert`` is npm's runtime annotation for packages it skipped installing
+(per-platform opt-outs).  ``peer`` is dropped from the hidden ``.package-lock.json``
+on dev-dependencies that are *also* declared as peers — the canonical
+``package-lock.json`` records the dual role, but npm 9's actualized tree strips
+it.  Neither key represents a real skew between what was declared and what was
+installed, so we exclude them from the comparison in :func:`_tui_need_npm_install`
+to avoid false-positive reinstalls on every launch.
+"""


 def _tui_need_npm_install(root: Path) -> bool:
@@ -1040,17 +1062,21 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
    if _tui_need_npm_install(tui_dir):
        if not os.environ.get("HERMES_QUIET"):
            print("Installing TUI dependencies…")
+        # Capture stdout as well as stderr — some npm errors (notably EACCES on a
+        # root-owned node_modules in containers) are emitted on stdout, and a
+        # bare "npm install failed." with no preview defeats debugging.  We keep
+        # the failure-only print path so a successful install stays silent.
        result = subprocess.run(
            [npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
            cwd=str(tui_dir),
-            stdout=subprocess.DEVNULL,
+            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            env={**os.environ, "CI": "1"},
        )
        if result.returncode != 0:
-            err = (result.stderr or "").strip()
-            preview = "\n".join(err.splitlines()[-30:])
+            combined = f"{result.stdout or ''}\n{result.stderr or ''}".strip()
+            preview = "\n".join(combined.splitlines()[-30:])
            print("npm install failed.")
            if preview:
                print(preview)
@@ -3397,10 +3423,10 @@ def _model_flow_named_custom(config, provider_info):
    print()

    print("Fetching available models...")
-    models = fetch_api_models(
-        api_key, base_url, timeout=8.0,
-        api_mode=api_mode or None,
-    )
+    fetch_kwargs = {"timeout": 8.0}
+    if api_mode:
+        fetch_kwargs["api_mode"] = api_mode
+    models = fetch_api_models(api_key, base_url, **fetch_kwargs)

    if models:
        default_idx = 0
@@ -5041,6 +5067,13 @@ def cmd_slack(args):
    return 1


+def cmd_kanban(args):
+    """Multi-profile collaboration board."""
+    from hermes_cli.kanban import kanban_command
+
+    return kanban_command(args)
+
+
 def cmd_hooks(args):
    """Shell-hook inspection and management."""
    from hermes_cli.hooks import hooks_command
@@ -5424,6 +5457,45 @@ def _find_stale_dashboard_pids() -> list[int]:
    return dashboard_pids


+def _print_curator_first_run_notice() -> None:
+    """Print a short heads-up about the skill curator after `hermes update`.
+
+    Only fires when the curator is enabled AND has no recorded run yet, which
+    is exactly the window where the gateway ticker used to fire Curator
+    against a fresh skill library immediately after an update. We defer the
+    first real pass by one ``interval_hours``; this notice tells the user how
+    to preview or disable before then. Silent on steady state.
+    """
+    try:
+        from agent import curator
+    except Exception:
+        return
+    try:
+        if not curator.is_enabled():
+            return
+        state = curator.load_state()
+    except Exception:
+        return
+    if state.get("last_run_at"):
+        # Curator has run before (real or already seeded) — no notice needed.
+        return
+    try:
+        hours = curator.get_interval_hours()
+    except Exception:
+        hours = 24 * 7
+    days = max(1, hours // 24)
+    print()
+    print("ℹ Skill curator")
+    print(
+        f"  Background skill maintenance is enabled. First pass is deferred "
+        f"~{days}d after installation; only agent-created skills are in "
+        f"scope and nothing is ever auto-deleted (archive is recoverable)."
+    )
+    print("  Preview now:  hermes curator run --dry-run")
+    print("  Pause it:     hermes curator pause")
+    print("  Docs:         https://hermes-agent.nousresearch.com/docs/user-guide/features/curator")
+
+
 def _kill_stale_dashboard_processes(
    reason: str = "the running backend no longer matches the updated frontend",
 ) -> None:
@@ -5661,6 +5733,10 @@ def _update_via_zip(args):

    print()
    print("✓ Update complete!")
+    try:
+        _print_curator_first_run_notice()
+    except Exception as e:
+        logger.debug("Curator first-run notice failed: %s", e)
    _kill_stale_dashboard_processes()


@@ -6419,13 +6495,29 @@ def _cmd_update_check():
    if sys.platform == "win32":
        git_cmd = ["git", "-c", "windows.appendAtomically=false"]

-    print("→ Fetching from origin...")
+    # Fetch both origin and upstream; prefer upstream as the canonical reference
+    print("→ Fetching from upstream...")
    fetch_result = subprocess.run(
-        git_cmd + ["fetch", "origin"],
+        git_cmd + ["fetch", "upstream"],
        cwd=PROJECT_ROOT,
        capture_output=True,
        text=True,
    )
+    if fetch_result.returncode != 0:
+        # Fallback to origin if upstream doesn't exist
+        print("→ Fetching from origin...")
+        fetch_result = subprocess.run(
+            git_cmd + ["fetch", "origin"],
+            cwd=PROJECT_ROOT,
+            capture_output=True,
+            text=True,
+        )
+        upstream_exists = False
+        compare_branch = "origin/main"
+    else:
+        upstream_exists = True
+        compare_branch = "upstream/main"
+
    if fetch_result.returncode != 0:
        stderr = fetch_result.stderr.strip()
        if "Could not resolve host" in stderr or "unable to access" in stderr:
@@ -6433,13 +6525,13 @@ def _cmd_update_check():
        elif "Authentication failed" in stderr or "could not read Username" in stderr:
            print("✗ Authentication failed — check your git credentials or SSH key.")
        else:
-            print("✗ Failed to fetch from origin.")
+            print("✗ Failed to fetch.")
            if stderr:
                print(f"  {stderr.splitlines()[0]}")
        sys.exit(1)

    rev_result = subprocess.run(
-        git_cmd + ["rev-list", "HEAD..origin/main", "--count"],
+        git_cmd + ["rev-list", f"HEAD..{compare_branch}", "--count"],
        cwd=PROJECT_ROOT,
        capture_output=True,
        text=True,
@@ -6451,7 +6543,7 @@ def _cmd_update_check():
        print("✓ Already up to date.")
    else:
        commits_word = "commit" if behind == 1 else "commits"
-        print(f"⚕ Update available: {behind} {commits_word} behind origin/main.")
+        print(f"⚕ Update available: {behind} {commits_word} behind {compare_branch}.")
        from hermes_cli.config import recommended_update_command
        print(f"  Run '{recommended_update_command()}' to install.")

@@ -6666,6 +6758,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
        if gateway_mode
        else None
    )
+    assume_yes = bool(getattr(args, "yes", False))

    print("⚕ Updating Hermes Agent...")
    print()
@@ -6785,8 +6878,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
        else:
            auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)

-        prompt_for_restore = auto_stash_ref is not None and (
-            gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty())
+        prompt_for_restore = (
+            auto_stash_ref is not None
+            and not assume_yes
+            and (gateway_mode or (sys.stdin.isatty() and sys.stdout.isatty()))
        )

        # Check if there are updates
@@ -6974,20 +7069,22 @@ def _cmd_update_impl(args, gateway_mode: bool):
        except Exception as e:
            logger.debug("Skills sync during update failed: %s", e)

-        # Sync bundled skills to all other profiles
+        # Sync bundled skills to all profiles (including the active one).
+        # seed_profile_skills() uses subprocess with an explicit HERMES_HOME so
+        # it is not affected by sync_skills()'s module-level HERMES_HOME cache,
+        # which means the active profile is reliably synced regardless of whether
+        # the caller's HERMES_HOME env var points at the default or a named profile.
        try:
            from hermes_cli.profiles import (
                list_profiles,
-                get_active_profile_name,
                seed_profile_skills,
            )

-            active = get_active_profile_name()
-            other_profiles = [p for p in list_profiles() if p.name != active]
-            if other_profiles:
+            all_profiles = list_profiles()
+            if all_profiles:
                print()
-                print("→ Syncing bundled skills to other profiles...")
-                for p in other_profiles:
+                print("→ Syncing bundled skills to all profiles...")
+                for p in all_profiles:
                    try:
                        r = seed_profile_skills(p.path, quiet=True)
                        if r:
@@ -7047,7 +7144,10 @@ def _cmd_update_impl(args, gateway_mode: bool):
                print(f"  ℹ️  {len(missing_config)} new config option(s) available")

            print()
-            if gateway_mode:
+            if assume_yes:
+                print("  ℹ --yes: auto-applying config migration (skipping API-key prompts).")
+                response = "y"
+            elif gateway_mode:
                response = (
                    _gateway_prompt(
                        "Would you like to configure new options now? [Y/n]", "n"
@@ -7073,14 +7173,17 @@ def _cmd_update_impl(args, gateway_mode: bool):

            if response in ("", "y", "yes"):
                print()
-                # In gateway mode, run auto-migrations only (no input() prompts
-                # for API keys which would hang the detached process).
-                results = migrate_config(interactive=not gateway_mode, quiet=False)
+                # In gateway mode OR under --yes, run auto-migrations only (no
+                # input() prompts for API keys which would hang the detached
+                # process / defeat the point of --yes).
+                results = migrate_config(
+                    interactive=not (gateway_mode or assume_yes), quiet=False
+                )

                if results["env_added"] or results["config_added"]:
                    print()
                    print("✓ Configuration updated!")
-                if gateway_mode and missing_env:
+                if (gateway_mode or assume_yes) and missing_env:
                    print("  ℹ API keys require manual entry: hermes config migrate")
            else:
                print()
@@ -7091,6 +7194,15 @@ def _cmd_update_impl(args, gateway_mode: bool):
        print()
        print("✓ Update complete!")

+        # Curator first-run heads-up. Only prints when curator is enabled AND
+        # has never run — i.e. the window where the ticker would otherwise
+        # have fired against a fresh skill library. Kept silent on steady
+        # state so we don't nag.
+        try:
+            _print_curator_first_run_notice()
+        except Exception as e:
+            logger.debug("Curator first-run notice failed: %s", e)
+
        # Repair RHEL-family root installs where /usr/local/bin isn't on PATH
        # for non-login interactive shells.  No-op on every other platform.
        try:
@@ -7130,6 +7242,8 @@ def _cmd_update_impl(args, gateway_mode: bool):
                supports_systemd_services,
                _ensure_user_systemd_env,
                find_gateway_pids,
+                find_profile_gateway_processes,
+                launch_detached_profile_gateway_restart,
                _get_service_pids,
                _graceful_restart_via_sigusr1,
            )
@@ -7233,6 +7347,7 @@ def _cmd_update_impl(args, gateway_mode: bool):

            restarted_services = []
            killed_pids = set()
+            relaunched_profiles = []

            # --- Systemd services (Linux) ---
            # Discover all hermes-gateway* units (default + profiles)
@@ -7422,7 +7537,33 @@ def _cmd_update_impl(args, gateway_mode: bool):
            manual_pids = find_gateway_pids(
                exclude_pids=service_pids, all_profiles=True
            )
+            profile_processes = {
+                proc.pid: proc
+                for proc in find_profile_gateway_processes(exclude_pids=service_pids)
+                if proc.pid in manual_pids
+            }
+            for pid, proc in profile_processes.items():
+                if not launch_detached_profile_gateway_restart(proc.profile, pid):
+                    continue
+                # Prefer a graceful SIGUSR1 drain so in-flight agent runs
+                # finish before the watcher respawns the gateway.  If the
+                # gateway doesn't support SIGUSR1 or doesn't exit within
+                # the drain budget, fall back to SIGTERM — the watcher
+                # still sees the exit and relaunches either way.
+                drained = _graceful_restart_via_sigusr1(
+                    pid, drain_timeout=_drain_budget,
+                )
+                if not drained:
+                    try:
+                        os.kill(pid, _signal.SIGTERM)
+                    except (ProcessLookupError, PermissionError):
+                        pass
+                killed_pids.add(pid)
+                relaunched_profiles.append(proc.profile)
+
            for pid in manual_pids:
+                if pid in profile_processes:
+                    continue
                try:
                    os.kill(pid, _signal.SIGTERM)
                    killed_pids.add(pid)
@@ -7433,11 +7574,14 @@ def _cmd_update_impl(args, gateway_mode: bool):
                print()
                for svc in restarted_services:
                    print(f"  ✓ Restarted {svc}")
-                if killed_pids:
-                    print(f"  → Stopped {len(killed_pids)} manual gateway process(es)")
+                if relaunched_profiles:
+                    names = ", ".join(relaunched_profiles)
+                    print(f"  ✓ Restarting manual gateway profile(s): {names}")
+                unmapped_count = len(killed_pids) - len(relaunched_profiles)
+                if unmapped_count:
+                    print(f"  → Stopped {unmapped_count} manual gateway process(es)")
                    print("    Restart manually: hermes gateway run")
-                    # Also restart for each profile if needed
-                    if len(killed_pids) > 1:
+                    if unmapped_count > 1:
                        print(
                            "    (or: hermes -p <profile> gateway run  for each profile)"
                        )
@@ -7446,6 +7590,42 @@ def _cmd_update_impl(args, gateway_mode: bool):
                # No gateways were running — nothing to do
                pass

+            # --- Post-restart survivor sweep -----------------------------
+            # Issue #17648: some gateways ignore SIGTERM (stuck drain,
+            # blocked I/O, PID dead but zombie).  The detached profile
+            # watchers wait 120s for the old PID to exit — if it never
+            # does, no respawn happens and the user keeps hitting
+            # ImportError against a stale sys.modules.  Give the
+            # graceful paths a brief window to complete, then SIGKILL
+            # any remaining pre-update PIDs so the watcher / service
+            # manager can relaunch with fresh code.
+            try:
+                _time.sleep(3.0)
+                _service_pids_after = _get_service_pids()
+                _surviving = find_gateway_pids(
+                    exclude_pids=_service_pids_after, all_profiles=True,
+                )
+                # Scope to PIDs we already tried to kill during this
+                # update (killed_pids).  Anything new is a gateway that
+                # started AFTER our restart attempt — respecting user
+                # intent, we don't kill those.
+                _stuck = [pid for pid in _surviving if pid in killed_pids]
+                if _stuck:
+                    print()
+                    print(
+                        f"  ⚠ {len(_stuck)} gateway process(es) ignored SIGTERM — force-killing"
+                    )
+                    for pid in _stuck:
+                        try:
+                            os.kill(pid, _signal.SIGKILL)
+                        except (ProcessLookupError, PermissionError):
+                            pass
+                    # Give the OS a beat to reap the processes so the
+                    # watchers see them exit and respawn.
+                    _time.sleep(1.5)
+            except Exception as _sweep_exc:
+                logger.debug("Post-restart survivor sweep failed: %s", _sweep_exc)
+
        except Exception as e:
            logger.debug("Gateway restart during update failed: %s", e)

@@ -7682,7 +7862,7 @@ def cmd_profile(args):
                if clone_all:
                    print(f"Full copy from {source_label}.")
                else:
-                    print(f"Cloned config, .env, SOUL.md from {source_label}.")
+                    print(f"Cloned config, .env, SOUL.md, and skills from {source_label}.")

            # Auto-clone Honcho config for the new profile (only with --clone/--clone-all)
            if clone or clone_all:
@@ -8500,7 +8680,24 @@ def main():
    )
    cron_create.add_argument(
        "--script",
-        help="Path to a Python script whose stdout is injected into the prompt each run",
+        help=(
+            "Path to a script under ~/.hermes/scripts/. Default mode: "
+            "script stdout is injected into the agent's prompt each run. "
+            "With --no-agent: the script IS the job and its stdout is "
+            "delivered verbatim. .sh/.bash files run via bash, everything "
+            "else via Python."
+        ),
+    )
+    cron_create.add_argument(
+        "--no-agent",
+        dest="no_agent",
+        action="store_true",
+        default=False,
+        help=(
+            "Skip the LLM entirely — run --script on schedule and deliver "
+            "its stdout directly. Empty stdout = silent. Classic watchdog "
+            "pattern (memory alerts, disk alerts, CI pings)."
+        ),
    )
    cron_create.add_argument(
        "--workdir",
@@ -8542,7 +8739,29 @@ def main():
    )
    cron_edit.add_argument(
        "--script",
-        help="Path to a Python script whose stdout is injected into the prompt each run. Pass empty string to clear.",
+        help=(
+            "Path to a script under ~/.hermes/scripts/. Pass empty string to clear. "
+            "With --no-agent the script IS the job; otherwise its stdout is "
+            "injected into the agent's prompt each run."
+        ),
+    )
+    cron_edit.add_argument(
+        "--no-agent",
+        dest="no_agent",
+        action="store_const",
+        const=True,
+        default=None,
+        help=(
+            "Enable no-agent mode on this job (requires --script or an "
+            "existing script on the job)."
+        ),
+    )
+    cron_edit.add_argument(
+        "--agent",
+        dest="no_agent",
+        action="store_const",
+        const=False,
+        help="Disable no-agent mode on this job (reverts to LLM-driven execution).",
    )
    cron_edit.add_argument(
        "--workdir",
@@ -8640,6 +8859,13 @@ def main():

    webhook_parser.set_defaults(func=cmd_webhook)

+    # =========================================================================
+    # kanban command — multi-profile collaboration board
+    # =========================================================================
+    from hermes_cli.kanban import build_parser as _build_kanban_parser
+    kanban_parser = _build_kanban_parser(subparsers)
+    kanban_parser.set_defaults(func=cmd_kanban)
+
    # =========================================================================
    # hooks command — shell-hook inspection and management
    # =========================================================================
@@ -8746,6 +8972,7 @@ Examples:
    hermes debug share --lines 500  Include more log lines
    hermes debug share --expire 30  Keep paste for 30 days
    hermes debug share --local      Print report locally (no upload)
+    hermes debug share --no-redact  Disable upload-time secret redaction
    hermes debug delete <url>       Delete a previously uploaded paste
 """,
    )
@@ -8771,6 +8998,16 @@ Examples:
        action="store_true",
        help="Print the report locally instead of uploading",
    )
+    share_parser.add_argument(
+        "--no-redact",
+        action="store_true",
+        help=(
+            "Disable upload-time secret redaction (default: redact). Logs "
+            "are normally run through agent.redact.redact_sensitive_text "
+            "with force=True before upload so credentials are not leaked "
+            "into the public paste service."
+        ),
+    )
    delete_parser = debug_sub.add_parser(
        "delete",
        help="Delete a paste uploaded by 'hermes debug share'",
@@ -9847,6 +10084,13 @@ Examples:
        default=False,
        help="Force a pre-update backup for this run (off by default; overrides updates.pre_update_backup)",
    )
+    update_parser.add_argument(
+        "--yes",
+        "-y",
+        action="store_true",
+        default=False,
+        help="Assume yes for interactive prompts (config migration, stash restore). API-key entry is skipped; run 'hermes config migrate' separately for those.",
+    )
    update_parser.set_defaults(func=cmd_update)

    # =========================================================================
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:

    existing_lines = []
    if env_path.exists():
-        existing_lines = env_path.read_text().splitlines()
+        existing_lines = env_path.read_text(encoding="utf-8").splitlines()

    updated_keys = set()
    new_lines = []
@@ -891,12 +891,37 @@ def switch_model(
    if not validation.get("accepted"):
        override = False
        if user_providers:
-            for up in user_providers:
-                if isinstance(up, dict) and up.get("provider") == target_provider:
-                    cfg_models = up.get("models", [])
-                    if new_model in cfg_models or any(
-                        m.get("name") == new_model for m in cfg_models if isinstance(m, dict)
-                    ):
+            # user_providers is a dict: {provider_slug: config_dict}
+            for slug, cfg in user_providers.items():
+                if slug == target_provider:
+                    cfg_models = cfg.get("models", {})
+                    # Direct membership works for dict (keys) and list (strings)
+                    if new_model in cfg_models:
+                        override = True
+                        break
+                    # Also accept if models is a list of dicts with 'name' field
+                    if isinstance(cfg_models, list):
+                        if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
+                            override = True
+                            break
+        # Also check custom_providers list — models declared there should be accepted
+        # even if the remote /v1/models endpoint doesn't list them.
+        if not override and custom_providers and isinstance(custom_providers, list):
+            for entry in custom_providers:
+                if not isinstance(entry, dict):
+                    continue
+                # Match by provider slug (custom:<name>) or by base_url
+                entry_name = entry.get("name", "")
+                entry_slug = f"custom:{entry_name}" if entry_name else ""
+                entry_url = entry.get("base_url", "")
+                if entry_slug == target_provider or entry_url == base_url:
+                    # Check if the requested model matches the entry's model
+                    entry_model = entry.get("model", "")
+                    entry_models = entry.get("models", {})
+                    if new_model == entry_model:
+                        override = True
+                        break
+                    if isinstance(entry_models, dict) and new_model in entry_models:
                        override = True
                        break
        if override:
@@ -1052,6 +1077,45 @@ def list_authenticated_providers(
        if normed:
            _builtin_endpoints.add(normed)

+    def _has_fast_aws_sdk_signal() -> bool:
+        """Return True when explicit AWS auth config is present.
+
+        This intentionally avoids botocore's full credential chain. Provider
+        picker/model-switch discovery can run for non-Bedrock providers, and
+        botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
+        machines before returning no credentials.
+        """
+        if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
+            return True
+        if (
+            os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
+            and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
+        ):
+            return True
+        return any(
+            os.environ.get(name, "").strip()
+            for name in (
+                "AWS_PROFILE",
+                "AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
+                "AWS_CONTAINER_CREDENTIALS_FULL_URI",
+                "AWS_WEB_IDENTITY_TOKEN_FILE",
+            )
+        )
+
+    def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
+        """Credential check for AWS SDK providers in non-runtime discovery."""
+        slug_norm = str(slug or "").strip().lower()
+        current_norm = str(current_provider or "").strip().lower()
+        if _has_fast_aws_sdk_signal():
+            return True
+        if slug_norm != current_norm:
+            return False
+        try:
+            from agent.bedrock_adapter import has_aws_credentials
+            return bool(has_aws_credentials())
+        except Exception:
+            return False
+
    data = fetch_models_dev()

    # Build curated model lists keyed by hermes provider ID
@@ -1179,7 +1243,9 @@ def list_authenticated_providers(

        # Check if credentials exist
        has_creds = False
-        if overlay.extra_env_vars:
+        if overlay.auth_type == "aws_sdk":
+            has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
+        elif overlay.extra_env_vars:
            has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
        # Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
        if not has_creds and overlay.auth_type == "api_key":
@@ -1198,11 +1264,7 @@ def list_authenticated_providers(
                from hermes_cli.auth import _load_auth_store
                store = _load_auth_store()
                providers_store = store.get("providers", {})
-                pool_store = store.get("credential_pool", {})
-                if store and (
-                    pid in providers_store or hermes_slug in providers_store
-                    or pid in pool_store or hermes_slug in pool_store
-                ):
+                if store and (pid in providers_store or hermes_slug in providers_store):
                    has_creds = True
            except Exception as exc:
                logger.debug("Auth store check failed for %s: %s", pid, exc)
@@ -1298,11 +1360,7 @@ def list_authenticated_providers(
                from hermes_cli.auth import _load_auth_store
                _cp_store = _load_auth_store()
                _cp_providers_store = _cp_store.get("providers", {})
-                _cp_pool_store = _cp_store.get("credential_pool", {})
-                if _cp_store and (
-                    _cp.slug in _cp_providers_store
-                    or _cp.slug in _cp_pool_store
-                ):
+                if _cp_store and _cp.slug in _cp_providers_store:
                    _cp_has_creds = True
            except Exception:
                pass
@@ -1319,11 +1377,7 @@ def list_authenticated_providers(
        # credentials come from the boto3 credential chain (env vars,
        # ~/.aws/credentials, instance roles, etc.)
        if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
-            try:
-                from agent.bedrock_adapter import has_aws_credentials
-                _cp_has_creds = has_aws_credentials()
-            except Exception:
-                pass
+            _cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)

        if not _cp_has_creds:
            continue
@@ -1412,14 +1466,17 @@ def list_authenticated_providers(
                        models_list = list(fb)

            # Prefer the endpoint's live /models list when credentials are
-            # available. This keeps OpenAI-compatible relays (for example CRS)
-            # in sync when the server catalog changes without requiring the
-            # user to mirror every model into config.yaml.
+            # available, unless the provider explicitly opts out via
+            # discover_models: false (e.g. dedicated endpoints that expose
+            # the entire aggregator catalog via /models).
            api_key = str(ep_cfg.get("api_key", "") or "").strip()
            if not api_key:
                key_env = str(ep_cfg.get("key_env", "") or "").strip()
                api_key = os.environ.get(key_env, "").strip() if key_env else ""
-            if api_url and api_key:
+            discover = ep_cfg.get("discover_models", True)
+            if isinstance(discover, str):
+                discover = discover.lower() not in ("false", "no", "0")
+            if api_url and api_key and discover:
                try:
                    from hermes_cli.models import fetch_api_models
                    live_models = fetch_api_models(api_key, api_url)
@@ -40,6 +40,7 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
    ("anthropic/claude-sonnet-4.5",     ""),
    ("anthropic/claude-haiku-4.5",      ""),
    ("openrouter/elephant-alpha",       "free"),
+    ("openrouter/owl-alpha",            "free"),
    ("openai/gpt-5.5",                  ""),
    ("openai/gpt-5.4-mini",             ""),
    ("xiaomi/mimo-v2.5-pro",             ""),
@@ -773,7 +774,6 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("nous",           "Nous Portal",              "Nous Portal (Nous Research subscription)"),
    ProviderEntry("openrouter",     "OpenRouter",               "OpenRouter (100+ models, pay-per-use)"),
    ProviderEntry("lmstudio",       "LM Studio",                "LM Studio (local desktop app with built-in model server)"),
-    ProviderEntry("ai-gateway",     "Vercel AI Gateway",        "Vercel AI Gateway (200+ models, $5 free credit, no markup)"),
    ProviderEntry("anthropic",      "Anthropic",                "Anthropic (Claude models — API key or Claude Code)"),
    ProviderEntry("openai-codex",   "OpenAI Codex",             "OpenAI Codex"),
    ProviderEntry("xiaomi",         "Xiaomi MiMo",              "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
@@ -803,6 +803,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
    ProviderEntry("opencode-go",    "OpenCode Go",              "OpenCode Go (open models, $10/month subscription)"),
    ProviderEntry("bedrock",        "AWS Bedrock",              "AWS Bedrock (Claude, Nova, Llama, DeepSeek — IAM or API key)"),
    ProviderEntry("azure-foundry",  "Azure Foundry",            "Azure Foundry (OpenAI-style or Anthropic-style endpoint — your Azure AI deployment)"),
+    ProviderEntry("ai-gateway",     "Vercel AI Gateway",        "Vercel AI Gateway"),
 ]

 # Derived dicts — used throughout the codebase
@@ -1739,10 +1740,20 @@ def model_supports_fast_mode(model_id: Optional[str]) -> bool:


 def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
-    """Return True if the model is a Claude model eligible for Anthropic Fast Mode."""
+    """Return True if the model is a Claude model eligible for Anthropic Fast Mode.
+
+    Fast mode is currently supported on Claude Opus 4.6 only. Per Anthropic's
+    docs (https://platform.claude.com/docs/en/build-with-claude/fast-mode):
+    "Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
+    with an unsupported model returns an error." Opus 4.7 explicitly rejects
+    the ``speed`` parameter with HTTP 400.
+    """
    raw = _strip_vendor_prefix(str(model_id or ""))
    base = raw.split(":")[0]
-    return base.startswith("claude-")
+    if not base.startswith("claude-"):
+        return False
+    # Only Opus 4.6 supports fast mode at present.
+    return "opus-4-6" in base or "opus-4.6" in base


 def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
@@ -2895,6 +2906,19 @@ def fetch_api_models(
 _OLLAMA_CLOUD_CACHE_TTL = 3600  # 1 hour


+def _strip_ollama_cloud_suffix(model_id: str) -> str:
+    """Strip :cloud / -cloud suffixes that models.dev appends to Ollama Cloud IDs.
+
+    The live API uses clean IDs (e.g. 'kimi-k2.6') while models.dev sometimes
+    returns them as 'kimi-k2.6:cloud'. Normalising before the dedup merge
+    prevents duplicate entries in the merged model list.
+    """
+    for suffix in (":cloud", "-cloud"):
+        if model_id.endswith(suffix):
+            return model_id[: -len(suffix)]
+    return model_id
+
+
 def _ollama_cloud_cache_path() -> Path:
    """Return the path for the Ollama Cloud model cache."""
    from hermes_constants import get_hermes_home
@@ -2990,9 +3014,10 @@ def fetch_ollama_cloud_models(
                seen.add(m)
                merged.append(m)
        for m in mdev_models:
-            if m and m not in seen:
-                seen.add(m)
-                merged.append(m)
+            normalized = _strip_ollama_cloud_suffix(m)
+            if normalized and normalized not in seen:
+                seen.add(normalized)
+                merged.append(normalized)
        if merged:
            _save_ollama_cloud_cache(merged)
            return merged
@@ -3086,7 +3111,7 @@ def validate_requested_model(
            "message": f"Model `{requested}` was not found in LM Studio's model listing.",
        }

-    if normalized == "custom":
+    if normalized == "custom" or normalized.startswith("custom:"):
        # Try probing with correct auth for the api_mode.
        if api_mode == "anthropic_messages":
            probe = probe_api_models(api_key, base_url, api_mode=api_mode)
@@ -3184,11 +3209,12 @@ def validate_requested_model(
            if suggestions:
                suggestion_text = "\n  Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
            return {
-                "accepted": False,
-                "persist": False,
+                "accepted": True,
+                "persist": True,
                "recognized": False,
                "message": (
-                    f"Model `{requested}` was not found in the OpenAI Codex model listing."
+                    f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
+                    "It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
                    f"{suggestion_text}"
                ),
            }
@@ -33,12 +33,15 @@ so plugin-defined tools appear alongside the built-in tools.

 from __future__ import annotations

+import asyncio
 import importlib
 import importlib.metadata
 import importlib.util
+import inspect
 import logging
 import os
 import sys
+import threading
 import types
 from dataclasses import dataclass, field
 from pathlib import Path
@@ -1226,6 +1229,55 @@ def get_plugin_command_handler(name: str) -> Optional[Callable]:
    return entry["handler"] if entry else None


+_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS = 30.0
+
+
+def resolve_plugin_command_result(result: Any) -> Any:
+    """Resolve a plugin command return value, awaiting async handlers when needed.
+
+    Sync CLI/TUI dispatch sites call plugin handlers from plain functions.
+    If a handler is async, await it directly when no loop is running; if
+    we're already inside an active loop, run it in a helper thread with its
+    own loop so the caller still gets a concrete result synchronously. The
+    threaded path is bounded by a 30s timeout so a hung async handler cannot
+    wedge the terminal indefinitely.
+    """
+    if not inspect.isawaitable(result):
+        return result
+
+    try:
+        asyncio.get_running_loop()
+    except RuntimeError:
+        return asyncio.run(result)
+
+    outcome: Dict[str, Any] = {}
+    failure: Dict[str, BaseException] = {}
+    done = threading.Event()
+
+    def _runner() -> None:
+        try:
+            outcome["value"] = asyncio.run(result)
+        except BaseException as exc:  # pragma: no cover - re-raised below
+            failure["exc"] = exc
+        finally:
+            done.set()
+
+    thread = threading.Thread(
+        target=_runner,
+        name="hermes-plugin-command-await",
+        daemon=True,
+    )
+    thread.start()
+    if not done.wait(timeout=_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS):
+        raise TimeoutError(
+            "Plugin command async handler did not complete within "
+            f"{_PLUGIN_COMMAND_AWAIT_TIMEOUT_SECS:.0f}s"
+        )
+    if "exc" in failure:
+        raise failure["exc"]
+    return outcome.get("value")
+
+
 def get_plugin_commands() -> Dict[str, dict]:
    """Return the full plugin commands dict (name → {handler, description, plugin}).

@@ -15,13 +15,18 @@ import shutil
 import subprocess
 import sys
 from pathlib import Path
-from typing import Optional
+from typing import Any, Optional

 from hermes_constants import get_hermes_home
 from hermes_cli.config import cfg_get

 logger = logging.getLogger(__name__)

+
+class PluginOperationError(Exception):
+    """Recoverable plugin install/update failure (CLI exits; HTTP maps to 4xx)."""
+
+
 # Minimum manifest version this installer understands.
 # Plugins may declare ``manifest_version: 1`` in plugin.yaml;
 # future breaking changes to the manifest schema bump this.
@@ -150,6 +155,24 @@ def _copy_example_files(plugin_dir: Path, console) -> None:
                )


+def _missing_requires_env_names(manifest: dict) -> list[str]:
+    """Return declared ``requires_env`` names that are unset in ``~/.hermes/.env``."""
+    requires_env = manifest.get("requires_env") or []
+    if not requires_env:
+        return []
+
+    from hermes_cli.config import get_env_value
+
+    env_specs: list[dict] = []
+    for entry in requires_env:
+        if isinstance(entry, str):
+            env_specs.append({"name": entry})
+        elif isinstance(entry, dict) and entry.get("name"):
+            env_specs.append(entry)
+
+    return [s["name"] for s in env_specs if s.get("name") and not get_env_value(s["name"])]
+
+
 def _prompt_plugin_env_vars(manifest: dict, console) -> None:
    """Prompt for required environment variables declared in plugin.yaml.

@@ -283,6 +306,95 @@ def _require_installed_plugin(name: str, plugins_dir: Path, console) -> Path:
 # ---------------------------------------------------------------------------


+def _install_plugin_core(identifier: str, *, force: bool) -> tuple[Path, dict, str]:
+    """Clone Git plugin into ``~/.hermes/plugins``.
+
+    Returns ``(target_dir, installed_manifest, canonical_name)``.
+    Raises ``PluginOperationError`` on failure.
+    """
+    import tempfile
+
+    try:
+        git_url = _resolve_git_url(identifier)
+    except ValueError as e:
+        raise PluginOperationError(str(e)) from e
+
+    plugins_dir = _plugins_dir()
+
+    with tempfile.TemporaryDirectory() as tmp:
+        tmp_target = Path(tmp) / "plugin"
+
+        try:
+            result = subprocess.run(
+                ["git", "clone", "--depth", "1", git_url, str(tmp_target)],
+                capture_output=True,
+                text=True,
+                timeout=60,
+            )
+        except FileNotFoundError as e:
+            raise PluginOperationError(
+                "git is not installed or not in PATH.",
+            ) from e
+        except subprocess.TimeoutExpired as e:
+            raise PluginOperationError(
+                "Git clone timed out after 60 seconds.",
+            ) from e
+
+        if result.returncode != 0:
+            err = (result.stderr or result.stdout or "").strip()
+            raise PluginOperationError(f"Git clone failed:\n{err}")
+
+        manifest = _read_manifest(tmp_target)
+        plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
+
+        try:
+            target = _sanitize_plugin_name(plugin_name, plugins_dir)
+        except ValueError as e:
+            raise PluginOperationError(str(e)) from e
+
+        mv = manifest.get("manifest_version")
+        if mv is not None:
+            try:
+                mv_int = int(mv)
+            except (ValueError, TypeError):
+                raise PluginOperationError(
+                    f"Plugin '{plugin_name}' has invalid manifest_version "
+                    f"'{mv}' (expected an integer).",
+                ) from None
+            if mv_int > _SUPPORTED_MANIFEST_VERSION:
+                from hermes_cli.config import recommended_update_command
+
+                raise PluginOperationError(
+                    f"Plugin '{plugin_name}' requires manifest_version {mv}, "
+                    f"but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}. "
+                    f"Run {recommended_update_command()} to update Hermes.",
+                ) from None
+
+        if target.exists():
+            if not force:
+                raise PluginOperationError(
+                    f"Plugin '{plugin_name}' already exists. Use force reinstall "
+                    f"or run `hermes plugins update {plugin_name}`.",
+                )
+            shutil.rmtree(target)
+
+        shutil.move(str(tmp_target), str(target))
+
+    has_yaml = (target / "plugin.yaml").exists() or (target / "plugin.yml").exists()
+    if not has_yaml and not (target / "__init__.py").exists():
+        logger.warning(
+            "%s has no plugin.yaml / __init__.py; may not be a valid plugin",
+            plugin_name,
+        )
+
+    from rich.console import Console
+
+    _copy_example_files(target, Console())
+    installed_manifest = _read_manifest(target)
+    installed_name = installed_manifest.get("name") or target.name
+    return target, installed_manifest, installed_name
+
+
 def cmd_install(
    identifier: str,
    force: bool = False,
@@ -293,7 +405,6 @@ def cmd_install(
    After install, prompt "Enable now? [y/N]" unless *enable* is provided
    (True = auto-enable without prompting, False = install disabled).
    """
-    import tempfile
    from rich.console import Console

    console = Console()
@@ -304,114 +415,41 @@ def cmd_install(
        console.print(f"[red]Error:[/red] {e}")
        sys.exit(1)

-    # Warn about insecure / local URL schemes
    if git_url.startswith(("http://", "file://")):
        console.print(
            "[yellow]Warning:[/yellow] Using insecure/local URL scheme. "
-            "Consider using https:// or git@ for production installs."
+            "Consider using https:// or git@ for production installs.",
        )

-    plugins_dir = _plugins_dir()
+    console.print(f"[dim]Cloning {git_url}...[/dim]")

-    # Clone into a temp directory first so we can read plugin.yaml for the name
-    with tempfile.TemporaryDirectory() as tmp:
-        tmp_target = Path(tmp) / "plugin"
-        console.print(f"[dim]Cloning {git_url}...[/dim]")
+    try:
+        target, installed_manifest, installed_name = _install_plugin_core(
+            identifier,
+            force=force,
+        )
+    except PluginOperationError as e:
+        console.print(f"[red]Error:[/red] {e}")
+        sys.exit(1)

-        try:
-            result = subprocess.run(
-                ["git", "clone", "--depth", "1", git_url, str(tmp_target)],
-                capture_output=True,
-                text=True,
-                timeout=60,
-            )
-        except FileNotFoundError:
-            console.print("[red]Error:[/red] git is not installed or not in PATH.")
-            sys.exit(1)
-        except subprocess.TimeoutExpired:
-            console.print("[red]Error:[/red] Git clone timed out after 60 seconds.")
-            sys.exit(1)
-
-        if result.returncode != 0:
-            console.print(
-                f"[red]Error:[/red] Git clone failed:\n{result.stderr.strip()}"
-            )
-            sys.exit(1)
-
-        # Read manifest
-        manifest = _read_manifest(tmp_target)
-        plugin_name = manifest.get("name") or _repo_name_from_url(git_url)
-
-        # Sanitize plugin name against path traversal
-        try:
-            target = _sanitize_plugin_name(plugin_name, plugins_dir)
-        except ValueError as e:
-            console.print(f"[red]Error:[/red] {e}")
-            sys.exit(1)
-
-        # Check manifest_version compatibility
-        mv = manifest.get("manifest_version")
-        if mv is not None:
-            try:
-                mv_int = int(mv)
-            except (ValueError, TypeError):
-                console.print(
-                    f"[red]Error:[/red] Plugin '{plugin_name}' has invalid "
-                    f"manifest_version '{mv}' (expected an integer)."
-                )
-                sys.exit(1)
-            if mv_int > _SUPPORTED_MANIFEST_VERSION:
-                from hermes_cli.config import recommended_update_command
-                console.print(
-                    f"[red]Error:[/red] Plugin '{plugin_name}' requires manifest_version "
-                    f"{mv}, but this installer only supports up to {_SUPPORTED_MANIFEST_VERSION}.\n"
-                    f"Run [bold]{recommended_update_command()}[/bold] to get a newer installer."
-                )
-                sys.exit(1)
-
-        if target.exists():
-            if not force:
-                console.print(
-                    f"[red]Error:[/red] Plugin '{plugin_name}' already exists at {target}.\n"
-                    f"Use [bold]--force[/bold] to remove and reinstall, or "
-                    f"[bold]hermes plugins update {plugin_name}[/bold] to pull latest."
-                )
-                sys.exit(1)
-            console.print(f"[dim]  Removing existing {plugin_name}...[/dim]")
-            shutil.rmtree(target)
-
-        # Move from temp to final location
-        shutil.move(str(tmp_target), str(target))
-
-    # Validate it looks like a plugin
-    if not (target / "plugin.yaml").exists() and not (target / "__init__.py").exists():
+    if not (target / "plugin.yaml").exists() and not (target / "plugin.yml").exists() and not (
+        target / "__init__.py"
+    ).exists():
        console.print(
-            f"[yellow]Warning:[/yellow] {plugin_name} doesn't contain plugin.yaml "
-            f"or __init__.py. It may not be a valid Hermes plugin."
+            f"[yellow]Warning:[/yellow] {installed_name} doesn't contain plugin.yaml "
+            f"or __init__.py. It may not be a valid Hermes plugin.",
        )

-    # Copy .example files to their real names (e.g. config.yaml.example → config.yaml)
-    _copy_example_files(target, console)
-
-    # Re-read manifest from installed location (for env var prompting)
-    installed_manifest = _read_manifest(target)
-
-    # Prompt for required environment variables before showing after-install docs
    _prompt_plugin_env_vars(installed_manifest, console)

    _display_after_install(target, identifier)

-    # Determine the canonical plugin name for enable-list bookkeeping.
-    installed_name = installed_manifest.get("name") or target.name
-
-    # Decide whether to enable: explicit flag > interactive prompt > default off
    should_enable = enable
    if should_enable is None:
-        # Interactive prompt unless stdin isn't a TTY (scripted install).
        if sys.stdin.isatty() and sys.stdout.isatty():
            try:
                answer = input(
-                    f"  Enable '{installed_name}' now? [y/N]: "
+                    f"  Enable '{installed_name}' now? [y/N]: ",
                ).strip().lower()
                should_enable = answer in ("y", "yes")
            except (EOFError, KeyboardInterrupt):
@@ -427,12 +465,12 @@ def cmd_install(
        _save_enabled_set(enabled)
        _save_disabled_set(disabled)
        console.print(
-            f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled."
+            f"[green]✓[/green] Plugin [bold]{installed_name}[/bold] enabled.",
        )
    else:
        console.print(
            f"[dim]Plugin installed but not enabled. "
-            f"Run `hermes plugins enable {installed_name}` to activate.[/dim]"
+            f"Run `hermes plugins enable {installed_name}` to activate.[/dim]",
        )

    console.print("[dim]Restart the gateway for the plugin to take effect:[/dim]")
@@ -462,36 +500,22 @@ def cmd_update(name: str) -> None:

    console.print(f"[dim]Updating {name}...[/dim]")

-    try:
-        result = subprocess.run(
-            ["git", "pull", "--ff-only"],
-            capture_output=True,
-            text=True,
-            timeout=60,
-            cwd=str(target),
-        )
-    except FileNotFoundError:
-        console.print("[red]Error:[/red] git is not installed or not in PATH.")
-        sys.exit(1)
-    except subprocess.TimeoutExpired:
-        console.print("[red]Error:[/red] Git pull timed out after 60 seconds.")
-        sys.exit(1)
-
-    if result.returncode != 0:
-        console.print(f"[red]Error:[/red] Git pull failed:\n{result.stderr.strip()}")
+    ok, output = _git_pull_plugin_dir(target)
+    if not ok:
+        console.print(f"[red]Error:[/red] {output}")
        sys.exit(1)

    # Copy any new .example files
    _copy_example_files(target, console)

-    output = result.stdout.strip()
-    if "Already up to date" in output:
+    out = output.strip()
+    if "Already up to date" in out:
        console.print(
            f"[green]✓[/green] Plugin [bold]{name}[/bold] is already up to date."
        )
    else:
        console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] updated.")
-        console.print(f"[dim]{output}[/dim]")
+        console.print(f"[dim]{out}[/dim]")


 def cmd_remove(name: str) -> None:
@@ -1244,6 +1268,247 @@ def _run_composite_fallback(plugin_names, plugin_labels, plugin_selected,
    print()


+def dashboard_install_plugin(
+    identifier: str,
+    *,
+    force: bool,
+    enable: bool,
+) -> dict[str, Any]:
+    """Non-interactive install for the web dashboard. Returns a JSON-serializable dict."""
+    warnings: list[str] = []
+    try:
+        git_url = _resolve_git_url(identifier)
+        if git_url.startswith(("http://", "file://")):
+            warnings.append(
+                "Insecure URL scheme; prefer https:// or git@ for production installs.",
+            )
+    except ValueError:
+        pass
+
+    try:
+        target, installed_manifest, installed_name = _install_plugin_core(
+            identifier,
+            force=force,
+        )
+    except PluginOperationError as exc:
+        return {"ok": False, "error": str(exc)}
+
+    missing_env = _missing_requires_env_names(installed_manifest)
+    if enable:
+        en = _get_enabled_set()
+        dis = _get_disabled_set()
+        en.add(installed_name)
+        dis.discard(installed_name)
+        _save_enabled_set(en)
+        _save_disabled_set(dis)
+
+    hint: str | None = None
+    ap = target / "after-install.md"
+    if ap.exists():
+        hint = str(ap)
+
+    return {
+        "ok": True,
+        "plugin_name": installed_name,
+        "warnings": warnings,
+        "missing_env": missing_env,
+        "after_install_path": hint,
+        "enabled": enable,
+    }
+
+
+def _get_plugin_toolset_key(name: str) -> Optional[str]:
+    """Return the toolset key a plugin registers its tools under, or None.
+
+    Queries the live tool registry — the plugin must already be loaded.
+    Falls back to reading ``provides_tools`` from plugin.yaml and looking
+    up the toolset from the registry for the first tool name found.
+    """
+    try:
+        from tools.registry import registry
+    except Exception:
+        return None
+
+    # Check the plugin manager for tools this plugin registered
+    try:
+        from hermes_cli.plugins import discover_plugins, get_plugin_manager
+        discover_plugins()  # idempotent — ensures plugins are loaded
+        manager = get_plugin_manager()
+        for _key, loaded in manager._plugins.items():
+            if loaded.manifest.name == name or _key == name:
+                for tool_name in loaded.tools_registered:
+                    entry = registry.get_entry(tool_name)
+                    if entry and entry.toolset:
+                        return entry.toolset
+                break
+    except Exception:
+        pass
+
+    # Fallback: read provides_tools from manifest on disk and query registry
+    try:
+        from hermes_cli.plugins import get_bundled_plugins_dir
+        for base in (get_bundled_plugins_dir(), _plugins_dir()):
+            if not base.is_dir():
+                continue
+            candidate = base / name
+            if candidate.is_dir():
+                manifest = _read_manifest(candidate)
+                for tool_name in manifest.get("provides_tools") or []:
+                    entry = registry.get_entry(tool_name)
+                    if entry and entry.toolset:
+                        return entry.toolset
+    except Exception:
+        pass
+
+    return None
+
+
+def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
+    """Add or remove a plugin's toolset from platform_toolsets for all platforms.
+
+    Only acts if the plugin actually provides tools (has a toolset key).
+    """
+    toolset_key = _get_plugin_toolset_key(name)
+    if not toolset_key:
+        return
+
+    from hermes_cli.config import load_config, save_config
+
+    config = load_config()
+    platform_toolsets = config.get("platform_toolsets")
+    if not isinstance(platform_toolsets, dict):
+        platform_toolsets = {}
+        config["platform_toolsets"] = platform_toolsets
+
+    changed = False
+    for platform, ts_list in platform_toolsets.items():
+        if not isinstance(ts_list, list):
+            continue
+        if enable:
+            if toolset_key not in ts_list:
+                ts_list.append(toolset_key)
+                changed = True
+        else:
+            if toolset_key in ts_list:
+                ts_list.remove(toolset_key)
+                changed = True
+
+    # If enabling and no platforms have toolset lists yet, add to "cli" at minimum
+    if enable and not changed and not platform_toolsets:
+        platform_toolsets["cli"] = [toolset_key]
+        changed = True
+
+    if changed:
+        save_config(config)
+
+
+def dashboard_set_agent_plugin_enabled(name: str, *, enabled: bool) -> dict[str, Any]:
+    """Enable or disable a plugin in ``config.yaml`` (runtime allow/deny lists).
+
+    For plugins that provide tools (toolsets), also toggles the toolset in
+    ``platform_toolsets`` so the agent actually sees the tools in sessions.
+    """
+    if not _plugin_exists(name):
+        return {"ok": False, "error": f"Plugin '{name}' is not installed or bundled."}
+
+    en = _get_enabled_set()
+    dis = _get_disabled_set()
+
+    if enabled:
+        if name in en and name not in dis:
+            return {"ok": True, "name": name, "unchanged": True}
+        en.add(name)
+        dis.discard(name)
+        _save_enabled_set(en)
+        _save_disabled_set(dis)
+        _toggle_plugin_toolset(name, enable=True)
+        return {"ok": True, "name": name, "unchanged": False}
+
+    if name not in en and name in dis:
+        return {"ok": True, "name": name, "unchanged": True}
+
+    en.discard(name)
+    dis.add(name)
+    _save_enabled_set(en)
+    _save_disabled_set(dis)
+    _toggle_plugin_toolset(name, enable=False)
+    return {"ok": True, "name": name, "unchanged": False}
+
+
+def _user_installed_plugin_dir(name: str) -> Optional[Path]:
+    """Resolved path under ``~/.hermes/plugins/<name>`` if it exists."""
+    plugins_dir = _plugins_dir()
+    try:
+        target = _sanitize_plugin_name(name, plugins_dir)
+    except ValueError:
+        return None
+    return target if target.is_dir() else None
+
+
+def dashboard_update_user_plugin(name: str) -> dict[str, Any]:
+    """``git pull`` inside ``~/.hermes/plugins/<name>``."""
+    target = _user_installed_plugin_dir(name)
+    if target is None:
+        return {
+            "ok": False,
+            "error": f"Plugin '{name}' was not found under {_plugins_dir()}.",
+        }
+
+    if not (target / ".git").exists():
+        return {
+            "ok": False,
+            "error": f"Plugin '{name}' is not a git checkout; cannot pull updates.",
+        }
+
+    ok, msg = _git_pull_plugin_dir(target)
+    if not ok:
+        return {"ok": False, "error": msg}
+
+    from rich.console import Console
+
+    _copy_example_files(target, Console())
+    unchanged = "Already up to date" in msg
+    return {"ok": True, "name": name, "output": msg, "unchanged": unchanged}
+
+
+def _git_pull_plugin_dir(target: Path) -> tuple[bool, str]:
+    try:
+        result = subprocess.run(
+            ["git", "pull", "--ff-only"],
+            capture_output=True,
+            text=True,
+            timeout=60,
+            cwd=str(target),
+        )
+    except FileNotFoundError:
+        return False, "git is not installed or not in PATH."
+    except subprocess.TimeoutExpired:
+        return False, "Git pull timed out after 60 seconds."
+
+    if result.returncode != 0:
+        err = (result.stderr or "").strip() or result.stdout.strip()
+        return False, err or "git pull failed."
+    return True, result.stdout.strip()
+
+
+def dashboard_remove_user_plugin(name: str) -> dict[str, Any]:
+    """Delete a plugin tree under ``~/.hermes/plugins/`` only."""
+    plugins_dir = _plugins_dir()
+    for n, _ver, _d, src, _path in _discover_all_plugins():
+        if n == name and src == "bundled":
+            return {"ok": False, "error": "Bundled plugins cannot be removed from the dashboard."}
+
+    target = _user_installed_plugin_dir(name)
+    if target is None:
+        return {
+            "ok": False,
+            "error": f"Plugin '{name}' was not found under {plugins_dir}.",
+        }
+
+    shutil.rmtree(target)
+    return {"ok": True, "name": name}
+
+
 def plugins_command(args) -> None:
    """Dispatch hermes plugins subcommands."""
    action = getattr(args, "plugins_action", None)
@@ -11,7 +11,7 @@ zero migration needed.
 Usage::

    hermes profile create coder          # fresh profile + bundled skills
-    hermes profile create coder --clone  # also copy config, .env, SOUL.md
+    hermes profile create coder --clone  # also copy config, .env, SOUL.md, skills
    hermes profile create coder --clone-all  # full copy of source profile
    coder chat                           # use via wrapper alias
    hermes -p coder chat                 # or via flag
@@ -179,8 +179,33 @@ def _get_wrapper_dir() -> Path:
 # Validation
 # ---------------------------------------------------------------------------

+def normalize_profile_name(name: str) -> str:
+    """Return the canonical profile id used on disk and in CLI ``-p`` argv.
+
+    Named profiles are stored lowercase under ``profiles/<id>/``. The special
+    alias ``default`` is matched case-insensitively (``Default`` → ``default``).
+    Dashboards and tools may pass title-cased display labels; normalize before
+    validation, assignment, and subprocess spawn (see issue #18498).
+    """
+    if not isinstance(name, str):
+        name = str(name)
+    stripped = name.strip()
+    if not stripped:
+        raise ValueError("profile name cannot be empty")
+    if stripped.casefold() == "default":
+        return "default"
+    return stripped.lower()
+
+
 def validate_profile_name(name: str) -> None:
-    """Raise ``ValueError`` if *name* is not a valid profile identifier."""
+    """Raise ``ValueError`` if *name* is not a valid profile identifier.
+
+    Validates the input as-given — strict lowercase match. Callers that accept
+    mixed-case or title-cased input from users (dashboard UI, CLI args) should
+    call :func:`normalize_profile_name` first. This separation keeps validate
+    honest about what the on-disk directory name must look like, while
+    ingress-point normalization handles UX flexibility (see #18498).
+    """
    if name == "default":
        return  # special alias for ~/.hermes
    if not _PROFILE_ID_RE.match(name):
@@ -192,16 +217,18 @@ def validate_profile_name(name: str) -> None:

 def get_profile_dir(name: str) -> Path:
    """Resolve a profile name to its HERMES_HOME directory."""
-    if name == "default":
+    canon = normalize_profile_name(name)
+    if canon == "default":
        return _get_default_hermes_home()
-    return _get_profiles_root() / name
+    return _get_profiles_root() / canon


 def profile_exists(name: str) -> bool:
    """Check whether a profile directory exists."""
-    if name == "default":
+    canon = normalize_profile_name(name)
+    if canon == "default":
        return True
-    return get_profile_dir(name).is_dir()
+    return get_profile_dir(canon).is_dir()


 # ---------------------------------------------------------------------------
@@ -213,28 +240,29 @@ def check_alias_collision(name: str) -> Optional[str]:

    Checks: reserved names, hermes subcommands, existing binaries in PATH.
    """
-    if name in _RESERVED_NAMES:
-        return f"'{name}' is a reserved name"
-    if name in _HERMES_SUBCOMMANDS:
-        return f"'{name}' conflicts with a hermes subcommand"
+    canon = normalize_profile_name(name)
+    if canon in _RESERVED_NAMES:
+        return f"'{canon}' is a reserved name"
+    if canon in _HERMES_SUBCOMMANDS:
+        return f"'{canon}' conflicts with a hermes subcommand"

    # Check existing commands in PATH
    wrapper_dir = _get_wrapper_dir()
    try:
        result = subprocess.run(
-            ["which", name], capture_output=True, text=True, timeout=5,
+            ["which", canon], capture_output=True, text=True, timeout=5,
        )
        if result.returncode == 0:
            existing_path = result.stdout.strip()
            # Allow overwriting our own wrappers
-            if existing_path == str(wrapper_dir / name):
+            if existing_path == str(wrapper_dir / canon):
                try:
-                    content = (wrapper_dir / name).read_text()
+                    content = (wrapper_dir / canon).read_text()
                    if "hermes -p" in content:
                        return None  # it's our wrapper, safe to overwrite
                except Exception:
                    pass
-            return f"'{name}' conflicts with an existing command ({existing_path})"
+            return f"'{canon}' conflicts with an existing command ({existing_path})"
    except (FileNotFoundError, subprocess.TimeoutExpired):
        pass

@@ -252,6 +280,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:

    Returns the path to the created wrapper, or None if creation failed.
    """
+    canon = normalize_profile_name(name)
    wrapper_dir = _get_wrapper_dir()
    try:
        wrapper_dir.mkdir(parents=True, exist_ok=True)
@@ -259,9 +288,9 @@ def create_wrapper_script(name: str) -> Optional[Path]:
        print(f"⚠ Could not create {wrapper_dir}: {e}")
        return None

-    wrapper_path = wrapper_dir / name
+    wrapper_path = wrapper_dir / canon
    try:
-        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
+        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {canon} "$@"\n')
        wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
        return wrapper_path
    except OSError as e:
@@ -271,7 +300,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:

 def remove_wrapper_script(name: str) -> bool:
    """Remove the wrapper script for a profile. Returns True if removed."""
-    wrapper_path = _get_wrapper_dir() / name
+    wrapper_path = _get_wrapper_dir() / normalize_profile_name(name)
    if wrapper_path.exists():
        try:
            # Verify it's our wrapper before removing
@@ -411,7 +440,8 @@ def create_profile(
    clone_all:
        If True, do a full copytree of the source (all state).
    clone_config:
-        If True, copy only config files (config.yaml, .env, SOUL.md).
+        If True, copy config files (config.yaml, .env, SOUL.md), installed
+        skills, and selected profile identity files from the source profile.
    no_alias:
        If True, skip wrapper script creation.

@@ -420,16 +450,17 @@ def create_profile(
    Path
        The newly created profile directory.
    """
-    validate_profile_name(name)
+    canon = normalize_profile_name(name)
+    validate_profile_name(canon)

-    if name == "default":
+    if canon == "default":
        raise ValueError(
            "Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
        )

-    profile_dir = get_profile_dir(name)
+    profile_dir = get_profile_dir(canon)
    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
+        raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")

    # Resolve clone source
    source_dir = None
@@ -439,6 +470,7 @@ def create_profile(
            from hermes_constants import get_hermes_home
            source_dir = get_hermes_home()
        else:
+            clone_from = normalize_profile_name(clone_from)
            validate_profile_name(clone_from)
            source_dir = get_profile_dir(clone_from)
        if not source_dir.is_dir():
@@ -469,6 +501,14 @@ def create_profile(
                if src.exists():
                    shutil.copy2(src, profile_dir / filename)

+            # Clone installed skills from the source profile. The dashboard's
+            # "clone from default" flow is expected to preserve both bundled
+            # and user-installed skills so the new profile immediately has the
+            # same agent capabilities as the source profile.
+            source_skills = source_dir / "skills"
+            if source_skills.is_dir():
+                shutil.copytree(source_skills, profile_dir / "skills", dirs_exist_ok=True)
+
            # Clone memory and other subdirectory files
            for relpath in _CLONE_SUBDIR_FILES:
                src = source_dir / relpath
@@ -531,24 +571,25 @@ def delete_profile(name: str, yes: bool = False) -> Path:

    Returns the path that was removed.
    """
-    validate_profile_name(name)
+    canon = normalize_profile_name(name)
+    validate_profile_name(canon)

-    if name == "default":
+    if canon == "default":
        raise ValueError(
            "Cannot delete the default profile (~/.hermes).\n"
            "To remove everything, use: hermes uninstall"
        )

-    profile_dir = get_profile_dir(name)
+    profile_dir = get_profile_dir(canon)
    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{name}' does not exist.")
+        raise FileNotFoundError(f"Profile '{canon}' does not exist.")

    # Show what will be deleted
    model, provider = _read_config_model(profile_dir)
    gw_running = _check_gateway_running(profile_dir)
    skill_count = _count_skills(profile_dir)

-    print(f"\nProfile: {name}")
+    print(f"\nProfile: {canon}")
    print(f"Path:    {profile_dir}")
    if model:
        print(f"Model:   {model}" + (f" ({provider})" if provider else ""))
@@ -560,7 +601,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    ]

    # Check for service
-    wrapper_path = _get_wrapper_dir() / name
+    wrapper_path = _get_wrapper_dir() / canon
    has_wrapper = wrapper_path.exists()
    if has_wrapper:
        items.append(f"Command alias ({wrapper_path})")
@@ -575,16 +616,16 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    if not yes:
        print()
        try:
-            confirm = input(f"Type '{name}' to confirm: ").strip()
+            confirm = input(f"Type '{canon}' to confirm: ").strip()
        except (KeyboardInterrupt, EOFError):
            print("\nCancelled.")
            return profile_dir
-        if confirm != name:
+        if confirm != canon:
            print("Cancelled.")
            return profile_dir

    # 1. Disable service (prevents auto-restart)
-    _cleanup_gateway_service(name, profile_dir)
+    _cleanup_gateway_service(canon, profile_dir)

    # 2. Stop running gateway
    if gw_running:
@@ -592,7 +633,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:

    # 3. Remove wrapper script
    if has_wrapper:
-        if remove_wrapper_script(name):
+        if remove_wrapper_script(canon):
            print(f"✓ Removed {wrapper_path}")

    # 4. Remove profile directory
@@ -605,13 +646,13 @@ def delete_profile(name: str, yes: bool = False) -> Path:
    # 5. Clear active_profile if it pointed to this profile
    try:
        active = get_active_profile()
-        if active == name:
+        if active == canon:
            set_active_profile("default")
            print("✓ Active profile reset to default")
    except Exception:
        pass

-    print(f"\nProfile '{name}' deleted.")
+    print(f"\nProfile '{canon}' deleted.")
    return profile_dir


@@ -721,22 +762,23 @@ def set_active_profile(name: str) -> None:

    Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
    """
-    validate_profile_name(name)
-    if name != "default" and not profile_exists(name):
+    canon = normalize_profile_name(name)
+    validate_profile_name(canon)
+    if canon != "default" and not profile_exists(canon):
        raise FileNotFoundError(
-            f"Profile '{name}' does not exist. "
-            f"Create it with: hermes profile create {name}"
+            f"Profile '{canon}' does not exist. "
+            f"Create it with: hermes profile create {canon}"
        )

    path = _get_active_profile_path()
    path.parent.mkdir(parents=True, exist_ok=True)
-    if name == "default":
+    if canon == "default":
        # Remove the file to indicate default
        path.unlink(missing_ok=True)
    else:
        # Atomic write
        tmp = path.with_suffix(".tmp")
-        tmp.write_text(name + "\n")
+        tmp.write_text(canon + "\n")
        tmp.replace(path)


@@ -802,16 +844,17 @@ def export_profile(name: str, output_path: str) -> Path:
    """
    import tempfile

-    validate_profile_name(name)
-    profile_dir = get_profile_dir(name)
+    canon = normalize_profile_name(name)
+    validate_profile_name(canon)
+    profile_dir = get_profile_dir(canon)
    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{name}' does not exist.")
+        raise FileNotFoundError(f"Profile '{canon}' does not exist.")

    output = Path(output_path)
    # shutil.make_archive wants the base name without extension
    base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")

-    if name == "default":
+    if canon == "default":
        # The default profile IS ~/.hermes itself — its parent is ~/ and its
        # directory name is ".hermes", not "default".  We stage a clean copy
        # under a temp dir so the archive contains ``default/...``.
@@ -827,14 +870,14 @@ def export_profile(name: str, output_path: str) -> Path:

    # Named profiles — stage a filtered copy to exclude credentials
    with tempfile.TemporaryDirectory() as tmpdir:
-        staged = Path(tmpdir) / name
+        staged = Path(tmpdir) / canon
        _CREDENTIAL_FILES = {"auth.json", ".env"}
        shutil.copytree(
            profile_dir,
            staged,
            ignore=lambda d, contents: _CREDENTIAL_FILES & set(contents),
        )
-        result = shutil.make_archive(base, "gztar", tmpdir, name)
+        result = shutil.make_archive(base, "gztar", tmpdir, canon)
        return Path(result)


@@ -943,16 +986,17 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
    # Archives exported from the default profile have "default/" as top-level
    # dir.  Importing as "default" would target ~/.hermes itself — disallow
    # that and guide the user toward a named profile.
-    if inferred_name == "default":
+    canon = normalize_profile_name(inferred_name)
+    validate_profile_name(canon)
+    if canon == "default":
        raise ValueError(
            "Cannot import as 'default' — that is the built-in root profile (~/.hermes). "
            "Specify a different name: hermes profile import <archive> --name <name>"
        )

-    validate_profile_name(inferred_name)
-    profile_dir = get_profile_dir(inferred_name)
+    profile_dir = get_profile_dir(canon)
    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
+        raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")

    profiles_root = _get_profiles_root()
    profiles_root.mkdir(parents=True, exist_ok=True)
@@ -968,8 +1012,8 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
            )

        final_source = extracted
-        if archive_root != inferred_name:
-            final_source = staging_root / inferred_name
+        if archive_root != canon:
+            final_source = staging_root / canon
            extracted.rename(final_source)

        shutil.move(str(final_source), str(profile_dir))
@@ -1039,25 +1083,27 @@ def rename_profile(old_name: str, new_name: str) -> Path:

    Returns the new profile directory.
    """
-    validate_profile_name(old_name)
-    validate_profile_name(new_name)
+    old_canon = normalize_profile_name(old_name)
+    new_canon = normalize_profile_name(new_name)
+    validate_profile_name(old_canon)
+    validate_profile_name(new_canon)

-    if old_name == "default":
+    if old_canon == "default":
        raise ValueError("Cannot rename the default profile.")
-    if new_name == "default":
+    if new_canon == "default":
        raise ValueError("Cannot rename to 'default' — it is reserved.")

-    old_dir = get_profile_dir(old_name)
-    new_dir = get_profile_dir(new_name)
+    old_dir = get_profile_dir(old_canon)
+    new_dir = get_profile_dir(new_canon)

    if not old_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
+        raise FileNotFoundError(f"Profile '{old_canon}' does not exist.")
    if new_dir.exists():
-        raise FileExistsError(f"Profile '{new_name}' already exists.")
+        raise FileExistsError(f"Profile '{new_canon}' already exists.")

    # 1. Stop gateway if running
    if _check_gateway_running(old_dir):
-        _cleanup_gateway_service(old_name, old_dir)
+        _cleanup_gateway_service(old_canon, old_dir)
        _stop_gateway_process(old_dir)

    # 2. Rename directory
@@ -1065,22 +1111,22 @@ def rename_profile(old_name: str, new_name: str) -> Path:
    print(f"✓ Renamed {old_dir.name} → {new_dir.name}")

    # 3. Update profile-scoped Honcho host blocks, preserving aiPeer identity
-    _migrate_honcho_profile_host(old_name, new_name, new_dir)
+    _migrate_honcho_profile_host(old_canon, new_canon, new_dir)

    # 4. Update wrapper script
-    remove_wrapper_script(old_name)
-    collision = check_alias_collision(new_name)
+    remove_wrapper_script(old_canon)
+    collision = check_alias_collision(new_canon)
    if not collision:
-        create_wrapper_script(new_name)
-        print(f"✓ Alias updated: {new_name}")
+        create_wrapper_script(new_canon)
+        print(f"✓ Alias updated: {new_canon}")
    else:
-        print(f"⚠ Cannot create alias '{new_name}' — {collision}")
+        print(f"⚠ Cannot create alias '{new_canon}' — {collision}")

    # 5. Update active_profile if it pointed to old name
    try:
-        if get_active_profile() == old_name:
-            set_active_profile(new_name)
-            print(f"✓ Active profile updated: {new_name}")
+        if get_active_profile() == old_canon:
+            set_active_profile(new_canon)
+            print(f"✓ Active profile updated: {new_canon}")
    except Exception:
        pass

@@ -1182,13 +1228,14 @@ def resolve_profile_env(profile_name: str) -> str:
    Called early in the CLI entry point, before any hermes modules
    are imported, to set the HERMES_HOME environment variable.
    """
-    validate_profile_name(profile_name)
-    profile_dir = get_profile_dir(profile_name)
+    canon = normalize_profile_name(profile_name)
+    validate_profile_name(canon)
+    profile_dir = get_profile_dir(canon)

-    if profile_name != "default" and not profile_dir.is_dir():
+    if canon != "default" and not profile_dir.is_dir():
        raise FileNotFoundError(
-            f"Profile '{profile_name}' does not exist. "
-            f"Create it with: hermes profile create {profile_name}"
+            f"Profile '{canon}' does not exist. "
+            f"Create it with: hermes profile create {canon}"
        )

    return str(profile_dir)
@@ -108,9 +108,14 @@ class PtyBridge:
                    "(or pip install -e '.[pty]')."
                )
            raise PtyUnavailableError("Pseudo-terminals are unavailable.")
-        # Let caller-supplied env fully override inheritance; if they pass
-        # None we inherit the server's env (same semantics as subprocess).
-        spawn_env = os.environ.copy() if env is None else env
+        # PTY-hosted programs expect TERM to describe the terminal type.
+        # CI often runs without TERM in the parent process, which makes
+        # simple terminal probes like `tput cols` fail before winsize reads.
+        # Preserve explicit caller overrides, but backfill a sensible default
+        # when TERM is missing or blank.
+        spawn_env = (os.environ.copy() if env is None else env.copy())
+        if not spawn_env.get("TERM"):
+            spawn_env["TERM"] = "xterm-256color"
        proc = ptyprocess.PtyProcess.spawn(  # type: ignore[union-attr]
            list(argv),
            cwd=cwd,
@@ -358,11 +358,20 @@ def _get_named_custom_provider(requested_provider: str) -> Optional[Dict[str, An
        return None
    if not requested_norm.startswith("custom:"):
        try:
-            auth_mod.resolve_provider(requested_norm)
+            canonical = auth_mod.resolve_provider(requested_norm)
        except AuthError:
            pass
        else:
-            return None
+            # A user-declared ``custom_providers`` entry whose name matches
+            # only an *alias* (``kimi`` → built-in ``kimi-coding``) is the
+            # user's intended target — alias rewriting would otherwise hijack
+            # the request.  We only defer to the built-in when the raw name is
+            # the canonical provider itself (``nous``, ``openrouter``, …) so
+            # accidentally shadowing a canonical provider still resolves to
+            # the built-in. See tests/hermes_cli/test_runtime_provider_resolution.py
+            # ``test_named_custom_provider_does_not_shadow_builtin_provider``.
+            if (canonical or "").strip().lower() == requested_norm:
+                return None

    config = load_config()
    
@@ -964,7 +964,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
                    )
                else:
                    _selected_vision_model = prompt("  Vision model (blank = use main/custom default)").strip()
-                save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
+                if _selected_vision_model:
+                    save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
                print_success(
                    f"Vision configured with {_base_url}"
                    + (f" ({_selected_vision_model})" if _selected_vision_model else "")
@@ -1190,6 +1191,13 @@ def _setup_tts_provider(config: dict):
                    "Falling back to Edge TTS."
                )
                selected = "edge"
+        if selected == "xai":
+            print()
+            voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
+            if voice_id and voice_id.strip():
+                config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
+                print_success(f"xAI voice_id set to: {voice_id.strip()}")
+

    elif selected == "minimax":
        existing = get_env_value("MINIMAX_API_KEY")
@@ -1321,15 +1329,13 @@ def setup_terminal_backend(config: dict):
        print_success("Terminal backend: Local")
        print_info("Commands run directly on this machine.")

-        # CWD for messaging
+        # Gateway/cron working directory
        print()
-        print_info("Working directory for messaging sessions:")
-        print_info("  When using Hermes via Telegram/Discord, this is where")
-        print_info(
-            "  the agent starts. CLI mode always starts in the current directory."
-        )
+        print_info("Gateway working directory:")
+        print_info("  Used by Telegram/Discord/cron sessions.")
+        print_info("  CLI/TUI always uses your launch directory instead.")
        current_cwd = cfg_get(config, "terminal", "cwd", default="")
-        cwd = prompt("  Messaging working directory", current_cwd or str(Path.home()))
+        cwd = prompt("  Gateway working directory", current_cwd or str(Path.home()))
        if cwd:
            config["terminal"]["cwd"] = cwd

@@ -1643,7 +1649,11 @@ def setup_terminal_backend(config: dict):
 def _apply_default_agent_settings(config: dict):
    """Apply recommended defaults for all agent settings without prompting."""
    config.setdefault("agent", {})["max_turns"] = 90
-    save_env_value("HERMES_MAX_ITERATIONS", "90")
+    # config.yaml is the authoritative source for max_turns; the gateway
+    # bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
+    # to .env to avoid the dual-source inconsistency that caused the
+    # 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
+    remove_env_value("HERMES_MAX_ITERATIONS")

    config.setdefault("display", {})["tool_progress"] = "all"

@@ -1673,9 +1683,10 @@ def setup_agent_settings(config: dict):
    print()

    # ── Max Iterations ──
-    current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
-        cfg_get(config, "agent", "max_turns", default=90)
-    )
+    # config.yaml is authoritative; read from there. If a legacy .env
+    # entry is still around (from pre-PR#18413 setups), prefer the
+    # config value so we don't surface a stale number to the user.
+    current_max = str(cfg_get(config, "agent", "max_turns", default=90))
    print_info("Maximum tool-calling iterations per conversation.")
    print_info("Higher = more complex tasks, but costs more tokens.")
    print_info(
@@ -1686,9 +1697,13 @@ def setup_agent_settings(config: dict):
    try:
        max_iter = int(max_iter_str)
        if max_iter > 0:
-            save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
+            # Write to config.yaml (authoritative) only. Also clean up any
+            # stale .env entry from earlier setup runs — the gateway's
+            # bridge in gateway/run.py now unconditionally derives
+            # HERMES_MAX_ITERATIONS from agent.max_turns at startup.
            config.setdefault("agent", {})["max_turns"] = max_iter
            config.pop("max_turns", None)
+            remove_env_value("HERMES_MAX_ITERATIONS")
            print_success(f"Max iterations set to {max_iter}")
    except ValueError:
        print_warning("Invalid number, keeping current value")
@@ -2033,6 +2048,16 @@ def _setup_slack():
        print_warning("⚠️  No Slack allowlist set - unpaired users will be denied by default.")
        print_info("   Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")

+    print()
+    print_info("📬 Home Channel: where Hermes delivers cron job results,")
+    print_info("   cross-platform messages, and notifications.")
+    print_info("   To get a channel ID: open the channel in Slack, then right-click")
+    print_info("   the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
+    print_info("   You can also set this later by typing /set-home in a Slack channel.")
+    home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
+    if home_channel:
+        save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
+

 def _write_slack_manifest_and_instruct():
    """Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -2979,6 +3004,21 @@ def run_setup_wizard(args):
    config = load_config()
    hermes_home = get_hermes_home()

+    # Back up existing config before setup modifies it (#3522)
+    config_path = get_config_path()
+    if config_path.exists():
+        from datetime import datetime as _dt
+        _backup_path = config_path.with_suffix(
+            f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
+        )
+        try:
+            import shutil
+            shutil.copy2(config_path, _backup_path)
+        except Exception:
+            _backup_path = None
+    else:
+        _backup_path = None
+
    # Detect non-interactive environments (headless SSH, Docker, CI/CD)
    non_interactive = getattr(args, 'non_interactive', False)
    if not non_interactive and not is_interactive_stdin():
@@ -3148,6 +3188,10 @@ def run_setup_wizard(args):

    # Save and show summary
    save_config(config)
+    if _backup_path and _backup_path.exists():
+        print_info(f"Previous config backed up to: {_backup_path}")
+        print_info("If setup changed a value you customized, restore it with:")
+        print_info(f"  cp {_backup_path} {config_path}")
    _print_setup_summary(config, hermes_home)

    _offer_launch_chat()
@@ -18,6 +18,7 @@ for reinstall when scopes/commands change.
 from __future__ import annotations

 import json
+import os
 import sys
 from pathlib import Path

@@ -128,7 +129,7 @@ def slack_manifest_command(args) -> int:

                target = Path(get_hermes_home()) / "slack-manifest.json"
            except Exception:
-                target = Path.home() / ".hermes" / "slack-manifest.json"
+                target = Path(os.environ.get("HERMES_HOME") or str(Path.home() / ".hermes")) / "slack-manifest.json"
        else:
            target = Path(write_target).expanduser()
        target.parent.mkdir(parents=True, exist_ok=True)
@@ -122,10 +122,16 @@ def show_status(args):
    print()
    print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))

-    keys = {
+    # Values may be a single env var name (str) or a tuple of alternates (first found wins).
+    keys: dict[str, str | tuple[str, ...]] = {
        "OpenRouter": "OPENROUTER_API_KEY",
        "OpenAI": "OPENAI_API_KEY",
-        "Z.AI/GLM": "GLM_API_KEY",
+        "Anthropic": ("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN"),
+        "Google / Gemini": ("GOOGLE_API_KEY", "GEMINI_API_KEY"),
+        "DeepSeek": "DEEPSEEK_API_KEY",
+        "xAI / Grok": "XAI_API_KEY",
+        "NVIDIA NIM": "NVIDIA_API_KEY",
+        "Z.AI / GLM": "GLM_API_KEY",
        "Kimi": "KIMI_API_KEY",
        "StepFun Step Plan": "STEPFUN_API_KEY",
        "MiniMax": "MINIMAX_API_KEY",
@@ -141,8 +147,23 @@ def show_status(args):
        "GitHub": "GITHUB_TOKEN",
    }

-    for name, env_var in keys.items():
-        value = get_env_value(env_var) or ""
+    def _resolve_env(env_ref) -> str:
+        """Return first non-empty env var value from a str or tuple of names."""
+        if isinstance(env_ref, tuple):
+            for candidate in env_ref:
+                v = get_env_value(candidate) or ""
+                if v:
+                    return v
+            return ""
+        return get_env_value(env_ref) or ""
+
+    for name, env_ref in keys.items():
+        # Anthropic already has a dedicated lookup below; keep that as the
+        # single source of truth (it also resolves OAuth tokens), skip here
+        # so we don't print two "Anthropic" rows.
+        if name == "Anthropic":
+            continue
+        value = _resolve_env(env_ref)
        has_key = bool(value)
        display = redact_key(value) if not show_all else value
        print(f"  {name:<12}  {check_mark(has_key)} {display}")
@@ -56,6 +56,7 @@ CONFIGURABLE_TOOLSETS = [
    ("file",            "📁 File Operations",           "read, write, patch, search"),
    ("code_execution",  "⚡ Code Execution",            "execute_code"),
    ("vision",          "👁️  Vision / Image Analysis",  "vision_analyze"),
+    ("video",           "🎬 Video Analysis",            "video_analyze (requires video-capable model)"),
    ("image_gen",       "🎨 Image Generation",          "image_generate"),
    ("moa",             "🧠 Mixture of Agents",         "mixture_of_agents"),
    ("tts",             "🔊 Text-to-Speech",            "text_to_speech"),
@@ -78,7 +79,7 @@ CONFIGURABLE_TOOLSETS = [
 # Toolsets that are OFF by default for new installs.
 # They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
 # but the setup checklist won't pre-select them for first-time users.
-_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
+_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}

 # Platform-scoped toolsets: only appear in the `hermes tools` checklist for
 # these platforms, and only resolve/save for these platforms.  A toolset
@@ -1822,7 +1823,7 @@ def _reconfigure_tool(config: dict):
        cat = TOOL_CATEGORIES.get(ts_key)
        reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
        if cat or reqs:
-            if _toolset_has_keys(ts_key, config):
+            if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
                configurable.append((ts_key, ts_label))

    if not configurable:
@@ -1848,6 +1849,28 @@ def _reconfigure_tool(config: dict):
    save_config(config)


+def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
+    """Return True if a configurable toolset is enabled anywhere.
+
+    Reconfigure must include enabled-but-unconfigured categories so users can
+    finish provider/API-key setup without disabling and re-enabling the toolset.
+    """
+    for platform in PLATFORMS:
+        if not _toolset_allowed_for_platform(ts_key, platform):
+            continue
+        try:
+            enabled = _get_platform_tools(
+                config,
+                platform,
+                include_default_mcp_servers=False,
+            )
+        except Exception:
+            continue
+        if ts_key in enabled:
+            return True
+    return False
+
+
 def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
    """Reconfigure a tool category - provider selection + API key update."""
    icon = cat.get("icon", "")
@@ -1897,21 +1920,27 @@ def _reconfigure_provider(provider: dict, config: dict):
            return

    if provider.get("tts_provider"):
-        config.setdefault("tts", {})["provider"] = provider["tts_provider"]
+        tts_cfg = config.setdefault("tts", {})
+        tts_cfg["provider"] = provider["tts_provider"]
+        tts_cfg["use_gateway"] = bool(managed_feature)
        _print_success(f"  TTS provider set to: {provider['tts_provider']}")

    if "browser_provider" in provider:
        bp = provider["browser_provider"]
+        browser_cfg = config.setdefault("browser", {})
        if bp == "local":
-            config.setdefault("browser", {})["cloud_provider"] = "local"
+            browser_cfg["cloud_provider"] = "local"
            _print_success("  Browser set to local mode")
        elif bp:
-            config.setdefault("browser", {})["cloud_provider"] = bp
+            browser_cfg["cloud_provider"] = bp
            _print_success(f"  Browser cloud provider set to: {bp}")
+        browser_cfg["use_gateway"] = bool(managed_feature)

    # Set web search backend in config if applicable
    if provider.get("web_backend"):
-        config.setdefault("web", {})["backend"] = provider["web_backend"]
+        web_cfg = config.setdefault("web", {})
+        web_cfg["backend"] = provider["web_backend"]
+        web_cfg["use_gateway"] = bool(managed_feature)
        _print_success(f"  Web backend set to: {provider['web_backend']}")

    if managed_feature and managed_feature not in ("web", "tts", "browser"):
@@ -27,6 +27,192 @@ import sys
 import threading
 from typing import Any, Callable, Optional

+# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``)
+# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime
+# mismatch Copilot flagged in round-9 on #19835.
+#
+# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit
+# has no super/meta modifier for the Cmd key, so those spellings are
+# TUI-only. The normalizer below returns the documented default
+# (``c-b``) for them — a silent fallback was preferred to a hard
+# startup crash (Copilot round-11). The CLI binding site
+# (``_register_voice_handler`` in cli.py) logs a warning when that
+# fallback fires so users see why their TUI-only shortcut isn't
+# bound in the classic CLI.
+_VOICE_MOD_ALIASES = {
+    "ctrl": "c-",
+    "control": "c-",
+    "alt": "a-",
+    "option": "a-",
+    "opt": "a-",
+}
+
+# Named keys prompt_toolkit accepts in ``c-<name>`` / ``a-<name>`` form.
+# Aliases collapse to prompt_toolkit's canonical spelling so the same
+# config value binds identically in both runtimes (Copilot round-10 on
+# #19835).
+_VOICE_NAMED_KEYS = {
+    "space": "space",
+    "spc": "space",
+    "enter": "enter",
+    "return": "enter",
+    "ret": "enter",
+    "tab": "tab",
+    "escape": "escape",
+    "esc": "escape",
+    "backspace": "backspace",
+    "bs": "backspace",
+    "delete": "delete",
+    "del": "delete",
+}
+
+# ``useInputHandlers()`` intercepts these before the voice check runs,
+# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or
+# ``ctrl+l`` (clear screen) would be advertised in /voice status but
+# never fire push-to-talk — the same blocklist the TUI parser uses.
+_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"})
+
+# On macOS the classic CLI's prompt_toolkit bindings for copy / exit /
+# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier
+# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals.
+# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc.
+# don't bind Alt+C in the CLI while the TUI silently falls back to
+# Ctrl+B (Copilot round-14 on #19835).
+_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"})
+
+_DEFAULT_PT_KEY = "c-b"
+
+
+def voice_record_key_from_config(cfg: Any) -> Any:
+    """Shape-safe ``cfg.voice.record_key`` lookup.
+
+    ``load_config()`` deep-merges raw YAML and preserves scalar
+    overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b``
+    leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the
+    naive ``.get("voice", {}).get("record_key")`` chain raises
+    AttributeError before voice can even start (Copilot round-11 on
+    #19835). Return ``None`` for malformed shapes so call sites can
+    feed the result straight into the normalizer/formatter and get
+    the documented default.
+    """
+    if not isinstance(cfg, dict):
+        return None
+
+    voice = cfg.get("voice")
+    if not isinstance(voice, dict):
+        return None
+
+    return voice.get("record_key")
+
+
+def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str:
+    """Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format.
+
+    Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``)
+    so one config value binds the same shortcut in both runtimes:
+
+    * non-string / empty / typo'd / bare-char / multi-modifier / reserved
+      ``ctrl+c|d|l`` → documented default ``c-b``
+    * single-char keys: ``ctrl+o`` → ``c-o``
+    * named keys: ``ctrl+space`` → ``c-space`` (aliases collapse:
+      ``ctrl+return`` → ``c-enter``)
+    * ``super`` / ``win`` / ``windows`` → ``c-b`` (TUI-only modifiers —
+      prompt_toolkit has no super mod; the CLI binding site is
+      expected to warn when this fallback fires so users see the
+      cross-runtime split, Copilot round-11 on #19835)
+    """
+    if not isinstance(raw, str):
+        return _DEFAULT_PT_KEY
+
+    lowered = raw.strip().lower()
+    if not lowered:
+        return _DEFAULT_PT_KEY
+
+    parts = [p.strip() for p in lowered.split("+") if p.strip()]
+    if not parts:
+        return _DEFAULT_PT_KEY
+
+    # Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts
+    # in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse
+    # to the documented default instead of silently diverging.
+    if len(parts) > 2:
+        return _DEFAULT_PT_KEY
+
+    # Bare char / bare named key (no explicit modifier) — the CLI's
+    # prompt_toolkit binds the raw key without a modifier, which the TUI
+    # parser refuses; reject here too so both runtimes agree.
+    if len(parts) == 1:
+        return _DEFAULT_PT_KEY
+
+    modifier_token, key_token = parts
+
+    # ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has
+    # no super modifier, so ``@kb.add(super+b)`` crashes the CLI at
+    # startup). Fall back to the documented default here; the CLI
+    # binding site is expected to log a warning when the configured
+    # value is one of these spellings so users know the TUI+CLI
+    # runtimes diverge on that shortcut (Copilot round-11 on #19835).
+    if modifier_token in {"super", "win", "windows"}:
+        return _DEFAULT_PT_KEY
+
+    normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token)
+    if not normalized_mod:
+        return _DEFAULT_PT_KEY
+
+    # Single-char key: reject reserved-ctrl chords that the TUI would
+    # also block at parse time, plus the mac-only alt reservation.
+    if len(key_token) == 1:
+        if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS:
+            return _DEFAULT_PT_KEY
+        if (
+            normalized_mod == "a-"
+            and sys.platform == "darwin"
+            and key_token in _VOICE_RESERVED_ALT_CHARS_MAC
+        ):
+            return _DEFAULT_PT_KEY
+        return f"{normalized_mod}{key_token}"
+
+    # Multi-char key token must be a known named key; typos like
+    # ``ctrl+spcae`` fall back to the default rather than being passed
+    # through as ``c-spcae`` (which prompt_toolkit would reject).
+    named = _VOICE_NAMED_KEYS.get(key_token)
+    if not named:
+        return _DEFAULT_PT_KEY
+
+    return f"{normalized_mod}{named}"
+
+
+def format_voice_record_key_for_status(raw: Any) -> str:
+    """Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form.
+
+    Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` /
+    ``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the
+    documented default so status never advertises a shortcut that
+    won't bind (Copilot round-10 on #19835).
+    """
+    normalized = normalize_voice_record_key_for_prompt_toolkit(raw)
+
+    if normalized.startswith("c-"):
+        prefix, key = "Ctrl+", normalized[2:]
+    elif normalized.startswith("a-"):
+        prefix, key = "Alt+", normalized[2:]
+    elif "+" in normalized:
+        # ``super+<key>`` / ``win+<key>`` — CLI won't bind them, but
+        # render in title case so status output is still readable.
+        mod, key = normalized.split("+", 1)
+        prefix = mod[0].upper() + mod[1:] + "+"
+    else:
+        return "Ctrl+B"
+
+    if not key:
+        return prefix.rstrip("+")
+
+    if len(key) == 1:
+        return prefix + key.upper()
+
+    return prefix + key[0].upper() + key[1:]
+
+
 from tools.voice_mode import (
    create_audio_recorder,
    is_whisper_hallucination,
@@ -345,6 +345,7 @@ _CATEGORY_MERGE: Dict[str, str] = {
    "dashboard": "display",
    "code_execution": "agent",
    "prompt_caching": "agent",
+    "goals": "agent",
    # Only `telegram.reactions` currently lives under telegram — fold it in
    # with the other messaging-platform config (discord) so it isn't an
    # orphan tab of one field.
@@ -469,10 +470,23 @@ except (ValueError, TypeError):
    )
    _GATEWAY_HEALTH_TIMEOUT = 3.0

+# DEPRECATED (scheduled for removal): GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT.
+# Cross-container / cross-host gateway liveness detection will be folded into a
+# first-class dashboard config key so it's no longer Docker-adjacent lore buried
+# in env vars.  The env vars still work for now so existing Compose deployments
+# don't break.  Do not add new callers — wire new uses through the planned
+# config surface.
+

 def _probe_gateway_health() -> tuple[bool, dict | None]:
    """Probe the gateway via its HTTP health endpoint (cross-container).

+    .. deprecated::
+        Driven by the deprecated ``GATEWAY_HEALTH_URL`` /
+        ``GATEWAY_HEALTH_TIMEOUT`` env vars.  Scheduled for removal alongside
+        a move to a first-class dashboard config key.  See
+        :data:`_GATEWAY_HEALTH_URL` for context.
+
    Uses ``/health/detailed`` first (returns full state), falling back to
    the simpler ``/health`` endpoint.  Returns ``(is_alive, body_dict)``.

@@ -2344,6 +2358,254 @@ async def delete_cron_job(job_id: str):
    return {"ok": True}


+# ---------------------------------------------------------------------------
+# Profile management endpoints (minimal — list/create/rename/delete + SOUL.md)
+# ---------------------------------------------------------------------------
+
+
+class ProfileCreate(BaseModel):
+    name: str
+    clone_from_default: bool = False
+
+
+class ProfileRename(BaseModel):
+    new_name: str
+
+
+class ProfileSoulUpdate(BaseModel):
+    content: str
+
+
+def _profile_attr(info, name: str, default: Any = None) -> Any:
+    try:
+        return getattr(info, name)
+    except Exception:
+        return default
+
+
+def _profile_to_dict(info) -> Dict[str, Any]:
+    return {
+        "name": _profile_attr(info, "name", ""),
+        "path": str(_profile_attr(info, "path", "")),
+        "is_default": bool(_profile_attr(info, "is_default", False)),
+        "model": _profile_attr(info, "model"),
+        "provider": _profile_attr(info, "provider"),
+        "has_env": bool(_profile_attr(info, "has_env", False)),
+        "skill_count": int(_profile_attr(info, "skill_count", 0) or 0),
+    }
+
+
+def _fallback_profile_dicts(profiles_mod) -> List[Dict[str, Any]]:
+    def _safe(callable_, default):
+        try:
+            return callable_()
+        except Exception:
+            return default
+
+    profiles: List[Dict[str, Any]] = []
+    default_home = profiles_mod._get_default_hermes_home()
+    if default_home.is_dir():
+        model, provider = _safe(lambda: profiles_mod._read_config_model(default_home), (None, None))
+        profiles.append({
+            "name": "default",
+            "path": str(default_home),
+            "is_default": True,
+            "model": model,
+            "provider": provider,
+            "has_env": (default_home / ".env").exists(),
+            "skill_count": _safe(lambda: profiles_mod._count_skills(default_home), 0),
+        })
+
+    profiles_root = profiles_mod._get_profiles_root()
+    if profiles_root.is_dir():
+        for entry in sorted(profiles_root.iterdir()):
+            if not entry.is_dir() or not profiles_mod._PROFILE_ID_RE.match(entry.name):
+                continue
+            model, provider = _safe(lambda entry=entry: profiles_mod._read_config_model(entry), (None, None))
+            profiles.append({
+                "name": entry.name,
+                "path": str(entry),
+                "is_default": False,
+                "model": model,
+                "provider": provider,
+                "has_env": (entry / ".env").exists(),
+                "skill_count": _safe(lambda entry=entry: profiles_mod._count_skills(entry), 0),
+            })
+
+    return profiles
+
+
+def _resolve_profile_dir(name: str) -> Path:
+    """Validate ``name`` and resolve to its directory or raise an HTTPException."""
+    from hermes_cli import profiles as profiles_mod
+    try:
+        profiles_mod.validate_profile_name(name)
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    if not profiles_mod.profile_exists(name):
+        raise HTTPException(status_code=404, detail=f"Profile '{name}' does not exist.")
+    return profiles_mod.get_profile_dir(name)
+
+
+def _profile_setup_command(name: str) -> str:
+    """Return the shell command used to configure a profile in the CLI."""
+    _resolve_profile_dir(name)
+    return "hermes setup" if name == "default" else f"{name} setup"
+
+
+@app.get("/api/profiles")
+async def list_profiles_endpoint():
+    from hermes_cli import profiles as profiles_mod
+    try:
+        return {"profiles": [_profile_to_dict(p) for p in profiles_mod.list_profiles()]}
+    except Exception:
+        _log.exception("GET /api/profiles failed; falling back to profile directory scan")
+        return {"profiles": _fallback_profile_dicts(profiles_mod)}
+
+
+@app.post("/api/profiles")
+async def create_profile_endpoint(body: ProfileCreate):
+    from hermes_cli import profiles as profiles_mod
+    try:
+        path = profiles_mod.create_profile(
+            name=body.name,
+            clone_from="default" if body.clone_from_default else None,
+            clone_config=body.clone_from_default,
+        )
+        # Match the CLI's profile-create flow: fresh named profiles get the
+        # bundled skills installed. When cloning from default, create_profile()
+        # has already copied the source profile's skills, including any
+        # user-installed skills.
+        if not body.clone_from_default:
+            profiles_mod.seed_profile_skills(path, quiet=True)
+
+        # Match the CLI's profile-create flow: named profiles should get a
+        # wrapper in ~/.local/bin when the alias is safe to create.
+        collision = profiles_mod.check_alias_collision(body.name)
+        if not collision:
+            profiles_mod.create_wrapper_script(body.name)
+    except (ValueError, FileExistsError, FileNotFoundError) as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        _log.exception("POST /api/profiles failed")
+        raise HTTPException(status_code=500, detail=str(e))
+    return {"ok": True, "name": body.name, "path": str(path)}
+
+
+@app.get("/api/profiles/{name}/setup-command")
+async def get_profile_setup_command(name: str):
+    return {"command": _profile_setup_command(name)}
+
+
+@app.post("/api/profiles/{name}/open-terminal")
+async def open_profile_terminal_endpoint(name: str):
+    try:
+        command = _profile_setup_command(name)
+
+        if sys.platform.startswith("win"):
+            subprocess.Popen(["cmd.exe", "/c", "start", "", command])
+        elif sys.platform == "darwin":
+            escaped = command.replace("\\", "\\\\").replace('"', '\\"')
+            applescript = (
+                'tell application "Terminal"\n'
+                "activate\n"
+                f'do script "{escaped}"\n'
+                "end tell"
+            )
+            subprocess.Popen(["osascript", "-e", applescript])
+        else:
+            terminal_commands = [
+                ("x-terminal-emulator", ["x-terminal-emulator", "-e", "sh", "-lc", command]),
+                ("gnome-terminal", ["gnome-terminal", "--", "sh", "-lc", command]),
+                ("konsole", ["konsole", "-e", "sh", "-lc", command]),
+                ("xfce4-terminal", ["xfce4-terminal", "-e", f"sh -lc '{command}'"]),
+                ("mate-terminal", ["mate-terminal", "-e", f"sh -lc '{command}'"]),
+                ("lxterminal", ["lxterminal", "-e", f"sh -lc '{command}'"]),
+                ("tilix", ["tilix", "-e", "sh", "-lc", command]),
+                ("alacritty", ["alacritty", "-e", "sh", "-lc", command]),
+                ("kitty", ["kitty", "sh", "-lc", command]),
+                ("xterm", ["xterm", "-e", "sh", "-lc", command]),
+            ]
+            for executable, popen_args in terminal_commands:
+                if subprocess.call(
+                    ["which", executable],
+                    stdout=subprocess.DEVNULL,
+                    stderr=subprocess.DEVNULL,
+                ) == 0:
+                    subprocess.Popen(popen_args)
+                    break
+            else:
+                raise HTTPException(
+                    status_code=400,
+                    detail="No supported terminal emulator found",
+                )
+    except FileNotFoundError as e:
+        raise HTTPException(status_code=404, detail=str(e))
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except HTTPException:
+        raise
+    except Exception as e:
+        _log.exception("POST /api/profiles/%s/open-terminal failed", name)
+        raise HTTPException(status_code=500, detail=str(e))
+    return {"ok": True, "command": command}
+
+
+@app.patch("/api/profiles/{name}")
+async def rename_profile_endpoint(name: str, body: ProfileRename):
+    from hermes_cli import profiles as profiles_mod
+    try:
+        path = profiles_mod.rename_profile(name, body.new_name)
+    except FileNotFoundError as e:
+        raise HTTPException(status_code=404, detail=str(e))
+    except (ValueError, FileExistsError) as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        _log.exception("PATCH /api/profiles/%s failed", name)
+        raise HTTPException(status_code=500, detail=str(e))
+    return {"ok": True, "name": body.new_name, "path": str(path)}
+
+
+@app.delete("/api/profiles/{name}")
+async def delete_profile_endpoint(name: str):
+    """Delete a profile. The dashboard collects the user's confirmation in
+    its own dialog before this request, so we always pass ``yes=True`` to
+    skip the CLI's interactive prompt."""
+    from hermes_cli import profiles as profiles_mod
+    try:
+        path = profiles_mod.delete_profile(name, yes=True)
+    except FileNotFoundError as e:
+        raise HTTPException(status_code=404, detail=str(e))
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        _log.exception("DELETE /api/profiles/%s failed", name)
+        raise HTTPException(status_code=500, detail=str(e))
+    return {"ok": True, "path": str(path)}
+
+
+@app.get("/api/profiles/{name}/soul")
+async def get_profile_soul(name: str):
+    soul_path = _resolve_profile_dir(name) / "SOUL.md"
+    if soul_path.exists():
+        try:
+            return {"content": soul_path.read_text(encoding="utf-8"), "exists": True}
+        except OSError as e:
+            raise HTTPException(status_code=500, detail=f"Could not read SOUL.md: {e}")
+    return {"content": "", "exists": False}
+
+
+@app.put("/api/profiles/{name}/soul")
+async def update_profile_soul(name: str, body: ProfileSoulUpdate):
+    soul_path = _resolve_profile_dir(name) / "SOUL.md"
+    try:
+        soul_path.write_text(body.content, encoding="utf-8")
+    except OSError as e:
+        _log.exception("PUT /api/profiles/%s/soul failed", name)
+        raise HTTPException(status_code=500, detail=f"Could not write SOUL.md: {e}")
+    return {"ok": True}
+
+
 # ---------------------------------------------------------------------------
 # Skills & Tools endpoints
 # ---------------------------------------------------------------------------
@@ -2633,6 +2895,25 @@ _VALID_CHANNEL_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
 # loopback so tests don't need to rewrite request scope.
 _LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})

+
+def _is_public_bind() -> bool:
+    """True when bound to all-interfaces (operator used --insecure)."""
+    return getattr(app.state, "bound_host", "") in ("0.0.0.0", "::")
+
+
+def _ws_client_is_allowed(ws: "WebSocket") -> bool:
+    """Check if the WebSocket client IP is acceptable.
+
+    Allows loopback always; allows any IP when bound to all-interfaces
+    (--insecure mode, guarded by session token auth).
+    """
+    if _is_public_bind():
+        return True
+    client_host = ws.client.host if ws.client else ""
+    if not client_host:
+        return True
+    return client_host in _LOOPBACK_HOSTS
+
 # Per-channel subscriber registry used by /api/pub (PTY-side gateway → dashboard)
 # and /api/events (dashboard → browser sidebar).  Keyed by an opaque channel id
 # the chat tab generates on mount; entries auto-evict when the last subscriber
@@ -2723,8 +3004,7 @@ async def pty_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    client_host = ws.client.host if ws.client else ""
-    if client_host and client_host not in _LOOPBACK_HOSTS:
+    if not _ws_client_is_allowed(ws):
        await ws.close(code=4403)
        return

@@ -2831,8 +3111,7 @@ async def gateway_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    client_host = ws.client.host if ws.client else ""
-    if client_host and client_host not in _LOOPBACK_HOSTS:
+    if not _ws_client_is_allowed(ws):
        await ws.close(code=4403)
        return

@@ -2864,8 +3143,7 @@ async def pub_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    client_host = ws.client.host if ws.client else ""
-    if client_host and client_host not in _LOOPBACK_HOSTS:
+    if not _ws_client_is_allowed(ws):
        await ws.close(code=4403)
        return

@@ -2894,8 +3172,7 @@ async def events_ws(ws: WebSocket) -> None:
        await ws.close(code=4401)
        return

-    client_host = ws.client.host if ws.client else ""
-    if client_host and client_host not in _LOOPBACK_HOSTS:
+    if not _ws_client_is_allowed(ws):
        await ws.close(code=4403)
        return

@@ -3369,12 +3646,16 @@ def _get_dashboard_plugins(force_rescan: bool = False) -> list:

@app.get("/api/dashboard/plugins")
 async def get_dashboard_plugins():
-    """Return discovered dashboard plugins."""
+    """Return discovered dashboard plugins (excludes user-hidden ones)."""
    plugins = _get_dashboard_plugins()
-    # Strip internal fields before sending to frontend.
+    # Read user's hidden plugins list from config.
+    config = load_config()
+    hidden: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
+    # Strip internal fields before sending to frontend and filter out hidden.
    return [
        {k: v for k, v in p.items() if not k.startswith("_")}
        for p in plugins
+        if p["name"] not in hidden
    ]


@@ -3385,6 +3666,268 @@ async def rescan_dashboard_plugins():
    return {"ok": True, "count": len(plugins)}


+class _AgentPluginInstallBody(BaseModel):
+    identifier: str
+    force: bool = False
+    enable: bool = True
+
+
+def _strip_dashboard_manifest(p: Dict[str, Any]) -> Dict[str, Any]:
+    return {k: v for k, v in p.items() if not k.startswith("_")}
+
+
+def _merged_plugins_hub() -> Dict[str, Any]:
+    """Agent discovery + dashboard manifests + optional provider picker metadata."""
+    from hermes_cli.plugins_cmd import (
+        _discover_all_plugins,
+        _get_current_context_engine,
+        _get_current_memory_provider,
+        _discover_context_engines,
+        _discover_memory_providers,
+        _get_disabled_set,
+        _get_enabled_set,
+        _read_manifest as _read_plugin_manifest_at,
+    )
+
+    dashboard_list = _get_dashboard_plugins()
+    dash_by_name = {str(p["name"]): p for p in dashboard_list}
+
+    disabled_set = _get_disabled_set()
+    enabled_set = _get_enabled_set()
+
+    # Read user-hidden plugins from config for the user_hidden field.
+    config = load_config()
+    hidden_plugins: list = cfg_get(config, "dashboard", "hidden_plugins", default=[]) or []
+
+    plugins_root_resolved = (get_hermes_home() / "plugins").resolve()
+    rows: List[Dict[str, Any]] = []
+
+    for name, version, description, source, dir_str in _discover_all_plugins():
+        if name in disabled_set:
+            runtime_status = "disabled"
+        elif name in enabled_set:
+            runtime_status = "enabled"
+        else:
+            runtime_status = "inactive"
+
+        dir_path = Path(dir_str)
+        dm = dash_by_name.get(name)
+        has_dash_manifest = dm is not None or (dir_path / "dashboard" / "manifest.json").exists()
+
+        under_user_tree = False
+        try:
+            dir_path.resolve().relative_to(plugins_root_resolved)
+            under_user_tree = True
+        except ValueError:
+            pass
+
+        can_remove_update = (
+            source in ("user", "git") and under_user_tree and Path(dir_str).is_dir()
+        )
+
+        # Check if this plugin provides tools that require auth
+        auth_required = False
+        auth_command = ""
+        manifest_data = _read_plugin_manifest_at(dir_path)
+        provides_tools = manifest_data.get("provides_tools") or []
+        if provides_tools:
+            try:
+                from tools.registry import registry
+                for tname in provides_tools:
+                    entry = registry.get_entry(tname)
+                    if entry and entry.check_fn and not entry.check_fn():
+                        auth_required = True
+                        auth_command = f"hermes auth {name}"
+                        break
+            except Exception:
+                pass
+
+        rows.append({
+            "name": name,
+            "version": version or "",
+            "description": description or "",
+            "source": source,
+            "runtime_status": runtime_status,
+            "has_dashboard_manifest": has_dash_manifest,
+            "dashboard_manifest": _strip_dashboard_manifest(dm) if dm else None,
+            "path": dir_str,
+            "can_remove": can_remove_update,
+            "can_update_git": can_remove_update and (Path(dir_str) / ".git").exists(),
+            "auth_required": auth_required,
+            "auth_command": auth_command,
+            "user_hidden": name in hidden_plugins,
+        })
+
+    agent_names = {r["name"] for r in rows}
+    orphan_dashboard = [
+        _strip_dashboard_manifest(p)
+        for p in dashboard_list
+        if str(p["name"]) not in agent_names
+    ]
+
+    memory_providers: List[Dict[str, str]] = []
+    try:
+        for n, desc in _discover_memory_providers():
+            memory_providers.append({"name": n, "description": desc})
+    except Exception:
+        memory_providers = []
+
+    context_engines: List[Dict[str, str]] = []
+    try:
+        for n, desc in _discover_context_engines():
+            context_engines.append({"name": n, "description": desc})
+    except Exception:
+        context_engines = []
+
+    return {
+        "plugins": rows,
+        "orphan_dashboard_plugins": orphan_dashboard,
+        "providers": {
+            "memory_provider": _get_current_memory_provider() or "",
+            "memory_options": memory_providers,
+            "context_engine": _get_current_context_engine(),
+            "context_options": context_engines,
+        },
+    }
+
+
+@app.get("/api/dashboard/plugins/hub")
+async def get_plugins_hub(request: Request):
+    """Unified agent plugins + dashboard extension metadata (session protected)."""
+    _require_token(request)
+    try:
+        return _merged_plugins_hub()
+    except Exception as exc:
+        _log.warning("plugins/hub failed: %s", exc)
+        raise HTTPException(status_code=500, detail="Failed to build plugins hub.") from exc
+
+
+@app.post("/api/dashboard/agent-plugins/install")
+async def post_agent_plugin_install(request: Request, body: _AgentPluginInstallBody):
+    _require_token(request)
+    from hermes_cli.plugins_cmd import dashboard_install_plugin
+
+    result = dashboard_install_plugin(
+        body.identifier.strip(),
+        force=body.force,
+        enable=body.enable,
+    )
+    if not result.get("ok"):
+        raise HTTPException(
+            status_code=400,
+            detail=result.get("error") or "Install failed.",
+        )
+    _get_dashboard_plugins(force_rescan=True)
+    # Strip internal paths from the response
+    result.pop("after_install_path", None)
+    return result
+
+
+def _validate_plugin_name(name: str) -> str:
+    """Reject path-traversal attempts in plugin name URL parameters."""
+    if not name or "/" in name or "\\" in name or ".." in name:
+        raise HTTPException(status_code=400, detail="Invalid plugin name.")
+    return name
+
+
+@app.post("/api/dashboard/agent-plugins/{name}/enable")
+async def post_agent_plugin_enable(request: Request, name: str):
+    _require_token(request)
+    name = _validate_plugin_name(name)
+    from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
+
+    result = dashboard_set_agent_plugin_enabled(name, enabled=True)
+    if not result.get("ok"):
+        raise HTTPException(status_code=400, detail=result.get("error") or "Enable failed.")
+    return result
+
+
+@app.post("/api/dashboard/agent-plugins/{name}/disable")
+async def post_agent_plugin_disable(request: Request, name: str):
+    _require_token(request)
+    name = _validate_plugin_name(name)
+    from hermes_cli.plugins_cmd import dashboard_set_agent_plugin_enabled
+
+    result = dashboard_set_agent_plugin_enabled(name, enabled=False)
+    if not result.get("ok"):
+        raise HTTPException(status_code=400, detail=result.get("error") or "Disable failed.")
+    return result
+
+
+@app.post("/api/dashboard/agent-plugins/{name}/update")
+async def post_agent_plugin_update(request: Request, name: str):
+    _require_token(request)
+    name = _validate_plugin_name(name)
+    from hermes_cli.plugins_cmd import dashboard_update_user_plugin
+
+    result = dashboard_update_user_plugin(name)
+    if not result.get("ok"):
+        raise HTTPException(status_code=400, detail=result.get("error") or "Update failed.")
+    _get_dashboard_plugins(force_rescan=True)
+    return result
+
+
+@app.delete("/api/dashboard/agent-plugins/{name}")
+async def delete_agent_plugin(request: Request, name: str):
+    _require_token(request)
+    name = _validate_plugin_name(name)
+    from hermes_cli.plugins_cmd import dashboard_remove_user_plugin
+
+    result = dashboard_remove_user_plugin(name)
+    if not result.get("ok"):
+        raise HTTPException(status_code=400, detail=result.get("error") or "Remove failed.")
+    _get_dashboard_plugins(force_rescan=True)
+    return result
+
+
+class _PluginProvidersPutBody(BaseModel):
+    memory_provider: Optional[str] = None
+    context_engine: Optional[str] = None
+
+
+@app.put("/api/dashboard/plugin-providers")
+async def put_plugin_providers(request: Request, body: _PluginProvidersPutBody):
+    """Persist memory provider / context engine selection (writes config.yaml)."""
+    _require_token(request)
+    from hermes_cli.plugins_cmd import (
+        _save_context_engine,
+        _save_memory_provider,
+    )
+
+    if body.memory_provider is not None:
+        _save_memory_provider(body.memory_provider)
+    if body.context_engine is not None:
+        _save_context_engine(body.context_engine)
+    return {"ok": True}
+
+
+class _PluginVisibilityBody(BaseModel):
+    hidden: bool
+
+
+@app.post("/api/dashboard/plugins/{name}/visibility")
+async def post_plugin_visibility(request: Request, name: str, body: _PluginVisibilityBody):
+    """Toggle a plugin's sidebar visibility (persists to config.yaml dashboard.hidden_plugins)."""
+    _require_token(request)
+    name = _validate_plugin_name(name)
+
+    config = load_config()
+    if "dashboard" not in config or not isinstance(config.get("dashboard"), dict):
+        config["dashboard"] = {}
+    hidden_list: list = config["dashboard"].get("hidden_plugins") or []
+    if not isinstance(hidden_list, list):
+        hidden_list = []
+
+    if body.hidden and name not in hidden_list:
+        hidden_list.append(name)
+    elif not body.hidden and name in hidden_list:
+        hidden_list.remove(name)
+
+    config["dashboard"]["hidden_plugins"] = hidden_list
+    save_config(config)
+    return {"ok": True, "name": name, "hidden": body.hidden}
+
+
@app.get("/dashboard-plugins/{plugin_name}/{file_path:path}")
 async def serve_plugin_asset(plugin_name: str, file_path: str):
    """Serve static assets from a dashboard plugin directory.
@@ -8,14 +8,64 @@ import os
 from pathlib import Path


+_profile_fallback_warned: bool = False
+
+
 def get_hermes_home() -> Path:
    """Return the Hermes home directory (default: ~/.hermes).

    Reads HERMES_HOME env var, falls back to ~/.hermes.
    This is the single source of truth — all other copies should import this.
+
+    When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
+    a non-default profile is active, logs a loud one-shot warning to
+    ``errors.log`` so cross-profile data corruption is diagnosable instead
+    of silent.  Behavior is unchanged otherwise — we still return
+    ``~/.hermes`` — because raising here would brick 30+ module-level
+    callers that import this at load time.  Subprocess spawners are
+    expected to propagate ``HERMES_HOME`` explicitly (see the systemd
+    template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
+    ``hermes_cli/kanban_db.py``).  See https://github.com/NousResearch/hermes-agent/issues/18594.
    """
    val = os.environ.get("HERMES_HOME", "").strip()
-    return Path(val) if val else Path.home() / ".hermes"
+    if val:
+        return Path(val)
+
+    # Guard: if a non-default profile is sticky-active, warn once that
+    # the fallback to the default profile is almost certainly wrong.
+    global _profile_fallback_warned
+    if not _profile_fallback_warned:
+        try:
+            # Inline the default-root resolution from get_default_hermes_root()
+            # to stay import-safe (this function is called from module scope
+            # in 30+ files; we cannot afford to trigger logging setup here).
+            active_path = (Path.home() / ".hermes" / "active_profile")
+            active = active_path.read_text().strip() if active_path.exists() else ""
+        except (UnicodeDecodeError, OSError):
+            active = ""
+        if active and active != "default":
+            _profile_fallback_warned = True
+            # Write directly to stderr.  We intentionally do NOT route this
+            # through ``logging`` because (a) this function is called at
+            # module-import time from 30+ sites, often before logging is
+            # configured, and (b) root-logger propagation would double-emit
+            # on consoles where a StreamHandler is already attached.
+            import sys
+            msg = (
+                f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
+                f"profile is {active!r}. Falling back to ~/.hermes, which "
+                f"is the DEFAULT profile — not {active!r}. Any data this "
+                f"process writes will land in the wrong profile. The "
+                f"subprocess spawner should pass HERMES_HOME explicitly "
+                f"(see issue #18594)."
+            )
+            try:
+                sys.stderr.write(msg + "\n")
+                sys.stderr.flush()
+            except Exception:
+                pass
+
+    return Path.home() / ".hermes"


 def get_default_hermes_root() -> Path:
@@ -514,7 +514,7 @@ class SessionDB:
    # Session lifecycle
    # =========================================================================

-    def create_session(
+    def _insert_session_row(
        self,
        session_id: str,
        source: str,
@@ -523,8 +523,8 @@ class SessionDB:
        system_prompt: str = None,
        user_id: str = None,
        parent_session_id: str = None,
-    ) -> str:
-        """Create a new session record. Returns the session_id."""
+    ) -> None:
+        """Shared INSERT OR IGNORE for session rows."""
        def _do(conn):
            conn.execute(
                """INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
@@ -542,8 +542,11 @@ class SessionDB:
                ),
            )
        self._execute_write(_do)
-        return session_id

+    def create_session(self, session_id: str, source: str, **kwargs) -> str:
+        """Create a new session record. Returns the session_id."""
+        self._insert_session_row(session_id, source, **kwargs)
+        return session_id
    def end_session(self, session_id: str, end_reason: str) -> None:
        """Mark a session as ended.

@@ -679,21 +682,41 @@ class SessionDB:
        session_id: str,
        source: str = "unknown",
        model: str = None,
-    ) -> None:
-        """Ensure a session row exists, creating it with minimal metadata if absent.
+        **kwargs,
+    ) -> str:
+        """Ensure a session row exists (INSERT OR IGNORE). Accepts optional kwargs."""
+        self._insert_session_row(session_id, source, model=model, **kwargs)
+        return session_id
+
+    def prune_empty_ghost_sessions(self, sessions_dir: "Optional[Path]" = None) -> int:
+        """Remove empty TUI ghost sessions (no messages, no title, >24hr old)."""
+        cutoff = time.time() - 86400  # Only sessions older than 24 hours

-        Used by _flush_messages_to_session_db to recover from a failed
-        create_session() call (e.g. transient SQLite lock at agent startup).
-        INSERT OR IGNORE is safe to call even when the row already exists.
-        """
        def _do(conn):
-            conn.execute(
-                """INSERT OR IGNORE INTO sessions
-                   (id, source, model, started_at)
-                   VALUES (?, ?, ?, ?)""",
-                (session_id, source, model, time.time()),
-            )
-        self._execute_write(_do)
+            rows = conn.execute("""
+                SELECT id FROM sessions
+                WHERE source = 'tui'
+                  AND title IS NULL
+                  AND ended_at IS NOT NULL
+                  AND started_at < ?
+                  AND NOT EXISTS (
+                      SELECT 1 FROM messages WHERE messages.session_id = sessions.id
+                  )
+            """, (cutoff,)).fetchall()
+            ids = [r[0] if isinstance(r, (tuple, list)) else r["id"] for r in rows]
+            if ids:
+                placeholders = ",".join("?" * len(ids))
+                conn.execute(
+                    f"DELETE FROM sessions WHERE id IN ({placeholders})", ids
+                )
+            return ids
+
+        removed_ids = self._execute_write(_do) or []
+        # Clean up any on-disk session files (belt-and-suspenders)
+        if sessions_dir and removed_ids:
+            for sid in removed_ids:
+                self._remove_session_files(sessions_dir, sid)
+        return len(removed_ids)

    def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
        """Get a session by ID."""
@@ -933,6 +956,7 @@ class SessionDB:
        offset: int = 0,
        include_children: bool = False,
        project_compression_tips: bool = True,
+        order_by_last_active: bool = False,
    ) -> List[Dict[str, Any]]:
        """List sessions with preview (first user message) and last active timestamp.

@@ -952,6 +976,14 @@ class SessionDB:
        compressed continuations from being invisible to users while keeping
        delegate subagents and branches hidden. Pass ``False`` to return the
        raw root rows (useful for admin/debug UIs).
+
+        Pass ``order_by_last_active=True`` to sort by most-recent activity
+        instead of original conversation start time. For compression chains,
+        the "most-recent activity" is taken from the live tip (not the root),
+        so an old conversation that was compressed and continued recently
+        surfaces in the correct slot. Ordering is computed at SQL level via
+        a recursive CTE that walks compression-continuation edges, so LIMIT
+        and OFFSET still apply efficiently.
        """
        where_clauses = []
        params = []
@@ -979,25 +1011,80 @@ class SessionDB:
            params.extend(exclude_sources)

        where_sql = f"WHERE {' AND '.join(where_clauses)}" if where_clauses else ""
-        query = f"""
-            SELECT s.*,
-                COALESCE(
-                    (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
-                     FROM messages m
-                     WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
-                     ORDER BY m.timestamp, m.id LIMIT 1),
-                    ''
-                ) AS _preview_raw,
-                COALESCE(
-                    (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
-                    s.started_at
-                ) AS last_active
-            FROM sessions s
-            {where_sql}
-            ORDER BY s.started_at DESC
-            LIMIT ? OFFSET ?
-        """
-        params.extend([limit, offset])
+        if order_by_last_active:
+            # Compute effective_last_active by walking each surfaced session's
+            # compression-continuation chain forward in SQL and taking the MAX
+            # timestamp across the chain. This lets us ORDER BY + LIMIT at SQL
+            # level instead of fetching every row and sorting in Python, while
+            # still surfacing old compression roots whose live tip is fresh.
+            #
+            # The CTE seeds from rows the outer WHERE admits (roots + branch
+            # children), then recursively joins forward through
+            # compression-continuation edges using the same criteria as
+            # get_compression_tip (parent.end_reason='compression' AND
+            # child.started_at >= parent.ended_at).
+            query = f"""
+                WITH RECURSIVE chain(root_id, cur_id) AS (
+                    SELECT s.id, s.id FROM sessions s {where_sql}
+                    UNION ALL
+                    SELECT c.root_id, child.id
+                    FROM chain c
+                    JOIN sessions parent ON parent.id = c.cur_id
+                    JOIN sessions child ON child.parent_session_id = c.cur_id
+                    WHERE parent.end_reason = 'compression'
+                      AND child.started_at >= parent.ended_at
+                ),
+                chain_max AS (
+                    SELECT
+                        root_id,
+                        MAX(COALESCE(
+                            (SELECT MAX(m.timestamp) FROM messages m WHERE m.session_id = cur_id),
+                            (SELECT started_at FROM sessions ss WHERE ss.id = cur_id)
+                        )) AS effective_last_active
+                    FROM chain
+                    GROUP BY root_id
+                )
+                SELECT s.*,
+                    COALESCE(
+                        (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
+                         FROM messages m
+                         WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
+                         ORDER BY m.timestamp, m.id LIMIT 1),
+                        ''
+                    ) AS _preview_raw,
+                    COALESCE(
+                        (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
+                        s.started_at
+                    ) AS last_active,
+                    COALESCE(cm.effective_last_active, s.started_at) AS _effective_last_active
+                FROM sessions s
+                LEFT JOIN chain_max cm ON cm.root_id = s.id
+                {where_sql}
+                ORDER BY _effective_last_active DESC, s.started_at DESC, s.id DESC
+                LIMIT ? OFFSET ?
+            """
+            # WHERE params apply twice (CTE seed + outer select).
+            params = params + params + [limit, offset]
+        else:
+            query = f"""
+                SELECT s.*,
+                    COALESCE(
+                        (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
+                         FROM messages m
+                         WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
+                         ORDER BY m.timestamp, m.id LIMIT 1),
+                        ''
+                    ) AS _preview_raw,
+                    COALESCE(
+                        (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
+                        s.started_at
+                    ) AS last_active
+                FROM sessions s
+                {where_sql}
+                ORDER BY s.started_at DESC
+                LIMIT ? OFFSET ?
+            """
+            params.extend([limit, offset])
        with self._lock:
            cursor = self._conn.execute(query, params)
            rows = cursor.fetchall()
@@ -1011,6 +1098,8 @@ class SessionDB:
                s["preview"] = text + ("..." if len(raw) > 60 else "")
            else:
                s["preview"] = ""
+            # Drop the internal ordering column so callers see a clean dict.
+            s.pop("_effective_last_active", None)
            sessions.append(s)

        # Project compression roots forward to their tips. Each row whose
@@ -1088,6 +1177,48 @@ class SessionDB:
    # Message storage
    # =========================================================================

+    # Sentinel prefix used to distinguish JSON-encoded structured content
+    # (multimodal messages: lists of parts like text + image_url) from plain
+    # string content. The NUL byte is not legal in normal text, so this
+    # cannot collide with real user content.
+    _CONTENT_JSON_PREFIX = "\x00json:"
+
+    @classmethod
+    def _encode_content(cls, content: Any) -> Any:
+        """Serialize structured (list/dict) message content for sqlite.
+
+        sqlite3 can only bind ``str``, ``bytes``, ``int``, ``float``, and ``None``
+        to query parameters. Multimodal messages have ``content`` as a list of
+        parts (``[{"type": "text", ...}, {"type": "image_url", ...}]``), which
+        raises ``ProgrammingError: Error binding parameter N: type 'list' is
+        not supported`` when bound directly.
+
+        Returns the value unchanged when it's already a safe scalar, or a
+        sentinel-prefixed JSON string for lists/dicts. Paired with
+        :meth:`_decode_content` on read.
+        """
+        if content is None or isinstance(content, (str, bytes, int, float)):
+            return content
+        try:
+            return cls._CONTENT_JSON_PREFIX + json.dumps(content)
+        except (TypeError, ValueError):
+            # Last-resort fallback: stringify so persistence never fails.
+            return str(content)
+
+    @classmethod
+    def _decode_content(cls, content: Any) -> Any:
+        """Reverse :meth:`_encode_content`; returns scalars unchanged."""
+        if isinstance(content, str) and content.startswith(cls._CONTENT_JSON_PREFIX):
+            try:
+                return json.loads(content[len(cls._CONTENT_JSON_PREFIX):])
+            except (json.JSONDecodeError, TypeError):
+                logger.warning(
+                    "Failed to decode JSON-encoded message content; "
+                    "returning raw string"
+                )
+                return content
+        return content
+
    def append_message(
        self,
        session_id: str,
@@ -1124,6 +1255,9 @@ class SessionDB:
            if codex_message_items else None
        )
        tool_calls_json = json.dumps(tool_calls) if tool_calls else None
+        # Multimodal content (list of parts) must be JSON-encoded: sqlite3
+        # cannot bind list/dict parameters directly.
+        stored_content = self._encode_content(content)

        # Pre-compute tool call count
        num_tool_calls = 0
@@ -1140,7 +1274,7 @@ class SessionDB:
                (
                    session_id,
                    role,
-                    content,
+                    stored_content,
                    tool_call_id,
                    tool_calls_json,
                    tool_name,
@@ -1223,7 +1357,7 @@ class SessionDB:
                    (
                        session_id,
                        role,
-                        msg.get("content"),
+                        self._encode_content(msg.get("content")),
                        msg.get("tool_call_id"),
                        tool_calls_json,
                        msg.get("tool_name"),
@@ -1262,6 +1396,8 @@ class SessionDB:
        result = []
        for row in rows:
            msg = dict(row)
+            if "content" in msg:
+                msg["content"] = self._decode_content(msg["content"])
            if msg.get("tool_calls"):
                try:
                    msg["tool_calls"] = json.loads(msg["tool_calls"])
@@ -1351,15 +1487,15 @@ class SessionDB:
            placeholders = ",".join("?" for _ in session_ids)
            rows = self._conn.execute(
                "SELECT role, content, tool_call_id, tool_calls, tool_name, "
-                "reasoning, reasoning_content, reasoning_details, codex_reasoning_items, "
-                "codex_message_items "
+                "finish_reason, reasoning, reasoning_content, reasoning_details, "
+                "codex_reasoning_items, codex_message_items "
                f"FROM messages WHERE session_id IN ({placeholders}) ORDER BY timestamp, id",
                tuple(session_ids),
            ).fetchall()

        messages = []
        for row in rows:
-            content = row["content"]
+            content = self._decode_content(row["content"])
            if row["role"] in {"user", "assistant"} and isinstance(content, str):
                content = sanitize_context(content).strip()
            msg = {"role": row["role"], "content": content}
@@ -1377,6 +1513,8 @@ class SessionDB:
            # that replay reasoning (OpenRouter, OpenAI, Nous) receive
            # coherent multi-turn reasoning context.
            if row["role"] == "assistant":
+                if row["finish_reason"]:
+                    msg["finish_reason"] = row["finish_reason"]
                if row["reasoning"]:
                    msg["reasoning"] = row["reasoning"]
                if row["reasoning_content"] is not None:
@@ -1744,10 +1882,26 @@ class SessionDB:
                           )""",
                        (match["id"], match["id"]),
                    )
-                    context_msgs = [
-                        {"role": r["role"], "content": (r["content"] or "")[:200]}
-                        for r in ctx_cursor.fetchall()
-                    ]
+                    context_msgs = []
+                    for r in ctx_cursor.fetchall():
+                        raw = r["content"]
+                        decoded = self._decode_content(raw)
+                        # Multimodal context: render a compact text-only
+                        # summary for search previews.
+                        if isinstance(decoded, list):
+                            text_parts = [
+                                p.get("text", "") for p in decoded
+                                if isinstance(p, dict) and p.get("type") == "text"
+                            ]
+                            text = " ".join(t for t in text_parts if t).strip()
+                            preview = text or "[multimodal content]"
+                        elif isinstance(decoded, str):
+                            preview = decoded
+                        else:
+                            preview = ""
+                        context_msgs.append(
+                            {"role": r["role"], "content": preview[:200]}
+                        )
                match["context"] = context_msgs
            except Exception:
                match["context"] = []
@@ -1994,6 +2148,388 @@ class SessionDB:
            )
        self._execute_write(_do)

+    def apply_telegram_topic_migration(self) -> None:
+        """Create Telegram DM topic-mode tables on explicit /topic opt-in.
+
+        This migration is deliberately not part of automatic SessionDB startup
+        reconciliation. Operators must be able to upgrade Hermes, keep the old
+        Telegram bot behavior running, and only mutate topic-mode state when the
+        user executes /topic to opt into the feature.
+
+        Schema versions:
+          v1 — initial shape (no ON DELETE CASCADE on session_id FK)
+          v2 — session_id FK gets ON DELETE CASCADE so session pruning
+               automatically clears bindings.
+        """
+        def _do(conn):
+            conn.executescript(
+                """
+                CREATE TABLE IF NOT EXISTS telegram_dm_topic_mode (
+                    chat_id TEXT PRIMARY KEY,
+                    user_id TEXT NOT NULL,
+                    enabled INTEGER NOT NULL DEFAULT 1,
+                    activated_at REAL NOT NULL,
+                    updated_at REAL NOT NULL,
+                    has_topics_enabled INTEGER,
+                    allows_users_to_create_topics INTEGER,
+                    capability_checked_at REAL,
+                    intro_message_id TEXT,
+                    pinned_message_id TEXT
+                );
+
+                CREATE TABLE IF NOT EXISTS telegram_dm_topic_bindings (
+                    chat_id TEXT NOT NULL,
+                    thread_id TEXT NOT NULL,
+                    user_id TEXT NOT NULL,
+                    session_key TEXT NOT NULL,
+                    session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
+                    managed_mode TEXT NOT NULL DEFAULT 'auto',
+                    linked_at REAL NOT NULL,
+                    updated_at REAL NOT NULL,
+                    PRIMARY KEY (chat_id, thread_id)
+                );
+
+                CREATE UNIQUE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_session
+                ON telegram_dm_topic_bindings(session_id);
+
+                CREATE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_user
+                ON telegram_dm_topic_bindings(user_id, chat_id);
+                """
+            )
+
+            # v1 → v2: rebuild telegram_dm_topic_bindings if its session_id FK
+            # lacks ON DELETE CASCADE. SQLite can't ALTER a foreign key, so we
+            # rebuild the table. Only runs once per DB (version gate).
+            current = conn.execute(
+                "SELECT value FROM state_meta WHERE key = ?",
+                ("telegram_dm_topic_schema_version",),
+            ).fetchone()
+            current_version = int(current[0]) if current and str(current[0]).isdigit() else 0
+            if current_version < 2:
+                fk_rows = conn.execute(
+                    "PRAGMA foreign_key_list('telegram_dm_topic_bindings')"
+                ).fetchall()
+                needs_rebuild = any(
+                    row[2] == "sessions" and (row[6] or "") != "CASCADE"
+                    for row in fk_rows
+                )
+                if needs_rebuild:
+                    conn.executescript(
+                        """
+                        CREATE TABLE telegram_dm_topic_bindings_new (
+                            chat_id TEXT NOT NULL,
+                            thread_id TEXT NOT NULL,
+                            user_id TEXT NOT NULL,
+                            session_key TEXT NOT NULL,
+                            session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
+                            managed_mode TEXT NOT NULL DEFAULT 'auto',
+                            linked_at REAL NOT NULL,
+                            updated_at REAL NOT NULL,
+                            PRIMARY KEY (chat_id, thread_id)
+                        );
+                        INSERT INTO telegram_dm_topic_bindings_new
+                            SELECT chat_id, thread_id, user_id, session_key,
+                                   session_id, managed_mode, linked_at, updated_at
+                            FROM telegram_dm_topic_bindings;
+                        DROP TABLE telegram_dm_topic_bindings;
+                        ALTER TABLE telegram_dm_topic_bindings_new
+                            RENAME TO telegram_dm_topic_bindings;
+                        CREATE UNIQUE INDEX idx_telegram_dm_topic_bindings_session
+                            ON telegram_dm_topic_bindings(session_id);
+                        CREATE INDEX idx_telegram_dm_topic_bindings_user
+                            ON telegram_dm_topic_bindings(user_id, chat_id);
+                        """
+                    )
+
+            conn.execute(
+                "INSERT INTO state_meta (key, value) VALUES (?, ?) "
+                "ON CONFLICT(key) DO UPDATE SET value = excluded.value",
+                ("telegram_dm_topic_schema_version", "2"),
+            )
+        self._execute_write(_do)
+
+    def enable_telegram_topic_mode(
+        self,
+        *,
+        chat_id: str,
+        user_id: str,
+        has_topics_enabled: Optional[bool] = None,
+        allows_users_to_create_topics: Optional[bool] = None,
+    ) -> None:
+        """Enable Telegram DM topic mode for one private chat/user.
+
+        This method intentionally owns the explicit topic migration. Ordinary
+        SessionDB startup must not create these side tables.
+        """
+        self.apply_telegram_topic_migration()
+        now = time.time()
+
+        def _to_int(value: Optional[bool]) -> Optional[int]:
+            if value is None:
+                return None
+            return 1 if value else 0
+
+        def _do(conn):
+            conn.execute(
+                """
+                INSERT INTO telegram_dm_topic_mode (
+                    chat_id, user_id, enabled, activated_at, updated_at,
+                    has_topics_enabled, allows_users_to_create_topics,
+                    capability_checked_at
+                ) VALUES (?, ?, 1, ?, ?, ?, ?, ?)
+                ON CONFLICT(chat_id) DO UPDATE SET
+                    user_id = excluded.user_id,
+                    enabled = 1,
+                    updated_at = excluded.updated_at,
+                    has_topics_enabled = excluded.has_topics_enabled,
+                    allows_users_to_create_topics = excluded.allows_users_to_create_topics,
+                    capability_checked_at = excluded.capability_checked_at
+                """,
+                (
+                    str(chat_id),
+                    str(user_id),
+                    now,
+                    now,
+                    _to_int(has_topics_enabled),
+                    _to_int(allows_users_to_create_topics),
+                    now,
+                ),
+            )
+        self._execute_write(_do)
+
+    def disable_telegram_topic_mode(
+        self,
+        *,
+        chat_id: str,
+        clear_bindings: bool = True,
+    ) -> None:
+        """Disable Telegram DM topic mode for one private chat.
+
+        When ``clear_bindings`` is True (default) the (chat_id, thread_id)
+        bindings for this chat are also cleared so re-enabling later
+        starts from a clean slate. Set to False if the operator wants to
+        preserve bindings for a later re-enable.
+
+        Never creates the topic-mode tables from scratch; if they don't
+        exist there is nothing to disable and the call is a no-op.
+        """
+        def _do(conn):
+            try:
+                conn.execute(
+                    "UPDATE telegram_dm_topic_mode SET enabled = 0, updated_at = ? "
+                    "WHERE chat_id = ?",
+                    (time.time(), str(chat_id)),
+                )
+                if clear_bindings:
+                    conn.execute(
+                        "DELETE FROM telegram_dm_topic_bindings WHERE chat_id = ?",
+                        (str(chat_id),),
+                    )
+            except sqlite3.OperationalError:
+                # Tables don't exist yet — nothing to disable.
+                return
+        self._execute_write(_do)
+
+    def is_telegram_topic_mode_enabled(self, *, chat_id: str, user_id: str) -> bool:
+        """Return whether Telegram DM topic mode is enabled for this chat/user."""
+        with self._lock:
+            try:
+                row = self._conn.execute(
+                    """
+                    SELECT enabled FROM telegram_dm_topic_mode
+                    WHERE chat_id = ? AND user_id = ?
+                    """,
+                    (str(chat_id), str(user_id)),
+                ).fetchone()
+            except sqlite3.OperationalError:
+                return False
+        if row is None:
+            return False
+        enabled = row["enabled"] if isinstance(row, sqlite3.Row) else row[0]
+        return bool(enabled)
+
+    def get_telegram_topic_binding(
+        self,
+        *,
+        chat_id: str,
+        thread_id: str,
+    ) -> Optional[Dict[str, Any]]:
+        """Return the session binding for a Telegram DM topic, if present."""
+        with self._lock:
+            try:
+                row = self._conn.execute(
+                    """
+                    SELECT * FROM telegram_dm_topic_bindings
+                    WHERE chat_id = ? AND thread_id = ?
+                    """,
+                    (str(chat_id), str(thread_id)),
+                ).fetchone()
+            except sqlite3.OperationalError:
+                return None
+        return dict(row) if row else None
+
+    def bind_telegram_topic(
+        self,
+        *,
+        chat_id: str,
+        thread_id: str,
+        user_id: str,
+        session_key: str,
+        session_id: str,
+        managed_mode: str = "auto",
+    ) -> None:
+        """Bind one Telegram DM topic thread to one Hermes session.
+
+        A Hermes session may only be linked to one Telegram topic in MVP.
+        Rebinding the same topic to the same session is idempotent; trying to
+        link the same session to a different topic raises ValueError.
+        """
+        self.apply_telegram_topic_migration()
+        now = time.time()
+        chat_id = str(chat_id)
+        thread_id = str(thread_id)
+        user_id = str(user_id)
+        session_key = str(session_key)
+        session_id = str(session_id)
+
+        def _do(conn):
+            existing_session = conn.execute(
+                """
+                SELECT chat_id, thread_id FROM telegram_dm_topic_bindings
+                WHERE session_id = ?
+                """,
+                (session_id,),
+            ).fetchone()
+            if existing_session is not None:
+                linked_chat = existing_session["chat_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[0]
+                linked_thread = existing_session["thread_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[1]
+                if str(linked_chat) != chat_id or str(linked_thread) != thread_id:
+                    raise ValueError("session is already linked to another Telegram topic")
+
+            conn.execute(
+                """
+                INSERT INTO telegram_dm_topic_bindings (
+                    chat_id, thread_id, user_id, session_key, session_id,
+                    managed_mode, linked_at, updated_at
+                ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+                ON CONFLICT(chat_id, thread_id) DO UPDATE SET
+                    user_id = excluded.user_id,
+                    session_key = excluded.session_key,
+                    session_id = excluded.session_id,
+                    managed_mode = excluded.managed_mode,
+                    updated_at = excluded.updated_at
+                """,
+                (
+                    chat_id,
+                    thread_id,
+                    user_id,
+                    session_key,
+                    session_id,
+                    managed_mode,
+                    now,
+                    now,
+                ),
+            )
+        self._execute_write(_do)
+
+    def is_telegram_session_linked_to_topic(self, *, session_id: str) -> bool:
+        """Return True if a Hermes session is already bound to any Telegram DM topic.
+
+        Read-only: does NOT trigger the telegram-topic migration. If the
+        topic-mode tables have not been created yet (i.e. nobody has run
+        ``/topic`` in this profile), the session is by definition unbound
+        and we return False.
+        """
+        with self._lock:
+            try:
+                row = self._conn.execute(
+                    """
+                    SELECT 1 FROM telegram_dm_topic_bindings
+                    WHERE session_id = ?
+                    LIMIT 1
+                    """,
+                    (str(session_id),),
+                ).fetchone()
+            except sqlite3.OperationalError:
+                return False
+        return row is not None
+
+    def list_unlinked_telegram_sessions_for_user(
+        self,
+        *,
+        chat_id: str,
+        user_id: str,
+        limit: int = 10,
+    ) -> List[Dict[str, Any]]:
+        """List previous Telegram sessions for this user that are not bound to a topic.
+
+        Read-only: does NOT trigger the telegram-topic migration. If the
+        topic-mode tables are absent, fall back to a simpler query that
+        just returns this user's Telegram sessions — there can't be any
+        bindings yet.
+        """
+        with self._lock:
+            try:
+                rows = self._conn.execute(
+                    """
+                    SELECT s.*,
+                        COALESCE(
+                            (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
+                             FROM messages m
+                             WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
+                             ORDER BY m.timestamp, m.id LIMIT 1),
+                            ''
+                        ) AS _preview_raw,
+                        COALESCE(
+                            (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
+                            s.started_at
+                        ) AS last_active
+                    FROM sessions s
+                    WHERE s.source = 'telegram'
+                      AND s.user_id = ?
+                      AND NOT EXISTS (
+                          SELECT 1 FROM telegram_dm_topic_bindings b
+                          WHERE b.session_id = s.id
+                      )
+                    ORDER BY last_active DESC, s.started_at DESC
+                    LIMIT ?
+                    """,
+                    (str(user_id), int(limit)),
+                ).fetchall()
+            except sqlite3.OperationalError:
+                # telegram_dm_topic_bindings doesn't exist yet — no bindings
+                # means every telegram session for this user is "unlinked".
+                rows = self._conn.execute(
+                    """
+                    SELECT s.*,
+                        COALESCE(
+                            (SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
+                             FROM messages m
+                             WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
+                             ORDER BY m.timestamp, m.id LIMIT 1),
+                            ''
+                        ) AS _preview_raw,
+                        COALESCE(
+                            (SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
+                            s.started_at
+                        ) AS last_active
+                    FROM sessions s
+                    WHERE s.source = 'telegram'
+                      AND s.user_id = ?
+                    ORDER BY last_active DESC, s.started_at DESC
+                    LIMIT ?
+                    """,
+                    (str(user_id), int(limit)),
+                ).fetchall()
+
+        sessions: List[Dict[str, Any]] = []
+        for row in rows:
+            session = dict(row)
+            raw = str(session.pop("_preview_raw", "") or "").strip()
+            session["preview"] = raw[:60] + ("..." if len(raw) > 60 else "") if raw else ""
+            sessions.append(session)
+        return sessions
+
    # ── Space reclamation ──

    def vacuum(self) -> None:
@@ -356,12 +356,17 @@ def _compute_tool_definitions(
            else:
                if not quiet_mode:
                    print(f"⚠️  Unknown toolset: {toolset_name}")
-
-    elif disabled_toolsets:
+    else:
+        # Default: start with everything
        from toolsets import get_all_toolsets
        for ts_name in get_all_toolsets():
            tools_to_include.update(resolve_toolset(ts_name))

+    # Always apply disabled toolsets as a subtraction step at the end.
+    # This ensures that even if a composite toolset (like hermes-cli)
+    # is enabled, any tools belonging to a disabled toolset are strictly
+    # stripped out. See issue #17309.
+    if disabled_toolsets:
        for toolset_name in disabled_toolsets:
            if validate_toolset(toolset_name):
                resolved = resolve_toolset(toolset_name)
@@ -376,10 +381,6 @@ def _compute_tool_definitions(
            else:
                if not quiet_mode:
                    print(f"⚠️  Unknown toolset: {toolset_name}")
-    else:
-        from toolsets import get_all_toolsets
-        for ts_name in get_all_toolsets():
-            tools_to_include.update(resolve_toolset(ts_name))

    # Plugin-registered tools are now resolved through the normal toolset
    # path — validate_toolset() / resolve_toolset() / get_all_toolsets()
@@ -510,6 +511,12 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:

    Handles ``"type": "integer"``, ``"type": "number"``, ``"type": "boolean"``,
    and union types (``"type": ["integer", "string"]``).
+
+    Also wraps bare scalar values in a single-element list when the schema
+    declares ``"type": "array"``.  Open-weight models (DeepSeek, Qwen, GLM)
+    sometimes emit ``{"urls": "https://a.com"}`` when the tool expects
+    ``{"urls": ["https://a.com"]}``; wrapping here avoids a confusing tool
+    failure on what is otherwise a well-formed call.
    """
    if not args or not isinstance(args, dict):
        return args
@@ -522,13 +529,42 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
    if not properties:
        return args

-    for key, value in args.items():
-        if not isinstance(value, str):
-            continue
+    for key, value in list(args.items()):
        prop_schema = properties.get(key)
        if not prop_schema:
            continue
        expected = prop_schema.get("type")
+
+        # Wrap bare non-list values when the schema declares ``array``.
+        # Strings still go through _coerce_value first so JSON-encoded
+        # arrays (``'["a","b"]'``) get parsed and nullable ``"null"``
+        # becomes ``None`` rather than ``["null"]``.
+        # ``None`` itself is preserved — we don't know whether the model
+        # meant "omit" or "empty list", and tools with sensible defaults
+        # (e.g. read_file's normalize_read_pagination) already handle it.
+        if expected == "array" and value is not None and not isinstance(value, (list, tuple)):
+            if isinstance(value, str):
+                coerced = _coerce_value(value, expected, schema=prop_schema)
+                if coerced is not value:
+                    # _coerce_value handled it (JSON-parsed list or
+                    # nullable "null" → None).
+                    args[key] = coerced
+                    continue
+                args[key] = [value]
+                logger.info(
+                    "coerce_tool_args: wrapped bare string in list for %s.%s",
+                    tool_name, key,
+                )
+                continue
+            args[key] = [value]
+            logger.info(
+                "coerce_tool_args: wrapped bare %s in list for %s.%s",
+                type(value).__name__, tool_name, key,
+            )
+            continue
+
+        if not isinstance(value, str):
+            continue
        if not expected and not _schema_allows_null(prop_schema):
            continue
        coerced = _coerce_value(value, expected, schema=prop_schema)
@@ -163,35 +163,42 @@
      for entry in "''${ENTRIES[@]}"; do
        IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
        echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
-        OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --rebuild --print-build-logs 2>&1)
-        STATUS=$?
-        if [ "$STATUS" -eq 0 ]; then
+
+        # Compute the actual hash from the lockfile directly using
+        # prefetch-npm-deps. This avoids false "ok" from nix build when
+        # an old derivation is cached in a substituter (cachix/cache.nixos.org).
+        LOCK_FILE="$FOLDER/package-lock.json"
+        NEW_HASH=$(${pkgs.lib.getExe pkgs.prefetch-npm-deps} "$LOCK_FILE" 2>/dev/null)
+        if [ -z "$NEW_HASH" ]; then
+          echo "    prefetch-npm-deps failed, falling back to nix build" >&2
+          OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
+          STATUS=$?
+          if [ "$STATUS" -eq 0 ]; then
+            echo "    ok (via nix build)"
+            continue
+          fi
+          NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
+          if [ -z "$NEW_HASH" ]; then
+            if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
+              echo "    skipped (transient cache failure — see primary nix build for real status)" >&2
+              echo "$OUTPUT" | tail -8 >&2
+              continue
+            fi
+            echo "    build failed with no hash mismatch:" >&2
+            echo "$OUTPUT" | tail -40 >&2
+            exit 1
+          fi
+        fi
+
+        OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
+          | sed -E 's/hash = "(.*)"/\1/')
+
+        if [ "$NEW_HASH" = "$OLD_HASH" ]; then
          echo "    ok"
          continue
        fi

-        NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
-        if [ -z "$NEW_HASH" ]; then
-          # Magic-Nix-Cache occasionally returns HTTP 418 / cache-throttled
-          # mid-run; nix then prints "outputs … not valid, so checking is
-          # not possible" without a `got:` line.  That's an infrastructure
-          # blip, not a stale lockfile — warn + skip rather than failing
-          # the lint.  A real hash mismatch would still surface in the
-          # primary `.#$ATTR` build, which is a separate CI job.
-          if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
-            echo "    skipped (transient cache failure — see primary nix build for real status)" >&2
-            echo "$OUTPUT" | tail -8 >&2
-            continue
-          fi
-          echo "    build failed with no hash mismatch:" >&2
-          echo "$OUTPUT" | tail -40 >&2
-          exit 1
-        fi
-
        HASH_LINE=$(grep -n 'hash = "sha256-' "$NIX_FILE" | head -1 | cut -d: -f1)
-        OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
-          | sed -E 's/hash = "(.*)"/\1/')
-        LOCK_FILE="$FOLDER/package-lock.json"
        echo "    stale: $NIX_FILE:$HASH_LINE $OLD_HASH -> $NEW_HASH"
        STALE=1

@@ -4,7 +4,7 @@ let
  src = ../ui-tui;
  npmDeps = pkgs.fetchNpmDeps {
    inherit src;
-    hash = "sha256-Chz+NW9NXqboXHOa6PKwf5bhAkkcFtKNhvKWwg2XSPc=";
+    hash = "sha256-MLcLhjTF6dgdvNBtJWzo8Nh19eNh/ZitD2b07nm61Tc=";
  };

  npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -0,0 +1,190 @@
+---
+name: hyperframes
+description: Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.
+version: 1.0.0
+author: heygen-com
+license: Apache-2.0
+prerequisites:
+  commands: [node, ffmpeg, npx]
+metadata:
+  hermes:
+    tags: [creative, video, animation, html, gsap, motion-graphics]
+    related_skills: [manim-video, meme-generation]
+    category: creative
+    requires_toolsets: [terminal]
+---
+
+# HyperFrames
+
+HTML is the source of truth for video. A composition is an HTML file with `data-*` attributes for timing, a GSAP timeline for animation, and CSS for appearance. The HyperFrames engine captures the page frame-by-frame and encodes to MP4/WebM with FFmpeg.
+
+**Complement to `manim-video`:** Use `manim-video` for mathematical/geometric explainers (equations, 3B1B-style). Use `hyperframes` for motion-graphics, talking-head with captions, product tours, social overlays, shader transitions, and anything driven by real video/audio media.
+
+## When to Use
+
+- User asks for a rendered video from text, a script, or a website
+- Animated title cards, lower thirds, or typographic intros
+- Captioned narration video (TTS + captions synced to waveform)
+- Audio-reactive visuals (beat sync, spectrum bars, pulsing glow)
+- Scene-to-scene transitions (crossfade, wipe, shader warp, flash-through-white)
+- Social overlays (Instagram/TikTok/YouTube style)
+- Website-to-video pipeline (capture a URL, produce a promo)
+- Any HTML/CSS/JS animation that must render deterministically to a video file
+
+Do **not** use this skill for:
+- Pure math/equation animation (→ `manim-video`)
+- Image generation or memes (→ `meme-generation`, image models)
+- Live video conferencing or streaming
+
+## Quick Reference
+
+```bash
+npx hyperframes init my-video               # scaffold a project
+cd my-video
+npx hyperframes lint                        # validate before preview/render
+npx hyperframes preview                     # live-reload browser preview (port 3002)
+npx hyperframes render --output final.mp4   # render to MP4
+npx hyperframes doctor                      # diagnose environment issues
+```
+
+Render flags: `--quality draft|standard|high` · `--fps 24|30|60` · `--format mp4|webm` · `--docker` (reproducible) · `--strict`.
+
+Full CLI reference: [references/cli.md](references/cli.md).
+
+## Setup (one-time)
+
+```bash
+bash "$(dirname "$(find ~/.hermes/skills -path '*/hyperframes/SKILL.md' 2>/dev/null | head -1)")/scripts/setup.sh"
+```
+
+The script:
+1. Verifies Node.js >= 22 and FFmpeg are installed (prints fix instructions if not).
+2. Installs the `hyperframes` CLI globally (`npm install -g hyperframes@>=0.4.2`).
+3. Pre-caches `chrome-headless-shell` via Puppeteer — **required** for best-quality rendering via Chrome's `HeadlessExperimental.beginFrame` capture path.
+4. Runs `npx hyperframes doctor` and reports the result.
+
+See [references/troubleshooting.md](references/troubleshooting.md) if setup fails.
+
+## Procedure
+
+### 1. Plan before writing HTML
+
+Before touching code, articulate at a high level:
+- **What** — narrative arc, key moments, emotional beats
+- **Structure** — compositions, tracks (video/audio/overlays), durations
+- **Visual identity** — colors, fonts, motion character (explosive / cinematic / fluid / technical)
+- **Hero frame** — for each scene, the moment when the most elements are simultaneously visible. This is the static layout you'll build first.
+
+**Visual Identity Gate (HARD-GATE).** Before writing ANY composition HTML, a visual identity must be defined. Do NOT write compositions with default or generic colors (`#333`, `#3b82f6`, `Roboto` are tells that this step was skipped). Check in order:
+
+1. **`DESIGN.md` at project root?** → Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.
+2. **User named a style** (e.g. "Swiss Pulse", "dark and techy", "luxury brand")? → Generate a minimal `DESIGN.md` with `## Style Prompt`, `## Colors` (3-5 hex with roles), `## Typography` (1-2 families), `## What NOT to Do` (3-5 anti-patterns).
+3. **None of the above?** → Ask 3 questions before writing any HTML:
+   - Mood? (explosive / cinematic / fluid / technical / chaotic / warm)
+   - Light or dark canvas?
+   - Any brand colors, fonts, or visual references?
+
+   Then generate a `DESIGN.md` from the answers. Every composition must trace its palette and typography back to `DESIGN.md` or explicit user direction.
+
+### 2. Scaffold
+
+```bash
+npx hyperframes init my-video --non-interactive
+```
+
+Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`. Pass `--example <name>` to pick one, `--video clip.mp4` or `--audio track.mp3` to seed with media.
+
+### 3. Layout before animation
+
+Write the static HTML+CSS for the **hero frame first** — no GSAP yet. The `.scene-content` container must fill the scene (`width:100%; height:100%; padding:Npx`) with `display:flex` + `gap`. Use padding to push content inward — never `position: absolute; top: Npx` on a content container (content overflows when taller than the remaining space).
+
+Only after the hero frame looks right, add `gsap.from()` entrances (animate **to** the CSS position) and `gsap.to()` exits (animate **from** it).
+
+See [references/composition.md](references/composition.md) for the full data-attribute schema and composition rules.
+
+### 4. Animate with GSAP
+
+Every composition must:
+- Register its timeline: `window.__timelines["<composition-id>"] = tl`
+- Start paused: `gsap.timeline({ paused: true })` — the player controls playback
+- Use finite `repeat` values (no `repeat: -1` — breaks the capture engine). Calculate: `repeat: Math.ceil(duration / cycleDuration) - 1`.
+- Be deterministic — no `Math.random()`, `Date.now()`, or wall-clock logic. Use a seeded PRNG if you need pseudo-randomness.
+- Build synchronously — no `async`/`await`, `setTimeout`, or Promises around timeline construction.
+
+See [references/gsap.md](references/gsap.md) for the core GSAP API (tweens, eases, stagger, timelines).
+
+### 5. Transitions between scenes
+
+Multi-scene compositions require transitions. Rules:
+1. **Always use a transition between scenes** — no jump cuts.
+2. **Always use entrance animations** on every scene element (`gsap.from(...)`).
+3. **Never use exit animations** except on the final scene — the transition IS the exit.
+4. The final scene may fade out.
+
+Use `npx hyperframes add <transition-name>` to install shader transitions (`flash-through-white`, `liquid-wipe`, etc.). Full list: `npx hyperframes add --list`.
+
+### 6. Audio, captions, TTS, audio-reactive, highlighting
+
+- **Audio:** always a separate `<audio>` element (video is `muted playsinline`).
+- **TTS:** `npx hyperframes tts "Script text" --voice af_nova --output narration.wav`. List voices with `--list`. Voice ID first letter encodes language (`a`/`b`=English, `e`=Spanish, `f`=French, `j`=Japanese, `z`=Mandarin, etc.) — the CLI auto-infers the phonemizer locale; pass `--lang` only to override. Non-English phonemization requires `espeak-ng` installed system-wide.
+- **Captions:** `npx hyperframes transcribe narration.wav` → word-level transcript. Pick style from the transcript tone (hype / corporate / tutorial / storytelling / social — see the table in `references/features.md`). **Language rule:** never use `.en` whisper models unless the audio is confirmed English — `.en` translates non-English audio instead of transcribing it. Every caption group MUST have a hard `tl.set(el, { opacity: 0, visibility: "hidden" }, group.end)` kill after its exit tween — otherwise groups leak visible into later ones.
+- **Audio-reactive visuals:** pre-extract audio bands (bass / mid / treble) and sample per-frame inside the timeline with a `for` loop of `tl.call(draw, [], f / fps)` — a single long tween does NOT react to audio. Map bass → `scale` (pulse), treble → `textShadow`/`boxShadow` (glow), overall amplitude → `opacity`/`y`/`backgroundColor`. Avoid equalizer-bar clichés — let content guide the visual, audio drive its behavior.
+- **Marker-style highlighting:** highlight, circle, burst, scribble, sketchout effects for text emphasis are deterministic CSS+GSAP — see `references/features.md#marker-highlighting`. Fully seekable, no animated SVG filters.
+- **Scene transitions:** every multi-scene composition MUST use transitions (no jump cuts). Pick from CSS primitives (push slide, blur crossfade, zoom through, staggered blocks) or shader transitions (`flash-through-white`, `liquid-wipe`, `cross-warp-morph`, `chromatic-split`, etc.) via `npx hyperframes add`. Mood and energy tables live in `references/features.md#transitions`. Do not mix CSS and shader transitions in the same composition.
+
+### 7. Lint, validate, inspect, preview, render
+
+```bash
+npx hyperframes lint              # catches missing data-composition-id, overlapping tracks, unregistered timelines
+npx hyperframes validate          # WCAG contrast audit at 5 timestamps
+npx hyperframes inspect           # visual layout audit — overflow, off-frame elements, occluded text
+npx hyperframes preview           # live browser preview
+npx hyperframes render --quality draft --output draft.mp4    # fast iteration
+npx hyperframes render --quality high --output final.mp4     # final delivery
+```
+
+`hyperframes validate` samples background pixels behind every text element and warns on contrast ratios below 4.5:1 (or 3:1 for large text). `hyperframes inspect` is the layout-side companion — runs the page at multiple timestamps and flags issues that a static lint can't see (a caption that wraps past the safe area only at 4.5s, a card that overflows when its title is the longest variant, an element that ends up behind a transition shader). Run `inspect` especially on compositions with speech bubbles, cards, captions, or tight typography.
+
+### 8. Website-to-video (if the user gives a URL)
+
+Use the 7-step capture-to-video workflow in [references/website-to-video.md](references/website-to-video.md): capture → DESIGN.md → SCRIPT.md → storyboard → composition → render → deliver.
+
+## Pitfalls
+
+- **`HeadlessExperimental.beginFrame' wasn't found`** — Chromium 147+ removed this protocol. Ensure you're on `hyperframes@>=0.4.2` (auto-detects and falls back to screenshot mode). Escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294) and [references/troubleshooting.md](references/troubleshooting.md).
+- **System Chrome (not `chrome-headless-shell`)** — renders hang for 120s then timeout. Run `npx puppeteer browsers install chrome-headless-shell` (setup.sh does this). `hyperframes doctor` reports which binary will be used.
+- **`repeat: -1` anywhere** — breaks the capture engine. Always compute a finite repeat count.
+- **`gsap.set()` on clip elements that enter later** — the element doesn't exist at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline instead, at or after the clip's `data-start`.
+- **`<br>` inside content text** — forced breaks don't know the rendered font width, so natural wrap + `<br>` double-breaks. Use `max-width` to let text wrap. Exception: short display titles where each word is deliberately on its own line.
+- **Animating `visibility` or `display`** — GSAP can't tween these. Use `autoAlpha` (handles both visibility and opacity).
+- **Calling `video.play()` or `audio.play()`** — the framework owns playback. Never call these yourself.
+- **Building timelines async** — the capture engine reads `window.__timelines` synchronously after page load. Never wrap timeline construction in `async`, `setTimeout`, or a Promise.
+- **Standalone `index.html` wrapped in `<template>`** — hides all content from the browser. Only **sub-compositions** loaded via `data-composition-src` use `<template>`.
+- **Using video for audio** — always muted `<video>` + separate `<audio>`.
+
+## Verification
+
+Before and after rendering:
+
+1. **Lint + validate + inspect pass:** `npx hyperframes lint --strict && npx hyperframes validate && npx hyperframes inspect` (lint catches structural issues, validate catches contrast, inspect catches visual layout / overflow issues — see troubleshooting.md if warnings appear).
+2. **Animation choreography** — for new compositions or significant animation changes, run the animation map. `npx hyperframes init` copies the skill scripts into the project, so the path is project-local:
+   ```bash
+   node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
+     --out <composition-dir>/.hyperframes/anim-map
+   ```
+   Outputs a single `animation-map.json` with per-tween summaries, ASCII Gantt timeline, stagger detection, dead zones (>1s with no animation), element lifecycles, and flags (`offscreen`, `collision`, `invisible`, `paced-fast` <0.2s, `paced-slow` >2s). Scan summaries and flags — fix or justify each. Skip on small edits.
+3. **File exists + non-zero:** `ls -lh final.mp4`.
+4. **Duration matches `data-duration`:** `ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 final.mp4`.
+5. **Visual check:** extract a mid-composition frame: `ffmpeg -i final.mp4 -ss 00:00:05 -vframes 1 preview.png`.
+6. **Audio present if expected:** `ffprobe -v error -show_streams -select_streams a -of default=nw=1:nk=1 final.mp4 | head -1`.
+
+If `hyperframes render` fails, run `npx hyperframes doctor` and attach its output when reporting.
+
+## References
+
+- [composition.md](references/composition.md) — data attributes, timeline contract, non-negotiable rules, typography/asset rules
+- [cli.md](references/cli.md) — every CLI command (init, capture, lint, validate, inspect, preview, render, transcribe, tts, doctor, browser, info, upgrade, benchmark)
+- [gsap.md](references/gsap.md) — GSAP core API for HyperFrames (tweens, eases, stagger, timelines, matchMedia)
+- [features.md](references/features.md) — captions, TTS, audio-reactive, marker highlighting, transitions (load on demand)
+- [website-to-video.md](references/website-to-video.md) — 7-step capture-to-video workflow
+- [troubleshooting.md](references/troubleshooting.md) — OpenClaw fix, env vars, common render errors
@@ -0,0 +1,185 @@
+# HyperFrames CLI
+
+Everything runs through `npx hyperframes` (or the globally-installed `hyperframes` after `npm install -g hyperframes`). Requires Node.js >= 22 and FFmpeg.
+
+## Workflow
+
+1. **Scaffold** — `npx hyperframes init my-video` (or `npx hyperframes capture <url>` if starting from a website)
+2. **Write** — author HTML composition (see `composition.md`)
+3. **Lint** — `npx hyperframes lint`
+4. **Validate** — `npx hyperframes validate` (WCAG contrast audit)
+5. **Inspect** — `npx hyperframes inspect` (visual layout audit)
+6. **Preview** — `npx hyperframes preview`
+7. **Render** — `npx hyperframes render`
+
+Always lint before preview/render — catches missing `data-composition-id`, overlapping tracks, and unregistered timelines.
+
+## init — Scaffold a Project
+
+```bash
+npx hyperframes init my-video                        # interactive wizard
+npx hyperframes init my-video --example warm-grain   # pick an example template
+npx hyperframes init my-video --video clip.mp4       # seed with a video file
+npx hyperframes init my-video --audio track.mp3      # seed with an audio file
+npx hyperframes init my-video --non-interactive      # skip prompts (CI / agent use)
+```
+
+Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`.
+
+`init` creates the correct file structure, copies media, transcribes audio with Whisper, and installs authoring skills. Use it instead of creating files by hand.
+
+## capture — Website → Editable Components
+
+```bash
+npx hyperframes capture https://example.com                  # → captures/example.com/
+npx hyperframes capture https://stripe.com -o stripe-video   # custom output dir
+npx hyperframes capture https://example.com --json           # machine-readable output
+npx hyperframes capture https://example.com --skip-assets    # skip images/SVGs
+```
+
+Captures the site into `captures/<hostname>/capture/` by default, producing `capture/screenshots/`, `capture/assets/`, `capture/extracted/` (tokens.json, visible-text.txt, fonts.json), and a self-contained snapshot.
+
+All downstream steps (DESIGN.md, SCRIPT.md, STORYBOARD, composition) read from the `capture/` subfolder — see `website-to-video.md`.
+
+## lint
+
+```bash
+npx hyperframes lint                # current directory
+npx hyperframes lint ./my-project   # specific project
+npx hyperframes lint --verbose      # include info-level findings
+npx hyperframes lint --json         # machine-readable output
+```
+
+Lints `index.html` and all files in `compositions/`. Reports errors (must fix), warnings (should fix), and info (only with `--verbose`).
+
+## validate
+
+```bash
+npx hyperframes validate                 # WCAG contrast audit at 5 timestamps
+npx hyperframes validate --no-contrast   # skip while iterating
+```
+
+Seeks to 5 timestamps, screenshots the page, samples background pixels behind every text element, and warns on contrast ratios below 4.5:1 (normal text) or 3:1 (large text — 24px+, or 19px+ bold). Run before final render.
+
+## inspect
+
+```bash
+npx hyperframes inspect                 # visual layout audit at 5 timestamps
+npx hyperframes inspect ./my-project    # specific project
+npx hyperframes inspect --json          # agent-readable findings
+npx hyperframes inspect --samples 15    # denser timeline sweep
+npx hyperframes inspect --at 1.5,4,7.25 # explicit hero-frame timestamps
+```
+
+Use this after `lint` and `validate`, especially for compositions with speech bubbles, cards, captions, or tight typography. Reports overflow, off-frame elements, occluded text, contrast warnings, and per-timestamp layout summaries — catches issues that pure timeline lint can't see (e.g., a caption that wraps past the safe area only at a specific timestamp).
+
+`npx hyperframes layout` is a compatibility alias for the same visual inspection pass.
+
+## preview
+
+```bash
+npx hyperframes preview                # serve current directory (port 3002)
+npx hyperframes preview --port 4567    # custom port
+```
+
+Hot-reloads on file changes. Opens the Studio in your browser automatically.
+
+## render
+
+```bash
+npx hyperframes render                              # standard MP4
+npx hyperframes render --output final.mp4           # named output
+npx hyperframes render --quality draft              # fast iteration
+npx hyperframes render --fps 60 --quality high      # final delivery
+npx hyperframes render --format webm                # transparent WebM
+npx hyperframes render --docker                     # byte-identical reproducible render
+```
+
+| Flag           | Options                 | Default                        | Notes                       |
+| -------------- | ----------------------- | ------------------------------ | --------------------------- |
+| `--output`     | path                    | `renders/<name>_<timestamp>.mp4` | Output path                 |
+| `--fps`        | 24, 30, 60              | 30                             | 60fps doubles render time   |
+| `--quality`    | `draft`, `standard`, `high` | standard                   | draft for iterating         |
+| `--format`     | `mp4`, `webm`           | mp4                            | WebM supports transparency  |
+| `--workers`    | 1–8 or `auto`           | auto                           | Each spawns Chrome          |
+| `--docker`     | flag                    | off                            | Reproducible output         |
+| `--gpu`        | flag                    | off                            | GPU-accelerated encoding    |
+| `--strict`     | flag                    | off                            | Fail on lint errors         |
+| `--strict-all` | flag                    | off                            | Fail on errors AND warnings |
+
+**Quality guidance:** `draft` while iterating, `standard` for review, `high` for final delivery.
+
+## transcribe
+
+```bash
+npx hyperframes transcribe audio.mp3
+npx hyperframes transcribe video.mp4 --model medium.en --language en
+npx hyperframes transcribe subtitles.srt     # import existing
+npx hyperframes transcribe subtitles.vtt
+npx hyperframes transcribe openai-response.json
+```
+
+Produces word-level timings suitable for caption components. First run downloads the Whisper model (cached after).
+
+## tts
+
+```bash
+npx hyperframes tts "Text here" --voice af_nova --output narration.wav
+npx hyperframes tts script.txt --voice bf_emma
+npx hyperframes tts "La reunión empieza a las nueve" --voice ef_dora --output es.wav
+npx hyperframes tts "Hello there" --voice af_heart --lang fr-fr --output accented.wav
+npx hyperframes tts --list                    # show all voices
+```
+
+Uses Kokoro (local, no API key). Voice ID first letter encodes language: `a` American English, `b` British English, `e` Spanish, `f` French, `h` Hindi, `i` Italian, `j` Japanese, `p` Brazilian Portuguese, `z` Mandarin. The CLI auto-infers the phonemizer locale from that prefix — pass `--lang` only to override (e.g. stylized accents). Valid `--lang` codes: `en-us`, `en-gb`, `es`, `fr-fr`, `hi`, `it`, `pt-br`, `ja`, `zh`. Non-English phonemization requires `espeak-ng` installed system-wide (`apt-get install espeak-ng` / `brew install espeak-ng`).
+
+## doctor
+
+```bash
+npx hyperframes doctor
+```
+
+Verifies environment:
+- Node.js >= 22
+- FFmpeg present on PATH
+- Available RAM (renders are memory-hungry — 4 GB minimum)
+- Chrome binary resolution (`chrome-headless-shell` preferred over system Chrome)
+- Current `hyperframes` version
+
+Run this **first** when a render fails. See `troubleshooting.md` for interpreting the output.
+
+## browser
+
+```bash
+npx hyperframes browser --install      # install the bundled chrome-headless-shell
+npx hyperframes browser --path         # print the resolved browser binary path
+npx hyperframes browser --clean        # clear the bundled browser cache
+```
+
+## info
+
+```bash
+npx hyperframes info
+```
+
+Prints version, Node version, FFmpeg version, OS, and resolved browser path — useful in bug reports.
+
+## upgrade
+
+```bash
+npx hyperframes upgrade -y
+```
+
+Check for and install updates. Run this if you hit `HeadlessExperimental.beginFrame` errors — the auto-detect fix shipped in `hyperframes@0.4.2` (commit 4c72ba4, March 2026).
+
+## Other
+
+```bash
+npx hyperframes compositions    # list compositions in the project
+npx hyperframes docs            # open documentation in browser
+npx hyperframes benchmark .     # benchmark render performance
+npx hyperframes add <block>     # install a block/component from the catalog
+npx hyperframes add --list      # browse the catalog
+```
+
+Popular catalog blocks: `flash-through-white` (shader transition), `instagram-follow` (social overlay), `data-chart` (animated chart), `lower-third` (talking-head overlay). See [hyperframes.heygen.com/catalog](https://hyperframes.heygen.com/catalog).
@@ -0,0 +1,129 @@
+# Composition Authoring
+
+HTML structure, data attributes, timeline contract, and non-negotiable rules.
+
+## Root Structure
+
+Standalone `index.html` — the top-level composition. **Does NOT use `<template>`**. Put the `data-composition-id` div directly in `<body>`.
+
+```html
+<!doctype html>
+<html>
+  <body>
+    <div
+      id="stage"
+      data-composition-id="root"
+      data-start="0"
+      data-duration="10"
+      data-width="1920"
+      data-height="1080"
+    >
+      <!-- clips go here -->
+      <video id="clip-1" data-start="0" data-duration="5" data-track-index="0" src="intro.mp4" muted playsinline></video>
+      <img id="logo" data-start="2" data-duration="3" data-track-index="1" src="logo.png" />
+      <audio id="music" data-start="0" data-duration="10" data-track-index="2" data-volume="0.5" src="music.wav"></audio>
+    </div>
+
+    <script src="https://cdn.jsdelivr.net/npm/gsap@3.14.2/dist/gsap.min.js"></script>
+    <script>
+      window.__timelines = window.__timelines || {};
+      const tl = gsap.timeline({ paused: true });
+      tl.from("#logo", { opacity: 0, y: 40, duration: 0.6 }, 2);
+      window.__timelines["root"] = tl;
+    </script>
+  </body>
+</html>
+```
+
+Sub-compositions loaded via `data-composition-src` **DO** use `<template>`:
+
+```html
+<template id="my-comp-template">
+  <div data-composition-id="my-comp" data-width="1920" data-height="1080">
+    <!-- content + scoped <style> + <script> with window.__timelines["my-comp"] -->
+  </div>
+</template>
+```
+
+Load from the root: `<div id="el-1" data-composition-id="my-comp" data-composition-src="compositions/my-comp.html" data-start="0" data-duration="10" data-track-index="1"></div>`
+
+## Data Attributes
+
+### All clips
+
+| Attribute          | Required                          | Values                                                 |
+| ------------------ | --------------------------------- | ------------------------------------------------------ |
+| `id`               | Yes                               | Unique identifier                                      |
+| `data-start`       | Yes                               | Seconds, or clip ID reference (`"el-1"`, `"intro + 2"`) |
+| `data-duration`    | Required for img/div/compositions | Seconds. Video/audio defaults to media duration.       |
+| `data-track-index` | Yes                               | Integer. Same-track clips cannot overlap.              |
+| `data-media-start` | No                                | Trim offset into source (seconds)                      |
+| `data-volume`      | No                                | 0–1 (default 1)                                        |
+
+`data-track-index` controls timeline layout only — **not** visual layering. Use CSS `z-index` for layering.
+
+### Composition clips
+
+| Attribute                    | Required | Values                                       |
+| ---------------------------- | -------- | -------------------------------------------- |
+| `data-composition-id`        | Yes      | Unique composition ID                        |
+| `data-start`                 | Yes      | Start time (root composition: `"0"`)         |
+| `data-duration`              | Yes      | Takes precedence over GSAP timeline duration |
+| `data-width` / `data-height` | Yes      | Pixel dimensions (1920x1080 or 1080x1920)    |
+| `data-composition-src`       | No       | Path to external HTML file                   |
+
+## Timeline Contract
+
+- Every timeline starts `{ paused: true }` — the player controls playback.
+- Register every timeline: `window.__timelines["<composition-id>"] = tl`.
+- Duration comes from `data-duration`, not from the GSAP timeline length.
+- Framework auto-nests sub-timelines — do NOT manually add them.
+- Never create empty tweens just to set duration.
+
+## Non-Negotiable Rules
+
+1. **Deterministic.** No `Math.random()`, `Date.now()`, or time-based logic. Use a seeded PRNG (e.g. mulberry32) if you need pseudo-randomness.
+2. **GSAP only on visual properties.** `opacity`, `x`, `y`, `scale`, `rotation`, `color`, `backgroundColor`, `borderRadius`, transforms. Never animate `visibility`, `display`, or call `video.play()`/`audio.play()`.
+3. **No property conflicts across timelines.** Never animate the same property on the same element from multiple timelines simultaneously.
+4. **No `repeat: -1`.** Infinite-repeat tweens break the capture engine. Compute `repeat: Math.ceil(duration / cycleDuration) - 1`.
+5. **Synchronous timeline construction.** Never build timelines inside `async`/`await`, `setTimeout`, or Promises. The capture engine reads `window.__timelines` synchronously after page load. Fonts are embedded by the compiler — no need to wait for load.
+6. **Root composition has no `<template>` wrapper.** Only sub-compositions use `<template>`.
+7. **Video is always `muted playsinline`.** Audio is always a separate `<audio>` element — even if it's the same source file.
+8. **Content containers use padding, not absolute positioning.** `.scene-content { width: 100%; height: 100%; padding: Npx; display: flex; flex-direction: column; gap: Npx; box-sizing: border-box }`. Absolute-positioned content containers overflow. Reserve `position: absolute` for decoratives only.
+
+## Scene Transitions
+
+Multi-scene compositions MUST follow all of these:
+
+1. **Always use a transition between scenes.** No jump cuts.
+2. **Always use entrance animations** on every scene element. Every element animates IN via `gsap.from(...)`. No element may appear fully-formed.
+3. **Never use exit animations** (except on the final scene). This means NO `gsap.to()` that animates `opacity` to 0, `y` offscreen, etc. The transition IS the exit. Outgoing scene content must be fully visible at the moment the transition starts.
+4. **Final scene only:** may fade elements out. This is the only scene where `gsap.to(..., { opacity: 0 })` is allowed.
+
+## Typography and Assets
+
+- **Fonts:** write the `font-family` you want in CSS — the compiler embeds supported fonts automatically. Unsupported fonts produce a compiler warning.
+- Add `crossorigin="anonymous"` to external media.
+- For dynamic text sizing, use `window.__hyperframes.fitTextFontSize(text, { maxWidth, fontFamily, fontWeight })`.
+- All project files live at the project root alongside `index.html`. Sub-compositions reference assets with `../`.
+- For rendered video: 60px+ headlines, 20px+ body, 16px+ data labels. `font-variant-numeric: tabular-nums` on number columns. Avoid full-screen linear gradients on dark backgrounds (H.264 banding — use radial or solid + localized glow).
+
+## Animation Guardrails
+
+- Offset the first animation 0.1–0.3s (not `t=0`).
+- Vary eases across entrance tweens — at least 3 different eases per scene.
+- Don't repeat an entrance pattern within a scene.
+
+## Never Do
+
+1. Forget `window.__timelines` registration.
+2. Use video for audio — always muted video + separate `<audio>`.
+3. Nest video inside a timed div — use a non-timed wrapper.
+4. Use `data-layer` (use `data-track-index`) or `data-end` (use `data-duration`).
+5. Animate video element dimensions — animate a wrapper div instead.
+6. Call `play`/`pause`/`seek` on media — framework owns playback.
+7. Create a top-level container without `data-composition-id`.
+8. Use `repeat: -1` on any timeline or tween.
+9. Build timelines asynchronously.
+10. Use `gsap.set()` on elements from later scenes — they don't exist in the DOM at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline at or after the clip's `data-start`.
+11. Use `<br>` in content text — causes unwanted extra breaks when the text wraps naturally. Use `max-width` instead. Exception: short display titles (e.g., "THE\nIMMORTAL\nGAME") where each word is deliberately on its own line.
--- a/Show More
+++ b/Show More