fix: add explicit hermes-api-server toolset for API server platform

The API server adapter was creating agents without specifying enabled_toolsets, causing ALL tools from ALL toolsets to be loaded (including clarify, send_message, and text_to_speech which don't work without interactive callbacks or gateway dispatch). This could confuse models by presenting too many irrelevant tools, and meant the platform_toolsets config override didn't apply to API server. Changes: - Add hermes-api-server toolset to toolsets.py with appropriate tools (web, terminal, files, browser, vision, skills, HA tools, etc.) but excluding clarify, send_message, and text_to_speech - Update _create_agent() in api_server.py to use enabled_toolsets=[hermes-api-server] - Add api_server to PLATFORMS dict in tools_config.py for config override support - Add tests for toolset definition, tool inclusion/exclusion, and adapter wiring
2026-03-26 16:04:39 -07:00
241 changed files with 1525 additions and 23858 deletions
@@ -1,13 +0,0 @@
-# Git
-.git
-.gitignore
-.gitmodules
-
-# Dependencies
-node_modules
-
-# CI/CD
-.github
-
-# Environment files
-.env
@@ -59,25 +59,12 @@ OPENCODE_ZEN_API_KEY=
 # OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5)
 # $10/month subscription. Get your key at: https://opencode.ai/auth
 OPENCODE_GO_API_KEY=
-
-# =============================================================================
-# LLM PROVIDER (Hugging Face Inference Providers)
-# =============================================================================
-# Hugging Face routes to 20+ open models via unified OpenAI-compatible endpoint.
-# Free tier included ($0.10/month), no markup on provider rates.
-# Get your token at: https://huggingface.co/settings/tokens
-# Required permission: "Make calls to Inference Providers"
-HF_TOKEN=
 # OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1  # Override default base URL

 # =============================================================================
 # TOOL API KEYS
 # =============================================================================

-# Exa API Key - AI-native web search and contents
-# Get at: https://exa.ai
-EXA_API_KEY=
-
 # Parallel API Key - AI-native web search and extract
 # Get at: https://parallel.ai
 PARALLEL_API_KEY=
@@ -1,61 +0,0 @@
-name: Docker Build and Publish
-
-on:
-  push:
-    branches: [main]
-  pull_request:
-    branches: [main]
-
-concurrency:
-  group: docker-${{ github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  build-and-push:
-    runs-on: ubuntu-latest
-    timeout-minutes: 30
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@v4
-        with:
-          submodules: recursive
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Build image
-        uses: docker/build-push-action@v6
-        with:
-          context: .
-          file: Dockerfile
-          load: true
-          tags: nousresearch/hermes-agent:test
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-
-      - name: Test image starts
-        run: |
-          docker run --rm \
-            -v /tmp/hermes-test:/opt/data \
-            --entrypoint /opt/hermes/docker/entrypoint.sh \
-            nousresearch/hermes-agent:test --help
-
-      - name: Log in to Docker Hub
-        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
-        uses: docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKERHUB_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_TOKEN }}
-
-      - name: Push image
-        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
-        uses: docker/build-push-action@v6
-        with:
-          context: .
-          file: Dockerfile
-          push: true
-          tags: |
-            nousresearch/hermes-agent:latest
-            nousresearch/hermes-agent:${{ github.sha }}
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
@@ -210,10 +210,6 @@ registry.register(

 The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.

-**Path references in tool schemas**: If the schema description mentions file paths (e.g. default output directories), use `display_hermes_home()` to make them profile-aware. The schema is generated at import time, which is after `_apply_profile_override()` sets `HERMES_HOME`.
-
-**State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
-
 **Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.

 ---
@@ -362,69 +358,8 @@ in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):

 ---

-## Profiles: Multi-Instance Support
-
-Hermes supports **profiles** — multiple fully isolated instances, each with its own
-`HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
-
-The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
-`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
-automatically scope to the active profile.
-
-### Rules for profile-safe code
-
-1. **Use `get_hermes_home()` for all HERMES_HOME paths.** Import from `hermes_constants`.
-   NEVER hardcode `~/.hermes` or `Path.home() / ".hermes"` in code that reads/writes state.
-   ```python
-   # GOOD
-   from hermes_constants import get_hermes_home
-   config_path = get_hermes_home() / "config.yaml"
-
-   # BAD — breaks profiles
-   config_path = Path.home() / ".hermes" / "config.yaml"
-   ```
-
-2. **Use `display_hermes_home()` for user-facing messages.** Import from `hermes_constants`.
-   This returns `~/.hermes` for default or `~/.hermes/profiles/<name>` for profiles.
-   ```python
-   # GOOD
-   from hermes_constants import display_hermes_home
-   print(f"Config saved to {display_hermes_home()}/config.yaml")
-
-   # BAD — shows wrong path for profiles
-   print("Config saved to ~/.hermes/config.yaml")
-   ```
-
-3. **Module-level constants are fine** — they cache `get_hermes_home()` at import time,
-   which is AFTER `_apply_profile_override()` sets the env var. Just use `get_hermes_home()`,
-   not `Path.home() / ".hermes"`.
-
-4. **Tests that mock `Path.home()` must also set `HERMES_HOME`** — since code now uses
-   `get_hermes_home()` (reads env var), not `Path.home() / ".hermes"`:
-   ```python
-   with patch.object(Path, "home", return_value=tmp_path), \
-        patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
-       ...
-   ```
-
-5. **Gateway platform adapters should use token locks** — if the adapter connects with
-   a unique credential (bot token, API key), call `acquire_scoped_lock()` from
-   `gateway.status` in the `connect()`/`start()` method and `release_scoped_lock()` in
-   `disconnect()`/`stop()`. This prevents two profiles from using the same credential.
-   See `gateway/platforms/telegram.py` for the canonical pattern.
-
-6. **Profile operations are HOME-anchored, not HERMES_HOME-anchored** — `_get_profiles_root()`
-   returns `Path.home() / ".hermes" / "profiles"`, NOT `get_hermes_home() / "profiles"`.
-   This is intentional — it lets `hermes -p coder profile list` see all profiles regardless
-   of which one is active.
-
 ## Known Pitfalls

-### DO NOT hardcode `~/.hermes` paths
-Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_hermes_home()`
-for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
-has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
-
 ### DO NOT use `simple_term_menu` for interactive menus
 Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.

@@ -440,19 +375,6 @@ Tool schema descriptions must not mention tools from other toolsets by name (e.g
 ### Tests must not write to `~/.hermes/`
 The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.

-**Profile tests**: When testing profile features, also mock `Path.home()` so that
-`_get_profiles_root()` and `_get_default_hermes_home()` resolve within the temp dir.
-Use the pattern from `tests/hermes_cli/test_profiles.py`:
-```python
-@pytest.fixture
-def profile_env(tmp_path, monkeypatch):
-    home = tmp_path / ".hermes"
-    home.mkdir()
-    monkeypatch.setattr(Path, "home", lambda: tmp_path)
-    monkeypatch.setenv("HERMES_HOME", str(home))
-    return home
-```
-
 ---

 ## Testing
@@ -1,20 +0,0 @@
-FROM debian:13.4
-
-RUN apt-get update
-RUN apt-get install -y nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev
-
-COPY . /opt/hermes
-WORKDIR /opt/hermes
-
-RUN pip install -e ".[all]" --break-system-packages
-RUN npm install
-RUN npx playwright install --with-deps chromium
-WORKDIR /opt/hermes/scripts/whatsapp-bridge
-RUN npm install
-
-WORKDIR /opt/hermes
-RUN chmod +x /opt/hermes/docker/entrypoint.sh
-
-ENV HERMES_HOME=/opt/data
-VOLUME [ "/opt/data" ]
-ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
@@ -1,348 +0,0 @@
-# Hermes Agent v0.5.0 (v2026.3.28)
-
-**Release Date:** March 28, 2026
-
-> The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.
-
---
-
-## ✨ Highlights
-
- **Nous Portal now supports 400+ models** — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint
-
- **Hugging Face as a first-class inference provider** — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live `/models` endpoint probe, and setup wizard flow ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419), [#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
-
- **Telegram Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
-
- **Native Modal SDK backend** — Replaced swe-rex dependency with native Modal SDK (`Sandbox.create.aio` + `exec.aio`), eliminating tunnels and simplifying the Modal terminal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
-
- **Plugin lifecycle hooks activated** — `pre_llm_call`, `post_llm_call`, `on_session_start`, and `on_session_end` hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
-
- **Improved OpenAI Model Reliability** — Added `GPT_TOOL_USE_GUIDANCE` to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
-
- **Nix flake** — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness ([#20](https://github.com/NousResearch/hermes-agent/pull/20), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274), [#3061](https://github.com/NousResearch/hermes-agent/pull/3061)) by @alt-glitch
-
- **Supply chain hardening** — Removed compromised `litellm` dependency, pinned all dependency version ranges, regenerated `uv.lock` with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796), [#2810](https://github.com/NousResearch/hermes-agent/pull/2810), [#2812](https://github.com/NousResearch/hermes-agent/pull/2812), [#2816](https://github.com/NousResearch/hermes-agent/pull/2816), [#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
-
- **Anthropic output limits fix** — Replaced hardcoded 16K `max_tokens` with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426), [#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
-
---
-
-## 🏗️ Core Agent & Architecture
-
-### New Provider: Hugging Face
- First-class Hugging Face Inference API integration with auth, setup wizard, and model picker ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419))
- Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live `/models` probe for speed ([#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
- Added glm-5-turbo to Z.AI provider model list ([#3095](https://github.com/NousResearch/hermes-agent/pull/3095))
-
-### Provider & Model Improvements
- `/model` command overhaul — extracted shared `switch_model()` pipeline for CLI and gateway, custom endpoint support, provider-aware routing ([#2795](https://github.com/NousResearch/hermes-agent/pull/2795), [#2799](https://github.com/NousResearch/hermes-agent/pull/2799))
- Removed `/model` slash command from CLI and gateway in favor of `hermes model` subcommand ([#3080](https://github.com/NousResearch/hermes-agent/pull/3080))
- Preserve `custom` provider instead of silently remapping to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
- Read root-level `provider` and `base_url` from config.yaml into model config ([#3112](https://github.com/NousResearch/hermes-agent/pull/3112))
- Align Nous Portal model slugs with OpenRouter naming ([#3253](https://github.com/NousResearch/hermes-agent/pull/3253))
- Fix Alibaba provider default endpoint and model list ([#3484](https://github.com/NousResearch/hermes-agent/pull/3484))
- Allow MiniMax users to override `/v1` → `/anthropic` auto-correction ([#3553](https://github.com/NousResearch/hermes-agent/pull/3553))
- Migrate OAuth token refresh to `platform.claude.com` with fallback ([#3246](https://github.com/NousResearch/hermes-agent/pull/3246))
-
-### Agent Loop & Conversation
- **Improved OpenAI model reliability** — `GPT_TOOL_USE_GUIDANCE` prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
- **Surface lifecycle events** — All retry, fallback, and compression events now surface to the user as formatted messages ([#3153](https://github.com/NousResearch/hermes-agent/pull/3153))
- **Anthropic output limits** — Per-model native output limits instead of hardcoded 16K `max_tokens` ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426))
- **Thinking-budget exhaustion detection** — Skip useless continuation retries when model uses all output tokens on reasoning ([#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
- Always prefer streaming for API calls to prevent hung subagents ([#3120](https://github.com/NousResearch/hermes-agent/pull/3120))
- Restore safe non-streaming fallback after stream failures ([#3020](https://github.com/NousResearch/hermes-agent/pull/3020))
- Give subagents independent iteration budgets ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
- Update `api_key` in `_try_activate_fallback` for subagent auth ([#3103](https://github.com/NousResearch/hermes-agent/pull/3103))
- Graceful return on max retries instead of crashing thread ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Count compression restarts toward retry limit ([#3070](https://github.com/NousResearch/hermes-agent/pull/3070))
- Include tool tokens in preflight estimate, guard context probe persistence ([#3164](https://github.com/NousResearch/hermes-agent/pull/3164))
- Update context compressor limits after fallback activation ([#3305](https://github.com/NousResearch/hermes-agent/pull/3305))
- Validate empty user messages to prevent Anthropic API 400 errors ([#3322](https://github.com/NousResearch/hermes-agent/pull/3322))
- GLM reasoning-only and max-length handling ([#3010](https://github.com/NousResearch/hermes-agent/pull/3010))
- Increase API timeout default from 900s to 1800s for slow-thinking models ([#3431](https://github.com/NousResearch/hermes-agent/pull/3431))
- Send `max_tokens` for Claude/OpenRouter + retry SSE connection errors ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
- Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701)) by @ctlst
-
-### Streaming & Reasoning
- **Persist reasoning across gateway session turns** with new schema v6 columns (`reasoning`, `reasoning_details`, `codex_reasoning_items`) ([#2974](https://github.com/NousResearch/hermes-agent/pull/2974))
- Detect and kill stale SSE connections ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Fix stale stream detector race causing spurious `RemoteProtocolError` ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Skip duplicate callback for `<think>`-extracted reasoning during streaming ([#3116](https://github.com/NousResearch/hermes-agent/pull/3116))
- Preserve reasoning fields in `rewrite_transcript` ([#3311](https://github.com/NousResearch/hermes-agent/pull/3311))
- Preserve Gemini thought signatures in streamed tool calls ([#2997](https://github.com/NousResearch/hermes-agent/pull/2997))
- Ensure first delta is fired during reasoning updates ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
-### Session & Memory
- **Session search recent sessions mode** — Omit query to browse recent sessions with titles, previews, and timestamps ([#2533](https://github.com/NousResearch/hermes-agent/pull/2533))
- **Session config surfacing** on `/new`, `/reset`, and auto-reset ([#3321](https://github.com/NousResearch/hermes-agent/pull/3321))
- **Third-party session isolation** — `--source` flag for isolating sessions by origin ([#3255](https://github.com/NousResearch/hermes-agent/pull/3255))
- Add `/resume` CLI handler, session log truncation guard, `reopen_session` API ([#3315](https://github.com/NousResearch/hermes-agent/pull/3315))
- Clear compressor summary and turn counter on `/clear` and `/new` ([#3102](https://github.com/NousResearch/hermes-agent/pull/3102))
- Surface silent SessionDB failures that cause session data loss ([#2999](https://github.com/NousResearch/hermes-agent/pull/2999))
- Session search fallback preview on summarization failure ([#3478](https://github.com/NousResearch/hermes-agent/pull/3478))
- Prevent stale memory overwrites by flush agent ([#2687](https://github.com/NousResearch/hermes-agent/pull/2687))
-
-### Context Compression
- Replace dead `summary_target_tokens` with ratio-based scaling ([#2554](https://github.com/NousResearch/hermes-agent/pull/2554))
- Expose `compression.target_ratio`, `protect_last_n`, and `threshold` in `DEFAULT_CONFIG` ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Restore sane defaults and cap summary at 12K tokens ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Preserve transcript on `/compress` and hygiene compression ([#3556](https://github.com/NousResearch/hermes-agent/pull/3556))
- Update context pressure warnings and token estimates after compaction ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
-### Architecture & Dependencies
- **Remove mini-swe-agent dependency** — Inline Docker and Modal backends directly ([#2804](https://github.com/NousResearch/hermes-agent/pull/2804))
- **Replace swe-rex with native Modal SDK** for Modal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
- **Plugin lifecycle hooks** — `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end` now fire in the agent loop ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
- Fix plugin toolsets invisible in `hermes tools` and standalone processes ([#3457](https://github.com/NousResearch/hermes-agent/pull/3457))
- Consolidate `get_hermes_home()` and `parse_reasoning_effort()` ([#3062](https://github.com/NousResearch/hermes-agent/pull/3062))
- Remove unused Hermes-native PKCE OAuth flow ([#3107](https://github.com/NousResearch/hermes-agent/pull/3107))
- Remove ~100 unused imports across 55 files ([#3016](https://github.com/NousResearch/hermes-agent/pull/3016))
- Fix 154 f-strings, simplify getattr/URL patterns, remove dead code ([#3119](https://github.com/NousResearch/hermes-agent/pull/3119))
-
---
-
-## 📱 Messaging Platforms (Gateway)
-
-### Telegram
- **Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
- **Auto-discover fallback IPs via DNS-over-HTTPS** when `api.telegram.org` is unreachable ([#3376](https://github.com/NousResearch/hermes-agent/pull/3376))
- **Configurable reply threading mode** ([#2907](https://github.com/NousResearch/hermes-agent/pull/2907))
- Fall back to no `thread_id` on "Message thread not found" BadRequest ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
- Self-reschedule reconnect when `start_polling` fails after 502 ([#3268](https://github.com/NousResearch/hermes-agent/pull/3268))
-
-### Discord
- Stop phantom typing indicator after agent turn completes ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
-
-### Slack
- Send tool call progress messages to correct Slack thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
- Scope progress thread fallback to Slack only ([#3488](https://github.com/NousResearch/hermes-agent/pull/3488))
-
-### WhatsApp
- Download documents, audio, and video media from messages ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
-
-### Matrix
- Add missing Matrix entry in `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
- Harden e2ee access-token handling ([#3562](https://github.com/NousResearch/hermes-agent/pull/3562))
- Add backoff for `SyncError` in sync loop ([#3280](https://github.com/NousResearch/hermes-agent/pull/3280))
-
-### Signal
- Track SSE keepalive comments as connection activity ([#3316](https://github.com/NousResearch/hermes-agent/pull/3316))
-
-### Email
- Prevent unbounded growth of `_seen_uids` in EmailAdapter ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
-
-### Gateway Core
- **Config-gated `/verbose` command** for messaging platforms — toggle tool output verbosity from chat ([#3262](https://github.com/NousResearch/hermes-agent/pull/3262))
- **Background review notifications** delivered to user chat ([#3293](https://github.com/NousResearch/hermes-agent/pull/3293))
- **Retry transient send failures** and notify user on exhaustion ([#3288](https://github.com/NousResearch/hermes-agent/pull/3288))
- Recover from hung agents — `/stop` hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
- Thread-safe `SessionStore` — protect `_entries` with `threading.Lock` ([#3052](https://github.com/NousResearch/hermes-agent/pull/3052))
- Fix gateway token double-counting with cached agents — use absolute set instead of increment ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
- Fingerprint full auth token in agent cache signature ([#3247](https://github.com/NousResearch/hermes-agent/pull/3247))
- Silence background agent terminal output ([#3297](https://github.com/NousResearch/hermes-agent/pull/3297))
- Include per-platform `ALLOW_ALL` and `SIGNAL_GROUP` in startup allowlist check ([#3313](https://github.com/NousResearch/hermes-agent/pull/3313))
- Include user-local bin paths in systemd unit PATH ([#3527](https://github.com/NousResearch/hermes-agent/pull/3527))
- Track background task references in `GatewayRunner` ([#3254](https://github.com/NousResearch/hermes-agent/pull/3254))
- Add request timeouts to HA, Email, Mattermost, SMS adapters ([#3258](https://github.com/NousResearch/hermes-agent/pull/3258))
- Add media download retry to Mattermost, Slack, and base cache ([#3323](https://github.com/NousResearch/hermes-agent/pull/3323))
- Detect virtualenv path instead of hardcoding `venv/` ([#2797](https://github.com/NousResearch/hermes-agent/pull/2797))
- Use `TERMINAL_CWD` for context file discovery, not process cwd ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Stop loading hermes repo AGENTS.md into gateway sessions (~10k wasted tokens) ([#2891](https://github.com/NousResearch/hermes-agent/pull/2891))
-
---
-
-## 🖥️ CLI & User Experience
-
-### Interactive CLI
- **Configurable busy input mode** + fix `/queue` always working ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
- **Preserve user input on multiline paste** ([#3065](https://github.com/NousResearch/hermes-agent/pull/3065))
- **Tool generation callback** — streaming "preparing terminal…" updates during tool argument generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Show tool progress for substantive tools, not just "preparing" ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Buffer reasoning preview chunks and fix duplicate display ([#3013](https://github.com/NousResearch/hermes-agent/pull/3013))
- Prevent reasoning box from rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
- Eliminate "Event loop is closed" / "Press ENTER to continue" during idle — three-layer fix with `neuter_async_httpx_del()`, custom exception handler, and stale client cleanup ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
- Fix status bar shows 26K instead of 260K for token counts with trailing zeros ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
- Fix status bar duplicates and degrades during long sessions ([#3291](https://github.com/NousResearch/hermes-agent/pull/3291))
- Refresh TUI before background task output to prevent status bar overlap ([#3048](https://github.com/NousResearch/hermes-agent/pull/3048))
- Suppress KawaiiSpinner animation under `patch_stdout` ([#2994](https://github.com/NousResearch/hermes-agent/pull/2994))
- Skip KawaiiSpinner when TUI handles tool progress ([#2973](https://github.com/NousResearch/hermes-agent/pull/2973))
- Guard `isatty()` against closed streams via `_is_tty` property ([#3056](https://github.com/NousResearch/hermes-agent/pull/3056))
- Ensure single closure of streaming boxes during tool generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Cap context pressure percentage at 100% in display ([#3480](https://github.com/NousResearch/hermes-agent/pull/3480))
- Clean up HTML error messages in CLI display ([#3069](https://github.com/NousResearch/hermes-agent/pull/3069))
- Show HTTP status code and 400 body in API error output ([#3096](https://github.com/NousResearch/hermes-agent/pull/3096))
- Extract useful info from HTML error pages, dump debug on max retries ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Prevent TypeError on startup when `base_url` is None ([#3068](https://github.com/NousResearch/hermes-agent/pull/3068))
- Prevent update crash in non-TTY environments ([#3094](https://github.com/NousResearch/hermes-agent/pull/3094))
- Handle EOFError in sessions delete/prune confirmation prompts ([#3101](https://github.com/NousResearch/hermes-agent/pull/3101))
- Catch KeyboardInterrupt during `flush_memories` on exit and in exit cleanup handlers ([#3025](https://github.com/NousResearch/hermes-agent/pull/3025), [#3257](https://github.com/NousResearch/hermes-agent/pull/3257))
- Guard `.strip()` against None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
- Guard `config.get()` against YAML null values to prevent AttributeError ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
- Store asyncio task references to prevent GC mid-execution ([#3267](https://github.com/NousResearch/hermes-agent/pull/3267))
-
-### Setup & Configuration
- Use explicit key mapping for returning-user menu dispatch instead of positional index ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
- Use `sys.executable` for pip in update commands to fix PEP 668 ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
- Harden `hermes update` against diverged history, non-main branches, and gateway edge cases ([#3492](https://github.com/NousResearch/hermes-agent/pull/3492))
- OpenClaw migration overwrites defaults and setup wizard skips imported sections — fixed ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
- Stop recursive AGENTS.md walk, load top-level only ([#3110](https://github.com/NousResearch/hermes-agent/pull/3110))
- Add macOS Homebrew paths to browser and terminal PATH resolution ([#2713](https://github.com/NousResearch/hermes-agent/pull/2713))
- YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
- Reset default SOUL.md to baseline identity text ([#3159](https://github.com/NousResearch/hermes-agent/pull/3159))
- Reject relative cwd paths for container terminal backends ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Add explicit `hermes-api-server` toolset for API server platform ([#3304](https://github.com/NousResearch/hermes-agent/pull/3304))
- Reorder setup wizard providers — OpenRouter first ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
---
-
-## 🔧 Tool System
-
-### API Server
- **Idempotency-Key support**, body size limit, and OpenAI error envelope ([#2903](https://github.com/NousResearch/hermes-agent/pull/2903))
- Allow Idempotency-Key in CORS headers ([#3530](https://github.com/NousResearch/hermes-agent/pull/3530))
- Cancel orphaned agent + true interrupt on SSE disconnect ([#3427](https://github.com/NousResearch/hermes-agent/pull/3427))
- Fix streaming breaks when agent makes tool calls ([#2985](https://github.com/NousResearch/hermes-agent/pull/2985))
-
-### Terminal & File Operations
- Handle addition-only hunks in V4A patch parser ([#3325](https://github.com/NousResearch/hermes-agent/pull/3325))
- Exponential backoff for persistent shell polling ([#2996](https://github.com/NousResearch/hermes-agent/pull/2996))
- Add timeout to subprocess calls in `context_references` ([#3469](https://github.com/NousResearch/hermes-agent/pull/3469))
-
-### Browser & Vision
- Handle 402 insufficient credits error in vision tool ([#2802](https://github.com/NousResearch/hermes-agent/pull/2802))
- Fix `browser_vision` ignores `auxiliary.vision.timeout` config ([#2901](https://github.com/NousResearch/hermes-agent/pull/2901))
- Make browser command timeout configurable via config.yaml ([#2801](https://github.com/NousResearch/hermes-agent/pull/2801))
-
-### MCP
- MCP toolset resolution for runtime and config ([#3252](https://github.com/NousResearch/hermes-agent/pull/3252))
- Add MCP tool name collision protection ([#3077](https://github.com/NousResearch/hermes-agent/pull/3077))
-
-### Auxiliary LLM
- Guard aux LLM calls against None content + reasoning fallback + retry ([#3449](https://github.com/NousResearch/hermes-agent/pull/3449))
- Catch ImportError from `build_anthropic_client` in vision auto-detection ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
-
-### Other Tools
- Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162)) by @memosr
- Auto-repair `jobs.json` with invalid control characters ([#3537](https://github.com/NousResearch/hermes-agent/pull/3537))
- Enable fine-grained tool streaming for Claude/OpenRouter ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
-
---
-
-## 🧩 Skills Ecosystem
-
-### Skills System
- **Env var passthrough** for skills and user config — skills can declare environment variables to pass through ([#2807](https://github.com/NousResearch/hermes-agent/pull/2807))
- Cache skills prompt with shared `skill_utils` module for faster TTFT ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
- Avoid redundant file re-read for skill conditions ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
- Use Git Trees API to prevent silent subdirectory loss during install ([#2995](https://github.com/NousResearch/hermes-agent/pull/2995))
- Fix skills-sh install for deeply nested repo structures ([#2980](https://github.com/NousResearch/hermes-agent/pull/2980))
- Handle null metadata in skill frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Preserve trust for skills-sh identifiers + reduce resolution churn ([#3251](https://github.com/NousResearch/hermes-agent/pull/3251))
- Agent-created skills were incorrectly treated as untrusted community content — fixed ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
-### New Skills
- **G0DM0D3 godmode jailbreaking skill** + docs ([#3157](https://github.com/NousResearch/hermes-agent/pull/3157))
- **Docker management skill** added to optional-skills ([#3060](https://github.com/NousResearch/hermes-agent/pull/3060))
- **OpenClaw migration v2** — 17 new modules, terminal recap for migrating from OpenClaw to Hermes ([#2906](https://github.com/NousResearch/hermes-agent/pull/2906))
-
---
-
-## 🔒 Security & Reliability
-
-### Security Hardening
- **SSRF protection** added to `browser_navigate` ([#3058](https://github.com/NousResearch/hermes-agent/pull/3058))
- **SSRF protection** added to `vision_tools` and `web_tools` (hardened) ([#2679](https://github.com/NousResearch/hermes-agent/pull/2679))
- **Restrict subagent toolsets** to parent's enabled set ([#3269](https://github.com/NousResearch/hermes-agent/pull/3269))
- **Prevent zip-slip path traversal** in self-update ([#3250](https://github.com/NousResearch/hermes-agent/pull/3250))
- **Prevent shell injection** in `_expand_path` via `~user` path suffix ([#2685](https://github.com/NousResearch/hermes-agent/pull/2685))
- **Normalize input** before dangerous command detection ([#3260](https://github.com/NousResearch/hermes-agent/pull/3260))
- Make tirith block verdicts approvable instead of hard-blocking ([#3428](https://github.com/NousResearch/hermes-agent/pull/3428))
- Remove compromised `litellm`/`typer`/`platformdirs` from deps ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796))
- Pin all dependency version ranges ([#2810](https://github.com/NousResearch/hermes-agent/pull/2810))
- Regenerate `uv.lock` with hashes, use lockfile in setup ([#2812](https://github.com/NousResearch/hermes-agent/pull/2812))
- Bump dependencies to fix CVEs + regenerate `uv.lock` ([#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
- Supply chain audit CI workflow for PR scanning ([#2816](https://github.com/NousResearch/hermes-agent/pull/2816))
-
-### Reliability
- **SQLite WAL write-lock contention** causing 15-20s TUI freeze — fixed ([#3385](https://github.com/NousResearch/hermes-agent/pull/3385))
- **SQLite concurrency hardening** + session transcript integrity ([#3249](https://github.com/NousResearch/hermes-agent/pull/3249))
- Prevent recurring cron job re-fire on gateway crash/restart loop ([#3396](https://github.com/NousResearch/hermes-agent/pull/3396))
- Mark cron session as ended after job completes ([#2998](https://github.com/NousResearch/hermes-agent/pull/2998))
-
---
-
-## ⚡ Performance
-
- **TTFT startup optimizations** — salvaged easy-win startup improvements ([#3395](https://github.com/NousResearch/hermes-agent/pull/3395))
- Cache skills prompt with shared `skill_utils` module ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
- Avoid redundant file re-read for skill conditions in prompt builder ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
-
---
-
-## 🐛 Notable Bug Fixes
-
- Fix gateway token double-counting with cached agents ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
- Fix "Event loop is closed" / "Press ENTER to continue" during idle sessions ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
- Fix reasoning box rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
- Fix status bar shows 26K instead of 260K for token counts ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
- Fix `/queue` always working regardless of config ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
- Fix phantom Discord typing indicator after agent turn ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
- Fix Slack progress messages appearing in wrong thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
- Fix WhatsApp media downloads (documents, audio, video) ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
- Fix Telegram "Message thread not found" killing progress messages ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
- Fix OpenClaw migration overwriting defaults ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
- Fix returning-user setup menu dispatching wrong section ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
- Fix `hermes update` PEP 668 "externally-managed-environment" error ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
- Fix subagents hitting `max_iterations` prematurely via shared budget ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
- Fix YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
- Fix `config.get()` crashes on YAML null values ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
- Fix `.strip()` crash on None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
- Fix hung agents on gateway — `/stop` now hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
- Fix `_custom` provider silently remapped to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
- Fix Matrix missing from `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
- Fix Email adapter unbounded `_seen_uids` growth ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
-
---
-
-## 🧪 Testing
-
- Pin `agent-client-protocol` < 0.9 to handle breaking upstream release ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
- Catch anthropic ImportError in vision auto-detection tests ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
- Update retry-exhaust test for new graceful return behavior ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
- Add regression tests for null metadata frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
---
-
-## 📚 Documentation
-
- Update all docs for `/model` command overhaul and custom provider support ([#2800](https://github.com/NousResearch/hermes-agent/pull/2800))
- Fix stale and incorrect documentation across 18 files ([#2805](https://github.com/NousResearch/hermes-agent/pull/2805))
- Document 9 previously undocumented features ([#2814](https://github.com/NousResearch/hermes-agent/pull/2814))
- Add missing skills, CLI commands, and messaging env vars to docs ([#2809](https://github.com/NousResearch/hermes-agent/pull/2809))
- Fix api-server response storage documentation — SQLite, not in-memory ([#2819](https://github.com/NousResearch/hermes-agent/pull/2819))
- Quote pip install extras to fix zsh glob errors ([#2815](https://github.com/NousResearch/hermes-agent/pull/2815))
- Unify hooks documentation — add plugin hooks to hooks page, add `session:end` event ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Clarify two-mode behavior in `session_search` schema description ([untagged commit](https://github.com/NousResearch/hermes-agent))
- Fix Discord Public Bot setting for Discord-provided invite link ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519)) by @mehmoodosman
- Revise v0.4.0 changelog — fix feature attribution, reorder sections ([untagged commit](https://github.com/NousResearch/hermes-agent))
-
---
-
-## 👥 Contributors
-
-### Core
- **@teknium1** — 157 PRs covering the full scope of this release
-
-### Community Contributors
- **@alt-glitch** (Siddharth Balyan) — 2 PRs: Nix flake with uv2nix build, NixOS module, and persistent container mode ([#20](https://github.com/NousResearch/hermes-agent/pull/20)); auto-generated config keys and suffix PATHs for Nix builds ([#3061](https://github.com/NousResearch/hermes-agent/pull/3061), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274))
- **@ctlst** — 1 PR: Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701))
- **@memosr** (memosr.eth) — 1 PR: Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162))
- **@mehmoodosman** (Osman Mehmood) — 1 PR: Fix Discord docs for Public Bot setting ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519))
-
-### All Contributors
-@alt-glitch, @ctlst, @mehmoodosman, @memosr, @teknium1
-
---
-
-**Full Changelog**: [v2026.3.23...v2026.3.28](https://github.com/NousResearch/hermes-agent/compare/v2026.3.23...v2026.3.28)
@@ -74,7 +74,7 @@ def main() -> None:

    agent = HermesACPAgent()
    try:
-        asyncio.run(acp.run_agent(agent, use_unstable_protocol=True))
+        asyncio.run(acp.run_agent(agent))
    except KeyboardInterrupt:
        logger.info("Shutting down (KeyboardInterrupt)")
    except Exception:
@@ -25,9 +25,6 @@ from acp.schema import (
    NewSessionResponse,
    PromptResponse,
    ResumeSessionResponse,
-    SetSessionConfigOptionResponse,
-    SetSessionModelResponse,
-    SetSessionModeResponse,
    ResourceContentBlock,
    SessionCapabilities,
    SessionForkCapabilities,
@@ -97,14 +94,11 @@ class HermesACPAgent(acp.Agent):

    async def initialize(
        self,
-        protocol_version: int | None = None,
+        protocol_version: int,
        client_capabilities: ClientCapabilities | None = None,
        client_info: Implementation | None = None,
        **kwargs: Any,
    ) -> InitializeResponse:
-        resolved_protocol_version = (
-            protocol_version if isinstance(protocol_version, int) else acp.PROTOCOL_VERSION
-        )
        provider = detect_provider()
        auth_methods = None
        if provider:
@@ -117,11 +111,7 @@ class HermesACPAgent(acp.Agent):
            ]

        client_name = client_info.name if client_info else "unknown"
-        logger.info(
-            "Initialize from %s (protocol v%s)",
-            client_name,
-            resolved_protocol_version,
-        )
+        logger.info("Initialize from %s (protocol v%s)", client_name, protocol_version)

        return InitializeResponse(
            protocol_version=acp.PROTOCOL_VERSION,
@@ -481,7 +471,7 @@ class HermesACPAgent(acp.Agent):

    async def set_session_model(
        self, model_id: str, session_id: str, **kwargs: Any
-    ) -> SetSessionModelResponse | None:
+    ):
        """Switch the model for a session (called by ACP protocol)."""
        state = self.session_manager.get_session(session_id)
        if state:
@@ -499,37 +489,4 @@ class HermesACPAgent(acp.Agent):
            )
            self.session_manager.save_session(session_id)
            logger.info("Session %s: model switched to %s", session_id, model_id)
-            return SetSessionModelResponse()
-        logger.warning("Session %s: model switch requested for missing session", session_id)
        return None
-
-    async def set_session_mode(
-        self, mode_id: str, session_id: str, **kwargs: Any
-    ) -> SetSessionModeResponse | None:
-        """Persist the editor-requested mode so ACP clients do not fail on mode switches."""
-        state = self.session_manager.get_session(session_id)
-        if state is None:
-            logger.warning("Session %s: mode switch requested for missing session", session_id)
-            return None
-        setattr(state, "mode", mode_id)
-        self.session_manager.save_session(session_id)
-        logger.info("Session %s: mode switched to %s", session_id, mode_id)
-        return SetSessionModeResponse()
-
-    async def set_config_option(
-        self, config_id: str, session_id: str, value: str, **kwargs: Any
-    ) -> SetSessionConfigOptionResponse | None:
-        """Accept ACP config option updates even when Hermes has no typed ACP config surface yet."""
-        state = self.session_manager.get_session(session_id)
-        if state is None:
-            logger.warning("Session %s: config update requested for missing session", session_id)
-            return None
-
-        options = getattr(state, "config_options", None)
-        if not isinstance(options, dict):
-            options = {}
-        options[str(config_id)] = value
-        setattr(state, "config_options", options)
-        self.session_manager.save_session(session_id)
-        logger.info("Session %s: config option %s updated", session_id, config_id)
-        return SetSessionConfigOptionResponse(config_options=[])
@@ -35,54 +35,6 @@ ADAPTIVE_EFFORT_MAP = {
    "minimal": "low",
 }

-# ── Max output token limits per Anthropic model ───────────────────────
-# Source: Anthropic docs + Cline model catalog.  Anthropic's API requires
-# max_tokens as a mandatory field.  Previously we hardcoded 16384, which
-# starves thinking-enabled models (thinking tokens count toward the limit).
-_ANTHROPIC_OUTPUT_LIMITS = {
-    # Claude 4.6
-    "claude-opus-4-6":   128_000,
-    "claude-sonnet-4-6":  64_000,
-    # Claude 4.5
-    "claude-opus-4-5":    64_000,
-    "claude-sonnet-4-5":  64_000,
-    "claude-haiku-4-5":   64_000,
-    # Claude 4
-    "claude-opus-4":      32_000,
-    "claude-sonnet-4":    64_000,
-    # Claude 3.7
-    "claude-3-7-sonnet": 128_000,
-    # Claude 3.5
-    "claude-3-5-sonnet":   8_192,
-    "claude-3-5-haiku":    8_192,
-    # Claude 3
-    "claude-3-opus":       4_096,
-    "claude-3-sonnet":     4_096,
-    "claude-3-haiku":      4_096,
-}
-
-# For any model not in the table, assume the highest current limit.
-# Future Anthropic models are unlikely to have *less* output capacity.
-_ANTHROPIC_DEFAULT_OUTPUT_LIMIT = 128_000
-
-
-def _get_anthropic_max_output(model: str) -> int:
-    """Look up the max output token limit for an Anthropic model.
-
-    Uses substring matching against _ANTHROPIC_OUTPUT_LIMITS so date-stamped
-    model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
-    resolve correctly.  Longest-prefix match wins to avoid e.g. "claude-3-5"
-    matching before "claude-3-5-sonnet".
-    """
-    m = model.lower()
-    best_key = ""
-    best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
-    for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
-        if key in m and len(key) > len(best_key):
-            best_key = key
-            best_val = val
-    return best_val
-

 def _supports_adaptive_thinking(model: str) -> bool:
    """Return True for Claude 4.6 models that support adaptive thinking."""
@@ -107,7 +59,6 @@ _OAUTH_ONLY_BETAS = [
 # The version must stay reasonably current — Anthropic rejects OAuth requests
 # when the spoofed user-agent version is too far behind the actual release.
 _CLAUDE_CODE_VERSION_FALLBACK = "2.1.74"
-_claude_code_version_cache: Optional[str] = None


 def _detect_claude_code_version() -> str:
@@ -135,18 +86,11 @@ def _detect_claude_code_version() -> str:
    return _CLAUDE_CODE_VERSION_FALLBACK


+_CLAUDE_CODE_VERSION = _detect_claude_code_version()
 _CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
 _MCP_TOOL_PREFIX = "mcp_"


-def _get_claude_code_version() -> str:
-    """Lazily detect the installed Claude Code version when OAuth headers need it."""
-    global _claude_code_version_cache
-    if _claude_code_version_cache is None:
-        _claude_code_version_cache = _detect_claude_code_version()
-    return _claude_code_version_cache
-
-
 def _is_oauth_token(key: str) -> bool:
    """Check if the key is an OAuth/setup token (not a regular Console API key).

@@ -188,7 +132,7 @@ def build_anthropic_client(api_key: str, base_url: str = None):
        kwargs["auth_token"] = api_key
        kwargs["default_headers"] = {
            "anthropic-beta": ",".join(all_betas),
-            "user-agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
+            "user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
            "x-app": "cli",
        }
    else:
@@ -297,7 +241,7 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:

    headers = {
        "Content-Type": "application/json",
-        "User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
+        "User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
    }

    for endpoint in token_endpoints:
@@ -762,21 +706,14 @@ def convert_messages_to_anthropic(
                result.append({"role": "user", "content": [tool_result]})
            continue

-        # Regular user message — validate non-empty content (Anthropic rejects empty)
+        # Regular user message
        if isinstance(content, list):
            converted_blocks = _convert_content_to_anthropic(content)
-            # Check if all text blocks are empty
-            if not converted_blocks or all(
-                b.get("text", "").strip() == ""
-                for b in converted_blocks
-                if isinstance(b, dict) and b.get("type") == "text"
-            ):
-                converted_blocks = [{"type": "text", "text": "(empty message)"}]
-            result.append({"role": "user", "content": converted_blocks})
+            result.append({
+                "role": "user",
+                "content": converted_blocks or [{"type": "text", "text": ""}],
+            })
        else:
-            # Validate string content is non-empty
-            if not content or (isinstance(content, str) and not content.strip()):
-                content = "(empty message)"
            result.append({"role": "user", "content": content})

    # Strip orphaned tool_use blocks (no matching tool_result follows)
@@ -866,15 +803,9 @@ def build_anthropic_kwargs(
    tool_choice: Optional[str] = None,
    is_oauth: bool = False,
    preserve_dots: bool = False,
-    context_length: Optional[int] = None,
 ) -> Dict[str, Any]:
    """Build kwargs for anthropic.messages.create().

-    When *max_tokens* is None, the model's native output limit is used
-    (e.g. 128K for Opus 4.6, 64K for Sonnet 4.6).  If *context_length*
-    is provided, the effective limit is clamped so it doesn't exceed
-    the context window.
-
    When *is_oauth* is True, applies Claude Code compatibility transforms:
    system prompt prefix, tool name prefixing, and prompt sanitization.

@@ -885,12 +816,7 @@ def build_anthropic_kwargs(
    anthropic_tools = convert_tools_to_anthropic(tools) if tools else []

    model = normalize_model_name(model, preserve_dots=preserve_dots)
-    effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
-
-    # Clamp to context window if the user set a lower context_length
-    # (e.g. custom endpoint with limited capacity).
-    if context_length and effective_max_tokens > context_length:
-        effective_max_tokens = max(context_length - 1, 1)
+    effective_max_tokens = max_tokens or 16384

    # ── OAuth: Claude Code identity ──────────────────────────────────
    if is_oauth:
@@ -693,13 +693,7 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
    is_oauth = _is_oauth_token(token)
    model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
    logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
-    try:
-        real_client = build_anthropic_client(token, base_url)
-    except ImportError:
-        # The anthropic_adapter module imports fine but the SDK itself is
-        # missing — build_anthropic_client raises ImportError at call time
-        # when _anthropic_sdk is None.  Treat as unavailable.
-        return None, None
+    real_client = build_anthropic_client(token, base_url)
    return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model


@@ -1137,13 +1131,7 @@ def resolve_vision_provider_client(
        return "custom", client, final_model

    if requested == "auto":
-        ordered = list(_VISION_AUTO_PROVIDER_ORDER)
-        preferred = _preferred_main_vision_provider()
-        if preferred in ordered:
-            ordered.remove(preferred)
-            ordered.insert(0, preferred)
-
-        for candidate in ordered:
+        for candidate in get_available_vision_backends():
            sync_client, default_model = _resolve_strict_vision_backend(candidate)
            if sync_client is not None:
                return _finalize(candidate, sync_client, default_model)
@@ -1216,39 +1204,6 @@ _client_cache: Dict[tuple, tuple] = {}
 _client_cache_lock = threading.Lock()


-def neuter_async_httpx_del() -> None:
-    """Monkey-patch ``AsyncHttpxClientWrapper.__del__`` to be a no-op.
-
-    The OpenAI SDK's ``AsyncHttpxClientWrapper.__del__`` schedules
-    ``self.aclose()`` via ``asyncio.get_running_loop().create_task()``.
-    When an ``AsyncOpenAI`` client is garbage-collected while
-    prompt_toolkit's event loop is running (the common CLI idle state),
-    the ``aclose()`` task runs on prompt_toolkit's loop but the
-    underlying TCP transport is bound to a *different* loop (the worker
-    thread's loop that the client was originally created on).  If that
-    loop is closed or its thread is dead, the transport's
-    ``self._loop.call_soon()`` raises ``RuntimeError("Event loop is
-    closed")``, which prompt_toolkit surfaces as "Unhandled exception
-    in event loop ... Press ENTER to continue...".
-
-    Neutering ``__del__`` is safe because:
-    - Cached clients are explicitly cleaned via ``_force_close_async_httpx``
-      on stale-loop detection and ``shutdown_cached_clients`` on exit.
-    - Uncached clients' TCP connections are cleaned up by the OS when the
-      process exits.
-    - The OpenAI SDK itself marks this as a TODO (``# TODO(someday):
-      support non asyncio runtimes here``).
-
-    Call this once at CLI startup, before any ``AsyncOpenAI`` clients are
-    created.
-    """
-    try:
-        from openai._base_client import AsyncHttpxClientWrapper
-        AsyncHttpxClientWrapper.__del__ = lambda self: None  # type: ignore[assignment]
-    except (ImportError, AttributeError):
-        pass  # Graceful degradation if the SDK changes its internals
-
-
 def _force_close_async_httpx(client: Any) -> None:
    """Mark the httpx AsyncClient inside an AsyncOpenAI client as closed.

@@ -1296,25 +1251,6 @@ def shutdown_cached_clients() -> None:
        _client_cache.clear()


-def cleanup_stale_async_clients() -> None:
-    """Force-close cached async clients whose event loop is closed.
-
-    Call this after each agent turn to proactively clean up stale clients
-    before GC can trigger ``AsyncHttpxClientWrapper.__del__`` on them.
-    This is defense-in-depth — the primary fix is ``neuter_async_httpx_del``
-    which disables ``__del__`` entirely.
-    """
-    with _client_cache_lock:
-        stale_keys = []
-        for key, entry in _client_cache.items():
-            client, _default, cached_loop = entry
-            if cached_loop is not None and cached_loop.is_closed():
-                _force_close_async_httpx(client)
-                stale_keys.append(key)
-        for key in stale_keys:
-            del _client_cache[key]
-
-
 def _get_cached_client(
    provider: str,
    model: str = None,
@@ -1458,29 +1394,6 @@ def _resolve_task_provider_model(
    return "auto", resolved_model, None, None


-_DEFAULT_AUX_TIMEOUT = 30.0
-
-
-def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
-    """Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
-    if not task:
-        return default
-    try:
-        from hermes_cli.config import load_config
-        config = load_config()
-    except ImportError:
-        return default
-    aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
-    task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
-    raw = task_config.get("timeout")
-    if raw is not None:
-        try:
-            return float(raw)
-        except (ValueError, TypeError):
-            pass
-    return default
-
-
 def _build_call_kwargs(
    provider: str,
    model: str,
@@ -1538,7 +1451,7 @@ def call_llm(
    temperature: float = None,
    max_tokens: int = None,
    tools: list = None,
-    timeout: float = None,
+    timeout: float = 30.0,
    extra_body: dict = None,
 ) -> Any:
    """Centralized synchronous LLM call.
@@ -1556,7 +1469,7 @@ def call_llm(
        temperature: Sampling temperature (None = provider default).
        max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
        tools: Tool definitions (for function calling).
-        timeout: Request timeout in seconds (None = read from auxiliary.{task}.timeout config).
+        timeout: Request timeout in seconds.
        extra_body: Additional request body fields.

    Returns:
@@ -1621,12 +1534,10 @@ def call_llm(
                f"No LLM provider configured for task={task} provider={resolved_provider}. "
                f"Run: hermes setup")

-    effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
-
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=effective_timeout, extra_body=extra_body,
+        tools=tools, timeout=timeout, extra_body=extra_body,
        base_url=resolved_base_url)

    # Handle max_tokens vs max_completion_tokens retry
@@ -1641,62 +1552,6 @@ def call_llm(
        raise


-def extract_content_or_reasoning(response) -> str:
-    """Extract content from an LLM response, falling back to reasoning fields.
-
-    Mirrors the main agent loop's behavior when a reasoning model (DeepSeek-R1,
-    Qwen-QwQ, etc.) returns ``content=None`` with reasoning in structured fields.
-
-    Resolution order:
-      1. ``message.content`` — strip inline think/reasoning blocks, check for
-         remaining non-whitespace text.
-      2. ``message.reasoning`` / ``message.reasoning_content`` — direct
-         structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
-      3. ``message.reasoning_details`` — OpenRouter unified array format.
-
-    Returns the best available text, or ``""`` if nothing found.
-    """
-    import re
-
-    msg = response.choices[0].message
-    content = (msg.content or "").strip()
-
-    if content:
-        # Strip inline think/reasoning blocks (mirrors _strip_think_blocks)
-        cleaned = re.sub(
-            r"<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>"
-            r".*?"
-            r"</(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>",
-            "", content, flags=re.DOTALL | re.IGNORECASE,
-        ).strip()
-        if cleaned:
-            return cleaned
-
-    # Content is empty or reasoning-only — try structured reasoning fields
-    reasoning_parts: list[str] = []
-    for field in ("reasoning", "reasoning_content"):
-        val = getattr(msg, field, None)
-        if val and isinstance(val, str) and val.strip() and val not in reasoning_parts:
-            reasoning_parts.append(val.strip())
-
-    details = getattr(msg, "reasoning_details", None)
-    if details and isinstance(details, list):
-        for detail in details:
-            if isinstance(detail, dict):
-                summary = (
-                    detail.get("summary")
-                    or detail.get("content")
-                    or detail.get("text")
-                )
-                if summary and summary not in reasoning_parts:
-                    reasoning_parts.append(summary.strip() if isinstance(summary, str) else str(summary))
-
-    if reasoning_parts:
-        return "\n\n".join(reasoning_parts)
-
-    return ""
-
-
 async def async_call_llm(
    task: str = None,
    *,
@@ -1708,7 +1563,7 @@ async def async_call_llm(
    temperature: float = None,
    max_tokens: int = None,
    tools: list = None,
-    timeout: float = None,
+    timeout: float = 30.0,
    extra_body: dict = None,
 ) -> Any:
    """Centralized asynchronous LLM call.
@@ -1769,12 +1624,10 @@ async def async_call_llm(
                f"No LLM provider configured for task={task} provider={resolved_provider}. "
                f"Run: hermes setup")

-    effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
-
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=effective_timeout, extra_body=extra_body,
+        tools=tools, timeout=timeout, extra_body=extra_body,
        base_url=resolved_base_url)

    try:
@@ -1,113 +0,0 @@
-"""BuiltinMemoryProvider — wraps MEMORY.md / USER.md as a MemoryProvider.
-
-Always registered as the first provider. Cannot be disabled or removed.
-This is the existing Hermes memory system exposed through the provider
-interface for compatibility with the MemoryManager.
-
-The actual storage logic lives in tools/memory_tool.py (MemoryStore).
-This provider is a thin adapter that delegates to MemoryStore and
-exposes the memory tool schema.
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-from typing import Any, Dict, List, Optional
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-
-class BuiltinMemoryProvider(MemoryProvider):
-    """Built-in file-backed memory (MEMORY.md + USER.md).
-
-    Always active, never disabled by other providers. The `memory` tool
-    is handled by run_agent.py's agent-level tool interception (not through
-    the normal registry), so get_tool_schemas() returns an empty list —
-    the memory tool is already wired separately.
-    """
-
-    def __init__(
-        self,
-        memory_store=None,
-        memory_enabled: bool = False,
-        user_profile_enabled: bool = False,
-    ):
-        self._store = memory_store
-        self._memory_enabled = memory_enabled
-        self._user_profile_enabled = user_profile_enabled
-
-    @property
-    def name(self) -> str:
-        return "builtin"
-
-    def is_available(self) -> bool:
-        """Built-in memory is always available."""
-        return True
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        """Load memory from disk if not already loaded."""
-        if self._store is not None:
-            self._store.load_from_disk()
-
-    def system_prompt_block(self) -> str:
-        """Return MEMORY.md and USER.md content for the system prompt.
-
-        Uses the frozen snapshot captured at load time. This ensures the
-        system prompt stays stable throughout a session (preserving the
-        prompt cache), even though the live entries may change via tool calls.
-        """
-        if not self._store:
-            return ""
-
-        parts = []
-        if self._memory_enabled:
-            mem_block = self._store.format_for_system_prompt("memory")
-            if mem_block:
-                parts.append(mem_block)
-        if self._user_profile_enabled:
-            user_block = self._store.format_for_system_prompt("user")
-            if user_block:
-                parts.append(user_block)
-
-        return "\n\n".join(parts)
-
-    def prefetch(self, query: str) -> str:
-        """Built-in memory doesn't do query-based recall — it's injected via system_prompt_block."""
-        return ""
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        """Built-in memory doesn't auto-sync turns — writes happen via the memory tool."""
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        """Return empty list.
-
-        The `memory` tool is an agent-level intercepted tool, handled
-        specially in run_agent.py before normal tool dispatch. It's not
-        part of the standard tool registry. We don't duplicate it here.
-        """
-        return []
-
-    def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
-        """Not used — the memory tool is intercepted in run_agent.py."""
-        return json.dumps({"error": "Built-in memory tool is handled by the agent loop"})
-
-    def shutdown(self) -> None:
-        """No cleanup needed — files are saved on every write."""
-
-    # -- Property access for backward compatibility --------------------------
-
-    @property
-    def store(self):
-        """Access the underlying MemoryStore for legacy code paths."""
-        return self._store
-
-    @property
-    def memory_enabled(self) -> bool:
-        return self._memory_enabled
-
-    @property
-    def user_profile_enabled(self) -> bool:
-        return self._user_profile_enabled
@@ -141,7 +141,7 @@ class ContextCompressor:
            "last_prompt_tokens": self.last_prompt_tokens,
            "threshold_tokens": self.threshold_tokens,
            "context_length": self.context_length,
-            "usage_percent": min(100, (self.last_prompt_tokens / self.context_length * 100)) if self.context_length else 0,
+            "usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
            "compression_count": self.compression_count,
        }

@@ -347,7 +347,7 @@ Write only the summary body. Do not include any preamble or prefix."""
                "messages": [{"role": "user", "content": prompt}],
                "temperature": 0.3,
                "max_tokens": summary_budget * 2,
-                # timeout resolved from auxiliary.compression.timeout config by call_llm
+                "timeout": 45.0,
            }
            if self.summary_model:
                call_kwargs["model"] = self.summary_model
@@ -286,16 +286,12 @@ def _expand_git_reference(
    args: list[str],
    label: str,
 ) -> tuple[str | None, str | None]:
-    try:
-        result = subprocess.run(
-            ["git", *args],
-            cwd=cwd,
-            capture_output=True,
-            text=True,
-            timeout=30,
-        )
-    except subprocess.TimeoutExpired:
-        return f"{ref.raw}: git command timed out (30s)", None
+    result = subprocess.run(
+        ["git", *args],
+        cwd=cwd,
+        capture_output=True,
+        text=True,
+    )
    if result.returncode != 0:
        stderr = (result.stderr or "").strip() or "git command failed"
        return f"{ref.raw}: {stderr}", None
@@ -453,12 +449,9 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
            cwd=cwd,
            capture_output=True,
            text=True,
-            timeout=10,
        )
    except FileNotFoundError:
        return None
-    except subprocess.TimeoutExpired:
-        return None
    if result.returncode != 0:
        return None
    files = [Path(line.strip()) for line in result.stdout.splitlines() if line.strip()]
@@ -231,7 +231,7 @@ class KawaiiSpinner:
        "analyzing", "computing", "synthesizing", "formulating", "brainstorming",
    ]

-    def __init__(self, message: str = "", spinner_type: str = 'dots', print_fn=None):
+    def __init__(self, message: str = "", spinner_type: str = 'dots'):
        self.message = message
        self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS['dots'])
        self.running = False
@@ -239,26 +239,12 @@ class KawaiiSpinner:
        self.frame_idx = 0
        self.start_time = None
        self.last_line_len = 0
-        # Optional callable to route all output through (e.g. a no-op for silent
-        # background agents).  When set, bypasses self._out entirely so that
-        # agents with _print_fn overridden remain fully silent.
-        self._print_fn = print_fn
        # Capture stdout NOW, before any redirect_stdout(devnull) from
        # child agents can replace sys.stdout with a black hole.
        self._out = sys.stdout

    def _write(self, text: str, end: str = '\n', flush: bool = False):
-        """Write to the stdout captured at spinner creation time.
-
-        If a print_fn was supplied at construction, all output is routed through
-        it instead — allowing callers to silence the spinner with a no-op lambda.
-        """
-        if self._print_fn is not None:
-            try:
-                self._print_fn(text)
-            except Exception:
-                pass
-            return
+        """Write to the stdout captured at spinner creation time."""
        try:
            self._out.write(text + end)
            if flush:
@@ -284,11 +270,11 @@ class KawaiiSpinner:
        The CLI already drives a TUI widget (_spinner_text) for spinner display,
        so KawaiiSpinner's \\r-based animation is redundant under StdoutProxy.
        """
-        try:
-            from prompt_toolkit.patch_stdout import StdoutProxy
-            return isinstance(self._out, StdoutProxy)
-        except ImportError:
-            return False
+        out = self._out
+        # StdoutProxy has a 'raw' attribute (bool) that plain file objects lack.
+        if hasattr(out, 'raw') and type(out).__name__ == 'StdoutProxy':
+            return True
+        return False

    def _animate(self):
        # When stdout is not a real terminal (e.g. Docker, systemd, pipe),
@@ -699,7 +685,7 @@ def format_context_pressure(
        threshold_percent: Compaction threshold as a fraction of context window.
        compression_enabled: Whether auto-compression is active.
    """
-    pct_int = min(int(compaction_progress * 100), 100)
+    pct_int = int(compaction_progress * 100)
    filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
    bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)

@@ -729,7 +715,7 @@ def format_context_pressure_gateway(
    No ANSI — just Unicode and plain text suitable for Telegram/Discord/etc.
    The percentage shows progress toward the compaction threshold.
    """
-    pct_int = min(int(compaction_progress * 100), 100)
+    pct_int = int(compaction_progress * 100)
    filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
    bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)

@@ -1,281 +0,0 @@
-"""MemoryManager — orchestrates multiple memory providers.
-
-Single integration point in run_agent.py. Replaces scattered per-backend
-code with one manager that delegates to all registered providers.
-
-The BuiltinMemoryProvider is always registered first and cannot be removed.
-External providers are additive — they never disable the built-in store.
-
-Usage in run_agent.py:
-    self._memory_manager = MemoryManager()
-    self._memory_manager.add_provider(BuiltinMemoryProvider(...))
-    if honcho_configured:
-        self._memory_manager.add_provider(HonchoProvider(...))
-    # Plugin providers are added via register_memory_provider()
-
-    # System prompt
-    prompt_parts.append(self._memory_manager.build_system_prompt())
-
-    # Pre-turn
-    context = self._memory_manager.prefetch_all(user_message)
-
-    # Post-turn
-    self._memory_manager.sync_all(user_msg, assistant_response)
-    self._memory_manager.queue_prefetch_all(user_msg)
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-from typing import Any, Dict, List, Optional
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-
-class MemoryManager:
-    """Orchestrates multiple memory providers.
-
-    Providers are called in registration order. The builtin provider
-    is always first. Failures in one provider never block others.
-    """
-
-    def __init__(self) -> None:
-        self._providers: List[MemoryProvider] = []
-        self._tool_to_provider: Dict[str, MemoryProvider] = {}
-
-    # -- Registration --------------------------------------------------------
-
-    def add_provider(self, provider: MemoryProvider) -> None:
-        """Register a memory provider.
-
-        Providers are called in registration order for all operations.
-        Tool name conflicts are resolved first-registered-wins.
-        """
-        self._providers.append(provider)
-
-        # Index tool names → provider for routing
-        for schema in provider.get_tool_schemas():
-            tool_name = schema.get("name", "")
-            if tool_name and tool_name not in self._tool_to_provider:
-                self._tool_to_provider[tool_name] = provider
-            elif tool_name in self._tool_to_provider:
-                logger.warning(
-                    "Memory tool name conflict: '%s' already registered by %s, "
-                    "ignoring from %s",
-                    tool_name,
-                    self._tool_to_provider[tool_name].name,
-                    provider.name,
-                )
-
-        logger.info(
-            "Memory provider '%s' registered (%d tools)",
-            provider.name,
-            len(provider.get_tool_schemas()),
-        )
-
-    @property
-    def providers(self) -> List[MemoryProvider]:
-        """All registered providers in order."""
-        return list(self._providers)
-
-    @property
-    def provider_names(self) -> List[str]:
-        """Names of all registered providers."""
-        return [p.name for p in self._providers]
-
-    def get_provider(self, name: str) -> Optional[MemoryProvider]:
-        """Get a provider by name, or None if not registered."""
-        for p in self._providers:
-            if p.name == name:
-                return p
-        return None
-
-    # -- System prompt -------------------------------------------------------
-
-    def build_system_prompt(self) -> str:
-        """Collect system prompt blocks from all providers.
-
-        Returns combined text, or empty string if no providers contribute.
-        Each non-empty block is labeled with the provider name.
-        """
-        blocks = []
-        for provider in self._providers:
-            try:
-                block = provider.system_prompt_block()
-                if block and block.strip():
-                    blocks.append(block)
-            except Exception as e:
-                logger.warning(
-                    "Memory provider '%s' system_prompt_block() failed: %s",
-                    provider.name, e,
-                )
-        return "\n\n".join(blocks)
-
-    # -- Prefetch / recall ---------------------------------------------------
-
-    def prefetch_all(self, query: str) -> str:
-        """Collect prefetch context from all providers.
-
-        Returns merged context text labeled by provider. Empty providers
-        are skipped. Failures in one provider don't block others.
-        """
-        parts = []
-        for provider in self._providers:
-            try:
-                result = provider.prefetch(query)
-                if result and result.strip():
-                    parts.append(result)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' prefetch failed (non-fatal): %s",
-                    provider.name, e,
-                )
-        return "\n\n".join(parts)
-
-    def queue_prefetch_all(self, query: str) -> None:
-        """Queue background prefetch on all providers for the next turn."""
-        for provider in self._providers:
-            try:
-                provider.queue_prefetch(query)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' queue_prefetch failed (non-fatal): %s",
-                    provider.name, e,
-                )
-
-    # -- Sync ----------------------------------------------------------------
-
-    def sync_all(self, user_content: str, assistant_content: str) -> None:
-        """Sync a completed turn to all providers."""
-        for provider in self._providers:
-            try:
-                provider.sync_turn(user_content, assistant_content)
-            except Exception as e:
-                logger.warning(
-                    "Memory provider '%s' sync_turn failed: %s",
-                    provider.name, e,
-                )
-
-    # -- Tools ---------------------------------------------------------------
-
-    def get_all_tool_schemas(self) -> List[Dict[str, Any]]:
-        """Collect tool schemas from all providers."""
-        schemas = []
-        seen = set()
-        for provider in self._providers:
-            try:
-                for schema in provider.get_tool_schemas():
-                    name = schema.get("name", "")
-                    if name and name not in seen:
-                        schemas.append(schema)
-                        seen.add(name)
-            except Exception as e:
-                logger.warning(
-                    "Memory provider '%s' get_tool_schemas() failed: %s",
-                    provider.name, e,
-                )
-        return schemas
-
-    def get_all_tool_names(self) -> set:
-        """Return set of all tool names across all providers."""
-        return set(self._tool_to_provider.keys())
-
-    def has_tool(self, tool_name: str) -> bool:
-        """Check if any provider handles this tool."""
-        return tool_name in self._tool_to_provider
-
-    def handle_tool_call(
-        self, tool_name: str, args: Dict[str, Any], **kwargs
-    ) -> str:
-        """Route a tool call to the correct provider.
-
-        Returns JSON string result. Raises ValueError if no provider
-        handles the tool.
-        """
-        provider = self._tool_to_provider.get(tool_name)
-        if provider is None:
-            return json.dumps({"error": f"No memory provider handles tool '{tool_name}'"})
-        try:
-            return provider.handle_tool_call(tool_name, args, **kwargs)
-        except Exception as e:
-            logger.error(
-                "Memory provider '%s' handle_tool_call(%s) failed: %s",
-                provider.name, tool_name, e,
-            )
-            return json.dumps({"error": f"Memory tool '{tool_name}' failed: {e}"})
-
-    # -- Lifecycle hooks -----------------------------------------------------
-
-    def on_turn_start(self, turn_number: int, message: str) -> None:
-        """Notify all providers of a new turn."""
-        for provider in self._providers:
-            try:
-                provider.on_turn_start(turn_number, message)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' on_turn_start failed: %s",
-                    provider.name, e,
-                )
-
-    def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
-        """Notify all providers of session end."""
-        for provider in self._providers:
-            try:
-                provider.on_session_end(messages)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' on_session_end failed: %s",
-                    provider.name, e,
-                )
-
-    def on_pre_compress(self, messages: List[Dict[str, Any]]) -> None:
-        """Notify all providers before context compression."""
-        for provider in self._providers:
-            try:
-                provider.on_pre_compress(messages)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' on_pre_compress failed: %s",
-                    provider.name, e,
-                )
-
-    def on_memory_write(self, action: str, target: str, content: str) -> None:
-        """Notify external providers when the built-in memory tool writes.
-
-        Skips the builtin provider itself (it's the source of the write).
-        """
-        for provider in self._providers:
-            if provider.name == "builtin":
-                continue
-            try:
-                provider.on_memory_write(action, target, content)
-            except Exception as e:
-                logger.debug(
-                    "Memory provider '%s' on_memory_write failed: %s",
-                    provider.name, e,
-                )
-
-    def shutdown_all(self) -> None:
-        """Shut down all providers (reverse order for clean teardown)."""
-        for provider in reversed(self._providers):
-            try:
-                provider.shutdown()
-            except Exception as e:
-                logger.warning(
-                    "Memory provider '%s' shutdown failed: %s",
-                    provider.name, e,
-                )
-
-    def initialize_all(self, session_id: str, **kwargs) -> None:
-        """Initialize all providers."""
-        for provider in self._providers:
-            try:
-                provider.initialize(session_id=session_id, **kwargs)
-            except Exception as e:
-                logger.warning(
-                    "Memory provider '%s' initialize failed: %s",
-                    provider.name, e,
-                )
@@ -1,171 +0,0 @@
-"""Abstract base class for pluggable memory providers.
-
-Memory providers give the agent persistent recall across sessions. Multiple
-providers can be active simultaneously — the MemoryManager orchestrates them.
-
-Built-in memory (MEMORY.md / USER.md) is always active as the first provider.
-External providers (Honcho, Hindsight, Mem0, etc.) are additive — they never
-disable the built-in store.
-
-Three registration paths:
-  1. Built-in: BuiltinMemoryProvider — always present, not removable.
-  2. First-party: Ship with the repo, activated by config (e.g. Honcho).
-  3. Plugin: External packages register via ctx.register_memory_provider().
-
-Lifecycle (called by MemoryManager, wired in run_agent.py):
-  initialize()          — connect, create resources, warm up
-  system_prompt_block()  — static text for the system prompt
-  prefetch(query)        — background recall before each turn
-  sync_turn(user, asst)  — async write after each turn
-  get_tool_schemas()     — tool schemas to expose to the model
-  handle_tool_call()     — dispatch a tool call
-  shutdown()             — clean exit
-
-Optional hooks (override to opt in):
-  on_turn_start(turn, message)     — per-turn tick (scope cooling, etc.)
-  on_session_end(messages)         — end-of-session extraction
-  on_pre_compress(messages)        — extract before context compression
-  on_memory_write(action, target, content) — mirror built-in memory writes
-"""
-
-from __future__ import annotations
-
-import logging
-from abc import ABC, abstractmethod
-from typing import Any, Dict, List, Optional
-
-logger = logging.getLogger(__name__)
-
-
-class MemoryProvider(ABC):
-    """Abstract base class for memory providers."""
-
-    @property
-    @abstractmethod
-    def name(self) -> str:
-        """Short identifier for this provider (e.g. 'builtin', 'honcho', 'hindsight')."""
-
-    # -- Core lifecycle (implement these) ------------------------------------
-
-    @abstractmethod
-    def is_available(self) -> bool:
-        """Return True if this provider is configured, has credentials, and is ready.
-
-        Called during agent init to decide whether to activate the provider.
-        Should not make network calls — just check config and installed deps.
-        """
-
-    @abstractmethod
-    def initialize(self, session_id: str, **kwargs) -> None:
-        """Initialize for a session.
-
-        Called once at agent startup. May create resources (banks, tables),
-        establish connections, start background threads, etc.
-
-        kwargs may include: platform, model, user_id, and other session context.
-        """
-
-    def system_prompt_block(self) -> str:
-        """Return text to include in the system prompt.
-
-        Called during system prompt assembly. Return empty string to skip.
-        This is for STATIC provider info (instructions, status). Prefetched
-        recall context is injected separately via prefetch().
-        """
-        return ""
-
-    def prefetch(self, query: str) -> str:
-        """Recall relevant context for the upcoming turn.
-
-        Called before each API call. Return formatted text to inject as
-        context, or empty string if nothing relevant. Implementations
-        should be fast — use background threads for the actual recall
-        and return cached results here.
-        """
-        return ""
-
-    def queue_prefetch(self, query: str) -> None:
-        """Queue a background recall for the NEXT turn.
-
-        Called after each turn completes. The result will be consumed
-        by prefetch() on the next turn. Default is no-op — providers
-        that do background prefetching should override this.
-        """
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        """Persist a completed turn to the backend.
-
-        Called after each turn. Should be non-blocking — queue for
-        background processing if the backend has latency.
-        """
-
-    @abstractmethod
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        """Return tool schemas this provider exposes.
-
-        Each schema follows the OpenAI function calling format:
-        {"name": "...", "description": "...", "parameters": {...}}
-
-        Return empty list if this provider has no tools (context-only).
-        """
-
-    def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
-        """Handle a tool call for one of this provider's tools.
-
-        Must return a JSON string (the tool result).
-        Only called for tool names returned by get_tool_schemas().
-        """
-        raise NotImplementedError(f"Provider {self.name} does not handle tool {tool_name}")
-
-    def shutdown(self) -> None:
-        """Clean shutdown — flush queues, close connections."""
-
-    # -- Optional hooks (override to opt in) ---------------------------------
-
-    def on_turn_start(self, turn_number: int, message: str) -> None:
-        """Called at the start of each turn with the user message.
-
-        Use for turn-counting, scope management, periodic maintenance.
-        """
-
-    def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
-        """Called when a session ends (explicit exit or timeout).
-
-        Use for end-of-session fact extraction, summarization, etc.
-        messages is the full conversation history.
-        """
-
-    def on_pre_compress(self, messages: List[Dict[str, Any]]) -> None:
-        """Called before context compression discards old messages.
-
-        Use to extract insights from messages about to be compressed.
-        messages is the list that will be summarized/discarded.
-        """
-
-    def get_config_schema(self) -> List[Dict[str, Any]]:
-        """Return config fields this provider needs for setup.
-
-        Used by 'hermes memory setup' to walk the user through configuration.
-        Each field is a dict with:
-          key:         config key name (e.g. 'api_key', 'mode')
-          description: human-readable description
-          secret:      True if this should go to .env (default: False)
-          required:    True if required (default: False)
-          default:     default value (optional)
-          choices:     list of valid values (optional)
-          url:         URL where user can get this credential (optional)
-          env_var:     explicit env var name for secrets (default: auto-generated)
-
-        Return empty list if no config needed (e.g. local-only providers).
-        """
-        return []
-
-    def on_memory_write(self, action: str, target: str, content: str) -> None:
-        """Called when the built-in memory tool writes an entry.
-
-        action: 'add', 'replace', or 'remove'
-        target: 'memory' or 'user'
-        content: the entry content
-
-        Use to mirror built-in memory writes to your backend.
-        """
@@ -113,15 +113,6 @@ DEFAULT_CONTEXT_LENGTHS = {
    "glm": 202752,
    # Kimi
    "kimi": 262144,
-    # Hugging Face Inference Providers — model IDs use org/name format
-    "Qwen/Qwen3.5-397B-A17B": 131072,
-    "Qwen/Qwen3.5-35B-A3B": 131072,
-    "deepseek-ai/DeepSeek-V3.2": 65536,
-    "moonshotai/Kimi-K2.5": 262144,
-    "moonshotai/Kimi-K2-Thinking": 262144,
-    "MiniMaxAI/MiniMax-M2.5": 204800,
-    "XiaomiMiMo/MiMo-V2-Flash": 32768,
-    "zai-org/GLM-5": 202752,
 }

 _CONTEXT_LENGTH_KEYS = (
@@ -15,8 +15,6 @@ import time
 from pathlib import Path
 from typing import Any, Dict, Optional

-from utils import atomic_json_write
-
 import requests

 logger = logging.getLogger(__name__)
@@ -66,10 +64,12 @@ def _load_disk_cache() -> Dict[str, Any]:


 def _save_disk_cache(data: Dict[str, Any]) -> None:
-    """Save models.dev data to disk cache atomically."""
+    """Save models.dev data to disk cache."""
    try:
        cache_path = _get_cache_path()
-        atomic_json_write(cache_path, data, indent=None, separators=(",", ":"))
+        cache_path.parent.mkdir(parents=True, exist_ok=True)
+        with open(cache_path, "w", encoding="utf-8") as f:
+            json.dump(data, f, separators=(",", ":"))
    except Exception as e:
        logger.debug("Failed to save models.dev disk cache: %s", e)

@@ -4,28 +4,14 @@ All functions are stateless. AIAgent._build_system_prompt() calls these to
 assemble pieces, then combines them with memory and ephemeral prompts.
 """

-import json
 import logging
 import os
 import re
-import threading
-from collections import OrderedDict
 from pathlib import Path

 from hermes_constants import get_hermes_home
 from typing import Optional

-from agent.skill_utils import (
-    extract_skill_conditions,
-    extract_skill_description,
-    get_all_skills_dirs,
-    get_disabled_skill_names,
-    iter_skill_index_files,
-    parse_frontmatter,
-    skill_matches_platform,
-)
-from utils import atomic_json_write
-
 logger = logging.getLogger(__name__)

 # ---------------------------------------------------------------------------
@@ -170,25 +156,6 @@ SKILLS_GUIDANCE = (
    "Skills that aren't maintained become liabilities."
 )

-TOOL_USE_ENFORCEMENT_GUIDANCE = (
-    "# Tool-use enforcement\n"
-    "You MUST use your tools to take action — do not describe what you would do "
-    "or plan to do without actually doing it. When you say you will perform an "
-    "action (e.g. 'I will run the tests', 'Let me check the file', 'I will create "
-    "the project'), you MUST immediately make the corresponding tool call in the same "
-    "response. Never end your turn with a promise of future action — execute it now.\n"
-    "Keep working until the task is actually complete. Do not stop with a summary of "
-    "what you plan to do next time. If you have tools available that can accomplish "
-    "the task, use them instead of telling the user what you would do.\n"
-    "Every response should either (a) contain tool calls that make progress, or "
-    "(b) deliver a final result to the user. Responses that only describe intentions "
-    "without acting are not acceptable."
-)
-
-# Model name substrings that trigger tool-use enforcement guidance.
-# Add new patterns here when a model family needs explicit steering.
-TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex")
-
 PLATFORM_HINTS = {
    "whatsapp": (
        "You are on a text messaging communication platform, WhatsApp. "
@@ -263,111 +230,6 @@ CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
 CONTEXT_TRUNCATE_TAIL_RATIO = 0.2


-# =========================================================================
-# Skills prompt cache
-# =========================================================================
-
-_SKILLS_PROMPT_CACHE_MAX = 8
-_SKILLS_PROMPT_CACHE: OrderedDict[tuple, str] = OrderedDict()
-_SKILLS_PROMPT_CACHE_LOCK = threading.Lock()
-_SKILLS_SNAPSHOT_VERSION = 1
-
-
-def _skills_prompt_snapshot_path() -> Path:
-    return get_hermes_home() / ".skills_prompt_snapshot.json"
-
-
-def clear_skills_system_prompt_cache(*, clear_snapshot: bool = False) -> None:
-    """Drop the in-process skills prompt cache (and optionally the disk snapshot)."""
-    with _SKILLS_PROMPT_CACHE_LOCK:
-        _SKILLS_PROMPT_CACHE.clear()
-    if clear_snapshot:
-        try:
-            _skills_prompt_snapshot_path().unlink(missing_ok=True)
-        except OSError as e:
-            logger.debug("Could not remove skills prompt snapshot: %s", e)
-
-
-def _build_skills_manifest(skills_dir: Path) -> dict[str, list[int]]:
-    """Build an mtime/size manifest of all SKILL.md and DESCRIPTION.md files."""
-    manifest: dict[str, list[int]] = {}
-    for filename in ("SKILL.md", "DESCRIPTION.md"):
-        for path in iter_skill_index_files(skills_dir, filename):
-            try:
-                st = path.stat()
-            except OSError:
-                continue
-            manifest[str(path.relative_to(skills_dir))] = [st.st_mtime_ns, st.st_size]
-    return manifest
-
-
-def _load_skills_snapshot(skills_dir: Path) -> Optional[dict]:
-    """Load the disk snapshot if it exists and its manifest still matches."""
-    snapshot_path = _skills_prompt_snapshot_path()
-    if not snapshot_path.exists():
-        return None
-    try:
-        snapshot = json.loads(snapshot_path.read_text(encoding="utf-8"))
-    except Exception:
-        return None
-    if not isinstance(snapshot, dict):
-        return None
-    if snapshot.get("version") != _SKILLS_SNAPSHOT_VERSION:
-        return None
-    if snapshot.get("manifest") != _build_skills_manifest(skills_dir):
-        return None
-    return snapshot
-
-
-def _write_skills_snapshot(
-    skills_dir: Path,
-    manifest: dict[str, list[int]],
-    skill_entries: list[dict],
-    category_descriptions: dict[str, str],
-) -> None:
-    """Persist skill metadata to disk for fast cold-start reuse."""
-    payload = {
-        "version": _SKILLS_SNAPSHOT_VERSION,
-        "manifest": manifest,
-        "skills": skill_entries,
-        "category_descriptions": category_descriptions,
-    }
-    try:
-        atomic_json_write(_skills_prompt_snapshot_path(), payload)
-    except Exception as e:
-        logger.debug("Could not write skills prompt snapshot: %s", e)
-
-
-def _build_snapshot_entry(
-    skill_file: Path,
-    skills_dir: Path,
-    frontmatter: dict,
-    description: str,
-) -> dict:
-    """Build a serialisable metadata dict for one skill."""
-    rel_path = skill_file.relative_to(skills_dir)
-    parts = rel_path.parts
-    if len(parts) >= 2:
-        skill_name = parts[-2]
-        category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
-    else:
-        category = "general"
-        skill_name = skill_file.parent.name
-
-    platforms = frontmatter.get("platforms") or []
-    if isinstance(platforms, str):
-        platforms = [platforms]
-
-    return {
-        "skill_name": skill_name,
-        "category": category,
-        "frontmatter_name": str(frontmatter.get("name", skill_name)),
-        "description": description,
-        "platforms": [str(p).strip() for p in platforms if str(p).strip()],
-        "conditions": extract_skill_conditions(frontmatter),
-    }
-
-
 # =========================================================================
 # Skills index
 # =========================================================================
@@ -379,13 +241,22 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
    (True, {}, "") to err on the side of showing the skill.
    """
    try:
+        from tools.skills_tool import _parse_frontmatter, skill_matches_platform
+
        raw = skill_file.read_text(encoding="utf-8")[:2000]
-        frontmatter, _ = parse_frontmatter(raw)
+        frontmatter, _ = _parse_frontmatter(raw)

        if not skill_matches_platform(frontmatter):
-            return False, frontmatter, ""
+            return False, {}, ""

-        return True, frontmatter, extract_skill_description(frontmatter)
+        desc = ""
+        raw_desc = frontmatter.get("description", "")
+        if raw_desc:
+            desc = str(raw_desc).strip().strip("'\"")
+            if len(desc) > 60:
+                desc = desc[:57] + "..."
+
+        return True, frontmatter, desc
    except Exception as e:
        logger.debug("Failed to parse skill file %s: %s", skill_file, e)
        return True, {}, ""
@@ -394,9 +265,16 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
 def _read_skill_conditions(skill_file: Path) -> dict:
    """Extract conditional activation fields from SKILL.md frontmatter."""
    try:
+        from tools.skills_tool import _parse_frontmatter
        raw = skill_file.read_text(encoding="utf-8")[:2000]
-        frontmatter, _ = parse_frontmatter(raw)
-        return extract_skill_conditions(frontmatter)
+        frontmatter, _ = _parse_frontmatter(raw)
+        hermes = frontmatter.get("metadata", {}).get("hermes", {})
+        return {
+            "fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
+            "requires_toolsets": hermes.get("requires_toolsets", []),
+            "fallback_for_tools": hermes.get("fallback_for_tools", []),
+            "requires_tools": hermes.get("requires_tools", []),
+        }
    except Exception as e:
        logger.debug("Failed to read skill conditions from %s: %s", skill_file, e)
        return {}
@@ -439,210 +317,109 @@ def build_skills_system_prompt(
 ) -> str:
    """Build a compact skill index for the system prompt.

-    Two-layer cache:
-      1. In-process LRU dict keyed by (skills_dir, tools, toolsets)
-      2. Disk snapshot (``.skills_prompt_snapshot.json``) validated by
-         mtime/size manifest — survives process restarts
-
-    Falls back to a full filesystem scan when both layers miss.
-
-    External skill directories (``skills.external_dirs`` in config.yaml) are
-    scanned alongside the local ``~/.hermes/skills/`` directory.  External dirs
-    are read-only — they appear in the index but new skills are always created
-    in the local dir.  Local skills take precedence when names collide.
+    Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
+    Includes per-skill descriptions from frontmatter so the model can
+    match skills by meaning, not just name.
+    Filters out skills incompatible with the current OS platform.
    """
    hermes_home = get_hermes_home()
    skills_dir = hermes_home / "skills"
-    external_dirs = get_all_skills_dirs()[1:]  # skip local (index 0)

-    if not skills_dir.exists() and not external_dirs:
+    if not skills_dir.exists():
        return ""

-    # ── Layer 1: in-process LRU cache ─────────────────────────────────
-    cache_key = (
-        str(skills_dir.resolve()),
-        tuple(str(d) for d in external_dirs),
-        tuple(sorted(str(t) for t in (available_tools or set()))),
-        tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
-    )
-    with _SKILLS_PROMPT_CACHE_LOCK:
-        cached = _SKILLS_PROMPT_CACHE.get(cache_key)
-        if cached is not None:
-            _SKILLS_PROMPT_CACHE.move_to_end(cache_key)
-            return cached
-
-    disabled = get_disabled_skill_names()
-
-    # ── Layer 2: disk snapshot ────────────────────────────────────────
-    snapshot = _load_skills_snapshot(skills_dir)
+    # Collect skills with descriptions, grouped by category.
+    # Each entry: (skill_name, description)
+    # Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
+    # -> category "mlops/training", skill "axolotl"
+    # Load disabled skill names once for the entire scan
+    try:
+        from tools.skills_tool import _get_disabled_skill_names
+        disabled = _get_disabled_skill_names()
+    except Exception:
+        disabled = set()

    skills_by_category: dict[str, list[tuple[str, str]]] = {}
-    category_descriptions: dict[str, str] = {}
-
-    if snapshot is not None:
-        # Fast path: use pre-parsed metadata from disk
-        for entry in snapshot.get("skills", []):
-            if not isinstance(entry, dict):
-                continue
-            skill_name = entry.get("skill_name") or ""
-            category = entry.get("category") or "general"
-            frontmatter_name = entry.get("frontmatter_name") or skill_name
-            platforms = entry.get("platforms") or []
-            if not skill_matches_platform({"platforms": platforms}):
-                continue
-            if frontmatter_name in disabled or skill_name in disabled:
-                continue
-            if not _skill_should_show(
-                entry.get("conditions") or {},
-                available_tools,
-                available_toolsets,
-            ):
-                continue
-            skills_by_category.setdefault(category, []).append(
-                (skill_name, entry.get("description", ""))
-            )
-        category_descriptions = {
-            str(k): str(v)
-            for k, v in (snapshot.get("category_descriptions") or {}).items()
+    for skill_file in skills_dir.rglob("SKILL.md"):
+        is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
+        if not is_compatible:
+            continue
+        rel_path = skill_file.relative_to(skills_dir)
+        parts = rel_path.parts
+        if len(parts) >= 2:
+            skill_name = parts[-2]
+            category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
+        else:
+            category = "general"
+            skill_name = skill_file.parent.name
+        # Respect user's disabled skills config
+        fm_name = frontmatter.get("name", skill_name)
+        if fm_name in disabled or skill_name in disabled:
+            continue
+        # Extract conditions inline from already-parsed frontmatter
+        # (avoids redundant file re-read that _read_skill_conditions would do)
+        hermes_meta = (frontmatter.get("metadata") or {}).get("hermes") or {}
+        conditions = {
+            "fallback_for_toolsets": hermes_meta.get("fallback_for_toolsets", []),
+            "requires_toolsets": hermes_meta.get("requires_toolsets", []),
+            "fallback_for_tools": hermes_meta.get("fallback_for_tools", []),
+            "requires_tools": hermes_meta.get("requires_tools", []),
        }
-    else:
-        # Cold path: full filesystem scan + write snapshot for next time
-        skill_entries: list[dict] = []
-        for skill_file in iter_skill_index_files(skills_dir, "SKILL.md"):
-            is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
-            entry = _build_snapshot_entry(skill_file, skills_dir, frontmatter, desc)
-            skill_entries.append(entry)
-            if not is_compatible:
-                continue
-            skill_name = entry["skill_name"]
-            if entry["frontmatter_name"] in disabled or skill_name in disabled:
-                continue
-            if not _skill_should_show(
-                extract_skill_conditions(frontmatter),
-                available_tools,
-                available_toolsets,
-            ):
-                continue
-            skills_by_category.setdefault(entry["category"], []).append(
-                (skill_name, entry["description"])
-            )
+        if not _skill_should_show(conditions, available_tools, available_toolsets):
+            continue
+        skills_by_category.setdefault(category, []).append((skill_name, desc))

-        # Read category-level DESCRIPTION.md files
-        for desc_file in iter_skill_index_files(skills_dir, "DESCRIPTION.md"):
+    if not skills_by_category:
+        return ""
+
+    # Read category-level descriptions from DESCRIPTION.md
+    # Checks both the exact category path and parent directories
+    category_descriptions = {}
+    for category in skills_by_category:
+        cat_path = Path(category)
+        desc_file = skills_dir / cat_path / "DESCRIPTION.md"
+        if desc_file.exists():
            try:
                content = desc_file.read_text(encoding="utf-8")
-                fm, _ = parse_frontmatter(content)
-                cat_desc = fm.get("description")
-                if not cat_desc:
-                    continue
-                rel = desc_file.relative_to(skills_dir)
-                cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
-                category_descriptions[cat] = str(cat_desc).strip().strip("'\"")
+                match = re.search(r"^---\s*\n.*?description:\s*(.+?)\s*\n.*?^---", content, re.MULTILINE | re.DOTALL)
+                if match:
+                    category_descriptions[category] = match.group(1).strip()
            except Exception as e:
                logger.debug("Could not read skill description %s: %s", desc_file, e)

-        _write_skills_snapshot(
-            skills_dir,
-            _build_skills_manifest(skills_dir),
-            skill_entries,
-            category_descriptions,
-        )
-
-    # ── External skill directories ─────────────────────────────────────
-    # Scan external dirs directly (no snapshot caching — they're read-only
-    # and typically small).  Local skills already in skills_by_category take
-    # precedence: we track seen names and skip duplicates from external dirs.
-    seen_skill_names: set[str] = set()
-    for cat_skills in skills_by_category.values():
-        for name, _desc in cat_skills:
-            seen_skill_names.add(name)
-
-    for ext_dir in external_dirs:
-        if not ext_dir.exists():
-            continue
-        for skill_file in iter_skill_index_files(ext_dir, "SKILL.md"):
-            try:
-                is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
-                if not is_compatible:
-                    continue
-                entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
-                skill_name = entry["skill_name"]
-                if skill_name in seen_skill_names:
-                    continue
-                if entry["frontmatter_name"] in disabled or skill_name in disabled:
-                    continue
-                if not _skill_should_show(
-                    extract_skill_conditions(frontmatter),
-                    available_tools,
-                    available_toolsets,
-                ):
-                    continue
-                seen_skill_names.add(skill_name)
-                skills_by_category.setdefault(entry["category"], []).append(
-                    (skill_name, entry["description"])
-                )
-            except Exception as e:
-                logger.debug("Error reading external skill %s: %s", skill_file, e)
-
-        # External category descriptions
-        for desc_file in iter_skill_index_files(ext_dir, "DESCRIPTION.md"):
-            try:
-                content = desc_file.read_text(encoding="utf-8")
-                fm, _ = parse_frontmatter(content)
-                cat_desc = fm.get("description")
-                if not cat_desc:
-                    continue
-                rel = desc_file.relative_to(ext_dir)
-                cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
-                category_descriptions.setdefault(cat, str(cat_desc).strip().strip("'\""))
-            except Exception as e:
-                logger.debug("Could not read external skill description %s: %s", desc_file, e)
-
-    if not skills_by_category:
-        result = ""
-    else:
-        index_lines = []
-        for category in sorted(skills_by_category.keys()):
-            cat_desc = category_descriptions.get(category, "")
-            if cat_desc:
-                index_lines.append(f"  {category}: {cat_desc}")
+    index_lines = []
+    for category in sorted(skills_by_category.keys()):
+        cat_desc = category_descriptions.get(category, "")
+        if cat_desc:
+            index_lines.append(f"  {category}: {cat_desc}")
+        else:
+            index_lines.append(f"  {category}:")
+        # Deduplicate and sort skills within each category
+        seen = set()
+        for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
+            if name in seen:
+                continue
+            seen.add(name)
+            if desc:
+                index_lines.append(f"    - {name}: {desc}")
            else:
-                index_lines.append(f"  {category}:")
-            # Deduplicate and sort skills within each category
-            seen = set()
-            for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
-                if name in seen:
-                    continue
-                seen.add(name)
-                if desc:
-                    index_lines.append(f"    - {name}: {desc}")
-                else:
-                    index_lines.append(f"    - {name}")
+                index_lines.append(f"    - {name}")

-        result = (
-            "## Skills (mandatory)\n"
-            "Before replying, scan the skills below. If one clearly matches your task, "
-            "load it with skill_view(name) and follow its instructions. "
-            "If a skill has issues, fix it with skill_manage(action='patch').\n"
-            "After difficult/iterative tasks, offer to save as a skill. "
-            "If a skill you loaded was missing steps, had wrong commands, or needed "
-            "pitfalls you discovered, update it before finishing.\n"
-            "\n"
-            "<available_skills>\n"
-            + "\n".join(index_lines) + "\n"
-            "</available_skills>\n"
-            "\n"
-            "If none match, proceed normally without loading a skill."
-        )
-
-    # ── Store in LRU cache ────────────────────────────────────────────
-    with _SKILLS_PROMPT_CACHE_LOCK:
-        _SKILLS_PROMPT_CACHE[cache_key] = result
-        _SKILLS_PROMPT_CACHE.move_to_end(cache_key)
-        while len(_SKILLS_PROMPT_CACHE) > _SKILLS_PROMPT_CACHE_MAX:
-            _SKILLS_PROMPT_CACHE.popitem(last=False)
-
-    return result
+    return (
+        "## Skills (mandatory)\n"
+        "Before replying, scan the skills below. If one clearly matches your task, "
+        "load it with skill_view(name) and follow its instructions. "
+        "If a skill has issues, fix it with skill_manage(action='patch').\n"
+        "After difficult/iterative tasks, offer to save as a skill. "
+        "If a skill you loaded was missing steps, had wrong commands, or needed "
+        "pitfalls you discovered, update it before finishing.\n"
+        "\n"
+        "<available_skills>\n"
+        + "\n".join(index_lines) + "\n"
+        "</available_skills>\n"
+        "\n"
+        "If none match, proceed normally without loading a skill."
+    )


 # =========================================================================
@@ -128,11 +128,7 @@ def _build_skill_message(
                        supporting.append(rel)

    if supporting and skill_dir:
-        try:
-            skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
-        except ValueError:
-            # Skill is from an external dir — use the skill name instead
-            skill_view_target = skill_dir.name
+        skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
        parts.append("")
        parts.append("[This skill has supporting files you can load with the skill_view tool:]")
        for sf in supporting:
@@ -162,49 +158,38 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
    _skill_commands = {}
    try:
        from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
-        from agent.skill_utils import get_external_skills_dirs
+        if not SKILLS_DIR.exists():
+            return _skill_commands
        disabled = _get_disabled_skill_names()
-        seen_names: set = set()
-
-        # Scan local dir first, then external dirs
-        dirs_to_scan = []
-        if SKILLS_DIR.exists():
-            dirs_to_scan.append(SKILLS_DIR)
-        dirs_to_scan.extend(get_external_skills_dirs())
-
-        for scan_dir in dirs_to_scan:
-            for skill_md in scan_dir.rglob("SKILL.md"):
-                if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
+        for skill_md in SKILLS_DIR.rglob("SKILL.md"):
+            if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
+                continue
+            try:
+                content = skill_md.read_text(encoding='utf-8')
+                frontmatter, body = _parse_frontmatter(content)
+                # Skip skills incompatible with the current OS platform
+                if not skill_matches_platform(frontmatter):
                    continue
-                try:
-                    content = skill_md.read_text(encoding='utf-8')
-                    frontmatter, body = _parse_frontmatter(content)
-                    # Skip skills incompatible with the current OS platform
-                    if not skill_matches_platform(frontmatter):
-                        continue
-                    name = frontmatter.get('name', skill_md.parent.name)
-                    if name in seen_names:
-                        continue
-                    # Respect user's disabled skills config
-                    if name in disabled:
-                        continue
-                    description = frontmatter.get('description', '')
-                    if not description:
-                        for line in body.strip().split('\n'):
-                            line = line.strip()
-                            if line and not line.startswith('#'):
-                                description = line[:80]
-                                break
-                    seen_names.add(name)
-                    cmd_name = name.lower().replace(' ', '-').replace('_', '-')
-                    _skill_commands[f"/{cmd_name}"] = {
-                        "name": name,
-                        "description": description or f"Invoke the {name} skill",
-                        "skill_md_path": str(skill_md),
-                        "skill_dir": str(skill_md.parent),
-                    }
-                except Exception:
+                name = frontmatter.get('name', skill_md.parent.name)
+                # Respect user's disabled skills config
+                if name in disabled:
                    continue
+                description = frontmatter.get('description', '')
+                if not description:
+                    for line in body.strip().split('\n'):
+                        line = line.strip()
+                        if line and not line.startswith('#'):
+                            description = line[:80]
+                            break
+                cmd_name = name.lower().replace(' ', '-').replace('_', '-')
+                _skill_commands[f"/{cmd_name}"] = {
+                    "name": name,
+                    "description": description or f"Invoke the {name} skill",
+                    "skill_md_path": str(skill_md),
+                    "skill_dir": str(skill_md.parent),
+                }
+            except Exception:
+                continue
    except Exception:
        pass
    return _skill_commands
@@ -1,270 +0,0 @@
-"""Lightweight skill metadata utilities shared by prompt_builder and skills_tool.
-
-This module intentionally avoids importing the tool registry, CLI config, or any
-heavy dependency chain.  It is safe to import at module level without triggering
-tool registration or provider resolution.
-"""
-
-import logging
-import os
-import re
-import sys
-from pathlib import Path
-from typing import Any, Dict, List, Optional, Set, Tuple
-
-from hermes_constants import get_hermes_home
-
-logger = logging.getLogger(__name__)
-
-# ── Platform mapping ──────────────────────────────────────────────────────
-
-PLATFORM_MAP = {
-    "macos": "darwin",
-    "linux": "linux",
-    "windows": "win32",
-}
-
-EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub"))
-
-# ── Lazy YAML loader ─────────────────────────────────────────────────────
-
-_yaml_load_fn = None
-
-
-def yaml_load(content: str):
-    """Parse YAML with lazy import and CSafeLoader preference."""
-    global _yaml_load_fn
-    if _yaml_load_fn is None:
-        import yaml
-
-        loader = getattr(yaml, "CSafeLoader", None) or yaml.SafeLoader
-
-        def _load(value: str):
-            return yaml.load(value, Loader=loader)
-
-        _yaml_load_fn = _load
-    return _yaml_load_fn(content)
-
-
-# ── Frontmatter parsing ──────────────────────────────────────────────────
-
-
-def parse_frontmatter(content: str) -> Tuple[Dict[str, Any], str]:
-    """Parse YAML frontmatter from a markdown string.
-
-    Uses yaml with CSafeLoader for full YAML support (nested metadata, lists)
-    with a fallback to simple key:value splitting for robustness.
-
-    Returns:
-        (frontmatter_dict, remaining_body)
-    """
-    frontmatter: Dict[str, Any] = {}
-    body = content
-
-    if not content.startswith("---"):
-        return frontmatter, body
-
-    end_match = re.search(r"\n---\s*\n", content[3:])
-    if not end_match:
-        return frontmatter, body
-
-    yaml_content = content[3 : end_match.start() + 3]
-    body = content[end_match.end() + 3 :]
-
-    try:
-        parsed = yaml_load(yaml_content)
-        if isinstance(parsed, dict):
-            frontmatter = parsed
-    except Exception:
-        # Fallback: simple key:value parsing for malformed YAML
-        for line in yaml_content.strip().split("\n"):
-            if ":" not in line:
-                continue
-            key, value = line.split(":", 1)
-            frontmatter[key.strip()] = value.strip()
-
-    return frontmatter, body
-
-
-# ── Platform matching ─────────────────────────────────────────────────────
-
-
-def skill_matches_platform(frontmatter: Dict[str, Any]) -> bool:
-    """Return True when the skill is compatible with the current OS.
-
-    Skills declare platform requirements via a top-level ``platforms`` list
-    in their YAML frontmatter::
-
-        platforms: [macos]          # macOS only
-        platforms: [macos, linux]   # macOS and Linux
-
-    If the field is absent or empty the skill is compatible with **all**
-    platforms (backward-compatible default).
-    """
-    platforms = frontmatter.get("platforms")
-    if not platforms:
-        return True
-    if not isinstance(platforms, list):
-        platforms = [platforms]
-    current = sys.platform
-    for platform in platforms:
-        normalized = str(platform).lower().strip()
-        mapped = PLATFORM_MAP.get(normalized, normalized)
-        if current.startswith(mapped):
-            return True
-    return False
-
-
-# ── Disabled skills ───────────────────────────────────────────────────────
-
-
-def get_disabled_skill_names() -> Set[str]:
-    """Read disabled skill names from config.yaml.
-
-    Resolves platform from ``HERMES_PLATFORM`` env var, falls back to
-    the global disabled list.  Reads the config file directly (no CLI
-    config imports) to stay lightweight.
-    """
-    config_path = get_hermes_home() / "config.yaml"
-    if not config_path.exists():
-        return set()
-    try:
-        parsed = yaml_load(config_path.read_text(encoding="utf-8"))
-    except Exception as e:
-        logger.debug("Could not read skill config %s: %s", config_path, e)
-        return set()
-    if not isinstance(parsed, dict):
-        return set()
-
-    skills_cfg = parsed.get("skills")
-    if not isinstance(skills_cfg, dict):
-        return set()
-
-    resolved_platform = os.getenv("HERMES_PLATFORM")
-    if resolved_platform:
-        platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
-            resolved_platform
-        )
-        if platform_disabled is not None:
-            return _normalize_string_set(platform_disabled)
-    return _normalize_string_set(skills_cfg.get("disabled"))
-
-
-def _normalize_string_set(values) -> Set[str]:
-    if values is None:
-        return set()
-    if isinstance(values, str):
-        values = [values]
-    return {str(v).strip() for v in values if str(v).strip()}
-
-
-# ── External skills directories ──────────────────────────────────────────
-
-
-def get_external_skills_dirs() -> List[Path]:
-    """Read ``skills.external_dirs`` from config.yaml and return validated paths.
-
-    Each entry is expanded (``~`` and ``${VAR}``) and resolved to an absolute
-    path.  Only directories that actually exist are returned.  Duplicates and
-    paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
-    """
-    config_path = get_hermes_home() / "config.yaml"
-    if not config_path.exists():
-        return []
-    try:
-        parsed = yaml_load(config_path.read_text(encoding="utf-8"))
-    except Exception:
-        return []
-    if not isinstance(parsed, dict):
-        return []
-
-    skills_cfg = parsed.get("skills")
-    if not isinstance(skills_cfg, dict):
-        return []
-
-    raw_dirs = skills_cfg.get("external_dirs")
-    if not raw_dirs:
-        return []
-    if isinstance(raw_dirs, str):
-        raw_dirs = [raw_dirs]
-    if not isinstance(raw_dirs, list):
-        return []
-
-    local_skills = (get_hermes_home() / "skills").resolve()
-    seen: Set[Path] = set()
-    result: List[Path] = []
-
-    for entry in raw_dirs:
-        entry = str(entry).strip()
-        if not entry:
-            continue
-        # Expand ~ and environment variables
-        expanded = os.path.expanduser(os.path.expandvars(entry))
-        p = Path(expanded).resolve()
-        if p == local_skills:
-            continue
-        if p in seen:
-            continue
-        if p.is_dir():
-            seen.add(p)
-            result.append(p)
-        else:
-            logger.debug("External skills dir does not exist, skipping: %s", p)
-
-    return result
-
-
-def get_all_skills_dirs() -> List[Path]:
-    """Return all skill directories: local ``~/.hermes/skills/`` first, then external.
-
-    The local dir is always first (and always included even if it doesn't exist
-    yet — callers handle that).  External dirs follow in config order.
-    """
-    dirs = [get_hermes_home() / "skills"]
-    dirs.extend(get_external_skills_dirs())
-    return dirs
-
-
-# ── Condition extraction ──────────────────────────────────────────────────
-
-
-def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
-    """Extract conditional activation fields from parsed frontmatter."""
-    hermes = (frontmatter.get("metadata") or {}).get("hermes") or {}
-    return {
-        "fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
-        "requires_toolsets": hermes.get("requires_toolsets", []),
-        "fallback_for_tools": hermes.get("fallback_for_tools", []),
-        "requires_tools": hermes.get("requires_tools", []),
-    }
-
-
-# ── Description extraction ────────────────────────────────────────────────
-
-
-def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
-    """Extract a truncated description from parsed frontmatter."""
-    raw_desc = frontmatter.get("description", "")
-    if not raw_desc:
-        return ""
-    desc = str(raw_desc).strip().strip("'\"")
-    if len(desc) > 60:
-        return desc[:57] + "..."
-    return desc
-
-
-# ── File iteration ────────────────────────────────────────────────────────
-
-
-def iter_skill_index_files(skills_dir: Path, filename: str):
-    """Walk skills_dir yielding sorted paths matching *filename*.
-
-    Excludes ``.git``, ``.github``, ``.hub`` directories.
-    """
-    matches = []
-    for root, dirs, files in os.walk(skills_dir):
-        dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
-        if filename in files:
-            matches.append(Path(root) / filename)
-    for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):
-        yield path
@@ -19,7 +19,7 @@ _TITLE_PROMPT = (
 )


-def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
+def generate_title(user_message: str, assistant_response: str, timeout: float = 15.0) -> Optional[str]:
    """Generate a session title from the first exchange.

    Uses the auxiliary LLM client (cheapest/fastest available model).
@@ -7,7 +7,6 @@
 # =============================================================================
 model:
  # Default model to use (can be overridden with --model flag)
-  # Both "default" and "model" work as the key name here.
  default: "anthropic/claude-opus-4.6"
  
  # Inference provider selection:
@@ -402,15 +401,6 @@ skills:
  # Set to 0 to disable.
  creation_nudge_interval: 15

-  # External skill directories — share skills across tools/agents without
-  # copying them into ~/.hermes/skills/.  Each path is expanded (~ and ${VAR})
-  # and resolved to an absolute path.  External dirs are read-only: skill
-  # creation always writes to ~/.hermes/skills/.  Local skills take precedence
-  # when names collide.
-  # external_dirs:
-  #   - ~/.agents/skills
-  #   - /home/shared/team-skills
-
 # =============================================================================
 # Agent Behavior
 # =============================================================================
@@ -698,12 +688,6 @@ display:
  # Toggle at runtime with /verbose in the CLI
  tool_progress: all

-  # What Enter does when Hermes is already busy in the CLI.
-  #   interrupt: Interrupt the current run and redirect Hermes (default)
-  #   queue:     Queue your message for the next turn
-  # Ctrl+C always interrupts regardless of this setting.
-  busy_input_mode: interrupt
-
  # Background process notifications (gateway/messaging only).
  # Controls how chatty the process watcher is when you use
  # terminal(background=true, check_interval=...) from Telegram/Discord/etc.
@@ -70,7 +70,7 @@ _COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧

 # Load .env from ~/.hermes/.env first, then project root as dev fallback.
 # User-managed env files should override stale shell exports on restart.
-from hermes_constants import get_hermes_home, display_hermes_home, OPENROUTER_BASE_URL
+from hermes_constants import get_hermes_home, OPENROUTER_BASE_URL
 from hermes_cli.env_loader import load_hermes_dotenv

 _hermes_home = get_hermes_home()
@@ -205,7 +205,6 @@ def load_cli_config() -> Dict[str, Any]:
            "resume_display": "full",
            "show_reasoning": False,
            "streaming": True,
-            "busy_input_mode": "interrupt",

            "skin": "default",
        },
@@ -449,17 +448,6 @@ try:
 except Exception:
    pass  # Skin engine is optional — default skin used if unavailable

-# Neuter AsyncHttpxClientWrapper.__del__ before any AsyncOpenAI clients are
-# created.  The SDK's __del__ schedules aclose() on asyncio.get_running_loop()
-# which, during CLI idle time, finds prompt_toolkit's event loop and tries to
-# close TCP transports bound to dead worker loops — producing
-# "Event loop is closed" / "Press ENTER to continue..." errors.
-try:
-    from agent.auxiliary_client import neuter_async_httpx_del
-    neuter_async_httpx_del()
-except Exception:
-    pass
-
 from rich import box as rich_box
 from rich.console import Console
 from rich.markup import escape as _escape
@@ -1047,18 +1035,13 @@ class HermesCLI:
        self.config = CLI_CONFIG
        self.compact = compact if compact is not None else CLI_CONFIG["display"].get("compact", False)
        # tool_progress: "off", "new", "all", "verbose" (from config.yaml display section)
-        # YAML 1.1 parses bare `off` as boolean False — normalise to string.
-        _raw_tp = CLI_CONFIG["display"].get("tool_progress", "all")
-        self.tool_progress_mode = "off" if _raw_tp is False else str(_raw_tp)
+        self.tool_progress_mode = CLI_CONFIG["display"].get("tool_progress", "all")
        # resume_display: "full" (show history) | "minimal" (one-liner only)
        self.resume_display = CLI_CONFIG["display"].get("resume_display", "full")
        # bell_on_complete: play terminal bell (\a) when agent finishes a response
        self.bell_on_complete = CLI_CONFIG["display"].get("bell_on_complete", False)
        # show_reasoning: display model thinking/reasoning before the response
        self.show_reasoning = CLI_CONFIG["display"].get("show_reasoning", False)
-        # busy_input_mode: "interrupt" (Enter interrupts current run) or "queue" (Enter queues for next turn)
-        _bim = CLI_CONFIG["display"].get("busy_input_mode", "interrupt")
-        self.busy_input_mode = "queue" if str(_bim).strip().lower() == "queue" else "interrupt"

        self.verbose = verbose if verbose is not None else (self.tool_progress_mode == "verbose")
        
@@ -1078,12 +1061,12 @@ class HermesCLI:
        # authoritative.  This avoids conflicts in multi-agent setups where
        # env vars would stomp each other.
        _model_config = CLI_CONFIG.get("model", {})
-        _config_model = (_model_config.get("default") or _model_config.get("model") or "") if isinstance(_model_config, dict) else (_model_config or "")
+        _config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
        _FALLBACK_MODEL = "anthropic/claude-opus-4.6"
        self.model = model or _config_model or _FALLBACK_MODEL
        # Auto-detect model from local server if still on fallback
        if self.model == _FALLBACK_MODEL:
-            _base_url = (_model_config.get("base_url") or "") if isinstance(_model_config, dict) else ""
+            _base_url = _model_config.get("base_url", "") if isinstance(_model_config, dict) else ""
            if "localhost" in _base_url or "127.0.0.1" in _base_url:
                from hermes_cli.runtime_provider import _auto_detect_local_model
                _detected = _auto_detect_local_model(_base_url)
@@ -1346,12 +1329,7 @@ class HermesCLI:
    def _build_status_bar_text(self, width: Optional[int] = None) -> str:
        try:
            snapshot = self._get_status_bar_snapshot()
-            if width is None:
-                try:
-                    from prompt_toolkit.application import get_app
-                    width = get_app().output.get_size().columns
-                except Exception:
-                    width = shutil.get_terminal_size((80, 24)).columns
+            width = width or shutil.get_terminal_size((80, 24)).columns
            percent = snapshot["context_percent"]
            percent_label = f"{percent}%" if percent is not None else "--"
            duration_label = snapshot["duration"]
@@ -1381,16 +1359,7 @@ class HermesCLI:
            return []
        try:
            snapshot = self._get_status_bar_snapshot()
-            # Use prompt_toolkit's own terminal width when running inside the
-            # TUI — shutil.get_terminal_size() can return stale or fallback
-            # values (especially on SSH) that differ from what prompt_toolkit
-            # actually renders, causing the fragments to overflow to a second
-            # line and produce duplicated status bar rows over long sessions.
-            try:
-                from prompt_toolkit.application import get_app
-                width = get_app().output.get_size().columns
-            except Exception:
-                width = shutil.get_terminal_size((80, 24)).columns
+            width = shutil.get_terminal_size((80, 24)).columns
            duration_label = snapshot["duration"]

            if width < 52:
@@ -1625,7 +1594,6 @@ class HermesCLI:
        if not text:
            return
        self._reasoning_stream_started = True
-        self._reasoning_shown_this_turn = True
        if getattr(self, "_stream_box_opened", False):
            return

@@ -2961,82 +2929,6 @@ class HermesCLI:
        if not silent:
            print("(^_^)v New session started!")

-    def _handle_resume_command(self, cmd_original: str) -> None:
-        """Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
-        parts = cmd_original.split(None, 1)
-        target = parts[1].strip() if len(parts) > 1 else ""
-
-        if not target:
-            _cprint("  Usage: /resume <session_id_or_title>")
-            _cprint("  Tip:   Use /history or `hermes sessions list` to find sessions.")
-            return
-
-        if not self._session_db:
-            _cprint("  Session database not available.")
-            return
-
-        # Resolve title or ID
-        from hermes_cli.main import _resolve_session_by_name_or_id
-        resolved = _resolve_session_by_name_or_id(target)
-        target_id = resolved or target
-
-        session_meta = self._session_db.get_session(target_id)
-        if not session_meta:
-            _cprint(f"  Session not found: {target}")
-            _cprint("  Use /history or `hermes sessions list` to see available sessions.")
-            return
-
-        if target_id == self.session_id:
-            _cprint("  Already on that session.")
-            return
-
-        # End current session
-        try:
-            self._session_db.end_session(self.session_id, "resumed_other")
-        except Exception:
-            pass
-
-        # Switch to the target session
-        self.session_id = target_id
-        self._resumed = True
-        self._pending_title = None
-
-        # Load conversation history
-        restored = self._session_db.get_messages_as_conversation(target_id)
-        self.conversation_history = restored or []
-
-        # Re-open the target session so it's not marked as ended
-        try:
-            self._session_db.reopen_session(target_id)
-        except Exception:
-            pass
-
-        # Sync the agent if already initialised
-        if self.agent:
-            self.agent.session_id = target_id
-            self.agent.reset_session_state()
-            if hasattr(self.agent, "_last_flushed_db_idx"):
-                self.agent._last_flushed_db_idx = len(self.conversation_history)
-            if hasattr(self.agent, "_todo_store"):
-                try:
-                    from tools.todo_tool import TodoStore
-                    self.agent._todo_store = TodoStore()
-                except Exception:
-                    pass
-            if hasattr(self.agent, "_invalidate_system_prompt"):
-                self.agent._invalidate_system_prompt()
-
-        title_part = f" \"{session_meta['title']}\"" if session_meta.get("title") else ""
-        msg_count = len([m for m in self.conversation_history if m.get("role") == "user"])
-        if self.conversation_history:
-            _cprint(
-                f"  ↻ Resumed session {target_id}{title_part}"
-                f" ({msg_count} user message{'s' if msg_count != 1 else ''},"
-                f" {len(self.conversation_history)} total)"
-            )
-        else:
-            _cprint(f"  ↻ Resumed session {target_id}{title_part} — no messages, starting fresh.")
-
    def reset_conversation(self):
        """Reset the conversation by starting a new session."""
        self.new_session()
@@ -3594,7 +3486,7 @@ class HermesCLI:
            print("  To start the gateway:")
            print("    python cli.py --gateway")
            print()
-            print(f"  Configuration file: {display_hermes_home()}/config.yaml")
+            print("  Configuration file: ~/.hermes/config.yaml")
            print()
            
        except Exception as e:
@@ -3604,7 +3496,7 @@ class HermesCLI:
            print("    1. Set environment variables:")
            print("       TELEGRAM_BOT_TOKEN=your_token")
            print("       DISCORD_BOT_TOKEN=your_token")
-            print(f"    2. Or configure settings in {display_hermes_home()}/config.yaml")
+            print("    2. Or configure settings in ~/.hermes/config.yaml")
            print()
    
    def process_command(self, command: str) -> bool:
@@ -3755,8 +3647,6 @@ class HermesCLI:
                    _cprint("  Session database not available.")
        elif canonical == "new":
            self.new_session()
-        elif canonical == "resume":
-            self._handle_resume_command(cmd_original)
        elif canonical == "provider":
            self._show_model_and_providers()
        elif canonical == "prompt":
@@ -3811,7 +3701,7 @@ class HermesCLI:
                plugins = mgr.list_plugins()
                if not plugins:
                    print("No plugins installed.")
-                    print(f"Drop plugin directories into {display_hermes_home()}/plugins/ to get started.")
+                    print("Drop plugin directories into ~/.hermes/plugins/ to get started.")
                else:
                    print(f"Plugins ({len(plugins)}):")
                    for p in plugins:
@@ -3832,17 +3722,17 @@ class HermesCLI:
        elif canonical == "background":
            self._handle_background_command(cmd_original)
        elif canonical == "queue":
-            # Extract prompt after "/queue " or "/q "
-            parts = cmd_original.split(None, 1)
-            payload = parts[1].strip() if len(parts) > 1 else ""
-            if not payload:
-                _cprint("  Usage: /queue <prompt>")
+            if not self._agent_running:
+                _cprint("  /queue only works while Hermes is busy. Just type your message normally.")
            else:
-                self._pending_input.put(payload)
-                if self._agent_running:
-                    _cprint(f"  Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
+                # Extract prompt after "/queue " or "/q "
+                parts = cmd_original.split(None, 1)
+                payload = parts[1].strip() if len(parts) > 1 else ""
+                if not payload:
+                    _cprint("  Usage: /queue <prompt>")
                else:
-                    _cprint(f"  Queued: {payload[:80]}{'...' if len(payload) > 80 else ''}")
+                    self._pending_input.put(payload)
+                    _cprint(f"  Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
        elif canonical == "skin":
            self._handle_skin_command(cmd_original)
        elif canonical == "voice":
@@ -4034,17 +3924,6 @@ class HermesCLI:
                    provider_data_collection=self._provider_data_collection,
                    fallback_model=self._fallback_model,
                )
-                # Silence raw spinner; route thinking through TUI widget when no foreground agent is active.
-                bg_agent._print_fn = lambda *_a, **_kw: None
-
-                def _bg_thinking(text: str) -> None:
-                    # Concurrent bg tasks may race on _spinner_text; acceptable for best-effort UI.
-                    if not self._agent_running:
-                        self._spinner_text = text
-                        if self._app:
-                            self._app.invalidate()
-
-                bg_agent.thinking_callback = _bg_thinking

                result = bg_agent.run_conversation(
                    user_message=prompt,
@@ -4107,9 +3986,6 @@ class HermesCLI:
                _cprint(f"  ❌ Background task #{task_num} failed: {e}")
            finally:
                self._background_tasks.pop(task_id, None)
-                # Clear spinner only if no foreground agent owns it
-                if not self._agent_running:
-                    self._spinner_text = ""
                if self._app:
                    self._invalidate(min_interval=0)

@@ -4340,7 +4216,7 @@ class HermesCLI:
                source = f" ({s['source']})" if s["source"] == "user" else ""
                print(f"   {marker} {s['name']}{source} — {s['description']}")
            print("\n  Usage: /skin <name>")
-            print(f"  Custom skins: drop a YAML file in {display_hermes_home()}/skins/\n")
+            print("  Custom skins: drop a YAML file in ~/.hermes/skins/\n")
            return

        new_skin = parts[1].strip().lower()
@@ -4520,7 +4396,7 @@ class HermesCLI:
        compressor = agent.context_compressor
        last_prompt = compressor.last_prompt_tokens
        ctx_len = compressor.context_length
-        pct = min(100, (last_prompt / ctx_len * 100)) if ctx_len else 0
+        pct = (last_prompt / ctx_len * 100) if ctx_len else 0
        compressions = compressor.compression_count

        msg_count = len(self.conversation_history)
@@ -5548,13 +5424,6 @@ class HermesCLI:
            except Exception as e:
                logging.debug("@ context reference expansion failed: %s", e)

-        # Sanitize surrogate characters that can arrive via clipboard paste from
-        # rich-text editors (Google Docs, Word, etc.).  Lone surrogates are invalid
-        # UTF-8 and crash JSON serialization in the OpenAI SDK.
-        if isinstance(message, str):
-            from run_agent import _sanitize_surrogates
-            message = _sanitize_surrogates(message)
-
        # Add user message to history
        self.conversation_history.append({"role": "user", "content": message})

@@ -5567,10 +5436,6 @@ class HermesCLI:

            # Reset streaming display state for this turn
            self._reset_stream_state()
-            # Separate from _reset_stream_state because this must persist
-            # across intermediate turn boundaries (tool-calling loops) — only
-            # reset at the start of each user turn.
-            self._reasoning_shown_this_turn = False

            # --- Streaming TTS setup ---
            # When ElevenLabs is the TTS provider and sounddevice is available,
@@ -5715,16 +5580,6 @@ class HermesCLI:

            agent_thread.join()  # Ensure agent thread completes

-            # Proactively clean up async clients whose event loop is dead.
-            # The agent thread may have created AsyncOpenAI clients bound
-            # to a per-thread event loop; if that loop is now closed, those
-            # clients' __del__ would crash prompt_toolkit's loop on GC.
-            try:
-                from agent.auxiliary_client import cleanup_stale_async_clients
-                cleanup_stale_async_clients()
-            except Exception:
-                pass
-
            # Flush any remaining streamed text and close the box
            self._flush_stream()

@@ -5785,13 +5640,8 @@ class HermesCLI:
            response_previewed = result.get("response_previewed", False) if result else False

            # Display reasoning (thinking) box if enabled and available.
-            # Skip when streaming already showed reasoning live.  Use the
-            # turn-persistent flag (_reasoning_shown_this_turn) instead of
-            # _reasoning_stream_started — the latter gets reset during
-            # intermediate turn boundaries (tool-calling loops), which caused
-            # the reasoning box to re-render after the final response.
-            _reasoning_already_shown = getattr(self, '_reasoning_shown_this_turn', False)
-            if self.show_reasoning and result and not _reasoning_already_shown:
+            # Skip when streaming already showed reasoning live.
+            if self.show_reasoning and result and not self._reasoning_stream_started:
                reasoning = result.get("last_reasoning")
                if reasoning:
                    w = shutil.get_terminal_size().columns
@@ -5912,22 +5762,10 @@ class HermesCLI:
            else:
                duration_str = f"{seconds}s"
            
-            # Look up session title for resume-by-name hint
-            session_title = None
-            if self._session_db:
-                try:
-                    session_title = self._session_db.get_session_title(self.session_id)
-                except Exception:
-                    pass
-
            print("Resume this session with:")
            print(f"  hermes --resume {self.session_id}")
-            if session_title:
-                print(f"  hermes -c \"{session_title}\"")
            print()
            print(f"Session:        {self.session_id}")
-            if session_title:
-                print(f"Title:          {session_title}")
            print(f"Duration:       {duration_str}")
            print(f"Messages:       {msg_count} ({user_msgs} user, {tool_calls} tool calls)")
        else:
@@ -5944,9 +5782,6 @@ class HermesCLI:
        ``normal_prompt`` is the full ``branding.prompt_symbol``.
        ``state_suffix`` is what special states (sudo/secret/approval/agent)
        should render after their leading icon.
-
-        When a profile is active (not "default"), the profile name is
-        prepended to the prompt symbol: ``coder ❯`` instead of ``❯``.
        """
        try:
            from hermes_cli.skin_engine import get_active_prompt_symbol
@@ -5955,15 +5790,6 @@ class HermesCLI:
            symbol = "❯ "

        symbol = (symbol or "❯ ").rstrip() + " "
-
-        # Prepend profile name when not default
-        try:
-            from hermes_cli.profiles import get_active_profile_name
-            profile = get_active_profile_name()
-            if profile not in ("default", "custom"):
-                symbol = f"{profile} {symbol}"
-        except Exception:
-            pass
        stripped = symbol.rstrip()
        if not stripped:
            return "❯ ", "❯ "
@@ -6115,7 +5941,7 @@ class HermesCLI:
            from honcho_integration.client import HonchoClientConfig
            from agent.display import honcho_session_line, write_tty
            hcfg = HonchoClientConfig.from_global_config()
-            if hcfg.enabled and (hcfg.api_key or hcfg.base_url) and hcfg.explicitly_configured:
+            if hcfg.enabled and hcfg.api_key and hcfg.explicitly_configured:
                sname = hcfg.resolve_session_name(session_id=self.session_id)
                if sname:
                    write_tty(honcho_session_line(hcfg.workspace_id, sname) + "\n")
@@ -6202,18 +6028,10 @@ class HermesCLI:
        set_approval_callback(self._approval_callback)
        set_secret_capture_callback(self._secret_capture_callback)

-        # Ensure tirith security scanner is available (downloads if needed).
-        # Warn the user if tirith is enabled in config but not available,
-        # so they know command security scanning is degraded.
+        # Ensure tirith security scanner is available (downloads if needed)
        try:
            from tools.tirith_security import ensure_installed
-            tirith_path = ensure_installed(log_failures=False)
-            if tirith_path is None:
-                security_cfg = self.config.get("security", {}) or {}
-                tirith_enabled = security_cfg.get("tirith_enabled", True)
-                if tirith_enabled:
-                    _cprint(f"  {_DIM}⚠ tirith security scanner enabled but not available "
-                            f"— command scanning will use pattern matching only{_RST}")
+            ensure_installed(log_failures=False)
        except Exception:
            pass  # Non-fatal — fail-open at scan time if unavailable
        
@@ -6294,22 +6112,16 @@ class HermesCLI:
                # Bundle text + images as a tuple when images are present
                payload = (text, images) if images else text
                if self._agent_running and not (text and text.startswith("/")):
-                    if self.busy_input_mode == "queue":
-                        # Queue for the next turn instead of interrupting
-                        self._pending_input.put(payload)
-                        preview = text if text else f"[{len(images)} image{'s' if len(images) != 1 else ''} attached]"
-                        _cprint(f"  Queued for the next turn: {preview[:80]}{'...' if len(preview) > 80 else ''}")
-                    else:
-                        self._interrupt_queue.put(payload)
-                        # Debug: log to file when message enters interrupt queue
-                        try:
-                            _dbg = _hermes_home / "interrupt_debug.log"
-                            with open(_dbg, "a") as _f:
-                                import time as _t
-                                _f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
-                                         f"agent_running={self._agent_running}\n")
-                        except Exception:
-                            pass
+                    self._interrupt_queue.put(payload)
+                    # Debug: log to file when message enters interrupt queue
+                    try:
+                        _dbg = _hermes_home / "interrupt_debug.log"
+                        with open(_dbg, "a") as _f:
+                            import time as _t
+                            _f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
+                                     f"agent_running={self._agent_running}\n")
+                    except Exception:
+                        pass
                else:
                    self._pending_input.put(payload)
                event.app.current_buffer.reset(append_to_history=True)
@@ -6689,7 +6501,6 @@ class HermesCLI:
        # Paste collapsing: detect large pastes and save to temp file
        _paste_counter = [0]
        _prev_text_len = [0]
-        _prev_newline_count = [0]
        _paste_just_collapsed = [False]

        def _on_text_changed(buf):
@@ -6698,27 +6509,18 @@ class HermesCLI:
            When bracketed paste is available, handle_paste collapses
            large pastes directly.  This handler is a fallback for
            terminals without bracketed paste support.
-
-            Two heuristics (either triggers collapse):
-            1. Many characters added at once (chars_added > 1) — works
-               when the terminal delivers the paste in one event-loop tick.
-            2. Newline count jumped by 4+ in a single text-change event —
-               catches terminals that feed characters individually but
-               still batch newlines.  Alt+Enter only adds 1 newline per
-               event so it never triggers this.
            """
            text = buf.text
            chars_added = len(text) - _prev_text_len[0]
            _prev_text_len[0] = len(text)
            if _paste_just_collapsed[0]:
                _paste_just_collapsed[0] = False
-                _prev_newline_count[0] = text.count('\n')
                return
            line_count = text.count('\n')
-            newlines_added = line_count - _prev_newline_count[0]
-            _prev_newline_count[0] = line_count
-            is_paste = chars_added > 1 or newlines_added >= 4
-            if line_count >= 5 and is_paste and not text.startswith('/'):
+            # Heuristic: a real paste adds many characters at once (not just a
+            # single newline from Alt+Enter) AND the result has 5+ lines.
+            # Fallback for terminals without bracketed paste support.
+            if line_count >= 5 and chars_added > 1 and not text.startswith('/'):
                _paste_counter[0] += 1
                # Save to temp file
                paste_dir = _hermes_home / "pastes"
@@ -6726,7 +6528,6 @@ class HermesCLI:
                paste_file = paste_dir / f"paste_{_paste_counter[0]}_{datetime.now().strftime('%H%M%S')}.txt"
                paste_file.write_text(text, encoding="utf-8")
                # Replace buffer with compact reference
-                _paste_just_collapsed[0] = True
                buf.text = f"[Pasted text #{_paste_counter[0]}: {line_count + 1} lines \u2192 {paste_file}]"
                buf.cursor_position = len(buf.text)

@@ -7093,15 +6894,6 @@ class HermesCLI:
            Window(
                content=FormattedTextControl(lambda: cli_ref._get_status_bar_fragments()),
                height=1,
-                # Prevent fragments that overflow the terminal width from
-                # wrapping onto a second line, which causes the status bar to
-                # appear duplicated (one full + one partial row) during long
-                # sessions, especially on SSH where shutil.get_terminal_size
-                # may return stale values.  _get_status_bar_fragments now reads
-                # width from prompt_toolkit's own output object, so fragments
-                # will always fit; wrap_lines=False is the belt-and-suspenders
-                # guard against any future width mismatch.
-                wrap_lines=False,
            ),
            filter=Condition(lambda: cli_ref._status_bar_visible),
        )
@@ -7336,28 +7128,9 @@ class HermesCLI:
        # Register atexit cleanup so resources are freed even on unexpected exit
        atexit.register(_run_cleanup)
        
-        # Install a custom asyncio exception handler that suppresses the
-        # "Event loop is closed" RuntimeError from httpx transport cleanup.
-        # This is defense-in-depth — the primary fix is neuter_async_httpx_del
-        # which disables __del__ entirely, but older clients or SDK upgrades
-        # could bypass it.
-        def _suppress_closed_loop_errors(loop, context):
-            exc = context.get("exception")
-            if isinstance(exc, RuntimeError) and "Event loop is closed" in str(exc):
-                return  # silently suppress
-            # Fall back to default handler for everything else
-            loop.default_exception_handler(context)
-
        # Run the application with patch_stdout for proper output handling
        try:
            with patch_stdout():
-                # Set the custom handler on prompt_toolkit's event loop
-                try:
-                    import asyncio as _aio
-                    _loop = _aio.get_event_loop()
-                    _loop.set_exception_handler(_suppress_closed_loop_errors)
-                except Exception:
-                    pass
                app.run()
        except (EOFError, KeyboardInterrupt):
            pass
@@ -327,20 +327,7 @@ def load_jobs() -> List[Dict[str, Any]]:
        with open(JOBS_FILE, 'r', encoding='utf-8') as f:
            data = json.load(f)
            return data.get("jobs", [])
-    except json.JSONDecodeError:
-        # Retry with strict=False to handle bare control chars in string values
-        try:
-            with open(JOBS_FILE, 'r', encoding='utf-8') as f:
-                data = json.loads(f.read(), strict=False)
-                jobs = data.get("jobs", [])
-                if jobs:
-                    # Auto-repair: rewrite with proper escaping
-                    save_jobs(jobs)
-                    logger.warning("Auto-repaired jobs.json (had invalid control characters)")
-                return jobs
-        except Exception:
-            return []
-    except IOError:
+    except (json.JSONDecodeError, IOError):
        return []


@@ -611,34 +598,6 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
    save_jobs(jobs)


-def advance_next_run(job_id: str) -> bool:
-    """Preemptively advance next_run_at for a recurring job before execution.
-
-    Call this BEFORE run_job() so that if the process crashes mid-execution,
-    the job won't re-fire on the next gateway restart.  This converts the
-    scheduler from at-least-once to at-most-once for recurring jobs — missing
-    one run is far better than firing dozens of times in a crash loop.
-
-    One-shot jobs are left unchanged so they can still retry on restart.
-
-    Returns True if next_run_at was advanced, False otherwise.
-    """
-    jobs = load_jobs()
-    for job in jobs:
-        if job["id"] == job_id:
-            kind = job.get("schedule", {}).get("kind")
-            if kind not in ("cron", "interval"):
-                return False
-            now = _hermes_now().isoformat()
-            new_next = compute_next_run(job["schedule"], now)
-            if new_next and new_next != job.get("next_run_at"):
-                job["next_run_at"] = new_next
-                save_jobs(jobs)
-                return True
-            return False
-    return False
-
-
 def get_due_jobs() -> List[Dict[str, Any]]:
    """Get all jobs that are due to run now.

@@ -35,7 +35,7 @@ logger = logging.getLogger(__name__)
 # Add parent directory to path for imports
 sys.path.insert(0, str(Path(__file__).parent.parent))

-from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
+from cron.jobs import get_due_jobs, mark_job_run, save_job_output

 # Sentinel: when a cron agent has nothing new to report, it can start its
 # response with this marker to suppress delivery.  Output is still saved
@@ -524,12 +524,6 @@ def tick(verbose: bool = True) -> int:
        executed = 0
        for job in due_jobs:
            try:
-                # For recurring jobs (cron/interval), advance next_run_at to the
-                # next future occurrence BEFORE execution.  This way, if the
-                # process crashes mid-run, the job won't re-fire on restart.
-                # One-shot jobs are left alone so they can retry on restart.
-                advance_next_run(job["id"])
-
                success, output, final_response, error = run_job(job)

                output_file = save_job_output(job["id"], output)
@@ -1,15 +0,0 @@
-# Hermes Agent Persona
-
-<!--
-This file defines the agent's personality and tone.
-The agent will embody whatever you write here.
-Edit this to customize how Hermes communicates with you.
-
-Examples:
-  - "You are a warm, playful assistant who uses kaomoji occasionally."
-  - "You are a concise technical expert. No fluff, just facts."
-  - "You speak like a friendly coworker who happens to know everything."
-
-This file is loaded fresh each message -- no restart needed.
-Delete the contents (or this file) to use the default personality.
-->
@@ -1,34 +0,0 @@
-#!/bin/bash
-# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
-set -e
-
-HERMES_HOME="/opt/data"
-INSTALL_DIR="/opt/hermes"
-
-# Create essential directory structure.  Cache and platform directories
-# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
-# demand by the application — don't pre-create them here so new installs
-# get the consolidated layout from get_hermes_dir().
-mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills}
-
-# .env
-if [ ! -f "$HERMES_HOME/.env" ]; then
-    cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
-fi
-
-# config.yaml
-if [ ! -f "$HERMES_HOME/config.yaml" ]; then
-    cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
-fi
-
-# SOUL.md
-if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
-    cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
-fi
-
-# Sync bundled skills (manifest-based so user edits are preserved)
-if [ -d "$INSTALL_DIR/skills" ]; then
-    python3 "$INSTALL_DIR/tools/skills_sync.py"
-fi
-
-exec hermes "$@"
@@ -101,11 +101,21 @@ Available methods:

 ### Patches (`patches.py`)

-**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
+**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend via SWE-ReX). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.

-**Solution**: `ModalEnvironment` uses a dedicated `_AsyncWorker` background thread with its own event loop. The calling code sees a sync interface, but internally all async Modal SDK calls happen on the worker thread so they don't conflict with Atropos's loop. This is built directly into `tools/environments/modal.py` — no monkey-patching required.
+**Solution**: `patches.py` monkey-patches `SwerexModalEnvironment` to use a dedicated background thread (`_AsyncWorker`) with its own event loop. The calling code sees the same sync interface, but internally the async work happens on a separate thread that doesn't conflict with Atropos's loop.

-`patches.py` is now a no-op (kept for backward compatibility with imports).
+What gets patched:
+- `SwerexModalEnvironment.__init__` -- creates Modal deployment on a background thread
+- `SwerexModalEnvironment.execute` -- runs commands on the same background thread
+- `SwerexModalEnvironment.stop` -- stops deployment on the background thread
+
+The patches are:
+- **Idempotent** -- calling `apply_patches()` multiple times is safe
+- **Transparent** -- same interface and behavior, only the internal async execution changes
+- **Universal** -- works identically in normal CLI use (no running event loop)
+
+Applied automatically at import time by `hermes_base_env.py`.

 ### Tool Call Parsers (`tool_call_parsers/`)

@@ -1 +0,0 @@
-"""Built-in gateway hooks that are always registered."""
@@ -1,86 +0,0 @@
-"""Built-in boot-md hook — run ~/.hermes/BOOT.md on gateway startup.
-
-This hook is always registered. It silently skips if no BOOT.md exists.
-To activate, create ``~/.hermes/BOOT.md`` with instructions for the
-agent to execute on every gateway restart.
-
-Example BOOT.md::
-
-    # Startup Checklist
-
-    1. Check if any cron jobs failed overnight
-    2. Send a status update to Discord #general
-    3. If there are errors in /opt/app/deploy.log, summarize them
-
-The agent runs in a background thread so it doesn't block gateway
-startup. If nothing needs attention, it replies with [SILENT] to
-suppress delivery.
-"""
-
-import logging
-import os
-import threading
-from pathlib import Path
-
-logger = logging.getLogger("hooks.boot-md")
-
-HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
-BOOT_FILE = HERMES_HOME / "BOOT.md"
-
-
-def _build_boot_prompt(content: str) -> str:
-    """Wrap BOOT.md content in a system-level instruction."""
-    return (
-        "You are running a startup boot checklist. Follow the BOOT.md "
-        "instructions below exactly.\n\n"
-        "---\n"
-        f"{content}\n"
-        "---\n\n"
-        "Execute each instruction. If you need to send a message to a "
-        "platform, use the send_message tool.\n"
-        "If nothing needs attention and there is nothing to report, "
-        "reply with ONLY: [SILENT]"
-    )
-
-
-def _run_boot_agent(content: str) -> None:
-    """Spawn a one-shot agent session to execute the boot instructions."""
-    try:
-        from run_agent import AIAgent
-
-        prompt = _build_boot_prompt(content)
-        agent = AIAgent(
-            quiet_mode=True,
-            skip_context_files=True,
-            skip_memory=True,
-            max_iterations=20,
-        )
-        result = agent.run_conversation(prompt)
-        response = result.get("final_response", "")
-        if response and "[SILENT]" not in response:
-            logger.info("boot-md completed: %s", response[:200])
-        else:
-            logger.info("boot-md completed (nothing to report)")
-    except Exception as e:
-        logger.error("boot-md agent failed: %s", e)
-
-
-async def handle(event_type: str, context: dict) -> None:
-    """Gateway startup handler — run BOOT.md if it exists."""
-    if not BOOT_FILE.exists():
-        return
-
-    content = BOOT_FILE.read_text(encoding="utf-8").strip()
-    if not content:
-        return
-
-    logger.info("Running BOOT.md (%d chars)", len(content))
-
-    # Run in a background thread so we don't block gateway startup.
-    thread = threading.Thread(
-        target=_run_boot_agent,
-        args=(content,),
-        name="boot-md",
-        daemon=True,
-    )
-    thread.start()
@@ -601,14 +601,6 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            config.platforms[Platform.TELEGRAM] = PlatformConfig()
        config.platforms[Platform.TELEGRAM].reply_to_mode = telegram_reply_mode
    
-    telegram_fallback_ips = os.getenv("TELEGRAM_FALLBACK_IPS", "")
-    if telegram_fallback_ips:
-        if Platform.TELEGRAM not in config.platforms:
-            config.platforms[Platform.TELEGRAM] = PlatformConfig()
-        config.platforms[Platform.TELEGRAM].extra["fallback_ips"] = [
-            ip.strip() for ip in telegram_fallback_ips.split(",") if ip.strip()
-        ]
-
    telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
    if telegram_home and Platform.TELEGRAM in config.platforms:
        config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
@@ -51,33 +51,14 @@ class HookRegistry:
        """Return metadata about all loaded hooks."""
        return list(self._loaded_hooks)

-    def _register_builtin_hooks(self) -> None:
-        """Register built-in hooks that are always active."""
-        try:
-            from gateway.builtin_hooks.boot_md import handle as boot_md_handle
-
-            self._handlers.setdefault("gateway:startup", []).append(boot_md_handle)
-            self._loaded_hooks.append({
-                "name": "boot-md",
-                "description": "Run ~/.hermes/BOOT.md on gateway startup",
-                "events": ["gateway:startup"],
-                "path": "(builtin)",
-            })
-        except Exception as e:
-            print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
-
    def discover_and_load(self) -> None:
        """
        Scan the hooks directory for hook directories and load their handlers.

-        Also registers built-in hooks that are always active.
-
        Each hook directory must contain:
          - HOOK.yaml with at least 'name' and 'events' keys
          - handler.py with a top-level 'handle' function (sync or async)
        """
-        self._register_builtin_hooks()
-
        if not HOOKS_DIR.exists():
            return

@@ -25,7 +25,7 @@ import time
 from pathlib import Path
 from typing import Optional

-from hermes_constants import get_hermes_dir
+from hermes_cli.config import get_hermes_home


 # Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
@@ -41,7 +41,7 @@ LOCKOUT_SECONDS = 3600              # Lockout duration after too many failures
 MAX_PENDING_PER_PLATFORM = 3        # Max pending codes per platform
 MAX_FAILED_ATTEMPTS = 5             # Failed approvals before lockout

-PAIRING_DIR = get_hermes_dir("platforms/pairing", "pairing")
+PAIRING_DIR = get_hermes_home() / "pairing"


 def _secure_write(path: Path, data: str) -> None:
@@ -166,7 +166,7 @@ class ResponseStore:

 _CORS_HEADERS = {
    "Access-Control-Allow-Methods": "GET, POST, DELETE, OPTIONS",
-    "Access-Control-Allow-Headers": "Authorization, Content-Type, Idempotency-Key",
+    "Access-Control-Allow-Headers": "Authorization, Content-Type",
 }


@@ -223,23 +223,6 @@ if AIOHTTP_AVAILABLE:
 else:
    body_limit_middleware = None  # type: ignore[assignment]

-_SECURITY_HEADERS = {
-    "X-Content-Type-Options": "nosniff",
-    "Referrer-Policy": "no-referrer",
-}
-
-
-if AIOHTTP_AVAILABLE:
-    @web.middleware
-    async def security_headers_middleware(request, handler):
-        """Add security headers to all responses (including errors)."""
-        response = await handler(request)
-        for k, v in _SECURITY_HEADERS.items():
-            response.headers.setdefault(k, v)
-        return response
-else:
-    security_headers_middleware = None  # type: ignore[assignment]
-

 class _IdempotencyCache:
    """In-memory idempotency cache with TTL and basic LRU semantics."""
@@ -324,7 +307,6 @@ class APIServerAdapter(BasePlatformAdapter):
        if "*" in self._cors_origins:
            headers = dict(_CORS_HEADERS)
            headers["Access-Control-Allow-Origin"] = "*"
-            headers["Access-Control-Max-Age"] = "600"
            return headers

        if origin not in self._cors_origins:
@@ -333,7 +315,6 @@ class APIServerAdapter(BasePlatformAdapter):
        headers = dict(_CORS_HEADERS)
        headers["Access-Control-Allow-Origin"] = origin
        headers["Vary"] = "Origin"
-        headers["Access-Control-Max-Age"] = "600"
        return headers

    def _origin_allowed(self, origin: str) -> bool:
@@ -385,20 +366,14 @@ class APIServerAdapter(BasePlatformAdapter):
        Create an AIAgent instance using the gateway's runtime config.

        Uses _resolve_runtime_agent_kwargs() to pick up model, api_key,
-        base_url, etc. from config.yaml / env vars.  Toolsets are resolved
-        from config.yaml platform_toolsets.api_server (same as all other
-        gateway platforms), falling back to the hermes-api-server default.
+        base_url, etc. from config.yaml / env vars.
        """
        from run_agent import AIAgent
-        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
-        from hermes_cli.tools_config import _get_platform_tools
+        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model

        runtime_kwargs = _resolve_runtime_agent_kwargs()
        model = _resolve_gateway_model()

-        user_config = _load_gateway_config()
-        enabled_toolsets = sorted(_get_platform_tools(user_config, "api_server"))
-
        max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))

        agent = AIAgent(
@@ -408,7 +383,7 @@ class APIServerAdapter(BasePlatformAdapter):
            quiet_mode=True,
            verbose_logging=False,
            ephemeral_system_prompt=ephemeral_system_prompt or None,
-            enabled_toolsets=enabled_toolsets,
+            enabled_toolsets=["hermes-api-server"],
            session_id=session_id,
            platform="api_server",
            stream_delta_callback=stream_delta_callback,
@@ -514,21 +489,17 @@ class APIServerAdapter(BasePlatformAdapter):
                if delta is not None:
                    _stream_q.put(delta)

-            # Start agent in background.  agent_ref is a mutable container
-            # so the SSE writer can interrupt the agent on client disconnect.
-            agent_ref = [None]
+            # Start agent in background
            agent_task = asyncio.ensure_future(self._run_agent(
                user_message=user_message,
                conversation_history=history,
                ephemeral_system_prompt=system_prompt,
                session_id=session_id,
                stream_delta_callback=_on_delta,
-                agent_ref=agent_ref,
            ))

            return await self._write_sse_chat_completion(
-                request, completion_id, model_name, created, _stream_q,
-                agent_task, agent_ref,
+                request, completion_id, model_name, created, _stream_q, agent_task
            )

        # Non-streaming: run the agent (with optional Idempotency-Key)
@@ -591,107 +562,80 @@ class APIServerAdapter(BasePlatformAdapter):

    async def _write_sse_chat_completion(
        self, request: "web.Request", completion_id: str, model: str,
-        created: int, stream_q, agent_task, agent_ref=None,
+        created: int, stream_q, agent_task,
    ) -> "web.StreamResponse":
-        """Write real streaming SSE from agent's stream_delta_callback queue.
-
-        If the client disconnects mid-stream (network drop, browser tab close),
-        the agent is interrupted via ``agent.interrupt()`` so it stops making
-        LLM API calls, and the asyncio task wrapper is cancelled.
-        """
+        """Write real streaming SSE from agent's stream_delta_callback queue."""
        import queue as _q

-        sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
-        # CORS middleware can't inject headers into StreamResponse after
-        # prepare() flushes them, so resolve CORS headers up front.
-        origin = request.headers.get("Origin", "")
-        cors = self._cors_headers_for_origin(origin) if origin else None
-        if cors:
-            sse_headers.update(cors)
-        response = web.StreamResponse(status=200, headers=sse_headers)
+        response = web.StreamResponse(
+            status=200,
+            headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
+        )
        await response.prepare(request)

-        try:
-            # Role chunk
-            role_chunk = {
-                "id": completion_id, "object": "chat.completion.chunk",
-                "created": created, "model": model,
-                "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
-            }
-            await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
+        # Role chunk
+        role_chunk = {
+            "id": completion_id, "object": "chat.completion.chunk",
+            "created": created, "model": model,
+            "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
+        }
+        await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())

-            # Stream content chunks as they arrive from the agent
-            loop = asyncio.get_event_loop()
-            while True:
-                try:
-                    delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
-                except _q.Empty:
-                    if agent_task.done():
-                        # Drain any remaining items
-                        while True:
-                            try:
-                                delta = stream_q.get_nowait()
-                                if delta is None:
-                                    break
-                                content_chunk = {
-                                    "id": completion_id, "object": "chat.completion.chunk",
-                                    "created": created, "model": model,
-                                    "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
-                                }
-                                await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
-                            except _q.Empty:
-                                break
-                        break
-                    continue
-
-                if delta is None:  # End of stream sentinel
-                    break
-
-                content_chunk = {
-                    "id": completion_id, "object": "chat.completion.chunk",
-                    "created": created, "model": model,
-                    "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
-                }
-                await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
-
-            # Get usage from completed agent
-            usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
+        # Stream content chunks as they arrive from the agent
+        loop = asyncio.get_event_loop()
+        while True:
            try:
-                result, agent_usage = await agent_task
-                usage = agent_usage or usage
-            except Exception:
-                pass
+                delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
+            except _q.Empty:
+                if agent_task.done():
+                    # Drain any remaining items
+                    while True:
+                        try:
+                            delta = stream_q.get_nowait()
+                            if delta is None:
+                                break
+                            content_chunk = {
+                                "id": completion_id, "object": "chat.completion.chunk",
+                                "created": created, "model": model,
+                                "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
+                            }
+                            await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+                        except _q.Empty:
+                            break
+                    break
+                continue

-            # Finish chunk
-            finish_chunk = {
+            if delta is None:  # End of stream sentinel
+                break
+
+            content_chunk = {
                "id": completion_id, "object": "chat.completion.chunk",
                "created": created, "model": model,
-                "choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
-                "usage": {
-                    "prompt_tokens": usage.get("input_tokens", 0),
-                    "completion_tokens": usage.get("output_tokens", 0),
-                    "total_tokens": usage.get("total_tokens", 0),
-                },
+                "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
            }
-            await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
-            await response.write(b"data: [DONE]\n\n")
-        except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
-            # Client disconnected mid-stream.  Interrupt the agent so it
-            # stops making LLM API calls at the next loop iteration, then
-            # cancel the asyncio task wrapper.
-            agent = agent_ref[0] if agent_ref else None
-            if agent is not None:
-                try:
-                    agent.interrupt("SSE client disconnected")
-                except Exception:
-                    pass
-            if not agent_task.done():
-                agent_task.cancel()
-                try:
-                    await agent_task
-                except (asyncio.CancelledError, Exception):
-                    pass
-            logger.info("SSE client disconnected; interrupted agent task %s", completion_id)
+            await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+
+        # Get usage from completed agent
+        usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
+        try:
+            result, agent_usage = await agent_task
+            usage = agent_usage or usage
+        except Exception:
+            pass
+
+        # Finish chunk
+        finish_chunk = {
+            "id": completion_id, "object": "chat.completion.chunk",
+            "created": created, "model": model,
+            "choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
+            "usage": {
+                "prompt_tokens": usage.get("input_tokens", 0),
+                "completion_tokens": usage.get("output_tokens", 0),
+                "total_tokens": usage.get("total_tokens", 0),
+            },
+        }
+        await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
+        await response.write(b"data: [DONE]\n\n")

        return response

@@ -1194,18 +1138,12 @@ class APIServerAdapter(BasePlatformAdapter):
        ephemeral_system_prompt: Optional[str] = None,
        session_id: Optional[str] = None,
        stream_delta_callback=None,
-        agent_ref: Optional[list] = None,
    ) -> tuple:
        """
        Create an agent and run a conversation in a thread executor.

        Returns ``(result_dict, usage_dict)`` where *usage_dict* contains
        ``input_tokens``, ``output_tokens`` and ``total_tokens``.
-
-        If *agent_ref* is a one-element list, the AIAgent instance is stored
-        at ``agent_ref[0]`` before ``run_conversation`` begins.  This allows
-        callers (e.g. the SSE writer) to call ``agent.interrupt()`` from
-        another thread to stop in-progress LLM calls.
        """
        loop = asyncio.get_event_loop()

@@ -1215,8 +1153,6 @@ class APIServerAdapter(BasePlatformAdapter):
                session_id=session_id,
                stream_delta_callback=stream_delta_callback,
            )
-            if agent_ref is not None:
-                agent_ref[0] = agent
            result = agent.run_conversation(
                user_message=user_message,
                conversation_history=conversation_history,
@@ -1241,11 +1177,10 @@ class APIServerAdapter(BasePlatformAdapter):
            return False

        try:
-            mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
+            mws = [mw for mw in (cors_middleware, body_limit_middleware) if mw is not None]
            self._app = web.Application(middlewares=mws)
            self._app["api_server_adapter"] = self
            self._app.router.add_get("/health", self._handle_health)
-            self._app.router.add_get("/v1/health", self._handle_health)
            self._app.router.add_get("/v1/models", self._handle_models)
            self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
            self._app.router.add_post("/v1/responses", self._handle_responses)
@@ -1261,17 +1196,6 @@ class APIServerAdapter(BasePlatformAdapter):
            self._app.router.add_post("/api/jobs/{job_id}/resume", self._handle_resume_job)
            self._app.router.add_post("/api/jobs/{job_id}/run", self._handle_run_job)

-            # Port conflict detection — fail fast if port is already in use
-            import socket as _socket
-            try:
-                with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
-                    _s.settimeout(1)
-                    _s.connect(('127.0.0.1', self._port))
-                logger.error('[%s] Port %d already in use. Set a different port in config.yaml: platforms.api_server.port', self.name, self._port)
-                return False
-            except (ConnectionRefusedError, OSError):
-                pass  # port is free
-
            self._runner = web.AppRunner(self._app)
            await self._runner.setup()
            self._site = web.TCPSite(self._runner, self._host, self._port)
@@ -8,7 +8,6 @@ and implement the required methods.
 import asyncio
 import logging
 import os
-import random
 import re
 import uuid
 from abc import ABC, abstractmethod
@@ -27,7 +26,6 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
 from gateway.config import Platform, PlatformConfig
 from gateway.session import SessionSource, build_session_key
 from hermes_cli.config import get_hermes_home
-from hermes_constants import get_hermes_dir


 GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
@@ -45,8 +43,8 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
 # (e.g. Telegram file URLs expire after ~1 hour).
 # ---------------------------------------------------------------------------

-# Default location: {HERMES_HOME}/cache/images/ (legacy: image_cache/)
-IMAGE_CACHE_DIR = get_hermes_dir("cache/images", "image_cache")
+# Default location: {HERMES_HOME}/image_cache/
+IMAGE_CACHE_DIR = get_hermes_home() / "image_cache"


 def get_image_cache_dir() -> Path:
@@ -73,51 +71,31 @@ def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
    return str(filepath)


-async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) -> str:
+async def cache_image_from_url(url: str, ext: str = ".jpg") -> str:
    """
    Download an image from a URL and save it to the local cache.

-    Retries on transient failures (timeouts, 429, 5xx) with exponential
-    backoff so a single slow CDN response doesn't lose the media.
+    Uses httpx for async download with a reasonable timeout.

    Args:
        url: The HTTP/HTTPS URL to download from.
        ext: File extension including the dot (e.g. ".jpg", ".png").
-        retries: Number of retry attempts on transient failures.

    Returns:
        Absolute path to the cached image file as a string.
    """
-    import asyncio
    import httpx
-    import logging as _logging
-    _log = _logging.getLogger(__name__)

-    last_exc = None
    async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-        for attempt in range(retries + 1):
-            try:
-                response = await client.get(
-                    url,
-                    headers={
-                        "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
-                        "Accept": "image/*,*/*;q=0.8",
-                    },
-                )
-                response.raise_for_status()
-                return cache_image_from_bytes(response.content, ext)
-            except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
-                last_exc = exc
-                if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
-                    raise
-                if attempt < retries:
-                    wait = 1.5 * (attempt + 1)
-                    _log.debug("Media cache retry %d/%d for %s (%.1fs): %s",
-                               attempt + 1, retries, url[:80], wait, exc)
-                    await asyncio.sleep(wait)
-                    continue
-                raise
-    raise last_exc
+        response = await client.get(
+            url,
+            headers={
+                "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
+                "Accept": "image/*,*/*;q=0.8",
+            },
+        )
+        response.raise_for_status()
+        return cache_image_from_bytes(response.content, ext)


 def cleanup_image_cache(max_age_hours: int = 24) -> int:
@@ -148,7 +126,7 @@ def cleanup_image_cache(max_age_hours: int = 24) -> int:
 # here so the STT tool (OpenAI Whisper) can transcribe them from local files.
 # ---------------------------------------------------------------------------

-AUDIO_CACHE_DIR = get_hermes_dir("cache/audio", "audio_cache")
+AUDIO_CACHE_DIR = get_hermes_home() / "audio_cache"


 def get_audio_cache_dir() -> Path:
@@ -175,51 +153,29 @@ def cache_audio_from_bytes(data: bytes, ext: str = ".ogg") -> str:
    return str(filepath)


-async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) -> str:
+async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
    """
    Download an audio file from a URL and save it to the local cache.

-    Retries on transient failures (timeouts, 429, 5xx) with exponential
-    backoff so a single slow CDN response doesn't lose the media.
-
    Args:
        url: The HTTP/HTTPS URL to download from.
        ext: File extension including the dot (e.g. ".ogg", ".mp3").
-        retries: Number of retry attempts on transient failures.

    Returns:
        Absolute path to the cached audio file as a string.
    """
-    import asyncio
    import httpx
-    import logging as _logging
-    _log = _logging.getLogger(__name__)

-    last_exc = None
    async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-        for attempt in range(retries + 1):
-            try:
-                response = await client.get(
-                    url,
-                    headers={
-                        "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
-                        "Accept": "audio/*,*/*;q=0.8",
-                    },
-                )
-                response.raise_for_status()
-                return cache_audio_from_bytes(response.content, ext)
-            except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
-                last_exc = exc
-                if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
-                    raise
-                if attempt < retries:
-                    wait = 1.5 * (attempt + 1)
-                    _log.debug("Audio cache retry %d/%d for %s (%.1fs): %s",
-                               attempt + 1, retries, url[:80], wait, exc)
-                    await asyncio.sleep(wait)
-                    continue
-                raise
-    raise last_exc
+        response = await client.get(
+            url,
+            headers={
+                "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
+                "Accept": "audio/*,*/*;q=0.8",
+            },
+        )
+        response.raise_for_status()
+        return cache_audio_from_bytes(response.content, ext)


 # ---------------------------------------------------------------------------
@@ -229,7 +185,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) ->
 # here so the agent can reference them by local file path.
 # ---------------------------------------------------------------------------

-DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")
+DOCUMENT_CACHE_DIR = get_hermes_home() / "document_cache"

 SUPPORTED_DOCUMENT_TYPES = {
    ".pdf": "application/pdf",
@@ -356,10 +312,7 @@ class MessageEvent:
            return None
        # Split on space and get first word, strip the /
        parts = self.text.split(maxsplit=1)
-        raw = parts[0][1:].lower() if parts else None
-        if raw and "@" in raw:
-            raw = raw.split("@", 1)[0]
-        return raw
+        return parts[0][1:].lower() if parts else None
    
    def get_command_args(self) -> str:
        """Get the arguments after a command."""
@@ -376,24 +329,6 @@ class SendResult:
    message_id: Optional[str] = None
    error: Optional[str] = None
    raw_response: Any = None
-    retryable: bool = False  # True for transient errors (network, timeout) — base will retry automatically
-
-
-# Error substrings that indicate a transient network failure worth retrying
-_RETRYABLE_ERROR_PATTERNS = (
-    "connecterror",
-    "connectionerror",
-    "connectionreset",
-    "connectionrefused",
-    "timeout",
-    "timed out",
-    "network",
-    "broken pipe",
-    "remotedisconnected",
-    "eoferror",
-    "readtimeout",
-    "writetimeout",
-)


 # Type for message handlers
@@ -898,91 +833,6 @@ class BasePlatformAdapter(ABC):
                except Exception:
                    pass
    
-    @staticmethod
-    def _is_retryable_error(error: Optional[str]) -> bool:
-        """Return True if the error string looks like a transient network failure."""
-        if not error:
-            return False
-        lowered = error.lower()
-        return any(pat in lowered for pat in _RETRYABLE_ERROR_PATTERNS)
-
-    async def _send_with_retry(
-        self,
-        chat_id: str,
-        content: str,
-        reply_to: Optional[str] = None,
-        metadata: Any = None,
-        max_retries: int = 2,
-        base_delay: float = 2.0,
-    ) -> "SendResult":
-        """
-        Send a message with automatic retry for transient network errors.
-
-        On permanent failures (e.g. formatting / permission errors) falls back
-        to a plain-text version before giving up. If all attempts fail due to
-        network errors, sends the user a brief delivery-failure notice so they
-        know to retry rather than waiting indefinitely.
-        """
-
-        result = await self.send(
-            chat_id=chat_id,
-            content=content,
-            reply_to=reply_to,
-            metadata=metadata,
-        )
-
-        if result.success:
-            return result
-
-        error_str = result.error or ""
-        is_network = result.retryable or self._is_retryable_error(error_str)
-
-        if is_network:
-            # Retry with exponential backoff for transient errors
-            for attempt in range(1, max_retries + 1):
-                delay = base_delay * (2 ** (attempt - 1)) + random.uniform(0, 1)
-                logger.warning(
-                    "[%s] Send failed (attempt %d/%d, retrying in %.1fs): %s",
-                    self.name, attempt, max_retries, delay, error_str,
-                )
-                await asyncio.sleep(delay)
-                result = await self.send(
-                    chat_id=chat_id,
-                    content=content,
-                    reply_to=reply_to,
-                    metadata=metadata,
-                )
-                if result.success:
-                    logger.info("[%s] Send succeeded on retry %d", self.name, attempt)
-                    return result
-                error_str = result.error or ""
-                if not (result.retryable or self._is_retryable_error(error_str)):
-                    break  # error switched to non-transient — fall through to plain-text fallback
-            else:
-                # All retries exhausted (loop completed without break) — notify user
-                logger.error("[%s] Failed to deliver response after %d retries: %s", self.name, max_retries, error_str)
-                notice = (
-                    "\u26a0\ufe0f Message delivery failed after multiple attempts. "
-                    "Please try again \u2014 your request was processed but the response could not be sent."
-                )
-                try:
-                    await self.send(chat_id=chat_id, content=notice, reply_to=reply_to, metadata=metadata)
-                except Exception as notify_err:
-                    logger.debug("[%s] Could not send delivery-failure notice: %s", self.name, notify_err)
-                return result
-
-        # Non-network / post-retry formatting failure: try plain text as fallback
-        logger.warning("[%s] Send failed: %s — trying plain-text fallback", self.name, error_str)
-        fallback_result = await self.send(
-            chat_id=chat_id,
-            content=f"(Response formatting failed, plain text:)\n\n{content[:3500]}",
-            reply_to=reply_to,
-            metadata=metadata,
-        )
-        if not fallback_result.success:
-            logger.error("[%s] Fallback send also failed: %s", self.name, fallback_result.error)
-        return fallback_result
-
    async def handle_message(self, event: MessageEvent) -> None:
        """
        Process an incoming message.
@@ -1005,7 +855,7 @@ class BasePlatformAdapter(ABC):
            # simultaneous messages. Queue them without interrupting the active run,
            # then process them immediately after the current task finishes.
            if event.message_type == MessageType.PHOTO:
-                logger.debug("[%s] Queuing photo follow-up for session %s without interrupt", self.name, session_key)
+                print(f"[{self.name}] 🖼️ Queuing photo follow-up for session {session_key} without interrupt")
                existing = self._pending_messages.get(session_key)
                if existing and existing.message_type == MessageType.PHOTO:
                    existing.media_urls.extend(event.media_urls)
@@ -1020,7 +870,7 @@ class BasePlatformAdapter(ABC):
                return  # Don't interrupt now - will run after current task completes

            # Default behavior for non-photo follow-ups: interrupt the running agent
-            logger.debug("[%s] New message while session %s is active — triggering interrupt", self.name, session_key)
+            print(f"[{self.name}] ⚡ New message while session {session_key} is active - triggering interrupt")
            self._pending_messages[session_key] = event
            # Signal the interrupt (the processing task checks this)
            self._active_sessions[session_key].set()
@@ -1132,13 +982,26 @@ class BasePlatformAdapter(ABC):
                # Send the text portion
                if text_content:
                    logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
-                    result = await self._send_with_retry(
+                    result = await self.send(
                        chat_id=event.source.chat_id,
                        content=text_content,
                        reply_to=event.message_id,
                        metadata=_thread_metadata,
                    )

+                    # Log send failures (don't raise - user already saw tool progress)
+                    if not result.success:
+                        print(f"[{self.name}] Failed to send response: {result.error}")
+                        # Try sending without markdown as fallback
+                        fallback_result = await self.send(
+                            chat_id=event.source.chat_id,
+                            content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
+                            reply_to=event.message_id,
+                            metadata=_thread_metadata,
+                        )
+                        if not fallback_result.success:
+                            print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
+
                # Human-like pacing delay between text and media
                human_delay = self._get_human_delay()

@@ -1206,9 +1069,9 @@ class BasePlatformAdapter(ABC):
                            )

                        if not media_result.success:
-                            logger.warning("[%s] Failed to send media (%s): %s", self.name, ext, media_result.error)
+                            print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
                    except Exception as media_err:
-                        logger.warning("[%s] Error sending media: %s", self.name, media_err)
+                        print(f"[{self.name}] Error sending media: {media_err}")

                # Send auto-detected local files as native attachments
                for file_path in local_files:
@@ -1240,7 +1103,7 @@ class BasePlatformAdapter(ABC):
            # Check if there's a pending message that was queued during our processing
            if session_key in self._pending_messages:
                pending_event = self._pending_messages.pop(session_key)
-                logger.debug("[%s] Processing queued message from interrupt", self.name)
+                print(f"[{self.name}] 📨 Processing queued message from interrupt")
                # Clean up current session before processing pending
                if session_key in self._active_sessions:
                    del self._active_sessions[session_key]
@@ -1254,7 +1117,9 @@ class BasePlatformAdapter(ABC):
                return  # Already cleaned up
                
        except Exception as e:
-            logger.error("[%s] Error handling message: %s", self.name, e, exc_info=True)
+            print(f"[{self.name}] Error handling message: {e}")
+            import traceback
+            traceback.print_exc()
            # Send the error to the user so they aren't left with radio silence
            try:
                error_type = type(e).__name__
@@ -486,17 +486,6 @@ class DiscordAdapter(BasePlatformAdapter):
            return False
        
        try:
-            # Acquire scoped lock to prevent duplicate bot token usage
-            from gateway.status import acquire_scoped_lock
-            self._token_lock_identity = self.config.token
-            acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
-            if not acquired:
-                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
-                message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
-                logger.error('[%s] %s', self.name, message)
-                self._set_fatal_error('discord_token_lock', message, retryable=False)
-                return False
-
            # Set up intents -- members intent needed for username-to-ID resolution
            intents = Intents.default()
            intents.message_content = True
@@ -561,22 +550,6 @@ class DiscordAdapter(BasePlatformAdapter):
                            return
                    # "all" falls through to handle_message
                
-                # If the message @mentions other users but NOT the bot, the
-                # sender is talking to someone else — stay silent.  Only
-                # applies in server channels; in DMs the user is always
-                # talking to the bot (mentions are just references).
-                # Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
-                _ignore_no_mention = os.getenv(
-                    "DISCORD_IGNORE_NO_MENTION", "true"
-                ).lower() in ("true", "1", "yes")
-                if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
-                    _bot_mentioned = (
-                        self._client.user is not None
-                        and self._client.user in message.mentions
-                    )
-                    if not _bot_mentioned:
-                        return  # Talking to someone else, don't interrupt
-
                await self._handle_message(message)

            @self._client.event
@@ -649,16 +622,6 @@ class DiscordAdapter(BasePlatformAdapter):
        self._running = False
        self._client = None
        self._ready_event.clear()
-
-        # Release the token lock
-        try:
-            from gateway.status import release_scoped_lock
-            if getattr(self, '_token_lock_identity', None):
-                release_scoped_lock('discord-bot-token', self._token_lock_identity)
-                self._token_lock_identity = None
-        except Exception:
-            pass
-
        logger.info("[%s] Disconnected", self.name)
    
    async def send(
@@ -1450,23 +1413,15 @@ class DiscordAdapter(BasePlatformAdapter):
        command_text: str,
        followup_msg: str | None = None,
    ) -> None:
-        """Common handler for simple slash commands that dispatch a command string.
-
-        Defers the interaction (shows "thinking..."), dispatches the command,
-        then cleans up the deferred response.  If *followup_msg* is provided
-        the "thinking..." indicator is replaced with that text; otherwise it
-        is deleted so the channel isn't cluttered.
-        """
+        """Common handler for simple slash commands that dispatch a command string."""
        await interaction.response.defer(ephemeral=True)
        event = self._build_slash_event(interaction, command_text)
        await self.handle_message(event)
-        try:
-            if followup_msg:
-                await interaction.edit_original_response(content=followup_msg)
-            else:
-                await interaction.delete_original_response()
-        except Exception as e:
-            logger.debug("Discord interaction cleanup failed: %s", e)
+        if followup_msg:
+            try:
+                await interaction.followup.send(followup_msg, ephemeral=True)
+            except Exception as e:
+                logger.debug("Discord followup failed: %s", e)

    def _register_slash_commands(self) -> None:
        """Register Discord slash commands on the command tree."""
@@ -1491,7 +1446,9 @@ class DiscordAdapter(BasePlatformAdapter):
        @tree.command(name="reasoning", description="Show or change reasoning effort")
        @discord.app_commands.describe(effort="Reasoning effort: xhigh, high, medium, low, minimal, or none.")
        async def slash_reasoning(interaction: discord.Interaction, effort: str = ""):
-            await self._run_simple_slash(interaction, f"/reasoning {effort}".strip())
+            await interaction.response.defer(ephemeral=True)
+            event = self._build_slash_event(interaction, f"/reasoning {effort}".strip())
+            await self.handle_message(event)

        @tree.command(name="personality", description="Set a personality")
        @discord.app_commands.describe(name="Personality name. Leave empty to list available.")
@@ -1564,7 +1521,9 @@ class DiscordAdapter(BasePlatformAdapter):
            discord.app_commands.Choice(name="status — show current mode", value="status"),
        ])
        async def slash_voice(interaction: discord.Interaction, mode: str = ""):
-            await self._run_simple_slash(interaction, f"/voice {mode}".strip())
+            await interaction.response.defer(ephemeral=True)
+            event = self._build_slash_event(interaction, f"/voice {mode}".strip())
+            await self.handle_message(event)

        @tree.command(name="update", description="Update Hermes Agent to the latest version")
        async def slash_update(interaction: discord.Interaction):
@@ -2137,11 +2096,6 @@ class DiscordAdapter(BasePlatformAdapter):
        if pending_text_injection:
            event_text = f"{pending_text_injection}\n\n{event_text}" if event_text else pending_text_injection

-        # Defense-in-depth: prevent empty user messages from entering session
-        # (can happen when user sends @mention-only with no other text)
-        if not event_text or not event_text.strip():
-            event_text = "(The user sent a message with no text content)"
-
        event = MessageEvent(
            text=event_text,
            message_type=msg_type,
@@ -43,20 +43,6 @@ from gateway.platforms.base import (
 from gateway.config import Platform, PlatformConfig

 logger = logging.getLogger(__name__)
-# Automated sender patterns — emails from these are silently ignored
-_NOREPLY_PATTERNS = (
-    "noreply", "no-reply", "no_reply", "donotreply", "do-not-reply",
-    "mailer-daemon", "postmaster", "bounce", "notifications@",
-    "automated@", "auto-confirm", "auto-reply", "automailer",
-)
-
-# RFC headers that indicate bulk/automated mail
-_AUTOMATED_HEADERS = {
-    "Auto-Submitted": lambda v: v.lower() != "no",
-    "Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
-    "X-Auto-Response-Suppress": lambda v: bool(v),
-    "List-Unsubscribe": lambda v: bool(v),
-}

 # Gmail-safe max length per email body
 MAX_MESSAGE_LENGTH = 50_000
@@ -64,17 +50,7 @@ MAX_MESSAGE_LENGTH = 50_000
 # Supported image extensions for inline detection
 _IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}

-def _is_automated_sender(address: str, headers: dict) -> bool:
-    """Return True if this email is from an automated/noreply source."""
-    addr = address.lower()
-    if any(pattern in addr for pattern in _NOREPLY_PATTERNS):
-        return True
-    for header, check in _AUTOMATED_HEADERS.items():
-        value = headers.get(header, "")
-        if value and check(value):
-            return True
-    return False
-    
+
 def check_email_requirements() -> bool:
    """Check if email platform dependencies are available."""
    addr = os.getenv("EMAIL_ADDRESS")
@@ -237,7 +213,6 @@ class EmailAdapter(BasePlatformAdapter):

        # Track message IDs we've already processed to avoid duplicates
        self._seen_uids: set = set()
-        self._seen_uids_max: int = 2000   # cap to prevent unbounded memory growth
        self._poll_task: Optional[asyncio.Task] = None

        # Map chat_id (sender email) -> last subject + message-id for threading
@@ -245,26 +220,6 @@ class EmailAdapter(BasePlatformAdapter):

        logger.info("[Email] Adapter initialized for %s", self._address)

-    def _trim_seen_uids(self) -> None:
-        """Keep only the most recent UIDs to prevent unbounded memory growth.
-
-        IMAP UIDs are monotonically increasing integers. When the set grows
-        beyond the cap, we keep only the highest half — old UIDs are safe to
-        drop because new messages always have higher UIDs and IMAP's UNSEEN
-        flag prevents re-delivery regardless.
-        """
-        if len(self._seen_uids) <= self._seen_uids_max:
-            return
-        try:
-            # UIDs are bytes like b'1234' — sort numerically and keep top half
-            sorted_uids = sorted(self._seen_uids, key=lambda u: int(u))
-            keep = self._seen_uids_max // 2
-            self._seen_uids = set(sorted_uids[-keep:])
-            logger.debug("[Email] Trimmed seen UIDs to %d entries", len(self._seen_uids))
-        except (ValueError, TypeError):
-            # Fallback: just clear old entries if sort fails
-            self._seen_uids = set(list(self._seen_uids)[-self._seen_uids_max // 2:])
-
    async def connect(self) -> bool:
        """Connect to the IMAP server and start polling for new messages."""
        try:
@@ -277,8 +232,6 @@ class EmailAdapter(BasePlatformAdapter):
            if status == "OK" and data and data[0]:
                for uid in data[0].split():
                    self._seen_uids.add(uid)
-            # Keep only the most recent UIDs to prevent unbounded growth
-            self._trim_seen_uids()
            imap.logout()
            logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
        except Exception as e:
@@ -349,9 +302,6 @@ class EmailAdapter(BasePlatformAdapter):
                if uid in self._seen_uids:
                    continue
                self._seen_uids.add(uid)
-                # Trim periodically to prevent unbounded memory growth
-                if len(self._seen_uids) > self._seen_uids_max:
-                    self._trim_seen_uids()

                status, msg_data = imap.uid("fetch", uid, "(RFC822)")
                if status != "OK":
@@ -370,11 +320,6 @@ class EmailAdapter(BasePlatformAdapter):
                subject = _decode_header_value(msg.get("Subject", "(no subject)"))
                message_id = msg.get("Message-ID", "")
                in_reply_to = msg.get("In-Reply-To", "")
-                # Skip automated/noreply senders before any processing
-                msg_headers = dict(msg.items())
-                if _is_automated_sender(sender_addr, msg_headers):
-                    logger.debug("[Email] Skipping automated sender: %s", sender_addr)
-                    continue
                body = _extract_text_body(msg)
                attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)

@@ -403,11 +348,6 @@ class EmailAdapter(BasePlatformAdapter):
        if sender_addr == self._address.lower():
            return

-        # Never reply to automated senders
-        if _is_automated_sender(sender_addr, {}):
-            logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
-            return
-
        subject = msg_data["subject"]
        body = msg_data["body"].strip()
        attachments = msg_data["attachments"]
@@ -40,9 +40,7 @@ logger = logging.getLogger(__name__)
 MAX_MESSAGE_LENGTH = 4000

 # Store directory for E2EE keys and sync state.
-# Uses get_hermes_home() so each profile gets its own Matrix store.
-from hermes_constants import get_hermes_dir as _get_hermes_dir
-_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")
+_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"

 # Grace period: ignore messages older than this many seconds before startup.
 _STARTUP_GRACE_SECONDS = 5
@@ -163,49 +161,22 @@ class MatrixAdapter(BasePlatformAdapter):
        # Authenticate.
        if self._access_token:
            client.access_token = self._access_token
-
-            # With access-token auth, always resolve whoami so we validate the
-            # token and learn the device_id. The device_id matters for E2EE:
-            # without it, matrix-nio can send plain messages but may fail to
-            # decrypt inbound encrypted events or encrypt outbound room sends.
-            resp = await client.whoami()
-            if isinstance(resp, nio.WhoamiResponse):
-                resolved_user_id = getattr(resp, "user_id", "") or self._user_id
-                resolved_device_id = getattr(resp, "device_id", "")
-                if resolved_user_id:
-                    self._user_id = resolved_user_id
-
-                # restore_login() is the matrix-nio path that binds the access
-                # token to a specific device and loads the crypto store.
-                if resolved_device_id and hasattr(client, "restore_login"):
-                    client.restore_login(
-                        self._user_id or resolved_user_id,
-                        resolved_device_id,
-                        self._access_token,
-                    )
+            # Resolve user_id if not set.
+            if not self._user_id:
+                resp = await client.whoami()
+                if isinstance(resp, nio.WhoamiResponse):
+                    self._user_id = resp.user_id
+                    client.user_id = resp.user_id
+                    logger.info("Matrix: authenticated as %s", self._user_id)
                else:
-                    if self._user_id:
-                        client.user_id = self._user_id
-                    if resolved_device_id:
-                        client.device_id = resolved_device_id
-                    client.access_token = self._access_token
-                    if self._encryption:
-                        logger.warning(
-                            "Matrix: access-token login did not restore E2EE state; "
-                            "encrypted rooms may fail until a device_id is available"
-                        )
-
-                logger.info(
-                    "Matrix: using access token for %s%s",
-                    self._user_id or "(unknown user)",
-                    f" (device {resolved_device_id})" if resolved_device_id else "",
-                )
+                    logger.error(
+                        "Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
+                    )
+                    await client.close()
+                    return False
            else:
-                logger.error(
-                    "Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
-                )
-                await client.close()
-                return False
+                client.user_id = self._user_id
+                logger.info("Matrix: using access token for %s", self._user_id)
        elif self._password and self._user_id:
            resp = await client.login(
                self._password,
@@ -223,18 +194,13 @@ class MatrixAdapter(BasePlatformAdapter):
            return False

        # If E2EE is enabled, load the crypto store.
-        if self._encryption and getattr(client, "olm", None):
+        if self._encryption and hasattr(client, "olm"):
            try:
                if client.should_upload_keys:
                    await client.keys_upload()
                logger.info("Matrix: E2EE crypto initialized")
            except Exception as exc:
                logger.warning("Matrix: crypto init issue: %s", exc)
-        elif self._encryption:
-            logger.warning(
-                "Matrix: E2EE requested but crypto store is not loaded; "
-                "encrypted rooms may fail"
-            )

        # Register event callbacks.
        client.add_event_callback(self._on_room_message, nio.RoomMessageText)
@@ -264,7 +230,6 @@ class MatrixAdapter(BasePlatformAdapter):
            )
            # Build DM room cache from m.direct account data.
            await self._refresh_dm_cache()
-            await self._run_e2ee_maintenance()
        else:
            logger.warning("Matrix: initial sync returned %s", type(resp).__name__)

@@ -336,48 +301,13 @@ class MatrixAdapter(BasePlatformAdapter):
                    relates_to["m.in_reply_to"] = {"event_id": reply_to}
                msg_content["m.relates_to"] = relates_to

-            async def _room_send_once(*, ignore_unverified_devices: bool = False):
-                return await asyncio.wait_for(
-                    self._client.room_send(
-                        chat_id,
-                        "m.room.message",
-                        msg_content,
-                        ignore_unverified_devices=ignore_unverified_devices,
-                    ),
-                    timeout=45,
-                )
-
-            try:
-                resp = await _room_send_once(ignore_unverified_devices=False)
-            except Exception as exc:
-                retryable = isinstance(exc, asyncio.TimeoutError)
-                olm_unverified = getattr(nio, "OlmUnverifiedDeviceError", None)
-                send_retry = getattr(nio, "SendRetryError", None)
-                if isinstance(olm_unverified, type) and isinstance(exc, olm_unverified):
-                    retryable = True
-                if isinstance(send_retry, type) and isinstance(exc, send_retry):
-                    retryable = True
-
-                if not retryable:
-                    logger.error("Matrix: failed to send to %s: %s", chat_id, exc)
-                    return SendResult(success=False, error=str(exc))
-
-                logger.warning(
-                    "Matrix: initial encrypted send to %s failed (%s); "
-                    "retrying after E2EE maintenance with ignored unverified devices",
-                    chat_id,
-                    exc,
-                )
-                await self._run_e2ee_maintenance()
-                try:
-                    resp = await _room_send_once(ignore_unverified_devices=True)
-                except Exception as retry_exc:
-                    logger.error("Matrix: failed to send to %s after retry: %s", chat_id, retry_exc)
-                    return SendResult(success=False, error=str(retry_exc))
-
+            resp = await self._client.room_send(
+                chat_id,
+                "m.room.message",
+                msg_content,
+            )
            if isinstance(resp, nio.RoomSendResponse):
                last_event_id = resp.event_id
-                logger.info("Matrix: sent event %s to %s", last_event_id, chat_id)
            else:
                err = getattr(resp, "message", str(resp))
                logger.error("Matrix: failed to send to %s: %s", chat_id, err)
@@ -621,23 +551,9 @@ class MatrixAdapter(BasePlatformAdapter):

    async def _sync_loop(self) -> None:
        """Continuously sync with the homeserver."""
-        import nio
-
        while not self._closing:
            try:
-                resp = await self._client.sync(timeout=30000)
-                if isinstance(resp, nio.SyncError):
-                    if self._closing:
-                        return
-                    logger.warning(
-                        "Matrix: sync returned %s: %s — retrying in 5s",
-                        type(resp).__name__,
-                        getattr(resp, "message", resp),
-                    )
-                    await asyncio.sleep(5)
-                    continue
-
-                await self._run_e2ee_maintenance()
+                await self._client.sync(timeout=30000)
            except asyncio.CancelledError:
                return
            except Exception as exc:
@@ -646,38 +562,6 @@ class MatrixAdapter(BasePlatformAdapter):
                logger.warning("Matrix: sync error: %s — retrying in 5s", exc)
                await asyncio.sleep(5)

-    async def _run_e2ee_maintenance(self) -> None:
-        """Run matrix-nio E2EE housekeeping between syncs.
-
-        Hermes uses a custom sync loop instead of matrix-nio's sync_forever(),
-        so we need to explicitly drive the key management work that sync_forever()
-        normally handles for encrypted rooms.
-        """
-        client = self._client
-        if not client or not self._encryption or not getattr(client, "olm", None):
-            return
-
-        tasks = [asyncio.create_task(client.send_to_device_messages())]
-
-        if client.should_upload_keys:
-            tasks.append(asyncio.create_task(client.keys_upload()))
-
-        if client.should_query_keys:
-            tasks.append(asyncio.create_task(client.keys_query()))
-
-        if client.should_claim_keys:
-            users = client.get_users_for_key_claiming()
-            if users:
-                tasks.append(asyncio.create_task(client.keys_claim(users)))
-
-        for task in asyncio.as_completed(tasks):
-            try:
-                await task
-            except asyncio.CancelledError:
-                raise
-            except Exception as exc:
-                logger.warning("Matrix: E2EE maintenance task failed: %s", exc)
-
    # ------------------------------------------------------------------
    # Event callbacks
    # ------------------------------------------------------------------
@@ -407,38 +407,18 @@ class MattermostAdapter(BasePlatformAdapter):
        kind: str = "file",
    ) -> SendResult:
        """Download a URL and upload it as a file attachment."""
-        import asyncio
        import aiohttp
-
-        last_exc = None
-        file_data = None
-        ct = "application/octet-stream"
-        fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
-
-        for attempt in range(3):
-            try:
-                async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
-                    if resp.status >= 500 or resp.status == 429:
-                        if attempt < 2:
-                            logger.debug("Mattermost download retry %d/2 for %s (status %d)",
-                                         attempt + 1, url[:80], resp.status)
-                            await asyncio.sleep(1.5 * (attempt + 1))
-                            continue
-                    if resp.status >= 400:
-                        return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
-                    file_data = await resp.read()
-                    ct = resp.content_type or "application/octet-stream"
-                    break
-            except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
-                last_exc = exc
-                if attempt < 2:
-                    await asyncio.sleep(1.5 * (attempt + 1))
-                    continue
-                logger.warning("Mattermost: failed to download %s after %d attempts: %s", url, attempt + 1, exc)
-                return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
-
-        if file_data is None:
-            logger.warning("Mattermost: download returned no data for %s", url)
+        try:
+            async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
+                if resp.status >= 400:
+                    # Fall back to sending the URL as text.
+                    return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
+                file_data = await resp.read()
+                ct = resp.content_type or "application/octet-stream"
+                # Derive filename from URL.
+                fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
+        except Exception as exc:
+            logger.warning("Mattermost: failed to download %s: %s", url, exc)
            return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)

        file_id = await self._upload_file(chat_id, file_data, fname, ct)
@@ -603,19 +583,9 @@ class MattermostAdapter(BasePlatformAdapter):
        # For DMs, user_id is sufficient.  For channels, check for @mention.
        message_text = post.get("message", "")

-        # Mention-gating for non-DM channels.
-        # Config (env vars):
-        #   MATTERMOST_REQUIRE_MENTION: Require @mention in channels (default: true)
-        #   MATTERMOST_FREE_RESPONSE_CHANNELS: Channel IDs where bot responds without mention
+        # Mention-only mode: skip channel messages that don't @mention the bot.
+        # DMs (type "D") are always processed.
        if channel_type_raw != "D":
-            require_mention = os.getenv(
-                "MATTERMOST_REQUIRE_MENTION", "true"
-            ).lower() not in ("false", "0", "no")
-
-            free_channels_raw = os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS", "")
-            free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
-            is_free_channel = channel_id in free_channels
-
            mention_patterns = [
                f"@{self._bot_username}",
                f"@{self._bot_user_id}",
@@ -624,21 +594,13 @@ class MattermostAdapter(BasePlatformAdapter):
                pattern.lower() in message_text.lower()
                for pattern in mention_patterns
            )
-
-            if require_mention and not is_free_channel and not has_mention:
+            if not has_mention:
                logger.debug(
                    "Mattermost: skipping non-DM message without @mention (channel=%s)",
                    channel_id,
                )
                return

-            # Strip @mention from the message text so the agent sees clean input.
-            if has_mention:
-                for pattern in mention_patterns:
-                    message_text = re.sub(
-                        re.escape(pattern), "", message_text, flags=re.IGNORECASE
-                    ).strip()
-
        # Resolve sender info.
        sender_id = post.get("user_id", "")
        sender_name = data.get("sender_name", "").lstrip("@") or sender_id
@@ -22,7 +22,7 @@ import time
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import Dict, List, Optional, Any
-from urllib.parse import quote, unquote
+from urllib.parse import unquote

 import httpx

@@ -184,8 +184,6 @@ class SignalAdapter(BasePlatformAdapter):
        self._recent_sent_timestamps: set = set()
        self._max_recent_timestamps = 50

-        self._phone_lock_identity: Optional[str] = None
-
        logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
                     self.http_url, _redact_phone(self.account),
                     "enabled" if self.group_allow_from else "disabled")
@@ -200,29 +198,6 @@ class SignalAdapter(BasePlatformAdapter):
            logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
            return False

-        # Acquire scoped lock to prevent duplicate Signal listeners for the same phone
-        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._phone_lock_identity = self.account
-            acquired, existing = acquire_scoped_lock(
-                "signal-phone",
-                self._phone_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this Signal account"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second Signal listener."
-                )
-                logger.error("Signal: %s", message)
-                self._set_fatal_error("signal_phone_lock", message, retryable=False)
-                return False
-        except Exception as e:
-            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
-
        self.client = httpx.AsyncClient(timeout=30.0)

        # Health check — verify signal-cli daemon is reachable
@@ -270,14 +245,6 @@ class SignalAdapter(BasePlatformAdapter):
            await self.client.aclose()
            self.client = None

-        if self._phone_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("signal-phone", self._phone_lock_identity)
-            except Exception as e:
-                logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
-            self._phone_lock_identity = None
-
        logger.info("Signal: disconnected")

    # ------------------------------------------------------------------
@@ -286,7 +253,7 @@ class SignalAdapter(BasePlatformAdapter):

    async def _sse_listener(self) -> None:
        """Listen for SSE events from signal-cli daemon."""
-        url = f"{self.http_url}/api/v1/events?account={quote(self.account, safe='')}"
+        url = f"{self.http_url}/api/v1/events?account={self.account}"
        backoff = SSE_RETRY_DELAY_INITIAL

        while self._running:
@@ -312,12 +279,6 @@ class SignalAdapter(BasePlatformAdapter):
                            line = line.strip()
                            if not line:
                                continue
-                            # SSE keepalive comments (":") prove the connection
-                            # is alive — update activity so the health monitor
-                            # doesn't report false idle warnings.
-                            if line.startswith(":"):
-                                self._last_sse_activity = time.time()
-                                continue
                            # Parse SSE data lines
                            if line.startswith("data:"):
                                data_str = line[5:].strip()
@@ -554,7 +515,7 @@ class SignalAdapter(BasePlatformAdapter):
        """Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
        result = await self._rpc("getAttachment", {
            "account": self.account,
-            "id": attachment_id,
+            "attachmentId": attachment_id,
        })

        if not result:
@@ -93,17 +93,6 @@ class SlackAdapter(BasePlatformAdapter):
            return False

        try:
-            # Acquire scoped lock to prevent duplicate app token usage
-            from gateway.status import acquire_scoped_lock
-            self._token_lock_identity = app_token
-            acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
-            if not acquired:
-                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
-                message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
-                logger.error('[%s] %s', self.name, message)
-                self._set_fatal_error('slack_token_lock', message, retryable=False)
-                return False
-
            self._app = AsyncApp(token=bot_token)

            # Get our own bot user ID for mention detection
@@ -149,16 +138,6 @@ class SlackAdapter(BasePlatformAdapter):
            except Exception as e:  # pragma: no cover - defensive logging
                logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
        self._running = False
-
-        # Release the token lock (use stored identity, not re-read env)
-        try:
-            from gateway.status import release_scoped_lock
-            if getattr(self, '_token_lock_identity', None):
-                release_scoped_lock('slack-app-token', self._token_lock_identity)
-                self._token_lock_identity = None
-        except Exception:
-            pass
-
        logger.info("[Slack] Disconnected")

    async def send(
@@ -840,65 +819,33 @@ class SlackAdapter(BasePlatformAdapter):
        await self.handle_message(event)

    async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
-        """Download a Slack file using the bot token for auth, with retry."""
-        import asyncio
+        """Download a Slack file using the bot token for auth."""
        import httpx

        bot_token = self.config.token
-        last_exc = None
-
        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-            for attempt in range(3):
-                try:
-                    response = await client.get(
-                        url,
-                        headers={"Authorization": f"Bearer {bot_token}"},
-                    )
-                    response.raise_for_status()
+            response = await client.get(
+                url,
+                headers={"Authorization": f"Bearer {bot_token}"},
+            )
+            response.raise_for_status()

-                    if audio:
-                        from gateway.platforms.base import cache_audio_from_bytes
-                        return cache_audio_from_bytes(response.content, ext)
-                    else:
-                        from gateway.platforms.base import cache_image_from_bytes
-                        return cache_image_from_bytes(response.content, ext)
-                except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
-                    last_exc = exc
-                    if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
-                        raise
-                    if attempt < 2:
-                        logger.debug("Slack file download retry %d/2 for %s: %s",
-                                     attempt + 1, url[:80], exc)
-                        await asyncio.sleep(1.5 * (attempt + 1))
-                        continue
-                    raise
-        raise last_exc
+        if audio:
+            from gateway.platforms.base import cache_audio_from_bytes
+            return cache_audio_from_bytes(response.content, ext)
+        else:
+            from gateway.platforms.base import cache_image_from_bytes
+            return cache_image_from_bytes(response.content, ext)

    async def _download_slack_file_bytes(self, url: str) -> bytes:
-        """Download a Slack file and return raw bytes, with retry."""
-        import asyncio
+        """Download a Slack file and return raw bytes."""
        import httpx

        bot_token = self.config.token
-        last_exc = None
-
        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-            for attempt in range(3):
-                try:
-                    response = await client.get(
-                        url,
-                        headers={"Authorization": f"Bearer {bot_token}"},
-                    )
-                    response.raise_for_status()
-                    return response.content
-                except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
-                    last_exc = exc
-                    if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
-                        raise
-                    if attempt < 2:
-                        logger.debug("Slack file download retry %d/2 for %s: %s",
-                                     attempt + 1, url[:80], exc)
-                        await asyncio.sleep(1.5 * (attempt + 1))
-                        continue
-                    raise
-        raise last_exc
+            response = await client.get(
+                url,
+                headers={"Authorization": f"Bearer {bot_token}"},
+            )
+            response.raise_for_status()
+        return response.content
@@ -11,7 +11,7 @@ import asyncio
 import logging
 import os
 import re
-from typing import Dict, List, Optional, Any
+from typing import Dict, Optional, Any

 logger = logging.getLogger(__name__)

@@ -25,7 +25,6 @@ try:
        filters,
    )
    from telegram.constants import ParseMode, ChatType
-    from telegram.request import HTTPXRequest
    TELEGRAM_AVAILABLE = True
 except ImportError:
    TELEGRAM_AVAILABLE = False
@@ -35,7 +34,6 @@ except ImportError:
    Application = Any
    CommandHandler = Any
    TelegramMessageHandler = Any
-    HTTPXRequest = Any
    filters = None
    ParseMode = None
    ChatType = None
@@ -61,11 +59,6 @@ from gateway.platforms.base import (
    cache_document_from_bytes,
    SUPPORTED_DOCUMENT_TYPES,
 )
-from gateway.platforms.telegram_network import (
-    TelegramFallbackTransport,
-    discover_fallback_ips,
-    parse_fallback_ip_env,
-)


 def check_telegram_requirements() -> bool:
@@ -145,13 +138,6 @@ class TelegramAdapter(BasePlatformAdapter):
        # DM Topics config from extra.dm_topics
        self._dm_topics_config: List[Dict[str, Any]] = self.config.extra.get("dm_topics", [])

-    def _fallback_ips(self) -> list[str]:
-        """Return validated fallback IPs from config (populated by _apply_env_overrides)."""
-        configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
-        if isinstance(configured, str):
-            configured = configured.split(",")
-        return parse_fallback_ip_env(",".join(str(v) for v in configured) if configured else None)
-
    @staticmethod
    def _looks_like_polling_conflict(error: Exception) -> bool:
        text = str(error).lower()
@@ -345,8 +331,7 @@ class TelegramAdapter(BasePlatformAdapter):
    def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
        """Save a newly created thread_id back into config.yaml so it persists across restarts."""
        try:
-            from hermes_constants import get_hermes_home
-            config_path = get_hermes_home() / "config.yaml"
+            config_path = _Path.home() / ".hermes" / "config.yaml"
            if not config_path.exists():
                logger.warning("[%s] Config file not found at %s, cannot persist thread_id", self.name, config_path)
                return
@@ -489,26 +474,7 @@ class TelegramAdapter(BasePlatformAdapter):
                return False

            # Build the application
-            builder = Application.builder().token(self.config.token)
-            fallback_ips = self._fallback_ips()
-            if not fallback_ips:
-                fallback_ips = await discover_fallback_ips()
-                logger.info(
-                    "[%s] Auto-discovered Telegram fallback IPs: %s",
-                    self.name,
-                    ", ".join(fallback_ips),
-                )
-            if fallback_ips:
-                logger.warning(
-                    "[%s] Telegram fallback IPs active: %s",
-                    self.name,
-                    ", ".join(fallback_ips),
-                )
-                transport = TelegramFallbackTransport(fallback_ips)
-                request = HTTPXRequest(httpx_kwargs={"transport": transport})
-                get_updates_request = HTTPXRequest(httpx_kwargs={"transport": transport})
-                builder = builder.request(request).get_updates_request(get_updates_request)
-            self._app = builder.build()
+            self._app = Application.builder().token(self.config.token).build()
            self._bot = self._app.bot
            
            # Register handlers
@@ -708,15 +674,9 @@ class TelegramAdapter(BasePlatformAdapter):
            except ImportError:
                _NetErr = OSError  # type: ignore[misc,assignment]

-            try:
-                from telegram.error import BadRequest as _BadReq
-            except ImportError:
-                _BadReq = None  # type: ignore[assignment,misc]
-
            for i, chunk in enumerate(chunks):
                should_thread = self._should_thread_reply(reply_to, i)
                reply_to_id = int(reply_to) if should_thread else None
-                effective_thread_id = int(thread_id) if thread_id else None

                msg = None
                for _send_attempt in range(3):
@@ -728,7 +688,7 @@ class TelegramAdapter(BasePlatformAdapter):
                                text=chunk,
                                parse_mode=ParseMode.MARKDOWN_V2,
                                reply_to_message_id=reply_to_id,
-                                message_thread_id=effective_thread_id,
+                                message_thread_id=int(thread_id) if thread_id else None,
                            )
                        except Exception as md_error:
                            # Markdown parsing failed, try plain text
@@ -740,30 +700,12 @@ class TelegramAdapter(BasePlatformAdapter):
                                    text=plain_chunk,
                                    parse_mode=None,
                                    reply_to_message_id=reply_to_id,
-                                    message_thread_id=effective_thread_id,
+                                    message_thread_id=int(thread_id) if thread_id else None,
                                )
                            else:
                                raise
                        break  # success
                    except _NetErr as send_err:
-                        # BadRequest is a subclass of NetworkError in
-                        # python-telegram-bot but represents permanent errors
-                        # (not transient network issues). Detect and handle
-                        # specific cases instead of blindly retrying.
-                        if _BadReq and isinstance(send_err, _BadReq):
-                            err_lower = str(send_err).lower()
-                            if "thread not found" in err_lower and effective_thread_id is not None:
-                                # Thread doesn't exist — retry without
-                                # message_thread_id so the message still
-                                # reaches the chat.
-                                logger.warning(
-                                    "[%s] Thread %s not found, retrying without message_thread_id",
-                                    self.name, effective_thread_id,
-                                )
-                                effective_thread_id = None
-                                continue
-                            # Other BadRequest errors are permanent — don't retry
-                            raise
                        if _send_attempt < 2:
                            wait = 2 ** _send_attempt
                            logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
@@ -1758,8 +1700,7 @@ class TelegramAdapter(BasePlatformAdapter):
        recognized without a gateway restart.
        """
        try:
-            from hermes_constants import get_hermes_home
-            config_path = get_hermes_home() / "config.yaml"
+            config_path = _Path.home() / ".hermes" / "config.yaml"
            if not config_path.exists():
                return

@@ -1,245 +0,0 @@
-"""Telegram-specific network helpers.
-
-Provides a hostname-preserving fallback transport for networks where
-api.telegram.org resolves to an endpoint that is unreachable from the current
-host. The transport keeps the logical request host and TLS SNI as
-api.telegram.org while retrying the TCP connection against one or more fallback
-IPv4 addresses.
-"""
-
-from __future__ import annotations
-
-import asyncio
-import ipaddress
-import logging
-import os
-import socket
-from typing import Iterable, Optional
-
-import httpx
-
-logger = logging.getLogger(__name__)
-
-_TELEGRAM_API_HOST = "api.telegram.org"
-
-# DNS-over-HTTPS providers used to discover Telegram API IPs that may differ
-# from the (potentially unreachable) IP returned by the local system resolver.
-_DOH_TIMEOUT = 4.0  # seconds — bounded so connect() isn't noticeably delayed
-
-_DOH_PROVIDERS: list[dict] = [
-    {
-        "url": "https://dns.google/resolve",
-        "params": {"name": _TELEGRAM_API_HOST, "type": "A"},
-        "headers": {},
-    },
-    {
-        "url": "https://cloudflare-dns.com/dns-query",
-        "params": {"name": _TELEGRAM_API_HOST, "type": "A"},
-        "headers": {"Accept": "application/dns-json"},
-    },
-]
-
-# Last-resort IPs when DoH is also blocked.  These are stable Telegram Bot API
-# endpoints in the 149.154.160.0/20 block (same seed used by OpenClaw).
-_SEED_FALLBACK_IPS: list[str] = ["149.154.167.220"]
-
-
-def _resolve_proxy_url() -> str | None:
-    for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
-        value = (os.environ.get(key) or "").strip()
-        if value:
-            return value
-    return None
-
-
-class TelegramFallbackTransport(httpx.AsyncBaseTransport):
-    """Retry Telegram Bot API requests via fallback IPs while preserving TLS/SNI.
-
-    Requests continue to target https://api.telegram.org/... logically, but on
-    connect failures the underlying TCP connection is retried against a known
-    reachable IP. This is effectively the programmatic equivalent of
-    ``curl --resolve api.telegram.org:443:<ip>``.
-    """
-
-    def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
-        self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
-        proxy_url = _resolve_proxy_url()
-        if proxy_url and "proxy" not in transport_kwargs:
-            transport_kwargs["proxy"] = proxy_url
-        self._primary = httpx.AsyncHTTPTransport(**transport_kwargs)
-        self._fallbacks = {
-            ip: httpx.AsyncHTTPTransport(**transport_kwargs) for ip in self._fallback_ips
-        }
-        self._sticky_ip: Optional[str] = None
-        self._sticky_lock = asyncio.Lock()
-
-    async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
-        if request.url.host != _TELEGRAM_API_HOST or not self._fallback_ips:
-            return await self._primary.handle_async_request(request)
-
-        sticky_ip = self._sticky_ip
-        attempt_order: list[Optional[str]] = [sticky_ip] if sticky_ip else [None]
-        for ip in self._fallback_ips:
-            if ip != sticky_ip:
-                attempt_order.append(ip)
-
-        last_error: Exception | None = None
-        for ip in attempt_order:
-            candidate = request if ip is None else _rewrite_request_for_ip(request, ip)
-            transport = self._primary if ip is None else self._fallbacks[ip]
-            try:
-                response = await transport.handle_async_request(candidate)
-                if ip is not None and self._sticky_ip != ip:
-                    async with self._sticky_lock:
-                        if self._sticky_ip != ip:
-                            self._sticky_ip = ip
-                            logger.warning(
-                                "[Telegram] Primary api.telegram.org path unreachable; using sticky fallback IP %s",
-                                ip,
-                            )
-                return response
-            except Exception as exc:
-                last_error = exc
-                if not _is_retryable_connect_error(exc):
-                    raise
-                if ip is None:
-                    logger.warning(
-                        "[Telegram] Primary api.telegram.org connection failed (%s); trying fallback IPs %s",
-                        exc,
-                        ", ".join(self._fallback_ips),
-                    )
-                    continue
-                logger.warning("[Telegram] Fallback IP %s failed: %s", ip, exc)
-                continue
-
-        assert last_error is not None
-        raise last_error
-
-    async def aclose(self) -> None:
-        await self._primary.aclose()
-        for transport in self._fallbacks.values():
-            await transport.aclose()
-
-
-def _normalize_fallback_ips(values: Iterable[str]) -> list[str]:
-    normalized: list[str] = []
-    for value in values:
-        raw = str(value).strip()
-        if not raw:
-            continue
-        try:
-            addr = ipaddress.ip_address(raw)
-        except ValueError:
-            logger.warning("Ignoring invalid Telegram fallback IP: %r", raw)
-            continue
-        if addr.version != 4:
-            logger.warning("Ignoring non-IPv4 Telegram fallback IP: %s", raw)
-            continue
-        normalized.append(str(addr))
-    return normalized
-
-
-def parse_fallback_ip_env(value: str | None) -> list[str]:
-    if not value:
-        return []
-    parts = [part.strip() for part in value.split(",")]
-    return _normalize_fallback_ips(parts)
-
-
-def _resolve_system_dns() -> set[str]:
-    """Return the IPv4 addresses that the OS resolver gives for api.telegram.org."""
-    try:
-        results = socket.getaddrinfo(_TELEGRAM_API_HOST, 443, socket.AF_INET)
-        return {addr[4][0] for addr in results}
-    except Exception:
-        return set()
-
-
-async def _query_doh_provider(
-    client: httpx.AsyncClient, provider: dict
-) -> list[str]:
-    """Query one DoH provider and return A-record IPs."""
-    try:
-        resp = await client.get(
-            provider["url"], params=provider["params"], headers=provider["headers"]
-        )
-        resp.raise_for_status()
-        data = resp.json()
-        ips: list[str] = []
-        for answer in data.get("Answer", []):
-            if answer.get("type") != 1:  # A record
-                continue
-            raw = answer.get("data", "").strip()
-            try:
-                ipaddress.ip_address(raw)
-                ips.append(raw)
-            except ValueError:
-                continue
-        return ips
-    except Exception as exc:
-        logger.debug("DoH query to %s failed: %s", provider["url"], exc)
-        return []
-
-
-async def discover_fallback_ips() -> list[str]:
-    """Auto-discover Telegram API IPs via DNS-over-HTTPS.
-
-    Resolves api.telegram.org through Google and Cloudflare DoH, collects all
-    unique IPs, and excludes the system-DNS-resolved IP (which is presumably
-    unreachable on this network).  Falls back to a hardcoded seed list when DoH
-    is also unavailable.
-    """
-    async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
-        doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
-        system_dns_task = asyncio.to_thread(_resolve_system_dns)
-        results = await asyncio.gather(system_dns_task, *doh_tasks, return_exceptions=True)
-
-    # results[0] = system DNS IPs (set), results[1:] = DoH IP lists
-    system_ips: set[str] = results[0] if isinstance(results[0], set) else set()
-
-    doh_ips: list[str] = []
-    for r in results[1:]:
-        if isinstance(r, list):
-            doh_ips.extend(r)
-
-    # Deduplicate preserving order, exclude system-DNS IPs
-    seen: set[str] = set()
-    candidates: list[str] = []
-    for ip in doh_ips:
-        if ip not in seen and ip not in system_ips:
-            seen.add(ip)
-            candidates.append(ip)
-
-    # Validate through existing normalization
-    validated = _normalize_fallback_ips(candidates)
-
-    if validated:
-        logger.debug("Discovered Telegram fallback IPs via DoH: %s", ", ".join(validated))
-        return validated
-
-    logger.info(
-        "DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
-        ", ".join(system_ips) or "unknown",
-        ", ".join(_SEED_FALLBACK_IPS),
-    )
-    return list(_SEED_FALLBACK_IPS)
-
-
-def _rewrite_request_for_ip(request: httpx.Request, ip: str) -> httpx.Request:
-    original_host = request.url.host or _TELEGRAM_API_HOST
-    url = request.url.copy_with(host=ip)
-    headers = request.headers.copy()
-    headers["host"] = original_host
-    extensions = dict(request.extensions)
-    extensions["sni_hostname"] = original_host
-    return httpx.Request(
-        method=request.method,
-        url=url,
-        headers=headers,
-        stream=request.stream,
-        extensions=extensions,
-    )
-
-
-def _is_retryable_connect_error(exc: Exception) -> bool:
-    return isinstance(exc, (httpx.ConnectTimeout, httpx.ConnectError))
@@ -27,7 +27,6 @@ import hashlib
 import hmac
 import json
 import logging
-import os
 import re
 import subprocess
 import time
@@ -54,7 +53,6 @@ logger = logging.getLogger(__name__)
 DEFAULT_HOST = "0.0.0.0"
 DEFAULT_PORT = 8644
 _INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
-_DYNAMIC_ROUTES_FILENAME = "webhook_subscriptions.json"


 def check_webhook_requirements() -> bool:
@@ -70,10 +68,7 @@ class WebhookAdapter(BasePlatformAdapter):
        self._host: str = config.extra.get("host", DEFAULT_HOST)
        self._port: int = int(config.extra.get("port", DEFAULT_PORT))
        self._global_secret: str = config.extra.get("secret", "")
-        self._static_routes: Dict[str, dict] = config.extra.get("routes", {})
-        self._dynamic_routes: Dict[str, dict] = {}
-        self._dynamic_routes_mtime: float = 0.0
-        self._routes: Dict[str, dict] = dict(self._static_routes)
+        self._routes: Dict[str, dict] = config.extra.get("routes", {})
        self._runner = None

        # Delivery info keyed by session chat_id — consumed by send()
@@ -101,9 +96,6 @@ class WebhookAdapter(BasePlatformAdapter):
    # ------------------------------------------------------------------

    async def connect(self) -> bool:
-        # Load agent-created subscriptions before validating
-        self._reload_dynamic_routes()
-
        # Validate routes at startup — secret is required per route
        for name, route in self._routes.items():
            secret = route.get("secret", self._global_secret)
@@ -118,17 +110,6 @@ class WebhookAdapter(BasePlatformAdapter):
        app.router.add_get("/health", self._handle_health)
        app.router.add_post("/webhooks/{route_name}", self._handle_webhook)

-        # Port conflict detection — fail fast if port is already in use
-        import socket as _socket
-        try:
-            with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
-                _s.settimeout(1)
-                _s.connect(('127.0.0.1', self._port))
-            logger.error('[webhook] Port %d already in use. Set a different port in config.yaml: platforms.webhook.port', self._port)
-            return False
-        except (ConnectionRefusedError, OSError):
-            pass  # port is free
-
        self._runner = web.AppRunner(app)
        await self._runner.setup()
        site = web.TCPSite(self._runner, self._host, self._port)
@@ -201,46 +182,8 @@ class WebhookAdapter(BasePlatformAdapter):
        """GET /health — simple health check."""
        return web.json_response({"status": "ok", "platform": "webhook"})

-    def _reload_dynamic_routes(self) -> None:
-        """Reload agent-created subscriptions from disk if the file changed."""
-        from pathlib import Path as _Path
-        hermes_home = _Path(
-            os.getenv("HERMES_HOME", str(_Path.home() / ".hermes"))
-        ).expanduser()
-        subs_path = hermes_home / _DYNAMIC_ROUTES_FILENAME
-        if not subs_path.exists():
-            if self._dynamic_routes:
-                self._dynamic_routes = {}
-                self._routes = dict(self._static_routes)
-                logger.debug("[webhook] Dynamic subscriptions file removed, cleared dynamic routes")
-            return
-        try:
-            mtime = subs_path.stat().st_mtime
-            if mtime <= self._dynamic_routes_mtime:
-                return  # No change
-            data = json.loads(subs_path.read_text(encoding="utf-8"))
-            if not isinstance(data, dict):
-                return
-            # Merge: static routes take precedence over dynamic ones
-            self._dynamic_routes = {
-                k: v for k, v in data.items()
-                if k not in self._static_routes
-            }
-            self._routes = {**self._dynamic_routes, **self._static_routes}
-            self._dynamic_routes_mtime = mtime
-            logger.info(
-                "[webhook] Reloaded %d dynamic route(s): %s",
-                len(self._dynamic_routes),
-                ", ".join(self._dynamic_routes.keys()) or "(none)",
-            )
-        except Exception as e:
-            logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
-
    async def _handle_webhook(self, request: "web.Request") -> "web.Response":
        """POST /webhooks/{route_name} — receive and process a webhook event."""
-        # Hot-reload dynamic subscriptions on each request (mtime-gated, cheap)
-        self._reload_dynamic_routes()
-
        route_name = request.match_info.get("route_name", "")
        route_config = self._routes.get(route_name)

@@ -26,7 +26,6 @@ from pathlib import Path
 from typing import Dict, Optional, Any

 from hermes_cli.config import get_hermes_home
-from hermes_constants import get_hermes_dir

 logger = logging.getLogger(__name__)

@@ -135,14 +134,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
        )
        self._session_path: Path = Path(config.extra.get(
            "session_path",
-            get_hermes_dir("platforms/whatsapp/session", "whatsapp/session")
+            get_hermes_home() / "whatsapp" / "session"
        ))
        self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
        self._message_queue: asyncio.Queue = asyncio.Queue()
        self._bridge_log_fh = None
        self._bridge_log: Optional[Path] = None
        self._poll_task: Optional[asyncio.Task] = None
-        self._session_lock_identity: Optional[str] = None
    
    async def connect(self) -> bool:
        """
@@ -161,29 +159,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        
        logger.info("[%s] Bridge found at %s", self.name, bridge_path)
        
-        # Acquire scoped lock to prevent duplicate sessions
-        try:
-            from gateway.status import acquire_scoped_lock
-
-            self._session_lock_identity = str(self._session_path)
-            acquired, existing = acquire_scoped_lock(
-                "whatsapp-session",
-                self._session_lock_identity,
-                metadata={"platform": self.platform.value},
-            )
-            if not acquired:
-                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
-                message = (
-                    "Another local Hermes gateway is already using this WhatsApp session"
-                    + (f" (PID {owner_pid})." if owner_pid else ".")
-                    + " Stop the other gateway before starting a second WhatsApp bridge."
-                )
-                logger.error("[%s] %s", self.name, message)
-                self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
-                return False
-        except Exception as e:
-            logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
-
        # Auto-install npm dependencies if node_modules doesn't exist
        bridge_dir = bridge_path.parent
        if not (bridge_dir / "node_modules").exists():
@@ -337,12 +312,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
            return True
            
        except Exception as e:
-            if self._session_lock_identity:
-                try:
-                    from gateway.status import release_scoped_lock
-                    release_scoped_lock("whatsapp-session", self._session_lock_identity)
-                except Exception:
-                    pass
            logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
            self._close_bridge_log()
            return False
@@ -401,17 +370,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
            # Bridge was not started by us, don't kill it
            print(f"[{self.name}] Disconnecting (external bridge left running)")
        
-        if self._session_lock_identity:
-            try:
-                from gateway.status import release_scoped_lock
-                release_scoped_lock("whatsapp-session", self._session_lock_identity)
-            except Exception as e:
-                logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
-
        self._mark_disconnected()
        self._bridge_process = None
        self._close_bridge_log()
-        self._session_lock_identity = None
        print(f"[{self.name}] Disconnected")
    
    async def send(
@@ -565,7 +526,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        image_path: str,
        caption: Optional[str] = None,
        reply_to: Optional[str] = None,
-        **kwargs,
    ) -> SendResult:
        """Send a local image file natively via bridge."""
        return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
@@ -576,7 +536,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        video_path: str,
        caption: Optional[str] = None,
        reply_to: Optional[str] = None,
-        **kwargs,
    ) -> SendResult:
        """Send a video natively via bridge — plays inline in WhatsApp."""
        return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
@@ -588,7 +547,6 @@ class WhatsAppAdapter(BasePlatformAdapter):
        caption: Optional[str] = None,
        file_name: Optional[str] = None,
        reply_to: Optional[str] = None,
-        **kwargs,
    ) -> SendResult:
        """Send a document/file as a downloadable attachment via bridge."""
        return await self._send_media_to_bridge(
@@ -288,7 +288,7 @@ def _resolve_gateway_model(config: dict | None = None) -> str:
    if isinstance(model_cfg, str):
        model = model_cfg
    elif isinstance(model_cfg, dict):
-        model = model_cfg.get("default") or model_cfg.get("model") or model
+        model = model_cfg.get("default", model)
    return model


@@ -432,7 +432,7 @@ class GatewayRunner:
            from honcho_integration.session import HonchoSessionManager

            hcfg = HonchoClientConfig.from_global_config()
-            if not hcfg.enabled or not (hcfg.api_key or hcfg.base_url):
+            if not hcfg.enabled or not hcfg.api_key:
                return None, hcfg

            client = get_honcho_client(hcfg)
@@ -573,10 +573,6 @@ class GatewayRunner:
                session_id=old_session_id,
                honcho_session_key=honcho_session_key,
            )
-            # Fully silence the flush agent — quiet_mode only suppresses init
-            # messages; tool call output still leaks to the terminal through
-            # _safe_print → _print_fn.  Set a no-op to prevent that.
-            tmp_agent._print_fn = lambda *a, **kw: None

            # Build conversation history from transcript
            msgs = [
@@ -745,22 +741,10 @@ class GatewayRunner:
                logger.error("No connected messaging platforms remain. Shutting down gateway cleanly.")
            await self.stop()
        elif not self.adapters and self._failed_platforms:
-            # All platforms are down and queued for background reconnection.
-            # If the error is retryable, exit with failure so systemd Restart=on-failure
-            # can restart the process. Otherwise stay alive and keep retrying in background.
-            if adapter.fatal_error_retryable:
-                self._exit_reason = adapter.fatal_error_message or "All messaging platforms failed with retryable errors"
-                self._exit_with_failure = True
-                logger.error(
-                    "All messaging platforms failed with retryable errors. "
-                    "Shutting down gateway for service restart (systemd will retry)."
-                )
-                await self.stop()
-            else:
-                logger.warning(
-                    "No connected messaging platforms remain, but %d platform(s) queued for reconnection",
-                    len(self._failed_platforms),
-                )
+            logger.warning(
+                "No connected messaging platforms remain, but %d platform(s) queued for reconnection",
+                len(self._failed_platforms),
+            )

    def _request_clean_exit(self, reason: str) -> None:
        self._exit_cleanly = True
@@ -959,13 +943,6 @@ class GatewayRunner:
        """
        logger.info("Starting Hermes Gateway...")
        logger.info("Session storage: %s", self.config.sessions_dir)
-        try:
-            from hermes_cli.profiles import get_active_profile_name
-            _profile = get_active_profile_name()
-            if _profile and _profile != "default":
-                logger.info("Active profile: %s", _profile)
-        except Exception:
-            pass
        try:
            from gateway.status import write_runtime_status
            write_runtime_status(gateway_state="starting", exit_reason=None)
@@ -977,20 +954,12 @@ class GatewayRunner:
            os.getenv(v)
            for v in ("TELEGRAM_ALLOWED_USERS", "DISCORD_ALLOWED_USERS",
                       "WHATSAPP_ALLOWED_USERS", "SLACK_ALLOWED_USERS",
-                       "SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
-                       "EMAIL_ALLOWED_USERS",
+                       "SIGNAL_ALLOWED_USERS", "EMAIL_ALLOWED_USERS",
                       "SMS_ALLOWED_USERS", "MATTERMOST_ALLOWED_USERS",
                       "MATRIX_ALLOWED_USERS", "DINGTALK_ALLOWED_USERS",
                       "GATEWAY_ALLOWED_USERS")
        )
-        _allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes") or any(
-            os.getenv(v, "").lower() in ("true", "1", "yes")
-            for v in ("TELEGRAM_ALLOW_ALL_USERS", "DISCORD_ALLOW_ALL_USERS",
-                       "WHATSAPP_ALLOW_ALL_USERS", "SLACK_ALLOW_ALL_USERS",
-                       "SIGNAL_ALLOW_ALL_USERS", "EMAIL_ALLOW_ALL_USERS",
-                       "SMS_ALLOW_ALL_USERS", "MATTERMOST_ALLOW_ALL_USERS",
-                       "MATRIX_ALLOW_ALL_USERS", "DINGTALK_ALLOW_ALL_USERS")
-        )
+        _allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes")
        if not _any_allowlist and not _allow_all:
            logger.warning(
                "No user allowlists configured. All unauthorized users will be denied. "
@@ -2001,12 +1970,6 @@ class GatewayRunner:
                            f"Use /resume to browse and restore a previous session.\n"
                            f"Adjust reset timing in config.yaml under session_reset."
                        )
-                        try:
-                            session_info = self._format_session_info()
-                            if session_info:
-                                notice = f"{notice}\n\n{session_info}"
-                        except Exception:
-                            pass
                        await adapter.send(
                            source.chat_id, notice,
                            metadata=getattr(event, 'metadata', None),
@@ -2100,7 +2063,7 @@ class GatewayRunner:
                    if isinstance(_model_cfg, str):
                        _hyg_model = _model_cfg
                    elif isinstance(_model_cfg, dict):
-                        _hyg_model = _model_cfg.get("default") or _model_cfg.get("model") or _hyg_model
+                        _hyg_model = _model_cfg.get("default", _hyg_model)
                        # Read explicit context_length override from model config
                        # (same as run_agent.py lines 995-1005)
                        _raw_ctx = _model_cfg.get("context_length")
@@ -2212,7 +2175,6 @@ class GatewayRunner:
                                    enabled_toolsets=["memory"],
                                    session_id=session_entry.session_id,
                                )
-                                _hyg_agent._print_fn = lambda *a, **kw: None

                                loop = asyncio.get_event_loop()
                                _compressed, _ = await loop.run_in_executor(
@@ -2223,15 +2185,6 @@ class GatewayRunner:
                                    ),
                                )

-                                # _compress_context ends the old session and creates
-                                # a new session_id.  Write compressed messages into
-                                # the NEW session so the old transcript stays intact
-                                # and searchable via session_search.
-                                _hyg_new_sid = _hyg_agent.session_id
-                                if _hyg_new_sid != session_entry.session_id:
-                                    session_entry.session_id = _hyg_new_sid
-                                    self.session_store._save()
-
                                self.session_store.rewrite_transcript(
                                    session_entry.session_id, _compressed
                                )
@@ -2783,85 +2736,6 @@ class GatewayRunner:
            # Clear session env
            self._clear_session_env()
    
-    def _format_session_info(self) -> str:
-        """Resolve current model config and return a formatted info block.
-
-        Surfaces model, provider, context length, and endpoint so gateway
-        users can immediately see if context detection went wrong (e.g.
-        local models falling to the 128K default).
-        """
-        from agent.model_metadata import get_model_context_length, DEFAULT_FALLBACK_CONTEXT
-
-        model = _resolve_gateway_model()
-        config_context_length = None
-        provider = None
-        base_url = None
-        api_key = None
-
-        try:
-            cfg_path = _hermes_home / "config.yaml"
-            if cfg_path.exists():
-                import yaml as _info_yaml
-                with open(cfg_path, encoding="utf-8") as f:
-                    data = _info_yaml.safe_load(f) or {}
-                model_cfg = data.get("model", {})
-                if isinstance(model_cfg, dict):
-                    raw_ctx = model_cfg.get("context_length")
-                    if raw_ctx is not None:
-                        try:
-                            config_context_length = int(raw_ctx)
-                        except (TypeError, ValueError):
-                            pass
-                    provider = model_cfg.get("provider") or None
-                    base_url = model_cfg.get("base_url") or None
-        except Exception:
-            pass
-
-        # Resolve runtime credentials for probing
-        try:
-            runtime = _resolve_runtime_agent_kwargs()
-            provider = provider or runtime.get("provider")
-            base_url = base_url or runtime.get("base_url")
-            api_key = runtime.get("api_key")
-        except Exception:
-            pass
-
-        context_length = get_model_context_length(
-            model,
-            base_url=base_url or "",
-            api_key=api_key or "",
-            config_context_length=config_context_length,
-            provider=provider or "",
-        )
-
-        # Format context source hint
-        if config_context_length is not None:
-            ctx_source = "config"
-        elif context_length == DEFAULT_FALLBACK_CONTEXT:
-            ctx_source = "default — set model.context_length in config to override"
-        else:
-            ctx_source = "detected"
-
-        # Format context length for display
-        if context_length >= 1_000_000:
-            ctx_display = f"{context_length / 1_000_000:.1f}M"
-        elif context_length >= 1_000:
-            ctx_display = f"{context_length // 1_000}K"
-        else:
-            ctx_display = str(context_length)
-
-        lines = [
-            f"◆ Model: `{model}`",
-            f"◆ Provider: {provider or 'openrouter'}",
-            f"◆ Context: {ctx_display} tokens ({ctx_source})",
-        ]
-
-        # Show endpoint for local/custom setups
-        if base_url and ("localhost" in base_url or "127.0.0.1" in base_url or "0.0.0.0" in base_url):
-            lines.append(f"◆ Endpoint: {base_url}")
-
-        return "\n".join(lines)
-
    async def _handle_reset_command(self, event: MessageEvent) -> str:
        """Handle /new or /reset command."""
        source = event.source
@@ -2902,22 +2776,12 @@ class GatewayRunner:
            "session_key": session_key,
        })
        
-        # Resolve session config info to surface to the user
-        try:
-            session_info = self._format_session_info()
-        except Exception:
-            session_info = ""
-
        if new_entry:
-            header = "✨ Session reset! Starting fresh."
+            return "✨ Session reset! I've started fresh with no memory of our previous conversation."
        else:
            # No existing session, just create one
            self.session_store.get_or_create_session(source, force_new=True)
-            header = "✨ New session started!"
-
-        if session_info:
-            return f"{header}\n\n{session_info}"
-        return header
+            return "✨ New session started!"
    
    async def _handle_status_command(self, event: MessageEvent) -> str:
        """Handle /status command."""
@@ -4021,27 +3885,17 @@ class GatewayRunner:
                enabled_toolsets=["memory"],
                session_id=session_entry.session_id,
            )
-            tmp_agent._print_fn = lambda *a, **kw: None

            loop = asyncio.get_event_loop()
            compressed, _ = await loop.run_in_executor(
                None,
-                lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens)
+                lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens),
            )

-            # _compress_context already calls end_session() on the old session
-            # (preserving its full transcript in SQLite) and creates a new
-            # session_id for the continuation.  Write the compressed messages
-            # into the NEW session so the original history stays searchable.
-            new_session_id = tmp_agent.session_id
-            if new_session_id != session_entry.session_id:
-                session_entry.session_id = new_session_id
-                self.session_store._save()
-
-            self.session_store.rewrite_transcript(new_session_id, compressed)
+            self.session_store.rewrite_transcript(session_entry.session_id, compressed)
            # Reset stored token count — transcript changed, old value is stale
            self.session_store.update_session(
-                session_entry.session_key, last_prompt_tokens=0
+                session_entry.session_key, last_prompt_tokens=0,
            )
            new_count = len(compressed)
            new_tokens = estimate_messages_tokens_rough(compressed)
@@ -4197,7 +4051,7 @@ class GatewayRunner:
            ]
            ctx = agent.context_compressor
            if ctx.last_prompt_tokens:
-                pct = min(100, ctx.last_prompt_tokens / ctx.context_length * 100) if ctx.context_length else 0
+                pct = ctx.last_prompt_tokens / ctx.context_length * 100 if ctx.context_length else 0
                lines.append(f"Context: {ctx.last_prompt_tokens:,} / {ctx.context_length:,} ({pct:.0f}%)")
            if ctx.compression_count:
                lines.append(f"Compressions: {ctx.compression_count}")
@@ -4945,14 +4799,9 @@ class GatewayRunner:
        enabled_toolsets = sorted(_get_platform_tools(user_config, platform_key))

        # Tool progress mode from config.yaml: "all", "new", "verbose", "off"
-        # Falls back to env vars for backward compatibility.
-        # YAML 1.1 parses bare `off` as boolean False — normalise before
-        # the `or` chain so it doesn't silently fall through to "all".
-        _raw_tp = user_config.get("display", {}).get("tool_progress")
-        if _raw_tp is False:
-            _raw_tp = "off"
+        # Falls back to env vars for backward compatibility
        progress_mode = (
-            _raw_tp
+            user_config.get("display", {}).get("tool_progress")
            or os.getenv("HERMES_TOOL_PROGRESS_MODE")
            or "all"
        )
@@ -5011,17 +4860,12 @@ class GatewayRunner:
            progress_queue.put(msg)
        
        # Background task to send progress messages
-        # Accumulates tool lines into a single message that gets edited.
-        #
-        # Threading metadata is platform-specific:
-        # - Slack DM threading needs event_message_id fallback (reply thread)
-        # - Telegram uses message_thread_id only for forum topics; passing a
-        #   normal DM/group message id as thread_id causes send failures
-        # - Other platforms should use explicit source.thread_id only
-        if source.platform == Platform.SLACK:
-            _progress_thread_id = source.thread_id or event_message_id
-        else:
-            _progress_thread_id = source.thread_id
+        # Accumulates tool lines into a single message that gets edited
+        # For DM top-level Slack messages, source.thread_id is None but the
+        # final reply will be threaded under the original message via reply_to.
+        # Use event_message_id as fallback so progress messages land in the
+        # same thread as the final response instead of going to the DM root.
+        _progress_thread_id = source.thread_id or event_message_id
        _progress_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None

        async def send_progress_messages():
@@ -5284,25 +5128,7 @@ class GatewayRunner:
            agent.stream_delta_callback = _stream_delta_cb
            agent.status_callback = _status_callback_sync
            agent.reasoning_config = reasoning_config
-
-            # Background review delivery — send "💾 Memory updated" etc. to user
-            def _bg_review_send(message: str) -> None:
-                if not _status_adapter:
-                    return
-                try:
-                    asyncio.run_coroutine_threadsafe(
-                        _status_adapter.send(
-                            _status_chat_id,
-                            message,
-                            metadata=_status_thread_metadata,
-                        ),
-                        _loop_for_step,
-                    )
-                except Exception as _e:
-                    logger.debug("background_review_callback error: %s", _e)
-
-            agent.background_review_callback = _bg_review_send
-
+            
            # Store agent reference for interrupt support
            agent_holder[0] = agent
            # Capture the full tool definitions for transcript logging
@@ -762,16 +762,14 @@ class SessionStore:
            if session_key in self._entries:
                entry = self._entries[session_key]
                entry.updated_at = _now()
-                # Direct assignment — the gateway receives cumulative totals
-                # from the cached agent, not per-call deltas.
-                entry.input_tokens = input_tokens
-                entry.output_tokens = output_tokens
-                entry.cache_read_tokens = cache_read_tokens
-                entry.cache_write_tokens = cache_write_tokens
+                entry.input_tokens += input_tokens
+                entry.output_tokens += output_tokens
+                entry.cache_read_tokens += cache_read_tokens
+                entry.cache_write_tokens += cache_write_tokens
                if last_prompt_tokens is not None:
                    entry.last_prompt_tokens = last_prompt_tokens
                if estimated_cost_usd is not None:
-                    entry.estimated_cost_usd = estimated_cost_usd
+                    entry.estimated_cost_usd += estimated_cost_usd
                if cost_status:
                    entry.cost_status = cost_status
                entry.total_tokens = (
@@ -785,7 +783,7 @@ class SessionStore:

        if self._db and db_session_id:
            try:
-                self._db.set_token_counts(
+                self._db.update_token_counts(
                    db_session_id,
                    input_tokens=input_tokens,
                    output_tokens=output_tokens,
@@ -797,7 +795,6 @@ class SessionStore:
                    billing_provider=provider,
                    billing_base_url=base_url,
                    model=model,
-                    absolute=True,
                )
            except Exception as e:
                logger.debug("Session DB operation failed: %s", e)
@@ -958,17 +955,13 @@ class SessionStore:
            try:
                self._db.clear_messages(session_id)
                for msg in messages:
-                    role = msg.get("role", "unknown")
                    self._db.append_message(
                        session_id=session_id,
-                        role=role,
+                        role=msg.get("role", "unknown"),
                        content=msg.get("content"),
                        tool_name=msg.get("tool_name"),
                        tool_calls=msg.get("tool_calls"),
                        tool_call_id=msg.get("tool_call_id"),
-                        reasoning=msg.get("reasoning") if role == "assistant" else None,
-                        reasoning_details=msg.get("reasoning_details") if role == "assistant" else None,
-                        codex_reasoning_items=msg.get("codex_reasoning_items") if role == "assistant" else None,
                    )
            except Exception as e:
                logger.debug("Failed to rewrite transcript in DB: %s", e)
@@ -11,5 +11,5 @@ Provides subcommands for:
 - hermes cron          - Manage cron jobs
 """

-__version__ = "0.5.0"
-__release_date__ = "2026.3.28"
+__version__ = "0.4.0"
+__release_date__ = "2026.3.23"
@@ -160,7 +160,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        id="alibaba",
        name="Alibaba Cloud (DashScope)",
        auth_type="api_key",
-        inference_base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
+        inference_base_url="https://dashscope-intl.aliyuncs.com/apps/anthropic",
        api_key_env_vars=("DASHSCOPE_API_KEY",),
        base_url_env_var="DASHSCOPE_BASE_URL",
    ),
@@ -212,14 +212,6 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        api_key_env_vars=("KILOCODE_API_KEY",),
        base_url_env_var="KILOCODE_BASE_URL",
    ),
-    "huggingface": ProviderConfig(
-        id="huggingface",
-        name="Hugging Face",
-        auth_type="api_key",
-        inference_base_url="https://router.huggingface.co/v1",
-        api_key_env_vars=("HF_TOKEN",),
-        base_url_env_var="HF_BASE_URL",
-    ),
 }


@@ -693,7 +685,6 @@ def resolve_provider(
        "github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
        "aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
        "opencode": "opencode-zen", "zen": "opencode-zen",
-        "hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
        "go": "opencode-go", "opencode-go-sub": "opencode-go",
        "kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
    }
@@ -2021,8 +2012,7 @@ def _login_openai_codex(args, pconfig: ProviderConfig) -> None:
    config_path = _update_config_for_provider("openai-codex", creds.get("base_url", DEFAULT_CODEX_BASE_URL))
    print()
    print("Login successful!")
-    from hermes_constants import display_hermes_home as _dhh
-    print(f"  Auth state: {_dhh()}/auth.json")
+    print("  Auth state: ~/.hermes/auth.json")
    print(f"  Config updated: {config_path} (model.provider=openai-codex)")


@@ -403,15 +403,6 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
    if mcp_connected:
        summary_parts.append(f"{mcp_connected} MCP servers")
    summary_parts.append("/help for commands")
-    # Show active profile name when not 'default'
-    try:
-        from hermes_cli.profiles import get_active_profile_name
-        _profile_name = get_active_profile_name()
-        if _profile_name and _profile_name != "default":
-            right_lines.append(f"[bold {accent}]Profile:[/] [{text}]{_profile_name}[/]")
-    except Exception:
-        pass  # Never break the banner over a profiles.py bug
-
    right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")

    # Update check — use prefetched result if available
@@ -12,7 +12,6 @@ import getpass

 from hermes_cli.banner import cprint, _DIM, _RST
 from hermes_cli.config import save_env_value_secure
-from hermes_constants import display_hermes_home


 def clarify_callback(cli, question, choices):
@@ -132,8 +131,7 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
            }

        stored = save_env_value_secure(var_name, value)
-        _dhh = display_hermes_home()
-        cprint(f"\n{_DIM}  ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
+        cprint(f"\n{_DIM}  ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
        return {
            **stored,
            "skipped": False,
@@ -185,8 +183,7 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
                }

            stored = save_env_value_secure(var_name, value)
-            _dhh = display_hermes_home()
-            cprint(f"\n{_DIM}  ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
+            cprint(f"\n{_DIM}  ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
            return {
                **stored,
                "skipped": False,
@@ -138,12 +138,6 @@ DEFAULT_CONFIG = {
    "toolsets": ["hermes-cli"],
    "agent": {
        "max_turns": 90,
-        # Tool-use enforcement: injects system prompt guidance that tells the
-        # model to actually call tools instead of describing intended actions.
-        # Values: "auto" (default — applies to gpt/codex models), true/false
-        # (force on/off for all models), or a list of model-name substrings
-        # to match (e.g. ["gpt", "codex", "gemini", "qwen"]).
-        "tool_use_enforcement": "auto",
    },
    
    "terminal": {
@@ -227,49 +221,42 @@ DEFAULT_CONFIG = {
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 30,         # seconds — increase for slow local models
        },
        "compression": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 120,        # seconds — compression summarises large contexts; increase for local models
        },
        "session_search": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 30,
        },
        "skills_hub": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 30,
        },
        "approval": {
            "provider": "auto",
            "model": "",           # fast/cheap model recommended (e.g. gemini-flash, haiku)
            "base_url": "",
            "api_key": "",
-            "timeout": 30,
        },
        "mcp": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 30,
        },
        "flush_memories": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
-            "timeout": 30,
        },
    },
    
@@ -277,7 +264,6 @@ DEFAULT_CONFIG = {
        "compact": False,
        "personality": "kawaii",
        "resume_display": "full",
-        "busy_input_mode": "interrupt",
        "bell_on_complete": False,
        "show_reasoning": False,
        "streaming": False,
@@ -346,11 +332,6 @@ DEFAULT_CONFIG = {
        "user_profile_enabled": True,
        "memory_char_limit": 2200,   # ~800 tokens at 2.75 chars/token
        "user_char_limit": 1375,     # ~500 tokens at 2.75 chars/token
-        # External memory provider (plugin). At most one active at a time.
-        # Set to the provider name (e.g. "holographic", "hindsight", "mem0")
-        # or leave empty for built-in only. Auto-detected from plugins that
-        # call ctx.register_memory_provider().
-        "provider": "",
    },

    # Subagent delegation — override the provider:model used by delegate_task
@@ -371,13 +352,6 @@ DEFAULT_CONFIG = {
    # Never saved to sessions, logs, or trajectories.
    "prefill_messages_file": "",
    
-    # Skills — external skill directories for sharing skills across tools/agents.
-    # Each path is expanded (~, ${VAR}) and resolved.  Read-only — skill creation
-    # always goes to ~/.hermes/skills/.
-    "skills": {
-        "external_dirs": [],   # e.g. ["~/.agents/skills", "/shared/team-skills"]
-    },
-
    # Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
    # This section is only needed for hermes-specific overrides; everything else
    # (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@@ -572,14 +546,14 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
    },
    "DASHSCOPE_API_KEY": {
-        "description": "Alibaba Cloud DashScope API key (Qwen + multi-provider models)",
+        "description": "Alibaba Cloud DashScope API key for Qwen models",
        "prompt": "DashScope API Key",
        "url": "https://modelstudio.console.alibabacloud.com/",
        "password": True,
        "category": "provider",
    },
    "DASHSCOPE_BASE_URL": {
-        "description": "Custom DashScope base URL (default: coding-intl OpenAI-compat endpoint)",
+        "description": "Custom DashScope base URL (default: international endpoint)",
        "prompt": "DashScope Base URL",
        "url": "",
        "password": False,
@@ -618,31 +592,8 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
        "advanced": True,
    },
-    "HF_TOKEN": {
-        "description": "Hugging Face token for Inference Providers (20+ open models via router.huggingface.co)",
-        "prompt": "Hugging Face Token",
-        "url": "https://huggingface.co/settings/tokens",
-        "password": True,
-        "category": "provider",
-    },
-    "HF_BASE_URL": {
-        "description": "Hugging Face Inference Providers base URL override",
-        "prompt": "HF base URL (leave empty for default)",
-        "url": None,
-        "password": False,
-        "category": "provider",
-        "advanced": True,
-    },

    # ── Tool API keys ──
-    "EXA_API_KEY": {
-        "description": "Exa API key for AI-native web search and contents",
-        "prompt": "Exa API key",
-        "url": "https://exa.ai/",
-        "tools": ["web_search", "web_extract"],
-        "password": True,
-        "category": "tool",
-    },
    "PARALLEL_API_KEY": {
        "description": "Parallel API key for AI-native web search and extract",
        "prompt": "Parallel API key",
@@ -829,20 +780,6 @@ OPTIONAL_ENV_VARS = {
        "password": False,
        "category": "messaging",
    },
-    "MATTERMOST_REQUIRE_MENTION": {
-        "description": "Require @mention in Mattermost channels (default: true). Set to false to respond to all messages.",
-        "prompt": "Require @mention in channels",
-        "url": None,
-        "password": False,
-        "category": "messaging",
-    },
-    "MATTERMOST_FREE_RESPONSE_CHANNELS": {
-        "description": "Comma-separated Mattermost channel IDs where bot responds without @mention",
-        "prompt": "Free-response channel IDs (comma-separated)",
-        "url": None,
-        "password": False,
-        "category": "messaging",
-    },
    "MATRIX_HOMESERVER": {
        "description": "Matrix homeserver URL (e.g. https://matrix.example.org)",
        "prompt": "Matrix homeserver URL",
@@ -1713,7 +1650,6 @@ def show_config():
    keys = [
        ("OPENROUTER_API_KEY", "OpenRouter"),
        ("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
-        ("EXA_API_KEY", "Exa"),
        ("PARALLEL_API_KEY", "Parallel"),
        ("FIRECRAWL_API_KEY", "Firecrawl"),
        ("TAVILY_API_KEY", "Tavily"),
@@ -1873,7 +1809,7 @@ def set_config_value(key: str, value: str):
    # Check if it's an API key (goes to .env)
    api_keys = [
        'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
-        'EXA_API_KEY', 'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
+        'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL', 'TAVILY_API_KEY',
        'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
        'FAL_KEY', 'TELEGRAM_BOT_TOKEN', 'DISCORD_BOT_TOKEN',
        'TERMINAL_SSH_HOST', 'TERMINAL_SSH_USER', 'TERMINAL_SSH_KEY',
@@ -10,11 +10,9 @@ import subprocess
 import shutil

 from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
-from hermes_constants import display_hermes_home

 PROJECT_ROOT = get_project_root()
 HERMES_HOME = get_hermes_home()
-_DHH = display_hermes_home()  # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)

 # Load environment variables from ~/.hermes/.env so API key checks work
 from dotenv import load_dotenv
@@ -58,7 +56,7 @@ def _honcho_is_configured_for_doctor() -> bool:
        from honcho_integration.client import HonchoClientConfig

        cfg = HonchoClientConfig.from_global_config()
-        return bool(cfg.enabled and (cfg.api_key or cfg.base_url))
+        return bool(cfg.enabled and cfg.api_key)
    except Exception:
        return False

@@ -211,14 +209,14 @@ def run_doctor(args):
    # Check ~/.hermes/.env (primary location for user config)
    env_path = HERMES_HOME / '.env'
    if env_path.exists():
-        check_ok(f"{_DHH}/.env file exists")
+        check_ok("~/.hermes/.env file exists")
        
        # Check for common issues
        content = env_path.read_text()
        if _has_provider_env_config(content):
            check_ok("API key or custom endpoint configured")
        else:
-            check_warn(f"No API key found in {_DHH}/.env")
+            check_warn("No API key found in ~/.hermes/.env")
            issues.append("Run 'hermes setup' to configure API keys")
    else:
        # Also check project root as fallback
@@ -226,11 +224,11 @@ def run_doctor(args):
        if fallback_env.exists():
            check_ok(".env file exists (in project directory)")
        else:
-            check_fail(f"{_DHH}/.env file missing")
+            check_fail("~/.hermes/.env file missing")
            if should_fix:
                env_path.parent.mkdir(parents=True, exist_ok=True)
                env_path.touch()
-                check_ok(f"Created empty {_DHH}/.env")
+                check_ok("Created empty ~/.hermes/.env")
                check_info("Run 'hermes setup' to configure API keys")
                fixed_count += 1
            else:
@@ -240,7 +238,7 @@ def run_doctor(args):
    # Check ~/.hermes/config.yaml (primary) or project cli-config.yaml (fallback)
    config_path = HERMES_HOME / 'config.yaml'
    if config_path.exists():
-        check_ok(f"{_DHH}/config.yaml exists")
+        check_ok("~/.hermes/config.yaml exists")
    else:
        fallback_config = PROJECT_ROOT / 'cli-config.yaml'
        if fallback_config.exists():
@@ -250,11 +248,11 @@ def run_doctor(args):
            if should_fix and example_config.exists():
                config_path.parent.mkdir(parents=True, exist_ok=True)
                shutil.copy2(str(example_config), str(config_path))
-                check_ok(f"Created {_DHH}/config.yaml from cli-config.yaml.example")
+                check_ok("Created ~/.hermes/config.yaml from cli-config.yaml.example")
                fixed_count += 1
            elif should_fix:
                check_warn("config.yaml not found and no example to copy from")
-                manual_issues.append(f"Create {_DHH}/config.yaml manually")
+                manual_issues.append("Create ~/.hermes/config.yaml manually")
            else:
                check_warn("config.yaml not found", "(using defaults)")
    
@@ -296,28 +294,28 @@ def run_doctor(args):
    
    hermes_home = HERMES_HOME
    if hermes_home.exists():
-        check_ok(f"{_DHH} directory exists")
+        check_ok("~/.hermes directory exists")
    else:
        if should_fix:
            hermes_home.mkdir(parents=True, exist_ok=True)
-            check_ok(f"Created {_DHH} directory")
+            check_ok("Created ~/.hermes directory")
            fixed_count += 1
        else:
-            check_warn(f"{_DHH} not found", "(will be created on first use)")
+            check_warn("~/.hermes not found", "(will be created on first use)")
    
    # Check expected subdirectories
    expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
    for subdir_name in expected_subdirs:
        subdir_path = hermes_home / subdir_name
        if subdir_path.exists():
-            check_ok(f"{_DHH}/{subdir_name}/ exists")
+            check_ok(f"~/.hermes/{subdir_name}/ exists")
        else:
            if should_fix:
                subdir_path.mkdir(parents=True, exist_ok=True)
-                check_ok(f"Created {_DHH}/{subdir_name}/")
+                check_ok(f"Created ~/.hermes/{subdir_name}/")
                fixed_count += 1
            else:
-                check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
+                check_warn(f"~/.hermes/{subdir_name}/ not found", "(will be created on first use)")
    
    # Check for SOUL.md persona file
    soul_path = hermes_home / "SOUL.md"
@@ -326,11 +324,11 @@ def run_doctor(args):
        # Check if it's just the template comments (no real content)
        lines = [l for l in content.splitlines() if l.strip() and not l.strip().startswith(("<!--", "-->", "#"))]
        if lines:
-            check_ok(f"{_DHH}/SOUL.md exists (persona configured)")
+            check_ok("~/.hermes/SOUL.md exists (persona configured)")
        else:
-            check_info(f"{_DHH}/SOUL.md exists but is empty — edit it to customize personality")
+            check_info("~/.hermes/SOUL.md exists but is empty — edit it to customize personality")
    else:
-        check_warn(f"{_DHH}/SOUL.md not found", "(create it to give Hermes a custom personality)")
+        check_warn("~/.hermes/SOUL.md not found", "(create it to give Hermes a custom personality)")
        if should_fix:
            soul_path.parent.mkdir(parents=True, exist_ok=True)
            soul_path.write_text(
@@ -339,13 +337,13 @@ def run_doctor(args):
                "You are Hermes, a helpful AI assistant.\n",
                encoding="utf-8",
            )
-            check_ok(f"Created {_DHH}/SOUL.md with basic template")
+            check_ok("Created ~/.hermes/SOUL.md with basic template")
            fixed_count += 1
    
    # Check memory directory
    memories_dir = hermes_home / "memories"
    if memories_dir.exists():
-        check_ok(f"{_DHH}/memories/ directory exists")
+        check_ok("~/.hermes/memories/ directory exists")
        memory_file = memories_dir / "MEMORY.md"
        user_file = memories_dir / "USER.md"
        if memory_file.exists():
@@ -359,10 +357,10 @@ def run_doctor(args):
        else:
            check_info("USER.md not created yet (will be created when the agent first writes a memory)")
    else:
-        check_warn(f"{_DHH}/memories/ not found", "(will be created on first use)")
+        check_warn("~/.hermes/memories/ not found", "(will be created on first use)")
        if should_fix:
            memories_dir.mkdir(parents=True, exist_ok=True)
-            check_ok(f"Created {_DHH}/memories/")
+            check_ok("Created ~/.hermes/memories/")
            fixed_count += 1
    
    # Check SQLite session store
@@ -374,11 +372,11 @@ def run_doctor(args):
            cursor = conn.execute("SELECT COUNT(*) FROM sessions")
            count = cursor.fetchone()[0]
            conn.close()
-            check_ok(f"{_DHH}/state.db exists ({count} sessions)")
+            check_ok(f"~/.hermes/state.db exists ({count} sessions)")
        except Exception as e:
-            check_warn(f"{_DHH}/state.db exists but has issues: {e}")
+            check_warn(f"~/.hermes/state.db exists but has issues: {e}")
    else:
-        check_info(f"{_DHH}/state.db not created yet (will be created on first session)")
+        check_info("~/.hermes/state.db not created yet (will be created on first session)")

    _check_gateway_service_linger(issues)
    
@@ -693,7 +691,7 @@ def run_doctor(args):
    if github_token:
        check_ok("GitHub token configured (authenticated API access)")
    else:
-        check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
+        check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")

    # =========================================================================
    # Honcho memory
@@ -710,8 +708,8 @@ def run_doctor(args):
            check_warn("Honcho config not found", "run: hermes honcho setup")
        elif not hcfg.enabled:
            check_info(f"Honcho disabled (set enabled: true in {_honcho_cfg_path} to activate)")
-        elif not (hcfg.api_key or hcfg.base_url):
-            check_fail("Honcho API key or base URL not set", "run: hermes honcho setup")
+        elif not hcfg.api_key:
+            check_fail("Honcho API key not set", "run: hermes honcho setup")
            issues.append("No Honcho API key — run 'hermes honcho setup'")
        else:
            from honcho_integration.client import get_honcho_client, reset_honcho_client
@@ -730,53 +728,6 @@ def run_doctor(args):
    except Exception as _e:
        check_warn("Honcho check failed", str(_e))

-    # =========================================================================
-    # Profiles
-    # =========================================================================
-    try:
-        from hermes_cli.profiles import list_profiles, _get_wrapper_dir, profile_exists
-        import re as _re
-
-        named_profiles = [p for p in list_profiles() if not p.is_default]
-        if named_profiles:
-            print()
-            print(color("◆ Profiles", Colors.CYAN, Colors.BOLD))
-            check_ok(f"{len(named_profiles)} profile(s) found")
-            wrapper_dir = _get_wrapper_dir()
-            for p in named_profiles:
-                parts = []
-                if p.gateway_running:
-                    parts.append("gateway running")
-                if p.model:
-                    parts.append(p.model[:30])
-                if not (p.path / "config.yaml").exists():
-                    parts.append("⚠ missing config")
-                if not (p.path / ".env").exists():
-                    parts.append("no .env")
-                wrapper = wrapper_dir / p.name
-                if not wrapper.exists():
-                    parts.append("no alias")
-                status = ", ".join(parts) if parts else "configured"
-                check_ok(f"  {p.name}: {status}")
-
-            # Check for orphan wrappers
-            if wrapper_dir.is_dir():
-                for wrapper in wrapper_dir.iterdir():
-                    if not wrapper.is_file():
-                        continue
-                    try:
-                        content = wrapper.read_text()
-                        if "hermes -p" in content:
-                            _m = _re.search(r"hermes -p (\S+)", content)
-                            if _m and not profile_exists(_m.group(1)):
-                                check_warn(f"Orphan alias: {wrapper.name} → profile '{_m.group(1)}' no longer exists")
-                    except Exception:
-                        pass
-    except ImportError:
-        pass
-    except Exception as _e:
-        logger.debug("Profile health check failed: %s", _e)
-
    # =========================================================================
    # Summary
    # =========================================================================
@@ -15,8 +15,6 @@ from pathlib import Path
 PROJECT_ROOT = Path(__file__).parent.parent.resolve()

 from hermes_cli.config import get_env_value, get_hermes_home, save_env_value, is_managed, managed_error
-# display_hermes_home is imported lazily at call sites to avoid ImportError
-# when hermes_constants is cached from a pre-update version during `hermes update`.
 from hermes_cli.setup import (
    print_header, print_info, print_success, print_warning, print_error,
    prompt, prompt_choice, prompt_yes_no,
@@ -127,43 +125,20 @@ _SERVICE_BASE = "hermes-gateway"
 SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"


-def _profile_suffix() -> str:
-    """Derive a service-name suffix from the current HERMES_HOME.
-
-    Returns ``""`` for the default ``~/.hermes``, the profile name for
-    ``~/.hermes/profiles/<name>``, or a short hash for any other custom
-    HERMES_HOME path.
-    """
-    import hashlib
-    import re
-    from pathlib import Path as _Path
-    home = get_hermes_home().resolve()
-    default = (_Path.home() / ".hermes").resolve()
-    if home == default:
-        return ""
-    # Detect ~/.hermes/profiles/<name> pattern → use the profile name
-    profiles_root = (default / "profiles").resolve()
-    try:
-        rel = home.relative_to(profiles_root)
-        parts = rel.parts
-        if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
-            return parts[0]
-    except ValueError:
-        pass
-    # Fallback: short hash for arbitrary HERMES_HOME paths
-    return hashlib.sha256(str(home).encode()).hexdigest()[:8]
-
-
 def get_service_name() -> str:
    """Derive a systemd service name scoped to this HERMES_HOME.

    Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
-    Profile ``~/.hermes/profiles/coder`` returns ``hermes-gateway-coder``.
-    Any other HERMES_HOME appends a short hash for uniqueness.
+    Any other HERMES_HOME appends a short hash so multiple installations
+    can each have their own systemd service without conflicting.
    """
-    suffix = _profile_suffix()
-    if not suffix:
+    import hashlib
+    from pathlib import Path as _Path  # local import to avoid monkeypatch interference
+    home = get_hermes_home().resolve()
+    default = (_Path.home() / ".hermes").resolve()
+    if home == default:
        return _SERVICE_BASE
+    suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
    return f"{_SERVICE_BASE}-{suffix}"


@@ -394,14 +369,7 @@ def print_systemd_linger_guidance() -> None:
        print("  sudo loginctl enable-linger $USER")

 def get_launchd_plist_path() -> Path:
-    """Return the launchd plist path, scoped per profile.
-
-    Default ``~/.hermes`` → ``ai.hermes.gateway.plist`` (backward compatible).
-    Profile ``~/.hermes/profiles/coder`` → ``ai.hermes.gateway-coder.plist``.
-    """
-    suffix = _profile_suffix()
-    name = f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
-    return Path.home() / "Library" / "LaunchAgents" / f"{name}.plist"
+    return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"

 def _detect_venv_dir() -> Path | None:
    """Detect the active virtualenv directory.
@@ -452,17 +420,6 @@ def get_hermes_cli_path() -> str:
 # Systemd (Linux)
 # =============================================================================

-def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
-    """Return user-local bin dirs that exist and aren't already in *path_entries*."""
-    candidates = [
-        str(home / ".local" / "bin"),       # uv, uvx, pip-installed CLIs
-        str(home / ".cargo" / "bin"),        # Rust/cargo tools
-        str(home / "go" / "bin"),            # Go tools
-        str(home / ".npm-global" / "bin"),   # npm global packages
-    ]
-    return [p for p in candidates if p not in path_entries and Path(p).exists()]
-
-
 def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
    python_path = get_python_path()
    working_dir = str(PROJECT_ROOT)
@@ -477,16 +434,13 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
        resolved_node_dir = str(Path(resolved_node).resolve().parent)
        if resolved_node_dir not in path_entries:
            path_entries.append(resolved_node_dir)
+    path_entries.extend(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"])
+    sane_path = ":".join(path_entries)

    hermes_home = str(get_hermes_home().resolve())

-    common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
-
    if system:
        username, group_name, home_dir = _system_service_identity(run_as_user)
-        path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
-        path_entries.extend(common_bin_paths)
-        sane_path = ":".join(path_entries)
        return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network-online.target
@@ -518,9 +472,6 @@ StandardError=journal
 WantedBy=multi-user.target
 """

-    path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
-    path_entries.extend(common_bin_paths)
-    sane_path = ":".join(path_entries)
    return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network.target
@@ -801,46 +752,18 @@ def systemd_status(deep: bool = False, system: bool = False):
 # Launchd (macOS)
 # =============================================================================

-def get_launchd_label() -> str:
-    """Return the launchd service label, scoped per profile."""
-    suffix = _profile_suffix()
-    return f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
-
-
 def generate_launchd_plist() -> str:
    python_path = get_python_path()
    working_dir = str(PROJECT_ROOT)
-    hermes_home = str(get_hermes_home().resolve())
    log_dir = get_hermes_home() / "logs"
    log_dir.mkdir(parents=True, exist_ok=True)
-    label = get_launchd_label()
-    # Build a sane PATH for the launchd plist.  launchd provides only a
-    # minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
-    # nvm, cargo, etc.  We prepend venv/bin and node_modules/.bin (matching
-    # the systemd unit), then capture the user's full shell PATH so every
-    # user-installed tool (node, ffmpeg, …) is reachable.
-    detected_venv = _detect_venv_dir()
-    venv_bin = str(detected_venv / "bin") if detected_venv else str(PROJECT_ROOT / "venv" / "bin")
-    venv_dir = str(detected_venv) if detected_venv else str(PROJECT_ROOT / "venv")
-    node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
-    # Resolve the directory containing the node binary (e.g. Homebrew, nvm)
-    # so it's explicitly in PATH even if the user's shell PATH changes later.
-    priority_dirs = [venv_bin, node_bin]
-    resolved_node = shutil.which("node")
-    if resolved_node:
-        resolved_node_dir = str(Path(resolved_node).resolve().parent)
-        if resolved_node_dir not in priority_dirs:
-            priority_dirs.append(resolved_node_dir)
-    sane_path = ":".join(
-        dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
-    )
-
+    
    return f"""<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
 <plist version="1.0">
 <dict>
    <key>Label</key>
-    <string>{label}</string>
+    <string>ai.hermes.gateway</string>
    
    <key>ProgramArguments</key>
    <array>
@@ -855,16 +778,6 @@ def generate_launchd_plist() -> str:
    <key>WorkingDirectory</key>
    <string>{working_dir}</string>
    
-    <key>EnvironmentVariables</key>
-    <dict>
-        <key>PATH</key>
-        <string>{sane_path}</string>
-        <key>VIRTUAL_ENV</key>
-        <string>{venv_dir}</string>
-        <key>HERMES_HOME</key>
-        <string>{hermes_home}</string>
-    </dict>
-    
    <key>RunAtLoad</key>
    <true/>
    
@@ -937,8 +850,7 @@ def launchd_install(force: bool = False):
    print()
    print("Next steps:")
    print("  hermes gateway status             # Check status")
-    from hermes_constants import display_hermes_home as _dhh
-    print(f"  tail -f {_dhh()}/logs/gateway.log  # View logs")
+    print("  tail -f ~/.hermes/logs/gateway.log  # View logs")

 def launchd_uninstall():
    plist_path = get_launchd_plist_path()
@@ -951,33 +863,20 @@ def launchd_uninstall():
    print("✓ Service uninstalled")

 def launchd_start():
-    plist_path = get_launchd_plist_path()
-    label = get_launchd_label()
-
-    # Self-heal if the plist is missing entirely (e.g., manual cleanup, failed upgrade)
-    if not plist_path.exists():
-        print("↻ launchd plist missing; regenerating service definition")
-        plist_path.parent.mkdir(parents=True, exist_ok=True)
-        plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
-        subprocess.run(["launchctl", "load", str(plist_path)], check=True)
-        subprocess.run(["launchctl", "start", label], check=True)
-        print("✓ Service started")
-        return
-
    refresh_launchd_plist_if_needed()
+    plist_path = get_launchd_plist_path()
    try:
-        subprocess.run(["launchctl", "start", label], check=True)
+        subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
    except subprocess.CalledProcessError as e:
-        if e.returncode != 3:
+        if e.returncode != 3 or not plist_path.exists():
            raise
        print("↻ launchd job was unloaded; reloading service definition")
        subprocess.run(["launchctl", "load", str(plist_path)], check=True)
-        subprocess.run(["launchctl", "start", label], check=True)
+        subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
    print("✓ Service started")

 def launchd_stop():
-    label = get_launchd_label()
-    subprocess.run(["launchctl", "stop", label], check=True)
+    subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
    print("✓ Service stopped")

 def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
@@ -1032,9 +931,8 @@ def launchd_restart():

 def launchd_status(deep: bool = False):
    plist_path = get_launchd_plist_path()
-    label = get_launchd_label()
    result = subprocess.run(
-        ["launchctl", "list", label],
+        ["launchctl", "list", "ai.hermes.gateway"],
        capture_output=True,
        text=True
    )
@@ -1539,7 +1437,7 @@ def _is_service_running() -> bool:
        return False
    elif is_macos() and get_launchd_plist_path().exists():
        result = subprocess.run(
-            ["launchctl", "list", get_launchd_label()],
+            ["launchctl", "list", "ai.hermes.gateway"],
            capture_output=True, text=True
        )
        return result.returncode == 0
@@ -54,71 +54,6 @@ from typing import Optional
 PROJECT_ROOT = Path(__file__).parent.parent.resolve()
 sys.path.insert(0, str(PROJECT_ROOT))

-# ---------------------------------------------------------------------------
-# Profile override — MUST happen before any hermes module import.
-#
-# Many modules cache HERMES_HOME at import time (module-level constants).
-# We intercept --profile/-p from sys.argv here and set the env var so that
-# every subsequent ``os.getenv("HERMES_HOME", ...)`` resolves correctly.
-# The flag is stripped from sys.argv so argparse never sees it.
-# Falls back to ~/.hermes/active_profile for sticky default.
-# ---------------------------------------------------------------------------
-def _apply_profile_override() -> None:
-    """Pre-parse --profile/-p and set HERMES_HOME before module imports."""
-    argv = sys.argv[1:]
-    profile_name = None
-    consume = 0
-
-    # 1. Check for explicit -p / --profile flag
-    for i, arg in enumerate(argv):
-        if arg in ("--profile", "-p") and i + 1 < len(argv):
-            profile_name = argv[i + 1]
-            consume = 2
-            break
-        elif arg.startswith("--profile="):
-            profile_name = arg.split("=", 1)[1]
-            consume = 1
-            break
-
-    # 2. If no flag, check ~/.hermes/active_profile
-    if profile_name is None:
-        try:
-            active_path = Path.home() / ".hermes" / "active_profile"
-            if active_path.exists():
-                name = active_path.read_text().strip()
-                if name and name != "default":
-                    profile_name = name
-                    consume = 0  # don't strip anything from argv
-        except (UnicodeDecodeError, OSError):
-            pass  # corrupted file, skip
-
-    # 3. If we found a profile, resolve and set HERMES_HOME
-    if profile_name is not None:
-        try:
-            from hermes_cli.profiles import resolve_profile_env
-            hermes_home = resolve_profile_env(profile_name)
-        except (ValueError, FileNotFoundError) as exc:
-            print(f"Error: {exc}", file=sys.stderr)
-            sys.exit(1)
-        except Exception as exc:
-            # A bug in profiles.py must NEVER prevent hermes from starting
-            print(f"Warning: profile override failed ({exc}), using default", file=sys.stderr)
-            return
-        os.environ["HERMES_HOME"] = hermes_home
-        # Strip the flag from argv so argparse doesn't choke
-        if consume > 0:
-            for i, arg in enumerate(argv):
-                if arg in ("--profile", "-p"):
-                    start = i + 1  # +1 because argv is sys.argv[1:]
-                    sys.argv = sys.argv[:start] + sys.argv[start + consume:]
-                    break
-                elif arg.startswith("--profile="):
-                    start = i + 1
-                    sys.argv = sys.argv[:start] + sys.argv[start + 1:]
-                    break
-
-_apply_profile_override()
-
 # Load .env from ~/.hermes/.env first, then project root as dev fallback.
 # User-managed env files should override stale shell exports on restart.
 from hermes_cli.config import get_hermes_home
@@ -860,7 +795,6 @@ def cmd_model(args):
        "ai-gateway": "AI Gateway",
        "kilocode": "Kilo Code",
        "alibaba": "Alibaba Cloud (DashScope)",
-        "huggingface": "Hugging Face",
        "custom": "Custom endpoint",
    }
    active_label = provider_labels.get(active, active)
@@ -886,8 +820,7 @@ def cmd_model(args):
        ("opencode-zen", "OpenCode Zen (35+ curated models, pay-as-you-go)"),
        ("opencode-go", "OpenCode Go (open models, $10/month subscription)"),
        ("ai-gateway", "AI Gateway (Vercel — 200+ models, pay-per-use)"),
-        ("alibaba", "Alibaba Cloud / DashScope Coding (Qwen + multi-provider)"),
-        ("huggingface", "Hugging Face Inference Providers (20+ open models)"),
+        ("alibaba", "Alibaba Cloud / DashScope (Qwen models, Anthropic-compatible)"),
    ]

    # Add user-defined custom providers from config.yaml
@@ -897,8 +830,8 @@ def cmd_model(args):
        for entry in custom_providers_cfg:
            if not isinstance(entry, dict):
                continue
-            name = (entry.get("name") or "").strip()
-            base_url = (entry.get("base_url") or "").strip()
+            name = entry.get("name", "").strip()
+            base_url = entry.get("base_url", "").strip()
            if not name or not base_url:
                continue
            # Generate a stable key from the name
@@ -960,7 +893,7 @@ def cmd_model(args):
        _model_flow_anthropic(config, current_model)
    elif selected_provider == "kimi-coding":
        _model_flow_kimi(config, current_model)
-    elif selected_provider in ("zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba", "huggingface"):
+    elif selected_provider in ("zai", "minimax", "minimax-cn", "kilocode", "opencode-zen", "opencode-go", "ai-gateway", "alibaba"):
        _model_flow_api_key_provider(config, selected_provider, current_model)


@@ -1045,7 +978,6 @@ def _model_flow_openrouter(config, current_model=""):
            cfg["model"] = model
        model["provider"] = "openrouter"
        model["base_url"] = OPENROUTER_BASE_URL
-        model["api_mode"] = "chat_completions"
        save_config(cfg)
        deactivate_provider()
        print(f"Default model set to: {selected} (via OpenRouter)")
@@ -1269,7 +1201,6 @@ def _model_flow_custom(config):
            cfg["model"] = model
        model["provider"] = "custom"
        model["base_url"] = effective_url
-        model["api_mode"] = "chat_completions"
        save_config(cfg)
        deactivate_provider()

@@ -1571,18 +1502,6 @@ _PROVIDER_MODELS = {
        "google/gemini-3-pro-preview",
        "google/gemini-3-flash-preview",
    ],
-    # Curated HF model list — only agentic models that map to OpenRouter defaults.
-    # Format: HF model ID → OpenRouter equivalent noted in comment
-    "huggingface": [
-        "Qwen/Qwen3.5-397B-A17B",                  # ↔ qwen/qwen3.5-plus
-        "Qwen/Qwen3.5-35B-A3B",                     # ↔ qwen/qwen3.5-35b-a3b
-        "deepseek-ai/DeepSeek-V3.2",                # ↔ deepseek/deepseek-chat
-        "moonshotai/Kimi-K2.5",                      # ↔ moonshotai/kimi-k2.5
-        "MiniMaxAI/MiniMax-M2.5",                    # ↔ minimax/minimax-m2.5
-        "zai-org/GLM-5",                             # ↔ z-ai/glm-5
-        "XiaomiMiMo/MiMo-V2-Flash",                 # ↔ xiaomi/mimo-v2-pro
-        "moonshotai/Kimi-K2-Thinking",               # ↔ moonshotai/kimi-k2-thinking
-    ],
 }


@@ -2051,7 +1970,6 @@ def _model_flow_kimi(config, current_model=""):
            cfg["model"] = model
        model["provider"] = provider_id
        model["base_url"] = effective_base
-        model["api_mode"] = "chat_completions"
        save_config(cfg)
        deactivate_provider()

@@ -2113,25 +2031,19 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
        save_env_value(base_url_env, override)
        effective_base = override

-    # Model selection — try live /models endpoint first, fall back to defaults.
-    # Providers with large live catalogs (100+ models) use a curated list instead
-    # so users see familiar model names rather than an overwhelming dump.
-    curated = _PROVIDER_MODELS.get(provider_id, [])
-    if curated and len(curated) >= 8:
-        # Curated list is substantial — use it directly, skip live probe
-        live_models = None
-    else:
-        from hermes_cli.models import fetch_api_models
-        api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
-        live_models = fetch_api_models(api_key_for_probe, effective_base)
+    # Model selection — try live /models endpoint first, fall back to defaults
+    from hermes_cli.models import fetch_api_models
+    api_key_for_probe = existing_key or (get_env_value(key_env) if key_env else "")
+    live_models = fetch_api_models(api_key_for_probe, effective_base)

    if live_models:
        model_list = live_models
        print(f"  Found {len(model_list)} model(s) from {pconfig.name} API")
    else:
-        model_list = curated
+        model_list = _PROVIDER_MODELS.get(provider_id, [])
        if model_list:
-            print(f"  Showing {len(model_list)} curated models — use \"Enter custom model name\" for others.")
+            print("  ⚠ Could not auto-detect models from API — showing defaults.")
+            print("    Use \"Enter custom model name\" if you don't see your model.")
        # else: no defaults either, will fall through to raw input

    if model_list:
@@ -2158,7 +2070,6 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
            cfg["model"] = model
        model["provider"] = provider_id
        model["base_url"] = effective_base
-        model["api_mode"] = "chat_completions"
        save_config(cfg)
        deactivate_provider()

@@ -2190,8 +2101,7 @@ def _run_anthropic_oauth_flow(save_env_value):
        ):
            use_anthropic_claude_code_credentials(save_fn=save_env_value)
            print("  ✓ Claude Code credentials linked.")
-            from hermes_constants import display_hermes_home as _dhh_fn
-            print(f"    Hermes will use Claude's credential store directly instead of copying a setup-token into {_dhh_fn()}/.env.")
+            print("    Hermes will use Claude's credential store directly instead of copying a setup-token into ~/.hermes/.env.")
            return True
        return False

@@ -2409,12 +2319,6 @@ def cmd_cron(args):
    cron_command(args)


-def cmd_webhook(args):
-    """Webhook subscription management."""
-    from hermes_cli.webhook import webhook_command
-    webhook_command(args)
-
-
 def cmd_doctor(args):
    """Check configuration and dependencies."""
    from hermes_cli.doctor import run_doctor
@@ -2546,18 +2450,8 @@ def _update_via_zip(args):
            )
    else:
        # Use sys.executable to explicitly call the venv's pip module,
-        # avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu.
-        # Some environments lose pip inside the venv; bootstrap it back with
-        # ensurepip before trying the editable install.
+        # avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu
        pip_cmd = [sys.executable, "-m", "pip"]
-        try:
-            subprocess.run(pip_cmd + ["--version"], cwd=PROJECT_ROOT, check=True, capture_output=True)
-        except subprocess.CalledProcessError:
-            subprocess.run(
-                [sys.executable, "-m", "ensurepip", "--upgrade", "--default-pip"],
-                cwd=PROJECT_ROOT,
-                check=True,
-            )
        try:
            subprocess.run(pip_cmd + ["install", "-e", ".[all]", "--quiet"], cwd=PROJECT_ROOT, check=True)
        except subprocess.CalledProcessError:
@@ -2718,12 +2612,7 @@ def _restore_stashed_changes(
            print("Resolve conflicts manually, then run: git stash drop")

        print(f"Restore your changes with: git stash apply {stash_ref}")
-        # In non-interactive mode (gateway /update), don't abort — the code
-        # update itself succeeded, only the stash restore had conflicts.
-        # Aborting would report the entire update as failed.
-        if prompt_user:
-            sys.exit(1)
-        return False
+        sys.exit(1)

    stash_selector = _resolve_stash_selector(git_cmd, cwd, stash_ref)
    if stash_selector is None:
@@ -2797,60 +2686,30 @@ def cmd_update(args):

    # Fetch and pull
    try:
+        print("→ Fetching updates...")
        git_cmd = ["git"]
        if sys.platform == "win32":
            git_cmd = ["git", "-c", "windows.appendAtomically=false"]
-
-        print("→ Fetching updates...")
-        fetch_result = subprocess.run(
-            git_cmd + ["fetch", "origin"],
-            cwd=PROJECT_ROOT,
-            capture_output=True,
-            text=True,
-        )
-        if fetch_result.returncode != 0:
-            stderr = fetch_result.stderr.strip()
-            if "Could not resolve host" in stderr or "unable to access" in stderr:
-                print("✗ Network error — cannot reach the remote repository.")
-                print(f"  {stderr.splitlines()[0]}" if stderr else "")
-            elif "Authentication failed" in stderr or "could not read Username" in stderr:
-                print("✗ Authentication failed — check your git credentials or SSH key.")
-            else:
-                print(f"✗ Failed to fetch updates from origin.")
-                if stderr:
-                    print(f"  {stderr.splitlines()[0]}")
-            sys.exit(1)
-
-        # Get current branch (returns literal "HEAD" when detached)
+        
+        subprocess.run(git_cmd + ["fetch", "origin"], cwd=PROJECT_ROOT, check=True)
+        
+        # Get current branch
        result = subprocess.run(
            git_cmd + ["rev-parse", "--abbrev-ref", "HEAD"],
            cwd=PROJECT_ROOT,
            capture_output=True,
            text=True,
-            check=True,
+            check=True
        )
-        current_branch = result.stdout.strip()
+        branch = result.stdout.strip()

-        # Always update against main
-        branch = "main"
-
-        # If user is on a non-main branch or detached HEAD, switch to main
-        if current_branch != "main":
-            label = "detached HEAD" if current_branch == "HEAD" else f"branch '{current_branch}'"
-            print(f"  ⚠ Currently on {label} — switching to main for update...")
-            # Stash before checkout so uncommitted work isn't lost
-            auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
-            subprocess.run(
-                git_cmd + ["checkout", "main"],
-                cwd=PROJECT_ROOT,
-                capture_output=True,
-                text=True,
-                check=True,
-            )
-        else:
-            auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
-
-        prompt_for_restore = auto_stash_ref is not None and sys.stdin.isatty() and sys.stdout.isatty()
+        # Fall back to main if the current branch doesn't exist on the remote
+        verify = subprocess.run(
+            git_cmd + ["rev-parse", "--verify", f"origin/{branch}"],
+            cwd=PROJECT_ROOT, capture_output=True, text=True,
+        )
+        if verify.returncode != 0:
+            branch = "main"

        # Check if there are updates
        result = subprocess.run(
@@ -2858,69 +2717,31 @@ def cmd_update(args):
            cwd=PROJECT_ROOT,
            capture_output=True,
            text=True,
-            check=True,
+            check=True
        )
        commit_count = int(result.stdout.strip())
-
+        
        if commit_count == 0:
            _invalidate_update_cache()
-            # Restore stash and switch back to original branch if we moved
-            if auto_stash_ref is not None:
-                _restore_stashed_changes(
-                    git_cmd, PROJECT_ROOT, auto_stash_ref,
-                    prompt_user=prompt_for_restore,
-                )
-            if current_branch not in ("main", "HEAD"):
-                subprocess.run(
-                    git_cmd + ["checkout", current_branch],
-                    cwd=PROJECT_ROOT, capture_output=True, text=True, check=False,
-                )
            print("✓ Already up to date!")
            return
-
+        
        print(f"→ Found {commit_count} new commit(s)")

+        auto_stash_ref = _stash_local_changes_if_needed(git_cmd, PROJECT_ROOT)
+        prompt_for_restore = auto_stash_ref is not None and sys.stdin.isatty() and sys.stdout.isatty()
+
        print("→ Pulling updates...")
-        update_succeeded = False
        try:
-            pull_result = subprocess.run(
-                git_cmd + ["pull", "--ff-only", "origin", branch],
-                cwd=PROJECT_ROOT,
-                capture_output=True,
-                text=True,
-            )
-            if pull_result.returncode != 0:
-                # ff-only failed — local and remote have diverged (e.g. upstream
-                # force-pushed or rebase).  Since local changes are already
-                # stashed, reset to match the remote exactly.
-                print("  ⚠ Fast-forward not possible (history diverged), resetting to match remote...")
-                reset_result = subprocess.run(
-                    git_cmd + ["reset", "--hard", f"origin/{branch}"],
-                    cwd=PROJECT_ROOT,
-                    capture_output=True,
-                    text=True,
-                )
-                if reset_result.returncode != 0:
-                    print(f"✗ Failed to reset to origin/{branch}.")
-                    if reset_result.stderr.strip():
-                        print(f"  {reset_result.stderr.strip()}")
-                    print("  Try manually: git fetch origin && git reset --hard origin/main")
-                    sys.exit(1)
-            update_succeeded = True
+            subprocess.run(git_cmd + ["pull", "--ff-only", "origin", branch], cwd=PROJECT_ROOT, check=True)
        finally:
            if auto_stash_ref is not None:
-                # Don't attempt stash restore if the code update itself failed —
-                # working tree is in an unknown state.
-                if not update_succeeded:
-                    print(f"  ℹ️  Local changes preserved in stash (ref: {auto_stash_ref})")
-                    print(f"  Restore manually with: git stash apply")
-                else:
-                    _restore_stashed_changes(
-                        git_cmd,
-                        PROJECT_ROOT,
-                        auto_stash_ref,
-                        prompt_user=prompt_for_restore,
-                    )
+                _restore_stashed_changes(
+                    git_cmd,
+                    PROJECT_ROOT,
+                    auto_stash_ref,
+                    prompt_user=prompt_for_restore,
+                )
        
        _invalidate_update_cache()
        
@@ -2943,18 +2764,8 @@ def cmd_update(args):
                )
        else:
            # Use sys.executable to explicitly call the venv's pip module,
-            # avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu.
-            # Some environments lose pip inside the venv; bootstrap it back with
-            # ensurepip before trying the editable install.
+            # avoiding PEP 668 'externally-managed-environment' errors on Debian/Ubuntu
            pip_cmd = [sys.executable, "-m", "pip"]
-            try:
-                subprocess.run(pip_cmd + ["--version"], cwd=PROJECT_ROOT, check=True, capture_output=True)
-            except subprocess.CalledProcessError:
-                subprocess.run(
-                    [sys.executable, "-m", "ensurepip", "--upgrade", "--default-pip"],
-                    cwd=PROJECT_ROOT,
-                    check=True,
-                )
            try:
                subprocess.run(pip_cmd + ["install", "-e", ".[all]", "--quiet"], cwd=PROJECT_ROOT, check=True)
            except subprocess.CalledProcessError:
@@ -2971,17 +2782,6 @@ def cmd_update(args):
        print()
        print("✓ Code updated!")
        
-        # After git pull, source files on disk are newer than cached Python
-        # modules in this process.  Reload hermes_constants so that any lazy
-        # import executed below (skills sync, gateway restart) sees new
-        # attributes like display_hermes_home() added since the last release.
-        try:
-            import importlib
-            import hermes_constants as _hc
-            importlib.reload(_hc)
-        except Exception:
-            pass  # non-fatal — worst case a lazy import fails gracefully
-        
        # Sync bundled skills (copies new, updates changed, respects user deletions)
        try:
            from tools.skills_sync import sync_skills
@@ -3000,35 +2800,7 @@ def cmd_update(args):
                print("  ✓ Skills are up to date")
        except Exception as e:
            logger.debug("Skills sync during update failed: %s", e)
-
-        # Sync bundled skills to all other profiles
-        try:
-            from hermes_cli.profiles import list_profiles, get_active_profile_name, seed_profile_skills
-            active = get_active_profile_name()
-            other_profiles = [p for p in list_profiles() if not p.is_default and p.name != active]
-            if other_profiles:
-                print()
-                print("→ Syncing bundled skills to other profiles...")
-                for p in other_profiles:
-                    try:
-                        r = seed_profile_skills(p.path, quiet=True)
-                        if r:
-                            copied = len(r.get("copied", []))
-                            updated = len(r.get("updated", []))
-                            modified = len(r.get("user_modified", []))
-                            parts = []
-                            if copied: parts.append(f"+{copied} new")
-                            if updated: parts.append(f"↑{updated} updated")
-                            if modified: parts.append(f"~{modified} user-modified")
-                            status = ", ".join(parts) if parts else "up to date"
-                        else:
-                            status = "sync failed"
-                        print(f"  {p.name}: {status}")
-                    except Exception as pe:
-                        print(f"  {p.name}: error ({pe})")
-        except Exception:
-            pass  # profiles module not available or no profiles
-
+        
        # Check for config migrations
        print()
        print("→ Checking configuration for new options...")
@@ -3052,15 +2824,10 @@ def cmd_update(args):
                print(f"  ℹ️  {len(missing_config)} new config option(s) available")
            
            print()
-            if not (sys.stdin.isatty() and sys.stdout.isatty()):
-                print("  ℹ Non-interactive session — skipping config migration prompt.")
-                print("    Run 'hermes config migrate' later to apply any new config/env options.")
-                response = "n"
+            if sys.stdin.isatty():
+                response = input("Would you like to configure them now? [Y/n]: ").strip().lower()
            else:
-                try:
-                    response = input("Would you like to configure them now? [Y/n]: ").strip().lower()
-                except EOFError:
-                    response = "n"
+                response = "n"
            
            if response in ('', 'y', 'yes'):
                print()
@@ -3108,11 +2875,10 @@ def cmd_update(args):
            # Check for macOS launchd service
            if is_macos():
                try:
-                    from hermes_cli.gateway import get_launchd_label
                    plist_path = get_launchd_plist_path()
                    if plist_path.exists():
                        check = subprocess.run(
-                            ["launchctl", "list", get_launchd_label()],
+                            ["launchctl", "list", "ai.hermes.gateway"],
                            capture_output=True, text=True, timeout=5,
                        )
                        has_launchd_service = check.returncode == 0
@@ -3168,13 +2934,12 @@ def cmd_update(args):
                    # after a manual SIGTERM, which would race with the
                    # PID file cleanup.
                    print("→ Restarting gateway service...")
-                    _launchd_label = get_launchd_label()
                    stop = subprocess.run(
-                        ["launchctl", "stop", _launchd_label],
+                        ["launchctl", "stop", "ai.hermes.gateway"],
                        capture_output=True, text=True, timeout=10,
                    )
                    start = subprocess.run(
-                        ["launchctl", "start", _launchd_label],
+                        ["launchctl", "start", "ai.hermes.gateway"],
                        capture_output=True, text=True, timeout=10,
                    )
                    if start.returncode == 0:
@@ -3226,7 +2991,6 @@ def _coalesce_session_name_args(argv: list) -> list:
        "chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
        "status", "cron", "doctor", "config", "pairing", "skills", "tools",
        "mcp", "sessions", "insights", "version", "update", "uninstall",
-        "profile",
    }
    _SESSION_FLAGS = {"-c", "--continue", "-r", "--resume"}

@@ -3250,253 +3014,6 @@ def _coalesce_session_name_args(argv: list) -> list:
    return result


-def cmd_profile(args):
-    """Profile management — create, delete, list, switch, alias."""
-    from hermes_cli.profiles import (
-        list_profiles, create_profile, delete_profile, seed_profile_skills,
-        get_active_profile, set_active_profile, get_active_profile_name,
-        check_alias_collision, create_wrapper_script, remove_wrapper_script,
-        _is_wrapper_dir_in_path, _get_wrapper_dir,
-    )
-    from hermes_constants import display_hermes_home
-
-    action = getattr(args, "profile_action", None)
-
-    if action is None:
-        # Bare `hermes profile` — show current profile status
-        profile_name = get_active_profile_name()
-        dhh = display_hermes_home()
-        print(f"\nActive profile: {profile_name}")
-        print(f"Path:           {dhh}")
-
-        profiles = list_profiles()
-        for p in profiles:
-            if p.name == profile_name or (profile_name == "default" and p.is_default):
-                if p.model:
-                    print(f"Model:          {p.model}" + (f" ({p.provider})" if p.provider else ""))
-                print(f"Gateway:        {'running' if p.gateway_running else 'stopped'}")
-                print(f"Skills:         {p.skill_count} installed")
-                if p.alias_path:
-                    print(f"Alias:          {p.name} → hermes -p {p.name}")
-                break
-        print()
-        return
-
-    if action == "list":
-        profiles = list_profiles()
-        active = get_active_profile_name()
-
-        if not profiles:
-            print("No profiles found.")
-            return
-
-        # Header
-        print(f"\n {'Profile':<16} {'Model':<28} {'Gateway':<12} {'Alias'}")
-        print(f" {'─' * 15}    {'─' * 27}    {'─' * 11}    {'─' * 12}")
-
-        for p in profiles:
-            marker = " ◆" if (p.name == active or (active == "default" and p.is_default)) else "  "
-            name = p.name
-            model = (p.model or "—")[:26]
-            gw = "running" if p.gateway_running else "stopped"
-            alias = p.name if p.alias_path else "—"
-            if p.is_default:
-                alias = "—"
-            print(f"{marker}{name:<15} {model:<28} {gw:<12} {alias}")
-        print()
-
-    elif action == "use":
-        name = args.profile_name
-        try:
-            set_active_profile(name)
-            if name == "default":
-                print(f"Switched to: default (~/.hermes)")
-            else:
-                print(f"Switched to: {name}")
-        except (ValueError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-    elif action == "create":
-        name = args.profile_name
-        clone = getattr(args, "clone", False)
-        clone_all = getattr(args, "clone_all", False)
-        no_alias = getattr(args, "no_alias", False)
-
-        try:
-            clone_from = getattr(args, "clone_from", None)
-
-            profile_dir = create_profile(
-                name=name,
-                clone_from=clone_from,
-                clone_all=clone_all,
-                clone_config=clone,
-                no_alias=no_alias,
-            )
-            print(f"\nProfile '{name}' created at {profile_dir}")
-
-            if clone or clone_all:
-                source_label = getattr(args, "clone_from", None) or get_active_profile_name()
-                if clone_all:
-                    print(f"Full copy from {source_label}.")
-                else:
-                    print(f"Cloned config, .env, SOUL.md from {source_label}.")
-
-            # Seed bundled skills (skip if --clone-all already copied them)
-            if not clone_all:
-                result = seed_profile_skills(profile_dir)
-                if result:
-                    copied = len(result.get("copied", []))
-                    print(f"{copied} bundled skills synced.")
-                else:
-                    print("⚠ Skills could not be seeded. Run `{} update` to retry.".format(name))
-
-            # Create wrapper alias
-            if not no_alias:
-                collision = check_alias_collision(name)
-                if collision:
-                    print(f"\n⚠ Cannot create alias '{name}' — {collision}")
-                    print(f"  Choose a custom alias:  hermes profile alias {name} --name <custom>")
-                    print(f"  Or access via flag:     hermes -p {name} chat")
-                else:
-                    wrapper_path = create_wrapper_script(name)
-                    if wrapper_path:
-                        print(f"Wrapper created: {wrapper_path}")
-                        if not _is_wrapper_dir_in_path():
-                            print(f"\n⚠ {_get_wrapper_dir()} is not in your PATH.")
-                            print(f'  Add to your shell config (~/.bashrc or ~/.zshrc):')
-                            print(f'    export PATH="$HOME/.local/bin:$PATH"')
-
-            # Next steps
-            print(f"\nNext steps:")
-            print(f"  {name} setup              Configure API keys and model")
-            print(f"  {name} chat               Start chatting")
-            print(f"  {name} gateway start      Start the messaging gateway")
-            if clone or clone_all:
-                from hermes_constants import get_hermes_home
-                profile_dir_display = f"~/.hermes/profiles/{name}"
-                print(f"\n  Edit {profile_dir_display}/.env for different API keys")
-                print(f"  Edit {profile_dir_display}/SOUL.md for different personality")
-            print()
-
-        except (ValueError, FileExistsError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-    elif action == "delete":
-        name = args.profile_name
-        yes = getattr(args, "yes", False)
-        try:
-            delete_profile(name, yes=yes)
-        except (ValueError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-    elif action == "show":
-        name = args.profile_name
-        from hermes_cli.profiles import get_profile_dir, profile_exists, _read_config_model, _check_gateway_running, _count_skills
-        if not profile_exists(name):
-            print(f"Error: Profile '{name}' does not exist.")
-            sys.exit(1)
-        profile_dir = get_profile_dir(name)
-        model, provider = _read_config_model(profile_dir)
-        gw = _check_gateway_running(profile_dir)
-        skills = _count_skills(profile_dir)
-        wrapper = _get_wrapper_dir() / name
-
-        print(f"\nProfile: {name}")
-        print(f"Path:    {profile_dir}")
-        if model:
-            print(f"Model:   {model}" + (f" ({provider})" if provider else ""))
-        print(f"Gateway: {'running' if gw else 'stopped'}")
-        print(f"Skills:  {skills}")
-        print(f".env:    {'exists' if (profile_dir / '.env').exists() else 'not configured'}")
-        print(f"SOUL.md: {'exists' if (profile_dir / 'SOUL.md').exists() else 'not configured'}")
-        if wrapper.exists():
-            print(f"Alias:   {wrapper}")
-        print()
-
-    elif action == "alias":
-        name = args.profile_name
-        remove = getattr(args, "remove", False)
-        custom_name = getattr(args, "alias_name", None)
-
-        from hermes_cli.profiles import profile_exists
-        if not profile_exists(name):
-            print(f"Error: Profile '{name}' does not exist.")
-            sys.exit(1)
-
-        alias_name = custom_name or name
-
-        if remove:
-            if remove_wrapper_script(alias_name):
-                print(f"✓ Removed alias '{alias_name}'")
-            else:
-                print(f"No alias '{alias_name}' found to remove.")
-        else:
-            collision = check_alias_collision(alias_name)
-            if collision:
-                print(f"Error: {collision}")
-                sys.exit(1)
-            wrapper_path = create_wrapper_script(alias_name)
-            if wrapper_path:
-                # If custom name, write the profile name into the wrapper
-                if custom_name:
-                    wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
-                print(f"✓ Alias created: {wrapper_path}")
-                if not _is_wrapper_dir_in_path():
-                    print(f"⚠ {_get_wrapper_dir()} is not in your PATH.")
-
-    elif action == "rename":
-        from hermes_cli.profiles import rename_profile
-        try:
-            new_dir = rename_profile(args.old_name, args.new_name)
-            print(f"\nProfile renamed: {args.old_name} → {args.new_name}")
-            print(f"Path: {new_dir}\n")
-        except (ValueError, FileExistsError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-    elif action == "export":
-        from hermes_cli.profiles import export_profile
-        name = args.profile_name
-        output = args.output or f"{name}.tar.gz"
-        try:
-            result_path = export_profile(name, output)
-            print(f"✓ Exported '{name}' to {result_path}")
-        except (ValueError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-    elif action == "import":
-        from hermes_cli.profiles import import_profile
-        try:
-            profile_dir = import_profile(args.archive, name=getattr(args, "import_name", None))
-            name = profile_dir.name
-            print(f"✓ Imported profile '{name}' at {profile_dir}")
-
-            # Offer to create alias
-            collision = check_alias_collision(name)
-            if not collision:
-                wrapper_path = create_wrapper_script(name)
-                if wrapper_path:
-                    print(f"  Wrapper created: {wrapper_path}")
-            print()
-        except (ValueError, FileExistsError, FileNotFoundError) as e:
-            print(f"Error: {e}")
-            sys.exit(1)
-
-
-def cmd_completion(args):
-    """Print shell completion script."""
-    from hermes_cli.profiles import generate_bash_completion, generate_zsh_completion
-    shell = getattr(args, "shell", "bash")
-    if shell == "zsh":
-        print(generate_zsh_completion())
-    else:
-        print(generate_bash_completion())
-
-
 def main():
    """Main entry point for hermes CLI."""
    parser = argparse.ArgumentParser(
@@ -3605,7 +3122,7 @@ For more help on a command:
    )
    chat_parser.add_argument(
        "--provider",
-        choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
+        choices=["auto", "openrouter", "nous", "openai-codex", "copilot-acp", "copilot", "anthropic", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode"],
        default=None,
        help="Inference provider (default: auto)"
    )
@@ -3906,38 +3423,7 @@ For more help on a command:
    cron_subparsers.add_parser("tick", help="Run due jobs once and exit")

    cron_parser.set_defaults(func=cmd_cron)
-
-    # =========================================================================
-    # webhook command
-    # =========================================================================
-    webhook_parser = subparsers.add_parser(
-        "webhook",
-        help="Manage dynamic webhook subscriptions",
-        description="Create, list, and remove webhook subscriptions for event-driven agent activation",
-    )
-    webhook_subparsers = webhook_parser.add_subparsers(dest="webhook_action")
-
-    wh_sub = webhook_subparsers.add_parser("subscribe", aliases=["add"], help="Create a webhook subscription")
-    wh_sub.add_argument("name", help="Route name (used in URL: /webhooks/<name>)")
-    wh_sub.add_argument("--prompt", default="", help="Prompt template with {dot.notation} payload refs")
-    wh_sub.add_argument("--events", default="", help="Comma-separated event types to accept")
-    wh_sub.add_argument("--description", default="", help="What this subscription does")
-    wh_sub.add_argument("--skills", default="", help="Comma-separated skill names to load")
-    wh_sub.add_argument("--deliver", default="log", help="Delivery target: log, telegram, discord, slack, etc.")
-    wh_sub.add_argument("--deliver-chat-id", default="", help="Target chat ID for cross-platform delivery")
-    wh_sub.add_argument("--secret", default="", help="HMAC secret (auto-generated if omitted)")
-
-    webhook_subparsers.add_parser("list", aliases=["ls"], help="List all dynamic subscriptions")
-
-    wh_rm = webhook_subparsers.add_parser("remove", aliases=["rm"], help="Remove a subscription")
-    wh_rm.add_argument("name", help="Subscription name to remove")
-
-    wh_test = webhook_subparsers.add_parser("test", help="Send a test POST to a webhook route")
-    wh_test.add_argument("name", help="Subscription name to test")
-    wh_test.add_argument("--payload", default="", help="JSON payload to send (default: test payload)")
-
-    webhook_parser.set_defaults(func=cmd_webhook)
-
+    
    # =========================================================================
    # doctor command
    # =========================================================================
@@ -4070,7 +3556,7 @@ For more help on a command:
    skills_snapshot = skills_subparsers.add_parser("snapshot", help="Export/import skill configurations")
    snapshot_subparsers = skills_snapshot.add_subparsers(dest="snapshot_action")
    snap_export = snapshot_subparsers.add_parser("export", help="Export installed skills to a file")
-    snap_export.add_argument("output", help="Output JSON file path (use - for stdout)")
+    snap_export.add_argument("output", help="Output JSON file path")
    snap_import = snapshot_subparsers.add_parser("import", help="Import and install skills from a file")
    snap_import.add_argument("input", help="Input JSON file path")
    snap_import.add_argument("--force", action="store_true", help="Force install despite caution verdict")
@@ -4131,46 +3617,12 @@ For more help on a command:

    plugins_subparsers.add_parser("list", aliases=["ls"], help="List installed plugins")

-    plugins_enable = plugins_subparsers.add_parser(
-        "enable", help="Enable a disabled plugin"
-    )
-    plugins_enable.add_argument("name", help="Plugin name to enable")
-
-    plugins_disable = plugins_subparsers.add_parser(
-        "disable", help="Disable a plugin without removing it"
-    )
-    plugins_disable.add_argument("name", help="Plugin name to disable")
-
    def cmd_plugins(args):
        from hermes_cli.plugins_cmd import plugins_command
        plugins_command(args)

    plugins_parser.set_defaults(func=cmd_plugins)

-    # =========================================================================
-    # memory command
-    # =========================================================================
-    memory_parser = subparsers.add_parser(
-        "memory",
-        help="Manage memory provider plugins",
-        description=(
-            "Configure which memory provider plugin is active.\n\n"
-            "Memory providers give the agent persistent recall across sessions.\n"
-            "Built-in memory (MEMORY.md / USER.md) is always active.\n"
-            "One external provider can be active at a time."
-        ),
-        formatter_class=__import__("argparse").RawDescriptionHelpFormatter,
-    )
-    memory_subparsers = memory_parser.add_subparsers(dest="memory_command")
-    memory_subparsers.add_parser("setup", help="Interactive setup wizard")
-    memory_subparsers.add_parser("status", help="Show current provider and config")
-
-    def cmd_memory(args):
-        from hermes_cli.memory_setup import memory_command
-        memory_command(args)
-
-    memory_parser.set_defaults(func=cmd_memory)
-
    # =========================================================================
    # honcho command
    # =========================================================================
@@ -4381,7 +3833,7 @@ For more help on a command:
    sessions_list.add_argument("--limit", type=int, default=20, help="Max sessions to show")

    sessions_export = sessions_subparsers.add_parser("export", help="Export sessions to a JSONL file")
-    sessions_export.add_argument("output", help="Output JSONL file path (use - for stdout)")
+    sessions_export.add_argument("output", help="Output JSONL file path")
    sessions_export.add_argument("--source", help="Filter by source")
    sessions_export.add_argument("--session-id", help="Export a specific session")

@@ -4462,25 +3914,15 @@ For more help on a command:
                if not data:
                    print(f"Session '{args.session_id}' not found.")
                    return
-                line = _json.dumps(data, ensure_ascii=False) + "\n"
-                if args.output == "-":
-                    import sys
-                    sys.stdout.write(line)
-                else:
-                    with open(args.output, "w", encoding="utf-8") as f:
-                        f.write(line)
-                    print(f"Exported 1 session to {args.output}")
+                with open(args.output, "w", encoding="utf-8") as f:
+                    f.write(_json.dumps(data, ensure_ascii=False) + "\n")
+                print(f"Exported 1 session to {args.output}")
            else:
                sessions = db.export_all(source=args.source)
-                if args.output == "-":
-                    import sys
+                with open(args.output, "w", encoding="utf-8") as f:
                    for s in sessions:
-                        sys.stdout.write(_json.dumps(s, ensure_ascii=False) + "\n")
-                else:
-                    with open(args.output, "w", encoding="utf-8") as f:
-                        for s in sessions:
-                            f.write(_json.dumps(s, ensure_ascii=False) + "\n")
-                    print(f"Exported {len(sessions)} sessions to {args.output}")
+                        f.write(_json.dumps(s, ensure_ascii=False) + "\n")
+                print(f"Exported {len(sessions)} sessions to {args.output}")

        elif action == "delete":
            resolved_session_id = db.resolve_session_id(args.session_id)
@@ -4718,75 +4160,7 @@ For more help on a command:
            sys.exit(1)

    acp_parser.set_defaults(func=cmd_acp)
-
-    # =========================================================================
-    # profile command
-    # =========================================================================
-    profile_parser = subparsers.add_parser(
-        "profile",
-        help="Manage profiles — multiple isolated Hermes instances",
-    )
-    profile_subparsers = profile_parser.add_subparsers(dest="profile_action")
-
-    profile_list = profile_subparsers.add_parser("list", help="List all profiles")
-    profile_use = profile_subparsers.add_parser("use", help="Set sticky default profile")
-    profile_use.add_argument("profile_name", help="Profile name (or 'default')")
-
-    profile_create = profile_subparsers.add_parser("create", help="Create a new profile")
-    profile_create.add_argument("profile_name", help="Profile name (lowercase, alphanumeric)")
-    profile_create.add_argument("--clone", action="store_true",
-                                help="Copy config.yaml, .env, SOUL.md from active profile")
-    profile_create.add_argument("--clone-all", action="store_true",
-                                help="Full copy of active profile (all state)")
-    profile_create.add_argument("--clone-from", metavar="SOURCE",
-                                help="Source profile to clone from (default: active)")
-    profile_create.add_argument("--no-alias", action="store_true",
-                                help="Skip wrapper script creation")
-
-    profile_delete = profile_subparsers.add_parser("delete", help="Delete a profile")
-    profile_delete.add_argument("profile_name", help="Profile to delete")
-    profile_delete.add_argument("-y", "--yes", action="store_true",
-                                help="Skip confirmation prompt")
-
-    profile_show = profile_subparsers.add_parser("show", help="Show profile details")
-    profile_show.add_argument("profile_name", help="Profile to show")
-
-    profile_alias = profile_subparsers.add_parser("alias", help="Manage wrapper scripts")
-    profile_alias.add_argument("profile_name", help="Profile name")
-    profile_alias.add_argument("--remove", action="store_true",
-                               help="Remove the wrapper script")
-    profile_alias.add_argument("--name", dest="alias_name", metavar="NAME",
-                               help="Custom alias name (default: profile name)")
-
-    profile_rename = profile_subparsers.add_parser("rename", help="Rename a profile")
-    profile_rename.add_argument("old_name", help="Current profile name")
-    profile_rename.add_argument("new_name", help="New profile name")
-
-    profile_export = profile_subparsers.add_parser("export", help="Export a profile to archive")
-    profile_export.add_argument("profile_name", help="Profile to export")
-    profile_export.add_argument("-o", "--output", default=None,
-                                help="Output file (default: <name>.tar.gz)")
-
-    profile_import = profile_subparsers.add_parser("import", help="Import a profile from archive")
-    profile_import.add_argument("archive", help="Path to .tar.gz archive")
-    profile_import.add_argument("--name", dest="import_name", metavar="NAME",
-                                help="Profile name (default: inferred from archive)")
-
-    profile_parser.set_defaults(func=cmd_profile)
-
-    # =========================================================================
-    # completion command
-    # =========================================================================
-    completion_parser = subparsers.add_parser(
-        "completion",
-        help="Print shell completion script (bash or zsh)",
-    )
-    completion_parser.add_argument(
-        "shell", nargs="?", default="bash", choices=["bash", "zsh"],
-        help="Shell type (default: bash)",
-    )
-    completion_parser.set_defaults(func=cmd_completion)
-
+    
    # =========================================================================
    # Parse and execute
    # =========================================================================
@@ -24,7 +24,6 @@ from hermes_cli.config import (
    get_hermes_home,  # noqa: F401 — used by test mocks
 )
 from hermes_cli.colors import Colors, color
-from hermes_constants import display_hermes_home

 logger = logging.getLogger(__name__)

@@ -245,7 +244,7 @@ def cmd_mcp_add(args):
                    api_key = _prompt("API key / Bearer token", password=True)
                    if api_key:
                        save_env_value(env_key, api_key)
-                        _success(f"Saved to {display_hermes_home()}/.env as {env_key}")
+                        _success(f"Saved to ~/.hermes/.env as {env_key}")

                # Set header with env var interpolation
                if api_key or existing_key:
@@ -333,7 +332,7 @@ def cmd_mcp_add(args):
    _save_mcp_server(name, server_config)

    print()
-    _success(f"Saved '{name}' to {display_hermes_home()}/config.yaml ({tool_count}/{total} tools enabled)")
+    _success(f"Saved '{name}' to ~/.hermes/config.yaml ({tool_count}/{total} tools enabled)")
    _info("Start a new session to use these tools.")


@@ -1,357 +0,0 @@
-"""hermes memory setup|status — configure memory provider plugins.
-
-Auto-detects installed memory providers via the plugin system.
-Interactive curses-based UI for provider selection, then walks through
-the provider's config schema. Writes config to config.yaml + .env.
-"""
-
-from __future__ import annotations
-
-import getpass
-import os
-import sys
-from pathlib import Path
-
-
-# ---------------------------------------------------------------------------
-# Curses-based interactive picker (same pattern as hermes tools)
-# ---------------------------------------------------------------------------
-
-def _curses_select(title: str, items: list[tuple[str, str]], default: int = 0) -> int:
-    """Interactive single-select with arrow keys.
-
-    items: list of (label, description) tuples.
-    Returns selected index, or default on escape/quit.
-    """
-    try:
-        import curses
-        result = [default]
-
-        def _menu(stdscr):
-            curses.curs_set(0)
-            if curses.has_colors():
-                curses.start_color()
-                curses.use_default_colors()
-                curses.init_pair(1, curses.COLOR_GREEN, -1)
-                curses.init_pair(2, curses.COLOR_YELLOW, -1)
-                curses.init_pair(3, curses.COLOR_CYAN, -1)
-            cursor = default
-
-            while True:
-                stdscr.clear()
-                max_y, max_x = stdscr.getmaxyx()
-
-                # Title
-                try:
-                    stdscr.addnstr(0, 0, title, max_x - 1,
-                                   curses.A_BOLD | (curses.color_pair(2) if curses.has_colors() else 0))
-                    stdscr.addnstr(1, 0, "  ↑↓ navigate  ⏎ select  q quit", max_x - 1,
-                                   curses.color_pair(3) if curses.has_colors() else curses.A_DIM)
-                except curses.error:
-                    pass
-
-                for i, (label, desc) in enumerate(items):
-                    y = i + 3
-                    if y >= max_y - 1:
-                        break
-                    arrow = "→" if i == cursor else " "
-                    line = f" {arrow}  {label}"
-                    if desc:
-                        line += f"  {desc}"
-
-                    attr = curses.A_NORMAL
-                    if i == cursor:
-                        attr = curses.A_BOLD
-                        if curses.has_colors():
-                            attr |= curses.color_pair(1)
-                    try:
-                        stdscr.addnstr(y, 0, line[:max_x - 1], max_x - 1, attr)
-                    except curses.error:
-                        pass
-
-                stdscr.refresh()
-                key = stdscr.getch()
-
-                if key in (curses.KEY_UP, ord('k')):
-                    cursor = (cursor - 1) % len(items)
-                elif key in (curses.KEY_DOWN, ord('j')):
-                    cursor = (cursor + 1) % len(items)
-                elif key in (curses.KEY_ENTER, 10, 13):
-                    result[0] = cursor
-                    return
-                elif key in (27, ord('q')):
-                    return
-
-        curses.wrapper(_menu)
-        return result[0]
-
-    except Exception:
-        # Fallback: numbered input
-        print(f"\n  {title}\n")
-        for i, (label, desc) in enumerate(items):
-            marker = "→" if i == default else " "
-            d = f"  {desc}" if desc else ""
-            print(f"  {marker} {i + 1}. {label}{d}")
-        while True:
-            try:
-                val = input(f"\n  Select [1-{len(items)}] ({default + 1}): ")
-                if not val:
-                    return default
-                idx = int(val) - 1
-                if 0 <= idx < len(items):
-                    return idx
-            except (ValueError, EOFError):
-                return default
-
-
-def _prompt(label: str, default: str | None = None, secret: bool = False) -> str:
-    """Prompt for a value with optional default and secret masking."""
-    suffix = f" [{default}]" if default else ""
-    if secret:
-        sys.stdout.write(f"  {label}{suffix}: ")
-        sys.stdout.flush()
-        if sys.stdin.isatty():
-            val = getpass.getpass(prompt="")
-        else:
-            val = sys.stdin.readline().strip()
-    else:
-        sys.stdout.write(f"  {label}{suffix}: ")
-        sys.stdout.flush()
-        val = sys.stdin.readline().strip()
-    return val or (default or "")
-
-
-# ---------------------------------------------------------------------------
-# Provider discovery
-# ---------------------------------------------------------------------------
-
-def _get_available_providers() -> list:
-    """Discover memory providers from installed plugins.
-
-    Returns list of (name, description, provider_instance) tuples.
-    """
-    try:
-        from hermes_cli.plugins import get_plugin_memory_providers
-        providers = get_plugin_memory_providers()
-    except Exception:
-        providers = []
-
-    results = []
-    for p in providers:
-        name = getattr(p, "name", "unknown")
-        schema = p.get_config_schema() if hasattr(p, "get_config_schema") else []
-        has_secrets = any(f.get("secret") for f in schema)
-        if has_secrets:
-            desc = "requires API key"
-        elif not schema:
-            desc = "no setup needed"
-        else:
-            desc = "local"
-        results.append((name, desc, p))
-    return results
-
-
-# ---------------------------------------------------------------------------
-# Setup wizard
-# ---------------------------------------------------------------------------
-
-def cmd_setup(args) -> None:
-    """Interactive memory provider setup wizard."""
-    from hermes_cli.config import load_config, save_config
-
-    providers = _get_available_providers()
-
-    if not providers:
-        print("\n  No memory provider plugins detected.")
-        print("  Install a plugin to ~/.hermes/plugins/ and try again.\n")
-        return
-
-    # Build picker items
-    items = []
-    for name, desc, _ in providers:
-        items.append((name, f"— {desc}"))
-    items.append(("Built-in only", "— MEMORY.md / USER.md (default)"))
-
-    builtin_idx = len(items) - 1
-    selected = _curses_select("Memory provider setup", items, default=builtin_idx)
-
-    config = load_config()
-    if not isinstance(config.get("memory"), dict):
-        config["memory"] = {}
-
-    # Built-in only
-    if selected >= len(providers) or selected < 0:
-        config["memory"]["provider"] = ""
-        save_config(config)
-        print("\n  ✓ Memory provider: built-in only")
-        print("  Saved to config.yaml\n")
-        return
-
-    name, _, provider = providers[selected]
-    schema = provider.get_config_schema() if hasattr(provider, "get_config_schema") else []
-
-    # Provider config section
-    provider_config = config["memory"].get(name, {})
-    if not isinstance(provider_config, dict):
-        provider_config = {}
-
-    env_path = Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))) / ".env"
-    env_writes = {}
-
-    if schema:
-        print(f"\n  Configuring {name}:\n")
-
-        for field in schema:
-            key = field["key"]
-            desc = field.get("description", key)
-            default = field.get("default")
-            is_secret = field.get("secret", False)
-            choices = field.get("choices")
-            env_var = field.get("env_var")
-            url = field.get("url")
-
-            if choices and not is_secret:
-                # Use curses picker for choice fields
-                choice_items = [(c, "") for c in choices]
-                current = provider_config.get(key, default)
-                current_idx = 0
-                if current and current in choices:
-                    current_idx = choices.index(current)
-                sel = _curses_select(f"  {desc}", choice_items, default=current_idx)
-                provider_config[key] = choices[sel]
-            elif is_secret:
-                # Prompt for secret
-                existing = os.environ.get(env_var, "") if env_var else ""
-                if existing:
-                    masked = f"...{existing[-4:]}" if len(existing) > 4 else "set"
-                    val = _prompt(f"{desc} (current: {masked}, blank to keep)", secret=True)
-                else:
-                    hint = f"  Get yours at {url}" if url else ""
-                    if hint:
-                        print(hint)
-                    val = _prompt(desc, secret=True)
-                if val and env_var:
-                    env_writes[env_var] = val
-            else:
-                # Regular text prompt
-                current = provider_config.get(key)
-                effective_default = current or default
-                val = _prompt(desc, default=str(effective_default) if effective_default else None)
-                if val:
-                    provider_config[key] = val
-
-    # Write config
-    config["memory"]["provider"] = name
-    config["memory"][name] = provider_config
-    save_config(config)
-
-    # Write secrets to .env
-    if env_writes:
-        _write_env_vars(env_path, env_writes)
-
-    print(f"\n  ✓ Memory provider: {name}")
-    print(f"  ✓ Config saved to config.yaml")
-    if env_writes:
-        print(f"  ✓ API keys saved to .env")
-    print(f"\n  Start a new session to activate.\n")
-
-
-def _write_env_vars(env_path: Path, env_writes: dict) -> None:
-    """Append or update env vars in .env file."""
-    env_path.parent.mkdir(parents=True, exist_ok=True)
-
-    existing_lines = []
-    if env_path.exists():
-        existing_lines = env_path.read_text().splitlines()
-
-    updated_keys = set()
-    new_lines = []
-    for line in existing_lines:
-        key_match = line.split("=", 1)[0].strip() if "=" in line else ""
-        if key_match in env_writes:
-            new_lines.append(f"{key_match}={env_writes[key_match]}")
-            updated_keys.add(key_match)
-        else:
-            new_lines.append(line)
-
-    for key, val in env_writes.items():
-        if key not in updated_keys:
-            new_lines.append(f"{key}={val}")
-
-    env_path.write_text("\n".join(new_lines) + "\n")
-
-
-# ---------------------------------------------------------------------------
-# Status
-# ---------------------------------------------------------------------------
-
-def cmd_status(args) -> None:
-    """Show current memory provider config."""
-    from hermes_cli.config import load_config
-
-    config = load_config()
-    mem_config = config.get("memory", {})
-    provider_name = mem_config.get("provider", "")
-
-    print(f"\nMemory status\n" + "─" * 40)
-    print(f"  Built-in:  always active")
-    print(f"  Provider:  {provider_name or '(none — built-in only)'}")
-
-    if provider_name:
-        provider_config = mem_config.get(provider_name, {})
-        if provider_config:
-            print(f"\n  {provider_name} config:")
-            for key, val in provider_config.items():
-                print(f"    {key}: {val}")
-
-        providers = _get_available_providers()
-        found = any(name == provider_name for name, _, _ in providers)
-        if found:
-            print(f"\n  Plugin:    installed ✓")
-            for pname, _, p in providers:
-                if pname == provider_name:
-                    if p.is_available():
-                        print(f"  Status:    available ✓")
-                    else:
-                        print(f"  Status:    not available ✗")
-                        schema = p.get_config_schema() if hasattr(p, "get_config_schema") else []
-                        secrets = [f for f in schema if f.get("secret")]
-                        if secrets:
-                            print(f"  Missing:")
-                            for s in secrets:
-                                env_var = s.get("env_var", "")
-                                url = s.get("url", "")
-                                is_set = bool(os.environ.get(env_var))
-                                mark = "✓" if is_set else "✗"
-                                line = f"    {mark} {env_var}"
-                                if url and not is_set:
-                                    line += f"  → {url}"
-                                print(line)
-                    break
-        else:
-            print(f"\n  Plugin:    NOT installed ✗")
-            print(f"  Install the '{provider_name}' memory plugin to ~/.hermes/plugins/")
-
-    providers = _get_available_providers()
-    if providers:
-        print(f"\n  Installed plugins:")
-        for pname, desc, _ in providers:
-            active = " ← active" if pname == provider_name else ""
-            print(f"    • {pname}  ({desc}){active}")
-
-    print()
-
-
-# ---------------------------------------------------------------------------
-# Router
-# ---------------------------------------------------------------------------
-
-def memory_command(args) -> None:
-    """Route memory subcommands."""
-    sub = getattr(args, "memory_command", None)
-    if sub == "setup":
-        cmd_setup(args)
-    elif sub == "status":
-        cmd_status(args)
-    else:
-        cmd_status(args)
@@ -208,31 +208,14 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "google/gemini-3-pro-preview",
        "google/gemini-3-flash-preview",
    ],
-    # Alibaba DashScope Coding platform (coding-intl) — default endpoint.
-    # Supports Qwen models + third-party providers (GLM, Kimi, MiniMax).
-    # Users with classic DashScope keys should override DASHSCOPE_BASE_URL
-    # to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
-    # or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
    "alibaba": [
        "qwen3.5-plus",
+        "qwen3-max",
        "qwen3-coder-plus",
        "qwen3-coder-next",
-        # Third-party models available on coding-intl
-        "glm-5",
-        "glm-4.7",
-        "kimi-k2.5",
-        "MiniMax-M2.5",
-    ],
-    # Curated HF model list — only agentic models that map to OpenRouter defaults.
-    "huggingface": [
-        "Qwen/Qwen3.5-397B-A17B",
-        "Qwen/Qwen3.5-35B-A3B",
-        "deepseek-ai/DeepSeek-V3.2",
-        "moonshotai/Kimi-K2.5",
-        "MiniMaxAI/MiniMax-M2.5",
-        "zai-org/GLM-5",
-        "XiaomiMiMo/MiMo-V2-Flash",
-        "moonshotai/Kimi-K2-Thinking",
+        "qwen-plus-latest",
+        "qwen3.5-flash",
+        "qwen-vl-max",
    ],
 }

@@ -253,7 +236,6 @@ _PROVIDER_LABELS = {
    "ai-gateway": "AI Gateway",
    "kilocode": "Kilo Code",
    "alibaba": "Alibaba Cloud (DashScope)",
-    "huggingface": "Hugging Face",
    "custom": "Custom endpoint",
 }

@@ -289,9 +271,6 @@ _PROVIDER_ALIASES = {
    "aliyun": "alibaba",
    "qwen": "alibaba",
    "alibaba-cloud": "alibaba",
-    "hf": "huggingface",
-    "hugging-face": "huggingface",
-    "huggingface-hub": "huggingface",
 }


@@ -325,7 +304,7 @@ def list_available_providers() -> list[dict[str, str]]:
    # Canonical providers in display order
    _PROVIDER_ORDER = [
        "openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
-        "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
+        "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
        "opencode-zen", "opencode-go",
        "ai-gateway", "deepseek", "custom",
    ]
@@ -68,17 +68,6 @@ def _env_enabled(name: str) -> bool:
    return os.getenv(name, "").strip().lower() in {"1", "true", "yes", "on"}


-def _get_disabled_plugins() -> set:
-    """Read the disabled plugins list from config.yaml."""
-    try:
-        from hermes_cli.config import load_config
-        config = load_config()
-        disabled = config.get("plugins", {}).get("disabled", [])
-        return set(disabled) if isinstance(disabled, list) else set()
-    except Exception:
-        return set()
-
-
 # ---------------------------------------------------------------------------
 # Data classes
 # ---------------------------------------------------------------------------
@@ -152,28 +141,6 @@ class PluginContext:
        self._manager._plugin_tool_names.add(name)
        logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)

-    # -- memory provider registration ----------------------------------------
-
-    def register_memory_provider(self, provider) -> None:
-        """Register a memory provider (must implement MemoryProvider ABC).
-
-        The provider will be added to the MemoryManager during agent init.
-        Providers registered this way are additive — they never disable
-        the built-in MEMORY.md/USER.md store.
-
-        Example plugin __init__.py::
-
-            from my_memory_backend import MyMemoryProvider
-
-            def register(ctx):
-                ctx.register_memory_provider(MyMemoryProvider())
-        """
-        self._manager._memory_providers.append(provider)
-        logger.debug(
-            "Plugin %s registered memory provider: %s",
-            self.manifest.name, getattr(provider, "name", "unknown"),
-        )
-
    # -- hook registration --------------------------------------------------

    def register_hook(self, hook_name: str, callback: Callable) -> None:
@@ -205,7 +172,6 @@ class PluginManager:
        self._plugins: Dict[str, LoadedPlugin] = {}
        self._hooks: Dict[str, List[Callable]] = {}
        self._plugin_tool_names: Set[str] = set()
-        self._memory_providers: List = []  # MemoryProvider instances from plugins
        self._discovered: bool = False

    # -----------------------------------------------------------------------
@@ -233,15 +199,8 @@ class PluginManager:
        # 3. Pip / entry-point plugins
        manifests.extend(self._scan_entry_points())

-        # Load each manifest (skip user-disabled plugins)
-        disabled = _get_disabled_plugins()
+        # Load each manifest
        for manifest in manifests:
-            if manifest.name in disabled:
-                loaded = LoadedPlugin(manifest=manifest, enabled=False)
-                loaded.error = "disabled via config"
-                self._plugins[manifest.name] = loaded
-                logger.debug("Skipping disabled plugin '%s'", manifest.name)
-                continue
            self._load_plugin(manifest)

        if manifests:
@@ -426,23 +385,16 @@ class PluginManager:
    # Hook invocation
    # -----------------------------------------------------------------------

-    def invoke_hook(self, hook_name: str, **kwargs: Any) -> List[Any]:
+    def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
        """Call all registered callbacks for *hook_name*.

        Each callback is wrapped in its own try/except so a misbehaving
        plugin cannot break the core agent loop.
-
-        Returns a list of non-``None`` return values from callbacks.
-        This allows hooks like ``pre_llm_call`` to contribute context
-        that the agent core can collect and inject.
        """
        callbacks = self._hooks.get(hook_name, [])
-        results: List[Any] = []
        for cb in callbacks:
            try:
-                ret = cb(**kwargs)
-                if ret is not None:
-                    results.append(ret)
+                cb(**kwargs)
            except Exception as exc:
                logger.warning(
                    "Hook '%s' callback %s raised: %s",
@@ -450,7 +402,6 @@ class PluginManager:
                    getattr(cb, "__name__", repr(cb)),
                    exc,
                )
-        return results

    # -----------------------------------------------------------------------
    # Introspection
@@ -495,12 +446,9 @@ def discover_plugins() -> None:
    get_plugin_manager().discover_and_load()


-def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
-    """Invoke a lifecycle hook on all loaded plugins.
-
-    Returns a list of non-``None`` return values from plugin callbacks.
-    """
-    return get_plugin_manager().invoke_hook(hook_name, **kwargs)
+def invoke_hook(hook_name: str, **kwargs: Any) -> None:
+    """Invoke a lifecycle hook on all loaded plugins."""
+    get_plugin_manager().invoke_hook(hook_name, **kwargs)


 def get_plugin_tool_names() -> Set[str]:
@@ -551,13 +499,3 @@ def get_plugin_toolsets() -> List[tuple]:
        result.append((ts_key, label, desc))

    return result
-
-
-def get_plugin_memory_providers() -> List:
-    """Return MemoryProvider instances registered by plugins.
-
-    Called during AIAgent init to add plugin memory providers to
-    the MemoryManager alongside built-in providers.
-    """
-    manager = get_plugin_manager()
-    return list(manager._memory_providers)
@@ -374,73 +374,6 @@ def cmd_remove(name: str) -> None:
    _display_removed(name, plugins_dir)


-def _get_disabled_set() -> set:
-    """Read the disabled plugins set from config.yaml."""
-    try:
-        from hermes_cli.config import load_config
-        config = load_config()
-        disabled = config.get("plugins", {}).get("disabled", [])
-        return set(disabled) if isinstance(disabled, list) else set()
-    except Exception:
-        return set()
-
-
-def _save_disabled_set(disabled: set) -> None:
-    """Write the disabled plugins list to config.yaml."""
-    from hermes_cli.config import load_config, save_config
-    config = load_config()
-    if "plugins" not in config:
-        config["plugins"] = {}
-    config["plugins"]["disabled"] = sorted(disabled)
-    save_config(config)
-
-
-def cmd_enable(name: str) -> None:
-    """Enable a previously disabled plugin."""
-    from rich.console import Console
-
-    console = Console()
-    plugins_dir = _plugins_dir()
-
-    # Verify the plugin exists
-    target = plugins_dir / name
-    if not target.is_dir():
-        console.print(f"[red]Plugin '{name}' is not installed.[/red]")
-        sys.exit(1)
-
-    disabled = _get_disabled_set()
-    if name not in disabled:
-        console.print(f"[dim]Plugin '{name}' is already enabled.[/dim]")
-        return
-
-    disabled.discard(name)
-    _save_disabled_set(disabled)
-    console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] enabled. Takes effect on next session.")
-
-
-def cmd_disable(name: str) -> None:
-    """Disable a plugin without removing it."""
-    from rich.console import Console
-
-    console = Console()
-    plugins_dir = _plugins_dir()
-
-    # Verify the plugin exists
-    target = plugins_dir / name
-    if not target.is_dir():
-        console.print(f"[red]Plugin '{name}' is not installed.[/red]")
-        sys.exit(1)
-
-    disabled = _get_disabled_set()
-    if name in disabled:
-        console.print(f"[dim]Plugin '{name}' is already disabled.[/dim]")
-        return
-
-    disabled.add(name)
-    _save_disabled_set(disabled)
-    console.print(f"[yellow]⊘[/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
-
-
 def cmd_list() -> None:
    """List installed plugins."""
    from rich.console import Console
@@ -460,11 +393,8 @@ def cmd_list() -> None:
        console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
        return

-    disabled = _get_disabled_set()
-
    table = Table(title="Installed Plugins", show_lines=False)
    table.add_column("Name", style="bold")
-    table.add_column("Status")
    table.add_column("Version", style="dim")
    table.add_column("Description")
    table.add_column("Source", style="dim")
@@ -490,86 +420,11 @@ def cmd_list() -> None:
        if (d / ".git").exists():
            source = "git"

-        is_disabled = name in disabled or d.name in disabled
-        status = "[red]disabled[/red]" if is_disabled else "[green]enabled[/green]"
-        table.add_row(name, status, str(version), description, source)
+        table.add_row(name, str(version), description, source)

    console.print()
    console.print(table)
    console.print()
-    console.print("[dim]Interactive toggle:[/dim] hermes plugins")
-    console.print("[dim]Enable/disable:[/dim] hermes plugins enable/disable <name>")
-
-
-def cmd_toggle() -> None:
-    """Interactive curses checklist to enable/disable installed plugins."""
-    from rich.console import Console
-
-    try:
-        import yaml
-    except ImportError:
-        yaml = None
-
-    console = Console()
-    plugins_dir = _plugins_dir()
-
-    dirs = sorted(d for d in plugins_dir.iterdir() if d.is_dir())
-    if not dirs:
-        console.print("[dim]No plugins installed.[/dim]")
-        console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
-        return
-
-    disabled = _get_disabled_set()
-
-    # Build items list: "name — description" for display
-    names = []
-    labels = []
-    selected = set()
-
-    for i, d in enumerate(dirs):
-        manifest_file = d / "plugin.yaml"
-        name = d.name
-        description = ""
-
-        if manifest_file.exists() and yaml:
-            try:
-                with open(manifest_file) as f:
-                    manifest = yaml.safe_load(f) or {}
-                name = manifest.get("name", d.name)
-                description = manifest.get("description", "")
-            except Exception:
-                pass
-
-        names.append(name)
-        label = f"{name} — {description}" if description else name
-        labels.append(label)
-
-        if name not in disabled and d.name not in disabled:
-            selected.add(i)
-
-    from hermes_cli.curses_ui import curses_checklist
-
-    result = curses_checklist(
-        title="Plugins — toggle enabled/disabled",
-        items=labels,
-        selected=selected,
-    )
-
-    # Compute new disabled set from deselected items
-    new_disabled = set()
-    for i, name in enumerate(names):
-        if i not in result:
-            new_disabled.add(name)
-
-    if new_disabled != disabled:
-        _save_disabled_set(new_disabled)
-        enabled_count = len(names) - len(new_disabled)
-        console.print(
-            f"\n[green]✓[/green] {enabled_count} enabled, {len(new_disabled)} disabled. "
-            f"Takes effect on next session."
-        )
-    else:
-        console.print("\n[dim]No changes.[/dim]")


 def plugins_command(args) -> None:
@@ -582,14 +437,8 @@ def plugins_command(args) -> None:
        cmd_update(args.name)
    elif action in ("remove", "rm", "uninstall"):
        cmd_remove(args.name)
-    elif action == "enable":
-        cmd_enable(args.name)
-    elif action == "disable":
-        cmd_disable(args.name)
-    elif action in ("list", "ls"):
+    elif action in ("list", "ls") or action is None:
        cmd_list()
-    elif action is None:
-        cmd_toggle()
    else:
        from rich.console import Console

@@ -1,906 +0,0 @@
-"""
-Profile management for multiple isolated Hermes instances.
-
-Each profile is a fully independent HERMES_HOME directory with its own
-config.yaml, .env, memory, sessions, skills, gateway, cron, and logs.
-Profiles live under ``~/.hermes/profiles/<name>/`` by default.
-
-The "default" profile is ``~/.hermes`` itself — backward compatible,
-zero migration needed.
-
-Usage::
-
-    hermes profile create coder          # fresh profile + bundled skills
-    hermes profile create coder --clone  # also copy config, .env, SOUL.md
-    hermes profile create coder --clone-all  # full copy of source profile
-    coder chat                           # use via wrapper alias
-    hermes -p coder chat                 # or via flag
-    hermes profile use coder             # set as sticky default
-    hermes profile delete coder          # remove profile + alias + service
-"""
-
-import json
-import os
-import re
-import shutil
-import stat
-import subprocess
-import sys
-from dataclasses import dataclass, field
-from pathlib import Path
-from typing import List, Optional
-
-_PROFILE_ID_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
-
-# Directories bootstrapped inside every new profile
-_PROFILE_DIRS = [
-    "memories",
-    "sessions",
-    "skills",
-    "skins",
-    "logs",
-    "plans",
-    "workspace",
-    "cron",
-]
-
-# Files copied during --clone (if they exist in the source)
-_CLONE_CONFIG_FILES = [
-    "config.yaml",
-    ".env",
-    "SOUL.md",
-]
-
-# Runtime files stripped after --clone-all (shouldn't carry over)
-_CLONE_ALL_STRIP = [
-    "gateway.pid",
-    "gateway_state.json",
-    "processes.json",
-]
-
-# Names that cannot be used as profile aliases
-_RESERVED_NAMES = frozenset({
-    "hermes", "default", "test", "tmp", "root", "sudo",
-})
-
-# Hermes subcommands that cannot be used as profile names/aliases
-_HERMES_SUBCOMMANDS = frozenset({
-    "chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
-    "status", "cron", "doctor", "config", "pairing", "skills", "tools",
-    "mcp", "sessions", "insights", "version", "update", "uninstall",
-    "profile", "plugins", "honcho", "acp",
-})
-
-
-# ---------------------------------------------------------------------------
-# Path helpers
-# ---------------------------------------------------------------------------
-
-def _get_profiles_root() -> Path:
-    """Return the directory where named profiles are stored.
-
-    Always ``~/.hermes/profiles/`` — anchored to the user's home,
-    NOT to the current HERMES_HOME (which may itself be a profile).
-    This ensures ``coder profile list`` can see all profiles.
-    """
-    return Path.home() / ".hermes" / "profiles"
-
-
-def _get_default_hermes_home() -> Path:
-    """Return the default (pre-profile) HERMES_HOME path."""
-    return Path.home() / ".hermes"
-
-
-def _get_active_profile_path() -> Path:
-    """Return the path to the sticky active_profile file."""
-    return _get_default_hermes_home() / "active_profile"
-
-
-def _get_wrapper_dir() -> Path:
-    """Return the directory for wrapper scripts."""
-    return Path.home() / ".local" / "bin"
-
-
-# ---------------------------------------------------------------------------
-# Validation
-# ---------------------------------------------------------------------------
-
-def validate_profile_name(name: str) -> None:
-    """Raise ``ValueError`` if *name* is not a valid profile identifier."""
-    if name == "default":
-        return  # special alias for ~/.hermes
-    if not _PROFILE_ID_RE.match(name):
-        raise ValueError(
-            f"Invalid profile name {name!r}. Must match "
-            f"[a-z0-9][a-z0-9_-]{{0,63}}"
-        )
-
-
-def get_profile_dir(name: str) -> Path:
-    """Resolve a profile name to its HERMES_HOME directory."""
-    if name == "default":
-        return _get_default_hermes_home()
-    return _get_profiles_root() / name
-
-
-def profile_exists(name: str) -> bool:
-    """Check whether a profile directory exists."""
-    if name == "default":
-        return True
-    return get_profile_dir(name).is_dir()
-
-
-# ---------------------------------------------------------------------------
-# Alias / wrapper script management
-# ---------------------------------------------------------------------------
-
-def check_alias_collision(name: str) -> Optional[str]:
-    """Return a human-readable collision message, or None if the name is safe.
-
-    Checks: reserved names, hermes subcommands, existing binaries in PATH.
-    """
-    if name in _RESERVED_NAMES:
-        return f"'{name}' is a reserved name"
-    if name in _HERMES_SUBCOMMANDS:
-        return f"'{name}' conflicts with a hermes subcommand"
-
-    # Check existing commands in PATH
-    wrapper_dir = _get_wrapper_dir()
-    try:
-        result = subprocess.run(
-            ["which", name], capture_output=True, text=True, timeout=5,
-        )
-        if result.returncode == 0:
-            existing_path = result.stdout.strip()
-            # Allow overwriting our own wrappers
-            if existing_path == str(wrapper_dir / name):
-                try:
-                    content = (wrapper_dir / name).read_text()
-                    if "hermes -p" in content:
-                        return None  # it's our wrapper, safe to overwrite
-                except Exception:
-                    pass
-            return f"'{name}' conflicts with an existing command ({existing_path})"
-    except (FileNotFoundError, subprocess.TimeoutExpired):
-        pass
-
-    return None  # safe
-
-
-def _is_wrapper_dir_in_path() -> bool:
-    """Check if ~/.local/bin is in PATH."""
-    wrapper_dir = str(_get_wrapper_dir())
-    return wrapper_dir in os.environ.get("PATH", "").split(os.pathsep)
-
-
-def create_wrapper_script(name: str) -> Optional[Path]:
-    """Create a shell wrapper script at ~/.local/bin/<name>.
-
-    Returns the path to the created wrapper, or None if creation failed.
-    """
-    wrapper_dir = _get_wrapper_dir()
-    try:
-        wrapper_dir.mkdir(parents=True, exist_ok=True)
-    except OSError as e:
-        print(f"⚠ Could not create {wrapper_dir}: {e}")
-        return None
-
-    wrapper_path = wrapper_dir / name
-    try:
-        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
-        wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
-        return wrapper_path
-    except OSError as e:
-        print(f"⚠ Could not create wrapper at {wrapper_path}: {e}")
-        return None
-
-
-def remove_wrapper_script(name: str) -> bool:
-    """Remove the wrapper script for a profile. Returns True if removed."""
-    wrapper_path = _get_wrapper_dir() / name
-    if wrapper_path.exists():
-        try:
-            # Verify it's our wrapper before removing
-            content = wrapper_path.read_text()
-            if "hermes -p" in content:
-                wrapper_path.unlink()
-                return True
-        except Exception:
-            pass
-    return False
-
-
-# ---------------------------------------------------------------------------
-# ProfileInfo
-# ---------------------------------------------------------------------------
-
-@dataclass
-class ProfileInfo:
-    """Summary information about a profile."""
-    name: str
-    path: Path
-    is_default: bool
-    gateway_running: bool
-    model: Optional[str] = None
-    provider: Optional[str] = None
-    has_env: bool = False
-    skill_count: int = 0
-    alias_path: Optional[Path] = None
-
-
-def _read_config_model(profile_dir: Path) -> tuple:
-    """Read model/provider from a profile's config.yaml. Returns (model, provider)."""
-    config_path = profile_dir / "config.yaml"
-    if not config_path.exists():
-        return None, None
-    try:
-        import yaml
-        with open(config_path, "r") as f:
-            cfg = yaml.safe_load(f) or {}
-        model_cfg = cfg.get("model", {})
-        if isinstance(model_cfg, str):
-            return model_cfg, None
-        if isinstance(model_cfg, dict):
-            return model_cfg.get("model"), model_cfg.get("provider")
-        return None, None
-    except Exception:
-        return None, None
-
-
-def _check_gateway_running(profile_dir: Path) -> bool:
-    """Check if a gateway is running for a given profile directory."""
-    pid_file = profile_dir / "gateway.pid"
-    if not pid_file.exists():
-        return False
-    try:
-        raw = pid_file.read_text().strip()
-        if not raw:
-            return False
-        data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
-        pid = int(data["pid"])
-        os.kill(pid, 0)  # existence check
-        return True
-    except (json.JSONDecodeError, KeyError, ValueError, TypeError,
-            ProcessLookupError, PermissionError, OSError):
-        return False
-
-
-def _count_skills(profile_dir: Path) -> int:
-    """Count installed skills in a profile."""
-    skills_dir = profile_dir / "skills"
-    if not skills_dir.is_dir():
-        return 0
-    count = 0
-    for md in skills_dir.rglob("SKILL.md"):
-        if "/.hub/" not in str(md) and "/.git/" not in str(md):
-            count += 1
-    return count
-
-
-# ---------------------------------------------------------------------------
-# CRUD operations
-# ---------------------------------------------------------------------------
-
-def list_profiles() -> List[ProfileInfo]:
-    """Return info for all profiles, including the default."""
-    profiles = []
-    wrapper_dir = _get_wrapper_dir()
-
-    # Default profile
-    default_home = _get_default_hermes_home()
-    if default_home.is_dir():
-        model, provider = _read_config_model(default_home)
-        profiles.append(ProfileInfo(
-            name="default",
-            path=default_home,
-            is_default=True,
-            gateway_running=_check_gateway_running(default_home),
-            model=model,
-            provider=provider,
-            has_env=(default_home / ".env").exists(),
-            skill_count=_count_skills(default_home),
-        ))
-
-    # Named profiles
-    profiles_root = _get_profiles_root()
-    if profiles_root.is_dir():
-        for entry in sorted(profiles_root.iterdir()):
-            if not entry.is_dir():
-                continue
-            name = entry.name
-            if not _PROFILE_ID_RE.match(name):
-                continue
-            model, provider = _read_config_model(entry)
-            alias_path = wrapper_dir / name
-            profiles.append(ProfileInfo(
-                name=name,
-                path=entry,
-                is_default=False,
-                gateway_running=_check_gateway_running(entry),
-                model=model,
-                provider=provider,
-                has_env=(entry / ".env").exists(),
-                skill_count=_count_skills(entry),
-                alias_path=alias_path if alias_path.exists() else None,
-            ))
-
-    return profiles
-
-
-def create_profile(
-    name: str,
-    clone_from: Optional[str] = None,
-    clone_all: bool = False,
-    clone_config: bool = False,
-    no_alias: bool = False,
-) -> Path:
-    """Create a new profile directory.
-
-    Parameters
-    ----------
-    name:
-        Profile identifier (lowercase, alphanumeric, hyphens, underscores).
-    clone_from:
-        Source profile to clone from. If ``None`` and clone_config/clone_all
-        is True, defaults to the currently active profile.
-    clone_all:
-        If True, do a full copytree of the source (all state).
-    clone_config:
-        If True, copy only config files (config.yaml, .env, SOUL.md).
-    no_alias:
-        If True, skip wrapper script creation.
-
-    Returns
-    -------
-    Path
-        The newly created profile directory.
-    """
-    validate_profile_name(name)
-
-    if name == "default":
-        raise ValueError(
-            "Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
-        )
-
-    profile_dir = get_profile_dir(name)
-    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
-
-    # Resolve clone source
-    source_dir = None
-    if clone_from is not None or clone_all or clone_config:
-        if clone_from is None:
-            # Default: clone from active profile
-            from hermes_constants import get_hermes_home
-            source_dir = get_hermes_home()
-        else:
-            validate_profile_name(clone_from)
-            source_dir = get_profile_dir(clone_from)
-        if not source_dir.is_dir():
-            raise FileNotFoundError(
-                f"Source profile '{clone_from or 'active'}' does not exist at {source_dir}"
-            )
-
-    if clone_all and source_dir:
-        # Full copy of source profile
-        shutil.copytree(source_dir, profile_dir)
-        # Strip runtime files
-        for stale in _CLONE_ALL_STRIP:
-            (profile_dir / stale).unlink(missing_ok=True)
-    else:
-        # Bootstrap directory structure
-        profile_dir.mkdir(parents=True, exist_ok=True)
-        for subdir in _PROFILE_DIRS:
-            (profile_dir / subdir).mkdir(parents=True, exist_ok=True)
-
-        # Clone config files from source
-        if source_dir is not None:
-            for filename in _CLONE_CONFIG_FILES:
-                src = source_dir / filename
-                if src.exists():
-                    shutil.copy2(src, profile_dir / filename)
-
-    return profile_dir
-
-
-def seed_profile_skills(profile_dir: Path, quiet: bool = False) -> Optional[dict]:
-    """Seed bundled skills into a profile via subprocess.
-
-    Uses subprocess because sync_skills() caches HERMES_HOME at module level.
-    Returns the sync result dict, or None on failure.
-    """
-    project_root = Path(__file__).parent.parent.resolve()
-    try:
-        result = subprocess.run(
-            [sys.executable, "-c",
-             "import json; from tools.skills_sync import sync_skills; "
-             "r = sync_skills(quiet=True); print(json.dumps(r))"],
-            env={**os.environ, "HERMES_HOME": str(profile_dir)},
-            cwd=str(project_root),
-            capture_output=True, text=True, timeout=60,
-        )
-        if result.returncode == 0 and result.stdout.strip():
-            return json.loads(result.stdout.strip())
-        if not quiet:
-            print(f"⚠ Skill seeding returned exit code {result.returncode}")
-            if result.stderr.strip():
-                print(f"  {result.stderr.strip()[:200]}")
-        return None
-    except subprocess.TimeoutExpired:
-        if not quiet:
-            print("⚠ Skill seeding timed out (60s)")
-        return None
-    except Exception as e:
-        if not quiet:
-            print(f"⚠ Skill seeding failed: {e}")
-        return None
-
-
-def delete_profile(name: str, yes: bool = False) -> Path:
-    """Delete a profile, its wrapper script, and its gateway service.
-
-    Stops the gateway if running. Disables systemd/launchd service first
-    to prevent auto-restart.
-
-    Returns the path that was removed.
-    """
-    validate_profile_name(name)
-
-    if name == "default":
-        raise ValueError(
-            "Cannot delete the default profile (~/.hermes).\n"
-            "To remove everything, use: hermes uninstall"
-        )
-
-    profile_dir = get_profile_dir(name)
-    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{name}' does not exist.")
-
-    # Show what will be deleted
-    model, provider = _read_config_model(profile_dir)
-    gw_running = _check_gateway_running(profile_dir)
-    skill_count = _count_skills(profile_dir)
-
-    print(f"\nProfile: {name}")
-    print(f"Path:    {profile_dir}")
-    if model:
-        print(f"Model:   {model}" + (f" ({provider})" if provider else ""))
-    if skill_count:
-        print(f"Skills:  {skill_count}")
-
-    items = [
-        "All config, API keys, memories, sessions, skills, cron jobs",
-    ]
-
-    # Check for service
-    from hermes_cli.gateway import _profile_suffix, get_service_name
-    wrapper_path = _get_wrapper_dir() / name
-    has_wrapper = wrapper_path.exists()
-    if has_wrapper:
-        items.append(f"Command alias ({wrapper_path})")
-
-    print(f"\nThis will permanently delete:")
-    for item in items:
-        print(f"  • {item}")
-    if gw_running:
-        print(f"  ⚠ Gateway is running — it will be stopped.")
-
-    # Confirmation
-    if not yes:
-        print()
-        try:
-            confirm = input(f"Type '{name}' to confirm: ").strip()
-        except (KeyboardInterrupt, EOFError):
-            print("\nCancelled.")
-            return profile_dir
-        if confirm != name:
-            print("Cancelled.")
-            return profile_dir
-
-    # 1. Disable service (prevents auto-restart)
-    _cleanup_gateway_service(name, profile_dir)
-
-    # 2. Stop running gateway
-    if gw_running:
-        _stop_gateway_process(profile_dir)
-
-    # 3. Remove wrapper script
-    if has_wrapper:
-        if remove_wrapper_script(name):
-            print(f"✓ Removed {wrapper_path}")
-
-    # 4. Remove profile directory
-    try:
-        shutil.rmtree(profile_dir)
-        print(f"✓ Removed {profile_dir}")
-    except Exception as e:
-        print(f"⚠ Could not remove {profile_dir}: {e}")
-
-    # 5. Clear active_profile if it pointed to this profile
-    try:
-        active = get_active_profile()
-        if active == name:
-            set_active_profile("default")
-            print("✓ Active profile reset to default")
-    except Exception:
-        pass
-
-    print(f"\nProfile '{name}' deleted.")
-    return profile_dir
-
-
-def _cleanup_gateway_service(name: str, profile_dir: Path) -> None:
-    """Disable and remove systemd/launchd service for a profile."""
-    import platform as _platform
-
-    # Derive service name for this profile
-    # Temporarily set HERMES_HOME so _profile_suffix resolves correctly
-    old_home = os.environ.get("HERMES_HOME")
-    try:
-        os.environ["HERMES_HOME"] = str(profile_dir)
-        from hermes_cli.gateway import get_service_name, get_launchd_plist_path
-
-        if _platform.system() == "Linux":
-            svc_name = get_service_name()
-            svc_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
-            if svc_file.exists():
-                subprocess.run(
-                    ["systemctl", "--user", "disable", svc_name],
-                    capture_output=True, check=False, timeout=10,
-                )
-                subprocess.run(
-                    ["systemctl", "--user", "stop", svc_name],
-                    capture_output=True, check=False, timeout=10,
-                )
-                svc_file.unlink(missing_ok=True)
-                subprocess.run(
-                    ["systemctl", "--user", "daemon-reload"],
-                    capture_output=True, check=False, timeout=10,
-                )
-                print(f"✓ Service {svc_name} removed")
-
-        elif _platform.system() == "Darwin":
-            plist_path = get_launchd_plist_path()
-            if plist_path.exists():
-                subprocess.run(
-                    ["launchctl", "unload", str(plist_path)],
-                    capture_output=True, check=False, timeout=10,
-                )
-                plist_path.unlink(missing_ok=True)
-                print(f"✓ Launchd service removed")
-    except Exception as e:
-        print(f"⚠ Service cleanup: {e}")
-    finally:
-        if old_home is not None:
-            os.environ["HERMES_HOME"] = old_home
-        elif "HERMES_HOME" in os.environ:
-            del os.environ["HERMES_HOME"]
-
-
-def _stop_gateway_process(profile_dir: Path) -> None:
-    """Stop a running gateway process via its PID file."""
-    import signal as _signal
-    import time as _time
-
-    pid_file = profile_dir / "gateway.pid"
-    if not pid_file.exists():
-        return
-
-    try:
-        raw = pid_file.read_text().strip()
-        data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
-        pid = int(data["pid"])
-        os.kill(pid, _signal.SIGTERM)
-        # Wait up to 10s for graceful shutdown
-        for _ in range(20):
-            _time.sleep(0.5)
-            try:
-                os.kill(pid, 0)
-            except ProcessLookupError:
-                print(f"✓ Gateway stopped (PID {pid})")
-                return
-        # Force kill
-        try:
-            os.kill(pid, _signal.SIGKILL)
-        except ProcessLookupError:
-            pass
-        print(f"✓ Gateway force-stopped (PID {pid})")
-    except (ProcessLookupError, PermissionError):
-        print("✓ Gateway already stopped")
-    except Exception as e:
-        print(f"⚠ Could not stop gateway: {e}")
-
-
-# ---------------------------------------------------------------------------
-# Active profile (sticky default)
-# ---------------------------------------------------------------------------
-
-def get_active_profile() -> str:
-    """Read the sticky active profile name.
-
-    Returns ``"default"`` if no active_profile file exists or it's empty.
-    """
-    path = _get_active_profile_path()
-    try:
-        name = path.read_text().strip()
-        if not name:
-            return "default"
-        return name
-    except (FileNotFoundError, UnicodeDecodeError, OSError):
-        return "default"
-
-
-def set_active_profile(name: str) -> None:
-    """Set the sticky active profile.
-
-    Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
-    """
-    validate_profile_name(name)
-    if name != "default" and not profile_exists(name):
-        raise FileNotFoundError(
-            f"Profile '{name}' does not exist. "
-            f"Create it with: hermes profile create {name}"
-        )
-
-    path = _get_active_profile_path()
-    path.parent.mkdir(parents=True, exist_ok=True)
-    if name == "default":
-        # Remove the file to indicate default
-        path.unlink(missing_ok=True)
-    else:
-        # Atomic write
-        tmp = path.with_suffix(".tmp")
-        tmp.write_text(name + "\n")
-        tmp.replace(path)
-
-
-def get_active_profile_name() -> str:
-    """Infer the current profile name from HERMES_HOME.
-
-    Returns ``"default"`` if HERMES_HOME is not set or points to ``~/.hermes``.
-    Returns the profile name if HERMES_HOME points into ``~/.hermes/profiles/<name>``.
-    Returns ``"custom"`` if HERMES_HOME is set to an unrecognized path.
-    """
-    from hermes_constants import get_hermes_home
-    hermes_home = get_hermes_home()
-    resolved = hermes_home.resolve()
-
-    default_resolved = _get_default_hermes_home().resolve()
-    if resolved == default_resolved:
-        return "default"
-
-    profiles_root = _get_profiles_root().resolve()
-    try:
-        rel = resolved.relative_to(profiles_root)
-        parts = rel.parts
-        if len(parts) == 1 and _PROFILE_ID_RE.match(parts[0]):
-            return parts[0]
-    except ValueError:
-        pass
-
-    return "custom"
-
-
-# ---------------------------------------------------------------------------
-# Export / Import
-# ---------------------------------------------------------------------------
-
-def export_profile(name: str, output_path: str) -> Path:
-    """Export a profile to a tar.gz archive.
-
-    Returns the output file path.
-    """
-    validate_profile_name(name)
-    profile_dir = get_profile_dir(name)
-    if not profile_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{name}' does not exist.")
-
-    output = Path(output_path)
-    # shutil.make_archive wants the base name without extension
-    base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
-    result = shutil.make_archive(base, "gztar", str(profile_dir.parent), name)
-    return Path(result)
-
-
-def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
-    """Import a profile from a tar.gz archive.
-
-    If *name* is not given, infers it from the archive's top-level directory.
-    Returns the imported profile directory.
-    """
-    import tarfile
-
-    archive = Path(archive_path)
-    if not archive.exists():
-        raise FileNotFoundError(f"Archive not found: {archive}")
-
-    # Peek at the archive to find the top-level directory name
-    with tarfile.open(archive, "r:gz") as tf:
-        top_dirs = {m.name.split("/")[0] for m in tf.getmembers() if "/" in m.name}
-        if not top_dirs:
-            top_dirs = {m.name for m in tf.getmembers() if m.isdir()}
-
-    inferred_name = name or (top_dirs.pop() if len(top_dirs) == 1 else None)
-    if not inferred_name:
-        raise ValueError(
-            "Cannot determine profile name from archive. "
-            "Specify it explicitly: hermes profile import <archive> --name <name>"
-        )
-
-    validate_profile_name(inferred_name)
-    profile_dir = get_profile_dir(inferred_name)
-    if profile_dir.exists():
-        raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
-
-    profiles_root = _get_profiles_root()
-    profiles_root.mkdir(parents=True, exist_ok=True)
-
-    shutil.unpack_archive(str(archive), str(profiles_root))
-
-    # If the archive extracted under a different name, rename
-    extracted = profiles_root / (top_dirs.pop() if top_dirs else inferred_name)
-    if extracted != profile_dir and extracted.exists():
-        extracted.rename(profile_dir)
-
-    return profile_dir
-
-
-# ---------------------------------------------------------------------------
-# Rename
-# ---------------------------------------------------------------------------
-
-def rename_profile(old_name: str, new_name: str) -> Path:
-    """Rename a profile: directory, wrapper script, service, active_profile.
-
-    Returns the new profile directory.
-    """
-    validate_profile_name(old_name)
-    validate_profile_name(new_name)
-
-    if old_name == "default":
-        raise ValueError("Cannot rename the default profile.")
-    if new_name == "default":
-        raise ValueError("Cannot rename to 'default' — it is reserved.")
-
-    old_dir = get_profile_dir(old_name)
-    new_dir = get_profile_dir(new_name)
-
-    if not old_dir.is_dir():
-        raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
-    if new_dir.exists():
-        raise FileExistsError(f"Profile '{new_name}' already exists.")
-
-    # 1. Stop gateway if running
-    if _check_gateway_running(old_dir):
-        _cleanup_gateway_service(old_name, old_dir)
-        _stop_gateway_process(old_dir)
-
-    # 2. Rename directory
-    old_dir.rename(new_dir)
-    print(f"✓ Renamed {old_dir.name} → {new_dir.name}")
-
-    # 3. Update wrapper script
-    remove_wrapper_script(old_name)
-    collision = check_alias_collision(new_name)
-    if not collision:
-        create_wrapper_script(new_name)
-        print(f"✓ Alias updated: {new_name}")
-    else:
-        print(f"⚠ Cannot create alias '{new_name}' — {collision}")
-
-    # 4. Update active_profile if it pointed to old name
-    try:
-        if get_active_profile() == old_name:
-            set_active_profile(new_name)
-            print(f"✓ Active profile updated: {new_name}")
-    except Exception:
-        pass
-
-    return new_dir
-
-
-# ---------------------------------------------------------------------------
-# Tab completion
-# ---------------------------------------------------------------------------
-
-def generate_bash_completion() -> str:
-    """Generate a bash completion script for hermes profile names."""
-    return '''# Hermes Agent profile completion
-# Add to ~/.bashrc: eval "$(hermes completion bash)"
-
-_hermes_profiles() {
-    local profiles_dir="$HOME/.hermes/profiles"
-    local profiles="default"
-    if [ -d "$profiles_dir" ]; then
-        profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
-    fi
-    echo "$profiles"
-}
-
-_hermes_completion() {
-    local cur prev
-    cur="${COMP_WORDS[COMP_CWORD]}"
-    prev="${COMP_WORDS[COMP_CWORD-1]}"
-
-    # Complete profile names after -p / --profile
-    if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
-        COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
-        return
-    fi
-
-    # Complete profile subcommands
-    if [[ "${COMP_WORDS[1]}" == "profile" ]]; then
-        case "$prev" in
-            profile)
-                COMPREPLY=($(compgen -W "list use create delete show alias rename export import" -- "$cur"))
-                return
-                ;;
-            use|delete|show|alias|rename|export)
-                COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
-                return
-                ;;
-        esac
-    fi
-
-    # Top-level subcommands
-    if [[ "$COMP_CWORD" == 1 ]]; then
-        local commands="chat model gateway setup status cron doctor config skills tools mcp sessions profile update version"
-        COMPREPLY=($(compgen -W "$commands" -- "$cur"))
-    fi
-}
-
-complete -F _hermes_completion hermes
-'''
-
-
-def generate_zsh_completion() -> str:
-    """Generate a zsh completion script for hermes profile names."""
-    return '''#compdef hermes
-# Hermes Agent profile completion
-# Add to ~/.zshrc: eval "$(hermes completion zsh)"
-
-_hermes() {
-    local -a profiles
-    profiles=(default)
-    if [[ -d "$HOME/.hermes/profiles" ]]; then
-        profiles+=("${(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}")
-    fi
-
-    _arguments \\
-        '-p[Profile name]:profile:($profiles)' \\
-        '--profile[Profile name]:profile:($profiles)' \\
-        '1:command:(chat model gateway setup status cron doctor config skills tools mcp sessions profile update version)' \\
-        '*::arg:->args'
-
-    case $words[1] in
-        profile)
-            _arguments '1:action:(list use create delete show alias rename export import)' \\
-                        '2:profile:($profiles)'
-            ;;
-    esac
-}
-
-_hermes "$@"
-'''
-
-
-# ---------------------------------------------------------------------------
-# Profile env resolution (called from _apply_profile_override)
-# ---------------------------------------------------------------------------
-
-def resolve_profile_env(profile_name: str) -> str:
-    """Resolve a profile name to a HERMES_HOME path string.
-
-    Called early in the CLI entry point, before any hermes modules
-    are imported, to set the HERMES_HOME environment variable.
-    """
-    validate_profile_name(profile_name)
-    profile_dir = get_profile_dir(profile_name)
-
-    if profile_name != "default" and not profile_dir.is_dir():
-        raise FileNotFoundError(
-            f"Profile '{profile_name}' does not exist. "
-            f"Create it with: hermes profile create {profile_name}"
-        )
-
-    return str(profile_dir)
@@ -63,11 +63,8 @@ def _get_model_config() -> Dict[str, Any]:
    model_cfg = config.get("model")
    if isinstance(model_cfg, dict):
        cfg = dict(model_cfg)
-        # Accept "model" as alias for "default" (users intuitively write model.model)
-        if not cfg.get("default") and cfg.get("model"):
-            cfg["default"] = cfg["model"]
-        default = (cfg.get("default") or "").strip()
-        base_url = (cfg.get("base_url") or "").strip()
+        default = cfg.get("default", "").strip()
+        base_url = cfg.get("base_url", "").strip()
        is_local = "localhost" in base_url or "127.0.0.1" in base_url
        is_fallback = not default or default == "anthropic/claude-opus-4.6"
        if is_local and is_fallback and base_url:
@@ -206,7 +203,7 @@ def _resolve_named_custom_runtime(
        or _detect_api_mode_for_url(base_url)
        or "chat_completions",
        "base_url": base_url,
-        "api_key": api_key or "no-key-required",
+        "api_key": api_key,
        "source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
    }

@@ -410,6 +407,12 @@ def resolve_runtime_provider(
            # (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
            elif base_url.rstrip("/").endswith("/anthropic"):
                api_mode = "anthropic_messages"
+            # MiniMax providers always use Anthropic Messages API.
+            # Auto-correct stale /v1 URLs (from old .env or config) to /anthropic.
+            elif provider in ("minimax", "minimax-cn"):
+                api_mode = "anthropic_messages"
+                if base_url.rstrip("/").endswith("/v1"):
+                    base_url = base_url.rstrip("/")[:-3] + "/anthropic"
        return {
            "provider": provider,
            "api_mode": api_mode,
@@ -80,11 +80,6 @@ _DEFAULT_PROVIDER_MODELS = {
    "minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
    "ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
    "kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
-    "huggingface": [
-        "Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
-        "Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
-        "deepseek-ai/DeepSeek-V3.2", "moonshotai/Kimi-K2.5",
-    ],
 }


@@ -289,7 +284,6 @@ from hermes_cli.config import (
    get_env_value,
    ensure_hermes_home,
 )
-# display_hermes_home imported lazily at call sites (stale-module safety during hermes update)

 from hermes_cli.colors import Colors, color

@@ -586,11 +580,11 @@ def _print_setup_summary(config: dict, hermes_home):
    else:
        tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))

-    # Web tools (Exa, Parallel, Firecrawl, or Tavily)
-    if get_env_value("EXA_API_KEY") or get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
+    # Web tools (Parallel, Firecrawl, or Tavily)
+    if get_env_value("PARALLEL_API_KEY") or get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL") or get_env_value("TAVILY_API_KEY"):
        tool_status.append(("Web Search & Extract", True, None))
    else:
-        tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
+        tool_status.append(("Web Search & Extract", False, "PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))

    # Browser tools (local Chromium or Browserbase cloud)
    import shutil
@@ -684,8 +678,7 @@ def _print_setup_summary(config: dict, hermes_home):
        print_warning(
            "Some tools are disabled. Run 'hermes setup tools' to configure them,"
        )
-        from hermes_constants import display_hermes_home as _dhh
-        print_warning(f"or edit {_dhh()}/.env directly to add the missing API keys.")
+        print_warning("or edit ~/.hermes/.env directly to add the missing API keys.")
        print()

    # Done banner
@@ -708,8 +701,7 @@ def _print_setup_summary(config: dict, hermes_home):
    print()

    # Show file locations prominently
-    from hermes_constants import display_hermes_home as _dhh
-    print(color(f"📁 All your files are in {_dhh()}/:", Colors.CYAN, Colors.BOLD))
+    print(color("📁 All your files are in ~/.hermes/:", Colors.CYAN, Colors.BOLD))
    print()
    print(f"   {color('Settings:', Colors.YELLOW)}  {get_config_path()}")
    print(f"   {color('API Keys:', Colors.YELLOW)}  {get_env_path()}")
@@ -892,7 +884,6 @@ def setup_model_provider(config: dict):
        "OpenCode Go (open models, $10/month subscription)",
        "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)",
        "GitHub Copilot ACP (spawns `copilot --acp --stdio`)",
-        "Hugging Face Inference Providers (20+ open models)",
    ]
    if keep_label:
        provider_choices.append(keep_label)
@@ -1537,26 +1528,7 @@ def setup_model_provider(config: dict):
        _set_model_provider(config, "copilot-acp", pconfig.inference_base_url)
        selected_base_url = pconfig.inference_base_url

-    elif provider_idx == 16:  # Hugging Face Inference Providers
-        selected_provider = "huggingface"
-        print()
-        print_header("Hugging Face API Token")
-        pconfig = PROVIDER_REGISTRY["huggingface"]
-        print_info(f"Provider: {pconfig.name}")
-        print_info("Get your token at: https://huggingface.co/settings/tokens")
-        print_info("Required permission: 'Make calls to Inference Providers'")
-        print()
-
-        api_key = prompt("  HF Token", password=True)
-        if api_key:
-            save_env_value("HF_TOKEN", api_key)
-            # Clear OpenRouter env vars to prevent routing confusion
-            save_env_value("OPENAI_BASE_URL", "")
-            save_env_value("OPENAI_API_KEY", "")
-        _set_model_provider(config, "huggingface", pconfig.inference_base_url)
-        selected_base_url = pconfig.inference_base_url
-
-    # else: provider_idx == 17 (Keep current) — only shown when a provider already exists
+    # else: provider_idx == 16 (Keep current) — only shown when a provider already exists
    # Normalize "keep current" to an explicit provider so downstream logic
    # doesn't fall back to the generic OpenRouter/static-model path.
    if selected_provider is None:
@@ -2095,11 +2067,11 @@ def setup_terminal_backend(config: dict):
        print_info("Serverless cloud sandboxes. Each session gets its own container.")
        print_info("Requires a Modal account: https://modal.com")

-        # Check if modal SDK is installed
+        # Check if swe-rex[modal] is installed
        try:
-            __import__("modal")
+            __import__("swe_rex")
        except ImportError:
-            print_info("Installing modal SDK...")
+            print_info("Installing swe-rex[modal]...")
            import subprocess

            uv_bin = shutil.which("uv")
@@ -2111,22 +2083,22 @@ def setup_terminal_backend(config: dict):
                        "install",
                        "--python",
                        sys.executable,
-                        "modal",
+                        "swe-rex[modal]",
                    ],
                    capture_output=True,
                    text=True,
                )
            else:
                result = subprocess.run(
-                    [sys.executable, "-m", "pip", "install", "modal"],
+                    [sys.executable, "-m", "pip", "install", "swe-rex[modal]"],
                    capture_output=True,
                    text=True,
                )
            if result.returncode == 0:
-                print_success("modal SDK installed")
+                print_success("swe-rex[modal] installed")
            else:
                print_warning(
-                    "Install failed — run manually: pip install modal"
+                    "Install failed — run manually: pip install 'swe-rex[modal]'"
                )

        # Modal token
@@ -2840,8 +2812,7 @@ def setup_gateway(config: dict):
        save_env_value("WEBHOOK_ENABLED", "true")
        print()
        print_success("Webhooks enabled! Next steps:")
-        from hermes_constants import display_hermes_home as _dhh
-        print_info(f"   1. Define webhook routes in {_dhh()}/config.yaml")
+        print_info("   1. Define webhook routes in ~/.hermes/config.yaml")
        print_info("   2. Point your service (GitHub, GitLab, etc.) at:")
        print_info("      http://your-server:8644/webhooks/<route-name>")
        print()
@@ -2997,95 +2968,6 @@ def setup_tools(config: dict, first_install: bool = False):
    tools_command(first_install=first_install, config=config)


-# =============================================================================
-# Post-Migration Section Skip Logic
-# =============================================================================
-
-
-def _get_section_config_summary(config: dict, section_key: str) -> Optional[str]:
-    """Return a short summary if a setup section is already configured, else None.
-
-    Used after OpenClaw migration to detect which sections can be skipped.
-    ``get_env_value`` is the module-level import from hermes_cli.config
-    so that test patches on ``setup_mod.get_env_value`` take effect.
-    """
-    if section_key == "model":
-        has_key = bool(
-            get_env_value("OPENROUTER_API_KEY")
-            or get_env_value("OPENAI_API_KEY")
-            or get_env_value("ANTHROPIC_API_KEY")
-        )
-        if not has_key:
-            # Check for OAuth providers
-            try:
-                from hermes_cli.auth import get_active_provider
-                if get_active_provider():
-                    has_key = True
-            except Exception:
-                pass
-        if not has_key:
-            return None
-        model = config.get("model")
-        if isinstance(model, str) and model.strip():
-            return model.strip()
-        if isinstance(model, dict):
-            return str(model.get("default") or model.get("model") or "configured")
-        return "configured"
-
-    elif section_key == "terminal":
-        backend = config.get("terminal", {}).get("backend", "local")
-        return f"backend: {backend}"
-
-    elif section_key == "agent":
-        max_turns = config.get("agent", {}).get("max_turns", 90)
-        return f"max turns: {max_turns}"
-
-    elif section_key == "gateway":
-        platforms = []
-        if get_env_value("TELEGRAM_BOT_TOKEN"):
-            platforms.append("Telegram")
-        if get_env_value("DISCORD_BOT_TOKEN"):
-            platforms.append("Discord")
-        if get_env_value("SLACK_BOT_TOKEN"):
-            platforms.append("Slack")
-        if get_env_value("WHATSAPP_PHONE_NUMBER_ID"):
-            platforms.append("WhatsApp")
-        if get_env_value("SIGNAL_ACCOUNT"):
-            platforms.append("Signal")
-        if platforms:
-            return ", ".join(platforms)
-        return None  # No platforms configured — section must run
-
-    elif section_key == "tools":
-        tools = []
-        if get_env_value("ELEVENLABS_API_KEY"):
-            tools.append("TTS/ElevenLabs")
-        if get_env_value("BROWSERBASE_API_KEY"):
-            tools.append("Browser")
-        if get_env_value("FIRECRAWL_API_KEY"):
-            tools.append("Firecrawl")
-        if tools:
-            return ", ".join(tools)
-        return None
-
-    return None
-
-
-def _skip_configured_section(
-    config: dict, section_key: str, label: str
-) -> bool:
-    """Show an already-configured section summary and offer to skip.
-
-    Returns True if the user chose to skip, False if the section should run.
-    """
-    summary = _get_section_config_summary(config, section_key)
-    if not summary:
-        return False
-    print()
-    print_success(f"  {label}: {summary}")
-    return not prompt_yes_no(f"  Reconfigure {label.lower()}?", default=False)
-
-
 # =============================================================================
 # OpenClaw Migration
 # =============================================================================
@@ -3157,7 +3039,7 @@ def _offer_openclaw_migration(hermes_home: Path) -> bool:
            target_root=hermes_home.resolve(),
            execute=True,
            workspace_target=None,
-            overwrite=True,
+            overwrite=False,
            migrate_secrets=True,
            output_dir=None,
            selected_options=selected,
@@ -3313,8 +3195,6 @@ def run_setup_wizard(args):
        )
    )

-    migration_ran = False
-
    if is_existing:
        # ── Returning User Menu ──
        print()
@@ -3384,8 +3264,7 @@ def run_setup_wizard(args):
            return

        # Offer OpenClaw migration before configuration begins
-        migration_ran = _offer_openclaw_migration(hermes_home)
-        if migration_ran:
+        if _offer_openclaw_migration(hermes_home):
            # Reload config in case migration wrote to it
            config = load_config()

@@ -3398,31 +3277,20 @@ def run_setup_wizard(args):
    print()
    print_info("You can edit these files directly or use 'hermes config edit'")

-    if migration_ran:
-        print()
-        print_info("Settings were imported from OpenClaw.")
-        print_info("Each section below will show what was imported — press Enter to keep,")
-        print_info("or choose to reconfigure if needed.")
-
    # Section 1: Model & Provider
-    if not (migration_ran and _skip_configured_section(config, "model", "Model & Provider")):
-        setup_model_provider(config)
+    setup_model_provider(config)

    # Section 2: Terminal Backend
-    if not (migration_ran and _skip_configured_section(config, "terminal", "Terminal Backend")):
-        setup_terminal_backend(config)
+    setup_terminal_backend(config)

    # Section 3: Agent Settings
-    if not (migration_ran and _skip_configured_section(config, "agent", "Agent Settings")):
-        setup_agent_settings(config)
+    setup_agent_settings(config)

    # Section 4: Messaging Platforms
-    if not (migration_ran and _skip_configured_section(config, "gateway", "Messaging Platforms")):
-        setup_gateway(config)
+    setup_gateway(config)

    # Section 5: Tools
-    if not (migration_ran and _skip_configured_section(config, "tools", "Tools")):
-        setup_tools(config, first_install=not is_existing)
+    setup_tools(config, first_install=not is_existing)

    # Save and show summary
    save_config(config)
@@ -24,10 +24,6 @@ PLATFORMS = {
    "whatsapp": "📱 WhatsApp",
    "signal":   "📡 Signal",
    "email":    "📧 Email",
-    "homeassistant": "🏠 Home Assistant",
-    "mattermost": "💬 Mattermost",
-    "matrix":   "💬 Matrix",
-    "dingtalk": "💬 DingTalk",
 }

 # ─── Config Helpers ───────────────────────────────────────────────────────────
@@ -21,7 +21,6 @@ from rich.table import Table

 # Lazy imports to avoid circular dependencies and slow startup.
 # tools.skills_hub and tools.skills_guard are imported inside functions.
-from hermes_constants import display_hermes_home

 _console = Console()

@@ -305,8 +304,7 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",


 def do_install(identifier: str, category: str = "", force: bool = False,
-               console: Optional[Console] = None, skip_confirm: bool = False,
-               invalidate_cache: bool = True) -> None:
+               console: Optional[Console] = None, skip_confirm: bool = False) -> None:
    """Fetch, quarantine, scan, confirm, and install a skill."""
    from tools.skills_hub import (
        GitHubAuth, create_source_router, ensure_hub_dirs,
@@ -389,7 +387,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
                "[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
                "It ships with hermes-agent but is not activated by default.\n"
                "Installing will copy it to your skills directory where the agent can use it.\n\n"
-                f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
+                f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
                title="Official Skill",
                border_style="bright_cyan",
            ))
@@ -399,7 +397,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
                "External skills can contain instructions that influence agent behavior,\n"
                "shell commands, and scripts. Even after automated scanning, you should\n"
                "review the installed files before use.\n\n"
-                f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
+                f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
                title="Disclaimer",
                border_style="yellow",
            ))
@@ -419,17 +417,6 @@ def do_install(identifier: str, category: str = "", force: bool = False,
    c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
    c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")

-    if invalidate_cache:
-        # Invalidate the skills prompt cache so the new skill appears immediately
-        try:
-            from agent.prompt_builder import clear_skills_system_prompt_cache
-            clear_skills_system_prompt_cache(clear_snapshot=True)
-        except Exception:
-            pass
-    else:
-        c.print("[dim]Skill will be available in your next session.[/]")
-        c.print("[dim]Use /reset to start a new session now, or --now to activate immediately (invalidates prompt cache).[/]\n")
-

 def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
    """Preview a skill's SKILL.md content without installing."""
@@ -616,8 +603,7 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N


 def do_uninstall(name: str, console: Optional[Console] = None,
-                 skip_confirm: bool = False,
-                 invalidate_cache: bool = True) -> None:
+                 skip_confirm: bool = False) -> None:
    """Remove a hub-installed skill with confirmation."""
    from tools.skills_hub import uninstall_skill

@@ -637,15 +623,6 @@ def do_uninstall(name: str, console: Optional[Console] = None,
    success, msg = uninstall_skill(name)
    if success:
        c.print(f"[bold green]{msg}[/]\n")
-        if invalidate_cache:
-            try:
-                from agent.prompt_builder import clear_skills_system_prompt_cache
-                clear_skills_system_prompt_cache(clear_snapshot=True)
-            except Exception:
-                pass
-        else:
-            c.print("[dim]Change will take effect in your next session.[/]")
-            c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
    else:
        c.print(f"[bold red]Error:[/] {msg}\n")

@@ -745,7 +722,7 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "",
        auth = GitHubAuth()
        if not auth.is_authenticated():
            c.print("[bold red]Error:[/] GitHub authentication required.\n"
-                    f"Set GITHUB_TOKEN in {display_hermes_home()}/.env or run 'gh auth login'.\n")
+                    "Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n")
            return

        c.print(f"[bold]Publishing '{name}' to {repo}...[/]")
@@ -888,15 +865,10 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
        "taps": tap_list,
    }

-    payload = json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n"
-    if output_path == "-":
-        import sys
-        sys.stdout.write(payload)
-    else:
-        out = Path(output_path)
-        out.write_text(payload)
-        c.print(f"[bold green]Snapshot exported:[/] {out}")
-        c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
+    out = Path(output_path)
+    out.write_text(json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n")
+    c.print(f"[bold green]Snapshot exported:[/] {out}")
+    c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")


 def do_snapshot_import(input_path: str, force: bool = False,
@@ -1087,23 +1059,19 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:

    elif action == "install":
        if not args:
-            c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
+            c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force|--yes]\n")
            return
        identifier = args[0]
        category = ""
-        # Slash commands run inside prompt_toolkit where input() hangs.
-        # Always skip confirmation — the user typing the command is implicit consent.
-        skip_confirm = True
+        # --yes / -y bypasses confirmation prompt (needed in TUI mode)
+        # --force handles reinstall override
+        skip_confirm = any(flag in args for flag in ("--yes", "-y"))
        force = "--force" in args
-        # --now invalidates prompt cache immediately (costs more money).
-        # Default: defer to next session to preserve cache.
-        invalidate_cache = "--now" in args
        for i, a in enumerate(args):
            if a == "--category" and i + 1 < len(args):
                category = args[i + 1]
        do_install(identifier, category=category, force=force,
-                   skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
-                   console=c)
+                   skip_confirm=skip_confirm, console=c)

    elif action == "inspect":
        if not args:
@@ -1133,13 +1101,10 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:

    elif action == "uninstall":
        if not args:
-            c.print("[bold red]Usage:[/] /skills uninstall <name> [--now]\n")
+            c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
            return
-        # Slash commands run inside prompt_toolkit where input() hangs.
-        skip_confirm = True
-        invalidate_cache = "--now" in args
-        do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
-                     invalidate_cache=invalidate_cache)
+        skip_confirm = any(flag in args for flag in ("--yes", "-y"))
+        do_uninstall(args[0], console=c, skip_confirm=skip_confirm)

    elif action == "publish":
        if not args:
@@ -292,9 +292,8 @@ def show_status(args):
        print("  Manager:      systemd (user)")
        
    elif sys.platform == 'darwin':
-        from hermes_cli.gateway import get_launchd_label
        result = subprocess.run(
-            ["launchctl", "list", get_launchd_label()],
+            ["launchctl", "list", "ai.hermes.gateway"],
            capture_output=True,
            text=True
        )
@@ -108,8 +108,7 @@ def _get_effective_configurable_toolsets():
    """
    result = list(CONFIGURABLE_TOOLSETS)
    try:
-        from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
-        discover_plugins()  # idempotent — ensures plugins are loaded
+        from hermes_cli.plugins import get_plugin_toolsets
        result.extend(get_plugin_toolsets())
    except Exception:
        pass
@@ -119,8 +118,7 @@ def _get_effective_configurable_toolsets():
 def _get_plugin_toolset_keys() -> set:
    """Return the set of toolset keys provided by plugins."""
    try:
-        from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
-        discover_plugins()  # idempotent — ensures plugins are loaded
+        from hermes_cli.plugins import get_plugin_toolsets
        return {ts_key for ts_key, _, _ in get_plugin_toolsets()}
    except Exception:
        return set()
@@ -135,10 +133,8 @@ PLATFORMS = {
    "signal":   {"label": "📡 Signal",     "default_toolset": "hermes-signal"},
    "homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
    "email":    {"label": "📧 Email",      "default_toolset": "hermes-email"},
-    "matrix":   {"label": "💬 Matrix",     "default_toolset": "hermes-matrix"},
    "dingtalk": {"label": "💬 DingTalk",   "default_toolset": "hermes-dingtalk"},
    "api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
-    "mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
 }


@@ -190,14 +186,6 @@ TOOL_CATEGORIES = {
                    {"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
                ],
            },
-            {
-                "name": "Exa",
-                "tag": "AI-native search and contents",
-                "web_backend": "exa",
-                "env_vars": [
-                    {"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
-                ],
-            },
            {
                "name": "Parallel",
                "tag": "AI-native search and extract",
@@ -326,8 +314,7 @@ def _run_post_setup(post_setup_key: str):
            if result.returncode == 0:
                _print_success("    Node.js dependencies installed")
            else:
-                from hermes_constants import display_hermes_home
-                _print_warning(f"    npm install failed - run manually: cd {display_hermes_home()}/hermes-agent && npm install")
+                _print_warning("    npm install failed - run manually: cd ~/.hermes/hermes-agent && npm install")
        elif not node_modules.exists():
            _print_warning("    Node.js not found - browser tools require: npm install (in hermes-agent directory)")

@@ -1265,8 +1252,7 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
        platform_choices[idx] = f"Configure {pinfo['label']}  ({new_count}/{total} enabled)"

    print()
-    from hermes_constants import display_hermes_home
-    print(color(f"  Tool configuration saved to {display_hermes_home()}/config.yaml", Colors.DIM))
+    print(color("  Tool configuration saved to ~/.hermes/config.yaml", Colors.DIM))
    print(color("  Changes take effect on next 'hermes' or gateway restart.", Colors.DIM))
    print()

@@ -1,260 +0,0 @@
-"""hermes webhook — manage dynamic webhook subscriptions from the CLI.
-
-Usage:
-    hermes webhook subscribe <name> [options]
-    hermes webhook list
-    hermes webhook remove <name>
-    hermes webhook test <name> [--payload '{"key": "value"}']
-
-Subscriptions persist to ~/.hermes/webhook_subscriptions.json and are
-hot-reloaded by the webhook adapter without a gateway restart.
-"""
-
-import json
-import os
-import re
-import secrets
-import time
-from pathlib import Path
-from typing import Dict, Optional
-
-from hermes_constants import display_hermes_home
-
-
-_SUBSCRIPTIONS_FILENAME = "webhook_subscriptions.json"
-
-
-def _hermes_home() -> Path:
-    return Path(
-        os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))
-    ).expanduser()
-
-
-def _subscriptions_path() -> Path:
-    return _hermes_home() / _SUBSCRIPTIONS_FILENAME
-
-
-def _load_subscriptions() -> Dict[str, dict]:
-    path = _subscriptions_path()
-    if not path.exists():
-        return {}
-    try:
-        data = json.loads(path.read_text(encoding="utf-8"))
-        return data if isinstance(data, dict) else {}
-    except Exception:
-        return {}
-
-
-def _save_subscriptions(subs: Dict[str, dict]) -> None:
-    path = _subscriptions_path()
-    path.parent.mkdir(parents=True, exist_ok=True)
-    tmp_path = path.with_suffix(".tmp")
-    tmp_path.write_text(
-        json.dumps(subs, indent=2, ensure_ascii=False),
-        encoding="utf-8",
-    )
-    os.replace(str(tmp_path), str(path))
-
-
-def _get_webhook_config() -> dict:
-    """Load webhook platform config. Returns {} if not configured."""
-    try:
-        from hermes_cli.config import load_config
-        cfg = load_config()
-        return cfg.get("platforms", {}).get("webhook", {})
-    except Exception:
-        return {}
-
-
-def _is_webhook_enabled() -> bool:
-    return bool(_get_webhook_config().get("enabled"))
-
-
-def _get_webhook_base_url() -> str:
-    wh = _get_webhook_config().get("extra", {})
-    host = wh.get("host", "0.0.0.0")
-    port = wh.get("port", 8644)
-    display_host = "localhost" if host == "0.0.0.0" else host
-    return f"http://{display_host}:{port}"
-
-
-def _setup_hint() -> str:
-    _dhh = display_hermes_home()
-    return f"""
-  Webhook platform is not enabled. To set it up:
-
-  1. Run the gateway setup wizard:
-     hermes gateway setup
-
-  2. Or manually add to {_dhh}/config.yaml:
-     platforms:
-       webhook:
-         enabled: true
-         extra:
-           host: "0.0.0.0"
-           port: 8644
-           secret: "your-global-hmac-secret"
-
-  3. Or set environment variables in {_dhh}/.env:
-     WEBHOOK_ENABLED=true
-     WEBHOOK_PORT=8644
-     WEBHOOK_SECRET=your-global-secret
-
-  Then start the gateway: hermes gateway run
-"""
-
-
-def _require_webhook_enabled() -> bool:
-    """Check webhook is enabled. Print setup guide and return False if not."""
-    if _is_webhook_enabled():
-        return True
-    print(_setup_hint())
-    return False
-
-
-def webhook_command(args):
-    """Entry point for 'hermes webhook' subcommand."""
-    sub = getattr(args, "webhook_action", None)
-
-    if not sub:
-        print("Usage: hermes webhook {subscribe|list|remove|test}")
-        print("Run 'hermes webhook --help' for details.")
-        return
-
-    if not _require_webhook_enabled():
-        return
-
-    if sub in ("subscribe", "add"):
-        _cmd_subscribe(args)
-    elif sub in ("list", "ls"):
-        _cmd_list(args)
-    elif sub in ("remove", "rm"):
-        _cmd_remove(args)
-    elif sub == "test":
-        _cmd_test(args)
-
-
-def _cmd_subscribe(args):
-    name = args.name.strip().lower().replace(" ", "-")
-    if not re.match(r'^[a-z0-9][a-z0-9_-]*$', name):
-        print(f"Error: Invalid name '{name}'. Use lowercase alphanumeric with hyphens/underscores.")
-        return
-
-    subs = _load_subscriptions()
-    is_update = name in subs
-
-    secret = args.secret or secrets.token_urlsafe(32)
-    events = [e.strip() for e in args.events.split(",")] if args.events else []
-
-    route = {
-        "description": args.description or f"Agent-created subscription: {name}",
-        "events": events,
-        "secret": secret,
-        "prompt": args.prompt or "",
-        "skills": [s.strip() for s in args.skills.split(",")] if args.skills else [],
-        "deliver": args.deliver or "log",
-        "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
-    }
-
-    if args.deliver_chat_id:
-        route["deliver_extra"] = {"chat_id": args.deliver_chat_id}
-
-    subs[name] = route
-    _save_subscriptions(subs)
-
-    base_url = _get_webhook_base_url()
-    status = "Updated" if is_update else "Created"
-
-    print(f"\n  {status} webhook subscription: {name}")
-    print(f"  URL:    {base_url}/webhooks/{name}")
-    print(f"  Secret: {secret}")
-    if events:
-        print(f"  Events: {', '.join(events)}")
-    else:
-        print("  Events: (all)")
-    print(f"  Deliver: {route['deliver']}")
-    if route.get("prompt"):
-        prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
-        print(f"  Prompt: {prompt_preview}")
-    print(f"\n  Configure your service to POST to the URL above.")
-    print(f"  Use the secret for HMAC-SHA256 signature validation.")
-    print(f"  The gateway must be running to receive events (hermes gateway run).\n")
-
-
-def _cmd_list(args):
-    subs = _load_subscriptions()
-    if not subs:
-        print("  No dynamic webhook subscriptions.")
-        print("  Create one with: hermes webhook subscribe <name>")
-        return
-
-    base_url = _get_webhook_base_url()
-    print(f"\n  {len(subs)} webhook subscription(s):\n")
-    for name, route in subs.items():
-        events = ", ".join(route.get("events", [])) or "(all)"
-        deliver = route.get("deliver", "log")
-        desc = route.get("description", "")
-        print(f"  ◆ {name}")
-        if desc:
-            print(f"    {desc}")
-        print(f"    URL:     {base_url}/webhooks/{name}")
-        print(f"    Events:  {events}")
-        print(f"    Deliver: {deliver}")
-        print()
-
-
-def _cmd_remove(args):
-    name = args.name.strip().lower()
-    subs = _load_subscriptions()
-
-    if name not in subs:
-        print(f"  No subscription named '{name}'.")
-        print("  Note: Static routes from config.yaml cannot be removed here.")
-        return
-
-    del subs[name]
-    _save_subscriptions(subs)
-    print(f"  Removed webhook subscription: {name}")
-
-
-def _cmd_test(args):
-    """Send a test POST to a webhook route."""
-    name = args.name.strip().lower()
-    subs = _load_subscriptions()
-
-    if name not in subs:
-        print(f"  No subscription named '{name}'.")
-        return
-
-    route = subs[name]
-    secret = route.get("secret", "")
-    base_url = _get_webhook_base_url()
-    url = f"{base_url}/webhooks/{name}"
-
-    payload = args.payload or '{"test": true, "event_type": "test", "message": "Hello from hermes webhook test"}'
-
-    import hmac
-    import hashlib
-    sig = "sha256=" + hmac.new(
-        secret.encode(), payload.encode(), hashlib.sha256
-    ).hexdigest()
-
-    print(f"  Sending test POST to {url}")
-    try:
-        import urllib.request
-        req = urllib.request.Request(
-            url,
-            data=payload.encode(),
-            headers={
-                "Content-Type": "application/json",
-                "X-Hub-Signature-256": sig,
-                "X-GitHub-Event": "test",
-            },
-            method="POST",
-        )
-        with urllib.request.urlopen(req, timeout=10) as resp:
-            body = resp.read().decode()
-            print(f"  Response ({resp.status}): {body}")
-    except Exception as e:
-        print(f"  Error: {e}")
-        print("  Is the gateway running? (hermes gateway run)")
@@ -17,47 +17,6 @@ def get_hermes_home() -> Path:
    return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))


-def get_hermes_dir(new_subpath: str, old_name: str) -> Path:
-    """Resolve a Hermes subdirectory with backward compatibility.
-
-    New installs get the consolidated layout (e.g. ``cache/images``).
-    Existing installs that already have the old path (e.g. ``image_cache``)
-    keep using it — no migration required.
-
-    Args:
-        new_subpath: Preferred path relative to HERMES_HOME (e.g. ``"cache/images"``).
-        old_name: Legacy path relative to HERMES_HOME (e.g. ``"image_cache"``).
-
-    Returns:
-        Absolute ``Path`` — old location if it exists on disk, otherwise the new one.
-    """
-    home = get_hermes_home()
-    old_path = home / old_name
-    if old_path.exists():
-        return old_path
-    return home / new_subpath
-
-
-def display_hermes_home() -> str:
-    """Return a user-friendly display string for the current HERMES_HOME.
-
-    Uses ``~/`` shorthand for readability::
-
-        default:  ``~/.hermes``
-        profile:  ``~/.hermes/profiles/coder``
-        custom:   ``/opt/hermes-custom``
-
-    Use this in **user-facing** print/log messages instead of hardcoding
-    ``~/.hermes``.  For code that needs a real ``Path``, use
-    :func:`get_hermes_home` instead.
-    """
-    home = get_hermes_home()
-    try:
-        return "~/" + str(home.relative_to(Path.home()))
-    except ValueError:
-        return str(home)
-
-
 VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")


@@ -15,20 +15,15 @@ Key design decisions:
 """

 import json
-import logging
 import os
-import random
 import re
 import sqlite3
 import threading
 import time
 from pathlib import Path
 from hermes_constants import get_hermes_home
-from typing import Any, Callable, Dict, List, Optional, TypeVar
+from typing import Dict, Any, List, Optional

-logger = logging.getLogger(__name__)
-
-T = TypeVar("T")

 DEFAULT_DB_PATH = get_hermes_home() / "state.db"

@@ -121,38 +116,18 @@ class SessionDB:
    single writer via WAL mode). Each method opens its own cursor.
    """

-    # ── Write-contention tuning ──
-    # With multiple hermes processes (gateway + CLI sessions + worktree agents)
-    # all sharing one state.db, WAL write-lock contention causes visible TUI
-    # freezes.  SQLite's built-in busy handler uses a deterministic sleep
-    # schedule that causes convoy effects under high concurrency.
-    #
-    # Instead, we keep the SQLite timeout short (1s) and handle retries at the
-    # application level with random jitter, which naturally staggers competing
-    # writers and avoids the convoy.
-    _WRITE_MAX_RETRIES = 15
-    _WRITE_RETRY_MIN_S = 0.020   # 20ms
-    _WRITE_RETRY_MAX_S = 0.150   # 150ms
-    # Attempt a PASSIVE WAL checkpoint every N successful writes.
-    _CHECKPOINT_EVERY_N_WRITES = 50
-
    def __init__(self, db_path: Path = None):
        self.db_path = db_path or DEFAULT_DB_PATH
        self.db_path.parent.mkdir(parents=True, exist_ok=True)

        self._lock = threading.Lock()
-        self._write_count = 0
        self._conn = sqlite3.connect(
            str(self.db_path),
            check_same_thread=False,
-            # Short timeout — application-level retry with random jitter
-            # handles contention instead of sitting in SQLite's internal
-            # busy handler for up to 30s.
-            timeout=1.0,
-            # Autocommit mode: Python's default isolation_level="" auto-starts
-            # transactions on DML, which conflicts with our explicit
-            # BEGIN IMMEDIATE.  None = we manage transactions ourselves.
-            isolation_level=None,
+            # 30s gives the WAL writer (CLI or gateway) time to finish a batch
+            # flush before the concurrent reader/writer gives up.  10s was too
+            # short when the CLI is doing frequent memory flushes.
+            timeout=30.0,
        )
        self._conn.row_factory = sqlite3.Row
        self._conn.execute("PRAGMA journal_mode=WAL")
@@ -160,96 +135,6 @@ class SessionDB:

        self._init_schema()

-    # ── Core write helper ──
-
-    def _execute_write(self, fn: Callable[[sqlite3.Connection], T]) -> T:
-        """Execute a write transaction with BEGIN IMMEDIATE and jitter retry.
-
-        *fn* receives the connection and should perform INSERT/UPDATE/DELETE
-        statements.  The caller must NOT call ``commit()`` — that's handled
-        here after *fn* returns.
-
-        BEGIN IMMEDIATE acquires the WAL write lock at transaction start
-        (not at commit time), so lock contention surfaces immediately.
-        On ``database is locked``, we release the Python lock, sleep a
-        random 20-150ms, and retry — breaking the convoy pattern that
-        SQLite's built-in deterministic backoff creates.
-
-        Returns whatever *fn* returns.
-        """
-        last_err: Optional[Exception] = None
-        for attempt in range(self._WRITE_MAX_RETRIES):
-            try:
-                with self._lock:
-                    self._conn.execute("BEGIN IMMEDIATE")
-                    try:
-                        result = fn(self._conn)
-                        self._conn.commit()
-                    except BaseException:
-                        try:
-                            self._conn.rollback()
-                        except Exception:
-                            pass
-                        raise
-                # Success — periodic best-effort checkpoint.
-                self._write_count += 1
-                if self._write_count % self._CHECKPOINT_EVERY_N_WRITES == 0:
-                    self._try_wal_checkpoint()
-                return result
-            except sqlite3.OperationalError as exc:
-                err_msg = str(exc).lower()
-                if "locked" in err_msg or "busy" in err_msg:
-                    last_err = exc
-                    if attempt < self._WRITE_MAX_RETRIES - 1:
-                        jitter = random.uniform(
-                            self._WRITE_RETRY_MIN_S,
-                            self._WRITE_RETRY_MAX_S,
-                        )
-                        time.sleep(jitter)
-                        continue
-                # Non-lock error or retries exhausted — propagate.
-                raise
-        # Retries exhausted (shouldn't normally reach here).
-        raise last_err or sqlite3.OperationalError(
-            "database is locked after max retries"
-        )
-
-    def _try_wal_checkpoint(self) -> None:
-        """Best-effort PASSIVE WAL checkpoint.  Never blocks, never raises.
-
-        Flushes committed WAL frames back into the main DB file for any
-        frames that no other connection currently needs.  Keeps the WAL
-        from growing unbounded when many processes hold persistent
-        connections.
-        """
-        try:
-            with self._lock:
-                result = self._conn.execute(
-                    "PRAGMA wal_checkpoint(PASSIVE)"
-                ).fetchone()
-                if result and result[1] > 0:
-                    logger.debug(
-                        "WAL checkpoint: %d/%d pages checkpointed",
-                        result[2], result[1],
-                    )
-        except Exception:
-            pass  # Best effort — never fatal.
-
-    def close(self):
-        """Close the database connection.
-
-        Attempts a PASSIVE WAL checkpoint first so that exiting processes
-        help keep the WAL file from growing unbounded.
-        """
-        with self._lock:
-            if self._conn:
-                try:
-                    self._conn.execute("PRAGMA wal_checkpoint(PASSIVE)")
-                except Exception:
-                    pass
-                self._conn.close()
-                self._conn = None
-
    def _init_schema(self):
        """Create tables and FTS if they don't exist, run migrations."""
        cursor = self._conn.cursor()
@@ -371,8 +256,8 @@ class SessionDB:
        parent_session_id: str = None,
    ) -> str:
        """Create a new session record. Returns the session_id."""
-        def _do(conn):
-            conn.execute(
+        with self._lock:
+            self._conn.execute(
                """INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
                   system_prompt, parent_session_id, started_at)
                   VALUES (?, ?, ?, ?, ?, ?, ?, ?)""",
@@ -387,35 +272,26 @@ class SessionDB:
                    time.time(),
                ),
            )
-        self._execute_write(_do)
+            self._conn.commit()
        return session_id

    def end_session(self, session_id: str, end_reason: str) -> None:
        """Mark a session as ended."""
-        def _do(conn):
-            conn.execute(
+        with self._lock:
+            self._conn.execute(
                "UPDATE sessions SET ended_at = ?, end_reason = ? WHERE id = ?",
                (time.time(), end_reason, session_id),
            )
-        self._execute_write(_do)
-
-    def reopen_session(self, session_id: str) -> None:
-        """Clear ended_at/end_reason so a session can be resumed."""
-        def _do(conn):
-            conn.execute(
-                "UPDATE sessions SET ended_at = NULL, end_reason = NULL WHERE id = ?",
-                (session_id,),
-            )
-        self._execute_write(_do)
+            self._conn.commit()

    def update_system_prompt(self, session_id: str, system_prompt: str) -> None:
        """Store the full assembled system prompt snapshot."""
-        def _do(conn):
-            conn.execute(
+        with self._lock:
+            self._conn.execute(
                "UPDATE sessions SET system_prompt = ? WHERE id = ?",
                (system_prompt, session_id),
            )
-        self._execute_write(_do)
+            self._conn.commit()

    def update_token_counts(
        self,
@@ -434,39 +310,11 @@ class SessionDB:
        billing_provider: Optional[str] = None,
        billing_base_url: Optional[str] = None,
        billing_mode: Optional[str] = None,
-        absolute: bool = False,
    ) -> None:
-        """Update token counters and backfill model if not already set.
-
-        When *absolute* is False (default), values are **incremented** — use
-        this for per-API-call deltas (CLI path).
-
-        When *absolute* is True, values are **set directly** — use this when
-        the caller already holds cumulative totals (gateway path, where the
-        cached agent accumulates across messages).
-        """
-        if absolute:
-            sql = """UPDATE sessions SET
-                   input_tokens = ?,
-                   output_tokens = ?,
-                   cache_read_tokens = ?,
-                   cache_write_tokens = ?,
-                   reasoning_tokens = ?,
-                   estimated_cost_usd = COALESCE(?, 0),
-                   actual_cost_usd = CASE
-                       WHEN ? IS NULL THEN actual_cost_usd
-                       ELSE ?
-                   END,
-                   cost_status = COALESCE(?, cost_status),
-                   cost_source = COALESCE(?, cost_source),
-                   pricing_version = COALESCE(?, pricing_version),
-                   billing_provider = COALESCE(billing_provider, ?),
-                   billing_base_url = COALESCE(billing_base_url, ?),
-                   billing_mode = COALESCE(billing_mode, ?),
-                   model = COALESCE(model, ?)
-                   WHERE id = ?"""
-        else:
-            sql = """UPDATE sessions SET
+        """Increment token counters and backfill model if not already set."""
+        with self._lock:
+            self._conn.execute(
+                """UPDATE sessions SET
                   input_tokens = input_tokens + ?,
                   output_tokens = output_tokens + ?,
                   cache_read_tokens = cache_read_tokens + ?,
@@ -484,94 +332,6 @@ class SessionDB:
                   billing_base_url = COALESCE(billing_base_url, ?),
                   billing_mode = COALESCE(billing_mode, ?),
                   model = COALESCE(model, ?)
-                   WHERE id = ?"""
-        params = (
-            input_tokens,
-            output_tokens,
-            cache_read_tokens,
-            cache_write_tokens,
-            reasoning_tokens,
-            estimated_cost_usd,
-            actual_cost_usd,
-            actual_cost_usd,
-            cost_status,
-            cost_source,
-            pricing_version,
-            billing_provider,
-            billing_base_url,
-            billing_mode,
-            model,
-            session_id,
-        )
-        def _do(conn):
-            conn.execute(sql, params)
-        self._execute_write(_do)
-
-    def ensure_session(
-        self,
-        session_id: str,
-        source: str = "unknown",
-        model: str = None,
-    ) -> None:
-        """Ensure a session row exists, creating it with minimal metadata if absent.
-
-        Used by _flush_messages_to_session_db to recover from a failed
-        create_session() call (e.g. transient SQLite lock at agent startup).
-        INSERT OR IGNORE is safe to call even when the row already exists.
-        """
-        def _do(conn):
-            conn.execute(
-                """INSERT OR IGNORE INTO sessions
-                   (id, source, model, started_at)
-                   VALUES (?, ?, ?, ?)""",
-                (session_id, source, model, time.time()),
-            )
-        self._execute_write(_do)
-
-    def set_token_counts(
-        self,
-        session_id: str,
-        input_tokens: int = 0,
-        output_tokens: int = 0,
-        model: str = None,
-        cache_read_tokens: int = 0,
-        cache_write_tokens: int = 0,
-        reasoning_tokens: int = 0,
-        estimated_cost_usd: Optional[float] = None,
-        actual_cost_usd: Optional[float] = None,
-        cost_status: Optional[str] = None,
-        cost_source: Optional[str] = None,
-        pricing_version: Optional[str] = None,
-        billing_provider: Optional[str] = None,
-        billing_base_url: Optional[str] = None,
-        billing_mode: Optional[str] = None,
-    ) -> None:
-        """Set token counters to absolute values (not increment).
-
-        Use this when the caller provides cumulative totals from a completed
-        conversation run (e.g. the gateway, where the cached agent's
-        session_prompt_tokens already reflects the running total).
-        """
-        def _do(conn):
-            conn.execute(
-                """UPDATE sessions SET
-                   input_tokens = ?,
-                   output_tokens = ?,
-                   cache_read_tokens = ?,
-                   cache_write_tokens = ?,
-                   reasoning_tokens = ?,
-                   estimated_cost_usd = ?,
-                   actual_cost_usd = CASE
-                       WHEN ? IS NULL THEN actual_cost_usd
-                       ELSE ?
-                   END,
-                   cost_status = COALESCE(?, cost_status),
-                   cost_source = COALESCE(?, cost_source),
-                   pricing_version = COALESCE(?, pricing_version),
-                   billing_provider = COALESCE(billing_provider, ?),
-                   billing_base_url = COALESCE(billing_base_url, ?),
-                   billing_mode = COALESCE(billing_mode, ?),
-                   model = COALESCE(model, ?)
                   WHERE id = ?""",
                (
                    input_tokens,
@@ -592,7 +352,28 @@ class SessionDB:
                    session_id,
                ),
            )
-        self._execute_write(_do)
+            self._conn.commit()
+
+    def ensure_session(
+        self,
+        session_id: str,
+        source: str = "unknown",
+        model: str = None,
+    ) -> None:
+        """Ensure a session row exists, creating it with minimal metadata if absent.
+
+        Used by _flush_messages_to_session_db to recover from a failed
+        create_session() call (e.g. transient SQLite lock at agent startup).
+        INSERT OR IGNORE is safe to call even when the row already exists.
+        """
+        with self._lock:
+            self._conn.execute(
+                """INSERT OR IGNORE INTO sessions
+                   (id, source, model, started_at)
+                   VALUES (?, ?, ?, ?)""",
+                (session_id, source, model, time.time()),
+            )
+            self._conn.commit()

    def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
        """Get a session by ID."""
@@ -686,10 +467,10 @@ class SessionDB:
        Empty/whitespace-only strings are normalized to None (clearing the title).
        """
        title = self.sanitize_title(title)
-        def _do(conn):
+        with self._lock:
            if title:
                # Check uniqueness (allow the same session to keep its own title)
-                cursor = conn.execute(
+                cursor = self._conn.execute(
                    "SELECT id FROM sessions WHERE title = ? AND id != ?",
                    (title, session_id),
                )
@@ -698,12 +479,12 @@ class SessionDB:
                    raise ValueError(
                        f"Title '{title}' is already in use by session {conflict['id']}"
                    )
-            cursor = conn.execute(
+            cursor = self._conn.execute(
                "UPDATE sessions SET title = ? WHERE id = ?",
                (title, session_id),
            )
-            return cursor.rowcount
-        rowcount = self._execute_write(_do)
+            self._conn.commit()
+            rowcount = cursor.rowcount
        return rowcount > 0

    def get_session_title(self, session_id: str) -> Optional[str]:
@@ -875,24 +656,17 @@ class SessionDB:
        Also increments the session's message_count (and tool_call_count
        if role is 'tool' or tool_calls is present).
        """
-        # Serialize structured fields to JSON before entering the write txn
-        reasoning_details_json = (
-            json.dumps(reasoning_details)
-            if reasoning_details else None
-        )
-        codex_items_json = (
-            json.dumps(codex_reasoning_items)
-            if codex_reasoning_items else None
-        )
-        tool_calls_json = json.dumps(tool_calls) if tool_calls else None
-
-        # Pre-compute tool call count
-        num_tool_calls = 0
-        if tool_calls is not None:
-            num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
-
-        def _do(conn):
-            cursor = conn.execute(
+        with self._lock:
+            # Serialize structured fields to JSON for storage
+            reasoning_details_json = (
+                json.dumps(reasoning_details)
+                if reasoning_details else None
+            )
+            codex_items_json = (
+                json.dumps(codex_reasoning_items)
+                if codex_reasoning_items else None
+            )
+            cursor = self._conn.execute(
                """INSERT INTO messages (session_id, role, content, tool_call_id,
                   tool_calls, tool_name, timestamp, token_count, finish_reason,
                   reasoning, reasoning_details, codex_reasoning_items)
@@ -902,7 +676,7 @@ class SessionDB:
                    role,
                    content,
                    tool_call_id,
-                    tool_calls_json,
+                    json.dumps(tool_calls) if tool_calls else None,
                    tool_name,
                    time.time(),
                    token_count,
@@ -915,20 +689,25 @@ class SessionDB:
            msg_id = cursor.lastrowid

            # Update counters
+            # Count actual tool calls from the tool_calls list (not from tool responses).
+            # A single assistant message can contain multiple parallel tool calls.
+            num_tool_calls = 0
+            if tool_calls is not None:
+                num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
            if num_tool_calls > 0:
-                conn.execute(
+                self._conn.execute(
                    """UPDATE sessions SET message_count = message_count + 1,
                       tool_call_count = tool_call_count + ? WHERE id = ?""",
                    (num_tool_calls, session_id),
                )
            else:
-                conn.execute(
+                self._conn.execute(
                    "UPDATE sessions SET message_count = message_count + 1 WHERE id = ?",
                    (session_id,),
                )
-            return msg_id

-        return self._execute_write(_do)
+            self._conn.commit()
+        return msg_id

    def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
        """Load all messages for a session, ordered by timestamp."""
@@ -1222,53 +1001,54 @@ class SessionDB:

    def clear_messages(self, session_id: str) -> None:
        """Delete all messages for a session and reset its counters."""
-        def _do(conn):
-            conn.execute(
+        with self._lock:
+            self._conn.execute(
                "DELETE FROM messages WHERE session_id = ?", (session_id,)
            )
-            conn.execute(
+            self._conn.execute(
                "UPDATE sessions SET message_count = 0, tool_call_count = 0 WHERE id = ?",
                (session_id,),
            )
-        self._execute_write(_do)
+            self._conn.commit()

    def delete_session(self, session_id: str) -> bool:
        """Delete a session and all its messages. Returns True if found."""
-        def _do(conn):
-            cursor = conn.execute(
+        with self._lock:
+            cursor = self._conn.execute(
                "SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,)
            )
            if cursor.fetchone()[0] == 0:
                return False
-            conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
-            conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
+            self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
+            self._conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
+            self._conn.commit()
            return True
-        return self._execute_write(_do)

    def prune_sessions(self, older_than_days: int = 90, source: str = None) -> int:
        """
        Delete sessions older than N days. Returns count of deleted sessions.
        Only prunes ended sessions (not active ones).
        """
-        cutoff = time.time() - (older_than_days * 86400)
+        import time as _time
+        cutoff = _time.time() - (older_than_days * 86400)

-        def _do(conn):
+        with self._lock:
            if source:
-                cursor = conn.execute(
+                cursor = self._conn.execute(
                    """SELECT id FROM sessions
                       WHERE started_at < ? AND ended_at IS NOT NULL AND source = ?""",
                    (cutoff, source),
                )
            else:
-                cursor = conn.execute(
+                cursor = self._conn.execute(
                    "SELECT id FROM sessions WHERE started_at < ? AND ended_at IS NOT NULL",
                    (cutoff,),
                )
            session_ids = [row["id"] for row in cursor.fetchall()]

            for sid in session_ids:
-                conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
-                conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
-            return len(session_ids)
+                self._conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
+                self._conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))

-        return self._execute_write(_do)
+            self._conn.commit()
+        return len(session_ids)
@@ -270,7 +270,7 @@ def cmd_status(args) -> None:
            print(f"    {peer}: {mode}")
    print(f"  Write freq:     {hcfg.write_frequency}")

-    if hcfg.enabled and (hcfg.api_key or hcfg.base_url):
+    if hcfg.enabled and hcfg.api_key:
        print("\n  Connection... ", end="", flush=True)
        try:
            get_honcho_client(hcfg)
@@ -278,7 +278,7 @@ def cmd_status(args) -> None:
        except Exception as e:
            print(f"FAILED ({e})\n")
    else:
-        reason = "disabled" if not hcfg.enabled else "no API key or base URL"
+        reason = "disabled" if not hcfg.enabled else "no API key"
        print(f"\n  Not connected ({reason})\n")


@@ -417,18 +417,9 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
    else:
        logger.info("Initializing Honcho client (host: %s, workspace: %s)", config.host, config.workspace_id)

-    # Local Honcho instances don't require an API key, but the SDK
-    # expects a non-empty string.  Use a placeholder for local URLs.
-    _is_local = resolved_base_url and (
-        "localhost" in resolved_base_url
-        or "127.0.0.1" in resolved_base_url
-        or "::1" in resolved_base_url
-    )
-    effective_api_key = config.api_key or ("local" if _is_local else None)
-
    kwargs: dict = {
        "workspace_id": config.workspace_id,
-        "api_key": effective_api_key,
+        "api_key": config.api_key,
        "environment": config.environment,
    }
    if resolved_base_url:
@@ -10,12 +10,6 @@
 # container recreation. Environment variables are written to $HERMES_HOME/.env
 # and read by hermes at startup — no container recreation needed for env changes.
 #
-# Tool resolution: the hermes wrapper uses --suffix PATH for nix store tools,
-# so apt/uv-installed versions take priority. The container entrypoint provisions
-# extensible tools on first boot: nodejs/npm via apt, uv via curl, and a Python
-# 3.11 venv (bootstrapped entirely by uv) at ~/.venv with pip seeded. Agents get
-# writable tool prefixes for npm i -g, pip install, uv tool install, etc.
-#
 # Usage:
 #   services.hermes-agent = {
 #     enable = true;
@@ -111,52 +105,22 @@
      fi
      mkdir -p "$TARGET_HOME"
      chown "$HERMES_UID:$HERMES_GID" "$TARGET_HOME"
-      chmod 0750 "$TARGET_HOME"

      # Ensure HERMES_HOME is owned by the target user
      if [ -n "''${HERMES_HOME:-}" ] && [ -d "$HERMES_HOME" ]; then
        chown -R "$HERMES_UID:$HERMES_GID" "$HERMES_HOME"
      fi

-      # ── Provision apt packages (first boot only, cached in writable layer) ──
-      # sudo: agent self-modification
-      # nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
-      # curl: needed for uv installer
-      if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
-        echo "First boot: provisioning agent tools..."
-        apt-get update -qq
-        apt-get install -y -qq sudo nodejs npm curl
-        touch /var/lib/hermes-tools-provisioned
+      # Install sudo on Debian/Ubuntu if missing (first boot only, cached in writable layer)
+      if command -v apt-get >/dev/null 2>&1 && ! command -v sudo >/dev/null 2>&1; then
+        apt-get update -qq >/dev/null 2>&1 && apt-get install -y -qq sudo >/dev/null 2>&1 || true
      fi
-
      if command -v sudo >/dev/null 2>&1 && [ ! -f /etc/sudoers.d/hermes ]; then
        mkdir -p /etc/sudoers.d
        echo "$TARGET_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/hermes
        chmod 0440 /etc/sudoers.d/hermes
      fi

-      # uv (Python manager) — not in Ubuntu repos, retry-safe outside the sentinel
-      if ! command -v uv >/dev/null 2>&1 && [ ! -x "$TARGET_HOME/.local/bin/uv" ] && command -v curl >/dev/null 2>&1; then
-        su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
-      fi
-
-      # Python 3.11 venv — gives the agent a writable Python with pip.
-      # Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
-      # --seed includes pip/setuptools so bare `pip install` works.
-      _UV_BIN="$TARGET_HOME/.local/bin/uv"
-      if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
-        su -s /bin/sh "$TARGET_USER" -c "
-          export PATH=\"\$HOME/.local/bin:\$PATH\"
-          uv python install 3.11
-          uv venv --python 3.11 --seed \"\$HOME/.venv\"
-        " || true
-      fi
-
-      # Put the agent venv first on PATH so python/pip resolve to writable copies
-      if [ -d "$TARGET_HOME/.venv/bin" ]; then
-        export PATH="$TARGET_HOME/.venv/bin:$PATH"
-      fi
-
      if command -v setpriv >/dev/null 2>&1; then
        exec setpriv --reuid="$HERMES_UID" --regid="$HERMES_GID" --init-groups "$@"
      elif command -v su >/dev/null 2>&1; then
@@ -552,8 +516,8 @@
      # ── Directories ───────────────────────────────────────────────────
      {
        systemd.tmpfiles.rules = [
-          "d ${cfg.stateDir}                0750 ${cfg.user} ${cfg.group} - -"
-          "d ${cfg.stateDir}/.hermes        0750 ${cfg.user} ${cfg.group} - -"
+          "d ${cfg.stateDir}                0755 ${cfg.user} ${cfg.group} - -"
+          "d ${cfg.stateDir}/.hermes        0755 ${cfg.user} ${cfg.group} - -"
          "d ${cfg.stateDir}/home           0750 ${cfg.user} ${cfg.group} - -"
          "d ${cfg.workingDirectory}         0750 ${cfg.user} ${cfg.group} - -"
        ];
@@ -567,23 +531,21 @@
          mkdir -p ${cfg.stateDir}/home
          mkdir -p ${cfg.workingDirectory}
          chown ${cfg.user}:${cfg.group} ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
-          chmod 0750 ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}

          # Merge Nix settings into existing config.yaml.
          # Preserves user-added keys (skills, streaming, etc.); Nix keys win.
          # If configFile is user-provided (not generated), overwrite instead of merge.
          ${if cfg.configFile != null then ''
-            install -o ${cfg.user} -g ${cfg.group} -m 0640 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
+            install -o ${cfg.user} -g ${cfg.group} -m 0644 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
          '' else ''
            ${configMergeScript} ${generatedConfigFile} ${cfg.stateDir}/.hermes/config.yaml
            chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/config.yaml
-            chmod 0640 ${cfg.stateDir}/.hermes/config.yaml
+            chmod 0644 ${cfg.stateDir}/.hermes/config.yaml
          ''}

          # Managed mode marker (so interactive shells also detect NixOS management)
          touch ${cfg.stateDir}/.hermes/.managed
          chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.managed
-          chmod 0644 ${cfg.stateDir}/.hermes/.managed

          # Seed auth file if provided
          ${lib.optionalString (cfg.authFile != null) ''
@@ -615,7 +577,7 @@ HERMES_NIX_ENV_EOF

          # Link documents into workspace
          ${lib.concatStringsSep "\n" (lib.mapAttrsToList (name: _value: ''
-            install -o ${cfg.user} -g ${cfg.group} -m 0640 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
+            install -o ${cfg.user} -g ${cfg.group} -m 0644 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
          '') cfg.documents)}
        '';
      }
@@ -35,7 +35,7 @@

          ${pkgs.lib.concatMapStringsSep "\n" (name: ''
            makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
-              --suffix PATH : "${runtimePath}" \
+              --prefix PATH : "${runtimePath}" \
              --set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
          '') [ "hermes" "hermes-agent" "hermes-acp" ]}

@@ -1 +0,0 @@
-Communication and decision-making frameworks — structured response formats for proposals, trade-off analysis, and stakeholder-ready recommendations.
@@ -1,103 +0,0 @@
---
-name: one-three-one-rule
-description: >
-  Structured decision-making framework for technical proposals and trade-off analysis.
-  When the user faces a choice between multiple approaches (architecture decisions,
-  tool selection, refactoring strategies, migration paths), this skill produces a
-  1-3-1 format: one clear problem statement, three distinct options with pros/cons,
-  and one concrete recommendation with definition of done and implementation plan.
-  Use when the user asks for a "1-3-1", says "give me options", or needs help
-  choosing between competing approaches.
-version: 1.0.0
-author: Willard Moore
-license: MIT
-category: communication
-metadata:
-  hermes:
-    tags: [communication, decision-making, proposals, trade-offs]
---
-
-# 1-3-1 Communication Rule
-
-Structured decision-making format for when a task has multiple viable approaches and the user needs a clear recommendation. Produces a concise problem framing, three options with trade-offs, and an actionable plan for the recommended path.
-
-## When to Use
-
- The user explicitly asks for a "1-3-1" response.
- The user says "give me options" or "what are my choices" for a technical decision.
- A task has multiple viable approaches with meaningful trade-offs (architecture, tooling, migration strategy).
- The user needs a proposal they can forward to a team or stakeholder.
-
-Do NOT use for simple questions with one obvious answer, debugging sessions, or tasks where the user has already decided on an approach.
-
-## Procedure
-
-1. **Problem** (one sentence)
-   - State the core decision or desired outcome in a single concise sentence.
-   - Focus on the *what*, not the *how* — no implementation details, tool names, or specific technologies.
-   - Keep it tight. If you need "and", you're describing two problems.
-
-2. **Options** (exactly three)
-   - Present three distinct, viable approaches labeled A, B, C.
-   - Each option gets a brief description, pros, and cons.
-   - Options should represent genuinely different strategies, not minor variations of the same approach.
-
-3. **Recommendation** (one option)
-   - State which option you recommend and why, based on the user's context and priorities.
-   - Be direct — this is your professional judgment, not a hedge.
-
-4. **Definition of Done**
-   - List the specific success criteria for the recommended option.
-   - These are concrete, verifiable outcomes — not vague aspirations.
-   - If the user picks a different option, revise this section to match.
-
-5. **Implementation Plan**
-   - Concrete steps to execute the recommended option.
-   - Include specific commands, tools, or actions where applicable.
-   - If the user picks a different option, revise this section to match.
-
-## Verification
-
- Response has exactly one Problem sentence.
- Response has exactly three Options (A, B, C) with pros and cons for each.
- Response has a single Recommendation that picks one option with reasoning.
- Definition of Done and Implementation Plan align with the recommended option.
- If the user selects a different option, Recommendation, DoD, and Implementation Plan update accordingly.
-
-## Example
-
-User: "Give me a 1-3-1 for adding retry logic to our API client."
-
-Agent:
-
-**Problem:** API calls to external services fail intermittently under load and need graceful retry handling.
-
-**Options:**
-
- **Option A: Exponential backoff with jitter in a shared utility.**
-  Pros: Simple, well-understood pattern; single implementation shared across all callers.
-  Cons: No circuit-breaking; callers still block during retries.
-
- **Option B: Retry library (tenacity) as a decorator on each endpoint call.**
-  Pros: Configurable per-endpoint; built-in support for circuit breakers and rate limits.
-  Cons: Adds a dependency; decorator pattern can obscure control flow.
-
- **Option C: Queue-based async retry with dead letter handling.**
-  Pros: Non-blocking; failed calls are tracked and replayable; scales well.
-  Cons: Significant complexity increase; requires a queue backend; overkill for low-volume services.
-
-**Recommendation:** Option A. The call volume doesn't justify queue infrastructure, and a shared utility with exponential backoff + jitter covers the failure mode cleanly without new dependencies.
-
-**Definition of Done:**
- `api_retry()` utility handles transient HTTP errors (429, 502, 503, 504).
- Retries up to 3 times with exponential backoff and jitter.
- All external API calls route through the retry utility.
- Logs each retry attempt with endpoint and attempt number.
- Tests cover retry exhaustion, immediate success, and transient-then-success scenarios.
-
-**Implementation Plan:**
-1. Create `utils/api_retry.py` with configurable max retries, base delay, and retryable status codes.
-2. Add jitter using `random.uniform(0, base_delay)` to prevent thundering herd.
-3. Wrap existing API calls in `api_client.py` with the retry utility.
-4. Add unit tests mocking HTTP responses for each retry scenario.
-5. Verify under load with a simple stress test against a flaky endpoint mock.
@@ -1,297 +0,0 @@
---
-name: siyuan
-description: SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base via curl.
-version: 1.0.0
-author: FEUAZUR
-license: MIT
-metadata:
-  hermes:
-    tags: [SiYuan, Notes, Knowledge Base, PKM, API]
-    related_skills: [obsidian, notion]
-    homepage: https://github.com/siyuan-note/siyuan
-prerequisites:
-  env_vars: [SIYUAN_TOKEN]
-  commands: [curl, jq]
-required_environment_variables:
-  - name: SIYUAN_TOKEN
-    prompt: SiYuan API token
-    help: "Settings > About in SiYuan desktop app"
-  - name: SIYUAN_URL
-    prompt: SiYuan instance URL (default http://127.0.0.1:6806)
-    required_for: remote instances
---
-
-# SiYuan Note API
-
-Use the [SiYuan](https://github.com/siyuan-note/siyuan) kernel API via curl to search, read, create, update, and delete blocks and documents in a self-hosted knowledge base. No extra tools needed -- just curl and an API token.
-
-## Prerequisites
-
-1. Install and run SiYuan (desktop or Docker)
-2. Get your API token: **Settings > About > API token**
-3. Store it in `~/.hermes/.env`:
-   ```
-   SIYUAN_TOKEN=your_token_here
-   SIYUAN_URL=http://127.0.0.1:6806
-   ```
-   `SIYUAN_URL` defaults to `http://127.0.0.1:6806` if not set.
-
-## API Basics
-
-All SiYuan API calls are **POST with JSON body**. Every request follows this pattern:
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/..." \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"param": "value"}'
-```
-
-Responses are JSON with this structure:
-```json
-{"code": 0, "msg": "", "data": { ... }}
-```
-`code: 0` means success. Any other value is an error -- check `msg` for details.
-
-**ID format:** SiYuan IDs look like `20210808180117-6v0mkxr` (14-digit timestamp + 7 alphanumeric chars).
-
-## Quick Reference
-
-| Operation | Endpoint |
-|-----------|----------|
-| Full-text search | `/api/search/fullTextSearchBlock` |
-| SQL query | `/api/query/sql` |
-| Read block | `/api/block/getBlockKramdown` |
-| Read children | `/api/block/getChildBlocks` |
-| Get path | `/api/filetree/getHPathByID` |
-| Get attributes | `/api/attr/getBlockAttrs` |
-| List notebooks | `/api/notebook/lsNotebooks` |
-| List documents | `/api/filetree/listDocsByPath` |
-| Create notebook | `/api/notebook/createNotebook` |
-| Create document | `/api/filetree/createDocWithMd` |
-| Append block | `/api/block/appendBlock` |
-| Update block | `/api/block/updateBlock` |
-| Rename document | `/api/filetree/renameDocByID` |
-| Set attributes | `/api/attr/setBlockAttrs` |
-| Delete block | `/api/block/deleteBlock` |
-| Delete document | `/api/filetree/removeDocByID` |
-| Export as Markdown | `/api/export/exportMdContent` |
-
-## Common Operations
-
-### Search (Full-Text)
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/search/fullTextSearchBlock" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"query": "meeting notes", "page": 0}' | jq '.data.blocks[:5]'
-```
-
-### Search (SQL)
-
-Query the blocks database directly. Only SELECT statements are safe.
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/query/sql" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"stmt": "SELECT id, content, type, box FROM blocks WHERE content LIKE '\''%keyword%'\'' AND type='\''p'\'' LIMIT 20"}' | jq '.data'
-```
-
-Useful columns: `id`, `parent_id`, `root_id`, `box` (notebook ID), `path`, `content`, `type`, `subtype`, `created`, `updated`.
-
-### Read Block Content
-
-Returns block content in Kramdown (Markdown-like) format.
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/getBlockKramdown" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data.kramdown'
-```
-
-### Read Child Blocks
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/getChildBlocks" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
-```
-
-### Get Human-Readable Path
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/getHPathByID" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
-```
-
-### Get Block Attributes
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/attr/getBlockAttrs" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
-```
-
-### List Notebooks
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/notebook/lsNotebooks" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{}' | jq '.data.notebooks[] | {id, name, closed}'
-```
-
-### List Documents in a Notebook
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/listDocsByPath" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"notebook": "NOTEBOOK_ID", "path": "/"}' | jq '.data.files[] | {id, name}'
-```
-
-### Create a Document
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/createDocWithMd" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "notebook": "NOTEBOOK_ID",
-    "path": "/Meeting Notes/2026-03-22",
-    "markdown": "# Meeting Notes\n\n- Discussed project timeline\n- Assigned tasks"
-  }' | jq '.data'
-```
-
-### Create a Notebook
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/notebook/createNotebook" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"name": "My New Notebook"}' | jq '.data.notebook.id'
-```
-
-### Append Block to Document
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/appendBlock" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "parentID": "DOCUMENT_OR_BLOCK_ID",
-    "data": "New paragraph added at the end.",
-    "dataType": "markdown"
-  }' | jq '.data'
-```
-
-Also available: `/api/block/prependBlock` (same params, inserts at the beginning) and `/api/block/insertBlock` (uses `previousID` instead of `parentID` to insert after a specific block).
-
-### Update Block Content
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/updateBlock" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "id": "BLOCK_ID",
-    "data": "Updated content here.",
-    "dataType": "markdown"
-  }' | jq '.data'
-```
-
-### Rename a Document
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/renameDocByID" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "DOCUMENT_ID", "title": "New Title"}'
-```
-
-### Set Block Attributes
-
-Custom attributes must be prefixed with `custom-`:
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/attr/setBlockAttrs" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "id": "BLOCK_ID",
-    "attrs": {
-      "custom-status": "reviewed",
-      "custom-priority": "high"
-    }
-  }'
-```
-
-### Delete a Block
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/deleteBlock" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "BLOCK_ID"}'
-```
-
-To delete a whole document: use `/api/filetree/removeDocByID` with `{"id": "DOC_ID"}`.
-To delete a notebook: use `/api/notebook/removeNotebook` with `{"notebook": "NOTEBOOK_ID"}`.
-
-### Export Document as Markdown
-
-```bash
-curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/export/exportMdContent" \
-  -H "Authorization: Token $SIYUAN_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"id": "DOCUMENT_ID"}' | jq -r '.data.content'
-```
-
-## Block Types
-
-Common `type` values in SQL queries:
-
-| Type | Description |
-|------|-------------|
-| `d` | Document (root block) |
-| `p` | Paragraph |
-| `h` | Heading |
-| `l` | List |
-| `i` | List item |
-| `c` | Code block |
-| `m` | Math block |
-| `t` | Table |
-| `b` | Blockquote |
-| `s` | Super block |
-| `html` | HTML block |
-
-## Pitfalls
-
- **All endpoints are POST** -- even read-only operations. Do not use GET.
- **SQL safety**: only use SELECT queries. INSERT/UPDATE/DELETE/DROP are dangerous and should never be sent.
- **ID validation**: IDs match the pattern `YYYYMMDDHHmmss-xxxxxxx`. Reject anything else.
- **Error responses**: always check `code != 0` in responses before processing `data`.
- **Large documents**: block content and export results can be very large. Use `LIMIT` in SQL and pipe through `jq` to extract only what you need.
- **Notebook IDs**: when working with a specific notebook, get its ID first via `lsNotebooks`.
-
-## Alternative: MCP Server
-
-If you prefer a native integration instead of curl, install the SiYuan MCP server:
-
-```yaml
-# In ~/.hermes/config.yaml under mcp_servers:
-mcp_servers:
-  siyuan:
-    command: npx
-    args: ["-y", "@porkll/siyuan-mcp"]
-    env:
-      SIYUAN_TOKEN: "your_token"
-      SIYUAN_URL: "http://127.0.0.1:6806"
-```
@@ -1,335 +0,0 @@
---
-name: scrapling
-description: Web scraping with Scrapling - HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python.
-version: 1.0.0
-author: FEUAZUR
-license: MIT
-metadata:
-  hermes:
-    tags: [Web Scraping, Browser, Cloudflare, Stealth, Crawling, Spider]
-    related_skills: [duckduckgo-search, domain-intel]
-    homepage: https://github.com/D4Vinci/Scrapling
-prerequisites:
-  commands: [scrapling, python]
---
-
-# Scrapling
-
-[Scrapling](https://github.com/D4Vinci/Scrapling) is a web scraping framework with anti-bot bypass, stealth browser automation, and a spider framework. It provides three fetching strategies (HTTP, dynamic JS, stealth/Cloudflare) and a full CLI.
-
-**This skill is for educational and research purposes only.** Users must comply with local/international data scraping laws and respect website Terms of Service.
-
-## When to Use
-
- Scraping static HTML pages (faster than browser tools)
- Scraping JS-rendered pages that need a real browser
- Bypassing Cloudflare Turnstile or bot detection
- Crawling multiple pages with a spider
- When the built-in `web_extract` tool does not return the data you need
-
-## Installation
-
-```bash
-pip install "scrapling[all]"
-scrapling install
-```
-
-Minimal install (HTTP only, no browser):
-```bash
-pip install scrapling
-```
-
-With browser automation only:
-```bash
-pip install "scrapling[fetchers]"
-scrapling install
-```
-
-## Quick Reference
-
-| Approach | Class | Use When |
-|----------|-------|----------|
-| HTTP | `Fetcher` / `FetcherSession` | Static pages, APIs, fast bulk requests |
-| Dynamic | `DynamicFetcher` / `DynamicSession` | JS-rendered content, SPAs |
-| Stealth | `StealthyFetcher` / `StealthySession` | Cloudflare, anti-bot protected sites |
-| Spider | `Spider` | Multi-page crawling with link following |
-
-## CLI Usage
-
-### Extract Static Page
-
-```bash
-scrapling extract get 'https://example.com' output.md
-```
-
-With CSS selector and browser impersonation:
-
-```bash
-scrapling extract get 'https://example.com' output.md \
-  --css-selector '.content' \
-  --impersonate 'chrome'
-```
-
-### Extract JS-Rendered Page
-
-```bash
-scrapling extract fetch 'https://example.com' output.md \
-  --css-selector '.dynamic-content' \
-  --disable-resources \
-  --network-idle
-```
-
-### Extract Cloudflare-Protected Page
-
-```bash
-scrapling extract stealthy-fetch 'https://protected-site.com' output.html \
-  --solve-cloudflare \
-  --block-webrtc \
-  --hide-canvas
-```
-
-### POST Request
-
-```bash
-scrapling extract post 'https://example.com/api' output.json \
-  --json '{"query": "search term"}'
-```
-
-### Output Formats
-
-The output format is determined by the file extension:
- `.html` -- raw HTML
- `.md` -- converted to Markdown
- `.txt` -- plain text
- `.json` / `.jsonl` -- JSON
-
-## Python: HTTP Scraping
-
-### Single Request
-
-```python
-from scrapling.fetchers import Fetcher
-
-page = Fetcher.get('https://quotes.toscrape.com/')
-quotes = page.css('.quote .text::text').getall()
-for q in quotes:
-    print(q)
-```
-
-### Session (Persistent Cookies)
-
-```python
-from scrapling.fetchers import FetcherSession
-
-with FetcherSession(impersonate='chrome') as session:
-    page = session.get('https://example.com/', stealthy_headers=True)
-    links = page.css('a::attr(href)').getall()
-    for link in links[:5]:
-        sub = session.get(link)
-        print(sub.css('h1::text').get())
-```
-
-### POST / PUT / DELETE
-
-```python
-page = Fetcher.post('https://api.example.com/data', json={"key": "value"})
-page = Fetcher.put('https://api.example.com/item/1', data={"name": "updated"})
-page = Fetcher.delete('https://api.example.com/item/1')
-```
-
-### With Proxy
-
-```python
-page = Fetcher.get('https://example.com', proxy='http://user:pass@proxy:8080')
-```
-
-## Python: Dynamic Pages (JS-Rendered)
-
-For pages that require JavaScript execution (SPAs, lazy-loaded content):
-
-```python
-from scrapling.fetchers import DynamicFetcher
-
-page = DynamicFetcher.fetch('https://example.com', headless=True)
-data = page.css('.js-loaded-content::text').getall()
-```
-
-### Wait for Specific Element
-
-```python
-page = DynamicFetcher.fetch(
-    'https://example.com',
-    wait_selector=('.results', 'visible'),
-    network_idle=True,
-)
-```
-
-### Disable Resources for Speed
-
-Blocks fonts, images, media, stylesheets (~25% faster):
-
-```python
-from scrapling.fetchers import DynamicSession
-
-with DynamicSession(headless=True, disable_resources=True, network_idle=True) as session:
-    page = session.fetch('https://example.com')
-    items = page.css('.item::text').getall()
-```
-
-### Custom Page Automation
-
-```python
-from playwright.sync_api import Page
-from scrapling.fetchers import DynamicFetcher
-
-def scroll_and_click(page: Page):
-    page.mouse.wheel(0, 3000)
-    page.wait_for_timeout(1000)
-    page.click('button.load-more')
-    page.wait_for_selector('.extra-results')
-
-page = DynamicFetcher.fetch('https://example.com', page_action=scroll_and_click)
-results = page.css('.extra-results .item::text').getall()
-```
-
-## Python: Stealth Mode (Anti-Bot Bypass)
-
-For Cloudflare-protected or heavily fingerprinted sites:
-
-```python
-from scrapling.fetchers import StealthyFetcher
-
-page = StealthyFetcher.fetch(
-    'https://protected-site.com',
-    headless=True,
-    solve_cloudflare=True,
-    block_webrtc=True,
-    hide_canvas=True,
-)
-content = page.css('.protected-content::text').getall()
-```
-
-### Stealth Session
-
-```python
-from scrapling.fetchers import StealthySession
-
-with StealthySession(headless=True, solve_cloudflare=True) as session:
-    page1 = session.fetch('https://protected-site.com/page1')
-    page2 = session.fetch('https://protected-site.com/page2')
-```
-
-## Element Selection
-
-All fetchers return a `Selector` object with these methods:
-
-### CSS Selectors
-
-```python
-page.css('h1::text').get()              # First h1 text
-page.css('a::attr(href)').getall()      # All link hrefs
-page.css('.quote .text::text').getall() # Nested selection
-```
-
-### XPath
-
-```python
-page.xpath('//div[@class="content"]/text()').getall()
-page.xpath('//a/@href').getall()
-```
-
-### Find Methods
-
-```python
-page.find_all('div', class_='quote')       # By tag + attribute
-page.find_by_text('Read more', tag='a')    # By text content
-page.find_by_regex(r'\$\d+\.\d{2}')       # By regex pattern
-```
-
-### Similar Elements
-
-Find elements with similar structure (useful for product listings, etc.):
-
-```python
-first_product = page.css('.product')[0]
-all_similar = first_product.find_similar()
-```
-
-### Navigation
-
-```python
-el = page.css('.target')[0]
-el.parent                # Parent element
-el.children              # Child elements
-el.next_sibling          # Next sibling
-el.prev_sibling          # Previous sibling
-```
-
-## Python: Spider Framework
-
-For multi-page crawling with link following:
-
-```python
-from scrapling.spiders import Spider, Request, Response
-
-class QuotesSpider(Spider):
-    name = "quotes"
-    start_urls = ["https://quotes.toscrape.com/"]
-    concurrent_requests = 10
-    download_delay = 1
-
-    async def parse(self, response: Response):
-        for quote in response.css('.quote'):
-            yield {
-                "text": quote.css('.text::text').get(),
-                "author": quote.css('.author::text').get(),
-                "tags": quote.css('.tag::text').getall(),
-            }
-
-        next_page = response.css('.next a::attr(href)').get()
-        if next_page:
-            yield response.follow(next_page)
-
-result = QuotesSpider().start()
-print(f"Scraped {len(result.items)} quotes")
-result.items.to_json("quotes.json")
-```
-
-### Multi-Session Spider
-
-Route requests to different fetcher types:
-
-```python
-from scrapling.fetchers import FetcherSession, AsyncStealthySession
-
-class SmartSpider(Spider):
-    name = "smart"
-    start_urls = ["https://example.com/"]
-
-    def configure_sessions(self, manager):
-        manager.add("fast", FetcherSession(impersonate="chrome"))
-        manager.add("stealth", AsyncStealthySession(headless=True), lazy=True)
-
-    async def parse(self, response: Response):
-        for link in response.css('a::attr(href)').getall():
-            if "protected" in link:
-                yield Request(link, sid="stealth")
-            else:
-                yield Request(link, sid="fast", callback=self.parse)
-```
-
-### Pause/Resume Crawling
-
-```python
-spider = QuotesSpider(crawldir="./crawl_checkpoint")
-spider.start()  # Ctrl+C to pause, re-run to resume from checkpoint
-```
-
-## Pitfalls
-
- **Browser install required**: run `scrapling install` after pip install -- without it, `DynamicFetcher` and `StealthyFetcher` will fail
- **Timeouts**: DynamicFetcher/StealthyFetcher timeout is in **milliseconds** (default 30000), Fetcher timeout is in **seconds**
- **Cloudflare bypass**: `solve_cloudflare=True` adds 5-15 seconds to fetch time -- only enable when needed
- **Resource usage**: StealthyFetcher runs a real browser -- limit concurrent usage
- **Legal**: always check robots.txt and website ToS before scraping. This library is for educational and research purposes
- **Python version**: requires Python 3.10+
@@ -1,384 +0,0 @@
-"""Cognitive memory plugin — MemoryProvider interface.
-
-Semantic memory with vector embeddings (via litellm), auto-classification,
-contradiction detection, importance decay, and time-based forgetting.
-Local SQLite storage with binary-packed float32 embeddings.
-
-Original PR #727 by 0xbyt4, adapted to MemoryProvider ABC.
-
-Requires: litellm (for embeddings via any provider — OpenAI, Cohere, etc.)
-Config via environment: uses litellm's standard env vars (OPENAI_API_KEY, etc.)
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import math
-import os
-import re
-import sqlite3
-import struct
-import time
-from pathlib import Path
-from typing import Any, Dict, List, Optional
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-_DB_DIR = Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))) / "cognitive_memory"
-_EMBEDDING_DIM = 1536  # text-embedding-3-small default
-_SIMILARITY_DEDUP_THRESHOLD = 0.95
-
-
-# ---------------------------------------------------------------------------
-# Embedding helper
-# ---------------------------------------------------------------------------
-
-def _get_embedding(text: str) -> Optional[List[float]]:
-    """Get embedding via litellm."""
-    try:
-        import litellm
-        resp = litellm.embedding(model="text-embedding-3-small", input=[text])
-        return resp.data[0]["embedding"]
-    except Exception as e:
-        logger.debug("Embedding failed: %s", e)
-        return None
-
-
-def _cosine_similarity(a: List[float], b: List[float]) -> float:
-    dot = sum(x * y for x, y in zip(a, b))
-    mag_a = math.sqrt(sum(x * x for x in a))
-    mag_b = math.sqrt(sum(x * x for x in b))
-    if mag_a == 0 or mag_b == 0:
-        return 0.0
-    return dot / (mag_a * mag_b)
-
-
-def _pack_embedding(emb: List[float]) -> bytes:
-    return struct.pack(f"{len(emb)}f", *emb)
-
-
-def _unpack_embedding(data: bytes) -> List[float]:
-    n = len(data) // 4
-    return list(struct.unpack(f"{n}f", data))
-
-
-# ---------------------------------------------------------------------------
-# Classification
-# ---------------------------------------------------------------------------
-
-_CATEGORY_PATTERNS = {
-    "preference": [r"\b(?:prefer|like|love|hate|dislike|favorite)\b"],
-    "correction": [r"\b(?:actually|no,|wrong|incorrect|not right)\b"],
-    "fact": [r"\b(?:is|are|was|were|has|have)\b"],
-    "procedure": [r"\b(?:first|then|step|always|never|usually)\b"],
-    "environment": [r"\b(?:running|using|installed|version|os|platform)\b"],
-}
-
-
-def _classify(content: str) -> str:
-    content_lower = content.lower()
-    for category, patterns in _CATEGORY_PATTERNS.items():
-        for pattern in patterns:
-            if re.search(pattern, content_lower):
-                return category
-    return "general"
-
-
-def _estimate_importance(content: str, category: str) -> float:
-    base = {"correction": 0.9, "preference": 0.7, "procedure": 0.6}.get(category, 0.5)
-    # Longer content slightly more important
-    length_bonus = min(len(content) / 500, 0.2)
-    return min(base + length_bonus, 1.0)
-
-
-# ---------------------------------------------------------------------------
-# Tool schema
-# ---------------------------------------------------------------------------
-
-COGNITIVE_RECALL_SCHEMA = {
-    "name": "cognitive_recall",
-    "description": (
-        "Semantic memory with automatic classification and importance scoring. "
-        "Actions: recall (search by meaning), store (add a fact), "
-        "forget (remove by ID), status (memory stats)."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "action": {
-                "type": "string",
-                "enum": ["recall", "store", "forget", "status"],
-                "description": "Action to perform.",
-            },
-            "query": {"type": "string", "description": "Search query (for 'recall')."},
-            "content": {"type": "string", "description": "Fact to store (for 'store')."},
-            "category": {
-                "type": "string",
-                "enum": ["preference", "fact", "procedure", "environment", "correction", "general"],
-                "description": "Category (auto-detected if omitted).",
-            },
-            "memory_id": {"type": "integer", "description": "Memory ID (for 'forget')."},
-        },
-        "required": ["action"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class CognitiveMemoryProvider(MemoryProvider):
-    """Semantic memory with embeddings, classification, and forgetting."""
-
-    def __init__(self):
-        self._conn = None
-        self._decay_half_life = 30  # days
-        self._last_decay = 0
-
-    @property
-    def name(self) -> str:
-        return "cognitive"
-
-    def get_config_schema(self):
-        return [
-            {"key": "embedding_model", "description": "Embedding model (litellm format)", "default": "text-embedding-3-small"},
-            {"key": "decay_half_life", "description": "Importance decay half-life in days (0=disabled)", "default": "30"},
-        ]
-
-    def is_available(self) -> bool:
-        try:
-            import litellm  # noqa: F401
-            return True
-        except ImportError:
-            return False
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        _DB_DIR.mkdir(parents=True, exist_ok=True)
-        db_path = _DB_DIR / "cognitive.db"
-        self._conn = sqlite3.connect(str(db_path))
-        self._conn.execute("PRAGMA journal_mode=WAL")
-        self._conn.executescript("""
-            CREATE TABLE IF NOT EXISTS memories (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                content TEXT NOT NULL,
-                category TEXT DEFAULT 'general',
-                importance REAL DEFAULT 0.5,
-                embedding BLOB,
-                retrieval_count INTEGER DEFAULT 0,
-                helpful_count INTEGER DEFAULT 0,
-                created_at REAL,
-                updated_at REAL,
-                deleted INTEGER DEFAULT 0
-            );
-            CREATE INDEX IF NOT EXISTS idx_mem_importance ON memories(importance DESC);
-            CREATE INDEX IF NOT EXISTS idx_mem_category ON memories(category);
-        """)
-        self._conn.commit()
-
-    def system_prompt_block(self) -> str:
-        if not self._conn:
-            return ""
-        try:
-            count = self._conn.execute(
-                "SELECT COUNT(*) FROM memories WHERE deleted = 0"
-            ).fetchone()[0]
-        except Exception:
-            count = 0
-        if count == 0:
-            return ""
-        return (
-            f"# Cognitive Memory\n"
-            f"Active. {count} memories with semantic recall and importance scoring.\n"
-            f"Use cognitive_recall to search, store facts, or check status.\n"
-            f"Memories decay over time — frequently used facts persist, unused ones fade."
-        )
-
-    def prefetch(self, query: str) -> str:
-        if not self._conn or not query:
-            return ""
-        emb = _get_embedding(query)
-        if not emb:
-            return ""
-        try:
-            rows = self._conn.execute(
-                "SELECT id, content, importance, embedding FROM memories "
-                "WHERE deleted = 0 AND embedding IS NOT NULL "
-                "ORDER BY importance DESC LIMIT 50"
-            ).fetchall()
-            scored = []
-            now = time.time()
-            for row in rows:
-                mem_emb = _unpack_embedding(row[3])
-                sim = _cosine_similarity(emb, mem_emb)
-                importance = row[2]
-                score = 0.5 * sim + 0.3 * importance + 0.2 * max(0, 1 - (now - (row[0] * 86400)) / (30 * 86400))
-                if sim > 0.3:
-                    scored.append((score, row[1]))
-            scored.sort(reverse=True)
-            if not scored:
-                return ""
-            lines = [f"- {content}" for _, content in scored[:5]]
-            return "## Cognitive Memory\n" + "\n".join(lines)
-        except Exception as e:
-            logger.debug("Cognitive prefetch failed: %s", e)
-            return ""
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        # Run decay cycle periodically
-        self._maybe_decay()
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [COGNITIVE_RECALL_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
-        if tool_name != "cognitive_recall":
-            return json.dumps({"error": f"Unknown tool: {tool_name}"})
-
-        action = args.get("action", "")
-
-        if action == "store":
-            return self._store(args)
-        elif action == "recall":
-            return self._recall(args)
-        elif action == "forget":
-            return self._forget(args)
-        elif action == "status":
-            return self._status()
-        return json.dumps({"error": f"Unknown action: {action}"})
-
-    def on_memory_write(self, action: str, target: str, content: str) -> None:
-        if action == "add" and self._conn and content:
-            category = "preference" if target == "user" else _classify(content)
-            importance = _estimate_importance(content, category)
-            emb = _get_embedding(content)
-            now = time.time()
-            self._conn.execute(
-                "INSERT INTO memories (content, category, importance, embedding, created_at, updated_at) "
-                "VALUES (?, ?, ?, ?, ?, ?)",
-                (content, category, importance, _pack_embedding(emb) if emb else None, now, now),
-            )
-            self._conn.commit()
-
-    def shutdown(self) -> None:
-        if self._conn:
-            self._conn.close()
-            self._conn = None
-
-    # -- Internal methods ----------------------------------------------------
-
-    def _store(self, args: dict) -> str:
-        content = args.get("content", "")
-        if not content:
-            return json.dumps({"error": "content is required"})
-
-        category = args.get("category") or _classify(content)
-        importance = _estimate_importance(content, category)
-        emb = _get_embedding(content)
-
-        # Dedup check
-        if emb:
-            rows = self._conn.execute(
-                "SELECT id, embedding FROM memories WHERE deleted = 0 AND embedding IS NOT NULL"
-            ).fetchall()
-            for row in rows:
-                existing_emb = _unpack_embedding(row[1])
-                if _cosine_similarity(emb, existing_emb) > _SIMILARITY_DEDUP_THRESHOLD:
-                    return json.dumps({"error": "Very similar memory already exists", "existing_id": row[0]})
-
-        now = time.time()
-        cur = self._conn.execute(
-            "INSERT INTO memories (content, category, importance, embedding, created_at, updated_at) "
-            "VALUES (?, ?, ?, ?, ?, ?)",
-            (content, category, importance, _pack_embedding(emb) if emb else None, now, now),
-        )
-        self._conn.commit()
-        return json.dumps({"id": cur.lastrowid, "category": category, "importance": round(importance, 2)})
-
-    def _recall(self, args: dict) -> str:
-        query = args.get("query", "")
-        if not query:
-            return json.dumps({"error": "query is required"})
-
-        emb = _get_embedding(query)
-        if not emb:
-            return json.dumps({"error": "Embedding generation failed"})
-
-        rows = self._conn.execute(
-            "SELECT id, content, category, importance, embedding, created_at FROM memories "
-            "WHERE deleted = 0 AND embedding IS NOT NULL "
-            "ORDER BY importance DESC LIMIT 50"
-        ).fetchall()
-
-        now = time.time()
-        results = []
-        for row in rows:
-            mem_emb = _unpack_embedding(row[4])
-            sim = _cosine_similarity(emb, mem_emb)
-            days_old = (now - (row[5] or now)) / 86400
-            recency = max(0, 1 - days_old / 90)
-            score = 0.5 * sim + 0.3 * row[3] + 0.2 * recency
-            if sim > 0.2:
-                results.append({
-                    "id": row[0], "content": row[1], "category": row[2],
-                    "score": round(score, 3), "similarity": round(sim, 3),
-                })
-
-        results.sort(key=lambda x: x["score"], reverse=True)
-        # Bump retrieval counts
-        for r in results[:10]:
-            self._conn.execute(
-                "UPDATE memories SET retrieval_count = retrieval_count + 1 WHERE id = ?",
-                (r["id"],),
-            )
-        self._conn.commit()
-        return json.dumps({"results": results[:10], "count": len(results[:10])})
-
-    def _forget(self, args: dict) -> str:
-        memory_id = args.get("memory_id")
-        if memory_id is None:
-            return json.dumps({"error": "memory_id is required"})
-        self._conn.execute("UPDATE memories SET deleted = 1 WHERE id = ?", (int(memory_id),))
-        self._conn.commit()
-        return json.dumps({"forgotten": True, "id": memory_id})
-
-    def _status(self) -> str:
-        total = self._conn.execute("SELECT COUNT(*) FROM memories WHERE deleted = 0").fetchone()[0]
-        by_cat = self._conn.execute(
-            "SELECT category, COUNT(*) FROM memories WHERE deleted = 0 GROUP BY category"
-        ).fetchall()
-        return json.dumps({
-            "total": total,
-            "by_category": {row[0]: row[1] for row in by_cat},
-            "decay_half_life_days": self._decay_half_life,
-        })
-
-    def _maybe_decay(self) -> None:
-        """Run importance decay every ~1 hour."""
-        now = time.time()
-        if now - self._last_decay < 3600:
-            return
-        self._last_decay = now
-        if not self._conn or self._decay_half_life <= 0:
-            return
-        try:
-            factor = 0.5 ** (1.0 / self._decay_half_life)
-            self._conn.execute(
-                "UPDATE memories SET importance = importance * ? WHERE deleted = 0",
-                (factor,),
-            )
-            # Prune very low importance
-            self._conn.execute(
-                "UPDATE memories SET deleted = 1 WHERE deleted = 0 AND importance < 0.05"
-            )
-            self._conn.commit()
-        except Exception as e:
-            logger.debug("Cognitive decay failed: %s", e)
-
-
-def register(ctx) -> None:
-    """Register cognitive memory as a memory provider plugin."""
-    ctx.register_memory_provider(CognitiveMemoryProvider())
@@ -1,6 +0,0 @@
-name: cognitive-memory
-version: 1.0.0
-description: >
-  Semantic memory with vector embeddings, auto-classification, contradiction
-  detection, importance decay, and time-based forgetting. Local SQLite storage,
-  requires litellm for embeddings.
@@ -1,373 +0,0 @@
-"""hermes-memory-store — holographic memory plugin using MemoryProvider interface.
-
-Registers as a MemoryProvider plugin, giving the agent structured fact storage
-with entity resolution, trust scoring, and HRR-based compositional retrieval.
-
-Original plugin by dusterbloom (PR #2351), adapted to the MemoryProvider ABC.
-
-Config in ~/.hermes/config.yaml:
-  plugins:
-    hermes-memory-store:
-      db_path: ~/.hermes/memory_store.db
-      auto_extract: false
-      default_trust: 0.5
-      min_trust_threshold: 0.3
-      temporal_decay_half_life: 0
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import re
-from pathlib import Path
-from typing import Any, Dict, List
-
-from agent.memory_provider import MemoryProvider
-from .store import MemoryStore
-from .retrieval import FactRetriever
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Tool schemas (unchanged from original PR)
-# ---------------------------------------------------------------------------
-
-FACT_STORE_SCHEMA = {
-    "name": "fact_store",
-    "description": (
-        "Deep structured memory with algebraic reasoning. "
-        "Use alongside the memory tool — memory for always-on context, "
-        "fact_store for deep recall and compositional queries.\n\n"
-        "ACTIONS (simple → powerful):\n"
-        "• add — Store a fact the user would expect you to remember.\n"
-        "• search — Keyword lookup ('editor config', 'deploy process').\n"
-        "• probe — Entity recall: ALL facts about a person/thing.\n"
-        "• related — What connects to an entity? Structural adjacency.\n"
-        "• reason — Compositional: facts connected to MULTIPLE entities simultaneously.\n"
-        "• contradict — Memory hygiene: find facts making conflicting claims.\n"
-        "• update/remove/list — CRUD operations.\n\n"
-        "IMPORTANT: Before answering questions about the user, ALWAYS probe or reason first."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "action": {
-                "type": "string",
-                "enum": ["add", "search", "probe", "related", "reason", "contradict", "update", "remove", "list"],
-            },
-            "content": {"type": "string", "description": "Fact content (required for 'add')."},
-            "query": {"type": "string", "description": "Search query (required for 'search')."},
-            "entity": {"type": "string", "description": "Entity name for 'probe'/'related'."},
-            "entities": {"type": "array", "items": {"type": "string"}, "description": "Entity names for 'reason'."},
-            "fact_id": {"type": "integer", "description": "Fact ID for 'update'/'remove'."},
-            "category": {"type": "string", "enum": ["user_pref", "project", "tool", "general"]},
-            "tags": {"type": "string", "description": "Comma-separated tags."},
-            "trust_delta": {"type": "number", "description": "Trust adjustment for 'update'."},
-            "min_trust": {"type": "number", "description": "Minimum trust filter (default: 0.3)."},
-            "limit": {"type": "integer", "description": "Max results (default: 10)."},
-        },
-        "required": ["action"],
-    },
-}
-
-FACT_FEEDBACK_SCHEMA = {
-    "name": "fact_feedback",
-    "description": (
-        "Rate a fact after using it. Mark 'helpful' if accurate, 'unhelpful' if outdated. "
-        "This trains the memory — good facts rise, bad facts sink."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "action": {"type": "string", "enum": ["helpful", "unhelpful"]},
-            "fact_id": {"type": "integer", "description": "The fact ID to rate."},
-        },
-        "required": ["action", "fact_id"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# Config
-# ---------------------------------------------------------------------------
-
-def _load_plugin_config() -> dict:
-    config_path = Path("~/.hermes/config.yaml").expanduser()
-    if not config_path.exists():
-        return {}
-    try:
-        import yaml
-        with open(config_path) as f:
-            all_config = yaml.safe_load(f) or {}
-        return all_config.get("plugins", {}).get("hermes-memory-store", {}) or {}
-    except Exception:
-        return {}
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class HolographicMemoryProvider(MemoryProvider):
-    """Holographic memory with structured facts, entity resolution, and HRR retrieval."""
-
-    def __init__(self, config: dict | None = None):
-        self._config = config or _load_plugin_config()
-        self._store = None
-        self._retriever = None
-        self._min_trust = float(self._config.get("min_trust_threshold", 0.3))
-
-    @property
-    def name(self) -> str:
-        return "holographic"
-
-    def is_available(self) -> bool:
-        return True  # SQLite is always available, numpy is optional
-
-    def get_config_schema(self):
-        return [
-            {"key": "db_path", "description": "SQLite database path", "default": "~/.hermes/memory_store.db"},
-            {"key": "auto_extract", "description": "Auto-extract facts at session end", "default": "false", "choices": ["true", "false"]},
-            {"key": "default_trust", "description": "Default trust score for new facts", "default": "0.5"},
-            {"key": "hrr_dim", "description": "HRR vector dimensions", "default": "1024"},
-        ]
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        db_path = self._config.get("db_path", "~/.hermes/memory_store.db")
-        default_trust = float(self._config.get("default_trust", 0.5))
-        hrr_dim = int(self._config.get("hrr_dim", 1024))
-        hrr_weight = float(self._config.get("hrr_weight", 0.3))
-        temporal_decay = int(self._config.get("temporal_decay_half_life", 0))
-
-        self._store = MemoryStore(db_path=db_path, default_trust=default_trust, hrr_dim=hrr_dim)
-        self._retriever = FactRetriever(
-            store=self._store,
-            temporal_decay_half_life=temporal_decay,
-            hrr_weight=hrr_weight,
-            hrr_dim=hrr_dim,
-        )
-        self._session_id = session_id
-
-    def system_prompt_block(self) -> str:
-        if not self._store:
-            return ""
-        try:
-            total = self._store._conn.execute(
-                "SELECT COUNT(*) FROM facts"
-            ).fetchone()[0]
-        except Exception:
-            total = 0
-        if total == 0:
-            return ""
-        return (
-            f"# Holographic Memory\n"
-            f"Active. {total} facts stored with entity resolution and trust scoring.\n"
-            f"Use fact_store to search, probe entities, reason across entities, or add facts.\n"
-            f"Use fact_feedback to rate facts after using them (trains trust scores)."
-        )
-
-    def prefetch(self, query: str) -> str:
-        if not self._retriever or not query:
-            return ""
-        try:
-            results = self._retriever.search(query, min_trust=self._min_trust, limit=5)
-            if not results:
-                return ""
-            lines = []
-            for r in results:
-                trust = r.get("trust", 0)
-                lines.append(f"- [{trust:.1f}] {r.get('content', '')}")
-            return "## Holographic Memory\n" + "\n".join(lines)
-        except Exception as e:
-            logger.debug("Holographic prefetch failed: %s", e)
-            return ""
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        # Holographic memory stores explicit facts via tools, not auto-sync.
-        # The on_session_end hook handles auto-extraction if configured.
-        pass
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [FACT_STORE_SCHEMA, FACT_FEEDBACK_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
-        if tool_name == "fact_store":
-            return self._handle_fact_store(args)
-        elif tool_name == "fact_feedback":
-            return self._handle_fact_feedback(args)
-        return json.dumps({"error": f"Unknown tool: {tool_name}"})
-
-    def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
-        if not self._config.get("auto_extract", False):
-            return
-        if not self._store or not messages:
-            return
-        self._auto_extract_facts(messages)
-
-    def on_memory_write(self, action: str, target: str, content: str) -> None:
-        """Mirror built-in memory writes as facts."""
-        if action == "add" and self._store and content:
-            try:
-                category = "user_pref" if target == "user" else "general"
-                self._store.add_fact(content, category=category)
-            except Exception as e:
-                logger.debug("Holographic memory_write mirror failed: %s", e)
-
-    def shutdown(self) -> None:
-        self._store = None
-        self._retriever = None
-
-    # -- Tool handlers -------------------------------------------------------
-
-    def _handle_fact_store(self, args: dict) -> str:
-        try:
-            action = args["action"]
-            store = self._store
-            retriever = self._retriever
-
-            if action == "add":
-                fact_id = store.add_fact(
-                    args["content"],
-                    category=args.get("category", "general"),
-                    tags=args.get("tags", ""),
-                )
-                return json.dumps({"fact_id": fact_id, "status": "added"})
-
-            elif action == "search":
-                results = retriever.search(
-                    args["query"],
-                    category=args.get("category"),
-                    min_trust=float(args.get("min_trust", self._min_trust)),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"results": results, "count": len(results)})
-
-            elif action == "probe":
-                results = retriever.probe(
-                    args["entity"],
-                    category=args.get("category"),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"results": results, "count": len(results)})
-
-            elif action == "related":
-                results = retriever.related(
-                    args["entity"],
-                    category=args.get("category"),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"results": results, "count": len(results)})
-
-            elif action == "reason":
-                entities = args.get("entities", [])
-                if not entities:
-                    return json.dumps({"error": "reason requires 'entities' list"})
-                results = retriever.reason(
-                    entities,
-                    category=args.get("category"),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"results": results, "count": len(results)})
-
-            elif action == "contradict":
-                results = retriever.contradict(
-                    category=args.get("category"),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"results": results, "count": len(results)})
-
-            elif action == "update":
-                updated = store.update_fact(
-                    int(args["fact_id"]),
-                    content=args.get("content"),
-                    trust_delta=float(args["trust_delta"]) if "trust_delta" in args else None,
-                    tags=args.get("tags"),
-                    category=args.get("category"),
-                )
-                return json.dumps({"updated": updated})
-
-            elif action == "remove":
-                removed = store.remove_fact(int(args["fact_id"]))
-                return json.dumps({"removed": removed})
-
-            elif action == "list":
-                facts = store.list_facts(
-                    category=args.get("category"),
-                    min_trust=float(args.get("min_trust", 0.0)),
-                    limit=int(args.get("limit", 10)),
-                )
-                return json.dumps({"facts": facts, "count": len(facts)})
-
-            else:
-                return json.dumps({"error": f"Unknown action: {action}"})
-
-        except KeyError as exc:
-            return json.dumps({"error": f"Missing required argument: {exc}"})
-        except Exception as exc:
-            return json.dumps({"error": str(exc)})
-
-    def _handle_fact_feedback(self, args: dict) -> str:
-        try:
-            fact_id = int(args["fact_id"])
-            helpful = args["action"] == "helpful"
-            result = self._store.record_feedback(fact_id, helpful=helpful)
-            return json.dumps(result)
-        except KeyError as exc:
-            return json.dumps({"error": f"Missing required argument: {exc}"})
-        except Exception as exc:
-            return json.dumps({"error": str(exc)})
-
-    # -- Auto-extraction (on_session_end) ------------------------------------
-
-    def _auto_extract_facts(self, messages: list) -> None:
-        _PREF_PATTERNS = [
-            re.compile(r'\bI\s+(?:prefer|like|love|use|want|need)\s+(.+)', re.IGNORECASE),
-            re.compile(r'\bmy\s+(?:favorite|preferred|default)\s+\w+\s+is\s+(.+)', re.IGNORECASE),
-            re.compile(r'\bI\s+(?:always|never|usually)\s+(.+)', re.IGNORECASE),
-        ]
-        _DECISION_PATTERNS = [
-            re.compile(r'\bwe\s+(?:decided|agreed|chose)\s+(?:to\s+)?(.+)', re.IGNORECASE),
-            re.compile(r'\bthe\s+project\s+(?:uses|needs|requires)\s+(.+)', re.IGNORECASE),
-        ]
-
-        extracted = 0
-        for msg in messages:
-            if msg.get("role") != "user":
-                continue
-            content = msg.get("content", "")
-            if not isinstance(content, str) or len(content) < 10:
-                continue
-
-            for pattern in _PREF_PATTERNS:
-                if pattern.search(content):
-                    try:
-                        self._store.add_fact(content[:400], category="user_pref")
-                        extracted += 1
-                    except Exception:
-                        pass
-                    break
-
-            for pattern in _DECISION_PATTERNS:
-                if pattern.search(content):
-                    try:
-                        self._store.add_fact(content[:400], category="project")
-                        extracted += 1
-                    except Exception:
-                        pass
-                    break
-
-        if extracted:
-            logger.info("Auto-extracted %d facts from conversation", extracted)
-
-
-# ---------------------------------------------------------------------------
-# Plugin entry point
-# ---------------------------------------------------------------------------
-
-def register(ctx) -> None:
-    """Register the holographic memory provider with the plugin system."""
-    config = _load_plugin_config()
-    provider = HolographicMemoryProvider(config=config)
-    ctx.register_memory_provider(provider)
@@ -1,203 +0,0 @@
-"""Holographic Reduced Representations (HRR) with phase encoding.
-
-HRRs are a vector symbolic architecture for encoding compositional structure
-into fixed-width distributed representations. This module uses *phase vectors*:
-each concept is a vector of angles in [0, 2π). The algebraic operations are:
-
-  bind   — circular convolution (phase addition)  — associates two concepts
-  unbind — circular correlation (phase subtraction) — retrieves a bound value
-  bundle — superposition (circular mean)           — merges multiple concepts
-
-Phase encoding is numerically stable, avoids the magnitude collapse of
-traditional complex-number HRRs, and maps cleanly to cosine similarity.
-
-Atoms are generated deterministically from SHA-256 so representations are
-identical across processes, machines, and language versions.
-
-References:
-  Plate (1995) — Holographic Reduced Representations
-  Gayler (2004) — Vector Symbolic Architectures answer Jackendoff's challenges
-"""
-
-import hashlib
-import logging
-import struct
-import math
-
-try:
-    import numpy as np
-    _HAS_NUMPY = True
-except ImportError:
-    _HAS_NUMPY = False
-
-logger = logging.getLogger(__name__)
-
-_TWO_PI = 2.0 * math.pi
-
-
-def _require_numpy() -> None:
-    if not _HAS_NUMPY:
-        raise RuntimeError("numpy is required for holographic operations")
-
-
-def encode_atom(word: str, dim: int = 1024) -> "np.ndarray":
-    """Deterministic phase vector via SHA-256 counter blocks.
-
-    Uses hashlib (not numpy RNG) for cross-platform reproducibility.
-
-    Algorithm:
-    - Generate enough SHA-256 blocks by hashing f"{word}:{i}" for i=0,1,2,...
-    - Concatenate digests, interpret as uint16 values via struct.unpack
-    - Scale to [0, 2π): phases = values * (2π / 65536)
-    - Truncate to dim elements
-    - Returns np.float64 array of shape (dim,)
-    """
-    _require_numpy()
-
-    # Each SHA-256 digest is 32 bytes = 16 uint16 values.
-    values_per_block = 16
-    blocks_needed = math.ceil(dim / values_per_block)
-
-    uint16_values: list[int] = []
-    for i in range(blocks_needed):
-        digest = hashlib.sha256(f"{word}:{i}".encode()).digest()
-        uint16_values.extend(struct.unpack("<16H", digest))
-
-    phases = np.array(uint16_values[:dim], dtype=np.float64) * (_TWO_PI / 65536.0)
-    return phases
-
-
-def bind(a: "np.ndarray", b: "np.ndarray") -> "np.ndarray":
-    """Circular convolution = element-wise phase addition.
-
-    Binding associates two concepts into a single composite vector.
-    The result is dissimilar to both inputs (quasi-orthogonal).
-    """
-    _require_numpy()
-    return (a + b) % _TWO_PI
-
-
-def unbind(memory: "np.ndarray", key: "np.ndarray") -> "np.ndarray":
-    """Circular correlation = element-wise phase subtraction.
-
-    Unbinding retrieves the value associated with a key from a memory vector.
-    unbind(bind(a, b), a) ≈ b  (up to superposition noise)
-    """
-    _require_numpy()
-    return (memory - key) % _TWO_PI
-
-
-def bundle(*vectors: "np.ndarray") -> "np.ndarray":
-    """Superposition via circular mean of complex exponentials.
-
-    Bundling merges multiple vectors into one that is similar to each input.
-    The result can hold O(sqrt(dim)) items before similarity degrades.
-    """
-    _require_numpy()
-    complex_sum = np.sum([np.exp(1j * v) for v in vectors], axis=0)
-    return np.angle(complex_sum) % _TWO_PI
-
-
-def similarity(a: "np.ndarray", b: "np.ndarray") -> float:
-    """Phase cosine similarity. Range [-1, 1].
-
-    Returns 1.0 for identical vectors, near 0.0 for random (unrelated) vectors,
-    and -1.0 for perfectly anti-correlated vectors.
-    """
-    _require_numpy()
-    return float(np.mean(np.cos(a - b)))
-
-
-def encode_text(text: str, dim: int = 1024) -> "np.ndarray":
-    """Bag-of-words: bundle of atom vectors for each token.
-
-    Tokenizes by lowercasing, splitting on whitespace, and stripping
-    leading/trailing punctuation from each token.
-
-    Returns bundle of all token atom vectors.
-    If text is empty or produces no tokens, returns encode_atom("__hrr_empty__", dim).
-    """
-    _require_numpy()
-
-    tokens = [
-        token.strip(".,!?;:\"'()[]{}")
-        for token in text.lower().split()
-    ]
-    tokens = [t for t in tokens if t]
-
-    if not tokens:
-        return encode_atom("__hrr_empty__", dim)
-
-    atom_vectors = [encode_atom(token, dim) for token in tokens]
-    return bundle(*atom_vectors)
-
-
-def encode_fact(content: str, entities: list[str], dim: int = 1024) -> "np.ndarray":
-    """Structured encoding: content bound to ROLE_CONTENT, each entity bound to ROLE_ENTITY, all bundled.
-
-    Role vectors are reserved atoms: "__hrr_role_content__", "__hrr_role_entity__"
-
-    Components:
-    1. bind(encode_text(content, dim), encode_atom("__hrr_role_content__", dim))
-    2. For each entity: bind(encode_atom(entity.lower(), dim), encode_atom("__hrr_role_entity__", dim))
-    3. bundle all components together
-
-    This enables algebraic extraction:
-        unbind(fact, bind(entity, ROLE_ENTITY)) ≈ content_vector
-    """
-    _require_numpy()
-
-    role_content = encode_atom("__hrr_role_content__", dim)
-    role_entity = encode_atom("__hrr_role_entity__", dim)
-
-    components: list[np.ndarray] = [
-        bind(encode_text(content, dim), role_content)
-    ]
-
-    for entity in entities:
-        components.append(bind(encode_atom(entity.lower(), dim), role_entity))
-
-    return bundle(*components)
-
-
-def phases_to_bytes(phases: "np.ndarray") -> bytes:
-    """Serialize phase vector to bytes. float64 tobytes — 8 KB at dim=1024."""
-    _require_numpy()
-    return phases.tobytes()
-
-
-def bytes_to_phases(data: bytes) -> "np.ndarray":
-    """Deserialize bytes back to phase vector. Inverse of phases_to_bytes.
-
-    The .copy() call is required because frombuffer returns a read-only view
-    backed by the bytes object; callers expect a mutable array.
-    """
-    _require_numpy()
-    return np.frombuffer(data, dtype=np.float64).copy()
-
-
-def snr_estimate(dim: int, n_items: int) -> float:
-    """Signal-to-noise ratio estimate for holographic storage.
-
-    SNR = sqrt(dim / n_items) when n_items > 0, else inf.
-
-    The SNR falls below 2.0 when n_items > dim / 4, meaning retrieval
-    errors become likely. Logs a warning when this threshold is crossed.
-    """
-    _require_numpy()
-
-    if n_items <= 0:
-        return float("inf")
-
-    snr = math.sqrt(dim / n_items)
-
-    if snr < 2.0:
-        logger.warning(
-            "HRR storage near capacity: SNR=%.2f (dim=%d, n_items=%d). "
-            "Retrieval accuracy may degrade. Consider increasing dim or reducing stored items.",
-            snr,
-            dim,
-            n_items,
-        )
-
-    return snr
@@ -1,6 +0,0 @@
-name: hermes-memory-store
-version: 0.1.0
-description: Structured memory backend with SQLite storage, trust scoring, entity resolution, and hybrid keyword/BM25 retrieval.
-author: peppi
-hooks:
-  - on_session_end
@@ -1,597 +0,0 @@
-"""Hybrid keyword/BM25 retrieval for the memory store.
-
-Ported from KIK memory_agent.py — combines FTS5 full-text search with
-Jaccard similarity reranking and trust-weighted scoring.
-"""
-
-from __future__ import annotations
-
-import math
-from datetime import datetime, timezone
-from typing import TYPE_CHECKING
-
-if TYPE_CHECKING:
-    from .store import MemoryStore
-
-try:
-    from . import holographic as hrr
-except ImportError:
-    import holographic as hrr  # type: ignore[no-redef]
-
-
-class FactRetriever:
-    """Multi-strategy fact retrieval with trust-weighted scoring."""
-
-    def __init__(
-        self,
-        store: MemoryStore,
-        temporal_decay_half_life: int = 0,  # days, 0 = disabled
-        fts_weight: float = 0.4,
-        jaccard_weight: float = 0.3,
-        hrr_weight: float = 0.3,
-        hrr_dim: int = 1024,
-    ):
-        self.store = store
-        self.half_life = temporal_decay_half_life
-        self.hrr_dim = hrr_dim
-
-        # Auto-redistribute weights if numpy unavailable
-        if hrr_weight > 0 and not hrr._HAS_NUMPY:
-            fts_weight = 0.6
-            jaccard_weight = 0.4
-            hrr_weight = 0.0
-
-        self.fts_weight = fts_weight
-        self.jaccard_weight = jaccard_weight
-        self.hrr_weight = hrr_weight
-
-    def search(
-        self,
-        query: str,
-        category: str | None = None,
-        min_trust: float = 0.3,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Hybrid search: FTS5 candidates → Jaccard rerank → trust weighting.
-
-        Pipeline:
-        1. FTS5 search: Get limit*3 candidates from SQLite full-text search
-        2. Jaccard boost: Token overlap between query and fact content
-        3. Trust weighting: final_score = relevance * trust_score
-        4. Temporal decay (optional): decay = 0.5^(age_days / half_life)
-
-        Returns list of dicts with fact data + 'score' field, sorted by score desc.
-        """
-        # Stage 1: Get FTS5 candidates (more than limit for reranking headroom)
-        candidates = self._fts_candidates(query, category, min_trust, limit * 3)
-
-        if not candidates:
-            return []
-
-        # Stage 2: Rerank with Jaccard + trust + optional decay
-        query_tokens = self._tokenize(query)
-        scored = []
-
-        for fact in candidates:
-            content_tokens = self._tokenize(fact["content"])
-            tag_tokens = self._tokenize(fact.get("tags", ""))
-            all_tokens = content_tokens | tag_tokens
-
-            jaccard = self._jaccard_similarity(query_tokens, all_tokens)
-            fts_score = fact.get("fts_rank", 0.0)
-
-            # HRR similarity
-            if self.hrr_weight > 0 and fact.get("hrr_vector"):
-                fact_vec = hrr.bytes_to_phases(fact["hrr_vector"])
-                query_vec = hrr.encode_text(query, self.hrr_dim)
-                hrr_sim = (hrr.similarity(query_vec, fact_vec) + 1.0) / 2.0  # shift to [0,1]
-            else:
-                hrr_sim = 0.5  # neutral
-
-            # Combine FTS5 + Jaccard + HRR
-            relevance = (self.fts_weight * fts_score
-                        + self.jaccard_weight * jaccard
-                        + self.hrr_weight * hrr_sim)
-
-            # Trust weighting
-            score = relevance * fact["trust_score"]
-
-            # Optional temporal decay
-            if self.half_life > 0:
-                score *= self._temporal_decay(fact.get("updated_at") or fact.get("created_at"))
-
-            fact["score"] = score
-            scored.append(fact)
-
-        # Sort by score descending, return top limit
-        scored.sort(key=lambda x: x["score"], reverse=True)
-        results = scored[:limit]
-        # Strip raw HRR bytes — callers expect JSON-serializable dicts
-        for fact in results:
-            fact.pop("hrr_vector", None)
-        return results
-
-    def probe(
-        self,
-        entity: str,
-        category: str | None = None,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Compositional entity query using HRR algebra.
-
-        Unbinds entity from memory bank to extract associated content.
-        This is NOT keyword search — it uses algebraic structure to find facts
-        where the entity plays a structural role.
-
-        Falls back to FTS5 search if numpy unavailable.
-        """
-        if not hrr._HAS_NUMPY:
-            # Fallback to keyword search on entity name
-            return self.search(entity, category=category, limit=limit)
-
-        conn = self.store._conn
-
-        # Encode entity as role-bound vector
-        role_entity = hrr.encode_atom("__hrr_role_entity__", self.hrr_dim)
-        entity_vec = hrr.encode_atom(entity.lower(), self.hrr_dim)
-        probe_key = hrr.bind(entity_vec, role_entity)
-
-        # Try category-specific bank first, then all facts
-        if category:
-            bank_name = f"cat:{category}"
-            bank_row = conn.execute(
-                "SELECT vector FROM memory_banks WHERE bank_name = ?",
-                (bank_name,),
-            ).fetchone()
-            if bank_row:
-                bank_vec = hrr.bytes_to_phases(bank_row["vector"])
-                extracted = hrr.unbind(bank_vec, probe_key)
-                # Use extracted signal to score individual facts
-                return self._score_facts_by_vector(
-                    extracted, category=category, limit=limit
-                )
-
-        # Score against individual fact vectors directly
-        where = "WHERE hrr_vector IS NOT NULL"
-        params: list = []
-        if category:
-            where += " AND category = ?"
-            params.append(category)
-
-        rows = conn.execute(
-            f"""
-            SELECT fact_id, content, category, tags, trust_score,
-                   retrieval_count, helpful_count, created_at, updated_at,
-                   hrr_vector
-            FROM facts
-            {where}
-            """,
-            params,
-        ).fetchall()
-
-        if not rows:
-            # Final fallback: keyword search
-            return self.search(entity, category=category, limit=limit)
-
-        scored = []
-        for row in rows:
-            fact = dict(row)
-            fact_vec = hrr.bytes_to_phases(fact.pop("hrr_vector"))
-            # Unbind probe key from fact to see if entity is structurally present
-            residual = hrr.unbind(fact_vec, probe_key)
-            # Compare residual against content signal
-            role_content = hrr.encode_atom("__hrr_role_content__", self.hrr_dim)
-            content_vec = hrr.bind(hrr.encode_text(fact["content"], self.hrr_dim), role_content)
-            sim = hrr.similarity(residual, content_vec)
-            fact["score"] = (sim + 1.0) / 2.0 * fact["trust_score"]
-            scored.append(fact)
-
-        scored.sort(key=lambda x: x["score"], reverse=True)
-        return scored[:limit]
-
-    def related(
-        self,
-        entity: str,
-        category: str | None = None,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Discover facts that share structural connections with an entity.
-
-        Unlike probe (which finds facts *about* an entity), related finds
-        facts that are connected through shared context — e.g., other entities
-        mentioned alongside this one, or content that overlaps structurally.
-
-        Falls back to FTS5 search if numpy unavailable.
-        """
-        if not hrr._HAS_NUMPY:
-            return self.search(entity, category=category, limit=limit)
-
-        conn = self.store._conn
-
-        # Encode entity as a bare atom (not role-bound — we want ANY structural match)
-        entity_vec = hrr.encode_atom(entity.lower(), self.hrr_dim)
-
-        # Get all facts with vectors
-        where = "WHERE hrr_vector IS NOT NULL"
-        params: list = []
-        if category:
-            where += " AND category = ?"
-            params.append(category)
-
-        rows = conn.execute(
-            f"""
-            SELECT fact_id, content, category, tags, trust_score,
-                   retrieval_count, helpful_count, created_at, updated_at,
-                   hrr_vector
-            FROM facts
-            {where}
-            """,
-            params,
-        ).fetchall()
-
-        if not rows:
-            return self.search(entity, category=category, limit=limit)
-
-        # Score each fact by how much the entity's atom appears in its vector
-        # This catches both role-bound entity matches AND content word matches
-        scored = []
-        for row in rows:
-            fact = dict(row)
-            fact_vec = hrr.bytes_to_phases(fact.pop("hrr_vector"))
-
-            # Check structural similarity: unbind entity from fact
-            residual = hrr.unbind(fact_vec, entity_vec)
-            # A high-similarity residual to ANY known role vector means this entity
-            # plays a structural role in the fact
-            role_entity = hrr.encode_atom("__hrr_role_entity__", self.hrr_dim)
-            role_content = hrr.encode_atom("__hrr_role_content__", self.hrr_dim)
-
-            entity_role_sim = hrr.similarity(residual, role_entity)
-            content_role_sim = hrr.similarity(residual, role_content)
-            # Take the max — entity could appear in either role
-            best_sim = max(entity_role_sim, content_role_sim)
-
-            fact["score"] = (best_sim + 1.0) / 2.0 * fact["trust_score"]
-            scored.append(fact)
-
-        scored.sort(key=lambda x: x["score"], reverse=True)
-        return scored[:limit]
-
-    def reason(
-        self,
-        entities: list[str],
-        category: str | None = None,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Multi-entity compositional query — vector-space JOIN.
-
-        Given multiple entities, algebraically intersects their structural
-        connections to find facts related to ALL of them simultaneously.
-        This is compositional reasoning that no embedding DB can do.
-
-        Example: reason(["peppi", "backend"]) finds facts where peppi AND
-        backend both play structural roles — without keyword matching.
-
-        Falls back to FTS5 search if numpy unavailable.
-        """
-        if not hrr._HAS_NUMPY or not entities:
-            # Fallback: search with all entities as keywords
-            query = " ".join(entities)
-            return self.search(query, category=category, limit=limit)
-
-        conn = self.store._conn
-        role_entity = hrr.encode_atom("__hrr_role_entity__", self.hrr_dim)
-
-        # For each entity, compute what the bank "remembers" about it
-        # by unbinding entity+role from each fact vector
-        entity_residuals = []
-        for entity in entities:
-            entity_vec = hrr.encode_atom(entity.lower(), self.hrr_dim)
-            probe_key = hrr.bind(entity_vec, role_entity)
-            entity_residuals.append(probe_key)
-
-        # The intersection key: bundle all probe keys, then use it to find
-        # facts that are structurally connected to ALL entities
-        intersection_key = hrr.bundle(*entity_residuals) if len(entity_residuals) > 1 else entity_residuals[0]
-
-        # Get all facts with vectors
-        where = "WHERE hrr_vector IS NOT NULL"
-        params: list = []
-        if category:
-            where += " AND category = ?"
-            params.append(category)
-
-        rows = conn.execute(
-            f"""
-            SELECT fact_id, content, category, tags, trust_score,
-                   retrieval_count, helpful_count, created_at, updated_at,
-                   hrr_vector
-            FROM facts
-            {where}
-            """,
-            params,
-        ).fetchall()
-
-        if not rows:
-            query = " ".join(entities)
-            return self.search(query, category=category, limit=limit)
-
-        # Score each fact: unbind the intersection key and check if the
-        # residual is coherent (high self-similarity = structured match)
-        scored = []
-        for row in rows:
-            fact = dict(row)
-            fact_vec = hrr.bytes_to_phases(fact.pop("hrr_vector"))
-
-            # Unbind intersection key from fact
-            residual = hrr.unbind(fact_vec, intersection_key)
-
-            # Score by how much EACH entity is present in this fact
-            # A fact scores high only if ALL entities have structural presence
-            entity_scores = []
-            for entity in entities:
-                entity_vec = hrr.encode_atom(entity.lower(), self.hrr_dim)
-                probe_key = hrr.bind(entity_vec, role_entity)
-                single_residual = hrr.unbind(fact_vec, probe_key)
-                # Check residual against content role (does this entity participate?)
-                role_content = hrr.encode_atom("__hrr_role_content__", self.hrr_dim)
-                sim = hrr.similarity(single_residual, role_content)
-                entity_scores.append(sim)
-
-            # Use minimum score — fact must match ALL entities, not just some
-            # This is the AND semantics (vs OR which would use mean/max)
-            min_sim = min(entity_scores)
-            fact["score"] = (min_sim + 1.0) / 2.0 * fact["trust_score"]
-            scored.append(fact)
-
-        scored.sort(key=lambda x: x["score"], reverse=True)
-        return scored[:limit]
-
-    def contradict(
-        self,
-        category: str | None = None,
-        threshold: float = 0.3,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Find potentially contradictory facts via entity overlap + content divergence.
-
-        Two facts contradict when they share entities (same subject) but have
-        low content-vector similarity (different claims). This is automated
-        memory hygiene — no other memory system does this.
-
-        Returns pairs of facts with a contradiction score.
-        Falls back to empty list if numpy unavailable.
-        """
-        if not hrr._HAS_NUMPY:
-            return []
-
-        conn = self.store._conn
-
-        # Get all facts with vectors and their linked entities
-        where = "WHERE f.hrr_vector IS NOT NULL"
-        params: list = []
-        if category:
-            where += " AND f.category = ?"
-            params.append(category)
-
-        rows = conn.execute(
-            f"""
-            SELECT f.fact_id, f.content, f.category, f.tags, f.trust_score,
-                   f.created_at, f.updated_at, f.hrr_vector
-            FROM facts f
-            {where}
-            """,
-            params,
-        ).fetchall()
-
-        if len(rows) < 2:
-            return []
-
-        # Build entity sets per fact
-        fact_entities: dict[int, set[str]] = {}
-        for row in rows:
-            fid = row["fact_id"]
-            entity_rows = conn.execute(
-                """
-                SELECT e.name FROM entities e
-                JOIN fact_entities fe ON fe.entity_id = e.entity_id
-                WHERE fe.fact_id = ?
-                """,
-                (fid,),
-            ).fetchall()
-            fact_entities[fid] = {r["name"].lower() for r in entity_rows}
-
-        # Compare all pairs: high entity overlap + low content similarity = contradiction
-        facts = [dict(r) for r in rows]
-        contradictions = []
-
-        for i in range(len(facts)):
-            for j in range(i + 1, len(facts)):
-                f1, f2 = facts[i], facts[j]
-                ents1 = fact_entities.get(f1["fact_id"], set())
-                ents2 = fact_entities.get(f2["fact_id"], set())
-
-                if not ents1 or not ents2:
-                    continue
-
-                # Entity overlap (Jaccard)
-                entity_overlap = len(ents1 & ents2) / len(ents1 | ents2) if (ents1 | ents2) else 0.0
-
-                if entity_overlap < 0.3:
-                    continue  # Not enough entity overlap to be contradictory
-
-                # Content similarity via HRR vectors
-                v1 = hrr.bytes_to_phases(f1["hrr_vector"])
-                v2 = hrr.bytes_to_phases(f2["hrr_vector"])
-                content_sim = hrr.similarity(v1, v2)
-
-                # High entity overlap + low content similarity = potential contradiction
-                # contradiction_score: higher = more contradictory
-                contradiction_score = entity_overlap * (1.0 - (content_sim + 1.0) / 2.0)
-
-                if contradiction_score >= threshold:
-                    # Strip hrr_vector from output (not JSON serializable)
-                    f1_clean = {k: v for k, v in f1.items() if k != "hrr_vector"}
-                    f2_clean = {k: v for k, v in f2.items() if k != "hrr_vector"}
-                    contradictions.append({
-                        "fact_a": f1_clean,
-                        "fact_b": f2_clean,
-                        "entity_overlap": round(entity_overlap, 3),
-                        "content_similarity": round(content_sim, 3),
-                        "contradiction_score": round(contradiction_score, 3),
-                        "shared_entities": sorted(ents1 & ents2),
-                    })
-
-        contradictions.sort(key=lambda x: x["contradiction_score"], reverse=True)
-        return contradictions[:limit]
-
-    def _score_facts_by_vector(
-        self,
-        target_vec: "np.ndarray",
-        category: str | None = None,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Score facts by similarity to a target vector."""
-        conn = self.store._conn
-
-        where = "WHERE hrr_vector IS NOT NULL"
-        params: list = []
-        if category:
-            where += " AND category = ?"
-            params.append(category)
-
-        rows = conn.execute(
-            f"""
-            SELECT fact_id, content, category, tags, trust_score,
-                   retrieval_count, helpful_count, created_at, updated_at,
-                   hrr_vector
-            FROM facts
-            {where}
-            """,
-            params,
-        ).fetchall()
-
-        scored = []
-        for row in rows:
-            fact = dict(row)
-            fact_vec = hrr.bytes_to_phases(fact.pop("hrr_vector"))
-            sim = hrr.similarity(target_vec, fact_vec)
-            fact["score"] = (sim + 1.0) / 2.0 * fact["trust_score"]
-            scored.append(fact)
-
-        scored.sort(key=lambda x: x["score"], reverse=True)
-        return scored[:limit]
-
-    def _fts_candidates(
-        self,
-        query: str,
-        category: str | None,
-        min_trust: float,
-        limit: int,
-    ) -> list[dict]:
-        """Get raw FTS5 candidates from the store.
-
-        Uses the store's database connection directly for FTS5 MATCH
-        with rank scoring. Normalizes FTS5 rank to [0, 1] range.
-        """
-        conn = self.store._conn
-
-        # Build query - FTS5 rank is negative (lower = better match)
-        # We need to join facts_fts with facts to get all columns
-        params: list = []
-        where_clauses = ["facts_fts MATCH ?"]
-        params.append(query)
-
-        if category:
-            where_clauses.append("f.category = ?")
-            params.append(category)
-
-        where_clauses.append("f.trust_score >= ?")
-        params.append(min_trust)
-
-        where_sql = " AND ".join(where_clauses)
-
-        sql = f"""
-            SELECT f.*, facts_fts.rank as fts_rank_raw
-            FROM facts_fts
-            JOIN facts f ON f.fact_id = facts_fts.rowid
-            WHERE {where_sql}
-            ORDER BY facts_fts.rank
-            LIMIT ?
-        """
-        params.append(limit)
-
-        try:
-            rows = conn.execute(sql, params).fetchall()
-        except Exception:
-            # FTS5 MATCH can fail on malformed queries — fall back to empty
-            return []
-
-        if not rows:
-            return []
-
-        # Normalize FTS5 rank: rank is negative, lower = better
-        # Convert to positive score in [0, 1] range
-        raw_ranks = [abs(row["fts_rank_raw"]) for row in rows]
-        max_rank = max(raw_ranks) if raw_ranks else 1.0
-        max_rank = max(max_rank, 1e-6)  # avoid div by zero
-
-        results = []
-        for row, raw_rank in zip(rows, raw_ranks):
-            fact = dict(row)
-            fact.pop("fts_rank_raw", None)
-            fact["fts_rank"] = raw_rank / max_rank  # normalize to [0, 1]
-            results.append(fact)
-
-        return results
-
-    @staticmethod
-    def _tokenize(text: str) -> set[str]:
-        """Simple whitespace tokenization with lowercasing.
-
-        Strips common punctuation. No stemming/lemmatization (Phase 1).
-        """
-        if not text:
-            return set()
-        # Split on whitespace, lowercase, strip punctuation
-        tokens = set()
-        for word in text.lower().split():
-            cleaned = word.strip(".,;:!?\"'()[]{}#@<>")
-            if cleaned:
-                tokens.add(cleaned)
-        return tokens
-
-    @staticmethod
-    def _jaccard_similarity(set_a: set, set_b: set) -> float:
-        """Jaccard similarity coefficient: |A ∩ B| / |A ∪ B|."""
-        if not set_a or not set_b:
-            return 0.0
-        intersection = len(set_a & set_b)
-        union = len(set_a | set_b)
-        return intersection / union if union > 0 else 0.0
-
-    def _temporal_decay(self, timestamp_str: str | None) -> float:
-        """Exponential decay: 0.5^(age_days / half_life_days).
-
-        Returns 1.0 if decay is disabled or timestamp is missing.
-        """
-        if not self.half_life or not timestamp_str:
-            return 1.0
-
-        try:
-            if isinstance(timestamp_str, str):
-                # Parse ISO format timestamp from SQLite
-                ts = datetime.fromisoformat(timestamp_str.replace("Z", "+00:00"))
-            else:
-                ts = timestamp_str
-
-            if ts.tzinfo is None:
-                ts = ts.replace(tzinfo=timezone.utc)
-
-            age_days = (datetime.now(timezone.utc) - ts).total_seconds() / 86400
-            if age_days < 0:
-                return 1.0
-
-            return math.pow(0.5, age_days / self.half_life)
-        except (ValueError, TypeError):
-            return 1.0
@@ -1,572 +0,0 @@
-"""
-SQLite-backed fact store with entity resolution and trust scoring.
-Single-user Hermes memory store plugin.
-"""
-
-import re
-import sqlite3
-import threading
-from datetime import datetime
-from pathlib import Path
-
-try:
-    from . import holographic as hrr
-except ImportError:
-    import holographic as hrr  # type: ignore[no-redef]
-
-_SCHEMA = """
-CREATE TABLE IF NOT EXISTS facts (
-    fact_id         INTEGER PRIMARY KEY AUTOINCREMENT,
-    content         TEXT NOT NULL UNIQUE,
-    category        TEXT DEFAULT 'general',
-    tags            TEXT DEFAULT '',
-    trust_score     REAL DEFAULT 0.5,
-    retrieval_count INTEGER DEFAULT 0,
-    helpful_count   INTEGER DEFAULT 0,
-    created_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-    updated_at      TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-    hrr_vector      BLOB
-);
-
-CREATE TABLE IF NOT EXISTS entities (
-    entity_id   INTEGER PRIMARY KEY AUTOINCREMENT,
-    name        TEXT NOT NULL,
-    entity_type TEXT DEFAULT 'unknown',
-    aliases     TEXT DEFAULT '',
-    created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-);
-
-CREATE TABLE IF NOT EXISTS fact_entities (
-    fact_id   INTEGER REFERENCES facts(fact_id),
-    entity_id INTEGER REFERENCES entities(entity_id),
-    PRIMARY KEY (fact_id, entity_id)
-);
-
-CREATE INDEX IF NOT EXISTS idx_facts_trust    ON facts(trust_score DESC);
-CREATE INDEX IF NOT EXISTS idx_facts_category ON facts(category);
-CREATE INDEX IF NOT EXISTS idx_entities_name  ON entities(name);
-
-CREATE VIRTUAL TABLE IF NOT EXISTS facts_fts
-    USING fts5(content, tags, content=facts, content_rowid=fact_id);
-
-CREATE TRIGGER IF NOT EXISTS facts_ai AFTER INSERT ON facts BEGIN
-    INSERT INTO facts_fts(rowid, content, tags)
-        VALUES (new.fact_id, new.content, new.tags);
-END;
-
-CREATE TRIGGER IF NOT EXISTS facts_ad AFTER DELETE ON facts BEGIN
-    INSERT INTO facts_fts(facts_fts, rowid, content, tags)
-        VALUES ('delete', old.fact_id, old.content, old.tags);
-END;
-
-CREATE TRIGGER IF NOT EXISTS facts_au AFTER UPDATE ON facts BEGIN
-    INSERT INTO facts_fts(facts_fts, rowid, content, tags)
-        VALUES ('delete', old.fact_id, old.content, old.tags);
-    INSERT INTO facts_fts(rowid, content, tags)
-        VALUES (new.fact_id, new.content, new.tags);
-END;
-
-CREATE TABLE IF NOT EXISTS memory_banks (
-    bank_id    INTEGER PRIMARY KEY AUTOINCREMENT,
-    bank_name  TEXT NOT NULL UNIQUE,
-    vector     BLOB NOT NULL,
-    dim        INTEGER NOT NULL,
-    fact_count INTEGER DEFAULT 0,
-    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-);
-"""
-
-# Trust adjustment constants
-_HELPFUL_DELTA   =  0.05
-_UNHELPFUL_DELTA = -0.10
-_TRUST_MIN       =  0.0
-_TRUST_MAX       =  1.0
-
-# Entity extraction patterns
-_RE_CAPITALIZED  = re.compile(r'\b([A-Z][a-z]+(?:\s+[A-Z][a-z]+)+)\b')
-_RE_DOUBLE_QUOTE = re.compile(r'"([^"]+)"')
-_RE_SINGLE_QUOTE = re.compile(r"'([^']+)'")
-_RE_AKA          = re.compile(
-    r'(\w+(?:\s+\w+)*)\s+(?:aka|also known as)\s+(\w+(?:\s+\w+)*)',
-    re.IGNORECASE,
-)
-
-
-def _clamp_trust(value: float) -> float:
-    return max(_TRUST_MIN, min(_TRUST_MAX, value))
-
-
-class MemoryStore:
-    """SQLite-backed fact store with entity resolution and trust scoring."""
-
-    def __init__(
-        self,
-        db_path: "str | Path" = "~/.hermes/memory_store.db",
-        default_trust: float = 0.5,
-        hrr_dim: int = 1024,
-    ) -> None:
-        self.db_path = Path(db_path).expanduser()
-        self.db_path.parent.mkdir(parents=True, exist_ok=True)
-        self.default_trust = _clamp_trust(default_trust)
-        self.hrr_dim = hrr_dim
-        self._hrr_available = hrr._HAS_NUMPY
-        self._conn: sqlite3.Connection = sqlite3.connect(
-            str(self.db_path),
-            check_same_thread=False,
-            timeout=10.0,
-        )
-        self._lock = threading.RLock()
-        self._conn.row_factory = sqlite3.Row
-        self._init_db()
-
-    # ------------------------------------------------------------------
-    # Initialisation
-    # ------------------------------------------------------------------
-
-    def _init_db(self) -> None:
-        """Create tables, indexes, and triggers if they do not exist. Enable WAL mode."""
-        self._conn.execute("PRAGMA journal_mode=WAL")
-        self._conn.executescript(_SCHEMA)
-        # Migrate: add hrr_vector column if missing (safe for existing databases)
-        columns = {row[1] for row in self._conn.execute("PRAGMA table_info(facts)").fetchall()}
-        if "hrr_vector" not in columns:
-            self._conn.execute("ALTER TABLE facts ADD COLUMN hrr_vector BLOB")
-        self._conn.commit()
-
-    # ------------------------------------------------------------------
-    # Public API
-    # ------------------------------------------------------------------
-
-    def add_fact(
-        self,
-        content: str,
-        category: str = "general",
-        tags: str = "",
-    ) -> int:
-        """Insert a fact and return its fact_id.
-
-        Deduplicates by content (UNIQUE constraint). On duplicate, returns
-        the existing fact_id without modifying the row. Extracts entities from
-        the content and links them to the fact.
-        """
-        with self._lock:
-            content = content.strip()
-            if not content:
-                raise ValueError("content must not be empty")
-
-            try:
-                cur = self._conn.execute(
-                    """
-                    INSERT INTO facts (content, category, tags, trust_score)
-                    VALUES (?, ?, ?, ?)
-                    """,
-                    (content, category, tags, self.default_trust),
-                )
-                self._conn.commit()
-                fact_id: int = cur.lastrowid  # type: ignore[assignment]
-            except sqlite3.IntegrityError:
-                # Duplicate content — return existing id
-                row = self._conn.execute(
-                    "SELECT fact_id FROM facts WHERE content = ?", (content,)
-                ).fetchone()
-                return int(row["fact_id"])
-
-            # Entity extraction and linking
-            for name in self._extract_entities(content):
-                entity_id = self._resolve_entity(name)
-                self._link_fact_entity(fact_id, entity_id)
-
-            # Compute HRR vector after entity linking
-            self._compute_hrr_vector(fact_id, content)
-            self._rebuild_bank(category)
-
-            return fact_id
-
-    def search_facts(
-        self,
-        query: str,
-        category: str | None = None,
-        min_trust: float = 0.3,
-        limit: int = 10,
-    ) -> list[dict]:
-        """Full-text search over facts using FTS5.
-
-        Returns a list of fact dicts ordered by FTS5 rank, then trust_score
-        descending. Also increments retrieval_count for matched facts.
-        """
-        with self._lock:
-            query = query.strip()
-            if not query:
-                return []
-
-            params: list = [query, min_trust]
-            category_clause = ""
-            if category is not None:
-                category_clause = "AND f.category = ?"
-                params.append(category)
-            params.append(limit)
-
-            sql = f"""
-                SELECT f.fact_id, f.content, f.category, f.tags,
-                       f.trust_score, f.retrieval_count, f.helpful_count,
-                       f.created_at, f.updated_at
-                FROM facts f
-                JOIN facts_fts fts ON fts.rowid = f.fact_id
-                WHERE facts_fts MATCH ?
-                  AND f.trust_score >= ?
-                  {category_clause}
-                ORDER BY fts.rank, f.trust_score DESC
-                LIMIT ?
-            """
-
-            rows = self._conn.execute(sql, params).fetchall()
-            results = [self._row_to_dict(r) for r in rows]
-
-            if results:
-                ids = [r["fact_id"] for r in results]
-                placeholders = ",".join("?" * len(ids))
-                self._conn.execute(
-                    f"UPDATE facts SET retrieval_count = retrieval_count + 1 WHERE fact_id IN ({placeholders})",
-                    ids,
-                )
-                self._conn.commit()
-
-            return results
-
-    def update_fact(
-        self,
-        fact_id: int,
-        content: str | None = None,
-        trust_delta: float | None = None,
-        tags: str | None = None,
-        category: str | None = None,
-    ) -> bool:
-        """Partially update a fact. Trust is clamped to [0, 1].
-
-        Returns True if the row existed, False otherwise.
-        """
-        with self._lock:
-            row = self._conn.execute(
-                "SELECT fact_id, trust_score FROM facts WHERE fact_id = ?", (fact_id,)
-            ).fetchone()
-            if row is None:
-                return False
-
-            assignments: list[str] = ["updated_at = CURRENT_TIMESTAMP"]
-            params: list = []
-
-            if content is not None:
-                assignments.append("content = ?")
-                params.append(content.strip())
-            if tags is not None:
-                assignments.append("tags = ?")
-                params.append(tags)
-            if category is not None:
-                assignments.append("category = ?")
-                params.append(category)
-            if trust_delta is not None:
-                new_trust = _clamp_trust(row["trust_score"] + trust_delta)
-                assignments.append("trust_score = ?")
-                params.append(new_trust)
-
-            params.append(fact_id)
-            self._conn.execute(
-                f"UPDATE facts SET {', '.join(assignments)} WHERE fact_id = ?",
-                params,
-            )
-            self._conn.commit()
-
-            # If content changed, re-extract entities
-            if content is not None:
-                self._conn.execute(
-                    "DELETE FROM fact_entities WHERE fact_id = ?", (fact_id,)
-                )
-                for name in self._extract_entities(content):
-                    entity_id = self._resolve_entity(name)
-                    self._link_fact_entity(fact_id, entity_id)
-                self._conn.commit()
-
-            # Recompute HRR vector if content changed
-            if content is not None:
-                self._compute_hrr_vector(fact_id, content)
-            # Rebuild bank for relevant category
-            cat = category or self._conn.execute(
-                "SELECT category FROM facts WHERE fact_id = ?", (fact_id,)
-            ).fetchone()["category"]
-            self._rebuild_bank(cat)
-
-            return True
-
-    def remove_fact(self, fact_id: int) -> bool:
-        """Delete a fact and its entity links. Returns True if the row existed."""
-        with self._lock:
-            row = self._conn.execute(
-                "SELECT fact_id, category FROM facts WHERE fact_id = ?", (fact_id,)
-            ).fetchone()
-            if row is None:
-                return False
-
-            self._conn.execute(
-                "DELETE FROM fact_entities WHERE fact_id = ?", (fact_id,)
-            )
-            self._conn.execute("DELETE FROM facts WHERE fact_id = ?", (fact_id,))
-            self._conn.commit()
-            self._rebuild_bank(row["category"])
-            return True
-
-    def list_facts(
-        self,
-        category: str | None = None,
-        min_trust: float = 0.0,
-        limit: int = 50,
-    ) -> list[dict]:
-        """Browse facts ordered by trust_score descending.
-
-        Optionally filter by category and minimum trust score.
-        """
-        with self._lock:
-            params: list = [min_trust]
-            category_clause = ""
-            if category is not None:
-                category_clause = "AND category = ?"
-                params.append(category)
-            params.append(limit)
-
-            sql = f"""
-                SELECT fact_id, content, category, tags, trust_score,
-                       retrieval_count, helpful_count, created_at, updated_at
-                FROM facts
-                WHERE trust_score >= ?
-                  {category_clause}
-                ORDER BY trust_score DESC
-                LIMIT ?
-            """
-            rows = self._conn.execute(sql, params).fetchall()
-            return [self._row_to_dict(r) for r in rows]
-
-    def record_feedback(self, fact_id: int, helpful: bool) -> dict:
-        """Record user feedback and adjust trust asymmetrically.
-
-        helpful=True  -> trust += 0.05, helpful_count += 1
-        helpful=False -> trust -= 0.10
-
-        Returns a dict with fact_id, old_trust, new_trust, helpful_count.
-        Raises KeyError if fact_id does not exist.
-        """
-        with self._lock:
-            row = self._conn.execute(
-                "SELECT fact_id, trust_score, helpful_count FROM facts WHERE fact_id = ?",
-                (fact_id,),
-            ).fetchone()
-            if row is None:
-                raise KeyError(f"fact_id {fact_id} not found")
-
-            old_trust: float = row["trust_score"]
-            delta = _HELPFUL_DELTA if helpful else _UNHELPFUL_DELTA
-            new_trust = _clamp_trust(old_trust + delta)
-
-            helpful_increment = 1 if helpful else 0
-            self._conn.execute(
-                """
-                UPDATE facts
-                SET trust_score    = ?,
-                    helpful_count  = helpful_count + ?,
-                    updated_at     = CURRENT_TIMESTAMP
-                WHERE fact_id = ?
-                """,
-                (new_trust, helpful_increment, fact_id),
-            )
-            self._conn.commit()
-
-            return {
-                "fact_id":      fact_id,
-                "old_trust":    old_trust,
-                "new_trust":    new_trust,
-                "helpful_count": row["helpful_count"] + helpful_increment,
-            }
-
-    # ------------------------------------------------------------------
-    # Entity helpers
-    # ------------------------------------------------------------------
-
-    def _extract_entities(self, text: str) -> list[str]:
-        """Extract entity candidates from text using simple regex rules.
-
-        Rules applied (in order):
-        1. Capitalized multi-word phrases  e.g. "John Doe"
-        2. Double-quoted terms             e.g. "Python"
-        3. Single-quoted terms             e.g. 'pytest'
-        4. AKA patterns                    e.g. "Guido aka BDFL" -> two entities
-
-        Returns a deduplicated list preserving first-seen order.
-        """
-        seen: set[str] = set()
-        candidates: list[str] = []
-
-        def _add(name: str) -> None:
-            stripped = name.strip()
-            if stripped and stripped.lower() not in seen:
-                seen.add(stripped.lower())
-                candidates.append(stripped)
-
-        for m in _RE_CAPITALIZED.finditer(text):
-            _add(m.group(1))
-
-        for m in _RE_DOUBLE_QUOTE.finditer(text):
-            _add(m.group(1))
-
-        for m in _RE_SINGLE_QUOTE.finditer(text):
-            _add(m.group(1))
-
-        for m in _RE_AKA.finditer(text):
-            _add(m.group(1))
-            _add(m.group(2))
-
-        return candidates
-
-    def _resolve_entity(self, name: str) -> int:
-        """Find an existing entity by name or alias (case-insensitive) or create one.
-
-        Returns the entity_id.
-        """
-        # Exact name match
-        row = self._conn.execute(
-            "SELECT entity_id FROM entities WHERE name LIKE ?", (name,)
-        ).fetchone()
-        if row is not None:
-            return int(row["entity_id"])
-
-        # Search aliases — aliases stored as comma-separated; use LIKE with % boundaries
-        alias_row = self._conn.execute(
-            """
-            SELECT entity_id FROM entities
-            WHERE ',' || aliases || ',' LIKE '%,' || ? || ',%'
-            """,
-            (name,),
-        ).fetchone()
-        if alias_row is not None:
-            return int(alias_row["entity_id"])
-
-        # Create new entity
-        cur = self._conn.execute(
-            "INSERT INTO entities (name) VALUES (?)", (name,)
-        )
-        self._conn.commit()
-        return int(cur.lastrowid)  # type: ignore[return-value]
-
-    def _link_fact_entity(self, fact_id: int, entity_id: int) -> None:
-        """Insert into fact_entities, silently ignore if the link already exists."""
-        self._conn.execute(
-            """
-            INSERT OR IGNORE INTO fact_entities (fact_id, entity_id)
-            VALUES (?, ?)
-            """,
-            (fact_id, entity_id),
-        )
-        self._conn.commit()
-
-    def _compute_hrr_vector(self, fact_id: int, content: str) -> None:
-        """Compute and store HRR vector for a fact. No-op if numpy unavailable."""
-        with self._lock:
-            if not self._hrr_available:
-                return
-
-            # Get entities linked to this fact
-            rows = self._conn.execute(
-                """
-                SELECT e.name FROM entities e
-                JOIN fact_entities fe ON fe.entity_id = e.entity_id
-                WHERE fe.fact_id = ?
-                """,
-                (fact_id,),
-            ).fetchall()
-            entities = [row["name"] for row in rows]
-
-            vector = hrr.encode_fact(content, entities, self.hrr_dim)
-            self._conn.execute(
-                "UPDATE facts SET hrr_vector = ? WHERE fact_id = ?",
-                (hrr.phases_to_bytes(vector), fact_id),
-            )
-            self._conn.commit()
-
-    def _rebuild_bank(self, category: str) -> None:
-        """Full rebuild of a category's memory bank from all its fact vectors."""
-        with self._lock:
-            if not self._hrr_available:
-                return
-
-            bank_name = f"cat:{category}"
-            rows = self._conn.execute(
-                "SELECT hrr_vector FROM facts WHERE category = ? AND hrr_vector IS NOT NULL",
-                (category,),
-            ).fetchall()
-
-            if not rows:
-                self._conn.execute("DELETE FROM memory_banks WHERE bank_name = ?", (bank_name,))
-                self._conn.commit()
-                return
-
-            vectors = [hrr.bytes_to_phases(row["hrr_vector"]) for row in rows]
-            bank_vector = hrr.bundle(*vectors)
-            fact_count = len(vectors)
-
-            # Check SNR
-            hrr.snr_estimate(self.hrr_dim, fact_count)
-
-            self._conn.execute(
-                """
-                INSERT INTO memory_banks (bank_name, vector, dim, fact_count, updated_at)
-                VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
-                ON CONFLICT(bank_name) DO UPDATE SET
-                    vector = excluded.vector,
-                    dim = excluded.dim,
-                    fact_count = excluded.fact_count,
-                    updated_at = excluded.updated_at
-                """,
-                (bank_name, hrr.phases_to_bytes(bank_vector), self.hrr_dim, fact_count),
-            )
-            self._conn.commit()
-
-    def rebuild_all_vectors(self, dim: int | None = None) -> int:
-        """Recompute all HRR vectors + banks from text. For recovery/migration.
-
-        Returns the number of facts processed.
-        """
-        with self._lock:
-            if not self._hrr_available:
-                return 0
-
-            if dim is not None:
-                self.hrr_dim = dim
-
-            rows = self._conn.execute(
-                "SELECT fact_id, content, category FROM facts"
-            ).fetchall()
-
-            categories: set[str] = set()
-            for row in rows:
-                self._compute_hrr_vector(row["fact_id"], row["content"])
-                categories.add(row["category"])
-
-            for category in categories:
-                self._rebuild_bank(category)
-
-            return len(rows)
-
-    # ------------------------------------------------------------------
-    # Utilities
-    # ------------------------------------------------------------------
-
-    def _row_to_dict(self, row: sqlite3.Row) -> dict:
-        """Convert a sqlite3.Row to a plain dict."""
-        return dict(row)
-
-    def close(self) -> None:
-        """Close the database connection."""
-        self._conn.close()
-
-    def __enter__(self) -> "MemoryStore":
-        return self
-
-    def __exit__(self, *_: object) -> None:
-        self.close()
@@ -1,315 +0,0 @@
-"""Hindsight memory plugin — MemoryProvider interface.
-
-Long-term memory with knowledge graph, entity resolution, and multi-strategy
-retrieval. Supports cloud (API key) and local (embedded PostgreSQL) modes.
-
-Original PR #1811 by benfrank241, adapted to MemoryProvider ABC.
-
-Config via environment variables:
-  HINDSIGHT_API_KEY   — API key for Hindsight Cloud
-  HINDSIGHT_BANK_ID   — memory bank identifier (default: hermes)
-  HINDSIGHT_BUDGET    — recall budget: low/mid/high (default: mid)
-  HINDSIGHT_API_URL   — API endpoint
-  HINDSIGHT_MODE      — cloud or local (default: cloud)
-
-Or via ~/.hindsight/config.json (written by the original setup wizard).
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import os
-import queue
-import threading
-from typing import Any, Dict, List
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-_DEFAULT_API_URL = "https://api.hindsight.vectorize.io"
-_VALID_BUDGETS = {"low", "mid", "high"}
-
-
-# ---------------------------------------------------------------------------
-# Thread helper (from original PR — avoids aiohttp event loop conflicts)
-# ---------------------------------------------------------------------------
-
-def _run_in_thread(fn, timeout: float = 30.0):
-    result_q: queue.Queue = queue.Queue(maxsize=1)
-
-    def _run():
-        import asyncio
-        asyncio.set_event_loop(None)
-        try:
-            result_q.put(("ok", fn()))
-        except Exception as exc:
-            result_q.put(("err", exc))
-
-    t = threading.Thread(target=_run, daemon=True, name="hindsight-call")
-    t.start()
-    kind, value = result_q.get(timeout=timeout)
-    if kind == "err":
-        raise value
-    return value
-
-
-# ---------------------------------------------------------------------------
-# Tool schemas
-# ---------------------------------------------------------------------------
-
-RETAIN_SCHEMA = {
-    "name": "hindsight_retain",
-    "description": (
-        "Store information to long-term memory. Hindsight automatically "
-        "extracts structured facts, resolves entities, and indexes for retrieval."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "content": {"type": "string", "description": "The information to store."},
-            "context": {"type": "string", "description": "Short label (e.g. 'user preference', 'project decision')."},
-        },
-        "required": ["content"],
-    },
-}
-
-RECALL_SCHEMA = {
-    "name": "hindsight_recall",
-    "description": (
-        "Search long-term memory. Returns memories ranked by relevance using "
-        "semantic search, keyword matching, entity graph traversal, and reranking."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "What to search for."},
-        },
-        "required": ["query"],
-    },
-}
-
-REFLECT_SCHEMA = {
-    "name": "hindsight_reflect",
-    "description": (
-        "Synthesize a reasoned answer from long-term memories. Unlike recall, "
-        "this reasons across all stored memories to produce a coherent response."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "The question to reflect on."},
-        },
-        "required": ["query"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# Config
-# ---------------------------------------------------------------------------
-
-def _load_config() -> dict:
-    """Load config from ~/.hindsight/config.json, falling back to env vars."""
-    from pathlib import Path
-    config_path = Path.home() / ".hindsight" / "config.json"
-
-    if config_path.exists():
-        try:
-            return json.loads(config_path.read_text(encoding="utf-8"))
-        except Exception:
-            pass
-
-    return {
-        "mode": os.environ.get("HINDSIGHT_MODE", "cloud"),
-        "apiKey": os.environ.get("HINDSIGHT_API_KEY", ""),
-        "banks": {
-            "hermes": {
-                "bankId": os.environ.get("HINDSIGHT_BANK_ID", "hermes"),
-                "budget": os.environ.get("HINDSIGHT_BUDGET", "mid"),
-                "enabled": True,
-            }
-        },
-    }
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class HindsightMemoryProvider(MemoryProvider):
-    """Hindsight long-term memory with knowledge graph and multi-strategy retrieval."""
-
-    def __init__(self):
-        self._config = None
-        self._api_key = None
-        self._bank_id = "hermes"
-        self._budget = "mid"
-        self._mode = "cloud"
-        self._prefetch_result = ""
-        self._prefetch_lock = threading.Lock()
-        self._prefetch_thread = None
-
-    @property
-    def name(self) -> str:
-        return "hindsight"
-
-    def is_available(self) -> bool:
-        try:
-            cfg = _load_config()
-            mode = cfg.get("mode", "cloud")
-            if mode == "local":
-                embed = cfg.get("embed", {})
-                return bool(embed.get("llmApiKey") or os.environ.get("HINDSIGHT_LLM_API_KEY"))
-            api_key = cfg.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", "")
-            return bool(api_key)
-        except Exception:
-            return False
-
-    def get_config_schema(self):
-        return [
-            {"key": "mode", "description": "Cloud API or local embedded mode", "default": "cloud", "choices": ["cloud", "local"]},
-            {"key": "api_key", "description": "Hindsight Cloud API key", "secret": True, "env_var": "HINDSIGHT_API_KEY", "url": "https://app.hindsight.vectorize.io"},
-            {"key": "bank_id", "description": "Memory bank identifier", "default": "hermes"},
-            {"key": "budget", "description": "Recall thoroughness", "default": "mid", "choices": ["low", "mid", "high"]},
-            {"key": "llm_provider", "description": "LLM provider for local mode", "default": "anthropic", "choices": ["anthropic", "openai", "groq", "ollama"]},
-            {"key": "llm_api_key", "description": "LLM API key for local mode", "secret": True, "env_var": "HINDSIGHT_LLM_API_KEY"},
-            {"key": "llm_model", "description": "LLM model for local mode", "default": "claude-haiku-4-5-20251001"},
-        ]
-
-    def _make_client(self):
-        """Create a fresh Hindsight client (thread-safe)."""
-        if self._mode == "local":
-            from hindsight import HindsightEmbedded
-            embed = self._config.get("embed", {})
-            return HindsightEmbedded(
-                profile=embed.get("profile", "hermes"),
-                llm_provider=embed.get("llmProvider", ""),
-                llm_api_key=embed.get("llmApiKey", ""),
-                llm_model=embed.get("llmModel", ""),
-            )
-        from hindsight_client import Hindsight
-        return Hindsight(api_key=self._api_key, timeout=30.0)
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        self._config = _load_config()
-        self._mode = self._config.get("mode", "cloud")
-        self._api_key = self._config.get("apiKey") or os.environ.get("HINDSIGHT_API_KEY", "")
-
-        banks = self._config.get("banks", {}).get("hermes", {})
-        self._bank_id = banks.get("bankId", "hermes")
-        budget = banks.get("budget", "mid")
-        self._budget = budget if budget in _VALID_BUDGETS else "mid"
-
-        # Ensure bank exists
-        try:
-            client = _run_in_thread(self._make_client)
-            _run_in_thread(lambda: client.create_bank(bank_id=self._bank_id, name=self._bank_id))
-        except Exception:
-            pass  # Already exists
-
-    def system_prompt_block(self) -> str:
-        return (
-            f"# Hindsight Memory\n"
-            f"Active. Bank: {self._bank_id}, budget: {self._budget}.\n"
-            f"Use hindsight_recall to search, hindsight_reflect for synthesis, "
-            f"hindsight_retain to store facts."
-        )
-
-    def prefetch(self, query: str) -> str:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=3.0)
-        with self._prefetch_lock:
-            result = self._prefetch_result
-            self._prefetch_result = ""
-        if not result:
-            return ""
-        return f"## Hindsight Memory\n{result}"
-
-    def queue_prefetch(self, query: str) -> None:
-        def _run():
-            try:
-                client = self._make_client()
-                resp = client.recall(bank_id=self._bank_id, query=query, budget=self._budget)
-                if resp.results:
-                    text = "\n".join(r.text for r in resp.results if r.text)
-                    with self._prefetch_lock:
-                        self._prefetch_result = text
-            except Exception as e:
-                logger.debug("Hindsight prefetch failed: %s", e)
-
-        self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="hindsight-prefetch")
-        self._prefetch_thread.start()
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        combined = f"User: {user_content}\nAssistant: {assistant_content}"
-        try:
-            _run_in_thread(
-                lambda: self._make_client().retain(
-                    bank_id=self._bank_id, content=combined, context="conversation"
-                )
-            )
-        except Exception as e:
-            logger.warning("Hindsight sync failed: %s", e)
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [RETAIN_SCHEMA, RECALL_SCHEMA, REFLECT_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
-        if tool_name == "hindsight_retain":
-            content = args.get("content", "")
-            if not content:
-                return json.dumps({"error": "Missing required parameter: content"})
-            context = args.get("context")
-            try:
-                _run_in_thread(
-                    lambda: self._make_client().retain(
-                        bank_id=self._bank_id, content=content, context=context
-                    )
-                )
-                return json.dumps({"result": "Memory stored successfully."})
-            except Exception as e:
-                return json.dumps({"error": f"Failed to store memory: {e}"})
-
-        elif tool_name == "hindsight_recall":
-            query = args.get("query", "")
-            if not query:
-                return json.dumps({"error": "Missing required parameter: query"})
-            try:
-                resp = _run_in_thread(
-                    lambda: self._make_client().recall(
-                        bank_id=self._bank_id, query=query, budget=self._budget
-                    )
-                )
-                if not resp.results:
-                    return json.dumps({"result": "No relevant memories found."})
-                lines = [f"{i}. {r.text}" for i, r in enumerate(resp.results, 1)]
-                return json.dumps({"result": "\n".join(lines)})
-            except Exception as e:
-                return json.dumps({"error": f"Failed to search memory: {e}"})
-
-        elif tool_name == "hindsight_reflect":
-            query = args.get("query", "")
-            if not query:
-                return json.dumps({"error": "Missing required parameter: query"})
-            try:
-                resp = _run_in_thread(
-                    lambda: self._make_client().reflect(
-                        bank_id=self._bank_id, query=query, budget=self._budget
-                    )
-                )
-                return json.dumps({"result": resp.text or "No relevant memories found."})
-            except Exception as e:
-                return json.dumps({"error": f"Failed to reflect: {e}"})
-
-        return json.dumps({"error": f"Unknown tool: {tool_name}"})
-
-    def shutdown(self) -> None:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=5.0)
-
-
-def register(ctx) -> None:
-    """Register Hindsight as a memory provider plugin."""
-    ctx.register_memory_provider(HindsightMemoryProvider())
@@ -1,8 +0,0 @@
-name: hindsight-memory
-version: 1.0.0
-description: >
-  Long-term memory via Hindsight — knowledge graph with entity resolution,
-  multi-strategy retrieval (semantic + BM25 + graph + temporal), and
-  cross-encoder reranking. Cloud or local mode.
-requires_env:
-  - HINDSIGHT_API_KEY
@@ -1,294 +0,0 @@
-"""Mem0 memory plugin — MemoryProvider interface.
-
-Server-side LLM fact extraction, semantic search with reranking, and
-automatic deduplication via the Mem0 Platform API.
-
-Original PR #2933 by kartik-mem0, adapted to MemoryProvider ABC.
-
-Config via environment variables:
-  MEM0_API_KEY       — Mem0 Platform API key (required)
-  MEM0_USER_ID       — User identifier (default: hermes-user)
-  MEM0_AGENT_ID      — Agent identifier (default: hermes)
-
-Or via $HERMES_HOME/mem0.json.
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import os
-import threading
-from pathlib import Path
-from typing import Any, Dict, List
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Config
-# ---------------------------------------------------------------------------
-
-def _load_config() -> dict:
-    """Load config from $HERMES_HOME/mem0.json or env vars."""
-    hermes_home = os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes"))
-    config_path = Path(hermes_home) / "mem0.json"
-
-    if config_path.exists():
-        try:
-            return json.loads(config_path.read_text(encoding="utf-8"))
-        except Exception:
-            pass
-
-    return {
-        "api_key": os.environ.get("MEM0_API_KEY", ""),
-        "user_id": os.environ.get("MEM0_USER_ID", "hermes-user"),
-        "agent_id": os.environ.get("MEM0_AGENT_ID", "hermes"),
-        "rerank": True,
-        "keyword_search": False,
-    }
-
-
-# ---------------------------------------------------------------------------
-# Tool schemas
-# ---------------------------------------------------------------------------
-
-PROFILE_SCHEMA = {
-    "name": "mem0_profile",
-    "description": (
-        "Retrieve all stored memories about the user — preferences, facts, "
-        "project context. Fast, no reranking. Use at conversation start."
-    ),
-    "parameters": {"type": "object", "properties": {}, "required": []},
-}
-
-SEARCH_SCHEMA = {
-    "name": "mem0_search",
-    "description": (
-        "Search memories by meaning. Returns relevant facts ranked by similarity. "
-        "Set rerank=true for higher accuracy (+150ms)."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "What to search for."},
-            "rerank": {"type": "boolean", "description": "Enable reranking for precision (default: false)."},
-            "top_k": {"type": "integer", "description": "Max results (default: 10, max: 50)."},
-        },
-        "required": ["query"],
-    },
-}
-
-CONTEXT_SCHEMA = {
-    "name": "mem0_context",
-    "description": (
-        "Deep retrieval with forced reranking. Use when you need the most "
-        "relevant memories for a specific topic."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "What to search for."},
-        },
-        "required": ["query"],
-    },
-}
-
-CONCLUDE_SCHEMA = {
-    "name": "mem0_conclude",
-    "description": (
-        "Store a durable fact about the user. Stored verbatim (no LLM extraction). "
-        "Use for explicit preferences, corrections, or decisions."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "conclusion": {"type": "string", "description": "The fact to store."},
-        },
-        "required": ["conclusion"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class Mem0MemoryProvider(MemoryProvider):
-    """Mem0 Platform memory with server-side extraction and semantic search."""
-
-    def __init__(self):
-        self._config = None
-        self._client = None
-        self._api_key = ""
-        self._user_id = "hermes-user"
-        self._agent_id = "hermes"
-        self._rerank = True
-        self._prefetch_result = ""
-        self._prefetch_lock = threading.Lock()
-        self._prefetch_thread = None
-
-    @property
-    def name(self) -> str:
-        return "mem0"
-
-    def is_available(self) -> bool:
-        cfg = _load_config()
-        return bool(cfg.get("api_key"))
-
-    def get_config_schema(self):
-        return [
-            {"key": "api_key", "description": "Mem0 Platform API key", "secret": True, "required": True, "env_var": "MEM0_API_KEY", "url": "https://app.mem0.ai"},
-            {"key": "user_id", "description": "User identifier", "default": "hermes-user"},
-            {"key": "agent_id", "description": "Agent identifier", "default": "hermes"},
-            {"key": "rerank", "description": "Enable reranking for recall", "default": "true", "choices": ["true", "false"]},
-        ]
-
-    def _get_client(self):
-        if self._client is not None:
-            return self._client
-        try:
-            from mem0 import MemoryClient
-            self._client = MemoryClient(api_key=self._api_key)
-            return self._client
-        except ImportError:
-            raise RuntimeError("mem0 package not installed. Run: pip install mem0ai")
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        self._config = _load_config()
-        self._api_key = self._config.get("api_key", "")
-        self._user_id = self._config.get("user_id", "hermes-user")
-        self._agent_id = self._config.get("agent_id", "hermes")
-        self._rerank = self._config.get("rerank", True)
-
-    def system_prompt_block(self) -> str:
-        return (
-            "# Mem0 Memory\n"
-            f"Active. User: {self._user_id}.\n"
-            "Use mem0_search to find memories, mem0_conclude to store facts, "
-            "mem0_profile for a full overview."
-        )
-
-    def prefetch(self, query: str) -> str:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=3.0)
-        with self._prefetch_lock:
-            result = self._prefetch_result
-            self._prefetch_result = ""
-        if not result:
-            return ""
-        return f"## Mem0 Memory\n{result}"
-
-    def queue_prefetch(self, query: str) -> None:
-        def _run():
-            try:
-                client = self._get_client()
-                results = client.search(
-                    query=query,
-                    user_id=self._user_id,
-                    rerank=self._rerank,
-                    top_k=5,
-                )
-                if results:
-                    lines = [r.get("memory", "") for r in results if r.get("memory")]
-                    with self._prefetch_lock:
-                        self._prefetch_result = "\n".join(f"- {l}" for l in lines)
-            except Exception as e:
-                logger.debug("Mem0 prefetch failed: %s", e)
-
-        self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="mem0-prefetch")
-        self._prefetch_thread.start()
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        """Send the turn to Mem0 for server-side fact extraction."""
-        try:
-            client = self._get_client()
-            messages = [
-                {"role": "user", "content": user_content},
-                {"role": "assistant", "content": assistant_content},
-            ]
-            client.add(messages, user_id=self._user_id, agent_id=self._agent_id)
-        except Exception as e:
-            logger.warning("Mem0 sync failed: %s", e)
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [PROFILE_SCHEMA, SEARCH_SCHEMA, CONTEXT_SCHEMA, CONCLUDE_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
-        try:
-            client = self._get_client()
-        except Exception as e:
-            return json.dumps({"error": str(e)})
-
-        if tool_name == "mem0_profile":
-            try:
-                memories = client.get_all(user_id=self._user_id)
-                if not memories:
-                    return json.dumps({"result": "No memories stored yet."})
-                lines = [m.get("memory", "") for m in memories if m.get("memory")]
-                return json.dumps({"result": "\n".join(lines), "count": len(lines)})
-            except Exception as e:
-                return json.dumps({"error": f"Failed to fetch profile: {e}"})
-
-        elif tool_name == "mem0_search":
-            query = args.get("query", "")
-            if not query:
-                return json.dumps({"error": "Missing required parameter: query"})
-            rerank = args.get("rerank", False)
-            top_k = min(int(args.get("top_k", 10)), 50)
-            try:
-                results = client.search(
-                    query=query, user_id=self._user_id,
-                    rerank=rerank, top_k=top_k,
-                )
-                if not results:
-                    return json.dumps({"result": "No relevant memories found."})
-                items = [{"memory": r.get("memory", ""), "score": r.get("score", 0)} for r in results]
-                return json.dumps({"results": items, "count": len(items)})
-            except Exception as e:
-                return json.dumps({"error": f"Search failed: {e}"})
-
-        elif tool_name == "mem0_context":
-            query = args.get("query", "")
-            if not query:
-                return json.dumps({"error": "Missing required parameter: query"})
-            try:
-                results = client.search(
-                    query=query, user_id=self._user_id,
-                    rerank=True, top_k=5,
-                )
-                if not results:
-                    return json.dumps({"result": "No relevant memories found."})
-                items = [{"memory": r.get("memory", ""), "score": r.get("score", 0)} for r in results]
-                return json.dumps({"results": items, "count": len(items)})
-            except Exception as e:
-                return json.dumps({"error": f"Context retrieval failed: {e}"})
-
-        elif tool_name == "mem0_conclude":
-            conclusion = args.get("conclusion", "")
-            if not conclusion:
-                return json.dumps({"error": "Missing required parameter: conclusion"})
-            try:
-                client.add(
-                    [{"role": "user", "content": conclusion}],
-                    user_id=self._user_id,
-                    agent_id=self._agent_id,
-                    infer=False,
-                )
-                return json.dumps({"result": "Fact stored."})
-            except Exception as e:
-                return json.dumps({"error": f"Failed to store: {e}"})
-
-        return json.dumps({"error": f"Unknown tool: {tool_name}"})
-
-    def shutdown(self) -> None:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=5.0)
-        self._client = None
-
-
-def register(ctx) -> None:
-    """Register Mem0 as a memory provider plugin."""
-    ctx.register_memory_provider(Mem0MemoryProvider())
@@ -1,7 +0,0 @@
-name: mem0-memory
-version: 1.0.0
-description: >
-  Long-term memory via Mem0 Platform — server-side LLM fact extraction,
-  semantic search with reranking, and automatic deduplication.
-requires_env:
-  - MEM0_API_KEY
@@ -1,205 +0,0 @@
-"""OpenViking memory plugin — MemoryProvider interface.
-
-Read-only semantic search over a self-hosted OpenViking knowledge server.
-Supports search (fast/deep/auto), URI-based content reading, and
-filesystem-style browsing.
-
-Original PR #3369 by Mibayy, adapted to MemoryProvider ABC.
-
-Config via environment variables:
-  OPENVIKING_ENDPOINT  — Server URL (default: http://127.0.0.1:1933)
-  OPENVIKING_API_KEY   — Optional API key
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import os
-from typing import Any, Dict, List
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Tool schemas
-# ---------------------------------------------------------------------------
-
-SEARCH_SCHEMA = {
-    "name": "viking_search",
-    "description": (
-        "Semantic search over OpenViking knowledge base. "
-        "Returns ranked results with URIs for deeper reading."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "Search query."},
-            "mode": {
-                "type": "string", "enum": ["auto", "fast", "deep"],
-                "description": "Search depth (default: auto).",
-            },
-            "scope": {"type": "string", "description": "URI prefix to scope search."},
-            "limit": {"type": "integer", "description": "Max results (default: 10)."},
-        },
-        "required": ["query"],
-    },
-}
-
-READ_SCHEMA = {
-    "name": "viking_read",
-    "description": (
-        "Read content at a viking:// URI. Supports three detail levels: "
-        "abstract (summary), overview (key points), read (full content)."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "uri": {"type": "string", "description": "viking:// URI to read."},
-            "level": {
-                "type": "string", "enum": ["abstract", "overview", "read"],
-                "description": "Detail level (default: overview).",
-            },
-        },
-        "required": ["uri"],
-    },
-}
-
-BROWSE_SCHEMA = {
-    "name": "viking_browse",
-    "description": (
-        "Browse the OpenViking knowledge store like a filesystem. "
-        "Supports tree (hierarchy), list (directory), and stat (metadata)."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "action": {
-                "type": "string", "enum": ["tree", "list", "stat"],
-                "description": "Browse action.",
-            },
-            "path": {"type": "string", "description": "Path to browse (default: root)."},
-        },
-        "required": ["action"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class OpenVikingMemoryProvider(MemoryProvider):
-    """Read-only memory via OpenViking self-hosted knowledge server."""
-
-    def __init__(self):
-        self._endpoint = ""
-        self._api_key = ""
-
-    @property
-    def name(self) -> str:
-        return "openviking"
-
-    def get_config_schema(self):
-        return [
-            {"key": "endpoint", "description": "OpenViking server URL", "required": True, "default": "http://127.0.0.1:1933"},
-            {"key": "api_key", "description": "OpenViking API key (if server requires auth)", "secret": True, "env_var": "OPENVIKING_API_KEY"},
-        ]
-
-    def is_available(self) -> bool:
-        endpoint = os.environ.get("OPENVIKING_ENDPOINT", "")
-        if not endpoint:
-            return False
-        # Quick health check
-        try:
-            import httpx
-            resp = httpx.get(f"{endpoint}/health", timeout=3.0)
-            return resp.status_code == 200
-        except Exception:
-            return False
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        self._endpoint = os.environ.get("OPENVIKING_ENDPOINT", "http://127.0.0.1:1933")
-        self._api_key = os.environ.get("OPENVIKING_API_KEY", "")
-
-    def _headers(self) -> dict:
-        h = {"Content-Type": "application/json"}
-        if self._api_key:
-            h["X-API-Key"] = self._api_key
-        return h
-
-    def system_prompt_block(self) -> str:
-        return (
-            "# OpenViking Knowledge Base\n"
-            f"Active. Endpoint: {self._endpoint}\n"
-            "Use viking_search to find information, viking_read for details, "
-            "viking_browse to explore the knowledge tree."
-        )
-
-    def prefetch(self, query: str) -> str:
-        """OpenViking is tool-driven, no automatic prefetch."""
-        return ""
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [SEARCH_SCHEMA, READ_SCHEMA, BROWSE_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
-        try:
-            import httpx
-        except ImportError:
-            return json.dumps({"error": "httpx not installed"})
-
-        try:
-            if tool_name == "viking_search":
-                return self._search(httpx, args)
-            elif tool_name == "viking_read":
-                return self._read(httpx, args)
-            elif tool_name == "viking_browse":
-                return self._browse(httpx, args)
-            return json.dumps({"error": f"Unknown tool: {tool_name}"})
-        except Exception as e:
-            return json.dumps({"error": str(e)})
-
-    def _search(self, httpx, args: dict) -> str:
-        query = args.get("query", "")
-        if not query:
-            return json.dumps({"error": "query is required"})
-        payload = {"query": query, "mode": args.get("mode", "auto")}
-        if args.get("scope"):
-            payload["scope"] = args["scope"]
-        if args.get("limit"):
-            payload["limit"] = args["limit"]
-        resp = httpx.post(
-            f"{self._endpoint}/v1/search",
-            json=payload, headers=self._headers(), timeout=30.0,
-        )
-        return resp.text
-
-    def _read(self, httpx, args: dict) -> str:
-        uri = args.get("uri", "")
-        if not uri:
-            return json.dumps({"error": "uri is required"})
-        level = args.get("level", "overview")
-        resp = httpx.post(
-            f"{self._endpoint}/v1/read",
-            json={"uri": uri, "level": level},
-            headers=self._headers(), timeout=30.0,
-        )
-        return resp.text
-
-    def _browse(self, httpx, args: dict) -> str:
-        action = args.get("action", "tree")
-        path = args.get("path", "/")
-        resp = httpx.post(
-            f"{self._endpoint}/v1/browse",
-            json={"action": action, "path": path},
-            headers=self._headers(), timeout=30.0,
-        )
-        return resp.text
-
-
-def register(ctx) -> None:
-    """Register OpenViking as a memory provider plugin."""
-    ctx.register_memory_provider(OpenVikingMemoryProvider())
@@ -1,7 +0,0 @@
-name: openviking-memory
-version: 1.0.0
-description: >
-  Read-only memory via OpenViking — semantic search, URI-based content
-  reading, and filesystem browsing over a self-hosted knowledge server.
-requires_env:
-  - OPENVIKING_ENDPOINT
@@ -1,280 +0,0 @@
-"""RetainDB memory plugin — MemoryProvider interface.
-
-Cross-session memory via RetainDB cloud API. Durable write-behind queue,
-semantic search with deduplication, and user profile retrieval.
-
-Original PR #2732 by Alinxus, adapted to MemoryProvider ABC.
-
-Config via environment variables:
-  RETAINDB_API_KEY    — API key (required)
-  RETAINDB_BASE_URL   — API endpoint (default: https://api.retaindb.com)
-  RETAINDB_PROJECT    — Project identifier (default: hermes)
-"""
-
-from __future__ import annotations
-
-import json
-import logging
-import os
-import threading
-from typing import Any, Dict, List
-
-from agent.memory_provider import MemoryProvider
-
-logger = logging.getLogger(__name__)
-
-_DEFAULT_BASE_URL = "https://api.retaindb.com"
-
-
-# ---------------------------------------------------------------------------
-# Tool schemas
-# ---------------------------------------------------------------------------
-
-PROFILE_SCHEMA = {
-    "name": "retaindb_profile",
-    "description": "Get the user's stable profile — preferences, facts, and patterns.",
-    "parameters": {"type": "object", "properties": {}, "required": []},
-}
-
-SEARCH_SCHEMA = {
-    "name": "retaindb_search",
-    "description": (
-        "Semantic search across stored memories. Returns ranked results "
-        "with relevance scores."
-    ),
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "What to search for."},
-            "top_k": {"type": "integer", "description": "Max results (default: 8, max: 20)."},
-        },
-        "required": ["query"],
-    },
-}
-
-CONTEXT_SCHEMA = {
-    "name": "retaindb_context",
-    "description": "Synthesized 'what matters now' context block for the current task.",
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "query": {"type": "string", "description": "Current task or question."},
-        },
-        "required": ["query"],
-    },
-}
-
-REMEMBER_SCHEMA = {
-    "name": "retaindb_remember",
-    "description": "Persist an explicit fact or preference to long-term memory.",
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "content": {"type": "string", "description": "The fact to remember."},
-            "memory_type": {
-                "type": "string",
-                "enum": ["preference", "fact", "decision", "context"],
-                "description": "Category (default: fact).",
-            },
-            "importance": {
-                "type": "number",
-                "description": "Importance 0-1 (default: 0.5).",
-            },
-        },
-        "required": ["content"],
-    },
-}
-
-FORGET_SCHEMA = {
-    "name": "retaindb_forget",
-    "description": "Delete a specific memory by ID.",
-    "parameters": {
-        "type": "object",
-        "properties": {
-            "memory_id": {"type": "string", "description": "Memory ID to delete."},
-        },
-        "required": ["memory_id"],
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# MemoryProvider implementation
-# ---------------------------------------------------------------------------
-
-class RetainDBMemoryProvider(MemoryProvider):
-    """RetainDB cloud memory with write-behind queue and semantic search."""
-
-    def __init__(self):
-        self._api_key = ""
-        self._base_url = _DEFAULT_BASE_URL
-        self._project = "hermes"
-        self._user_id = ""
-        self._prefetch_result = ""
-        self._prefetch_lock = threading.Lock()
-        self._prefetch_thread = None
-
-    @property
-    def name(self) -> str:
-        return "retaindb"
-
-    def is_available(self) -> bool:
-        return bool(os.environ.get("RETAINDB_API_KEY"))
-
-    def get_config_schema(self):
-        return [
-            {"key": "api_key", "description": "RetainDB API key", "secret": True, "required": True, "env_var": "RETAINDB_API_KEY", "url": "https://retaindb.com"},
-            {"key": "base_url", "description": "API endpoint", "default": "https://api.retaindb.com"},
-            {"key": "project", "description": "Project identifier", "default": "hermes"},
-        ]
-
-    def _headers(self) -> dict:
-        return {
-            "Authorization": f"Bearer {self._api_key}",
-            "Content-Type": "application/json",
-        }
-
-    def _api(self, method: str, path: str, **kwargs):
-        """Make an API call to RetainDB."""
-        import requests
-        url = f"{self._base_url}{path}"
-        resp = requests.request(method, url, headers=self._headers(), timeout=30, **kwargs)
-        resp.raise_for_status()
-        return resp.json()
-
-    def initialize(self, session_id: str, **kwargs) -> None:
-        self._api_key = os.environ.get("RETAINDB_API_KEY", "")
-        self._base_url = os.environ.get("RETAINDB_BASE_URL", _DEFAULT_BASE_URL)
-        self._project = os.environ.get("RETAINDB_PROJECT", "hermes")
-        self._user_id = kwargs.get("user_id", "default")
-        self._session_id = session_id
-
-    def system_prompt_block(self) -> str:
-        return (
-            "# RetainDB Memory\n"
-            f"Active. Project: {self._project}.\n"
-            "Use retaindb_search to find memories, retaindb_remember to store facts, "
-            "retaindb_profile for a user overview, retaindb_context for task-relevant context."
-        )
-
-    def prefetch(self, query: str) -> str:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=3.0)
-        with self._prefetch_lock:
-            result = self._prefetch_result
-            self._prefetch_result = ""
-        if not result:
-            return ""
-        return f"## RetainDB Memory\n{result}"
-
-    def queue_prefetch(self, query: str) -> None:
-        def _run():
-            try:
-                data = self._api("POST", "/v1/recall", json={
-                    "project": self._project,
-                    "query": query,
-                    "user_id": self._user_id,
-                    "top_k": 5,
-                })
-                results = data.get("results", [])
-                if results:
-                    lines = [r.get("content", "") for r in results if r.get("content")]
-                    with self._prefetch_lock:
-                        self._prefetch_result = "\n".join(f"- {l}" for l in lines)
-            except Exception as e:
-                logger.debug("RetainDB prefetch failed: %s", e)
-
-        self._prefetch_thread = threading.Thread(target=_run, daemon=True, name="retaindb-prefetch")
-        self._prefetch_thread.start()
-
-    def sync_turn(self, user_content: str, assistant_content: str) -> None:
-        try:
-            self._api("POST", "/v1/ingest", json={
-                "project": self._project,
-                "user_id": self._user_id,
-                "session_id": self._session_id,
-                "messages": [
-                    {"role": "user", "content": user_content},
-                    {"role": "assistant", "content": assistant_content},
-                ],
-            })
-        except Exception as e:
-            logger.warning("RetainDB sync failed: %s", e)
-
-    def get_tool_schemas(self) -> List[Dict[str, Any]]:
-        return [PROFILE_SCHEMA, SEARCH_SCHEMA, CONTEXT_SCHEMA, REMEMBER_SCHEMA, FORGET_SCHEMA]
-
-    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
-        try:
-            if tool_name == "retaindb_profile":
-                data = self._api("GET", f"/v1/profile/{self._project}/{self._user_id}")
-                return json.dumps(data)
-
-            elif tool_name == "retaindb_search":
-                query = args.get("query", "")
-                if not query:
-                    return json.dumps({"error": "query is required"})
-                data = self._api("POST", "/v1/search", json={
-                    "project": self._project,
-                    "user_id": self._user_id,
-                    "query": query,
-                    "top_k": min(int(args.get("top_k", 8)), 20),
-                })
-                return json.dumps(data)
-
-            elif tool_name == "retaindb_context":
-                query = args.get("query", "")
-                if not query:
-                    return json.dumps({"error": "query is required"})
-                data = self._api("POST", "/v1/recall", json={
-                    "project": self._project,
-                    "user_id": self._user_id,
-                    "query": query,
-                    "top_k": 5,
-                })
-                return json.dumps(data)
-
-            elif tool_name == "retaindb_remember":
-                content = args.get("content", "")
-                if not content:
-                    return json.dumps({"error": "content is required"})
-                data = self._api("POST", "/v1/remember", json={
-                    "project": self._project,
-                    "user_id": self._user_id,
-                    "content": content,
-                    "memory_type": args.get("memory_type", "fact"),
-                    "importance": float(args.get("importance", 0.5)),
-                })
-                return json.dumps(data)
-
-            elif tool_name == "retaindb_forget":
-                memory_id = args.get("memory_id", "")
-                if not memory_id:
-                    return json.dumps({"error": "memory_id is required"})
-                data = self._api("DELETE", f"/v1/memory/{memory_id}")
-                return json.dumps(data)
-
-            return json.dumps({"error": f"Unknown tool: {tool_name}"})
-        except Exception as e:
-            return json.dumps({"error": str(e)})
-
-    def on_memory_write(self, action: str, target: str, content: str) -> None:
-        if action == "add":
-            try:
-                self._api("POST", "/v1/remember", json={
-                    "project": self._project,
-                    "user_id": self._user_id,
-                    "content": content,
-                    "memory_type": "preference" if target == "user" else "fact",
-                })
-            except Exception as e:
-                logger.debug("RetainDB memory bridge failed: %s", e)
-
-    def shutdown(self) -> None:
-        if self._prefetch_thread and self._prefetch_thread.is_alive():
-            self._prefetch_thread.join(timeout=5.0)
-
-
-def register(ctx) -> None:
-    """Register RetainDB as a memory provider plugin."""
-    ctx.register_memory_provider(RetainDBMemoryProvider())
@@ -1,7 +0,0 @@
-name: retaindb-memory
-version: 1.0.0
-description: >
-  Cross-session memory via RetainDB — durable write-behind queue, semantic
-  search with deduplication, user identity resolution, and profile retrieval.
-requires_env:
-  - RETAINDB_API_KEY
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "hermes-agent"
-version = "0.5.0"
+version = "0.4.0"
 description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -26,7 +26,6 @@ dependencies = [
  # Interactive CLI (prompt_toolkit is used directly by cli.py)
  "prompt_toolkit>=3.0.52,<4",
  # Tools
-  "exa-py>=2.9.0,<3",
  "firecrawl-py>=4.16.0,<5",
  "parallel-web>=0.4.2,<1",
  "fal-client>=0.13.1,<1",
@@ -38,7 +37,7 @@ dependencies = [
 ]

 [project.optional-dependencies]
-modal = ["modal>=1.0.0,<2"]
+modal = ["swe-rex[modal]>=1.4.0,<2"]
 daytona = ["daytona>=0.148.0,<1"]
 dev = ["pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2"]
 messaging = ["python-telegram-bot>=22.6,<23", "discord.py[voice]>=2.7.1,<3", "aiohttp>=3.13.3,<4", "slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4"]
@@ -56,7 +55,7 @@ honcho = ["honcho-ai>=2.0.1,<3"]
 mcp = ["mcp>=1.2.0,<2"]
 homeassistant = ["aiohttp>=3.9.0,<4"]
 sms = ["aiohttp>=3.9.0,<4"]
-acp = ["agent-client-protocol>=0.8.1,<0.9"]
+acp = ["agent-client-protocol>=0.8.1,<1.0"]
 dingtalk = ["dingtalk-stream>=0.1.0,<1"]
 rl = [
  "atroposlib @ git+https://github.com/NousResearch/atropos.git",
@@ -2,7 +2,7 @@
 # Kill all running Modal apps (sandboxes, deployments, etc.)
 #
 # Usage:
-#   bash scripts/kill_modal.sh          # Stop hermes-agent sandboxes
+#   bash scripts/kill_modal.sh          # Stop swe-rex (the sandbox app)
 #   bash scripts/kill_modal.sh --all    # Stop ALL Modal apps

 set -uo pipefail
@@ -17,10 +17,10 @@ if [[ "${1:-}" == "--all" ]]; then
        modal app stop "$app_id" 2>/dev/null || true
    done
 else
-    echo "Stopping hermes-agent sandboxes..."
-    APPS=$(echo "$APP_LIST" | grep 'hermes-agent' | grep -oE 'ap-[A-Za-z0-9]+' || true)
+    echo "Stopping swe-rex sandboxes..."
+    APPS=$(echo "$APP_LIST" | grep 'swe-rex' | grep -oE 'ap-[A-Za-z0-9]+' || true)
    if [[ -z "$APPS" ]]; then
-        echo "  No hermes-agent apps found."
+        echo "  No swe-rex apps found."
    else
        echo "$APPS" | while read app_id; do
            echo "  Stopping $app_id"
@@ -30,5 +30,5 @@ else
 fi

 echo ""
-echo "Current hermes-agent status:"
-modal app list 2>/dev/null | grep -E 'State|hermes-agent' || echo "  (none)"
+echo "Current swe-rex status:"
+modal app list 2>/dev/null | grep -E 'State|swe-rex' || echo "  (none)"
@@ -1,180 +0,0 @@
---
-name: webhook-subscriptions
-description: Create and manage webhook subscriptions for event-driven agent activation. Use when the user wants external services to trigger agent runs automatically.
-version: 1.0.0
-metadata:
-  hermes:
-    tags: [webhook, events, automation, integrations]
---
-
-# Webhook Subscriptions
-
-Create dynamic webhook subscriptions so external services (GitHub, GitLab, Stripe, CI/CD, IoT sensors, monitoring tools) can trigger Hermes agent runs by POSTing events to a URL.
-
-## Setup (Required First)
-
-The webhook platform must be enabled before subscriptions can be created. Check with:
-```bash
-hermes webhook list
-```
-
-If it says "Webhook platform is not enabled", set it up:
-
-### Option 1: Setup wizard
-```bash
-hermes gateway setup
-```
-Follow the prompts to enable webhooks, set the port, and set a global HMAC secret.
-
-### Option 2: Manual config
-Add to `~/.hermes/config.yaml`:
-```yaml
-platforms:
-  webhook:
-    enabled: true
-    extra:
-      host: "0.0.0.0"
-      port: 8644
-      secret: "generate-a-strong-secret-here"
-```
-
-### Option 3: Environment variables
-Add to `~/.hermes/.env`:
-```bash
-WEBHOOK_ENABLED=true
-WEBHOOK_PORT=8644
-WEBHOOK_SECRET=generate-a-strong-secret-here
-```
-
-After configuration, start (or restart) the gateway:
-```bash
-hermes gateway run
-# Or if using systemd:
-systemctl --user restart hermes-gateway
-```
-
-Verify it's running:
-```bash
-curl http://localhost:8644/health
-```
-
-## Commands
-
-All management is via the `hermes webhook` CLI command:
-
-### Create a subscription
-```bash
-hermes webhook subscribe <name> \
-  --prompt "Prompt template with {payload.fields}" \
-  --events "event1,event2" \
-  --description "What this does" \
-  --skills "skill1,skill2" \
-  --deliver telegram \
-  --deliver-chat-id "12345" \
-  --secret "optional-custom-secret"
-```
-
-Returns the webhook URL and HMAC secret. The user configures their service to POST to that URL.
-
-### List subscriptions
-```bash
-hermes webhook list
-```
-
-### Remove a subscription
-```bash
-hermes webhook remove <name>
-```
-
-### Test a subscription
-```bash
-hermes webhook test <name>
-hermes webhook test <name> --payload '{"key": "value"}'
-```
-
-## Prompt Templates
-
-Prompts support `{dot.notation}` for accessing nested payload fields:
-
- `{issue.title}` — GitHub issue title
- `{pull_request.user.login}` — PR author
- `{data.object.amount}` — Stripe payment amount
- `{sensor.temperature}` — IoT sensor reading
-
-If no prompt is specified, the full JSON payload is dumped into the agent prompt.
-
-## Common Patterns
-
-### GitHub: new issues
-```bash
-hermes webhook subscribe github-issues \
-  --events "issues" \
-  --prompt "New GitHub issue #{issue.number}: {issue.title}\n\nAction: {action}\nAuthor: {issue.user.login}\nBody:\n{issue.body}\n\nPlease triage this issue." \
-  --deliver telegram \
-  --deliver-chat-id "-100123456789"
-```
-
-Then in GitHub repo Settings → Webhooks → Add webhook:
- Payload URL: the returned webhook_url
- Content type: application/json
- Secret: the returned secret
- Events: "Issues"
-
-### GitHub: PR reviews
-```bash
-hermes webhook subscribe github-prs \
-  --events "pull_request" \
-  --prompt "PR #{pull_request.number} {action}: {pull_request.title}\nBy: {pull_request.user.login}\nBranch: {pull_request.head.ref}\n\n{pull_request.body}" \
-  --skills "github-code-review" \
-  --deliver github_comment
-```
-
-### Stripe: payment events
-```bash
-hermes webhook subscribe stripe-payments \
-  --events "payment_intent.succeeded,payment_intent.payment_failed" \
-  --prompt "Payment {data.object.status}: {data.object.amount} cents from {data.object.receipt_email}" \
-  --deliver telegram \
-  --deliver-chat-id "-100123456789"
-```
-
-### CI/CD: build notifications
-```bash
-hermes webhook subscribe ci-builds \
-  --events "pipeline" \
-  --prompt "Build {object_attributes.status} on {project.name} branch {object_attributes.ref}\nCommit: {commit.message}" \
-  --deliver discord \
-  --deliver-chat-id "1234567890"
-```
-
-### Generic monitoring alert
-```bash
-hermes webhook subscribe alerts \
-  --prompt "Alert: {alert.name}\nSeverity: {alert.severity}\nMessage: {alert.message}\n\nPlease investigate and suggest remediation." \
-  --deliver origin
-```
-
-## Security
-
- Each subscription gets an auto-generated HMAC-SHA256 secret (or provide your own with `--secret`)
- The webhook adapter validates signatures on every incoming POST
- Static routes from config.yaml cannot be overwritten by dynamic subscriptions
- Subscriptions persist to `~/.hermes/webhook_subscriptions.json`
-
-## How It Works
-
-1. `hermes webhook subscribe` writes to `~/.hermes/webhook_subscriptions.json`
-2. The webhook adapter hot-reloads this file on each incoming request (mtime-gated, negligible overhead)
-3. When a POST arrives matching a route, the adapter formats the prompt and triggers an agent run
-4. The agent's response is delivered to the configured target (Telegram, Discord, GitHub comment, etc.)
-
-## Troubleshooting
-
-If webhooks aren't working:
-
-1. **Is the gateway running?** Check with `systemctl --user status hermes-gateway` or `ps aux | grep gateway`
-2. **Is the webhook server listening?** `curl http://localhost:8644/health` should return `{"status": "ok"}`
-3. **Check gateway logs:** `grep webhook ~/.hermes/logs/gateway.log | tail -20`
-4. **Signature mismatch?** Verify the secret in your service matches the one from `hermes webhook list`. GitHub sends `X-Hub-Signature-256`, GitLab sends `X-Gitlab-Token`.
-5. **Firewall/NAT?** The webhook URL must be reachable from the service. For local development, use a tunnel (ngrok, cloudflared).
-6. **Wrong event type?** Check `--events` filter matches what the service sends. Use `hermes webhook test <name>` to verify the route works.
@@ -219,9 +219,6 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
  echo "AUTH_METHOD=gh"
 elif [ -n "$GITHUB_TOKEN" ]; then
  echo "AUTH_METHOD=curl"
-elif [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
-  export GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
-  echo "AUTH_METHOD=curl"
 elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
  export GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
  echo "AUTH_METHOD=curl"
@@ -23,11 +23,6 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null 2>&1; then
    GH_USER=$(gh api user --jq '.login' 2>/dev/null)
 elif [ -n "$GITHUB_TOKEN" ]; then
    GH_AUTH_METHOD="curl"
-elif [ -f "$HOME/.hermes/.env" ] && grep -q "^GITHUB_TOKEN=" "$HOME/.hermes/.env" 2>/dev/null; then
-    GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" "$HOME/.hermes/.env" | head -1 | cut -d= -f2 | tr -d '\n\r')
-    if [ -n "$GITHUB_TOKEN" ]; then
-        GH_AUTH_METHOD="curl"
-    fi
 elif [ -f "$HOME/.git-credentials" ] && grep -q "github.com" "$HOME/.git-credentials" 2>/dev/null; then
    GITHUB_TOKEN=$(grep "github.com" "$HOME/.git-credentials" | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
    if [ -n "$GITHUB_TOKEN" ]; then
--- a/Show More
+++ b/Show More
				`@@ -1 +0,0 @@`
				`"""Built-in gateway hooks that are always registered."""`
				`@@ -1 +0,0 @@`
				`Communication and decision-making frameworks — structured response formats for proposals, trade-off analysis, and stakeholder-ready recommendations.`