feat(kanban): default_assignee fallback + per-profile concurrency cap (#27145 , #21582 ) (#34244 )

Two related dispatcher behaviors that have been missing for a while. ## kanban.default_assignee (#27145) Reporter (@agarzon): dashboard creates a task without an assignee, task parks in 'ready' forever even though the operator's intent ('default') is perfectly clear. The dispatcher already had a 'skipped_unassigned' bucket but no fallback routing — users had to manually type 'default' in the assignee field every time. Behavior: when 'kanban.default_assignee' is set in config.yaml, the dispatcher applies that assignee to any unassigned ready task before deciding whether to spawn. The row is mutated (assignee column + an 'assigned' event with source='kanban.default_assignee' for the audit trail). Empty/whitespace config value = no fallback, preserving the existing skipped_unassigned behavior. Dry-run mode reports what WOULD happen via the new 'auto_assigned_default' bucket on DispatchResult, but does NOT mutate the DB — operators using 'hermes kanban dispatch --dry-run' see the routing decision before committing. ## kanban.max_in_progress_per_profile (#21582) Reporter (@edwardchenchen, @simlu, 4 reactions): fan-out workloads saturate one profile's local model / API quota / browser pool while other profiles sit idle. The existing global 'max_in_progress' caps total workers but doesn't balance across profiles. Behavior: when 'kanban.max_in_progress_per_profile' is set to a positive int, the dispatcher tracks per-assignee running counts (one query at tick start) and refuses to spawn for any assignee already at the cap. Tasks blocked this way go to a new 'skipped_per_profile_capped' bucket on DispatchResult as (task_id, assignee, current_running_count) tuples — NOT an operator-actionable failure, just 'try again next tick when the profile has capacity'. Pre-existing 'running' tasks count against the cap (verified via regression test). The cap respects dry_run mode by incrementing its in-memory counter on each would-be spawn so dry_run reports the same balanced subset that a real tick would. Invalid cap values (0, negative, non-int, None) are treated as 'no cap', preserving the existing behavior. Backward-compatible for installs that don't set the config. ## Surfaces - 'hermes kanban dispatch' CLI now prints 'Auto-assigned to kanban.default_assignee=X: ...' and 'Deferred (X at per-profile cap, N running): ...' lines, plus matching JSON keys in --json output. - Gateway dispatcher logs the configured values at startup ('default_assignee=X', 'max_in_progress_per_profile=N'). - 'kanban.max_in_progress_per_profile' added to DEFAULT_CONFIG with inline docs. ## Validation - tests/hermes_cli/test_kanban_default_assignee.py (6 cases): no-cap baseline, auto-assign + DB mutation, dry-run reports without mutating, whitespace treated as None, explicit assignees untouched, DispatchResult field schema. - tests/hermes_cli/test_kanban_per_profile_cap.py (9 cases including 4 parametrized): no-cap baseline, balanced 2-profile fan-out, pre-existing running counts against cap, invalid cap values (0/-1/'abc'/None), capped tasks dispatched on next tick after running task completes, DispatchResult field schema. - Broader kanban suite: 464/464 pass (was 449 baseline; +15 new regression tests across both features). ## Credit #27145 — Jimmy Johansson reported the dispatcher skipped-unassigned gap; @agarzon scoped the simpler 'honor kanban.default_assignee' fix that matches the existing config knob. #21582 — @edwardchenchen filed the per-profile cap ask after hitting model 429s on fan-out research projects; @simlu confirmed the same pain on local-model setups.
docs(docker): refresh user-guide page for s6-overlay reality
2026-05-28 19:02:55 -07:00 · 2026-05-29 11:55:01 +10:00 · 2026-05-29 11:49:54 +10:00 · 2026-05-29 11:49:54 +10:00 · 2026-05-29 11:49:54 +10:00 · 2026-05-29 11:49:54 +10:00
396 changed files with 12700 additions and 72040 deletions
@@ -196,10 +196,26 @@ jobs:
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f  # v3

-      # Build once, load into the local daemon for smoke testing.  Cached
-      # to gha with a per-arch scope; the push step below reuses every
-      # layer from this build.
-      - name: Build image (arm64, smoke test)
+      # Build once, load into the local daemon for smoke testing. PR arm64
+      # builds deliberately avoid the gha cache: cold-cache arm64 builds can
+      # outlive GitHub's short-lived Azure cache SAS token, then fail while
+      # reading or writing cache blobs before the smoke test can run.
+      - name: Build image (arm64, smoke test, uncached PR)
+        if: github.event_name == 'pull_request'
+        uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f  # v7.1.0
+        with:
+          context: .
+          file: Dockerfile
+          load: true
+          platforms: linux/arm64
+          tags: ${{ env.IMAGE_NAME }}:test
+          build-args: |
+            HERMES_GIT_SHA=${{ github.sha }}
+
+      # Main/release builds still use the per-arch gha cache so the digest
+      # push below can reuse layers from this smoke-test build.
+      - name: Build image (arm64, smoke test, cached publish)
+        if: github.event_name != 'pull_request'
        uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f  # v7.1.0
        with:
          context: .
@@ -200,22 +200,3 @@ jobs:

      - name: Run footgun checker
        run: python scripts/check-windows-footguns.py --all
-
-  plugin-isolation:
-    # Enforce that core code and core tests never import from plugin packages.
-    # Core must interact with plugins exclusively through the registry layer.
-    # See scripts/check_no_plugin_imports_in_core.py for the rule list.
-    name: Plugin isolation (blocking)
-    runs-on: ubuntu-latest
-    timeout-minutes: 5
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-
-      - name: Set up Python
-        uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v5
-        with:
-          python-version: "3.11"
-
-      - name: Run plugin isolation checker
-        run: python scripts/check_no_plugin_imports_in_core.py
@@ -48,7 +48,7 @@ agent-browser/
 privvy*
 images/
 __pycache__/
-*.egg-info
+hermes_agent.egg-info/
 wandb/
 testlogs

@@ -87,7 +87,8 @@ website/static/api/skills-meta.json
 models-dev-upstream/
 hermes_cli/tui_dist/*
 hermes_cli/scripts/
-docs/superpowers/*# Working directory for the Hermes Agent's session state (~/.hermes/ at runtime;
+docs/superpowers/*
+# Working directory for the Hermes Agent's session state (~/.hermes/ at runtime;
 # also created in-repo when an agent operates in this checkout). Plans, audit
 # logs, and per-session caches are never artifacts of the codebase.
 .hermes/
@@ -29,9 +29,7 @@ hermes-agent/
 ├── hermes_constants.py   # get_hermes_home(), display_hermes_home() — profile-aware paths
 ├── hermes_logging.py     # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
 ├── batch_runner.py       # Parallel batch processing
-├── _build_backend.py     # Custom PEP 517 build backend — inlines plugin deps at wheel build time
 ├── agent/                # Agent internals (provider adapters, memory, caching, compression, etc.)
-│   └── plugin_registries.py  # Typed capability registries (auth, transport, platform, tool, model_metadata)
 ├── hermes_cli/           # CLI subcommands, setup wizard, plugins loader, skin engine
 ├── tools/                # Tool implementations — auto-discovered via tools/registry.py
 │   └── environments/     # Terminal backends (local, docker, ssh, modal, daytona, singularity)
@@ -41,20 +39,16 @@ hermes-agent/
 │   │                     #   dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
 │   │                     #   yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
 │   └── builtin_hooks/    # Extension point for always-registered gateway hooks (none shipped)
-├── plugins/              # Plugin packages — uv workspace members (see "Plugins" section)
-│   ├── model-providers/  # anthropic, bedrock, azure-foundry (own pyproject.toml each)
-│   ├── platforms/        # telegram, slack, discord, feishu, dingtalk, matrix
-│   ├── tts/              # Text-to-speech plugin
-│   ├── stt/              # Speech-to-text plugin
-│   ├── image_gen/        # FAL image generation
-│   ├── terminals/        # daytona, modal, vercel
-│   ├── web/              # exa, firecrawl, parallel
+├── plugins/              # Plugin system (see "Plugins" section below)
 │   ├── memory/           # Memory-provider plugins (honcho, mem0, supermemory, ...)
 │   ├── context_engine/   # Context-engine plugins
+│   ├── model-providers/  # Inference backend plugins (openrouter, anthropic, gmi, ...)
 │   ├── kanban/           # Multi-agent board dispatcher + worker plugin
 │   ├── hermes-achievements/  # Gamified achievement tracking
 │   ├── observability/    # Metrics / traces / logs plugin
-│   └── <others>/         # dashboard, google_meet, spotify, strike-freedom-cockpit, ...
+│   ├── image_gen/        # Image-generation providers
+│   └── <others>/         # disk-cleanup, example-dashboard, google_meet, platforms,
+│                         #   spotify, strike-freedom-cockpit, ...
 ├── optional-skills/      # Heavier/niche skills shipped but NOT active by default
 ├── skills/               # Built-in skills bundled with the repo
 ├── ui-tui/               # Ink (React) terminal UI — `hermes --tui`
@@ -492,102 +486,9 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.

 ## Plugins

-Hermes uses a **plugin-first architecture**: every optional capability (model
-providers, platform adapters, TTS/STT, terminal backends, image generation)
-lives in its own installable Python package under `plugins/`. The core
-codebase (`agent/`, `hermes_cli/`, `gateway/`, `tools/`) **never** imports
-from a `hermes_agent_*` plugin package directly. Instead, plugins register
-their capabilities into typed registries during `register()`, and the core
-queries those registries at runtime.
-
-Full architecture doc: `website/docs/developer-guide/plugin-architecture.md`
-
-### Workspace layout
-
-All 21 builtin plugins are uv workspace members — each has its own
-`pyproject.toml` (single source of truth for deps), `plugin.yaml`
-(directory-scanner manifest for dev mode), and `hermes_agent_<name>/` package
-with `register(ctx)`:
-
-```
-plugins/
-├── model-providers/        # anthropic, bedrock, azure-foundry
-├── platforms/              # telegram, slack, discord, feishu, dingtalk, matrix
-├── tts/                    # text-to-speech (Edge TTS + ElevenLabs)
-├── stt/                    # speech-to-text
-├── image_gen/fal_pkg/      # FAL image generation
-├── terminals/              # daytona, modal, vercel
-├── web/                    # exa, firecrawl, parallel
-├── memory/                 # honcho, hindsight
-├── dashboard/              # streamlit dashboard
-└── hermes-achievements/    # gamified achievement tracking
-```
-
-### The hermetic core boundary
-
-Core code MUST NOT import from `hermes_agent_*` packages. Instead it queries
-typed registries in `agent/plugin_registries.py`:
-
-```python
-# ❌ BAD — core directly imports plugin
-from hermes_agent_bedrock import has_aws_credentials
-
-# ✅ GOOD — core queries the registry
-from agent.plugin_registries import registries
-bedrock_auth = registries.get_auth_provider("bedrock")
-```
-
-Registry types: `auth_providers`, `transport_builders`, `platform_adapters`,
-`tool_providers`, `model_metadata`, `credential_pools`.
-
-Each plugin's `register(ctx)` populates the registries via `ctx.register_*()`:
- `ctx.register_auth_provider(name, provider, ...)`
- `ctx.register_transport(name, builder, ...)`
- `ctx.register_platform(name, label, adapter_factory, check_fn, ...)`
- `ctx.register_tool_provider(entry, ...)`
- `ctx.register_model_metadata(entry, ...)`
- `ctx.register_credential_pool(entry, ...)`
- Plus existing: `register_tool()`, `register_hook()`, `register_cli_command()`,
-  `register_tts_provider()`, `register_transcription_provider()`,
-  `register_image_gen_provider()`, `register_video_gen_provider()`,
-  `register_context_engine()`
-
-### Plugin discovery
-
-Three discovery paths (same as before, now workspace-aware):
-1. **Directory scanner** — `plugins/`, `~/.hermes/plugins/`, `.hermes/plugins/`
-   (looks for `plugin.yaml`)
-2. **Entry points** — `[project.entry-points."hermes_agent.plugins"]`
-3. **uv workspace** — `uv sync --extra <name>` installs the plugin into venv
-
-### Dependency management
-
- Each plugin's `pyproject.toml` is the **only** place its deps are declared
- Root `pyproject.toml` maps extras to workspace members:
-  `telegram = ["hermes-agent-telegram"]`
- `uv.lock` resolves the whole workspace (236 packages)
- No `LAZY_DEPS`, no `ensure()`, no runtime `pip install`
- Custom PEP 517 build backend (`_build_backend.py`) inlines plugin deps
-  at wheel build time for PyPI publishing
-
-### NixOS
-
-`loadWorkspace` discovers all workspace members from `uv.lock` automatically.
-`mkVirtualEnv { hermes-agent = ["all"] }` installs all plugins. Select specific
-plugins with `extraDependencyGroups = ["telegram", "anthropic"]`.
-
-### Tests
-
-Plugin tests live in `plugins/<category>/<name>/tests/`. The test runner
-discovers both `tests/` and `plugins/`. Running plugin tests requires the
-plugin to be installed (`uv sync --extra <name>`).
-
-### The rule
-
-**If it can be a plugin, it must be a plugin.** Adding optional capabilities
-to core files is a code review rejection. If the plugin surface doesn't
-support what you need, extend the surface (new registry type, new hook, new
-`ctx` method) — don't inline the capability.
+Hermes has two plugin surfaces. Both live under `plugins/` in the repo so
+repo-shipped plugins can be discovered alongside user-installed ones in
+`~/.hermes/plugins/` and pip-installed entry points.

 ### General plugins (`hermes_cli/plugins.py` + `plugins/<name>/`)

@@ -630,14 +531,9 @@ providers don't clutter `hermes --help`.
 **Rule (Teknium, May 2026):** plugins MUST NOT modify core files
 (`run_agent.py`, `cli.py`, `gateway/run.py`, `hermes_cli/main.py`, etc.).
 If a plugin needs a capability the framework doesn't expose, expand the
-generic plugin surface (new hook, new ctx method, new registry type) — never
-hardcode plugin-specific logic into core. PR #5295 removed 95 lines of
-hardcoded honcho argparse from `main.py` for exactly this reason.
-
-**Hermetic core boundary (May 2026):** core code (`agent/`, `hermes_cli/`,
-`gateway/`, `tools/`) MUST NOT import from `hermes_agent_*` plugin packages.
-Use the typed registries in `agent/plugin_registries.py` instead. See the
-**Plugins** section below for the full list of registry types.
+generic plugin surface (new hook, new ctx method) — never hardcode
+plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
+honcho argparse from `main.py` for exactly this reason.

 **No new in-tree memory providers (policy, May 2026):** the set of
 built-in memory providers under `plugins/memory/` is closed. New memory
@@ -1115,41 +1011,40 @@ def profile_env(tmp_path, monkeypatch):

 ## Testing

-**ALWAYS use `scripts/run_tests.sh`** — do NOT call `pytest` directly on a directory.
-The script enforces hermetic environment parity with CI and provides per-file
-process isolation that prevents registry singleton / module-level state leakage
-between test files.
+**ALWAYS use `scripts/run_tests.sh`** — do not call `pytest` directly. The script enforces
+hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
+`-n auto` xdist workers, in-tree subprocess-isolation plugin). Direct `pytest`
+on a 16+ core developer machine with API keys set diverges from CI in ways
+that have caused multiple "works locally, fails in CI" incidents (and the reverse).

 ```bash
 scripts/run_tests.sh                                  # full suite, CI-parity
 scripts/run_tests.sh tests/gateway/                   # one directory
-scripts/run_tests.sh tests/agent/test_foo.py          # one file
 scripts/run_tests.sh tests/agent/test_foo.py::test_x  # one test
 scripts/run_tests.sh -v --tb=long                     # pass-through pytest flags
+scripts/run_tests.sh --no-isolate tests/foo/          # disable subprocess isolation (faster, for debugging)
 ```

-For a **single test file or specific test**, bare `pytest` is fine:
+### Subprocess-per-test isolation

-```bash
-nix run nixpkgs#uv -- run python -m pytest tests/agent/test_foo.py -q
-nix run nixpkgs#uv -- run python -m pytest tests/agent/test_foo.py::test_x --tb=short
-```
+Every test runs in a freshly-spawned Python subprocess via the in-tree plugin
+at `tests/_isolate_plugin.py`. This means module-level dicts/sets and
+ContextVars from one test cannot leak into the next — the historic
+`_reset_module_state` autouse fixture is gone.

-Running bare `pytest` on a directory (e.g. `pytest tests/`) will print a warning
-from `conftest.py` telling you to use the script instead.
+Implementation notes:

-### Per-file process isolation
-
-`scripts/run_tests.sh` calls `scripts/run_tests_parallel.py`, which spawns one
-`python -m pytest <file>` subprocess per test **file** (not per test), giving each
-a fresh Python interpreter. This means module-level dicts/sets, ContextVars, and
-registry singletons from one test file cannot leak into another — no shared state
-between files, no xdist required.
-
-`HERMES_PARALLEL_RUNNER=1` is set in each subprocess so `conftest.py` knows tests
-are running under the managed runner. If you need to suppress the bare-pytest
-directory warning in a special case, set this variable yourself — but prefer the
-script.
+- The plugin uses `multiprocessing.get_context("spawn")`, which works on
+  Linux, macOS, and Windows alike (POSIX `fork` is not used).
+- Per-test overhead is ~0.5–1.0s (Python startup + pytest collection). xdist
+  parallelism amortizes this across cores; on a 20-core box the full suite
+  finishes in roughly the same wall time as before, but flake-free.
+- `isolate_timeout` (configured in `pyproject.toml`) caps each test at 30s.
+  Hangs are killed and surfaced as a failure report.
+- Pass `--no-isolate` to disable isolation — useful when debugging a single
+  test interactively, or when you specifically want to verify state leakage.
+- The plugin disables itself in child processes (sentinel envvar
+  `HERMES_ISOLATE_CHILD=1`), so there's no fork-bomb risk.

 ### Why the wrapper (and why the old "just call pytest" doesn't work)

@@ -1161,13 +1056,31 @@ Five real sources of local-vs-CI drift the script closes:
 | HOME / `~/.hermes/` | Your real config+auth.json | Temp dir per test |
 | Timezone | Local TZ (PDT etc.) | UTC |
 | Locale | Whatever is set | C.UTF-8 |
-| File isolation | Shared interpreter — state leaks between files | One subprocess per file |
+| xdist workers | `-n auto` = all cores | `-n auto` (safe — subprocess isolation prevents cross-worker flakes) |

-`tests/conftest.py` also enforces the credential/TZ/locale points as an autouse
-fixture so ANY pytest invocation (including IDE integrations) gets hermetic
-behavior — but the wrapper adds per-file process isolation on top.
+`tests/conftest.py` also enforces points 1-4 as an autouse fixture so ANY pytest
+invocation (including IDE integrations) gets hermetic behavior — but the wrapper
+is belt-and-suspenders.

-Always run the full suite via `scripts/run_tests.sh` before pushing changes.
+### Running without the wrapper (only if you must)
+
+If you can't use the wrapper (e.g. inside an IDE that shells pytest directly),
+at minimum activate the venv. The isolation plugin loads automatically from
+`addopts` in `pyproject.toml`, so you get the same per-test process isolation
+either way.
+
+```bash
+source .venv/bin/activate   # or: source venv/bin/activate
+python -m pytest tests/ -q
+```
+
+If you need to bypass isolation for fast feedback while debugging:
+
+```bash
+python -m pytest tests/agent/test_foo.py -q --no-isolate
+```
+
+Always run the full suite before pushing changes.

 ### Don't write change-detector tests

@@ -121,11 +121,12 @@ hermes chat -q "Hello"
 ### Run tests

 ```bash
-# Preferred — matches CI (hermetic env, per-file process isolation); see AGENTS.md
+# Preferred — matches CI (hermetic env, 4 xdist workers); see AGENTS.md
 scripts/run_tests.sh

-# For a single file or specific test, bare pytest is also fine:
-# python -m pytest tests/agent/test_foo.py -q
+# Alternative (activate the venv first). The wrapper is still recommended
+# for parity with GitHub Actions before you open a PR:
+pytest tests/ -v
 ```

 ---
@@ -856,7 +857,7 @@ refactor/description   # Code restructuring

 ### Before submitting

-1. **Run tests**: `scripts/run_tests.sh` (recommended; same as CI — hermetic env + per-file process isolation)
+1. **Run tests**: `scripts/run_tests.sh` (recommended; same as CI) or `pytest tests/ -v` with the project venv activated
 2. **Test manually**: Run `hermes` and exercise the code path you changed
 3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider macOS, Linux, and WSL2
 4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.
@@ -179,7 +179,7 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
 uv venv venv --python 3.11
 source venv/bin/activate
 uv pip install -e ".[all,dev]"
-scripts/run_tests.sh
+python -m pytest tests/ -q
 ```

 ---
@@ -0,0 +1,110 @@
+# Hermes Agent v0.15.1 (v2026.5.29)
+
+**Release Date:** May 29, 2026
+**Since v0.15.0:** 28 commits · 21 merged PRs · hotfix release · 9 contributors
+
+> **The Patch Release.** A same-day hotfix for v0.15.0. Headline fix: the dashboard infinite-reload loop that hit anyone running v0.15.0 in loopback mode (Docker, hosted Hermes, fresh installs). A handful of other v0.15.0 follow-ups go along for the ride — kanban worker SIGTERM, `/model` picker unification, `/yolo` session bypass, the full 19,932-entry skills.sh catalog, `.md` media delivery restoration, gateway probe-stepdown safety, web-URL redaction passthrough, kanban worker vision on referenced images, hindsight observation-default. Docker users get an explicit `--insecure` opt-in env var (no more bind-host inference), MCP server bare-command PATH resolution, and arm64 PR-build cache fixes.
+
+---
+
+## ✨ Highlights
+
+- **Dashboard 401 reload loop fixed** — In loopback mode the dashboard's identity probe (`/api/auth/me`) returns 401 by design, but v0.15.0's stale-token reload guard treated every 401 as a rotated session token and full-page-reloaded to pick up a fresh one. Every successful sibling call cleared the one-shot reload guard, so the page reload-looped forever (Firefox: "Navigated to /sessions" storm; Chrome: React re-render storm). Fix adds an `allowUnauthorized` opt-out to `fetchJSON` that skips only the loopback stale-token reload — 401 still throws so `AuthWidget` swallows it, gated-mode `login_url` redirects are unaffected. Closes [#34206](https://github.com/NousResearch/hermes-agent/issues/34206), [#34202](https://github.com/NousResearch/hermes-agent/issues/34202). ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
+
+- **Docker dashboard `--insecure` is now an explicit env opt-in, never derived from bind host** — Previously the Docker entrypoint inferred `--insecure` when the dashboard bound to a non-loopback host. That conflated "I want LAN access" with "I want to disable the same-origin guard." The fix splits them: bind host is bind host, and disabling the dashboard's loopback auth requires an explicit `HERMES_DASHBOARD_INSECURE=1`. Existing setups that genuinely wanted insecure binding must now set the env var. ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188), [#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
+
+- **MCP bare command resolution under Docker** — MCP servers configured with bare commands (`npx`, `npm`, `node`) now resolve against `/usr/local/bin` so they actually launch inside the Docker image where those binaries live. v0.15.0 left these failing silently in containers when the agent's effective PATH didn't include the Node toolchain location. ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
+
+- **Skills page sidebar / source pills restored** — A stale `useMemo` dependency in the new dashboard skills page collapsed the source pills and category sidebar to "All" only. Fixed; both surfaces now reflect the live catalog state. ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
+
+- **Kanban worker can be killed again** — `SIGTERM` on a kanban worker was being absorbed by an intermediate process and the worker stayed running. Closes [#28181](https://github.com/NousResearch/hermes-agent/issues/28181). ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
+
+- **Full skills.sh catalog (858 → 19,932 entries)** — The skills hub page was pulling a partial paginated catalog. The fetch now walks the sitemap, so all 19,932 skills.sh entries surface in the picker instead of just the first 858. ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
+
+---
+
+## 🐛 Bug Fixes
+
+### Dashboard / Web
+
+- **`/api/auth/me` 401 no longer triggers reload loop** in loopback mode — ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
+- **Skills page source pills + category sidebar restored** — stale `useMemo` dep ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
+
+### Docker
+
+- **`--insecure` is now explicit opt-in via env var**, not derived from bind host ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188) — @benbarclay)
+- **Dashboard test suite repaired** to match the insecure-opt-in fix ([#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
+- **arm64 PR builds skip the GHA cache** to avoid cache-thrash on cross-arch builders ([#33704](https://github.com/NousResearch/hermes-agent/pull/33704) — @BROCCOLO1D)
+
+### MCP
+
+- **Bare `npx`/`npm`/`node` resolve against `/usr/local/bin`** for Docker compatibility ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
+
+### Kanban
+
+- **Worker SIGTERM actually terminates the process** ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
+- **Workers receive images referenced in task bodies** for vision-capable models ([#34210](https://github.com/NousResearch/hermes-agent/pull/34210))
+
+### Gateway
+
+- **`.md` files deliver again** — media-delivery validation defaults to denylist-only instead of an overly-narrow allowlist ([#34022](https://github.com/NousResearch/hermes-agent/pull/34022))
+- **Probe stepdown safety** — on a context-overflow without an explicit provider context limit, the agent no longer steps down to a smaller model based on an unknown ceiling (salvage of [#33673](https://github.com/NousResearch/hermes-agent/pull/33673)) ([#33826](https://github.com/NousResearch/hermes-agent/pull/33826))
+
+### CLI
+
+- **`/yolo` mid-session enables the per-session bypass** instead of just toggling the env var (which the running agent had already snapshotted) ([#33931](https://github.com/NousResearch/hermes-agent/pull/33931) — @kshitijk4poor)
+- **`/model` and `hermes model` show the same list**, plus disk cache for picker startup ([#33867](https://github.com/NousResearch/hermes-agent/pull/33867))
+
+### Skills
+
+- **Full skills.sh catalog via sitemap** — 858 → 19,932 entries ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
+
+### Redaction
+
+- **Web URLs pass through unchanged** — the redactor was eating query parameters that looked credential-shaped ([#34029](https://github.com/NousResearch/hermes-agent/pull/34029))
+
+---
+
+## ✨ Small Features
+
+- **Hindsight default narrowed to observation-only** for `recall_types` — tool path is also narrowed ([#34079](https://github.com/NousResearch/hermes-agent/pull/34079) — @nicoloboschi, follow-up [#34091](https://github.com/NousResearch/hermes-agent/pull/4df62d239e38bf8c212a595721c9c01e176f6c3a) — @kshitijk4poor)
+- **Memory providers receive completed-turn message context** — salvage of [#28065](https://github.com/NousResearch/hermes-agent/pull/28065) ([#34097](https://github.com/NousResearch/hermes-agent/pull/34097) — @kshitijk4poor, credit to @devwdave)
+
+---
+
+## 📚 Documentation
+
+- **`--no-supervise` / `HERMES_GATEWAY_NO_SUPERVISE` documented** in the reference docs (follow-up to [#33583](https://github.com/NousResearch/hermes-agent/pull/33583)) ([#33751](https://github.com/NousResearch/hermes-agent/pull/33751) — @r266-tech)
+
+---
+
+## 🛠️ Infrastructure
+
+- **Vercel deploy workflow accepts `workflow_dispatch`** so docs deploys can be manually triggered ([#34081](https://github.com/NousResearch/hermes-agent/pull/34081))
+- **`@nous-research/ui` bumped to 0.18.2** (Nix `npmDepsHash` also updated to match) ([#34193](https://github.com/NousResearch/hermes-agent/pull/34193) follow-ups — @austinpickett)
+
+---
+
+## 👥 Contributors
+
+### Core
+- @teknium1
+
+### Community
+- @austinpickett — dashboard 401 reload-loop fix (the headline), `@nous-research/ui` bump, Nix `npmDepsHash` updates
+- @benbarclay — Docker `--insecure` opt-in, MCP bare-command resolution, dashboard test repair
+- @kshitijk4poor — `/yolo` session bypass, completed-turn memory context salvage, hindsight follow-up docs
+- @nicoloboschi — hindsight `recall_types` observation default
+- @BROCCOLO1D — arm64 PR build cache fix
+- @r266-tech — `--no-supervise` reference docs
+- @yangguangjin — probe stepdown safety (salvage of @yanghd's #33673)
+- @devwdave — completed-turn memory context (credited via salvage)
+- @andrewhosf — co-author
+
+### Issue Reporters (the 401 loop)
+- @routesmith ([#34206](https://github.com/NousResearch/hermes-agent/issues/34206))
+- @beeaton ([#34202](https://github.com/NousResearch/hermes-agent/issues/34202))
+
+---
+
+**Full Changelog**: [v2026.5.28...v2026.5.29](https://github.com/NousResearch/hermes-agent/compare/v2026.5.28...v2026.5.29)
@@ -1,183 +0,0 @@
-"""Custom PEP 517 build backend for hermes-agent.
-
-At wheel build time, rewrites [project.optional-dependencies] so that
-plugin extras (e.g. ``anthropic = ["hermes-agent-anthropic"]``) are
-inlined with the actual deps from each plugin's pyproject.toml.
-
-In the source repo (and on Nix), uv resolves workspace members natively
-so this backend is NOT used — it's only invoked when building a wheel
-for PyPI publication.
-
-Usage in pyproject.toml::
-
-    [build-system]
-    requires = ["setuptools>=61.0"]
-    build-backend = "_build_backend"
-    backend-path = ["."]
-
-How it works:
-1.  ``build_wheel`` intercepts the call before setuptools sees pyproject.toml.
-2.  It reads the workspace member dirs from [tool.uv.workspace].members.
-3.  For each member, it reads the member's pyproject.toml and extracts
-    ``project.dependencies`` (excluding the ``hermes-agent`` base dep).
-4.  It rewrites the main pyproject.toml's optional-dependencies to inline
-    those deps instead of the workspace member references.
-5.  It writes a temporary pyproject.toml, delegates to
-    ``setuptools.build_meta.build_wheel``, then restores the original.
-"""
-
-from __future__ import annotations
-
-import os
-import shutil
-import tempfile
-from pathlib import Path
-from typing import Any
-
-import tomllib
-
-# The original setuptools backend we delegate to.
-_BACKEND = "setuptools.build_meta"
-
-
-def _load_pyproject(path: Path) -> dict:
-    with path.open("rb") as f:
-        return tomllib.load(f)
-
-
-def _save_pyproject(path: Path, data: dict) -> None:
-    """Write a pyproject.toml. Uses a simple serializer since we only
-    need to preserve the structure enough for setuptools to parse."""
-    import tomli_w
-    with path.open("wb") as f:
-        tomli_w.dump(data, f)
-
-
-def _inline_plugin_deps(root: Path, data: dict) -> dict:
-    """Rewrite optional-dependencies to inline plugin deps.
-
-    Maps each plugin extra (e.g. ``anthropic = ["hermes-agent-anthropic"]``)
-    to the actual deps from that plugin's pyproject.toml, minus the
-    ``hermes-agent`` base dependency.
-    """
-    opt_deps = data.get("project", {}).get("optional-dependencies", {})
-    workspace = data.get("tool", {}).get("uv", {}).get("workspace", {})
-    members = workspace.get("members", [])
-
-    # Build a map: package name → (member_dir, pyproject_data)
-    pkg_to_deps: dict[str, list[str]] = {}
-    for member_glob in members:
-        for member_dir in sorted(root.glob(member_glob)):
-            pptoml = member_dir / "pyproject.toml"
-            if not pptoml.exists():
-                continue
-            member_data = _load_pyproject(pptoml)
-            pkg_name = member_data.get("project", {}).get("name", "")
-            if not pkg_name:
-                continue
-            # Extract deps, excluding the base hermes-agent dependency
-            raw_deps = member_data.get("project", {}).get("dependencies", [])
-            filtered = [
-                d for d in raw_deps
-                if not d.replace(" ", "").startswith("hermes-agent")
-            ]
-            pkg_to_deps[pkg_name] = filtered
-
-    # Rewrite optional-dependencies
-    new_opt_deps = {}
-    for extra_name, specs in opt_deps.items():
-        new_specs = []
-        for spec in specs:
-            # Check if this spec references a workspace member package
-            if spec in pkg_to_deps:
-                # Inline the plugin's deps
-                new_specs.extend(pkg_to_deps[spec])
-            else:
-                new_specs.append(spec)
-        new_opt_deps[extra_name] = new_specs
-
-    data["project"]["optional-dependencies"] = new_opt_deps
-
-    # Remove [tool.uv] section — it's not valid in a published wheel
-    if "uv" in data.get("tool", {}):
-        del data["tool"]["uv"]
-
-    return data
-
-
-# ---------------------------------------------------------------------------
-# PEP 517 hooks
-# ---------------------------------------------------------------------------
-
-def build_wheel(wheel_directory: str, config_settings: dict[str, Any] | None = None, metadata_directory: str | None = None) -> str:
-    """Build a wheel with inlined plugin deps."""
-    root = Path.cwd()
-    pyproject_path = root / "pyproject.toml"
-
-    # Read and rewrite
-    data = _load_pyproject(pyproject_path)
-    data = _inline_plugin_deps(root, data)
-
-    # Write a temporary pyproject.toml, build, then restore
-    backup = pyproject_path.with_suffix(".toml.bak")
-    shutil.copy2(pyproject_path, backup)
-    try:
-        _save_pyproject(pyproject_path, data)
-
-        # Delegate to setuptools
-        import importlib
-        backend = importlib.import_module(_BACKEND)
-        return backend.build_wheel(wheel_directory, config_settings)
-    finally:
-        shutil.copy2(backup, pyproject_path)
-        backup.unlink()
-
-
-def build_sdist(sdist_directory: str, config_settings: dict[str, Any] | None = None) -> str:
-    """Build an sdist — no rewriting needed."""
-    import importlib
-    backend = importlib.import_module(_BACKEND)
-    return backend.build_sdist(sdist_directory, config_settings)
-
-
-def get_requires_for_build_wheel(config_settings: dict[str, Any] | None = None) -> list[str]:
-    return ["setuptools>=61.0", "tomli_w"]
-
-
-def get_requires_for_build_sdist(config_settings: dict[str, Any] | None = None) -> list[str]:
-    return ["setuptools>=61.0"]
-
-
-def prepare_metadata_for_build_wheel(metadata_directory: str, config_settings: dict[str, Any] | None = None) -> str:
-    """Prepare metadata with inlined plugin deps."""
-    root = Path.cwd()
-    pyproject_path = root / "pyproject.toml"
-
-    data = _load_pyproject(pyproject_path)
-    data = _inline_plugin_deps(root, data)
-
-    backup = pyproject_path.with_suffix(".toml.bak")
-    shutil.copy2(pyproject_path, backup)
-    try:
-        _save_pyproject(pyproject_path, data)
-
-        import importlib
-        backend = importlib.import_module(_BACKEND)
-        return backend.prepare_metadata_for_build_wheel(metadata_directory, config_settings)
-    finally:
-        shutil.copy2(backup, pyproject_path)
-        backup.unlink()
-
-
-def build_editable(wheel_directory: str, config_settings: dict[str, Any] | None = None, metadata_directory: str | None = None) -> str:
-    """Build an editable install — no rewriting needed (dev mode)."""
-    import importlib
-    backend = importlib.import_module(_BACKEND)
-    kwargs: dict[str, Any] = {"config_settings": config_settings}
-    if metadata_directory is not None:
-        kwargs["metadata_directory"] = metadata_directory
-    return backend.build_editable(wheel_directory, **kwargs)
-
-
-def get_requires_for_build_editable(config_settings: dict[str, Any] | None = None) -> list[str]:
-    return ["setuptools>=61.0"]
@@ -1,7 +1,7 @@
 {
  "id": "hermes-agent",
  "name": "Hermes Agent",
-  "version": "0.15.0",
+  "version": "0.15.1",
  "description": "Self-improving open-source AI agent by Nous Research with ACP editor integration, persistent memory, skills, and rich tool support.",
  "repository": "https://github.com/NousResearch/hermes-agent",
  "website": "https://hermes-agent.nousresearch.com/docs/user-guide/features/acp",
@@ -9,7 +9,7 @@
  "license": "MIT",
  "distribution": {
    "uvx": {
-      "package": "hermes-agent[acp]==0.15.0",
+      "package": "hermes-agent[acp]==0.15.1",
      "args": ["hermes-acp"]
    }
  }
@@ -6,9 +6,7 @@ from typing import Any, Optional

 import httpx

-from agent.plugin_registries import registries
-_is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
-resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
+from agent.anthropic_adapter import _is_oauth_token, resolve_anthropic_token
 from hermes_cli.auth import _read_codex_tokens, resolve_codex_runtime_credentials
 from hermes_cli.runtime_provider import resolve_runtime_provider

@@ -178,7 +176,7 @@ def _fetch_anthropic_account_usage() -> Optional[AccountUsageSnapshot]:
    token = (resolve_anthropic_token() or "").strip()
    if not token:
        return None
-    if _is_oauth_token is not None and not _is_oauth_token(token):
+    if not _is_oauth_token(token):
        return AccountUsageSnapshot(
            provider="anthropic",
            source="oauth_usage_api",
@@ -404,7 +404,7 @@ def init_agent(
    agent.status_callback = status_callback
    agent.tool_gen_callback = tool_gen_callback

-
+    
    # Tool execution state — allows _vprint during tool execution
    # even when stream consumers are registered (no tokens streaming then)
    agent._executing_tools = False
@@ -437,12 +437,12 @@ def init_agent(
    # their tids explicitly.
    agent._tool_worker_threads: set[int] = set()
    agent._tool_worker_threads_lock = threading.Lock()
-
+    
    # Subagent delegation state
    agent._delegate_depth = 0        # 0 = top-level agent, incremented for children
    agent._active_children = []      # Running child AIAgents (for interrupt propagation)
    agent._active_children_lock = threading.Lock()
-
+    
    # Store OpenRouter provider preferences
    agent.providers_allowed = providers_allowed
    agent.providers_ignored = providers_ignored
@@ -455,7 +455,7 @@ def init_agent(
    # Store toolset filtering options
    agent.enabled_toolsets = enabled_toolsets
    agent.disabled_toolsets = disabled_toolsets
-
+    
    # Model response configuration
    agent.max_tokens = max_tokens  # None = use model default
    agent.reasoning_config = reasoning_config  # None = use default (medium for OpenRouter)
@@ -463,7 +463,7 @@ def init_agent(
    agent.request_overrides = dict(request_overrides or {})
    agent.prefill_messages = prefill_messages or []  # Prefilled conversation turns
    agent._force_ascii_payload = False
-
+    
    # Anthropic prompt caching: auto-enabled for Claude models on native
    # Anthropic, OpenRouter, and third-party gateways that speak the
    # Anthropic protocol (``api_mode == 'anthropic_messages'``). Reduces
@@ -535,7 +535,7 @@ def init_agent(
        # console. Any future noise reduction belongs at the
        # handler level inside hermes_logging.py, not here.
        pass
-
+    
    # Internal stream callback (set during streaming TTS).
    # Initialized here so _vprint can reference it before run_conversation.
    agent._stream_callback = None
@@ -585,14 +585,12 @@ def init_agent(
    _provider_timeout = get_provider_request_timeout(agent.provider, agent.model)

    if agent.api_mode == "anthropic_messages":
-        from agent.plugin_registries import registries
-        build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
-        resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
+        from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
        # Bedrock + Claude → use AnthropicBedrock SDK for full feature parity
        # (prompt caching, thinking budgets, adaptive thinking).
        _is_bedrock_anthropic = agent.provider == "bedrock"
        if _is_bedrock_anthropic:
-            build_anthropic_bedrock_client = registries.get_provider_service("anthropic", "build_anthropic_bedrock_client")
+            from agent.anthropic_adapter import build_anthropic_bedrock_client
            _region_match = re.search(r"bedrock-runtime\.([a-z0-9-]+)\.", base_url or "")
            _br_region = _region_match.group(1) if _region_match else "us-east-1"
            agent._bedrock_region = _br_region
@@ -646,8 +644,8 @@ def init_agent(
            # so injects Claude-Code identity headers and system prompts
            # that cause 401/403 on their endpoints.  Guards #1739 and
            # the third-party identity-injection bug.
-            _is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
-            agent._is_anthropic_oauth = _is_oauth_token(effective_key) if (_is_oauth_token is not None and _is_native_anthropic and isinstance(effective_key, str)) else False
+            from agent.anthropic_adapter import _is_oauth_token as _is_oat
+            agent._is_anthropic_oauth = _is_oat(effective_key) if (_is_native_anthropic and isinstance(effective_key, str)) else False
            agent._anthropic_client = build_anthropic_client(effective_key, base_url, timeout=_provider_timeout)
            # No OpenAI client needed for Anthropic mode
            agent.client = None
@@ -659,10 +657,9 @@ def init_agent(
                # The Anthropic adapter installs an httpx event hook
                # that mints a fresh JWT per request — we never
                # invoke or inspect the callable in the banner.
-                from agent.plugin_registries import registries
-                is_token_provider = registries.get_provider_service("azure", "is_token_provider")
+                from agent.azure_identity_adapter import is_token_provider

-                if is_token_provider and is_token_provider(effective_key):
+                if is_token_provider(effective_key):
                    print("🔑 Using credentials: Microsoft Entra ID")
                elif isinstance(effective_key, str) and len(effective_key) > 12:
                    print(f"🔑 Using token: {effective_key[:8]}...{effective_key[-4:]}")
@@ -872,11 +869,10 @@ def init_agent(
                # provider (Azure Foundry). The OpenAI SDK mints a
                # fresh JWT per request internally — the banner
                # never invokes or inspects the callable.
-                from agent.plugin_registries import registries
-                is_token_provider = registries.get_provider_service("azure", "is_token_provider")
+                from agent.azure_identity_adapter import is_token_provider

                key_used = client_kwargs.get("api_key", "none")
-                if is_token_provider and is_token_provider(key_used):
+                if is_token_provider(key_used):
                    print("🔑 Using credentials: Microsoft Entra ID")
                elif isinstance(key_used, str) and key_used and key_used != "dummy-key" and len(key_used) > 12:
                    print(f"🔑 Using API key: {key_used[:8]}...{key_used[-4:]}")
@@ -884,7 +880,7 @@ def init_agent(
                    print("⚠️  Warning: API key appears invalid or missing")
        except Exception as e:
            raise RuntimeError(f"Failed to initialize OpenAI client: {e}")
-
+    
    # Provider fallback chain — ordered list of backup providers tried
    # when the primary is exhausted (rate-limit, overload, connection
    # failure).  Supports both legacy single-dict ``fallback_model`` and
@@ -916,7 +912,7 @@ def init_agent(
        disabled_toolsets=disabled_toolsets,
        quiet_mode=agent.quiet_mode,
    )
-
+    
    # Show tool configuration and store valid tool names for validation
    agent.valid_tool_names = set()
    if agent.tools:
@@ -949,16 +945,16 @@ def init_agent(
        missing_reqs = [name for name, available in requirements.items() if not available]
        if missing_reqs:
            print(f"⚠️  Some tools may not work due to missing requirements: {missing_reqs}")
-
+    
    # Show trajectory saving status
    if agent.save_trajectories and not agent.quiet_mode:
        print("📝 Trajectory saving enabled")
-
+    
    # Show ephemeral system prompt status
    if agent.ephemeral_system_prompt and not agent.quiet_mode:
        prompt_preview = agent.ephemeral_system_prompt[:60] + "..." if len(agent.ephemeral_system_prompt) > 60 else agent.ephemeral_system_prompt
        print(f"🔒 Ephemeral system prompt: '{prompt_preview}' (not saved to trajectories)")
-
+    
    # Show prompt caching status
    if agent._use_prompt_caching and not agent.quiet_mode:
        if agent._use_native_cache_layout and agent.provider == "anthropic":
@@ -968,7 +964,7 @@ def init_agent(
        else:
            source = "Claude via OpenRouter"
        print(f"💾 Prompt caching: ENABLED ({source}, {agent._cache_ttl} TTL)")
-
+    
    # Session logging setup - auto-save conversation trajectories for debugging
    agent.session_start = datetime.now()
    if session_id:
@@ -1008,7 +1004,7 @@ def init_agent(
        pass
    # logs_dir is retained unconditionally for request_dump_*.json (debug
    # breadcrumb path written by agent_runtime_helpers.dump_api_request_debug).
-
+    
    # Track conversation messages for session logging
    agent._session_messages: List[Dict[str, Any]] = []
    # Responses encrypted reasoning replay state.  Some OpenAI-compatible
@@ -1020,10 +1016,10 @@ def init_agent(
    agent._codex_reasoning_replay_enabled = True
    agent._memory_write_origin = "assistant_tool"
    agent._memory_write_context = "foreground"
-
+    
    # Cached system prompt -- built once per session, only rebuilt on compression
    agent._cached_system_prompt: Optional[str] = None
-
+    
    # Filesystem checkpoint manager (transparent — not a tool)
    from tools.checkpoint_manager import CheckpointManager
    agent._checkpoint_mgr = CheckpointManager(
@@ -1032,7 +1028,7 @@ def init_agent(
        max_total_size_mb=checkpoint_max_total_size_mb,
        max_file_size_mb=checkpoint_max_file_size_mb,
    )
-
+    
    # SQLite session store (optional -- provided by CLI or gateway)
    agent._session_db = session_db
    agent._parent_session_id = parent_session_id
@@ -1043,11 +1039,11 @@ def init_agent(
        "reasoning_config": reasoning_config,
        "max_tokens": max_tokens,
    }
-
+    
    # In-memory todo list for task planning (one per agent/session)
    from tools.todo_tool import TodoStore
    agent._todo_store = TodoStore()
-
+    
    # Load config once for memory, skills, and compression sections
    try:
        from hermes_cli.config import load_config as _load_agent_config
@@ -1089,7 +1085,7 @@ def init_agent(
                agent._memory_store.load_from_disk()
        except Exception:
            pass  # Memory is optional -- don't break agent init
-
+    


    # Memory provider plugin (external — one at a time, alongside built-in)
@@ -1549,7 +1545,7 @@ def init_agent(
    agent.session_estimated_cost_usd = 0.0
    agent.session_cost_status = "unknown"
    agent.session_cost_source = "none"
-
+    
    # ── Ollama num_ctx injection ──
    # Ollama defaults to 2048 context regardless of the model's capabilities.
    # When running against an Ollama server, detect the model's max context
@@ -766,8 +766,7 @@ def try_recover_primary_transport(
        agent.api_key = rt["api_key"]

        if agent.api_mode == "anthropic_messages":
-            from agent.plugin_registries import registries
-            build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
+            from agent.anthropic_adapter import build_anthropic_client
            agent._anthropic_api_key = rt["anthropic_api_key"]
            agent._anthropic_base_url = rt["anthropic_base_url"]
            agent._anthropic_client = build_anthropic_client(
@@ -931,8 +930,7 @@ def restore_primary_runtime(agent) -> bool:

        # ── Rebuild client for the primary provider ──
        if agent.api_mode == "anthropic_messages":
-            from agent.plugin_registries import registries
-            build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
+            from agent.anthropic_adapter import build_anthropic_client
            agent._anthropic_api_key = rt["anthropic_api_key"]
            agent._anthropic_base_url = rt["anthropic_base_url"]
            agent._anthropic_client = build_anthropic_client(
@@ -1438,10 +1436,11 @@ def switch_model(agent, new_model, new_provider, api_key='', base_url='', api_mo

        # ── Build new client ──
        if api_mode == "anthropic_messages":
-            from agent.plugin_registries import registries
-            build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
-            resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
-            _is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
+            from agent.anthropic_adapter import (
+                build_anthropic_client,
+                resolve_anthropic_token,
+                _is_oauth_token,
+            )
            # Only fall back to ANTHROPIC_TOKEN when the provider is actually Anthropic.
            # Other anthropic_messages providers (MiniMax, Alibaba, etc.) must use their own
            # API key — falling back would send Anthropic credentials to third-party endpoints.
@@ -1,166 +0,0 @@
-"""Anthropic auxiliary client wrappers — core module, no SDK dependency.
-
-Provides OpenAI-client-compatible shims over native Anthropic SDK clients,
-so auxiliary tasks (compression, vision, web extract, etc.) can call
-``client.chat.completions.create()`` regardless of the underlying SDK.
-
-The wrapper classes themselves never import the anthropic SDK.  They delegate
-wire-format conversion to :mod:`agent.anthropic_format` and response
-normalization to the ``anthropic_messages`` transport registered in
-:mod:`agent.transports`.
-"""
-
-from __future__ import annotations
-
-import asyncio
-import logging
-from types import SimpleNamespace
-from typing import Any, Optional
-
-from agent.anthropic_format import (
-    build_anthropic_kwargs,
-    _forbids_sampling_params,
-)
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Adapter: Anthropic SDK → OpenAI-compatible completions.create()
-# ---------------------------------------------------------------------------
-
-class _AnthropicCompletionsAdapter:
-    """OpenAI-client-compatible adapter for Anthropic Messages API."""
-
-    def __init__(self, real_client: Any, model: str, is_oauth: bool = False):
-        self._client = real_client
-        self._model = model
-        self._is_oauth = is_oauth
-
-    def create(self, **kwargs) -> Any:
-        from agent.transports import get_transport
-
-        messages = kwargs.get("messages", [])
-        model = kwargs.get("model", self._model)
-        tools = kwargs.get("tools")
-        tool_choice = kwargs.get("tool_choice")
-        # ZAI's Anthropic-compatible endpoint rejects max_tokens on vision
-        # models (glm-4v-flash etc.) with error code 1210.  When the caller
-        # signals this by setting _skip_zai_max_tokens in kwargs, omit it.
-        _skip_mt = kwargs.pop("_skip_zai_max_tokens", False)
-        if _skip_mt:
-            max_tokens = None
-        else:
-            max_tokens = kwargs.get("max_tokens") or kwargs.get("max_completion_tokens") or 2000
-        temperature = kwargs.get("temperature")
-
-        normalized_tool_choice = None
-        if isinstance(tool_choice, str):
-            normalized_tool_choice = tool_choice
-        elif isinstance(tool_choice, dict):
-            choice_type = str(tool_choice.get("type", "")).lower()
-            if choice_type == "function":
-                normalized_tool_choice = tool_choice.get("function", {}).get("name")
-            elif choice_type in {"auto", "required", "none"}:
-                normalized_tool_choice = choice_type
-
-        anthropic_kwargs = build_anthropic_kwargs(
-            model=model,
-            messages=messages,
-            tools=tools,
-            max_tokens=max_tokens,
-            reasoning_config=None,
-            tool_choice=normalized_tool_choice,
-            is_oauth=self._is_oauth,
-        )
-        # Opus 4.7+ rejects any non-default temperature/top_p/top_k; only set
-        # temperature for models that still accept it. build_anthropic_kwargs
-        # additionally strips these keys as a safety net — keep both layers.
-        if temperature is not None:
-            if not _forbids_sampling_params(model):
-                anthropic_kwargs["temperature"] = temperature
-
-        response = self._client.messages.create(**anthropic_kwargs)
-        _transport = get_transport("anthropic_messages")
-        _nr = _transport.normalize_response(
-            response, strip_tool_prefix=self._is_oauth
-        )
-
-        assistant_message = SimpleNamespace(
-            content=_nr.content,
-            tool_calls=_nr.tool_calls,
-            reasoning=_nr.reasoning,
-        )
-        finish_reason = _nr.finish_reason
-
-        usage = None
-        if hasattr(response, "usage") and response.usage:
-            prompt_tokens = getattr(response.usage, "input_tokens", 0) or 0
-            completion_tokens = getattr(response.usage, "output_tokens", 0) or 0
-            total_tokens = getattr(response.usage, "total_tokens", 0) or (prompt_tokens + completion_tokens)
-            usage = SimpleNamespace(
-                prompt_tokens=prompt_tokens,
-                completion_tokens=completion_tokens,
-                total_tokens=total_tokens,
-            )
-
-        choice = SimpleNamespace(
-            index=0,
-            message=assistant_message,
-            finish_reason=finish_reason,
-        )
-        return SimpleNamespace(
-            choices=[choice],
-            model=model,
-            usage=usage,
-        )
-
-
-class _AnthropicChatShim:
-    def __init__(self, adapter: _AnthropicCompletionsAdapter):
-        self.completions = adapter
-
-
-# ---------------------------------------------------------------------------
-# Public wrappers
-# ---------------------------------------------------------------------------
-
-class AnthropicAuxiliaryClient:
-    """OpenAI-client-compatible wrapper over a native Anthropic client."""
-
-    def __init__(self, real_client: Any, model: str, api_key: str, base_url: str, is_oauth: bool = False):
-        self._real_client = real_client
-        adapter = _AnthropicCompletionsAdapter(real_client, model, is_oauth=is_oauth)
-        self.chat = _AnthropicChatShim(adapter)
-        self.api_key = api_key
-        self.base_url = base_url
-
-    def close(self):
-        close_fn = getattr(self._real_client, "close", None)
-        if callable(close_fn):
-            close_fn()
-
-
-class _AsyncAnthropicCompletionsAdapter:
-    def __init__(self, sync_adapter: _AnthropicCompletionsAdapter):
-        self._sync = sync_adapter
-
-    async def create(self, **kwargs) -> Any:
-        return await asyncio.to_thread(self._sync.create, **kwargs)
-
-
-class _AsyncAnthropicChatShim:
-    def __init__(self, adapter: _AsyncAnthropicCompletionsAdapter):
-        self.completions = adapter
-
-
-class AsyncAnthropicAuxiliaryClient:
-    def __init__(self, sync_wrapper: AnthropicAuxiliaryClient):
-        sync_adapter = sync_wrapper.chat.completions
-        async_adapter = _AsyncAnthropicCompletionsAdapter(sync_adapter)
-        self.chat = _AsyncAnthropicChatShim(async_adapter)
-        self.api_key = sync_wrapper.api_key
-        self.base_url = sync_wrapper.base_url
-        # Mirror _real_client so cache eviction on a poisoned underlying
-        # client also drops this entry.
-        self._real_client = sync_wrapper._real_client
@@ -106,41 +106,6 @@ from utils import base_url_host_matches, base_url_hostname, normalize_proxy_env_

 logger = logging.getLogger(__name__)

-# ---------------------------------------------------------------------------
-# Core anthropic wire-format modules (no SDK dependency)
-# ---------------------------------------------------------------------------
-
-from agent.anthropic_aux import (  # noqa: F401
-    AnthropicAuxiliaryClient,
-    AsyncAnthropicAuxiliaryClient,
-)
-
-# ---------------------------------------------------------------------------
-# Plugin-registry helper — access *plugin-provided* anthropic services
-# (resolve.py functions: maybe_wrap_anthropic, is_anthropic_compat_endpoint, etc.)
-# Wire-format code (message conversion, aux client wrappers) lives in core
-# and is imported directly above.
-# ---------------------------------------------------------------------------
-
-def _anthropic_plugin_service(name: str):
-    """Lazy accessor for anthropic plugin resolve services.
-
-    Only the SDK-dependent orchestration (maybe_wrap_anthropic,
-    is_anthropic_compat_endpoint, convert_openai_images_to_anthropic) lives
-    in the plugin.  Core accesses it through
-    ``registries.get_provider_service("anthropic", name)`` so that:
-      - Core never imports from a plugin package directly.
-      - The plugin need only be installed when the user actually uses it.
-    """
-    from agent.plugin_registries import registries
-    svc = registries.get_provider_service("anthropic", name)
-    if svc is None:
-        raise ImportError(
-            f"anthropic plugin service {name!r} not available — "
-            f"the hermes_agent_anthropic package may not be installed"
-        )
-    return svc
-

 def _safe_isinstance(obj: Any, maybe_type: Any) -> bool:
    """Return False instead of raising when a patched symbol is not a type."""
@@ -452,6 +417,7 @@ auxiliary_is_nous: bool = False
 _OPENROUTER_MODEL = "google/gemini-3-flash-preview"
 _NOUS_MODEL = "google/gemini-3-flash-preview"
 _NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
+_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
 _AUTH_JSON_PATH = get_hermes_home() / "auth.json"

 # Codex OAuth endpoint used when a caller explicitly requests
@@ -982,6 +948,253 @@ class AsyncCodexAuxiliaryClient:
        self._real_client = sync_wrapper._real_client


+class _AnthropicCompletionsAdapter:
+    """OpenAI-client-compatible adapter for Anthropic Messages API."""
+
+    def __init__(self, real_client: Any, model: str, is_oauth: bool = False):
+        self._client = real_client
+        self._model = model
+        self._is_oauth = is_oauth
+
+    def create(self, **kwargs) -> Any:
+        from agent.anthropic_adapter import build_anthropic_kwargs
+        from agent.transports import get_transport
+
+        messages = kwargs.get("messages", [])
+        model = kwargs.get("model", self._model)
+        tools = kwargs.get("tools")
+        tool_choice = kwargs.get("tool_choice")
+        # ZAI's Anthropic-compatible endpoint rejects max_tokens on vision
+        # models (glm-4v-flash etc.) with error code 1210.  When the caller
+        # signals this by setting _skip_zai_max_tokens in kwargs, omit it.
+        _skip_mt = kwargs.pop("_skip_zai_max_tokens", False)
+        if _skip_mt:
+            max_tokens = None
+        else:
+            max_tokens = kwargs.get("max_tokens") or kwargs.get("max_completion_tokens") or 2000
+        temperature = kwargs.get("temperature")
+
+        normalized_tool_choice = None
+        if isinstance(tool_choice, str):
+            normalized_tool_choice = tool_choice
+        elif isinstance(tool_choice, dict):
+            choice_type = str(tool_choice.get("type", "")).lower()
+            if choice_type == "function":
+                normalized_tool_choice = tool_choice.get("function", {}).get("name")
+            elif choice_type in {"auto", "required", "none"}:
+                normalized_tool_choice = choice_type
+
+        anthropic_kwargs = build_anthropic_kwargs(
+            model=model,
+            messages=messages,
+            tools=tools,
+            max_tokens=max_tokens,
+            reasoning_config=None,
+            tool_choice=normalized_tool_choice,
+            is_oauth=self._is_oauth,
+        )
+        # Opus 4.7+ rejects any non-default temperature/top_p/top_k; only set
+        # temperature for models that still accept it. build_anthropic_kwargs
+        # additionally strips these keys as a safety net — keep both layers.
+        if temperature is not None:
+            from agent.anthropic_adapter import _forbids_sampling_params
+            if not _forbids_sampling_params(model):
+                anthropic_kwargs["temperature"] = temperature
+
+        response = self._client.messages.create(**anthropic_kwargs)
+        _transport = get_transport("anthropic_messages")
+        _nr = _transport.normalize_response(
+            response, strip_tool_prefix=self._is_oauth
+        )
+
+        # ToolCall already duck-types as OpenAI shape (.type, .function.name,
+        # .function.arguments) via properties, so no wrapping needed.
+        assistant_message = SimpleNamespace(
+            content=_nr.content,
+            tool_calls=_nr.tool_calls,
+            reasoning=_nr.reasoning,
+        )
+        finish_reason = _nr.finish_reason
+
+        usage = None
+        if hasattr(response, "usage") and response.usage:
+            prompt_tokens = getattr(response.usage, "input_tokens", 0) or 0
+            completion_tokens = getattr(response.usage, "output_tokens", 0) or 0
+            total_tokens = getattr(response.usage, "total_tokens", 0) or (prompt_tokens + completion_tokens)
+            usage = SimpleNamespace(
+                prompt_tokens=prompt_tokens,
+                completion_tokens=completion_tokens,
+                total_tokens=total_tokens,
+            )
+
+        choice = SimpleNamespace(
+            index=0,
+            message=assistant_message,
+            finish_reason=finish_reason,
+        )
+        return SimpleNamespace(
+            choices=[choice],
+            model=model,
+            usage=usage,
+        )
+
+
+class _AnthropicChatShim:
+    def __init__(self, adapter: _AnthropicCompletionsAdapter):
+        self.completions = adapter
+
+
+class AnthropicAuxiliaryClient:
+    """OpenAI-client-compatible wrapper over a native Anthropic client."""
+
+    def __init__(self, real_client: Any, model: str, api_key: str, base_url: str, is_oauth: bool = False):
+        self._real_client = real_client
+        adapter = _AnthropicCompletionsAdapter(real_client, model, is_oauth=is_oauth)
+        self.chat = _AnthropicChatShim(adapter)
+        self.api_key = api_key
+        self.base_url = base_url
+
+    def close(self):
+        close_fn = getattr(self._real_client, "close", None)
+        if callable(close_fn):
+            close_fn()
+
+
+class _AsyncAnthropicCompletionsAdapter:
+    def __init__(self, sync_adapter: _AnthropicCompletionsAdapter):
+        self._sync = sync_adapter
+
+    async def create(self, **kwargs) -> Any:
+        import asyncio
+        return await asyncio.to_thread(self._sync.create, **kwargs)
+
+
+class _AsyncAnthropicChatShim:
+    def __init__(self, adapter: _AsyncAnthropicCompletionsAdapter):
+        self.completions = adapter
+
+
+class AsyncAnthropicAuxiliaryClient:
+    def __init__(self, sync_wrapper: "AnthropicAuxiliaryClient"):
+        sync_adapter = sync_wrapper.chat.completions
+        async_adapter = _AsyncAnthropicCompletionsAdapter(sync_adapter)
+        self.chat = _AsyncAnthropicChatShim(async_adapter)
+        self.api_key = sync_wrapper.api_key
+        self.base_url = sync_wrapper.base_url
+        # See AsyncCodexAuxiliaryClient: mirror _real_client so cache
+        # eviction on a poisoned underlying client also drops this entry.
+        self._real_client = sync_wrapper._real_client
+
+
+def _endpoint_speaks_anthropic_messages(base_url: str) -> bool:
+    """True if the endpoint at ``base_url`` speaks the Anthropic Messages
+    protocol instead of OpenAI chat.completions.
+
+    Mirrors ``hermes_cli.runtime_provider._detect_api_mode_for_url`` so the
+    auxiliary client and the main agent stay in sync on transport selection.
+    Covers:
+
+    - Any URL ending in ``/anthropic`` (MiniMax, Zhipu GLM, LiteLLM proxies,
+      Anthropic-compatible gateways).
+    - ``api.kimi.com/coding`` (Kimi Coding Plan — the /coding route only
+      speaks Claude-Code's native Anthropic shape; ``chat.completions``
+      returns 404 on Anthropic-only model aliases like ``kimi-for-coding``).
+    - ``api.anthropic.com`` (native Anthropic).
+    """
+    normalized = (base_url or "").strip().lower().rstrip("/")
+    if not normalized:
+        return False
+    if normalized.endswith("/anthropic"):
+        return True
+    hostname = base_url_hostname(normalized)
+    if hostname == "api.anthropic.com":
+        return True
+    if hostname == "api.kimi.com" and "/coding" in normalized:
+        return True
+    return False
+
+
+def _maybe_wrap_anthropic(
+    client_obj: Any,
+    model: str,
+    api_key: str,
+    base_url: str,
+    api_mode: Optional[str] = None,
+) -> Any:
+    """Rewrap a plain OpenAI client in ``AnthropicAuxiliaryClient`` when
+    the endpoint actually speaks Anthropic Messages.
+
+    This is the single chokepoint for aux-client transport correction.
+    Runs at the end of every ``resolve_provider_client`` branch so that
+    api_key providers (Kimi Coding Plan), the ``custom`` endpoint, and
+    future /anthropic gateways all land on the right wire format
+    regardless of which branch built the client.
+
+    Returns ``client_obj`` unchanged when:
+
+    - It's already an Anthropic/Codex/Gemini/CopilotACP wrapper.
+    - The endpoint is an OpenAI-wire endpoint.
+    - ``api_mode`` is explicitly set to a non-Anthropic transport.
+    - The ``anthropic`` SDK is not installed (falls back to OpenAI wire).
+    """
+    # Already wrapped — don't double-wrap.
+    if _safe_isinstance(client_obj, AnthropicAuxiliaryClient):
+        return client_obj
+    # Other specialized adapters we should never re-dispatch.
+    if _safe_isinstance(client_obj, CodexAuxiliaryClient):
+        return client_obj
+    try:
+        from agent.gemini_native_adapter import GeminiNativeClient
+        if _safe_isinstance(client_obj, GeminiNativeClient):
+            return client_obj
+    except ImportError:
+        pass
+    try:
+        from agent.copilot_acp_client import CopilotACPClient
+        if _safe_isinstance(client_obj, CopilotACPClient):
+            return client_obj
+    except ImportError:
+        pass
+
+    # Explicit non-anthropic api_mode wins over URL heuristics.
+    if api_mode and api_mode != "anthropic_messages":
+        return client_obj
+
+    should_wrap = (
+        api_mode == "anthropic_messages"
+        or _endpoint_speaks_anthropic_messages(base_url)
+    )
+    if not should_wrap:
+        return client_obj
+
+    try:
+        from agent.anthropic_adapter import build_anthropic_client
+    except ImportError:
+        logger.warning(
+            "Endpoint %s speaks Anthropic Messages but the anthropic SDK is "
+            "not installed — falling back to OpenAI-wire (will likely 404).",
+            base_url,
+        )
+        return client_obj
+
+    try:
+        real_client = build_anthropic_client(api_key, base_url)
+    except Exception as exc:
+        logger.warning(
+            "Failed to build Anthropic client for %s (%s) — falling back to "
+            "OpenAI-wire client.", base_url, exc,
+        )
+        return client_obj
+
+    logger.debug(
+        "Auxiliary transport: wrapping client in AnthropicAuxiliaryClient "
+        "(model=%s, base_url=%s, api_mode=%s)",
+        model, base_url[:60] if base_url else "", api_mode or "auto-detected",
+    )
+    return AnthropicAuxiliaryClient(
+        real_client, model, api_key, base_url, is_oauth=False,
+    )
+

 def _read_nous_auth() -> Optional[dict]:
    """Read and validate ~/.hermes/auth.json for an active Nous provider.
@@ -1192,14 +1405,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
                    continue
            except ImportError:
                pass
-            # Delegate to the anthropic plugin resolver via the registry
-            from agent.plugin_registries import registries as _ar
-            _anthro_resolver = _ar.get_provider_resolver("anthropic")
-            if _anthro_resolver is not None:
-                _ac, _am = _anthro_resolver()
-                if _ac is not None:
-                    return _ac, _am
-            continue
+            return _try_anthropic()

        pool_present, entry = _select_pool_entry(provider_id)
        if pool_present:
@@ -1236,7 +1442,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
                except Exception:
                    pass
            _client = OpenAI(api_key=api_key, base_url=base_url, **extra)
-            _client = _anthropic_plugin_service("maybe_wrap_anthropic")(_client, model, api_key, raw_base_url)
+            _client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
            return _client, model

        creds = resolve_api_key_provider_credentials(provider_id)
@@ -1273,7 +1479,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
            except Exception:
                pass
        _client = OpenAI(api_key=api_key, base_url=base_url, **extra)
-        _client = _anthropic_plugin_service("maybe_wrap_anthropic")(_client, model, api_key, raw_base_url)
+        _client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
        return _client, model

    return None, None
@@ -1282,6 +1488,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
 # ── Provider resolution helpers ─────────────────────────────────────────────


+
 def _try_openrouter(explicit_api_key: str = None, model: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
    pool_present, entry = _select_pool_entry("openrouter")
    if pool_present:
@@ -1603,11 +1810,7 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
        # LiteLLM proxies, etc.).  Must NEVER be treated as OAuth —
        # Anthropic OAuth claims only apply to api.anthropic.com.
        try:
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            build_anthropic_client = _anthropic.get("build_anthropic_client")
-            if build_anthropic_client is None:
-                raise ImportError("anthropic provider not registered")
+            from agent.anthropic_adapter import build_anthropic_client
            real_client = build_anthropic_client(custom_key, custom_base)
        except ImportError:
            logger.warning(
@@ -1622,7 +1825,7 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
    # URL-based anthropic detection for custom endpoints that didn't set
    # api_mode explicitly (e.g. kimi.com/coding reached via custom config).
    _fallback_client = OpenAI(api_key=custom_key, base_url=_clean_base, **_extra)
-    _fallback_client = _anthropic_plugin_service("maybe_wrap_anthropic")(
+    _fallback_client = _maybe_wrap_anthropic(
        _fallback_client, model, custom_key, custom_base, custom_mode,
    )
    return _fallback_client, model
@@ -1800,7 +2003,7 @@ def _try_azure_foundry(
        # for Entra ID it's a callable. ``_maybe_wrap_anthropic`` →
        # ``build_anthropic_client`` detects the callable and installs
        # the bearer-injecting httpx hook.
-        return _anthropic_plugin_service("maybe_wrap_anthropic")(
+        return _maybe_wrap_anthropic(
            client, final_model, api_key,
            base_url, runtime_api_mode,
        ), final_model
@@ -1809,6 +2012,54 @@ def _try_azure_foundry(
    return client, final_model


+def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
+    try:
+        from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
+    except ImportError:
+        return None, None
+
+    pool_present, entry = _select_pool_entry("anthropic")
+    if pool_present:
+        if entry is None:
+            return None, None
+        token = explicit_api_key or _pool_runtime_api_key(entry)
+    else:
+        entry = None
+        token = explicit_api_key or resolve_anthropic_token()
+    if not token:
+        return None, None
+
+    # Allow base URL override from config.yaml model.base_url, but only
+    # when the configured provider is anthropic — otherwise a non-Anthropic
+    # base_url (e.g. Codex endpoint) would leak into Anthropic requests.
+    base_url = _pool_runtime_base_url(entry, _ANTHROPIC_DEFAULT_BASE_URL) if pool_present else _ANTHROPIC_DEFAULT_BASE_URL
+    try:
+        from hermes_cli.config import load_config
+        cfg = load_config()
+        model_cfg = cfg.get("model")
+        if isinstance(model_cfg, dict):
+            cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
+            if cfg_provider == "anthropic":
+                cfg_base_url = (model_cfg.get("base_url") or "").strip().rstrip("/")
+                if cfg_base_url:
+                    base_url = cfg_base_url
+    except Exception:
+        pass
+
+    from agent.anthropic_adapter import _is_oauth_token
+    is_oauth = _is_oauth_token(token)
+    model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
+    logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
+    try:
+        real_client = build_anthropic_client(token, base_url)
+    except ImportError:
+        # The anthropic_adapter module imports fine but the SDK itself is
+        # missing — build_anthropic_client raises ImportError at call time
+        # when _anthropic_sdk is None.  Treat as unavailable.
+        return None, None
+    return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model
+
+
 _AUTO_PROVIDER_LABELS = {
    "_try_openrouter": "openrouter",
    "_try_nous": "nous",
@@ -2378,8 +2629,8 @@ def _retry_same_provider_sync(
        extra_body=effective_extra_body,
        base_url=retry_base or resolved_base_url,
    )
-    if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, retry_base):
-        retry_kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(retry_kwargs["messages"])
+    if _is_anthropic_compat_endpoint(resolved_provider, retry_base):
+        retry_kwargs["messages"] = _convert_openai_images_to_anthropic(retry_kwargs["messages"])
    return _validate_llm_response(
        retry_client.chat.completions.create(**retry_kwargs), task,
    )
@@ -2435,8 +2686,8 @@ async def _retry_same_provider_async(
        extra_body=effective_extra_body,
        base_url=retry_base or resolved_base_url,
    )
-    if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, retry_base):
-        retry_kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(retry_kwargs["messages"])
+    if _is_anthropic_compat_endpoint(resolved_provider, retry_base):
+        retry_kwargs["messages"] = _convert_openai_images_to_anthropic(retry_kwargs["messages"])
    return _validate_llm_response(
        await retry_client.chat.completions.create(**retry_kwargs), task,
    )
@@ -2470,19 +2721,12 @@ def _refresh_provider_credentials(provider: str) -> bool:
            _evict_cached_clients(normalized)
            return True
        if normalized == "anthropic":
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-            _refresh_oauth_token = _anthropic.get("_refresh_oauth_token")
-            resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
-            if read_claude_code_credentials is None:
-                return False
+            from agent.anthropic_adapter import read_claude_code_credentials, _refresh_oauth_token, resolve_anthropic_token

            creds = read_claude_code_credentials()
-            token = _refresh_oauth_token(creds) if isinstance(creds, dict) and creds.get("refreshToken") and _refresh_oauth_token else None
+            token = _refresh_oauth_token(creds) if isinstance(creds, dict) and creds.get("refreshToken") else None
            if not str(token or "").strip():
-                if resolve_anthropic_token is not None:
-                    token = resolve_anthropic_token()
+                token = resolve_anthropic_token()
            if not str(token or "").strip():
                return False
            _evict_cached_clients(normalized)
@@ -2803,7 +3047,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):

    if isinstance(sync_client, CodexAuxiliaryClient):
        return AsyncCodexAuxiliaryClient(sync_client), model
-    if _safe_isinstance(sync_client, AnthropicAuxiliaryClient):
+    if isinstance(sync_client, AnthropicAuxiliaryClient):
        return AsyncAnthropicAuxiliaryClient(sync_client), model
    try:
        from agent.gemini_native_adapter import GeminiNativeClient, AsyncGeminiNativeClient
@@ -2989,7 +3233,7 @@ def resolve_provider_client(
            return CodexAuxiliaryClient(client_obj, final_model_str)
        # Anthropic-wire endpoints: rewrap plain OpenAI clients so
        # chat.completions.create() is translated to /v1/messages.
-        return _anthropic_plugin_service("maybe_wrap_anthropic")(
+        return _maybe_wrap_anthropic(
            client_obj, final_model_str, api_key_str, base_url_str, api_mode,
        )

@@ -3221,11 +3465,7 @@ def resolve_provider_client(
                # branch in _try_custom_endpoint(). See #15033.
                if entry_api_mode == "anthropic_messages":
                    try:
-                        from agent.plugin_registries import registries
-                        _anthropic = registries.get_provider_namespace("anthropic")
-                        build_anthropic_client = _anthropic.get("build_anthropic_client")
-                        if build_anthropic_client is None:
-                            raise ImportError("anthropic provider not registered")
+                        from agent.anthropic_adapter import build_anthropic_client
                        real_client = build_anthropic_client(custom_key, custom_base)
                    except ImportError:
                        logger.warning(
@@ -3268,32 +3508,39 @@ def resolve_provider_client(
    except ImportError:
        pass

-    # ── Plugin-registered resolvers (azure-foundry, etc.) ──────────────
-    # Providers with complex auth (Entra ID, OAuth, etc.) register a
-    # resolver callable so core doesn't need per-provider if/elif branches.
-    from agent.plugin_registries import registries as _reg_early
-    _early_resolver = _reg_early.get_provider_resolver(provider)
-    if _early_resolver is not None:
-        client, default_model = _early_resolver(
+    # ── Azure Foundry (delegates to runtime resolver for auth_mode-aware routing) ─
+    #
+    # The generic PROVIDER_REGISTRY path below uses
+    # ``resolve_api_key_provider_credentials`` which only knows about the
+    # static ``AZURE_FOUNDRY_API_KEY`` env var. That misses two important
+    # cases for the ``azure-foundry`` provider:
+    #
+    #   1. ``model.auth_mode: entra_id`` — no static key exists; we need
+    #      a callable bearer-token provider from ``azure_identity_adapter``.
+    #   2. Non-default ``model.base_url`` (Foundry projects path) — the
+    #      env-var-only resolver doesn't apply config-yaml-driven URL
+    #      overrides.
+    #
+    # Delegate to the same runtime resolver the main agent uses so
+    # auxiliary tasks (title generation, compression, vision, embedding,
+    # session search) inherit the user's full Azure config.
+    if provider == "azure-foundry":
+        client, default_model = _try_azure_foundry(
            model=model,
            explicit_api_key=explicit_api_key,
            explicit_base_url=explicit_base_url,
-            async_mode=async_mode,
-            is_vision=is_vision,
-            main_runtime=main_runtime,
            api_mode=api_mode,
        )
-        if client is not None:
-            final_model = _normalize_resolved_model(model or default_model, provider)
-            return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
-                    else (client, final_model))
-        # Resolver returned None — provider unavailable
-        logger.warning(
-            "resolve_provider_client: %s requested but resolver returned "
-            "no client (run: hermes doctor for diagnostics)",
-            provider,
-        )
-        return None, None
+        if client is None:
+            logger.warning(
+                "resolve_provider_client: azure-foundry requested but "
+                "runtime resolution failed (run: hermes doctor for "
+                "diagnostics)"
+            )
+            return None, None
+        final_model = _normalize_resolved_model(model or default_model, provider)
+        return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
+                else (client, final_model))

    # ── API-key providers from PROVIDER_REGISTRY ─────────────────────
    try:
@@ -3312,6 +3559,14 @@ def resolve_provider_client(
        return None, None

    if pconfig.auth_type == "api_key":
+        if provider == "anthropic":
+            client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
+            if client is None:
+                logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
+                return None, None
+            final_model = _normalize_resolved_model(model or default_model, provider)
+            return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode else (client, final_model))
+
        creds = resolve_api_key_provider_credentials(provider)
        api_key = str(creds.get("api_key", "")).strip()
        # Honour an explicit api_key override (e.g. from a fallback_model entry
@@ -3444,14 +3699,37 @@ def resolve_provider_client(
        return None, None

    elif pconfig.auth_type == "aws_sdk":
-        # AWS SDK providers (e.g. Bedrock) — handled by the early resolver
-        # catch above when a plugin registers one.  If we reach here, no
-        # resolver was registered.
-        logger.warning(
-            "resolve_provider_client: aws_sdk provider %s has no "
-            "registered resolver (plugin not loaded?)", provider,
+        # AWS SDK providers (Bedrock) — use the Anthropic Bedrock client via
+        # boto3's credential chain (IAM roles, SSO, env vars, instance metadata).
+        try:
+            from agent.bedrock_adapter import has_aws_credentials, resolve_bedrock_region
+            from agent.anthropic_adapter import build_anthropic_bedrock_client
+        except ImportError:
+            logger.warning("resolve_provider_client: bedrock requested but "
+                           "boto3 or anthropic SDK not installed")
+            return None, None
+
+        if not has_aws_credentials():
+            logger.debug("resolve_provider_client: bedrock requested but "
+                         "no AWS credentials found")
+            return None, None
+
+        region = resolve_bedrock_region()
+        default_model = "anthropic.claude-haiku-4-5-20251001-v1:0"
+        final_model = _normalize_resolved_model(model or default_model, provider)
+        try:
+            real_client = build_anthropic_bedrock_client(region)
+        except ImportError as exc:
+            logger.warning("resolve_provider_client: cannot create Bedrock "
+                           "client: %s", exc)
+            return None, None
+        client = AnthropicAuxiliaryClient(
+            real_client, final_model, api_key="aws-sdk",
+            base_url=f"https://bedrock-runtime.{region}.amazonaws.com",
        )
-        return None, None
+        logger.debug("resolve_provider_client: bedrock (%s, %s)", final_model, region)
+        return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
+                else (client, final_model))

    elif pconfig.auth_type in {"oauth_device_code", "oauth_external"}:
        # OAuth providers — route through their specific try functions
@@ -3575,12 +3853,7 @@ def _resolve_strict_vision_backend(
        # allow-list); callers must specify via auxiliary.<task>.model.
        return resolve_provider_client("openai-codex", model, is_vision=True)
    if provider == "anthropic":
-        from agent.plugin_registries import registries as _reg
-        _resolver = _reg.get_provider_resolver("anthropic")
-        if _resolver is not None:
-            return _resolver(model=model)
-        # Fallback: no resolver registered (plugin not loaded)
-        return None, None
+        return _try_anthropic()
    if provider == "custom":
        return _try_custom_endpoint()
    return None, None
@@ -4310,6 +4583,69 @@ def _get_task_extra_body(task: str) -> Dict[str, Any]:

 # Providers that use Anthropic-compatible endpoints (via OpenAI SDK wrapper).
 # Their image content blocks must use Anthropic format, not OpenAI format.
+_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-oauth", "minimax-cn"})
+
+
+def _is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
+    """Detect if an endpoint expects Anthropic-format content blocks.
+
+    Returns True for known Anthropic-compatible providers (MiniMax) and
+    any endpoint whose URL contains ``/anthropic`` in the path.
+    """
+    if provider in _ANTHROPIC_COMPAT_PROVIDERS:
+        return True
+    url_lower = (base_url or "").lower()
+    return "/anthropic" in url_lower
+
+
+def _convert_openai_images_to_anthropic(messages: list) -> list:
+    """Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks.
+
+    Only touches messages that have list-type content with ``image_url`` blocks;
+    plain text messages pass through unchanged.
+    """
+    converted = []
+    for msg in messages:
+        content = msg.get("content")
+        if not isinstance(content, list):
+            converted.append(msg)
+            continue
+        new_content = []
+        changed = False
+        for block in content:
+            if block.get("type") == "image_url":
+                image_url_val = (block.get("image_url") or {}).get("url", "")
+                if image_url_val.startswith("data:"):
+                    # Parse data URI: data:<media_type>;base64,<data>
+                    header, _, b64data = image_url_val.partition(",")
+                    media_type = "image/png"
+                    if ":" in header and ";" in header:
+                        media_type = header.split(":", 1)[1].split(";", 1)[0]
+                    new_content.append({
+                        "type": "image",
+                        "source": {
+                            "type": "base64",
+                            "media_type": media_type,
+                            "data": b64data,
+                        },
+                    })
+                else:
+                    # URL-based image
+                    new_content.append({
+                        "type": "image",
+                        "source": {
+                            "type": "url",
+                            "url": image_url_val,
+                        },
+                    })
+                changed = True
+            else:
+                new_content.append(block)
+        converted.append({**msg, "content": new_content} if changed else msg)
+    return converted
+
+
+
 def _build_call_kwargs(
    provider: str,
    model: str,
@@ -4339,10 +4675,8 @@ def _build_call_kwargs(
    # structured-JSON extraction) don't 400 the moment
    # the aux model is flipped to 4.7.
    if temperature is not None:
-        from agent.plugin_registries import registries
-        _anthropic = registries.get_provider_namespace("anthropic")
-        _forbids_sampling_params = _anthropic.get("_forbids_sampling_params")
-        if _forbids_sampling_params is not None and _forbids_sampling_params(model):
+        from agent.anthropic_adapter import _forbids_sampling_params
+        if _forbids_sampling_params(model):
            temperature = None

    if temperature is not None:
@@ -4554,8 +4888,8 @@ def call_llm(

    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
    _client_base = str(getattr(client, "base_url", "") or "")
-    if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, _client_base):
-        kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(kwargs["messages"])
+    if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
+        kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])

    # Handle unsupported temperature, max_tokens vs max_completion_tokens retry,
    # then payment fallback.
@@ -4997,8 +5331,8 @@ async def async_call_llm(
        base_url=_client_base or resolved_base_url)

    # Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
-    if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, _client_base):
-        kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(kwargs["messages"])
+    if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
+        kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])

    try:
        return _validate_llm_response(
@@ -54,6 +54,8 @@ SCOPE_AI_AZURE_DEFAULT = "https://ai.azure.com/.default"
 # Lazy SDK import — only loaded when the Entra path is actually used.
 # ---------------------------------------------------------------------------

+_AZURE_IDENTITY_FEATURE = "provider.azure_identity"
+

 def has_azure_identity_installed() -> bool:
    """Return True if `azure-identity` can be imported right now.
@@ -68,20 +70,35 @@ def has_azure_identity_installed() -> bool:


 def _require_azure_identity():
-    """Import ``azure.identity``.
+    """Import ``azure.identity``, lazy-installing it if allowed.

    Raises ``ImportError`` with a clear actionable message when the
-    package is missing.
+    package is missing and lazy installs are disabled.
    """
    try:
        import azure.identity as _ai
        return _ai
    except ImportError:
-        raise ImportError(
-            "The 'azure-identity' package is required for Azure AI "
-            "Foundry Entra ID authentication. Install it with: "
-            "pip install azure-identity"
-        )
+        try:
+            from tools.lazy_deps import ensure, FeatureUnavailable
+        except ImportError as exc:
+            raise ImportError(
+                "The 'azure-identity' package is required for Azure AI "
+                "Foundry Entra ID authentication. Install it with: "
+                "pip install azure-identity"
+            ) from exc
+
+        try:
+            ensure(_AZURE_IDENTITY_FEATURE, prompt=False)
+        except FeatureUnavailable as exc:
+            raise ImportError(
+                "The 'azure-identity' package is required for Azure AI "
+                "Foundry Entra ID authentication. " + str(exc)
+            ) from exc
+
+        # Retry import after lazy install.
+        import azure.identity as _ai  # noqa: WPS440
+        return _ai


 def reset_credential_cache() -> None:
@@ -36,6 +36,19 @@ from typing import Any, Dict, List, Optional, Tuple

 logger = logging.getLogger(__name__)

+# ---------------------------------------------------------------------------
+# Ensure boto3/botocore are installed before any code in this module runs.
+# Upstream removed boto3 from [all] extras (PRs #24220, #24515); lazy_deps
+# handles on-demand installation so the Bedrock provider still works in the
+# EKS deployment without baking boto3 into the base image.
+# ---------------------------------------------------------------------------
+try:
+    from tools.lazy_deps import ensure
+    ensure("provider.bedrock", prompt=False)
+except Exception:
+    pass  # lazy_deps unavailable or install failed — let downstream imports surface the real error
+
+
 # ---------------------------------------------------------------------------
 # Lazy boto3 import — only loaded when the Bedrock provider is actually used.
 # This keeps startup fast for users who don't use Bedrock.
@@ -235,14 +235,12 @@ def interruptible_api_call(agent, api_kwargs: dict):
                # normalize_converse_response produces an OpenAI-compatible
                # SimpleNamespace so the rest of the agent loop can treat
                # bedrock responses like chat_completions responses.
-                from agent.plugin_registries import registries
-                _bedrock = registries.get_provider_namespace("bedrock")
-                _get_bedrock_runtime_client = _bedrock.get("_get_bedrock_runtime_client")
-                invalidate_runtime_client = _bedrock.get("invalidate_runtime_client")
-                is_stale_connection_error = _bedrock.get("is_stale_connection_error")
-                normalize_converse_response = _bedrock.get("normalize_converse_response")
-                if _get_bedrock_runtime_client is None or normalize_converse_response is None:
-                    raise ImportError("bedrock provider not registered")
+                from agent.bedrock_adapter import (
+                    _get_bedrock_runtime_client,
+                    invalidate_runtime_client,
+                    is_stale_connection_error,
+                    normalize_converse_response,
+                )
                region = api_kwargs.pop("__bedrock_region__", "us-east-1")
                api_kwargs.pop("__bedrock_converse__", None)
                client = _get_bedrock_runtime_client(region)
@@ -698,11 +696,8 @@ def build_api_kwargs(agent, api_messages: list) -> dict:
    _ant_max = None
    if (_is_or or _is_nous) and "claude" in (agent.model or "").lower():
        try:
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            _get_anthropic_max_output = _anthropic.get("_get_anthropic_max_output")
-            if _get_anthropic_max_output is not None:
-                _ant_max = _get_anthropic_max_output(agent.model)
+            from agent.anthropic_adapter import _get_anthropic_max_output
+            _ant_max = _get_anthropic_max_output(agent.model)
        except Exception:
            pass

@@ -1187,20 +1182,15 @@ def try_activate_fallback(agent, reason: "FailoverReason | None" = None) -> bool

        if fb_api_mode == "anthropic_messages":
            # Build native Anthropic client instead of using OpenAI client
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            build_anthropic_client = _anthropic.get("build_anthropic_client")
-            resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
-            _is_oauth_token = _anthropic.get("_is_oauth_token")
-            effective_key = (fb_client.api_key or (resolve_anthropic_token() if resolve_anthropic_token else "") or "") if fb_provider == "anthropic" else (fb_client.api_key or "")
+            from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token, _is_oauth_token
+            effective_key = (fb_client.api_key or resolve_anthropic_token() or "") if fb_provider == "anthropic" else (fb_client.api_key or "")
            agent.api_key = effective_key
            agent._anthropic_api_key = effective_key
            agent._anthropic_base_url = fb_base_url
-            if build_anthropic_client is not None:
-                agent._anthropic_client = build_anthropic_client(
-                    effective_key, agent._anthropic_base_url, timeout=_fb_timeout,
-                )
-            agent._is_anthropic_oauth = _is_oauth_token(effective_key) if fb_provider == "anthropic" and _is_oauth_token else False
+            agent._anthropic_client = build_anthropic_client(
+                effective_key, agent._anthropic_base_url, timeout=_fb_timeout,
+            )
+            agent._is_anthropic_oauth = _is_oauth_token(effective_key) if fb_provider == "anthropic" else False
            agent.client = None
            agent._client_kwargs = {}
        else:
@@ -1584,14 +1574,12 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=

        def _bedrock_call():
            try:
-                from agent.plugin_registries import registries
-                _bedrock = registries.get_provider_namespace("bedrock")
-                _get_bedrock_runtime_client = _bedrock.get("_get_bedrock_runtime_client")
-                invalidate_runtime_client = _bedrock.get("invalidate_runtime_client")
-                is_stale_connection_error = _bedrock.get("is_stale_connection_error")
-                stream_converse_with_callbacks = _bedrock.get("stream_converse_with_callbacks")
-                if _get_bedrock_runtime_client is None or stream_converse_with_callbacks is None:
-                    raise ImportError("bedrock provider not registered")
+                from agent.bedrock_adapter import (
+                    _get_bedrock_runtime_client,
+                    invalidate_runtime_client,
+                    is_stale_connection_error,
+                    stream_converse_with_callbacks,
+                )
                region = api_kwargs.pop("__bedrock_region__", "us-east-1")
                api_kwargs.pop("__bedrock_converse__", None)
                client = _get_bedrock_runtime_client(region)
@@ -27,7 +27,7 @@ import time
 import uuid
 from typing import Any, Dict, List, Optional

-from agent.plugin_registries import registries as _registries
+from agent.anthropic_adapter import _is_oauth_token
 from agent.auxiliary_client import set_runtime_main
 from agent.codex_responses_adapter import _summarize_user_message_for_log
 from agent.display import KawaiiSpinner
@@ -2383,8 +2383,8 @@ def run_conversation(
                    and not anthropic_auth_retry_attempted
                ):
                    anthropic_auth_retry_attempted = True
-                    _is_oauth_token = _registries.get_provider_service("anthropic", "_is_oauth_token")
-                    is_token_provider = _registries.get_provider_service("azure", "is_token_provider")
+                    from agent.anthropic_adapter import _is_oauth_token
+                    from agent.azure_identity_adapter import is_token_provider
                    if agent._try_refresh_anthropic_client_credentials():
                        print(f"{agent.log_prefix}🔐 Anthropic credentials refreshed after 401. Retrying request...")
                        continue
@@ -2401,7 +2401,7 @@ def run_conversation(
                        print(f"{agent.log_prefix}   Run `hermes doctor` for credential-chain diagnostics, or")
                        print(f"{agent.log_prefix}   `az login` if your developer session expired.")
                    else:
-                        auth_method = "Bearer (OAuth/setup-token)" if (_is_oauth_token is not None and _is_oauth_token(key)) else "x-api-key (API key)"
+                        auth_method = "Bearer (OAuth/setup-token)" if _is_oauth_token(key) else "x-api-key (API key)"
                        print(f"{agent.log_prefix}   Auth method: {auth_method}")
                        print(f"{agent.log_prefix}   Token prefix: {key[:12]}..." if isinstance(key, str) and len(key) > 12 else f"{agent.log_prefix}   Token: (empty or short)")
                    print(f"{agent.log_prefix}   Troubleshooting:")
@@ -458,6 +458,43 @@ class CredentialPool:
        self._persist()
        return updated

+    def _sync_anthropic_entry_from_credentials_file(self, entry: PooledCredential) -> PooledCredential:
+        """Sync a claude_code pool entry from ~/.claude/.credentials.json if tokens differ.
+
+        OAuth refresh tokens are single-use. When something external (e.g.
+        Claude Code CLI, or another profile's pool) refreshes the token, it
+        writes the new pair to ~/.claude/.credentials.json. The pool entry's
+        refresh token becomes stale. This method detects that and syncs.
+        """
+        if self.provider != "anthropic" or entry.source != "claude_code":
+            return entry
+        try:
+            from agent.anthropic_adapter import read_claude_code_credentials
+            creds = read_claude_code_credentials()
+            if not creds:
+                return entry
+            file_refresh = creds.get("refreshToken", "")
+            file_access = creds.get("accessToken", "")
+            file_expires = creds.get("expiresAt", 0)
+            # If the credentials file has a different token pair, sync it
+            if file_refresh and file_refresh != entry.refresh_token:
+                logger.debug("Pool entry %s: syncing tokens from credentials file (refresh token changed)", entry.id)
+                updated = replace(
+                    entry,
+                    access_token=file_access,
+                    refresh_token=file_refresh,
+                    expires_at_ms=file_expires,
+                    last_status=None,
+                    last_status_at=None,
+                    last_error_code=None,
+                )
+                self._replace_entry(entry, updated)
+                self._persist()
+                return updated
+        except Exception as exc:
+            logger.debug("Failed to sync from credentials file: %s", exc)
+        return entry
+
    def _sync_codex_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
        """Sync a Codex device_code pool entry from auth.json if tokens differ.

@@ -747,11 +784,32 @@ class CredentialPool:
            return None

        try:
-            # ── Plugin-registered credential pool hooks ──
-            from agent.plugin_registries import registries as _cph_reg2
-            _hook = _cph_reg2.get_credential_pool_hook(self.provider)
-            if _hook is not None and _hook.refresh_oauth is not None:
-                updated = _hook.refresh_oauth(entry, pool=self)
+            if self.provider == "anthropic":
+                from agent.anthropic_adapter import refresh_anthropic_oauth_pure
+
+                refreshed = refresh_anthropic_oauth_pure(
+                    entry.refresh_token,
+                    use_json=entry.source.endswith("hermes_pkce"),
+                )
+                updated = replace(
+                    entry,
+                    access_token=refreshed["access_token"],
+                    refresh_token=refreshed["refresh_token"],
+                    expires_at_ms=refreshed["expires_at_ms"],
+                )
+                # Keep ~/.claude/.credentials.json in sync so that the
+                # fallback path (resolve_anthropic_token) and other profiles
+                # see the latest tokens.
+                if entry.source == "claude_code":
+                    try:
+                        from agent.anthropic_adapter import _write_claude_code_credentials
+                        _write_claude_code_credentials(
+                            refreshed["access_token"],
+                            refreshed["refresh_token"],
+                            refreshed["expires_at_ms"],
+                        )
+                    except Exception as wexc:
+                        logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
            elif self.provider == "openai-codex":
                # Adopt fresher tokens from auth.json before spending the
                # refresh_token — single-use tokens consumed by another Hermes
@@ -806,18 +864,46 @@ class CredentialPool:
                return entry
        except Exception as exc:
            logger.debug("Credential refresh failed for %s/%s: %s", self.provider, entry.id, exc)
-            # ── Plugin-registered credential pool hooks ──
-            # The hook's refresh_oauth already handles retry-with-sync internally,
-            # so if we got here it means a non-hook provider failed.
-            from agent.plugin_registries import registries as _cph_reg3
-            _hook = _cph_reg3.get_credential_pool_hook(self.provider)
-            if _hook is not None and _hook.sync_from_credentials_file is not None:
-                # Give the hook a chance to sync from external file
-                synced = _hook.sync_from_credentials_file(entry)
-                if synced is not entry:
-                    entry = synced
-                    self._replace_entry(entry, synced)
-                    self._persist()
+            # For anthropic claude_code entries: the refresh token may have been
+            # consumed by another process. Check if ~/.claude/.credentials.json
+            # has a newer token pair and retry once.
+            if self.provider == "anthropic" and entry.source == "claude_code":
+                synced = self._sync_anthropic_entry_from_credentials_file(entry)
+                if synced.refresh_token != entry.refresh_token:
+                    logger.debug("Retrying refresh with synced token from credentials file")
+                    try:
+                        from agent.anthropic_adapter import refresh_anthropic_oauth_pure
+                        refreshed = refresh_anthropic_oauth_pure(
+                            synced.refresh_token,
+                            use_json=synced.source.endswith("hermes_pkce"),
+                        )
+                        updated = replace(
+                            synced,
+                            access_token=refreshed["access_token"],
+                            refresh_token=refreshed["refresh_token"],
+                            expires_at_ms=refreshed["expires_at_ms"],
+                            last_status=STATUS_OK,
+                            last_status_at=None,
+                            last_error_code=None,
+                        )
+                        self._replace_entry(synced, updated)
+                        self._persist()
+                        try:
+                            from agent.anthropic_adapter import _write_claude_code_credentials
+                            _write_claude_code_credentials(
+                                refreshed["access_token"],
+                                refreshed["refresh_token"],
+                                refreshed["expires_at_ms"],
+                            )
+                        except Exception as wexc:
+                            logger.debug("Failed to write refreshed token to credentials file (retry path): %s", wexc)
+                        return updated
+                    except Exception as retry_exc:
+                        logger.debug("Retry refresh also failed: %s", retry_exc)
+                elif not self._entry_needs_refresh(synced):
+                    # Credentials file had a valid (non-expired) token — use it directly
+                    logger.debug("Credentials file has valid token, using without refresh")
+                    return synced
            # For xai-oauth: same race as nous — another process may have
            # consumed the refresh token between our proactive sync and the
            # HTTP call.  Re-check auth.json and adopt the fresh tokens if
@@ -1038,11 +1124,10 @@ class CredentialPool:
    def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
        if entry.auth_type != AUTH_TYPE_OAUTH:
            return False
-        # ── Plugin-registered credential pool hooks ──
-        from agent.plugin_registries import registries as _cph_reg
-        _hook = _cph_reg.get_credential_pool_hook(self.provider)
-        if _hook is not None and _hook.needs_refresh is not None:
-            return _hook.needs_refresh(entry)
+        if self.provider == "anthropic":
+            if entry.expires_at_ms is None:
+                return False
+            return int(entry.expires_at_ms) <= int(time.time() * 1000) + 120_000
        if self.provider == "openai-codex":
            return _codex_access_token_is_expiring(
                entry.access_token,
@@ -1075,16 +1160,12 @@ class CredentialPool:
        cleared_any = False
        available: List[PooledCredential] = []
        for entry in self._entries:
-            # ── Plugin-registered credential pool hooks ──
-            # Sync exhausted entries from external credentials files before
-            # status/refresh checks. This picks up tokens refreshed by other
-            # processes (e.g. Claude Code CLI, other Hermes profiles).
-            from agent.plugin_registries import registries as _cph_reg4
-            _avail_hook = _cph_reg4.get_credential_pool_hook(self.provider)
-            if (_avail_hook is not None
-                    and _avail_hook.sync_from_credentials_file is not None
+            # For anthropic claude_code entries, sync from the credentials file
+            # before any status/refresh checks. This picks up tokens refreshed
+            # by other processes (Claude Code CLI, other Hermes profiles).
+            if (self.provider == "anthropic" and entry.source == "claude_code"
                    and entry.last_status == STATUS_EXHAUSTED):
-                synced = _avail_hook.sync_from_credentials_file(entry)
+                synced = self._sync_anthropic_entry_from_credentials_file(entry)
                if synced is not entry:
                    entry = synced
                    cleared_any = True
@@ -1434,15 +1515,84 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
        def _is_suppressed(_p, _s):  # type: ignore[misc]
            return False

-    # ── Plugin-registered credential pool hooks ──
-    from agent.plugin_registries import registries as _cp_reg
-    _cp_hook = _cp_reg.get_credential_pool_hook(provider)
-    if _cp_hook is not None and _cp_hook.discover_credentials is not None:
-        hook_changed, hook_sources = _cp_hook.discover_credentials(
-            entries, provider, _is_suppressed,
+    if provider == "anthropic":
+        # Only auto-discover external credentials (Claude Code, Hermes PKCE)
+        # when the user has explicitly configured anthropic as their provider.
+        # Without this gate, auxiliary client fallback chains silently read
+        # ~/.claude/.credentials.json without user consent.  See PR #4210.
+        try:
+            from hermes_cli.auth import is_provider_explicitly_configured
+            if not is_provider_explicitly_configured("anthropic"):
+                return changed, active_sources
+        except ImportError:
+            pass
+
+        # API-key vs OAuth is a user-visible choice at `hermes setup` ("Claude
+        # Pro/Max subscription" vs "Anthropic API key").  The signal that the
+        # user picked the API-key path is: ANTHROPIC_API_KEY set in the env,
+        # AND no OAuth env vars set — `save_anthropic_api_key()` writes the
+        # API key and zeros ANTHROPIC_TOKEN; `save_anthropic_oauth_token()`
+        # does the inverse.  When that signal is present we MUST NOT seed
+        # autodiscovered OAuth tokens (~/.claude/.credentials.json from the
+        # Claude Code CLI, hermes_pkce creds from a previous OAuth login)
+        # into the anthropic pool — otherwise rotation on a 401/429 silently
+        # flips the session onto an OAuth credential, which forces the Claude
+        # Code identity injection, `mcp_` tool-name rewrite, and claude-cli
+        # User-Agent header (`agent/anthropic_adapter.py:2128`).  Users who
+        # explicitly opted into the API-key path are explicitly opting OUT of
+        # that masquerade.  Prefer ~/.hermes/.env over os.environ for the
+        # same reason `_seed_from_env` does — that's the authoritative file
+        # that `hermes setup` writes.
+        _env_file = load_env()
+
+        def _env_val(key: str) -> str:
+            return (_env_file.get(key) or os.environ.get(key) or "").strip()
+
+        anthropic_api_key = _env_val("ANTHROPIC_API_KEY")
+        anthropic_oauth_env = (
+            _env_val("ANTHROPIC_TOKEN") or _env_val("CLAUDE_CODE_OAUTH_TOKEN")
        )
-        changed |= hook_changed
-        active_sources |= hook_sources
+        api_key_path_explicit = bool(anthropic_api_key and not anthropic_oauth_env)
+
+        if api_key_path_explicit:
+            # Prune any stale autodiscovered OAuth entries that may have been
+            # seeded into the on-disk pool during a previous OAuth session.
+            # Without this, switching OAuth -> API key at setup leaves the
+            # OAuth entries dormant in auth.json forever and rotation on a
+            # transient 401 could revive them.
+            retained = [
+                entry for entry in entries
+                if entry.source not in {"hermes_pkce", "claude_code"}
+            ]
+            if len(retained) != len(entries):
+                entries[:] = retained
+                changed = True
+            return changed, active_sources
+
+        from agent.anthropic_adapter import read_claude_code_credentials, read_hermes_oauth_credentials
+
+        for source_name, creds in (
+            ("hermes_pkce", read_hermes_oauth_credentials()),
+            ("claude_code", read_claude_code_credentials()),
+        ):
+            if creds and creds.get("accessToken"):
+                if _is_suppressed(provider, source_name):
+                    continue
+                active_sources.add(source_name)
+                changed |= _upsert_entry(
+                    entries,
+                    provider,
+                    source_name,
+                    {
+                        "source": source_name,
+                        "auth_type": AUTH_TYPE_OAUTH,
+                        "access_token": creds.get("accessToken", ""),
+                        "refresh_token": creds.get("refreshToken"),
+                        "expires_at_ms": creds.get("expiresAt"),
+                        "label": label_from_token(creds.get("accessToken", ""), source_name),
+                    },
+                )
+
    elif provider == "nous":
        state = _load_provider_state(auth_store, "nous")
        has_runtime_material = bool(
@@ -1753,11 +1903,12 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")

    env_vars = list(pconfig.api_key_env_vars)
-    # ── Plugin-registered credential pool hooks: env var order override ──
-    from agent.plugin_registries import registries as _env_reg
-    _env_hook = _env_reg.get_credential_pool_hook(provider)
-    if _env_hook is not None and _env_hook.env_var_order is not None:
-        env_vars = _env_hook.env_var_order
+    if provider == "anthropic":
+        env_vars = [
+            "ANTHROPIC_TOKEN",
+            "CLAUDE_CODE_OAUTH_TOKEN",
+            "ANTHROPIC_API_KEY",
+        ]

    for env_var in env_vars:
        # Prefer ~/.hermes/.env over os.environ
@@ -1768,11 +1919,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
        if _is_source_suppressed(provider, source):
            continue
        active_sources.add(source)
-        # ── Plugin-registered credential pool hooks: auth type detection ──
-        if _env_hook is not None and _env_hook.detect_auth_type is not None:
-            auth_type = _env_hook.detect_auth_type(token)
-        else:
-            auth_type = AUTH_TYPE_API_KEY
+        auth_type = AUTH_TYPE_OAUTH if provider == "anthropic" and not token.startswith("sk-ant-api") else AUTH_TYPE_API_KEY
        base_url = env_url or pconfig.inference_base_url
        if provider == "kimi-coding":
            base_url = _resolve_kimi_base_url(token, pconfig.inference_base_url, env_url)
@@ -37,6 +37,8 @@ from __future__ import annotations
 import base64
 import logging
 import mimetypes
+import os
+import re
 from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple

@@ -46,6 +48,102 @@ logger = logging.getLogger(__name__)
 _VALID_MODES = frozenset({"auto", "native", "text"})


+# Image extensions used by extract_image_refs(). Kept tight on purpose — we
+# only auto-attach things the model can actually see. Documents/archives are
+# excluded because the gateway's broader extract_local_files() also routes
+# them differently (send_document), and we don't want to attach a PDF as a
+# vision part.
+_IMAGE_EXTS = (
+    ".png", ".jpg", ".jpeg", ".gif", ".webp", ".bmp", ".tiff", ".tif", ".heic",
+)
+_IMAGE_EXT_PATTERN = "|".join(e.lstrip(".") for e in _IMAGE_EXTS)
+
+# Absolute / home-relative local image path. Matches the same shape gateway's
+# extract_local_files() uses: anchors to ``~/`` or ``/``, ignores matches inside
+# URLs (the ``(?<![/:\w.])`` lookbehind), and case-insensitive on the extension.
+_LOCAL_IMAGE_PATH_RE = re.compile(
+    r"(?<![/:\w.])(?:~/|/)(?:[\w.\-]+/)*[\w.\-]+\.(?:" + _IMAGE_EXT_PATTERN + r")\b",
+    re.IGNORECASE,
+)
+
+# http(s) URL ending in an image extension (optionally followed by a
+# query string). Case-insensitive on the extension. Strict ``http(s)://``
+# scheme so we don't accidentally grab ``file://`` URLs or other shapes.
+_IMAGE_URL_RE = re.compile(
+    r"https?://[^\s<>\"']+?\.(?:" + _IMAGE_EXT_PATTERN + r")(?:\?[^\s<>\"']*)?",
+    re.IGNORECASE,
+)
+
+
+def extract_image_refs(text: str) -> Tuple[List[str], List[str]]:
+    """Scan free-form text for image references the model should see.
+
+    Returns ``(local_paths, urls)``:
+
+      * ``local_paths`` — absolute (``/``) or home-relative (``~/``) paths
+        whose suffix is an image extension AND whose expanded form exists
+        on disk as a file. Order-preserving, deduplicated.
+      * ``urls`` — ``http(s)://…`` URLs whose path ends in an image
+        extension (a ``?query`` is allowed after the extension).
+        Order-preserving, deduplicated.
+
+    Matches inside fenced code blocks (``` ``` ```) and inline backticks
+    (`` `…` ``) are skipped so that snippets pasted into a task body for
+    reference aren't mistaken for live attachments. This mirrors the
+    behaviour of ``gateway.platforms.base.BaseAdapter.extract_local_files``.
+
+    Local paths are validated against the filesystem; URLs are not
+    (the provider fetches them at request time).
+    """
+    if not isinstance(text, str) or not text:
+        return [], []
+
+    # Build spans covered by fenced code blocks and inline code so we can
+    # ignore references the author embedded purely as example text.
+    code_spans: list[tuple[int, int]] = []
+    for m in re.finditer(r"```[^\n]*\n.*?```", text, re.DOTALL):
+        code_spans.append((m.start(), m.end()))
+    for m in re.finditer(r"`[^`\n]+`", text):
+        code_spans.append((m.start(), m.end()))
+
+    def _in_code(pos: int) -> bool:
+        return any(s <= pos < e for s, e in code_spans)
+
+    local_paths: list[str] = []
+    seen_paths: set[str] = set()
+    for match in _LOCAL_IMAGE_PATH_RE.finditer(text):
+        if _in_code(match.start()):
+            continue
+        raw = match.group(0)
+        expanded = os.path.expanduser(raw)
+        try:
+            if not os.path.isfile(expanded):
+                continue
+        except OSError:
+            # ENAMETOOLONG / EINVAL on pathological inputs — skip rather than crash.
+            continue
+        if expanded in seen_paths:
+            continue
+        seen_paths.add(expanded)
+        local_paths.append(expanded)
+
+    urls: list[str] = []
+    seen_urls: set[str] = set()
+    for match in _IMAGE_URL_RE.finditer(text):
+        if _in_code(match.start()):
+            continue
+        url = match.group(0)
+        # Strip trailing punctuation that's almost certainly prose, not part
+        # of the URL (e.g. "see https://x.com/a.png." or "/a.png)").
+        url = url.rstrip(".,;:!?)]>")
+        if url in seen_urls:
+            continue
+        seen_urls.add(url)
+        urls.append(url)
+
+    return local_paths, urls
+
+
 # Strict YAML/JSON boolean coercion for capability overrides.
 #
 # ``bool("false")`` is True in Python because non-empty strings are truthy, so
@@ -320,20 +418,29 @@ def _file_to_data_url(path: Path) -> Optional[str]:
 def build_native_content_parts(
    user_text: str,
    image_paths: List[str],
+    image_urls: Optional[List[str]] = None,
 ) -> Tuple[List[Dict[str, Any]], List[str]]:
    """Build an OpenAI-style ``content`` list for a user turn.

    Shape:
      [{"type": "text", "text": "...\\n\\n[Image attached at: /local/path]"},
       {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
+       {"type": "image_url", "image_url": {"url": "https://example.com/a.png"}},
       ...]

-    The local path of each successfully attached image is appended to the
-    text part as ``[Image attached at: <path>]``. The model still sees the
-    pixels via the ``image_url`` part (full native vision); the path note
-    just gives it a string handle so MCP/skill tools that take an image
-    path or URL argument can be invoked on the same image without an
-    extra round-trip. This parallels the text-mode hint produced by
+    Local paths are read from disk and embedded as base64 ``data:`` URLs.
+    Remote URLs (``http(s)://``) are passed through verbatim — the provider
+    fetches them server-side. The model still sees the pixels either way.
+
+    For each successfully attached image, a hint is appended to the text
+    part:
+
+      * local path → ``[Image attached at: <path>]``
+      * URL        → ``[Image attached: <url>]``
+
+    The hint gives the model a string handle so MCP/skill tools that take
+    an image path or URL argument can be invoked on the same image without
+    an extra round-trip. This parallels the text-mode hint produced by
    ``Runner._enrich_message_with_vision`` (``vision_analyze using image_url:
    <path>``) so behaviour is consistent across both image input modes.

@@ -342,12 +449,14 @@ def build_native_content_parts(
    ceiling), the agent's retry loop transparently shrinks and retries
    once — see ``run_agent._try_shrink_image_parts_in_messages``.

-    Returns (content_parts, skipped_paths). Skipped paths are files that
-    couldn't be read from disk and are NOT advertised in the path hints.
+    Returns (content_parts, skipped). Skipped entries are local paths
+    that couldn't be read from disk; URLs are never skipped (they're
+    not validated here).
    """
    skipped: List[str] = []
    image_parts: List[Dict[str, Any]] = []
    attached_paths: List[str] = []
+    attached_urls: List[str] = []

    for raw_path in image_paths:
        p = Path(raw_path)
@@ -364,16 +473,26 @@ def build_native_content_parts(
        })
        attached_paths.append(str(raw_path))

+    for url in image_urls or []:
+        url = (url or "").strip()
+        if not url:
+            continue
+        image_parts.append({
+            "type": "image_url",
+            "image_url": {"url": url},
+        })
+        attached_urls.append(url)
+
    text = (user_text or "").strip()

    # If at least one image attached, build a single text part that combines
-    # the user's caption (or a neutral default) with one path hint per image.
-    if attached_paths:
+    # the user's caption (or a neutral default) with one hint per image.
+    if attached_paths or attached_urls:
        base_text = text or "What do you see in this image?"
-        path_hints = "\n".join(
-            f"[Image attached at: {p}]" for p in attached_paths
-        )
-        combined_text = f"{base_text}\n\n{path_hints}"
+        hint_lines: List[str] = []
+        hint_lines.extend(f"[Image attached at: {p}]" for p in attached_paths)
+        hint_lines.extend(f"[Image attached: {u}]" for u in attached_urls)
+        combined_text = f"{base_text}\n\n" + "\n".join(hint_lines)
        parts: List[Dict[str, Any]] = [{"type": "text", "text": combined_text}]
        parts.extend(image_parts)
        return parts, skipped
@@ -388,4 +507,5 @@ def build_native_content_parts(
 __all__ = [
    "decide_image_input_mode",
    "build_native_content_parts",
+    "extract_image_refs",
 ]
@@ -1567,11 +1567,8 @@ def get_model_context_length(
        and base_url_host_matches(base_url, "amazonaws.com")
    ):
        try:
-            from agent.plugin_registries import registries
-            _bedrock = registries.get_provider_namespace("bedrock")
-            get_bedrock_context_length = _bedrock.get("get_bedrock_context_length")
-            if get_bedrock_context_length is not None:
-                return get_bedrock_context_length(model)
+            from agent.bedrock_adapter import get_bedrock_context_length
+            return get_bedrock_context_length(model)
        except ImportError:
            pass  # boto3 not installed — fall through to generic resolution

@@ -1,586 +0,0 @@
-"""Plugin capability registries.
-
-Each plugin's ``register(ctx)`` function populates these registries via
-``ctx.register_<capability>()``.  The core codebase then queries the
-registries instead of importing from plugin packages directly.
-
-This is the **only** coupling point between the core and plugins: the core
-imports from ``agent.plugin_registries``, never from ``hermes_agent_*``.
-"""
-
-from __future__ import annotations
-
-from dataclasses import dataclass, field
-from typing import (
-    Any,
-    Callable,
-    Dict,
-    List,
-    Optional,
-    Protocol,
-    Sequence,
-    Tuple,
-    Type,
-    runtime_checkable,
-)
-
-
-# ---------------------------------------------------------------------------
-# Auth providers
-# ---------------------------------------------------------------------------
-
-@runtime_checkable
-class AuthProvider(Protocol):
-    """A plugin that can provide or check authentication credentials.
-
-    Registered via ``ctx.register_auth_provider(name, provider)``.
-    Queried by ``hermes_cli/auth_commands.py``, ``doctor.py``, etc.
-    """
-
-    @property
-    def name(self) -> str: ...
-
-    def has_credentials(self) -> bool:
-        """Return True if the required credentials are present in env/config."""
-        ...
-
-    def check_env_vars(self) -> Dict[str, str | None]:
-        """Return a dict of env-var-name → current-value (or None if unset).
-
-        Used by ``hermes doctor`` to display credential status.
-        """
-        ...
-
-    def resolve_token(self, **kwargs: Any) -> Any:
-        """Resolve and return an auth token/credential for the provider.
-
-        The return type is provider-specific (string, tuple, object, etc.).
-        """
-        ...
-
-    def refresh_token(self, **kwargs: Any) -> Any:
-        """Refresh an existing token.  Raises if refresh is not supported."""
-        ...
-
-
-@dataclass
-class AuthProviderEntry:
-    provider: AuthProvider
-    """The auth provider instance."""
-
-    cli_group: str = ""
-    """CLI argument group name (e.g. 'Anthropic', 'AWS / Bedrock')."""
-
-    setup_subcommands: bool = False
-    """Whether this provider adds CLI auth subcommands (login, logout, etc.)."""
-
-
-# ---------------------------------------------------------------------------
-# Transport builders
-# ---------------------------------------------------------------------------
-
-@runtime_checkable
-class TransportBuilder(Protocol):
-    """A plugin that builds clients and converts messages for a model transport.
-
-    Registered via ``ctx.register_transport(name, builder)``.
-    Queried by ``agent/transports/`` and ``agent/auxiliary_client.py``.
-    """
-
-    def build_client(self, **kwargs: Any) -> Any:
-        """Build and return a provider-specific API client."""
-        ...
-
-    def build_kwargs(self, **kwargs: Any) -> Dict[str, Any]:
-        """Build the kwargs dict for a provider-specific API call."""
-        ...
-
-    def convert_messages(self, messages: Sequence[Any], **kwargs: Any) -> Any:
-        """Convert internal message format to provider-specific format."""
-        ...
-
-    def convert_tools(self, tools: Sequence[Any], **kwargs: Any) -> Any:
-        """Convert internal tool format to provider-specific format."""
-        ...
-
-    def normalize_response(self, response: Any, **kwargs: Any) -> Any:
-        """Normalize a provider-specific response into the internal format."""
-        ...
-
-
-# ---------------------------------------------------------------------------
-# Platform adapters
-# ---------------------------------------------------------------------------
-
-@dataclass
-class PlatformAdapterEntry:
-    """A registered platform adapter.
-
-    Registered via ``ctx.register_platform(name, entry)``.
-    Queried by ``gateway/run.py`` and ``tools/send_message_tool.py``.
-    """
-    name: str
-    """Platform identifier (e.g. 'telegram', 'slack')."""
-
-    adapter_class: Type
-    """The adapter class (e.g. TelegramAdapter)."""
-
-    check_requirements: Callable[[], bool]
-    """Check if the platform's dependencies are installed and configured."""
-
-    available_flag: str = ""
-    """Name of the module-level AVAILABLE boolean, if any."""
-
-    constants: Dict[str, Any] = field(default_factory=dict)
-    """Platform-specific constants (e.g. FEISHU_DOMAIN, LARK_DOMAIN)."""
-
-    helper_functions: Dict[str, Callable] = field(default_factory=dict)
-    """Platform-specific helper functions (e.g. probe_bot, qr_register)."""
-
-
-# ---------------------------------------------------------------------------
-# Tool providers
-# ---------------------------------------------------------------------------
-
-@dataclass
-class ToolProviderEntry:
-    """A registered tool provider.
-
-    Registered via ``ctx.register_tool_provider(name, entry)``.
-    Queried by ``tools/`` modules.
-    """
-    name: str
-    """Tool identifier (e.g. 'tts', 'stt', 'fal', 'daytona')."""
-
-    tool_functions: Dict[str, Callable] = field(default_factory=dict)
-    """Tool functions keyed by name (e.g. 'text_to_speech_tool', 'transcribe_audio')."""
-
-    check_fn: Optional[Callable] = None
-    """Check if the tool's dependencies are available."""
-
-    constants: Dict[str, Any] = field(default_factory=dict)
-    """Tool-specific constants (e.g. MAX_FILE_SIZE)."""
-
-    config_functions: Dict[str, Callable] = field(default_factory=dict)
-    """Config/utility functions (e.g. _get_provider, _load_stt_config)."""
-
-    environment_classes: Dict[str, Type] = field(default_factory=dict)
-    """Environment classes for terminal backends (e.g. DaytonaEnvironment)."""
-
-
-# ---------------------------------------------------------------------------
-# Model metadata providers
-# ---------------------------------------------------------------------------
-
-@dataclass
-class ModelMetadataEntry:
-    """A registered model metadata provider.
-
-    Registered via ``ctx.register_model_metadata(name, entry)``.
-    Queried by ``agent/model_metadata.py`` and CLI model commands.
-    """
-    name: str
-    """Provider identifier (e.g. 'anthropic', 'bedrock')."""
-
-    get_context_length: Optional[Callable[[str], int | None]] = None
-    """Return the context length for a model name, or None if unknown."""
-
-    list_models: Optional[Callable[[], List[str]]] = None
-    """Return a list of known model IDs for this provider."""
-
-    constants: Dict[str, Any] = field(default_factory=dict)
-    """Provider-specific constants (e.g. _COMMON_BETAS, betas lists)."""
-
-
-# ---------------------------------------------------------------------------
-# Credential pool entries
-# ---------------------------------------------------------------------------
-
-@dataclass
-class CredentialPoolEntry:
-    """A registered credential pool provider.
-
-    Registered via ``ctx.register_credential_pool(name, entry)``.
-    Queried by ``agent/credential_pool.py``.
-    """
-    name: str
-    """Provider identifier (e.g. 'anthropic')."""
-
-    read_credentials: Optional[Callable] = None
-    """Read stored credentials."""
-
-    write_credentials: Optional[Callable] = None
-    """Write/store credentials."""
-
-    refresh_credentials: Optional[Callable] = None
-    """Refresh stored credentials."""
-
-    read_oauth: Optional[Callable] = None
-    """Read OAuth credentials."""
-
-
-# ---------------------------------------------------------------------------
-# Provider resolvers
-# ---------------------------------------------------------------------------
-
-@runtime_checkable
-class ProviderResolver(Protocol):
-    """A plugin that resolves an auxiliary client for a specific provider.
-
-    Registered via ``ctx.register_provider_resolver(provider_name, resolver)``.
-    Queried by ``agent/auxiliary_client.py`` in ``resolve_provider_client()``.
-    """
-
-    def __call__(
-        self,
-        *,
-        model: str | None = None,
-        explicit_api_key: str | None = None,
-        explicit_base_url: str | None = None,
-        async_mode: bool = False,
-        is_vision: bool = False,
-        main_runtime: dict | None = None,
-        api_mode: str | None = None,
-    ) -> tuple[Any, str] | tuple[None, None]:
-        """Return ``(client, default_model)`` or ``(None, None)`` if unavailable."""
-        ...
-
-
-# ---------------------------------------------------------------------------
-# Credential pool hooks
-# ---------------------------------------------------------------------------
-
-@dataclass
-class CredentialPoolHook:
-    """Provider-specific credential pool operations.
-
-    Registered via ``ctx.register_credential_pool_hook(provider_name, hook)``.
-    Queried by ``agent/credential_pool.py``.
-    """
-
-    sync_from_credentials_file: Optional[Callable] = None
-    """Sync a pool entry from an external credentials file (e.g. ~/.claude/.credentials.json)."""
-
-    refresh_oauth: Optional[Callable] = None
-    """Refresh an OAuth token for a pool entry."""
-
-    should_include_in_pool: Optional[Callable] = None
-    """Return True if this provider's credentials should be included in the pool."""
-
-    needs_refresh: Optional[Callable] = None
-    """Return True if an OAuth entry needs a token refresh."""
-
-    source_priority: Optional[Callable] = None
-    """Return integer priority for a credential source (lower = preferred)."""
-
-    discover_credentials: Optional[Callable] = None
-    """Discover external credentials and upsert into the pool entries.
-
-    Signature: (entries: list, provider: str, is_suppressed: Callable) -> (changed: bool, active_sources: set)
-    """
-
-    env_var_order: Optional[list] = None
-    """Override env var scan order for this provider (e.g. ['ANTHROPIC_TOKEN', 'CLAUDE_CODE_OAUTH_TOKEN', 'ANTHROPIC_API_KEY'])."""
-
-    detect_auth_type: Optional[Callable] = None
-    """Given a token string, return the auth type for this provider.
-
-    Signature: (token: str) -> str  (e.g. AUTH_TYPE_OAUTH or AUTH_TYPE_API_KEY)
-    """
-
-
-# ---------------------------------------------------------------------------
-# Pricing providers
-# ---------------------------------------------------------------------------
-
-# Re-export PricingEntry from usage_pricing — that's the canonical definition
-# with Decimal fields. The registry stores these directly keyed by (provider, model).
-# Lazy import to avoid circular dependency (usage_pricing imports registries at runtime).
-def _get_pricing_entry_class():
-    from agent.usage_pricing import PricingEntry
-    return PricingEntry
-
-
-# ---------------------------------------------------------------------------
-# Provider overlays
-# ---------------------------------------------------------------------------
-
-@dataclass
-class ProviderOverlayEntry:
-    """A provider overlay registered by a plugin.
-
-    Registered via ``ctx.register_provider_overlay(provider_name, entry)``.
-    Queried by ``hermes_cli/providers.py``.
-
-    This mirrors the fields of ``HermesOverlay`` so that providers.py
-    can merge plugin-registered overlays seamlessly.
-    """
-
-    provider_name: str
-    """Primary provider name (e.g. 'anthropic', 'bedrock')."""
-
-    transport: str = "openai_chat"
-    """Transport type: openai_chat | anthropic_messages | codex_responses | bedrock_converse"""
-
-    is_aggregator: bool = False
-    """Whether this provider aggregates multiple model providers."""
-
-    auth_type: str = "api_key"
-    """Auth type: api_key | oauth_device_code | oauth_external | aws_sdk | external_process"""
-
-    extra_env_vars: Tuple[str, ...] = ()
-    """Environment variable names that indicate this provider is configured."""
-
-    base_url_override: str = ""
-    """Override if models.dev URL is wrong/missing."""
-
-    base_url_env_var: str = ""
-    """Env var for user-custom base URL."""
-
-    display_name: str = ""
-    """Human-readable name for the provider (e.g. 'Anthropic', 'AWS Bedrock')."""
-
-    aliases: List[str] = field(default_factory=list)
-    """Alternative names that resolve to this provider."""
-
-
-# ---------------------------------------------------------------------------
-# The global registries (singleton)
-# ---------------------------------------------------------------------------
-
-class PluginRegistries:
-    """Central store for all plugin-registered capabilities.
-
-    A single instance is created at import time and shared across the
-    process.  Plugins populate it during ``register()``; the core
-    queries it at runtime.
-    """
-
-    def __init__(self) -> None:
-        self.auth_providers: Dict[str, AuthProviderEntry] = {}
-        self.transport_builders: Dict[str, TransportBuilder] = {}
-        self._transports: Dict[str, type] = {}
-        self.platform_adapters: Dict[str, PlatformAdapterEntry] = {}
-        self.tool_providers: Dict[str, ToolProviderEntry] = {}
-        self.model_metadata: Dict[str, ModelMetadataEntry] = {}
-        self.credential_pools: Dict[str, CredentialPoolEntry] = {}
-        self._provider_services: Dict[str, Dict[str, Any]] = {}
-        self._provider_resolvers: Dict[str, Callable] = {}
-        self._credential_pool_hooks: Dict[str, CredentialPoolHook] = {}
-        self._pricing_providers: Dict[tuple, Any] = {}
-        self._provider_overlays: Dict[str, ProviderOverlayEntry] = {}
-
-    # -- registration methods (called from PluginContext) --------------------
-
-    def register_auth_provider(
-        self,
-        name: str,
-        provider: AuthProvider,
-        *,
-        cli_group: str = "",
-        setup_subcommands: bool = False,
-    ) -> None:
-        self.auth_providers[name] = AuthProviderEntry(
-            provider=provider,
-            cli_group=cli_group,
-            setup_subcommands=setup_subcommands,
-        )
-
-    def register_transport(self, name: str, builder: TransportBuilder) -> None:
-        self.transport_builders[name] = builder
-
-    def register_platform(self, entry: PlatformAdapterEntry) -> None:
-        self.platform_adapters[entry.name] = entry
-
-    def register_tool_provider(self, entry: ToolProviderEntry) -> None:
-        self.tool_providers[entry.name] = entry
-
-    def register_model_metadata(self, entry: ModelMetadataEntry) -> None:
-        self.model_metadata[entry.name] = entry
-
-    def register_credential_pool(self, entry: CredentialPoolEntry) -> None:
-        self.credential_pools[entry.name] = entry
-
-    def register_provider_resolver(self, name: str, resolver: Callable) -> None:
-        """Register a provider resolver callable.
-
-        The resolver is called by ``resolve_provider_client()`` to create an
-        auxiliary client for a specific provider.  Signature::
-
-            def resolver(
-                *,
-                model: str | None,
-                explicit_api_key: str | None,
-                explicit_base_url: str | None,
-                async_mode: bool,
-                is_vision: bool,
-                main_runtime: dict | None,
-                api_mode: str | None,
-            ) -> tuple[Any, str] | tuple[None, None]:
-                ...
-
-        Returns ``(client, default_model)`` or ``(None, None)``.
-        """
-        self._provider_resolvers[name] = resolver
-
-    def register_credential_pool_hook(self, name: str, hook: CredentialPoolHook) -> None:
-        """Register a credential pool hook for provider-specific pool operations."""
-        self._credential_pool_hooks[name] = hook
-
-    def register_pricing_provider(self, name: str, entries: List[tuple]) -> None:
-        """Register pricing entries for a provider.
-
-        Each entry is a (provider, model, PricingEntry) tuple so the
-        lookup key matches the (provider, model) pattern used by
-        _OFFICIAL_DOCS_PRICING.
-        """
-        for prov, model, entry in entries:
-            self._pricing_providers[(prov, model)] = entry
-
-    def register_provider_overlay(self, entry: ProviderOverlayEntry) -> None:
-        """Register a provider overlay entry from a plugin."""
-        self._provider_overlays[entry.provider_name] = entry
-
-    # -- query helpers -------------------------------------------------------
-
-    def get_auth_provider(self, name: str) -> AuthProviderEntry | None:
-        return self.auth_providers.get(name)
-
-    def get_transport(self, name: str) -> TransportBuilder | None:
-        return self.transport_builders.get(name)
-
-    def get_platform(self, name: str) -> PlatformAdapterEntry | None:
-        return self.platform_adapters.get(name)
-
-    def get_tool_provider(self, name: str) -> ToolProviderEntry | None:
-        return self.tool_providers.get(name)
-
-    def get_model_metadata(self, name: str) -> ModelMetadataEntry | None:
-        return self.model_metadata.get(name)
-
-    def get_credential_pool(self, name: str) -> CredentialPoolEntry | None:
-        return self.credential_pools.get(name)
-
-    def get_provider_resolver(self, name: str) -> Callable | None:
-        """Return the registered resolver for a provider, or None."""
-        return self._provider_resolvers.get(name)
-
-    def get_credential_pool_hook(self, name: str) -> CredentialPoolHook | None:
-        """Return the registered credential pool hook for a provider, or None."""
-        return self._credential_pool_hooks.get(name)
-
-    def get_pricing_entry(self, provider: str, model: str) -> Any:
-        """Return a registered pricing entry for (provider, model), or None."""
-        return self._pricing_providers.get((provider, model))
-
-    def all_pricing_entries(self) -> Dict[tuple, Any]:
-        """Return all registered pricing entries (keyed by (provider, model))."""
-        return dict(self._pricing_providers)
-
-    def get_provider_overlay(self, name: str) -> ProviderOverlayEntry | None:
-        """Return a registered provider overlay, or None."""
-        return self._provider_overlays.get(name)
-
-    def all_provider_overlays(self) -> Dict[str, ProviderOverlayEntry]:
-        """Return all registered provider overlays."""
-        return dict(self._provider_overlays)
-
-    def all_auth_providers(self) -> List[AuthProviderEntry]:
-        return list(self.auth_providers.values())
-
-    def all_platforms(self) -> List[PlatformAdapterEntry]:
-        return list(self.platform_adapters.values())
-
-    def all_tool_providers(self) -> List[ToolProviderEntry]:
-        return list(self.tool_providers.values())
-
-    # -- provider services (model-provider namespace) -----------------------
-
-    def register_provider_services(self, name: str, services: Dict[str, Any]) -> None:
-        """Register a namespace dict of provider-specific services.
-
-        This is the escape hatch for model-provider plugins that expose many
-        symbols (anthropic has 50+).  Each plugin registers its public surface
-        as a flat dict of ``{symbol_name: callable_or_value}``.  Core code
-        looks up specific symbols instead of importing from the plugin
-        package directly.
-
-        Each callable value is stored as a *lazy module-attribute reference*
-        so that ``unittest.mock.patch("pkg.mod.fn")`` works correctly in
-        tests — the registry re-reads ``mod.fn`` on every lookup instead of
-        capturing the function object at register time.
-
-        Example::
-
-            registries.register_provider_services("anthropic", {
-                "build_anthropic_client": build_anthropic_client,
-                "resolve_anthropic_token": resolve_anthropic_token,
-                "_is_oauth_token": _is_oauth_token,
-                ...
-            })
-        """
-        import sys
-
-        def _make_lazy(fn: Any) -> Any:
-            """Return a lazy wrapper that re-reads fn from its module each call.
-
-            This makes mock.patch() on the module attribute work transparently —
-            the registry never caches the function object, just the reference path.
-            """
-            if not callable(fn):
-                return fn
-            module = getattr(fn, "__module__", None)
-            qualname = getattr(fn, "__qualname__", None)
-            if not module or not qualname or "." in qualname:
-                # non-simple attribute (lambda, nested fn, class method) — store directly
-                return fn
-
-            class _LazyRef:
-                __slots__ = ("_mod", "_attr", "_fallback")
-
-                def __init__(self, mod: str, attr: str, fallback: Any) -> None:
-                    self._mod = mod
-                    self._attr = attr
-                    self._fallback = fallback
-
-                def _resolve(self) -> Any:
-                    mod = sys.modules.get(self._mod)
-                    return getattr(mod, self._attr, self._fallback) if mod else self._fallback
-
-                def __call__(self, *args: Any, **kwargs: Any) -> Any:
-                    return self._resolve()(*args, **kwargs)
-
-                def __getattr__(self, name: str) -> Any:
-                    if name.startswith("_"):
-                        raise AttributeError(name)
-                    return getattr(self._resolve(), name)
-
-                def __repr__(self) -> str:  # pragma: no cover
-                    return f"<LazyRef {self._mod}.{self._attr}>"
-
-                # Allow isinstance checks and hasattr to pass through
-                def __bool__(self) -> bool:
-                    return True
-
-            return _LazyRef(module, qualname, fn)
-
-        self._provider_services[name] = {k: _make_lazy(v) for k, v in services.items()}
-
-    def get_provider_service(self, provider: str, name: str) -> Any:
-        """Look up a single symbol from a provider's service namespace.
-
-        Returns ``None`` if the provider is not registered or the symbol
-        doesn't exist.
-        """
-        ns = self._provider_services.get(provider)
-        if ns is None:
-            return None
-        return ns.get(name)
-
-    def get_provider_namespace(self, provider: str) -> Dict[str, Any]:
-        """Return the full service namespace dict for a provider (empty dict if unregistered)."""
-        return self._provider_services.get(provider, {})
-
-
-# Module-level singleton — the one and only instance.
-registries = PluginRegistries()
@@ -47,16 +47,9 @@ def get_transport(api_mode: str):


 def _discover_transports() -> None:
-    """Import all transport modules to trigger auto-registration.
-
-    Also checks the plugin registry for transports registered by plugins
-    (e.g. anthropic_messages from the anthropic plugin, bedrock_converse
-    from the bedrock plugin).  Plugin-registered transports take priority
-    over core fallbacks when both exist.
-    """
+    """Import all transport modules to trigger auto-registration."""
    global _discovered
    _discovered = True
-    # Core transport modules (registered automatically — no plugin needed)
    try:
        import agent.transports.anthropic  # noqa: F401
    except ImportError:
@@ -69,10 +62,7 @@ def _discover_transports() -> None:
        import agent.transports.chat_completions  # noqa: F401
    except ImportError:
        pass
-    # Plugin-registered transports (override core fallbacks)
    try:
-        from agent.plugin_registries import registries
-        for api_mode, transport_cls in registries._transports.items():
-            _REGISTRY.setdefault(api_mode, transport_cls)
+        import agent.transports.bedrock  # noqa: F401
    except ImportError:
        pass
@@ -1,53 +1,41 @@
-"""Anthropic Messages API transport — core module.
+"""Anthropic Messages API transport.

-Owns format conversion and response normalization for the ``anthropic_messages``
-wire format.  No SDK dependency; all wire-format logic lives in
-:mod:`agent.anthropic_format`.
+Delegates to the existing adapter functions in agent/anthropic_adapter.py.
+This transport owns format conversion and normalization — NOT client lifecycle.
 """

-import json
 from typing import Any, Dict, List, Optional

-from agent.anthropic_format import (
-    build_anthropic_kwargs,
-    convert_messages_to_anthropic,
-    convert_tools_to_anthropic,
-    _to_plain_data,
-)
 from agent.transports.base import ProviderTransport
-from agent.transports.types import NormalizedResponse, ToolCall
+from agent.transports.types import NormalizedResponse


 class AnthropicTransport(ProviderTransport):
    """Transport for api_mode='anthropic_messages'.

-    Uses core functions directly from :mod:`agent.anthropic_format` — no
-    plugin registry lookups needed.  This means core tests, bedrock tests,
-    and any other consumer of the anthropic wire format work without the
-    anthropic plugin being registered.
+    Wraps the existing functions in anthropic_adapter.py behind the
+    ProviderTransport ABC.  Each method delegates — no logic is duplicated.
    """

-    _STOP_REASON_MAP = {
-        "end_turn": "stop",
-        "tool_use": "tool_calls",
-        "max_tokens": "length",
-        "stop_sequence": "stop",
-        "refusal": "content_filter",
-        "model_context_window_exceeded": "length",
-    }
-
    @property
    def api_mode(self) -> str:
        return "anthropic_messages"

    def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
-        """Convert OpenAI messages to Anthropic (system, messages) tuple."""
+        """Convert OpenAI messages to Anthropic (system, messages) tuple.
+
+        kwargs:
+            base_url: Optional[str] — affects thinking signature handling.
+        """
+        from agent.anthropic_adapter import convert_messages_to_anthropic
+
        base_url = kwargs.get("base_url")
-        return convert_messages_to_anthropic(messages, base_url=base_url,
-                                             model=kwargs.get("model"))
+        return convert_messages_to_anthropic(messages, base_url=base_url)

    def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
        """Convert OpenAI tool schemas to Anthropic input_schema format."""
+        from agent.anthropic_adapter import convert_tools_to_anthropic
+
        return convert_tools_to_anthropic(tools)

    def build_kwargs(
@@ -57,7 +45,23 @@ class AnthropicTransport(ProviderTransport):
        tools: Optional[List[Dict[str, Any]]] = None,
        **params,
    ) -> Dict[str, Any]:
-        """Build Anthropic messages.create() kwargs."""
+        """Build Anthropic messages.create() kwargs.
+
+        Calls convert_messages and convert_tools internally.
+
+        params (all optional):
+            max_tokens: int
+            reasoning_config: dict | None
+            tool_choice: str | None
+            is_oauth: bool
+            preserve_dots: bool
+            context_length: int | None
+            base_url: str | None
+            fast_mode: bool
+            drop_context_1m_beta: bool
+        """
+        from agent.anthropic_adapter import build_anthropic_kwargs
+
        return build_anthropic_kwargs(
            model=model,
            messages=messages,
@@ -74,7 +78,15 @@ class AnthropicTransport(ProviderTransport):
        )

    def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
-        """Normalize Anthropic response to NormalizedResponse."""
+        """Normalize Anthropic response to NormalizedResponse.
+
+        Parses content blocks (text, thinking, tool_use), maps stop_reason
+        to OpenAI finish_reason, and collects reasoning_details in provider_data.
+        """
+        import json
+        from agent.anthropic_adapter import _to_plain_data
+        from agent.transports.types import ToolCall
+
        strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
        _MCP_PREFIX = "mcp_"

@@ -95,6 +107,12 @@ class AnthropicTransport(ProviderTransport):
                name = block.name
                if strip_tool_prefix and name.startswith(_MCP_PREFIX):
                    stripped = name[len(_MCP_PREFIX):]
+                    # Only strip the mcp_ prefix for OAuth-injected tools
+                    # (where Hermes adds the prefix when sending to Anthropic
+                    # and must remove it on the way back).  Native MCP server
+                    # tools (from mcp_servers: in config.yaml) are registered
+                    # in the tool registry under their FULL mcp_<server>_<tool>
+                    # name and must NOT be stripped.  GH-25255.
                    from tools.registry import registry as _tool_registry
                    if (_tool_registry.get_entry(stripped)
                            and not _tool_registry.get_entry(name)):
@@ -123,7 +141,13 @@ class AnthropicTransport(ProviderTransport):
        )

    def validate_response(self, response: Any) -> bool:
-        """Check Anthropic response structure is valid."""
+        """Check Anthropic response structure is valid.
+
+        An empty content list is legitimate when ``stop_reason == "end_turn"``
+        — the model's canonical way of signalling "nothing more to add" after
+        a tool turn that already delivered the user-facing text. Treating it
+        as invalid falsely retries a completed response.
+        """
        if response is None:
            return False
        content_blocks = getattr(response, "content", None)
@@ -144,6 +168,16 @@ class AnthropicTransport(ProviderTransport):
            return {"cached_tokens": cached, "creation_tokens": written}
        return None

+    # Promote the adapter's canonical mapping to module level so it's shared
+    _STOP_REASON_MAP = {
+        "end_turn": "stop",
+        "tool_use": "tool_calls",
+        "max_tokens": "length",
+        "stop_sequence": "stop",
+        "refusal": "content_filter",
+        "model_context_window_exceeded": "length",
+    }
+
    def map_finish_reason(self, raw_reason: str) -> str:
        """Map Anthropic stop_reason to OpenAI finish_reason."""
        return self._STOP_REASON_MAP.get(raw_reason, "stop")
@@ -1,6 +1,6 @@
 """AWS Bedrock Converse API transport.

-Delegates to the existing adapter functions in hermes_agent_bedrock.
+Delegates to the existing adapter functions in agent/bedrock_adapter.py.
 Bedrock uses its own boto3 client (not the OpenAI SDK), so the transport
 owns format conversion and normalization, while client construction and
 boto3 calls stay on AIAgent.
@@ -21,19 +21,13 @@ class BedrockTransport(ProviderTransport):

    def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
        """Convert OpenAI messages to Bedrock Converse format."""
-        from agent.plugin_registries import registries
-        _fn = registries.get_provider_service("bedrock", "convert_messages_to_converse")
-        if _fn is None:
-            raise ImportError("bedrock plugin not registered")
-        return _fn(messages)
+        from agent.bedrock_adapter import convert_messages_to_converse
+        return convert_messages_to_converse(messages)

    def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
        """Convert OpenAI tool schemas to Bedrock Converse toolConfig."""
-        from agent.plugin_registries import registries
-        _fn = registries.get_provider_service("bedrock", "convert_tools_to_converse")
-        if _fn is None:
-            raise ImportError("bedrock plugin not registered")
-        return _fn(tools)
+        from agent.bedrock_adapter import convert_tools_to_converse
+        return convert_tools_to_converse(tools)

    def build_kwargs(
        self,
@@ -42,16 +36,22 @@ class BedrockTransport(ProviderTransport):
        tools: Optional[List[Dict[str, Any]]] = None,
        **params,
    ) -> Dict[str, Any]:
-        """Build Bedrock converse() kwargs."""
-        from agent.plugin_registries import registries
-        _fn = registries.get_provider_service("bedrock", "build_converse_kwargs")
-        if _fn is None:
-            raise ImportError("bedrock plugin not registered")
+        """Build Bedrock converse() kwargs.
+
+        Calls convert_messages and convert_tools internally.
+
+        params:
+            max_tokens: int — output token limit (default 4096)
+            temperature: float | None
+            guardrail_config: dict | None — Bedrock guardrails
+            region: str — AWS region (default 'us-east-1')
+        """
+        from agent.bedrock_adapter import build_converse_kwargs

        region = params.get("region", "us-east-1")
        guardrail = params.get("guardrail_config")

-        kwargs = _fn(
+        kwargs = build_converse_kwargs(
            model=model,
            messages=messages,
            tools=tools,
@@ -65,15 +65,20 @@ class BedrockTransport(ProviderTransport):
        return kwargs

    def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
-        """Normalize Bedrock response to NormalizedResponse."""
-        from agent.plugin_registries import registries
-        normalize_converse_response = registries.get_provider_service("bedrock", "normalize_converse_response")
-        if normalize_converse_response is None:
-            raise ImportError("bedrock plugin not registered")
+        """Normalize Bedrock response to NormalizedResponse.

+        Handles two shapes:
+        1. Raw boto3 dict (from direct converse() calls)
+        2. Already-normalized SimpleNamespace with .choices (from dispatch site)
+        """
+        from agent.bedrock_adapter import normalize_converse_response
+
+        # Normalize to OpenAI-compatible SimpleNamespace
        if hasattr(response, "choices") and response.choices:
+            # Already normalized at dispatch site
            ns = response
        else:
+            # Raw boto3 dict
            ns = normalize_converse_response(response)

        choice = ns.choices[0]
@@ -111,15 +116,27 @@ class BedrockTransport(ProviderTransport):
        )

    def validate_response(self, response: Any) -> bool:
+        """Check Bedrock response structure.
+
+        After normalize_converse_response, the response has OpenAI-compatible
+        .choices — same check as chat_completions.
+        """
        if response is None:
            return False
+        # Raw Bedrock dict response — check for 'output' key
        if isinstance(response, dict):
            return "output" in response
+        # Already-normalized SimpleNamespace
        if hasattr(response, "choices"):
            return bool(response.choices)
        return False

    def map_finish_reason(self, raw_reason: str) -> str:
+        """Map Bedrock stop reason to OpenAI finish_reason.
+
+        The adapter already does this mapping inside normalize_converse_response,
+        so this is only used for direct access to raw responses.
+        """
        _MAP = {
            "end_turn": "stop",
            "tool_use": "tool_calls",
@@ -129,3 +146,9 @@ class BedrockTransport(ProviderTransport):
            "content_filtered": "content_filter",
        }
        return _MAP.get(raw_reason, "stop")
+
+
+# Auto-register on import
+from agent.transports import register_transport  # noqa: E402
+
+register_transport("bedrock_converse", BedrockTransport)
@@ -115,8 +115,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
    # Opus 4.5/4.6/4.7 share $5/$25 pricing (new tokenizer, up to 35% more
    # tokens for the same text).
    # Source: https://platform.claude.com/docs/en/about-claude/pricing
-    # NOTE: The anthropic plugin also registers these — plugin takes priority
-    # at runtime, but these static entries ensure costs work without the plugin.
    (
        "anthropic",
        "claude-opus-4-7",
@@ -141,6 +139,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
        pricing_version="anthropic-pricing-2026-05",
    ),
+    # ── Anthropic Claude 4.6 ─────────────────────────────────────────────
    (
        "anthropic",
        "claude-opus-4-6",
@@ -189,6 +188,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
        pricing_version="anthropic-pricing-2026-05",
    ),
+    # ── Anthropic Claude 4.5 ─────────────────────────────────────────────
    (
        "anthropic",
        "claude-opus-4-5",
@@ -225,6 +225,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
        pricing_version="anthropic-pricing-2026-05",
    ),
+    # ── Anthropic Claude 4 / 4.1 ─────────────────────────────────────────
    (
        "anthropic",
        "claude-opus-4-20250514",
@@ -249,56 +250,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
        pricing_version="anthropic-pricing-2026-05",
    ),
-    # ── Anthropic older models (pre-4.5 generation) ────────────────────────
-    (
-        "anthropic",
-        "claude-3-5-sonnet-20241022",
-    ): PricingEntry(
-        input_cost_per_million=Decimal("3.00"),
-        output_cost_per_million=Decimal("15.00"),
-        cache_read_cost_per_million=Decimal("0.30"),
-        cache_write_cost_per_million=Decimal("3.75"),
-        source="official_docs_snapshot",
-        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
-        pricing_version="anthropic-pricing-2026-05",
-    ),
-    (
-        "anthropic",
-        "claude-3-5-haiku-20241022",
-    ): PricingEntry(
-        input_cost_per_million=Decimal("0.80"),
-        output_cost_per_million=Decimal("4.00"),
-        cache_read_cost_per_million=Decimal("0.08"),
-        cache_write_cost_per_million=Decimal("1.00"),
-        source="official_docs_snapshot",
-        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
-        pricing_version="anthropic-pricing-2026-05",
-    ),
-    (
-        "anthropic",
-        "claude-3-opus-20240229",
-    ): PricingEntry(
-        input_cost_per_million=Decimal("15.00"),
-        output_cost_per_million=Decimal("75.00"),
-        cache_read_cost_per_million=Decimal("1.50"),
-        cache_write_cost_per_million=Decimal("18.75"),
-        source="official_docs_snapshot",
-        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
-        pricing_version="anthropic-pricing-2026-05",
-    ),
-    (
-        "anthropic",
-        "claude-3-haiku-20240307",
-    ): PricingEntry(
-        input_cost_per_million=Decimal("0.25"),
-        output_cost_per_million=Decimal("1.25"),
-        cache_read_cost_per_million=Decimal("0.03"),
-        cache_write_cost_per_million=Decimal("0.30"),
-        source="official_docs_snapshot",
-        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
-        pricing_version="anthropic-pricing-2026-05",
-    ),
-    # ── OpenAI ────────────────────────────────────────────────────────────
+    # OpenAI
    (
        "openai",
        "gpt-4o",
@@ -376,6 +328,55 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://openai.com/api/pricing/",
        pricing_version="openai-pricing-2026-03-16",
    ),
+    # ── Anthropic older models (pre-4.5 generation) ────────────────────────
+    (
+        "anthropic",
+        "claude-3-5-sonnet-20241022",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("3.00"),
+        output_cost_per_million=Decimal("15.00"),
+        cache_read_cost_per_million=Decimal("0.30"),
+        cache_write_cost_per_million=Decimal("3.75"),
+        source="official_docs_snapshot",
+        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
+        pricing_version="anthropic-pricing-2026-05",
+    ),
+    (
+        "anthropic",
+        "claude-3-5-haiku-20241022",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.80"),
+        output_cost_per_million=Decimal("4.00"),
+        cache_read_cost_per_million=Decimal("0.08"),
+        cache_write_cost_per_million=Decimal("1.00"),
+        source="official_docs_snapshot",
+        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
+        pricing_version="anthropic-pricing-2026-05",
+    ),
+    (
+        "anthropic",
+        "claude-3-opus-20240229",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("15.00"),
+        output_cost_per_million=Decimal("75.00"),
+        cache_read_cost_per_million=Decimal("1.50"),
+        cache_write_cost_per_million=Decimal("18.75"),
+        source="official_docs_snapshot",
+        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
+        pricing_version="anthropic-pricing-2026-05",
+    ),
+    (
+        "anthropic",
+        "claude-3-haiku-20240307",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.25"),
+        output_cost_per_million=Decimal("1.25"),
+        cache_read_cost_per_million=Decimal("0.03"),
+        cache_write_cost_per_million=Decimal("0.30"),
+        source="official_docs_snapshot",
+        source_url="https://platform.claude.com/docs/en/about-claude/pricing",
+        pricing_version="anthropic-pricing-2026-05",
+    ),
    # DeepSeek
    (
        "deepseek",
@@ -439,6 +440,80 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
        source_url="https://ai.google.dev/pricing",
        pricing_version="google-pricing-2026-03-16",
    ),
+    # AWS Bedrock — pricing per the Bedrock pricing page.
+    # Bedrock charges the same per-token rates as the model provider but
+    # through AWS billing.  These are the on-demand prices (no commitment).
+    # Source: https://aws.amazon.com/bedrock/pricing/
+    (
+        "bedrock",
+        "anthropic.claude-opus-4-6",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("15.00"),
+        output_cost_per_million=Decimal("75.00"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "anthropic.claude-sonnet-4-6",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("3.00"),
+        output_cost_per_million=Decimal("15.00"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "anthropic.claude-sonnet-4-5",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("3.00"),
+        output_cost_per_million=Decimal("15.00"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "anthropic.claude-haiku-4-5",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.80"),
+        output_cost_per_million=Decimal("4.00"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "amazon.nova-pro",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.80"),
+        output_cost_per_million=Decimal("3.20"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "amazon.nova-lite",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.06"),
+        output_cost_per_million=Decimal("0.24"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
+    (
+        "bedrock",
+        "amazon.nova-micro",
+    ): PricingEntry(
+        input_cost_per_million=Decimal("0.035"),
+        output_cost_per_million=Decimal("0.14"),
+        source="official_docs_snapshot",
+        source_url="https://aws.amazon.com/bedrock/pricing/",
+        pricing_version="bedrock-pricing-2026-04",
+    ),
    # MiniMax
    (
        "minimax",
@@ -506,27 +581,36 @@ def resolve_billing_route(
    return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")


+def _normalize_anthropic_model_name(model: str) -> str:
+    """Normalize Anthropic model name variants to canonical form.
+
+    Handles:
+      - Dot notation: claude-opus-4.7 → claude-opus-4-7
+      - Short aliases: claude-opus-4.7 → claude-opus-4-7
+      - Strips anthropic/ prefix if present
+    """
+    name = model.lower().strip()
+    if name.startswith("anthropic/"):
+        name = name[len("anthropic/"):]
+    # Normalize dots to dashes in version numbers (e.g. 4.7 → 4-7, 4.6 → 4-6)
+    # But preserve the rest of the name structure
+    name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
+    return name
+
+
 def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]:
    model = route.model.lower()
-
-    # ── Plugin-registered pricing entries take priority ──
-    from agent.plugin_registries import registries as _preg
-    plugin_entry = _preg.get_pricing_entry(route.provider, model)
-    if plugin_entry:
-        return plugin_entry
-    # Try provider-specific name normalization via registry
-    _norm = _preg.get_provider_service(route.provider, "normalize_model_name")
-    if _norm is not None:
-        normalized = _norm(model)
-        if normalized != model:
-            plugin_entry = _preg.get_pricing_entry(route.provider, normalized)
-            if plugin_entry:
-                return plugin_entry
-
-    # Fall back to static dict
+    # Direct lookup first
    entry = _OFFICIAL_DOCS_PRICING.get((route.provider, model))
    if entry:
        return entry
+    # Try normalized name for Anthropic (handles dot-notation like opus-4.7)
+    if route.provider == "anthropic":
+        normalized = _normalize_anthropic_model_name(model)
+        if normalized != model:
+            entry = _OFFICIAL_DOCS_PRICING.get((route.provider, normalized))
+            if entry:
+                return entry
    return None


@@ -576,6 +576,8 @@ def load_cli_config() -> Dict[str, Any]:
        "docker_env": "TERMINAL_DOCKER_ENV",
        "docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
        "docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
+        "docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
+        "docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
        "sandbox_dir": "TERMINAL_SANDBOX_DIR",
        # Persistent shell (non-local backends)
        "persistent_shell": "TERMINAL_PERSISTENT_SHELL",
@@ -6234,10 +6236,8 @@ class HermesCLI:
        
        # ``self.api_key`` may be a callable (Azure Foundry Entra ID bearer
        # provider). Never invoke it; just identify the auth surface.
-        from agent.plugin_registries import registries
-        _azure_ns = registries.get_provider_namespace("azure")
-        is_token_provider = _azure_ns.get("is_token_provider")
-        if is_token_provider and is_token_provider(self.api_key):
+        from agent.azure_identity_adapter import is_token_provider
+        if is_token_provider(self.api_key):
            api_key_display = "Microsoft Entra ID"
        elif isinstance(self.api_key, str) and len(self.api_key) > 12:
            api_key_display = f"{self.api_key[:8]}...{self.api_key[-4:]}"
@@ -10968,14 +10968,7 @@ class HermesCLI:
            return
        self._voice_tts_done.clear()
        try:
-            from agent.plugin_registries import registries
-            _tts_provider = registries.get_tool_provider("tts")
-            if _tts_provider is None:
-                raise ImportError("tts tool provider not registered")
-            text_to_speech_tool = _tts_provider.tool_functions.get("text_to_speech_tool")
-            check_tts_requirements = _tts_provider.check_fn
-            if text_to_speech_tool is None:
-                raise ImportError("text_to_speech_tool not found in tts provider")
+            from tools.tts_tool import text_to_speech_tool
            from tools.voice_mode import play_audio_file

            # Strip markdown and non-speech content for cleaner TTS
@@ -11158,10 +11151,8 @@ class HermesCLI:
        status = "enabled" if self._voice_tts else "disabled"

        if self._voice_tts:
-            from agent.plugin_registries import registries
-            _tts_provider = registries.get_tool_provider("tts")
-            check_tts_requirements = _tts_provider.check_fn if _tts_provider else None
-            if check_tts_requirements and not check_tts_requirements():
+            from tools.tts_tool import check_tts_requirements
+            if not check_tts_requirements():
                _cprint(f"{_DIM}Warning: No TTS provider available. Install edge-tts or set API keys.{_RST}")

        _cprint(f"{_ACCENT}Voice TTS {status}.{_RST}")
@@ -11783,17 +11774,13 @@ class HermesCLI:

            if self._voice_tts:
                try:
-                    from agent.plugin_registries import registries
-                    _tts_provider = registries.get_tool_provider("tts")
-                    if _tts_provider is None:
-                        raise ImportError("tts tool provider not registered")
-                    _load_tts_cfg = _tts_provider.config_functions.get("_load_tts_config")
-                    _get_prov = _tts_provider.config_functions.get("_get_provider")
-                    _import_elevenlabs = _tts_provider.config_functions.get("_import_elevenlabs")
-                    _import_sounddevice = _tts_provider.config_functions.get("_import_sounddevice")
-                    stream_tts_to_speaker = _tts_provider.tool_functions.get("stream_tts_to_speaker")
-                    if not all([_load_tts_cfg, _get_prov, stream_tts_to_speaker]):
-                        raise ImportError("streaming TTS functions not found in tts provider")
+                    from tools.tts_tool import (
+                        _load_tts_config as _load_tts_cfg,
+                        _get_provider as _get_prov,
+                        _import_elevenlabs,
+                        _import_sounddevice,
+                        stream_tts_to_speaker,
+                    )
                    _tts_cfg = _load_tts_cfg()
                    if _get_prov(_tts_cfg) == "elevenlabs":
                        # Verify both ElevenLabs SDK and audio output are available
@@ -15140,13 +15127,50 @@ def main(
    # Handle single query mode
    if query or image:
        query, single_query_images = _collect_query_images(query, image)
+        # Kanban workers spawn with ``hermes chat -q "work kanban task <id>"``;
+        # the actual task description lives in the task body. Mirror the
+        # gateway/CLI behaviour for inbound images by scanning the body for
+        # local image paths and http(s) image URLs and attaching them to the
+        # worker's first turn. Without this, users who paste a screenshot
+        # path or URL into a kanban task body never get it routed to the
+        # model's vision input.
+        single_query_image_urls: list[str] = []
+        _kanban_task_id = os.environ.get("HERMES_KANBAN_TASK", "").strip()
+        if _kanban_task_id:
+            try:
+                from hermes_cli import kanban_db as _kb
+                from agent.image_routing import extract_image_refs as _extract_refs
+
+                _conn = _kb.connect()
+                try:
+                    _task = _kb.get_task(_conn, _kanban_task_id)
+                finally:
+                    try:
+                        _conn.close()
+                    except Exception:
+                        pass
+                _body = getattr(_task, "body", "") if _task is not None else ""
+                if _body:
+                    _kb_paths, _kb_urls = _extract_refs(_body)
+                    if _kb_paths:
+                        # Dedupe against any --image the user already passed.
+                        _seen = {str(p) for p in single_query_images}
+                        for _p in _kb_paths:
+                            if _p not in _seen:
+                                _seen.add(_p)
+                                single_query_images.append(Path(_p))
+                    if _kb_urls:
+                        single_query_image_urls.extend(_kb_urls)
+            except Exception as _exc:
+                # Best-effort enrichment; never block worker startup on it.
+                logger.debug("kanban image-ref extraction failed: %s", _exc)
        if quiet:
            # Quiet mode: suppress banner, spinner, tool previews.
            # Only print the final response and parseable session info.
            cli.tool_progress_mode = "off"
            if cli._ensure_runtime_credentials():
                effective_query: Any = query
-                if single_query_images:
+                if single_query_images or single_query_image_urls:
                    # Honour the same image-routing decision used by the
                    # interactive path. With a vision-capable model (incl.
                    # custom-provider models declared via
@@ -15175,19 +15199,26 @@ def main(
                            _parts, _skipped = _build_parts(
                                query if isinstance(query, str) else "",
                                [str(p) for p in single_query_images],
+                                image_urls=list(single_query_image_urls) or None,
                            )
                            if any(p.get("type") == "image_url" for p in _parts):
                                effective_query = _parts
                            else:
                                # All images unreadable — text fallback.
+                                # ``_preprocess_images_with_vision`` only knows
+                                # about local files; URLs would be lost there,
+                                # so keep the original query text intact when
+                                # only URLs were supplied.
+                                if single_query_images:
+                                    effective_query = cli._preprocess_images_with_vision(
+                                        query, single_query_images, announce=False,
+                                    )
+                        except Exception:
+                            if single_query_images:
                                effective_query = cli._preprocess_images_with_vision(
                                    query, single_query_images, announce=False,
                                )
-                        except Exception:
-                            effective_query = cli._preprocess_images_with_vision(
-                                query, single_query_images, announce=False,
-                            )
-                    else:
+                    elif single_query_images:
                        effective_query = cli._preprocess_images_with_vision(
                            query,
                            single_query_images,
@@ -30,13 +30,21 @@ cd /opt/data
 dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
 dash_port="${HERMES_DASHBOARD_PORT:-9119}"

-# Binding to anything other than localhost requires --insecure — the
-# dashboard refuses otherwise because it exposes API keys. Inside a
-# container this is the expected deployment.
+# `--insecure` is opt-in via HERMES_DASHBOARD_INSECURE. The dashboard's
+# OAuth auth gate engages automatically on non-loopback binds when a
+# DashboardAuthProvider is registered (e.g. the bundled dashboard_auth/nous
+# provider, which auto-registers when HERMES_DASHBOARD_OAUTH_CLIENT_ID is
+# set). If no provider is registered, start_server fails closed with a
+# specific operator-facing error.
+#
+# This used to derive --insecure from the bind host ("anything non-loopback
+# implies insecure"), but that predates the OAuth gate and silently
+# disabled it on every container-deployed dashboard. The gate is now the
+# authority; operators on trusted LANs / behind a reverse proxy without
+# the OAuth contract opt in explicitly.
 insecure=""
-case "$dash_host" in
-    127.0.0.1|localhost) ;;
-    *) insecure="--insecure" ;;
+case "${HERMES_DASHBOARD_INSECURE:-}" in
+    1|true|TRUE|True|yes|YES|Yes) insecure="--insecure" ;;
 esac

 # shellcheck disable=SC2086  # word-splitting of $insecure is intentional
@@ -3654,11 +3654,8 @@ class BasePlatformAdapter(ABC):
                        and text_content
                        and not media_files):
                    try:
-                        from agent.plugin_registries import registries
-                        _tts = registries.get_tool_provider("tts")
-                        text_to_speech_tool = _tts.tool_functions.get("text_to_speech_tool") if _tts else None
-                        check_tts_requirements = _tts.check_fn if _tts else None
-                        if check_tts_requirements and text_to_speech_tool and check_tts_requirements():
+                        from tools.tts_tool import text_to_speech_tool, check_tts_requirements
+                        if check_tts_requirements():
                            import json as _json
                            speech_text = self.prepare_tts_text(text_content)
                            if not speech_text:
@@ -113,12 +113,17 @@ DINGTALK_TYPE_MAPPING = {
 def check_dingtalk_requirements() -> bool:
    """Check if DingTalk dependencies are available and configured.

-    Since this is a separate package, deps are guaranteed by the package
-    manager.  Just verify the SDK can be imported and env vars are set.
+    Lazy-installs dingtalk-stream via ``tools.lazy_deps.ensure("platform.dingtalk")``
+    on first call if not present.
    """
    global DINGTALK_STREAM_AVAILABLE, dingtalk_stream, ChatbotMessage, CallbackMessage, AckMessage
    global HTTPX_AVAILABLE, httpx
    if not DINGTALK_STREAM_AVAILABLE or not HTTPX_AVAILABLE:
+        try:
+            from tools.lazy_deps import ensure as _lazy_ensure
+            _lazy_ensure("platform.dingtalk", prompt=False)
+        except Exception:
+            return False
        try:
            import dingtalk_stream as _ds
            from dingtalk_stream import ChatbotMessage as _CM
@@ -1345,64 +1345,63 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
 def check_feishu_requirements() -> bool:
    """Check if Feishu/Lark dependencies are available.

-    Since this is a separate package, deps are guaranteed by the package
-    manager.  Just verify the SDK can be imported.
+    Lazy-installs lark-oapi via ``tools.lazy_deps.ensure("platform.feishu")``
+    on first call if not present. Rebinds all module-level globals on success.
    """
    if FEISHU_AVAILABLE:
        return True

-    try:
-        import lark_oapi as _lark
-        from lark_oapi.api.application.v6 import GetApplicationRequest as _GAR
+    def _import():
+        import lark_oapi as lark
+        from lark_oapi.api.application.v6 import GetApplicationRequest
        from lark_oapi.api.im.v1 import (
-            CreateFileRequest as _CFR, CreateFileRequestBody as _CFRB,
-            CreateImageRequest as _CIR, CreateImageRequestBody as _CIRB,
-            CreateMessageRequest as _CMR, CreateMessageRequestBody as _CMRB,
-            GetChatRequest as _GCR, GetMessageRequest as _GMR, GetMessageResourceRequest as _GMRR,
-            P2ImMessageMessageReadV1 as _P2,
-            ReplyMessageRequest as _RMR, ReplyMessageRequestBody as _RMRB,
-            UpdateMessageRequest as _UMR, UpdateMessageRequestBody as _UMRB,
+            CreateFileRequest, CreateFileRequestBody,
+            CreateImageRequest, CreateImageRequestBody,
+            CreateMessageRequest, CreateMessageRequestBody,
+            GetChatRequest, GetMessageRequest, GetMessageResourceRequest,
+            P2ImMessageMessageReadV1,
+            ReplyMessageRequest, ReplyMessageRequestBody,
+            UpdateMessageRequest, UpdateMessageRequestBody,
        )
-        from lark_oapi.core import AccessTokenType as _AT, HttpMethod as _HM
-        from lark_oapi.core.const import FEISHU_DOMAIN as _FD, LARK_DOMAIN as _LD
-        from lark_oapi.core.model import BaseRequest as _BR
+        from lark_oapi.core import AccessTokenType, HttpMethod
+        from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
+        from lark_oapi.core.model import BaseRequest
        from lark_oapi.event.callback.model.p2_card_action_trigger import (
-            CallBackCard as _CBC, P2CardActionTriggerResponse as _P2R,
+            CallBackCard, P2CardActionTriggerResponse,
        )
-        from lark_oapi.event.dispatcher_handler import EventDispatcherHandler as _EDH
-        from lark_oapi.ws import Client as _FWSC
-    except ImportError:
-        return False
+        from lark_oapi.event.dispatcher_handler import EventDispatcherHandler
+        from lark_oapi.ws import Client as FeishuWSClient
+        return {
+            "lark": lark,
+            "GetApplicationRequest": GetApplicationRequest,
+            "CreateFileRequest": CreateFileRequest,
+            "CreateFileRequestBody": CreateFileRequestBody,
+            "CreateImageRequest": CreateImageRequest,
+            "CreateImageRequestBody": CreateImageRequestBody,
+            "CreateMessageRequest": CreateMessageRequest,
+            "CreateMessageRequestBody": CreateMessageRequestBody,
+            "GetChatRequest": GetChatRequest,
+            "GetMessageRequest": GetMessageRequest,
+            "GetMessageResourceRequest": GetMessageResourceRequest,
+            "P2ImMessageMessageReadV1": P2ImMessageMessageReadV1,
+            "ReplyMessageRequest": ReplyMessageRequest,
+            "ReplyMessageRequestBody": ReplyMessageRequestBody,
+            "UpdateMessageRequest": UpdateMessageRequest,
+            "UpdateMessageRequestBody": UpdateMessageRequestBody,
+            "AccessTokenType": AccessTokenType,
+            "HttpMethod": HttpMethod,
+            "FEISHU_DOMAIN": FEISHU_DOMAIN,
+            "LARK_DOMAIN": LARK_DOMAIN,
+            "BaseRequest": BaseRequest,
+            "CallBackCard": CallBackCard,
+            "P2CardActionTriggerResponse": P2CardActionTriggerResponse,
+            "EventDispatcherHandler": EventDispatcherHandler,
+            "FeishuWSClient": FeishuWSClient,
+            "FEISHU_AVAILABLE": True,
+        }

-    globals().update({
-        "lark": _lark,
-        "GetApplicationRequest": _GAR,
-        "CreateFileRequest": _CFR,
-        "CreateFileRequestBody": _CFRB,
-        "CreateImageRequest": _CIR,
-        "CreateImageRequestBody": _CIRB,
-        "CreateMessageRequest": _CMR,
-        "CreateMessageRequestBody": _CMRB,
-        "GetChatRequest": _GCR,
-        "GetMessageRequest": _GMR,
-        "GetMessageResourceRequest": _GMRR,
-        "P2ImMessageMessageReadV1": _P2,
-        "ReplyMessageRequest": _RMR,
-        "ReplyMessageRequestBody": _RMRB,
-        "UpdateMessageRequest": _UMR,
-        "UpdateMessageRequestBody": _UMRB,
-        "AccessTokenType": _AT,
-        "HttpMethod": _HM,
-        "FEISHU_DOMAIN": _FD,
-        "LARK_DOMAIN": _LD,
-        "BaseRequest": _BR,
-        "CallBackCard": _CBC,
-        "P2CardActionTriggerResponse": _P2R,
-        "EventDispatcherHandler": _EDH,
-        "FeishuWSClient": _FWSC,
-        "FEISHU_AVAILABLE": True,
-    })
-    return True
+    from tools.lazy_deps import ensure_and_bind
+    return ensure_and_bind("platform.feishu", _import, globals(), prompt=False)


 class FeishuAdapter(BasePlatformAdapter):
@@ -2460,7 +2459,7 @@ class FeishuAdapter(BasePlatformAdapter):
        logging, and reaction.  Scheduling follows the same
        ``run_coroutine_threadsafe`` pattern used by ``_on_message_event``.
        """
-        from .feishu_comment import handle_drive_comment_event
+        from gateway.platforms.feishu_comment import handle_drive_comment_event

        loop = self._loop
        if not self._loop_accepts_callbacks(loop):
@@ -1164,7 +1164,7 @@ async def handle_drive_comment_event(
    )

    # Access control
-    from .feishu_comment_rules import load_config, resolve_rule, is_user_allowed, has_wiki_keys
+    from gateway.platforms.feishu_comment_rules import load_config, resolve_rule, is_user_allowed, has_wiki_keys

    comments_cfg = load_config()
    rule = resolve_rule(comments_cfg, file_type, file_token)
@@ -240,8 +240,13 @@ def _check_e2ee_deps() -> bool:
 def check_matrix_requirements() -> bool:
    """Return True if the Matrix adapter can be used.

-    Since this is a separate package, deps are guaranteed by the package
-    manager.  Just verify the SDK can be imported and env vars are set.
+    Lazy-installs the full ``platform.matrix`` feature group via
+    ``tools.lazy_deps.ensure_and_bind`` whenever any of the declared
+    packages (mautrix, Markdown, aiosqlite, asyncpg, aiohttp-socks) is
+    missing — not just mautrix itself.  Previously this short-circuited on
+    ``import mautrix``, which left the other four packages uninstalled
+    forever and broke E2EE connect with ``No module named 'asyncpg'``
+    (#31116).  Rebinds module-level type globals on success.
    """
    token = os.getenv("MATRIX_ACCESS_TOKEN", "")
    password = os.getenv("MATRIX_PASSWORD", "")
@@ -254,15 +259,48 @@ def check_matrix_requirements() -> bool:
        logger.warning("Matrix: MATRIX_HOMESERVER not set")
        return False

-    # Try importing the mautrix types to verify the SDK is present.
+    # Check whether any package in the platform.matrix feature group is
+    # missing.  ``feature_missing`` is cheap (per-spec importlib.metadata
+    # lookups) and correctly handles ``mautrix[encryption]`` by stripping
+    # the extras marker before checking the bare package.
    try:
-        from mautrix.types import (  # noqa: F401
-            ContentURI, EventID, EventType, PaginationDirection,
-            PresenceState, RoomCreatePreset, RoomID, SyncToken,
-            TrustState, UserID,
-        )
-    except ImportError:
-        return False
+        from tools.lazy_deps import feature_missing, ensure_and_bind
+        missing = feature_missing("platform.matrix")
+    except Exception as exc:  # pragma: no cover — defensive
+        logger.debug("Matrix: lazy_deps lookup failed: %s", exc)
+        missing = ()
+        ensure_and_bind = None  # type: ignore[assignment]
+
+    if missing or ensure_and_bind is None:
+        def _import():
+            from mautrix.types import (
+                ContentURI, EventID, EventType, PaginationDirection,
+                PresenceState, RoomCreatePreset, RoomID, SyncToken,
+                TrustState, UserID,
+            )
+            return {
+                "ContentURI": ContentURI,
+                "EventID": EventID,
+                "EventType": EventType,
+                "PaginationDirection": PaginationDirection,
+                "PresenceState": PresenceState,
+                "RoomCreatePreset": RoomCreatePreset,
+                "RoomID": RoomID,
+                "SyncToken": SyncToken,
+                "TrustState": TrustState,
+                "UserID": UserID,
+            }
+
+        if ensure_and_bind is None:
+            return False
+        if not ensure_and_bind("platform.matrix", _import, globals(), prompt=False):
+            logger.warning(
+                "Matrix: required packages not installed (%s). "
+                "Run: pip install 'mautrix[encryption]' asyncpg aiosqlite "
+                "Markdown aiohttp-socks",
+                ", ".join(missing) if missing else "platform.matrix",
+            )
+            return False

    # If encryption is requested, verify E2EE deps are available at startup
    # rather than silently degrading to plaintext-only at connect time.
@@ -30,6 +30,10 @@ except ImportError:
    AsyncSocketModeHandler = Any
    AsyncWebClient = Any

+import sys
+from pathlib import Path as _Path
+sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
+
 from gateway.config import Platform, PlatformConfig
 from gateway.platforms.helpers import MessageDeduplicator
 from gateway.platforms.base import (
@@ -71,28 +75,27 @@ class _ThreadContextCache:
 def check_slack_requirements() -> bool:
    """Check if Slack dependencies are available.

-    Since this is a separate package, deps are guaranteed by the package
-    manager.  Just verify the SDK can be imported.
+    Lazy-installs slack-bolt/slack-sdk via ``tools.lazy_deps.ensure("platform.slack")``
+    on first call if not present. Rebinds all module-level globals on success.
    """
    if SLACK_AVAILABLE:
        return True

-    try:
-        from slack_bolt.async_app import AsyncApp as _AsyncApp
-        from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler as _ASMH
-        from slack_sdk.web.async_client import AsyncWebClient as _AWC
-        import aiohttp as _aiohttp
-    except ImportError:
-        return False
+    def _import():
+        from slack_bolt.async_app import AsyncApp
+        from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
+        from slack_sdk.web.async_client import AsyncWebClient
+        import aiohttp
+        return {
+            "AsyncApp": AsyncApp,
+            "AsyncSocketModeHandler": AsyncSocketModeHandler,
+            "AsyncWebClient": AsyncWebClient,
+            "aiohttp": aiohttp,
+            "SLACK_AVAILABLE": True,
+        }

-    globals().update({
-        "AsyncApp": _AsyncApp,
-        "AsyncSocketModeHandler": _ASMH,
-        "AsyncWebClient": _AWC,
-        "aiohttp": _aiohttp,
-        "SLACK_AVAILABLE": True,
-    })
-    return True
+    from tools.lazy_deps import ensure_and_bind
+    return ensure_and_bind("platform.slack", _import, globals(), prompt=False)


 def _extract_text_from_slack_blocks(blocks: list) -> str:
@@ -60,6 +60,10 @@ except ImportError:
        DEFAULT_TYPE = Any
    ContextTypes = _MockContextTypes

+import sys
+from pathlib import Path as _Path
+sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
+
 from gateway.config import Platform, PlatformConfig
 from gateway.platforms.base import (
    BasePlatformAdapter,
@@ -107,8 +111,10 @@ MAX_COMMANDS_PER_SCOPE = 30
 def check_telegram_requirements() -> bool:
    """Check if Telegram dependencies are available.

-    Since this is a separate package, deps are guaranteed by the package
-    manager.  Just verify the SDK can be imported.
+    If python-telegram-bot is missing, attempts to lazy-install it via
+    ``tools.lazy_deps.ensure("platform.telegram")``. After a successful
+    install, re-imports the SDK and flips ``TELEGRAM_AVAILABLE`` to True
+    so the adapter's class-level type aliases get rebound.
    """
    global TELEGRAM_AVAILABLE, Update, Bot, Message, InlineKeyboardButton
    global InlineKeyboardMarkup, LinkPreviewOptions, Application
@@ -116,6 +122,11 @@ def check_telegram_requirements() -> bool:
    global ContextTypes, filters, ParseMode, ChatType, HTTPXRequest
    if TELEGRAM_AVAILABLE:
        return True
+    try:
+        from tools.lazy_deps import ensure as _lazy_ensure
+        _lazy_ensure("platform.telegram", prompt=False)
+    except Exception:
+        return False
    try:
        from telegram import Update as _Update, Bot as _Bot, Message as _Message
        from telegram import InlineKeyboardButton as _IKB, InlineKeyboardMarkup as _IKM
@@ -831,6 +831,8 @@ if _config_path.exists():
                "docker_env": "TERMINAL_DOCKER_ENV",
                "docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
                "docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
+                "docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
+                "docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
                "sandbox_dir": "TERMINAL_SANDBOX_DIR",
                "persistent_shell": "TERMINAL_PERSISTENT_SHELL",
            }
@@ -5418,6 +5420,49 @@ class GatewayRunner:
            )
            stale_timeout_seconds = 0

+        # Read kanban.default_assignee — fallback profile for tasks
+        # created without an explicit assignee (e.g. via the dashboard).
+        # When set, the dispatcher applies it to unassigned ready tasks
+        # instead of skipping them indefinitely (#27145). Empty string
+        # (the schema default) means "no fallback, keep skipping" —
+        # backward-compatible with existing installs.
+        default_assignee = (kanban_cfg.get("default_assignee") or "").strip() or None
+        if default_assignee:
+            logger.info(
+                "kanban dispatcher: default_assignee=%r (unassigned ready tasks "
+                "will route to this profile)",
+                default_assignee,
+            )
+
+        # Read kanban.max_in_progress_per_profile — per-profile concurrency
+        # cap (#21582). When set, no single profile gets more than N
+        # workers running at once, even if the global max_in_progress
+        # would allow it. Prevents one profile's local model / API quota
+        # / browser pool from being overwhelmed by a fan-out.
+        raw_per_profile = kanban_cfg.get("max_in_progress_per_profile", None)
+        max_in_progress_per_profile = None
+        if raw_per_profile is not None:
+            try:
+                max_in_progress_per_profile = int(raw_per_profile)
+            except (TypeError, ValueError):
+                logger.warning(
+                    "kanban dispatcher: invalid kanban.max_in_progress_per_profile=%r; ignoring",
+                    raw_per_profile,
+                )
+                max_in_progress_per_profile = None
+            else:
+                if max_in_progress_per_profile < 1:
+                    logger.warning(
+                        "kanban dispatcher: kanban.max_in_progress_per_profile=%r is below 1; ignoring",
+                        raw_per_profile,
+                    )
+                    max_in_progress_per_profile = None
+                else:
+                    logger.info(
+                        "kanban dispatcher: max_in_progress_per_profile=%d",
+                        max_in_progress_per_profile,
+                    )
+
        # Initial delay so the gateway finishes wiring adapters before the
        # dispatcher spawns workers (those workers may hit gateway notify
        # subscriptions etc.). Matches the notifier watcher's delay.
@@ -5509,6 +5554,8 @@ class GatewayRunner:
                    max_in_progress=max_in_progress,
                    failure_limit=failure_limit,
                    stale_timeout_seconds=stale_timeout_seconds,
+                    default_assignee=default_assignee,
+                    max_in_progress_per_profile=max_in_progress_per_profile,
                )
            except sqlite3.DatabaseError as exc:
                if _is_corrupt_board_db_error(exc):
@@ -6263,29 +6310,6 @@ class GatewayRunner:
                    # plugin adapters don't need a custom factory signature.
                    if hasattr(adapter, "gateway_runner"):
                        adapter.gateway_runner = self
-                    # ── Telegram: notification mode from config ──
-                    # Applied here (not in the adapter factory) because it
-                    # reads gateway-local config that only the gateway runner
-                    # has access to.
-                    if platform.value == "telegram":
-                        _notify_mode = os.getenv("HERMES_TELEGRAM_NOTIFICATIONS", "")
-                        if not _notify_mode:
-                            try:
-                                _gw_cfg = _load_gateway_config()
-                                _raw = cfg_get(_gw_cfg, "display", "platforms", "telegram", "notifications")
-                                if _raw not in {None, ""}:
-                                    _notify_mode = str(_raw).strip().lower()
-                            except Exception:
-                                pass
-                        _notify_mode = _notify_mode or "important"
-                        if _notify_mode not in {"all", "important"}:
-                            logger.warning(
-                                "Unknown telegram notifications mode '%s', "
-                                "defaulting to 'important' (valid: all, important)",
-                                _notify_mode,
-                            )
-                            _notify_mode = "important"
-                        adapter._notifications_mode = _notify_mode
                    return adapter
                # Registered but failed to instantiate — don't silently fall
                # through to built-ins (there are none for plugin platforms).
@@ -6299,13 +6323,49 @@ class GatewayRunner:
            logger.debug("Platform registry lookup for '%s' failed: %s", platform.value, e)
        # Fall through to built-in adapters below

-        if platform == Platform.WHATSAPP:
+        if platform == Platform.TELEGRAM:
+            from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
+            if not check_telegram_requirements():
+                logger.warning("Telegram: python-telegram-bot not installed")
+                return None
+            adapter = TelegramAdapter(config)
+            # Apply Telegram notification mode from config.  Controls whether
+            # intermediate messages (tool progress, streaming, status) trigger
+            # push notifications.  Supports ENV override for quick testing.
+            _notify_mode = os.getenv("HERMES_TELEGRAM_NOTIFICATIONS", "")
+            if not _notify_mode:
+                try:
+                    _gw_cfg = _load_gateway_config()
+                    _raw = cfg_get(_gw_cfg, "display", "platforms", "telegram", "notifications")
+                    if _raw not in {None, ""}:
+                        _notify_mode = str(_raw).strip().lower()
+                except Exception:
+                    pass
+            _notify_mode = _notify_mode or "important"
+            if _notify_mode not in {"all", "important"}:
+                logger.warning(
+                    "Unknown telegram notifications mode '%s', "
+                    "defaulting to 'important' (valid: all, important)",
+                    _notify_mode,
+                )
+                _notify_mode = "important"
+            adapter._notifications_mode = _notify_mode
+            return adapter
+        
+        elif platform == Platform.WHATSAPP:
            from gateway.platforms.whatsapp import WhatsAppAdapter, check_whatsapp_requirements
            if not check_whatsapp_requirements():
                logger.warning("WhatsApp: Node.js not installed or bridge not configured")
                return None
            return WhatsAppAdapter(config)
        
+        elif platform == Platform.SLACK:
+            from gateway.platforms.slack import SlackAdapter, check_slack_requirements
+            if not check_slack_requirements():
+                logger.warning("Slack: slack-bolt not installed. Run: pip install 'hermes-agent[slack]'")
+                return None
+            return SlackAdapter(config)
+
        elif platform == Platform.SIGNAL:
            from gateway.platforms.signal import SignalAdapter, check_signal_requirements
            if not check_signal_requirements():
@@ -6334,6 +6394,20 @@ class GatewayRunner:
                return None
            return SmsAdapter(config)

+        elif platform == Platform.DINGTALK:
+            from gateway.platforms.dingtalk import DingTalkAdapter, check_dingtalk_requirements
+            if not check_dingtalk_requirements():
+                logger.warning("DingTalk: dingtalk-stream not installed or DINGTALK_CLIENT_ID/SECRET not set")
+                return None
+            return DingTalkAdapter(config)
+
+        elif platform == Platform.FEISHU:
+            from gateway.platforms.feishu import FeishuAdapter, check_feishu_requirements
+            if not check_feishu_requirements():
+                logger.warning("Feishu: lark-oapi not installed or FEISHU_APP_ID/SECRET not set")
+                return None
+            return FeishuAdapter(config)
+
        elif platform == Platform.WECOM_CALLBACK:
            from gateway.platforms.wecom_callback import (
                WecomCallbackAdapter,
@@ -6358,6 +6432,13 @@ class GatewayRunner:
                return None
            return WeixinAdapter(config)

+        elif platform == Platform.MATRIX:
+            from gateway.platforms.matrix import MatrixAdapter, check_matrix_requirements
+            if not check_matrix_requirements():
+                logger.warning("Matrix: mautrix not installed or credentials not set. Run: pip install 'mautrix[encryption]'")
+                return None
+            return MatrixAdapter(config)
+
        elif platform == Platform.API_SERVER:
            from gateway.platforms.api_server import APIServerAdapter, check_api_server_requirements
            if not check_api_server_requirements():
@@ -11428,12 +11509,7 @@ class GatewayRunner:
        audio_path = None
        actual_path = None
        try:
-            from agent.plugin_registries import registries
-            _tts_entry = registries.get_tool_provider("tts")
-            if _tts_entry is None:
-                return
-            text_to_speech_tool = _tts_entry.tool_functions["text_to_speech_tool"]
-            _strip_markdown_for_tts = _tts_entry.tool_functions["_strip_markdown_for_tts"]
+            from tools.tts_tool import text_to_speech_tool, _strip_markdown_for_tts

            tts_text = _strip_markdown_for_tts(text[:4000])
            if not tts_text:
@@ -14721,32 +14797,9 @@ class GatewayRunner:
                return f"{prefix}\n\n{user_text}"
            return prefix

-        from agent.plugin_registries import registries
-        _stt_entry = registries.get_tool_provider("stt")
-        enriched_parts = []
-        if _stt_entry is None or "transcribe_audio" not in _stt_entry.tool_functions:
-            # No STT plugin registered — treat each audio path the same way
-            # as a "No STT provider" transcription failure.
-            for path in audio_paths:
-                abs_path = os.path.abspath(path)
-                duration_str = await _probe_audio_duration(abs_path)
-                if duration_str:
-                    enriched_parts.append(
-                        f"[The user sent a voice message: {abs_path} (duration: {duration_str})]"
-                    )
-                else:
-                    enriched_parts.append(f"[The user sent a voice message: {abs_path}]")
-            if not enriched_parts:
-                return user_text
-            prefix = "\n\n".join(enriched_parts)
-            _placeholder = "(The user sent a message with no text content)"
-            if user_text and user_text.strip() == _placeholder:
-                return prefix
-            if user_text:
-                return f"{prefix}\n\n{user_text}"
-            return prefix
-        transcribe_audio = _stt_entry.tool_functions["transcribe_audio"]
+        from tools.transcription_tools import transcribe_audio

+        enriched_parts = []
        for path in audio_paths:
            try:
                logger.debug("Transcribing user voice: %s", path)
@@ -14,8 +14,8 @@ Provides subcommands for:
 import os
 import sys

-__version__ = "0.15.0"
-__release_date__ = "2026.5.28"
+__version__ = "0.15.1"
+__release_date__ = "2026.5.29"


 def _ensure_utf8():
@@ -1597,10 +1597,8 @@ def resolve_provider(
    # AWS Bedrock — detect via boto3 credential chain (IAM roles, SSO, env vars).
    # This runs after API-key providers so explicit keys always win.
    try:
-        from agent.plugin_registries import registries
-        _bedrock_ns = registries.get_provider_namespace("bedrock")
-        has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
-        if has_aws_credentials and has_aws_credentials():
+        from agent.bedrock_adapter import has_aws_credentials
+        if has_aws_credentials():
            return "bedrock"
    except ImportError:
        pass  # boto3 not installed — skip Bedrock auto-detection
@@ -6046,13 +6044,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
    # AWS SDK providers (Bedrock) — check via boto3 credential chain
    if pconfig and pconfig.auth_type == "aws_sdk":
        try:
-            from agent.plugin_registries import registries
-            _bedrock_ns = registries.get_provider_namespace("bedrock")
-            has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
-            if has_aws_credentials:
-                return {"logged_in": has_aws_credentials(), "provider": target}
-            else:
-                return {"logged_in": False, "provider": target, "error": "boto3 not installed"}
+            from agent.bedrock_adapter import has_aws_credentials
+            return {"logged_in": has_aws_credentials(), "provider": target}
        except ImportError:
            return {"logged_in": False, "provider": target, "error": "boto3 not installed"}
    return {"logged_in": False}
@@ -6091,13 +6084,11 @@ def _get_azure_foundry_auth_status() -> Dict[str, Any]:

    if auth_mode == "entra_id":
        try:
-            from agent.plugin_registries import registries
-            _azure_ns = registries.get_provider_namespace("azure")
-            EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
-            SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
-            has_azure_identity_installed = _azure_ns.get("has_azure_identity_installed")
-            if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, has_azure_identity_installed]):
-                raise ImportError("azure provider services not fully registered")
+            from agent.azure_identity_adapter import (
+                EntraIdentityConfig,
+                SCOPE_AI_AZURE_DEFAULT,
+                has_azure_identity_installed,
+            )
            installed = has_azure_identity_installed()
            entra_cfg = {}
            if isinstance(model_cfg, dict) and isinstance(model_cfg.get("entra"), dict):
@@ -221,12 +221,9 @@ def auth_add_command(args) -> None:
        return

    if provider == "anthropic":
-        from agent.plugin_registries import registries
-        _anthropic_ns = registries.get_provider_namespace("anthropic")
-        run_hermes_oauth_login_pure = _anthropic_ns.get("run_hermes_oauth_login_pure")
-        if not run_hermes_oauth_login_pure:
-            raise SystemExit("Anthropic plugin not loaded — cannot run OAuth login.")
-        creds = run_hermes_oauth_login_pure()
+        from agent import anthropic_adapter as anthropic_mod
+
+        creds = anthropic_mod.run_hermes_oauth_login_pure()
        if not creds:
            raise SystemExit("Anthropic OAuth login did not return credentials.")
        label = (getattr(args, "label", None) or "").strip() or label_from_token(
@@ -552,12 +549,8 @@ def _interactive_auth() -> None:

    # Show AWS Bedrock credential status (not in the pool — uses boto3 chain)
    try:
-        from agent.plugin_registries import registries
-        _bedrock = registries.get_provider_namespace("bedrock")
-        has_aws_credentials = _bedrock.get("has_aws_credentials")
-        resolve_aws_auth_env_var = _bedrock.get("resolve_aws_auth_env_var")
-        resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
-        if has_aws_credentials and has_aws_credentials():
+        from agent.bedrock_adapter import has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region
+        if has_aws_credentials():
            auth_source = resolve_aws_auth_env_var() or "unknown"
            region = resolve_bedrock_region()
            print(f"bedrock (AWS SDK credential chain):")
@@ -584,12 +577,12 @@ def _interactive_auth() -> None:
            _cfg_provider = str(_model_cfg.get("provider") or "").strip().lower()
            _cfg_auth_mode = str(_model_cfg.get("auth_mode") or "").strip().lower()
            if _cfg_provider == "azure-foundry" and _cfg_auth_mode == "entra_id":
-                from agent.plugin_registries import registries
-                _azure = registries.get_provider_namespace("azure")
-                EntraIdentityConfig = _azure.get("EntraIdentityConfig")
-                SCOPE_AI_AZURE_DEFAULT = _azure.get("SCOPE_AI_AZURE_DEFAULT")
-                describe_active_credential = _azure.get("describe_active_credential")
-                has_azure_identity_installed = _azure.get("has_azure_identity_installed")
+                from agent.azure_identity_adapter import (
+                    EntraIdentityConfig,
+                    SCOPE_AI_AZURE_DEFAULT,
+                    describe_active_credential,
+                    has_azure_identity_installed,
+                )
                _base_url = str(_model_cfg.get("base_url") or "").strip()
                _entra = _model_cfg.get("entra") or {}
                if not isinstance(_entra, dict):
@@ -1726,6 +1726,15 @@ DEFAULT_CONFIG = {
        # assignee to any installed profile. When unset, falls back to the
        # default profile. A task never ends up with assignee=None.
        "default_assignee": "",
+        # Per-profile concurrency cap (#21582). When set to a positive int,
+        # no single profile can have more than N workers running at once,
+        # even if the global max_in_progress / max_spawn caps would allow
+        # it. Tasks blocked this way defer to the next dispatcher tick.
+        # Unset (None) means "no per-profile cap" — backward-compatible
+        # with existing installs. Useful for fan-out workflows that would
+        # otherwise saturate one profile's local model / API quota /
+        # browser pool while leaving other profiles idle.
+        "max_in_progress_per_profile": None,
        # When true, the kanban dispatcher auto-runs the decomposer on
        # tasks that land in Triage (every dispatcher tick). When false,
        # decomposition is manual via `hermes kanban decompose <id>` or
@@ -5551,6 +5560,8 @@ def set_config_value(key: str, value: str):
        "terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
        "terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
        "terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
+        "terminal.docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
+        "terminal.docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
        "terminal.docker_env": "TERMINAL_DOCKER_ENV",
        # terminal.cwd intentionally excluded — CLI resolves at runtime,
        # gateway bridges it in gateway/run.py. Persisting to .env causes
@@ -28,6 +28,7 @@ from hermes_cli.models import _HERMES_USER_AGENT
 from hermes_constants import OPENROUTER_MODELS_URL
 from utils import base_url_host_matches

+
 _PROVIDER_ENV_HINTS = (
    "OPENROUTER_API_KEY",
    "OPENAI_API_KEY",
@@ -53,11 +54,14 @@ _PROVIDER_ENV_HINTS = (
    "TOKENHUB_API_KEY",
 )

+
 from hermes_constants import is_termux as _is_termux

+
 def _python_install_cmd() -> str:
    return "python -m pip install" if _is_termux() else "uv pip install"

+
 def _system_package_install_cmd(pkg: str) -> str:
    if _is_termux():
        return f"pkg install {pkg}"
@@ -65,6 +69,7 @@ def _system_package_install_cmd(pkg: str) -> str:
        return f"brew install {pkg}"
    return f"sudo apt install {pkg}"

+
 def _safe_which(cmd: str) -> str | None:
    """shutil.which wrapper resilient to platform monkeypatching in tests."""
    try:
@@ -72,6 +77,7 @@ def _safe_which(cmd: str) -> str | None:
    except Exception:
        return None

+
 def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
    steps: list[str] = []
    step = 1
@@ -82,6 +88,7 @@ def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
    steps.append(f"{step + 1}) agent-browser install")
    return steps

+
 def _termux_install_all_fallback_notes() -> list[str]:
    return [
        "Termux install profile: use .[termux-all] for broad compatibility (installer default on Termux).",
@@ -90,10 +97,12 @@ def _termux_install_all_fallback_notes() -> list[str]:
        "STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY).",
    ]

+
 def _has_provider_env_config(content: str) -> bool:
    """Return True when ~/.hermes/.env contains provider auth/base URL settings."""
    return any(key in content for key in _PROVIDER_ENV_HINTS)

+
 def _honcho_is_configured_for_doctor() -> bool:
    """Return True when Honcho is configured, even if this process has no active session."""
    try:
@@ -104,6 +113,7 @@ def _honcho_is_configured_for_doctor() -> bool:
    except Exception:
        return False

+
 def _is_kanban_worker_env_gate(item: dict) -> bool:
    """Return True when Kanban is unavailable only because this is not a worker process."""
    if item.get("name") != "kanban":
@@ -114,12 +124,14 @@ def _is_kanban_worker_env_gate(item: dict) -> bool:
    tools = item.get("tools") or []
    return bool(tools) and all(str(tool).startswith("kanban_") for tool in tools)

+
 def _doctor_tool_availability_detail(toolset: str) -> str:
    """Optional explanatory suffix for toolsets whose doctor status needs context."""
    if toolset == "kanban" and not os.environ.get("HERMES_KANBAN_TASK"):
        return "(runtime-gated; loaded only for dispatcher-spawned workers)"
    return ""

+
 def _apply_doctor_tool_availability_overrides(available: list[str], unavailable: list[dict]) -> tuple[list[str], list[dict]]:
    """Adjust runtime-gated tool availability for doctor diagnostics."""
    updated_available = list(available)
@@ -137,6 +149,7 @@ def _apply_doctor_tool_availability_overrides(available: list[str], unavailable:
        updated_unavailable.append(item)
    return updated_available, updated_unavailable

+
 def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool:
    """Return True when a direct API-key probe failure is non-blocking.

@@ -166,6 +179,7 @@ def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool
            return False
    return False

+
 def check_ok(text: str, detail: str = ""):
    print(f"  {color('✓', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))

@@ -178,16 +192,19 @@ def check_fail(text: str, detail: str = ""):
 def check_info(text: str):
    print(f"    {color('→', Colors.CYAN)} {text}")

+
 def _section(title: str) -> None:
    """Print a doctor section banner: blank line + bold cyan ◆ title."""
    print()
    print(color(f"◆ {title}", Colors.CYAN, Colors.BOLD))

+
 def _fail_and_issue(text: str, detail: str, fix: str, issues: list[str]) -> None:
    """Emit a check_fail and append the corresponding fix instruction."""
    check_fail(text, detail)
    issues.append(fix)

+
 def _check_s6_supervision(issues: list[str]) -> None:
    """Inside a container under our s6 /init, surface what s6 sees.

@@ -235,6 +252,7 @@ def _check_s6_supervision(issues: list[str]) -> None:
        + (f" ({', '.join(sorted(profiles))})" if len(profiles) <= 8 else "")
    )

+
 def _check_gateway_service_linger(issues: list[str]) -> None:
    """Warn when a systemd user gateway service will stop after logout.

@@ -278,8 +296,10 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
    else:
        check_warn("Could not verify systemd linger", f"({linger_detail})")

+
 _APIKEY_PROVIDERS_CACHE: list | None = None

+
 def _build_apikey_providers_list() -> list:
    """Build the API-key provider health-check list once and cache it.

@@ -371,6 +391,7 @@ def _build_apikey_providers_list() -> list:
        pass
    return _static

+
 def run_doctor(args):
    """Run diagnostic checks."""
    should_fix = getattr(args, 'fix', False)
@@ -1454,15 +1475,12 @@ def run_doctor(args):
            return _ConnectivityResult("Anthropic API", [], [])
        try:
            import httpx
-            from agent.plugin_registries import registries
-            _anthropic_ns = registries.get_provider_namespace("anthropic")
-            _is_oauth_token = _anthropic_ns.get("_is_oauth_token")
-            # _COMMON_BETAS and _CONTEXT_1M_BETA are now in core
-            from agent.anthropic_format import _COMMON_BETAS, _CONTEXT_1M_BETA
-            _OAUTH_ONLY_BETAS = _anthropic_ns.get("_OAUTH_ONLY_BETAS")
-
-            if not all([_is_oauth_token, _OAUTH_ONLY_BETAS]):
-                raise ImportError("anthropic provider services not fully registered")
+            from agent.anthropic_adapter import (
+                _is_oauth_token,
+                _COMMON_BETAS,
+                _OAUTH_ONLY_BETAS,
+                _CONTEXT_1M_BETA,
+            )
            headers = {"anthropic-version": "2023-06-01"}
            is_oauth = _is_oauth_token(key)
            if is_oauth:
@@ -1606,13 +1624,11 @@ def run_doctor(args):

    def _probe_bedrock() -> _ConnectivityResult:
        try:
-            from agent.plugin_registries import registries
-            _bedrock_ns = registries.get_provider_namespace("bedrock")
-            has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
-            resolve_aws_auth_env_var = _bedrock_ns.get("resolve_aws_auth_env_var")
-            resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
-            if not all([has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region]):
-                raise ImportError("bedrock provider services not fully registered")
+            from agent.bedrock_adapter import (
+                has_aws_credentials,
+                resolve_aws_auth_env_var,
+                resolve_bedrock_region,
+            )
        except ImportError:
            return _ConnectivityResult("AWS Bedrock", [], [])
        if not has_aws_credentials():
@@ -1683,14 +1699,12 @@ def run_doctor(args):
            return _ConnectivityResult("Azure Foundry (Entra ID)", [], [])

        try:
-            from agent.plugin_registries import registries
-            _azure_ns = registries.get_provider_namespace("azure")
-            EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
-            SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
-            describe_active_credential = _azure_ns.get("describe_active_credential")
-            has_azure_identity_installed = _azure_ns.get("has_azure_identity_installed")
-            if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, describe_active_credential, has_azure_identity_installed]):
-                raise ImportError("azure provider services not fully registered")
+            from agent.azure_identity_adapter import (
+                EntraIdentityConfig,
+                SCOPE_AI_AZURE_DEFAULT,
+                describe_active_credential,
+                has_azure_identity_installed,
+            )
        except Exception as exc:
            return _ConnectivityResult(
                "Azure Foundry (Entra ID)",
@@ -4370,9 +4370,7 @@ def _setup_feishu():
    if method_idx == 0:
        # ── QR scan-to-create ──
        try:
-            from agent.plugin_registries import registries
-            _feishu_entry = registries.get_platform("feishu")
-            qr_register = _feishu_entry.helper_functions.get("qr_register") if _feishu_entry else None
+            from gateway.platforms.feishu import qr_register
        except Exception as exc:
            print_error(f"  Feishu / Lark onboard import failed: {exc}")
            qr_register = None
@@ -4413,13 +4411,8 @@ def _setup_feishu():
        # Try to probe the bot with manual credentials
        bot_name = None
        try:
-            from agent.plugin_registries import registries
-            _feishu_entry = registries.get_platform("feishu")
-            probe_bot = _feishu_entry.helper_functions.get("probe_bot") if _feishu_entry else None
-            if probe_bot:
-                bot_info = probe_bot(app_id, app_secret, domain)
-            else:
-                bot_info = None
+            from gateway.platforms.feishu import probe_bot
+            bot_info = probe_bot(app_id, app_secret, domain)
            if bot_info:
                bot_name = bot_info.get("bot_name")
                print_success(f"  Credentials verified — bot: {bot_name or 'unnamed'}")
@@ -2087,12 +2087,35 @@ def _cmd_tail(args: argparse.Namespace) -> int:


 def _cmd_dispatch(args: argparse.Namespace) -> int:
+    # Honour kanban.default_assignee as the fallback for unassigned ready
+    # tasks (#27145) and kanban.max_in_progress_per_profile as the
+    # per-profile concurrency cap (#21582). Same semantics as the
+    # gateway dispatch path.
+    try:
+        from hermes_cli.config import load_config
+        _cfg = load_config()
+        _kanban_cfg = _cfg.get("kanban", {}) if isinstance(_cfg, dict) else {}
+        default_assignee = (_kanban_cfg.get("default_assignee") or "").strip() or None
+        _raw_per_profile = _kanban_cfg.get("max_in_progress_per_profile", None)
+        try:
+            max_in_progress_per_profile = (
+                int(_raw_per_profile) if _raw_per_profile is not None else None
+            )
+            if max_in_progress_per_profile is not None and max_in_progress_per_profile < 1:
+                max_in_progress_per_profile = None
+        except (TypeError, ValueError):
+            max_in_progress_per_profile = None
+    except Exception:
+        default_assignee = None
+        max_in_progress_per_profile = None
    with kb.connect_closing() as conn:
        res = kb.dispatch_once(
            conn,
            dry_run=args.dry_run,
            max_spawn=args.max,
            failure_limit=getattr(args, "failure_limit", kb.DEFAULT_SPAWN_FAILURE_LIMIT),
+            default_assignee=default_assignee,
+            max_in_progress_per_profile=max_in_progress_per_profile,
        )
    if getattr(args, "json", False):
        print(json.dumps({
@@ -2108,6 +2131,11 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
            ],
            "skipped_unassigned": res.skipped_unassigned,
            "skipped_nonspawnable": res.skipped_nonspawnable,
+            "skipped_per_profile_capped": [
+                {"task_id": tid, "assignee": who, "current": current}
+                for (tid, who, current) in res.skipped_per_profile_capped
+            ],
+            "auto_assigned_default": res.auto_assigned_default,
        }, indent=2))
        return 0
    print(f"Reclaimed:    {res.reclaimed}")
@@ -2128,8 +2156,18 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
    for tid, who, ws in res.spawned:
        tag = " (dry)" if args.dry_run else ""
        print(f"  - {tid}  ->  {who}  @ {ws or '-'}{tag}")
+    if res.auto_assigned_default:
+        print(
+            f"Auto-assigned to kanban.default_assignee={default_assignee!r}: "
+            f"{', '.join(res.auto_assigned_default)}"
+        )
    if res.skipped_unassigned:
        print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
+    if res.skipped_per_profile_capped:
+        for tid, who, current in res.skipped_per_profile_capped:
+            print(
+                f"Deferred ({who} at per-profile cap, {current} running): {tid}"
+            )
    if res.skipped_nonspawnable:
        print(
            f"Skipped (non-spawnable assignee — terminal lane, OK): "
@@ -4289,6 +4289,12 @@ class DispatchResult:
    skipped_unassigned: list[str] = field(default_factory=list)
    """Ready task ids skipped because they have no assignee at all.
    Operator-actionable — usually a misfiled task waiting for routing."""
+    auto_assigned_default: list[str] = field(default_factory=list)
+    """Task ids that were unassigned in the DB and had
+    ``kanban.default_assignee`` applied this tick before spawning (#27145).
+    Surfaces the auto-assignment to telemetry / CLI / dashboard so the
+    operator can see when the dispatcher is acting on the fallback rule
+    rather than on explicit per-task assignments."""
    skipped_nonspawnable: list[str] = field(default_factory=list)
    """Ready task ids skipped because their assignee names a control-plane
    lane (a Claude Code terminal like ``orion-cc``) rather than a Hermes
@@ -4296,6 +4302,14 @@ class DispatchResult:
    operator-actionable failure. Tracked separately so health telemetry
    can distinguish "real stuck" (nothing spawned but spawnable work
    available) from "correctly idle" (nothing spawnable in the queue)."""
+    skipped_per_profile_capped: list[tuple[str, str, int]] = field(default_factory=list)
+    """Tasks deferred this tick because their assignee is already at
+    ``kanban.max_in_progress_per_profile`` (#21582). Each entry is
+    ``(task_id, assignee, current_running_count)``. NOT an
+    operator-actionable failure — the task will be picked up on a
+    subsequent tick when the assignee has capacity. Separate bucket so
+    telemetry / dashboards can show "this profile is busy" vs
+    "task is genuinely stuck"."""
    crashed: list[str] = field(default_factory=list)
    """Task ids reclaimed because their worker PID disappeared."""
    auto_blocked: list[str] = field(default_factory=list)
@@ -5342,6 +5356,8 @@ def dispatch_once(
    failure_limit: int = DEFAULT_SPAWN_FAILURE_LIMIT,
    stale_timeout_seconds: int = 0,
    board: Optional[str] = None,
+    default_assignee: Optional[str] = None,
+    max_in_progress_per_profile: Optional[int] = None,
 ) -> DispatchResult:
    """Run one dispatcher tick.

@@ -5427,12 +5443,89 @@ def dispatch_once(
        if max_spawn is None or max_spawn > remaining:
            max_spawn = remaining
    spawned = 0
+    # Per-profile concurrency cap (#21582): when set, track how many
+    # workers each assignee already has in flight, and refuse to spawn
+    # when this would push that assignee past the cap. Prevents
+    # fan-out workloads from melting a single profile's local model /
+    # API quota / browser pool while leaving other profiles idle.
+    # Tasks blocked this way go to skipped_per_profile_capped (not
+    # skipped_unassigned — the operator-actionable signal is different:
+    # "this profile is busy, try again later" not "this needs routing").
+    _per_profile_cap = max_in_progress_per_profile if (
+        isinstance(max_in_progress_per_profile, int)
+        and max_in_progress_per_profile > 0
+    ) else None
+    _per_profile_running: dict[str, int] = {}
+    if _per_profile_cap is not None:
+        for prow in conn.execute(
+            "SELECT assignee, COUNT(*) AS n FROM tasks "
+            "WHERE status = 'running' AND assignee IS NOT NULL "
+            "GROUP BY assignee"
+        ):
+            _per_profile_running[prow["assignee"]] = int(prow["n"])
+    # Normalize default_assignee once: empty/whitespace string → None so the
+    # rest of the loop can use ``if default_assignee:`` as a single check.
+    # We also resolve profile_exists once here for the same reason.
+    _default_assignee = (default_assignee or "").strip() or None
+    _default_assignee_resolved = False
+    if _default_assignee:
+        try:
+            from hermes_cli.profiles import profile_exists as _pe
+            _default_assignee_resolved = bool(_pe(_default_assignee))
+        except Exception:
+            # Profiles module not importable (test stubs, exotic envs).
+            # Trust the operator's config and try the assignment; the
+            # downstream profile_exists check on the assigned row will
+            # bucket it as nonspawnable if the profile genuinely isn't
+            # there, with the existing diagnostic.
+            _default_assignee_resolved = True
    for row in ready_rows:
        if max_spawn is not None and running_count + spawned >= max_spawn:
            break
-        if not row["assignee"]:
-            result.skipped_unassigned.append(row["id"])
-            continue
+        row_assignee = row["assignee"]
+        if not row_assignee:
+            # Honour kanban.default_assignee: when the dispatcher hits an
+            # unassigned ready task and an operator-configured fallback
+            # exists, persist the assignment and proceed. This removes the
+            # dashboard footgun where a task created without an assignee
+            # parks in 'ready' forever even though the operator's intent
+            # ("default") was perfectly clear (#27145). Mutating the row
+            # (not just the in-memory view) keeps diagnostics and the
+            # board state consistent: the task is now legitimately owned
+            # by ``kanban.default_assignee``, not "unassigned but secretly
+            # routed".
+            if _default_assignee and _default_assignee_resolved:
+                # Dry-run: show what WOULD happen (auto-assign + spawn) without
+                # mutating the DB. Real run: mutate the row + emit the
+                # 'assigned' event so the board state matches what just happened.
+                if not dry_run:
+                    try:
+                        with write_txn(conn):
+                            conn.execute(
+                                "UPDATE tasks SET assignee = ? WHERE id = ? "
+                                "AND (assignee IS NULL OR assignee = '')",
+                                (_default_assignee, row["id"]),
+                            )
+                            _append_event(
+                                conn, row["id"], "assigned",
+                                {
+                                    "assignee": _default_assignee,
+                                    "source": "kanban.default_assignee",
+                                },
+                            )
+                    except Exception:
+                        _log.debug(
+                            "kanban dispatch: failed to apply default_assignee=%r "
+                            "to task %s",
+                            _default_assignee, row["id"], exc_info=True,
+                        )
+                        result.skipped_unassigned.append(row["id"])
+                        continue
+                row_assignee = _default_assignee
+                result.auto_assigned_default.append(row["id"])
+            else:
+                result.skipped_unassigned.append(row["id"])
+                continue
        # Skip ready tasks whose assignee is not a real Hermes profile.
        # `_default_spawn` invokes ``hermes -p <assignee>`` which fails
        # with "Profile 'X' does not exist" when the assignee names a
@@ -5447,7 +5540,7 @@ def dispatch_once(
            from hermes_cli.profiles import profile_exists  # local import: avoids cycle
        except Exception:
            profile_exists = None  # type: ignore[assignment]
-        if profile_exists is not None and not profile_exists(row["assignee"]):
+        if profile_exists is not None and not profile_exists(row_assignee):
            # Bucket separately from skipped_unassigned: the operator
            # cannot fix this by assigning a profile (the assignee IS the
            # intended owner — a terminal lane). Health telemetry uses
@@ -5456,6 +5549,19 @@ def dispatch_once(
            # of human-pulled work.
            result.skipped_nonspawnable.append(row["id"])
            continue
+        # Per-profile concurrency cap (#21582): even if there's global
+        # headroom, refuse to spawn for an assignee that's already at
+        # its in-flight cap. Prevents one profile's local model / API
+        # quota / browser pool from being overwhelmed by a fan-out
+        # while the global max_in_progress / max_spawn caps still allow
+        # work on OTHER profiles.
+        if _per_profile_cap is not None:
+            current = _per_profile_running.get(row_assignee, 0)
+            if current >= _per_profile_cap:
+                result.skipped_per_profile_capped.append(
+                    (row["id"], row_assignee, current)
+                )
+                continue
        # Respawn guard: refuse to re-spawn when useful work is already
        # in-flight/recent, or when the last failure is a deterministic
        # blocker (quota / auth). The guard defers the spawn this tick so
@@ -5478,7 +5584,15 @@ def dispatch_once(
                    )
            continue
        if dry_run:
-            result.spawned.append((row["id"], row["assignee"], ""))
+            result.spawned.append((row["id"], row_assignee, ""))
+            # Increment per-profile counter even in dry_run so the cap
+            # check sees the would-be spawn on subsequent iterations.
+            # Without this, dry_run reports every task as spawnable and
+            # under-reports the capped subset (#21582).
+            if _per_profile_cap is not None and row_assignee:
+                _per_profile_running[row_assignee] = (
+                    _per_profile_running.get(row_assignee, 0) + 1
+                )
            continue
        claimed = claim_task(conn, row["id"], ttl_seconds=ttl_seconds)
        if claimed is None:
@@ -5521,6 +5635,13 @@ def dispatch_once(
            # complete_task).
            result.spawned.append((claimed.id, claimed.assignee or "", str(workspace)))
            spawned += 1
+            # Track the new in-flight count for this profile so later
+            # iterations in this same tick respect the per-profile cap
+            # (#21582). Subsequent ticks re-query from the DB.
+            if _per_profile_cap is not None and claimed.assignee:
+                _per_profile_running[claimed.assignee] = (
+                    _per_profile_running.get(claimed.assignee, 0) + 1
+                )
        except Exception as exc:
            auto = _record_spawn_failure(
                conn, claimed.id, str(exc),
@@ -622,12 +622,10 @@ def _has_any_provider_configured() -> bool:
    # being installed doesn't mean the user wants Hermes to use their tokens.
    if _has_hermes_config:
        try:
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-            is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
-            if read_claude_code_credentials is None or is_claude_code_token_valid is None:
-                raise ImportError("anthropic plugin not registered")
+            from agent.anthropic_adapter import (
+                read_claude_code_credentials,
+                is_claude_code_token_valid,
+            )

            creds = read_claude_code_credentials()
            if creds and (
@@ -4106,15 +4104,13 @@ def _model_flow_azure_foundry(config, current_model=""):

    if use_entra:
        try:
-            from agent.plugin_registries import registries
-            _azure = registries.get_provider_namespace("azure")
-            EntraIdentityConfig = _azure.get("EntraIdentityConfig")
-            SCOPE_AI_AZURE_DEFAULT = _azure.get("SCOPE_AI_AZURE_DEFAULT")
-            build_token_provider = _azure.get("build_token_provider")
-            describe_active_credential = _azure.get("describe_active_credential")
-            has_azure_identity_installed = _azure.get("has_azure_identity_installed")
-            if any(v is None for v in [EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, build_token_provider, describe_active_credential, has_azure_identity_installed]):
-                raise ImportError("azure plugin not registered")
+            from agent.azure_identity_adapter import (
+                EntraIdentityConfig,
+                SCOPE_AI_AZURE_DEFAULT,
+                build_token_provider,
+                describe_active_credential,
+                has_azure_identity_installed,
+            )
        except ImportError as exc:
            print()
            print(f"⚠ Could not import azure-identity adapter: {exc}")
@@ -5428,14 +5424,12 @@ def _model_flow_bedrock(config, current_model=""):

    # 1. Check for AWS credentials
    try:
-        from agent.plugin_registries import registries
-        _bedrock = registries.get_provider_namespace("bedrock")
-        has_aws_credentials = _bedrock.get("has_aws_credentials")
-        resolve_aws_auth_env_var = _bedrock.get("resolve_aws_auth_env_var")
-        resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
-        discover_bedrock_models = _bedrock.get("discover_bedrock_models")
-        if any(v is None for v in [has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region, discover_bedrock_models]):
-            raise ImportError("bedrock plugin not registered")
+        from agent.bedrock_adapter import (
+            has_aws_credentials,
+            resolve_aws_auth_env_var,
+            resolve_bedrock_region,
+            discover_bedrock_models,
+        )
    except ImportError:
        print("  ✗ boto3 is not installed. Install it with:")
        print("    pip install boto3")
@@ -5883,13 +5877,11 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):

 def _run_anthropic_oauth_flow(save_env_value):
    """Run the Claude OAuth setup-token flow. Returns True if credentials were saved."""
-    from agent.plugin_registries import registries
-    _anthropic = registries.get_provider_namespace("anthropic")
-    run_oauth_setup_token = _anthropic.get("run_oauth_setup_token")
-    read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-    is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
-    if run_oauth_setup_token is None:
-        raise ImportError("anthropic plugin not registered")
+    from agent.anthropic_adapter import (
+        run_oauth_setup_token,
+        read_claude_code_credentials,
+        is_claude_code_token_valid,
+    )
    from hermes_cli.config import (
        save_anthropic_oauth_token,
        use_anthropic_claude_code_credentials,
@@ -5997,13 +5989,11 @@ def _model_flow_anthropic(config, current_model=""):
    existing_key = get_anthropic_key()
    cc_available = False
    try:
-        from agent.plugin_registries import registries
-        _anthropic = registries.get_provider_namespace("anthropic")
-        read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-        is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
-        _is_oauth_token = _anthropic.get("_is_oauth_token")
-        if any(v is None for v in [read_claude_code_credentials, is_claude_code_token_valid, _is_oauth_token]):
-            raise ImportError("anthropic plugin not registered")
+        from agent.anthropic_adapter import (
+            read_claude_code_credentials,
+            is_claude_code_token_valid,
+            _is_oauth_token,
+        )

        cc_creds = read_claude_code_credentials()
        if cc_creds and is_claude_code_token_valid(cc_creds):
@@ -8120,13 +8110,71 @@ def _cleanup_quarantined_exes(scripts_dir: Path | None = None) -> None:


 def _refresh_active_lazy_features() -> None:
-    """No-op — lazy deps removed.
+    """Refresh lazy-installed backends after a code update.

-    Optional backends are now proper plugin packages (hermes-agent-anthropic,
-    hermes-agent-telegram, etc.) installed via extras. ``hermes update``
-    refreshes them through ``uv pip install -e .[all]`` like any other dep.
+    When pyproject.toml's ``[all]`` extra was slimmed down (May 2026), most
+    optional backends moved to ``tools/lazy_deps.py`` and only install on
+    first use. ``hermes update`` runs ``uv pip install -e .[all]`` which
+    leaves those packages untouched — so if we bump a pin in
+    :data:`LAZY_DEPS` (CVE response, transitive bug fix), users who already
+    activated the backend keep the stale version forever.
+
+    This function asks lazy_deps which features the user has previously
+    activated and reinstalls them under the current pins. Features the
+    user never enabled stay quiet — no churn for cold backends.
+
+    Never raises. A failure here must not block the rest of the update.
    """
-    pass
+    try:
+        from tools import lazy_deps
+    except Exception as exc:
+        logger.debug("Lazy refresh skipped (import failed): %s", exc)
+        return
+
+    try:
+        active = lazy_deps.active_features()
+    except Exception as exc:
+        logger.debug("Lazy refresh skipped (active_features failed): %s", exc)
+        return
+
+    if not active:
+        return
+
+    print()
+    print(f"→ Refreshing {len(active)} active lazy backend(s)...")
+
+    try:
+        results = lazy_deps.refresh_active_features(prompt=False)
+    except Exception as exc:
+        # refresh_active_features is documented as never-raise, but defend
+        # the update flow against future regressions.
+        print(f"  ⚠ Lazy refresh failed unexpectedly: {exc}")
+        return
+
+    refreshed = [f for f, s in results.items() if s == "refreshed"]
+    current = [f for f, s in results.items() if s == "current"]
+    failed = [(f, s) for f, s in results.items() if s.startswith("failed:")]
+    skipped = [(f, s) for f, s in results.items() if s.startswith("skipped:")]
+
+    if refreshed:
+        print(f"  ↑ {len(refreshed)} refreshed: {', '.join(refreshed)}")
+    if current:
+        print(f"  ✓ {len(current)} already current")
+    if skipped:
+        # Most common reason: security.allow_lazy_installs=false. Show one
+        # line so the user knows why; not an error.
+        names = ", ".join(f for f, _ in skipped)
+        reason = skipped[0][1].split(": ", 1)[-1]
+        print(f"  · {len(skipped)} skipped ({reason}): {names}")
+    if failed:
+        for feature, status in failed:
+            reason = status.split(": ", 1)[-1]
+            # Clip noisy pip stderr to keep update output legible.
+            if len(reason) > 200:
+                reason = reason[:200] + "..."
+            print(f"  ⚠ {feature} failed to refresh: {reason}")
+        print("  Backends keep their previously-installed version; rerun")
+        print("  `hermes update` once the upstream issue is resolved.")


 def _install_python_dependencies_with_optional_fallback(
@@ -1159,12 +1159,8 @@ def list_authenticated_providers(
        if slug_norm != current_norm:
            return False
        try:
-            from agent.plugin_registries import registries
-            _bedrock_ns = registries.get_provider_namespace("bedrock")
-            has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
-            if has_aws_credentials:
-                return bool(has_aws_credentials())
-            return False
+            from agent.bedrock_adapter import has_aws_credentials
+            return bool(has_aws_credentials())
        except Exception:
            return False

@@ -1346,12 +1342,10 @@ def list_authenticated_providers(
        # configured.
        if not has_creds and hermes_slug == "anthropic":
            try:
-                from agent.plugin_registries import registries
-                _anthropic_ns = registries.get_provider_namespace("anthropic")
-                read_claude_code_credentials = _anthropic_ns.get("read_claude_code_credentials")
-                read_hermes_oauth_credentials = _anthropic_ns.get("read_hermes_oauth_credentials")
-                if read_claude_code_credentials is None or read_hermes_oauth_credentials is None:
-                    raise ImportError("anthropic credential readers not registered")
+                from agent.anthropic_adapter import (
+                    read_claude_code_credentials,
+                    read_hermes_oauth_credentials,
+                )
                hermes_creds = read_hermes_oauth_credentials()
                cc_creds = read_claude_code_credentials()
                if (hermes_creds and hermes_creds.get("accessToken")) or \
@@ -2116,11 +2116,7 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
    # below — bedrock is not expected to appear in that table.
    if normalized == "bedrock":
        try:
-            from agent.plugin_registries import registries
-            _bedrock_ns = registries.get_provider_namespace("bedrock")
-            bedrock_model_ids_or_none = _bedrock_ns.get("bedrock_model_ids_or_none")
-            if bedrock_model_ids_or_none is None:
-                raise ImportError("bedrock_model_ids_or_none not found in bedrock provider")
+            from agent.bedrock_adapter import bedrock_model_ids_or_none
            ids = bedrock_model_ids_or_none()
            if ids is not None:
                return ids
@@ -2367,14 +2363,7 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
    Claude Code auto-discovery).  Returns sorted model IDs or None.
    """
    try:
-        from agent.plugin_registries import registries
-        _anthropic_ns = registries.get_provider_namespace("anthropic")
-        resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
-        _is_oauth_token = _anthropic_ns.get("_is_oauth_token")
-        # Beta header constants live in core agent.anthropic_format.
-        from agent.anthropic_format import _COMMON_BETAS, _OAUTH_ONLY_BETAS, _CONTEXT_1M_BETA
-        if resolve_anthropic_token is None or _is_oauth_token is None:
-            raise ImportError("anthropic provider services not registered")
+        from agent.anthropic_adapter import resolve_anthropic_token, _is_oauth_token
    except ImportError:
        return None

@@ -2386,6 +2375,7 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
    is_oauth = _is_oauth_token(token)
    if is_oauth:
        headers["Authorization"] = f"Bearer {token}"
+        from agent.anthropic_adapter import _COMMON_BETAS, _OAUTH_ONLY_BETAS, _CONTEXT_1M_BETA
        headers["anthropic-beta"] = ",".join(_COMMON_BETAS + _OAUTH_ONLY_BETAS)
    else:
        headers["x-api-key"] = token
@@ -3717,12 +3707,7 @@ def validate_requested_model(
    # AWS SDK control plane (ListFoundationModels + ListInferenceProfiles).
    if normalized == "bedrock":
        try:
-            from agent.plugin_registries import registries
-            _bedrock_ns = registries.get_provider_namespace("bedrock")
-            discover_bedrock_models = _bedrock_ns.get("discover_bedrock_models")
-            resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
-            if discover_bedrock_models is None or resolve_bedrock_region is None:
-                raise ImportError("bedrock discovery functions not registered")
+            from agent.bedrock_adapter import discover_bedrock_models, resolve_bedrock_region
            region = resolve_bedrock_region()
            discovered = discover_bedrock_models(region)
            discovered_ids = {m["id"] for m in discovered}
@@ -818,270 +818,6 @@ class PluginContext:
            name,
        )

-    # -- auth provider registration -------------------------------------------
-
-    def register_platform_entry(
-        self,
-        name: str,
-        adapter_class: type,
-        check_requirements: Callable,
-        available_flag: str = "",
-        constants: dict | None = None,
-        helper_functions: dict | None = None,
-    ) -> None:
-        """Register a platform adapter entry in the capability registries.
-
-        This populates ``agent.plugin_registries.registries.platform_adapters``
-        so core code can look up adapter classes, constants, and helper
-        functions without importing from ``hermes_agent_*`` packages directly.
-
-        Call this **in addition to** :meth:`register_platform` — the two
-        registries serve different consumers:
-
-        * ``register_platform``  → ``gateway.platform_registry``  (gateway
-          adapter creation, setup wizard, status)
-        * ``register_platform_entry`` → ``agent.plugin_registries`` (adapter
-          class access, constants, helpers for send_message_tool, etc.)
-
-        Args:
-            name: Platform identifier (e.g. ``"telegram"``).
-            adapter_class: The adapter class itself (e.g. ``TelegramAdapter``).
-            check_requirements: Callable returning ``bool`` — are deps installed?
-            available_flag: Name of the module-level AVAILABLE boolean, if any.
-            constants: Platform-specific constants (e.g.
-                ``{"FEISHU_DOMAIN": ..., "LARK_DOMAIN": ...}``).
-            helper_functions: Platform-specific helpers (e.g.
-                ``{"_strip_mdv2": _strip_mdv2, "qr_register": qr_register}``).
-        """
-        from agent.plugin_registries import registries, PlatformAdapterEntry
-
-        entry = PlatformAdapterEntry(
-            name=name,
-            adapter_class=adapter_class,
-            check_requirements=check_requirements,
-            available_flag=available_flag,
-            constants=constants or {},
-            helper_functions=helper_functions or {},
-        )
-        registries.register_platform(entry)
-        logger.debug(
-            "Plugin %s registered platform entry: %s",
-            self.manifest.name,
-            name,
-        )
-
-    def register_tool_provider_entry(
-        self,
-        name: str,
-        tool_functions: dict | None = None,
-        check_fn: Callable | None = None,
-        constants: dict | None = None,
-        config_functions: dict | None = None,
-        environment_classes: dict | None = None,
-    ) -> None:
-        """Register a tool provider entry in the capability registries.
-
-        This populates ``agent.plugin_registries.registries.tool_providers``
-        so core code can look up tool functions, constants, and config
-        helpers without importing from ``hermes_agent_*`` packages directly.
-
-        Args:
-            name: Tool identifier (e.g. ``"tts"``, ``"stt"``).
-            tool_functions: Dict of function name → callable
-                (e.g. ``{"text_to_speech_tool": text_to_speech_tool}``).
-            check_fn: Optional callable returning ``bool`` — are deps
-                installed and configured?
-            constants: Tool-specific constants
-                (e.g. ``{"MAX_FILE_SIZE": 25 * 1024 * 1024}``).
-            config_functions: Config/utility functions
-                (e.g. ``{"is_stt_enabled": is_stt_enabled}``).
-            environment_classes: Environment classes for terminal backends
-                (e.g. ``{"DaytonaEnvironment": DaytonaEnvironment}``).
-        """
-        from agent.plugin_registries import registries, ToolProviderEntry
-
-        entry = ToolProviderEntry(
-            name=name,
-            tool_functions=tool_functions or {},
-            check_fn=check_fn,
-            constants=constants or {},
-            config_functions=config_functions or {},
-            environment_classes=environment_classes or {},
-        )
-        registries.register_tool_provider(entry)
-        logger.debug(
-            "Plugin %s registered tool provider entry: %s",
-            self.manifest.name,
-            name,
-        )
-
-    def register_provider_services(
-        self,
-        name: str,
-        services: dict,
-    ) -> None:
-        """Register a namespace dict of provider-specific services.
-
-        This is the escape hatch for model-provider plugins that expose many
-        symbols (anthropic has 50+).  Each plugin registers its public surface
-        as a flat dict of ``{symbol_name: callable_or_value}``.  Core code
-        looks up specific symbols instead of importing from the plugin
-        package directly.
-
-        Args:
-            name: Provider identifier (e.g. ``"anthropic"``, ``"bedrock"``).
-            services: Dict of symbol name → callable or value.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_provider_services(name, services)
-        logger.debug(
-            "Plugin %s registered provider services: %s (%d symbols)",
-            self.manifest.name,
-            name,
-            len(services),
-        )
-
-    def register_auth_provider(
-        self,
-        name: str,
-        provider: Any,
-        *,
-        cli_group: str = "",
-        setup_subcommands: bool = False,
-    ) -> None:
-        """Register an authentication provider.
-
-        ``provider`` must implement the :class:`agent.plugin_registries.AuthProvider`
-        protocol (``name``, ``has_credentials``, ``check_env_vars``,
-        ``resolve_token``, ``refresh_token``).  It may also expose
-        provider-specific attributes (``_is_oauth_token``,
-        ``_HERMES_OAUTH_FILE``, ``read_claude_code_credentials``, etc.)
-        that core code accesses via the registry.
-
-        Registered providers are queried by core code via
-        ``registries.get_auth_provider(name)`` instead of importing
-        directly from ``hermes_agent_*`` packages.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_auth_provider(
-            name, provider,
-            cli_group=cli_group,
-            setup_subcommands=setup_subcommands,
-        )
-        logger.debug(
-            "Plugin %s registered auth provider: %s",
-            self.manifest.name, name,
-        )
-
-    def register_provider_resolver(
-        self,
-        name: str,
-        resolver: Any,
-    ) -> None:
-        """Register a provider resolver callable.
-
-        The resolver handles ALL provider-specific client construction
-        logic for auxiliary tasks.  Core's ``resolve_provider_client()``
-        dispatches to it instead of using per-provider if/elif branches.
-
-        Signature::
-
-            def resolver(
-                *,
-                model: str | None,
-                explicit_api_key: str | None,
-                explicit_base_url: str | None,
-                async_mode: bool,
-                is_vision: bool,
-                main_runtime: dict | None,
-                api_mode: str | None,
-            ) -> tuple[Any, str] | tuple[None, None]:
-                ...
-
-        Returns ``(client, default_model)`` or ``(None, None)``.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_provider_resolver(name, resolver)
-        logger.debug(
-            "Plugin %s registered provider resolver: %s",
-            self.manifest.name, name,
-        )
-
-    def register_transport(
-        self,
-        api_mode: str,
-        transport_cls: type,
-    ) -> None:
-        """Register a ProviderTransport class for an api_mode string.
-
-        This lets the transport registry discover provider transports
-        from plugins without core needing to import the plugin package.
-        """
-        from agent.plugin_registries import registries
-
-        registries._transports[api_mode] = transport_cls
-        logger.debug(
-            "Plugin %s registered transport: %s → %s",
-            self.manifest.name, api_mode, transport_cls.__name__,
-        )
-
-    def register_credential_pool_hook(
-        self,
-        name: str,
-        hook: Any,
-    ) -> None:
-        """Register a credential pool hook for provider-specific pool operations.
-
-        The hook should be a :class:`agent.plugin_registries.CredentialPoolHook`
-        instance with optional ``sync_from_credentials_file``,
-        ``refresh_oauth``, and ``should_include_in_pool`` callables.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_credential_pool_hook(name, hook)
-        logger.debug(
-            "Plugin %s registered credential pool hook: %s",
-            self.manifest.name, name,
-        )
-
-    def register_pricing_provider(
-        self,
-        name: str,
-        entries: list,
-    ) -> None:
-        """Register pricing entries for a provider.
-
-        ``entries`` should be a list of
-        :class:`agent.plugin_registries.PricingEntry` instances.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_pricing_provider(name, entries)
-        logger.debug(
-            "Plugin %s registered pricing provider: %s (%d entries)",
-            self.manifest.name, name, len(entries),
-        )
-
-    def register_provider_overlay(
-        self,
-        entry: Any,
-    ) -> None:
-        """Register a provider overlay entry.
-
-        ``entry`` should be a :class:`agent.plugin_registries.ProviderOverlayEntry`
-        instance.
-        """
-        from agent.plugin_registries import registries
-
-        registries.register_provider_overlay(entry)
-        logger.debug(
-            "Plugin %s registered provider overlay: %s",
-            self.manifest.name, entry.provider_name,
-        )
-
    # -- hook registration --------------------------------------------------

    # -- auxiliary task registration ---------------------------------------
@@ -1338,11 +1074,6 @@ class PluginManager:
        )
        logger.debug("  bundled/platforms: %d manifest(s)", len(bundled_platforms))
        manifests.extend(bundled_platforms)
-        bundled_providers = self._scan_directory(
-            repo_plugins / "model-providers", source="bundled"
-        )
-        logger.debug("  bundled/model-providers: %d manifest(s)", len(bundled_providers))
-        manifests.extend(bundled_providers)

        # 2. User plugins (~/.hermes/plugins/)
        user_dir = get_hermes_home() / "plugins"
@@ -1379,16 +1110,7 @@ class PluginManager:
        enabled = _get_enabled_plugins()  # None = opt-in default (nothing enabled)
        winners: Dict[str, PluginManifest] = {}
        for manifest in manifests:
-            key = manifest.key or manifest.name
-            existing = winners.get(key)
-            # Bundled/workspace plugins take precedence over entry-points
-            # for the same key — the local source is the one we're
-            # actively developing; the entry-point is the published
-            # version.  Only let entry-points fill gaps where no bundled
-            # version exists.
-            if existing is not None and existing.source == "bundled" and manifest.source != "bundled":
-                continue
-            winners[key] = manifest
+            winners[manifest.key or manifest.name] = manifest
        for manifest in winners.values():
            lookup_key = manifest.key or manifest.name

@@ -1416,12 +1138,30 @@ class PluginManager:
                )
                continue

-            # Model provider plugins auto-load just like backends and
-            # platforms. They register their provider services (auth,
-            # transport, metadata) via ctx.register_provider_services()
-            # in their register() function, which populates the
-            # capability registries that core code queries.
-            if manifest.source == "bundled" and manifest.kind in {"backend", "platform", "model-provider"}:
+            # Model provider plugins are loaded by providers/__init__.py
+            # (its own lazy discovery keyed off first get_provider_profile()
+            # call). We record the manifest here for introspection but do
+            # not import the module — a second import would create two
+            # ProviderProfile instances and break the "last writer wins"
+            # override semantics between bundled and user plugins.
+            if manifest.kind == "model-provider":
+                loaded = LoadedPlugin(manifest=manifest, enabled=True)
+                self._plugins[lookup_key] = loaded
+                logger.debug(
+                    "Skipping '%s' (model-provider, handled by providers/ discovery)",
+                    lookup_key,
+                )
+                continue
+
+            # Built-in backends auto-load — they ship with hermes and must
+            # just work. Selection among them (e.g. which image_gen backend
+            # services calls) is driven by ``<category>.provider`` config,
+            # enforced by the tool wrapper.
+            #
+            # Bundled platform plugins (gateway adapters like IRC) auto-load
+            # for the same reason: every platform Hermes ships must be
+            # available out of the box without the user having to opt in.
+            if manifest.source == "bundled" and manifest.kind in {"backend", "platform"}:
                self._load_plugin(manifest)
                continue

@@ -99,8 +99,10 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
        transport="openai_chat",
        extra_env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN"),
    ),
-    # "anthropic" overlay moved to plugin: hermes_agent_anthropic register()
-    # Plugin registers via ctx.register_provider_overlay() and core merges lazily.
+    "anthropic": HermesOverlay(
+        transport="anthropic_messages",
+        extra_env_vars=("ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
+    ),
    "zai": HermesOverlay(
        transport="openai_chat",
        extra_env_vars=("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
@@ -202,45 +204,17 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
    ),
    # Azure Foundry: supports both OpenAI-style and Anthropic-style endpoints.
    # The transport is determined at runtime from config.yaml model.api_mode.
-    # "azure-foundry" overlay moved to plugin: hermes_agent_azure register()
-    # "bedrock" overlay moved to plugin: hermes_agent_bedrock register()
-    # Plugins register via ctx.register_provider_overlay() and core merges lazily.
+    "azure-foundry": HermesOverlay(
+        transport="openai_chat",  # default; overridden by api_mode in config
+        base_url_env_var="AZURE_FOUNDRY_BASE_URL",
+    ),
+    "bedrock": HermesOverlay(
+        transport="bedrock_converse",
+        auth_type="aws_sdk",
+    ),
 }


-def _merge_plugin_overlays() -> None:
-    """Merge plugin-registered provider overlays into HERMES_OVERLAYS.
-
-    Called lazily from ``resolve_provider`` so that plugins have had a
-    chance to register by the time we need the overlay data.
-    """
-    global _plugin_overlays_merged
-    if _plugin_overlays_merged:
-        return
-    _plugin_overlays_merged = True
-    try:
-        from agent.plugin_registries import registries
-        for _name, _entry in registries.all_provider_overlays().items():
-            if _name not in HERMES_OVERLAYS:
-                HERMES_OVERLAYS[_name] = HermesOverlay(
-                    transport=_entry.transport,
-                    is_aggregator=_entry.is_aggregator,
-                    auth_type=_entry.auth_type,
-                    extra_env_vars=_entry.extra_env_vars,
-                    base_url_override=_entry.base_url_override,
-                    base_url_env_var=_entry.base_url_env_var,
-                )
-            # Also merge aliases from the plugin overlay entry
-            for _alias in _entry.aliases:
-                if _alias not in ALIASES:
-                    ALIASES[_alias] = _name
-    except Exception:
-        pass
-
-
-_plugin_overlays_merged = False
-
-
 # -- Resolved provider -------------------------------------------------------
 # The merged result of models.dev + overlay + user config.

@@ -361,7 +335,11 @@ ALIASES: Dict[str, str] = {
    "tencent-cloud": "tencent-tokenhub",
    "tencentmaas": "tencent-tokenhub",

-    # bedrock aliases moved to plugin: hermes_agent_bedrock register()
+    # bedrock
+    "aws": "bedrock",
+    "aws-bedrock": "bedrock",
+    "amazon-bedrock": "bedrock",
+    "amazon": "bedrock",

    # arcee
    "arcee-ai": "arcee",
@@ -448,7 +426,6 @@ def get_provider(name: str) -> Optional[ProviderDef]:
    except Exception:
        mdev_info = None

-    _merge_plugin_overlays()
    overlay = HERMES_OVERLAYS.get(canonical)

    if mdev_info is not None:
@@ -976,13 +976,11 @@ def _resolve_azure_foundry_runtime(
            auth_mode = "api_key"
        else:
            try:
-                from agent.plugin_registries import registries
-                _azure_ns = registries.get_provider_namespace("azure")
-                EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
-                SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
-                build_token_provider = _azure_ns.get("build_token_provider")
-                if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, build_token_provider]):
-                    raise ImportError("azure provider services not fully registered")
+                from agent.azure_identity_adapter import (
+                    EntraIdentityConfig,
+                    SCOPE_AI_AZURE_DEFAULT,
+                    build_token_provider,
+                )
            except Exception as exc:
                raise AuthError(
                    "Azure Foundry Entra ID auth requires the 'azure-identity' "
@@ -1074,11 +1072,7 @@ def _resolve_explicit_runtime(
        base_url = explicit_base_url or cfg_base_url or "https://api.anthropic.com"
        api_key = explicit_api_key
        if not api_key:
-            from agent.plugin_registries import registries
-            _anthropic_ns = registries.get_provider_namespace("anthropic")
-            resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
-            if resolve_anthropic_token is None:
-                raise ImportError("anthropic provider services not registered")
+            from agent.anthropic_adapter import resolve_anthropic_token

            api_key = resolve_anthropic_token()
            if not api_key:
@@ -1518,11 +1512,7 @@ def resolve_runtime_provider(
                    "config.yaml model section at a custom env var."
                )
        else:
-            from agent.plugin_registries import registries
-            _anthropic_ns = registries.get_provider_namespace("anthropic")
-            resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
-            if resolve_anthropic_token is None:
-                raise ImportError("anthropic provider services not registered")
+            from agent.anthropic_adapter import resolve_anthropic_token
            token = resolve_anthropic_token()
            if not token:
                raise AuthError(
@@ -1540,14 +1530,12 @@ def resolve_runtime_provider(

    # AWS Bedrock (native Converse API via boto3)
    if provider == "bedrock":
-        from agent.plugin_registries import registries
-        _bedrock_ns = registries.get_provider_namespace("bedrock")
-        has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
-        resolve_aws_auth_env_var = _bedrock_ns.get("resolve_aws_auth_env_var")
-        resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
-        is_anthropic_bedrock_model = _bedrock_ns.get("is_anthropic_bedrock_model")
-        if not all([has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region, is_anthropic_bedrock_model]):
-            raise ImportError("bedrock provider services not fully registered")
+        from agent.bedrock_adapter import (
+            has_aws_credentials,
+            resolve_aws_auth_env_var,
+            resolve_bedrock_region,
+            is_anthropic_bedrock_model,
+        )
        # When the user explicitly selected bedrock (not auto-detected),
        # trust boto3's credential chain — it handles IMDS, ECS task roles,
        # Lambda execution roles, SSO, and other implicit sources that our
@@ -2052,32 +2052,59 @@ def _setup_matrix():
            save_env_value("MATRIX_ENCRYPTION", "true")
            print_success("E2EE enabled")

-        matrix_pkg = "hermes-agent[matrix]"
-        # Matrix deps are now a proper plugin package. Install it the normal way.
+        matrix_pkg = "mautrix[encryption]" if want_e2ee else "mautrix"
+        # Use the central lazy-deps feature group so we install ALL of
+        # platform.matrix's dependencies (mautrix, Markdown, aiosqlite,
+        # asyncpg, aiohttp-socks) — not just mautrix itself.  The previous
+        # hand-rolled ``pip install mautrix[encryption]`` left asyncpg /
+        # aiosqlite uninstalled and broke E2EE connect with
+        # ``No module named 'asyncpg'`` on every fresh install (#31116).
        try:
-            __import__("hermes_agent_matrix")
+            from tools.lazy_deps import ensure as _lazy_ensure, feature_missing
+            _missing_before = feature_missing("platform.matrix")
+            if _missing_before:
+                print_info(
+                    f"Installing {matrix_pkg} (+ {len(_missing_before)} runtime deps)..."
+                )
+                try:
+                    _lazy_ensure("platform.matrix", prompt=False)
+                    print_success(f"{matrix_pkg} installed")
+                except Exception as exc:
+                    print_warning(
+                        f"Install failed — run manually: pip install "
+                        f"'mautrix[encryption]' asyncpg aiosqlite Markdown "
+                        f"aiohttp-socks"
+                    )
+                    print_info(f"  Error: {exc}")
        except ImportError:
-            print_info(f"Installing {matrix_pkg}...")
-            import subprocess
-            uv_bin = shutil.which("uv")
-            if uv_bin:
-                result = subprocess.run(
-                    [uv_bin, "pip", "install", "--python", sys.executable, matrix_pkg],
-                    capture_output=True, text=True,
-                )
-            else:
-                result = subprocess.run(
-                    [sys.executable, "-m", "pip", "install", matrix_pkg],
-                    capture_output=True, text=True,
-                )
-            if result.returncode == 0:
-                print_success(f"{matrix_pkg} installed")
-            else:
-                print_warning(
-                    f"Install failed — run manually: pip install '{matrix_pkg}'"
-                )
-                if result.stderr:
-                    print_info(f"  Error: {result.stderr.strip().splitlines()[-1]}")
+            # tools.lazy_deps unavailable (extreme edge case — partial
+            # install).  Fall back to the legacy single-package install
+            # path so the wizard still does *something*.
+            try:
+                __import__("mautrix")
+            except ImportError:
+                print_info(f"Installing {matrix_pkg}...")
+                import subprocess
+                uv_bin = shutil.which("uv")
+                if uv_bin:
+                    result = subprocess.run(
+                        [uv_bin, "pip", "install", "--python", sys.executable, matrix_pkg],
+                        capture_output=True, text=True,
+                    )
+                else:
+                    result = subprocess.run(
+                        [sys.executable, "-m", "pip", "install", matrix_pkg],
+                        capture_output=True, text=True,
+                    )
+                if result.returncode == 0:
+                    print_success(f"{matrix_pkg} installed")
+                else:
+                    print_warning(
+                        f"Install failed — run manually: pip install "
+                        f"'{matrix_pkg}' asyncpg aiosqlite Markdown aiohttp-socks"
+                    )
+                    if result.stderr:
+                        print_info(f"  Error: {result.stderr.strip().splitlines()[-1]}")

        print()
        print_info("🔒 Security: Restrict who can use your bot")
@@ -779,9 +779,7 @@ def speak_text(text: str) -> None:
    _debug(f"speak_text: TTS begin (paused_recording={paused_recording})")

    try:
-        from agent.plugin_registries import registries
-        _tts = registries.get_tool_provider("tts")
-        text_to_speech_tool = _tts.tool_functions.get("text_to_speech_tool") if _tts else None
+        from tools.tts_tool import text_to_speech_tool

        tts_text = text[:4000] if len(text) > 4000 else text
        tts_text = re.sub(r'```[\s\S]*?```', ' ', tts_text)             # fenced code blocks
@@ -808,8 +806,6 @@ def speak_text(text: str) -> None:
            f"tts_{time.strftime('%Y%m%d_%H%M%S')}.mp3",
        )

-        if text_to_speech_tool is None:
-            raise ImportError("TTS plugin not registered")
        _debug(f"speak_text: synthesizing {len(tts_text)} chars -> {mp3_path}")
        text_to_speech_tool(text=tts_text, output_path=mp3_path)

@@ -58,10 +58,22 @@ try:
    from fastapi.staticfiles import StaticFiles
    from pydantic import BaseModel
 except ImportError:
-    raise SystemExit(
-        "Web UI requires fastapi and uvicorn.\n"
-        "Install with: pip install 'hermes-agent[dashboard]'"
-    )
+    # First try lazy-installing the dashboard extras. Only the user actually
+    # running `hermes dashboard` needs fastapi+uvicorn; lazy install keeps
+    # them out of every other install path. After install, re-import.
+    try:
+        from tools.lazy_deps import ensure as _lazy_ensure
+        _lazy_ensure("tool.dashboard", prompt=False)
+        from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
+        from fastapi.middleware.cors import CORSMiddleware
+        from fastapi.responses import FileResponse, HTMLResponse, JSONResponse, Response
+        from fastapi.staticfiles import StaticFiles
+        from pydantic import BaseModel
+    except Exception:
+        raise SystemExit(
+            "Web UI requires fastapi and uvicorn.\n"
+            f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
+        )

 WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
 _log = logging.getLogger(__name__)
@@ -1359,13 +1371,11 @@ def _anthropic_oauth_status() -> Dict[str, Any]:
    The dashboard reports the highest-priority source that's actually present.
    """
    try:
-        from agent.plugin_registries import registries
-        _anthropic = registries.get_provider_namespace("anthropic")
-        read_hermes_oauth_credentials = _anthropic.get("read_hermes_oauth_credentials")
-        read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-        _HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
-        if read_hermes_oauth_credentials is None:
-            raise ImportError("anthropic plugin not registered")
+        from agent.anthropic_adapter import (
+            read_hermes_oauth_credentials,
+            read_claude_code_credentials,
+            _HERMES_OAUTH_FILE,
+        )
    except ImportError:
        read_claude_code_credentials = None  # type: ignore
        read_hermes_oauth_credentials = None  # type: ignore
@@ -1424,11 +1434,7 @@ def _claude_code_only_status() -> Dict[str, Any]:
    when they also have a separate Hermes-managed PKCE login.
    """
    try:
-        from agent.plugin_registries import registries
-        _anthropic = registries.get_provider_namespace("anthropic")
-        read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
-        if read_claude_code_credentials is None:
-            raise ImportError("anthropic plugin not registered")
+        from agent.anthropic_adapter import read_claude_code_credentials
        creds = read_claude_code_credentials()
    except Exception:
        creds = None
@@ -1614,10 +1620,8 @@ async def disconnect_oauth_provider(provider_id: str, request: Request):
    # want to undo a disconnect.
    if provider_id in {"anthropic", "claude-code"}:
        try:
-            from agent.plugin_registries import registries
-            _anthropic = registries.get_provider_namespace("anthropic")
-            _HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
-            if _HERMES_OAUTH_FILE is not None and _HERMES_OAUTH_FILE.exists():
+            from agent.anthropic_adapter import _HERMES_OAUTH_FILE
+            if _HERMES_OAUTH_FILE.exists():
                _HERMES_OAUTH_FILE.unlink()
        except Exception:
            pass
@@ -1684,15 +1688,13 @@ _oauth_sessions_lock = threading.Lock()
 # Guarded so hermes web still starts if anthropic_adapter is unavailable;
 # Phase 2 endpoints will return 501 in that case.
 try:
-    from agent.plugin_registries import registries
-    _anthropic = registries.get_provider_namespace("anthropic")
-    _ANTHROPIC_OAUTH_CLIENT_ID = _anthropic.get("_OAUTH_CLIENT_ID")
-    _ANTHROPIC_OAUTH_TOKEN_URL = _anthropic.get("_OAUTH_TOKEN_URL")
-    _ANTHROPIC_OAUTH_REDIRECT_URI = _anthropic.get("_OAUTH_REDIRECT_URI")
-    _ANTHROPIC_OAUTH_SCOPES = _anthropic.get("_OAUTH_SCOPES")
-    _generate_pkce_pair = _anthropic.get("_generate_pkce")
-    if any(v is None for v in [_ANTHROPIC_OAUTH_CLIENT_ID, _ANTHROPIC_OAUTH_TOKEN_URL, _ANTHROPIC_OAUTH_REDIRECT_URI, _ANTHROPIC_OAUTH_SCOPES, _generate_pkce_pair]):
-        raise ImportError("anthropic plugin not registered")
+    from agent.anthropic_adapter import (
+        _OAUTH_CLIENT_ID as _ANTHROPIC_OAUTH_CLIENT_ID,
+        _OAUTH_TOKEN_URL as _ANTHROPIC_OAUTH_TOKEN_URL,
+        _OAUTH_REDIRECT_URI as _ANTHROPIC_OAUTH_REDIRECT_URI,
+        _OAUTH_SCOPES as _ANTHROPIC_OAUTH_SCOPES,
+        _generate_pkce as _generate_pkce_pair,
+    )
    _ANTHROPIC_OAUTH_AVAILABLE = True
 except ImportError:
    _ANTHROPIC_OAUTH_AVAILABLE = False
@@ -1730,11 +1732,7 @@ def _save_anthropic_oauth_creds(access_token: str, refresh_token: str, expires_a
    Mirrors what auth_commands.add_command does so the dashboard flow leaves
    the system in the same state as ``hermes auth add anthropic``.
    """
-    from agent.plugin_registries import registries
-    _anthropic = registries.get_provider_namespace("anthropic")
-    _HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
-    if _HERMES_OAUTH_FILE is None:
-        raise ImportError("anthropic plugin not registered")
+    from agent.anthropic_adapter import _HERMES_OAUTH_FILE
    payload = {
        "accessToken": access_token,
        "refreshToken": refresh_token,
@@ -147,15 +147,8 @@ def create_environment(
        return DockerEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
    
    elif env_type == "modal":
-        from agent.plugin_registries import registries
-        _modal = registries.get_tool_provider("modal")
-        _ModalEnvironment = _modal.environment_classes.get("ModalEnvironment") if _modal else None
-        if _ModalEnvironment is None:
-            raise ValueError(
-                "Modal backend selected but the hermes_agent_modal plugin is not loaded. "
-                "Ensure the modal plugin is installed and enabled."
-            )
-        return _ModalEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
+        from tools.environments.modal import ModalEnvironment
+        return ModalEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
    
    else:
        raise ValueError(f"Unknown environment type: {env_type}. Use 'local', 'docker', or 'modal'")
@@ -260,239 +260,6 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
          echo "ok" > $out/result
        '';

-        # ── Plugin architecture (hermetic core boundary) ───────────────────
-        #
-        # These checks prove that under NixOS (sealed venv, no pip install),
-        # the plugin system works correctly:
-        #   1. Core never imports from hermes_agent_* packages directly
-        #   2. Plugin registries are populated after discovery
-        #   3. Provider service namespaces are queryable
-        #   4. Optional plugins degrade gracefully (None returns, no crash)
-        #   5. No ensure() / lazy_deps / pip-install at runtime
-
-        # Check 1: Zero direct hermes_agent_* imports in core code
-        plugin-hermetic-boundary = pkgs.runCommand "hermes-plugin-hermetic-boundary" { } ''
-          set -e
-          echo "=== Checking core never imports from plugin packages ==="
-
-          # Search for direct imports from hermes_agent_* in core code
-          # (excluding plugins/, tests/, website/, and comments)
-          VIOLATIONS=$(${hermesVenv}/bin/python3 -c '
-          import subprocess, re, sys
-          result = subprocess.run(
-            ["grep", "-rn",
-             "from hermes_agent_\\|import hermes_agent_",
-             "${hermes-agent}/share/hermes-agent"],
-            capture_output=True, text=True
-          )
-          lines = result.stdout.strip().split("\n") if result.stdout.strip() else []
-          # Filter: only .py files, not in plugins/ or tests/, not comments
-          violations = []
-          for line in lines:
-            if not line.endswith(".py"):
-              continue
-            parts = line.split(":", 2)
-            if len(parts) < 3:
-              continue
-            filepath, lineno, content = parts
-            # Skip plugin directories
-            if "/plugins/" in filepath:
-              continue
-            # Skip test directories
-            if "/tests/" in filepath or "/test_" in filepath:
-              continue
-            # Skip comments
-            stripped = content.lstrip()
-            if stripped.startswith("#"):
-              continue
-            violations.append(line)
-          for v in violations:
-            print(v)
-          sys.exit(1 if violations else 0)
-          ' 2>&1 || true)
-
-          if [ -n "$VIOLATIONS" ]; then
-            echo "FAIL: Core code imports directly from plugin packages:"
-            echo "$VIOLATIONS"
-            exit 1
-          fi
-          echo "PASS: Zero direct hermes_agent_* imports in core"
-
-          echo "=== Checking no ensure() / LAZY_DEPS in core ==="
-          ENSURE_VIOLATIONS=$(grep -rn 'ensure(' ${hermes-agent}/share/hermes-agent/agent/ ${hermes-agent}/share/hermes-agent/hermes_cli/ --include='*.py' 2>/dev/null | grep -v '__pycache__' | grep -v '# ' || true)
-          if [ -n "$ENSURE_VIOLATIONS" ]; then
-            echo "FAIL: ensure() still used in core:"
-            echo "$ENSURE_VIOLATIONS"
-            exit 1
-          fi
-          echo "PASS: No ensure() calls in core code"
-
-          mkdir -p $out
-          echo "ok" > $out/result
-        '';
-
-        # Check 2: Plugin registries populate after discovery
-        plugin-registries-populate = pkgs.runCommand "hermes-plugin-registries-populate" { } ''
-          set -e
-          echo "=== Checking plugin registries populate after discovery ==="
-          export HOME=$(mktemp -d)
-
-          RESULT=$(${hermesVenv}/bin/python3 -c '
-          import json, sys
-          from hermes_cli.plugins import PluginManager
-          from agent.plugin_registries import registries
-
-          pm = PluginManager()
-          pm.discover_and_load(force=True)
-
-          out = {
-            "provider_services": list(registries._provider_services.keys()),
-            "platform_adapters": list(registries.platform_adapters.keys()),
-            "tool_providers": list(registries.tool_providers.keys()),
-          }
-          json.dump(out, sys.stdout)
-          ' 2>/dev/null)
-
-          echo "Registry state: $RESULT"
-
-          # Verify provider services populated
-          PROV_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.provider_services | length')
-          if [ "$PROV_COUNT" -lt 1 ]; then
-            echo "FAIL: No provider services registered (expected >= 1)"
-            exit 1
-          fi
-          echo "PASS: $PROV_COUNT provider service(s) registered"
-
-          # Verify platform adapters populated
-          PLAT_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.platform_adapters | length')
-          if [ "$PLAT_COUNT" -lt 1 ]; then
-            echo "FAIL: No platform adapters registered (expected >= 1)"
-            exit 1
-          fi
-          echo "PASS: $PLAT_COUNT platform adapter(s) registered"
-
-          # Verify tool providers populated
-          TOOL_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.tool_providers | length')
-          if [ "$TOOL_COUNT" -lt 1 ]; then
-            echo "FAIL: No tool providers registered (expected >= 1)"
-            exit 1
-          fi
-          echo "PASS: $TOOL_COUNT tool provider(s) registered"
-
-          mkdir -p $out
-          echo "ok" > $out/result
-        '';
-
-        # Check 3: Specific provider service lookups work
-        plugin-provider-lookups = pkgs.runCommand "hermes-plugin-provider-lookups" { } ''
-          set -e
-          echo "=== Checking provider service lookups ==="
-          export HOME=$(mktemp -d)
-
-          RESULT=$(${hermesVenv}/bin/python3 -c '
-          import json, sys
-          from hermes_cli.plugins import PluginManager
-          from agent.plugin_registries import registries
-
-          pm = PluginManager()
-          pm.discover_and_load(force=True)
-
-          checks = {
-            "anthropic.resolve_anthropic_token": registries.get_provider_service("anthropic", "resolve_anthropic_token") is not None,
-            "bedrock.has_aws_credentials": registries.get_provider_service("bedrock", "has_aws_credentials") is not None,
-            "azure.is_token_provider": registries.get_provider_service("azure", "is_token_provider") is not None,
-          }
-          json.dump(checks, sys.stdout)
-          ' 2>/dev/null)
-
-          echo "Lookup results: $RESULT"
-
-          for key in anthropic.resolve_anthropic_token bedrock.has_aws_credentials azure.is_token_provider; do
-            VALUE=$(echo "$RESULT" | ${pkgs.jq}/bin/jq --arg k "$key" '.[$k]')
-            if [ "$VALUE" != "true" ]; then
-              echo "FAIL: $key lookup returned $VALUE (expected true)"
-              exit 1
-            fi
-            echo "PASS: $key lookup works"
-          done
-
-          mkdir -p $out
-          echo "ok" > $out/result
-        '';
-
-        # Check 4: Missing plugins degrade gracefully (no crash)
-        plugin-missing-graceful = pkgs.runCommand "hermes-plugin-missing-graceful" { } ''
-          set -e
-          echo "=== Checking missing plugins degrade gracefully ==="
-          export HOME=$(mktemp -d)
-
-          ${hermesVenv}/bin/python3 -c '
-          from agent.plugin_registries import registries
-
-          # Lookup from non-existent provider — should return None, not crash
-          result = registries.get_provider_service("nonexistent-provider", "some_function")
-          assert result is None, f"Expected None for missing provider, got {result}"
-
-          # Lookup from empty registry — should return None
-          result2 = registries.get_provider_namespace("no-such-provider")
-          assert result2 == {}, f"Expected empty dict for missing namespace, got {result2}"
-
-          # Lookup specific tool provider that does not exist
-          result3 = registries.get_tool_provider("nonexistent-tool")
-          assert result3 is None, f"Expected None for missing tool provider, got {result3}"
-
-          print("PASS: All missing-plugin lookups return None gracefully")
-          ' 2>&1
-
-          echo "PASS: Missing plugins degrade gracefully (no crash)"
-
-          mkdir -p $out
-          echo "ok" > $out/result
-        '';
-
-        # Check 5: No runtime pip install / ensure in gateway/run.py
-        plugin-no-runtime-install = pkgs.runCommand "hermes-plugin-no-runtime-install" { } ''
-          set -e
-          echo "=== Checking no runtime pip install / ensure in core ==="
-
-          # Check gateway/run.py has no ensure() or pip install
-          GATEWAY=${hermes-agent}/share/hermes-agent/gateway/run.py
-          if [ -f "$GATEWAY" ]; then
-            if grep -q 'ensure(' "$GATEWAY" || grep -q 'pip install' "$GATEWAY"; then
-              echo "FAIL: gateway/run.py contains ensure() or pip install"
-              grep -n 'ensure(\|pip install' "$GATEWAY"
-              exit 1
-            fi
-            echo "PASS: gateway/run.py has no ensure()/pip install"
-          else
-            echo "SKIP: gateway/run.py not found in package"
-          fi
-
-          # Check run_agent.py has no ensure() or pip install
-          RUN_AGENT=${hermes-agent}/share/hermes-agent/run_agent.py
-          if [ -f "$RUN_AGENT" ]; then
-            if grep -q 'ensure(' "$RUN_AGENT" || grep -q 'pip install' "$RUN_AGENT"; then
-              echo "FAIL: run_agent.py contains ensure() or pip install"
-              grep -n 'ensure(\|pip install' "$RUN_AGENT"
-              exit 1
-            fi
-            echo "PASS: run_agent.py has no ensure()/pip install"
-          else
-            echo "SKIP: run_agent.py not found in package"
-          fi
-
-          # Check tools/lazy_deps.py is gone
-          LAZY_DEPS=${hermes-agent}/share/hermes-agent/tools/lazy_deps.py
-          if [ -f "$LAZY_DEPS" ]; then
-            echo "FAIL: tools/lazy_deps.py still exists — should be removed"
-            exit 1
-          fi
-          echo "PASS: tools/lazy_deps.py removed"
-
-          mkdir -p $out
-          echo "ok" > $out/result
-        '';
-
        # Regression guard: messaging deps live outside [all], so the
        # #messaging variant must actually ship discord.py — otherwise
        # `nix profile install .#messaging` regresses to the broken default.
@@ -4,7 +4,7 @@ let
  src = ../web;
  npmDeps = pkgs.fetchNpmDeps {
    inherit src;
-    hash = "sha256-RPPWPM0nEkwsaQHrkdEP+UMTZ2aF7JHUNfsIEnKt1l8=";
+    hash = "sha256-HV0aISBVjwbGqDj8qQynSxGFrrZDzuYAW3D3lB/x3zo=";
  };

  npm = hermesNpmLib.mkNpmPassthru { folder = "web"; attr = "web"; pname = "hermes-web"; };
@@ -1,7 +0,0 @@
-"""Bridge module — delegates plugin registration to hermes_agent_dashboard."""
-
-
-def register(ctx):
-    """Plugin entry point — delegates to the inner hermes_agent_dashboard package."""
-    from hermes_agent_dashboard import register as _inner_register
-    _inner_register(ctx)
@@ -1,6 +0,0 @@
-"""Hermes Agent web dashboard."""
-
-
-def register(ctx):
-    """Plugin entry point — dashboard registers via ctx.register_tool_provider_entry()."""
-    pass
@@ -1,6 +0,0 @@
-name: dashboard
-version: 0.1.0
-description: Web dashboard (FastAPI + uvicorn)
-kind: backend
-provides_tools: ["dashboard"]
-provides_hooks: []
@@ -1,20 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-dashboard"
-version = "0.1.0"
-description = "Hermes Agent web dashboard (FastAPI + Uvicorn)"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "fastapi==0.133.1",
-    "uvicorn[standard]==0.41.0",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-dashboard = "hermes_agent_dashboard:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_dashboard*"]
@@ -1,7 +0,0 @@
-"""Bridge module — delegates plugin registration to hermes_agent_fal."""
-
-
-def register(ctx):
-    """Plugin entry point — delegates to the inner hermes_agent_fal package."""
-    from hermes_agent_fal import register as _inner_register
-    _inner_register(ctx)
@@ -1,36 +0,0 @@
-"""hermes-agent-fal: FAL.ai SDK plumbing plugin for Hermes Agent."""
-
-from hermes_agent_fal.fal_common import (  # noqa: F401
-    import_fal_client,
-    _ManagedFalSyncClient,
-    _extract_http_status,
-    _normalize_fal_queue_url_format,
-)
-
-
-def register(ctx):
-    """Entry point for the hermes_agent.plugins entry point group.
-
-    Registers FAL SDK plumbing (import_fal_client, _ManagedFalSyncClient,
-    etc.) in the plugin capability registry so core code can look them
-    up without importing from ``hermes_agent_fal`` directly.
-    """
-    from hermes_agent_fal.fal_common import (
-        import_fal_client,
-        _ManagedFalSyncClient,
-        _extract_http_status,
-        _normalize_fal_queue_url_format,
-    )
-    ctx.register_tool_provider_entry(
-        name="fal",
-        tool_functions={
-            "import_fal_client": import_fal_client,
-        },
-        constants={
-            "_normalize_fal_queue_url_format": _normalize_fal_queue_url_format,
-        },
-        config_functions={
-            "_ManagedFalSyncClient": _ManagedFalSyncClient,
-            "_extract_http_status": _extract_http_status,
-        },
-    )
@@ -1,6 +0,0 @@
-name: fal
-version: 0.1.0
-description: FAL.ai image generation backend
-kind: backend
-provides_tools: ["image_gen"]
-provides_hooks: []
@@ -1,19 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-fal"
-version = "0.1.0"
-description = "FAL.ai SDK plumbing plugin for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "fal-client==0.13.1",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-fal = "hermes_agent_fal:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_fal*"]
@@ -888,7 +888,8 @@ class HindsightMemoryProvider(MemoryProvider):
                        + (f": {reason}" if reason else "")
                    )
                try:
-                    from hindsight import HindsightEmbedded  # noqa: F401 — side-effect import
+                    from tools.lazy_deps import ensure as _lazy_ensure
+                    _lazy_ensure("memory.hindsight", prompt=False)
                except ImportError:
                    pass
                except Exception as _e:
@@ -1,19 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-hindsight"
-version = "1.0.0"
-description = "Hindsight long-term memory with knowledge graph for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "hindsight-client==0.6.1",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-hindsight = "hermes_agent_hindsight:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_hindsight*"]
@@ -745,12 +745,23 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
            "For local instances, set HONCHO_BASE_URL instead."
        )

-    # Import the honcho SDK (installed via hermes-agent-honcho package).
+    # Lazy-install the honcho SDK on demand. ensure() honors
    # security.allow_lazy_installs (default true). On failure we surface
    # the original ImportError-shape message so existing callers still get
    # the "go run hermes honcho setup" hint they used to.
    try:
-        from honcho import Honcho  # noqa: F401 — imported for side-effects
+        from tools.lazy_deps import FeatureUnavailable, ensure as _lazy_ensure
+        _lazy_ensure("memory.honcho", prompt=False)
+    except ImportError:
+        # lazy_deps module missing — fall through to the raw import below.
+        pass
+    except Exception:
+        # FeatureUnavailable or unexpected error. Don't crash here; let the
+        # actual import attempt produce the canonical error message.
+        pass
+
+    try:
+        from honcho import Honcho
    except ImportError:
        raise ImportError(
            "honcho-ai is required for Honcho integration. "
@@ -1,19 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-honcho"
-version = "1.0.0"
-description = "Honcho AI-native memory for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "honcho-ai==2.0.1",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-honcho = "hermes_agent_honcho:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_honcho*"]
@@ -19,9 +19,3 @@ alibaba_coding_plan = ProviderProfile(
 )

 register_provider(alibaba_coding_plan)
-
-
-def register(ctx):
-    """No-op — this provider has no workspace package yet."""
-    pass
-
@@ -11,9 +11,3 @@ alibaba = ProviderProfile(
 )

 register_provider(alibaba)
-
-
-def register(ctx):
-    """No-op — this provider has no workspace package yet."""
-    pass
-
@@ -50,9 +50,3 @@ anthropic = AnthropicProfile(
 )

 register_provider(anthropic)
-
-
-def register(ctx):
-    """Plugin entry point — delegates to the inner hermes_agent_anthropic package."""
-    from hermes_agent_anthropic import register as _inner_register
-    _inner_register(ctx)
@@ -1,174 +0,0 @@
-"""hermes-agent-anthropic: Anthropic Messages API adapter for Hermes Agent."""
-
-# -----------------------------------------------------------------------
-# Re-exports from adapter.py — SDK-dependent orchestration only.
-# Wire-format code (message conversion, aux client wrappers, transport)
-# has moved to core and is no longer re-exported here.
-# -----------------------------------------------------------------------
-from hermes_agent_anthropic.adapter import (  # noqa: F401
-    _CLAUDE_CODE_VERSION_FALLBACK,
-    _HERMES_OAUTH_FILE,
-    _OAUTH_CLIENT_ID,
-    _OAUTH_REDIRECT_URI,
-    _OAUTH_SCOPES,
-    _OAUTH_TOKEN_URL,
-    _build_anthropic_client_with_bearer_hook,
-    _detect_claude_code_version,
-    _generate_pkce,
-    _get_anthropic_sdk,
-    _get_claude_code_version,
-    _is_azure_anthropic_endpoint,
-    _is_oauth_token,
-    _prefer_refreshable_claude_code_token,
-    _read_claude_code_credentials_from_keychain,
-    _refresh_oauth_token,
-    _requires_bearer_auth,
-    _resolve_claude_code_token_from_credentials,
-    _write_claude_code_credentials,
-    build_anthropic_bedrock_client,
-    build_anthropic_client,
-    is_claude_code_token_valid,
-    read_claude_code_credentials,
-    read_claude_managed_key,
-    read_hermes_oauth_credentials,
-    refresh_anthropic_oauth_pure,
-    resolve_anthropic_token,
-    run_hermes_oauth_login_pure,
-    run_oauth_setup_token,
-)
-
-# Re-exports from resolve.py — client resolution & endpoint detection
-from hermes_agent_anthropic.resolve import (  # noqa: F401
-    _ANTHROPIC_DEFAULT_BASE_URL as ANTHROPIC_DEFAULT_BASE_URL,
-    convert_openai_images_to_anthropic,
-    endpoint_speaks_anthropic_messages,
-    is_anthropic_compat_endpoint,
-    maybe_wrap_anthropic,
-    resolve_auxiliary_client,
-)
-
-
-def register(ctx):
-    """Entry point for the hermes_agent.plugins entry point group."""
-    from hermes_agent_anthropic import adapter
-
-    # -----------------------------------------------------------------------
-    # Plugin-only symbols — SDK-dependent orchestration that stays in the
-    # plugin package.  Wire-format code (message conversion, aux client
-    # wrappers, transport) has moved to core (agent.anthropic_format,
-    # agent.anthropic_aux, agent.transports.anthropic) and is no longer
-    # registered here.
-    # -----------------------------------------------------------------------
-    _symbols = [
-        # OAuth / auth constants
-        "_CLAUDE_CODE_VERSION_FALLBACK",
-        "_HERMES_OAUTH_FILE",
-        "_OAUTH_CLIENT_ID",
-        "_OAUTH_REDIRECT_URI",
-        "_OAUTH_SCOPES",
-        "_OAUTH_TOKEN_URL",
-        # SDK-dependent functions
-        "_build_anthropic_client_with_bearer_hook",
-        "_detect_claude_code_version",
-        "_generate_pkce",
-        "_get_anthropic_sdk",
-        "_get_claude_code_version",
-        "_is_azure_anthropic_endpoint",
-        "_is_oauth_token",
-        "_prefer_refreshable_claude_code_token",
-        "_read_claude_code_credentials_from_keychain",
-        "_refresh_oauth_token",
-        "_requires_bearer_auth",
-        "_resolve_claude_code_token_from_credentials",
-        "_write_claude_code_credentials",
-        "build_anthropic_bedrock_client",
-        "build_anthropic_client",
-        "is_claude_code_token_valid",
-        "read_claude_code_credentials",
-        "read_claude_managed_key",
-        "read_hermes_oauth_credentials",
-        "refresh_anthropic_oauth_pure",
-        "resolve_anthropic_token",
-        "run_hermes_oauth_login_pure",
-        "run_oauth_setup_token",
-    ]
-
-    # resolve.py symbols — client resolution & endpoint detection
-    _resolve_symbols = [
-        "_ANTHROPIC_DEFAULT_BASE_URL",
-        "_ANTHROPIC_COMPAT_PROVIDERS",
-        "convert_openai_images_to_anthropic",
-        "endpoint_speaks_anthropic_messages",
-        "is_anthropic_compat_endpoint",
-        "maybe_wrap_anthropic",
-        "resolve_auxiliary_client",
-    ]
-    _all_symbols = _symbols + _resolve_symbols
-    _services = {}
-    for name in _symbols:
-        _services[name] = getattr(adapter, name)
-    for name in _resolve_symbols:
-        from hermes_agent_anthropic import resolve as _resolve_mod
-        _services[name] = getattr(_resolve_mod, name)
-    # Also expose ANTHROPIC_DEFAULT_BASE_URL under the public (no-underscore) name
-    _services["ANTHROPIC_DEFAULT_BASE_URL"] = _services.get("_ANTHROPIC_DEFAULT_BASE_URL", "")
-
-    # Also expose the model name normalizer as a provider service
-    from hermes_agent_anthropic.pricing import normalize_anthropic_model_name
-    _services["normalize_model_name"] = normalize_anthropic_model_name
-
-    ctx.register_provider_services("anthropic", _services)
-
-    # Register the provider resolver — core dispatches to this instead of
-    # having per-anthropic if/elif branches in resolve_provider_client().
-    ctx.register_provider_resolver("anthropic", resolve_auxiliary_client)
-
-    # Register the anthropic transport so core doesn't need to import it.
-    from agent.transports.anthropic import AnthropicTransport
-    ctx.register_transport("anthropic_messages", AnthropicTransport)
-
-    # Register the credential pool hook — core dispatches to this instead of
-    # having per-anthropic if/elif branches in credential_pool.py.
-    from agent.plugin_registries import CredentialPoolHook
-    from hermes_agent_anthropic.credential_pool_hook import (
-        sync_from_credentials_file,
-        refresh_oauth,
-        needs_refresh,
-        should_include_in_pool,
-        source_priority,
-        discover_credentials,
-        ANTHROPIC_ENV_VAR_ORDER,
-        detect_auth_type,
-    )
-    ctx.register_credential_pool_hook("anthropic", CredentialPoolHook(
-        sync_from_credentials_file=sync_from_credentials_file,
-        refresh_oauth=refresh_oauth,
-        needs_refresh=needs_refresh,
-        should_include_in_pool=should_include_in_pool,
-        source_priority=source_priority,
-        discover_credentials=discover_credentials,
-        env_var_order=ANTHROPIC_ENV_VAR_ORDER,
-        detect_auth_type=detect_auth_type,
-    ))
-
-    # Register pricing entries — core looks these up via the registry
-    # instead of hardcoding them in _OFFICIAL_DOCS_PRICING.
-    from hermes_agent_anthropic.pricing import (
-        get_anthropic_pricing_entries,
-        ANTHROPIC_PRICING_KEYS,
-    )
-    _entries = get_anthropic_pricing_entries()
-    _keyed = []
-    for (prov, model), entry in zip(ANTHROPIC_PRICING_KEYS, _entries):
-        _keyed.append((prov, model, entry))
-    ctx.register_pricing_provider("anthropic", _keyed)
-
-    # Register the provider overlay — core merges this into HERMES_OVERLAYS
-    from agent.plugin_registries import ProviderOverlayEntry
-    ctx.register_provider_overlay(ProviderOverlayEntry(
-        provider_name="anthropic",
-        transport="anthropic_messages",
-        extra_env_vars=("ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
-        display_name="Anthropic",
-        aliases=[],
-    ))
@@ -1,274 +0,0 @@
-"""Anthropic credential pool hook.
-
-Handles provider-specific pool operations: syncing from ~/.claude/.credentials.json,
-refreshing OAuth tokens, and deciding which sources to include in the pool.
-"""
-
-from __future__ import annotations
-
-import logging
-import os
-import time
-from dataclasses import replace
-from typing import Any, Optional
-
-logger = logging.getLogger(__name__)
-
-
-def sync_from_credentials_file(entry: Any) -> Any:
-    """Sync a claude_code pool entry from ~/.claude/.credentials.json if tokens differ.
-
-    OAuth refresh tokens are single-use. When something external (e.g.
-    Claude Code CLI, or another profile's pool) refreshes the token, it
-    writes the new pair to ~/.claude/.credentials.json. The pool entry's
-    refresh token becomes stale. This method detects that and syncs.
-
-    Returns the (possibly updated) entry.
-    """
-    if entry.source != "claude_code":
-        return entry
-    try:
-        from agent.plugin_registries import registries
-        read_claude_code_credentials = registries.get_provider_service("anthropic", "read_claude_code_credentials")
-        if read_claude_code_credentials is None:
-            return entry
-        creds = read_claude_code_credentials()
-        if not creds:
-            return entry
-        file_refresh = creds.get("refreshToken", "")
-        file_access = creds.get("accessToken", "")
-        file_expires = creds.get("expiresAt", 0)
-        if file_refresh and file_refresh != entry.refresh_token:
-            logger.debug("Pool entry %s: syncing tokens from credentials file (refresh token changed)", entry.id)
-            return replace(
-                entry,
-                access_token=file_access,
-                refresh_token=file_refresh,
-                expires_at_ms=file_expires,
-                last_status=None,
-                last_status_at=None,
-                last_error_code=None,
-            )
-    except Exception as exc:
-        logger.debug("Failed to sync from credentials file: %s", exc)
-    return entry
-
-
-def refresh_oauth(entry: Any, pool: Any) -> Any:
-    """Refresh an anthropic OAuth token and return the updated entry.
-
-    Handles:
-    - Standard OAuth refresh via ``refresh_anthropic_oauth_pure``
-    - Writing back to ~/.claude/.credentials.json for claude_code entries
-    - Retry with synced token from credentials file on refresh failure
-
-    Returns the updated entry, or the original entry on failure.
-    """
-    from agent.plugin_registries import registries
-
-    refresh_anthropic_oauth_pure = registries.get_provider_service("anthropic", "refresh_anthropic_oauth_pure")
-    if refresh_anthropic_oauth_pure is None:
-        return entry
-
-    try:
-        refreshed = refresh_anthropic_oauth_pure(
-            entry.refresh_token,
-            use_json=entry.source.endswith("hermes_pkce"),
-        )
-        updated = replace(
-            entry,
-            access_token=refreshed["access_token"],
-            refresh_token=refreshed["refresh_token"],
-            expires_at_ms=refreshed["expires_at_ms"],
-        )
-        # Keep ~/.claude/.credentials.json in sync
-        if entry.source == "claude_code":
-            try:
-                _write_claude_code_credentials = registries.get_provider_service("anthropic", "_write_claude_code_credentials")
-                if _write_claude_code_credentials is not None:
-                    _write_claude_code_credentials(
-                        refreshed["access_token"],
-                        refreshed["refresh_token"],
-                        refreshed["expires_at_ms"],
-                    )
-            except Exception as wexc:
-                logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
-        return updated
-    except Exception as exc:
-        logger.debug("Credential refresh failed for anthropic/%s: %s", entry.id, exc)
-        # The refresh token may have been consumed by another process.
-        # Check if ~/.claude/.credentials.json has a newer token pair.
-        if entry.source == "claude_code":
-            synced = sync_from_credentials_file(entry)
-            if synced.refresh_token != entry.refresh_token:
-                logger.debug("Retrying refresh with synced token from credentials file")
-                try:
-                    refreshed = refresh_anthropic_oauth_pure(
-                        synced.refresh_token,
-                        use_json=synced.source.endswith("hermes_pkce"),
-                    )
-                    updated = replace(
-                        synced,
-                        access_token=refreshed["access_token"],
-                        refresh_token=refreshed["refresh_token"],
-                        expires_at_ms=refreshed["expires_at_ms"],
-                        last_status="OK",
-                        last_status_at=None,
-                        last_error_code=None,
-                    )
-                    try:
-                        _write_claude_code_credentials = registries.get_provider_service("anthropic", "_write_claude_code_credentials")
-                        if _write_claude_code_credentials is not None:
-                            _write_claude_code_credentials(
-                                refreshed["access_token"],
-                                refreshed["refresh_token"],
-                                refreshed["expires_at_ms"],
-                            )
-                    except Exception:
-                        pass
-                    return updated
-                except Exception:
-                    pass
-        return entry
-
-
-def needs_refresh(entry: Any) -> bool:
-    """Check if an anthropic OAuth entry needs a token refresh."""
-    if entry.expires_at_ms is None:
-        return False
-    return int(entry.expires_at_ms) <= int(time.time() * 1000) + 120_000
-
-
-def should_include_in_pool(source: str) -> bool:
-    """Which anthropic credential sources should be pooled."""
-    return source in {"claude_code", "hermes_pkce"}
-
-
-def source_priority(source: str) -> int:
-    """Priority ordering for anthropic credential sources (lower = preferred)."""
-    _PRIORITIES = {
-        "claude_code": 3,
-        "hermes_pkce": 2,
-    }
-    return _PRIORITIES.get(source, 99)
-
-
-def discover_credentials(entries: list, provider: str, is_suppressed: Any) -> tuple:
-    """Discover external anthropic credentials and upsert into pool entries.
-
-    Returns (changed: bool, active_sources: set).
-    """
-    from agent.plugin_registries import registries
-
-    changed = False
-    active_sources = set()
-
-    # Only auto-discover external credentials (Claude Code, Hermes PKCE)
-    # when the user has explicitly configured anthropic as their provider.
-    # Without this gate, auxiliary client fallback chains silently read
-    # ~/.claude/.credentials.json without user consent.  See PR #4210.
-    try:
-        from hermes_cli.auth import is_provider_explicitly_configured
-        if not is_provider_explicitly_configured("anthropic"):
-            return changed, active_sources
-    except ImportError:
-        pass
-
-    # API-key vs OAuth is a user-visible choice at `hermes setup` ("Claude
-    # Pro/Max subscription" vs "Anthropic API key").  The signal that the
-    # user picked the API-key path is: ANTHROPIC_API_KEY set in the env,
-    # AND no OAuth env vars set — `save_anthropic_api_key()` writes the
-    # API key and zeros ANTHROPIC_TOKEN; `save_anthropic_oauth_token()`
-    # does the inverse.  When that signal is present we MUST NOT seed
-    # autodiscovered OAuth tokens (~/.claude/.credentials.json from the
-    # Claude Code CLI, hermes_pkce creds from a previous OAuth login)
-    # into the anthropic pool — otherwise rotation on a 401/429 silently
-    # flips the session onto an OAuth credential, which forces the Claude
-    # Code identity injection, `mcp_` tool-name rewrite, and claude-cli
-    # User-Agent header.  Users who explicitly opted into the API-key path
-    # are explicitly opting OUT of that masquerade.  Prefer ~/.hermes/.env
-    # over os.environ for the same reason `_seed_from_env` does — that's
-    # the authoritative file that `hermes setup` writes.
-    try:
-        from hermes_cli.config import load_env
-    except ImportError:
-        load_env = None  # type: ignore[assignment]
-
-    _env_file = load_env() if load_env is not None else {}
-
-    def _env_val(key: str) -> str:
-        return (_env_file.get(key) or os.environ.get(key) or "").strip()
-
-    anthropic_api_key = _env_val("ANTHROPIC_API_KEY")
-    anthropic_oauth_env = (
-        _env_val("ANTHROPIC_TOKEN") or _env_val("CLAUDE_CODE_OAUTH_TOKEN")
-    )
-    api_key_path_explicit = bool(anthropic_api_key and not anthropic_oauth_env)
-
-    if api_key_path_explicit:
-        # Prune any stale autodiscovered OAuth entries that may have been
-        # seeded into the on-disk pool during a previous OAuth session.
-        # Without this, switching OAuth -> API key at setup leaves the
-        # OAuth entries dormant in auth.json forever and rotation on a
-        # transient 401 could revive them.
-        retained = [
-            entry for entry in entries
-            if entry.source not in {"hermes_pkce", "claude_code"}
-        ]
-        if len(retained) != len(entries):
-            entries[:] = retained
-            changed = True
-        return changed, active_sources
-
-    read_claude_code_credentials = registries.get_provider_service("anthropic", "read_claude_code_credentials")
-    read_hermes_oauth_credentials = registries.get_provider_service("anthropic", "read_hermes_oauth_credentials")
-    if read_claude_code_credentials is None or read_hermes_oauth_credentials is None:
-        return changed, active_sources
-
-    # Import pool helpers
-    try:
-        from agent.credential_pool import _upsert_entry, label_from_token, AUTH_TYPE_OAUTH
-    except ImportError:
-        return changed, active_sources
-
-    for source_name, creds in (
-        ("hermes_pkce", read_hermes_oauth_credentials()),
-        ("claude_code", read_claude_code_credentials()),
-    ):
-        if creds and creds.get("accessToken"):
-            if is_suppressed(provider, source_name):
-                continue
-            active_sources.add(source_name)
-            changed |= _upsert_entry(
-                entries,
-                provider,
-                source_name,
-                {
-                    "source": source_name,
-                    "auth_type": AUTH_TYPE_OAUTH,
-                    "access_token": creds.get("accessToken", ""),
-                    "refresh_token": creds.get("refreshToken"),
-                    "expires_at_ms": creds.get("expiresAt"),
-                    "label": label_from_token(creds.get("accessToken", ""), source_name),
-                },
-            )
-    return changed, active_sources
-
-
-# Env var scan order for anthropic — prefer OAuth tokens over API keys
-ANTHROPIC_ENV_VAR_ORDER = [
-    "ANTHROPIC_TOKEN",
-    "CLAUDE_CODE_OAUTH_TOKEN",
-    "ANTHROPIC_API_KEY",
-]
-
-
-def detect_auth_type(token: str) -> str:
-    """Determine auth type for an anthropic token.
-
-    OAuth tokens don't start with 'sk-ant-api'; API keys do.
-    """
-    from agent.credential_pool import AUTH_TYPE_OAUTH, AUTH_TYPE_API_KEY
-    if not token.startswith("sk-ant-api"):
-        return AUTH_TYPE_OAUTH
-    return AUTH_TYPE_API_KEY
@@ -1,184 +0,0 @@
-"""Anthropic model pricing data.
-
-Official docs snapshot entries for Anthropic Claude models.
-Source: https://platform.claude.com/docs/en/about-claude/pricing
-"""
-
-from __future__ import annotations
-
-from datetime import datetime, timezone
-from decimal import Decimal
-from typing import List
-
-
-def get_anthropic_pricing_entries() -> list:
-    """Return official docs pricing entries for Anthropic Claude models."""
-    from agent.usage_pricing import PricingEntry
-
-    _ANTHROPIC_PRICING_URL = "https://platform.claude.com/docs/en/about-claude/pricing"
-    _ANTHROPIC_PRICING_VER = "anthropic-pricing-2026-05"
-
-    return [
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-opus-4-7")
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-opus-4-6")
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-opus-4-5")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            cache_read_cost_per_million=Decimal("0.30"),
-            cache_write_cost_per_million=Decimal("3.75"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-sonnet-4-7")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            cache_read_cost_per_million=Decimal("0.30"),
-            cache_write_cost_per_million=Decimal("3.75"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-sonnet-4-6")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            cache_read_cost_per_million=Decimal("0.30"),
-            cache_write_cost_per_million=Decimal("3.75"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-sonnet-4-5")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.80"),
-            output_cost_per_million=Decimal("4.00"),
-            cache_read_cost_per_million=Decimal("0.08"),
-            cache_write_cost_per_million=Decimal("1.00"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-haiku-4-5")
-        PricingEntry(
-            input_cost_per_million=Decimal("1.00"),
-            output_cost_per_million=Decimal("5.00"),
-            cache_read_cost_per_million=Decimal("0.10"),
-            cache_write_cost_per_million=Decimal("1.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-7-sonnet")
-        PricingEntry(
-            input_cost_per_million=Decimal("1.00"),
-            output_cost_per_million=Decimal("5.00"),
-            cache_read_cost_per_million=Decimal("0.10"),
-            cache_write_cost_per_million=Decimal("1.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-6-sonnet")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            cache_read_cost_per_million=Decimal("0.30"),
-            cache_write_cost_per_million=Decimal("3.75"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-5-sonnet")
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-7-opus")
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-6-opus")
-        PricingEntry(
-            input_cost_per_million=Decimal("5.00"),
-            output_cost_per_million=Decimal("25.00"),
-            cache_read_cost_per_million=Decimal("0.50"),
-            cache_write_cost_per_million=Decimal("6.25"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-5-opus")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.80"),
-            output_cost_per_million=Decimal("4.00"),
-            cache_read_cost_per_million=Decimal("0.08"),
-            cache_write_cost_per_million=Decimal("1.00"),
-            source="official_docs_snapshot",
-            source_url=_ANTHROPIC_PRICING_URL,
-            pricing_version=_ANTHROPIC_PRICING_VER,
-        ),  # key: ("anthropic", "claude-4-5-haiku")
-    ]
-
-
-# Model name keys for the pricing entries — must match the order above
-ANTHROPIC_PRICING_KEYS = [
-    ("anthropic", "claude-opus-4-7"),
-    ("anthropic", "claude-opus-4-6"),
-    ("anthropic", "claude-opus-4-5"),
-    ("anthropic", "claude-sonnet-4-7"),
-    ("anthropic", "claude-sonnet-4-6"),
-    ("anthropic", "claude-sonnet-4-5"),
-    ("anthropic", "claude-haiku-4-5"),
-    ("anthropic", "claude-4-7-sonnet"),
-    ("anthropic", "claude-4-6-sonnet"),
-    ("anthropic", "claude-4-5-sonnet"),
-    ("anthropic", "claude-4-7-opus"),
-    ("anthropic", "claude-4-6-opus"),
-    ("anthropic", "claude-4-5-opus"),
-    ("anthropic", "claude-4-5-haiku"),
-]
-
-
-def normalize_anthropic_model_name(model: str) -> str:
-    """Normalize Anthropic model name variants to canonical form.
-
-    Handles:
-      - Dot notation: claude-opus-4.7 → claude-opus-4-7
-      - Short aliases: claude-opus-4.7 → claude-opus-4-7
-      - Strips anthropic/ prefix if present
-    """
-    import re
-    name = model.lower().strip()
-    if name.startswith("anthropic/"):
-        name = name[len("anthropic/"):]
-    # Normalize dots to dashes in version numbers
-    name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
-    return name
@@ -1,312 +0,0 @@
-"""Anthropic provider resolver for auxiliary client construction.
-
-Handles ALL provider-specific logic for building auxiliary clients:
-credential resolution (pool, env var, OAuth), client construction,
-base URL detection, and transport wrapping.
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any, Optional, Tuple
-
-from utils import base_url_hostname
-
-logger = logging.getLogger(__name__)
-
-_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
-
-_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-oauth", "minimax-cn"})
-
-
-# ---------------------------------------------------------------------------
-# Endpoint detection helpers
-# ---------------------------------------------------------------------------
-
-def endpoint_speaks_anthropic_messages(base_url: str) -> bool:
-    """True if the endpoint at ``base_url`` speaks Anthropic Messages protocol.
-
-    Covers:
-    - Any URL ending in ``/anthropic``
-    - ``api.kimi.com/coding`` (Kimi Coding Plan)
-    - ``api.anthropic.com`` (native Anthropic)
-    """
-    normalized = (base_url or "").strip().lower().rstrip("/")
-    if not normalized:
-        return False
-    if normalized.endswith("/anthropic"):
-        return True
-    hostname = base_url_hostname(normalized)
-    if hostname == "api.anthropic.com":
-        return True
-    if hostname == "api.kimi.com" and "/coding" in normalized:
-        return True
-    return False
-
-
-def is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
-    """Detect if an endpoint expects Anthropic-format content blocks."""
-    if provider in _ANTHROPIC_COMPAT_PROVIDERS:
-        return True
-    url_lower = (base_url or "").lower()
-    return "/anthropic" in url_lower
-
-
-def convert_openai_images_to_anthropic(messages: list) -> list:
-    """Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks."""
-    converted = []
-    for msg in messages:
-        content = msg.get("content")
-        if not isinstance(content, list):
-            converted.append(msg)
-            continue
-        new_content = []
-        changed = False
-        for block in content:
-            if block.get("type") == "image_url":
-                image_url_val = (block.get("image_url") or {}).get("url", "")
-                if image_url_val.startswith("data:"):
-                    header, _, b64data = image_url_val.partition(",")
-                    media_type = "image/png"
-                    if ":" in header and ";" in header:
-                        media_type = header.split(":", 1)[1].split(";", 1)[0]
-                    new_content.append({
-                        "type": "image",
-                        "source": {
-                            "type": "base64",
-                            "media_type": media_type,
-                            "data": b64data,
-                        },
-                    })
-                else:
-                    new_content.append({
-                        "type": "image",
-                        "source": {
-                            "type": "url",
-                            "url": image_url_val,
-                        },
-                    })
-                changed = True
-            else:
-                new_content.append(block)
-        converted.append({**msg, "content": new_content} if changed else msg)
-    return converted
-
-
-# ---------------------------------------------------------------------------
-# Transport wrapping
-# ---------------------------------------------------------------------------
-
-def _safe_isinstance(obj: Any, maybe_type: Any) -> bool:
-    """Return False instead of raising when a patched symbol is not a type."""
-    try:
-        return isinstance(obj, maybe_type)
-    except TypeError:
-        return False
-
-
-def maybe_wrap_anthropic(
-    client_obj: Any,
-    model: str,
-    api_key: str,
-    base_url: str,
-    api_mode: Optional[str] = None,
-) -> Any:
-    """Rewrap a plain OpenAI client in ``AnthropicAuxiliaryClient`` when
-    the endpoint actually speaks Anthropic Messages.
-
-    Returns ``client_obj`` unchanged when it's already a specialized adapter
-    or the endpoint is OpenAI-wire.
-    """
-    from agent.anthropic_aux import AnthropicAuxiliaryClient
-
-    # Already wrapped — don't double-wrap.
-    if _safe_isinstance(client_obj, AnthropicAuxiliaryClient):
-        return client_obj
-
-    # Check for other specialized adapters we should never re-dispatch.
-    try:
-        from agent.auxiliary_client import CodexAuxiliaryClient
-        if _safe_isinstance(client_obj, CodexAuxiliaryClient):
-            return client_obj
-    except ImportError:
-        pass
-    try:
-        from agent.gemini_native_adapter import GeminiNativeClient
-        if _safe_isinstance(client_obj, GeminiNativeClient):
-            return client_obj
-    except ImportError:
-        pass
-    try:
-        from agent.copilot_acp_client import CopilotACPClient
-        if _safe_isinstance(client_obj, CopilotACPClient):
-            return client_obj
-    except ImportError:
-        pass
-
-    # Explicit non-anthropic api_mode wins over URL heuristics.
-    if api_mode and api_mode != "anthropic_messages":
-        return client_obj
-
-    should_wrap = (
-        api_mode == "anthropic_messages"
-        or endpoint_speaks_anthropic_messages(base_url)
-    )
-    if not should_wrap:
-        return client_obj
-
-    from agent.plugin_registries import registries
-    build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
-    if build_anthropic_client is None:
-        logger.warning(
-            "Endpoint %s speaks Anthropic Messages but the anthropic SDK is "
-            "not installed — falling back to OpenAI-wire (will likely 404).",
-            base_url,
-        )
-        return client_obj
-
-    try:
-        real_client = build_anthropic_client(api_key, base_url)
-    except Exception as exc:
-        logger.warning(
-            "Failed to build Anthropic client for %s (%s) — falling back to "
-            "OpenAI-wire client.", base_url, exc,
-        )
-        return client_obj
-
-    logger.debug(
-        "Auxiliary transport: wrapping client in AnthropicAuxiliaryClient "
-        "(model=%s, base_url=%s, api_mode=%s)",
-        model, base_url[:60] if base_url else "", api_mode or "auto-detected",
-    )
-    return AnthropicAuxiliaryClient(
-        real_client, model, api_key, base_url, is_oauth=False,
-    )
-
-
-# ---------------------------------------------------------------------------
-# Pool helpers (thin wrappers over core pool functions)
-# ---------------------------------------------------------------------------
-
-def _select_pool_entry(provider: str) -> Tuple[bool, Optional[Any]]:
-    """Return (pool_exists_for_provider, selected_entry)."""
-    try:
-        from agent.credential_pool import load_pool
-        pool = load_pool(provider)
-    except Exception as exc:
-        logger.debug("Auxiliary client: could not load pool for %s: %s", provider, exc)
-        return False, None
-    if not pool or not pool.has_credentials():
-        return False, None
-    try:
-        return True, pool.select()
-    except Exception as exc:
-        logger.debug("Auxiliary client: could not select pool entry for %s: %s", provider, exc)
-        return True, None
-
-
-def _pool_runtime_api_key(entry: Any) -> str:
-    if entry is None:
-        return ""
-    key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
-    return str(key or "").strip()
-
-
-def _pool_runtime_base_url(entry: Any, fallback: str = "") -> str:
-    if entry is None:
-        return str(fallback or "").strip().rstrip("/")
-    url = (
-        getattr(entry, "runtime_base_url", None)
-        or getattr(entry, "inference_base_url", None)
-        or getattr(entry, "base_url", None)
-        or fallback
-    )
-    return str(url or "").strip().rstrip("/")
-
-
-def _get_aux_model_for_provider(provider_id: str) -> str:
-    """Return the cheap auxiliary model for a provider."""
-    try:
-        from providers import get_provider_profile
-        _p = get_provider_profile(provider_id)
-        if _p and _p.default_aux_model:
-            return _p.default_aux_model
-    except Exception:
-        pass
-    return ""
-
-
-# ---------------------------------------------------------------------------
-# The resolver: called by core's resolve_provider_client()
-# ---------------------------------------------------------------------------
-
-def resolve_auxiliary_client(
-    *,
-    model: str | None = None,
-    explicit_api_key: str | None = None,
-    explicit_base_url: str | None = None,
-    async_mode: bool = False,
-    is_vision: bool = False,
-    main_runtime: dict | None = None,
-    api_mode: str | None = None,
-) -> tuple[Any, str] | tuple[None, None]:
-    """Resolve an auxiliary client for the Anthropic provider.
-
-    Returns ``(client, default_model)`` or ``(None, None)`` if unavailable.
-    """
-    from agent.plugin_registries import registries
-    from agent.anthropic_aux import (
-        AnthropicAuxiliaryClient,
-        AsyncAnthropicAuxiliaryClient,
-    )
-
-    _anthropic = registries.get_provider_namespace("anthropic")
-    build_anthropic_client = _anthropic.get("build_anthropic_client")
-    resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
-    if build_anthropic_client is None or resolve_anthropic_token is None:
-        return None, None
-
-    pool_present, entry = _select_pool_entry("anthropic")
-    if pool_present:
-        if entry is None:
-            return None, None
-        token = explicit_api_key or _pool_runtime_api_key(entry)
-    else:
-        entry = None
-        token = explicit_api_key or resolve_anthropic_token()
-    if not token:
-        return None, None
-
-    # Allow base URL override from config.yaml model.base_url, but only
-    # when the configured provider is anthropic.
-    base_url = _pool_runtime_base_url(entry, _ANTHROPIC_DEFAULT_BASE_URL) if pool_present else _ANTHROPIC_DEFAULT_BASE_URL
-    if explicit_base_url:
-        base_url = explicit_base_url.strip().rstrip("/")
-    try:
-        from hermes_cli.config import load_config
-        cfg = load_config()
-        model_cfg = cfg.get("model")
-        if isinstance(model_cfg, dict):
-            cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
-            if cfg_provider == "anthropic":
-                cfg_base_url = (model_cfg.get("base_url") or "").strip().rstrip("/")
-                if cfg_base_url:
-                    base_url = cfg_base_url
-    except Exception:
-        pass
-
-    _is_oauth_token = _anthropic.get("_is_oauth_token")
-    is_oauth = _is_oauth_token(token) if _is_oauth_token else False
-    default_model = model or _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
-    logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", default_model, base_url, is_oauth)
-    try:
-        real_client = build_anthropic_client(token, base_url)
-    except ImportError:
-        return None, None
-
-    client = AnthropicAuxiliaryClient(real_client, default_model, token, base_url, is_oauth=is_oauth)
-
-    if async_mode:
-        client = AsyncAnthropicAuxiliaryClient(client)
-
-    return client, default_model
@@ -1,20 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-anthropic"
-version = "0.1.0"
-description = "Anthropic Messages API adapter for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "anthropic==0.87.0",
-    "hermes-agent-azure",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-anthropic = "hermes_agent_anthropic:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_anthropic*"]
@@ -1,180 +0,0 @@
-"""Shared fixtures for anthropic plugin tests.
-
-Registers the anthropic plugin in the singleton registry before each test
-and provides the ``agent`` fixture used by integration tests.
-"""
-
-import sys
-from pathlib import Path
-
-from unittest.mock import MagicMock, patch
-
-import pytest
-
-
-def pytest_configure(config):
-    """Remove sys.path entries that would shadow the real ``anthropic`` SDK.
-
-    pytest adds ``plugins/model-providers/`` to ``sys.path`` because
-    ``plugins/model-providers/anthropic/__init__.py`` (a provider profile)
-    exists.  This makes ``import anthropic`` find the plugin directory
-    instead of the installed SDK package, causing ``AttributeError:
-    module 'anthropic' has no attribute 'Anthropic'``.
-
-    We remove the conflicting entry, evict any wrong cached import, and
-    force-import the real SDK so sys.modules["anthropic"] is correct even
-    after pytest re-adds the conflicting path during collection.
-    """
-    import importlib
-    _repo_root = Path(__file__).resolve().parent.parent.parent.parent  # main/
-    _bad = str(_repo_root / "plugins" / "model-providers")
-    while _bad in sys.path:
-        sys.path.remove(_bad)
-    # Evict wrong import
-    if "anthropic" in sys.modules and not hasattr(sys.modules["anthropic"], "Anthropic"):
-        del sys.modules["anthropic"]
-    # Force-import the real SDK now (before pytest re-adds the bad path)
-    # so sys.modules["anthropic"] points to the real package.
-    try:
-        import anthropic as _real_anthropic  # noqa: F401
-        if not hasattr(_real_anthropic, "Anthropic"):
-            raise ImportError("wrong anthropic module loaded")
-    except ImportError:
-        # Try explicit import from venv
-        import importlib.util as _ilu
-        for _p in sys.path:
-            _candidate = Path(_p) / "anthropic" / "__init__.py"
-            if _candidate.exists() and (_candidate.parent / "_client.py").exists():
-                _spec = _ilu.spec_from_file_location("anthropic", _candidate)
-                if _spec and _spec.loader:
-                    _mod = _ilu.module_from_spec(_spec)
-                    sys.modules["anthropic"] = _mod
-                    _spec.loader.exec_module(_mod)
-                    break
-
-
-
-class _FullCtx:
-    """Plugin context that wires up all registry hooks the anthropic plugin uses.
-
-    Uses the real registries for provider_services, provider_resolver,
-    credential_pool_hook, transport, and pricing so plugin internals work
-    correctly.  Everything else is a no-op so the fixture doesn't depend on
-    parts of the system (platform, TTS, etc.) that aren't under test.
-    """
-
-    def register_provider_services(self, name, services):
-        from agent.plugin_registries import registries
-        registries.register_provider_services(name, services)
-
-    def register_provider_resolver(self, name, resolver):
-        from agent.plugin_registries import registries
-        registries.register_provider_resolver(name, resolver)
-
-    def register_credential_pool_hook(self, name, hook):
-        from agent.plugin_registries import registries
-        registries.register_credential_pool_hook(name, hook)
-
-    def register_transport(self, api_mode, transport_cls):
-        from agent.plugin_registries import registries
-        registries._transports[api_mode] = transport_cls
-
-    def register_pricing_provider(self, name, fn):
-        from agent.plugin_registries import registries
-        registries.register_pricing_provider(name, fn)
-
-    def register_provider_overlay(self, entry):
-        from agent.plugin_registries import registries
-        registries.register_provider_overlay(entry)
-
-    # Catch-all no-op for every other register_* method (platform, TTS,
-    # tools, hooks, skills, etc.) so the fixture never crashes when the
-    # plugin calls something we don't need to wire up for unit tests.
-    def __getattr__(self, name):
-        if name.startswith("register_"):
-            return lambda *a, **kw: None
-        raise AttributeError(name)
-
-
-@pytest.fixture(autouse=True)
-def _register_anthropic_plugin():
-    """Register the real anthropic plugin for the duration of each test,
-    then restore the registry to its prior state afterwards.
-
-    Calls the plugin's ``register()`` against a full context so that all
-    registry hooks (services, resolver, transport, pricing, etc.) are
-    populated.  patch.dict on each affected registry dict guarantees clean
-    teardown even across conftest scopes.
-    """
-    from agent.plugin_registries import registries
-
-    # Snapshot current state so we can restore after the test.
-    _prev_services = dict(registries._provider_services)
-    _prev_resolvers = dict(registries._provider_resolvers)
-    _prev_cph = dict(registries._credential_pool_hooks)
-    _prev_transports = dict(registries._transports) if hasattr(registries, "_transports") else {}
-    _prev_pricing = dict(registries._pricing_providers) if hasattr(registries, "_pricing_providers") else {}
-    _prev_overlays = dict(registries._provider_overlays) if hasattr(registries, "_provider_overlays") else {}
-
-    ctx = _FullCtx()
-    try:
-        from hermes_agent_anthropic import register as _reg  # type: ignore[import]
-        _reg(ctx)
-    except ImportError:
-        pass
-
-    yield
-
-    # Restore — remove keys the plugin added, put back what was there before.
-    for d, prev in [
-        (registries._provider_services, _prev_services),
-        (registries._provider_resolvers, _prev_resolvers),
-        (registries._credential_pool_hooks, _prev_cph),
-    ]:
-        d.clear()
-        d.update(prev)
-    for attr, prev in [
-        ("_transports", _prev_transports),
-        ("_pricing_providers", _prev_pricing),
-        ("_provider_overlays", _prev_overlays),
-    ]:
-        if hasattr(registries, attr):
-            getattr(registries, attr).clear()
-            getattr(registries, attr).update(prev)
-
-
-def _make_tool_defs(*names: str) -> list:
-    """Build minimal tool definition list accepted by AIAgent.__init__."""
-    return [
-        {
-            "type": "function",
-            "function": {
-                "name": n,
-                "description": f"{n} tool",
-                "parameters": {"type": "object", "properties": {}},
-            },
-        }
-        for n in names
-    ]
-
-
-@pytest.fixture()
-def agent():
-    """Minimal AIAgent with mocked OpenAI client and tool loading."""
-    from run_agent import AIAgent
-    with (
-        patch(
-            "run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")
-        ),
-        patch("run_agent.check_toolset_requirements", return_value={}),
-        patch("run_agent.OpenAI"),
-    ):
-        a = AIAgent(
-            api_key="test-key-1234567890",
-            base_url="https://openrouter.ai/api/v1",
-            quiet_mode=True,
-            skip_context_files=True,
-            skip_memory=True,
-        )
-        a.client = MagicMock()
-        return a
@@ -1,420 +0,0 @@
-"""Integration tests for Anthropic-specific AIAgent behaviour.
-
-Tests that exercise the interaction between AIAgent and the Anthropic
-provider plugin — covering max_tokens passthrough, image fallback,
-provider fallback routing, base-url passthrough, credential refresh,
-and OAuth flag setting.
-"""
-
-import json
-from types import SimpleNamespace
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from hermes_agent_anthropic.adapter import build_anthropic_client, resolve_anthropic_token, _is_oauth_token
-
-import run_agent
-from run_agent import AIAgent
-
-
-def _make_tool_defs(*names: str) -> list:
-    """Build minimal tool definition list accepted by AIAgent.__init__."""
-    return [
-        {
-            "type": "function",
-            "function": {
-                "name": n,
-                "description": f"{n} tool",
-                "parameters": {"type": "object", "properties": {}},
-            },
-        }
-        for n in names
-    ]
-
-
-class TestBuildApiKwargsAnthropicMaxTokens:
-    """Bug fix: max_tokens was always None for Anthropic mode, ignoring user config."""
-
-    def test_max_tokens_passed_to_anthropic(self, agent):
-        agent.api_mode = "anthropic_messages"
-        agent.max_tokens = 4096
-        agent.reasoning_config = None
-
-        with patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build:
-            mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
-            agent._build_api_kwargs([{"role": "user", "content": "test"}])
-            _, kwargs = mock_build.call_args
-            if not kwargs:
-                kwargs = dict(zip(
-                    ["model", "messages", "tools", "max_tokens", "reasoning_config"],
-                    mock_build.call_args[0],
-                ))
-            assert kwargs.get("max_tokens") == 4096 or mock_build.call_args[1].get("max_tokens") == 4096
-
-    def test_max_tokens_none_when_unset(self, agent):
-        agent.api_mode = "anthropic_messages"
-        agent.max_tokens = None
-        agent.reasoning_config = None
-
-        with patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build:
-            mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 16384}
-            agent._build_api_kwargs([{"role": "user", "content": "test"}])
-            call_args = mock_build.call_args
-            # max_tokens should be None (let adapter use its default)
-            if call_args[1]:
-                assert call_args[1].get("max_tokens") is None
-            else:
-                assert call_args[0][3] is None
-
-
-class TestAnthropicImageFallback:
-    def test_build_api_kwargs_converts_multimodal_user_image_to_text(self, agent):
-        agent.api_mode = "anthropic_messages"
-        agent.reasoning_config = None
-
-        api_messages = [{
-            "role": "user",
-            "content": [
-                {"type": "text", "text": "Can you see this now?"},
-                {"type": "image_url", "image_url": {"url": "https://example.com/cat.png"}},
-            ],
-        }]
-
-        with (
-            patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=json.dumps({"success": True, "analysis": "A cat sitting on a chair."}))),
-            patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build,
-        ):
-            mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
-            agent._build_api_kwargs(api_messages)
-
-        kwargs = mock_build.call_args.kwargs or dict(zip(
-            ["model", "messages", "tools", "max_tokens", "reasoning_config"],
-            mock_build.call_args.args,
-        ))
-        transformed = kwargs["messages"]
-        assert isinstance(transformed[0]["content"], str)
-        assert "A cat sitting on a chair." in transformed[0]["content"]
-        assert "Can you see this now?" in transformed[0]["content"]
-        assert "vision_analyze with image_url: https://example.com/cat.png" in transformed[0]["content"]
-
-    def test_build_api_kwargs_reuses_cached_image_analysis_for_duplicate_images(self, agent):
-        agent.api_mode = "anthropic_messages"
-        agent.reasoning_config = None
-        data_url = "data:image/png;base64,QUFBQQ=="
-
-        api_messages = [
-            {
-                "role": "user",
-                "content": [
-                    {"type": "text", "text": "first"},
-                    {"type": "input_image", "image_url": data_url},
-                ],
-            },
-            {
-                "role": "user",
-                "content": [
-                    {"type": "text", "text": "second"},
-                    {"type": "input_image", "image_url": data_url},
-                ],
-            },
-        ]
-
-        mock_vision = AsyncMock(return_value=json.dumps({"success": True, "analysis": "A small test image."}))
-        with (
-            patch("tools.vision_tools.vision_analyze_tool", new=mock_vision),
-            patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build,
-        ):
-            mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
-            agent._build_api_kwargs(api_messages)
-
-        assert mock_vision.await_count == 1
-
-
-class TestFallbackAnthropicProvider:
-    """Bug fix: _try_activate_fallback had no case for anthropic provider."""
-
-    def test_fallback_to_anthropic_sets_api_mode(self, agent):
-        agent._fallback_activated = False
-        agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
-        agent._fallback_chain = [agent._fallback_model]
-        agent._fallback_index = 0
-
-        mock_client = MagicMock()
-        mock_client.base_url = "https://api.anthropic.com/v1"
-        mock_client.api_key = "***"
-
-        with (
-            patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value=None),
-        ):
-            mock_build.return_value = MagicMock()
-            result = agent._try_activate_fallback()
-
-        assert result is True
-        assert agent.api_mode == "anthropic_messages"
-        assert agent._anthropic_client is not None
-        assert agent.client is None
-
-    def test_fallback_to_anthropic_enables_prompt_caching(self, agent):
-        agent._fallback_activated = False
-        agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
-        agent._fallback_chain = [agent._fallback_model]
-        agent._fallback_index = 0
-
-        mock_client = MagicMock()
-        mock_client.base_url = "https://api.anthropic.com/v1"
-        mock_client.api_key = "***"
-
-        with (
-            patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value=None),
-        ):
-            agent._try_activate_fallback()
-
-        assert agent._use_prompt_caching is True
-
-    def test_fallback_to_openrouter_uses_openai_client(self, agent):
-        agent._fallback_activated = False
-        agent._fallback_model = {"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}
-        agent._fallback_chain = [agent._fallback_model]
-        agent._fallback_index = 0
-
-        mock_client = MagicMock()
-        mock_client.base_url = "https://openrouter.ai/api/v1"
-        mock_client.api_key = "sk-or-test"
-
-        with patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)):
-            result = agent._try_activate_fallback()
-
-        assert result is True
-        assert agent.api_mode == "chat_completions"
-        assert agent.client is mock_client
-
-
-class TestAnthropicBaseUrlPassthrough:
-    """Bug fix: base_url was filtered with 'anthropic in base_url', blocking proxies."""
-
-    def test_custom_proxy_base_url_passed_through(self):
-        with (
-            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
-            patch("run_agent.check_toolset_requirements", return_value={}),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
-        ):
-            mock_build.return_value = MagicMock()
-            a = AIAgent(
-                api_key="sk-ant...7890",
-                base_url="https://llm-proxy.company.com/v1",
-                api_mode="anthropic_messages",
-                quiet_mode=True,
-                skip_context_files=True,
-                skip_memory=True,
-            )
-            call_args = mock_build.call_args
-            # base_url should be passed through, not filtered out
-            assert call_args[0][1] == "https://llm-proxy.company.com/v1"
-
-    def test_none_base_url_passed_as_none(self):
-        with (
-            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
-            patch("run_agent.check_toolset_requirements", return_value={}),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
-        ):
-            mock_build.return_value = MagicMock()
-            a = AIAgent(
-                api_key="sk-ant...7890",
-                api_mode="anthropic_messages",
-                quiet_mode=True,
-                skip_context_files=True,
-                skip_memory=True,
-            )
-            call_args = mock_build.call_args
-            # No base_url provided, should be default empty string or None
-            passed_url = call_args[0][1]
-            assert not passed_url or passed_url is None
-
-
-class TestAnthropicCredentialRefresh:
-    def test_try_refresh_anthropic_client_credentials_rebuilds_client(self):
-        with (
-            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
-            patch("run_agent.check_toolset_requirements", return_value={}),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
-        ):
-            old_client = MagicMock()
-            new_client = MagicMock()
-            mock_build.side_effect = [old_client, new_client]
-            agent = AIAgent(
-                api_key="sk-ant...oken",
-                base_url="https://openrouter.ai/api/v1",
-                api_mode="anthropic_messages",
-                quiet_mode=True,
-                skip_context_files=True,
-                skip_memory=True,
-            )
-
-        agent._anthropic_client = old_client
-        agent._anthropic_api_key = "sk-ant...old-token"   # differs from what resolve returns
-        agent._anthropic_base_url = "https://api.anthropic.com"
-        agent.provider = "anthropic"
-
-        with (
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant...oken"),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=new_client) as rebuild,
-        ):
-            assert agent._try_refresh_anthropic_client_credentials() is True
-
-        old_client.close.assert_called_once()
-        rebuild.assert_called_once_with(
-            "sk-ant...oken", "https://api.anthropic.com", timeout=None,
-        )
-        assert agent._anthropic_client is new_client
-        assert agent._anthropic_api_key == "sk-ant...oken"
-
-    def test_try_refresh_anthropic_client_credentials_returns_false_when_token_unchanged(self):
-        with (
-            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
-            patch("run_agent.check_toolset_requirements", return_value={}),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
-        ):
-            agent = AIAgent(
-                api_key="sk-ant...oken",
-                base_url="https://openrouter.ai/api/v1",
-                api_mode="anthropic_messages",
-                quiet_mode=True,
-                skip_context_files=True,
-                skip_memory=True,
-            )
-
-        old_client = MagicMock()
-        agent._anthropic_client = old_client
-        agent._anthropic_api_key = "sk-ant...oken"
-
-        with (
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant...oken"),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client") as rebuild,
-        ):
-            assert agent._try_refresh_anthropic_client_credentials() is False
-
-        old_client.close.assert_not_called()
-        rebuild.assert_not_called()
-
-    def test_anthropic_messages_create_preflights_refresh(self):
-        with (
-            patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
-            patch("run_agent.check_toolset_requirements", return_value={}),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
-        ):
-            agent = AIAgent(
-                api_key="sk-ant...oken",
-                base_url="https://openrouter.ai/api/v1",
-                api_mode="anthropic_messages",
-                quiet_mode=True,
-                skip_context_files=True,
-                skip_memory=True,
-            )
-
-        response = SimpleNamespace(content=[])
-        agent._anthropic_client = MagicMock()
-        agent._anthropic_client.messages.create.return_value = response
-
-        with patch.object(agent, "_try_refresh_anthropic_client_credentials", return_value=True) as refresh:
-            result = agent._anthropic_messages_create({"model": "claude-sonnet-4-20250514"})
-
-        refresh.assert_called_once_with()
-        agent._anthropic_client.messages.create.assert_called_once_with(model="claude-sonnet-4-20250514")
-        assert result is response
-
-
-class TestFallbackSetsOAuthFlag:
-    """_try_activate_fallback must set _is_anthropic_oauth for Anthropic fallbacks."""
-
-    def test_fallback_to_anthropic_oauth_sets_flag(self, agent):
-        agent._fallback_activated = False
-        agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
-        agent._fallback_chain = [agent._fallback_model]
-        agent._fallback_index = 0
-
-        mock_client = MagicMock()
-        mock_client.base_url = "https://api.anthropic.com/v1"
-        mock_client.api_key = "sk-ant-setup-oauth-token"
-
-        with (
-            patch("agent.auxiliary_client.resolve_provider_client",
-                  return_value=(mock_client, None)),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client",
-                  return_value=MagicMock()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token",
-                  return_value=None),
-        ):
-            result = agent._try_activate_fallback()
-
-        assert result is True
-        assert agent._is_anthropic_oauth is True
-
-    def test_fallback_to_anthropic_api_key_clears_flag(self, agent):
-        agent._fallback_activated = False
-        agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
-        agent._fallback_chain = [agent._fallback_model]
-        agent._fallback_index = 0
-
-        mock_client = MagicMock()
-        mock_client.base_url = "https://api.anthropic.com/v1"
-        mock_client.api_key = "sk-ant-api03-regular-key"
-
-        with (
-            patch("agent.auxiliary_client.resolve_provider_client",
-                  return_value=(mock_client, None)),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client",
-                  return_value=MagicMock()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token",
-                  return_value=None),
-        ):
-            result = agent._try_activate_fallback()
-
-        assert result is True
-        assert agent._is_anthropic_oauth is False
-
-
-class TestOAuthFlagAfterCredentialRefresh:
-    """_is_anthropic_oauth must update when token type changes during refresh."""
-
-    def test_oauth_flag_updates_api_key_to_oauth(self, agent):
-        """Refreshing from API key to OAuth token must set flag to True."""
-        from agent.plugin_registries import registries
-        agent.api_mode = "anthropic_messages"
-        agent.provider = "anthropic"
-        agent._anthropic_api_key = "***"
-        agent._anthropic_client = MagicMock()
-        agent._is_anthropic_oauth = False
-
-        with patch.dict(registries._provider_services, {"anthropic": {
-            "resolve_anthropic_token": MagicMock(return_value="sk-ant...oken"),
-            "build_anthropic_client": MagicMock(return_value=MagicMock()),
-            "_is_oauth_token": MagicMock(return_value=True),
-        }}):
-            result = agent._try_refresh_anthropic_client_credentials()
-
-        assert result is True
-        assert agent._is_anthropic_oauth is True
-
-    def test_oauth_flag_updates_oauth_to_api_key(self, agent):
-        """Refreshing from OAuth to API key must set flag to False."""
-        from agent.plugin_registries import registries
-        agent.api_mode = "anthropic_messages"
-        agent.provider = "anthropic"
-        agent._anthropic_api_key = "***"
-        agent._anthropic_client = MagicMock()
-        agent._is_anthropic_oauth = True
-
-        with patch.dict(registries._provider_services, {"anthropic": {
-            "resolve_anthropic_token": MagicMock(return_value="sk-ant...-key"),
-            "build_anthropic_client": MagicMock(return_value=MagicMock()),
-            "_is_oauth_token": MagicMock(return_value=False),
-        }}):
-            result = agent._try_refresh_anthropic_client_credentials()
-
-        assert result is True
-        assert agent._is_anthropic_oauth is False
@@ -1,98 +0,0 @@
-"""Anthropic-specific auth command tests moved from tests/hermes_cli/test_auth_commands.py."""
-
-from __future__ import annotations
-
-import base64
-import json
-
-import pytest
-
-
-def _write_auth_store(tmp_path, payload: dict) -> None:
-    hermes_home = tmp_path / "hermes"
-    hermes_home.mkdir(parents=True, exist_ok=True)
-    (hermes_home / "auth.json").write_text(json.dumps(payload, indent=2))
-
-
-def _jwt_with_email(email: str) -> str:
-    header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
-    payload = base64.urlsafe_b64encode(
-        json.dumps({"email": email}).encode()
-    ).rstrip(b"=").decode()
-    return f"{header}.{payload}.signature"
-
-
-@pytest.fixture(autouse=True)
-def _clear_provider_env(monkeypatch):
-    for key in (
-        "OPENROUTER_API_KEY",
-        "OPENAI_API_KEY",
-        "ANTHROPIC_API_KEY",
-        "ANTHROPIC_TOKEN",
-        "CLAUDE_CODE_OAUTH_TOKEN",
-    ):
-        monkeypatch.delenv(key, raising=False)
-
-
-def test_auth_add_anthropic_oauth_persists_pool_entry(tmp_path, monkeypatch):
-    monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
-    monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
-    monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
-    monkeypatch.delenv("CLAUDE_CODE_OAUTH_TOKEN", raising=False)
-    _write_auth_store(tmp_path, {"version": 1, "providers": {}})
-    token = _jwt_with_email("claude@example.com")
-    monkeypatch.setattr(
-        "hermes_agent_anthropic.adapter.run_hermes_oauth_login_pure",
-        lambda: {
-            "access_token": token,
-            "refresh_token": "refresh-token",
-            "expires_at_ms": 1711234567000,
-        },
-    )
-
-    from hermes_cli.auth_commands import auth_add_command
-
-    class _Args:
-        provider = "anthropic"
-        auth_type = "oauth"
-        api_key = None
-        label = None
-
-    auth_add_command(_Args())
-
-    payload = json.loads((tmp_path / "hermes" / "auth.json").read_text())
-    entries = payload["credential_pool"]["anthropic"]
-    entry = next(item for item in entries if item["source"] == "manual:hermes_pkce")
-    assert entry["label"] == "claude@example.com"
-    assert entry["source"] == "manual:hermes_pkce"
-    assert entry["refresh_token"] == "refresh-token"
-    assert entry["expires_at_ms"] == 1711234567000
-
-
-def test_seed_from_singletons_respects_hermes_pkce_suppression(tmp_path, monkeypatch):
-    """anthropic hermes_pkce must not re-seed from ~/.hermes/.anthropic_oauth.json when suppressed."""
-    hermes_home = tmp_path / "hermes"
-    hermes_home.mkdir(parents=True, exist_ok=True)
-    monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-
-    import yaml
-    (hermes_home / "config.yaml").write_text(yaml.dump({"model": {"provider": "anthropic", "model": "claude"}}))
-    (hermes_home / "auth.json").write_text(json.dumps({
-        "version": 1,
-        "providers": {},
-        "suppressed_sources": {"anthropic": ["hermes_pkce"]},
-    }))
-
-    # Stub the readers so only hermes_pkce is "available"; claude_code returns None
-    import hermes_agent_anthropic as aa
-    monkeypatch.setattr(aa, "read_hermes_oauth_credentials", lambda: {
-        "accessToken": "tok", "refreshToken": "r", "expiresAt": 9999999999000,
-    })
-    monkeypatch.setattr(aa, "read_claude_code_credentials", lambda: None)
-
-    from agent.credential_pool import _seed_from_singletons
-    entries = []
-    changed, active = _seed_from_singletons("anthropic", entries)
-    # hermes_pkce suppressed, claude_code returns None → nothing should be seeded
-    assert entries == []
-    assert "hermes_pkce" not in active
@@ -1,535 +0,0 @@
-"""Tests for Anthropic-specific auxiliary client behaviour.
-
-Covers:
- OAuth vs API-key flag propagation (_try_anthropic → AnthropicAuxiliaryClient)
- explicit_api_key propagation through resolve_provider_client → _try_anthropic
- Expired Codex token fallback to Anthropic
- Vision client fallback with Anthropic
- Auth refresh retry for Anthropic clients
-"""
-
-import json
-from unittest.mock import MagicMock, AsyncMock, patch
-
-import pytest
-
-from agent.auxiliary_client import (
-    resolve_provider_client,
-    _read_codex_access_token,
-    _resolve_auto,
-    get_available_vision_backends,
-    call_llm,
-    async_call_llm,
-)
-from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-from agent.anthropic_aux import AnthropicAuxiliaryClient
-
-
-class TestAnthropicOAuthFlag:
-    """Test that OAuth tokens get is_oauth=True in auxiliary Anthropic client."""
-
-    def test_oauth_token_sets_flag(self, monkeypatch):
-        """OAuth tokens (sk-ant-oat01-*) should create client with is_oauth=True."""
-        monkeypatch.setenv("ANTHROPIC_TOKEN", "sk-ant-oat01-test-token")
-        with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
-            mock_build.return_value = MagicMock()
-            from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-            from agent.anthropic_aux import AnthropicAuxiliaryClient
-            client, model = _try_anthropic()
-            assert client is not None
-            assert isinstance(client, AnthropicAuxiliaryClient)
-            # The adapter inside should have is_oauth=True
-            adapter = client.chat.completions
-            assert adapter._is_oauth is True
-
-    def test_api_key_no_oauth_flag(self, monkeypatch):
-        """Regular API keys (sk-ant-api-*) should create client with is_oauth=False."""
-        with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant-api03-testkey1234"), \
-             patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
-             patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
-            mock_build.return_value = MagicMock()
-            from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-            from agent.anthropic_aux import AnthropicAuxiliaryClient
-            client, model = _try_anthropic()
-            assert client is not None
-            assert isinstance(client, AnthropicAuxiliaryClient)
-            adapter = client.chat.completions
-            assert adapter._is_oauth is False
-
-    def test_pool_entry_takes_priority_over_legacy_resolution(self):
-        class _Entry:
-            access_token = "sk-ant-oat01-pooled"
-            base_url = "https://api.anthropic.com"
-
-        class _Pool:
-            def has_credentials(self):
-                return True
-
-            def select(self):
-                return _Entry()
-
-        with (
-            patch("agent.credential_pool.load_pool", return_value=_Pool()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", side_effect=AssertionError("legacy path should not run")),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()) as mock_build,
-        ):
-            from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-
-            client, model = _try_anthropic()
-
-        assert client is not None
-        assert model == "claude-haiku-4-5-20251001"
-        assert mock_build.call_args.args[0] == "sk-ant-oat01-pooled"
-
-
-class TestAnthropicExplicitApiKey:
-    """Test that explicit_api_key is correctly propagated to _try_anthropic().
-
-    Parity with the OpenRouter fix in #18768: resolve_provider_client() passes
-    explicit_api_key to _try_openrouter(), but the anthropic branch was not
-    updated — _try_anthropic() always fell back to resolve_anthropic_token()
-    even when an explicit key was supplied (e.g. from a fallback_model entry).
-    """
-
-    def test_try_anthropic_uses_explicit_api_key_over_env(self):
-        """_try_anthropic(explicit_api_key) must use the supplied key, not the env fallback."""
-        with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
-             patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
-             patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
-            mock_build.return_value = MagicMock()
-            from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-            client, model = _try_anthropic(explicit_api_key="explicit-pool-key")
-        assert client is not None
-        assert mock_build.call_args.args[0] == "explicit-pool-key", (
-            f"Expected explicit_api_key to be passed, got: {mock_build.call_args.args[0]}"
-        )
-        assert mock_build.call_args.args[0] != "env-fallback-key"
-
-    def test_try_anthropic_without_explicit_key_falls_back_to_resolve(self):
-        """Without explicit_api_key, _try_anthropic falls back to resolve_anthropic_token."""
-        with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
-             patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
-             patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
-            mock_build.return_value = MagicMock()
-            from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
-            client, model = _try_anthropic()
-        assert client is not None
-        assert mock_build.call_args.args[0] == "env-fallback-key"
-
-    def test_resolve_provider_client_passes_explicit_api_key_to_anthropic(self):
-        """resolve_provider_client(provider='anthropic', explicit_api_key=...) must propagate the key."""
-        with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-key"), \
-             patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
-             patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
-            mock_build.return_value = MagicMock()
-            client, model = resolve_provider_client(
-                provider="anthropic",
-                explicit_api_key="explicit-fallback-key",
-            )
-        assert client is not None
-        assert mock_build.call_args.args[0] == "explicit-fallback-key", (
-            "resolve_provider_client must forward explicit_api_key to _try_anthropic()"
-        )
-
-
-class TestExpiredCodexFallback:
-    """Test that expired Codex tokens don't block the auto chain."""
-
-    def test_expired_codex_falls_through_to_next(self, tmp_path, monkeypatch):
-        """When Codex token is expired, auto chain should skip it and try next provider."""
-        import base64
-        import time as _time
-
-        # Expired Codex JWT
-        header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
-        payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
-        payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
-        expired_jwt = f"{header}.{payload}.fakesig"
-
-        hermes_home = tmp_path / "hermes"
-        hermes_home.mkdir(parents=True, exist_ok=True)
-        (hermes_home / "auth.json").write_text(json.dumps({
-            "version": 1,
-            "providers": {
-                "openai-codex": {
-                    "tokens": {"access_token": expired_jwt, "refresh_token": "***"},
-                },
-            },
-        }))
-        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-
-        # Set up Anthropic as fallback
-        monkeypatch.setenv("ANTHROPIC_TOKEN", "sk-ant...back")
-        with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
-            mock_build.return_value = MagicMock()
-            client, model = _resolve_auto()
-            # Should NOT be Codex, should be Anthropic (or another available provider)
-            assert not isinstance(client, type(None)), "Should find a provider after expired Codex"
-
-
-    def test_expired_codex_openrouter_wins(self, tmp_path, monkeypatch):
-        """With expired Codex + OpenRouter key, OpenRouter should win (1st in chain)."""
-        import base64
-        import time as _time
-
-        # Belt-and-suspenders: _try_openrouter marks openrouter unhealthy
-        # when OPENROUTER_API_KEY is absent (which the preceding test in
-        # this class exercises).  The file-level _clean_env autouse fixture
-        # clears the cache, but fixture ordering with the conftest
-        # _hermetic_environment autouse can leave a narrow window where
-        # the mark reappears.  Explicitly clear here so this test is
-        # independent of run order.
-        import agent.auxiliary_client as _aux_mod
-        _aux_mod._aux_unhealthy_until.clear()
-        _aux_mod._aux_unhealthy_logged_at.clear()
-
-        header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
-        payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
-        payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
-        expired_jwt = f"{header}.{payload}.fakesig"
-
-        hermes_home = tmp_path / "hermes"
-        hermes_home.mkdir(parents=True, exist_ok=True)
-        (hermes_home / "auth.json").write_text(json.dumps({
-            "version": 1,
-            "providers": {
-                "openai-codex": {
-                    "tokens": {"access_token": expired_jwt, "refresh_token": "***"},
-                },
-            },
-        }))
-        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-        monkeypatch.setenv("OPENROUTER_API_KEY", "or-test-key")
-
-        with patch("agent.auxiliary_client.OpenAI") as mock_openai:
-            mock_openai.return_value = MagicMock()
-            client, model = _resolve_auto()
-            assert client is not None
-            # OpenRouter is 1st in chain, should win
-            mock_openai.assert_called()
-
-    def test_expired_codex_custom_endpoint_wins(self, tmp_path, monkeypatch):
-        """With expired Codex + custom endpoint (Ollama), custom should win (3rd in chain)."""
-        import base64
-        import time as _time
-
-        header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
-        payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
-        payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
-        expired_jwt = f"{header}.{payload}.fakesig"
-
-        hermes_home = tmp_path / "hermes"
-        hermes_home.mkdir(parents=True, exist_ok=True)
-        (hermes_home / "auth.json").write_text(json.dumps({
-            "version": 1,
-            "providers": {
-                "openai-codex": {
-                    "tokens": {"access_token": expired_jwt, "refresh_token": "***"},
-                },
-            },
-        }))
-        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-
-        # Simulate Ollama or custom endpoint
-        with patch("agent.auxiliary_client._resolve_custom_runtime",
-                   return_value=("http://localhost:11434/v1", "sk-dummy")):
-            with patch("agent.auxiliary_client.OpenAI") as mock_openai:
-                mock_openai.return_value = MagicMock()
-                client, model = _resolve_auto()
-                assert client is not None
-
-
-    def test_hermes_oauth_file_sets_oauth_flag(self, monkeypatch):
-        """OAuth-style tokens should get is_oauth=*** (token is not sk-ant-api-*)."""
-        # Mock resolve_anthropic_token to return an OAuth-style token
-        with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"), \
-             patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
-             patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
-            mock_build.return_value = MagicMock()
-            client, model = _try_anthropic()
-            assert client is not None, "Should resolve token"
-            adapter = client.chat.completions
-            assert adapter._is_oauth is True, "Non-sk-ant-api token should set is_oauth=True"
-
-    def test_jwt_missing_exp_passes_through(self, tmp_path, monkeypatch):
-        """JWT with valid JSON but no exp claim should pass through."""
-        import base64
-        header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
-        payload_data = json.dumps({"sub": "user123"}).encode()  # no exp
-        payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
-        no_exp_jwt = f"{header}.{payload}.fakesig"
-
-        hermes_home = tmp_path / "hermes"
-        hermes_home.mkdir(parents=True, exist_ok=True)
-        (hermes_home / "auth.json").write_text(json.dumps({
-            "version": 1,
-            "providers": {
-                "openai-codex": {
-                    "tokens": {"access_token": no_exp_jwt, "refresh_token": "***"},
-                },
-            },
-        }))
-        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-        result = _read_codex_access_token()
-        assert result == no_exp_jwt, "JWT without exp should pass through"
-
-    def test_jwt_invalid_json_payload_passes_through(self, tmp_path, monkeypatch):
-        """JWT with valid base64 but invalid JSON payload should pass through."""
-        import base64
-        header = base64.urlsafe_b64encode(b'{"alg":"RS256"}').rstrip(b"=").decode()
-        payload = base64.urlsafe_b64encode(b"not-json-content").rstrip(b"=").decode()
-        bad_jwt = f"{header}.{payload}.fakesig"
-
-        hermes_home = tmp_path / "hermes"
-        hermes_home.mkdir(parents=True, exist_ok=True)
-        (hermes_home / "auth.json").write_text(json.dumps({
-            "version": 1,
-            "providers": {
-                "openai-codex": {
-                    "tokens": {"access_token": bad_jwt, "refresh_token": "***"},
-                },
-            },
-        }))
-        monkeypatch.setenv("HERMES_HOME", str(hermes_home))
-        result = _read_codex_access_token()
-        assert result == bad_jwt, "JWT with invalid JSON payload should pass through"
-
-    def test_claude_code_oauth_env_sets_flag(self, monkeypatch):
-        """CLAUDE_CODE_OAUTH_TOKEN env var should get is_oauth=True."""
-        monkeypatch.setenv("CLAUDE_CODE_OAUTH_TOKEN", "eyJhbG...test.sig")  # JWT → is_oauth=True
-        monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
-        with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
-            mock_build.return_value = MagicMock()
-            client, model = _try_anthropic()
-            assert client is not None
-            adapter = client.chat.completions
-            assert adapter._is_oauth is True
-
-
-class TestVisionClientFallback:
-    """Vision client auto mode resolves known-good multimodal backends."""
-
-    def test_vision_auto_includes_active_provider_when_configured(self, monkeypatch):
-        """Active provider appears in available backends when credentials exist."""
-        monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
-        with (
-            patch("agent.auxiliary_client._read_nous_auth", return_value=None),
-            patch("agent.auxiliary_client._read_main_provider", return_value="anthropic"),
-            patch("agent.auxiliary_client._read_main_model", return_value="claude-sonnet-4"),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"),
-        ):
-            backends = get_available_vision_backends()
-
-        assert "anthropic" in backends
-
-    def test_resolve_provider_client_returns_native_anthropic_wrapper(self, monkeypatch):
-        monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
-        with (
-            patch("agent.auxiliary_client._read_nous_auth", return_value=None),
-            patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
-            patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"),
-        ):
-            client, model = resolve_provider_client("anthropic")
-
-        assert client is not None
-        assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
-        assert model == "claude-haiku-4-5-20251001"
-
-
-class _AuxAuth401(Exception):
-    status_code = 401
-
-    def __init__(self, message="Provided authentication token is expired"):
-        super().__init__(message)
-
-
-class _DummyResponse:
-    def __init__(self, text="ok"):
-        self.choices = [MagicMock(message=MagicMock(content=text))]
-
-
-class _FailingThenSuccessCompletions:
-    def __init__(self):
-        self.calls = 0
-
-    def create(self, **kwargs):
-        self.calls += 1
-        if self.calls == 1:
-            raise _AuxAuth401()
-        return _DummyResponse("sync-ok")
-
-
-class _AsyncFailingThenSuccessCompletions:
-    def __init__(self):
-        self.calls = 0
-
-    async def create(self, **kwargs):
-        self.calls += 1
-        if self.calls == 1:
-            raise _AuxAuth401()
-        return _DummyResponse("async-ok")
-
-
-class TestAuxiliaryAuthRefreshRetry:
-    def test_call_llm_refreshes_codex_on_401_for_vision(self):
-        failing_client = MagicMock()
-        failing_client.base_url = "https://chatgpt.com/backend-api/codex"
-        failing_client.chat.completions = _FailingThenSuccessCompletions()
-
-        fresh_client = MagicMock()
-        fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
-        fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-sync")
-
-        with (
-            patch(
-                "agent.auxiliary_client.resolve_vision_provider_client",
-                side_effect=[("openai-codex", failing_client, "gpt-5.4"), ("openai-codex", fresh_client, "gpt-5.4")],
-            ),
-            patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
-        ):
-            resp = call_llm(
-                task="vision",
-                provider="openai-codex",
-                model="gpt-5.4",
-                messages=[{"role": "user", "content": "hi"}],
-            )
-
-        assert resp.choices[0].message.content == "fresh-sync"
-        mock_refresh.assert_called_once_with("openai-codex")
-
-    def test_call_llm_refreshes_codex_on_401_for_non_vision(self):
-        stale_client = MagicMock()
-        stale_client.base_url = "https://chatgpt.com/backend-api/codex"
-        stale_client.chat.completions.create.side_effect = _AuxAuth401("stale codex token")
-
-        fresh_client = MagicMock()
-        fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
-        fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-non-vision")
-
-        with (
-            patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("openai-codex", "gpt-5.4", None, None, None)),
-            patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "gpt-5.4"), (fresh_client, "gpt-5.4")]),
-            patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
-        ):
-            resp = call_llm(
-                task="compression",
-                provider="openai-codex",
-                model="gpt-5.4",
-                messages=[{"role": "user", "content": "hi"}],
-            )
-
-        assert resp.choices[0].message.content == "fresh-non-vision"
-        mock_refresh.assert_called_once_with("openai-codex")
-        assert stale_client.chat.completions.create.call_count == 1
-        assert fresh_client.chat.completions.create.call_count == 1
-
-    def test_call_llm_refreshes_anthropic_on_401_for_non_vision(self):
-        stale_client = MagicMock()
-        stale_client.base_url = "https://api.anthropic.com"
-        stale_client.chat.completions.create.side_effect = _AuxAuth401("anthropic token expired")
-
-        fresh_client = MagicMock()
-        fresh_client.base_url = "https://api.anthropic.com"
-        fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-anthropic")
-
-        with (
-            patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("anthropic", "claude-haiku-4-5-20251001", None, None, None)),
-            patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "claude-haiku-4-5-20251001"), (fresh_client, "claude-haiku-4-5-20251001")]),
-            patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
-        ):
-            resp = call_llm(
-                task="compression",
-                provider="anthropic",
-                model="claude-haiku-4-5-20251001",
-                messages=[{"role": "user", "content": "hi"}],
-            )
-
-        assert resp.choices[0].message.content == "fresh-anthropic"
-        mock_refresh.assert_called_once_with("anthropic")
-        assert stale_client.chat.completions.create.call_count == 1
-        assert fresh_client.chat.completions.create.call_count == 1
-
-    @pytest.mark.asyncio
-    async def test_async_call_llm_refreshes_codex_on_401_for_vision(self):
-        failing_client = MagicMock()
-        failing_client.base_url = "https://chatgpt.com/backend-api/codex"
-        failing_client.chat.completions = _AsyncFailingThenSuccessCompletions()
-
-        fresh_client = MagicMock()
-        fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
-        fresh_client.chat.completions.create = AsyncMock(return_value=_DummyResponse("fresh-async"))
-
-        with (
-            patch(
-                "agent.auxiliary_client.resolve_vision_provider_client",
-                side_effect=[("openai-codex", failing_client, "gpt-5.4"), ("openai-codex", fresh_client, "gpt-5.4")],
-            ),
-            patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
-        ):
-            resp = await async_call_llm(
-                task="vision",
-                provider="openai-codex",
-                model="gpt-5.4",
-                messages=[{"role": "user", "content": "hi"}],
-            )
-
-        assert resp.choices[0].message.content == "fresh-async"
-        mock_refresh.assert_called_once_with("openai-codex")
-
-    def test_refresh_provider_credentials_force_refreshes_anthropic_oauth_and_evicts_cache(self, monkeypatch):
-        stale_client = MagicMock()
-        cache_key = ("anthropic", False, None, None, None)
-
-        monkeypatch.setenv("ANTHROPIC_TOKEN", "")
-        monkeypatch.setenv("CLAUDE_CODE_OAUTH_TOKEN", "")
-        monkeypatch.setenv("ANTHROPIC_API_KEY", "")
-
-        with (
-            patch("agent.auxiliary_client._client_cache", {cache_key: (stale_client, "claude-haiku-4-5-20251001", None)}),
-            patch("hermes_agent_anthropic.adapter.read_claude_code_credentials", return_value={
-                "accessToken": "expired-token",
-                "refreshToken": "refresh-token",
-                "expiresAt": 0,
-            }),
-            patch("hermes_agent_anthropic.adapter.refresh_anthropic_oauth_pure", return_value={
-                "access_token": "fresh-token",
-                "refresh_token": "refresh-token-2",
-                "expires_at_ms": 9999999999999,
-            }) as mock_refresh_oauth,
-            patch("hermes_agent_anthropic.adapter._write_claude_code_credentials") as mock_write,
-        ):
-            from agent.auxiliary_client import _refresh_provider_credentials
-
-            assert _refresh_provider_credentials("anthropic") is True
-
-        mock_refresh_oauth.assert_called_once_with("refresh-token", use_json=False)
-        mock_write.assert_called_once_with("fresh-token", "refresh-token-2", 9999999999999)
-        stale_client.close.assert_called_once()
-
-    @pytest.mark.asyncio
-    async def test_async_call_llm_refreshes_anthropic_on_401_for_non_vision(self):
-        stale_client = MagicMock()
-        stale_client.base_url = "https://api.anthropic.com"
-        stale_client.chat.completions.create = AsyncMock(side_effect=_AuxAuth401("anthropic token expired"))
-
-        fresh_client = MagicMock()
-        fresh_client.base_url = "https://api.anthropic.com"
-        fresh_client.chat.completions.create = AsyncMock(return_value=_DummyResponse("fresh-async-anthropic"))
-
-        with (
-            patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("anthropic", "claude-haiku-4-5-20251001", None, None, None)),
-            patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "claude-haiku-4-5-20251001"), (fresh_client, "claude-haiku-4-5-20251001")]),
-            patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
-        ):
-            resp = await async_call_llm(
-                task="compression",
-                provider="anthropic",
-                model="claude-haiku-4-5-20251001",
-                messages=[{"role": "user", "content": "hi"}],
-            )
-
-        assert resp.choices[0].message.content == "fresh-async-anthropic"
-        mock_refresh.assert_called_once_with("anthropic")
-        assert stale_client.chat.completions.create.await_count == 1
-        assert fresh_client.chat.completions.create.await_count == 1
@@ -1,129 +0,0 @@
-"""Anthropic-specific computer use tests moved from tests/tools/test_computer_use.py."""
-
-from __future__ import annotations
-
-from typing import Any, Dict, List
-
-
-# ---------------------------------------------------------------------------
-# Anthropic adapter: multimodal tool-result conversion
-# ---------------------------------------------------------------------------
-
-class TestAnthropicAdapterMultimodal:
-    def test_multimodal_envelope_becomes_tool_result_with_image_block(self):
-        from agent.anthropic_format import convert_messages_to_anthropic
-
-        fake_png = "iVBORw0KGgo="
-        messages = [
-            {"role": "user", "content": "take a screenshot"},
-            {
-                "role": "assistant",
-                "content": "",
-                "tool_calls": [{
-                    "id": "call_1",
-                    "type": "function",
-                    "function": {"name": "computer_use", "arguments": "{}"},
-                }],
-            },
-            {
-                "role": "tool",
-                "tool_call_id": "call_1",
-                "content": {
-                    "_multimodal": True,
-                    "content": [
-                        {"type": "text", "text": "1 element"},
-                        {"type": "image_url",
-                         "image_url": {"url": f"data:image/png;base64,{fake_png}"}},
-                    ],
-                    "text_summary": "1 element",
-                },
-            },
-        ]
-        _, anthropic_msgs = convert_messages_to_anthropic(messages)
-        tool_result_msgs = [m for m in anthropic_msgs if m["role"] == "user"
-                            and isinstance(m["content"], list)
-                            and any(b.get("type") == "tool_result" for b in m["content"])]
-        assert tool_result_msgs, "expected a tool_result user message"
-        tr = next(b for b in tool_result_msgs[-1]["content"] if b.get("type") == "tool_result")
-        inner = tr["content"]
-        assert any(b.get("type") == "image" for b in inner)
-        assert any(b.get("type") == "text" for b in inner)
-
-    def test_old_screenshots_are_evicted_beyond_max_keep(self):
-        """Image blocks in old tool_results get replaced with placeholders."""
-        from agent.anthropic_format import convert_messages_to_anthropic
-
-        fake_png = "iVBORw0KGgo="
-
-        def _mm_tool(call_id: str) -> Dict[str, Any]:
-            return {
-                "role": "tool",
-                "tool_call_id": call_id,
-                "content": {
-                    "_multimodal": True,
-                    "content": [
-                        {"type": "text", "text": "cap"},
-                        {"type": "image_url",
-                         "image_url": {"url": f"data:image/png;base64,{fake_png}"}},
-                    ],
-                    "text_summary": "cap",
-                },
-            }
-
-        # Build 5 screenshots interleaved with assistant messages.
-        messages: List[Dict[str, Any]] = [{"role": "user", "content": "start"}]
-        for i in range(5):
-            messages.append({
-                "role": "assistant", "content": "",
-                "tool_calls": [{
-                    "id": f"call_{i}",
-                    "type": "function",
-                    "function": {"name": "computer_use", "arguments": "{}"},
-                }],
-            })
-            messages.append(_mm_tool(f"call_{i}"))
-        messages.append({"role": "assistant", "content": "done"})
-
-        _, anthropic_msgs = convert_messages_to_anthropic(messages)
-
-        # Walk tool_result blocks in order; the OLDEST (5 - 3) = 2 should be
-        # text-only placeholders, newest 3 should still carry image blocks.
-        tool_results = []
-        for m in anthropic_msgs:
-            if m["role"] != "user" or not isinstance(m["content"], list):
-                continue
-            for b in m["content"]:
-                if b.get("type") == "tool_result":
-                    tool_results.append(b)
-
-        assert len(tool_results) == 5
-        with_images = [
-            b for b in tool_results
-            if isinstance(b.get("content"), list)
-            and any(x.get("type") == "image" for x in b["content"])
-        ]
-        placeholders = [
-            b for b in tool_results
-            if isinstance(b.get("content"), list)
-            and any(
-                x.get("type") == "text"
-                and "screenshot removed" in x.get("text", "")
-                for x in b["content"]
-            )
-        ]
-        assert len(with_images) == 3
-        assert len(placeholders) == 2
-
-    def test_content_parts_helper_filters_to_text_and_image(self):
-        from agent.anthropic_format import _content_parts_to_anthropic_blocks
-
-        fake_png = "iVBORw0KGgo="
-        blocks = _content_parts_to_anthropic_blocks([
-            {"type": "text", "text": "hi"},
-            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{fake_png}"}},
-            {"type": "unsupported", "data": "ignored"},
-        ])
-        types = [b["type"] for b in blocks]
-        assert "text" in types
-        assert "image" in types
-        assert len(blocks) == 2
@@ -1,47 +0,0 @@
-"""Anthropic-specific ctx halving tests moved from tests/test_ctx_halving_fix.py."""
-
-
-# ---------------------------------------------------------------------------
-# build_anthropic_kwargs — output cap clamping
-# ---------------------------------------------------------------------------
-
-class TestBuildAnthropicKwargsClamping:
-    """The context_length clamp only fires when output ceiling > window.
-    For standard Anthropic models (output ceiling < window) it must not fire.
-    """
-
-    def _build(self, model, max_tokens=None, context_length=None):
-        from agent.anthropic_format import build_anthropic_kwargs
-        return build_anthropic_kwargs(
-            model=model,
-            messages=[{"role": "user", "content": "hi"}],
-            tools=None,
-            max_tokens=max_tokens,
-            reasoning_config=None,
-            context_length=context_length,
-        )
-
-    def test_no_clamping_when_output_ceiling_fits_in_window(self):
-        """Opus 4.6 native output (128K) < context window (200K) — no clamping."""
-        kwargs = self._build("claude-opus-4-6", context_length=200_000)
-        assert kwargs["max_tokens"] == 128_000
-
-    def test_clamping_fires_for_tiny_custom_window(self):
-        """When context_length is 8K (local model), output cap is clamped to 7999."""
-        kwargs = self._build("claude-opus-4-6", context_length=8_000)
-        assert kwargs["max_tokens"] == 7_999
-
-    def test_explicit_max_tokens_respected_when_within_window(self):
-        """Explicit max_tokens smaller than window passes through unchanged."""
-        kwargs = self._build("claude-opus-4-6", max_tokens=4096, context_length=200_000)
-        assert kwargs["max_tokens"] == 4096
-
-    def test_explicit_max_tokens_clamped_when_exceeds_window(self):
-        """Explicit max_tokens larger than a small window is clamped."""
-        kwargs = self._build("claude-opus-4-6", max_tokens=32_768, context_length=16_000)
-        assert kwargs["max_tokens"] == 15_999
-
-    def test_no_context_length_uses_native_ceiling(self):
-        """Without context_length the native output ceiling is used directly."""
-        kwargs = self._build("claude-sonnet-4-6")
-        assert kwargs["max_tokens"] == 64_000
@@ -1,231 +0,0 @@
-"""Anthropic-specific fast mode tests moved from tests/cli/test_fast_command.py."""
-
-import unittest
-from types import SimpleNamespace
-
-
-def _import_cli():
-    import hermes_cli.config as config_mod
-
-    if not hasattr(config_mod, "save_env_value_secure"):
-        config_mod.save_env_value_secure = lambda key, value: {
-            "success": True,
-            "stored_as": key,
-            "validated": False,
-        }
-
-    import cli as cli_mod
-
-    return cli_mod
-
-
-class TestAnthropicFastMode(unittest.TestCase):
-    """Verify Anthropic Fast Mode model support and override resolution."""
-
-    def test_anthropic_opus_supported(self):
-        from hermes_cli.models import model_supports_fast_mode
-
-        # Native Anthropic format (hyphens)
-        assert model_supports_fast_mode("claude-opus-4-6") is True
-        # OpenRouter format (dots)
-        assert model_supports_fast_mode("claude-opus-4.6") is True
-        # With vendor prefix
-        assert model_supports_fast_mode("anthropic/claude-opus-4-6") is True
-        assert model_supports_fast_mode("anthropic/claude-opus-4.6") is True
-
-    def test_anthropic_non_opus46_models_excluded(self):
-        """Anthropic restricts fast mode to Opus 4.6 — others must be excluded.
-
-        Per https://platform.claude.com/docs/en/build-with-claude/fast-mode,
-        sending speed=fast to Opus 4.7, Sonnet, or Haiku returns HTTP 400.
-        """
-        from hermes_cli.models import model_supports_fast_mode
-
-        assert model_supports_fast_mode("claude-sonnet-4-6") is False
-        assert model_supports_fast_mode("claude-sonnet-4.6") is False
-        assert model_supports_fast_mode("claude-haiku-4-5") is False
-        assert model_supports_fast_mode("claude-opus-4-7") is False
-        assert model_supports_fast_mode("anthropic/claude-sonnet-4.6") is False
-        assert model_supports_fast_mode("anthropic/claude-opus-4-7") is False
-
-    def test_non_claude_models_not_anthropic_fast(self):
-        """Non-Claude models should not be treated as Anthropic fast-mode."""
-        from hermes_cli.models import _is_anthropic_fast_model
-
-        assert _is_anthropic_fast_model("gpt-5.4") is False
-        assert _is_anthropic_fast_model("gemini-3-pro") is False
-        assert _is_anthropic_fast_model("kimi-k2-thinking") is False
-
-    def test_anthropic_variant_tags_stripped(self):
-        from hermes_cli.models import model_supports_fast_mode
-
-        # OpenRouter variant tags after colon should be stripped
-        assert model_supports_fast_mode("claude-opus-4.6:fast") is True
-        assert model_supports_fast_mode("claude-opus-4.6:beta") is True
-
-    def test_resolve_overrides_returns_speed_for_anthropic(self):
-        from hermes_cli.models import resolve_fast_mode_overrides
-
-        result = resolve_fast_mode_overrides("claude-opus-4-6")
-        assert result == {"speed": "fast"}
-
-        result = resolve_fast_mode_overrides("anthropic/claude-opus-4.6")
-        assert result == {"speed": "fast"}
-
-    def test_resolve_overrides_returns_none_for_unsupported_claude(self):
-        """Opus 4.7 and other Claude models don't support fast mode (API 400s).
-
-        Per Anthropic docs, fast mode is currently Opus 4.6 only.
-        """
-        from hermes_cli.models import resolve_fast_mode_overrides
-
-        assert resolve_fast_mode_overrides("claude-opus-4-7") is None
-        assert resolve_fast_mode_overrides("claude-sonnet-4-6") is None
-        assert resolve_fast_mode_overrides("claude-haiku-4-5") is None
-
-    def test_resolve_overrides_returns_service_tier_for_openai(self):
-        """OpenAI models should still get service_tier, not speed."""
-        from hermes_cli.models import resolve_fast_mode_overrides
-
-        result = resolve_fast_mode_overrides("gpt-5.4")
-        assert result == {"service_tier": "priority"}
-
-    def test_is_anthropic_fast_model(self):
-        """Fast mode is currently Opus 4.6 only — other Claude variants must be excluded."""
-        from hermes_cli.models import _is_anthropic_fast_model
-
-        # Supported: Opus 4.6 in any form
-        assert _is_anthropic_fast_model("claude-opus-4-6") is True
-        assert _is_anthropic_fast_model("claude-opus-4.6") is True
-        assert _is_anthropic_fast_model("anthropic/claude-opus-4-6") is True
-        assert _is_anthropic_fast_model("claude-opus-4.6:fast") is True
-
-        # Unsupported per Anthropic API contract — would 400 if we sent speed=fast
-        assert _is_anthropic_fast_model("claude-opus-4-7") is False
-        assert _is_anthropic_fast_model("claude-sonnet-4-6") is False
-        assert _is_anthropic_fast_model("claude-haiku-4-5") is False
-
-        # Non-Claude
-        assert _is_anthropic_fast_model("gpt-5.4") is False
-        assert _is_anthropic_fast_model("") is False
-
-    def test_fast_command_exposed_for_anthropic_model(self):
-        cli_mod = _import_cli()
-        stub = SimpleNamespace(
-            provider="anthropic", requested_provider="anthropic",
-            model="claude-opus-4-6", agent=None,
-        )
-        assert cli_mod.HermesCLI._fast_command_available(stub) is True
-
-    def test_fast_command_hidden_for_anthropic_sonnet(self):
-        """Sonnet doesn't support fast mode (Opus 4.6 only) — /fast must be hidden."""
-        cli_mod = _import_cli()
-        stub = SimpleNamespace(
-            provider="anthropic", requested_provider="anthropic",
-            model="claude-sonnet-4-6", agent=None,
-        )
-        assert cli_mod.HermesCLI._fast_command_available(stub) is False
-
-    def test_fast_command_hidden_for_anthropic_opus_47(self):
-        """Opus 4.7 doesn't support fast mode — /fast must be hidden."""
-        cli_mod = _import_cli()
-        stub = SimpleNamespace(
-            provider="anthropic", requested_provider="anthropic",
-            model="claude-opus-4-7", agent=None,
-        )
-        assert cli_mod.HermesCLI._fast_command_available(stub) is False
-
-    def test_fast_command_hidden_for_non_claude_non_openai(self):
-        """Non-Claude, non-OpenAI models should not expose /fast."""
-        cli_mod = _import_cli()
-        stub = SimpleNamespace(
-            provider="gemini", requested_provider="gemini",
-            model="gemini-3-pro-preview", agent=None,
-        )
-        assert cli_mod.HermesCLI._fast_command_available(stub) is False
-
-    def test_turn_route_injects_speed_for_anthropic(self):
-        """Anthropic models should get speed:'fast' override, not service_tier."""
-        cli_mod = _import_cli()
-        stub = SimpleNamespace(
-            model="claude-opus-4-6",
-            api_key="sk-ant-test",
-            base_url="https://api.anthropic.com",
-            provider="anthropic",
-            api_mode="anthropic_messages",
-            acp_command=None,
-            acp_args=[],
-            _credential_pool=None,
-            service_tier="priority",
-        )
-
-        route = cli_mod.HermesCLI._resolve_turn_agent_config(stub, "hi")
-
-        assert route["runtime"]["provider"] == "anthropic"
-        assert route["request_overrides"] == {"speed": "fast"}
-
-
-class TestAnthropicFastModeAdapter(unittest.TestCase):
-    """Verify build_anthropic_kwargs handles fast_mode parameter."""
-
-    def test_fast_mode_adds_speed_and_beta(self):
-        from agent.anthropic_format import build_anthropic_kwargs, _FAST_MODE_BETA
-
-        kwargs = build_anthropic_kwargs(
-            model="claude-opus-4-6",
-            messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
-            tools=None,
-            max_tokens=None,
-            reasoning_config=None,
-            fast_mode=True,
-        )
-        assert kwargs.get("extra_body", {}).get("speed") == "fast"
-        assert "speed" not in kwargs
-        assert "extra_headers" in kwargs
-        assert _FAST_MODE_BETA in kwargs["extra_headers"].get("anthropic-beta", "")
-
-    def test_fast_mode_off_no_speed(self):
-        from agent.anthropic_format import build_anthropic_kwargs
-
-        kwargs = build_anthropic_kwargs(
-            model="claude-opus-4-6",
-            messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
-            tools=None,
-            max_tokens=None,
-            reasoning_config=None,
-            fast_mode=False,
-        )
-        assert kwargs.get("extra_body", {}).get("speed") is None
-        assert "speed" not in kwargs
-        assert "extra_headers" not in kwargs
-
-    def test_fast_mode_skipped_for_third_party_endpoint(self):
-        from agent.anthropic_format import build_anthropic_kwargs
-
-        kwargs = build_anthropic_kwargs(
-            model="claude-opus-4-6",
-            messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
-            tools=None,
-            max_tokens=None,
-            reasoning_config=None,
-            fast_mode=True,
-            base_url="https://api.minimax.io/anthropic/v1",
-        )
-        # Third-party endpoints should NOT get speed or fast-mode beta
-        assert kwargs.get("extra_body", {}).get("speed") is None
-        assert "speed" not in kwargs
-        assert "extra_headers" not in kwargs
-
-    def test_fast_mode_kwargs_are_safe_for_sdk_unpacking(self):
-        from agent.anthropic_format import build_anthropic_kwargs
-
-        kwargs = build_anthropic_kwargs(
-            model="claude-opus-4-6",
-            messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
-            tools=None,
-            max_tokens=None,
-            reasoning_config=None,
-            fast_mode=True,
-        )
-        assert "speed" not in kwargs
-        assert kwargs.get("extra_body", {}).get("speed") == "fast"
@@ -1,22 +0,0 @@
-"""Anthropic-specific timeout tests moved from tests/hermes_cli/test_timeouts.py."""
-
-from __future__ import annotations
-
-
-def test_anthropic_adapter_honors_timeout_kwarg():
-    """build_anthropic_client(timeout=X) overrides the 900s default read timeout."""
-    pytest = __import__("pytest")
-    anthropic = pytest.importorskip("anthropic")  # skip if optional SDK missing
-    from hermes_agent_anthropic import build_anthropic_client
-
-    c_default = build_anthropic_client("sk-ant-dummy", None)
-    c_custom = build_anthropic_client("sk-ant-dummy", None, timeout=45.0)
-    c_invalid = build_anthropic_client("sk-ant-dummy", None, timeout=-1)
-
-    # Default stays at 900s; custom overrides; invalid falls back to default
-    assert c_default.timeout.read == 900.0
-    assert c_custom.timeout.read == 45.0
-    assert c_invalid.timeout.read == 900.0
-    # Connect timeout always stays at 10s regardless
-    assert c_default.timeout.connect == 10.0
-    assert c_custom.timeout.connect == 10.0
@@ -1,183 +0,0 @@
-"""Tests for the AnthropicMessagesTransport.
-
-Behavioral tests that require the real anthropic transport implementation.
-"""
-
-import json
-import pytest
-from types import SimpleNamespace
-
-from agent.transports import get_transport
-from agent.transports.types import NormalizedResponse
-
-
-@pytest.fixture
-def transport():
-    """Load the real Anthropic transport by registering the plugin."""
-    from hermes_agent_anthropic import register as _anthro_register
-    from agent.plugin_registries import registries
-
-    class _Ctx:
-        def register_transport(self, api_mode, obj):
-            from agent.transports import register_transport
-            register_transport(api_mode, obj)
-        def register_provider_resolver(self, name, fn):
-            registries.register_provider_resolver(name, fn)
-        def register_provider_services(self, name, services):
-            registries.register_provider_services(name, services)
-        def register_credential_pool_hook(self, name, hook):
-            registries.register_credential_pool_hook(name, hook)
-        def register_pricing_provider(self, name, entries):
-            registries.register_pricing_provider(name, entries)
-        def register_provider_overlay(self, entry):
-            registries.register_provider_overlay(entry)
-        def __getattr__(self, name):
-            if name.startswith("register_"):
-                return lambda *a, **kw: None
-            raise AttributeError(name)
-
-    _anthro_register(_Ctx())
-    return get_transport("anthropic_messages")
-
-
-
-class TestAnthropicTransportBehavioral:
-
-    # (fixture defined at module level above)
-
-    def test_api_mode(self, transport):
-        assert transport.api_mode == "anthropic_messages"
-
-    def test_convert_tools_simple(self, transport):
-        tools = [{
-            "type": "function",
-            "function": {
-                "name": "test_tool",
-                "description": "A test",
-                "parameters": {"type": "object", "properties": {}},
-            }
-        }]
-        result = transport.convert_tools(tools)
-        assert len(result) == 1
-        assert result[0]["name"] == "test_tool"
-        assert "input_schema" in result[0]
-
-    def test_validate_response_none(self, transport):
-        assert transport.validate_response(None) is False
-
-    def test_validate_response_empty_content(self, transport):
-        r = SimpleNamespace(content=[])
-        assert transport.validate_response(r) is False
-
-    def test_validate_response_empty_content_with_end_turn_is_valid(self, transport):
-        r = SimpleNamespace(content=[], stop_reason="end_turn")
-        assert transport.validate_response(r) is True
-
-    def test_validate_response_empty_content_with_tool_use_is_invalid(self, transport):
-        r = SimpleNamespace(content=[], stop_reason="tool_use")
-        assert transport.validate_response(r) is False
-
-    def test_validate_response_valid(self, transport):
-        r = SimpleNamespace(content=[SimpleNamespace(type="text", text="hello")])
-        assert transport.validate_response(r) is True
-
-    def test_map_finish_reason(self, transport):
-        assert transport.map_finish_reason("end_turn") == "stop"
-        assert transport.map_finish_reason("tool_use") == "tool_calls"
-        assert transport.map_finish_reason("max_tokens") == "length"
-        assert transport.map_finish_reason("stop_sequence") == "stop"
-        assert transport.map_finish_reason("refusal") == "content_filter"
-        assert transport.map_finish_reason("model_context_window_exceeded") == "length"
-        assert transport.map_finish_reason("unknown") == "stop"
-
-    def test_extract_cache_stats_none_usage(self, transport):
-        r = SimpleNamespace(usage=None)
-        assert transport.extract_cache_stats(r) is None
-
-    def test_extract_cache_stats_with_cache(self, transport):
-        usage = SimpleNamespace(cache_read_input_tokens=100, cache_creation_input_tokens=50)
-        r = SimpleNamespace(usage=usage)
-        result = transport.extract_cache_stats(r)
-        assert result == {"cached_tokens": 100, "creation_tokens": 50}
-
-    def test_extract_cache_stats_zero(self, transport):
-        usage = SimpleNamespace(cache_read_input_tokens=0, cache_creation_input_tokens=0)
-        r = SimpleNamespace(usage=usage)
-        assert transport.extract_cache_stats(r) is None
-
-    def test_normalize_response_text(self, transport):
-        """Test normalization of a simple text response."""
-        r = SimpleNamespace(
-            content=[SimpleNamespace(type="text", text="Hello world")],
-            stop_reason="end_turn",
-            usage=SimpleNamespace(input_tokens=10, output_tokens=5),
-            model="claude-sonnet-4-6",
-        )
-        nr = transport.normalize_response(r)
-        assert isinstance(nr, NormalizedResponse)
-        assert nr.content == "Hello world"
-        assert nr.tool_calls is None or nr.tool_calls == []
-        assert nr.finish_reason == "stop"
-
-    def test_normalize_response_tool_calls(self, transport):
-        """Test normalization of a tool-use response."""
-        r = SimpleNamespace(
-            content=[
-                SimpleNamespace(
-                    type="tool_use",
-                    id="toolu_123",
-                    name="terminal",
-                    input={"command": "ls"},
-                ),
-            ],
-            stop_reason="tool_use",
-            usage=SimpleNamespace(input_tokens=10, output_tokens=20),
-            model="claude-sonnet-4-6",
-        )
-        nr = transport.normalize_response(r)
-        assert nr.finish_reason == "tool_calls"
-        assert len(nr.tool_calls) == 1
-        tc = nr.tool_calls[0]
-        assert tc.name == "terminal"
-        assert tc.id == "toolu_123"
-        assert '"command"' in tc.arguments
-
-    def test_normalize_response_thinking(self, transport):
-        """Test normalization preserves thinking content."""
-        r = SimpleNamespace(
-            content=[
-                SimpleNamespace(type="thinking", thinking="Let me think..."),
-                SimpleNamespace(type="text", text="The answer is 42"),
-            ],
-            stop_reason="end_turn",
-            usage=SimpleNamespace(input_tokens=10, output_tokens=15),
-            model="claude-sonnet-4-6",
-        )
-        nr = transport.normalize_response(r)
-        assert nr.content == "The answer is 42"
-        assert nr.reasoning == "Let me think..."
-
-    def test_build_kwargs_returns_dict(self, transport):
-        """Test build_kwargs produces a usable kwargs dict."""
-        messages = [{"role": "user", "content": "Hello"}]
-        kw = transport.build_kwargs(
-            model="claude-sonnet-4-6",
-            messages=messages,
-            max_tokens=1024,
-        )
-        assert isinstance(kw, dict)
-        assert "model" in kw
-        assert "max_tokens" in kw
-        assert "messages" in kw
-
-    def test_convert_messages_extracts_system(self, transport):
-        """Test convert_messages separates system from messages."""
-        messages = [
-            {"role": "system", "content": "You are helpful."},
-            {"role": "user", "content": "Hi"},
-        ]
-        system, msgs = transport.convert_messages(messages)
-        # System should be extracted
-        assert system is not None
-        # Messages should only have user
-        assert len(msgs) >= 1
@@ -11,9 +11,3 @@ arcee = ProviderProfile(
 )

 register_provider(arcee)
-
-
-def register(ctx):
-    """No-op — this provider has no workspace package yet."""
-    pass
-
@@ -19,9 +19,3 @@ azure_foundry = ProviderProfile(
 )

 register_provider(azure_foundry)
-
-
-def register(ctx):
-    """Plugin entry point — delegates to the inner hermes_agent_azure package."""
-    from hermes_agent_azure import register as _inner_register
-    _inner_register(ctx)
@@ -1,57 +0,0 @@
-"""hermes-agent-azure: Microsoft Entra ID / Azure Identity adapter for Hermes Agent."""
-
-from hermes_agent_azure.adapter import (  # noqa: F401
-    SCOPE_AI_AZURE_DEFAULT,
-    EntraIdentityConfig,
-    _build_default_credential,
-    _require_azure_identity,
-    build_bearer_http_client,
-    build_credential,
-    build_token_provider,
-    describe_active_credential,
-    has_azure_identity_credentials,
-    has_azure_identity_installed,
-    is_token_provider,
-    materialize_bearer_for_http,
-    reset_credential_cache,
-)
-
-
-def register(ctx):
-    """Entry point for the hermes_agent.plugins entry point group."""
-    from hermes_agent_azure import adapter
-
-    ctx.register_provider_services("azure", {
-        # Auth / credentials
-        "is_token_provider": adapter.is_token_provider,
-        "has_azure_identity_credentials": adapter.has_azure_identity_credentials,
-        "has_azure_identity_installed": adapter.has_azure_identity_installed,
-        # Client building
-        "build_bearer_http_client": adapter.build_bearer_http_client,
-        "build_credential": adapter.build_credential,
-        "build_token_provider": adapter.build_token_provider,
-        "materialize_bearer_for_http": adapter.materialize_bearer_for_http,
-        "reset_credential_cache": adapter.reset_credential_cache,
-        # Constants / config
-        "SCOPE_AI_AZURE_DEFAULT": adapter.SCOPE_AI_AZURE_DEFAULT,
-        "EntraIdentityConfig": adapter.EntraIdentityConfig,
-        # Internal helpers
-        "_build_default_credential": adapter._build_default_credential,
-        "_require_azure_identity": adapter._require_azure_identity,
-        "describe_active_credential": adapter.describe_active_credential,
-    })
-
-    # Register the provider resolver — core dispatches to this instead of
-    # having a per-azure-foundry if/elif branch in resolve_provider_client().
-    from hermes_agent_azure.resolve import resolve_auxiliary_client as _azure_resolver
-    ctx.register_provider_resolver("azure-foundry", _azure_resolver)
-
-    # Register the provider overlay — core merges this into HERMES_OVERLAYS
-    from agent.plugin_registries import ProviderOverlayEntry
-    ctx.register_provider_overlay(ProviderOverlayEntry(
-        provider_name="azure-foundry",
-        transport="openai_chat",  # default; overridden by api_mode in config
-        base_url_env_var="AZURE_FOUNDRY_BASE_URL",
-        display_name="Azure AI Foundry",
-        aliases=[],
-    ))
@@ -1,131 +0,0 @@
-"""Azure Foundry provider resolver for auxiliary client construction.
-
-Handles ALL provider-specific logic for building auxiliary clients:
-Entra ID auth, static API key, base URL resolution, api_mode routing
-(chat_completions, codex_responses, anthropic_messages).
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any, Optional
-from urllib.parse import parse_qs, urlparse, urlunparse
-
-logger = logging.getLogger(__name__)
-
-
-def _extract_url_query_params(url: str):
-    """Extract query params from URL, return (clean_url, default_query dict or None)."""
-    parsed = urlparse(url)
-    if parsed.query:
-        clean = urlunparse(parsed._replace(query=""))
-        params = {k: v[0] for k, v in parse_qs(parsed.query).items()}
-        return clean, params
-    return url, None
-
-
-def _normalize_resolved_model(model: str, provider: str) -> str:
-    """Normalize model name for a given provider."""
-    return str(model or "").strip()
-
-
-def resolve_auxiliary_client(
-    *,
-    model: str | None = None,
-    explicit_api_key: str | None = None,
-    explicit_base_url: str | None = None,
-    async_mode: bool = False,
-    is_vision: bool = False,
-    main_runtime: dict | None = None,
-    api_mode: str | None = None,
-) -> tuple[Any, str] | tuple[None, None]:
-    """Resolve an Azure Foundry auxiliary client via the runtime resolver.
-
-    Mirrors the anthropic/bedrock resolver shape but delegates to
-    ``hermes_cli.runtime_provider._resolve_azure_foundry_runtime`` —
-    the same resolver the main agent uses — so:
-
-    * ``auth_mode: api_key`` (default) gets the static
-      ``AZURE_FOUNDRY_API_KEY`` string.
-    * ``auth_mode: entra_id`` gets a callable bearer-token provider
-      (``Callable[[], str]`` from the azure identity adapter).
-    * Per-model ``api_mode`` auto-routing for GPT-5.x / o-series /
-      codex models works.
-    * ``model.entra.{tenant_id,client_id,authority,scope}`` config
-      fields propagate.
-    * Non-default ``model.base_url`` overrides are honored.
-
-    Returns ``(client, model)`` or ``(None, None)`` on failure.
-    """
-    from openai import OpenAI
-
-    try:
-        from hermes_cli.runtime_provider import _resolve_azure_foundry_runtime
-        from hermes_cli.auth import AuthError
-        from hermes_cli.config import load_config
-    except ImportError:
-        return None, None
-
-    try:
-        cfg = load_config()
-        model_cfg = cfg.get("model") if isinstance(cfg, dict) else {}
-        if not isinstance(model_cfg, dict):
-            model_cfg = {}
-    except Exception:
-        model_cfg = {}
-
-    try:
-        runtime = _resolve_azure_foundry_runtime(
-            requested_provider="azure-foundry",
-            model_cfg=model_cfg,
-            explicit_api_key=explicit_api_key,
-            explicit_base_url=explicit_base_url,
-            target_model=model,
-        )
-    except AuthError as exc:
-        logger.debug("Auxiliary azure-foundry: %s", exc)
-        return None, None
-    except Exception as exc:
-        logger.debug("Auxiliary azure-foundry runtime error: %s", exc)
-        return None, None
-
-    api_key = runtime.get("api_key")
-    base_url = str(runtime.get("base_url", "") or "")
-    runtime_api_mode = api_mode or runtime.get("api_mode") or "chat_completions"
-
-    _has_key = bool(api_key) if not callable(api_key) else True
-    if not _has_key or not base_url:
-        return None, None
-
-    final_model = _normalize_resolved_model(
-        model or str(model_cfg.get("default") or ""),
-        "azure-foundry",
-    )
-    if not final_model:
-        logger.debug(
-            "Auxiliary azure-foundry: no model resolved (model=%r, default=%r)",
-            model, model_cfg.get("default"),
-        )
-        return None, None
-
-    extra: dict[str, Any] = {}
-    _clean_base, _dq = _extract_url_query_params(base_url)
-    if _dq:
-        extra["default_query"] = _dq
-
-    client = OpenAI(api_key=api_key, base_url=_clean_base, **extra)
-
-    if runtime_api_mode == "codex_responses":
-        from agent.auxiliary_client import CodexAuxiliaryClient
-        return CodexAuxiliaryClient(client, final_model), final_model
-
-    if runtime_api_mode == "anthropic_messages":
-        from agent.plugin_registries import registries
-        maybe_wrap = registries.get_provider_service("anthropic", "maybe_wrap_anthropic")
-        if maybe_wrap is not None:
-            return maybe_wrap(
-                client, final_model, api_key,
-                base_url, runtime_api_mode,
-            ), final_model
-
-    return client, final_model
@@ -1,19 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-azure"
-version = "0.1.0"
-description = "Microsoft Entra ID / Azure Identity adapter for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "azure-identity==1.25.3",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-azure = "hermes_agent_azure:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_azure*"]
@@ -1,71 +0,0 @@
-"""Shared fixtures for azure-foundry plugin tests.
-
-Registers the azure plugin in the singleton registry before each test.
-"""
-import pytest
-
-
-class _FullCtx:
-    """Plugin context that wires up all registry hooks."""
-
-    def register_provider_services(self, name, services):
-        from agent.plugin_registries import registries
-        registries.register_provider_services(name, services)
-
-    def register_provider_resolver(self, name, resolver):
-        from agent.plugin_registries import registries
-        registries.register_provider_resolver(name, resolver)
-
-    def register_credential_pool_hook(self, name, hook):
-        from agent.plugin_registries import registries
-        registries.register_credential_pool_hook(name, hook)
-
-    def register_transport(self, api_mode, transport_cls):
-        from agent.plugin_registries import registries
-        registries._transports[api_mode] = transport_cls
-
-    def register_pricing_provider(self, name, entries):
-        from agent.plugin_registries import registries
-        registries.register_pricing_provider(name, entries)
-
-    def register_provider_overlay(self, entry):
-        from agent.plugin_registries import registries
-        registries.register_provider_overlay(entry)
-
-    def __getattr__(self, name):
-        if name.startswith("register_"):
-            return lambda *a, **kw: None
-        raise AttributeError(name)
-
-
-@pytest.fixture(autouse=True)
-def _register_azure_plugin():
-    """Register the real azure plugin for the duration of each test."""
-    from agent.plugin_registries import registries
-
-    _prev_services = dict(registries._provider_services)
-    _prev_resolvers = dict(registries._provider_resolvers)
-    _prev_cph = dict(registries._credential_pool_hooks)
-
-    ctx = _FullCtx()
-    try:
-        from hermes_agent_azure import register as _reg
-        _reg(ctx)
-    except ImportError:
-        pass
-    # azure-foundry tests for Anthropic Messages mode need the anthropic plugin too
-    try:
-        from hermes_agent_anthropic import register as _anthro_reg
-        _anthro_reg(ctx)
-    except ImportError:
-        pass
-
-    yield
-
-    for d, prev in [
-        (registries._provider_services, _prev_services),
-        (registries._provider_resolvers, _prev_resolvers),
-        (registries._credential_pool_hooks, _prev_cph),
-    ]:
-        d.clear()
-        d.update(prev)
@@ -27,9 +27,3 @@ bedrock = BedrockProfile(
 )

 register_provider(bedrock)
-
-
-def register(ctx):
-    """Plugin entry point — delegates to the inner hermes_agent_bedrock package."""
-    from hermes_agent_bedrock import register as _inner_register
-    _inner_register(ctx)
@@ -1,125 +0,0 @@
-"""hermes-agent-bedrock: AWS Bedrock Converse API adapter for Hermes Agent."""
-
-from hermes_agent_bedrock.adapter import (  # noqa: F401
-    BEDROCK_DEFAULT_CONTEXT_LENGTH,
-    CONTEXT_OVERFLOW_PATTERNS,
-    OVERLOAD_PATTERNS,
-    THROTTLE_PATTERNS,
-    _AWS_CREDENTIAL_ENV_VARS,
-    _DISCOVERY_CACHE_TTL_SECONDS,
-    _NON_TOOL_CALLING_PATTERNS,
-    _STALE_LIB_MODULE_PREFIXES,
-    _convert_content_to_converse,
-    _converse_stop_reason_to_openai,
-    _extract_provider_from_arn,
-    _get_bedrock_control_client,
-    _get_bedrock_runtime_client,
-    _model_supports_tool_use,
-    _require_boto3,
-    _traceback_frames_modules,
-    bedrock_model_ids_or_none,
-    build_converse_kwargs,
-    call_converse,
-    call_converse_stream,
-    classify_bedrock_error,
-    convert_messages_to_converse,
-    convert_tools_to_converse,
-    discover_bedrock_models,
-    get_bedrock_context_length,
-    get_bedrock_model_ids,
-    has_aws_credentials,
-    invalidate_runtime_client,
-    is_anthropic_bedrock_model,
-    is_context_overflow_error,
-    is_stale_connection_error,
-    normalize_converse_response,
-    normalize_converse_stream_events,
-    reset_client_cache,
-    reset_discovery_cache,
-    resolve_aws_auth_env_var,
-    resolve_bedrock_region,
-    stream_converse_with_callbacks,
-)
-
-
-def register(ctx):
-    """Entry point for the hermes_agent.plugins entry point group."""
-    from hermes_agent_bedrock import adapter
-
-    ctx.register_provider_services("bedrock", {
-        # Auth / credentials
-        "has_aws_credentials": adapter.has_aws_credentials,
-        "resolve_aws_auth_env_var": adapter.resolve_aws_auth_env_var,
-        "resolve_bedrock_region": adapter.resolve_bedrock_region,
-        "_AWS_CREDENTIAL_ENV_VARS": adapter._AWS_CREDENTIAL_ENV_VARS,
-        # Transport
-        "build_converse_kwargs": adapter.build_converse_kwargs,
-        "convert_messages_to_converse": adapter.convert_messages_to_converse,
-        "convert_tools_to_converse": adapter.convert_tools_to_converse,
-        "normalize_converse_response": adapter.normalize_converse_response,
-        "normalize_converse_stream_events": adapter.normalize_converse_stream_events,
-        "call_converse": adapter.call_converse,
-        "call_converse_stream": adapter.call_converse_stream,
-        "stream_converse_with_callbacks": adapter.stream_converse_with_callbacks,
-        # Model metadata
-        "bedrock_model_ids_or_none": adapter.bedrock_model_ids_or_none,
-        "discover_bedrock_models": adapter.discover_bedrock_models,
-        "get_bedrock_context_length": adapter.get_bedrock_context_length,
-        "get_bedrock_model_ids": adapter.get_bedrock_model_ids,
-        "BEDROCK_DEFAULT_CONTEXT_LENGTH": adapter.BEDROCK_DEFAULT_CONTEXT_LENGTH,
-        # Client management
-        "_get_bedrock_control_client": adapter._get_bedrock_control_client,
-        "_get_bedrock_runtime_client": adapter._get_bedrock_runtime_client,
-        "invalidate_runtime_client": adapter.invalidate_runtime_client,
-        "reset_client_cache": adapter.reset_client_cache,
-        "reset_discovery_cache": adapter.reset_discovery_cache,
-        # Error handling
-        "classify_bedrock_error": adapter.classify_bedrock_error,
-        "is_context_overflow_error": adapter.is_context_overflow_error,
-        "is_stale_connection_error": adapter.is_stale_connection_error,
-        "CONTEXT_OVERFLOW_PATTERNS": adapter.CONTEXT_OVERFLOW_PATTERNS,
-        "OVERLOAD_PATTERNS": adapter.OVERLOAD_PATTERNS,
-        "THROTTLE_PATTERNS": adapter.THROTTLE_PATTERNS,
-        "_NON_TOOL_CALLING_PATTERNS": adapter._NON_TOOL_CALLING_PATTERNS,
-        "_STALE_LIB_MODULE_PREFIXES": adapter._STALE_LIB_MODULE_PREFIXES,
-        "_DISCOVERY_CACHE_TTL_SECONDS": adapter._DISCOVERY_CACHE_TTL_SECONDS,
-        # Internal helpers
-        "_require_boto3": adapter._require_boto3,
-        "_model_supports_tool_use": adapter._model_supports_tool_use,
-        "is_anthropic_bedrock_model": adapter.is_anthropic_bedrock_model,
-        "_convert_content_to_converse": adapter._convert_content_to_converse,
-        "_converse_stop_reason_to_openai": adapter._converse_stop_reason_to_openai,
-        "_extract_provider_from_arn": adapter._extract_provider_from_arn,
-        "_traceback_frames_modules": adapter._traceback_frames_modules,
-    })
-
-    # Register the provider resolver — core dispatches to this instead of
-    # having per-bedrock if/elif branches in resolve_provider_client().
-    from hermes_agent_bedrock.resolve import resolve_auxiliary_client as _bedrock_resolver
-    ctx.register_provider_resolver("bedrock", _bedrock_resolver)
-
-    # Register the bedrock transport so core doesn't need to import it.
-    from hermes_agent_bedrock.transport import BedrockTransport
-    ctx.register_transport("bedrock_converse", BedrockTransport)
-
-    # Register pricing entries — core looks these up via the registry
-    # instead of hardcoding them in _OFFICIAL_DOCS_PRICING.
-    from hermes_agent_bedrock.pricing import (
-        get_bedrock_pricing_entries,
-        BEDROCK_PRICING_KEYS,
-    )
-    _entries = get_bedrock_pricing_entries()
-    _keyed = []
-    for (prov, model), entry in zip(BEDROCK_PRICING_KEYS, _entries):
-        _keyed.append((prov, model, entry))
-    ctx.register_pricing_provider("bedrock", _keyed)
-
-    # Register the provider overlay — core merges this into HERMES_OVERLAYS
-    from agent.plugin_registries import ProviderOverlayEntry
-    ctx.register_provider_overlay(ProviderOverlayEntry(
-        provider_name="bedrock",
-        transport="bedrock_converse",
-        auth_type="aws_sdk",
-        display_name="AWS Bedrock",
-        aliases=["aws", "aws-bedrock", "amazon-bedrock", "amazon"],
-    ))
@@ -1,80 +0,0 @@
-"""Bedrock model pricing data.
-
-Official docs snapshot entries for AWS Bedrock models.
-Source: https://aws.amazon.com/bedrock/pricing/
-"""
-
-from __future__ import annotations
-
-from decimal import Decimal
-
-
-def get_bedrock_pricing_entries() -> list:
-    """Return official docs pricing entries for Bedrock models."""
-    from agent.usage_pricing import PricingEntry
-
-    _BEDROCK_PRICING_URL = "https://aws.amazon.com/bedrock/pricing/"
-    _BEDROCK_PRICING_VER = "bedrock-pricing-2026-04"
-
-    return [
-        PricingEntry(
-            input_cost_per_million=Decimal("15.00"),
-            output_cost_per_million=Decimal("75.00"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "anthropic.claude-opus-4-6")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "anthropic.claude-sonnet-4-6")
-        PricingEntry(
-            input_cost_per_million=Decimal("3.00"),
-            output_cost_per_million=Decimal("15.00"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "anthropic.claude-sonnet-4-5")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.80"),
-            output_cost_per_million=Decimal("4.00"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "anthropic.claude-haiku-4-5")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.80"),
-            output_cost_per_million=Decimal("3.20"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "amazon.nova-pro")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.06"),
-            output_cost_per_million=Decimal("0.24"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "amazon.nova-lite")
-        PricingEntry(
-            input_cost_per_million=Decimal("0.035"),
-            output_cost_per_million=Decimal("0.14"),
-            source="official_docs_snapshot",
-            source_url=_BEDROCK_PRICING_URL,
-            pricing_version=_BEDROCK_PRICING_VER,
-        ),  # ("bedrock", "amazon.nova-micro")
-    ]
-
-
-BEDROCK_PRICING_KEYS = [
-    ("bedrock", "anthropic.claude-opus-4-6"),
-    ("bedrock", "anthropic.claude-sonnet-4-6"),
-    ("bedrock", "anthropic.claude-sonnet-4-5"),
-    ("bedrock", "anthropic.claude-haiku-4-5"),
-    ("bedrock", "amazon.nova-pro"),
-    ("bedrock", "amazon.nova-lite"),
-    ("bedrock", "amazon.nova-micro"),
-]
@@ -1,66 +0,0 @@
-"""Bedrock provider resolver for auxiliary client construction.
-
-Handles ALL provider-specific logic for building auxiliary clients:
-AWS credential detection, region resolution, and Bedrock client construction.
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any, Optional
-
-logger = logging.getLogger(__name__)
-
-
-def resolve_auxiliary_client(
-    *,
-    model: str | None = None,
-    explicit_api_key: str | None = None,
-    explicit_base_url: str | None = None,
-    async_mode: bool = False,
-    is_vision: bool = False,
-    main_runtime: dict | None = None,
-    api_mode: str | None = None,
-) -> tuple[Any, str] | tuple[None, None]:
-    """Resolve an auxiliary client for the Bedrock provider.
-
-    Returns ``(client, default_model)`` or ``(None, None)`` if unavailable.
-    """
-    from agent.plugin_registries import registries
-    from agent.anthropic_aux import (
-        AnthropicAuxiliaryClient,
-        AsyncAnthropicAuxiliaryClient,
-    )
-
-    _bedrock = registries.get_provider_namespace("bedrock")
-    _anthropic = registries.get_provider_namespace("anthropic")
-    has_aws_credentials = _bedrock.get("has_aws_credentials")
-    resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
-    build_anthropic_bedrock_client = _anthropic.get("build_anthropic_bedrock_client")
-    if has_aws_credentials is None or resolve_bedrock_region is None or build_anthropic_bedrock_client is None:
-        return None, None
-
-    if not has_aws_credentials():
-        logger.debug("resolve_provider_client: bedrock requested but "
-                     "no AWS credentials found")
-        return None, None
-
-    region = resolve_bedrock_region()
-    default_model = "anthropic.claude-haiku-4-5-20251001-v1:0"
-    final_model = model or default_model
-    try:
-        real_client = build_anthropic_bedrock_client(region)
-    except ImportError as exc:
-        logger.warning("resolve_provider_client: cannot create Bedrock "
-                       "client: %s", exc)
-        return None, None
-    client = AnthropicAuxiliaryClient(
-        real_client, final_model, api_key="aws-sdk",
-        base_url=f"https://bedrock-runtime.{region}.amazonaws.com",
-    )
-    logger.debug("resolve_provider_client: bedrock (%s, %s)", final_model, region)
-
-    if async_mode:
-        client = AsyncAnthropicAuxiliaryClient(client)
-
-    return client, final_model
@@ -1,19 +0,0 @@
-[build-system]
-requires = ["setuptools>=61.0"]
-build-backend = "setuptools.build_meta"
-
-[project]
-name = "hermes-agent-bedrock"
-version = "0.1.0"
-description = "AWS Bedrock Converse API adapter for Hermes Agent"
-requires-python = ">=3.11"
-dependencies = [
-    "hermes-agent",
-    "boto3==1.42.89",
-]
-
-[project.entry-points."hermes_agent.plugins"]
-bedrock = "hermes_agent_bedrock:register"
-
-[tool.setuptools.packages.find]
-include = ["hermes_agent_bedrock*"]
--- a/Show More
+++ b/Show More