Compare commits

..

24 Commits

Author SHA1 Message Date
Teknium 3b6347af15 feat(kanban): default_assignee fallback + per-profile concurrency cap (#27145, #21582) (#34244)
Two related dispatcher behaviors that have been missing for a while.

## kanban.default_assignee (#27145)

Reporter (@agarzon): dashboard creates a task without an assignee, task
parks in 'ready' forever even though the operator's intent ('default')
is perfectly clear. The dispatcher already had a 'skipped_unassigned'
bucket but no fallback routing — users had to manually type 'default'
in the assignee field every time.

Behavior: when 'kanban.default_assignee' is set in config.yaml, the
dispatcher applies that assignee to any unassigned ready task before
deciding whether to spawn. The row is mutated (assignee column + an
'assigned' event with source='kanban.default_assignee' for the audit
trail). Empty/whitespace config value = no fallback, preserving the
existing skipped_unassigned behavior.

Dry-run mode reports what WOULD happen via the new
'auto_assigned_default' bucket on DispatchResult, but does NOT mutate
the DB — operators using 'hermes kanban dispatch --dry-run' see the
routing decision before committing.

## kanban.max_in_progress_per_profile (#21582)

Reporter (@edwardchenchen, @simlu, 4 reactions): fan-out workloads
saturate one profile's local model / API quota / browser pool while
other profiles sit idle. The existing global 'max_in_progress' caps
total workers but doesn't balance across profiles.

Behavior: when 'kanban.max_in_progress_per_profile' is set to a
positive int, the dispatcher tracks per-assignee running counts (one
query at tick start) and refuses to spawn for any assignee already at
the cap. Tasks blocked this way go to a new
'skipped_per_profile_capped' bucket on DispatchResult as
(task_id, assignee, current_running_count) tuples — NOT an
operator-actionable failure, just 'try again next tick when the
profile has capacity'.

Pre-existing 'running' tasks count against the cap (verified via
regression test). The cap respects dry_run mode by incrementing
its in-memory counter on each would-be spawn so dry_run reports
the same balanced subset that a real tick would.

Invalid cap values (0, negative, non-int, None) are treated as 'no
cap', preserving the existing behavior. Backward-compatible for
installs that don't set the config.

## Surfaces

- 'hermes kanban dispatch' CLI now prints 'Auto-assigned to
  kanban.default_assignee=X: ...' and 'Deferred (X at per-profile cap,
  N running): ...' lines, plus matching JSON keys in --json output.
- Gateway dispatcher logs the configured values at startup
  ('default_assignee=X', 'max_in_progress_per_profile=N').
- 'kanban.max_in_progress_per_profile' added to DEFAULT_CONFIG with
  inline docs.

## Validation

- tests/hermes_cli/test_kanban_default_assignee.py (6 cases): no-cap
  baseline, auto-assign + DB mutation, dry-run reports without
  mutating, whitespace treated as None, explicit assignees untouched,
  DispatchResult field schema.
- tests/hermes_cli/test_kanban_per_profile_cap.py (9 cases including
  4 parametrized): no-cap baseline, balanced 2-profile fan-out,
  pre-existing running counts against cap, invalid cap values
  (0/-1/'abc'/None), capped tasks dispatched on next tick after
  running task completes, DispatchResult field schema.
- Broader kanban suite: 464/464 pass (was 449 baseline; +15 new
  regression tests across both features).

## Credit

#27145 — Jimmy Johansson reported the dispatcher skipped-unassigned
gap; @agarzon scoped the simpler 'honor kanban.default_assignee' fix
that matches the existing config knob.
#21582 — @edwardchenchen filed the per-profile cap ask after hitting
model 429s on fan-out research projects; @simlu confirmed the same
pain on local-model setups.
2026-05-28 19:02:55 -07:00
Ben 42612aa350 docs(docker): refresh user-guide page for s6-overlay reality
The page was last meaningfully rewritten in the pre-s6 (tini) era and had
drifted on five points that no longer matched the image:

1. "Running the dashboard" claimed the entrypoint backgrounds
   `hermes dashboard` and prefixes its output with `[dashboard]`. That
   was the pre-s6 entrypoint.sh path; under s6 the dashboard is a
   supervised s6-rc service (`docker/s6-rc.d/dashboard/run`) with no
   sed-prefix pipeline. Rewrote the section accordingly.

2. The default for `HERMES_DASHBOARD_HOST` was documented as
   `127.0.0.1`. The s6 run script defaults it to `0.0.0.0`
   (`dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"`). Fixed the table
   and the surrounding prose.

3. Multi-profile was documented as "not recommended in Docker — run
   one container per profile." That advice was load-bearing when
   there was no in-container supervisor, but the s6 architecture
   explicitly adds per-profile gateway supervision: each profile
   created via `hermes profile create <name>` gets a slot under
   `/run/service/gateway-<name>/`, the `02-reconcile-profiles`
   cont-init script restores them across `docker restart` from
   `gateway_state.json`, and `hermes gateway start/stop/restart` is
   intercepted by `_dispatch_via_service_manager_if_s6` to route
   through `s6-svc`. Pivoted the section to "one container, many
   supervised profile gateways" as the default, with a comparison
   table and a "When you DO want a separate container" escape
   hatch for the genuine resource-isolation / network-segmentation
   cases.

4. The Compose example trailer also claimed `[dashboard]` log
   prefixing. Replaced with the actual log routing.

5. Added a new "Where the logs go" section covering all four log
   surfaces: per-profile gateways (tee'd to `docker logs` AND
   `${HERMES_HOME}/logs/gateways/<profile>/current` since PR
   b34532319), dashboard (`docker logs`, no prefix), boot reconciler
   (`container-boot.log`), and `hermes logs`. The gateway-mode and
   Compose sections cross-reference this rather than each carrying
   their own routing prose.

Added a new "docker exec automatically drops to the hermes user"
subsection under "What the Dockerfile does", next to the existing
Privilege model warning. Documents the `/opt/hermes/bin/hermes` shim
(landed via the docker-exec privilege-drop work) — operators don't
need to remember `--user hermes` for `docker exec hermes login`,
`docker exec hermes profile create …`, etc. The historical footgun
(`auth.json` written as `root:root`, supervised gateway then can't
read its own auth file) is mentioned only as context for what the
fail-loud `exit 126` is protecting against, not as a problem the
reader needs to solve. The `HERMES_DOCKER_EXEC_AS_ROOT=1` opt-out is
documented for diagnostic sessions.

The "Permission denied" troubleshooting subsection now carries a
single-line pointer to the new section instead of duplicating it.

The `--insecure` framing reflects PR #fb5125362 (opt-in via
`HERMES_DASHBOARD_INSECURE`, not derived from bind host): the OAuth
gate is the authority, the bind host alone never implies
`--insecure`, and opting out is an explicit security trade-off.

Anchors verified resolve. i18n zh-Hans mirror left for the
translation flow to catch up.
2026-05-29 11:55:01 +10:00
Ben 3c6e70aef1 docs(docker): document new persist-across-processes contract and orphan reaper (#20561)
Updates the Docker Backend section of the user-guide configuration page
to match the actual behavior shipped in PR #33645. Pre-PR the docs
claimed "container is stopped and removed on shutdown," which was
never quite true for the documented happy path and is now actively
wrong: in default mode the container survives across Hermes processes
so background processes (npm watchers, dev servers, long-running
pytest) carry over the way the "ONE long-lived container shared
across sessions" promise requires.

Changes to `website/docs/user-guide/configuration.md`:

* Reworked the intro paragraph at the top of the Docker Backend
  section to describe the actual cross-process reuse contract.
* Expanded the YAML example with the new keys
  `docker_persist_across_processes` and `docker_orphan_reaper`, plus
  the pre-existing-but-undocumented `docker_env`, `timeout`, and
  `lifetime_seconds`.  Clarified the `container_persistent` comment
  to disambiguate from `docker_persist_across_processes`.
* Added a `docker_env` vs `docker_forward_env` explainer (one
  injects literal KEY=value, the other forwards values from the
  host/.env — easy to confuse).
* Replaced the one-line "Container lifecycle" paragraph with a full
  subsection covering:
    - the three labels Hermes tags every container with
      (hermes-agent, hermes-task-id, hermes-profile)
    - the label-probe reuse mechanism on startup
    - a teardown-trigger table with four rows for every situation
      that destroys the container in default mode
    - edge cases (OOM kill, profile switching)
* Added an "Environment variable overrides" table covering all
  TERMINAL_* env vars relevant to the Docker backend, including the
  previously-undocumented `TERMINAL_DOCKER_ENV` and
  `HERMES_DOCKER_BINARY`.

Changes to `website/docs/user-guide/docker.md`:

* Extended the cross-link admonition (around l.227) so the
  Hermes-in-Docker page points at the new terminal-backend keys
  (`docker_env`, `docker_persist_across_processes`,
  `docker_orphan_reaper`) alongside the ones already mentioned.

No code changes.  Behavior already covered by tests added in earlier
commits on this branch (#33645 commits 1-5).

Refs #20561
2026-05-29 11:49:54 +10:00
Ben 2f0f03c40d fix(docker): cleanup_vm() default honors persist mode (don't kill container on session close)
Commit 4 made cleanup_vm() default to force_remove=True, which was wrong:
cleanup_vm() is called from AIAgent.close() (TUI session close at
tui_gateway/server.py:2991, gateway session teardown at gateway/run.py:3569)
and from per-turn cleanup (agent/chat_completion_helpers.py:1517). All
three are session-lifecycle events that should honor persist mode, not
explicit user-initiated teardown.

Ben reported the symptom: container shared between multiple TUI sessions
(good) but killed as soon as any session closed (bad). With force_remove=True
as the default, every `session.close` JSON-RPC tore down the container.

The fix is to flip cleanup_vm()'s force_remove default back to False.
The kwarg still exists for future explicit-teardown paths (`/reset`-style
flows, "destroy my sandbox" commands) that haven't been wired up yet.

Two new unit tests pin the behavior:

* `test_cleanup_vm_default_honors_persist_mode` — asserts
  `cleanup_vm(task_id)` does neither docker stop nor docker rm on a
  persist-mode container (the regression Ben caught).
* `test_cleanup_vm_force_remove_tears_down_persist_container` —
  asserts the kwarg still flows through the runtime-signature-inspection
  plumbing to the backend's cleanup().

E2E verified against real Docker (in addition to all 17 existing checks):

  ✓ Default cleanup_vm() leaves persist-mode container running
  ✓ cleanup_vm(force_remove=True) removed the container

Refs #20561
2026-05-29 11:49:54 +10:00
Ben 5c2170a7c6 fix(docker): persist-mode cleanup is no-op; add force_remove kwarg (#20561)
The first iteration of this PR did docker stop on every cleanup in
persist mode (only skipping docker rm). Ben caught this as
contradicting the documented "ONE long-lived container shared across
sessions" semantics: stopping the container on every Hermes /quit kills
any background processes inside (npm watchers, pytest watchers,
long-running scripts) — exactly the case persist mode is supposed to
protect.

This commit splits the cleanup paths cleanly:

* **Persist mode (default)** — cleanup() is a NO-OP for the
  container. Container stays running, processes survive, next Hermes
  process attaches via the existing label probe in ~ms instead of
  waiting for docker start. Resource reclamation happens via the
  orphan reaper at next startup (2 × lifetime_seconds threshold), which
  covers the SIGKILL / OOM / abandoned-laptop cases.
* **Opt-out mode (persist_across_processes=False)** — unchanged:
  docker stop + docker rm -f on cleanup as before.
* **Explicit teardown** — new cleanup(force_remove=True) kwarg
  overrides persist mode and tears the container down unconditionally.
  cleanup_vm(task_id) now defaults to force_remove=True since
  it's the user-driven reset path (called from AIAgent.close(),
  /reset-style flows, and the idle reaper's per-turn cleanup).

The idle reaper in _cleanup_inactive_envs calls env.cleanup()
directly with no kwargs, so idle persist-mode envs are no-op'd — the
container survives the in-process pop and the next tool call re-probes
via labels. No state leak: _container_id is still cleared on the
in-process handle.

E2E verified against real Docker:

  ✓ Container is still running after cleanup()
  ✓ Background process (sleep loop) survived cleanup()
  ✓ Filesystem state preserved across cleanup()
  ✓ In-process container_id cleared (next __init__ will re-probe)
  ✓ Background process visible from reused env (no docker start happened)
  ✓ force_remove=True removed the container even in persist mode
  ✓ cleanup_vm() removed the container (defaults to force_remove=True)

Test changes:

* Replaces `test_cleanup_with_persist_only_stops_no_rm` with
  `test_cleanup_with_persist_is_noop_for_container` — asserts neither
  stop nor rm runs in persist mode, and the in-process handle is
  cleared so re-probe works.
* Adds `test_cleanup_force_remove_stops_and_rms_even_in_persist_mode`
  — covers the new kwarg.
* Updates `test_cleanup_uses_subprocess_run_not_detached_shell` and
  `test_wait_for_cleanup_after_cleanup_returns_true` to pass
  `force_remove=True` so they actually exercise the docker code path
  (default no-op would trivially pass).

cleanup_vm() forwards `force_remove` only to backends whose cleanup()
accepts the kwarg (currently just DockerEnvironment) via runtime
signature inspection — Modal/Daytona/SSH `cleanup()` signatures are
unchanged.

Refs #20561
2026-05-29 11:49:54 +10:00
Ben d77d877665 fix(docker): startup orphan reaper for crashed-process containers
The cleanup-fix in the previous commit handles the graceful-exit leak: a
Hermes process that runs ``atexit`` will now actually wait on the docker
stop/rm worker thread, so containers either survive (persist mode) or are
fully removed (opt-out mode) by the time the interpreter exits.

But ``atexit`` doesn't fire on SIGKILL, OOM-kill, or terminal-window
close. Containers from those exits stay parked with no surviving Python
process to reuse or remove them, so they accumulate until the operator
intervenes with ``docker rm -f``. The cleanup-fix doesn't help this class
— there's no live cleanup() to fix.

This commit adds the safety net: a startup orphan reaper that runs once
per Hermes process and removes long-Exited hermes-labeled containers
that the prior commit couldn't reach.

Implementation:

* New ``reap_orphan_containers()`` in ``tools/environments/docker.py``.
  Filters: ``label=hermes-agent=1`` + ``status=exited`` + (optional)
  ``label=hermes-profile=<current>``. Per-container ``docker inspect``
  parses ``State.FinishedAt`` (with nanosecond-precision trimming for
  Python's microsecond-bound ``fromisoformat``); containers older than
  the threshold get ``docker rm -f``'d. The ``status=exited`` filter is
  load-bearing — a running container may belong to a sibling Hermes
  process whose reuse path will pick it up; killing it would crash the
  sibling mid-command. Single-container failures are logged and the
  sweep continues to the next candidate.

* New ``_maybe_reap_docker_orphans()`` helper in
  ``tools/terminal_tool.py``. Wired into ``_create_environment()`` for
  ``env_type == "docker"``. Gated by:

    - ``terminal.docker_orphan_reaper: true`` (default; opt-out for
      operators running multiple Hermes processes in the same profile
      who don't trust the conservative defaults)
    - ``_docker_orphan_reaper_ran`` module flag with double-checked
      locking — parallel subagents and RL rollouts don't trigger N
      concurrent docker ps storms
    - Age threshold = ``2 × TERMINAL_LIFETIME_SECONDS`` with a 60s floor
      (so ``TERMINAL_LIFETIME_SECONDS=0`` doesn't race the user's own
      setup)
    - Profile scoping — a research profile NEVER reaps the default
      profile's stragglers
    - Exception swallow — a janitor failure must never block container
      creation

* New config ``terminal.docker_orphan_reaper`` wired through all four
  config-bridge sites (cli.py, gateway/run.py, hermes_cli/config.py,
  tests/conftest.py) and pinned by
  ``test_docker_orphan_reaper_is_bridged_everywhere``.

Coverage:

* 9 new unit tests in test_docker_environment.py — happy path, recent-
  container sparing, profile scoping, unparseable-timestamp safety,
  docker-ps-failure handling, partial-failure continuation, nanosecond
  timestamp parsing, zero-value FinishedAt rejection.
* 6 new integration tests in test_docker_orphan_reaper_integration.py
  — once-per-process gate, disable-flag respected, lifetime doubling
  with 60s floor, current-profile filter wiring, exception swallow.
* 1 new bridge-invariant regression test.

Closes #20561 (combined with the two prior commits on this branch).
2026-05-29 11:49:54 +10:00
Ben ac8e238bc8 fix(docker): reuse containers across processes + fix cleanup leaks
The Docker backend docs claim "Single persistent container — ONE long-
lived container shared across sessions, /new, /reset, and delegate_task
subagents. Stopped/removed on shutdown." In practice the code only
honored that contract within a single Python process via the in-memory
\`_active_environments[task_id]\` cache. Every \`hermes chat\` invocation
spawned a fresh \`hermes-<hex>\` container; older containers piled up in
\`Exited\` state and accumulated until manual \`docker rm\` (issue #20561).

Three root causes, all addressed by this commit:

1. No cross-process container discovery.
2. \`cleanup()\` used fire-and-forget \`subprocess.Popen("... &", shell=True)\`
   which raced with parent-process exit — when Python exited promptly the
   detached shell child got killed mid-\`docker stop\`, leaving stopped
   containers behind.
3. The \`docker rm\` step in cleanup was gated on \`not self._persistent\`
   (the bind-mount-persistence flag). Default config sets
   \`container_persistent: true\`, so the default happy path skipped \`rm\`
   entirely — even when the user explicitly didn't want cross-process
   reuse, containers leaked.

Fix:

* Add \`DockerEnvironment.__init__(persist_across_processes=True)\`. When
  true, init probes
  \`docker ps -a --filter label=hermes-agent=1
                  --filter label=hermes-task-id=<task>
                  --filter label=hermes-profile=<profile>\`
  and reuses a matching container (running → attach; stopped →
  \`docker start\` → attach; \`docker start\` failure → fall through to a
  fresh \`docker run\`). Multiple matches prefer the running one, with the
  stragglers left for the orphan reaper (next commit) to clean up.

* Rewrite \`cleanup()\`. Uses \`subprocess.run(..., timeout=30)\` on a
  daemon \`threading.Thread\`, not the racy \`Popen(... &)\`. The
  \`_persistent\` guard is dropped on the \`rm\` step — \`rm\` now runs
  whenever \`persist_across_processes\` is false, regardless of the
  bind-mount-persistence setting. The leak class is gone in all
  combinations.

* Add \`wait_for_cleanup(timeout)\`. \`tools/terminal_tool.py\`'s atexit
  hook calls this on every active env, blocking up to 15s for the
  cleanup thread before interpreter exit. Without this, \`hermes /quit\`
  raced the daemon-thread teardown and dropped the stop/rm work.

* New config \`terminal.docker_persist_across_processes\` (default
  \`true\` — restores the documented contract). Set \`false\` for hard
  per-process isolation. Wired through all four config-bridge sites
  (cli.py env_mappings, gateway/run.py _terminal_env_map,
  hermes_cli/config.py _config_to_env_sync, tests/conftest.py env-strip
  list); regression-pinned by
  \`test_docker_persist_across_processes_is_bridged_everywhere\` matching
  the existing pattern for docker_run_as_host_user / docker_env.

Reuse intentionally does NOT compare image / mounts / resources — only
the labels. Operators changing those settings should set
\`docker_persist_across_processes: false\` (or \`docker rm -f\` the
labeled container) to force a fresh start. This keeps the probe cheap
and the failure mode obvious.

Coverage: 12 new unit tests in tests/tools/test_docker_environment.py
covering reuse paths (running, stopped, fallback, opt-out, duplicate
preference) and cleanup behavior (persist-mode no-rm, opt-out always-rm,
no-Popen, wait_for_cleanup semantics, partial-init safety). Plus one
config-bridge regression pin.

Refs #20561
2026-05-29 11:49:54 +10:00
Ben 8d129d013b fix(docker): tag containers with hermes-agent labels for identification
Issue #20561 (Docker containers accumulate) needs a way to identify
hermes-created containers from the outside — both for the orphan reaper
(a follow-up commit) and for operators triaging `docker ps -a | grep
hermes-` after a SIGKILL leaves stragglers. The previous `hermes-<hex>`
name prefix was the only signal, which broke down under cross-process
reuse (planned) and against any custom `--name` someone might pass via
`docker_extra_args`.

This commit adds three labels at `docker run` time:

  --label hermes-agent=1                # global sweep target
  --label hermes-task-id=<sanitized>    # per-task reuse key
  --label hermes-profile=<sanitized>    # per-profile isolation key

Values are sanitized to `[A-Za-z0-9_.-]` and truncated to 63 chars so the
label round-trips cleanly through `docker ps --filter label=key=value`.
Empty or non-string inputs collapse to "unknown" rather than producing
an unqueryable empty value.

No behavior change: the labels are pure metadata. The follow-up commits
in this PR (cleanup-fix + orphan reaper) are what use them.

Refs #20561
2026-05-29 11:49:54 +10:00
Teknium 300140e006 test(tui_gateway): stop reloading server module in fixture teardown (#34217)
tui_gateway.server registers two atexit hooks at module load time:
ThreadPoolExecutor shutdown (line 170) and _shutdown_sessions (line 336).
Three test files reloaded the module on each fixture teardown to reset
per-test state. Each reload re-runs module-level code, including the
atexit registrations — duplicates accumulate across the test session.

At pytest interpreter shutdown the duplicated atexit hooks race the
stderr buffer flush:

    Fatal Python error: _enter_buffered_busy: could not acquire lock
    for <_io.BufferedWriter name='<stderr>'> at interpreter shutdown,
    possibly due to daemon threads

pytest reports 'tests passed but the slice exited non-zero', and the
shard turns red on CI. Surfaced today on PR #34193's test slice 1
(204 files, 3572 tests passed, then Fatal Python error during exit).

Fix: drop importlib.reload(mod) from the three fixtures that have it.
Per-test reset is handled by clearing the mutable session dicts
(_sessions, _pending, _answers). _methods is also no longer cleared —
it's populated at module import time and would only be re-populated by
a reload, so clearing it without reload broke session.resume /
command.dispatch / slash.exec method registration across tests.

Affected fixtures:
- tests/tui_gateway/test_goal_command.py
- tests/tui_gateway/test_protocol.py
- tests/tui_gateway/test_review_summary_callback.py

The second reload in test_protocol.py at line 211 (reload of
tui_gateway.transport) is preserved — transport.py has no atexit hooks
or threads, so reload is safe there.

Tests: 84/84 in tests/tui_gateway/ pass cleanly with exit code 0; no
Fatal Python error at interpreter shutdown.
2026-05-28 18:16:54 -07:00
Teknium e71a2bd11b chore: release v0.15.1 (2026.5.29) (#34222) 2026-05-28 18:11:49 -07:00
Teknium 769ee86cd2 feat(kanban): attach images referenced in task bodies to worker vision (#34210)
Kanban workers now scan the task body for local image paths and
http(s) image URLs and attach them to the worker's first user turn —
matching the CLI/gateway behaviour for inbound images. Before, a
user pasting `/home/me/screenshot.png` or `https://example.com/img.png`
into a kanban task description had it sent to the model as plain
text and the pixels were never seen.

How it works:
* agent/image_routing.py gains extract_image_refs(text) → (paths, urls)
  that mirrors gateway/platforms/base.py:extract_local_files (absolute /
  ~-relative paths, image extensions only, ignores fenced/inline code).
* build_native_content_parts() accepts an optional image_urls= kwarg
  and emits passthrough image_url parts for remote URLs alongside the
  base64 data: URLs used for local paths.
* cli.py (single-query/quiet branch — the path every dispatcher-spawned
  worker takes) detects HERMES_KANBAN_TASK, reads the task body via
  kanban_db.get_task, runs extract_image_refs, and threads the results
  into the existing image-routing decision (native vs text). Best-effort:
  enrichment failures never block worker startup.

Tested:
* tests/agent/test_image_routing.py — 22 new tests for extract_image_refs
  and URL pass-through in build_native_content_parts.
* tests/hermes_cli/test_kanban_worker_image_extraction.py — 10 new tests
  driving real kanban_db round-trip (create task → read body → extract
  refs → build parts).
* E2E: created a fake kanban task with a body referencing both a local
  PNG and an https URL; verified the worker pipeline produces a
  multimodal user turn with 1 text part + 2 image_url parts (data URL
  for the local file, passthrough URL for the remote).
2026-05-28 17:50:42 -07:00
Ben 1b1e30510a test(docker): repair dashboard tests broken by the insecure-opt-in fix
The Docker integration test job started failing on main after
fb5125362 ("docker: opt in to dashboard --insecure via env var").
Two distinct failures, both fallout from that change being more
behaviour-changing than the existing test harness anticipated.

Failure 1 — test_dashboard_port_override (silent regression in an
already-existing test)
The test starts the container with just HERMES_DASHBOARD=1, defaults
to host=0.0.0.0, no HERMES_DASHBOARD_OAUTH_CLIENT_ID, no
HERMES_DASHBOARD_INSECURE. Pre-fix that combination got --insecure
auto-injected by the s6 run script (anything non-loopback was
implicitly insecure), so the OAuth gate stayed off and start_server
bound the port. Post-fix the gate engages, no provider is
registered, and start_server raises SystemExit before binding —
under s6 the dashboard goes into a restart loop and the test's
/proc/net/tcp poll finds nothing.

Same silent regression was masking three sibling tests
(test_dashboard_slot_reports_up_when_enabled, test_dashboard_opt_in_starts,
test_dashboard_restarts_after_crash) — they all only sample pgrep
or s6-svstat and so caught the supervised process mid-restart
loop, appearing to pass while the dashboard was actually never
reaching a healthy state.

Fix: pin HERMES_DASHBOARD_INSECURE=1 on every test that enables
the dashboard but doesn't itself exercise the auth gate. Each
pinned site carries an inline comment pointing back to
test_dashboard_slot_reports_up_when_enabled for the full
rationale.

Failure 2 — test_dashboard_oauth_gate_engages_on_non_loopback_bind
(bug in the test I added in fb5125362)
The probe used urllib.request.urlopen() against /api/status. Under
the now-engaged OAuth gate /api/status no longer answers
unauthenticated callers (the gate middleware runs upstream of the
legacy _SESSION_TOKEN allowlist and 401s anything without a valid
session cookie). urlopen() raises HTTPError on the 401, the wrapper
treated that as "not ready yet", and the poll loop hit
timeout.

Fix: split the probe into a generic _http_probe() helper that
returns (status_code, body) for any HTTP response — including 401,
which IS the gate-engaged success signal. The helper feeds a
multi-line Python program over stdin via a POSIX heredoc so the
try/except branch reads naturally; far less fragile than the
earlier semicolon-laden -c one-liner.

The OAuth-gate test now verifies two independent observable
consequences of the gate being on:

  1. GET /api/auth/providers (publicly reachable through the gate
     so the login page can bootstrap) returns 200 with `nous` in
     the provider list — proves the bundled provider registered.
  2. GET /api/status returns 401 — proves the OAuth gate runs
     upstream of the legacy public-paths allowlist and is
     actively intercepting unauthenticated callers.

The insecure-opt-out test still hits /api/status, but now
asserts status_code == 200 first (proves the gate is bypassed)
before parsing the JSON for auth_required: false (proves the
gate-state flag is also correctly off).

Verified locally end-to-end against a fresh image build on a
real Docker daemon: all 41 tests under tests/docker/ pass in
2m38s, including the two formerly-failing dashboard tests and
the three sibling tests that were passing by accident.
2026-05-29 10:30:52 +10:00
Teknium f3acdd94fe Merge pull request #30698 from NousResearch/refactor/use-ds-primitives
refactor(web): consume DS primitives, remove local component copies
2026-05-28 17:29:28 -07:00
Teknium 78a54d2c00 fix(skills-page): source pills and category sidebar collapsed to All only (#34194)
Regression from PR #33809 (lazy-fetch refactor). The `sources` and
`categoryEntries` useMemo blocks were derived from `allSkillsLocal`
but had empty/incomplete deps arrays — so they computed once at mount
when the catalog was still `[]`, then never recomputed when the fetch
resolved.

Symptom: live site shows only the "All 87,639" source button and
"All Skills 87,639" category — no per-source pills (ClawHub, skills.sh,
LobeHub, etc.) and no category breakdown. Filtering by source/category
is unusable.

Fix: add `allSkillsLocal` to both deps arrays so they recompute when
data arrives. Local build green on en + zh-Hans.
2026-05-28 17:11:40 -07:00
Ben e7c99651fb fix(mcp): resolve bare npx/npm/node against /usr/local/bin
When the Hermes Docker image runs an stdio MCP server configured with an
explicit env.PATH that omits /usr/local/bin (a common pattern when users
hand-author PATH for sandboxing), the MCP env-filter passes that narrow
PATH straight through to the subprocess. _resolve_stdio_command's
fallback for bare 'npx' / 'npm' / 'node' commands only checked
$HERMES_HOME/node/bin/ and ~/.local/bin/, so execvp() failed with
'[Errno 2] No such file or directory: npx' on every Node-based stdio
MCP server (Railway, Anthropic, GitHub Copilot, etc.).

The naive workaround — symlink /usr/local/bin/npx into the user's PATH —
fails one layer deeper because npx's shebang re-execs /usr/bin/env node
and node also lives at /usr/local/bin/node.

Fix: add /usr/local/bin/<cmd> as a third candidate in the fallback list.
This is the canonical install location for Node on:
  - Linux from-source builds
  - the upstream node:bookworm-slim image, which the Hermes Docker
    image copies node + npm + corepack from since #4977 (the Node 22 LTS
    refactor that exposed this)
  - macOS Homebrew on Intel

Because the resolver already calls _prepend_path(resolved_env, command_dir)
after locating the command, /usr/local/bin gets prepended to the env's
PATH automatically, which also fixes the second-layer shebang failure
(npx-cli.js can now find node).

Scope is intentionally narrow: the fix activates only when the bare
command isn't otherwise locatable through the user's PATH. Users who
explicitly narrowed PATH for a non-Node MCP server see no change in
behavior.

Tested:
  - tests/tools/test_mcp_tool_issue_948.py: new test
    test_resolve_stdio_command_falls_back_to_usr_local_bin (mirrors the
    existing hermes-node-bin fallback test)
  - Full MCP test suite: 254/254 pass across 7 test files
  - E2E against a freshly-built Docker image: reproduced the original
    failure mode (env.PATH=/opt/data/bin:/usr/bin:/bin), confirmed the
    resolver returns /usr/local/bin/npx and prepends /usr/local/bin to
    PATH; subprocess.run of the resolved command prints '10.9.8' and
    exits 0 with empty stderr
  - Negative E2E on the host (where Node is already on PATH via mise):
    resolver still hits the mise install dir, /usr/local/bin candidate
    is not consulted, PATH is unchanged
2026-05-29 10:05:42 +10:00
Ben fb51253620 docker: opt in to dashboard --insecure via env var, never derive from bind host
The s6 dashboard run script flipped `--insecure` on whenever
`HERMES_DASHBOARD_HOST` was anything other than 127.0.0.1 / localhost.
That comment ("the dashboard refuses otherwise") predates the OAuth
auth gate: back when it was written, `start_server` would SystemExit
on any non-loopback bind, so the run script's `--insecure` was the
only way to make in-container deployments work at all.

The gate has since been replaced by `should_require_auth(host,
allow_public)`, which engages the OAuth flow when a
`DashboardAuthProvider` is registered (the bundled `dashboard_auth/nous`
provider auto-registers on `HERMES_DASHBOARD_OAUTH_CLIENT_ID`) and
fails closed with a specific operator-facing error when none is. The
host-derived `--insecure` ran upstream of all that and silently
disabled the gate on every container-deployed dashboard.

Most visible under the portal's wildcard-subdomain rollout: every Fly
machine binds 0.0.0.0 so the edge can reach Flycast, every machine
boots with the correct `HERMES_DASHBOARD_OAUTH_CLIENT_ID`, the nous
provider registers — and `/api/status` still returns
`{"auth_required": false, "auth_providers": ["nous"]}` because the
run script disabled the gate before `start_server` ever saw the
request. The dashboard SPA was served to anyone, no `/login` redirect,
no OAuth challenge.

Fix: derive `--insecure` from an explicit opt-in env var,
`HERMES_DASHBOARD_INSECURE` (truthy values matching the rest of the
s6 boolean envs: 1, true, TRUE, True, yes, YES, Yes). Operators on
trusted LANs behind a reverse proxy without the OAuth contract
(the existing `docker-compose.windows.yml` use case) opt in
explicitly; portal-managed agent deployments leave it unset and let
the gate engage.

`docker-compose.windows.yml` already passes `--insecure` on the
`command:` array directly (line 38), so it doesn't depend on the s6
auto-injection. No compose-file change required.

Tests:
* `tests/test_docker_home_override_scripts.py` — extends the existing
  static-text guard with a regression assertion that the legacy
  host-derived case-statement is gone and the new env-var opt-in is
  present (locks against accidental revert).
* `tests/docker/test_dashboard.py` — adds two Docker-in-Docker tests
  exercising the actual `/api/status` round-trip:
  - 0.0.0.0 bind + `HERMES_DASHBOARD_OAUTH_CLIENT_ID` → gate engaged
  - 0.0.0.0 bind + `HERMES_DASHBOARD_INSECURE=1` → gate disabled

Docs:
* `website/docs/user-guide/docker.md` + zh-Hans i18n — adds the new
  env var to the table, replaces the stale prose ("the entrypoint
  no longer auto-enables insecure mode" — which until this PR was
  flat-out wrong) with an accurate description of the gate's
  trigger conditions and the explicit opt-out.

shellcheck clean. Python static-text test passes locally. Behavioural
test will run against any future image build (CI's Docker harness).
2026-05-29 09:56:40 +10:00
Evo ef009a987a docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE from #33583 (#33751)
* docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (en)

* docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (en)

* docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (zh)

* docs(reference): document --no-supervise / HERMES_GATEWAY_NO_SUPERVISE (zh)
2026-05-29 09:44:53 +10:00
BROCCOLO1D 130396c658 ci(docker): avoid gha cache on arm64 PR builds 2026-05-29 09:43:48 +10:00
Austin Pickett a5c1f925b5 fix(web): stop /api/auth/me 401 from triggering a reload loop
In loopback mode the dashboard's identity probe (/api/auth/me) returns
401 by design — AuthWidget swallows it and renders nothing. But the
probe routed through fetchJSON, whose loopback 401 handler treats a 401
as a rotated session token and full-page-reloads to pick up a fresh one.
That reload is guarded by a one-shot sessionStorage flag which every
*successful* request clears, so with auth/me reliably 401ing and the
other dashboard calls (status/config/sessions) reliably succeeding, the
guard never sticks and the page reload-loops indefinitely (the "boot
flash").

Add an allowUnauthorized option to fetchJSON that skips only the loopback
stale-token reload (the 401 still throws so AuthWidget can catch it, and
the gated-mode login_url envelope redirect is unaffected), and use it for
getAuthMe.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-28 16:58:42 -04:00
Austin Pickett 0acb7f4583 fix(nix): update hermes-web npmDepsHash for @nous-research/ui 0.18.2
The web/package-lock.json changed when bumping @nous-research/ui to
0.18.2, so the fetchNpmDeps fixed-output hash in nix/web.nix was stale.
Update it to the hash prefetch-npm-deps computes for the new lockfile.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-28 16:24:01 -04:00
Austin Pickett a3cd974ee7 chore(web): bump @nous-research/ui to 0.18.2
Picks up the deferred GPU-tier detection fix (design-language) that
stops the synchronous WebGL probe from blocking first paint, which was
causing a boot-time flash in the dashboard backdrop.

nix/web.nix npmDepsHash is a placeholder here and is corrected in the
follow-up commit using the hash reported by the Nix CI job.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-28 16:20:14 -04:00
Austin Pickett 102eb4adc0 fix(nix): update hermes-web npmDepsHash for bumped @nous-research/ui
The web/package-lock.json changed when bumping @nous-research/ui to 0.18.0,
so the fetchNpmDeps fixed-output hash in nix/web.nix was stale and the nix
build failed. Update it to the hash prefetch-npm-deps computes for the new
lockfile.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-28 14:27:08 -04:00
Austin Pickett c661fefa08 Merge remote-tracking branch 'origin/main' into refactor/use-ds-primitives
Co-authored-by: Cursor <cursoragent@cursor.com>

# Conflicts:
#	web/src/components/BottomPickSheet.tsx
#	web/src/components/SidebarFooter.tsx
#	web/src/components/ui/card.tsx
#	web/src/components/ui/confirm-dialog.tsx
#	web/src/pages/ChatPage.tsx
2026-05-28 14:20:49 -04:00
Austin Pickett c9e5a9bb08 refactor(web): consume DS primitives, remove local component copies
Replace locally-forked UI components and hooks with their newly
promoted counterparts from @nous-research/ui:

Deleted local components (now in DS):
- components/ui/input.tsx, label.tsx, separator.tsx, card.tsx,
  confirm-dialog.tsx
- components/Toast.tsx, BottomPickSheet.tsx, NouiTypography.tsx
- hooks/useToast.ts, useModalBehavior.ts, useBelowBreakpoint.ts,
  useConfirmDelete.ts

Import updates across 25 files to use DS deep imports:
- @nous-research/ui/ui/components/{input,label,separator,card,
  confirm-dialog,toast,bottom-sheet}
- @nous-research/ui/ui/components/typography (replaces NouiTypography)
- @nous-research/ui/hooks/{use-toast,use-modal-behavior,
  use-below-breakpoint,use-confirm-delete}

Requires design-language >= feat/promote-hermes-web-primitives.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-22 21:57:59 -04:00
396 changed files with 12700 additions and 72040 deletions
+20 -4
View File
@@ -196,10 +196,26 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
# Build once, load into the local daemon for smoke testing. Cached
# to gha with a per-arch scope; the push step below reuses every
# layer from this build.
- name: Build image (arm64, smoke test)
# Build once, load into the local daemon for smoke testing. PR arm64
# builds deliberately avoid the gha cache: cold-cache arm64 builds can
# outlive GitHub's short-lived Azure cache SAS token, then fail while
# reading or writing cache blobs before the smoke test can run.
- name: Build image (arm64, smoke test, uncached PR)
if: github.event_name == 'pull_request'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
file: Dockerfile
load: true
platforms: linux/arm64
tags: ${{ env.IMAGE_NAME }}:test
build-args: |
HERMES_GIT_SHA=${{ github.sha }}
# Main/release builds still use the per-arch gha cache so the digest
# push below can reuse layers from this smoke-test build.
- name: Build image (arm64, smoke test, cached publish)
if: github.event_name != 'pull_request'
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: .
-19
View File
@@ -200,22 +200,3 @@ jobs:
- name: Run footgun checker
run: python scripts/check-windows-footguns.py --all
plugin-isolation:
# Enforce that core code and core tests never import from plugin packages.
# Core must interact with plugins exclusively through the registry layer.
# See scripts/check_no_plugin_imports_in_core.py for the rule list.
name: Plugin isolation (blocking)
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v5
with:
python-version: "3.11"
- name: Run plugin isolation checker
run: python scripts/check_no_plugin_imports_in_core.py
+3 -2
View File
@@ -48,7 +48,7 @@ agent-browser/
privvy*
images/
__pycache__/
*.egg-info
hermes_agent.egg-info/
wandb/
testlogs
@@ -87,7 +87,8 @@ website/static/api/skills-meta.json
models-dev-upstream/
hermes_cli/tui_dist/*
hermes_cli/scripts/
docs/superpowers/*# Working directory for the Hermes Agent's session state (~/.hermes/ at runtime;
docs/superpowers/*
# Working directory for the Hermes Agent's session state (~/.hermes/ at runtime;
# also created in-repo when an agent operates in this checkout). Plans, audit
# logs, and per-session caches are never artifacts of the codebase.
.hermes/
+57 -144
View File
@@ -29,9 +29,7 @@ hermes-agent/
├── hermes_constants.py # get_hermes_home(), display_hermes_home() — profile-aware paths
├── hermes_logging.py # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
├── batch_runner.py # Parallel batch processing
├── _build_backend.py # Custom PEP 517 build backend — inlines plugin deps at wheel build time
├── agent/ # Agent internals (provider adapters, memory, caching, compression, etc.)
│ └── plugin_registries.py # Typed capability registries (auth, transport, platform, tool, model_metadata)
├── hermes_cli/ # CLI subcommands, setup wizard, plugins loader, skin engine
├── tools/ # Tool implementations — auto-discovered via tools/registry.py
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
@@ -41,20 +39,16 @@ hermes-agent/
│ │ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Extension point for always-registered gateway hooks (none shipped)
├── plugins/ # Plugin packages — uv workspace members (see "Plugins" section)
│ ├── model-providers/ # anthropic, bedrock, azure-foundry (own pyproject.toml each)
│ ├── platforms/ # telegram, slack, discord, feishu, dingtalk, matrix
│ ├── tts/ # Text-to-speech plugin
│ ├── stt/ # Speech-to-text plugin
│ ├── image_gen/ # FAL image generation
│ ├── terminals/ # daytona, modal, vercel
│ ├── web/ # exa, firecrawl, parallel
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
│ ├── model-providers/ # Inference backend plugins (openrouter, anthropic, gmi, ...)
│ ├── kanban/ # Multi-agent board dispatcher + worker plugin
│ ├── hermes-achievements/ # Gamified achievement tracking
│ ├── observability/ # Metrics / traces / logs plugin
── <others>/ # dashboard, google_meet, spotify, strike-freedom-cockpit, ...
── image_gen/ # Image-generation providers
│ └── <others>/ # disk-cleanup, example-dashboard, google_meet, platforms,
│ # spotify, strike-freedom-cockpit, ...
├── optional-skills/ # Heavier/niche skills shipped but NOT active by default
├── skills/ # Built-in skills bundled with the repo
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
@@ -492,102 +486,9 @@ Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
## Plugins
Hermes uses a **plugin-first architecture**: every optional capability (model
providers, platform adapters, TTS/STT, terminal backends, image generation)
lives in its own installable Python package under `plugins/`. The core
codebase (`agent/`, `hermes_cli/`, `gateway/`, `tools/`) **never** imports
from a `hermes_agent_*` plugin package directly. Instead, plugins register
their capabilities into typed registries during `register()`, and the core
queries those registries at runtime.
Full architecture doc: `website/docs/developer-guide/plugin-architecture.md`
### Workspace layout
All 21 builtin plugins are uv workspace members — each has its own
`pyproject.toml` (single source of truth for deps), `plugin.yaml`
(directory-scanner manifest for dev mode), and `hermes_agent_<name>/` package
with `register(ctx)`:
```
plugins/
├── model-providers/ # anthropic, bedrock, azure-foundry
├── platforms/ # telegram, slack, discord, feishu, dingtalk, matrix
├── tts/ # text-to-speech (Edge TTS + ElevenLabs)
├── stt/ # speech-to-text
├── image_gen/fal_pkg/ # FAL image generation
├── terminals/ # daytona, modal, vercel
├── web/ # exa, firecrawl, parallel
├── memory/ # honcho, hindsight
├── dashboard/ # streamlit dashboard
└── hermes-achievements/ # gamified achievement tracking
```
### The hermetic core boundary
Core code MUST NOT import from `hermes_agent_*` packages. Instead it queries
typed registries in `agent/plugin_registries.py`:
```python
# ❌ BAD — core directly imports plugin
from hermes_agent_bedrock import has_aws_credentials
# ✅ GOOD — core queries the registry
from agent.plugin_registries import registries
bedrock_auth = registries.get_auth_provider("bedrock")
```
Registry types: `auth_providers`, `transport_builders`, `platform_adapters`,
`tool_providers`, `model_metadata`, `credential_pools`.
Each plugin's `register(ctx)` populates the registries via `ctx.register_*()`:
- `ctx.register_auth_provider(name, provider, ...)`
- `ctx.register_transport(name, builder, ...)`
- `ctx.register_platform(name, label, adapter_factory, check_fn, ...)`
- `ctx.register_tool_provider(entry, ...)`
- `ctx.register_model_metadata(entry, ...)`
- `ctx.register_credential_pool(entry, ...)`
- Plus existing: `register_tool()`, `register_hook()`, `register_cli_command()`,
`register_tts_provider()`, `register_transcription_provider()`,
`register_image_gen_provider()`, `register_video_gen_provider()`,
`register_context_engine()`
### Plugin discovery
Three discovery paths (same as before, now workspace-aware):
1. **Directory scanner**`plugins/`, `~/.hermes/plugins/`, `.hermes/plugins/`
(looks for `plugin.yaml`)
2. **Entry points**`[project.entry-points."hermes_agent.plugins"]`
3. **uv workspace**`uv sync --extra <name>` installs the plugin into venv
### Dependency management
- Each plugin's `pyproject.toml` is the **only** place its deps are declared
- Root `pyproject.toml` maps extras to workspace members:
`telegram = ["hermes-agent-telegram"]`
- `uv.lock` resolves the whole workspace (236 packages)
- No `LAZY_DEPS`, no `ensure()`, no runtime `pip install`
- Custom PEP 517 build backend (`_build_backend.py`) inlines plugin deps
at wheel build time for PyPI publishing
### NixOS
`loadWorkspace` discovers all workspace members from `uv.lock` automatically.
`mkVirtualEnv { hermes-agent = ["all"] }` installs all plugins. Select specific
plugins with `extraDependencyGroups = ["telegram", "anthropic"]`.
### Tests
Plugin tests live in `plugins/<category>/<name>/tests/`. The test runner
discovers both `tests/` and `plugins/`. Running plugin tests requires the
plugin to be installed (`uv sync --extra <name>`).
### The rule
**If it can be a plugin, it must be a plugin.** Adding optional capabilities
to core files is a code review rejection. If the plugin surface doesn't
support what you need, extend the surface (new registry type, new hook, new
`ctx` method) — don't inline the capability.
Hermes has two plugin surfaces. Both live under `plugins/` in the repo so
repo-shipped plugins can be discovered alongside user-installed ones in
`~/.hermes/plugins/` and pip-installed entry points.
### General plugins (`hermes_cli/plugins.py` + `plugins/<name>/`)
@@ -630,14 +531,9 @@ providers don't clutter `hermes --help`.
**Rule (Teknium, May 2026):** plugins MUST NOT modify core files
(`run_agent.py`, `cli.py`, `gateway/run.py`, `hermes_cli/main.py`, etc.).
If a plugin needs a capability the framework doesn't expose, expand the
generic plugin surface (new hook, new ctx method, new registry type) — never
hardcode plugin-specific logic into core. PR #5295 removed 95 lines of
hardcoded honcho argparse from `main.py` for exactly this reason.
**Hermetic core boundary (May 2026):** core code (`agent/`, `hermes_cli/`,
`gateway/`, `tools/`) MUST NOT import from `hermes_agent_*` plugin packages.
Use the typed registries in `agent/plugin_registries.py` instead. See the
**Plugins** section below for the full list of registry types.
generic plugin surface (new hook, new ctx method) — never hardcode
plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
honcho argparse from `main.py` for exactly this reason.
**No new in-tree memory providers (policy, May 2026):** the set of
built-in memory providers under `plugins/memory/` is closed. New memory
@@ -1115,41 +1011,40 @@ def profile_env(tmp_path, monkeypatch):
## Testing
**ALWAYS use `scripts/run_tests.sh`** — do NOT call `pytest` directly on a directory.
The script enforces hermetic environment parity with CI and provides per-file
process isolation that prevents registry singleton / module-level state leakage
between test files.
**ALWAYS use `scripts/run_tests.sh`** — do not call `pytest` directly. The script enforces
hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
`-n auto` xdist workers, in-tree subprocess-isolation plugin). Direct `pytest`
on a 16+ core developer machine with API keys set diverges from CI in ways
that have caused multiple "works locally, fails in CI" incidents (and the reverse).
```bash
scripts/run_tests.sh # full suite, CI-parity
scripts/run_tests.sh tests/gateway/ # one directory
scripts/run_tests.sh tests/agent/test_foo.py # one file
scripts/run_tests.sh tests/agent/test_foo.py::test_x # one test
scripts/run_tests.sh -v --tb=long # pass-through pytest flags
scripts/run_tests.sh --no-isolate tests/foo/ # disable subprocess isolation (faster, for debugging)
```
For a **single test file or specific test**, bare `pytest` is fine:
### Subprocess-per-test isolation
```bash
nix run nixpkgs#uv -- run python -m pytest tests/agent/test_foo.py -q
nix run nixpkgs#uv -- run python -m pytest tests/agent/test_foo.py::test_x --tb=short
```
Every test runs in a freshly-spawned Python subprocess via the in-tree plugin
at `tests/_isolate_plugin.py`. This means module-level dicts/sets and
ContextVars from one test cannot leak into the next — the historic
`_reset_module_state` autouse fixture is gone.
Running bare `pytest` on a directory (e.g. `pytest tests/`) will print a warning
from `conftest.py` telling you to use the script instead.
Implementation notes:
### Per-file process isolation
`scripts/run_tests.sh` calls `scripts/run_tests_parallel.py`, which spawns one
`python -m pytest <file>` subprocess per test **file** (not per test), giving each
a fresh Python interpreter. This means module-level dicts/sets, ContextVars, and
registry singletons from one test file cannot leak into another — no shared state
between files, no xdist required.
`HERMES_PARALLEL_RUNNER=1` is set in each subprocess so `conftest.py` knows tests
are running under the managed runner. If you need to suppress the bare-pytest
directory warning in a special case, set this variable yourself — but prefer the
script.
- The plugin uses `multiprocessing.get_context("spawn")`, which works on
Linux, macOS, and Windows alike (POSIX `fork` is not used).
- Per-test overhead is ~0.51.0s (Python startup + pytest collection). xdist
parallelism amortizes this across cores; on a 20-core box the full suite
finishes in roughly the same wall time as before, but flake-free.
- `isolate_timeout` (configured in `pyproject.toml`) caps each test at 30s.
Hangs are killed and surfaced as a failure report.
- Pass `--no-isolate` to disable isolation — useful when debugging a single
test interactively, or when you specifically want to verify state leakage.
- The plugin disables itself in child processes (sentinel envvar
`HERMES_ISOLATE_CHILD=1`), so there's no fork-bomb risk.
### Why the wrapper (and why the old "just call pytest" doesn't work)
@@ -1161,13 +1056,31 @@ Five real sources of local-vs-CI drift the script closes:
| HOME / `~/.hermes/` | Your real config+auth.json | Temp dir per test |
| Timezone | Local TZ (PDT etc.) | UTC |
| Locale | Whatever is set | C.UTF-8 |
| File isolation | Shared interpreter — state leaks between files | One subprocess per file |
| xdist workers | `-n auto` = all cores | `-n auto` (safe — subprocess isolation prevents cross-worker flakes) |
`tests/conftest.py` also enforces the credential/TZ/locale points as an autouse
fixture so ANY pytest invocation (including IDE integrations) gets hermetic
behavior — but the wrapper adds per-file process isolation on top.
`tests/conftest.py` also enforces points 1-4 as an autouse fixture so ANY pytest
invocation (including IDE integrations) gets hermetic behavior — but the wrapper
is belt-and-suspenders.
Always run the full suite via `scripts/run_tests.sh` before pushing changes.
### Running without the wrapper (only if you must)
If you can't use the wrapper (e.g. inside an IDE that shells pytest directly),
at minimum activate the venv. The isolation plugin loads automatically from
`addopts` in `pyproject.toml`, so you get the same per-test process isolation
either way.
```bash
source .venv/bin/activate # or: source venv/bin/activate
python -m pytest tests/ -q
```
If you need to bypass isolation for fast feedback while debugging:
```bash
python -m pytest tests/agent/test_foo.py -q --no-isolate
```
Always run the full suite before pushing changes.
### Don't write change-detector tests
+5 -4
View File
@@ -121,11 +121,12 @@ hermes chat -q "Hello"
### Run tests
```bash
# Preferred — matches CI (hermetic env, per-file process isolation); see AGENTS.md
# Preferred — matches CI (hermetic env, 4 xdist workers); see AGENTS.md
scripts/run_tests.sh
# For a single file or specific test, bare pytest is also fine:
# python -m pytest tests/agent/test_foo.py -q
# Alternative (activate the venv first). The wrapper is still recommended
# for parity with GitHub Actions before you open a PR:
pytest tests/ -v
```
---
@@ -856,7 +857,7 @@ refactor/description # Code restructuring
### Before submitting
1. **Run tests**: `scripts/run_tests.sh` (recommended; same as CI — hermetic env + per-file process isolation)
1. **Run tests**: `scripts/run_tests.sh` (recommended; same as CI) or `pytest tests/ -v` with the project venv activated
2. **Test manually**: Run `hermes` and exercise the code path you changed
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider macOS, Linux, and WSL2
4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.
+1 -1
View File
@@ -179,7 +179,7 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
uv pip install -e ".[all,dev]"
scripts/run_tests.sh
python -m pytest tests/ -q
```
---
+110
View File
@@ -0,0 +1,110 @@
# Hermes Agent v0.15.1 (v2026.5.29)
**Release Date:** May 29, 2026
**Since v0.15.0:** 28 commits · 21 merged PRs · hotfix release · 9 contributors
> **The Patch Release.** A same-day hotfix for v0.15.0. Headline fix: the dashboard infinite-reload loop that hit anyone running v0.15.0 in loopback mode (Docker, hosted Hermes, fresh installs). A handful of other v0.15.0 follow-ups go along for the ride — kanban worker SIGTERM, `/model` picker unification, `/yolo` session bypass, the full 19,932-entry skills.sh catalog, `.md` media delivery restoration, gateway probe-stepdown safety, web-URL redaction passthrough, kanban worker vision on referenced images, hindsight observation-default. Docker users get an explicit `--insecure` opt-in env var (no more bind-host inference), MCP server bare-command PATH resolution, and arm64 PR-build cache fixes.
---
## ✨ Highlights
- **Dashboard 401 reload loop fixed** — In loopback mode the dashboard's identity probe (`/api/auth/me`) returns 401 by design, but v0.15.0's stale-token reload guard treated every 401 as a rotated session token and full-page-reloaded to pick up a fresh one. Every successful sibling call cleared the one-shot reload guard, so the page reload-looped forever (Firefox: "Navigated to /sessions" storm; Chrome: React re-render storm). Fix adds an `allowUnauthorized` opt-out to `fetchJSON` that skips only the loopback stale-token reload — 401 still throws so `AuthWidget` swallows it, gated-mode `login_url` redirects are unaffected. Closes [#34206](https://github.com/NousResearch/hermes-agent/issues/34206), [#34202](https://github.com/NousResearch/hermes-agent/issues/34202). ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
- **Docker dashboard `--insecure` is now an explicit env opt-in, never derived from bind host** — Previously the Docker entrypoint inferred `--insecure` when the dashboard bound to a non-loopback host. That conflated "I want LAN access" with "I want to disable the same-origin guard." The fix splits them: bind host is bind host, and disabling the dashboard's loopback auth requires an explicit `HERMES_DASHBOARD_INSECURE=1`. Existing setups that genuinely wanted insecure binding must now set the env var. ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188), [#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
- **MCP bare command resolution under Docker** — MCP servers configured with bare commands (`npx`, `npm`, `node`) now resolve against `/usr/local/bin` so they actually launch inside the Docker image where those binaries live. v0.15.0 left these failing silently in containers when the agent's effective PATH didn't include the Node toolchain location. ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
- **Skills page sidebar / source pills restored** — A stale `useMemo` dependency in the new dashboard skills page collapsed the source pills and category sidebar to "All" only. Fixed; both surfaces now reflect the live catalog state. ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
- **Kanban worker can be killed again** — `SIGTERM` on a kanban worker was being absorbed by an intermediate process and the worker stayed running. Closes [#28181](https://github.com/NousResearch/hermes-agent/issues/28181). ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
- **Full skills.sh catalog (858 → 19,932 entries)** — The skills hub page was pulling a partial paginated catalog. The fetch now walks the sitemap, so all 19,932 skills.sh entries surface in the picker instead of just the first 858. ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
---
## 🐛 Bug Fixes
### Dashboard / Web
- **`/api/auth/me` 401 no longer triggers reload loop** in loopback mode — ([#30698](https://github.com/NousResearch/hermes-agent/pull/30698) — @austinpickett)
- **Skills page source pills + category sidebar restored** — stale `useMemo` dep ([#34194](https://github.com/NousResearch/hermes-agent/pull/34194))
### Docker
- **`--insecure` is now explicit opt-in via env var**, not derived from bind host ([#34188](https://github.com/NousResearch/hermes-agent/pull/34188) — @benbarclay)
- **Dashboard test suite repaired** to match the insecure-opt-in fix ([#34204](https://github.com/NousResearch/hermes-agent/pull/34204) — @benbarclay)
- **arm64 PR builds skip the GHA cache** to avoid cache-thrash on cross-arch builders ([#33704](https://github.com/NousResearch/hermes-agent/pull/33704) — @BROCCOLO1D)
### MCP
- **Bare `npx`/`npm`/`node` resolve against `/usr/local/bin`** for Docker compatibility ([#34186](https://github.com/NousResearch/hermes-agent/pull/34186) — @benbarclay)
### Kanban
- **Worker SIGTERM actually terminates the process** ([#34045](https://github.com/NousResearch/hermes-agent/pull/34045))
- **Workers receive images referenced in task bodies** for vision-capable models ([#34210](https://github.com/NousResearch/hermes-agent/pull/34210))
### Gateway
- **`.md` files deliver again** — media-delivery validation defaults to denylist-only instead of an overly-narrow allowlist ([#34022](https://github.com/NousResearch/hermes-agent/pull/34022))
- **Probe stepdown safety** — on a context-overflow without an explicit provider context limit, the agent no longer steps down to a smaller model based on an unknown ceiling (salvage of [#33673](https://github.com/NousResearch/hermes-agent/pull/33673)) ([#33826](https://github.com/NousResearch/hermes-agent/pull/33826))
### CLI
- **`/yolo` mid-session enables the per-session bypass** instead of just toggling the env var (which the running agent had already snapshotted) ([#33931](https://github.com/NousResearch/hermes-agent/pull/33931) — @kshitijk4poor)
- **`/model` and `hermes model` show the same list**, plus disk cache for picker startup ([#33867](https://github.com/NousResearch/hermes-agent/pull/33867))
### Skills
- **Full skills.sh catalog via sitemap** — 858 → 19,932 entries ([#34025](https://github.com/NousResearch/hermes-agent/pull/34025))
### Redaction
- **Web URLs pass through unchanged** — the redactor was eating query parameters that looked credential-shaped ([#34029](https://github.com/NousResearch/hermes-agent/pull/34029))
---
## ✨ Small Features
- **Hindsight default narrowed to observation-only** for `recall_types` — tool path is also narrowed ([#34079](https://github.com/NousResearch/hermes-agent/pull/34079) — @nicoloboschi, follow-up [#34091](https://github.com/NousResearch/hermes-agent/pull/4df62d239e38bf8c212a595721c9c01e176f6c3a) — @kshitijk4poor)
- **Memory providers receive completed-turn message context** — salvage of [#28065](https://github.com/NousResearch/hermes-agent/pull/28065) ([#34097](https://github.com/NousResearch/hermes-agent/pull/34097) — @kshitijk4poor, credit to @devwdave)
---
## 📚 Documentation
- **`--no-supervise` / `HERMES_GATEWAY_NO_SUPERVISE` documented** in the reference docs (follow-up to [#33583](https://github.com/NousResearch/hermes-agent/pull/33583)) ([#33751](https://github.com/NousResearch/hermes-agent/pull/33751) — @r266-tech)
---
## 🛠️ Infrastructure
- **Vercel deploy workflow accepts `workflow_dispatch`** so docs deploys can be manually triggered ([#34081](https://github.com/NousResearch/hermes-agent/pull/34081))
- **`@nous-research/ui` bumped to 0.18.2** (Nix `npmDepsHash` also updated to match) ([#34193](https://github.com/NousResearch/hermes-agent/pull/34193) follow-ups — @austinpickett)
---
## 👥 Contributors
### Core
- @teknium1
### Community
- @austinpickett — dashboard 401 reload-loop fix (the headline), `@nous-research/ui` bump, Nix `npmDepsHash` updates
- @benbarclay — Docker `--insecure` opt-in, MCP bare-command resolution, dashboard test repair
- @kshitijk4poor`/yolo` session bypass, completed-turn memory context salvage, hindsight follow-up docs
- @nicoloboschi — hindsight `recall_types` observation default
- @BROCCOLO1D — arm64 PR build cache fix
- @r266-tech — `--no-supervise` reference docs
- @yangguangjin — probe stepdown safety (salvage of @yanghd's #33673)
- @devwdave — completed-turn memory context (credited via salvage)
- @andrewhosf — co-author
### Issue Reporters (the 401 loop)
- @routesmith ([#34206](https://github.com/NousResearch/hermes-agent/issues/34206))
- @beeaton ([#34202](https://github.com/NousResearch/hermes-agent/issues/34202))
---
**Full Changelog**: [v2026.5.28...v2026.5.29](https://github.com/NousResearch/hermes-agent/compare/v2026.5.28...v2026.5.29)
-183
View File
@@ -1,183 +0,0 @@
"""Custom PEP 517 build backend for hermes-agent.
At wheel build time, rewrites [project.optional-dependencies] so that
plugin extras (e.g. ``anthropic = ["hermes-agent-anthropic"]``) are
inlined with the actual deps from each plugin's pyproject.toml.
In the source repo (and on Nix), uv resolves workspace members natively
so this backend is NOT used — it's only invoked when building a wheel
for PyPI publication.
Usage in pyproject.toml::
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "_build_backend"
backend-path = ["."]
How it works:
1. ``build_wheel`` intercepts the call before setuptools sees pyproject.toml.
2. It reads the workspace member dirs from [tool.uv.workspace].members.
3. For each member, it reads the member's pyproject.toml and extracts
``project.dependencies`` (excluding the ``hermes-agent`` base dep).
4. It rewrites the main pyproject.toml's optional-dependencies to inline
those deps instead of the workspace member references.
5. It writes a temporary pyproject.toml, delegates to
``setuptools.build_meta.build_wheel``, then restores the original.
"""
from __future__ import annotations
import os
import shutil
import tempfile
from pathlib import Path
from typing import Any
import tomllib
# The original setuptools backend we delegate to.
_BACKEND = "setuptools.build_meta"
def _load_pyproject(path: Path) -> dict:
with path.open("rb") as f:
return tomllib.load(f)
def _save_pyproject(path: Path, data: dict) -> None:
"""Write a pyproject.toml. Uses a simple serializer since we only
need to preserve the structure enough for setuptools to parse."""
import tomli_w
with path.open("wb") as f:
tomli_w.dump(data, f)
def _inline_plugin_deps(root: Path, data: dict) -> dict:
"""Rewrite optional-dependencies to inline plugin deps.
Maps each plugin extra (e.g. ``anthropic = ["hermes-agent-anthropic"]``)
to the actual deps from that plugin's pyproject.toml, minus the
``hermes-agent`` base dependency.
"""
opt_deps = data.get("project", {}).get("optional-dependencies", {})
workspace = data.get("tool", {}).get("uv", {}).get("workspace", {})
members = workspace.get("members", [])
# Build a map: package name → (member_dir, pyproject_data)
pkg_to_deps: dict[str, list[str]] = {}
for member_glob in members:
for member_dir in sorted(root.glob(member_glob)):
pptoml = member_dir / "pyproject.toml"
if not pptoml.exists():
continue
member_data = _load_pyproject(pptoml)
pkg_name = member_data.get("project", {}).get("name", "")
if not pkg_name:
continue
# Extract deps, excluding the base hermes-agent dependency
raw_deps = member_data.get("project", {}).get("dependencies", [])
filtered = [
d for d in raw_deps
if not d.replace(" ", "").startswith("hermes-agent")
]
pkg_to_deps[pkg_name] = filtered
# Rewrite optional-dependencies
new_opt_deps = {}
for extra_name, specs in opt_deps.items():
new_specs = []
for spec in specs:
# Check if this spec references a workspace member package
if spec in pkg_to_deps:
# Inline the plugin's deps
new_specs.extend(pkg_to_deps[spec])
else:
new_specs.append(spec)
new_opt_deps[extra_name] = new_specs
data["project"]["optional-dependencies"] = new_opt_deps
# Remove [tool.uv] section — it's not valid in a published wheel
if "uv" in data.get("tool", {}):
del data["tool"]["uv"]
return data
# ---------------------------------------------------------------------------
# PEP 517 hooks
# ---------------------------------------------------------------------------
def build_wheel(wheel_directory: str, config_settings: dict[str, Any] | None = None, metadata_directory: str | None = None) -> str:
"""Build a wheel with inlined plugin deps."""
root = Path.cwd()
pyproject_path = root / "pyproject.toml"
# Read and rewrite
data = _load_pyproject(pyproject_path)
data = _inline_plugin_deps(root, data)
# Write a temporary pyproject.toml, build, then restore
backup = pyproject_path.with_suffix(".toml.bak")
shutil.copy2(pyproject_path, backup)
try:
_save_pyproject(pyproject_path, data)
# Delegate to setuptools
import importlib
backend = importlib.import_module(_BACKEND)
return backend.build_wheel(wheel_directory, config_settings)
finally:
shutil.copy2(backup, pyproject_path)
backup.unlink()
def build_sdist(sdist_directory: str, config_settings: dict[str, Any] | None = None) -> str:
"""Build an sdist — no rewriting needed."""
import importlib
backend = importlib.import_module(_BACKEND)
return backend.build_sdist(sdist_directory, config_settings)
def get_requires_for_build_wheel(config_settings: dict[str, Any] | None = None) -> list[str]:
return ["setuptools>=61.0", "tomli_w"]
def get_requires_for_build_sdist(config_settings: dict[str, Any] | None = None) -> list[str]:
return ["setuptools>=61.0"]
def prepare_metadata_for_build_wheel(metadata_directory: str, config_settings: dict[str, Any] | None = None) -> str:
"""Prepare metadata with inlined plugin deps."""
root = Path.cwd()
pyproject_path = root / "pyproject.toml"
data = _load_pyproject(pyproject_path)
data = _inline_plugin_deps(root, data)
backup = pyproject_path.with_suffix(".toml.bak")
shutil.copy2(pyproject_path, backup)
try:
_save_pyproject(pyproject_path, data)
import importlib
backend = importlib.import_module(_BACKEND)
return backend.prepare_metadata_for_build_wheel(metadata_directory, config_settings)
finally:
shutil.copy2(backup, pyproject_path)
backup.unlink()
def build_editable(wheel_directory: str, config_settings: dict[str, Any] | None = None, metadata_directory: str | None = None) -> str:
"""Build an editable install — no rewriting needed (dev mode)."""
import importlib
backend = importlib.import_module(_BACKEND)
kwargs: dict[str, Any] = {"config_settings": config_settings}
if metadata_directory is not None:
kwargs["metadata_directory"] = metadata_directory
return backend.build_editable(wheel_directory, **kwargs)
def get_requires_for_build_editable(config_settings: dict[str, Any] | None = None) -> list[str]:
return ["setuptools>=61.0"]
+2 -2
View File
@@ -1,7 +1,7 @@
{
"id": "hermes-agent",
"name": "Hermes Agent",
"version": "0.15.0",
"version": "0.15.1",
"description": "Self-improving open-source AI agent by Nous Research with ACP editor integration, persistent memory, skills, and rich tool support.",
"repository": "https://github.com/NousResearch/hermes-agent",
"website": "https://hermes-agent.nousresearch.com/docs/user-guide/features/acp",
@@ -9,7 +9,7 @@
"license": "MIT",
"distribution": {
"uvx": {
"package": "hermes-agent[acp]==0.15.0",
"package": "hermes-agent[acp]==0.15.1",
"args": ["hermes-acp"]
}
}
+2 -4
View File
@@ -6,9 +6,7 @@ from typing import Any, Optional
import httpx
from agent.plugin_registries import registries
_is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
from agent.anthropic_adapter import _is_oauth_token, resolve_anthropic_token
from hermes_cli.auth import _read_codex_tokens, resolve_codex_runtime_credentials
from hermes_cli.runtime_provider import resolve_runtime_provider
@@ -178,7 +176,7 @@ def _fetch_anthropic_account_usage() -> Optional[AccountUsageSnapshot]:
token = (resolve_anthropic_token() or "").strip()
if not token:
return None
if _is_oauth_token is not None and not _is_oauth_token(token):
if not _is_oauth_token(token):
return AccountUsageSnapshot(
provider="anthropic",
source="oauth_usage_api",
+28 -32
View File
@@ -404,7 +404,7 @@ def init_agent(
agent.status_callback = status_callback
agent.tool_gen_callback = tool_gen_callback
# Tool execution state — allows _vprint during tool execution
# even when stream consumers are registered (no tokens streaming then)
agent._executing_tools = False
@@ -437,12 +437,12 @@ def init_agent(
# their tids explicitly.
agent._tool_worker_threads: set[int] = set()
agent._tool_worker_threads_lock = threading.Lock()
# Subagent delegation state
agent._delegate_depth = 0 # 0 = top-level agent, incremented for children
agent._active_children = [] # Running child AIAgents (for interrupt propagation)
agent._active_children_lock = threading.Lock()
# Store OpenRouter provider preferences
agent.providers_allowed = providers_allowed
agent.providers_ignored = providers_ignored
@@ -455,7 +455,7 @@ def init_agent(
# Store toolset filtering options
agent.enabled_toolsets = enabled_toolsets
agent.disabled_toolsets = disabled_toolsets
# Model response configuration
agent.max_tokens = max_tokens # None = use model default
agent.reasoning_config = reasoning_config # None = use default (medium for OpenRouter)
@@ -463,7 +463,7 @@ def init_agent(
agent.request_overrides = dict(request_overrides or {})
agent.prefill_messages = prefill_messages or [] # Prefilled conversation turns
agent._force_ascii_payload = False
# Anthropic prompt caching: auto-enabled for Claude models on native
# Anthropic, OpenRouter, and third-party gateways that speak the
# Anthropic protocol (``api_mode == 'anthropic_messages'``). Reduces
@@ -535,7 +535,7 @@ def init_agent(
# console. Any future noise reduction belongs at the
# handler level inside hermes_logging.py, not here.
pass
# Internal stream callback (set during streaming TTS).
# Initialized here so _vprint can reference it before run_conversation.
agent._stream_callback = None
@@ -585,14 +585,12 @@ def init_agent(
_provider_timeout = get_provider_request_timeout(agent.provider, agent.model)
if agent.api_mode == "anthropic_messages":
from agent.plugin_registries import registries
build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
# Bedrock + Claude → use AnthropicBedrock SDK for full feature parity
# (prompt caching, thinking budgets, adaptive thinking).
_is_bedrock_anthropic = agent.provider == "bedrock"
if _is_bedrock_anthropic:
build_anthropic_bedrock_client = registries.get_provider_service("anthropic", "build_anthropic_bedrock_client")
from agent.anthropic_adapter import build_anthropic_bedrock_client
_region_match = re.search(r"bedrock-runtime\.([a-z0-9-]+)\.", base_url or "")
_br_region = _region_match.group(1) if _region_match else "us-east-1"
agent._bedrock_region = _br_region
@@ -646,8 +644,8 @@ def init_agent(
# so injects Claude-Code identity headers and system prompts
# that cause 401/403 on their endpoints. Guards #1739 and
# the third-party identity-injection bug.
_is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
agent._is_anthropic_oauth = _is_oauth_token(effective_key) if (_is_oauth_token is not None and _is_native_anthropic and isinstance(effective_key, str)) else False
from agent.anthropic_adapter import _is_oauth_token as _is_oat
agent._is_anthropic_oauth = _is_oat(effective_key) if (_is_native_anthropic and isinstance(effective_key, str)) else False
agent._anthropic_client = build_anthropic_client(effective_key, base_url, timeout=_provider_timeout)
# No OpenAI client needed for Anthropic mode
agent.client = None
@@ -659,10 +657,9 @@ def init_agent(
# The Anthropic adapter installs an httpx event hook
# that mints a fresh JWT per request — we never
# invoke or inspect the callable in the banner.
from agent.plugin_registries import registries
is_token_provider = registries.get_provider_service("azure", "is_token_provider")
from agent.azure_identity_adapter import is_token_provider
if is_token_provider and is_token_provider(effective_key):
if is_token_provider(effective_key):
print("🔑 Using credentials: Microsoft Entra ID")
elif isinstance(effective_key, str) and len(effective_key) > 12:
print(f"🔑 Using token: {effective_key[:8]}...{effective_key[-4:]}")
@@ -872,11 +869,10 @@ def init_agent(
# provider (Azure Foundry). The OpenAI SDK mints a
# fresh JWT per request internally — the banner
# never invokes or inspects the callable.
from agent.plugin_registries import registries
is_token_provider = registries.get_provider_service("azure", "is_token_provider")
from agent.azure_identity_adapter import is_token_provider
key_used = client_kwargs.get("api_key", "none")
if is_token_provider and is_token_provider(key_used):
if is_token_provider(key_used):
print("🔑 Using credentials: Microsoft Entra ID")
elif isinstance(key_used, str) and key_used and key_used != "dummy-key" and len(key_used) > 12:
print(f"🔑 Using API key: {key_used[:8]}...{key_used[-4:]}")
@@ -884,7 +880,7 @@ def init_agent(
print("⚠️ Warning: API key appears invalid or missing")
except Exception as e:
raise RuntimeError(f"Failed to initialize OpenAI client: {e}")
# Provider fallback chain — ordered list of backup providers tried
# when the primary is exhausted (rate-limit, overload, connection
# failure). Supports both legacy single-dict ``fallback_model`` and
@@ -916,7 +912,7 @@ def init_agent(
disabled_toolsets=disabled_toolsets,
quiet_mode=agent.quiet_mode,
)
# Show tool configuration and store valid tool names for validation
agent.valid_tool_names = set()
if agent.tools:
@@ -949,16 +945,16 @@ def init_agent(
missing_reqs = [name for name, available in requirements.items() if not available]
if missing_reqs:
print(f"⚠️ Some tools may not work due to missing requirements: {missing_reqs}")
# Show trajectory saving status
if agent.save_trajectories and not agent.quiet_mode:
print("📝 Trajectory saving enabled")
# Show ephemeral system prompt status
if agent.ephemeral_system_prompt and not agent.quiet_mode:
prompt_preview = agent.ephemeral_system_prompt[:60] + "..." if len(agent.ephemeral_system_prompt) > 60 else agent.ephemeral_system_prompt
print(f"🔒 Ephemeral system prompt: '{prompt_preview}' (not saved to trajectories)")
# Show prompt caching status
if agent._use_prompt_caching and not agent.quiet_mode:
if agent._use_native_cache_layout and agent.provider == "anthropic":
@@ -968,7 +964,7 @@ def init_agent(
else:
source = "Claude via OpenRouter"
print(f"💾 Prompt caching: ENABLED ({source}, {agent._cache_ttl} TTL)")
# Session logging setup - auto-save conversation trajectories for debugging
agent.session_start = datetime.now()
if session_id:
@@ -1008,7 +1004,7 @@ def init_agent(
pass
# logs_dir is retained unconditionally for request_dump_*.json (debug
# breadcrumb path written by agent_runtime_helpers.dump_api_request_debug).
# Track conversation messages for session logging
agent._session_messages: List[Dict[str, Any]] = []
# Responses encrypted reasoning replay state. Some OpenAI-compatible
@@ -1020,10 +1016,10 @@ def init_agent(
agent._codex_reasoning_replay_enabled = True
agent._memory_write_origin = "assistant_tool"
agent._memory_write_context = "foreground"
# Cached system prompt -- built once per session, only rebuilt on compression
agent._cached_system_prompt: Optional[str] = None
# Filesystem checkpoint manager (transparent — not a tool)
from tools.checkpoint_manager import CheckpointManager
agent._checkpoint_mgr = CheckpointManager(
@@ -1032,7 +1028,7 @@ def init_agent(
max_total_size_mb=checkpoint_max_total_size_mb,
max_file_size_mb=checkpoint_max_file_size_mb,
)
# SQLite session store (optional -- provided by CLI or gateway)
agent._session_db = session_db
agent._parent_session_id = parent_session_id
@@ -1043,11 +1039,11 @@ def init_agent(
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
}
# In-memory todo list for task planning (one per agent/session)
from tools.todo_tool import TodoStore
agent._todo_store = TodoStore()
# Load config once for memory, skills, and compression sections
try:
from hermes_cli.config import load_config as _load_agent_config
@@ -1089,7 +1085,7 @@ def init_agent(
agent._memory_store.load_from_disk()
except Exception:
pass # Memory is optional -- don't break agent init
# Memory provider plugin (external — one at a time, alongside built-in)
@@ -1549,7 +1545,7 @@ def init_agent(
agent.session_estimated_cost_usd = 0.0
agent.session_cost_status = "unknown"
agent.session_cost_source = "none"
# ── Ollama num_ctx injection ──
# Ollama defaults to 2048 context regardless of the model's capabilities.
# When running against an Ollama server, detect the model's max context
+7 -8
View File
@@ -766,8 +766,7 @@ def try_recover_primary_transport(
agent.api_key = rt["api_key"]
if agent.api_mode == "anthropic_messages":
from agent.plugin_registries import registries
build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
from agent.anthropic_adapter import build_anthropic_client
agent._anthropic_api_key = rt["anthropic_api_key"]
agent._anthropic_base_url = rt["anthropic_base_url"]
agent._anthropic_client = build_anthropic_client(
@@ -931,8 +930,7 @@ def restore_primary_runtime(agent) -> bool:
# ── Rebuild client for the primary provider ──
if agent.api_mode == "anthropic_messages":
from agent.plugin_registries import registries
build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
from agent.anthropic_adapter import build_anthropic_client
agent._anthropic_api_key = rt["anthropic_api_key"]
agent._anthropic_base_url = rt["anthropic_base_url"]
agent._anthropic_client = build_anthropic_client(
@@ -1438,10 +1436,11 @@ def switch_model(agent, new_model, new_provider, api_key='', base_url='', api_mo
# ── Build new client ──
if api_mode == "anthropic_messages":
from agent.plugin_registries import registries
build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
resolve_anthropic_token = registries.get_provider_service("anthropic", "resolve_anthropic_token")
_is_oauth_token = registries.get_provider_service("anthropic", "_is_oauth_token")
from agent.anthropic_adapter import (
build_anthropic_client,
resolve_anthropic_token,
_is_oauth_token,
)
# Only fall back to ANTHROPIC_TOKEN when the provider is actually Anthropic.
# Other anthropic_messages providers (MiniMax, Alibaba, etc.) must use their own
# API key — falling back would send Anthropic credentials to third-party endpoints.
File diff suppressed because it is too large Load Diff
-166
View File
@@ -1,166 +0,0 @@
"""Anthropic auxiliary client wrappers — core module, no SDK dependency.
Provides OpenAI-client-compatible shims over native Anthropic SDK clients,
so auxiliary tasks (compression, vision, web extract, etc.) can call
``client.chat.completions.create()`` regardless of the underlying SDK.
The wrapper classes themselves never import the anthropic SDK. They delegate
wire-format conversion to :mod:`agent.anthropic_format` and response
normalization to the ``anthropic_messages`` transport registered in
:mod:`agent.transports`.
"""
from __future__ import annotations
import asyncio
import logging
from types import SimpleNamespace
from typing import Any, Optional
from agent.anthropic_format import (
build_anthropic_kwargs,
_forbids_sampling_params,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Adapter: Anthropic SDK → OpenAI-compatible completions.create()
# ---------------------------------------------------------------------------
class _AnthropicCompletionsAdapter:
"""OpenAI-client-compatible adapter for Anthropic Messages API."""
def __init__(self, real_client: Any, model: str, is_oauth: bool = False):
self._client = real_client
self._model = model
self._is_oauth = is_oauth
def create(self, **kwargs) -> Any:
from agent.transports import get_transport
messages = kwargs.get("messages", [])
model = kwargs.get("model", self._model)
tools = kwargs.get("tools")
tool_choice = kwargs.get("tool_choice")
# ZAI's Anthropic-compatible endpoint rejects max_tokens on vision
# models (glm-4v-flash etc.) with error code 1210. When the caller
# signals this by setting _skip_zai_max_tokens in kwargs, omit it.
_skip_mt = kwargs.pop("_skip_zai_max_tokens", False)
if _skip_mt:
max_tokens = None
else:
max_tokens = kwargs.get("max_tokens") or kwargs.get("max_completion_tokens") or 2000
temperature = kwargs.get("temperature")
normalized_tool_choice = None
if isinstance(tool_choice, str):
normalized_tool_choice = tool_choice
elif isinstance(tool_choice, dict):
choice_type = str(tool_choice.get("type", "")).lower()
if choice_type == "function":
normalized_tool_choice = tool_choice.get("function", {}).get("name")
elif choice_type in {"auto", "required", "none"}:
normalized_tool_choice = choice_type
anthropic_kwargs = build_anthropic_kwargs(
model=model,
messages=messages,
tools=tools,
max_tokens=max_tokens,
reasoning_config=None,
tool_choice=normalized_tool_choice,
is_oauth=self._is_oauth,
)
# Opus 4.7+ rejects any non-default temperature/top_p/top_k; only set
# temperature for models that still accept it. build_anthropic_kwargs
# additionally strips these keys as a safety net — keep both layers.
if temperature is not None:
if not _forbids_sampling_params(model):
anthropic_kwargs["temperature"] = temperature
response = self._client.messages.create(**anthropic_kwargs)
_transport = get_transport("anthropic_messages")
_nr = _transport.normalize_response(
response, strip_tool_prefix=self._is_oauth
)
assistant_message = SimpleNamespace(
content=_nr.content,
tool_calls=_nr.tool_calls,
reasoning=_nr.reasoning,
)
finish_reason = _nr.finish_reason
usage = None
if hasattr(response, "usage") and response.usage:
prompt_tokens = getattr(response.usage, "input_tokens", 0) or 0
completion_tokens = getattr(response.usage, "output_tokens", 0) or 0
total_tokens = getattr(response.usage, "total_tokens", 0) or (prompt_tokens + completion_tokens)
usage = SimpleNamespace(
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
total_tokens=total_tokens,
)
choice = SimpleNamespace(
index=0,
message=assistant_message,
finish_reason=finish_reason,
)
return SimpleNamespace(
choices=[choice],
model=model,
usage=usage,
)
class _AnthropicChatShim:
def __init__(self, adapter: _AnthropicCompletionsAdapter):
self.completions = adapter
# ---------------------------------------------------------------------------
# Public wrappers
# ---------------------------------------------------------------------------
class AnthropicAuxiliaryClient:
"""OpenAI-client-compatible wrapper over a native Anthropic client."""
def __init__(self, real_client: Any, model: str, api_key: str, base_url: str, is_oauth: bool = False):
self._real_client = real_client
adapter = _AnthropicCompletionsAdapter(real_client, model, is_oauth=is_oauth)
self.chat = _AnthropicChatShim(adapter)
self.api_key = api_key
self.base_url = base_url
def close(self):
close_fn = getattr(self._real_client, "close", None)
if callable(close_fn):
close_fn()
class _AsyncAnthropicCompletionsAdapter:
def __init__(self, sync_adapter: _AnthropicCompletionsAdapter):
self._sync = sync_adapter
async def create(self, **kwargs) -> Any:
return await asyncio.to_thread(self._sync.create, **kwargs)
class _AsyncAnthropicChatShim:
def __init__(self, adapter: _AsyncAnthropicCompletionsAdapter):
self.completions = adapter
class AsyncAnthropicAuxiliaryClient:
def __init__(self, sync_wrapper: AnthropicAuxiliaryClient):
sync_adapter = sync_wrapper.chat.completions
async_adapter = _AsyncAnthropicCompletionsAdapter(sync_adapter)
self.chat = _AsyncAnthropicChatShim(async_adapter)
self.api_key = sync_wrapper.api_key
self.base_url = sync_wrapper.base_url
# Mirror _real_client so cache eviction on a poisoned underlying
# client also drops this entry.
self._real_client = sync_wrapper._real_client
+449 -115
View File
@@ -106,41 +106,6 @@ from utils import base_url_host_matches, base_url_hostname, normalize_proxy_env_
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Core anthropic wire-format modules (no SDK dependency)
# ---------------------------------------------------------------------------
from agent.anthropic_aux import ( # noqa: F401
AnthropicAuxiliaryClient,
AsyncAnthropicAuxiliaryClient,
)
# ---------------------------------------------------------------------------
# Plugin-registry helper — access *plugin-provided* anthropic services
# (resolve.py functions: maybe_wrap_anthropic, is_anthropic_compat_endpoint, etc.)
# Wire-format code (message conversion, aux client wrappers) lives in core
# and is imported directly above.
# ---------------------------------------------------------------------------
def _anthropic_plugin_service(name: str):
"""Lazy accessor for anthropic plugin resolve services.
Only the SDK-dependent orchestration (maybe_wrap_anthropic,
is_anthropic_compat_endpoint, convert_openai_images_to_anthropic) lives
in the plugin. Core accesses it through
``registries.get_provider_service("anthropic", name)`` so that:
- Core never imports from a plugin package directly.
- The plugin need only be installed when the user actually uses it.
"""
from agent.plugin_registries import registries
svc = registries.get_provider_service("anthropic", name)
if svc is None:
raise ImportError(
f"anthropic plugin service {name!r} not available — "
f"the hermes_agent_anthropic package may not be installed"
)
return svc
def _safe_isinstance(obj: Any, maybe_type: Any) -> bool:
"""Return False instead of raising when a patched symbol is not a type."""
@@ -452,6 +417,7 @@ auxiliary_is_nous: bool = False
_OPENROUTER_MODEL = "google/gemini-3-flash-preview"
_NOUS_MODEL = "google/gemini-3-flash-preview"
_NOUS_DEFAULT_BASE_URL = "https://inference-api.nousresearch.com/v1"
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
_AUTH_JSON_PATH = get_hermes_home() / "auth.json"
# Codex OAuth endpoint used when a caller explicitly requests
@@ -982,6 +948,253 @@ class AsyncCodexAuxiliaryClient:
self._real_client = sync_wrapper._real_client
class _AnthropicCompletionsAdapter:
"""OpenAI-client-compatible adapter for Anthropic Messages API."""
def __init__(self, real_client: Any, model: str, is_oauth: bool = False):
self._client = real_client
self._model = model
self._is_oauth = is_oauth
def create(self, **kwargs) -> Any:
from agent.anthropic_adapter import build_anthropic_kwargs
from agent.transports import get_transport
messages = kwargs.get("messages", [])
model = kwargs.get("model", self._model)
tools = kwargs.get("tools")
tool_choice = kwargs.get("tool_choice")
# ZAI's Anthropic-compatible endpoint rejects max_tokens on vision
# models (glm-4v-flash etc.) with error code 1210. When the caller
# signals this by setting _skip_zai_max_tokens in kwargs, omit it.
_skip_mt = kwargs.pop("_skip_zai_max_tokens", False)
if _skip_mt:
max_tokens = None
else:
max_tokens = kwargs.get("max_tokens") or kwargs.get("max_completion_tokens") or 2000
temperature = kwargs.get("temperature")
normalized_tool_choice = None
if isinstance(tool_choice, str):
normalized_tool_choice = tool_choice
elif isinstance(tool_choice, dict):
choice_type = str(tool_choice.get("type", "")).lower()
if choice_type == "function":
normalized_tool_choice = tool_choice.get("function", {}).get("name")
elif choice_type in {"auto", "required", "none"}:
normalized_tool_choice = choice_type
anthropic_kwargs = build_anthropic_kwargs(
model=model,
messages=messages,
tools=tools,
max_tokens=max_tokens,
reasoning_config=None,
tool_choice=normalized_tool_choice,
is_oauth=self._is_oauth,
)
# Opus 4.7+ rejects any non-default temperature/top_p/top_k; only set
# temperature for models that still accept it. build_anthropic_kwargs
# additionally strips these keys as a safety net — keep both layers.
if temperature is not None:
from agent.anthropic_adapter import _forbids_sampling_params
if not _forbids_sampling_params(model):
anthropic_kwargs["temperature"] = temperature
response = self._client.messages.create(**anthropic_kwargs)
_transport = get_transport("anthropic_messages")
_nr = _transport.normalize_response(
response, strip_tool_prefix=self._is_oauth
)
# ToolCall already duck-types as OpenAI shape (.type, .function.name,
# .function.arguments) via properties, so no wrapping needed.
assistant_message = SimpleNamespace(
content=_nr.content,
tool_calls=_nr.tool_calls,
reasoning=_nr.reasoning,
)
finish_reason = _nr.finish_reason
usage = None
if hasattr(response, "usage") and response.usage:
prompt_tokens = getattr(response.usage, "input_tokens", 0) or 0
completion_tokens = getattr(response.usage, "output_tokens", 0) or 0
total_tokens = getattr(response.usage, "total_tokens", 0) or (prompt_tokens + completion_tokens)
usage = SimpleNamespace(
prompt_tokens=prompt_tokens,
completion_tokens=completion_tokens,
total_tokens=total_tokens,
)
choice = SimpleNamespace(
index=0,
message=assistant_message,
finish_reason=finish_reason,
)
return SimpleNamespace(
choices=[choice],
model=model,
usage=usage,
)
class _AnthropicChatShim:
def __init__(self, adapter: _AnthropicCompletionsAdapter):
self.completions = adapter
class AnthropicAuxiliaryClient:
"""OpenAI-client-compatible wrapper over a native Anthropic client."""
def __init__(self, real_client: Any, model: str, api_key: str, base_url: str, is_oauth: bool = False):
self._real_client = real_client
adapter = _AnthropicCompletionsAdapter(real_client, model, is_oauth=is_oauth)
self.chat = _AnthropicChatShim(adapter)
self.api_key = api_key
self.base_url = base_url
def close(self):
close_fn = getattr(self._real_client, "close", None)
if callable(close_fn):
close_fn()
class _AsyncAnthropicCompletionsAdapter:
def __init__(self, sync_adapter: _AnthropicCompletionsAdapter):
self._sync = sync_adapter
async def create(self, **kwargs) -> Any:
import asyncio
return await asyncio.to_thread(self._sync.create, **kwargs)
class _AsyncAnthropicChatShim:
def __init__(self, adapter: _AsyncAnthropicCompletionsAdapter):
self.completions = adapter
class AsyncAnthropicAuxiliaryClient:
def __init__(self, sync_wrapper: "AnthropicAuxiliaryClient"):
sync_adapter = sync_wrapper.chat.completions
async_adapter = _AsyncAnthropicCompletionsAdapter(sync_adapter)
self.chat = _AsyncAnthropicChatShim(async_adapter)
self.api_key = sync_wrapper.api_key
self.base_url = sync_wrapper.base_url
# See AsyncCodexAuxiliaryClient: mirror _real_client so cache
# eviction on a poisoned underlying client also drops this entry.
self._real_client = sync_wrapper._real_client
def _endpoint_speaks_anthropic_messages(base_url: str) -> bool:
"""True if the endpoint at ``base_url`` speaks the Anthropic Messages
protocol instead of OpenAI chat.completions.
Mirrors ``hermes_cli.runtime_provider._detect_api_mode_for_url`` so the
auxiliary client and the main agent stay in sync on transport selection.
Covers:
- Any URL ending in ``/anthropic`` (MiniMax, Zhipu GLM, LiteLLM proxies,
Anthropic-compatible gateways).
- ``api.kimi.com/coding`` (Kimi Coding Plan — the /coding route only
speaks Claude-Code's native Anthropic shape; ``chat.completions``
returns 404 on Anthropic-only model aliases like ``kimi-for-coding``).
- ``api.anthropic.com`` (native Anthropic).
"""
normalized = (base_url or "").strip().lower().rstrip("/")
if not normalized:
return False
if normalized.endswith("/anthropic"):
return True
hostname = base_url_hostname(normalized)
if hostname == "api.anthropic.com":
return True
if hostname == "api.kimi.com" and "/coding" in normalized:
return True
return False
def _maybe_wrap_anthropic(
client_obj: Any,
model: str,
api_key: str,
base_url: str,
api_mode: Optional[str] = None,
) -> Any:
"""Rewrap a plain OpenAI client in ``AnthropicAuxiliaryClient`` when
the endpoint actually speaks Anthropic Messages.
This is the single chokepoint for aux-client transport correction.
Runs at the end of every ``resolve_provider_client`` branch so that
api_key providers (Kimi Coding Plan), the ``custom`` endpoint, and
future /anthropic gateways all land on the right wire format
regardless of which branch built the client.
Returns ``client_obj`` unchanged when:
- It's already an Anthropic/Codex/Gemini/CopilotACP wrapper.
- The endpoint is an OpenAI-wire endpoint.
- ``api_mode`` is explicitly set to a non-Anthropic transport.
- The ``anthropic`` SDK is not installed (falls back to OpenAI wire).
"""
# Already wrapped — don't double-wrap.
if _safe_isinstance(client_obj, AnthropicAuxiliaryClient):
return client_obj
# Other specialized adapters we should never re-dispatch.
if _safe_isinstance(client_obj, CodexAuxiliaryClient):
return client_obj
try:
from agent.gemini_native_adapter import GeminiNativeClient
if _safe_isinstance(client_obj, GeminiNativeClient):
return client_obj
except ImportError:
pass
try:
from agent.copilot_acp_client import CopilotACPClient
if _safe_isinstance(client_obj, CopilotACPClient):
return client_obj
except ImportError:
pass
# Explicit non-anthropic api_mode wins over URL heuristics.
if api_mode and api_mode != "anthropic_messages":
return client_obj
should_wrap = (
api_mode == "anthropic_messages"
or _endpoint_speaks_anthropic_messages(base_url)
)
if not should_wrap:
return client_obj
try:
from agent.anthropic_adapter import build_anthropic_client
except ImportError:
logger.warning(
"Endpoint %s speaks Anthropic Messages but the anthropic SDK is "
"not installed — falling back to OpenAI-wire (will likely 404).",
base_url,
)
return client_obj
try:
real_client = build_anthropic_client(api_key, base_url)
except Exception as exc:
logger.warning(
"Failed to build Anthropic client for %s (%s) — falling back to "
"OpenAI-wire client.", base_url, exc,
)
return client_obj
logger.debug(
"Auxiliary transport: wrapping client in AnthropicAuxiliaryClient "
"(model=%s, base_url=%s, api_mode=%s)",
model, base_url[:60] if base_url else "", api_mode or "auto-detected",
)
return AnthropicAuxiliaryClient(
real_client, model, api_key, base_url, is_oauth=False,
)
def _read_nous_auth() -> Optional[dict]:
"""Read and validate ~/.hermes/auth.json for an active Nous provider.
@@ -1192,14 +1405,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
continue
except ImportError:
pass
# Delegate to the anthropic plugin resolver via the registry
from agent.plugin_registries import registries as _ar
_anthro_resolver = _ar.get_provider_resolver("anthropic")
if _anthro_resolver is not None:
_ac, _am = _anthro_resolver()
if _ac is not None:
return _ac, _am
continue
return _try_anthropic()
pool_present, entry = _select_pool_entry(provider_id)
if pool_present:
@@ -1236,7 +1442,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _anthropic_plugin_service("maybe_wrap_anthropic")(_client, model, api_key, raw_base_url)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
creds = resolve_api_key_provider_credentials(provider_id)
@@ -1273,7 +1479,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _anthropic_plugin_service("maybe_wrap_anthropic")(_client, model, api_key, raw_base_url)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
return None, None
@@ -1282,6 +1488,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
# ── Provider resolution helpers ─────────────────────────────────────────────
def _try_openrouter(explicit_api_key: str = None, model: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
@@ -1603,11 +1810,7 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
# LiteLLM proxies, etc.). Must NEVER be treated as OAuth —
# Anthropic OAuth claims only apply to api.anthropic.com.
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
build_anthropic_client = _anthropic.get("build_anthropic_client")
if build_anthropic_client is None:
raise ImportError("anthropic provider not registered")
from agent.anthropic_adapter import build_anthropic_client
real_client = build_anthropic_client(custom_key, custom_base)
except ImportError:
logger.warning(
@@ -1622,7 +1825,7 @@ def _try_custom_endpoint() -> Tuple[Optional[Any], Optional[str]]:
# URL-based anthropic detection for custom endpoints that didn't set
# api_mode explicitly (e.g. kimi.com/coding reached via custom config).
_fallback_client = OpenAI(api_key=custom_key, base_url=_clean_base, **_extra)
_fallback_client = _anthropic_plugin_service("maybe_wrap_anthropic")(
_fallback_client = _maybe_wrap_anthropic(
_fallback_client, model, custom_key, custom_base, custom_mode,
)
return _fallback_client, model
@@ -1800,7 +2003,7 @@ def _try_azure_foundry(
# for Entra ID it's a callable. ``_maybe_wrap_anthropic`` →
# ``build_anthropic_client`` detects the callable and installs
# the bearer-injecting httpx hook.
return _anthropic_plugin_service("maybe_wrap_anthropic")(
return _maybe_wrap_anthropic(
client, final_model, api_key,
base_url, runtime_api_mode,
), final_model
@@ -1809,6 +2012,54 @@ def _try_azure_foundry(
return client, final_model
def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
return None, None
pool_present, entry = _select_pool_entry("anthropic")
if pool_present:
if entry is None:
return None, None
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
# Allow base URL override from config.yaml model.base_url, but only
# when the configured provider is anthropic — otherwise a non-Anthropic
# base_url (e.g. Codex endpoint) would leak into Anthropic requests.
base_url = _pool_runtime_base_url(entry, _ANTHROPIC_DEFAULT_BASE_URL) if pool_present else _ANTHROPIC_DEFAULT_BASE_URL
try:
from hermes_cli.config import load_config
cfg = load_config()
model_cfg = cfg.get("model")
if isinstance(model_cfg, dict):
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
if cfg_provider == "anthropic":
cfg_base_url = (model_cfg.get("base_url") or "").strip().rstrip("/")
if cfg_base_url:
base_url = cfg_base_url
except Exception:
pass
from agent.anthropic_adapter import _is_oauth_token
is_oauth = _is_oauth_token(token)
model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
try:
real_client = build_anthropic_client(token, base_url)
except ImportError:
# The anthropic_adapter module imports fine but the SDK itself is
# missing — build_anthropic_client raises ImportError at call time
# when _anthropic_sdk is None. Treat as unavailable.
return None, None
return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model
_AUTO_PROVIDER_LABELS = {
"_try_openrouter": "openrouter",
"_try_nous": "nous",
@@ -2378,8 +2629,8 @@ def _retry_same_provider_sync(
extra_body=effective_extra_body,
base_url=retry_base or resolved_base_url,
)
if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, retry_base):
retry_kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(retry_kwargs["messages"])
if _is_anthropic_compat_endpoint(resolved_provider, retry_base):
retry_kwargs["messages"] = _convert_openai_images_to_anthropic(retry_kwargs["messages"])
return _validate_llm_response(
retry_client.chat.completions.create(**retry_kwargs), task,
)
@@ -2435,8 +2686,8 @@ async def _retry_same_provider_async(
extra_body=effective_extra_body,
base_url=retry_base or resolved_base_url,
)
if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, retry_base):
retry_kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(retry_kwargs["messages"])
if _is_anthropic_compat_endpoint(resolved_provider, retry_base):
retry_kwargs["messages"] = _convert_openai_images_to_anthropic(retry_kwargs["messages"])
return _validate_llm_response(
await retry_client.chat.completions.create(**retry_kwargs), task,
)
@@ -2470,19 +2721,12 @@ def _refresh_provider_credentials(provider: str) -> bool:
_evict_cached_clients(normalized)
return True
if normalized == "anthropic":
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
_refresh_oauth_token = _anthropic.get("_refresh_oauth_token")
resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
if read_claude_code_credentials is None:
return False
from agent.anthropic_adapter import read_claude_code_credentials, _refresh_oauth_token, resolve_anthropic_token
creds = read_claude_code_credentials()
token = _refresh_oauth_token(creds) if isinstance(creds, dict) and creds.get("refreshToken") and _refresh_oauth_token else None
token = _refresh_oauth_token(creds) if isinstance(creds, dict) and creds.get("refreshToken") else None
if not str(token or "").strip():
if resolve_anthropic_token is not None:
token = resolve_anthropic_token()
token = resolve_anthropic_token()
if not str(token or "").strip():
return False
_evict_cached_clients(normalized)
@@ -2803,7 +3047,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
if isinstance(sync_client, CodexAuxiliaryClient):
return AsyncCodexAuxiliaryClient(sync_client), model
if _safe_isinstance(sync_client, AnthropicAuxiliaryClient):
if isinstance(sync_client, AnthropicAuxiliaryClient):
return AsyncAnthropicAuxiliaryClient(sync_client), model
try:
from agent.gemini_native_adapter import GeminiNativeClient, AsyncGeminiNativeClient
@@ -2989,7 +3233,7 @@ def resolve_provider_client(
return CodexAuxiliaryClient(client_obj, final_model_str)
# Anthropic-wire endpoints: rewrap plain OpenAI clients so
# chat.completions.create() is translated to /v1/messages.
return _anthropic_plugin_service("maybe_wrap_anthropic")(
return _maybe_wrap_anthropic(
client_obj, final_model_str, api_key_str, base_url_str, api_mode,
)
@@ -3221,11 +3465,7 @@ def resolve_provider_client(
# branch in _try_custom_endpoint(). See #15033.
if entry_api_mode == "anthropic_messages":
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
build_anthropic_client = _anthropic.get("build_anthropic_client")
if build_anthropic_client is None:
raise ImportError("anthropic provider not registered")
from agent.anthropic_adapter import build_anthropic_client
real_client = build_anthropic_client(custom_key, custom_base)
except ImportError:
logger.warning(
@@ -3268,32 +3508,39 @@ def resolve_provider_client(
except ImportError:
pass
# ── Plugin-registered resolvers (azure-foundry, etc.) ─────────────
# Providers with complex auth (Entra ID, OAuth, etc.) register a
# resolver callable so core doesn't need per-provider if/elif branches.
from agent.plugin_registries import registries as _reg_early
_early_resolver = _reg_early.get_provider_resolver(provider)
if _early_resolver is not None:
client, default_model = _early_resolver(
# ── Azure Foundry (delegates to runtime resolver for auth_mode-aware routing)
#
# The generic PROVIDER_REGISTRY path below uses
# ``resolve_api_key_provider_credentials`` which only knows about the
# static ``AZURE_FOUNDRY_API_KEY`` env var. That misses two important
# cases for the ``azure-foundry`` provider:
#
# 1. ``model.auth_mode: entra_id`` — no static key exists; we need
# a callable bearer-token provider from ``azure_identity_adapter``.
# 2. Non-default ``model.base_url`` (Foundry projects path) — the
# env-var-only resolver doesn't apply config-yaml-driven URL
# overrides.
#
# Delegate to the same runtime resolver the main agent uses so
# auxiliary tasks (title generation, compression, vision, embedding,
# session search) inherit the user's full Azure config.
if provider == "azure-foundry":
client, default_model = _try_azure_foundry(
model=model,
explicit_api_key=explicit_api_key,
explicit_base_url=explicit_base_url,
async_mode=async_mode,
is_vision=is_vision,
main_runtime=main_runtime,
api_mode=api_mode,
)
if client is not None:
final_model = _normalize_resolved_model(model or default_model, provider)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# Resolver returned None — provider unavailable
logger.warning(
"resolve_provider_client: %s requested but resolver returned "
"no client (run: hermes doctor for diagnostics)",
provider,
)
return None, None
if client is None:
logger.warning(
"resolve_provider_client: azure-foundry requested but "
"runtime resolution failed (run: hermes doctor for "
"diagnostics)"
)
return None, None
final_model = _normalize_resolved_model(model or default_model, provider)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── API-key providers from PROVIDER_REGISTRY ─────────────────────
try:
@@ -3312,6 +3559,14 @@ def resolve_provider_client(
return None, None
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
final_model = _normalize_resolved_model(model or default_model, provider)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode else (client, final_model))
creds = resolve_api_key_provider_credentials(provider)
api_key = str(creds.get("api_key", "")).strip()
# Honour an explicit api_key override (e.g. from a fallback_model entry
@@ -3444,14 +3699,37 @@ def resolve_provider_client(
return None, None
elif pconfig.auth_type == "aws_sdk":
# AWS SDK providers (e.g. Bedrock) — handled by the early resolver
# catch above when a plugin registers one. If we reach here, no
# resolver was registered.
logger.warning(
"resolve_provider_client: aws_sdk provider %s has no "
"registered resolver (plugin not loaded?)", provider,
# AWS SDK providers (Bedrock) — use the Anthropic Bedrock client via
# boto3's credential chain (IAM roles, SSO, env vars, instance metadata).
try:
from agent.bedrock_adapter import has_aws_credentials, resolve_bedrock_region
from agent.anthropic_adapter import build_anthropic_bedrock_client
except ImportError:
logger.warning("resolve_provider_client: bedrock requested but "
"boto3 or anthropic SDK not installed")
return None, None
if not has_aws_credentials():
logger.debug("resolve_provider_client: bedrock requested but "
"no AWS credentials found")
return None, None
region = resolve_bedrock_region()
default_model = "anthropic.claude-haiku-4-5-20251001-v1:0"
final_model = _normalize_resolved_model(model or default_model, provider)
try:
real_client = build_anthropic_bedrock_client(region)
except ImportError as exc:
logger.warning("resolve_provider_client: cannot create Bedrock "
"client: %s", exc)
return None, None
client = AnthropicAuxiliaryClient(
real_client, final_model, api_key="aws-sdk",
base_url=f"https://bedrock-runtime.{region}.amazonaws.com",
)
return None, None
logger.debug("resolve_provider_client: bedrock (%s, %s)", final_model, region)
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
elif pconfig.auth_type in {"oauth_device_code", "oauth_external"}:
# OAuth providers — route through their specific try functions
@@ -3575,12 +3853,7 @@ def _resolve_strict_vision_backend(
# allow-list); callers must specify via auxiliary.<task>.model.
return resolve_provider_client("openai-codex", model, is_vision=True)
if provider == "anthropic":
from agent.plugin_registries import registries as _reg
_resolver = _reg.get_provider_resolver("anthropic")
if _resolver is not None:
return _resolver(model=model)
# Fallback: no resolver registered (plugin not loaded)
return None, None
return _try_anthropic()
if provider == "custom":
return _try_custom_endpoint()
return None, None
@@ -4310,6 +4583,69 @@ def _get_task_extra_body(task: str) -> Dict[str, Any]:
# Providers that use Anthropic-compatible endpoints (via OpenAI SDK wrapper).
# Their image content blocks must use Anthropic format, not OpenAI format.
_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-oauth", "minimax-cn"})
def _is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
"""Detect if an endpoint expects Anthropic-format content blocks.
Returns True for known Anthropic-compatible providers (MiniMax) and
any endpoint whose URL contains ``/anthropic`` in the path.
"""
if provider in _ANTHROPIC_COMPAT_PROVIDERS:
return True
url_lower = (base_url or "").lower()
return "/anthropic" in url_lower
def _convert_openai_images_to_anthropic(messages: list) -> list:
"""Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks.
Only touches messages that have list-type content with ``image_url`` blocks;
plain text messages pass through unchanged.
"""
converted = []
for msg in messages:
content = msg.get("content")
if not isinstance(content, list):
converted.append(msg)
continue
new_content = []
changed = False
for block in content:
if block.get("type") == "image_url":
image_url_val = (block.get("image_url") or {}).get("url", "")
if image_url_val.startswith("data:"):
# Parse data URI: data:<media_type>;base64,<data>
header, _, b64data = image_url_val.partition(",")
media_type = "image/png"
if ":" in header and ";" in header:
media_type = header.split(":", 1)[1].split(";", 1)[0]
new_content.append({
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": b64data,
},
})
else:
# URL-based image
new_content.append({
"type": "image",
"source": {
"type": "url",
"url": image_url_val,
},
})
changed = True
else:
new_content.append(block)
converted.append({**msg, "content": new_content} if changed else msg)
return converted
def _build_call_kwargs(
provider: str,
model: str,
@@ -4339,10 +4675,8 @@ def _build_call_kwargs(
# structured-JSON extraction) don't 400 the moment
# the aux model is flipped to 4.7.
if temperature is not None:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
_forbids_sampling_params = _anthropic.get("_forbids_sampling_params")
if _forbids_sampling_params is not None and _forbids_sampling_params(model):
from agent.anthropic_adapter import _forbids_sampling_params
if _forbids_sampling_params(model):
temperature = None
if temperature is not None:
@@ -4554,8 +4888,8 @@ def call_llm(
# Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
_client_base = str(getattr(client, "base_url", "") or "")
if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, _client_base):
kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(kwargs["messages"])
if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
# Handle unsupported temperature, max_tokens vs max_completion_tokens retry,
# then payment fallback.
@@ -4997,8 +5331,8 @@ async def async_call_llm(
base_url=_client_base or resolved_base_url)
# Convert image blocks for Anthropic-compatible endpoints (e.g. MiniMax)
if _anthropic_plugin_service("is_anthropic_compat_endpoint")(resolved_provider, _client_base):
kwargs["messages"] = _anthropic_plugin_service("convert_openai_images_to_anthropic")(kwargs["messages"])
if _is_anthropic_compat_endpoint(resolved_provider, _client_base):
kwargs["messages"] = _convert_openai_images_to_anthropic(kwargs["messages"])
try:
return _validate_llm_response(
@@ -54,6 +54,8 @@ SCOPE_AI_AZURE_DEFAULT = "https://ai.azure.com/.default"
# Lazy SDK import — only loaded when the Entra path is actually used.
# ---------------------------------------------------------------------------
_AZURE_IDENTITY_FEATURE = "provider.azure_identity"
def has_azure_identity_installed() -> bool:
"""Return True if `azure-identity` can be imported right now.
@@ -68,20 +70,35 @@ def has_azure_identity_installed() -> bool:
def _require_azure_identity():
"""Import ``azure.identity``.
"""Import ``azure.identity``, lazy-installing it if allowed.
Raises ``ImportError`` with a clear actionable message when the
package is missing.
package is missing and lazy installs are disabled.
"""
try:
import azure.identity as _ai
return _ai
except ImportError:
raise ImportError(
"The 'azure-identity' package is required for Azure AI "
"Foundry Entra ID authentication. Install it with: "
"pip install azure-identity"
)
try:
from tools.lazy_deps import ensure, FeatureUnavailable
except ImportError as exc:
raise ImportError(
"The 'azure-identity' package is required for Azure AI "
"Foundry Entra ID authentication. Install it with: "
"pip install azure-identity"
) from exc
try:
ensure(_AZURE_IDENTITY_FEATURE, prompt=False)
except FeatureUnavailable as exc:
raise ImportError(
"The 'azure-identity' package is required for Azure AI "
"Foundry Entra ID authentication. " + str(exc)
) from exc
# Retry import after lazy install.
import azure.identity as _ai # noqa: WPS440
return _ai
def reset_credential_cache() -> None:
@@ -36,6 +36,19 @@ from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Ensure boto3/botocore are installed before any code in this module runs.
# Upstream removed boto3 from [all] extras (PRs #24220, #24515); lazy_deps
# handles on-demand installation so the Bedrock provider still works in the
# EKS deployment without baking boto3 into the base image.
# ---------------------------------------------------------------------------
try:
from tools.lazy_deps import ensure
ensure("provider.bedrock", prompt=False)
except Exception:
pass # lazy_deps unavailable or install failed — let downstream imports surface the real error
# ---------------------------------------------------------------------------
# Lazy boto3 import — only loaded when the Bedrock provider is actually used.
# This keeps startup fast for users who don't use Bedrock.
+20 -32
View File
@@ -235,14 +235,12 @@ def interruptible_api_call(agent, api_kwargs: dict):
# normalize_converse_response produces an OpenAI-compatible
# SimpleNamespace so the rest of the agent loop can treat
# bedrock responses like chat_completions responses.
from agent.plugin_registries import registries
_bedrock = registries.get_provider_namespace("bedrock")
_get_bedrock_runtime_client = _bedrock.get("_get_bedrock_runtime_client")
invalidate_runtime_client = _bedrock.get("invalidate_runtime_client")
is_stale_connection_error = _bedrock.get("is_stale_connection_error")
normalize_converse_response = _bedrock.get("normalize_converse_response")
if _get_bedrock_runtime_client is None or normalize_converse_response is None:
raise ImportError("bedrock provider not registered")
from agent.bedrock_adapter import (
_get_bedrock_runtime_client,
invalidate_runtime_client,
is_stale_connection_error,
normalize_converse_response,
)
region = api_kwargs.pop("__bedrock_region__", "us-east-1")
api_kwargs.pop("__bedrock_converse__", None)
client = _get_bedrock_runtime_client(region)
@@ -698,11 +696,8 @@ def build_api_kwargs(agent, api_messages: list) -> dict:
_ant_max = None
if (_is_or or _is_nous) and "claude" in (agent.model or "").lower():
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
_get_anthropic_max_output = _anthropic.get("_get_anthropic_max_output")
if _get_anthropic_max_output is not None:
_ant_max = _get_anthropic_max_output(agent.model)
from agent.anthropic_adapter import _get_anthropic_max_output
_ant_max = _get_anthropic_max_output(agent.model)
except Exception:
pass
@@ -1187,20 +1182,15 @@ def try_activate_fallback(agent, reason: "FailoverReason | None" = None) -> bool
if fb_api_mode == "anthropic_messages":
# Build native Anthropic client instead of using OpenAI client
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
build_anthropic_client = _anthropic.get("build_anthropic_client")
resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
_is_oauth_token = _anthropic.get("_is_oauth_token")
effective_key = (fb_client.api_key or (resolve_anthropic_token() if resolve_anthropic_token else "") or "") if fb_provider == "anthropic" else (fb_client.api_key or "")
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token, _is_oauth_token
effective_key = (fb_client.api_key or resolve_anthropic_token() or "") if fb_provider == "anthropic" else (fb_client.api_key or "")
agent.api_key = effective_key
agent._anthropic_api_key = effective_key
agent._anthropic_base_url = fb_base_url
if build_anthropic_client is not None:
agent._anthropic_client = build_anthropic_client(
effective_key, agent._anthropic_base_url, timeout=_fb_timeout,
)
agent._is_anthropic_oauth = _is_oauth_token(effective_key) if fb_provider == "anthropic" and _is_oauth_token else False
agent._anthropic_client = build_anthropic_client(
effective_key, agent._anthropic_base_url, timeout=_fb_timeout,
)
agent._is_anthropic_oauth = _is_oauth_token(effective_key) if fb_provider == "anthropic" else False
agent.client = None
agent._client_kwargs = {}
else:
@@ -1584,14 +1574,12 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
def _bedrock_call():
try:
from agent.plugin_registries import registries
_bedrock = registries.get_provider_namespace("bedrock")
_get_bedrock_runtime_client = _bedrock.get("_get_bedrock_runtime_client")
invalidate_runtime_client = _bedrock.get("invalidate_runtime_client")
is_stale_connection_error = _bedrock.get("is_stale_connection_error")
stream_converse_with_callbacks = _bedrock.get("stream_converse_with_callbacks")
if _get_bedrock_runtime_client is None or stream_converse_with_callbacks is None:
raise ImportError("bedrock provider not registered")
from agent.bedrock_adapter import (
_get_bedrock_runtime_client,
invalidate_runtime_client,
is_stale_connection_error,
stream_converse_with_callbacks,
)
region = api_kwargs.pop("__bedrock_region__", "us-east-1")
api_kwargs.pop("__bedrock_converse__", None)
client = _get_bedrock_runtime_client(region)
+4 -4
View File
@@ -27,7 +27,7 @@ import time
import uuid
from typing import Any, Dict, List, Optional
from agent.plugin_registries import registries as _registries
from agent.anthropic_adapter import _is_oauth_token
from agent.auxiliary_client import set_runtime_main
from agent.codex_responses_adapter import _summarize_user_message_for_log
from agent.display import KawaiiSpinner
@@ -2383,8 +2383,8 @@ def run_conversation(
and not anthropic_auth_retry_attempted
):
anthropic_auth_retry_attempted = True
_is_oauth_token = _registries.get_provider_service("anthropic", "_is_oauth_token")
is_token_provider = _registries.get_provider_service("azure", "is_token_provider")
from agent.anthropic_adapter import _is_oauth_token
from agent.azure_identity_adapter import is_token_provider
if agent._try_refresh_anthropic_client_credentials():
print(f"{agent.log_prefix}🔐 Anthropic credentials refreshed after 401. Retrying request...")
continue
@@ -2401,7 +2401,7 @@ def run_conversation(
print(f"{agent.log_prefix} Run `hermes doctor` for credential-chain diagnostics, or")
print(f"{agent.log_prefix} `az login` if your developer session expired.")
else:
auth_method = "Bearer (OAuth/setup-token)" if (_is_oauth_token is not None and _is_oauth_token(key)) else "x-api-key (API key)"
auth_method = "Bearer (OAuth/setup-token)" if _is_oauth_token(key) else "x-api-key (API key)"
print(f"{agent.log_prefix} Auth method: {auth_method}")
print(f"{agent.log_prefix} Token prefix: {key[:12]}..." if isinstance(key, str) and len(key) > 12 else f"{agent.log_prefix} Token: (empty or short)")
print(f"{agent.log_prefix} Troubleshooting:")
+196 -49
View File
@@ -458,6 +458,43 @@ class CredentialPool:
self._persist()
return updated
def _sync_anthropic_entry_from_credentials_file(self, entry: PooledCredential) -> PooledCredential:
"""Sync a claude_code pool entry from ~/.claude/.credentials.json if tokens differ.
OAuth refresh tokens are single-use. When something external (e.g.
Claude Code CLI, or another profile's pool) refreshes the token, it
writes the new pair to ~/.claude/.credentials.json. The pool entry's
refresh token becomes stale. This method detects that and syncs.
"""
if self.provider != "anthropic" or entry.source != "claude_code":
return entry
try:
from agent.anthropic_adapter import read_claude_code_credentials
creds = read_claude_code_credentials()
if not creds:
return entry
file_refresh = creds.get("refreshToken", "")
file_access = creds.get("accessToken", "")
file_expires = creds.get("expiresAt", 0)
# If the credentials file has a different token pair, sync it
if file_refresh and file_refresh != entry.refresh_token:
logger.debug("Pool entry %s: syncing tokens from credentials file (refresh token changed)", entry.id)
updated = replace(
entry,
access_token=file_access,
refresh_token=file_refresh,
expires_at_ms=file_expires,
last_status=None,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(entry, updated)
self._persist()
return updated
except Exception as exc:
logger.debug("Failed to sync from credentials file: %s", exc)
return entry
def _sync_codex_entry_from_auth_store(self, entry: PooledCredential) -> PooledCredential:
"""Sync a Codex device_code pool entry from auth.json if tokens differ.
@@ -747,11 +784,32 @@ class CredentialPool:
return None
try:
# ── Plugin-registered credential pool hooks ──
from agent.plugin_registries import registries as _cph_reg2
_hook = _cph_reg2.get_credential_pool_hook(self.provider)
if _hook is not None and _hook.refresh_oauth is not None:
updated = _hook.refresh_oauth(entry, pool=self)
if self.provider == "anthropic":
from agent.anthropic_adapter import refresh_anthropic_oauth_pure
refreshed = refresh_anthropic_oauth_pure(
entry.refresh_token,
use_json=entry.source.endswith("hermes_pkce"),
)
updated = replace(
entry,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
expires_at_ms=refreshed["expires_at_ms"],
)
# Keep ~/.claude/.credentials.json in sync so that the
# fallback path (resolve_anthropic_token) and other profiles
# see the latest tokens.
if entry.source == "claude_code":
try:
from agent.anthropic_adapter import _write_claude_code_credentials
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
elif self.provider == "openai-codex":
# Adopt fresher tokens from auth.json before spending the
# refresh_token — single-use tokens consumed by another Hermes
@@ -806,18 +864,46 @@ class CredentialPool:
return entry
except Exception as exc:
logger.debug("Credential refresh failed for %s/%s: %s", self.provider, entry.id, exc)
# ── Plugin-registered credential pool hooks ──
# The hook's refresh_oauth already handles retry-with-sync internally,
# so if we got here it means a non-hook provider failed.
from agent.plugin_registries import registries as _cph_reg3
_hook = _cph_reg3.get_credential_pool_hook(self.provider)
if _hook is not None and _hook.sync_from_credentials_file is not None:
# Give the hook a chance to sync from external file
synced = _hook.sync_from_credentials_file(entry)
if synced is not entry:
entry = synced
self._replace_entry(entry, synced)
self._persist()
# For anthropic claude_code entries: the refresh token may have been
# consumed by another process. Check if ~/.claude/.credentials.json
# has a newer token pair and retry once.
if self.provider == "anthropic" and entry.source == "claude_code":
synced = self._sync_anthropic_entry_from_credentials_file(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug("Retrying refresh with synced token from credentials file")
try:
from agent.anthropic_adapter import refresh_anthropic_oauth_pure
refreshed = refresh_anthropic_oauth_pure(
synced.refresh_token,
use_json=synced.source.endswith("hermes_pkce"),
)
updated = replace(
synced,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
expires_at_ms=refreshed["expires_at_ms"],
last_status=STATUS_OK,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(synced, updated)
self._persist()
try:
from agent.anthropic_adapter import _write_claude_code_credentials
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file (retry path): %s", wexc)
return updated
except Exception as retry_exc:
logger.debug("Retry refresh also failed: %s", retry_exc)
elif not self._entry_needs_refresh(synced):
# Credentials file had a valid (non-expired) token — use it directly
logger.debug("Credentials file has valid token, using without refresh")
return synced
# For xai-oauth: same race as nous — another process may have
# consumed the refresh token between our proactive sync and the
# HTTP call. Re-check auth.json and adopt the fresh tokens if
@@ -1038,11 +1124,10 @@ class CredentialPool:
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
if entry.auth_type != AUTH_TYPE_OAUTH:
return False
# ── Plugin-registered credential pool hooks ──
from agent.plugin_registries import registries as _cph_reg
_hook = _cph_reg.get_credential_pool_hook(self.provider)
if _hook is not None and _hook.needs_refresh is not None:
return _hook.needs_refresh(entry)
if self.provider == "anthropic":
if entry.expires_at_ms is None:
return False
return int(entry.expires_at_ms) <= int(time.time() * 1000) + 120_000
if self.provider == "openai-codex":
return _codex_access_token_is_expiring(
entry.access_token,
@@ -1075,16 +1160,12 @@ class CredentialPool:
cleared_any = False
available: List[PooledCredential] = []
for entry in self._entries:
# ── Plugin-registered credential pool hooks ──
# Sync exhausted entries from external credentials files before
# status/refresh checks. This picks up tokens refreshed by other
# processes (e.g. Claude Code CLI, other Hermes profiles).
from agent.plugin_registries import registries as _cph_reg4
_avail_hook = _cph_reg4.get_credential_pool_hook(self.provider)
if (_avail_hook is not None
and _avail_hook.sync_from_credentials_file is not None
# For anthropic claude_code entries, sync from the credentials file
# before any status/refresh checks. This picks up tokens refreshed
# by other processes (Claude Code CLI, other Hermes profiles).
if (self.provider == "anthropic" and entry.source == "claude_code"
and entry.last_status == STATUS_EXHAUSTED):
synced = _avail_hook.sync_from_credentials_file(entry)
synced = self._sync_anthropic_entry_from_credentials_file(entry)
if synced is not entry:
entry = synced
cleared_any = True
@@ -1434,15 +1515,84 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _is_suppressed(_p, _s): # type: ignore[misc]
return False
# ── Plugin-registered credential pool hooks ──
from agent.plugin_registries import registries as _cp_reg
_cp_hook = _cp_reg.get_credential_pool_hook(provider)
if _cp_hook is not None and _cp_hook.discover_credentials is not None:
hook_changed, hook_sources = _cp_hook.discover_credentials(
entries, provider, _is_suppressed,
if provider == "anthropic":
# Only auto-discover external credentials (Claude Code, Hermes PKCE)
# when the user has explicitly configured anthropic as their provider.
# Without this gate, auxiliary client fallback chains silently read
# ~/.claude/.credentials.json without user consent. See PR #4210.
try:
from hermes_cli.auth import is_provider_explicitly_configured
if not is_provider_explicitly_configured("anthropic"):
return changed, active_sources
except ImportError:
pass
# API-key vs OAuth is a user-visible choice at `hermes setup` ("Claude
# Pro/Max subscription" vs "Anthropic API key"). The signal that the
# user picked the API-key path is: ANTHROPIC_API_KEY set in the env,
# AND no OAuth env vars set — `save_anthropic_api_key()` writes the
# API key and zeros ANTHROPIC_TOKEN; `save_anthropic_oauth_token()`
# does the inverse. When that signal is present we MUST NOT seed
# autodiscovered OAuth tokens (~/.claude/.credentials.json from the
# Claude Code CLI, hermes_pkce creds from a previous OAuth login)
# into the anthropic pool — otherwise rotation on a 401/429 silently
# flips the session onto an OAuth credential, which forces the Claude
# Code identity injection, `mcp_` tool-name rewrite, and claude-cli
# User-Agent header (`agent/anthropic_adapter.py:2128`). Users who
# explicitly opted into the API-key path are explicitly opting OUT of
# that masquerade. Prefer ~/.hermes/.env over os.environ for the
# same reason `_seed_from_env` does — that's the authoritative file
# that `hermes setup` writes.
_env_file = load_env()
def _env_val(key: str) -> str:
return (_env_file.get(key) or os.environ.get(key) or "").strip()
anthropic_api_key = _env_val("ANTHROPIC_API_KEY")
anthropic_oauth_env = (
_env_val("ANTHROPIC_TOKEN") or _env_val("CLAUDE_CODE_OAUTH_TOKEN")
)
changed |= hook_changed
active_sources |= hook_sources
api_key_path_explicit = bool(anthropic_api_key and not anthropic_oauth_env)
if api_key_path_explicit:
# Prune any stale autodiscovered OAuth entries that may have been
# seeded into the on-disk pool during a previous OAuth session.
# Without this, switching OAuth -> API key at setup leaves the
# OAuth entries dormant in auth.json forever and rotation on a
# transient 401 could revive them.
retained = [
entry for entry in entries
if entry.source not in {"hermes_pkce", "claude_code"}
]
if len(retained) != len(entries):
entries[:] = retained
changed = True
return changed, active_sources
from agent.anthropic_adapter import read_claude_code_credentials, read_hermes_oauth_credentials
for source_name, creds in (
("hermes_pkce", read_hermes_oauth_credentials()),
("claude_code", read_claude_code_credentials()),
):
if creds and creds.get("accessToken"):
if _is_suppressed(provider, source_name):
continue
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_OAUTH,
"access_token": creds.get("accessToken", ""),
"refresh_token": creds.get("refreshToken"),
"expires_at_ms": creds.get("expiresAt"),
"label": label_from_token(creds.get("accessToken", ""), source_name),
},
)
elif provider == "nous":
state = _load_provider_state(auth_store, "nous")
has_runtime_material = bool(
@@ -1753,11 +1903,12 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
# ── Plugin-registered credential pool hooks: env var order override ──
from agent.plugin_registries import registries as _env_reg
_env_hook = _env_reg.get_credential_pool_hook(provider)
if _env_hook is not None and _env_hook.env_var_order is not None:
env_vars = _env_hook.env_var_order
if provider == "anthropic":
env_vars = [
"ANTHROPIC_TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN",
"ANTHROPIC_API_KEY",
]
for env_var in env_vars:
# Prefer ~/.hermes/.env over os.environ
@@ -1768,11 +1919,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
if _is_source_suppressed(provider, source):
continue
active_sources.add(source)
# ── Plugin-registered credential pool hooks: auth type detection ──
if _env_hook is not None and _env_hook.detect_auth_type is not None:
auth_type = _env_hook.detect_auth_type(token)
else:
auth_type = AUTH_TYPE_API_KEY
auth_type = AUTH_TYPE_OAUTH if provider == "anthropic" and not token.startswith("sk-ant-api") else AUTH_TYPE_API_KEY
base_url = env_url or pconfig.inference_base_url
if provider == "kimi-coding":
base_url = _resolve_kimi_base_url(token, pconfig.inference_base_url, env_url)
+134 -14
View File
@@ -37,6 +37,8 @@ from __future__ import annotations
import base64
import logging
import mimetypes
import os
import re
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
@@ -46,6 +48,102 @@ logger = logging.getLogger(__name__)
_VALID_MODES = frozenset({"auto", "native", "text"})
# Image extensions used by extract_image_refs(). Kept tight on purpose — we
# only auto-attach things the model can actually see. Documents/archives are
# excluded because the gateway's broader extract_local_files() also routes
# them differently (send_document), and we don't want to attach a PDF as a
# vision part.
_IMAGE_EXTS = (
".png", ".jpg", ".jpeg", ".gif", ".webp", ".bmp", ".tiff", ".tif", ".heic",
)
_IMAGE_EXT_PATTERN = "|".join(e.lstrip(".") for e in _IMAGE_EXTS)
# Absolute / home-relative local image path. Matches the same shape gateway's
# extract_local_files() uses: anchors to ``~/`` or ``/``, ignores matches inside
# URLs (the ``(?<![/:\w.])`` lookbehind), and case-insensitive on the extension.
_LOCAL_IMAGE_PATH_RE = re.compile(
r"(?<![/:\w.])(?:~/|/)(?:[\w.\-]+/)*[\w.\-]+\.(?:" + _IMAGE_EXT_PATTERN + r")\b",
re.IGNORECASE,
)
# http(s) URL ending in an image extension (optionally followed by a
# query string). Case-insensitive on the extension. Strict ``http(s)://``
# scheme so we don't accidentally grab ``file://`` URLs or other shapes.
_IMAGE_URL_RE = re.compile(
r"https?://[^\s<>\"']+?\.(?:" + _IMAGE_EXT_PATTERN + r")(?:\?[^\s<>\"']*)?",
re.IGNORECASE,
)
def extract_image_refs(text: str) -> Tuple[List[str], List[str]]:
"""Scan free-form text for image references the model should see.
Returns ``(local_paths, urls)``:
* ``local_paths`` absolute (``/``) or home-relative (``~/``) paths
whose suffix is an image extension AND whose expanded form exists
on disk as a file. Order-preserving, deduplicated.
* ``urls`` ``http(s)://`` URLs whose path ends in an image
extension (a ``?query`` is allowed after the extension).
Order-preserving, deduplicated.
Matches inside fenced code blocks (``` ``` ```) and inline backticks
(`` `` ``) are skipped so that snippets pasted into a task body for
reference aren't mistaken for live attachments. This mirrors the
behaviour of ``gateway.platforms.base.BaseAdapter.extract_local_files``.
Local paths are validated against the filesystem; URLs are not
(the provider fetches them at request time).
"""
if not isinstance(text, str) or not text:
return [], []
# Build spans covered by fenced code blocks and inline code so we can
# ignore references the author embedded purely as example text.
code_spans: list[tuple[int, int]] = []
for m in re.finditer(r"```[^\n]*\n.*?```", text, re.DOTALL):
code_spans.append((m.start(), m.end()))
for m in re.finditer(r"`[^`\n]+`", text):
code_spans.append((m.start(), m.end()))
def _in_code(pos: int) -> bool:
return any(s <= pos < e for s, e in code_spans)
local_paths: list[str] = []
seen_paths: set[str] = set()
for match in _LOCAL_IMAGE_PATH_RE.finditer(text):
if _in_code(match.start()):
continue
raw = match.group(0)
expanded = os.path.expanduser(raw)
try:
if not os.path.isfile(expanded):
continue
except OSError:
# ENAMETOOLONG / EINVAL on pathological inputs — skip rather than crash.
continue
if expanded in seen_paths:
continue
seen_paths.add(expanded)
local_paths.append(expanded)
urls: list[str] = []
seen_urls: set[str] = set()
for match in _IMAGE_URL_RE.finditer(text):
if _in_code(match.start()):
continue
url = match.group(0)
# Strip trailing punctuation that's almost certainly prose, not part
# of the URL (e.g. "see https://x.com/a.png." or "/a.png)").
url = url.rstrip(".,;:!?)]>")
if url in seen_urls:
continue
seen_urls.add(url)
urls.append(url)
return local_paths, urls
# Strict YAML/JSON boolean coercion for capability overrides.
#
# ``bool("false")`` is True in Python because non-empty strings are truthy, so
@@ -320,20 +418,29 @@ def _file_to_data_url(path: Path) -> Optional[str]:
def build_native_content_parts(
user_text: str,
image_paths: List[str],
image_urls: Optional[List[str]] = None,
) -> Tuple[List[Dict[str, Any]], List[str]]:
"""Build an OpenAI-style ``content`` list for a user turn.
Shape:
[{"type": "text", "text": "...\\n\\n[Image attached at: /local/path]"},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
{"type": "image_url", "image_url": {"url": "https://example.com/a.png"}},
...]
The local path of each successfully attached image is appended to the
text part as ``[Image attached at: <path>]``. The model still sees the
pixels via the ``image_url`` part (full native vision); the path note
just gives it a string handle so MCP/skill tools that take an image
path or URL argument can be invoked on the same image without an
extra round-trip. This parallels the text-mode hint produced by
Local paths are read from disk and embedded as base64 ``data:`` URLs.
Remote URLs (``http(s)://``) are passed through verbatim the provider
fetches them server-side. The model still sees the pixels either way.
For each successfully attached image, a hint is appended to the text
part:
* local path ``[Image attached at: <path>]``
* URL ``[Image attached: <url>]``
The hint gives the model a string handle so MCP/skill tools that take
an image path or URL argument can be invoked on the same image without
an extra round-trip. This parallels the text-mode hint produced by
``Runner._enrich_message_with_vision`` (``vision_analyze using image_url:
<path>``) so behaviour is consistent across both image input modes.
@@ -342,12 +449,14 @@ def build_native_content_parts(
ceiling), the agent's retry loop transparently shrinks and retries
once see ``run_agent._try_shrink_image_parts_in_messages``.
Returns (content_parts, skipped_paths). Skipped paths are files that
couldn't be read from disk and are NOT advertised in the path hints.
Returns (content_parts, skipped). Skipped entries are local paths
that couldn't be read from disk; URLs are never skipped (they're
not validated here).
"""
skipped: List[str] = []
image_parts: List[Dict[str, Any]] = []
attached_paths: List[str] = []
attached_urls: List[str] = []
for raw_path in image_paths:
p = Path(raw_path)
@@ -364,16 +473,26 @@ def build_native_content_parts(
})
attached_paths.append(str(raw_path))
for url in image_urls or []:
url = (url or "").strip()
if not url:
continue
image_parts.append({
"type": "image_url",
"image_url": {"url": url},
})
attached_urls.append(url)
text = (user_text or "").strip()
# If at least one image attached, build a single text part that combines
# the user's caption (or a neutral default) with one path hint per image.
if attached_paths:
# the user's caption (or a neutral default) with one hint per image.
if attached_paths or attached_urls:
base_text = text or "What do you see in this image?"
path_hints = "\n".join(
f"[Image attached at: {p}]" for p in attached_paths
)
combined_text = f"{base_text}\n\n{path_hints}"
hint_lines: List[str] = []
hint_lines.extend(f"[Image attached at: {p}]" for p in attached_paths)
hint_lines.extend(f"[Image attached: {u}]" for u in attached_urls)
combined_text = f"{base_text}\n\n" + "\n".join(hint_lines)
parts: List[Dict[str, Any]] = [{"type": "text", "text": combined_text}]
parts.extend(image_parts)
return parts, skipped
@@ -388,4 +507,5 @@ def build_native_content_parts(
__all__ = [
"decide_image_input_mode",
"build_native_content_parts",
"extract_image_refs",
]
+2 -5
View File
@@ -1567,11 +1567,8 @@ def get_model_context_length(
and base_url_host_matches(base_url, "amazonaws.com")
):
try:
from agent.plugin_registries import registries
_bedrock = registries.get_provider_namespace("bedrock")
get_bedrock_context_length = _bedrock.get("get_bedrock_context_length")
if get_bedrock_context_length is not None:
return get_bedrock_context_length(model)
from agent.bedrock_adapter import get_bedrock_context_length
return get_bedrock_context_length(model)
except ImportError:
pass # boto3 not installed — fall through to generic resolution
-586
View File
@@ -1,586 +0,0 @@
"""Plugin capability registries.
Each plugin's ``register(ctx)`` function populates these registries via
``ctx.register_<capability>()``. The core codebase then queries the
registries instead of importing from plugin packages directly.
This is the **only** coupling point between the core and plugins: the core
imports from ``agent.plugin_registries``, never from ``hermes_agent_*``.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import (
Any,
Callable,
Dict,
List,
Optional,
Protocol,
Sequence,
Tuple,
Type,
runtime_checkable,
)
# ---------------------------------------------------------------------------
# Auth providers
# ---------------------------------------------------------------------------
@runtime_checkable
class AuthProvider(Protocol):
"""A plugin that can provide or check authentication credentials.
Registered via ``ctx.register_auth_provider(name, provider)``.
Queried by ``hermes_cli/auth_commands.py``, ``doctor.py``, etc.
"""
@property
def name(self) -> str: ...
def has_credentials(self) -> bool:
"""Return True if the required credentials are present in env/config."""
...
def check_env_vars(self) -> Dict[str, str | None]:
"""Return a dict of env-var-name → current-value (or None if unset).
Used by ``hermes doctor`` to display credential status.
"""
...
def resolve_token(self, **kwargs: Any) -> Any:
"""Resolve and return an auth token/credential for the provider.
The return type is provider-specific (string, tuple, object, etc.).
"""
...
def refresh_token(self, **kwargs: Any) -> Any:
"""Refresh an existing token. Raises if refresh is not supported."""
...
@dataclass
class AuthProviderEntry:
provider: AuthProvider
"""The auth provider instance."""
cli_group: str = ""
"""CLI argument group name (e.g. 'Anthropic', 'AWS / Bedrock')."""
setup_subcommands: bool = False
"""Whether this provider adds CLI auth subcommands (login, logout, etc.)."""
# ---------------------------------------------------------------------------
# Transport builders
# ---------------------------------------------------------------------------
@runtime_checkable
class TransportBuilder(Protocol):
"""A plugin that builds clients and converts messages for a model transport.
Registered via ``ctx.register_transport(name, builder)``.
Queried by ``agent/transports/`` and ``agent/auxiliary_client.py``.
"""
def build_client(self, **kwargs: Any) -> Any:
"""Build and return a provider-specific API client."""
...
def build_kwargs(self, **kwargs: Any) -> Dict[str, Any]:
"""Build the kwargs dict for a provider-specific API call."""
...
def convert_messages(self, messages: Sequence[Any], **kwargs: Any) -> Any:
"""Convert internal message format to provider-specific format."""
...
def convert_tools(self, tools: Sequence[Any], **kwargs: Any) -> Any:
"""Convert internal tool format to provider-specific format."""
...
def normalize_response(self, response: Any, **kwargs: Any) -> Any:
"""Normalize a provider-specific response into the internal format."""
...
# ---------------------------------------------------------------------------
# Platform adapters
# ---------------------------------------------------------------------------
@dataclass
class PlatformAdapterEntry:
"""A registered platform adapter.
Registered via ``ctx.register_platform(name, entry)``.
Queried by ``gateway/run.py`` and ``tools/send_message_tool.py``.
"""
name: str
"""Platform identifier (e.g. 'telegram', 'slack')."""
adapter_class: Type
"""The adapter class (e.g. TelegramAdapter)."""
check_requirements: Callable[[], bool]
"""Check if the platform's dependencies are installed and configured."""
available_flag: str = ""
"""Name of the module-level AVAILABLE boolean, if any."""
constants: Dict[str, Any] = field(default_factory=dict)
"""Platform-specific constants (e.g. FEISHU_DOMAIN, LARK_DOMAIN)."""
helper_functions: Dict[str, Callable] = field(default_factory=dict)
"""Platform-specific helper functions (e.g. probe_bot, qr_register)."""
# ---------------------------------------------------------------------------
# Tool providers
# ---------------------------------------------------------------------------
@dataclass
class ToolProviderEntry:
"""A registered tool provider.
Registered via ``ctx.register_tool_provider(name, entry)``.
Queried by ``tools/`` modules.
"""
name: str
"""Tool identifier (e.g. 'tts', 'stt', 'fal', 'daytona')."""
tool_functions: Dict[str, Callable] = field(default_factory=dict)
"""Tool functions keyed by name (e.g. 'text_to_speech_tool', 'transcribe_audio')."""
check_fn: Optional[Callable] = None
"""Check if the tool's dependencies are available."""
constants: Dict[str, Any] = field(default_factory=dict)
"""Tool-specific constants (e.g. MAX_FILE_SIZE)."""
config_functions: Dict[str, Callable] = field(default_factory=dict)
"""Config/utility functions (e.g. _get_provider, _load_stt_config)."""
environment_classes: Dict[str, Type] = field(default_factory=dict)
"""Environment classes for terminal backends (e.g. DaytonaEnvironment)."""
# ---------------------------------------------------------------------------
# Model metadata providers
# ---------------------------------------------------------------------------
@dataclass
class ModelMetadataEntry:
"""A registered model metadata provider.
Registered via ``ctx.register_model_metadata(name, entry)``.
Queried by ``agent/model_metadata.py`` and CLI model commands.
"""
name: str
"""Provider identifier (e.g. 'anthropic', 'bedrock')."""
get_context_length: Optional[Callable[[str], int | None]] = None
"""Return the context length for a model name, or None if unknown."""
list_models: Optional[Callable[[], List[str]]] = None
"""Return a list of known model IDs for this provider."""
constants: Dict[str, Any] = field(default_factory=dict)
"""Provider-specific constants (e.g. _COMMON_BETAS, betas lists)."""
# ---------------------------------------------------------------------------
# Credential pool entries
# ---------------------------------------------------------------------------
@dataclass
class CredentialPoolEntry:
"""A registered credential pool provider.
Registered via ``ctx.register_credential_pool(name, entry)``.
Queried by ``agent/credential_pool.py``.
"""
name: str
"""Provider identifier (e.g. 'anthropic')."""
read_credentials: Optional[Callable] = None
"""Read stored credentials."""
write_credentials: Optional[Callable] = None
"""Write/store credentials."""
refresh_credentials: Optional[Callable] = None
"""Refresh stored credentials."""
read_oauth: Optional[Callable] = None
"""Read OAuth credentials."""
# ---------------------------------------------------------------------------
# Provider resolvers
# ---------------------------------------------------------------------------
@runtime_checkable
class ProviderResolver(Protocol):
"""A plugin that resolves an auxiliary client for a specific provider.
Registered via ``ctx.register_provider_resolver(provider_name, resolver)``.
Queried by ``agent/auxiliary_client.py`` in ``resolve_provider_client()``.
"""
def __call__(
self,
*,
model: str | None = None,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
async_mode: bool = False,
is_vision: bool = False,
main_runtime: dict | None = None,
api_mode: str | None = None,
) -> tuple[Any, str] | tuple[None, None]:
"""Return ``(client, default_model)`` or ``(None, None)`` if unavailable."""
...
# ---------------------------------------------------------------------------
# Credential pool hooks
# ---------------------------------------------------------------------------
@dataclass
class CredentialPoolHook:
"""Provider-specific credential pool operations.
Registered via ``ctx.register_credential_pool_hook(provider_name, hook)``.
Queried by ``agent/credential_pool.py``.
"""
sync_from_credentials_file: Optional[Callable] = None
"""Sync a pool entry from an external credentials file (e.g. ~/.claude/.credentials.json)."""
refresh_oauth: Optional[Callable] = None
"""Refresh an OAuth token for a pool entry."""
should_include_in_pool: Optional[Callable] = None
"""Return True if this provider's credentials should be included in the pool."""
needs_refresh: Optional[Callable] = None
"""Return True if an OAuth entry needs a token refresh."""
source_priority: Optional[Callable] = None
"""Return integer priority for a credential source (lower = preferred)."""
discover_credentials: Optional[Callable] = None
"""Discover external credentials and upsert into the pool entries.
Signature: (entries: list, provider: str, is_suppressed: Callable) -> (changed: bool, active_sources: set)
"""
env_var_order: Optional[list] = None
"""Override env var scan order for this provider (e.g. ['ANTHROPIC_TOKEN', 'CLAUDE_CODE_OAUTH_TOKEN', 'ANTHROPIC_API_KEY'])."""
detect_auth_type: Optional[Callable] = None
"""Given a token string, return the auth type for this provider.
Signature: (token: str) -> str (e.g. AUTH_TYPE_OAUTH or AUTH_TYPE_API_KEY)
"""
# ---------------------------------------------------------------------------
# Pricing providers
# ---------------------------------------------------------------------------
# Re-export PricingEntry from usage_pricing — that's the canonical definition
# with Decimal fields. The registry stores these directly keyed by (provider, model).
# Lazy import to avoid circular dependency (usage_pricing imports registries at runtime).
def _get_pricing_entry_class():
from agent.usage_pricing import PricingEntry
return PricingEntry
# ---------------------------------------------------------------------------
# Provider overlays
# ---------------------------------------------------------------------------
@dataclass
class ProviderOverlayEntry:
"""A provider overlay registered by a plugin.
Registered via ``ctx.register_provider_overlay(provider_name, entry)``.
Queried by ``hermes_cli/providers.py``.
This mirrors the fields of ``HermesOverlay`` so that providers.py
can merge plugin-registered overlays seamlessly.
"""
provider_name: str
"""Primary provider name (e.g. 'anthropic', 'bedrock')."""
transport: str = "openai_chat"
"""Transport type: openai_chat | anthropic_messages | codex_responses | bedrock_converse"""
is_aggregator: bool = False
"""Whether this provider aggregates multiple model providers."""
auth_type: str = "api_key"
"""Auth type: api_key | oauth_device_code | oauth_external | aws_sdk | external_process"""
extra_env_vars: Tuple[str, ...] = ()
"""Environment variable names that indicate this provider is configured."""
base_url_override: str = ""
"""Override if models.dev URL is wrong/missing."""
base_url_env_var: str = ""
"""Env var for user-custom base URL."""
display_name: str = ""
"""Human-readable name for the provider (e.g. 'Anthropic', 'AWS Bedrock')."""
aliases: List[str] = field(default_factory=list)
"""Alternative names that resolve to this provider."""
# ---------------------------------------------------------------------------
# The global registries (singleton)
# ---------------------------------------------------------------------------
class PluginRegistries:
"""Central store for all plugin-registered capabilities.
A single instance is created at import time and shared across the
process. Plugins populate it during ``register()``; the core
queries it at runtime.
"""
def __init__(self) -> None:
self.auth_providers: Dict[str, AuthProviderEntry] = {}
self.transport_builders: Dict[str, TransportBuilder] = {}
self._transports: Dict[str, type] = {}
self.platform_adapters: Dict[str, PlatformAdapterEntry] = {}
self.tool_providers: Dict[str, ToolProviderEntry] = {}
self.model_metadata: Dict[str, ModelMetadataEntry] = {}
self.credential_pools: Dict[str, CredentialPoolEntry] = {}
self._provider_services: Dict[str, Dict[str, Any]] = {}
self._provider_resolvers: Dict[str, Callable] = {}
self._credential_pool_hooks: Dict[str, CredentialPoolHook] = {}
self._pricing_providers: Dict[tuple, Any] = {}
self._provider_overlays: Dict[str, ProviderOverlayEntry] = {}
# -- registration methods (called from PluginContext) --------------------
def register_auth_provider(
self,
name: str,
provider: AuthProvider,
*,
cli_group: str = "",
setup_subcommands: bool = False,
) -> None:
self.auth_providers[name] = AuthProviderEntry(
provider=provider,
cli_group=cli_group,
setup_subcommands=setup_subcommands,
)
def register_transport(self, name: str, builder: TransportBuilder) -> None:
self.transport_builders[name] = builder
def register_platform(self, entry: PlatformAdapterEntry) -> None:
self.platform_adapters[entry.name] = entry
def register_tool_provider(self, entry: ToolProviderEntry) -> None:
self.tool_providers[entry.name] = entry
def register_model_metadata(self, entry: ModelMetadataEntry) -> None:
self.model_metadata[entry.name] = entry
def register_credential_pool(self, entry: CredentialPoolEntry) -> None:
self.credential_pools[entry.name] = entry
def register_provider_resolver(self, name: str, resolver: Callable) -> None:
"""Register a provider resolver callable.
The resolver is called by ``resolve_provider_client()`` to create an
auxiliary client for a specific provider. Signature::
def resolver(
*,
model: str | None,
explicit_api_key: str | None,
explicit_base_url: str | None,
async_mode: bool,
is_vision: bool,
main_runtime: dict | None,
api_mode: str | None,
) -> tuple[Any, str] | tuple[None, None]:
...
Returns ``(client, default_model)`` or ``(None, None)``.
"""
self._provider_resolvers[name] = resolver
def register_credential_pool_hook(self, name: str, hook: CredentialPoolHook) -> None:
"""Register a credential pool hook for provider-specific pool operations."""
self._credential_pool_hooks[name] = hook
def register_pricing_provider(self, name: str, entries: List[tuple]) -> None:
"""Register pricing entries for a provider.
Each entry is a (provider, model, PricingEntry) tuple so the
lookup key matches the (provider, model) pattern used by
_OFFICIAL_DOCS_PRICING.
"""
for prov, model, entry in entries:
self._pricing_providers[(prov, model)] = entry
def register_provider_overlay(self, entry: ProviderOverlayEntry) -> None:
"""Register a provider overlay entry from a plugin."""
self._provider_overlays[entry.provider_name] = entry
# -- query helpers -------------------------------------------------------
def get_auth_provider(self, name: str) -> AuthProviderEntry | None:
return self.auth_providers.get(name)
def get_transport(self, name: str) -> TransportBuilder | None:
return self.transport_builders.get(name)
def get_platform(self, name: str) -> PlatformAdapterEntry | None:
return self.platform_adapters.get(name)
def get_tool_provider(self, name: str) -> ToolProviderEntry | None:
return self.tool_providers.get(name)
def get_model_metadata(self, name: str) -> ModelMetadataEntry | None:
return self.model_metadata.get(name)
def get_credential_pool(self, name: str) -> CredentialPoolEntry | None:
return self.credential_pools.get(name)
def get_provider_resolver(self, name: str) -> Callable | None:
"""Return the registered resolver for a provider, or None."""
return self._provider_resolvers.get(name)
def get_credential_pool_hook(self, name: str) -> CredentialPoolHook | None:
"""Return the registered credential pool hook for a provider, or None."""
return self._credential_pool_hooks.get(name)
def get_pricing_entry(self, provider: str, model: str) -> Any:
"""Return a registered pricing entry for (provider, model), or None."""
return self._pricing_providers.get((provider, model))
def all_pricing_entries(self) -> Dict[tuple, Any]:
"""Return all registered pricing entries (keyed by (provider, model))."""
return dict(self._pricing_providers)
def get_provider_overlay(self, name: str) -> ProviderOverlayEntry | None:
"""Return a registered provider overlay, or None."""
return self._provider_overlays.get(name)
def all_provider_overlays(self) -> Dict[str, ProviderOverlayEntry]:
"""Return all registered provider overlays."""
return dict(self._provider_overlays)
def all_auth_providers(self) -> List[AuthProviderEntry]:
return list(self.auth_providers.values())
def all_platforms(self) -> List[PlatformAdapterEntry]:
return list(self.platform_adapters.values())
def all_tool_providers(self) -> List[ToolProviderEntry]:
return list(self.tool_providers.values())
# -- provider services (model-provider namespace) -----------------------
def register_provider_services(self, name: str, services: Dict[str, Any]) -> None:
"""Register a namespace dict of provider-specific services.
This is the escape hatch for model-provider plugins that expose many
symbols (anthropic has 50+). Each plugin registers its public surface
as a flat dict of ``{symbol_name: callable_or_value}``. Core code
looks up specific symbols instead of importing from the plugin
package directly.
Each callable value is stored as a *lazy module-attribute reference*
so that ``unittest.mock.patch("pkg.mod.fn")`` works correctly in
tests the registry re-reads ``mod.fn`` on every lookup instead of
capturing the function object at register time.
Example::
registries.register_provider_services("anthropic", {
"build_anthropic_client": build_anthropic_client,
"resolve_anthropic_token": resolve_anthropic_token,
"_is_oauth_token": _is_oauth_token,
...
})
"""
import sys
def _make_lazy(fn: Any) -> Any:
"""Return a lazy wrapper that re-reads fn from its module each call.
This makes mock.patch() on the module attribute work transparently
the registry never caches the function object, just the reference path.
"""
if not callable(fn):
return fn
module = getattr(fn, "__module__", None)
qualname = getattr(fn, "__qualname__", None)
if not module or not qualname or "." in qualname:
# non-simple attribute (lambda, nested fn, class method) — store directly
return fn
class _LazyRef:
__slots__ = ("_mod", "_attr", "_fallback")
def __init__(self, mod: str, attr: str, fallback: Any) -> None:
self._mod = mod
self._attr = attr
self._fallback = fallback
def _resolve(self) -> Any:
mod = sys.modules.get(self._mod)
return getattr(mod, self._attr, self._fallback) if mod else self._fallback
def __call__(self, *args: Any, **kwargs: Any) -> Any:
return self._resolve()(*args, **kwargs)
def __getattr__(self, name: str) -> Any:
if name.startswith("_"):
raise AttributeError(name)
return getattr(self._resolve(), name)
def __repr__(self) -> str: # pragma: no cover
return f"<LazyRef {self._mod}.{self._attr}>"
# Allow isinstance checks and hasattr to pass through
def __bool__(self) -> bool:
return True
return _LazyRef(module, qualname, fn)
self._provider_services[name] = {k: _make_lazy(v) for k, v in services.items()}
def get_provider_service(self, provider: str, name: str) -> Any:
"""Look up a single symbol from a provider's service namespace.
Returns ``None`` if the provider is not registered or the symbol
doesn't exist.
"""
ns = self._provider_services.get(provider)
if ns is None:
return None
return ns.get(name)
def get_provider_namespace(self, provider: str) -> Dict[str, Any]:
"""Return the full service namespace dict for a provider (empty dict if unregistered)."""
return self._provider_services.get(provider, {})
# Module-level singleton — the one and only instance.
registries = PluginRegistries()
+2 -12
View File
@@ -47,16 +47,9 @@ def get_transport(api_mode: str):
def _discover_transports() -> None:
"""Import all transport modules to trigger auto-registration.
Also checks the plugin registry for transports registered by plugins
(e.g. anthropic_messages from the anthropic plugin, bedrock_converse
from the bedrock plugin). Plugin-registered transports take priority
over core fallbacks when both exist.
"""
"""Import all transport modules to trigger auto-registration."""
global _discovered
_discovered = True
# Core transport modules (registered automatically — no plugin needed)
try:
import agent.transports.anthropic # noqa: F401
except ImportError:
@@ -69,10 +62,7 @@ def _discover_transports() -> None:
import agent.transports.chat_completions # noqa: F401
except ImportError:
pass
# Plugin-registered transports (override core fallbacks)
try:
from agent.plugin_registries import registries
for api_mode, transport_cls in registries._transports.items():
_REGISTRY.setdefault(api_mode, transport_cls)
import agent.transports.bedrock # noqa: F401
except ImportError:
pass
+65 -31
View File
@@ -1,53 +1,41 @@
"""Anthropic Messages API transport — core module.
"""Anthropic Messages API transport.
Owns format conversion and response normalization for the ``anthropic_messages``
wire format. No SDK dependency; all wire-format logic lives in
:mod:`agent.anthropic_format`.
Delegates to the existing adapter functions in agent/anthropic_adapter.py.
This transport owns format conversion and normalization NOT client lifecycle.
"""
import json
from typing import Any, Dict, List, Optional
from agent.anthropic_format import (
build_anthropic_kwargs,
convert_messages_to_anthropic,
convert_tools_to_anthropic,
_to_plain_data,
)
from agent.transports.base import ProviderTransport
from agent.transports.types import NormalizedResponse, ToolCall
from agent.transports.types import NormalizedResponse
class AnthropicTransport(ProviderTransport):
"""Transport for api_mode='anthropic_messages'.
Uses core functions directly from :mod:`agent.anthropic_format` no
plugin registry lookups needed. This means core tests, bedrock tests,
and any other consumer of the anthropic wire format work without the
anthropic plugin being registered.
Wraps the existing functions in anthropic_adapter.py behind the
ProviderTransport ABC. Each method delegates no logic is duplicated.
"""
_STOP_REASON_MAP = {
"end_turn": "stop",
"tool_use": "tool_calls",
"max_tokens": "length",
"stop_sequence": "stop",
"refusal": "content_filter",
"model_context_window_exceeded": "length",
}
@property
def api_mode(self) -> str:
return "anthropic_messages"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
"""Convert OpenAI messages to Anthropic (system, messages) tuple."""
"""Convert OpenAI messages to Anthropic (system, messages) tuple.
kwargs:
base_url: Optional[str] affects thinking signature handling.
"""
from agent.anthropic_adapter import convert_messages_to_anthropic
base_url = kwargs.get("base_url")
return convert_messages_to_anthropic(messages, base_url=base_url,
model=kwargs.get("model"))
return convert_messages_to_anthropic(messages, base_url=base_url)
def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
"""Convert OpenAI tool schemas to Anthropic input_schema format."""
from agent.anthropic_adapter import convert_tools_to_anthropic
return convert_tools_to_anthropic(tools)
def build_kwargs(
@@ -57,7 +45,23 @@ class AnthropicTransport(ProviderTransport):
tools: Optional[List[Dict[str, Any]]] = None,
**params,
) -> Dict[str, Any]:
"""Build Anthropic messages.create() kwargs."""
"""Build Anthropic messages.create() kwargs.
Calls convert_messages and convert_tools internally.
params (all optional):
max_tokens: int
reasoning_config: dict | None
tool_choice: str | None
is_oauth: bool
preserve_dots: bool
context_length: int | None
base_url: str | None
fast_mode: bool
drop_context_1m_beta: bool
"""
from agent.anthropic_adapter import build_anthropic_kwargs
return build_anthropic_kwargs(
model=model,
messages=messages,
@@ -74,7 +78,15 @@ class AnthropicTransport(ProviderTransport):
)
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize Anthropic response to NormalizedResponse."""
"""Normalize Anthropic response to NormalizedResponse.
Parses content blocks (text, thinking, tool_use), maps stop_reason
to OpenAI finish_reason, and collects reasoning_details in provider_data.
"""
import json
from agent.anthropic_adapter import _to_plain_data
from agent.transports.types import ToolCall
strip_tool_prefix = kwargs.get("strip_tool_prefix", False)
_MCP_PREFIX = "mcp_"
@@ -95,6 +107,12 @@ class AnthropicTransport(ProviderTransport):
name = block.name
if strip_tool_prefix and name.startswith(_MCP_PREFIX):
stripped = name[len(_MCP_PREFIX):]
# Only strip the mcp_ prefix for OAuth-injected tools
# (where Hermes adds the prefix when sending to Anthropic
# and must remove it on the way back). Native MCP server
# tools (from mcp_servers: in config.yaml) are registered
# in the tool registry under their FULL mcp_<server>_<tool>
# name and must NOT be stripped. GH-25255.
from tools.registry import registry as _tool_registry
if (_tool_registry.get_entry(stripped)
and not _tool_registry.get_entry(name)):
@@ -123,7 +141,13 @@ class AnthropicTransport(ProviderTransport):
)
def validate_response(self, response: Any) -> bool:
"""Check Anthropic response structure is valid."""
"""Check Anthropic response structure is valid.
An empty content list is legitimate when ``stop_reason == "end_turn"``
the model's canonical way of signalling "nothing more to add" after
a tool turn that already delivered the user-facing text. Treating it
as invalid falsely retries a completed response.
"""
if response is None:
return False
content_blocks = getattr(response, "content", None)
@@ -144,6 +168,16 @@ class AnthropicTransport(ProviderTransport):
return {"cached_tokens": cached, "creation_tokens": written}
return None
# Promote the adapter's canonical mapping to module level so it's shared
_STOP_REASON_MAP = {
"end_turn": "stop",
"tool_use": "tool_calls",
"max_tokens": "length",
"stop_sequence": "stop",
"refusal": "content_filter",
"model_context_window_exceeded": "length",
}
def map_finish_reason(self, raw_reason: str) -> str:
"""Map Anthropic stop_reason to OpenAI finish_reason."""
return self._STOP_REASON_MAP.get(raw_reason, "stop")
@@ -1,6 +1,6 @@
"""AWS Bedrock Converse API transport.
Delegates to the existing adapter functions in hermes_agent_bedrock.
Delegates to the existing adapter functions in agent/bedrock_adapter.py.
Bedrock uses its own boto3 client (not the OpenAI SDK), so the transport
owns format conversion and normalization, while client construction and
boto3 calls stay on AIAgent.
@@ -21,19 +21,13 @@ class BedrockTransport(ProviderTransport):
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> Any:
"""Convert OpenAI messages to Bedrock Converse format."""
from agent.plugin_registries import registries
_fn = registries.get_provider_service("bedrock", "convert_messages_to_converse")
if _fn is None:
raise ImportError("bedrock plugin not registered")
return _fn(messages)
from agent.bedrock_adapter import convert_messages_to_converse
return convert_messages_to_converse(messages)
def convert_tools(self, tools: List[Dict[str, Any]]) -> Any:
"""Convert OpenAI tool schemas to Bedrock Converse toolConfig."""
from agent.plugin_registries import registries
_fn = registries.get_provider_service("bedrock", "convert_tools_to_converse")
if _fn is None:
raise ImportError("bedrock plugin not registered")
return _fn(tools)
from agent.bedrock_adapter import convert_tools_to_converse
return convert_tools_to_converse(tools)
def build_kwargs(
self,
@@ -42,16 +36,22 @@ class BedrockTransport(ProviderTransport):
tools: Optional[List[Dict[str, Any]]] = None,
**params,
) -> Dict[str, Any]:
"""Build Bedrock converse() kwargs."""
from agent.plugin_registries import registries
_fn = registries.get_provider_service("bedrock", "build_converse_kwargs")
if _fn is None:
raise ImportError("bedrock plugin not registered")
"""Build Bedrock converse() kwargs.
Calls convert_messages and convert_tools internally.
params:
max_tokens: int output token limit (default 4096)
temperature: float | None
guardrail_config: dict | None Bedrock guardrails
region: str AWS region (default 'us-east-1')
"""
from agent.bedrock_adapter import build_converse_kwargs
region = params.get("region", "us-east-1")
guardrail = params.get("guardrail_config")
kwargs = _fn(
kwargs = build_converse_kwargs(
model=model,
messages=messages,
tools=tools,
@@ -65,15 +65,20 @@ class BedrockTransport(ProviderTransport):
return kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize Bedrock response to NormalizedResponse."""
from agent.plugin_registries import registries
normalize_converse_response = registries.get_provider_service("bedrock", "normalize_converse_response")
if normalize_converse_response is None:
raise ImportError("bedrock plugin not registered")
"""Normalize Bedrock response to NormalizedResponse.
Handles two shapes:
1. Raw boto3 dict (from direct converse() calls)
2. Already-normalized SimpleNamespace with .choices (from dispatch site)
"""
from agent.bedrock_adapter import normalize_converse_response
# Normalize to OpenAI-compatible SimpleNamespace
if hasattr(response, "choices") and response.choices:
# Already normalized at dispatch site
ns = response
else:
# Raw boto3 dict
ns = normalize_converse_response(response)
choice = ns.choices[0]
@@ -111,15 +116,27 @@ class BedrockTransport(ProviderTransport):
)
def validate_response(self, response: Any) -> bool:
"""Check Bedrock response structure.
After normalize_converse_response, the response has OpenAI-compatible
.choices same check as chat_completions.
"""
if response is None:
return False
# Raw Bedrock dict response — check for 'output' key
if isinstance(response, dict):
return "output" in response
# Already-normalized SimpleNamespace
if hasattr(response, "choices"):
return bool(response.choices)
return False
def map_finish_reason(self, raw_reason: str) -> str:
"""Map Bedrock stop reason to OpenAI finish_reason.
The adapter already does this mapping inside normalize_converse_response,
so this is only used for direct access to raw responses.
"""
_MAP = {
"end_turn": "stop",
"tool_use": "tool_calls",
@@ -129,3 +146,9 @@ class BedrockTransport(ProviderTransport):
"content_filtered": "content_filter",
}
return _MAP.get(raw_reason, "stop")
# Auto-register on import
from agent.transports import register_transport # noqa: E402
register_transport("bedrock_converse", BedrockTransport)
+152 -68
View File
@@ -115,8 +115,6 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
# Opus 4.5/4.6/4.7 share $5/$25 pricing (new tokenizer, up to 35% more
# tokens for the same text).
# Source: https://platform.claude.com/docs/en/about-claude/pricing
# NOTE: The anthropic plugin also registers these — plugin takes priority
# at runtime, but these static entries ensure costs work without the plugin.
(
"anthropic",
"claude-opus-4-7",
@@ -141,6 +139,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.6 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-6",
@@ -189,6 +188,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4.5 ─────────────────────────────────────────────
(
"anthropic",
"claude-opus-4-5",
@@ -225,6 +225,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic Claude 4 / 4.1 ─────────────────────────────────────────
(
"anthropic",
"claude-opus-4-20250514",
@@ -249,56 +250,7 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── Anthropic older models (pre-4.5 generation) ────────────────────────
(
"anthropic",
"claude-3-5-sonnet-20241022",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-5-haiku-20241022",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-opus-20240229",
): PricingEntry(
input_cost_per_million=Decimal("15.00"),
output_cost_per_million=Decimal("75.00"),
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-haiku-20240307",
): PricingEntry(
input_cost_per_million=Decimal("0.25"),
output_cost_per_million=Decimal("1.25"),
cache_read_cost_per_million=Decimal("0.03"),
cache_write_cost_per_million=Decimal("0.30"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# ── OpenAI ────────────────────────────────────────────────────────────
# OpenAI
(
"openai",
"gpt-4o",
@@ -376,6 +328,55 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://openai.com/api/pricing/",
pricing_version="openai-pricing-2026-03-16",
),
# ── Anthropic older models (pre-4.5 generation) ────────────────────────
(
"anthropic",
"claude-3-5-sonnet-20241022",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-5-haiku-20241022",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-opus-20240229",
): PricingEntry(
input_cost_per_million=Decimal("15.00"),
output_cost_per_million=Decimal("75.00"),
cache_read_cost_per_million=Decimal("1.50"),
cache_write_cost_per_million=Decimal("18.75"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
(
"anthropic",
"claude-3-haiku-20240307",
): PricingEntry(
input_cost_per_million=Decimal("0.25"),
output_cost_per_million=Decimal("1.25"),
cache_read_cost_per_million=Decimal("0.03"),
cache_write_cost_per_million=Decimal("0.30"),
source="official_docs_snapshot",
source_url="https://platform.claude.com/docs/en/about-claude/pricing",
pricing_version="anthropic-pricing-2026-05",
),
# DeepSeek
(
"deepseek",
@@ -439,6 +440,80 @@ _OFFICIAL_DOCS_PRICING: Dict[tuple[str, str], PricingEntry] = {
source_url="https://ai.google.dev/pricing",
pricing_version="google-pricing-2026-03-16",
),
# AWS Bedrock — pricing per the Bedrock pricing page.
# Bedrock charges the same per-token rates as the model provider but
# through AWS billing. These are the on-demand prices (no commitment).
# Source: https://aws.amazon.com/bedrock/pricing/
(
"bedrock",
"anthropic.claude-opus-4-6",
): PricingEntry(
input_cost_per_million=Decimal("15.00"),
output_cost_per_million=Decimal("75.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-sonnet-4-6",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-sonnet-4-5",
): PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"anthropic.claude-haiku-4-5",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-pro",
): PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("3.20"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-lite",
): PricingEntry(
input_cost_per_million=Decimal("0.06"),
output_cost_per_million=Decimal("0.24"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
(
"bedrock",
"amazon.nova-micro",
): PricingEntry(
input_cost_per_million=Decimal("0.035"),
output_cost_per_million=Decimal("0.14"),
source="official_docs_snapshot",
source_url="https://aws.amazon.com/bedrock/pricing/",
pricing_version="bedrock-pricing-2026-04",
),
# MiniMax
(
"minimax",
@@ -506,27 +581,36 @@ def resolve_billing_route(
return BillingRoute(provider=provider_name or "unknown", model=model.split("/")[-1] if model else "", base_url=base_url or "", billing_mode="unknown")
def _normalize_anthropic_model_name(model: str) -> str:
"""Normalize Anthropic model name variants to canonical form.
Handles:
- Dot notation: claude-opus-4.7 claude-opus-4-7
- Short aliases: claude-opus-4.7 claude-opus-4-7
- Strips anthropic/ prefix if present
"""
name = model.lower().strip()
if name.startswith("anthropic/"):
name = name[len("anthropic/"):]
# Normalize dots to dashes in version numbers (e.g. 4.7 → 4-7, 4.6 → 4-6)
# But preserve the rest of the name structure
name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
return name
def _lookup_official_docs_pricing(route: BillingRoute) -> Optional[PricingEntry]:
model = route.model.lower()
# ── Plugin-registered pricing entries take priority ──
from agent.plugin_registries import registries as _preg
plugin_entry = _preg.get_pricing_entry(route.provider, model)
if plugin_entry:
return plugin_entry
# Try provider-specific name normalization via registry
_norm = _preg.get_provider_service(route.provider, "normalize_model_name")
if _norm is not None:
normalized = _norm(model)
if normalized != model:
plugin_entry = _preg.get_pricing_entry(route.provider, normalized)
if plugin_entry:
return plugin_entry
# Fall back to static dict
# Direct lookup first
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, model))
if entry:
return entry
# Try normalized name for Anthropic (handles dot-notation like opus-4.7)
if route.provider == "anthropic":
normalized = _normalize_anthropic_model_name(model)
if normalized != model:
entry = _OFFICIAL_DOCS_PRICING.get((route.provider, normalized))
if entry:
return entry
return None
+64 -33
View File
@@ -576,6 +576,8 @@ def load_cli_config() -> Dict[str, Any]:
"docker_env": "TERMINAL_DOCKER_ENV",
"docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
"docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
# Persistent shell (non-local backends)
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
@@ -6234,10 +6236,8 @@ class HermesCLI:
# ``self.api_key`` may be a callable (Azure Foundry Entra ID bearer
# provider). Never invoke it; just identify the auth surface.
from agent.plugin_registries import registries
_azure_ns = registries.get_provider_namespace("azure")
is_token_provider = _azure_ns.get("is_token_provider")
if is_token_provider and is_token_provider(self.api_key):
from agent.azure_identity_adapter import is_token_provider
if is_token_provider(self.api_key):
api_key_display = "Microsoft Entra ID"
elif isinstance(self.api_key, str) and len(self.api_key) > 12:
api_key_display = f"{self.api_key[:8]}...{self.api_key[-4:]}"
@@ -10968,14 +10968,7 @@ class HermesCLI:
return
self._voice_tts_done.clear()
try:
from agent.plugin_registries import registries
_tts_provider = registries.get_tool_provider("tts")
if _tts_provider is None:
raise ImportError("tts tool provider not registered")
text_to_speech_tool = _tts_provider.tool_functions.get("text_to_speech_tool")
check_tts_requirements = _tts_provider.check_fn
if text_to_speech_tool is None:
raise ImportError("text_to_speech_tool not found in tts provider")
from tools.tts_tool import text_to_speech_tool
from tools.voice_mode import play_audio_file
# Strip markdown and non-speech content for cleaner TTS
@@ -11158,10 +11151,8 @@ class HermesCLI:
status = "enabled" if self._voice_tts else "disabled"
if self._voice_tts:
from agent.plugin_registries import registries
_tts_provider = registries.get_tool_provider("tts")
check_tts_requirements = _tts_provider.check_fn if _tts_provider else None
if check_tts_requirements and not check_tts_requirements():
from tools.tts_tool import check_tts_requirements
if not check_tts_requirements():
_cprint(f"{_DIM}Warning: No TTS provider available. Install edge-tts or set API keys.{_RST}")
_cprint(f"{_ACCENT}Voice TTS {status}.{_RST}")
@@ -11783,17 +11774,13 @@ class HermesCLI:
if self._voice_tts:
try:
from agent.plugin_registries import registries
_tts_provider = registries.get_tool_provider("tts")
if _tts_provider is None:
raise ImportError("tts tool provider not registered")
_load_tts_cfg = _tts_provider.config_functions.get("_load_tts_config")
_get_prov = _tts_provider.config_functions.get("_get_provider")
_import_elevenlabs = _tts_provider.config_functions.get("_import_elevenlabs")
_import_sounddevice = _tts_provider.config_functions.get("_import_sounddevice")
stream_tts_to_speaker = _tts_provider.tool_functions.get("stream_tts_to_speaker")
if not all([_load_tts_cfg, _get_prov, stream_tts_to_speaker]):
raise ImportError("streaming TTS functions not found in tts provider")
from tools.tts_tool import (
_load_tts_config as _load_tts_cfg,
_get_provider as _get_prov,
_import_elevenlabs,
_import_sounddevice,
stream_tts_to_speaker,
)
_tts_cfg = _load_tts_cfg()
if _get_prov(_tts_cfg) == "elevenlabs":
# Verify both ElevenLabs SDK and audio output are available
@@ -15140,13 +15127,50 @@ def main(
# Handle single query mode
if query or image:
query, single_query_images = _collect_query_images(query, image)
# Kanban workers spawn with ``hermes chat -q "work kanban task <id>"``;
# the actual task description lives in the task body. Mirror the
# gateway/CLI behaviour for inbound images by scanning the body for
# local image paths and http(s) image URLs and attaching them to the
# worker's first turn. Without this, users who paste a screenshot
# path or URL into a kanban task body never get it routed to the
# model's vision input.
single_query_image_urls: list[str] = []
_kanban_task_id = os.environ.get("HERMES_KANBAN_TASK", "").strip()
if _kanban_task_id:
try:
from hermes_cli import kanban_db as _kb
from agent.image_routing import extract_image_refs as _extract_refs
_conn = _kb.connect()
try:
_task = _kb.get_task(_conn, _kanban_task_id)
finally:
try:
_conn.close()
except Exception:
pass
_body = getattr(_task, "body", "") if _task is not None else ""
if _body:
_kb_paths, _kb_urls = _extract_refs(_body)
if _kb_paths:
# Dedupe against any --image the user already passed.
_seen = {str(p) for p in single_query_images}
for _p in _kb_paths:
if _p not in _seen:
_seen.add(_p)
single_query_images.append(Path(_p))
if _kb_urls:
single_query_image_urls.extend(_kb_urls)
except Exception as _exc:
# Best-effort enrichment; never block worker startup on it.
logger.debug("kanban image-ref extraction failed: %s", _exc)
if quiet:
# Quiet mode: suppress banner, spinner, tool previews.
# Only print the final response and parseable session info.
cli.tool_progress_mode = "off"
if cli._ensure_runtime_credentials():
effective_query: Any = query
if single_query_images:
if single_query_images or single_query_image_urls:
# Honour the same image-routing decision used by the
# interactive path. With a vision-capable model (incl.
# custom-provider models declared via
@@ -15175,19 +15199,26 @@ def main(
_parts, _skipped = _build_parts(
query if isinstance(query, str) else "",
[str(p) for p in single_query_images],
image_urls=list(single_query_image_urls) or None,
)
if any(p.get("type") == "image_url" for p in _parts):
effective_query = _parts
else:
# All images unreadable — text fallback.
# ``_preprocess_images_with_vision`` only knows
# about local files; URLs would be lost there,
# so keep the original query text intact when
# only URLs were supplied.
if single_query_images:
effective_query = cli._preprocess_images_with_vision(
query, single_query_images, announce=False,
)
except Exception:
if single_query_images:
effective_query = cli._preprocess_images_with_vision(
query, single_query_images, announce=False,
)
except Exception:
effective_query = cli._preprocess_images_with_vision(
query, single_query_images, announce=False,
)
else:
elif single_query_images:
effective_query = cli._preprocess_images_with_vision(
query,
single_query_images,
+14 -6
View File
@@ -30,13 +30,21 @@ cd /opt/data
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment.
# `--insecure` is opt-in via HERMES_DASHBOARD_INSECURE. The dashboard's
# OAuth auth gate engages automatically on non-loopback binds when a
# DashboardAuthProvider is registered (e.g. the bundled dashboard_auth/nous
# provider, which auto-registers when HERMES_DASHBOARD_OAUTH_CLIENT_ID is
# set). If no provider is registered, start_server fails closed with a
# specific operator-facing error.
#
# This used to derive --insecure from the bind host ("anything non-loopback
# implies insecure"), but that predates the OAuth gate and silently
# disabled it on every container-deployed dashboard. The gate is now the
# authority; operators on trusted LANs / behind a reverse proxy without
# the OAuth contract opt in explicitly.
insecure=""
case "$dash_host" in
127.0.0.1|localhost) ;;
*) insecure="--insecure" ;;
case "${HERMES_DASHBOARD_INSECURE:-}" in
1|true|TRUE|True|yes|YES|Yes) insecure="--insecure" ;;
esac
# shellcheck disable=SC2086 # word-splitting of $insecure is intentional
+2 -5
View File
@@ -3654,11 +3654,8 @@ class BasePlatformAdapter(ABC):
and text_content
and not media_files):
try:
from agent.plugin_registries import registries
_tts = registries.get_tool_provider("tts")
text_to_speech_tool = _tts.tool_functions.get("text_to_speech_tool") if _tts else None
check_tts_requirements = _tts.check_fn if _tts else None
if check_tts_requirements and text_to_speech_tool and check_tts_requirements():
from tools.tts_tool import text_to_speech_tool, check_tts_requirements
if check_tts_requirements():
import json as _json
speech_text = self.prepare_tts_text(text_content)
if not speech_text:
@@ -113,12 +113,17 @@ DINGTALK_TYPE_MAPPING = {
def check_dingtalk_requirements() -> bool:
"""Check if DingTalk dependencies are available and configured.
Since this is a separate package, deps are guaranteed by the package
manager. Just verify the SDK can be imported and env vars are set.
Lazy-installs dingtalk-stream via ``tools.lazy_deps.ensure("platform.dingtalk")``
on first call if not present.
"""
global DINGTALK_STREAM_AVAILABLE, dingtalk_stream, ChatbotMessage, CallbackMessage, AckMessage
global HTTPX_AVAILABLE, httpx
if not DINGTALK_STREAM_AVAILABLE or not HTTPX_AVAILABLE:
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.dingtalk", prompt=False)
except Exception:
return False
try:
import dingtalk_stream as _ds
from dingtalk_stream import ChatbotMessage as _CM
@@ -1345,64 +1345,63 @@ def _run_official_feishu_ws_client(ws_client: Any, adapter: Any) -> None:
def check_feishu_requirements() -> bool:
"""Check if Feishu/Lark dependencies are available.
Since this is a separate package, deps are guaranteed by the package
manager. Just verify the SDK can be imported.
Lazy-installs lark-oapi via ``tools.lazy_deps.ensure("platform.feishu")``
on first call if not present. Rebinds all module-level globals on success.
"""
if FEISHU_AVAILABLE:
return True
try:
import lark_oapi as _lark
from lark_oapi.api.application.v6 import GetApplicationRequest as _GAR
def _import():
import lark_oapi as lark
from lark_oapi.api.application.v6 import GetApplicationRequest
from lark_oapi.api.im.v1 import (
CreateFileRequest as _CFR, CreateFileRequestBody as _CFRB,
CreateImageRequest as _CIR, CreateImageRequestBody as _CIRB,
CreateMessageRequest as _CMR, CreateMessageRequestBody as _CMRB,
GetChatRequest as _GCR, GetMessageRequest as _GMR, GetMessageResourceRequest as _GMRR,
P2ImMessageMessageReadV1 as _P2,
ReplyMessageRequest as _RMR, ReplyMessageRequestBody as _RMRB,
UpdateMessageRequest as _UMR, UpdateMessageRequestBody as _UMRB,
CreateFileRequest, CreateFileRequestBody,
CreateImageRequest, CreateImageRequestBody,
CreateMessageRequest, CreateMessageRequestBody,
GetChatRequest, GetMessageRequest, GetMessageResourceRequest,
P2ImMessageMessageReadV1,
ReplyMessageRequest, ReplyMessageRequestBody,
UpdateMessageRequest, UpdateMessageRequestBody,
)
from lark_oapi.core import AccessTokenType as _AT, HttpMethod as _HM
from lark_oapi.core.const import FEISHU_DOMAIN as _FD, LARK_DOMAIN as _LD
from lark_oapi.core.model import BaseRequest as _BR
from lark_oapi.core import AccessTokenType, HttpMethod
from lark_oapi.core.const import FEISHU_DOMAIN, LARK_DOMAIN
from lark_oapi.core.model import BaseRequest
from lark_oapi.event.callback.model.p2_card_action_trigger import (
CallBackCard as _CBC, P2CardActionTriggerResponse as _P2R,
CallBackCard, P2CardActionTriggerResponse,
)
from lark_oapi.event.dispatcher_handler import EventDispatcherHandler as _EDH
from lark_oapi.ws import Client as _FWSC
except ImportError:
return False
from lark_oapi.event.dispatcher_handler import EventDispatcherHandler
from lark_oapi.ws import Client as FeishuWSClient
return {
"lark": lark,
"GetApplicationRequest": GetApplicationRequest,
"CreateFileRequest": CreateFileRequest,
"CreateFileRequestBody": CreateFileRequestBody,
"CreateImageRequest": CreateImageRequest,
"CreateImageRequestBody": CreateImageRequestBody,
"CreateMessageRequest": CreateMessageRequest,
"CreateMessageRequestBody": CreateMessageRequestBody,
"GetChatRequest": GetChatRequest,
"GetMessageRequest": GetMessageRequest,
"GetMessageResourceRequest": GetMessageResourceRequest,
"P2ImMessageMessageReadV1": P2ImMessageMessageReadV1,
"ReplyMessageRequest": ReplyMessageRequest,
"ReplyMessageRequestBody": ReplyMessageRequestBody,
"UpdateMessageRequest": UpdateMessageRequest,
"UpdateMessageRequestBody": UpdateMessageRequestBody,
"AccessTokenType": AccessTokenType,
"HttpMethod": HttpMethod,
"FEISHU_DOMAIN": FEISHU_DOMAIN,
"LARK_DOMAIN": LARK_DOMAIN,
"BaseRequest": BaseRequest,
"CallBackCard": CallBackCard,
"P2CardActionTriggerResponse": P2CardActionTriggerResponse,
"EventDispatcherHandler": EventDispatcherHandler,
"FeishuWSClient": FeishuWSClient,
"FEISHU_AVAILABLE": True,
}
globals().update({
"lark": _lark,
"GetApplicationRequest": _GAR,
"CreateFileRequest": _CFR,
"CreateFileRequestBody": _CFRB,
"CreateImageRequest": _CIR,
"CreateImageRequestBody": _CIRB,
"CreateMessageRequest": _CMR,
"CreateMessageRequestBody": _CMRB,
"GetChatRequest": _GCR,
"GetMessageRequest": _GMR,
"GetMessageResourceRequest": _GMRR,
"P2ImMessageMessageReadV1": _P2,
"ReplyMessageRequest": _RMR,
"ReplyMessageRequestBody": _RMRB,
"UpdateMessageRequest": _UMR,
"UpdateMessageRequestBody": _UMRB,
"AccessTokenType": _AT,
"HttpMethod": _HM,
"FEISHU_DOMAIN": _FD,
"LARK_DOMAIN": _LD,
"BaseRequest": _BR,
"CallBackCard": _CBC,
"P2CardActionTriggerResponse": _P2R,
"EventDispatcherHandler": _EDH,
"FeishuWSClient": _FWSC,
"FEISHU_AVAILABLE": True,
})
return True
from tools.lazy_deps import ensure_and_bind
return ensure_and_bind("platform.feishu", _import, globals(), prompt=False)
class FeishuAdapter(BasePlatformAdapter):
@@ -2460,7 +2459,7 @@ class FeishuAdapter(BasePlatformAdapter):
logging, and reaction. Scheduling follows the same
``run_coroutine_threadsafe`` pattern used by ``_on_message_event``.
"""
from .feishu_comment import handle_drive_comment_event
from gateway.platforms.feishu_comment import handle_drive_comment_event
loop = self._loop
if not self._loop_accepts_callbacks(loop):
@@ -1164,7 +1164,7 @@ async def handle_drive_comment_event(
)
# Access control
from .feishu_comment_rules import load_config, resolve_rule, is_user_allowed, has_wiki_keys
from gateway.platforms.feishu_comment_rules import load_config, resolve_rule, is_user_allowed, has_wiki_keys
comments_cfg = load_config()
rule = resolve_rule(comments_cfg, file_type, file_token)
@@ -240,8 +240,13 @@ def _check_e2ee_deps() -> bool:
def check_matrix_requirements() -> bool:
"""Return True if the Matrix adapter can be used.
Since this is a separate package, deps are guaranteed by the package
manager. Just verify the SDK can be imported and env vars are set.
Lazy-installs the full ``platform.matrix`` feature group via
``tools.lazy_deps.ensure_and_bind`` whenever any of the declared
packages (mautrix, Markdown, aiosqlite, asyncpg, aiohttp-socks) is
missing not just mautrix itself. Previously this short-circuited on
``import mautrix``, which left the other four packages uninstalled
forever and broke E2EE connect with ``No module named 'asyncpg'``
(#31116). Rebinds module-level type globals on success.
"""
token = os.getenv("MATRIX_ACCESS_TOKEN", "")
password = os.getenv("MATRIX_PASSWORD", "")
@@ -254,15 +259,48 @@ def check_matrix_requirements() -> bool:
logger.warning("Matrix: MATRIX_HOMESERVER not set")
return False
# Try importing the mautrix types to verify the SDK is present.
# Check whether any package in the platform.matrix feature group is
# missing. ``feature_missing`` is cheap (per-spec importlib.metadata
# lookups) and correctly handles ``mautrix[encryption]`` by stripping
# the extras marker before checking the bare package.
try:
from mautrix.types import ( # noqa: F401
ContentURI, EventID, EventType, PaginationDirection,
PresenceState, RoomCreatePreset, RoomID, SyncToken,
TrustState, UserID,
)
except ImportError:
return False
from tools.lazy_deps import feature_missing, ensure_and_bind
missing = feature_missing("platform.matrix")
except Exception as exc: # pragma: no cover — defensive
logger.debug("Matrix: lazy_deps lookup failed: %s", exc)
missing = ()
ensure_and_bind = None # type: ignore[assignment]
if missing or ensure_and_bind is None:
def _import():
from mautrix.types import (
ContentURI, EventID, EventType, PaginationDirection,
PresenceState, RoomCreatePreset, RoomID, SyncToken,
TrustState, UserID,
)
return {
"ContentURI": ContentURI,
"EventID": EventID,
"EventType": EventType,
"PaginationDirection": PaginationDirection,
"PresenceState": PresenceState,
"RoomCreatePreset": RoomCreatePreset,
"RoomID": RoomID,
"SyncToken": SyncToken,
"TrustState": TrustState,
"UserID": UserID,
}
if ensure_and_bind is None:
return False
if not ensure_and_bind("platform.matrix", _import, globals(), prompt=False):
logger.warning(
"Matrix: required packages not installed (%s). "
"Run: pip install 'mautrix[encryption]' asyncpg aiosqlite "
"Markdown aiohttp-socks",
", ".join(missing) if missing else "platform.matrix",
)
return False
# If encryption is requested, verify E2EE deps are available at startup
# rather than silently degrading to plaintext-only at connect time.
@@ -30,6 +30,10 @@ except ImportError:
AsyncSocketModeHandler = Any
AsyncWebClient = Any
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.helpers import MessageDeduplicator
from gateway.platforms.base import (
@@ -71,28 +75,27 @@ class _ThreadContextCache:
def check_slack_requirements() -> bool:
"""Check if Slack dependencies are available.
Since this is a separate package, deps are guaranteed by the package
manager. Just verify the SDK can be imported.
Lazy-installs slack-bolt/slack-sdk via ``tools.lazy_deps.ensure("platform.slack")``
on first call if not present. Rebinds all module-level globals on success.
"""
if SLACK_AVAILABLE:
return True
try:
from slack_bolt.async_app import AsyncApp as _AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler as _ASMH
from slack_sdk.web.async_client import AsyncWebClient as _AWC
import aiohttp as _aiohttp
except ImportError:
return False
def _import():
from slack_bolt.async_app import AsyncApp
from slack_bolt.adapter.socket_mode.async_handler import AsyncSocketModeHandler
from slack_sdk.web.async_client import AsyncWebClient
import aiohttp
return {
"AsyncApp": AsyncApp,
"AsyncSocketModeHandler": AsyncSocketModeHandler,
"AsyncWebClient": AsyncWebClient,
"aiohttp": aiohttp,
"SLACK_AVAILABLE": True,
}
globals().update({
"AsyncApp": _AsyncApp,
"AsyncSocketModeHandler": _ASMH,
"AsyncWebClient": _AWC,
"aiohttp": _aiohttp,
"SLACK_AVAILABLE": True,
})
return True
from tools.lazy_deps import ensure_and_bind
return ensure_and_bind("platform.slack", _import, globals(), prompt=False)
def _extract_text_from_slack_blocks(blocks: list) -> str:
@@ -60,6 +60,10 @@ except ImportError:
DEFAULT_TYPE = Any
ContextTypes = _MockContextTypes
import sys
from pathlib import Path as _Path
sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
@@ -107,8 +111,10 @@ MAX_COMMANDS_PER_SCOPE = 30
def check_telegram_requirements() -> bool:
"""Check if Telegram dependencies are available.
Since this is a separate package, deps are guaranteed by the package
manager. Just verify the SDK can be imported.
If python-telegram-bot is missing, attempts to lazy-install it via
``tools.lazy_deps.ensure("platform.telegram")``. After a successful
install, re-imports the SDK and flips ``TELEGRAM_AVAILABLE`` to True
so the adapter's class-level type aliases get rebound.
"""
global TELEGRAM_AVAILABLE, Update, Bot, Message, InlineKeyboardButton
global InlineKeyboardMarkup, LinkPreviewOptions, Application
@@ -116,6 +122,11 @@ def check_telegram_requirements() -> bool:
global ContextTypes, filters, ParseMode, ChatType, HTTPXRequest
if TELEGRAM_AVAILABLE:
return True
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.telegram", prompt=False)
except Exception:
return False
try:
from telegram import Update as _Update, Bot as _Bot, Message as _Message
from telegram import InlineKeyboardButton as _IKB, InlineKeyboardMarkup as _IKM
+108 -55
View File
@@ -831,6 +831,8 @@ if _config_path.exists():
"docker_env": "TERMINAL_DOCKER_ENV",
"docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
"docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
"sandbox_dir": "TERMINAL_SANDBOX_DIR",
"persistent_shell": "TERMINAL_PERSISTENT_SHELL",
}
@@ -5418,6 +5420,49 @@ class GatewayRunner:
)
stale_timeout_seconds = 0
# Read kanban.default_assignee — fallback profile for tasks
# created without an explicit assignee (e.g. via the dashboard).
# When set, the dispatcher applies it to unassigned ready tasks
# instead of skipping them indefinitely (#27145). Empty string
# (the schema default) means "no fallback, keep skipping" —
# backward-compatible with existing installs.
default_assignee = (kanban_cfg.get("default_assignee") or "").strip() or None
if default_assignee:
logger.info(
"kanban dispatcher: default_assignee=%r (unassigned ready tasks "
"will route to this profile)",
default_assignee,
)
# Read kanban.max_in_progress_per_profile — per-profile concurrency
# cap (#21582). When set, no single profile gets more than N
# workers running at once, even if the global max_in_progress
# would allow it. Prevents one profile's local model / API quota
# / browser pool from being overwhelmed by a fan-out.
raw_per_profile = kanban_cfg.get("max_in_progress_per_profile", None)
max_in_progress_per_profile = None
if raw_per_profile is not None:
try:
max_in_progress_per_profile = int(raw_per_profile)
except (TypeError, ValueError):
logger.warning(
"kanban dispatcher: invalid kanban.max_in_progress_per_profile=%r; ignoring",
raw_per_profile,
)
max_in_progress_per_profile = None
else:
if max_in_progress_per_profile < 1:
logger.warning(
"kanban dispatcher: kanban.max_in_progress_per_profile=%r is below 1; ignoring",
raw_per_profile,
)
max_in_progress_per_profile = None
else:
logger.info(
"kanban dispatcher: max_in_progress_per_profile=%d",
max_in_progress_per_profile,
)
# Initial delay so the gateway finishes wiring adapters before the
# dispatcher spawns workers (those workers may hit gateway notify
# subscriptions etc.). Matches the notifier watcher's delay.
@@ -5509,6 +5554,8 @@ class GatewayRunner:
max_in_progress=max_in_progress,
failure_limit=failure_limit,
stale_timeout_seconds=stale_timeout_seconds,
default_assignee=default_assignee,
max_in_progress_per_profile=max_in_progress_per_profile,
)
except sqlite3.DatabaseError as exc:
if _is_corrupt_board_db_error(exc):
@@ -6263,29 +6310,6 @@ class GatewayRunner:
# plugin adapters don't need a custom factory signature.
if hasattr(adapter, "gateway_runner"):
adapter.gateway_runner = self
# ── Telegram: notification mode from config ──
# Applied here (not in the adapter factory) because it
# reads gateway-local config that only the gateway runner
# has access to.
if platform.value == "telegram":
_notify_mode = os.getenv("HERMES_TELEGRAM_NOTIFICATIONS", "")
if not _notify_mode:
try:
_gw_cfg = _load_gateway_config()
_raw = cfg_get(_gw_cfg, "display", "platforms", "telegram", "notifications")
if _raw not in {None, ""}:
_notify_mode = str(_raw).strip().lower()
except Exception:
pass
_notify_mode = _notify_mode or "important"
if _notify_mode not in {"all", "important"}:
logger.warning(
"Unknown telegram notifications mode '%s', "
"defaulting to 'important' (valid: all, important)",
_notify_mode,
)
_notify_mode = "important"
adapter._notifications_mode = _notify_mode
return adapter
# Registered but failed to instantiate — don't silently fall
# through to built-ins (there are none for plugin platforms).
@@ -6299,13 +6323,49 @@ class GatewayRunner:
logger.debug("Platform registry lookup for '%s' failed: %s", platform.value, e)
# Fall through to built-in adapters below
if platform == Platform.WHATSAPP:
if platform == Platform.TELEGRAM:
from gateway.platforms.telegram import TelegramAdapter, check_telegram_requirements
if not check_telegram_requirements():
logger.warning("Telegram: python-telegram-bot not installed")
return None
adapter = TelegramAdapter(config)
# Apply Telegram notification mode from config. Controls whether
# intermediate messages (tool progress, streaming, status) trigger
# push notifications. Supports ENV override for quick testing.
_notify_mode = os.getenv("HERMES_TELEGRAM_NOTIFICATIONS", "")
if not _notify_mode:
try:
_gw_cfg = _load_gateway_config()
_raw = cfg_get(_gw_cfg, "display", "platforms", "telegram", "notifications")
if _raw not in {None, ""}:
_notify_mode = str(_raw).strip().lower()
except Exception:
pass
_notify_mode = _notify_mode or "important"
if _notify_mode not in {"all", "important"}:
logger.warning(
"Unknown telegram notifications mode '%s', "
"defaulting to 'important' (valid: all, important)",
_notify_mode,
)
_notify_mode = "important"
adapter._notifications_mode = _notify_mode
return adapter
elif platform == Platform.WHATSAPP:
from gateway.platforms.whatsapp import WhatsAppAdapter, check_whatsapp_requirements
if not check_whatsapp_requirements():
logger.warning("WhatsApp: Node.js not installed or bridge not configured")
return None
return WhatsAppAdapter(config)
elif platform == Platform.SLACK:
from gateway.platforms.slack import SlackAdapter, check_slack_requirements
if not check_slack_requirements():
logger.warning("Slack: slack-bolt not installed. Run: pip install 'hermes-agent[slack]'")
return None
return SlackAdapter(config)
elif platform == Platform.SIGNAL:
from gateway.platforms.signal import SignalAdapter, check_signal_requirements
if not check_signal_requirements():
@@ -6334,6 +6394,20 @@ class GatewayRunner:
return None
return SmsAdapter(config)
elif platform == Platform.DINGTALK:
from gateway.platforms.dingtalk import DingTalkAdapter, check_dingtalk_requirements
if not check_dingtalk_requirements():
logger.warning("DingTalk: dingtalk-stream not installed or DINGTALK_CLIENT_ID/SECRET not set")
return None
return DingTalkAdapter(config)
elif platform == Platform.FEISHU:
from gateway.platforms.feishu import FeishuAdapter, check_feishu_requirements
if not check_feishu_requirements():
logger.warning("Feishu: lark-oapi not installed or FEISHU_APP_ID/SECRET not set")
return None
return FeishuAdapter(config)
elif platform == Platform.WECOM_CALLBACK:
from gateway.platforms.wecom_callback import (
WecomCallbackAdapter,
@@ -6358,6 +6432,13 @@ class GatewayRunner:
return None
return WeixinAdapter(config)
elif platform == Platform.MATRIX:
from gateway.platforms.matrix import MatrixAdapter, check_matrix_requirements
if not check_matrix_requirements():
logger.warning("Matrix: mautrix not installed or credentials not set. Run: pip install 'mautrix[encryption]'")
return None
return MatrixAdapter(config)
elif platform == Platform.API_SERVER:
from gateway.platforms.api_server import APIServerAdapter, check_api_server_requirements
if not check_api_server_requirements():
@@ -11428,12 +11509,7 @@ class GatewayRunner:
audio_path = None
actual_path = None
try:
from agent.plugin_registries import registries
_tts_entry = registries.get_tool_provider("tts")
if _tts_entry is None:
return
text_to_speech_tool = _tts_entry.tool_functions["text_to_speech_tool"]
_strip_markdown_for_tts = _tts_entry.tool_functions["_strip_markdown_for_tts"]
from tools.tts_tool import text_to_speech_tool, _strip_markdown_for_tts
tts_text = _strip_markdown_for_tts(text[:4000])
if not tts_text:
@@ -14721,32 +14797,9 @@ class GatewayRunner:
return f"{prefix}\n\n{user_text}"
return prefix
from agent.plugin_registries import registries
_stt_entry = registries.get_tool_provider("stt")
enriched_parts = []
if _stt_entry is None or "transcribe_audio" not in _stt_entry.tool_functions:
# No STT plugin registered — treat each audio path the same way
# as a "No STT provider" transcription failure.
for path in audio_paths:
abs_path = os.path.abspath(path)
duration_str = await _probe_audio_duration(abs_path)
if duration_str:
enriched_parts.append(
f"[The user sent a voice message: {abs_path} (duration: {duration_str})]"
)
else:
enriched_parts.append(f"[The user sent a voice message: {abs_path}]")
if not enriched_parts:
return user_text
prefix = "\n\n".join(enriched_parts)
_placeholder = "(The user sent a message with no text content)"
if user_text and user_text.strip() == _placeholder:
return prefix
if user_text:
return f"{prefix}\n\n{user_text}"
return prefix
transcribe_audio = _stt_entry.tool_functions["transcribe_audio"]
from tools.transcription_tools import transcribe_audio
enriched_parts = []
for path in audio_paths:
try:
logger.debug("Transcribing user voice: %s", path)
+2 -2
View File
@@ -14,8 +14,8 @@ Provides subcommands for:
import os
import sys
__version__ = "0.15.0"
__release_date__ = "2026.5.28"
__version__ = "0.15.1"
__release_date__ = "2026.5.29"
def _ensure_utf8():
+9 -18
View File
@@ -1597,10 +1597,8 @@ def resolve_provider(
# AWS Bedrock — detect via boto3 credential chain (IAM roles, SSO, env vars).
# This runs after API-key providers so explicit keys always win.
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
if has_aws_credentials and has_aws_credentials():
from agent.bedrock_adapter import has_aws_credentials
if has_aws_credentials():
return "bedrock"
except ImportError:
pass # boto3 not installed — skip Bedrock auto-detection
@@ -6046,13 +6044,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
# AWS SDK providers (Bedrock) — check via boto3 credential chain
if pconfig and pconfig.auth_type == "aws_sdk":
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
if has_aws_credentials:
return {"logged_in": has_aws_credentials(), "provider": target}
else:
return {"logged_in": False, "provider": target, "error": "boto3 not installed"}
from agent.bedrock_adapter import has_aws_credentials
return {"logged_in": has_aws_credentials(), "provider": target}
except ImportError:
return {"logged_in": False, "provider": target, "error": "boto3 not installed"}
return {"logged_in": False}
@@ -6091,13 +6084,11 @@ def _get_azure_foundry_auth_status() -> Dict[str, Any]:
if auth_mode == "entra_id":
try:
from agent.plugin_registries import registries
_azure_ns = registries.get_provider_namespace("azure")
EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
has_azure_identity_installed = _azure_ns.get("has_azure_identity_installed")
if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, has_azure_identity_installed]):
raise ImportError("azure provider services not fully registered")
from agent.azure_identity_adapter import (
EntraIdentityConfig,
SCOPE_AI_AZURE_DEFAULT,
has_azure_identity_installed,
)
installed = has_azure_identity_installed()
entra_cfg = {}
if isinstance(model_cfg, dict) and isinstance(model_cfg.get("entra"), dict):
+11 -18
View File
@@ -221,12 +221,9 @@ def auth_add_command(args) -> None:
return
if provider == "anthropic":
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
run_hermes_oauth_login_pure = _anthropic_ns.get("run_hermes_oauth_login_pure")
if not run_hermes_oauth_login_pure:
raise SystemExit("Anthropic plugin not loaded — cannot run OAuth login.")
creds = run_hermes_oauth_login_pure()
from agent import anthropic_adapter as anthropic_mod
creds = anthropic_mod.run_hermes_oauth_login_pure()
if not creds:
raise SystemExit("Anthropic OAuth login did not return credentials.")
label = (getattr(args, "label", None) or "").strip() or label_from_token(
@@ -552,12 +549,8 @@ def _interactive_auth() -> None:
# Show AWS Bedrock credential status (not in the pool — uses boto3 chain)
try:
from agent.plugin_registries import registries
_bedrock = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock.get("has_aws_credentials")
resolve_aws_auth_env_var = _bedrock.get("resolve_aws_auth_env_var")
resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
if has_aws_credentials and has_aws_credentials():
from agent.bedrock_adapter import has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region
if has_aws_credentials():
auth_source = resolve_aws_auth_env_var() or "unknown"
region = resolve_bedrock_region()
print(f"bedrock (AWS SDK credential chain):")
@@ -584,12 +577,12 @@ def _interactive_auth() -> None:
_cfg_provider = str(_model_cfg.get("provider") or "").strip().lower()
_cfg_auth_mode = str(_model_cfg.get("auth_mode") or "").strip().lower()
if _cfg_provider == "azure-foundry" and _cfg_auth_mode == "entra_id":
from agent.plugin_registries import registries
_azure = registries.get_provider_namespace("azure")
EntraIdentityConfig = _azure.get("EntraIdentityConfig")
SCOPE_AI_AZURE_DEFAULT = _azure.get("SCOPE_AI_AZURE_DEFAULT")
describe_active_credential = _azure.get("describe_active_credential")
has_azure_identity_installed = _azure.get("has_azure_identity_installed")
from agent.azure_identity_adapter import (
EntraIdentityConfig,
SCOPE_AI_AZURE_DEFAULT,
describe_active_credential,
has_azure_identity_installed,
)
_base_url = str(_model_cfg.get("base_url") or "").strip()
_entra = _model_cfg.get("entra") or {}
if not isinstance(_entra, dict):
+11
View File
@@ -1726,6 +1726,15 @@ DEFAULT_CONFIG = {
# assignee to any installed profile. When unset, falls back to the
# default profile. A task never ends up with assignee=None.
"default_assignee": "",
# Per-profile concurrency cap (#21582). When set to a positive int,
# no single profile can have more than N workers running at once,
# even if the global max_in_progress / max_spawn caps would allow
# it. Tasks blocked this way defer to the next dispatcher tick.
# Unset (None) means "no per-profile cap" — backward-compatible
# with existing installs. Useful for fan-out workflows that would
# otherwise saturate one profile's local model / API quota /
# browser pool while leaving other profiles idle.
"max_in_progress_per_profile": None,
# When true, the kanban dispatcher auto-runs the decomposer on
# tasks that land in Triage (every dispatcher tick). When false,
# decomposition is manual via `hermes kanban decompose <id>` or
@@ -5551,6 +5560,8 @@ def set_config_value(key: str, value: str):
"terminal.daytona_image": "TERMINAL_DAYTONA_IMAGE",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"terminal.docker_persist_across_processes": "TERMINAL_DOCKER_PERSIST_ACROSS_PROCESSES",
"terminal.docker_orphan_reaper": "TERMINAL_DOCKER_ORPHAN_REAPER",
"terminal.docker_env": "TERMINAL_DOCKER_ENV",
# terminal.cwd intentionally excluded — CLI resolves at runtime,
# gateway bridges it in gateway/run.py. Persisting to .env causes
+38 -24
View File
@@ -28,6 +28,7 @@ from hermes_cli.models import _HERMES_USER_AGENT
from hermes_constants import OPENROUTER_MODELS_URL
from utils import base_url_host_matches
_PROVIDER_ENV_HINTS = (
"OPENROUTER_API_KEY",
"OPENAI_API_KEY",
@@ -53,11 +54,14 @@ _PROVIDER_ENV_HINTS = (
"TOKENHUB_API_KEY",
)
from hermes_constants import is_termux as _is_termux
def _python_install_cmd() -> str:
return "python -m pip install" if _is_termux() else "uv pip install"
def _system_package_install_cmd(pkg: str) -> str:
if _is_termux():
return f"pkg install {pkg}"
@@ -65,6 +69,7 @@ def _system_package_install_cmd(pkg: str) -> str:
return f"brew install {pkg}"
return f"sudo apt install {pkg}"
def _safe_which(cmd: str) -> str | None:
"""shutil.which wrapper resilient to platform monkeypatching in tests."""
try:
@@ -72,6 +77,7 @@ def _safe_which(cmd: str) -> str | None:
except Exception:
return None
def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
steps: list[str] = []
step = 1
@@ -82,6 +88,7 @@ def _termux_browser_setup_steps(node_installed: bool) -> list[str]:
steps.append(f"{step + 1}) agent-browser install")
return steps
def _termux_install_all_fallback_notes() -> list[str]:
return [
"Termux install profile: use .[termux-all] for broad compatibility (installer default on Termux).",
@@ -90,10 +97,12 @@ def _termux_install_all_fallback_notes() -> list[str]:
"STT fallback: use Groq Whisper (set GROQ_API_KEY) or OpenAI Whisper (set VOICE_TOOLS_OPENAI_KEY).",
]
def _has_provider_env_config(content: str) -> bool:
"""Return True when ~/.hermes/.env contains provider auth/base URL settings."""
return any(key in content for key in _PROVIDER_ENV_HINTS)
def _honcho_is_configured_for_doctor() -> bool:
"""Return True when Honcho is configured, even if this process has no active session."""
try:
@@ -104,6 +113,7 @@ def _honcho_is_configured_for_doctor() -> bool:
except Exception:
return False
def _is_kanban_worker_env_gate(item: dict) -> bool:
"""Return True when Kanban is unavailable only because this is not a worker process."""
if item.get("name") != "kanban":
@@ -114,12 +124,14 @@ def _is_kanban_worker_env_gate(item: dict) -> bool:
tools = item.get("tools") or []
return bool(tools) and all(str(tool).startswith("kanban_") for tool in tools)
def _doctor_tool_availability_detail(toolset: str) -> str:
"""Optional explanatory suffix for toolsets whose doctor status needs context."""
if toolset == "kanban" and not os.environ.get("HERMES_KANBAN_TASK"):
return "(runtime-gated; loaded only for dispatcher-spawned workers)"
return ""
def _apply_doctor_tool_availability_overrides(available: list[str], unavailable: list[dict]) -> tuple[list[str], list[dict]]:
"""Adjust runtime-gated tool availability for doctor diagnostics."""
updated_available = list(available)
@@ -137,6 +149,7 @@ def _apply_doctor_tool_availability_overrides(available: list[str], unavailable:
updated_unavailable.append(item)
return updated_available, updated_unavailable
def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool:
"""Return True when a direct API-key probe failure is non-blocking.
@@ -166,6 +179,7 @@ def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool
return False
return False
def check_ok(text: str, detail: str = ""):
print(f" {color('', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
@@ -178,16 +192,19 @@ def check_fail(text: str, detail: str = ""):
def check_info(text: str):
print(f" {color('', Colors.CYAN)} {text}")
def _section(title: str) -> None:
"""Print a doctor section banner: blank line + bold cyan ◆ title."""
print()
print(color(f"{title}", Colors.CYAN, Colors.BOLD))
def _fail_and_issue(text: str, detail: str, fix: str, issues: list[str]) -> None:
"""Emit a check_fail and append the corresponding fix instruction."""
check_fail(text, detail)
issues.append(fix)
def _check_s6_supervision(issues: list[str]) -> None:
"""Inside a container under our s6 /init, surface what s6 sees.
@@ -235,6 +252,7 @@ def _check_s6_supervision(issues: list[str]) -> None:
+ (f" ({', '.join(sorted(profiles))})" if len(profiles) <= 8 else "")
)
def _check_gateway_service_linger(issues: list[str]) -> None:
"""Warn when a systemd user gateway service will stop after logout.
@@ -278,8 +296,10 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
else:
check_warn("Could not verify systemd linger", f"({linger_detail})")
_APIKEY_PROVIDERS_CACHE: list | None = None
def _build_apikey_providers_list() -> list:
"""Build the API-key provider health-check list once and cache it.
@@ -371,6 +391,7 @@ def _build_apikey_providers_list() -> list:
pass
return _static
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
@@ -1454,15 +1475,12 @@ def run_doctor(args):
return _ConnectivityResult("Anthropic API", [], [])
try:
import httpx
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
_is_oauth_token = _anthropic_ns.get("_is_oauth_token")
# _COMMON_BETAS and _CONTEXT_1M_BETA are now in core
from agent.anthropic_format import _COMMON_BETAS, _CONTEXT_1M_BETA
_OAUTH_ONLY_BETAS = _anthropic_ns.get("_OAUTH_ONLY_BETAS")
if not all([_is_oauth_token, _OAUTH_ONLY_BETAS]):
raise ImportError("anthropic provider services not fully registered")
from agent.anthropic_adapter import (
_is_oauth_token,
_COMMON_BETAS,
_OAUTH_ONLY_BETAS,
_CONTEXT_1M_BETA,
)
headers = {"anthropic-version": "2023-06-01"}
is_oauth = _is_oauth_token(key)
if is_oauth:
@@ -1606,13 +1624,11 @@ def run_doctor(args):
def _probe_bedrock() -> _ConnectivityResult:
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
resolve_aws_auth_env_var = _bedrock_ns.get("resolve_aws_auth_env_var")
resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
if not all([has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region]):
raise ImportError("bedrock provider services not fully registered")
from agent.bedrock_adapter import (
has_aws_credentials,
resolve_aws_auth_env_var,
resolve_bedrock_region,
)
except ImportError:
return _ConnectivityResult("AWS Bedrock", [], [])
if not has_aws_credentials():
@@ -1683,14 +1699,12 @@ def run_doctor(args):
return _ConnectivityResult("Azure Foundry (Entra ID)", [], [])
try:
from agent.plugin_registries import registries
_azure_ns = registries.get_provider_namespace("azure")
EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
describe_active_credential = _azure_ns.get("describe_active_credential")
has_azure_identity_installed = _azure_ns.get("has_azure_identity_installed")
if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, describe_active_credential, has_azure_identity_installed]):
raise ImportError("azure provider services not fully registered")
from agent.azure_identity_adapter import (
EntraIdentityConfig,
SCOPE_AI_AZURE_DEFAULT,
describe_active_credential,
has_azure_identity_installed,
)
except Exception as exc:
return _ConnectivityResult(
"Azure Foundry (Entra ID)",
+3 -10
View File
@@ -4370,9 +4370,7 @@ def _setup_feishu():
if method_idx == 0:
# ── QR scan-to-create ──
try:
from agent.plugin_registries import registries
_feishu_entry = registries.get_platform("feishu")
qr_register = _feishu_entry.helper_functions.get("qr_register") if _feishu_entry else None
from gateway.platforms.feishu import qr_register
except Exception as exc:
print_error(f" Feishu / Lark onboard import failed: {exc}")
qr_register = None
@@ -4413,13 +4411,8 @@ def _setup_feishu():
# Try to probe the bot with manual credentials
bot_name = None
try:
from agent.plugin_registries import registries
_feishu_entry = registries.get_platform("feishu")
probe_bot = _feishu_entry.helper_functions.get("probe_bot") if _feishu_entry else None
if probe_bot:
bot_info = probe_bot(app_id, app_secret, domain)
else:
bot_info = None
from gateway.platforms.feishu import probe_bot
bot_info = probe_bot(app_id, app_secret, domain)
if bot_info:
bot_name = bot_info.get("bot_name")
print_success(f" Credentials verified — bot: {bot_name or 'unnamed'}")
+38
View File
@@ -2087,12 +2087,35 @@ def _cmd_tail(args: argparse.Namespace) -> int:
def _cmd_dispatch(args: argparse.Namespace) -> int:
# Honour kanban.default_assignee as the fallback for unassigned ready
# tasks (#27145) and kanban.max_in_progress_per_profile as the
# per-profile concurrency cap (#21582). Same semantics as the
# gateway dispatch path.
try:
from hermes_cli.config import load_config
_cfg = load_config()
_kanban_cfg = _cfg.get("kanban", {}) if isinstance(_cfg, dict) else {}
default_assignee = (_kanban_cfg.get("default_assignee") or "").strip() or None
_raw_per_profile = _kanban_cfg.get("max_in_progress_per_profile", None)
try:
max_in_progress_per_profile = (
int(_raw_per_profile) if _raw_per_profile is not None else None
)
if max_in_progress_per_profile is not None and max_in_progress_per_profile < 1:
max_in_progress_per_profile = None
except (TypeError, ValueError):
max_in_progress_per_profile = None
except Exception:
default_assignee = None
max_in_progress_per_profile = None
with kb.connect_closing() as conn:
res = kb.dispatch_once(
conn,
dry_run=args.dry_run,
max_spawn=args.max,
failure_limit=getattr(args, "failure_limit", kb.DEFAULT_SPAWN_FAILURE_LIMIT),
default_assignee=default_assignee,
max_in_progress_per_profile=max_in_progress_per_profile,
)
if getattr(args, "json", False):
print(json.dumps({
@@ -2108,6 +2131,11 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
],
"skipped_unassigned": res.skipped_unassigned,
"skipped_nonspawnable": res.skipped_nonspawnable,
"skipped_per_profile_capped": [
{"task_id": tid, "assignee": who, "current": current}
for (tid, who, current) in res.skipped_per_profile_capped
],
"auto_assigned_default": res.auto_assigned_default,
}, indent=2))
return 0
print(f"Reclaimed: {res.reclaimed}")
@@ -2128,8 +2156,18 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
for tid, who, ws in res.spawned:
tag = " (dry)" if args.dry_run else ""
print(f" - {tid} -> {who} @ {ws or '-'}{tag}")
if res.auto_assigned_default:
print(
f"Auto-assigned to kanban.default_assignee={default_assignee!r}: "
f"{', '.join(res.auto_assigned_default)}"
)
if res.skipped_unassigned:
print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
if res.skipped_per_profile_capped:
for tid, who, current in res.skipped_per_profile_capped:
print(
f"Deferred ({who} at per-profile cap, {current} running): {tid}"
)
if res.skipped_nonspawnable:
print(
f"Skipped (non-spawnable assignee — terminal lane, OK): "
+126 -5
View File
@@ -4289,6 +4289,12 @@ class DispatchResult:
skipped_unassigned: list[str] = field(default_factory=list)
"""Ready task ids skipped because they have no assignee at all.
Operator-actionable usually a misfiled task waiting for routing."""
auto_assigned_default: list[str] = field(default_factory=list)
"""Task ids that were unassigned in the DB and had
``kanban.default_assignee`` applied this tick before spawning (#27145).
Surfaces the auto-assignment to telemetry / CLI / dashboard so the
operator can see when the dispatcher is acting on the fallback rule
rather than on explicit per-task assignments."""
skipped_nonspawnable: list[str] = field(default_factory=list)
"""Ready task ids skipped because their assignee names a control-plane
lane (a Claude Code terminal like ``orion-cc``) rather than a Hermes
@@ -4296,6 +4302,14 @@ class DispatchResult:
operator-actionable failure. Tracked separately so health telemetry
can distinguish "real stuck" (nothing spawned but spawnable work
available) from "correctly idle" (nothing spawnable in the queue)."""
skipped_per_profile_capped: list[tuple[str, str, int]] = field(default_factory=list)
"""Tasks deferred this tick because their assignee is already at
``kanban.max_in_progress_per_profile`` (#21582). Each entry is
``(task_id, assignee, current_running_count)``. NOT an
operator-actionable failure the task will be picked up on a
subsequent tick when the assignee has capacity. Separate bucket so
telemetry / dashboards can show "this profile is busy" vs
"task is genuinely stuck"."""
crashed: list[str] = field(default_factory=list)
"""Task ids reclaimed because their worker PID disappeared."""
auto_blocked: list[str] = field(default_factory=list)
@@ -5342,6 +5356,8 @@ def dispatch_once(
failure_limit: int = DEFAULT_SPAWN_FAILURE_LIMIT,
stale_timeout_seconds: int = 0,
board: Optional[str] = None,
default_assignee: Optional[str] = None,
max_in_progress_per_profile: Optional[int] = None,
) -> DispatchResult:
"""Run one dispatcher tick.
@@ -5427,12 +5443,89 @@ def dispatch_once(
if max_spawn is None or max_spawn > remaining:
max_spawn = remaining
spawned = 0
# Per-profile concurrency cap (#21582): when set, track how many
# workers each assignee already has in flight, and refuse to spawn
# when this would push that assignee past the cap. Prevents
# fan-out workloads from melting a single profile's local model /
# API quota / browser pool while leaving other profiles idle.
# Tasks blocked this way go to skipped_per_profile_capped (not
# skipped_unassigned — the operator-actionable signal is different:
# "this profile is busy, try again later" not "this needs routing").
_per_profile_cap = max_in_progress_per_profile if (
isinstance(max_in_progress_per_profile, int)
and max_in_progress_per_profile > 0
) else None
_per_profile_running: dict[str, int] = {}
if _per_profile_cap is not None:
for prow in conn.execute(
"SELECT assignee, COUNT(*) AS n FROM tasks "
"WHERE status = 'running' AND assignee IS NOT NULL "
"GROUP BY assignee"
):
_per_profile_running[prow["assignee"]] = int(prow["n"])
# Normalize default_assignee once: empty/whitespace string → None so the
# rest of the loop can use ``if default_assignee:`` as a single check.
# We also resolve profile_exists once here for the same reason.
_default_assignee = (default_assignee or "").strip() or None
_default_assignee_resolved = False
if _default_assignee:
try:
from hermes_cli.profiles import profile_exists as _pe
_default_assignee_resolved = bool(_pe(_default_assignee))
except Exception:
# Profiles module not importable (test stubs, exotic envs).
# Trust the operator's config and try the assignment; the
# downstream profile_exists check on the assigned row will
# bucket it as nonspawnable if the profile genuinely isn't
# there, with the existing diagnostic.
_default_assignee_resolved = True
for row in ready_rows:
if max_spawn is not None and running_count + spawned >= max_spawn:
break
if not row["assignee"]:
result.skipped_unassigned.append(row["id"])
continue
row_assignee = row["assignee"]
if not row_assignee:
# Honour kanban.default_assignee: when the dispatcher hits an
# unassigned ready task and an operator-configured fallback
# exists, persist the assignment and proceed. This removes the
# dashboard footgun where a task created without an assignee
# parks in 'ready' forever even though the operator's intent
# ("default") was perfectly clear (#27145). Mutating the row
# (not just the in-memory view) keeps diagnostics and the
# board state consistent: the task is now legitimately owned
# by ``kanban.default_assignee``, not "unassigned but secretly
# routed".
if _default_assignee and _default_assignee_resolved:
# Dry-run: show what WOULD happen (auto-assign + spawn) without
# mutating the DB. Real run: mutate the row + emit the
# 'assigned' event so the board state matches what just happened.
if not dry_run:
try:
with write_txn(conn):
conn.execute(
"UPDATE tasks SET assignee = ? WHERE id = ? "
"AND (assignee IS NULL OR assignee = '')",
(_default_assignee, row["id"]),
)
_append_event(
conn, row["id"], "assigned",
{
"assignee": _default_assignee,
"source": "kanban.default_assignee",
},
)
except Exception:
_log.debug(
"kanban dispatch: failed to apply default_assignee=%r "
"to task %s",
_default_assignee, row["id"], exc_info=True,
)
result.skipped_unassigned.append(row["id"])
continue
row_assignee = _default_assignee
result.auto_assigned_default.append(row["id"])
else:
result.skipped_unassigned.append(row["id"])
continue
# Skip ready tasks whose assignee is not a real Hermes profile.
# `_default_spawn` invokes ``hermes -p <assignee>`` which fails
# with "Profile 'X' does not exist" when the assignee names a
@@ -5447,7 +5540,7 @@ def dispatch_once(
from hermes_cli.profiles import profile_exists # local import: avoids cycle
except Exception:
profile_exists = None # type: ignore[assignment]
if profile_exists is not None and not profile_exists(row["assignee"]):
if profile_exists is not None and not profile_exists(row_assignee):
# Bucket separately from skipped_unassigned: the operator
# cannot fix this by assigning a profile (the assignee IS the
# intended owner — a terminal lane). Health telemetry uses
@@ -5456,6 +5549,19 @@ def dispatch_once(
# of human-pulled work.
result.skipped_nonspawnable.append(row["id"])
continue
# Per-profile concurrency cap (#21582): even if there's global
# headroom, refuse to spawn for an assignee that's already at
# its in-flight cap. Prevents one profile's local model / API
# quota / browser pool from being overwhelmed by a fan-out
# while the global max_in_progress / max_spawn caps still allow
# work on OTHER profiles.
if _per_profile_cap is not None:
current = _per_profile_running.get(row_assignee, 0)
if current >= _per_profile_cap:
result.skipped_per_profile_capped.append(
(row["id"], row_assignee, current)
)
continue
# Respawn guard: refuse to re-spawn when useful work is already
# in-flight/recent, or when the last failure is a deterministic
# blocker (quota / auth). The guard defers the spawn this tick so
@@ -5478,7 +5584,15 @@ def dispatch_once(
)
continue
if dry_run:
result.spawned.append((row["id"], row["assignee"], ""))
result.spawned.append((row["id"], row_assignee, ""))
# Increment per-profile counter even in dry_run so the cap
# check sees the would-be spawn on subsequent iterations.
# Without this, dry_run reports every task as spawnable and
# under-reports the capped subset (#21582).
if _per_profile_cap is not None and row_assignee:
_per_profile_running[row_assignee] = (
_per_profile_running.get(row_assignee, 0) + 1
)
continue
claimed = claim_task(conn, row["id"], ttl_seconds=ttl_seconds)
if claimed is None:
@@ -5521,6 +5635,13 @@ def dispatch_once(
# complete_task).
result.spawned.append((claimed.id, claimed.assignee or "", str(workspace)))
spawned += 1
# Track the new in-flight count for this profile so later
# iterations in this same tick respect the per-profile cap
# (#21582). Subsequent ticks re-query from the DB.
if _per_profile_cap is not None and claimed.assignee:
_per_profile_running[claimed.assignee] = (
_per_profile_running.get(claimed.assignee, 0) + 1
)
except Exception as exc:
auto = _record_spawn_failure(
conn, claimed.id, str(exc),
+90 -42
View File
@@ -622,12 +622,10 @@ def _has_any_provider_configured() -> bool:
# being installed doesn't mean the user wants Hermes to use their tokens.
if _has_hermes_config:
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
if read_claude_code_credentials is None or is_claude_code_token_valid is None:
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import (
read_claude_code_credentials,
is_claude_code_token_valid,
)
creds = read_claude_code_credentials()
if creds and (
@@ -4106,15 +4104,13 @@ def _model_flow_azure_foundry(config, current_model=""):
if use_entra:
try:
from agent.plugin_registries import registries
_azure = registries.get_provider_namespace("azure")
EntraIdentityConfig = _azure.get("EntraIdentityConfig")
SCOPE_AI_AZURE_DEFAULT = _azure.get("SCOPE_AI_AZURE_DEFAULT")
build_token_provider = _azure.get("build_token_provider")
describe_active_credential = _azure.get("describe_active_credential")
has_azure_identity_installed = _azure.get("has_azure_identity_installed")
if any(v is None for v in [EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, build_token_provider, describe_active_credential, has_azure_identity_installed]):
raise ImportError("azure plugin not registered")
from agent.azure_identity_adapter import (
EntraIdentityConfig,
SCOPE_AI_AZURE_DEFAULT,
build_token_provider,
describe_active_credential,
has_azure_identity_installed,
)
except ImportError as exc:
print()
print(f"⚠ Could not import azure-identity adapter: {exc}")
@@ -5428,14 +5424,12 @@ def _model_flow_bedrock(config, current_model=""):
# 1. Check for AWS credentials
try:
from agent.plugin_registries import registries
_bedrock = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock.get("has_aws_credentials")
resolve_aws_auth_env_var = _bedrock.get("resolve_aws_auth_env_var")
resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
discover_bedrock_models = _bedrock.get("discover_bedrock_models")
if any(v is None for v in [has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region, discover_bedrock_models]):
raise ImportError("bedrock plugin not registered")
from agent.bedrock_adapter import (
has_aws_credentials,
resolve_aws_auth_env_var,
resolve_bedrock_region,
discover_bedrock_models,
)
except ImportError:
print(" ✗ boto3 is not installed. Install it with:")
print(" pip install boto3")
@@ -5883,13 +5877,11 @@ def _model_flow_api_key_provider(config, provider_id, current_model=""):
def _run_anthropic_oauth_flow(save_env_value):
"""Run the Claude OAuth setup-token flow. Returns True if credentials were saved."""
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
run_oauth_setup_token = _anthropic.get("run_oauth_setup_token")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
if run_oauth_setup_token is None:
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import (
run_oauth_setup_token,
read_claude_code_credentials,
is_claude_code_token_valid,
)
from hermes_cli.config import (
save_anthropic_oauth_token,
use_anthropic_claude_code_credentials,
@@ -5997,13 +5989,11 @@ def _model_flow_anthropic(config, current_model=""):
existing_key = get_anthropic_key()
cc_available = False
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
is_claude_code_token_valid = _anthropic.get("is_claude_code_token_valid")
_is_oauth_token = _anthropic.get("_is_oauth_token")
if any(v is None for v in [read_claude_code_credentials, is_claude_code_token_valid, _is_oauth_token]):
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import (
read_claude_code_credentials,
is_claude_code_token_valid,
_is_oauth_token,
)
cc_creds = read_claude_code_credentials()
if cc_creds and is_claude_code_token_valid(cc_creds):
@@ -8120,13 +8110,71 @@ def _cleanup_quarantined_exes(scripts_dir: Path | None = None) -> None:
def _refresh_active_lazy_features() -> None:
"""No-op — lazy deps removed.
"""Refresh lazy-installed backends after a code update.
Optional backends are now proper plugin packages (hermes-agent-anthropic,
hermes-agent-telegram, etc.) installed via extras. ``hermes update``
refreshes them through ``uv pip install -e .[all]`` like any other dep.
When pyproject.toml's ``[all]`` extra was slimmed down (May 2026), most
optional backends moved to ``tools/lazy_deps.py`` and only install on
first use. ``hermes update`` runs ``uv pip install -e .[all]`` which
leaves those packages untouched so if we bump a pin in
:data:`LAZY_DEPS` (CVE response, transitive bug fix), users who already
activated the backend keep the stale version forever.
This function asks lazy_deps which features the user has previously
activated and reinstalls them under the current pins. Features the
user never enabled stay quiet no churn for cold backends.
Never raises. A failure here must not block the rest of the update.
"""
pass
try:
from tools import lazy_deps
except Exception as exc:
logger.debug("Lazy refresh skipped (import failed): %s", exc)
return
try:
active = lazy_deps.active_features()
except Exception as exc:
logger.debug("Lazy refresh skipped (active_features failed): %s", exc)
return
if not active:
return
print()
print(f"→ Refreshing {len(active)} active lazy backend(s)...")
try:
results = lazy_deps.refresh_active_features(prompt=False)
except Exception as exc:
# refresh_active_features is documented as never-raise, but defend
# the update flow against future regressions.
print(f" ⚠ Lazy refresh failed unexpectedly: {exc}")
return
refreshed = [f for f, s in results.items() if s == "refreshed"]
current = [f for f, s in results.items() if s == "current"]
failed = [(f, s) for f, s in results.items() if s.startswith("failed:")]
skipped = [(f, s) for f, s in results.items() if s.startswith("skipped:")]
if refreshed:
print(f"{len(refreshed)} refreshed: {', '.join(refreshed)}")
if current:
print(f"{len(current)} already current")
if skipped:
# Most common reason: security.allow_lazy_installs=false. Show one
# line so the user knows why; not an error.
names = ", ".join(f for f, _ in skipped)
reason = skipped[0][1].split(": ", 1)[-1]
print(f" · {len(skipped)} skipped ({reason}): {names}")
if failed:
for feature, status in failed:
reason = status.split(": ", 1)[-1]
# Clip noisy pip stderr to keep update output legible.
if len(reason) > 200:
reason = reason[:200] + "..."
print(f"{feature} failed to refresh: {reason}")
print(" Backends keep their previously-installed version; rerun")
print(" `hermes update` once the upstream issue is resolved.")
def _install_python_dependencies_with_optional_fallback(
+6 -12
View File
@@ -1159,12 +1159,8 @@ def list_authenticated_providers(
if slug_norm != current_norm:
return False
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
if has_aws_credentials:
return bool(has_aws_credentials())
return False
from agent.bedrock_adapter import has_aws_credentials
return bool(has_aws_credentials())
except Exception:
return False
@@ -1346,12 +1342,10 @@ def list_authenticated_providers(
# configured.
if not has_creds and hermes_slug == "anthropic":
try:
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
read_claude_code_credentials = _anthropic_ns.get("read_claude_code_credentials")
read_hermes_oauth_credentials = _anthropic_ns.get("read_hermes_oauth_credentials")
if read_claude_code_credentials is None or read_hermes_oauth_credentials is None:
raise ImportError("anthropic credential readers not registered")
from agent.anthropic_adapter import (
read_claude_code_credentials,
read_hermes_oauth_credentials,
)
hermes_creds = read_hermes_oauth_credentials()
cc_creds = read_claude_code_credentials()
if (hermes_creds and hermes_creds.get("accessToken")) or \
+4 -19
View File
@@ -2116,11 +2116,7 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
# below — bedrock is not expected to appear in that table.
if normalized == "bedrock":
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
bedrock_model_ids_or_none = _bedrock_ns.get("bedrock_model_ids_or_none")
if bedrock_model_ids_or_none is None:
raise ImportError("bedrock_model_ids_or_none not found in bedrock provider")
from agent.bedrock_adapter import bedrock_model_ids_or_none
ids = bedrock_model_ids_or_none()
if ids is not None:
return ids
@@ -2367,14 +2363,7 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
Claude Code auto-discovery). Returns sorted model IDs or None.
"""
try:
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
_is_oauth_token = _anthropic_ns.get("_is_oauth_token")
# Beta header constants live in core agent.anthropic_format.
from agent.anthropic_format import _COMMON_BETAS, _OAUTH_ONLY_BETAS, _CONTEXT_1M_BETA
if resolve_anthropic_token is None or _is_oauth_token is None:
raise ImportError("anthropic provider services not registered")
from agent.anthropic_adapter import resolve_anthropic_token, _is_oauth_token
except ImportError:
return None
@@ -2386,6 +2375,7 @@ def _fetch_anthropic_models(timeout: float = 5.0) -> Optional[list[str]]:
is_oauth = _is_oauth_token(token)
if is_oauth:
headers["Authorization"] = f"Bearer {token}"
from agent.anthropic_adapter import _COMMON_BETAS, _OAUTH_ONLY_BETAS, _CONTEXT_1M_BETA
headers["anthropic-beta"] = ",".join(_COMMON_BETAS + _OAUTH_ONLY_BETAS)
else:
headers["x-api-key"] = token
@@ -3717,12 +3707,7 @@ def validate_requested_model(
# AWS SDK control plane (ListFoundationModels + ListInferenceProfiles).
if normalized == "bedrock":
try:
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
discover_bedrock_models = _bedrock_ns.get("discover_bedrock_models")
resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
if discover_bedrock_models is None or resolve_bedrock_region is None:
raise ImportError("bedrock discovery functions not registered")
from agent.bedrock_adapter import discover_bedrock_models, resolve_bedrock_region
region = resolve_bedrock_region()
discovered = discover_bedrock_models(region)
discovered_ids = {m["id"] for m in discovered}
+25 -285
View File
@@ -818,270 +818,6 @@ class PluginContext:
name,
)
# -- auth provider registration -------------------------------------------
def register_platform_entry(
self,
name: str,
adapter_class: type,
check_requirements: Callable,
available_flag: str = "",
constants: dict | None = None,
helper_functions: dict | None = None,
) -> None:
"""Register a platform adapter entry in the capability registries.
This populates ``agent.plugin_registries.registries.platform_adapters``
so core code can look up adapter classes, constants, and helper
functions without importing from ``hermes_agent_*`` packages directly.
Call this **in addition to** :meth:`register_platform` the two
registries serve different consumers:
* ``register_platform`` ``gateway.platform_registry`` (gateway
adapter creation, setup wizard, status)
* ``register_platform_entry`` ``agent.plugin_registries`` (adapter
class access, constants, helpers for send_message_tool, etc.)
Args:
name: Platform identifier (e.g. ``"telegram"``).
adapter_class: The adapter class itself (e.g. ``TelegramAdapter``).
check_requirements: Callable returning ``bool`` are deps installed?
available_flag: Name of the module-level AVAILABLE boolean, if any.
constants: Platform-specific constants (e.g.
``{"FEISHU_DOMAIN": ..., "LARK_DOMAIN": ...}``).
helper_functions: Platform-specific helpers (e.g.
``{"_strip_mdv2": _strip_mdv2, "qr_register": qr_register}``).
"""
from agent.plugin_registries import registries, PlatformAdapterEntry
entry = PlatformAdapterEntry(
name=name,
adapter_class=adapter_class,
check_requirements=check_requirements,
available_flag=available_flag,
constants=constants or {},
helper_functions=helper_functions or {},
)
registries.register_platform(entry)
logger.debug(
"Plugin %s registered platform entry: %s",
self.manifest.name,
name,
)
def register_tool_provider_entry(
self,
name: str,
tool_functions: dict | None = None,
check_fn: Callable | None = None,
constants: dict | None = None,
config_functions: dict | None = None,
environment_classes: dict | None = None,
) -> None:
"""Register a tool provider entry in the capability registries.
This populates ``agent.plugin_registries.registries.tool_providers``
so core code can look up tool functions, constants, and config
helpers without importing from ``hermes_agent_*`` packages directly.
Args:
name: Tool identifier (e.g. ``"tts"``, ``"stt"``).
tool_functions: Dict of function name callable
(e.g. ``{"text_to_speech_tool": text_to_speech_tool}``).
check_fn: Optional callable returning ``bool`` are deps
installed and configured?
constants: Tool-specific constants
(e.g. ``{"MAX_FILE_SIZE": 25 * 1024 * 1024}``).
config_functions: Config/utility functions
(e.g. ``{"is_stt_enabled": is_stt_enabled}``).
environment_classes: Environment classes for terminal backends
(e.g. ``{"DaytonaEnvironment": DaytonaEnvironment}``).
"""
from agent.plugin_registries import registries, ToolProviderEntry
entry = ToolProviderEntry(
name=name,
tool_functions=tool_functions or {},
check_fn=check_fn,
constants=constants or {},
config_functions=config_functions or {},
environment_classes=environment_classes or {},
)
registries.register_tool_provider(entry)
logger.debug(
"Plugin %s registered tool provider entry: %s",
self.manifest.name,
name,
)
def register_provider_services(
self,
name: str,
services: dict,
) -> None:
"""Register a namespace dict of provider-specific services.
This is the escape hatch for model-provider plugins that expose many
symbols (anthropic has 50+). Each plugin registers its public surface
as a flat dict of ``{symbol_name: callable_or_value}``. Core code
looks up specific symbols instead of importing from the plugin
package directly.
Args:
name: Provider identifier (e.g. ``"anthropic"``, ``"bedrock"``).
services: Dict of symbol name callable or value.
"""
from agent.plugin_registries import registries
registries.register_provider_services(name, services)
logger.debug(
"Plugin %s registered provider services: %s (%d symbols)",
self.manifest.name,
name,
len(services),
)
def register_auth_provider(
self,
name: str,
provider: Any,
*,
cli_group: str = "",
setup_subcommands: bool = False,
) -> None:
"""Register an authentication provider.
``provider`` must implement the :class:`agent.plugin_registries.AuthProvider`
protocol (``name``, ``has_credentials``, ``check_env_vars``,
``resolve_token``, ``refresh_token``). It may also expose
provider-specific attributes (``_is_oauth_token``,
``_HERMES_OAUTH_FILE``, ``read_claude_code_credentials``, etc.)
that core code accesses via the registry.
Registered providers are queried by core code via
``registries.get_auth_provider(name)`` instead of importing
directly from ``hermes_agent_*`` packages.
"""
from agent.plugin_registries import registries
registries.register_auth_provider(
name, provider,
cli_group=cli_group,
setup_subcommands=setup_subcommands,
)
logger.debug(
"Plugin %s registered auth provider: %s",
self.manifest.name, name,
)
def register_provider_resolver(
self,
name: str,
resolver: Any,
) -> None:
"""Register a provider resolver callable.
The resolver handles ALL provider-specific client construction
logic for auxiliary tasks. Core's ``resolve_provider_client()``
dispatches to it instead of using per-provider if/elif branches.
Signature::
def resolver(
*,
model: str | None,
explicit_api_key: str | None,
explicit_base_url: str | None,
async_mode: bool,
is_vision: bool,
main_runtime: dict | None,
api_mode: str | None,
) -> tuple[Any, str] | tuple[None, None]:
...
Returns ``(client, default_model)`` or ``(None, None)``.
"""
from agent.plugin_registries import registries
registries.register_provider_resolver(name, resolver)
logger.debug(
"Plugin %s registered provider resolver: %s",
self.manifest.name, name,
)
def register_transport(
self,
api_mode: str,
transport_cls: type,
) -> None:
"""Register a ProviderTransport class for an api_mode string.
This lets the transport registry discover provider transports
from plugins without core needing to import the plugin package.
"""
from agent.plugin_registries import registries
registries._transports[api_mode] = transport_cls
logger.debug(
"Plugin %s registered transport: %s%s",
self.manifest.name, api_mode, transport_cls.__name__,
)
def register_credential_pool_hook(
self,
name: str,
hook: Any,
) -> None:
"""Register a credential pool hook for provider-specific pool operations.
The hook should be a :class:`agent.plugin_registries.CredentialPoolHook`
instance with optional ``sync_from_credentials_file``,
``refresh_oauth``, and ``should_include_in_pool`` callables.
"""
from agent.plugin_registries import registries
registries.register_credential_pool_hook(name, hook)
logger.debug(
"Plugin %s registered credential pool hook: %s",
self.manifest.name, name,
)
def register_pricing_provider(
self,
name: str,
entries: list,
) -> None:
"""Register pricing entries for a provider.
``entries`` should be a list of
:class:`agent.plugin_registries.PricingEntry` instances.
"""
from agent.plugin_registries import registries
registries.register_pricing_provider(name, entries)
logger.debug(
"Plugin %s registered pricing provider: %s (%d entries)",
self.manifest.name, name, len(entries),
)
def register_provider_overlay(
self,
entry: Any,
) -> None:
"""Register a provider overlay entry.
``entry`` should be a :class:`agent.plugin_registries.ProviderOverlayEntry`
instance.
"""
from agent.plugin_registries import registries
registries.register_provider_overlay(entry)
logger.debug(
"Plugin %s registered provider overlay: %s",
self.manifest.name, entry.provider_name,
)
# -- hook registration --------------------------------------------------
# -- auxiliary task registration ---------------------------------------
@@ -1338,11 +1074,6 @@ class PluginManager:
)
logger.debug(" bundled/platforms: %d manifest(s)", len(bundled_platforms))
manifests.extend(bundled_platforms)
bundled_providers = self._scan_directory(
repo_plugins / "model-providers", source="bundled"
)
logger.debug(" bundled/model-providers: %d manifest(s)", len(bundled_providers))
manifests.extend(bundled_providers)
# 2. User plugins (~/.hermes/plugins/)
user_dir = get_hermes_home() / "plugins"
@@ -1379,16 +1110,7 @@ class PluginManager:
enabled = _get_enabled_plugins() # None = opt-in default (nothing enabled)
winners: Dict[str, PluginManifest] = {}
for manifest in manifests:
key = manifest.key or manifest.name
existing = winners.get(key)
# Bundled/workspace plugins take precedence over entry-points
# for the same key — the local source is the one we're
# actively developing; the entry-point is the published
# version. Only let entry-points fill gaps where no bundled
# version exists.
if existing is not None and existing.source == "bundled" and manifest.source != "bundled":
continue
winners[key] = manifest
winners[manifest.key or manifest.name] = manifest
for manifest in winners.values():
lookup_key = manifest.key or manifest.name
@@ -1416,12 +1138,30 @@ class PluginManager:
)
continue
# Model provider plugins auto-load just like backends and
# platforms. They register their provider services (auth,
# transport, metadata) via ctx.register_provider_services()
# in their register() function, which populates the
# capability registries that core code queries.
if manifest.source == "bundled" and manifest.kind in {"backend", "platform", "model-provider"}:
# Model provider plugins are loaded by providers/__init__.py
# (its own lazy discovery keyed off first get_provider_profile()
# call). We record the manifest here for introspection but do
# not import the module — a second import would create two
# ProviderProfile instances and break the "last writer wins"
# override semantics between bundled and user plugins.
if manifest.kind == "model-provider":
loaded = LoadedPlugin(manifest=manifest, enabled=True)
self._plugins[lookup_key] = loaded
logger.debug(
"Skipping '%s' (model-provider, handled by providers/ discovery)",
lookup_key,
)
continue
# Built-in backends auto-load — they ship with hermes and must
# just work. Selection among them (e.g. which image_gen backend
# services calls) is driven by ``<category>.provider`` config,
# enforced by the tool wrapper.
#
# Bundled platform plugins (gateway adapters like IRC) auto-load
# for the same reason: every platform Hermes ships must be
# available out of the box without the user having to opt in.
if manifest.source == "bundled" and manifest.kind in {"backend", "platform"}:
self._load_plugin(manifest)
continue
+17 -40
View File
@@ -99,8 +99,10 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
transport="openai_chat",
extra_env_vars=("COPILOT_GITHUB_TOKEN", "GH_TOKEN"),
),
# "anthropic" overlay moved to plugin: hermes_agent_anthropic register()
# Plugin registers via ctx.register_provider_overlay() and core merges lazily.
"anthropic": HermesOverlay(
transport="anthropic_messages",
extra_env_vars=("ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
),
"zai": HermesOverlay(
transport="openai_chat",
extra_env_vars=("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"),
@@ -202,45 +204,17 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
),
# Azure Foundry: supports both OpenAI-style and Anthropic-style endpoints.
# The transport is determined at runtime from config.yaml model.api_mode.
# "azure-foundry" overlay moved to plugin: hermes_agent_azure register()
# "bedrock" overlay moved to plugin: hermes_agent_bedrock register()
# Plugins register via ctx.register_provider_overlay() and core merges lazily.
"azure-foundry": HermesOverlay(
transport="openai_chat", # default; overridden by api_mode in config
base_url_env_var="AZURE_FOUNDRY_BASE_URL",
),
"bedrock": HermesOverlay(
transport="bedrock_converse",
auth_type="aws_sdk",
),
}
def _merge_plugin_overlays() -> None:
"""Merge plugin-registered provider overlays into HERMES_OVERLAYS.
Called lazily from ``resolve_provider`` so that plugins have had a
chance to register by the time we need the overlay data.
"""
global _plugin_overlays_merged
if _plugin_overlays_merged:
return
_plugin_overlays_merged = True
try:
from agent.plugin_registries import registries
for _name, _entry in registries.all_provider_overlays().items():
if _name not in HERMES_OVERLAYS:
HERMES_OVERLAYS[_name] = HermesOverlay(
transport=_entry.transport,
is_aggregator=_entry.is_aggregator,
auth_type=_entry.auth_type,
extra_env_vars=_entry.extra_env_vars,
base_url_override=_entry.base_url_override,
base_url_env_var=_entry.base_url_env_var,
)
# Also merge aliases from the plugin overlay entry
for _alias in _entry.aliases:
if _alias not in ALIASES:
ALIASES[_alias] = _name
except Exception:
pass
_plugin_overlays_merged = False
# -- Resolved provider -------------------------------------------------------
# The merged result of models.dev + overlay + user config.
@@ -361,7 +335,11 @@ ALIASES: Dict[str, str] = {
"tencent-cloud": "tencent-tokenhub",
"tencentmaas": "tencent-tokenhub",
# bedrock aliases moved to plugin: hermes_agent_bedrock register()
# bedrock
"aws": "bedrock",
"aws-bedrock": "bedrock",
"amazon-bedrock": "bedrock",
"amazon": "bedrock",
# arcee
"arcee-ai": "arcee",
@@ -448,7 +426,6 @@ def get_provider(name: str) -> Optional[ProviderDef]:
except Exception:
mdev_info = None
_merge_plugin_overlays()
overlay = HERMES_OVERLAYS.get(canonical)
if mdev_info is not None:
+13 -25
View File
@@ -976,13 +976,11 @@ def _resolve_azure_foundry_runtime(
auth_mode = "api_key"
else:
try:
from agent.plugin_registries import registries
_azure_ns = registries.get_provider_namespace("azure")
EntraIdentityConfig = _azure_ns.get("EntraIdentityConfig")
SCOPE_AI_AZURE_DEFAULT = _azure_ns.get("SCOPE_AI_AZURE_DEFAULT")
build_token_provider = _azure_ns.get("build_token_provider")
if not all([EntraIdentityConfig, SCOPE_AI_AZURE_DEFAULT, build_token_provider]):
raise ImportError("azure provider services not fully registered")
from agent.azure_identity_adapter import (
EntraIdentityConfig,
SCOPE_AI_AZURE_DEFAULT,
build_token_provider,
)
except Exception as exc:
raise AuthError(
"Azure Foundry Entra ID auth requires the 'azure-identity' "
@@ -1074,11 +1072,7 @@ def _resolve_explicit_runtime(
base_url = explicit_base_url or cfg_base_url or "https://api.anthropic.com"
api_key = explicit_api_key
if not api_key:
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
if resolve_anthropic_token is None:
raise ImportError("anthropic provider services not registered")
from agent.anthropic_adapter import resolve_anthropic_token
api_key = resolve_anthropic_token()
if not api_key:
@@ -1518,11 +1512,7 @@ def resolve_runtime_provider(
"config.yaml model section at a custom env var."
)
else:
from agent.plugin_registries import registries
_anthropic_ns = registries.get_provider_namespace("anthropic")
resolve_anthropic_token = _anthropic_ns.get("resolve_anthropic_token")
if resolve_anthropic_token is None:
raise ImportError("anthropic provider services not registered")
from agent.anthropic_adapter import resolve_anthropic_token
token = resolve_anthropic_token()
if not token:
raise AuthError(
@@ -1540,14 +1530,12 @@ def resolve_runtime_provider(
# AWS Bedrock (native Converse API via boto3)
if provider == "bedrock":
from agent.plugin_registries import registries
_bedrock_ns = registries.get_provider_namespace("bedrock")
has_aws_credentials = _bedrock_ns.get("has_aws_credentials")
resolve_aws_auth_env_var = _bedrock_ns.get("resolve_aws_auth_env_var")
resolve_bedrock_region = _bedrock_ns.get("resolve_bedrock_region")
is_anthropic_bedrock_model = _bedrock_ns.get("is_anthropic_bedrock_model")
if not all([has_aws_credentials, resolve_aws_auth_env_var, resolve_bedrock_region, is_anthropic_bedrock_model]):
raise ImportError("bedrock provider services not fully registered")
from agent.bedrock_adapter import (
has_aws_credentials,
resolve_aws_auth_env_var,
resolve_bedrock_region,
is_anthropic_bedrock_model,
)
# When the user explicitly selected bedrock (not auto-detected),
# trust boto3's credential chain — it handles IMDS, ECS task roles,
# Lambda execution roles, SSO, and other implicit sources that our
+51 -24
View File
@@ -2052,32 +2052,59 @@ def _setup_matrix():
save_env_value("MATRIX_ENCRYPTION", "true")
print_success("E2EE enabled")
matrix_pkg = "hermes-agent[matrix]"
# Matrix deps are now a proper plugin package. Install it the normal way.
matrix_pkg = "mautrix[encryption]" if want_e2ee else "mautrix"
# Use the central lazy-deps feature group so we install ALL of
# platform.matrix's dependencies (mautrix, Markdown, aiosqlite,
# asyncpg, aiohttp-socks) — not just mautrix itself. The previous
# hand-rolled ``pip install mautrix[encryption]`` left asyncpg /
# aiosqlite uninstalled and broke E2EE connect with
# ``No module named 'asyncpg'`` on every fresh install (#31116).
try:
__import__("hermes_agent_matrix")
from tools.lazy_deps import ensure as _lazy_ensure, feature_missing
_missing_before = feature_missing("platform.matrix")
if _missing_before:
print_info(
f"Installing {matrix_pkg} (+ {len(_missing_before)} runtime deps)..."
)
try:
_lazy_ensure("platform.matrix", prompt=False)
print_success(f"{matrix_pkg} installed")
except Exception as exc:
print_warning(
f"Install failed — run manually: pip install "
f"'mautrix[encryption]' asyncpg aiosqlite Markdown "
f"aiohttp-socks"
)
print_info(f" Error: {exc}")
except ImportError:
print_info(f"Installing {matrix_pkg}...")
import subprocess
uv_bin = shutil.which("uv")
if uv_bin:
result = subprocess.run(
[uv_bin, "pip", "install", "--python", sys.executable, matrix_pkg],
capture_output=True, text=True,
)
else:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", matrix_pkg],
capture_output=True, text=True,
)
if result.returncode == 0:
print_success(f"{matrix_pkg} installed")
else:
print_warning(
f"Install failed — run manually: pip install '{matrix_pkg}'"
)
if result.stderr:
print_info(f" Error: {result.stderr.strip().splitlines()[-1]}")
# tools.lazy_deps unavailable (extreme edge case — partial
# install). Fall back to the legacy single-package install
# path so the wizard still does *something*.
try:
__import__("mautrix")
except ImportError:
print_info(f"Installing {matrix_pkg}...")
import subprocess
uv_bin = shutil.which("uv")
if uv_bin:
result = subprocess.run(
[uv_bin, "pip", "install", "--python", sys.executable, matrix_pkg],
capture_output=True, text=True,
)
else:
result = subprocess.run(
[sys.executable, "-m", "pip", "install", matrix_pkg],
capture_output=True, text=True,
)
if result.returncode == 0:
print_success(f"{matrix_pkg} installed")
else:
print_warning(
f"Install failed — run manually: pip install "
f"'{matrix_pkg}' asyncpg aiosqlite Markdown aiohttp-socks"
)
if result.stderr:
print_info(f" Error: {result.stderr.strip().splitlines()[-1]}")
print()
print_info("🔒 Security: Restrict who can use your bot")
+1 -5
View File
@@ -779,9 +779,7 @@ def speak_text(text: str) -> None:
_debug(f"speak_text: TTS begin (paused_recording={paused_recording})")
try:
from agent.plugin_registries import registries
_tts = registries.get_tool_provider("tts")
text_to_speech_tool = _tts.tool_functions.get("text_to_speech_tool") if _tts else None
from tools.tts_tool import text_to_speech_tool
tts_text = text[:4000] if len(text) > 4000 else text
tts_text = re.sub(r'```[\s\S]*?```', ' ', tts_text) # fenced code blocks
@@ -808,8 +806,6 @@ def speak_text(text: str) -> None:
f"tts_{time.strftime('%Y%m%d_%H%M%S')}.mp3",
)
if text_to_speech_tool is None:
raise ImportError("TTS plugin not registered")
_debug(f"speak_text: synthesizing {len(tts_text)} chars -> {mp3_path}")
text_to_speech_tool(text=tts_text, output_path=mp3_path)
+32 -34
View File
@@ -58,10 +58,22 @@ try:
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
except ImportError:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
"Install with: pip install 'hermes-agent[dashboard]'"
)
# First try lazy-installing the dashboard extras. Only the user actually
# running `hermes dashboard` needs fastapi+uvicorn; lazy install keeps
# them out of every other install path. After install, re-import.
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("tool.dashboard", prompt=False)
from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse, HTMLResponse, JSONResponse, Response
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
except Exception:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
)
WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
_log = logging.getLogger(__name__)
@@ -1359,13 +1371,11 @@ def _anthropic_oauth_status() -> Dict[str, Any]:
The dashboard reports the highest-priority source that's actually present.
"""
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
read_hermes_oauth_credentials = _anthropic.get("read_hermes_oauth_credentials")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
_HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
if read_hermes_oauth_credentials is None:
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import (
read_hermes_oauth_credentials,
read_claude_code_credentials,
_HERMES_OAUTH_FILE,
)
except ImportError:
read_claude_code_credentials = None # type: ignore
read_hermes_oauth_credentials = None # type: ignore
@@ -1424,11 +1434,7 @@ def _claude_code_only_status() -> Dict[str, Any]:
when they also have a separate Hermes-managed PKCE login.
"""
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
read_claude_code_credentials = _anthropic.get("read_claude_code_credentials")
if read_claude_code_credentials is None:
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import read_claude_code_credentials
creds = read_claude_code_credentials()
except Exception:
creds = None
@@ -1614,10 +1620,8 @@ async def disconnect_oauth_provider(provider_id: str, request: Request):
# want to undo a disconnect.
if provider_id in {"anthropic", "claude-code"}:
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
_HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
if _HERMES_OAUTH_FILE is not None and _HERMES_OAUTH_FILE.exists():
from agent.anthropic_adapter import _HERMES_OAUTH_FILE
if _HERMES_OAUTH_FILE.exists():
_HERMES_OAUTH_FILE.unlink()
except Exception:
pass
@@ -1684,15 +1688,13 @@ _oauth_sessions_lock = threading.Lock()
# Guarded so hermes web still starts if anthropic_adapter is unavailable;
# Phase 2 endpoints will return 501 in that case.
try:
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
_ANTHROPIC_OAUTH_CLIENT_ID = _anthropic.get("_OAUTH_CLIENT_ID")
_ANTHROPIC_OAUTH_TOKEN_URL = _anthropic.get("_OAUTH_TOKEN_URL")
_ANTHROPIC_OAUTH_REDIRECT_URI = _anthropic.get("_OAUTH_REDIRECT_URI")
_ANTHROPIC_OAUTH_SCOPES = _anthropic.get("_OAUTH_SCOPES")
_generate_pkce_pair = _anthropic.get("_generate_pkce")
if any(v is None for v in [_ANTHROPIC_OAUTH_CLIENT_ID, _ANTHROPIC_OAUTH_TOKEN_URL, _ANTHROPIC_OAUTH_REDIRECT_URI, _ANTHROPIC_OAUTH_SCOPES, _generate_pkce_pair]):
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import (
_OAUTH_CLIENT_ID as _ANTHROPIC_OAUTH_CLIENT_ID,
_OAUTH_TOKEN_URL as _ANTHROPIC_OAUTH_TOKEN_URL,
_OAUTH_REDIRECT_URI as _ANTHROPIC_OAUTH_REDIRECT_URI,
_OAUTH_SCOPES as _ANTHROPIC_OAUTH_SCOPES,
_generate_pkce as _generate_pkce_pair,
)
_ANTHROPIC_OAUTH_AVAILABLE = True
except ImportError:
_ANTHROPIC_OAUTH_AVAILABLE = False
@@ -1730,11 +1732,7 @@ def _save_anthropic_oauth_creds(access_token: str, refresh_token: str, expires_a
Mirrors what auth_commands.add_command does so the dashboard flow leaves
the system in the same state as ``hermes auth add anthropic``.
"""
from agent.plugin_registries import registries
_anthropic = registries.get_provider_namespace("anthropic")
_HERMES_OAUTH_FILE = _anthropic.get("_HERMES_OAUTH_FILE")
if _HERMES_OAUTH_FILE is None:
raise ImportError("anthropic plugin not registered")
from agent.anthropic_adapter import _HERMES_OAUTH_FILE
payload = {
"accessToken": access_token,
"refreshToken": refresh_token,
+2 -9
View File
@@ -147,15 +147,8 @@ def create_environment(
return DockerEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
elif env_type == "modal":
from agent.plugin_registries import registries
_modal = registries.get_tool_provider("modal")
_ModalEnvironment = _modal.environment_classes.get("ModalEnvironment") if _modal else None
if _ModalEnvironment is None:
raise ValueError(
"Modal backend selected but the hermes_agent_modal plugin is not loaded. "
"Ensure the modal plugin is installed and enabled."
)
return _ModalEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
from tools.environments.modal import ModalEnvironment
return ModalEnvironment(image=image, cwd=cwd, timeout=timeout, **kwargs)
else:
raise ValueError(f"Unknown environment type: {env_type}. Use 'local', 'docker', or 'modal'")
-233
View File
@@ -260,239 +260,6 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
echo "ok" > $out/result
'';
# ── Plugin architecture (hermetic core boundary) ───────────────────
#
# These checks prove that under NixOS (sealed venv, no pip install),
# the plugin system works correctly:
# 1. Core never imports from hermes_agent_* packages directly
# 2. Plugin registries are populated after discovery
# 3. Provider service namespaces are queryable
# 4. Optional plugins degrade gracefully (None returns, no crash)
# 5. No ensure() / lazy_deps / pip-install at runtime
# Check 1: Zero direct hermes_agent_* imports in core code
plugin-hermetic-boundary = pkgs.runCommand "hermes-plugin-hermetic-boundary" { } ''
set -e
echo "=== Checking core never imports from plugin packages ==="
# Search for direct imports from hermes_agent_* in core code
# (excluding plugins/, tests/, website/, and comments)
VIOLATIONS=$(${hermesVenv}/bin/python3 -c '
import subprocess, re, sys
result = subprocess.run(
["grep", "-rn",
"from hermes_agent_\\|import hermes_agent_",
"${hermes-agent}/share/hermes-agent"],
capture_output=True, text=True
)
lines = result.stdout.strip().split("\n") if result.stdout.strip() else []
# Filter: only .py files, not in plugins/ or tests/, not comments
violations = []
for line in lines:
if not line.endswith(".py"):
continue
parts = line.split(":", 2)
if len(parts) < 3:
continue
filepath, lineno, content = parts
# Skip plugin directories
if "/plugins/" in filepath:
continue
# Skip test directories
if "/tests/" in filepath or "/test_" in filepath:
continue
# Skip comments
stripped = content.lstrip()
if stripped.startswith("#"):
continue
violations.append(line)
for v in violations:
print(v)
sys.exit(1 if violations else 0)
' 2>&1 || true)
if [ -n "$VIOLATIONS" ]; then
echo "FAIL: Core code imports directly from plugin packages:"
echo "$VIOLATIONS"
exit 1
fi
echo "PASS: Zero direct hermes_agent_* imports in core"
echo "=== Checking no ensure() / LAZY_DEPS in core ==="
ENSURE_VIOLATIONS=$(grep -rn 'ensure(' ${hermes-agent}/share/hermes-agent/agent/ ${hermes-agent}/share/hermes-agent/hermes_cli/ --include='*.py' 2>/dev/null | grep -v '__pycache__' | grep -v '# ' || true)
if [ -n "$ENSURE_VIOLATIONS" ]; then
echo "FAIL: ensure() still used in core:"
echo "$ENSURE_VIOLATIONS"
exit 1
fi
echo "PASS: No ensure() calls in core code"
mkdir -p $out
echo "ok" > $out/result
'';
# Check 2: Plugin registries populate after discovery
plugin-registries-populate = pkgs.runCommand "hermes-plugin-registries-populate" { } ''
set -e
echo "=== Checking plugin registries populate after discovery ==="
export HOME=$(mktemp -d)
RESULT=$(${hermesVenv}/bin/python3 -c '
import json, sys
from hermes_cli.plugins import PluginManager
from agent.plugin_registries import registries
pm = PluginManager()
pm.discover_and_load(force=True)
out = {
"provider_services": list(registries._provider_services.keys()),
"platform_adapters": list(registries.platform_adapters.keys()),
"tool_providers": list(registries.tool_providers.keys()),
}
json.dump(out, sys.stdout)
' 2>/dev/null)
echo "Registry state: $RESULT"
# Verify provider services populated
PROV_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.provider_services | length')
if [ "$PROV_COUNT" -lt 1 ]; then
echo "FAIL: No provider services registered (expected >= 1)"
exit 1
fi
echo "PASS: $PROV_COUNT provider service(s) registered"
# Verify platform adapters populated
PLAT_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.platform_adapters | length')
if [ "$PLAT_COUNT" -lt 1 ]; then
echo "FAIL: No platform adapters registered (expected >= 1)"
exit 1
fi
echo "PASS: $PLAT_COUNT platform adapter(s) registered"
# Verify tool providers populated
TOOL_COUNT=$(echo "$RESULT" | ${pkgs.jq}/bin/jq '.tool_providers | length')
if [ "$TOOL_COUNT" -lt 1 ]; then
echo "FAIL: No tool providers registered (expected >= 1)"
exit 1
fi
echo "PASS: $TOOL_COUNT tool provider(s) registered"
mkdir -p $out
echo "ok" > $out/result
'';
# Check 3: Specific provider service lookups work
plugin-provider-lookups = pkgs.runCommand "hermes-plugin-provider-lookups" { } ''
set -e
echo "=== Checking provider service lookups ==="
export HOME=$(mktemp -d)
RESULT=$(${hermesVenv}/bin/python3 -c '
import json, sys
from hermes_cli.plugins import PluginManager
from agent.plugin_registries import registries
pm = PluginManager()
pm.discover_and_load(force=True)
checks = {
"anthropic.resolve_anthropic_token": registries.get_provider_service("anthropic", "resolve_anthropic_token") is not None,
"bedrock.has_aws_credentials": registries.get_provider_service("bedrock", "has_aws_credentials") is not None,
"azure.is_token_provider": registries.get_provider_service("azure", "is_token_provider") is not None,
}
json.dump(checks, sys.stdout)
' 2>/dev/null)
echo "Lookup results: $RESULT"
for key in anthropic.resolve_anthropic_token bedrock.has_aws_credentials azure.is_token_provider; do
VALUE=$(echo "$RESULT" | ${pkgs.jq}/bin/jq --arg k "$key" '.[$k]')
if [ "$VALUE" != "true" ]; then
echo "FAIL: $key lookup returned $VALUE (expected true)"
exit 1
fi
echo "PASS: $key lookup works"
done
mkdir -p $out
echo "ok" > $out/result
'';
# Check 4: Missing plugins degrade gracefully (no crash)
plugin-missing-graceful = pkgs.runCommand "hermes-plugin-missing-graceful" { } ''
set -e
echo "=== Checking missing plugins degrade gracefully ==="
export HOME=$(mktemp -d)
${hermesVenv}/bin/python3 -c '
from agent.plugin_registries import registries
# Lookup from non-existent provider — should return None, not crash
result = registries.get_provider_service("nonexistent-provider", "some_function")
assert result is None, f"Expected None for missing provider, got {result}"
# Lookup from empty registry — should return None
result2 = registries.get_provider_namespace("no-such-provider")
assert result2 == {}, f"Expected empty dict for missing namespace, got {result2}"
# Lookup specific tool provider that does not exist
result3 = registries.get_tool_provider("nonexistent-tool")
assert result3 is None, f"Expected None for missing tool provider, got {result3}"
print("PASS: All missing-plugin lookups return None gracefully")
' 2>&1
echo "PASS: Missing plugins degrade gracefully (no crash)"
mkdir -p $out
echo "ok" > $out/result
'';
# Check 5: No runtime pip install / ensure in gateway/run.py
plugin-no-runtime-install = pkgs.runCommand "hermes-plugin-no-runtime-install" { } ''
set -e
echo "=== Checking no runtime pip install / ensure in core ==="
# Check gateway/run.py has no ensure() or pip install
GATEWAY=${hermes-agent}/share/hermes-agent/gateway/run.py
if [ -f "$GATEWAY" ]; then
if grep -q 'ensure(' "$GATEWAY" || grep -q 'pip install' "$GATEWAY"; then
echo "FAIL: gateway/run.py contains ensure() or pip install"
grep -n 'ensure(\|pip install' "$GATEWAY"
exit 1
fi
echo "PASS: gateway/run.py has no ensure()/pip install"
else
echo "SKIP: gateway/run.py not found in package"
fi
# Check run_agent.py has no ensure() or pip install
RUN_AGENT=${hermes-agent}/share/hermes-agent/run_agent.py
if [ -f "$RUN_AGENT" ]; then
if grep -q 'ensure(' "$RUN_AGENT" || grep -q 'pip install' "$RUN_AGENT"; then
echo "FAIL: run_agent.py contains ensure() or pip install"
grep -n 'ensure(\|pip install' "$RUN_AGENT"
exit 1
fi
echo "PASS: run_agent.py has no ensure()/pip install"
else
echo "SKIP: run_agent.py not found in package"
fi
# Check tools/lazy_deps.py is gone
LAZY_DEPS=${hermes-agent}/share/hermes-agent/tools/lazy_deps.py
if [ -f "$LAZY_DEPS" ]; then
echo "FAIL: tools/lazy_deps.py still exists should be removed"
exit 1
fi
echo "PASS: tools/lazy_deps.py removed"
mkdir -p $out
echo "ok" > $out/result
'';
# Regression guard: messaging deps live outside [all], so the
# #messaging variant must actually ship discord.py — otherwise
# `nix profile install .#messaging` regresses to the broken default.
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../web;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-RPPWPM0nEkwsaQHrkdEP+UMTZ2aF7JHUNfsIEnKt1l8=";
hash = "sha256-HV0aISBVjwbGqDj8qQynSxGFrrZDzuYAW3D3lB/x3zo=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "web"; attr = "web"; pname = "hermes-web"; };
-58693
View File
File diff suppressed because it is too large Load Diff
-7
View File
@@ -1,7 +0,0 @@
"""Bridge module — delegates plugin registration to hermes_agent_dashboard."""
def register(ctx):
"""Plugin entry point — delegates to the inner hermes_agent_dashboard package."""
from hermes_agent_dashboard import register as _inner_register
_inner_register(ctx)
@@ -1,6 +0,0 @@
"""Hermes Agent web dashboard."""
def register(ctx):
"""Plugin entry point — dashboard registers via ctx.register_tool_provider_entry()."""
pass
-6
View File
@@ -1,6 +0,0 @@
name: dashboard
version: 0.1.0
description: Web dashboard (FastAPI + uvicorn)
kind: backend
provides_tools: ["dashboard"]
provides_hooks: []
-20
View File
@@ -1,20 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-dashboard"
version = "0.1.0"
description = "Hermes Agent web dashboard (FastAPI + Uvicorn)"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"fastapi==0.133.1",
"uvicorn[standard]==0.41.0",
]
[project.entry-points."hermes_agent.plugins"]
dashboard = "hermes_agent_dashboard:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_dashboard*"]
-7
View File
@@ -1,7 +0,0 @@
"""Bridge module — delegates plugin registration to hermes_agent_fal."""
def register(ctx):
"""Plugin entry point — delegates to the inner hermes_agent_fal package."""
from hermes_agent_fal import register as _inner_register
_inner_register(ctx)
@@ -1,36 +0,0 @@
"""hermes-agent-fal: FAL.ai SDK plumbing plugin for Hermes Agent."""
from hermes_agent_fal.fal_common import ( # noqa: F401
import_fal_client,
_ManagedFalSyncClient,
_extract_http_status,
_normalize_fal_queue_url_format,
)
def register(ctx):
"""Entry point for the hermes_agent.plugins entry point group.
Registers FAL SDK plumbing (import_fal_client, _ManagedFalSyncClient,
etc.) in the plugin capability registry so core code can look them
up without importing from ``hermes_agent_fal`` directly.
"""
from hermes_agent_fal.fal_common import (
import_fal_client,
_ManagedFalSyncClient,
_extract_http_status,
_normalize_fal_queue_url_format,
)
ctx.register_tool_provider_entry(
name="fal",
tool_functions={
"import_fal_client": import_fal_client,
},
constants={
"_normalize_fal_queue_url_format": _normalize_fal_queue_url_format,
},
config_functions={
"_ManagedFalSyncClient": _ManagedFalSyncClient,
"_extract_http_status": _extract_http_status,
},
)
-6
View File
@@ -1,6 +0,0 @@
name: fal
version: 0.1.0
description: FAL.ai image generation backend
kind: backend
provides_tools: ["image_gen"]
provides_hooks: []
-19
View File
@@ -1,19 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-fal"
version = "0.1.0"
description = "FAL.ai SDK plumbing plugin for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"fal-client==0.13.1",
]
[project.entry-points."hermes_agent.plugins"]
fal = "hermes_agent_fal:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_fal*"]
+2 -1
View File
@@ -888,7 +888,8 @@ class HindsightMemoryProvider(MemoryProvider):
+ (f": {reason}" if reason else "")
)
try:
from hindsight import HindsightEmbedded # noqa: F401 — side-effect import
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("memory.hindsight", prompt=False)
except ImportError:
pass
except Exception as _e:
-19
View File
@@ -1,19 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-hindsight"
version = "1.0.0"
description = "Hindsight long-term memory with knowledge graph for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"hindsight-client==0.6.1",
]
[project.entry-points."hermes_agent.plugins"]
hindsight = "hermes_agent_hindsight:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_hindsight*"]
+13 -2
View File
@@ -745,12 +745,23 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
"For local instances, set HONCHO_BASE_URL instead."
)
# Import the honcho SDK (installed via hermes-agent-honcho package).
# Lazy-install the honcho SDK on demand. ensure() honors
# security.allow_lazy_installs (default true). On failure we surface
# the original ImportError-shape message so existing callers still get
# the "go run hermes honcho setup" hint they used to.
try:
from honcho import Honcho # noqa: F401 — imported for side-effects
from tools.lazy_deps import FeatureUnavailable, ensure as _lazy_ensure
_lazy_ensure("memory.honcho", prompt=False)
except ImportError:
# lazy_deps module missing — fall through to the raw import below.
pass
except Exception:
# FeatureUnavailable or unexpected error. Don't crash here; let the
# actual import attempt produce the canonical error message.
pass
try:
from honcho import Honcho
except ImportError:
raise ImportError(
"honcho-ai is required for Honcho integration. "
-19
View File
@@ -1,19 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-honcho"
version = "1.0.0"
description = "Honcho AI-native memory for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"honcho-ai==2.0.1",
]
[project.entry-points."hermes_agent.plugins"]
honcho = "hermes_agent_honcho:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_honcho*"]
@@ -19,9 +19,3 @@ alibaba_coding_plan = ProviderProfile(
)
register_provider(alibaba_coding_plan)
def register(ctx):
"""No-op — this provider has no workspace package yet."""
pass
@@ -11,9 +11,3 @@ alibaba = ProviderProfile(
)
register_provider(alibaba)
def register(ctx):
"""No-op — this provider has no workspace package yet."""
pass
@@ -50,9 +50,3 @@ anthropic = AnthropicProfile(
)
register_provider(anthropic)
def register(ctx):
"""Plugin entry point — delegates to the inner hermes_agent_anthropic package."""
from hermes_agent_anthropic import register as _inner_register
_inner_register(ctx)
@@ -1,174 +0,0 @@
"""hermes-agent-anthropic: Anthropic Messages API adapter for Hermes Agent."""
# -----------------------------------------------------------------------
# Re-exports from adapter.py — SDK-dependent orchestration only.
# Wire-format code (message conversion, aux client wrappers, transport)
# has moved to core and is no longer re-exported here.
# -----------------------------------------------------------------------
from hermes_agent_anthropic.adapter import ( # noqa: F401
_CLAUDE_CODE_VERSION_FALLBACK,
_HERMES_OAUTH_FILE,
_OAUTH_CLIENT_ID,
_OAUTH_REDIRECT_URI,
_OAUTH_SCOPES,
_OAUTH_TOKEN_URL,
_build_anthropic_client_with_bearer_hook,
_detect_claude_code_version,
_generate_pkce,
_get_anthropic_sdk,
_get_claude_code_version,
_is_azure_anthropic_endpoint,
_is_oauth_token,
_prefer_refreshable_claude_code_token,
_read_claude_code_credentials_from_keychain,
_refresh_oauth_token,
_requires_bearer_auth,
_resolve_claude_code_token_from_credentials,
_write_claude_code_credentials,
build_anthropic_bedrock_client,
build_anthropic_client,
is_claude_code_token_valid,
read_claude_code_credentials,
read_claude_managed_key,
read_hermes_oauth_credentials,
refresh_anthropic_oauth_pure,
resolve_anthropic_token,
run_hermes_oauth_login_pure,
run_oauth_setup_token,
)
# Re-exports from resolve.py — client resolution & endpoint detection
from hermes_agent_anthropic.resolve import ( # noqa: F401
_ANTHROPIC_DEFAULT_BASE_URL as ANTHROPIC_DEFAULT_BASE_URL,
convert_openai_images_to_anthropic,
endpoint_speaks_anthropic_messages,
is_anthropic_compat_endpoint,
maybe_wrap_anthropic,
resolve_auxiliary_client,
)
def register(ctx):
"""Entry point for the hermes_agent.plugins entry point group."""
from hermes_agent_anthropic import adapter
# -----------------------------------------------------------------------
# Plugin-only symbols — SDK-dependent orchestration that stays in the
# plugin package. Wire-format code (message conversion, aux client
# wrappers, transport) has moved to core (agent.anthropic_format,
# agent.anthropic_aux, agent.transports.anthropic) and is no longer
# registered here.
# -----------------------------------------------------------------------
_symbols = [
# OAuth / auth constants
"_CLAUDE_CODE_VERSION_FALLBACK",
"_HERMES_OAUTH_FILE",
"_OAUTH_CLIENT_ID",
"_OAUTH_REDIRECT_URI",
"_OAUTH_SCOPES",
"_OAUTH_TOKEN_URL",
# SDK-dependent functions
"_build_anthropic_client_with_bearer_hook",
"_detect_claude_code_version",
"_generate_pkce",
"_get_anthropic_sdk",
"_get_claude_code_version",
"_is_azure_anthropic_endpoint",
"_is_oauth_token",
"_prefer_refreshable_claude_code_token",
"_read_claude_code_credentials_from_keychain",
"_refresh_oauth_token",
"_requires_bearer_auth",
"_resolve_claude_code_token_from_credentials",
"_write_claude_code_credentials",
"build_anthropic_bedrock_client",
"build_anthropic_client",
"is_claude_code_token_valid",
"read_claude_code_credentials",
"read_claude_managed_key",
"read_hermes_oauth_credentials",
"refresh_anthropic_oauth_pure",
"resolve_anthropic_token",
"run_hermes_oauth_login_pure",
"run_oauth_setup_token",
]
# resolve.py symbols — client resolution & endpoint detection
_resolve_symbols = [
"_ANTHROPIC_DEFAULT_BASE_URL",
"_ANTHROPIC_COMPAT_PROVIDERS",
"convert_openai_images_to_anthropic",
"endpoint_speaks_anthropic_messages",
"is_anthropic_compat_endpoint",
"maybe_wrap_anthropic",
"resolve_auxiliary_client",
]
_all_symbols = _symbols + _resolve_symbols
_services = {}
for name in _symbols:
_services[name] = getattr(adapter, name)
for name in _resolve_symbols:
from hermes_agent_anthropic import resolve as _resolve_mod
_services[name] = getattr(_resolve_mod, name)
# Also expose ANTHROPIC_DEFAULT_BASE_URL under the public (no-underscore) name
_services["ANTHROPIC_DEFAULT_BASE_URL"] = _services.get("_ANTHROPIC_DEFAULT_BASE_URL", "")
# Also expose the model name normalizer as a provider service
from hermes_agent_anthropic.pricing import normalize_anthropic_model_name
_services["normalize_model_name"] = normalize_anthropic_model_name
ctx.register_provider_services("anthropic", _services)
# Register the provider resolver — core dispatches to this instead of
# having per-anthropic if/elif branches in resolve_provider_client().
ctx.register_provider_resolver("anthropic", resolve_auxiliary_client)
# Register the anthropic transport so core doesn't need to import it.
from agent.transports.anthropic import AnthropicTransport
ctx.register_transport("anthropic_messages", AnthropicTransport)
# Register the credential pool hook — core dispatches to this instead of
# having per-anthropic if/elif branches in credential_pool.py.
from agent.plugin_registries import CredentialPoolHook
from hermes_agent_anthropic.credential_pool_hook import (
sync_from_credentials_file,
refresh_oauth,
needs_refresh,
should_include_in_pool,
source_priority,
discover_credentials,
ANTHROPIC_ENV_VAR_ORDER,
detect_auth_type,
)
ctx.register_credential_pool_hook("anthropic", CredentialPoolHook(
sync_from_credentials_file=sync_from_credentials_file,
refresh_oauth=refresh_oauth,
needs_refresh=needs_refresh,
should_include_in_pool=should_include_in_pool,
source_priority=source_priority,
discover_credentials=discover_credentials,
env_var_order=ANTHROPIC_ENV_VAR_ORDER,
detect_auth_type=detect_auth_type,
))
# Register pricing entries — core looks these up via the registry
# instead of hardcoding them in _OFFICIAL_DOCS_PRICING.
from hermes_agent_anthropic.pricing import (
get_anthropic_pricing_entries,
ANTHROPIC_PRICING_KEYS,
)
_entries = get_anthropic_pricing_entries()
_keyed = []
for (prov, model), entry in zip(ANTHROPIC_PRICING_KEYS, _entries):
_keyed.append((prov, model, entry))
ctx.register_pricing_provider("anthropic", _keyed)
# Register the provider overlay — core merges this into HERMES_OVERLAYS
from agent.plugin_registries import ProviderOverlayEntry
ctx.register_provider_overlay(ProviderOverlayEntry(
provider_name="anthropic",
transport="anthropic_messages",
extra_env_vars=("ANTHROPIC_TOKEN", "CLAUDE_CODE_OAUTH_TOKEN"),
display_name="Anthropic",
aliases=[],
))
File diff suppressed because it is too large Load Diff
@@ -1,274 +0,0 @@
"""Anthropic credential pool hook.
Handles provider-specific pool operations: syncing from ~/.claude/.credentials.json,
refreshing OAuth tokens, and deciding which sources to include in the pool.
"""
from __future__ import annotations
import logging
import os
import time
from dataclasses import replace
from typing import Any, Optional
logger = logging.getLogger(__name__)
def sync_from_credentials_file(entry: Any) -> Any:
"""Sync a claude_code pool entry from ~/.claude/.credentials.json if tokens differ.
OAuth refresh tokens are single-use. When something external (e.g.
Claude Code CLI, or another profile's pool) refreshes the token, it
writes the new pair to ~/.claude/.credentials.json. The pool entry's
refresh token becomes stale. This method detects that and syncs.
Returns the (possibly updated) entry.
"""
if entry.source != "claude_code":
return entry
try:
from agent.plugin_registries import registries
read_claude_code_credentials = registries.get_provider_service("anthropic", "read_claude_code_credentials")
if read_claude_code_credentials is None:
return entry
creds = read_claude_code_credentials()
if not creds:
return entry
file_refresh = creds.get("refreshToken", "")
file_access = creds.get("accessToken", "")
file_expires = creds.get("expiresAt", 0)
if file_refresh and file_refresh != entry.refresh_token:
logger.debug("Pool entry %s: syncing tokens from credentials file (refresh token changed)", entry.id)
return replace(
entry,
access_token=file_access,
refresh_token=file_refresh,
expires_at_ms=file_expires,
last_status=None,
last_status_at=None,
last_error_code=None,
)
except Exception as exc:
logger.debug("Failed to sync from credentials file: %s", exc)
return entry
def refresh_oauth(entry: Any, pool: Any) -> Any:
"""Refresh an anthropic OAuth token and return the updated entry.
Handles:
- Standard OAuth refresh via ``refresh_anthropic_oauth_pure``
- Writing back to ~/.claude/.credentials.json for claude_code entries
- Retry with synced token from credentials file on refresh failure
Returns the updated entry, or the original entry on failure.
"""
from agent.plugin_registries import registries
refresh_anthropic_oauth_pure = registries.get_provider_service("anthropic", "refresh_anthropic_oauth_pure")
if refresh_anthropic_oauth_pure is None:
return entry
try:
refreshed = refresh_anthropic_oauth_pure(
entry.refresh_token,
use_json=entry.source.endswith("hermes_pkce"),
)
updated = replace(
entry,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
expires_at_ms=refreshed["expires_at_ms"],
)
# Keep ~/.claude/.credentials.json in sync
if entry.source == "claude_code":
try:
_write_claude_code_credentials = registries.get_provider_service("anthropic", "_write_claude_code_credentials")
if _write_claude_code_credentials is not None:
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
return updated
except Exception as exc:
logger.debug("Credential refresh failed for anthropic/%s: %s", entry.id, exc)
# The refresh token may have been consumed by another process.
# Check if ~/.claude/.credentials.json has a newer token pair.
if entry.source == "claude_code":
synced = sync_from_credentials_file(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug("Retrying refresh with synced token from credentials file")
try:
refreshed = refresh_anthropic_oauth_pure(
synced.refresh_token,
use_json=synced.source.endswith("hermes_pkce"),
)
updated = replace(
synced,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
expires_at_ms=refreshed["expires_at_ms"],
last_status="OK",
last_status_at=None,
last_error_code=None,
)
try:
_write_claude_code_credentials = registries.get_provider_service("anthropic", "_write_claude_code_credentials")
if _write_claude_code_credentials is not None:
_write_claude_code_credentials(
refreshed["access_token"],
refreshed["refresh_token"],
refreshed["expires_at_ms"],
)
except Exception:
pass
return updated
except Exception:
pass
return entry
def needs_refresh(entry: Any) -> bool:
"""Check if an anthropic OAuth entry needs a token refresh."""
if entry.expires_at_ms is None:
return False
return int(entry.expires_at_ms) <= int(time.time() * 1000) + 120_000
def should_include_in_pool(source: str) -> bool:
"""Which anthropic credential sources should be pooled."""
return source in {"claude_code", "hermes_pkce"}
def source_priority(source: str) -> int:
"""Priority ordering for anthropic credential sources (lower = preferred)."""
_PRIORITIES = {
"claude_code": 3,
"hermes_pkce": 2,
}
return _PRIORITIES.get(source, 99)
def discover_credentials(entries: list, provider: str, is_suppressed: Any) -> tuple:
"""Discover external anthropic credentials and upsert into pool entries.
Returns (changed: bool, active_sources: set).
"""
from agent.plugin_registries import registries
changed = False
active_sources = set()
# Only auto-discover external credentials (Claude Code, Hermes PKCE)
# when the user has explicitly configured anthropic as their provider.
# Without this gate, auxiliary client fallback chains silently read
# ~/.claude/.credentials.json without user consent. See PR #4210.
try:
from hermes_cli.auth import is_provider_explicitly_configured
if not is_provider_explicitly_configured("anthropic"):
return changed, active_sources
except ImportError:
pass
# API-key vs OAuth is a user-visible choice at `hermes setup` ("Claude
# Pro/Max subscription" vs "Anthropic API key"). The signal that the
# user picked the API-key path is: ANTHROPIC_API_KEY set in the env,
# AND no OAuth env vars set — `save_anthropic_api_key()` writes the
# API key and zeros ANTHROPIC_TOKEN; `save_anthropic_oauth_token()`
# does the inverse. When that signal is present we MUST NOT seed
# autodiscovered OAuth tokens (~/.claude/.credentials.json from the
# Claude Code CLI, hermes_pkce creds from a previous OAuth login)
# into the anthropic pool — otherwise rotation on a 401/429 silently
# flips the session onto an OAuth credential, which forces the Claude
# Code identity injection, `mcp_` tool-name rewrite, and claude-cli
# User-Agent header. Users who explicitly opted into the API-key path
# are explicitly opting OUT of that masquerade. Prefer ~/.hermes/.env
# over os.environ for the same reason `_seed_from_env` does — that's
# the authoritative file that `hermes setup` writes.
try:
from hermes_cli.config import load_env
except ImportError:
load_env = None # type: ignore[assignment]
_env_file = load_env() if load_env is not None else {}
def _env_val(key: str) -> str:
return (_env_file.get(key) or os.environ.get(key) or "").strip()
anthropic_api_key = _env_val("ANTHROPIC_API_KEY")
anthropic_oauth_env = (
_env_val("ANTHROPIC_TOKEN") or _env_val("CLAUDE_CODE_OAUTH_TOKEN")
)
api_key_path_explicit = bool(anthropic_api_key and not anthropic_oauth_env)
if api_key_path_explicit:
# Prune any stale autodiscovered OAuth entries that may have been
# seeded into the on-disk pool during a previous OAuth session.
# Without this, switching OAuth -> API key at setup leaves the
# OAuth entries dormant in auth.json forever and rotation on a
# transient 401 could revive them.
retained = [
entry for entry in entries
if entry.source not in {"hermes_pkce", "claude_code"}
]
if len(retained) != len(entries):
entries[:] = retained
changed = True
return changed, active_sources
read_claude_code_credentials = registries.get_provider_service("anthropic", "read_claude_code_credentials")
read_hermes_oauth_credentials = registries.get_provider_service("anthropic", "read_hermes_oauth_credentials")
if read_claude_code_credentials is None or read_hermes_oauth_credentials is None:
return changed, active_sources
# Import pool helpers
try:
from agent.credential_pool import _upsert_entry, label_from_token, AUTH_TYPE_OAUTH
except ImportError:
return changed, active_sources
for source_name, creds in (
("hermes_pkce", read_hermes_oauth_credentials()),
("claude_code", read_claude_code_credentials()),
):
if creds and creds.get("accessToken"):
if is_suppressed(provider, source_name):
continue
active_sources.add(source_name)
changed |= _upsert_entry(
entries,
provider,
source_name,
{
"source": source_name,
"auth_type": AUTH_TYPE_OAUTH,
"access_token": creds.get("accessToken", ""),
"refresh_token": creds.get("refreshToken"),
"expires_at_ms": creds.get("expiresAt"),
"label": label_from_token(creds.get("accessToken", ""), source_name),
},
)
return changed, active_sources
# Env var scan order for anthropic — prefer OAuth tokens over API keys
ANTHROPIC_ENV_VAR_ORDER = [
"ANTHROPIC_TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN",
"ANTHROPIC_API_KEY",
]
def detect_auth_type(token: str) -> str:
"""Determine auth type for an anthropic token.
OAuth tokens don't start with 'sk-ant-api'; API keys do.
"""
from agent.credential_pool import AUTH_TYPE_OAUTH, AUTH_TYPE_API_KEY
if not token.startswith("sk-ant-api"):
return AUTH_TYPE_OAUTH
return AUTH_TYPE_API_KEY
@@ -1,184 +0,0 @@
"""Anthropic model pricing data.
Official docs snapshot entries for Anthropic Claude models.
Source: https://platform.claude.com/docs/en/about-claude/pricing
"""
from __future__ import annotations
from datetime import datetime, timezone
from decimal import Decimal
from typing import List
def get_anthropic_pricing_entries() -> list:
"""Return official docs pricing entries for Anthropic Claude models."""
from agent.usage_pricing import PricingEntry
_ANTHROPIC_PRICING_URL = "https://platform.claude.com/docs/en/about-claude/pricing"
_ANTHROPIC_PRICING_VER = "anthropic-pricing-2026-05"
return [
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-opus-4-7")
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-opus-4-6")
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-opus-4-5")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-sonnet-4-7")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-sonnet-4-6")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-sonnet-4-5")
PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-haiku-4-5")
PricingEntry(
input_cost_per_million=Decimal("1.00"),
output_cost_per_million=Decimal("5.00"),
cache_read_cost_per_million=Decimal("0.10"),
cache_write_cost_per_million=Decimal("1.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-7-sonnet")
PricingEntry(
input_cost_per_million=Decimal("1.00"),
output_cost_per_million=Decimal("5.00"),
cache_read_cost_per_million=Decimal("0.10"),
cache_write_cost_per_million=Decimal("1.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-6-sonnet")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
cache_read_cost_per_million=Decimal("0.30"),
cache_write_cost_per_million=Decimal("3.75"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-5-sonnet")
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-7-opus")
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-6-opus")
PricingEntry(
input_cost_per_million=Decimal("5.00"),
output_cost_per_million=Decimal("25.00"),
cache_read_cost_per_million=Decimal("0.50"),
cache_write_cost_per_million=Decimal("6.25"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-5-opus")
PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
cache_read_cost_per_million=Decimal("0.08"),
cache_write_cost_per_million=Decimal("1.00"),
source="official_docs_snapshot",
source_url=_ANTHROPIC_PRICING_URL,
pricing_version=_ANTHROPIC_PRICING_VER,
), # key: ("anthropic", "claude-4-5-haiku")
]
# Model name keys for the pricing entries — must match the order above
ANTHROPIC_PRICING_KEYS = [
("anthropic", "claude-opus-4-7"),
("anthropic", "claude-opus-4-6"),
("anthropic", "claude-opus-4-5"),
("anthropic", "claude-sonnet-4-7"),
("anthropic", "claude-sonnet-4-6"),
("anthropic", "claude-sonnet-4-5"),
("anthropic", "claude-haiku-4-5"),
("anthropic", "claude-4-7-sonnet"),
("anthropic", "claude-4-6-sonnet"),
("anthropic", "claude-4-5-sonnet"),
("anthropic", "claude-4-7-opus"),
("anthropic", "claude-4-6-opus"),
("anthropic", "claude-4-5-opus"),
("anthropic", "claude-4-5-haiku"),
]
def normalize_anthropic_model_name(model: str) -> str:
"""Normalize Anthropic model name variants to canonical form.
Handles:
- Dot notation: claude-opus-4.7 claude-opus-4-7
- Short aliases: claude-opus-4.7 claude-opus-4-7
- Strips anthropic/ prefix if present
"""
import re
name = model.lower().strip()
if name.startswith("anthropic/"):
name = name[len("anthropic/"):]
# Normalize dots to dashes in version numbers
name = re.sub(r"(\d+)\.(\d+)", r"\1-\2", name)
return name
@@ -1,312 +0,0 @@
"""Anthropic provider resolver for auxiliary client construction.
Handles ALL provider-specific logic for building auxiliary clients:
credential resolution (pool, env var, OAuth), client construction,
base URL detection, and transport wrapping.
"""
from __future__ import annotations
import logging
from typing import Any, Optional, Tuple
from utils import base_url_hostname
logger = logging.getLogger(__name__)
_ANTHROPIC_DEFAULT_BASE_URL = "https://api.anthropic.com"
_ANTHROPIC_COMPAT_PROVIDERS = frozenset({"minimax", "minimax-oauth", "minimax-cn"})
# ---------------------------------------------------------------------------
# Endpoint detection helpers
# ---------------------------------------------------------------------------
def endpoint_speaks_anthropic_messages(base_url: str) -> bool:
"""True if the endpoint at ``base_url`` speaks Anthropic Messages protocol.
Covers:
- Any URL ending in ``/anthropic``
- ``api.kimi.com/coding`` (Kimi Coding Plan)
- ``api.anthropic.com`` (native Anthropic)
"""
normalized = (base_url or "").strip().lower().rstrip("/")
if not normalized:
return False
if normalized.endswith("/anthropic"):
return True
hostname = base_url_hostname(normalized)
if hostname == "api.anthropic.com":
return True
if hostname == "api.kimi.com" and "/coding" in normalized:
return True
return False
def is_anthropic_compat_endpoint(provider: str, base_url: str) -> bool:
"""Detect if an endpoint expects Anthropic-format content blocks."""
if provider in _ANTHROPIC_COMPAT_PROVIDERS:
return True
url_lower = (base_url or "").lower()
return "/anthropic" in url_lower
def convert_openai_images_to_anthropic(messages: list) -> list:
"""Convert OpenAI ``image_url`` content blocks to Anthropic ``image`` blocks."""
converted = []
for msg in messages:
content = msg.get("content")
if not isinstance(content, list):
converted.append(msg)
continue
new_content = []
changed = False
for block in content:
if block.get("type") == "image_url":
image_url_val = (block.get("image_url") or {}).get("url", "")
if image_url_val.startswith("data:"):
header, _, b64data = image_url_val.partition(",")
media_type = "image/png"
if ":" in header and ";" in header:
media_type = header.split(":", 1)[1].split(";", 1)[0]
new_content.append({
"type": "image",
"source": {
"type": "base64",
"media_type": media_type,
"data": b64data,
},
})
else:
new_content.append({
"type": "image",
"source": {
"type": "url",
"url": image_url_val,
},
})
changed = True
else:
new_content.append(block)
converted.append({**msg, "content": new_content} if changed else msg)
return converted
# ---------------------------------------------------------------------------
# Transport wrapping
# ---------------------------------------------------------------------------
def _safe_isinstance(obj: Any, maybe_type: Any) -> bool:
"""Return False instead of raising when a patched symbol is not a type."""
try:
return isinstance(obj, maybe_type)
except TypeError:
return False
def maybe_wrap_anthropic(
client_obj: Any,
model: str,
api_key: str,
base_url: str,
api_mode: Optional[str] = None,
) -> Any:
"""Rewrap a plain OpenAI client in ``AnthropicAuxiliaryClient`` when
the endpoint actually speaks Anthropic Messages.
Returns ``client_obj`` unchanged when it's already a specialized adapter
or the endpoint is OpenAI-wire.
"""
from agent.anthropic_aux import AnthropicAuxiliaryClient
# Already wrapped — don't double-wrap.
if _safe_isinstance(client_obj, AnthropicAuxiliaryClient):
return client_obj
# Check for other specialized adapters we should never re-dispatch.
try:
from agent.auxiliary_client import CodexAuxiliaryClient
if _safe_isinstance(client_obj, CodexAuxiliaryClient):
return client_obj
except ImportError:
pass
try:
from agent.gemini_native_adapter import GeminiNativeClient
if _safe_isinstance(client_obj, GeminiNativeClient):
return client_obj
except ImportError:
pass
try:
from agent.copilot_acp_client import CopilotACPClient
if _safe_isinstance(client_obj, CopilotACPClient):
return client_obj
except ImportError:
pass
# Explicit non-anthropic api_mode wins over URL heuristics.
if api_mode and api_mode != "anthropic_messages":
return client_obj
should_wrap = (
api_mode == "anthropic_messages"
or endpoint_speaks_anthropic_messages(base_url)
)
if not should_wrap:
return client_obj
from agent.plugin_registries import registries
build_anthropic_client = registries.get_provider_service("anthropic", "build_anthropic_client")
if build_anthropic_client is None:
logger.warning(
"Endpoint %s speaks Anthropic Messages but the anthropic SDK is "
"not installed — falling back to OpenAI-wire (will likely 404).",
base_url,
)
return client_obj
try:
real_client = build_anthropic_client(api_key, base_url)
except Exception as exc:
logger.warning(
"Failed to build Anthropic client for %s (%s) — falling back to "
"OpenAI-wire client.", base_url, exc,
)
return client_obj
logger.debug(
"Auxiliary transport: wrapping client in AnthropicAuxiliaryClient "
"(model=%s, base_url=%s, api_mode=%s)",
model, base_url[:60] if base_url else "", api_mode or "auto-detected",
)
return AnthropicAuxiliaryClient(
real_client, model, api_key, base_url, is_oauth=False,
)
# ---------------------------------------------------------------------------
# Pool helpers (thin wrappers over core pool functions)
# ---------------------------------------------------------------------------
def _select_pool_entry(provider: str) -> Tuple[bool, Optional[Any]]:
"""Return (pool_exists_for_provider, selected_entry)."""
try:
from agent.credential_pool import load_pool
pool = load_pool(provider)
except Exception as exc:
logger.debug("Auxiliary client: could not load pool for %s: %s", provider, exc)
return False, None
if not pool or not pool.has_credentials():
return False, None
try:
return True, pool.select()
except Exception as exc:
logger.debug("Auxiliary client: could not select pool entry for %s: %s", provider, exc)
return True, None
def _pool_runtime_api_key(entry: Any) -> str:
if entry is None:
return ""
key = getattr(entry, "runtime_api_key", None) or getattr(entry, "access_token", "")
return str(key or "").strip()
def _pool_runtime_base_url(entry: Any, fallback: str = "") -> str:
if entry is None:
return str(fallback or "").strip().rstrip("/")
url = (
getattr(entry, "runtime_base_url", None)
or getattr(entry, "inference_base_url", None)
or getattr(entry, "base_url", None)
or fallback
)
return str(url or "").strip().rstrip("/")
def _get_aux_model_for_provider(provider_id: str) -> str:
"""Return the cheap auxiliary model for a provider."""
try:
from providers import get_provider_profile
_p = get_provider_profile(provider_id)
if _p and _p.default_aux_model:
return _p.default_aux_model
except Exception:
pass
return ""
# ---------------------------------------------------------------------------
# The resolver: called by core's resolve_provider_client()
# ---------------------------------------------------------------------------
def resolve_auxiliary_client(
*,
model: str | None = None,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
async_mode: bool = False,
is_vision: bool = False,
main_runtime: dict | None = None,
api_mode: str | None = None,
) -> tuple[Any, str] | tuple[None, None]:
"""Resolve an auxiliary client for the Anthropic provider.
Returns ``(client, default_model)`` or ``(None, None)`` if unavailable.
"""
from agent.plugin_registries import registries
from agent.anthropic_aux import (
AnthropicAuxiliaryClient,
AsyncAnthropicAuxiliaryClient,
)
_anthropic = registries.get_provider_namespace("anthropic")
build_anthropic_client = _anthropic.get("build_anthropic_client")
resolve_anthropic_token = _anthropic.get("resolve_anthropic_token")
if build_anthropic_client is None or resolve_anthropic_token is None:
return None, None
pool_present, entry = _select_pool_entry("anthropic")
if pool_present:
if entry is None:
return None, None
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
# Allow base URL override from config.yaml model.base_url, but only
# when the configured provider is anthropic.
base_url = _pool_runtime_base_url(entry, _ANTHROPIC_DEFAULT_BASE_URL) if pool_present else _ANTHROPIC_DEFAULT_BASE_URL
if explicit_base_url:
base_url = explicit_base_url.strip().rstrip("/")
try:
from hermes_cli.config import load_config
cfg = load_config()
model_cfg = cfg.get("model")
if isinstance(model_cfg, dict):
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
if cfg_provider == "anthropic":
cfg_base_url = (model_cfg.get("base_url") or "").strip().rstrip("/")
if cfg_base_url:
base_url = cfg_base_url
except Exception:
pass
_is_oauth_token = _anthropic.get("_is_oauth_token")
is_oauth = _is_oauth_token(token) if _is_oauth_token else False
default_model = model or _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", default_model, base_url, is_oauth)
try:
real_client = build_anthropic_client(token, base_url)
except ImportError:
return None, None
client = AnthropicAuxiliaryClient(real_client, default_model, token, base_url, is_oauth=is_oauth)
if async_mode:
client = AsyncAnthropicAuxiliaryClient(client)
return client, default_model
@@ -1,20 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-anthropic"
version = "0.1.0"
description = "Anthropic Messages API adapter for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"anthropic==0.87.0",
"hermes-agent-azure",
]
[project.entry-points."hermes_agent.plugins"]
anthropic = "hermes_agent_anthropic:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_anthropic*"]
@@ -1,180 +0,0 @@
"""Shared fixtures for anthropic plugin tests.
Registers the anthropic plugin in the singleton registry before each test
and provides the ``agent`` fixture used by integration tests.
"""
import sys
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
def pytest_configure(config):
"""Remove sys.path entries that would shadow the real ``anthropic`` SDK.
pytest adds ``plugins/model-providers/`` to ``sys.path`` because
``plugins/model-providers/anthropic/__init__.py`` (a provider profile)
exists. This makes ``import anthropic`` find the plugin directory
instead of the installed SDK package, causing ``AttributeError:
module 'anthropic' has no attribute 'Anthropic'``.
We remove the conflicting entry, evict any wrong cached import, and
force-import the real SDK so sys.modules["anthropic"] is correct even
after pytest re-adds the conflicting path during collection.
"""
import importlib
_repo_root = Path(__file__).resolve().parent.parent.parent.parent # main/
_bad = str(_repo_root / "plugins" / "model-providers")
while _bad in sys.path:
sys.path.remove(_bad)
# Evict wrong import
if "anthropic" in sys.modules and not hasattr(sys.modules["anthropic"], "Anthropic"):
del sys.modules["anthropic"]
# Force-import the real SDK now (before pytest re-adds the bad path)
# so sys.modules["anthropic"] points to the real package.
try:
import anthropic as _real_anthropic # noqa: F401
if not hasattr(_real_anthropic, "Anthropic"):
raise ImportError("wrong anthropic module loaded")
except ImportError:
# Try explicit import from venv
import importlib.util as _ilu
for _p in sys.path:
_candidate = Path(_p) / "anthropic" / "__init__.py"
if _candidate.exists() and (_candidate.parent / "_client.py").exists():
_spec = _ilu.spec_from_file_location("anthropic", _candidate)
if _spec and _spec.loader:
_mod = _ilu.module_from_spec(_spec)
sys.modules["anthropic"] = _mod
_spec.loader.exec_module(_mod)
break
class _FullCtx:
"""Plugin context that wires up all registry hooks the anthropic plugin uses.
Uses the real registries for provider_services, provider_resolver,
credential_pool_hook, transport, and pricing so plugin internals work
correctly. Everything else is a no-op so the fixture doesn't depend on
parts of the system (platform, TTS, etc.) that aren't under test.
"""
def register_provider_services(self, name, services):
from agent.plugin_registries import registries
registries.register_provider_services(name, services)
def register_provider_resolver(self, name, resolver):
from agent.plugin_registries import registries
registries.register_provider_resolver(name, resolver)
def register_credential_pool_hook(self, name, hook):
from agent.plugin_registries import registries
registries.register_credential_pool_hook(name, hook)
def register_transport(self, api_mode, transport_cls):
from agent.plugin_registries import registries
registries._transports[api_mode] = transport_cls
def register_pricing_provider(self, name, fn):
from agent.plugin_registries import registries
registries.register_pricing_provider(name, fn)
def register_provider_overlay(self, entry):
from agent.plugin_registries import registries
registries.register_provider_overlay(entry)
# Catch-all no-op for every other register_* method (platform, TTS,
# tools, hooks, skills, etc.) so the fixture never crashes when the
# plugin calls something we don't need to wire up for unit tests.
def __getattr__(self, name):
if name.startswith("register_"):
return lambda *a, **kw: None
raise AttributeError(name)
@pytest.fixture(autouse=True)
def _register_anthropic_plugin():
"""Register the real anthropic plugin for the duration of each test,
then restore the registry to its prior state afterwards.
Calls the plugin's ``register()`` against a full context so that all
registry hooks (services, resolver, transport, pricing, etc.) are
populated. patch.dict on each affected registry dict guarantees clean
teardown even across conftest scopes.
"""
from agent.plugin_registries import registries
# Snapshot current state so we can restore after the test.
_prev_services = dict(registries._provider_services)
_prev_resolvers = dict(registries._provider_resolvers)
_prev_cph = dict(registries._credential_pool_hooks)
_prev_transports = dict(registries._transports) if hasattr(registries, "_transports") else {}
_prev_pricing = dict(registries._pricing_providers) if hasattr(registries, "_pricing_providers") else {}
_prev_overlays = dict(registries._provider_overlays) if hasattr(registries, "_provider_overlays") else {}
ctx = _FullCtx()
try:
from hermes_agent_anthropic import register as _reg # type: ignore[import]
_reg(ctx)
except ImportError:
pass
yield
# Restore — remove keys the plugin added, put back what was there before.
for d, prev in [
(registries._provider_services, _prev_services),
(registries._provider_resolvers, _prev_resolvers),
(registries._credential_pool_hooks, _prev_cph),
]:
d.clear()
d.update(prev)
for attr, prev in [
("_transports", _prev_transports),
("_pricing_providers", _prev_pricing),
("_provider_overlays", _prev_overlays),
]:
if hasattr(registries, attr):
getattr(registries, attr).clear()
getattr(registries, attr).update(prev)
def _make_tool_defs(*names: str) -> list:
"""Build minimal tool definition list accepted by AIAgent.__init__."""
return [
{
"type": "function",
"function": {
"name": n,
"description": f"{n} tool",
"parameters": {"type": "object", "properties": {}},
},
}
for n in names
]
@pytest.fixture()
def agent():
"""Minimal AIAgent with mocked OpenAI client and tool loading."""
from run_agent import AIAgent
with (
patch(
"run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")
),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("run_agent.OpenAI"),
):
a = AIAgent(
api_key="test-key-1234567890",
base_url="https://openrouter.ai/api/v1",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
a.client = MagicMock()
return a
@@ -1,420 +0,0 @@
"""Integration tests for Anthropic-specific AIAgent behaviour.
Tests that exercise the interaction between AIAgent and the Anthropic
provider plugin covering max_tokens passthrough, image fallback,
provider fallback routing, base-url passthrough, credential refresh,
and OAuth flag setting.
"""
import json
from types import SimpleNamespace
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from hermes_agent_anthropic.adapter import build_anthropic_client, resolve_anthropic_token, _is_oauth_token
import run_agent
from run_agent import AIAgent
def _make_tool_defs(*names: str) -> list:
"""Build minimal tool definition list accepted by AIAgent.__init__."""
return [
{
"type": "function",
"function": {
"name": n,
"description": f"{n} tool",
"parameters": {"type": "object", "properties": {}},
},
}
for n in names
]
class TestBuildApiKwargsAnthropicMaxTokens:
"""Bug fix: max_tokens was always None for Anthropic mode, ignoring user config."""
def test_max_tokens_passed_to_anthropic(self, agent):
agent.api_mode = "anthropic_messages"
agent.max_tokens = 4096
agent.reasoning_config = None
with patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build:
mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
agent._build_api_kwargs([{"role": "user", "content": "test"}])
_, kwargs = mock_build.call_args
if not kwargs:
kwargs = dict(zip(
["model", "messages", "tools", "max_tokens", "reasoning_config"],
mock_build.call_args[0],
))
assert kwargs.get("max_tokens") == 4096 or mock_build.call_args[1].get("max_tokens") == 4096
def test_max_tokens_none_when_unset(self, agent):
agent.api_mode = "anthropic_messages"
agent.max_tokens = None
agent.reasoning_config = None
with patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build:
mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 16384}
agent._build_api_kwargs([{"role": "user", "content": "test"}])
call_args = mock_build.call_args
# max_tokens should be None (let adapter use its default)
if call_args[1]:
assert call_args[1].get("max_tokens") is None
else:
assert call_args[0][3] is None
class TestAnthropicImageFallback:
def test_build_api_kwargs_converts_multimodal_user_image_to_text(self, agent):
agent.api_mode = "anthropic_messages"
agent.reasoning_config = None
api_messages = [{
"role": "user",
"content": [
{"type": "text", "text": "Can you see this now?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.png"}},
],
}]
with (
patch("tools.vision_tools.vision_analyze_tool", new=AsyncMock(return_value=json.dumps({"success": True, "analysis": "A cat sitting on a chair."}))),
patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build,
):
mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
agent._build_api_kwargs(api_messages)
kwargs = mock_build.call_args.kwargs or dict(zip(
["model", "messages", "tools", "max_tokens", "reasoning_config"],
mock_build.call_args.args,
))
transformed = kwargs["messages"]
assert isinstance(transformed[0]["content"], str)
assert "A cat sitting on a chair." in transformed[0]["content"]
assert "Can you see this now?" in transformed[0]["content"]
assert "vision_analyze with image_url: https://example.com/cat.png" in transformed[0]["content"]
def test_build_api_kwargs_reuses_cached_image_analysis_for_duplicate_images(self, agent):
agent.api_mode = "anthropic_messages"
agent.reasoning_config = None
data_url = "data:image/png;base64,QUFBQQ=="
api_messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "first"},
{"type": "input_image", "image_url": data_url},
],
},
{
"role": "user",
"content": [
{"type": "text", "text": "second"},
{"type": "input_image", "image_url": data_url},
],
},
]
mock_vision = AsyncMock(return_value=json.dumps({"success": True, "analysis": "A small test image."}))
with (
patch("tools.vision_tools.vision_analyze_tool", new=mock_vision),
patch("agent.transports.anthropic.build_anthropic_kwargs") as mock_build,
):
mock_build.return_value = {"model": "claude-sonnet-4-20250514", "messages": [], "max_tokens": 4096}
agent._build_api_kwargs(api_messages)
assert mock_vision.await_count == 1
class TestFallbackAnthropicProvider:
"""Bug fix: _try_activate_fallback had no case for anthropic provider."""
def test_fallback_to_anthropic_sets_api_mode(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
agent._fallback_chain = [agent._fallback_model]
agent._fallback_index = 0
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "***"
with (
patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)),
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value=None),
):
mock_build.return_value = MagicMock()
result = agent._try_activate_fallback()
assert result is True
assert agent.api_mode == "anthropic_messages"
assert agent._anthropic_client is not None
assert agent.client is None
def test_fallback_to_anthropic_enables_prompt_caching(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
agent._fallback_chain = [agent._fallback_model]
agent._fallback_index = 0
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "***"
with (
patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value=None),
):
agent._try_activate_fallback()
assert agent._use_prompt_caching is True
def test_fallback_to_openrouter_uses_openai_client(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}
agent._fallback_chain = [agent._fallback_model]
agent._fallback_index = 0
mock_client = MagicMock()
mock_client.base_url = "https://openrouter.ai/api/v1"
mock_client.api_key = "sk-or-test"
with patch("agent.auxiliary_client.resolve_provider_client", return_value=(mock_client, None)):
result = agent._try_activate_fallback()
assert result is True
assert agent.api_mode == "chat_completions"
assert agent.client is mock_client
class TestAnthropicBaseUrlPassthrough:
"""Bug fix: base_url was filtered with 'anthropic in base_url', blocking proxies."""
def test_custom_proxy_base_url_passed_through(self):
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
):
mock_build.return_value = MagicMock()
a = AIAgent(
api_key="sk-ant...7890",
base_url="https://llm-proxy.company.com/v1",
api_mode="anthropic_messages",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
call_args = mock_build.call_args
# base_url should be passed through, not filtered out
assert call_args[0][1] == "https://llm-proxy.company.com/v1"
def test_none_base_url_passed_as_none(self):
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
):
mock_build.return_value = MagicMock()
a = AIAgent(
api_key="sk-ant...7890",
api_mode="anthropic_messages",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
call_args = mock_build.call_args
# No base_url provided, should be default empty string or None
passed_url = call_args[0][1]
assert not passed_url or passed_url is None
class TestAnthropicCredentialRefresh:
def test_try_refresh_anthropic_client_credentials_rebuilds_client(self):
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build,
):
old_client = MagicMock()
new_client = MagicMock()
mock_build.side_effect = [old_client, new_client]
agent = AIAgent(
api_key="sk-ant...oken",
base_url="https://openrouter.ai/api/v1",
api_mode="anthropic_messages",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
agent._anthropic_client = old_client
agent._anthropic_api_key = "sk-ant...old-token" # differs from what resolve returns
agent._anthropic_base_url = "https://api.anthropic.com"
agent.provider = "anthropic"
with (
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant...oken"),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=new_client) as rebuild,
):
assert agent._try_refresh_anthropic_client_credentials() is True
old_client.close.assert_called_once()
rebuild.assert_called_once_with(
"sk-ant...oken", "https://api.anthropic.com", timeout=None,
)
assert agent._anthropic_client is new_client
assert agent._anthropic_api_key == "sk-ant...oken"
def test_try_refresh_anthropic_client_credentials_returns_false_when_token_unchanged(self):
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
):
agent = AIAgent(
api_key="sk-ant...oken",
base_url="https://openrouter.ai/api/v1",
api_mode="anthropic_messages",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
old_client = MagicMock()
agent._anthropic_client = old_client
agent._anthropic_api_key = "sk-ant...oken"
with (
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant...oken"),
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as rebuild,
):
assert agent._try_refresh_anthropic_client_credentials() is False
old_client.close.assert_not_called()
rebuild.assert_not_called()
def test_anthropic_messages_create_preflights_refresh(self):
with (
patch("run_agent.get_tool_definitions", return_value=_make_tool_defs("web_search")),
patch("run_agent.check_toolset_requirements", return_value={}),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
):
agent = AIAgent(
api_key="sk-ant...oken",
base_url="https://openrouter.ai/api/v1",
api_mode="anthropic_messages",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
response = SimpleNamespace(content=[])
agent._anthropic_client = MagicMock()
agent._anthropic_client.messages.create.return_value = response
with patch.object(agent, "_try_refresh_anthropic_client_credentials", return_value=True) as refresh:
result = agent._anthropic_messages_create({"model": "claude-sonnet-4-20250514"})
refresh.assert_called_once_with()
agent._anthropic_client.messages.create.assert_called_once_with(model="claude-sonnet-4-20250514")
assert result is response
class TestFallbackSetsOAuthFlag:
"""_try_activate_fallback must set _is_anthropic_oauth for Anthropic fallbacks."""
def test_fallback_to_anthropic_oauth_sets_flag(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
agent._fallback_chain = [agent._fallback_model]
agent._fallback_index = 0
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "sk-ant-setup-oauth-token"
with (
patch("agent.auxiliary_client.resolve_provider_client",
return_value=(mock_client, None)),
patch("hermes_agent_anthropic.adapter.build_anthropic_client",
return_value=MagicMock()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token",
return_value=None),
):
result = agent._try_activate_fallback()
assert result is True
assert agent._is_anthropic_oauth is True
def test_fallback_to_anthropic_api_key_clears_flag(self, agent):
agent._fallback_activated = False
agent._fallback_model = {"provider": "anthropic", "model": "claude-sonnet-4-6"}
agent._fallback_chain = [agent._fallback_model]
agent._fallback_index = 0
mock_client = MagicMock()
mock_client.base_url = "https://api.anthropic.com/v1"
mock_client.api_key = "sk-ant-api03-regular-key"
with (
patch("agent.auxiliary_client.resolve_provider_client",
return_value=(mock_client, None)),
patch("hermes_agent_anthropic.adapter.build_anthropic_client",
return_value=MagicMock()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token",
return_value=None),
):
result = agent._try_activate_fallback()
assert result is True
assert agent._is_anthropic_oauth is False
class TestOAuthFlagAfterCredentialRefresh:
"""_is_anthropic_oauth must update when token type changes during refresh."""
def test_oauth_flag_updates_api_key_to_oauth(self, agent):
"""Refreshing from API key to OAuth token must set flag to True."""
from agent.plugin_registries import registries
agent.api_mode = "anthropic_messages"
agent.provider = "anthropic"
agent._anthropic_api_key = "***"
agent._anthropic_client = MagicMock()
agent._is_anthropic_oauth = False
with patch.dict(registries._provider_services, {"anthropic": {
"resolve_anthropic_token": MagicMock(return_value="sk-ant...oken"),
"build_anthropic_client": MagicMock(return_value=MagicMock()),
"_is_oauth_token": MagicMock(return_value=True),
}}):
result = agent._try_refresh_anthropic_client_credentials()
assert result is True
assert agent._is_anthropic_oauth is True
def test_oauth_flag_updates_oauth_to_api_key(self, agent):
"""Refreshing from OAuth to API key must set flag to False."""
from agent.plugin_registries import registries
agent.api_mode = "anthropic_messages"
agent.provider = "anthropic"
agent._anthropic_api_key = "***"
agent._anthropic_client = MagicMock()
agent._is_anthropic_oauth = True
with patch.dict(registries._provider_services, {"anthropic": {
"resolve_anthropic_token": MagicMock(return_value="sk-ant...-key"),
"build_anthropic_client": MagicMock(return_value=MagicMock()),
"_is_oauth_token": MagicMock(return_value=False),
}}):
result = agent._try_refresh_anthropic_client_credentials()
assert result is True
assert agent._is_anthropic_oauth is False
@@ -1,98 +0,0 @@
"""Anthropic-specific auth command tests moved from tests/hermes_cli/test_auth_commands.py."""
from __future__ import annotations
import base64
import json
import pytest
def _write_auth_store(tmp_path, payload: dict) -> None:
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps(payload, indent=2))
def _jwt_with_email(email: str) -> str:
header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
payload = base64.urlsafe_b64encode(
json.dumps({"email": email}).encode()
).rstrip(b"=").decode()
return f"{header}.{payload}.signature"
@pytest.fixture(autouse=True)
def _clear_provider_env(monkeypatch):
for key in (
"OPENROUTER_API_KEY",
"OPENAI_API_KEY",
"ANTHROPIC_API_KEY",
"ANTHROPIC_TOKEN",
"CLAUDE_CODE_OAUTH_TOKEN",
):
monkeypatch.delenv(key, raising=False)
def test_auth_add_anthropic_oauth_persists_pool_entry(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
monkeypatch.delenv("ANTHROPIC_API_KEY", raising=False)
monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
monkeypatch.delenv("CLAUDE_CODE_OAUTH_TOKEN", raising=False)
_write_auth_store(tmp_path, {"version": 1, "providers": {}})
token = _jwt_with_email("claude@example.com")
monkeypatch.setattr(
"hermes_agent_anthropic.adapter.run_hermes_oauth_login_pure",
lambda: {
"access_token": token,
"refresh_token": "refresh-token",
"expires_at_ms": 1711234567000,
},
)
from hermes_cli.auth_commands import auth_add_command
class _Args:
provider = "anthropic"
auth_type = "oauth"
api_key = None
label = None
auth_add_command(_Args())
payload = json.loads((tmp_path / "hermes" / "auth.json").read_text())
entries = payload["credential_pool"]["anthropic"]
entry = next(item for item in entries if item["source"] == "manual:hermes_pkce")
assert entry["label"] == "claude@example.com"
assert entry["source"] == "manual:hermes_pkce"
assert entry["refresh_token"] == "refresh-token"
assert entry["expires_at_ms"] == 1711234567000
def test_seed_from_singletons_respects_hermes_pkce_suppression(tmp_path, monkeypatch):
"""anthropic hermes_pkce must not re-seed from ~/.hermes/.anthropic_oauth.json when suppressed."""
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
import yaml
(hermes_home / "config.yaml").write_text(yaml.dump({"model": {"provider": "anthropic", "model": "claude"}}))
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {},
"suppressed_sources": {"anthropic": ["hermes_pkce"]},
}))
# Stub the readers so only hermes_pkce is "available"; claude_code returns None
import hermes_agent_anthropic as aa
monkeypatch.setattr(aa, "read_hermes_oauth_credentials", lambda: {
"accessToken": "tok", "refreshToken": "r", "expiresAt": 9999999999000,
})
monkeypatch.setattr(aa, "read_claude_code_credentials", lambda: None)
from agent.credential_pool import _seed_from_singletons
entries = []
changed, active = _seed_from_singletons("anthropic", entries)
# hermes_pkce suppressed, claude_code returns None → nothing should be seeded
assert entries == []
assert "hermes_pkce" not in active
@@ -1,535 +0,0 @@
"""Tests for Anthropic-specific auxiliary client behaviour.
Covers:
- OAuth vs API-key flag propagation (_try_anthropic AnthropicAuxiliaryClient)
- explicit_api_key propagation through resolve_provider_client _try_anthropic
- Expired Codex token fallback to Anthropic
- Vision client fallback with Anthropic
- Auth refresh retry for Anthropic clients
"""
import json
from unittest.mock import MagicMock, AsyncMock, patch
import pytest
from agent.auxiliary_client import (
resolve_provider_client,
_read_codex_access_token,
_resolve_auto,
get_available_vision_backends,
call_llm,
async_call_llm,
)
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
from agent.anthropic_aux import AnthropicAuxiliaryClient
class TestAnthropicOAuthFlag:
"""Test that OAuth tokens get is_oauth=True in auxiliary Anthropic client."""
def test_oauth_token_sets_flag(self, monkeypatch):
"""OAuth tokens (sk-ant-oat01-*) should create client with is_oauth=True."""
monkeypatch.setenv("ANTHROPIC_TOKEN", "sk-ant-oat01-test-token")
with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
mock_build.return_value = MagicMock()
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
from agent.anthropic_aux import AnthropicAuxiliaryClient
client, model = _try_anthropic()
assert client is not None
assert isinstance(client, AnthropicAuxiliaryClient)
# The adapter inside should have is_oauth=True
adapter = client.chat.completions
assert adapter._is_oauth is True
def test_api_key_no_oauth_flag(self, monkeypatch):
"""Regular API keys (sk-ant-api-*) should create client with is_oauth=False."""
with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="sk-ant-api03-testkey1234"), \
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
from agent.anthropic_aux import AnthropicAuxiliaryClient
client, model = _try_anthropic()
assert client is not None
assert isinstance(client, AnthropicAuxiliaryClient)
adapter = client.chat.completions
assert adapter._is_oauth is False
def test_pool_entry_takes_priority_over_legacy_resolution(self):
class _Entry:
access_token = "sk-ant-oat01-pooled"
base_url = "https://api.anthropic.com"
class _Pool:
def has_credentials(self):
return True
def select(self):
return _Entry()
with (
patch("agent.credential_pool.load_pool", return_value=_Pool()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", side_effect=AssertionError("legacy path should not run")),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()) as mock_build,
):
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
client, model = _try_anthropic()
assert client is not None
assert model == "claude-haiku-4-5-20251001"
assert mock_build.call_args.args[0] == "sk-ant-oat01-pooled"
class TestAnthropicExplicitApiKey:
"""Test that explicit_api_key is correctly propagated to _try_anthropic().
Parity with the OpenRouter fix in #18768: resolve_provider_client() passes
explicit_api_key to _try_openrouter(), but the anthropic branch was not
updated _try_anthropic() always fell back to resolve_anthropic_token()
even when an explicit key was supplied (e.g. from a fallback_model entry).
"""
def test_try_anthropic_uses_explicit_api_key_over_env(self):
"""_try_anthropic(explicit_api_key) must use the supplied key, not the env fallback."""
with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
client, model = _try_anthropic(explicit_api_key="explicit-pool-key")
assert client is not None
assert mock_build.call_args.args[0] == "explicit-pool-key", (
f"Expected explicit_api_key to be passed, got: {mock_build.call_args.args[0]}"
)
assert mock_build.call_args.args[0] != "env-fallback-key"
def test_try_anthropic_without_explicit_key_falls_back_to_resolve(self):
"""Without explicit_api_key, _try_anthropic falls back to resolve_anthropic_token."""
with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-fallback-key"), \
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
from hermes_agent_anthropic.resolve import resolve_auxiliary_client as _try_anthropic
client, model = _try_anthropic()
assert client is not None
assert mock_build.call_args.args[0] == "env-fallback-key"
def test_resolve_provider_client_passes_explicit_api_key_to_anthropic(self):
"""resolve_provider_client(provider='anthropic', explicit_api_key=...) must propagate the key."""
with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="env-key"), \
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
client, model = resolve_provider_client(
provider="anthropic",
explicit_api_key="explicit-fallback-key",
)
assert client is not None
assert mock_build.call_args.args[0] == "explicit-fallback-key", (
"resolve_provider_client must forward explicit_api_key to _try_anthropic()"
)
class TestExpiredCodexFallback:
"""Test that expired Codex tokens don't block the auto chain."""
def test_expired_codex_falls_through_to_next(self, tmp_path, monkeypatch):
"""When Codex token is expired, auto chain should skip it and try next provider."""
import base64
import time as _time
# Expired Codex JWT
header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
expired_jwt = f"{header}.{payload}.fakesig"
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {
"openai-codex": {
"tokens": {"access_token": expired_jwt, "refresh_token": "***"},
},
},
}))
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
# Set up Anthropic as fallback
monkeypatch.setenv("ANTHROPIC_TOKEN", "sk-ant...back")
with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
mock_build.return_value = MagicMock()
client, model = _resolve_auto()
# Should NOT be Codex, should be Anthropic (or another available provider)
assert not isinstance(client, type(None)), "Should find a provider after expired Codex"
def test_expired_codex_openrouter_wins(self, tmp_path, monkeypatch):
"""With expired Codex + OpenRouter key, OpenRouter should win (1st in chain)."""
import base64
import time as _time
# Belt-and-suspenders: _try_openrouter marks openrouter unhealthy
# when OPENROUTER_API_KEY is absent (which the preceding test in
# this class exercises). The file-level _clean_env autouse fixture
# clears the cache, but fixture ordering with the conftest
# _hermetic_environment autouse can leave a narrow window where
# the mark reappears. Explicitly clear here so this test is
# independent of run order.
import agent.auxiliary_client as _aux_mod
_aux_mod._aux_unhealthy_until.clear()
_aux_mod._aux_unhealthy_logged_at.clear()
header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
expired_jwt = f"{header}.{payload}.fakesig"
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {
"openai-codex": {
"tokens": {"access_token": expired_jwt, "refresh_token": "***"},
},
},
}))
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
monkeypatch.setenv("OPENROUTER_API_KEY", "or-test-key")
with patch("agent.auxiliary_client.OpenAI") as mock_openai:
mock_openai.return_value = MagicMock()
client, model = _resolve_auto()
assert client is not None
# OpenRouter is 1st in chain, should win
mock_openai.assert_called()
def test_expired_codex_custom_endpoint_wins(self, tmp_path, monkeypatch):
"""With expired Codex + custom endpoint (Ollama), custom should win (3rd in chain)."""
import base64
import time as _time
header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
payload_data = json.dumps({"exp": int(_time.time()) - 3600}).encode()
payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
expired_jwt = f"{header}.{payload}.fakesig"
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {
"openai-codex": {
"tokens": {"access_token": expired_jwt, "refresh_token": "***"},
},
},
}))
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
# Simulate Ollama or custom endpoint
with patch("agent.auxiliary_client._resolve_custom_runtime",
return_value=("http://localhost:11434/v1", "sk-dummy")):
with patch("agent.auxiliary_client.OpenAI") as mock_openai:
mock_openai.return_value = MagicMock()
client, model = _resolve_auto()
assert client is not None
def test_hermes_oauth_file_sets_oauth_flag(self, monkeypatch):
"""OAuth-style tokens should get is_oauth=*** (token is not sk-ant-api-*)."""
# Mock resolve_anthropic_token to return an OAuth-style token
with patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"), \
patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build, \
patch("hermes_agent_anthropic.resolve._select_pool_entry", return_value=(False, None)):
mock_build.return_value = MagicMock()
client, model = _try_anthropic()
assert client is not None, "Should resolve token"
adapter = client.chat.completions
assert adapter._is_oauth is True, "Non-sk-ant-api token should set is_oauth=True"
def test_jwt_missing_exp_passes_through(self, tmp_path, monkeypatch):
"""JWT with valid JSON but no exp claim should pass through."""
import base64
header = base64.urlsafe_b64encode(b'{"alg":"RS256","typ":"JWT"}').rstrip(b"=").decode()
payload_data = json.dumps({"sub": "user123"}).encode() # no exp
payload = base64.urlsafe_b64encode(payload_data).rstrip(b"=").decode()
no_exp_jwt = f"{header}.{payload}.fakesig"
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {
"openai-codex": {
"tokens": {"access_token": no_exp_jwt, "refresh_token": "***"},
},
},
}))
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
result = _read_codex_access_token()
assert result == no_exp_jwt, "JWT without exp should pass through"
def test_jwt_invalid_json_payload_passes_through(self, tmp_path, monkeypatch):
"""JWT with valid base64 but invalid JSON payload should pass through."""
import base64
header = base64.urlsafe_b64encode(b'{"alg":"RS256"}').rstrip(b"=").decode()
payload = base64.urlsafe_b64encode(b"not-json-content").rstrip(b"=").decode()
bad_jwt = f"{header}.{payload}.fakesig"
hermes_home = tmp_path / "hermes"
hermes_home.mkdir(parents=True, exist_ok=True)
(hermes_home / "auth.json").write_text(json.dumps({
"version": 1,
"providers": {
"openai-codex": {
"tokens": {"access_token": bad_jwt, "refresh_token": "***"},
},
},
}))
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
result = _read_codex_access_token()
assert result == bad_jwt, "JWT with invalid JSON payload should pass through"
def test_claude_code_oauth_env_sets_flag(self, monkeypatch):
"""CLAUDE_CODE_OAUTH_TOKEN env var should get is_oauth=True."""
monkeypatch.setenv("CLAUDE_CODE_OAUTH_TOKEN", "eyJhbG...test.sig") # JWT → is_oauth=True
monkeypatch.delenv("ANTHROPIC_TOKEN", raising=False)
with patch("hermes_agent_anthropic.adapter.build_anthropic_client") as mock_build:
mock_build.return_value = MagicMock()
client, model = _try_anthropic()
assert client is not None
adapter = client.chat.completions
assert adapter._is_oauth is True
class TestVisionClientFallback:
"""Vision client auto mode resolves known-good multimodal backends."""
def test_vision_auto_includes_active_provider_when_configured(self, monkeypatch):
"""Active provider appears in available backends when credentials exist."""
monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("agent.auxiliary_client._read_main_provider", return_value="anthropic"),
patch("agent.auxiliary_client._read_main_model", return_value="claude-sonnet-4"),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"),
):
backends = get_available_vision_backends()
assert "anthropic" in backends
def test_resolve_provider_client_returns_native_anthropic_wrapper(self, monkeypatch):
monkeypatch.setenv("ANTHROPIC_API_KEY", "***")
with (
patch("agent.auxiliary_client._read_nous_auth", return_value=None),
patch("hermes_agent_anthropic.adapter.build_anthropic_client", return_value=MagicMock()),
patch("hermes_agent_anthropic.adapter.resolve_anthropic_token", return_value="eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJ0ZXN0In0.sig"),
):
client, model = resolve_provider_client("anthropic")
assert client is not None
assert client.__class__.__name__ == "AnthropicAuxiliaryClient"
assert model == "claude-haiku-4-5-20251001"
class _AuxAuth401(Exception):
status_code = 401
def __init__(self, message="Provided authentication token is expired"):
super().__init__(message)
class _DummyResponse:
def __init__(self, text="ok"):
self.choices = [MagicMock(message=MagicMock(content=text))]
class _FailingThenSuccessCompletions:
def __init__(self):
self.calls = 0
def create(self, **kwargs):
self.calls += 1
if self.calls == 1:
raise _AuxAuth401()
return _DummyResponse("sync-ok")
class _AsyncFailingThenSuccessCompletions:
def __init__(self):
self.calls = 0
async def create(self, **kwargs):
self.calls += 1
if self.calls == 1:
raise _AuxAuth401()
return _DummyResponse("async-ok")
class TestAuxiliaryAuthRefreshRetry:
def test_call_llm_refreshes_codex_on_401_for_vision(self):
failing_client = MagicMock()
failing_client.base_url = "https://chatgpt.com/backend-api/codex"
failing_client.chat.completions = _FailingThenSuccessCompletions()
fresh_client = MagicMock()
fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-sync")
with (
patch(
"agent.auxiliary_client.resolve_vision_provider_client",
side_effect=[("openai-codex", failing_client, "gpt-5.4"), ("openai-codex", fresh_client, "gpt-5.4")],
),
patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
):
resp = call_llm(
task="vision",
provider="openai-codex",
model="gpt-5.4",
messages=[{"role": "user", "content": "hi"}],
)
assert resp.choices[0].message.content == "fresh-sync"
mock_refresh.assert_called_once_with("openai-codex")
def test_call_llm_refreshes_codex_on_401_for_non_vision(self):
stale_client = MagicMock()
stale_client.base_url = "https://chatgpt.com/backend-api/codex"
stale_client.chat.completions.create.side_effect = _AuxAuth401("stale codex token")
fresh_client = MagicMock()
fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-non-vision")
with (
patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("openai-codex", "gpt-5.4", None, None, None)),
patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "gpt-5.4"), (fresh_client, "gpt-5.4")]),
patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
):
resp = call_llm(
task="compression",
provider="openai-codex",
model="gpt-5.4",
messages=[{"role": "user", "content": "hi"}],
)
assert resp.choices[0].message.content == "fresh-non-vision"
mock_refresh.assert_called_once_with("openai-codex")
assert stale_client.chat.completions.create.call_count == 1
assert fresh_client.chat.completions.create.call_count == 1
def test_call_llm_refreshes_anthropic_on_401_for_non_vision(self):
stale_client = MagicMock()
stale_client.base_url = "https://api.anthropic.com"
stale_client.chat.completions.create.side_effect = _AuxAuth401("anthropic token expired")
fresh_client = MagicMock()
fresh_client.base_url = "https://api.anthropic.com"
fresh_client.chat.completions.create.return_value = _DummyResponse("fresh-anthropic")
with (
patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("anthropic", "claude-haiku-4-5-20251001", None, None, None)),
patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "claude-haiku-4-5-20251001"), (fresh_client, "claude-haiku-4-5-20251001")]),
patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
):
resp = call_llm(
task="compression",
provider="anthropic",
model="claude-haiku-4-5-20251001",
messages=[{"role": "user", "content": "hi"}],
)
assert resp.choices[0].message.content == "fresh-anthropic"
mock_refresh.assert_called_once_with("anthropic")
assert stale_client.chat.completions.create.call_count == 1
assert fresh_client.chat.completions.create.call_count == 1
@pytest.mark.asyncio
async def test_async_call_llm_refreshes_codex_on_401_for_vision(self):
failing_client = MagicMock()
failing_client.base_url = "https://chatgpt.com/backend-api/codex"
failing_client.chat.completions = _AsyncFailingThenSuccessCompletions()
fresh_client = MagicMock()
fresh_client.base_url = "https://chatgpt.com/backend-api/codex"
fresh_client.chat.completions.create = AsyncMock(return_value=_DummyResponse("fresh-async"))
with (
patch(
"agent.auxiliary_client.resolve_vision_provider_client",
side_effect=[("openai-codex", failing_client, "gpt-5.4"), ("openai-codex", fresh_client, "gpt-5.4")],
),
patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
):
resp = await async_call_llm(
task="vision",
provider="openai-codex",
model="gpt-5.4",
messages=[{"role": "user", "content": "hi"}],
)
assert resp.choices[0].message.content == "fresh-async"
mock_refresh.assert_called_once_with("openai-codex")
def test_refresh_provider_credentials_force_refreshes_anthropic_oauth_and_evicts_cache(self, monkeypatch):
stale_client = MagicMock()
cache_key = ("anthropic", False, None, None, None)
monkeypatch.setenv("ANTHROPIC_TOKEN", "")
monkeypatch.setenv("CLAUDE_CODE_OAUTH_TOKEN", "")
monkeypatch.setenv("ANTHROPIC_API_KEY", "")
with (
patch("agent.auxiliary_client._client_cache", {cache_key: (stale_client, "claude-haiku-4-5-20251001", None)}),
patch("hermes_agent_anthropic.adapter.read_claude_code_credentials", return_value={
"accessToken": "expired-token",
"refreshToken": "refresh-token",
"expiresAt": 0,
}),
patch("hermes_agent_anthropic.adapter.refresh_anthropic_oauth_pure", return_value={
"access_token": "fresh-token",
"refresh_token": "refresh-token-2",
"expires_at_ms": 9999999999999,
}) as mock_refresh_oauth,
patch("hermes_agent_anthropic.adapter._write_claude_code_credentials") as mock_write,
):
from agent.auxiliary_client import _refresh_provider_credentials
assert _refresh_provider_credentials("anthropic") is True
mock_refresh_oauth.assert_called_once_with("refresh-token", use_json=False)
mock_write.assert_called_once_with("fresh-token", "refresh-token-2", 9999999999999)
stale_client.close.assert_called_once()
@pytest.mark.asyncio
async def test_async_call_llm_refreshes_anthropic_on_401_for_non_vision(self):
stale_client = MagicMock()
stale_client.base_url = "https://api.anthropic.com"
stale_client.chat.completions.create = AsyncMock(side_effect=_AuxAuth401("anthropic token expired"))
fresh_client = MagicMock()
fresh_client.base_url = "https://api.anthropic.com"
fresh_client.chat.completions.create = AsyncMock(return_value=_DummyResponse("fresh-async-anthropic"))
with (
patch("agent.auxiliary_client._resolve_task_provider_model", return_value=("anthropic", "claude-haiku-4-5-20251001", None, None, None)),
patch("agent.auxiliary_client._get_cached_client", side_effect=[(stale_client, "claude-haiku-4-5-20251001"), (fresh_client, "claude-haiku-4-5-20251001")]),
patch("agent.auxiliary_client._refresh_provider_credentials", return_value=True) as mock_refresh,
):
resp = await async_call_llm(
task="compression",
provider="anthropic",
model="claude-haiku-4-5-20251001",
messages=[{"role": "user", "content": "hi"}],
)
assert resp.choices[0].message.content == "fresh-async-anthropic"
mock_refresh.assert_called_once_with("anthropic")
assert stale_client.chat.completions.create.await_count == 1
assert fresh_client.chat.completions.create.await_count == 1
@@ -1,129 +0,0 @@
"""Anthropic-specific computer use tests moved from tests/tools/test_computer_use.py."""
from __future__ import annotations
from typing import Any, Dict, List
# ---------------------------------------------------------------------------
# Anthropic adapter: multimodal tool-result conversion
# ---------------------------------------------------------------------------
class TestAnthropicAdapterMultimodal:
def test_multimodal_envelope_becomes_tool_result_with_image_block(self):
from agent.anthropic_format import convert_messages_to_anthropic
fake_png = "iVBORw0KGgo="
messages = [
{"role": "user", "content": "take a screenshot"},
{
"role": "assistant",
"content": "",
"tool_calls": [{
"id": "call_1",
"type": "function",
"function": {"name": "computer_use", "arguments": "{}"},
}],
},
{
"role": "tool",
"tool_call_id": "call_1",
"content": {
"_multimodal": True,
"content": [
{"type": "text", "text": "1 element"},
{"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{fake_png}"}},
],
"text_summary": "1 element",
},
},
]
_, anthropic_msgs = convert_messages_to_anthropic(messages)
tool_result_msgs = [m for m in anthropic_msgs if m["role"] == "user"
and isinstance(m["content"], list)
and any(b.get("type") == "tool_result" for b in m["content"])]
assert tool_result_msgs, "expected a tool_result user message"
tr = next(b for b in tool_result_msgs[-1]["content"] if b.get("type") == "tool_result")
inner = tr["content"]
assert any(b.get("type") == "image" for b in inner)
assert any(b.get("type") == "text" for b in inner)
def test_old_screenshots_are_evicted_beyond_max_keep(self):
"""Image blocks in old tool_results get replaced with placeholders."""
from agent.anthropic_format import convert_messages_to_anthropic
fake_png = "iVBORw0KGgo="
def _mm_tool(call_id: str) -> Dict[str, Any]:
return {
"role": "tool",
"tool_call_id": call_id,
"content": {
"_multimodal": True,
"content": [
{"type": "text", "text": "cap"},
{"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{fake_png}"}},
],
"text_summary": "cap",
},
}
# Build 5 screenshots interleaved with assistant messages.
messages: List[Dict[str, Any]] = [{"role": "user", "content": "start"}]
for i in range(5):
messages.append({
"role": "assistant", "content": "",
"tool_calls": [{
"id": f"call_{i}",
"type": "function",
"function": {"name": "computer_use", "arguments": "{}"},
}],
})
messages.append(_mm_tool(f"call_{i}"))
messages.append({"role": "assistant", "content": "done"})
_, anthropic_msgs = convert_messages_to_anthropic(messages)
# Walk tool_result blocks in order; the OLDEST (5 - 3) = 2 should be
# text-only placeholders, newest 3 should still carry image blocks.
tool_results = []
for m in anthropic_msgs:
if m["role"] != "user" or not isinstance(m["content"], list):
continue
for b in m["content"]:
if b.get("type") == "tool_result":
tool_results.append(b)
assert len(tool_results) == 5
with_images = [
b for b in tool_results
if isinstance(b.get("content"), list)
and any(x.get("type") == "image" for x in b["content"])
]
placeholders = [
b for b in tool_results
if isinstance(b.get("content"), list)
and any(
x.get("type") == "text"
and "screenshot removed" in x.get("text", "")
for x in b["content"]
)
]
assert len(with_images) == 3
assert len(placeholders) == 2
def test_content_parts_helper_filters_to_text_and_image(self):
from agent.anthropic_format import _content_parts_to_anthropic_blocks
fake_png = "iVBORw0KGgo="
blocks = _content_parts_to_anthropic_blocks([
{"type": "text", "text": "hi"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{fake_png}"}},
{"type": "unsupported", "data": "ignored"},
])
types = [b["type"] for b in blocks]
assert "text" in types
assert "image" in types
assert len(blocks) == 2
@@ -1,47 +0,0 @@
"""Anthropic-specific ctx halving tests moved from tests/test_ctx_halving_fix.py."""
# ---------------------------------------------------------------------------
# build_anthropic_kwargs — output cap clamping
# ---------------------------------------------------------------------------
class TestBuildAnthropicKwargsClamping:
"""The context_length clamp only fires when output ceiling > window.
For standard Anthropic models (output ceiling < window) it must not fire.
"""
def _build(self, model, max_tokens=None, context_length=None):
from agent.anthropic_format import build_anthropic_kwargs
return build_anthropic_kwargs(
model=model,
messages=[{"role": "user", "content": "hi"}],
tools=None,
max_tokens=max_tokens,
reasoning_config=None,
context_length=context_length,
)
def test_no_clamping_when_output_ceiling_fits_in_window(self):
"""Opus 4.6 native output (128K) < context window (200K) — no clamping."""
kwargs = self._build("claude-opus-4-6", context_length=200_000)
assert kwargs["max_tokens"] == 128_000
def test_clamping_fires_for_tiny_custom_window(self):
"""When context_length is 8K (local model), output cap is clamped to 7999."""
kwargs = self._build("claude-opus-4-6", context_length=8_000)
assert kwargs["max_tokens"] == 7_999
def test_explicit_max_tokens_respected_when_within_window(self):
"""Explicit max_tokens smaller than window passes through unchanged."""
kwargs = self._build("claude-opus-4-6", max_tokens=4096, context_length=200_000)
assert kwargs["max_tokens"] == 4096
def test_explicit_max_tokens_clamped_when_exceeds_window(self):
"""Explicit max_tokens larger than a small window is clamped."""
kwargs = self._build("claude-opus-4-6", max_tokens=32_768, context_length=16_000)
assert kwargs["max_tokens"] == 15_999
def test_no_context_length_uses_native_ceiling(self):
"""Without context_length the native output ceiling is used directly."""
kwargs = self._build("claude-sonnet-4-6")
assert kwargs["max_tokens"] == 64_000
@@ -1,231 +0,0 @@
"""Anthropic-specific fast mode tests moved from tests/cli/test_fast_command.py."""
import unittest
from types import SimpleNamespace
def _import_cli():
import hermes_cli.config as config_mod
if not hasattr(config_mod, "save_env_value_secure"):
config_mod.save_env_value_secure = lambda key, value: {
"success": True,
"stored_as": key,
"validated": False,
}
import cli as cli_mod
return cli_mod
class TestAnthropicFastMode(unittest.TestCase):
"""Verify Anthropic Fast Mode model support and override resolution."""
def test_anthropic_opus_supported(self):
from hermes_cli.models import model_supports_fast_mode
# Native Anthropic format (hyphens)
assert model_supports_fast_mode("claude-opus-4-6") is True
# OpenRouter format (dots)
assert model_supports_fast_mode("claude-opus-4.6") is True
# With vendor prefix
assert model_supports_fast_mode("anthropic/claude-opus-4-6") is True
assert model_supports_fast_mode("anthropic/claude-opus-4.6") is True
def test_anthropic_non_opus46_models_excluded(self):
"""Anthropic restricts fast mode to Opus 4.6 — others must be excluded.
Per https://platform.claude.com/docs/en/build-with-claude/fast-mode,
sending speed=fast to Opus 4.7, Sonnet, or Haiku returns HTTP 400.
"""
from hermes_cli.models import model_supports_fast_mode
assert model_supports_fast_mode("claude-sonnet-4-6") is False
assert model_supports_fast_mode("claude-sonnet-4.6") is False
assert model_supports_fast_mode("claude-haiku-4-5") is False
assert model_supports_fast_mode("claude-opus-4-7") is False
assert model_supports_fast_mode("anthropic/claude-sonnet-4.6") is False
assert model_supports_fast_mode("anthropic/claude-opus-4-7") is False
def test_non_claude_models_not_anthropic_fast(self):
"""Non-Claude models should not be treated as Anthropic fast-mode."""
from hermes_cli.models import _is_anthropic_fast_model
assert _is_anthropic_fast_model("gpt-5.4") is False
assert _is_anthropic_fast_model("gemini-3-pro") is False
assert _is_anthropic_fast_model("kimi-k2-thinking") is False
def test_anthropic_variant_tags_stripped(self):
from hermes_cli.models import model_supports_fast_mode
# OpenRouter variant tags after colon should be stripped
assert model_supports_fast_mode("claude-opus-4.6:fast") is True
assert model_supports_fast_mode("claude-opus-4.6:beta") is True
def test_resolve_overrides_returns_speed_for_anthropic(self):
from hermes_cli.models import resolve_fast_mode_overrides
result = resolve_fast_mode_overrides("claude-opus-4-6")
assert result == {"speed": "fast"}
result = resolve_fast_mode_overrides("anthropic/claude-opus-4.6")
assert result == {"speed": "fast"}
def test_resolve_overrides_returns_none_for_unsupported_claude(self):
"""Opus 4.7 and other Claude models don't support fast mode (API 400s).
Per Anthropic docs, fast mode is currently Opus 4.6 only.
"""
from hermes_cli.models import resolve_fast_mode_overrides
assert resolve_fast_mode_overrides("claude-opus-4-7") is None
assert resolve_fast_mode_overrides("claude-sonnet-4-6") is None
assert resolve_fast_mode_overrides("claude-haiku-4-5") is None
def test_resolve_overrides_returns_service_tier_for_openai(self):
"""OpenAI models should still get service_tier, not speed."""
from hermes_cli.models import resolve_fast_mode_overrides
result = resolve_fast_mode_overrides("gpt-5.4")
assert result == {"service_tier": "priority"}
def test_is_anthropic_fast_model(self):
"""Fast mode is currently Opus 4.6 only — other Claude variants must be excluded."""
from hermes_cli.models import _is_anthropic_fast_model
# Supported: Opus 4.6 in any form
assert _is_anthropic_fast_model("claude-opus-4-6") is True
assert _is_anthropic_fast_model("claude-opus-4.6") is True
assert _is_anthropic_fast_model("anthropic/claude-opus-4-6") is True
assert _is_anthropic_fast_model("claude-opus-4.6:fast") is True
# Unsupported per Anthropic API contract — would 400 if we sent speed=fast
assert _is_anthropic_fast_model("claude-opus-4-7") is False
assert _is_anthropic_fast_model("claude-sonnet-4-6") is False
assert _is_anthropic_fast_model("claude-haiku-4-5") is False
# Non-Claude
assert _is_anthropic_fast_model("gpt-5.4") is False
assert _is_anthropic_fast_model("") is False
def test_fast_command_exposed_for_anthropic_model(self):
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="anthropic", requested_provider="anthropic",
model="claude-opus-4-6", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is True
def test_fast_command_hidden_for_anthropic_sonnet(self):
"""Sonnet doesn't support fast mode (Opus 4.6 only) — /fast must be hidden."""
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="anthropic", requested_provider="anthropic",
model="claude-sonnet-4-6", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is False
def test_fast_command_hidden_for_anthropic_opus_47(self):
"""Opus 4.7 doesn't support fast mode — /fast must be hidden."""
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="anthropic", requested_provider="anthropic",
model="claude-opus-4-7", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is False
def test_fast_command_hidden_for_non_claude_non_openai(self):
"""Non-Claude, non-OpenAI models should not expose /fast."""
cli_mod = _import_cli()
stub = SimpleNamespace(
provider="gemini", requested_provider="gemini",
model="gemini-3-pro-preview", agent=None,
)
assert cli_mod.HermesCLI._fast_command_available(stub) is False
def test_turn_route_injects_speed_for_anthropic(self):
"""Anthropic models should get speed:'fast' override, not service_tier."""
cli_mod = _import_cli()
stub = SimpleNamespace(
model="claude-opus-4-6",
api_key="sk-ant-test",
base_url="https://api.anthropic.com",
provider="anthropic",
api_mode="anthropic_messages",
acp_command=None,
acp_args=[],
_credential_pool=None,
service_tier="priority",
)
route = cli_mod.HermesCLI._resolve_turn_agent_config(stub, "hi")
assert route["runtime"]["provider"] == "anthropic"
assert route["request_overrides"] == {"speed": "fast"}
class TestAnthropicFastModeAdapter(unittest.TestCase):
"""Verify build_anthropic_kwargs handles fast_mode parameter."""
def test_fast_mode_adds_speed_and_beta(self):
from agent.anthropic_format import build_anthropic_kwargs, _FAST_MODE_BETA
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=True,
)
assert kwargs.get("extra_body", {}).get("speed") == "fast"
assert "speed" not in kwargs
assert "extra_headers" in kwargs
assert _FAST_MODE_BETA in kwargs["extra_headers"].get("anthropic-beta", "")
def test_fast_mode_off_no_speed(self):
from agent.anthropic_format import build_anthropic_kwargs
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=False,
)
assert kwargs.get("extra_body", {}).get("speed") is None
assert "speed" not in kwargs
assert "extra_headers" not in kwargs
def test_fast_mode_skipped_for_third_party_endpoint(self):
from agent.anthropic_format import build_anthropic_kwargs
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=True,
base_url="https://api.minimax.io/anthropic/v1",
)
# Third-party endpoints should NOT get speed or fast-mode beta
assert kwargs.get("extra_body", {}).get("speed") is None
assert "speed" not in kwargs
assert "extra_headers" not in kwargs
def test_fast_mode_kwargs_are_safe_for_sdk_unpacking(self):
from agent.anthropic_format import build_anthropic_kwargs
kwargs = build_anthropic_kwargs(
model="claude-opus-4-6",
messages=[{"role": "user", "content": [{"type": "text", "text": "hi"}]}],
tools=None,
max_tokens=None,
reasoning_config=None,
fast_mode=True,
)
assert "speed" not in kwargs
assert kwargs.get("extra_body", {}).get("speed") == "fast"
@@ -1,22 +0,0 @@
"""Anthropic-specific timeout tests moved from tests/hermes_cli/test_timeouts.py."""
from __future__ import annotations
def test_anthropic_adapter_honors_timeout_kwarg():
"""build_anthropic_client(timeout=X) overrides the 900s default read timeout."""
pytest = __import__("pytest")
anthropic = pytest.importorskip("anthropic") # skip if optional SDK missing
from hermes_agent_anthropic import build_anthropic_client
c_default = build_anthropic_client("sk-ant-dummy", None)
c_custom = build_anthropic_client("sk-ant-dummy", None, timeout=45.0)
c_invalid = build_anthropic_client("sk-ant-dummy", None, timeout=-1)
# Default stays at 900s; custom overrides; invalid falls back to default
assert c_default.timeout.read == 900.0
assert c_custom.timeout.read == 45.0
assert c_invalid.timeout.read == 900.0
# Connect timeout always stays at 10s regardless
assert c_default.timeout.connect == 10.0
assert c_custom.timeout.connect == 10.0
@@ -1,183 +0,0 @@
"""Tests for the AnthropicMessagesTransport.
Behavioral tests that require the real anthropic transport implementation.
"""
import json
import pytest
from types import SimpleNamespace
from agent.transports import get_transport
from agent.transports.types import NormalizedResponse
@pytest.fixture
def transport():
"""Load the real Anthropic transport by registering the plugin."""
from hermes_agent_anthropic import register as _anthro_register
from agent.plugin_registries import registries
class _Ctx:
def register_transport(self, api_mode, obj):
from agent.transports import register_transport
register_transport(api_mode, obj)
def register_provider_resolver(self, name, fn):
registries.register_provider_resolver(name, fn)
def register_provider_services(self, name, services):
registries.register_provider_services(name, services)
def register_credential_pool_hook(self, name, hook):
registries.register_credential_pool_hook(name, hook)
def register_pricing_provider(self, name, entries):
registries.register_pricing_provider(name, entries)
def register_provider_overlay(self, entry):
registries.register_provider_overlay(entry)
def __getattr__(self, name):
if name.startswith("register_"):
return lambda *a, **kw: None
raise AttributeError(name)
_anthro_register(_Ctx())
return get_transport("anthropic_messages")
class TestAnthropicTransportBehavioral:
# (fixture defined at module level above)
def test_api_mode(self, transport):
assert transport.api_mode == "anthropic_messages"
def test_convert_tools_simple(self, transport):
tools = [{
"type": "function",
"function": {
"name": "test_tool",
"description": "A test",
"parameters": {"type": "object", "properties": {}},
}
}]
result = transport.convert_tools(tools)
assert len(result) == 1
assert result[0]["name"] == "test_tool"
assert "input_schema" in result[0]
def test_validate_response_none(self, transport):
assert transport.validate_response(None) is False
def test_validate_response_empty_content(self, transport):
r = SimpleNamespace(content=[])
assert transport.validate_response(r) is False
def test_validate_response_empty_content_with_end_turn_is_valid(self, transport):
r = SimpleNamespace(content=[], stop_reason="end_turn")
assert transport.validate_response(r) is True
def test_validate_response_empty_content_with_tool_use_is_invalid(self, transport):
r = SimpleNamespace(content=[], stop_reason="tool_use")
assert transport.validate_response(r) is False
def test_validate_response_valid(self, transport):
r = SimpleNamespace(content=[SimpleNamespace(type="text", text="hello")])
assert transport.validate_response(r) is True
def test_map_finish_reason(self, transport):
assert transport.map_finish_reason("end_turn") == "stop"
assert transport.map_finish_reason("tool_use") == "tool_calls"
assert transport.map_finish_reason("max_tokens") == "length"
assert transport.map_finish_reason("stop_sequence") == "stop"
assert transport.map_finish_reason("refusal") == "content_filter"
assert transport.map_finish_reason("model_context_window_exceeded") == "length"
assert transport.map_finish_reason("unknown") == "stop"
def test_extract_cache_stats_none_usage(self, transport):
r = SimpleNamespace(usage=None)
assert transport.extract_cache_stats(r) is None
def test_extract_cache_stats_with_cache(self, transport):
usage = SimpleNamespace(cache_read_input_tokens=100, cache_creation_input_tokens=50)
r = SimpleNamespace(usage=usage)
result = transport.extract_cache_stats(r)
assert result == {"cached_tokens": 100, "creation_tokens": 50}
def test_extract_cache_stats_zero(self, transport):
usage = SimpleNamespace(cache_read_input_tokens=0, cache_creation_input_tokens=0)
r = SimpleNamespace(usage=usage)
assert transport.extract_cache_stats(r) is None
def test_normalize_response_text(self, transport):
"""Test normalization of a simple text response."""
r = SimpleNamespace(
content=[SimpleNamespace(type="text", text="Hello world")],
stop_reason="end_turn",
usage=SimpleNamespace(input_tokens=10, output_tokens=5),
model="claude-sonnet-4-6",
)
nr = transport.normalize_response(r)
assert isinstance(nr, NormalizedResponse)
assert nr.content == "Hello world"
assert nr.tool_calls is None or nr.tool_calls == []
assert nr.finish_reason == "stop"
def test_normalize_response_tool_calls(self, transport):
"""Test normalization of a tool-use response."""
r = SimpleNamespace(
content=[
SimpleNamespace(
type="tool_use",
id="toolu_123",
name="terminal",
input={"command": "ls"},
),
],
stop_reason="tool_use",
usage=SimpleNamespace(input_tokens=10, output_tokens=20),
model="claude-sonnet-4-6",
)
nr = transport.normalize_response(r)
assert nr.finish_reason == "tool_calls"
assert len(nr.tool_calls) == 1
tc = nr.tool_calls[0]
assert tc.name == "terminal"
assert tc.id == "toolu_123"
assert '"command"' in tc.arguments
def test_normalize_response_thinking(self, transport):
"""Test normalization preserves thinking content."""
r = SimpleNamespace(
content=[
SimpleNamespace(type="thinking", thinking="Let me think..."),
SimpleNamespace(type="text", text="The answer is 42"),
],
stop_reason="end_turn",
usage=SimpleNamespace(input_tokens=10, output_tokens=15),
model="claude-sonnet-4-6",
)
nr = transport.normalize_response(r)
assert nr.content == "The answer is 42"
assert nr.reasoning == "Let me think..."
def test_build_kwargs_returns_dict(self, transport):
"""Test build_kwargs produces a usable kwargs dict."""
messages = [{"role": "user", "content": "Hello"}]
kw = transport.build_kwargs(
model="claude-sonnet-4-6",
messages=messages,
max_tokens=1024,
)
assert isinstance(kw, dict)
assert "model" in kw
assert "max_tokens" in kw
assert "messages" in kw
def test_convert_messages_extracts_system(self, transport):
"""Test convert_messages separates system from messages."""
messages = [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hi"},
]
system, msgs = transport.convert_messages(messages)
# System should be extracted
assert system is not None
# Messages should only have user
assert len(msgs) >= 1
@@ -11,9 +11,3 @@ arcee = ProviderProfile(
)
register_provider(arcee)
def register(ctx):
"""No-op — this provider has no workspace package yet."""
pass
@@ -19,9 +19,3 @@ azure_foundry = ProviderProfile(
)
register_provider(azure_foundry)
def register(ctx):
"""Plugin entry point — delegates to the inner hermes_agent_azure package."""
from hermes_agent_azure import register as _inner_register
_inner_register(ctx)
@@ -1,57 +0,0 @@
"""hermes-agent-azure: Microsoft Entra ID / Azure Identity adapter for Hermes Agent."""
from hermes_agent_azure.adapter import ( # noqa: F401
SCOPE_AI_AZURE_DEFAULT,
EntraIdentityConfig,
_build_default_credential,
_require_azure_identity,
build_bearer_http_client,
build_credential,
build_token_provider,
describe_active_credential,
has_azure_identity_credentials,
has_azure_identity_installed,
is_token_provider,
materialize_bearer_for_http,
reset_credential_cache,
)
def register(ctx):
"""Entry point for the hermes_agent.plugins entry point group."""
from hermes_agent_azure import adapter
ctx.register_provider_services("azure", {
# Auth / credentials
"is_token_provider": adapter.is_token_provider,
"has_azure_identity_credentials": adapter.has_azure_identity_credentials,
"has_azure_identity_installed": adapter.has_azure_identity_installed,
# Client building
"build_bearer_http_client": adapter.build_bearer_http_client,
"build_credential": adapter.build_credential,
"build_token_provider": adapter.build_token_provider,
"materialize_bearer_for_http": adapter.materialize_bearer_for_http,
"reset_credential_cache": adapter.reset_credential_cache,
# Constants / config
"SCOPE_AI_AZURE_DEFAULT": adapter.SCOPE_AI_AZURE_DEFAULT,
"EntraIdentityConfig": adapter.EntraIdentityConfig,
# Internal helpers
"_build_default_credential": adapter._build_default_credential,
"_require_azure_identity": adapter._require_azure_identity,
"describe_active_credential": adapter.describe_active_credential,
})
# Register the provider resolver — core dispatches to this instead of
# having a per-azure-foundry if/elif branch in resolve_provider_client().
from hermes_agent_azure.resolve import resolve_auxiliary_client as _azure_resolver
ctx.register_provider_resolver("azure-foundry", _azure_resolver)
# Register the provider overlay — core merges this into HERMES_OVERLAYS
from agent.plugin_registries import ProviderOverlayEntry
ctx.register_provider_overlay(ProviderOverlayEntry(
provider_name="azure-foundry",
transport="openai_chat", # default; overridden by api_mode in config
base_url_env_var="AZURE_FOUNDRY_BASE_URL",
display_name="Azure AI Foundry",
aliases=[],
))
@@ -1,131 +0,0 @@
"""Azure Foundry provider resolver for auxiliary client construction.
Handles ALL provider-specific logic for building auxiliary clients:
Entra ID auth, static API key, base URL resolution, api_mode routing
(chat_completions, codex_responses, anthropic_messages).
"""
from __future__ import annotations
import logging
from typing import Any, Optional
from urllib.parse import parse_qs, urlparse, urlunparse
logger = logging.getLogger(__name__)
def _extract_url_query_params(url: str):
"""Extract query params from URL, return (clean_url, default_query dict or None)."""
parsed = urlparse(url)
if parsed.query:
clean = urlunparse(parsed._replace(query=""))
params = {k: v[0] for k, v in parse_qs(parsed.query).items()}
return clean, params
return url, None
def _normalize_resolved_model(model: str, provider: str) -> str:
"""Normalize model name for a given provider."""
return str(model or "").strip()
def resolve_auxiliary_client(
*,
model: str | None = None,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
async_mode: bool = False,
is_vision: bool = False,
main_runtime: dict | None = None,
api_mode: str | None = None,
) -> tuple[Any, str] | tuple[None, None]:
"""Resolve an Azure Foundry auxiliary client via the runtime resolver.
Mirrors the anthropic/bedrock resolver shape but delegates to
``hermes_cli.runtime_provider._resolve_azure_foundry_runtime``
the same resolver the main agent uses so:
* ``auth_mode: api_key`` (default) gets the static
``AZURE_FOUNDRY_API_KEY`` string.
* ``auth_mode: entra_id`` gets a callable bearer-token provider
(``Callable[[], str]`` from the azure identity adapter).
* Per-model ``api_mode`` auto-routing for GPT-5.x / o-series /
codex models works.
* ``model.entra.{tenant_id,client_id,authority,scope}`` config
fields propagate.
* Non-default ``model.base_url`` overrides are honored.
Returns ``(client, model)`` or ``(None, None)`` on failure.
"""
from openai import OpenAI
try:
from hermes_cli.runtime_provider import _resolve_azure_foundry_runtime
from hermes_cli.auth import AuthError
from hermes_cli.config import load_config
except ImportError:
return None, None
try:
cfg = load_config()
model_cfg = cfg.get("model") if isinstance(cfg, dict) else {}
if not isinstance(model_cfg, dict):
model_cfg = {}
except Exception:
model_cfg = {}
try:
runtime = _resolve_azure_foundry_runtime(
requested_provider="azure-foundry",
model_cfg=model_cfg,
explicit_api_key=explicit_api_key,
explicit_base_url=explicit_base_url,
target_model=model,
)
except AuthError as exc:
logger.debug("Auxiliary azure-foundry: %s", exc)
return None, None
except Exception as exc:
logger.debug("Auxiliary azure-foundry runtime error: %s", exc)
return None, None
api_key = runtime.get("api_key")
base_url = str(runtime.get("base_url", "") or "")
runtime_api_mode = api_mode or runtime.get("api_mode") or "chat_completions"
_has_key = bool(api_key) if not callable(api_key) else True
if not _has_key or not base_url:
return None, None
final_model = _normalize_resolved_model(
model or str(model_cfg.get("default") or ""),
"azure-foundry",
)
if not final_model:
logger.debug(
"Auxiliary azure-foundry: no model resolved (model=%r, default=%r)",
model, model_cfg.get("default"),
)
return None, None
extra: dict[str, Any] = {}
_clean_base, _dq = _extract_url_query_params(base_url)
if _dq:
extra["default_query"] = _dq
client = OpenAI(api_key=api_key, base_url=_clean_base, **extra)
if runtime_api_mode == "codex_responses":
from agent.auxiliary_client import CodexAuxiliaryClient
return CodexAuxiliaryClient(client, final_model), final_model
if runtime_api_mode == "anthropic_messages":
from agent.plugin_registries import registries
maybe_wrap = registries.get_provider_service("anthropic", "maybe_wrap_anthropic")
if maybe_wrap is not None:
return maybe_wrap(
client, final_model, api_key,
base_url, runtime_api_mode,
), final_model
return client, final_model
@@ -1,19 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-azure"
version = "0.1.0"
description = "Microsoft Entra ID / Azure Identity adapter for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"azure-identity==1.25.3",
]
[project.entry-points."hermes_agent.plugins"]
azure = "hermes_agent_azure:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_azure*"]
@@ -1,71 +0,0 @@
"""Shared fixtures for azure-foundry plugin tests.
Registers the azure plugin in the singleton registry before each test.
"""
import pytest
class _FullCtx:
"""Plugin context that wires up all registry hooks."""
def register_provider_services(self, name, services):
from agent.plugin_registries import registries
registries.register_provider_services(name, services)
def register_provider_resolver(self, name, resolver):
from agent.plugin_registries import registries
registries.register_provider_resolver(name, resolver)
def register_credential_pool_hook(self, name, hook):
from agent.plugin_registries import registries
registries.register_credential_pool_hook(name, hook)
def register_transport(self, api_mode, transport_cls):
from agent.plugin_registries import registries
registries._transports[api_mode] = transport_cls
def register_pricing_provider(self, name, entries):
from agent.plugin_registries import registries
registries.register_pricing_provider(name, entries)
def register_provider_overlay(self, entry):
from agent.plugin_registries import registries
registries.register_provider_overlay(entry)
def __getattr__(self, name):
if name.startswith("register_"):
return lambda *a, **kw: None
raise AttributeError(name)
@pytest.fixture(autouse=True)
def _register_azure_plugin():
"""Register the real azure plugin for the duration of each test."""
from agent.plugin_registries import registries
_prev_services = dict(registries._provider_services)
_prev_resolvers = dict(registries._provider_resolvers)
_prev_cph = dict(registries._credential_pool_hooks)
ctx = _FullCtx()
try:
from hermes_agent_azure import register as _reg
_reg(ctx)
except ImportError:
pass
# azure-foundry tests for Anthropic Messages mode need the anthropic plugin too
try:
from hermes_agent_anthropic import register as _anthro_reg
_anthro_reg(ctx)
except ImportError:
pass
yield
for d, prev in [
(registries._provider_services, _prev_services),
(registries._provider_resolvers, _prev_resolvers),
(registries._credential_pool_hooks, _prev_cph),
]:
d.clear()
d.update(prev)
@@ -27,9 +27,3 @@ bedrock = BedrockProfile(
)
register_provider(bedrock)
def register(ctx):
"""Plugin entry point — delegates to the inner hermes_agent_bedrock package."""
from hermes_agent_bedrock import register as _inner_register
_inner_register(ctx)
@@ -1,125 +0,0 @@
"""hermes-agent-bedrock: AWS Bedrock Converse API adapter for Hermes Agent."""
from hermes_agent_bedrock.adapter import ( # noqa: F401
BEDROCK_DEFAULT_CONTEXT_LENGTH,
CONTEXT_OVERFLOW_PATTERNS,
OVERLOAD_PATTERNS,
THROTTLE_PATTERNS,
_AWS_CREDENTIAL_ENV_VARS,
_DISCOVERY_CACHE_TTL_SECONDS,
_NON_TOOL_CALLING_PATTERNS,
_STALE_LIB_MODULE_PREFIXES,
_convert_content_to_converse,
_converse_stop_reason_to_openai,
_extract_provider_from_arn,
_get_bedrock_control_client,
_get_bedrock_runtime_client,
_model_supports_tool_use,
_require_boto3,
_traceback_frames_modules,
bedrock_model_ids_or_none,
build_converse_kwargs,
call_converse,
call_converse_stream,
classify_bedrock_error,
convert_messages_to_converse,
convert_tools_to_converse,
discover_bedrock_models,
get_bedrock_context_length,
get_bedrock_model_ids,
has_aws_credentials,
invalidate_runtime_client,
is_anthropic_bedrock_model,
is_context_overflow_error,
is_stale_connection_error,
normalize_converse_response,
normalize_converse_stream_events,
reset_client_cache,
reset_discovery_cache,
resolve_aws_auth_env_var,
resolve_bedrock_region,
stream_converse_with_callbacks,
)
def register(ctx):
"""Entry point for the hermes_agent.plugins entry point group."""
from hermes_agent_bedrock import adapter
ctx.register_provider_services("bedrock", {
# Auth / credentials
"has_aws_credentials": adapter.has_aws_credentials,
"resolve_aws_auth_env_var": adapter.resolve_aws_auth_env_var,
"resolve_bedrock_region": adapter.resolve_bedrock_region,
"_AWS_CREDENTIAL_ENV_VARS": adapter._AWS_CREDENTIAL_ENV_VARS,
# Transport
"build_converse_kwargs": adapter.build_converse_kwargs,
"convert_messages_to_converse": adapter.convert_messages_to_converse,
"convert_tools_to_converse": adapter.convert_tools_to_converse,
"normalize_converse_response": adapter.normalize_converse_response,
"normalize_converse_stream_events": adapter.normalize_converse_stream_events,
"call_converse": adapter.call_converse,
"call_converse_stream": adapter.call_converse_stream,
"stream_converse_with_callbacks": adapter.stream_converse_with_callbacks,
# Model metadata
"bedrock_model_ids_or_none": adapter.bedrock_model_ids_or_none,
"discover_bedrock_models": adapter.discover_bedrock_models,
"get_bedrock_context_length": adapter.get_bedrock_context_length,
"get_bedrock_model_ids": adapter.get_bedrock_model_ids,
"BEDROCK_DEFAULT_CONTEXT_LENGTH": adapter.BEDROCK_DEFAULT_CONTEXT_LENGTH,
# Client management
"_get_bedrock_control_client": adapter._get_bedrock_control_client,
"_get_bedrock_runtime_client": adapter._get_bedrock_runtime_client,
"invalidate_runtime_client": adapter.invalidate_runtime_client,
"reset_client_cache": adapter.reset_client_cache,
"reset_discovery_cache": adapter.reset_discovery_cache,
# Error handling
"classify_bedrock_error": adapter.classify_bedrock_error,
"is_context_overflow_error": adapter.is_context_overflow_error,
"is_stale_connection_error": adapter.is_stale_connection_error,
"CONTEXT_OVERFLOW_PATTERNS": adapter.CONTEXT_OVERFLOW_PATTERNS,
"OVERLOAD_PATTERNS": adapter.OVERLOAD_PATTERNS,
"THROTTLE_PATTERNS": adapter.THROTTLE_PATTERNS,
"_NON_TOOL_CALLING_PATTERNS": adapter._NON_TOOL_CALLING_PATTERNS,
"_STALE_LIB_MODULE_PREFIXES": adapter._STALE_LIB_MODULE_PREFIXES,
"_DISCOVERY_CACHE_TTL_SECONDS": adapter._DISCOVERY_CACHE_TTL_SECONDS,
# Internal helpers
"_require_boto3": adapter._require_boto3,
"_model_supports_tool_use": adapter._model_supports_tool_use,
"is_anthropic_bedrock_model": adapter.is_anthropic_bedrock_model,
"_convert_content_to_converse": adapter._convert_content_to_converse,
"_converse_stop_reason_to_openai": adapter._converse_stop_reason_to_openai,
"_extract_provider_from_arn": adapter._extract_provider_from_arn,
"_traceback_frames_modules": adapter._traceback_frames_modules,
})
# Register the provider resolver — core dispatches to this instead of
# having per-bedrock if/elif branches in resolve_provider_client().
from hermes_agent_bedrock.resolve import resolve_auxiliary_client as _bedrock_resolver
ctx.register_provider_resolver("bedrock", _bedrock_resolver)
# Register the bedrock transport so core doesn't need to import it.
from hermes_agent_bedrock.transport import BedrockTransport
ctx.register_transport("bedrock_converse", BedrockTransport)
# Register pricing entries — core looks these up via the registry
# instead of hardcoding them in _OFFICIAL_DOCS_PRICING.
from hermes_agent_bedrock.pricing import (
get_bedrock_pricing_entries,
BEDROCK_PRICING_KEYS,
)
_entries = get_bedrock_pricing_entries()
_keyed = []
for (prov, model), entry in zip(BEDROCK_PRICING_KEYS, _entries):
_keyed.append((prov, model, entry))
ctx.register_pricing_provider("bedrock", _keyed)
# Register the provider overlay — core merges this into HERMES_OVERLAYS
from agent.plugin_registries import ProviderOverlayEntry
ctx.register_provider_overlay(ProviderOverlayEntry(
provider_name="bedrock",
transport="bedrock_converse",
auth_type="aws_sdk",
display_name="AWS Bedrock",
aliases=["aws", "aws-bedrock", "amazon-bedrock", "amazon"],
))
@@ -1,80 +0,0 @@
"""Bedrock model pricing data.
Official docs snapshot entries for AWS Bedrock models.
Source: https://aws.amazon.com/bedrock/pricing/
"""
from __future__ import annotations
from decimal import Decimal
def get_bedrock_pricing_entries() -> list:
"""Return official docs pricing entries for Bedrock models."""
from agent.usage_pricing import PricingEntry
_BEDROCK_PRICING_URL = "https://aws.amazon.com/bedrock/pricing/"
_BEDROCK_PRICING_VER = "bedrock-pricing-2026-04"
return [
PricingEntry(
input_cost_per_million=Decimal("15.00"),
output_cost_per_million=Decimal("75.00"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "anthropic.claude-opus-4-6")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "anthropic.claude-sonnet-4-6")
PricingEntry(
input_cost_per_million=Decimal("3.00"),
output_cost_per_million=Decimal("15.00"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "anthropic.claude-sonnet-4-5")
PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("4.00"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "anthropic.claude-haiku-4-5")
PricingEntry(
input_cost_per_million=Decimal("0.80"),
output_cost_per_million=Decimal("3.20"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "amazon.nova-pro")
PricingEntry(
input_cost_per_million=Decimal("0.06"),
output_cost_per_million=Decimal("0.24"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "amazon.nova-lite")
PricingEntry(
input_cost_per_million=Decimal("0.035"),
output_cost_per_million=Decimal("0.14"),
source="official_docs_snapshot",
source_url=_BEDROCK_PRICING_URL,
pricing_version=_BEDROCK_PRICING_VER,
), # ("bedrock", "amazon.nova-micro")
]
BEDROCK_PRICING_KEYS = [
("bedrock", "anthropic.claude-opus-4-6"),
("bedrock", "anthropic.claude-sonnet-4-6"),
("bedrock", "anthropic.claude-sonnet-4-5"),
("bedrock", "anthropic.claude-haiku-4-5"),
("bedrock", "amazon.nova-pro"),
("bedrock", "amazon.nova-lite"),
("bedrock", "amazon.nova-micro"),
]
@@ -1,66 +0,0 @@
"""Bedrock provider resolver for auxiliary client construction.
Handles ALL provider-specific logic for building auxiliary clients:
AWS credential detection, region resolution, and Bedrock client construction.
"""
from __future__ import annotations
import logging
from typing import Any, Optional
logger = logging.getLogger(__name__)
def resolve_auxiliary_client(
*,
model: str | None = None,
explicit_api_key: str | None = None,
explicit_base_url: str | None = None,
async_mode: bool = False,
is_vision: bool = False,
main_runtime: dict | None = None,
api_mode: str | None = None,
) -> tuple[Any, str] | tuple[None, None]:
"""Resolve an auxiliary client for the Bedrock provider.
Returns ``(client, default_model)`` or ``(None, None)`` if unavailable.
"""
from agent.plugin_registries import registries
from agent.anthropic_aux import (
AnthropicAuxiliaryClient,
AsyncAnthropicAuxiliaryClient,
)
_bedrock = registries.get_provider_namespace("bedrock")
_anthropic = registries.get_provider_namespace("anthropic")
has_aws_credentials = _bedrock.get("has_aws_credentials")
resolve_bedrock_region = _bedrock.get("resolve_bedrock_region")
build_anthropic_bedrock_client = _anthropic.get("build_anthropic_bedrock_client")
if has_aws_credentials is None or resolve_bedrock_region is None or build_anthropic_bedrock_client is None:
return None, None
if not has_aws_credentials():
logger.debug("resolve_provider_client: bedrock requested but "
"no AWS credentials found")
return None, None
region = resolve_bedrock_region()
default_model = "anthropic.claude-haiku-4-5-20251001-v1:0"
final_model = model or default_model
try:
real_client = build_anthropic_bedrock_client(region)
except ImportError as exc:
logger.warning("resolve_provider_client: cannot create Bedrock "
"client: %s", exc)
return None, None
client = AnthropicAuxiliaryClient(
real_client, final_model, api_key="aws-sdk",
base_url=f"https://bedrock-runtime.{region}.amazonaws.com",
)
logger.debug("resolve_provider_client: bedrock (%s, %s)", final_model, region)
if async_mode:
client = AsyncAnthropicAuxiliaryClient(client)
return client, final_model
@@ -1,19 +0,0 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent-bedrock"
version = "0.1.0"
description = "AWS Bedrock Converse API adapter for Hermes Agent"
requires-python = ">=3.11"
dependencies = [
"hermes-agent",
"boto3==1.42.89",
]
[project.entry-points."hermes_agent.plugins"]
bedrock = "hermes_agent_bedrock:register"
[tool.setuptools.packages.find]
include = ["hermes_agent_bedrock*"]

Some files were not shown because too many files have changed in this diff Show More