Compare commits

..

36 Commits

Author SHA1 Message Date
Ben 4c481860ce ci(docker): run tests/docker/ in build-amd64 against the freshly-built image
The new tests/docker/ suite (added by this PR) was being picked up by the
sharded pytest matrix in tests.yml, where its session-scoped `built_image`
fixture issued a 3-7min `docker build` under tests/docker/conftest.py's
180s pytest-timeout cap. Every test in the directory failed in fixture
setup across all 6 shards.

Fix the suite so it actually runs (not skips):

1. Wire the docker tests into docker-publish.yml's build-amd64 job, right
   after the existing smoke test. The image is already loaded into the
   local daemon as `nousresearch/hermes-agent:test`; set
   HERMES_TEST_IMAGE to that and the fixture's pre-built-image branch
   short-circuits the rebuild. 21 tests run in ~90s locally against a
   prebuilt image, no rebuild cost on top of the existing build step.

2. Exclude tests/docker/ from scripts/run_tests_parallel.py's default
   discovery so the sharded matrix in tests.yml stops trying to build
   the image. Explicit positional paths (`pytest tests/docker/` or
   `scripts/run_tests.sh tests/docker/`) still pick the suite up — the
   skip rule honors directory-level user intent, matching the existing
   per-file override pattern.

The dedicated docker-tests step runs on every PR that touches docker
code (the existing path filters on docker-publish.yml already cover
`tests/docker/**` via `**/*.py`), so the suite gates real changes.
2026-05-25 11:55:03 +10:00
Ben 1150639fa9 chore(ty): suppress unresolved-import inside tests/ to keep lint-diff PR comment useful
The lint-diff CI job runs ty as a bare uv tool without installing the
project's venv, so test files trip ty with unresolved-import on
pytest itself and on local test-only deps. The PR #30136 github-
actions lint-summary bot reported 7 new such warnings, even though
ty itself flags them as non-blocking and the imports demonstrably
work at runtime (the full pytest suite in a sibling CI job exercises
them).

Installing the full venv just to please ty would balloon the lint
job runtime; the override below tells ty to ignore unresolved-import
strictly inside tests/. The diagnostic class continues to be active
for hermes_cli/, agent/, plugins/, etc. — anywhere those imports
might really break.
2026-05-25 11:22:06 +10:00
Ben 02c933aedc test(docker): fix svstat 'want up' assertion in profile-gateway lifecycle test
After the supervise-perms fix lands, the s6 lifecycle actually works
for the hermes user — hermes -p <profile> gateway start now genuinely
brings the supervised gateway up rather than silently no-op'ing on
EACCES. That exposes a latent bug in this test's assertion: it
expected 'want up' to appear literally in s6-svstat output, but
s6-svstat elides redundancies — when the slot is currently up AND
s6 wants it up, the output is just 'up (pid N pgid N) X seconds';
the explicit 'want up' token only appears when current ≠ wanted
(e.g. 'down (exitcode 1) … , want up' on a crash-loop).

Add a small helper _svstat_wants_up() that reads the want-state
correctly across both spellings:
  * 'up …'                       → wanted up (unless explicit 'want down')
  * 'down …, want up'            → wanted up explicitly
  * 'down …'                     → wanted down

Both stop and start assertions now use the helper. Also rewords
the module docstring to acknowledge that the supervised process
may succeed OR crash-loop depending on environment, but the want-
state contract holds either way.
2026-05-25 11:21:47 +10:00
Ben c41f908ad4 fix(docker): make s6 lifecycle work for the unprivileged hermes user
Resolves the explicit "Known follow-up" left by commit 2f8ceeab9 and
the resulting CI failures in tests/docker/test_dashboard.py and
tests/docker/test_s6_profile_gateway_integration.py.

The product gap
---------------
Every hermes runtime operation inside the container runs as the
hermes user (UID 10000) via s6-setuidgid. But s6-supervise — spawned
by s6-svscan running as PID 1 — creates each service's supervise/
and top-level event/ directories with mode 0700 owned by its
effective UID (root). That left every s6-svc / s6-svstat / s6-svwait
call from hermes hitting EACCES on the supervise/control FIFO and
supervise/status — i.e. the entire S6ServiceManager lifecycle
(register, start, stop, unregister) was inert in production.

The 2f8ceeab9 commit message called this out and deferred the fix.
The audit changes that landed alongside it (defaulting docker_exec
to -u hermes) made the integration tests reproduce the bug
deterministically; the fix below resolves it.

The fix: pre-create the supervise/ skeleton hermes-owned
----------------------------------------------------------
Reading s6's source (src/supervision/s6-supervise.c::trymkdir +
control_init), the mkdir and mkfifo calls that build the supervise
tree are EEXIST-safe: if the directory or FIFO is already present,
s6-supervise reuses it and skips the chown/chmod fix-up that would
normally make event/ 03730 root:root. So if we lay the skeleton
down with hermes ownership before triggering s6-svscanctl -a,
s6-supervise inherits our layout and never touches it. The
death_tally / lock / status regular files written later by
s6-supervise (still as root) land mode 0644 — world-readable —
which is all s6-svstat needs.

New module-level helper _seed_supervise_skeleton(svc_dir) in
hermes_cli/service_manager.py lays down:
  svc_dir/event/                       hermes:hermes 03730
  svc_dir/supervise/                   hermes:hermes 0755
  svc_dir/supervise/event/             hermes:hermes 03730
  svc_dir/supervise/control            hermes:hermes 0660 (FIFO)
  svc_dir/log/event/                   hermes:hermes 03730  (if log/ present)
  svc_dir/log/supervise/               hermes:hermes 0755
  svc_dir/log/supervise/event/         hermes:hermes 03730
  svc_dir/log/supervise/control        hermes:hermes 0660 (FIFO)

The log/ branch matters because the logger is a second
s6-supervise instance — without it, unregister rmtree races on
the logger's root-owned supervise dir even after the parent
slot's supervise/ is hermes-owned. The helper is idempotent and
swallows PermissionError on chown so it works equally well when
called from root (cont-init.d) or hermes (runtime register).

Wiring
------
1. S6ServiceManager.register_profile_gateway calls
   _seed_supervise_skeleton(tmp_dir) just before publishing the
   slot via Path.replace. Runtime-registered profile gateways are
   set up by hermes.

2. container_boot._register_service does the same in the cont-init.d
   reconciliation path so boot-time-restored profile slots inherit
   the same layout.

3. New cont-init.d/015-supervise-perms script chowns the supervise/
   and event/ trees for STATIC s6-rc services (dashboard,
   main-hermes). These are spawned by s6-rc before cont-init.d
   gets to run, so the EEXIST-trick doesn't apply; we chown the
   already-existing tree instead. s6-supervise keeps using the
   same files; it never re-asserts ownership on a running service.
   The script skips s6-overlay internal services (s6rc-*,
   s6-linux-*) so the supervision tree itself stays root-only.
   015- slot is intentional: lex-sorts between 01-hermes-setup
   and 02-reconcile-profiles in the container's C-locale, so
   the chown finishes before the reconciler walks the scandir.

Unregister teardown reordering
------------------------------
S6ServiceManager.unregister_profile_gateway now fires
s6-svscanctl -an BEFORE rmtree (with a 200ms grace), so
s6-svscan reaps the supervise child and releases its file
handles on supervise/lock + supervise/status before we try to
remove the directory. Previously rmtree raced s6-supervise on a
set of files inside the supervise dir, and even with the parent
supervise/ now hermes-owned, the contained files (death_tally,
lock, status, written by root) could still be in use.

Dashboard down-state redesign
-----------------------------
The original PR #30136 review fix wrote a 'down' marker file
into /run/service/dashboard/ via cont-init.d/03-dashboard-toggle.
That approach was broken in two ways:

  (a) /run/service/dashboard is a symlink to a TRANSIENT
      /run/s6-rc:s6-rc-init:<tmpdir>/ directory while s6-rc is
      mid-transaction; the touch landed in a soon-to-be-discarded
      tmp.

  (b) Even when written to the final /run/s6-rc/servicedirs/
      location, the 'down' file is only consulted by s6-supervise
      at slot startup. s6-rc's user-bundle explicitly transitions
      'dashboard' to 'up' on every boot, overriding any down
      marker.

The right fix is the canonical s6 pattern: when HERMES_DASHBOARD
is unset, the dashboard run script exits 0 and a companion
finish script exits 125. Per s6-supervise(8), exit code 125 from
the finish script is the 'permanent failure, do not restart'
marker — equivalent to s6-svc -O. The slot reports as 'down' to
s6-svstat, matching the reality that no dashboard process is
running. When HERMES_DASHBOARD IS truthy, finish exits 0 and
restart-on-crash semantics apply.

03-dashboard-toggle is removed (its function is now subsumed by
the run/finish pair).

Tests
-----
Adds four unit tests for _seed_supervise_skeleton covering the
produced layout, the log/ subservice case, the skip-when-no-log
case, and idempotency. The live-container verification continues
to live in tests/docker/test_s6_profile_gateway_integration.py and
tests/docker/test_dashboard.py — both now pass against the
rebuilt image.

References
----------
* Skarnet skaware mailing list 2020-02-02 (Laurent Bercot
  + Guillermo Diaz Hartusch) on unprivileged s6 tool semantics:
  http://skarnet.org/lists/skaware/1424.html
* just-containers/s6-overlay#130 — same EEXIST-preseed pattern,
  community-validated 2016 onward
* https://skarnet.org/software/s6/servicedir.html — exit-code 125
  semantics in finish scripts
2026-05-25 11:21:31 +10:00
Ben ffc1bb6393 test(dockerfile): recognize s6-overlay/init as a valid PID-1; harden against historical-comment masquerade
PR #30136 CI: test_dockerfile_entrypoint_routes_through_the_init failed
because the test hardcoded known_inits = ('tini', 'dumb-init',
'catatonit'). The PR replaced tini with s6-overlay's /init (which execs
s6-svscan as PID 1) — same SIGCHLD-reaping contract, different name,
so the substring scan against ENTRYPOINT missed it.

Two-part fix:

1. Extend the accepted token list to include 's6-overlay', 's6-svscan',
   and '/init'. The contract these tests enforce is behavioural ('some
   PID-1 init reaps SIGCHLD'), so the names list is purely a recognition
   table and any reaper-capable family should qualify.

2. Harden test_dockerfile_installs_an_init_for_zombie_reaping (the
   sibling check) against comment-only matches. It was scanning the full
   Dockerfile text and only passed because the word 'tini' is still in
   a historical comment explaining why we used to use it. The next
   person to clean up that comment would have silently broken the test.
   New _instruction_text() helper joins only the parsed, non-comment
   Dockerfile instructions so stale comments can't satisfy the check.
2026-05-25 10:32:51 +10:00
Ben 472be1247d fix(service_manager): pass encoding to Path.read_text in _s6_running
PR #30136 CI: ruff PLW1514 (preview rule unspecified-encoding) failed on
`Path('/proc/1/comm').read_text().strip()` introduced by commit
2f8ceeab9 (the daimon-nous critical-bug fix that switched s6 detection
off /proc/1/exe to /proc/1/comm so it works for the unprivileged hermes
user).

Add explicit encoding='utf-8'. /proc/1/comm is always plain ASCII (the
kernel's PR_GET_NAME / TASK_COMM_LEN buffer), so utf-8 is correct and
locale-independent.
2026-05-25 10:32:36 +10:00
Ben Barclay 59da190512 Merge branch 'main' into docker_s6 2026-05-25 09:39:27 +10:00
Ben 0988ab83b7 docs(plans): trim s6-overlay plan to a post-implementation reference
PR #30136 review item O7: the plan doc was 3,191 lines — 5x the
size of any other plan in docs/plans/ and the largest reference
document in the repo. With the implementation shipped, most of
that content is either:

* The phase-by-phase TDD walkthrough (~2,800 lines): now canonical
  in the PR commit log (`git log a957ef083..a6f7171a5`).
* The v2/v3 re-validation preambles: artifacts of the planning
  process, no longer load-bearing.
* The full Open Questions deliberations with options A/B/C laid
  out: collapsed into the Decision Log.
* The Rollout Plan and Estimated Timeline: history.

Trim to ~430 lines covering what readers actually need going
forward: the goal, architecture, scope, key design decisions
(D1–D9), risk register (now including the three risks surfaced
in PR review — `_s6_running` detection, svscanctl FIFO perms,
supervise control FIFO perms), the decision log including the
post-merge additions, and the verification checklist (now all
boxes ticked).

Header now reads 'Status: shipped' and points at the PR. The git
history preserves the full v3 plan for anyone who needs it.
2026-05-23 16:24:33 +10:00
Ben 3b69bdb74e test(docker): poll for boot-log signal instead of fixed sleeps
PR #30136 review item O6: test_container_restart.py used fixed
`time.sleep(8)` calls after `docker restart` to wait for the
cont-init reconciler to finish. Fixed sleeps are slow when the
event happens fast and false-fail when the event happens slow.

Replace with two polling helpers:

* `_wait_for_path(container, path, kind='f' | 'd', deadline_s=...)`
  — generic `test -f/-d` poller. Returns True on success, False on
  timeout; callers assert with a clear message.
* `_wait_for_reconcile_log_mention(container, profile, ...)` — the
  reconciler's per-profile log line is the canonical signal that
  the cont-init reconcile has finished for that profile. Poll on
  it instead of a sleep that hopes 8 seconds is enough.

The fixture-level setup wait is similarly migrated: it now polls
for `profile=default` in the boot log (every container always
gets a default-slot entry per item I1) and raises a clear timeout
error from the fixture if the container never finishes cont-init —
much better diagnostics than a mid-test KeyError.

The remaining `time.sleep()` calls are all internal interval_s
between probe attempts; no fixed wait points left.
2026-05-23 16:21:00 +10:00
Ben e3050657aa docs(docker): deprecation warning in entrypoint.sh shim
PR #30136 review item O5: docker/entrypoint.sh is now a thin shim
that forwards to stage2-hook.sh — the real ENTRYPOINT is /init plus
main-wrapper.sh. External scripts that hard-coded entrypoint.sh as
the container's ENTRYPOINT will see the cont-init bootstrap happen
but the CMD will not be exec'd (because stage2-hook only handles
bootstrap; main-wrapper.sh handles the CMD passthrough).

Add a stderr warning explaining the new contract and pointing
callers at the migration path (drop the --entrypoint override).
The shim itself stays in place for one release cycle so the
deprecation isn't a hard break — anyone still invoking it sees
the warning in their logs and has time to migrate.
2026-05-23 16:18:59 +10:00
Ben 541b40532a fix(container_boot): publish reconciled service dirs atomically
PR #30136 review noted the asymmetry: `register_profile_gateway`
used tmp_dir + rename to publish a new service slot atomically,
but the boot-time reconciler wrote files into the slot directly.
Same underlying concern (a concurrent s6-svscan rescan could
observe a half-populated directory), different code path.

Rewrite `container_boot._register_service` to mirror the manager:
build everything in `<scandir>/gateway-<profile>.tmp/`, then
`Path.replace` into place. If a previous interrupted run left a
`.tmp` sibling, it's cleaned up before the new build starts. If
the target already exists, it's removed before the rename so
`Path.replace` doesn't error on a non-empty target (Linux `rename`
overwrites empty targets only).

Three new tests: atomic publication leaves no .tmp leftovers,
overwriting an existing slot still leaves no .tmp leftovers, and
a stale .tmp from an interrupted run is cleaned up automatically.
2026-05-23 15:34:51 +10:00
Ben 5b1fcdd16b fix(container_boot): rotate container-boot.log when it exceeds 256 KiB
PR #30136 review noted: container-boot.log was append-only with no
rotation. On a long-lived container with frequent restarts and
many profiles it would grow unboundedly (~80 B per profile per
reconcile pass).

Add a soft cap: when the file size hits 256 KiB (`_LOG_ROTATE_BYTES`,
≈3000 reconcile lines, ≈1 year of daily reboots × 5 profiles), the
current file is renamed to `container-boot.log.1` (replacing any
existing one) before new entries are appended. Worst case is two
files at ~512 KiB — well within visibility limits for grep/cat.

Rotation is intentionally simple (no logrotate or s6-log machinery
for one append-only file). Failures during rotation are logged via
the module logger and treated as non-fatal — we keep appending to
the existing file rather than dropping the reconcile entry. Three
new unit tests cover above-threshold rotation, below-threshold
non-rotation, and overwrite of an existing .1 file.
2026-05-23 15:33:11 +10:00
Ben f83b9b96d1 docker: drop sh -c wrappers from stage2-hook.sh
PR #30136 review caught: three `s6-setuidgid hermes sh -c "..."`
invocations in stage2-hook.sh interpolated $HERMES_HOME into a
nested shell context. Practically low-risk (a malicious HERMES_HOME
already requires container-launch privileges) but the cleaner
pattern is to invoke commands directly so the shell isn't a second
interpreter.

* `mkdir -p` of the data subdirs now runs directly via s6-setuidgid,
  one path per arg.
* The .install_method stamp is written via `printf | tee` — also no
  shell wrapper.
* The skills_sync invocation uses the venv's python by absolute path
  instead of sourcing activate inside a shell. skills_sync.py doesn't
  need anything from activate beyond sys.path, which the bin-stub
  python already provides.

No behavior change. Just a smaller attack surface and a script
that's easier to read.
2026-05-23 15:31:46 +10:00
Ben 8b6733ebe2 fix(service_manager): rip out dead port parameter
PR #30136 review caught: `_allocate_gateway_port()` in profiles.py
computed a SHA-256-derived port that was threaded through
`register_profile_gateway(profile, port=N)` →
`_render_run_script(profile, port, extra_env)` → and then **ignored**.
The rendered run script picked the bind port from the profile's
config.yaml (`[gateway] port = …`), never from the allocator. So
the entire allocator + parameter chain was dead code.

Remove:

* `hermes_cli.profiles._allocate_gateway_port` (deterministic
  SHA-256 → [9200, 9800) — never used).
* `port` kwarg from `ServiceManager.register_profile_gateway`
  (Protocol + Mixin + S6 implementation).
* `port` positional arg from `_render_run_script(profile, port,
  extra_env)` — now `_render_run_script(profile, extra_env)`.
* The pass-through call in `profiles._maybe_register_gateway_service`.

config.yaml is now the single source of truth for gateway port
selection — matches reality and reduces the API surface. Three
explanatory comments in service_manager.py / profiles.py document
the retirement so future readers don't reach for the allocator and
find a ghost.

Tests: drop the three `_allocate_gateway_port` tests; update
fakes' signatures throughout test_service_manager.py and
test_profiles_s6_hooks.py to match the new no-port API.
2026-05-23 15:30:15 +10:00
Ben 7b16e4448a docs(compose): update entrypoint comment for s6-overlay
PR #30136 review caught: docker-compose.yml still said "If you
override entrypoint, keep /opt/hermes/docker/entrypoint.sh in the
command chain." That was true under tini; under s6-overlay the
entrypoint is /init plus main-wrapper.sh, and entrypoint.sh is now
only a backward-compat shim.

Replace with an accurate description: /init must remain first in the
chain because it's PID 1 and runs the cont-init.d scripts (chown,
profile reconcile, dashboard toggle) before any service starts.
2026-05-23 15:24:46 +10:00
Ben 9ba349b6e9 fix(docker): dashboard slot stays 'down' when HERMES_DASHBOARD unset
PR #30136 review caught a false positive: when HERMES_DASHBOARD was
unset, the dashboard run script did `exec sleep infinity`, so
`s6-svstat /run/service/dashboard` reported the slot as 'up'.
`hermes doctor` and any other s6-svstat-based health check saw the
dashboard as supervised-running even though no dashboard process
existed.

Add cont-init.d/03-dashboard-toggle: writes a `down` marker file
into `/run/service/dashboard/` when HERMES_DASHBOARD is falsy,
removes any leftover marker when it's truthy. s6-supervise honors
`down` by not starting the service, so s6-svstat reports 'down' —
matching reality.

The run script's HERMES_DASHBOARD case-statement stays in place as
a belt-and-suspenders guard, so the two layers can never disagree.

Two new integration tests lock the behavior: slot reports down
when unset; slot reports up when set to 1.
2026-05-23 15:24:17 +10:00
Ben 1759c0f090 fix(service_manager): friendly errors for missing slots and s6-svc failures
PR #30136 review caught: `S6ServiceManager.start/stop/restart` called
`subprocess.run(check=True)` on `s6-svc`, so any failure surfaced as
a raw `CalledProcessError` traceback. The two cases operators
actually hit are:

  1. The service slot doesn't exist — most commonly because the user
     typed a profile name wrong (`hermes -p typo gateway start`).
  2. s6-svc itself fails — most commonly EACCES on the supervise
     control FIFO when running unprivileged.

Both deserve named errors with actionable messages, not stacktraces.

Changes:

* Add `S6Error` base + two concrete errors in `hermes_cli.service_manager`:
    - `GatewayNotRegisteredError(profile)` — carries the unprefixed
      profile name; message: `no such gateway 'typo': register it
      with `hermes profile create typo` first, or pass an existing
      profile name via `-p <name>``.
    - `S6CommandError(service, action, returncode, stderr)` — carries
      the s6-svc rc and stderr; message: `s6-svc start on
      'gateway-coder' failed (rc=111): <stderr>`.

* Factor lifecycle dispatch through `_run_svc(flag, label, name)`:
  pre-checks that the service directory exists (raises
  GatewayNotRegisteredError before invoking s6-svc), then runs
  s6-svc and translates any CalledProcessError into S6CommandError.

* `_dispatch_via_service_manager_if_s6` in `hermes_cli.gateway`
  catches both errors and prints `✗ <message>` + `sys.exit(1)`
  instead of letting the exception bubble. The dispatch path that
  used to dump a traceback at the user now gives an actionable
  one-liner.

Tests: 6 new tests for the error types and their CLI rendering;
existing lifecycle test pre-seeds the slot directory before calling
`mgr.start` etc.
2026-05-23 15:20:41 +10:00
Ben 367c15b1dc fix(container_boot): always register gateway-default slot
PR #30136 review caught: `hermes gateway start` (no `-p`) inside
the container resolves `_profile_suffix() == ""` → service name
`gateway-default`, but no such slot was ever registered. The Phase 4
profile-create hook only fired on `hermes profile create <name>`,
and the root profile (which lives at the top of $HERMES_HOME, not
under `profiles/`) was never one of those. So bare `hermes gateway
start` landed on `s6-svc -u /run/service/gateway-default` →
uncaught `CalledProcessError` → traceback to the user.

Changes:

1. `reconcile_profile_gateways` now always registers a
   `gateway-default` slot before iterating named profiles. Its
   prior state is read from `$HERMES_HOME/gateway_state.json`
   (sibling to the profile root, not under `profiles/`); stale
   runtime files there are swept the same way. Auto-up only if the
   prior state was `running` — same rule as named profiles.

2. `S6ServiceManager._render_run_script` special-cases
   `profile == "default"` to emit `hermes gateway run` with NO
   `-p` flag. Passing `-p default` would resolve to
   `$HERMES_HOME/profiles/default/` — a different profile that
   almost certainly doesn't exist. The empty profile-suffix
   convention is the dispatcher's contract and the run script has
   to match.

3. A user-created `profiles/default/` collides with the reserved
   root-profile slot; the reconciler now skips it with a warning
   rather than producing two registrations of the same service name.

Action-list ordering is stable: `default` first, then named
profiles in directory order. Boot-log readers can rely on this.

Tests: 8 new dedicated default-slot tests plus updates to every
existing test that asserted against the action list (via the new
`_named_actions` helper that drops the always-present default
entry).
2026-05-23 15:16:35 +10:00
Ben 04d1894f36 docs(docker): dashboard IS supervised — update note that contradicted the PR
PR #30136 review caught that website/docs/user-guide/docker.md still
said "The dashboard side-process is **not supervised** — if it
crashes, it stays down until the container restarts." That was true
under tini but is the opposite of the s6 behavior this PR ships and
`test_dashboard_restarts_after_crash` proves.

Replace with a description of what users actually see now: automatic
restart by s6-overlay, new PID after a short backoff, logs via
`docker logs`. The standalone-container caveat carries forward
unchanged.
2026-05-23 15:08:48 +10:00
Ben efd3569739 fix(gateway): route --all stop/restart through s6 under container
PR #30136 review caught that `hermes gateway stop --all` and
`... restart --all` were broken under s6. The Phase 4 dispatcher was
gated on `not stop_all` (and the symmetric restart_all), so `--all`
fell through to `kill_gateway_processes(all_profiles=True)`. pkill
SIGTERMed every gateway, s6-supervise observed the crashes, and
restarted every gateway ~1s later — net effect: `--all` *kicked*
gateways instead of *stopping* them.

Add `_dispatch_all_via_service_manager_if_s6(action)` that iterates
`mgr.list_profile_gateways()` and routes stop/restart through each
service slot. s6's `want up`/`want down` flips correctly, so a
stop persists. Partial failures are surfaced per-profile with a
running success count; the host pkill path is only reached when s6
isn't in play.

`start --all` isn't a CLI surface — the helper rejects it and
returns False (host code path can take over).
2026-05-23 15:08:17 +10:00
Ben 8ae959adb6 fix(ci): drop --entrypoint override in hermes-smoke-test action
PR #30136 review caught a silent regression: the smoke-test action
overrode ENTRYPOINT to `/opt/hermes/docker/entrypoint.sh`, which the
s6-overlay migration reduced to a shim that just `exec`s the stage2
hook. stage2-hook ignores its CMD args, prints "Setup complete", and
exits 0 — so `hermes --help` and `hermes dashboard --help` never
ran. The #9153 regression guard was a green-always no-op.

Drop the override so the smoke test uses the image's real ENTRYPOINT
chain (`/init` + `main-wrapper.sh`), which is the actual production
startup path. `hermes --help` and `hermes dashboard --help` now run
through the full supervision tree and exercise the real argv routing.
2026-05-23 15:00:43 +10:00
Ben eb59d6f774 fix(docker): SHA256-verify s6-overlay tarballs
PR #30136 review flagged the s6-overlay install as a supply-chain
regression vs the gosu source it replaced — `tianon/gosu` was
digest-pinned via `FROM ...@sha256:...`, but the three new
ADD/curl downloads had no integrity check at all.

Pin all three tarballs (noarch, symlinks-noarch, per-arch) to
upstream-published SHA256s via ARGs. Verification happens via
`sha256sum -c` against a single checksum file (avoids a piped-shell
hadolint DL4006 warning under dash). To bump S6_OVERLAY_VERSION,
fetch the four `.sha256` files from the new release and update
the ARGs — documented inline.

If upstream artifacts are tampered with mid-build, the build now
fails loudly at the verification step instead of silently
producing a tainted image.
2026-05-23 14:59:42 +10:00
Ben 928e52e574 fix(docker): support multi-arch s6-overlay install (amd64 + arm64)
The Dockerfile only ADD'd `s6-overlay-x86_64.tar.xz`, so the
`build-arm64` job in docker-publish.yml — which runs on
`ubuntu-24.04-arm` and publishes by digest — produced an image whose
`/init` couldn't exec on actual arm64 hosts. Apple Silicon and ARM
server users were getting a broken container.

Map BuildKit's `TARGETARCH` (`amd64` / `arm64`) to s6's kernel-arch
naming (`x86_64` / `aarch64`) inside the RUN step and fetch the
correct tarball via `curl` (`ADD`'s URL is evaluated at parse time,
before TARGETARCH substitution, so dynamic arch selection requires
RUN). The noarch + symlinks tarballs are architecture-independent
and stay as ADDs.

The audit case is now explicit: unsupported architectures fail loudly
at build time rather than producing a silently-broken image.
2026-05-23 14:58:06 +10:00
Ben 2f8ceeab9a fix(service_manager): s6 detection works for unprivileged hermes user
PR #30136 review surfaced two issues, both rooted in the same audit gap:
docker integration tests were running as root, not the unprivileged
`hermes` user (UID 10000) that the runtime actually uses via
`s6-setuidgid hermes`. Anything that probed PID-1 state or wrote to
the s6 control surface worked as root in the tests but was inert in
production.

Fixes:

1. `_s6_running()` previously called `Path("/proc/1/exe").resolve()`,
   which is root-only readable. For UID 10000 the symlink yields
   PermissionError, `resolve()` silently returns the unresolved path,
   and `exe.name == "exe"` — so detection always returned False, the
   service-manager runtime-registration path was inert, and every
   `hermes profile create` / `hermes -p X gateway start` silently
   skipped the s6 hook. Replace with `/proc/1/comm` (world-readable)
   + `/run/s6/basedir` (s6-overlay-specific) — both required, fail
   closed.

2. `02-reconcile-profiles` now also chowns `/run/service/.s6-svscan/`
   {control,lock} to hermes so `s6-svscanctl -a/-an` works without
   root. Previously the directory chown stopped at `/run/service`
   and the FIFO inside stayed root-owned, so `register_profile_gateway`
   from hermes failed at the rescan-trigger step with EACCES — the
   wrapper in profiles.py caught the exception and printed a swallowed
   warning, so profile creation appeared to succeed while the slot
   was rolled back.

Audit changes to flush this class of bug next time:

- Add `docker_exec` / `docker_exec_sh` helpers to `tests/docker/conftest.py`
  that default to `-u hermes`. The module docstring explains why and
  flags `user="root"` as opt-in only for tests that explicitly need
  root (none currently do).
- Refactor every `docker exec` call in tests/docker/ through the new
  helpers (test_dashboard.py, test_zombie_reaping.py, test_profile_gateway.py,
  test_container_restart.py, test_s6_profile_gateway_integration.py).
- Add 5 unit tests covering `_s6_running` under various probe states
  (both signals present; comm wrong; basedir missing; PermissionError
  on /proc/1/comm; missing /proc — non-Linux). The PermissionError
  test is the explicit regression guard for the original bug.

Known follow-up: the per-service `supervise/control` FIFO inside each
`/run/service/gateway-<profile>/supervise/` is created root-owned by
s6-supervise (which runs as root because s6-svscan is PID 1). `s6-svc
-u/-d/-t` from the hermes user will get EACCES on those. The audit
under `-u hermes` will reveal this in lifecycle tests — surfacing the
issue cleanly so it can be fixed in a focused follow-up (likely via a
small SUID helper or a polling chown loop in cont-init.d). The
detection + svscanctl fixes here are independent and complete on
their own.
2026-05-23 14:56:39 +10:00
Ben a6f7171a5e feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop
The s6-overlay migration replaced every runtime use of gosu with
s6-setuidgid (in stage2-hook.sh, main-wrapper.sh, per-service run
scripts, and cont-init.d hooks), but the gosu binary itself was still
being copied into the image from tianon/gosu, and several comments
across the repo still pointed to it.

Image changes:
- Drop the FROM tianon/gosu:1.19-trixie AS gosu_source stage
- Drop the COPY --from=gosu_source /gosu /usr/local/bin/ layer
- Net: one fewer base-image pull, ~12-15 MB layer eliminated

Documentation/comment refresh (no behavior change):
- Dockerfile: update root-user rationale comment + cont-init.d comment
- docker/main-wrapper.sh: drop "pre-s6 contract (gosu drop)" reference
- docker-compose.yml: update UID/GID remap comment
- .hadolint.yaml: update DL3002 ignore rationale
- website/docs/user-guide/docker.md: privilege-drop helper is s6-setuidgid now
- hermes_cli/config.py: docker_run_as_host_user docstring

tools/environments/docker.py runs *arbitrary user images* via the
terminal backend, not the bundled Hermes image. It still needs SETUID/
SETGID caps so user images that use gosu/su/s6-setuidgid all work.
Renamed the cap-list constant _GOSU_CAP_ARGS → _PRIVDROP_CAP_ARGS and
updated comments to list s6-setuidgid alongside the others as examples.
The matching test (test_security_args_include_setuid_setgid_for_gosu_drop
→ test_security_args_include_setuid_setgid_for_privdrop) was renamed
and its docstring updated; behavior is unchanged.

Verification:
- hadolint clean against .hadolint.yaml
- shellcheck clean against all docker/ shell scripts
- Image rebuilt successfully (sha 1a090924ccea)
- Docker harness: 19 passed in 41.87s (every Phase 0 test + Phase 4
  per-profile-gateway lifecycle + container-restart reconciliation)
- tests/tools/test_docker_environment.py: 23 passed (rename did not
  break test discovery; pre-existing unrelated mock warning)

The plan document (docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md)
intentionally retains its historical references to gosu — it describes
the pre-s6 entrypoint as background for understanding the migration.
2026-05-22 11:47:42 +10:00
Ben 7d07dd60a8 docs(s6): document container supervision; doctor + skill + user-guide updates
Phase 5 of the s6-overlay supervision plan. Documentation + small
diagnostic cleanups; no behavior changes.

website/docs/user-guide/docker.md:
  - Replace the old 'entrypoint script does the bootstrap' section
    with the s6-overlay boot flow (cont-init.d/01-hermes-setup,
    cont-init.d/02-reconcile-profiles, static main-hermes + dashboard
    services, ENTRYPOINT-as-main-program pattern).
  - Add a 'Per-profile gateway supervision' subsection covering the
    new lifecycle commands, restart semantics, log persistence, and
    'Manager: s6 (container supervisor)' status reporting.
  - Add 'Breaking change vs. pre-s6 images' callout naming the
    /init ENTRYPOINT and pointing affected wrappers at the pin
    workaround.

website/docs/user-guide/profiles.md:
  - Add a note under 'Persistent services' pointing container users
    at the docker.md section explaining s6 supervision inside the
    image. Host-side systemd/launchd documentation is unchanged.

skills/software-development/hermes-s6-container-supervision/SKILL.md:
  - New maintainer skill covering the supervision-tree map, file
    layout, the Architecture B rationale (cont-init.d args + halt
    exit-code propagation), quick recipes, and the 8 pitfalls we hit
    while implementing the plan (PATH-without-/command, root-owned
    profile dirs, SOUL.md as marker, the '143' anti-pattern, etc.).

hermes_cli/doctor.py:
  - _check_gateway_service_linger skips on s6 (the linger concept
    doesn't apply inside the container).
  - New _check_s6_supervision section reports main-hermes/dashboard
    state and per-profile-gateway count (registered vs supervised
    up), only inside the s6 container. Host doctor output unchanged.
  - External Tools / Docker check no longer emits a 'docker not
    found' warning inside the container; prints an explanatory
    info line instead. Still respects an explicit TERMINAL_ENV=docker
    (in case the user mounted /var/run/docker.sock).

hermes_cli/gateway.py:
  - Document _container_systemd_operational more precisely: it's
    NOT for our Hermes Docker image (s6-overlay handles that via
    detect_service_manager() == 's6'). It still covers
    systemd-nspawn / k8s-with-systemd-init cases, so leaving it in
    place is correct; the docstring just makes that explicit.

Test harness (verification, no test changes in this commit):
  19 passed, 0 xfailed. 66 service-manager / container-boot /
  profiles-s6-hooks / gateway-s6-dispatch unit tests still green.
  61 doctor tests still green. Hadolint + shellcheck clean.

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:47:42 +10:00
Ben 57c6e29666 feat(docker): per-profile s6 supervision + container-restart reconciliation
Phase 4 of the s6-overlay supervision plan. Activates the Phase 3
S6ServiceManager by hooking it into the profile lifecycle and the
`hermes gateway start/stop/restart` dispatcher, and adds a cont-
init.d-time reconciliation pass that survives `docker restart`.

Task 4.0 — container-boot reconciliation:
  /run/service/ is tmpfs, so every `docker restart` wipes every
  per-profile gateway slot. /etc/cont-init.d/02-reconcile-profiles
  invokes hermes_cli.container_boot.reconcile_profile_gateways() on
  every boot, which walks $HERMES_HOME/profiles/<name>/, reads each
  gateway_state.json, recreates the s6 service slot, and auto-starts
  only those whose last state was 'running'. Other states
  (stopped, starting, startup_failed, missing) register the slot
  in the down state — avoiding crash-loops across restarts for a
  gateway that was broken last boot. Per-profile outcome is recorded
  to $HERMES_HOME/logs/container-boot.log.

  Implementation: hermes_cli/container_boot.py + 12 unit tests.
  Profile-marker is SOUL.md, not config.yaml, because `hermes profile
  create` only seeds SOUL.md by default (config.yaml comes from
  `hermes setup`).

Task 4.1 / 4.2 — profile create/delete hooks:
  hermes_cli/profiles.py::create_profile now calls
  _maybe_register_gateway_service(<canon>) at the end, which routes
  through ServiceManager.register_profile_gateway when running on s6
  and no-ops on host backends. delete_profile mirrors with
  _maybe_unregister_gateway_service. _allocate_gateway_port produces
  a deterministic SHA-256-derived port in [9200, 9800).

Task 4.3 — gateway dispatch + remove rejection arms:
  _dispatch_via_service_manager_if_s6(action) intercepts
  start/stop/restart at the top of each subcommand and routes them
  through S6ServiceManager.{start,stop,restart}. The pre-Phase-4
  `elif is_container():` rejection arms are kept as fallback for
  pre-s6 containers / unsupported runtimes, but only ever fire when
  detect_service_manager() != 's6'. install/uninstall under s6
  print informational guidance pointing users at profile create/delete.

  Removed the two xfail(strict=True) markers from
  tests/docker/test_profile_gateway.py — both tests now pass strictly.

Task 4.4 — status reporting:
  get_gateway_runtime_snapshot() reports
  Manager: 's6 (container supervisor)' inside an s6 container instead
  of 'docker (foreground)'.

Plan-vs-reality drift fixed in this commit:
  - Plan's S6ServiceManager._render_run_script used
    `gateway start --foreground --port {port}` — invented args; the
    real CLI is `gateway run`. Switched accordingly. port arg
    retained for API parity but now documented as 'currently ignored'.
  - Plan's reconciler keyed on config.yaml; switched to SOUL.md
    (config.yaml is created by hermes setup, not by hermes profile
    create, so the original gate caught nothing).
  - The plan's _dispatch helper used _profile_arg() which returns
    '--profile <name>' (i.e. with the flag prefix). Switched to
    _profile_suffix() which returns the bare name.
  - Architecture B's docker exec doesn't get /command on PATH or
    the venv on PATH; Dockerfile's runtime PATH now includes
    /opt/hermes/.venv/bin so 'docker exec <c> hermes ...' works
    without sourcing the venv.
  - stage2-hook now chowns $HERMES_HOME/profiles to hermes on every
    boot, not just on the UID-remap path. Without this, files created
    by docker-exec-as-root accumulate and the next reconciler run
    fails with PermissionError reading SOUL.md.

Test harness:
  19 passed, 0 xfailed (the two pre-Phase-4 xfail targets flip to
  passing). 78 unit tests across service_manager + container_boot +
  profiles_s6_hooks + gateway_s6_dispatch. Hadolint + shellcheck
  pass cleanly.

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:47:42 +10:00
Ben ad5fdab092 feat(service_manager): add S6ServiceManager for runtime gateway supervision
Phase 3 of the s6-overlay supervision plan. Implements the runtime-
registration surface from D4 — only the s6 backend supports
register_profile_gateway / unregister_profile_gateway /
list_profile_gateways; host backends continue to raise
NotImplementedError. No caller yet (Phase 4 wires in the profile
create/delete hooks).

Key implementation notes:

  - Service directory shape: /run/service/gateway-<profile>/{type,run,log/run}.
    Atomic register: write to gateway-<profile>.tmp, fsync via
    os.rename. Cleanup on rescan failure.

  - Run script uses #!/command/with-contenv sh so HERMES_HOME and any
    extra_env arrive at exec time. The hermes -p <profile> gateway
    start --foreground --port <port> command is wrapped in
    s6-setuidgid hermes for the per-service privilege drop (OQ2-A).

  - Log script (OQ8-C): persists via s6-log to
    ${HERMES_HOME}/logs/gateways/<profile>/. CRITICAL — HERMES_HOME is
    a runtime env-var expansion in the rendered script, NOT a Python
    f-string substitution. Negative-asserted in
    test_s6_register_creates_service_dir_and_triggers_scan so
    regressions are caught.

  - PATH gotcha: /command/ is only on PATH for processes spawned by
    the supervision tree (services, cont-init.d). `docker exec` and
    profile-create hooks don't get it. S6ServiceManager calls all
    s6-* binaries via absolute path through the new _S6_BIN_DIR
    constant so callers don't have to fix up env vars.

  - validate_profile_name rejects path-traversal, leading-dash (s6
    would parse as a flag), uppercase, whitespace, and names >251
    chars (s6-svscan default name_max).

Test coverage:
  - 13 new unit tests in tests/hermes_cli/test_service_manager.py
    (kind detection, run-script content, env quoting, register
    rollback on rescan failure, unregister idempotence, list filter,
    lifecycle dispatch, svstat parsing). Total: 36 passing.
  - 2 new in-container integration tests in
    tests/docker/test_s6_profile_gateway_integration.py validating
    end-to-end registration against a real s6 supervision tree.

Docker harness: 14 passed, 2 xfailed (Phase 4 target unchanged).

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:47:41 +10:00
Ben 4826ea7b41 feat(docker)!: replace tini with s6-overlay as PID 1
BREAKING CHANGE: the container ENTRYPOINT is now /init (s6-overlay)
instead of /usr/bin/tini. Main hermes runs as the container CMD with
TTY inherited (preserving --tui), dashboard runs as a supervised s6-rc
service (HERMES_DASHBOARD=1 starts it; crashes auto-restart), and the
ground is laid for per-profile gateway supervision (Phase 3+4).

All five pre-s6 docker run invocation patterns continue to work
identically — verified by the Phase 0 docker harness:

  docker run <image>                  → `hermes` with no args
  docker run <image> chat -q "..."    → `hermes chat -q ...` passthrough
  docker run <image> sleep infinity   → `sleep infinity` direct
  docker run <image> bash             → interactive bash
  docker run -it <image> --tui        → interactive Ink TUI

Phase 2 harness result: 12 passed, 2 xfailed (Phase 4 target). Hadolint
+ shellcheck pass cleanly.

Architecture pivot from plan v3 (documented in main-hermes/run header):
the plan called for main hermes to be an s6-supervised service, but
two real s6-overlay v3 mechanics blocked that — cont-init.d scripts
receive no arguments (CMD args are not visible to stage2-hook), and
`/run/s6/basedir/bin/halt` after writing the exit code did not
propagate the desired exit code (container exits 143). We use the
s6-overlay-native CMD pattern instead: main-wrapper.sh is the
container's main program (ENTRYPOINT prepends it so leading-dash
args like --version aren't intercepted by /init), exec's the final
program with stdin/stdout/stderr inherited, and the program's exit
code becomes the container exit code. main-hermes is now a no-op
`sleep infinity` slot kept for future supervised-gateway-container
modes. This trades "supervised restart of main hermes" for arg-
parity with the pre-s6 contract — main hermes was already unsupervised
under tini, so we lose nothing functional. Dashboard supervision is
the only new guarantee added by this phase.

Files added:
  docker/main-wrapper.sh           # arg routing + s6-setuidgid drop
  docker/stage2-hook.sh            # gosu-equivalent + chown + seed
  docker/s6-rc.d/main-hermes/{type,run,dependencies.d/base}
  docker/s6-rc.d/dashboard/{type,run,dependencies.d/base}
  docker/s6-rc.d/user/contents.d/{main-hermes,dashboard}

Files changed:
  Dockerfile: tini → s6-overlay install + ENTRYPOINT flip + service wiring
  docker/entrypoint.sh: thin shim to stage2-hook.sh for back-compat
  tests/docker/test_dashboard.py: add test_dashboard_restarts_after_crash

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:47:41 +10:00
Ben cf6133495c feat(service_manager): add ServiceManager protocol + host wrappers
Phase 1 of the s6-overlay supervision plan. Pure-refactor addition:
introduces the abstract interface (with runtime_checkable Protocol),
detect_service_manager(), validate_profile_name(), and thin
SystemdServiceManager / LaunchdServiceManager / WindowsServiceManager
wrappers around the existing systemd_* / launchd_* / gateway_windows.*
module-level functions. No host call site was modified — host code
continues to use the existing functions directly; the protocol is for
new backend-agnostic code (Phase 4 profile create/delete hooks and the
Phase 4 s6 dispatch path in 'hermes gateway start/stop/restart').

WindowsServiceManager.install() forwards the v3 kwargs (start_now,
start_on_login, elevated_handoff) added in PRs #28169-adjacent so
non-Windows callers — there aren't any today — can opt in.

The s6 backend lands in Phase 3; until then get_service_manager()
raises a clear error if invoked on a host that detects as 's6'.
2026-05-22 11:47:41 +10:00
Ben c6febe3765 ci(docker): add hadolint + shellcheck for container build inputs
Phase 0.5 of the s6-overlay supervision plan. Catches Dockerfile and
shell-script regressions that the behavioral docker-publish smoke test
can't surface — unquoted variable expansions, silently-failing RUN
commands, missing apt-get clean, etc.

Both lint clean against the current (tini) Dockerfile + entrypoint.sh
at the configured thresholds (hadolint: warning, shellcheck: error).
Each ignore in .hadolint.yaml carries a one-line justification; the
shellcheck severity floor is documented in the workflow file.

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:47:41 +10:00
Ben a957ef0834 test(docker): stabilize Phase 0 baseline harness
Two pre-existing baseline issues found while running the Phase 0 harness
against the tini image that need fixing before later phases can use the
harness as a behavior-parity oracle:

1. The autouse `_enforce_test_timeout` fixture in tests/conftest.py
   hard-coded a 30s SIGALRM, which preempted any `pytest.mark.timeout`
   marker (already honored by pytest-timeout). Honor the marker if
   present; fall back to 30s otherwise. Docker harness tests carry a
   180s marker applied at collection time in tests/docker/conftest.py.

2. test_dashboard_port_override polled via `ss -tlnp` / `netstat -tln`
   — neither is installed in the Hermes image, so the probe trivially
   failed even when the dashboard was bound. The dashboard also takes
   8-15s to bind on cold image; the 5s sleep was insufficient. Replace
   with a poll loop reading /proc/net/tcp directly (port 9120 = 0x23A0,
   state 0A = LISTEN). Bump probe deadline to 60s and switch
   test_dashboard_opt_in_starts to a similar poll for pgrep so we don't
   regress to the same race.

Result: 11 passed, 2 xfailed (Phase 4 target) on tini image. Harness
now ready to serve as Phase 2's behavior-parity oracle.
2026-05-22 11:47:41 +10:00
Ben 60d8e07ded test(docker): apply 180s timeout to docker harness tests
The agent-test suite default is 30s; docker test_no_args (the dashboard
spin-up, the container restart) routinely take 60-90s. Without this
they intermittently fail in CI with TimeoutError.
2026-05-22 11:46:52 +10:00
Ben 244d62ded3 test(docker): lock baseline behavior for Phase 0 harness
Tasks 0.2-0.6 of the s6-overlay supervision plan. Locks the
user-visible behavior we must preserve through the Phase 2 init-
system swap:

- test_main_invocation.py (Task 0.2): docker run <image> with no
  args, chat subcommand passthrough, bare executable passthrough,
  bash pattern, exit-code propagation
- test_tui_passthrough.py (Task 0.3): TTY allocation via docker -t
  using the host's script(1) for a PTY
- test_dashboard.py (Task 0.4): HERMES_DASHBOARD=1 opt-in,
  HERMES_DASHBOARD_PORT override
- test_profile_gateway.py (Task 0.5): per-profile gateway
  start/stop and profile-delete-stops-gateway. Both marked
  xfail(strict=True) because the current tini image refuses
  gateway lifecycle commands inside the container; Phase 4
  Task 4.3 flips them to passing.
- test_zombie_reaping.py (Task 0.6): PID 1 reaps orphaned
  zombies. tini does this today; s6-overlay's /init must
  continue to.

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:46:52 +10:00
Ben 705256aaa6 test(docker): add conftest fixtures for docker harness
Task 0.1 of the s6-overlay supervision plan. Establishes the test
infrastructure for tests/docker/: skip-on-missing-Docker collection
hook, session-scoped image-build fixture (overridable via the
HERMES_TEST_IMAGE env var for faster local iteration), and a
container_name fixture that ensures cleanup on test exit.

Refs: docs/plans/2026-05-07-s6-overlay-dynamic-subagent-gateways.md
2026-05-22 11:46:52 +10:00
Ben ef536880a3 docs(plans): add s6-overlay supervision plan (v3)
Replace tini with s6-overlay as PID 1 in the Hermes Docker image so that
main hermes, the dashboard, and dynamically-created per-profile gateways
all run as supervised services. Includes container-boot reconciliation
(Task 4.0) so per-profile gateways survive docker restart.

Plan history:
- v1: 2026-05-07 — original design (subagent gateways scope)
- v2: 2026-05-18 — re-validated, scope narrowed to per-profile gateways,
  WindowsServiceManager added to protocol
- v3: 2026-05-21 — re-validated in docker_s6 worktree, install-method
  stamp preservation noted in Task 2.3, Task 4.0 added for container
  restart survival

12.5 engineering days estimated across 7 phases.
2026-05-22 11:46:52 +10:00
699 changed files with 2820 additions and 126306 deletions
+1 -1
View File
@@ -51,7 +51,7 @@ jobs:
steps:
- name: Generate GitHub App token
id: app-token
uses: actions/create-github-app-token@bcd2ba49218906704ab6c1aa796996da409d3eb1 # v3.2.0
uses: actions/create-github-app-token@7bfa3a4717ef143a604ee0a99d859b8886a96d00 # v1.9.3
with:
app-id: ${{ secrets.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
+1 -6
View File
@@ -100,12 +100,7 @@ jobs:
# --- Install-hook files (setup.py/sitecustomize/usercustomize/__init__.pth) ---
# These execute during pip install or interpreter startup.
# Anchored at repo root: only the top-level setup.py/setup.cfg run during
# `pip install`, and only top-level sitecustomize.py/usercustomize.py are
# auto-loaded by the interpreter via site.py. Any nested file with the
# same name (e.g. hermes_cli/setup.py — the CLI setup wizard) is unrelated
# and produced false positives that trained reviewers to ignore the scanner.
SETUP_HITS=$(git diff --name-only "$BASE"..."$HEAD" | grep -E '^(setup\.py|setup\.cfg|sitecustomize\.py|usercustomize\.py|__init__\.pth)$' || true)
SETUP_HITS=$(git diff --name-only "$BASE"..."$HEAD" | grep -E '(^|/)(setup\.py|setup\.cfg|sitecustomize\.py|usercustomize\.py|__init__\.pth)$' || true)
if [ -n "$SETUP_HITS" ]; then
FINDINGS="${FINDINGS}
### 🚨 CRITICAL: Install-hook file added or modified
+5 -45
View File
@@ -41,7 +41,6 @@ from agent.message_sanitization import (
)
from agent.tool_dispatch_helpers import _trajectory_normalize_msg, make_tool_result_message
from agent.trajectory import convert_scratchpad_to_think
from agent.credential_pool import STATUS_EXHAUSTED
from agent.error_classifier import classify_api_error, FailoverReason
from utils import base_url_host_matches, base_url_hostname, env_var_enabled, atomic_json_write
@@ -583,37 +582,12 @@ def recover_with_credential_pool(
return False, has_retried_429
if effective_reason == FailoverReason.rate_limit:
# If current credential is already marked exhausted, skip retry and
# rotate immediately. This prevents the "cancel-between-429s" trap
# where has_retried_429 (a local var) gets reset on each new prompt,
# causing the pool to retry the same exhausted credential forever.
current_entry = pool.current()
current_last_status = getattr(current_entry, "last_status", None) if current_entry else None
if current_last_status == STATUS_EXHAUSTED:
_ra().logger.info(
"Credential already exhausted (last_status=%s) — rotating immediately instead of retrying",
current_last_status,
)
rotate_status = status_code if status_code is not None else 429
next_entry = pool.mark_exhausted_and_rotate(status_code=rotate_status, error_context=error_context)
if next_entry is not None:
_ra().logger.info(
"Credential %s (rate limit, pre-exhausted) — rotated to pool entry %s",
rotate_status,
getattr(next_entry, "id", "?"),
)
agent._swap_credential(next_entry)
return True, False
return False, True
usage_limit_reached = False
if error_context:
context_reason = str(error_context.get("reason") or "").lower()
context_message = str(error_context.get("message") or "").lower()
usage_limit_reached = (
"usage_limit_reached" in context_reason
or "gousagelimit" in context_reason
or "usage limit reached" in context_message
or "usage limit has been reached" in context_message
)
if not has_retried_429 and not usage_limit_reached:
@@ -2092,33 +2066,19 @@ def extract_api_error_context(error: Exception) -> Dict[str, Any]:
if "reset_at" not in context:
message = context.get("message") or ""
if isinstance(message, str):
delay_match = re.search(r"quotaResetDelay[:\s\"]+(\d+(?:\.\d+)?)(ms|s)", message, re.IGNORECASE)
delay_match = re.search(r"quotaResetDelay[:\s\"]+(\\d+(?:\\.\\d+)?)(ms|s)", message, re.IGNORECASE)
if delay_match:
value = float(delay_match.group(1))
seconds = value / 1000.0 if delay_match.group(2).lower() == "ms" else value
context["reset_at"] = time.time() + seconds
else:
resets_in_match = re.search(
r"resets?\s+in\s+"
r"(?:(\d+(?:\.\d+)?)\s*(?:h|hr|hrs|hour|hours)\b\s*)?"
r"(?:(\d+(?:\.\d+)?)\s*(?:m|min|mins|minute|minutes)\b\s*)?"
r"(?:(\d+(?:\.\d+)?)\s*(?:s|sec|secs|second|seconds)\b)?",
sec_match = re.search(
r"retry\s+(?:after\s+)?(\d+(?:\.\d+)?)\s*(?:sec|secs|seconds|s\b)",
message,
re.IGNORECASE,
)
if resets_in_match and any(resets_in_match.groups()):
hours = float(resets_in_match.group(1) or 0)
minutes = float(resets_in_match.group(2) or 0)
seconds = float(resets_in_match.group(3) or 0)
context["reset_at"] = time.time() + (hours * 3600) + (minutes * 60) + seconds
else:
sec_match = re.search(
r"retry\s+(?:after\s+)?(\d+(?:\.\d+)?)\s*(?:sec|secs|seconds|s\b)",
message,
re.IGNORECASE,
)
if sec_match:
context["reset_at"] = time.time() + float(sec_match.group(1))
if sec_match:
context["reset_at"] = time.time() + float(sec_match.group(1))
return context
+5 -30
View File
@@ -15,8 +15,6 @@ import json
import logging
import os
import platform
import secrets
import stat
import subprocess
from pathlib import Path
from urllib.parse import urlparse
@@ -1042,34 +1040,11 @@ def _write_claude_code_credentials(
existing["claudeAiOauth"] = oauth_data
cred_path.parent.mkdir(parents=True, exist_ok=True)
# Per-process random suffix avoids collisions between concurrent
# writers and stale leftovers from a prior crashed write.
_tmp_cred = cred_path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
# Create the temp file atomically at 0o600. The previous
# write_text + post-replace chmod opened a TOCTOU window where
# both the temp file and the destination briefly inherited the
# process umask (commonly 0o644 = world-readable), exposing
# Claude Code OAuth tokens to other local users between create
# and chmod. Mirrors agent/google_oauth.py (#19673) and
# tools/mcp_oauth.py (#21148). Parent dir (~/.claude/) is
# owned by Claude Code itself, so we leave its mode alone.
fd = os.open(
str(_tmp_cred),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
json.dump(existing, fh, indent=2)
fh.flush()
os.fsync(fh.fileno())
os.replace(_tmp_cred, cred_path)
except OSError:
try:
_tmp_cred.unlink(missing_ok=True)
except OSError:
pass
raise
_tmp_cred = cred_path.with_suffix(".tmp")
_tmp_cred.write_text(json.dumps(existing, indent=2), encoding="utf-8")
_tmp_cred.replace(cred_path)
# Restrict permissions (credentials file)
cred_path.chmod(0o600)
except (OSError, IOError) as e:
logger.debug("Failed to write refreshed credentials: %s", e)
+45 -155
View File
@@ -1406,9 +1406,6 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
for provider_id, pconfig in PROVIDER_REGISTRY.items():
if pconfig.auth_type != "api_key":
continue
if _is_provider_unhealthy(provider_id):
logger.debug("Auxiliary api-key chain: %s is unhealthy, skipping", provider_id)
continue
if provider_id == "anthropic":
# Only try anthropic when the user has explicitly configured it.
# Without this gate, Claude Code credentials get silently used
@@ -2263,12 +2260,11 @@ def _is_payment_error(exc: Exception) -> bool:
"credits", "insufficient funds",
"can only afford", "billing",
"payment required",
# Daily / monthly / weekly quota exhaustion keywords
# Daily / monthly quota exhaustion keywords
"quota exceeded", "quota_exceeded",
"too many tokens per day", "daily limit",
"tokens per day", "daily quota",
"resource exhausted", # Vertex AI / gRPC quota errors
"weekly usage limit", "weekly limit", # OpenCode Go weekly subscription cap
)):
return True
return False
@@ -2482,11 +2478,7 @@ def _pool_error_context(exc: Exception) -> Dict[str, Any]:
return payload
def _recoverable_pool_provider(
resolved_provider: str,
client: Any,
main_runtime: Optional[Dict[str, Any]] = None,
) -> Optional[str]:
def _recoverable_pool_provider(resolved_provider: str, client: Any) -> Optional[str]:
"""Infer which provider pool can recover the current auxiliary client."""
normalized = _normalize_aux_provider(resolved_provider)
if normalized not in {"", "auto", "custom"}:
@@ -2504,33 +2496,11 @@ def _recoverable_pool_provider(
return "copilot"
if base_url_host_matches(base, "api.kimi.com"):
return "kimi-coding"
# For api_key providers not in the hardcoded list (e.g. opencode-go), match
# the client base URL against all registered api_key providers so that
# credential-pool rotation works for any provider the user configured.
if main_runtime:
rt = _normalize_main_runtime(main_runtime)
rt_provider = rt.get("provider", "")
if rt_provider and rt_provider not in {"", "auto", "custom"}:
try:
from hermes_cli.auth import PROVIDER_REGISTRY
pconfig = PROVIDER_REGISTRY.get(rt_provider)
if pconfig and getattr(pconfig, "auth_type", None) == "api_key":
rt_base = str(getattr(pconfig, "inference_base_url", "") or "").rstrip("/")
if rt_base and base_url_host_matches(base, base_url_hostname(rt_base)):
return rt_provider
except Exception:
pass
return None
def _recover_provider_pool(provider: str, exc: Exception, *, failed_api_key: str = "") -> bool:
"""Try same-provider credential-pool recovery for auxiliary calls.
``failed_api_key`` is the API key that was actually used for the failing
request. Passing it lets mark_exhausted_and_rotate identify the correct
pool entry even when another process has already rotated the pool (which
would leave current() as None, causing the wrong entry to be marked).
"""
def _recover_provider_pool(provider: str, exc: Exception) -> bool:
"""Try same-provider credential-pool recovery for auxiliary calls."""
normalized = _normalize_aux_provider(provider)
try:
pool = load_pool(normalized)
@@ -2542,7 +2512,6 @@ def _recover_provider_pool(provider: str, exc: Exception, *, failed_api_key: str
status_code = getattr(exc, "status_code", None)
error_context = _pool_error_context(exc)
hint = failed_api_key or None
if _is_auth_error(exc):
refreshed = pool.try_refresh_current()
@@ -2552,7 +2521,6 @@ def _recover_provider_pool(provider: str, exc: Exception, *, failed_api_key: str
next_entry = pool.mark_exhausted_and_rotate(
status_code=status_code if status_code is not None else 401,
error_context=error_context,
api_key_hint=hint,
)
if next_entry is not None:
_evict_cached_clients(normalized)
@@ -2564,7 +2532,6 @@ def _recover_provider_pool(provider: str, exc: Exception, *, failed_api_key: str
next_entry = pool.mark_exhausted_and_rotate(
status_code=status_code if status_code is not None else fallback_status,
error_context=error_context,
api_key_hint=hint,
)
if next_entry is not None:
_evict_cached_clients(normalized)
@@ -2969,11 +2936,6 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
resolved_provider = "custom"
explicit_base_url = runtime_base_url
explicit_api_key = runtime_api_key or None
elif runtime_api_key:
# Pin auxiliary to the same api_key as the active main chat session
# so that a working key is reused instead of re-selecting from the pool
# (which might pick a different, potentially exhausted key).
explicit_api_key = runtime_api_key
# Skip Step-1 if the main provider was recently 402'd. The unhealthy
# cache TTL bounds how long we bypass it, so a topped-up account
# recovers automatically. If we tried Step-1 anyway, every aux call
@@ -3154,34 +3116,6 @@ def resolve_provider_client(
# Normalise aliases
provider = _normalize_aux_provider(provider)
# Universal model-resolution fallback chain. Callers (notably title
# generation, vision, session search, and other auxiliary tasks) can
# reach this function without an explicit model — the user picked their
# main provider, didn't bother configuring a per-task ``auxiliary.<task>.model``,
# and just expects "use my main model for side tasks too." Resolve in
# this order, stopping at the first non-empty answer:
#
# 1. ``model`` argument (caller knew what they wanted)
# 2. Provider's catalog default — cheap/fast model the provider
# registered via ``ProviderProfile.default_aux_model`` or the
# legacy ``_API_KEY_PROVIDER_AUX_MODELS_FALLBACK`` dict. Empty
# string for OAuth-gated providers (openai-codex, xai-oauth)
# whose accepted-model lists drift on the backend, so we don't
# pin a default that can silently rot.
# 3. User's main model from ``model.model`` in config.yaml. This is
# the load-bearing step for OAuth providers: an xai-oauth user
# with grok-4.3 configured gets grok-4.3 for title generation
# instead of silently dropping to whatever Step-2 fallback (#31845).
#
# Each provider branch below sees a non-empty ``model`` whenever the
# user has *anything* configured — no provider-specific empty-model
# guards needed. When the user has NOTHING configured (fresh install,
# main_model also empty), the branches still hit their own
# missing-credentials returns and ``_resolve_auto`` falls through to
# the Step-2 chain as before.
if not model:
model = _get_aux_model_for_provider(provider) or _read_main_model() or model
def _needs_codex_wrap(client_obj, base_url_str: str, model_str: str) -> bool:
"""Decide if a plain OpenAI client should be wrapped for Responses API.
@@ -3326,7 +3260,7 @@ def resolve_provider_client(
if client is None:
logger.warning(
"resolve_provider_client: xai-oauth requested but no xAI "
"OAuth token found (run: hermes model -> xAI Grok OAuth — SuperGrok / Premium+)"
"OAuth token found (run: hermes model -> xAI Grok OAuth — SuperGrok Subscription)"
)
return None, None
final_model = _normalize_resolved_model(model or default, provider)
@@ -4366,25 +4300,13 @@ def _get_cached_client(
else:
effective = _compat_model(cached_client, model, cached_default)
return cached_client, effective
# Build outside the lock.
# For pool-backed api_key providers, derive the active API key from the
# pool entry rather than from env vars. resolve_api_key_provider_credentials
# always prefers env vars (first-entry bias), which bypasses pool rotation:
# after key #1 is marked exhausted the retry would still get key #1 from
# the env var and fail again, causing the retry2_err handler to mark key #2.
effective_api_key = api_key
if not effective_api_key:
_pe = _peek_pool_entry(_normalize_aux_provider(provider))
if _pe is not None:
_pk = _pool_runtime_api_key(_pe)
if _pk:
effective_api_key = _pk
# Build outside the lock
client, default_model = resolve_provider_client(
provider,
model,
async_mode,
explicit_base_url=base_url,
explicit_api_key=effective_api_key,
explicit_api_key=api_key,
api_mode=api_mode,
main_runtime=runtime,
is_vision=is_vision,
@@ -4998,17 +4920,10 @@ def call_llm(
)
# ── Same-provider credential-pool recovery ─────────────────────
pool_provider = _recoverable_pool_provider(resolved_provider, client, main_runtime=main_runtime)
# Capture the exact API key used so mark_exhausted_and_rotate can find
# the correct pool entry even when another process rotated the pool
# between this call and recovery (which leaves current()=None and makes
# _select_unlocked() return the NEXT key by mistake).
_client_api_key = str(getattr(client, "api_key", "") or "")
pool_provider = _recoverable_pool_provider(resolved_provider, client)
if pool_provider and (_is_auth_error(first_err) or _is_payment_error(first_err) or _is_rate_limit_error(first_err)):
recovery_err = first_err
# Skip the extra retry for clear payment/quota errors — the endpoint
# won't accept another request with the same exhausted key.
if _is_rate_limit_error(first_err) and not _is_payment_error(first_err):
if _is_rate_limit_error(first_err):
try:
return _validate_llm_response(
client.chat.completions.create(**kwargs), task)
@@ -5016,40 +4931,27 @@ def call_llm(
if not (_is_auth_error(retry_err) or _is_payment_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
recovery_err = retry_err
if _recover_provider_pool(pool_provider, recovery_err, failed_api_key=_client_api_key):
if _recover_provider_pool(pool_provider, recovery_err):
logger.info(
"Auxiliary %s: recovered %s via credential-pool rotation after %s",
task or "call", pool_provider, type(recovery_err).__name__,
)
try:
return _retry_same_provider_sync(
task=task,
resolved_provider=resolved_provider,
resolved_model=resolved_model,
resolved_base_url=resolved_base_url,
resolved_api_key=resolved_api_key,
resolved_api_mode=resolved_api_mode,
main_runtime=main_runtime,
final_model=final_model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
tools=tools,
effective_timeout=effective_timeout,
effective_extra_body=effective_extra_body,
)
except Exception as retry2_err:
# The rotated key also hit a quota/auth wall. Mark it
# immediately so concurrent processes don't make a
# redundant API call to discover it's exhausted too.
# Then fall through to the payment fallback below so
# alternative providers can still serve the request.
if (_is_payment_error(retry2_err) or _is_auth_error(retry2_err)
or _is_rate_limit_error(retry2_err)):
_recover_provider_pool(pool_provider, retry2_err)
first_err = retry2_err
else:
raise
return _retry_same_provider_sync(
task=task,
resolved_provider=resolved_provider,
resolved_model=resolved_model,
resolved_base_url=resolved_base_url,
resolved_api_key=resolved_api_key,
resolved_api_mode=resolved_api_mode,
main_runtime=main_runtime,
final_model=final_model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
tools=tools,
effective_timeout=effective_timeout,
effective_extra_body=effective_extra_body,
)
# ── Payment / credit exhaustion fallback ──────────────────────
# When the resolved provider returns 402 or a credit-related error,
@@ -5091,7 +4993,7 @@ def call_llm(
# 402). Mark THAT label unhealthy so subsequent aux calls
# skip it instead of paying another doomed RTT.
_mark_provider_unhealthy(
_recoverable_pool_provider(resolved_provider, client, main_runtime=main_runtime) or resolved_provider
_recoverable_pool_provider(resolved_provider, client) or resolved_provider
)
elif _is_rate_limit_error(first_err):
reason = "rate limit"
@@ -5211,7 +5113,6 @@ async def async_call_llm(
model: str = None,
base_url: str = None,
api_key: str = None,
main_runtime: Optional[Dict[str, Any]] = None,
messages: list,
temperature: float = None,
max_tokens: int = None,
@@ -5398,13 +5299,10 @@ async def async_call_llm(
)
# ── Same-provider credential-pool recovery (mirrors sync) ─────
pool_provider = _recoverable_pool_provider(resolved_provider, client, main_runtime=main_runtime)
_client_api_key = str(getattr(client, "api_key", "") or "")
pool_provider = _recoverable_pool_provider(resolved_provider, client)
if pool_provider and (_is_auth_error(first_err) or _is_payment_error(first_err) or _is_rate_limit_error(first_err)):
recovery_err = first_err
# Skip the extra retry for clear payment/quota errors — the endpoint
# won't accept another request with the same exhausted key.
if _is_rate_limit_error(first_err) and not _is_payment_error(first_err):
if _is_rate_limit_error(first_err):
try:
return _validate_llm_response(
await client.chat.completions.create(**kwargs), task)
@@ -5412,34 +5310,26 @@ async def async_call_llm(
if not (_is_auth_error(retry_err) or _is_payment_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
recovery_err = retry_err
if _recover_provider_pool(pool_provider, recovery_err, failed_api_key=_client_api_key):
if _recover_provider_pool(pool_provider, recovery_err):
logger.info(
"Auxiliary %s (async): recovered %s via credential-pool rotation after %s",
task or "call", pool_provider, type(recovery_err).__name__,
)
try:
return await _retry_same_provider_async(
task=task,
resolved_provider=resolved_provider,
resolved_model=resolved_model,
resolved_base_url=resolved_base_url,
resolved_api_key=resolved_api_key,
resolved_api_mode=resolved_api_mode,
final_model=final_model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
tools=tools,
effective_timeout=effective_timeout,
effective_extra_body=effective_extra_body,
)
except Exception as retry2_err:
if (_is_payment_error(retry2_err) or _is_auth_error(retry2_err)
or _is_rate_limit_error(retry2_err)):
_recover_provider_pool(pool_provider, retry2_err)
first_err = retry2_err
else:
raise
return await _retry_same_provider_async(
task=task,
resolved_provider=resolved_provider,
resolved_model=resolved_model,
resolved_base_url=resolved_base_url,
resolved_api_key=resolved_api_key,
resolved_api_mode=resolved_api_mode,
final_model=final_model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
tools=tools,
effective_timeout=effective_timeout,
effective_extra_body=effective_extra_body,
)
# ── Payment / connection / rate-limit fallback (mirrors sync call_llm) ──
should_fallback = (
+48 -189
View File
@@ -34,7 +34,6 @@ from typing import Any, Dict, List, Optional, Tuple
from urllib.parse import urlparse, parse_qs, urlunparse
from hermes_cli.timeouts import get_provider_request_timeout, get_provider_stale_timeout
from hermes_constants import PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH
from agent.error_classifier import classify_api_error, FailoverReason
from agent.model_metadata import is_local_endpoint
from agent.message_sanitization import (
@@ -76,59 +75,6 @@ def _ra():
return run_agent
def estimate_request_context_tokens(api_payload: Any) -> int:
"""Estimate context/load tokens from an API payload, dict or messages list.
The stale-call detectors historically assumed a Chat Completions request:
they pulled ``api_kwargs["messages"]`` and ran a cheap char/4 estimate.
Codex / Responses API requests carry the conversational payload in
``input`` (with additional load in ``instructions`` and ``tools``), so the
legacy estimator reported ~0 tokens for every Codex turn and the
context-tier scaling never fired.
This helper handles both shapes:
- bare list -> treat as Chat Completions ``messages``
- dict with ``messages`` -> Chat Completions (+ ``tools`` if present)
- dict with ``input`` -> Responses API (+ ``instructions``/``tools``)
- any other dict -> fall back to summing string values
"""
def _chars(value: Any) -> int:
if value is None:
return 0
if isinstance(value, str):
return len(value)
return len(str(value))
def _message_chars(messages: Any) -> int:
if not isinstance(messages, list):
return _chars(messages)
return sum(_chars(item) for item in messages)
if isinstance(api_payload, list):
return _message_chars(api_payload) // 4
if isinstance(api_payload, dict):
messages = api_payload.get("messages")
if isinstance(messages, list):
total_chars = _message_chars(messages)
if "tools" in api_payload:
total_chars += _chars(api_payload.get("tools"))
return total_chars // 4
if "input" in api_payload:
total_chars = (
_chars(api_payload.get("input"))
+ _chars(api_payload.get("instructions"))
+ _chars(api_payload.get("tools"))
)
return total_chars // 4
return sum(_chars(value) for value in api_payload.values()) // 4
return _chars(api_payload) // 4
def interruptible_api_call(agent, api_kwargs: dict):
"""
@@ -254,34 +200,9 @@ def interruptible_api_call(agent, api_kwargs: dict):
# httpx timeout (default 1800s) with zero feedback. The stale
# detector kills the connection early so the main retry loop can
# apply richer recovery (credential rotation, provider fallback).
_stale_timeout = agent._compute_non_stream_stale_timeout(api_kwargs)
# ── Time-to-first-byte (TTFB) watchdog for the Codex Responses stream ──
# The chatgpt.com/backend-api/codex endpoint has an intermittent failure
# mode where it accepts the connection but never emits a single stream
# event (observed directly: 0 events, no HTTP status, the socket just
# hangs). A fresh reconnect succeeds in ~2s, but the wall-clock stale
# timeout (often 180900s) makes us wait minutes before retrying. While no
# stream event has arrived yet we apply a much shorter TTFB cutoff so the
# main retry loop can reconnect promptly. Once the first event arrives the
# stream is healthy, so we fall back to the wall-clock stale timeout and
# never interrupt a legitimate long generation. Gated to codex_responses:
# only that path streams events incrementally (the chat_completions
# non-stream, anthropic and bedrock branches here have no first-event
# signal). The marker advances on *any* event (see codex_runtime), so
# reasoning-only / tool-call-only turns are not mistaken for a stall.
# Operators can tune via HERMES_CODEX_TTFB_TIMEOUT_SECONDS (0 disables).
_ttfb_enabled = agent.api_mode == "codex_responses"
try:
_ttfb_timeout = float(os.getenv("HERMES_CODEX_TTFB_TIMEOUT_SECONDS", "45"))
except (TypeError, ValueError):
_ttfb_timeout = 45.0
if _ttfb_timeout <= 0:
_ttfb_enabled = False
if _ttfb_enabled:
# Reset before the worker starts so a marker left over from a previous
# call on this agent can't be misread as first-byte for this one.
agent._codex_stream_last_event_ts = None
_stale_timeout = agent._compute_non_stream_stale_timeout(
api_kwargs.get("messages", [])
)
_call_start = time.time()
agent._touch_activity("waiting for non-streaming API response")
@@ -301,75 +222,22 @@ def interruptible_api_call(agent, api_kwargs: dict):
f"waiting for non-streaming response ({int(_elapsed)}s elapsed)"
)
_elapsed = time.time() - _call_start
# TTFB detector: the Codex stream has produced no event at all and
# we're past the first-byte cutoff → the backend opened the
# connection but isn't responding. Kill it so the retry loop can
# reconnect (a fresh connection typically succeeds in seconds),
# instead of waiting out the much longer wall-clock stale timeout.
if (
_ttfb_enabled
and _elapsed > _ttfb_timeout
and getattr(agent, "_codex_stream_last_event_ts", None) is None
):
logger.warning(
"Codex stream produced no bytes within TTFB cutoff "
"(%.0fs > %.0fs, model=%s). Backend accepted the connection "
"but sent no stream events. Killing connection so the retry "
"loop can reconnect.",
_elapsed, _ttfb_timeout, api_kwargs.get("model", "unknown"),
)
agent._emit_status(
f"⚠️ No first byte from provider in {int(_elapsed)}s "
f"(codex stream, model: {api_kwargs.get('model', 'unknown')}). "
f"Reconnecting."
)
try:
_close_request_client_once("codex_ttfb_kill")
except Exception:
pass
agent._touch_activity(
f"codex stream killed after {int(_elapsed)}s with no first byte"
)
# Wait briefly for the worker to notice the closed connection.
t.join(timeout=2.0)
if result["error"] is None and result["response"] is None:
result["error"] = TimeoutError(
f"Codex stream produced no bytes within {int(_elapsed)}s "
f"(TTFB threshold: {int(_ttfb_timeout)}s)"
)
break
# Stale-call detector: kill the connection if no response
# arrives within the configured timeout.
_elapsed = time.time() - _call_start
if _elapsed > _stale_timeout:
_est_ctx = estimate_request_context_tokens(api_kwargs)
_silent_hint: Optional[str] = None
_hint_fn = getattr(agent, "_codex_silent_hang_hint", None)
if callable(_hint_fn):
try:
_silent_hint = _hint_fn(model=api_kwargs.get("model"))
except Exception:
_silent_hint = None
_est_ctx = sum(len(str(v)) for v in api_kwargs.get("messages", [])) // 4
logger.warning(
"Non-streaming API call stale for %.0fs (threshold %.0fs). "
"model=%s context=~%s tokens. Killing connection.",
_elapsed, _stale_timeout,
api_kwargs.get("model", "unknown"), f"{_est_ctx:,}",
)
if _silent_hint:
agent._emit_status(
f"⚠️ No response from provider for {int(_elapsed)}s "
f"(non-streaming, model: {api_kwargs.get('model', 'unknown')}). "
f"{_silent_hint}"
)
else:
agent._emit_status(
f"⚠️ No response from provider for {int(_elapsed)}s "
f"(non-streaming, model: {api_kwargs.get('model', 'unknown')}). "
f"Aborting call."
)
agent._emit_status(
f"⚠️ No response from provider for {int(_elapsed)}s "
f"(non-streaming, model: {api_kwargs.get('model', 'unknown')}). "
f"Aborting call."
)
try:
if agent.api_mode == "anthropic_messages":
agent._anthropic_client.close()
@@ -384,17 +252,10 @@ def interruptible_api_call(agent, api_kwargs: dict):
# Wait briefly for the thread to notice the closed connection.
t.join(timeout=2.0)
if result["error"] is None and result["response"] is None:
if _silent_hint:
result["error"] = TimeoutError(
f"Non-streaming API call timed out after {int(_elapsed)}s "
f"with no response (threshold: {int(_stale_timeout)}s). "
f"{_silent_hint}"
)
else:
result["error"] = TimeoutError(
f"Non-streaming API call timed out after {int(_elapsed)}s "
f"with no response (threshold: {int(_stale_timeout)}s)"
)
result["error"] = TimeoutError(
f"Non-streaming API call timed out after {int(_elapsed)}s "
f"with no response (threshold: {int(_stale_timeout)}s)"
)
break
if agent._interrupt_requested:
@@ -501,7 +362,6 @@ def build_api_kwargs(agent, api_messages: list) -> dict:
reasoning_config=agent.reasoning_config,
session_id=getattr(agent, "session_id", None),
max_tokens=agent.max_tokens,
timeout=agent._resolved_api_call_timeout(),
request_overrides=agent.request_overrides,
is_github_responses=is_github_responses,
is_codex_backend=is_codex_backend,
@@ -721,17 +581,6 @@ def build_assistant_message(agent, assistant_message, finish_reason: str) -> dic
if isinstance(_san_content, str) and _san_content:
_san_content = agent._strip_think_blocks(_san_content).strip()
# Defence-in-depth: redact credentials (PATs, API keys, Bearer tokens)
# from assistant content BEFORE the message enters conversation history.
# If the model accidentally inlines a secret in its natural-language
# response, catch it here at the persistence boundary so it never
# reaches state.db, session_*.json, gateway delivery, or compression.
# Respects HERMES_REDACT_SECRETS via redact_sensitive_text — no-op
# when disabled. (#19798)
if isinstance(_san_content, str) and _san_content:
from agent.redact import redact_sensitive_text
_san_content = redact_sensitive_text(_san_content)
msg = {
"role": "assistant",
"content": _san_content,
@@ -853,18 +702,6 @@ def build_assistant_message(agent, assistant_message, finish_reason: str) -> dic
"arguments": tool_call.function.arguments
},
}
# Defence-in-depth: redact credentials from tool call arguments
# before they enter conversation history. Tool execution uses the
# raw API response object, not this dict, so redacting the
# persisted shape is safe and only affects storage. Catches the
# case where a model accidentally inlines a secret into a tool
# call (e.g. `terminal(command="curl -H 'Authorization: Bearer
# sk-...'")`). (#19798)
if isinstance(tc_dict["function"]["arguments"], str):
from agent.redact import redact_sensitive_text
tc_dict["function"]["arguments"] = redact_sensitive_text(
tc_dict["function"]["arguments"]
)
# Preserve extra_content (e.g. Gemini thought_signature) so it
# is sent back on subsequent API calls. Without this, Gemini 3
# thinking models reject the request with a 400 error.
@@ -2159,7 +1996,7 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
# when the context is large. Without this, the stale detector kills
# healthy connections during the model's thinking phase, producing
# spurious RemoteProtocolError ("peer closed connection").
_est_tokens = estimate_request_context_tokens(api_kwargs)
_est_tokens = sum(len(str(v)) for v in api_kwargs.get("messages", [])) // 4
if _est_tokens > 100_000:
_stream_stale_timeout = max(_stream_stale_timeout_base, 300.0)
elif _est_tokens > 50_000:
@@ -2195,7 +2032,7 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
# inner retry loop can start a fresh connection.
_stale_elapsed = time.time() - last_chunk_time["t"]
if _stale_elapsed > _stream_stale_timeout:
_est_ctx = estimate_request_context_tokens(api_kwargs)
_est_ctx = sum(len(str(v)) for v in api_kwargs.get("messages", [])) // 4
logger.warning(
"Stream stale for %.0fs (threshold %.0fs) — no chunks received. "
"model=%s context=~%s tokens. Killing connection.",
@@ -2239,15 +2076,37 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
if deltas_were_sent["yes"]:
# Streaming failed AFTER some tokens were already delivered to
# the platform. Re-raising would let the outer retry loop make
# Return a partial response stub with finish_reason="length"
# so the conversation loop's continuation machinery fires.
# tool_calls=None prevents auto-execution of incomplete calls.
# a new API call, creating a duplicate message. Return a
# partial response stub instead and let the outer loop decide:
#
# - text-only partials → finish_reason="length" so the
# conversation loop persists the partial assistant content
# and asks the model to continue from where the stream
# died (issue #30963: partial stop misclassified as a
# clean completion was exiting the loop with budget
# remaining and an unfinished goal).
#
# - partial mid-tool-call → finish_reason="stop" stays.
# The user-visible warning we append says "Ask me to
# retry if you want to continue", so the agent should
# hand control back rather than auto-retry a tool call
# that may have side-effects.
#
# Recover whatever content was already streamed to the user.
# _current_streamed_assistant_text accumulates text fired
# through _fire_stream_delta, so it has exactly what the
# user saw before the connection died.
_partial_text = (
getattr(agent, "_current_streamed_assistant_text", "") or ""
).strip() or None
# Append a user-visible warning if tool calls were dropped so
# the user and model both know what was attempted.
# If the stream died while the model was emitting a tool call,
# the stub below will silently set `tool_calls=None` and the
# agent loop will treat the turn as complete — the attempted
# action is lost with no user-facing signal. Append a
# human-visible warning to the stub content so (a) the user
# knows something failed, and (b) the next turn's model sees
# in conversation history what was attempted and can retry.
_partial_names = list(result.get("partial_tool_names") or [])
if _partial_names:
_name_str = ", ".join(_partial_names[:3])
@@ -2259,7 +2118,8 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
f"Ask me to retry if you want to continue."
)
_partial_text = (_partial_text or "") + _warn
# Fire as streaming delta so the user sees it immediately.
# Also fire as a streaming delta so the user sees it now
# instead of only in the persisted transcript.
try:
agent._fire_stream_delta(_warn)
except Exception:
@@ -2269,7 +2129,7 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
"of text; surfaced warning to user: %s",
_partial_names, len(_partial_text or ""), result["error"],
)
_stub_finish_reason = FINISH_REASON_LENGTH
_stub_finish_reason = "stop"
else:
logger.warning(
"Partial stream delivered before error; returning "
@@ -2279,19 +2139,18 @@ def interruptible_streaming_api_call(agent, api_kwargs: dict, *, on_first_delta=
len(_partial_text or ""),
result["error"],
)
_stub_finish_reason = FINISH_REASON_LENGTH
_stub_finish_reason = "length"
_stub_msg = SimpleNamespace(
role="assistant", content=_partial_text, tool_calls=None,
reasoning_content=None,
)
return SimpleNamespace(
id=PARTIAL_STREAM_STUB_ID,
id="partial-stream-stub",
model=getattr(agent, "model", "unknown"),
choices=[SimpleNamespace(
index=0, message=_stub_msg, finish_reason=_stub_finish_reason,
)],
usage=None,
_dropped_tool_names=_partial_names or None,
)
raise result["error"]
return result["response"]
+1 -8
View File
@@ -745,7 +745,7 @@ def _preflight_codex_api_kwargs(
"model", "instructions", "input", "tools", "store",
"reasoning", "include", "max_output_tokens", "temperature",
"tool_choice", "parallel_tool_calls", "prompt_cache_key", "service_tier",
"extra_headers", "extra_body", "timeout",
"extra_headers", "extra_body",
}
normalized: Dict[str, Any] = {
"model": model,
@@ -771,13 +771,6 @@ def _preflight_codex_api_kwargs(
max_output_tokens = api_kwargs.get("max_output_tokens")
if isinstance(max_output_tokens, (int, float)) and max_output_tokens > 0:
normalized["max_output_tokens"] = int(max_output_tokens)
timeout = api_kwargs.get("timeout")
if (
isinstance(timeout, (int, float))
and not isinstance(timeout, bool)
and 0 < float(timeout) < float("inf")
):
normalized["timeout"] = float(timeout)
temperature = api_kwargs.get("temperature")
if isinstance(temperature, (int, float)):
normalized["temperature"] = float(temperature)
-6
View File
@@ -19,7 +19,6 @@ from __future__ import annotations
import json
import logging
import os
import time
from types import SimpleNamespace
from typing import Any, Dict, List
@@ -195,11 +194,6 @@ def run_codex_stream(agent, api_kwargs: dict, client: Any = None, on_first_delta
try:
with active_client.responses.stream(**api_kwargs) as stream:
for event in stream:
# Mark stream activity for the TTFB watchdog in
# interruptible_api_call. The Codex backend can accept the
# connection but never emit a single event; this timestamp
# staying None tells the watchdog no bytes are flowing.
agent._codex_stream_last_event_ts = time.time()
agent._touch_activity("receiving stream response")
if agent._interrupt_requested:
break
+25 -67
View File
@@ -65,7 +65,7 @@ from agent.prompt_caching import apply_anthropic_cache_control
from agent.retry_utils import jittered_backoff
from agent.trajectory import has_incomplete_scratchpad
from agent.usage_pricing import estimate_usage_cost, normalize_usage
from hermes_constants import display_hermes_home as _dhh_fn, PARTIAL_STREAM_STUB_ID
from hermes_constants import display_hermes_home as _dhh_fn
from hermes_logging import set_session_context
from tools.schema_sanitizer import strip_pattern_and_format
from tools.skill_provenance import set_current_write_origin
@@ -229,37 +229,6 @@ def _restore_or_build_system_prompt(agent, system_message, conversation_history)
)
def _get_continuation_prompt(is_partial_stub: bool, dropped_tools: Optional[List[str]] = None) -> str:
if is_partial_stub and dropped_tools:
tool_list = ", ".join(dropped_tools[:3])
return (
"[System: Your previous tool call "
f"({tool_list}) was too large and "
"the stream timed out before it "
"could be delivered. Do NOT retry "
"the same tool call with the same "
"large content. Instead, break the "
"content into multiple smaller tool "
"calls (e.g. use multiple patch calls "
"or write smaller files). Each tool "
"call's arguments must be under ~8K "
"tokens to avoid stream timeouts.]"
)
elif is_partial_stub:
return (
"[System: The previous response was cut off by a "
"network error mid-stream. Continue exactly where "
"you left off. Do not restart or repeat prior text. "
"Finish the answer directly.]"
)
else:
return (
"[System: Your previous response was truncated by the output "
"length limit. Continue exactly where you left off. Do not "
"restart or repeat prior text. Finish the answer directly.]"
)
def run_conversation(
agent,
user_message: str,
@@ -515,7 +484,7 @@ def run_conversation(
tools=agent.tools or None,
)
if agent.context_compressor.should_compress(_preflight_tokens):
if _preflight_tokens >= agent.context_compressor.threshold_tokens:
logger.info(
"Preflight compression: ~%s tokens >= %s threshold (model %s, ctx %s)",
f"{_preflight_tokens:,}",
@@ -1445,7 +1414,7 @@ def run_conversation(
finish_reason = "length"
if finish_reason == "length":
if getattr(response, "id", "") == PARTIAL_STREAM_STUB_ID:
if getattr(response, "id", "") == "partial-stream-stub":
agent._vprint(
f"{agent.log_prefix}⚠️ Stream interrupted by network error "
f"(finish_reason='length' on partial-stream-stub)",
@@ -1549,36 +1518,37 @@ def run_conversation(
truncated_response_parts.append(assistant_message.content)
if length_continue_retries < 3:
# Distinguish a real output-token truncation
# from a partial-stream-stub network error
# (#30963). Same continuation machinery,
# but the prompt has to tell the truth or
# the model goes off rails ("I wasn't
# truncated, I'm done").
_is_partial_stream_stub = (
getattr(response, "id", "") == PARTIAL_STREAM_STUB_ID
getattr(response, "id", "") == "partial-stream-stub"
)
_dropped_tools = getattr(
response, "_dropped_tool_names", None
)
if _is_partial_stream_stub and _dropped_tools:
_tool_list = ", ".join(_dropped_tools[:3])
agent._vprint(
f"{agent.log_prefix}↻ Stream interrupted mid "
f"tool-call ({_tool_list}) — requesting "
f"chunked retry "
f"({length_continue_retries}/3)..."
)
elif _is_partial_stream_stub:
if _is_partial_stream_stub:
agent._vprint(
f"{agent.log_prefix}↻ Stream interrupted — "
f"requesting continuation "
f"({length_continue_retries}/3)..."
)
_continue_content = (
"[System: The previous response was cut off by a "
"network error mid-stream. Continue exactly where "
"you left off. Do not restart or repeat prior text. "
"Finish the answer directly.]"
)
else:
agent._vprint(
f"{agent.log_prefix}↻ Requesting continuation "
f"({length_continue_retries}/3)..."
)
_continue_content = _get_continuation_prompt(
_is_partial_stream_stub, _dropped_tools
)
_continue_content = (
"[System: Your previous response was truncated by the output "
"length limit. Continue exactly where you left off. Do not "
"restart or repeat prior text. Finish the answer directly.]"
)
continue_msg = {
"role": "user",
"content": _continue_content,
@@ -2889,26 +2859,15 @@ def run_conversation(
agent._vprint(f"{agent.log_prefix} 🌐 Endpoint: {_base}", force=True)
# Actionable guidance for common auth errors
if classified.is_auth or classified.reason == FailoverReason.billing:
if _provider in {"openai-codex", "xai-oauth", "nous"} and status_code == 401:
if _provider in {"openai-codex", "xai-oauth"} and status_code == 401:
if _provider == "openai-codex":
agent._vprint(f"{agent.log_prefix} 💡 Codex OAuth token was rejected (HTTP 401). Your token may have been", force=True)
agent._vprint(f"{agent.log_prefix} refreshed by another client (Codex CLI, VS Code). To fix:", force=True)
agent._vprint(f"{agent.log_prefix} 1. Run `codex` in your terminal to generate fresh tokens.", force=True)
agent._vprint(f"{agent.log_prefix} 2. Then run `hermes auth` to re-authenticate.", force=True)
elif _provider == "xai-oauth":
else:
agent._vprint(f"{agent.log_prefix} 💡 xAI OAuth token was rejected (HTTP 401). To fix:", force=True)
agent._vprint(f"{agent.log_prefix} re-authenticate with xAI Grok OAuth (SuperGrok / Premium+) from `hermes model`.", force=True)
else: # nous
agent._vprint(f"{agent.log_prefix} 💡 Nous Portal OAuth token was rejected (HTTP 401). Your token may be", force=True)
agent._vprint(f"{agent.log_prefix} expired, revoked, or your account may be out of credits. To fix:", force=True)
agent._vprint(f"{agent.log_prefix} 1. Re-authenticate: hermes auth add nous --type oauth", force=True)
agent._vprint(f"{agent.log_prefix} 2. Check your portal account: https://portal.nousresearch.com", force=True)
# ``:free`` is OpenRouter slug syntax; Nous Portal will reject
# the model name even after a successful re-auth.
if isinstance(_model, str) and _model.endswith(":free"):
agent._vprint(f"{agent.log_prefix} ⚠️ Note: `{_model}` looks like an OpenRouter slug (`:free` suffix).", force=True)
agent._vprint(f"{agent.log_prefix} Nous Portal won't recognize that model name. Either switch to a", force=True)
agent._vprint(f"{agent.log_prefix} Nous catalog model, or run `/model openrouter:{_model}` to use OpenRouter.", force=True)
agent._vprint(f"{agent.log_prefix} re-authenticate with xAI Grok OAuth (SuperGrok Subscription) from `hermes model`.", force=True)
else:
agent._vprint(f"{agent.log_prefix} 💡 Your API key was rejected by the provider. Check:", force=True)
agent._vprint(f"{agent.log_prefix} • Is the key valid? Run: hermes setup", force=True)
@@ -4221,7 +4180,6 @@ def run_conversation(
"estimated_cost_usd": agent.session_estimated_cost_usd,
"cost_status": agent.session_cost_status,
"cost_source": agent.session_cost_source,
"session_id": agent.session_id,
}
if agent._tool_guardrail_halt_decision is not None:
result["guardrail"] = agent._tool_guardrail_halt_decision.to_metadata()
-174
View File
@@ -1,174 +0,0 @@
"""Credential-pool disk-boundary sanitization helpers.
These helpers define which credential-pool entries are references to borrowed
runtime secrets and strip raw values before those entries are written to
``auth.json``. They intentionally have no dependency on ``hermes_cli.auth`` so
both the pool model and the final auth-store write boundary can share the same
policy without import cycles.
"""
from __future__ import annotations
import hashlib
import re
from typing import Any, Dict, Mapping
# Sources Hermes owns and can intentionally persist in auth.json. Everything
# else with a non-empty source is treated as borrowed/reference-only by default
# so future external secret providers fail closed at the disk boundary.
_PERSISTABLE_PROVIDER_SOURCES = frozenset({
("anthropic", "hermes_pkce"),
("minimax-oauth", "oauth"),
("nous", "device_code"),
("openai-codex", "device_code"),
("xai-oauth", "loopback_pkce"),
})
_SAFE_SECRETISH_METADATA_KEYS = frozenset({
"secret_fingerprint",
"secret_source",
"token_type",
"scope",
"client_id",
"agent_key_id",
"agent_key_expires_at",
"agent_key_expires_in",
"agent_key_reused",
"agent_key_obtained_at",
"expires_at",
"expires_at_ms",
"expires_in",
"last_refresh",
"last_status",
"last_status_at",
"last_error_code",
"last_error_reason",
"last_error_message",
"last_error_reset_at",
})
_SECRET_VALUE_KEYS = frozenset({
"access_token",
"refresh_token",
"agent_key",
"api_key",
"apikey",
"api_token",
"auth_token",
"authorization",
"bearer_token",
"client_secret",
"credential",
"credentials",
"id_token",
"oauth_token",
"private_key",
"secret_key",
"session_token",
"password",
"secret",
"token",
"tokens",
})
_SECRET_VALUE_SUFFIXES = (
"_api_key",
"_api_token",
"_access_token",
"_auth_token",
"_refresh_token",
"_bearer_token",
"_client_secret",
"_id_token",
"_oauth_token",
"_private_key",
"_session_token",
"_secret_key",
"_password",
"_secret",
"_token",
"_key",
)
_CAMEL_CASE_BOUNDARY = re.compile(r"(?<=[a-z0-9])(?=[A-Z])")
def _normalize_key(key: Any) -> str:
raw = str(key or "").strip()
raw = _CAMEL_CASE_BOUNDARY.sub("_", raw)
return raw.lower().replace("-", "_").replace(".", "_")
def is_borrowed_credential_source(source: Any, provider_id: Any = None) -> bool:
"""Return True when ``source`` points at a borrowed/reference-only secret."""
normalized_source = str(source or "").strip().lower()
if not normalized_source:
return False
if normalized_source == "manual" or normalized_source.startswith("manual:"):
return False
normalized_provider = str(provider_id or "").strip().lower()
return (normalized_provider, normalized_source) not in _PERSISTABLE_PROVIDER_SOURCES
def _is_secret_payload_key(key: Any) -> bool:
normalized = _normalize_key(key)
if not normalized or normalized in _SAFE_SECRETISH_METADATA_KEYS:
return False
if normalized in _SECRET_VALUE_KEYS:
return True
return normalized.endswith(_SECRET_VALUE_SUFFIXES)
def _fingerprint_value(value: Any) -> str | None:
if value is None:
return None
text = str(value)
if not text:
return None
digest = hashlib.sha256(text.encode("utf-8", errors="surrogatepass")).hexdigest()
return f"sha256:{digest[:16]}"
def _credential_secret_fingerprint(payload: Mapping[str, Any]) -> str | None:
for key in ("agent_key", "access_token", "refresh_token", "api_key", "token", "secret"):
fingerprint = _fingerprint_value(payload.get(key))
if fingerprint:
return fingerprint
for key, value in payload.items():
if _is_secret_payload_key(key):
fingerprint = _fingerprint_value(value)
if fingerprint:
return fingerprint
existing = payload.get("secret_fingerprint")
if isinstance(existing, str) and existing.startswith("sha256:"):
return existing
return None
def sanitize_borrowed_credential_payload(
payload: Mapping[str, Any],
provider_id: Any = None,
) -> Dict[str, Any]:
"""Return a disk-safe credential-pool payload.
Owned sources (manual entries and Hermes-owned OAuth/device-code state)
pass through unchanged. Borrowed/reference-only sources keep labels,
source refs, status/cooldown metadata, counters, and a non-reversible
fingerprint, but raw secret value fields are removed.
"""
result = dict(payload)
if not is_borrowed_credential_source(result.get("source"), provider_id):
return result
fingerprint = _credential_secret_fingerprint(result)
sanitized = {
key: value
for key, value in result.items()
if not _is_secret_payload_key(key)
}
if fingerprint:
sanitized["secret_fingerprint"] = fingerprint
return sanitized
+23 -89
View File
@@ -15,10 +15,6 @@ from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value, load_env
from agent.credential_persistence import (
is_borrowed_credential_source,
sanitize_borrowed_credential_payload,
)
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -90,7 +86,7 @@ CUSTOM_POOL_PREFIX = "custom:"
_EXTRA_KEYS = frozenset({
"token_type", "scope", "client_id", "portal_base_url", "obtained_at",
"expires_in", "agent_key_id", "agent_key_expires_in", "agent_key_reused",
"agent_key_obtained_at", "tls", "secret_source", "secret_fingerprint",
"agent_key_obtained_at", "tls",
})
@@ -165,7 +161,7 @@ class PooledCredential:
for k, v in self.extra.items():
if v is not None:
result[k] = v
return sanitize_borrowed_credential_payload(result, self.provider)
return result
@property
def runtime_api_key(self) -> str:
@@ -249,16 +245,6 @@ def _extract_retry_delay_seconds(message: str) -> Optional[float]:
sec_match = re.search(r"retry\s+(?:after\s+)?(\d+(?:\.\d+)?)\s*(?:sec|secs|seconds|s\b)", message, re.IGNORECASE)
if sec_match:
return float(sec_match.group(1))
# "Resets in 4hr 5min" format used by OpenCode Go weekly usage limits
hr_min_match = re.search(r"resets?\s+in\s+(\d+)\s*hr\s+(\d+)\s*min", message, re.IGNORECASE)
if hr_min_match:
return int(hr_min_match.group(1)) * 3600 + int(hr_min_match.group(2)) * 60
hr_only_match = re.search(r"resets?\s+in\s+(\d+)\s*hr\b", message, re.IGNORECASE)
if hr_only_match:
return int(hr_only_match.group(1)) * 3600
min_only_match = re.search(r"resets?\s+in\s+(\d+)\s*min\b", message, re.IGNORECASE)
if min_only_match:
return int(min_only_match.group(1)) * 60
return None
@@ -1275,21 +1261,9 @@ class CredentialPool:
*,
status_code: Optional[int],
error_context: Optional[Dict[str, Any]] = None,
api_key_hint: Optional[str] = None,
) -> Optional[PooledCredential]:
with self._lock:
entry = None
if api_key_hint:
# Prefer the specific entry whose API key matches the one that
# actually failed. When this pool was freshly loaded from disk
# (another process already rotated), current() is None and
# _select_unlocked() would return the NEXT key — the wrong one.
entry = next(
(e for e in self._entries if e.runtime_api_key == api_key_hint),
None,
)
if entry is None:
entry = self.current() or self._select_unlocked()
entry = self.current() or self._select_unlocked()
if entry is None:
return None
_label = entry.label or entry.id[:8]
@@ -1459,12 +1433,8 @@ def _upsert_entry(entries: List[PooledCredential], provider: str, source: str, p
if field_updates or extra_updates:
if extra_updates:
field_updates["extra"] = {**existing.extra, **extra_updates}
updated = replace(existing, **field_updates)
entries[existing_idx] = updated
# Runtime-only borrowed secret updates should refresh the in-memory
# entry without forcing auth.json churn when the disk-safe payload is
# unchanged (for example env keys with the same fingerprint).
return existing.to_dict() != updated.to_dict()
entries[existing_idx] = replace(existing, **field_updates)
return True
return False
@@ -1802,35 +1772,6 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
except ImportError:
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
def _secret_source_for_env(env_var: str) -> Optional[str]:
try:
from hermes_cli.env_loader import get_secret_source
source_label = get_secret_source(env_var)
except Exception:
source_label = None
return str(source_label).strip() if source_label else None
def _env_payload(
*,
source: str,
env_var: str,
token: str,
base_url: str,
auth_type: str = AUTH_TYPE_API_KEY,
) -> Dict[str, Any]:
payload: Dict[str, Any] = {
"source": source,
"auth_type": auth_type,
"access_token": token,
"base_url": base_url,
"label": env_var,
}
secret_source = _secret_source_for_env(env_var)
if secret_source:
payload["secret_source"] = secret_source
return payload
if provider == "openrouter":
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
@@ -1843,12 +1784,13 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
entries,
provider,
source,
_env_payload(
source=source,
env_var="OPENROUTER_API_KEY",
token=token,
base_url=OPENROUTER_BASE_URL,
),
{
"source": source,
"auth_type": AUTH_TYPE_API_KEY,
"access_token": token,
"base_url": OPENROUTER_BASE_URL,
"label": "OPENROUTER_API_KEY",
},
)
return changed, active_sources
@@ -1887,13 +1829,13 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
entries,
provider,
source,
_env_payload(
source=source,
env_var=env_var,
token=token,
base_url=base_url,
auth_type=auth_type,
),
{
"source": source,
"auth_type": auth_type,
"access_token": token,
"base_url": base_url,
"label": env_var,
},
)
return changed, active_sources
@@ -1905,11 +1847,8 @@ def _prune_stale_seeded_entries(entries: List[PooledCredential], active_sources:
if _is_manual_source(entry.source)
or entry.source in active_sources
or not (
is_borrowed_credential_source(entry.source, entry.provider)
# Hermes PKCE is Hermes-owned/persistable while present, but it is
# still a file-backed singleton and should disappear from the pool
# when the backing OAuth file is gone.
or entry.source == "hermes_pkce"
entry.source.startswith("env:")
or entry.source in {"claude_code", "hermes_pkce"}
)
]
if len(retained) == len(entries):
@@ -1994,22 +1933,17 @@ def _seed_custom_pool(pool_key: str, entries: List[PooledCredential]) -> Tuple[b
def load_pool(provider: str) -> CredentialPool:
provider = (provider or "").strip().lower()
raw_entries = read_credential_pool(provider)
raw_needs_sanitization = any(
isinstance(payload, dict)
and sanitize_borrowed_credential_payload(payload, provider) != payload
for payload in raw_entries
)
entries = [PooledCredential.from_dict(provider, payload) for payload in raw_entries]
if provider.startswith(CUSTOM_POOL_PREFIX):
# Custom endpoint pool — seed from custom_providers config and model config
custom_changed, custom_sources = _seed_custom_pool(provider, entries)
changed = raw_needs_sanitization or custom_changed
changed = custom_changed
changed |= _prune_stale_seeded_entries(entries, custom_sources)
else:
singleton_changed, singleton_sources = _seed_from_singletons(provider, entries)
env_changed, env_sources = _seed_from_env(provider, entries)
changed = raw_needs_sanitization or singleton_changed or env_changed
changed = singleton_changed or env_changed
changed |= _prune_stale_seeded_entries(entries, singleton_sources | env_sources)
changed |= _normalize_pool_priorities(provider, entries)
+1 -1
View File
@@ -285,7 +285,7 @@ def _remove_xai_oauth_loopback_pkce(provider: str, removed) -> RemovalResult:
if _clear_auth_store_provider(provider):
result.cleaned.append(f"Cleared {provider} OAuth tokens from auth store")
result.hints.append(
"Run `hermes model` → xAI Grok OAuth (SuperGrok / Premium+) to re-authenticate if needed."
"Run `hermes model` → xAI Grok OAuth (SuperGrok Subscription) to re-authenticate if needed."
)
return result
+6 -48
View File
@@ -41,11 +41,6 @@ def build_write_denied_paths(home: str) -> set[str]:
# Top-level .env, even when running under a profile — overwriting it
# leaks credentials across every profile that inherits from root (#15981).
str(hermes_root / ".env"),
# Active profile Anthropic PKCE credential store.
str(hermes_home / ".anthropic_oauth.json"),
# Top-level Anthropic PKCE credential store remains sensitive even
# when a profile is active; default/non-profile sessions still read it.
str(hermes_root / ".anthropic_oauth.json"),
os.path.join(home, ".bashrc"),
os.path.join(home, ".zshrc"),
os.path.join(home, ".profile"),
@@ -55,7 +50,6 @@ def build_write_denied_paths(home: str) -> set[str]:
os.path.join(home, ".pgpass"),
os.path.join(home, ".npmrc"),
os.path.join(home, ".pypirc"),
os.path.join(home, ".git-credentials"),
"/etc/sudoers",
"/etc/passwd",
"/etc/shadow",
@@ -77,7 +71,6 @@ def build_write_denied_prefixes(home: str) -> list[str]:
os.path.join(home, ".docker"),
os.path.join(home, ".azure"),
os.path.join(home, ".config", "gh"),
os.path.join(home, ".config", "gcloud"),
]
]
@@ -148,42 +141,21 @@ def is_write_denied(path: str) -> bool:
return False
# Common secret-bearing project-local environment file basenames.
# These are blocked because .env files routinely contain API keys,
# database passwords, and other credentials.
_BLOCKED_PROJECT_ENV_BASENAMES: set[str] = {
".env",
".env.local",
".env.development",
".env.production",
".env.test",
".env.staging",
".envrc",
}
def get_read_block_error(path: str) -> Optional[str]:
"""Return an error message when a read targets a denied Hermes path.
Three categories are blocked:
Two categories are blocked:
* Internal Hermes cache files under ``HERMES_HOME/skills/.hub``
readable metadata that an attacker could use as a prompt-injection
carrier.
* Credential / secret stores under HERMES_HOME and the global Hermes
root: ``auth.json``, ``auth.lock``, ``.anthropic_oauth.json``,
``.env``, ``webhook_subscriptions.json``, ``auth/google_oauth.json``,
and anything under ``mcp-tokens/``. These hold plaintext provider keys,
OAuth tokens, and HMAC secrets that the agent never needs to read
directly provider tools / gateway adapters consume them through
internal channels.
* Project-local environment files anywhere on disk: ``.env``,
``.env.local``, ``.env.development``, ``.env.production``,
``.env.test``, ``.env.staging``, ``.envrc``. These routinely hold
API keys, database passwords, and other credentials for the user's
own projects. The agent helping debug a project shouldn't normally
need to read these ``.env.example`` is the documented-shape
substitute.
``.env``, ``webhook_subscriptions.json``, and anything under
``mcp-tokens/``. These hold plaintext provider keys, OAuth tokens,
and HMAC secrets that the agent never needs to read directly
provider tools / gateway adapters consume them through internal
channels.
**This is NOT a security boundary.** The terminal tool runs as the
same OS user with shell access; the agent can still ``cat auth.json``
@@ -248,7 +220,6 @@ def get_read_block_error(path: str) -> Optional[str]:
".anthropic_oauth.json",
".env",
"webhook_subscriptions.json",
os.path.join("auth", "google_oauth.json"),
)
for hd in hermes_dirs:
for name in credential_file_names:
@@ -288,19 +259,6 @@ def get_read_block_error(path: str) -> Optional[str]:
"security boundary; the terminal tool can still bypass.)"
)
# Block common secret-bearing project-local .env files anywhere on disk.
# The agent helping a user with their project rarely needs to read raw
# .env contents — .env.example is the documented-shape substitute. The
# terminal tool can still ``cat .env``; this is defense-in-depth, not a
# boundary (see module docstring).
if resolved.name in _BLOCKED_PROJECT_ENV_BASENAMES:
return (
f"Access denied: {path} is a secret-bearing environment file "
"and cannot be read to prevent credential leakage. "
"If you need to check the file structure, read .env.example instead. "
"(Defense-in-depth — not a security boundary; the terminal tool can still bypass.)"
)
return None
-82
View File
@@ -191,88 +191,6 @@ def save_b64_image(
return path
# Extension inference for save_url_image — keep small and explicit. We don't
# want to import mimetypes for a handful of formats every image_gen provider
# actually returns, and we never want to inherit a content-type that points
# at HTML or JSON when the API gives us a degenerate response.
_URL_IMAGE_CONTENT_TYPES = {
"image/png": "png",
"image/jpeg": "jpg",
"image/jpg": "jpg",
"image/webp": "webp",
"image/gif": "gif",
}
def save_url_image(
url: str,
*,
prefix: str = "image",
timeout: float = 60.0,
max_bytes: int = 25 * 1024 * 1024,
) -> Path:
"""Download an image URL and write it under ``$HERMES_HOME/cache/images/``.
Used by providers (xAI, fallback OpenAI) whose API returns an *ephemeral*
URL instead of inline base64 those URLs frequently expire before a
downstream consumer (Telegram ``send_photo``, browser fetch) can resolve
them, so we materialise the bytes locally at tool-completion time.
Mirrors :func:`save_b64_image`'s shape so providers can swap in one line.
Returns the absolute :class:`Path` to the saved file. Raises on any
network / HTTP / oversize / non-image-content-type error so callers can
fall back to returning the bare URL with a clear error message.
"""
import requests
response = requests.get(url, timeout=timeout, stream=True)
response.raise_for_status()
# Infer extension from the response content-type, falling back to the
# URL suffix when xAI / OpenAI omit a precise type (some CDNs return
# ``application/octet-stream``). Defaults to ``png``.
content_type = (response.headers.get("Content-Type") or "").split(";", 1)[0].strip().lower()
extension = _URL_IMAGE_CONTENT_TYPES.get(content_type)
if extension is None:
url_path = url.split("?", 1)[0].lower()
for ext in ("png", "jpg", "jpeg", "webp", "gif"):
if url_path.endswith(f".{ext}"):
extension = "jpg" if ext == "jpeg" else ext
break
if extension is None:
extension = "png"
ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
short = uuid.uuid4().hex[:8]
path = _images_cache_dir() / f"{prefix}_{ts}_{short}.{extension}"
bytes_written = 0
with path.open("wb") as fh:
for chunk in response.iter_content(chunk_size=64 * 1024):
if not chunk:
continue
bytes_written += len(chunk)
if bytes_written > max_bytes:
fh.close()
try:
path.unlink()
except OSError:
pass
raise ValueError(
f"Image at {url} exceeds {max_bytes // (1024 * 1024)}MB cap; refusing to cache."
)
fh.write(chunk)
if bytes_written == 0:
try:
path.unlink()
except OSError:
pass
raise ValueError(f"Image at {url} returned 0 bytes; refusing to cache.")
return path
def success_response(
*,
image: str,
+2 -128
View File
@@ -73,102 +73,6 @@ _BWS_RUN_TIMEOUT = 30
_CacheKey = Tuple[str, str, str] # (access_token_fingerprint, project_id, server_url)
_CACHE: Dict[_CacheKey, "_CachedFetch"] = {}
# Disk-persisted cache so back-to-back CLI invocations (e.g. `hermes chat -q ...`
# called from scripts, cron, the gateway forking new agents) don't each pay the
# ~380ms `bws secret list` tax. The in-process _CACHE above only saves repeated
# fetches WITHIN one process; this saves repeated fetches ACROSS processes.
#
# Layout: one JSON object per cache key, written atomically with mode 0600 in
# <hermes_home>/cache/bws_cache.json. The file holds only the secret VALUES,
# never the access token. It's plaintext-equivalent to ~/.hermes/.env (which
# we already accept) but kept out of the .env file so users editing it won't
# accidentally commit BSM-sourced secrets.
_DISK_CACHE_BASENAME = "bws_cache.json"
def _disk_cache_path(home_path: Optional[Path] = None) -> Path:
"""Return the disk cache path under hermes_home/cache/.
`home_path` is what `load_hermes_dotenv()` already resolved; falling back
to `$HERMES_HOME` / `~/.hermes` keeps direct callers working too.
"""
if home_path is None:
home_path = Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
return home_path / "cache" / _DISK_CACHE_BASENAME
def _cache_key_str(cache_key: _CacheKey) -> str:
"""Serialize a cache key to a stable string for JSON storage."""
token_fp, project_id, server_url = cache_key
return f"{token_fp}|{project_id}|{server_url}"
def _read_disk_cache(cache_key: _CacheKey, ttl_seconds: float,
home_path: Optional[Path] = None) -> Optional["_CachedFetch"]:
"""Return a cached entry from disk if fresh, else None.
Best-effort: any I/O or parse error returns None and we re-fetch.
"""
if ttl_seconds <= 0:
return None
path = _disk_cache_path(home_path)
try:
with open(path, "r", encoding="utf-8") as f:
payload = json.load(f)
except (OSError, json.JSONDecodeError):
return None
if not isinstance(payload, dict):
return None
if payload.get("key") != _cache_key_str(cache_key):
return None
secrets = payload.get("secrets")
fetched_at = payload.get("fetched_at")
if not isinstance(secrets, dict) or not isinstance(fetched_at, (int, float)):
return None
# Coerce all values to strings — JSON allows numbers but env vars need strings
typed_secrets: Dict[str, str] = {
k: v for k, v in secrets.items() if isinstance(k, str) and isinstance(v, str)
}
entry = _CachedFetch(secrets=typed_secrets, fetched_at=float(fetched_at))
if not entry.is_fresh(ttl_seconds):
return None
return entry
def _write_disk_cache(cache_key: _CacheKey, entry: "_CachedFetch",
home_path: Optional[Path] = None) -> None:
"""Persist a cache entry to disk atomically with mode 0600.
Best-effort: any I/O error is swallowed (the next invocation will just
re-fetch). We never want disk cache failures to break startup.
"""
path = _disk_cache_path(home_path)
try:
path.parent.mkdir(parents=True, exist_ok=True)
payload = {
"key": _cache_key_str(cache_key),
"secrets": entry.secrets,
"fetched_at": entry.fetched_at,
}
# Write to a temp file in the same directory and atomic-rename.
# tempfile honors os.umask, so we explicitly chmod 0600 before rename.
fd, tmp = tempfile.mkstemp(
prefix=".bws_cache_", suffix=".tmp", dir=str(path.parent)
)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(payload, f)
os.chmod(tmp, 0o600)
os.replace(tmp, path)
except BaseException:
try:
os.unlink(tmp)
except OSError:
pass
raise
except OSError:
pass # best-effort — disk cache miss on next invocation is fine
@dataclass
class _CachedFetch:
@@ -414,7 +318,6 @@ def fetch_bitwarden_secrets(
cache_ttl_seconds: float = 300,
use_cache: bool = True,
server_url: str = "",
home_path: Optional[Path] = None,
) -> Tuple[Dict[str, str], List[str]]:
"""Pull the secrets for ``project_id`` from Bitwarden Secrets Manager.
@@ -426,13 +329,6 @@ def fetch_bitwarden_secrets(
(``https://vault.bitwarden.com``, US Cloud). This is plumbed into
the subprocess as ``BWS_SERVER_URL``.
Caching is a two-layer LRU: an in-process dict (for hot-reload paths
inside one process) and a disk-persisted JSON file under
``<hermes_home>/cache/bws_cache.json`` (for back-to-back CLI invocations).
Both share the same TTL. Pass ``home_path`` so disk cache lookups find
the right directory in tests / non-standard installs; otherwise we fall
back to ``$HERMES_HOME`` / ``~/.hermes``.
Raises :class:`RuntimeError` for fatal conditions (missing binary,
auth failure, unparseable output). Callers in the env_loader path
catch this and emit a single warning; callers in the user-facing
@@ -448,13 +344,6 @@ def fetch_bitwarden_secrets(
cached = _CACHE.get(cache_key)
if cached and cached.is_fresh(cache_ttl_seconds):
return cached.secrets, []
# L2: disk cache. ~5ms on cache hit vs ~380ms for `bws secret list`.
disk_cached = _read_disk_cache(cache_key, cache_ttl_seconds, home_path)
if disk_cached is not None:
# Promote into in-process cache so subsequent fetches in the
# same process skip the disk read too.
_CACHE[cache_key] = disk_cached
return disk_cached.secrets, []
bws = binary or find_bws(install_if_missing=True)
if bws is None:
@@ -466,10 +355,7 @@ def fetch_bitwarden_secrets(
)
secrets, warnings = _run_bws_list(bws, access_token, project_id, server_url)
entry = _CachedFetch(secrets=secrets, fetched_at=time.time())
_CACHE[cache_key] = entry
if use_cache:
_write_disk_cache(cache_key, entry, home_path)
_CACHE[cache_key] = _CachedFetch(secrets=secrets, fetched_at=time.time())
return secrets, warnings
@@ -566,7 +452,6 @@ def apply_bitwarden_secrets(
cache_ttl_seconds: float = 300,
auto_install: bool = True,
server_url: str = "",
home_path: Optional[Path] = None,
) -> FetchResult:
"""Pull secrets from BSM and set them on ``os.environ``.
@@ -617,7 +502,6 @@ def apply_bitwarden_secrets(
binary=binary,
cache_ttl_seconds=cache_ttl_seconds,
server_url=server_url,
home_path=home_path,
)
except RuntimeError as exc:
result.error = str(exc)
@@ -647,15 +531,5 @@ def apply_bitwarden_secrets(
# ---------------------------------------------------------------------------
def _reset_cache_for_tests(home_path: Optional[Path] = None) -> None:
"""Clear in-process AND disk caches.
Tests can pass ``home_path`` to scope the disk cleanup to a tmpdir.
Without it we fall back to the same default resolution as the cache
writer itself.
"""
def _reset_cache_for_tests() -> None:
_CACHE.clear()
try:
_disk_cache_path(home_path).unlink()
except (FileNotFoundError, OSError):
pass
-193
View File
@@ -1,193 +0,0 @@
"""
Transcription Provider ABC
==========================
Defines the pluggable-backend interface for speech-to-text. Providers
register instances via
:meth:`PluginContext.register_transcription_provider`; the active one
(selected via ``stt.provider`` in ``config.yaml``) services every
:func:`tools.transcription_tools.transcribe_audio` call **when the
configured name is neither a built-in (``local``, ``local_command``,
``groq``, ``openai``, ``mistral``, ``xai``) nor disabled**.
Two coexisting STT extension surfaces in resolution order:
1. **Built-in providers** (``BUILTIN_STT_PROVIDERS`` in
:mod:`tools.transcription_tools`) native Python implementations
for the 6 backends shipped today (faster-whisper, local_command,
Groq, OpenAI, Mistral, xAI). **Always win** plugins cannot
shadow them. The single-env-var shell escape hatch
``HERMES_LOCAL_STT_COMMAND`` is preserved via the built-in
``local_command`` path.
2. **Plugin-registered providers** (this ABC). For new STT backends
OpenRouter, SenseAudio, Gemini-STT, custom proprietary engines
that need a Python implementation without modifying
``tools/transcription_tools.py``.
Built-ins-always-win is enforced at registration time
(:func:`agent.transcription_registry.register_provider` rejects names
in ``BUILTIN_STT_PROVIDERS`` with a warning) AND at dispatch time
(:func:`tools.transcription_tools._dispatch_to_plugin_provider`
re-checks defensively).
Providers live in ``<repo>/plugins/transcription/<name>/`` (built-in
plugins, none shipped today) or
``~/.hermes/plugins/transcription/<name>/`` (user-installed).
Response contract
-----------------
:meth:`TranscriptionProvider.transcribe` returns a dict with keys::
success bool
transcript str transcribed text (empty when success=False)
provider str provider name (for diagnostics)
error str only when success=False
"""
from __future__ import annotations
import abc
import logging
from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# ABC
# ---------------------------------------------------------------------------
class TranscriptionProvider(abc.ABC):
"""Abstract base class for a speech-to-text backend.
Subclasses must implement :attr:`name` and :meth:`transcribe`.
Everything else has sane defaults override only what your provider
needs.
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""Stable short identifier used in ``stt.provider`` config.
Lowercase, no spaces. Examples: ``openrouter``, ``sensaudio``,
``gemini``, ``deepgram``. Names that collide with a built-in STT
provider (``local``, ``local_command``, ``groq``, ``openai``,
``mistral``, ``xai``) are rejected at registration time.
"""
@property
def display_name(self) -> str:
"""Human-readable label shown in ``hermes tools``.
Defaults to ``name.title()``.
"""
return self.name.title()
def is_available(self) -> bool:
"""Return True when this provider can service calls.
Typically checks for a required API key + that the SDK is
importable. Default: True (providers with no external
dependencies are always available).
Must NOT raise used by the picker and ``hermes setup`` for
availability displays and should fail gracefully.
"""
return True
def list_models(self) -> List[Dict[str, Any]]:
"""Return model catalog entries.
Each entry::
{
"id": "whisper-large-v3-turbo", # required
"display": "Whisper Large v3 Turbo", # optional
"languages": ["en", "es", "fr"], # optional
"max_audio_seconds": 1500, # optional
}
Default: empty list (provider has a single fixed model or
doesn't expose model selection).
"""
return []
def default_model(self) -> Optional[str]:
"""Return the default model id, or None if not applicable."""
models = self.list_models()
if models:
return models[0].get("id")
return None
def get_setup_schema(self) -> Dict[str, Any]:
"""Return provider metadata for the ``hermes tools`` picker.
Used by ``tools_config.py`` to inject this provider as a row in
the Speech-to-Text provider list. Shape::
{
"name": "OpenRouter STT", # picker label
"badge": "paid", # optional short tag
"tag": "Whisper via OpenRouter API", # optional subtitle
"env_vars": [ # keys to prompt for
{"key": "OPENROUTER_API_KEY",
"prompt": "OpenRouter API key",
"url": "https://openrouter.ai/keys"},
],
}
Default: minimal entry derived from ``display_name`` with no
env vars. Override to expose API key prompts and custom badges.
"""
return {
"name": self.display_name,
"badge": "",
"tag": "",
"env_vars": [],
}
@abc.abstractmethod
def transcribe(
self,
file_path: str,
*,
model: Optional[str] = None,
language: Optional[str] = None,
**extra: Any,
) -> Dict[str, Any]:
"""Transcribe the audio file at ``file_path``.
Returns a dict with the standard envelope::
{
"success": True,
"transcript": "the transcribed text",
"provider": "<this provider's name>",
}
or on failure::
{
"success": False,
"transcript": "",
"error": "human-readable error message",
"provider": "<this provider's name>",
}
Implementations should NOT raise convert exceptions to the
error envelope so the dispatcher can deliver a consistent shape
to the gateway/CLI caller.
Args:
file_path: Absolute path to the audio file. The dispatcher
has already validated existence + size before calling.
model: Model identifier from :meth:`list_models`, or None
to use :meth:`default_model`.
language: Optional BCP-47 language hint (e.g. ``"en"``,
``"ja"``) providers without language hints should
ignore this argument.
**extra: Forward-compat parameters future schema versions
may expose. Implementations should ignore unknown keys.
"""
-122
View File
@@ -1,122 +0,0 @@
"""
Transcription Provider Registry
================================
Central map of registered STT providers. Populated by plugins at
import-time via :meth:`PluginContext.register_transcription_provider`;
consumed by :mod:`tools.transcription_tools` to dispatch
:func:`transcribe_audio` calls to the active plugin backend **when**
the configured ``stt.provider`` name is not a built-in.
Built-ins-always-win
--------------------
Plugin names that collide with a built-in STT provider (``local``,
``local_command``, ``groq``, ``openai``, ``mistral``, ``xai``) are
rejected at registration with a warning. This invariant is also
re-checked at dispatch time in
:func:`tools.transcription_tools._dispatch_to_plugin_provider`.
"""
from __future__ import annotations
import logging
import threading
from typing import Dict, List, Optional
from agent.transcription_provider import TranscriptionProvider
logger = logging.getLogger(__name__)
# Names reserved for native built-in STT handlers. Plugins cannot
# register a name in this set — the registration call is rejected with
# a warning. **Kept in sync with ``BUILTIN_STT_PROVIDERS`` in
# :mod:`tools.transcription_tools`** — a regression test in
# ``tests/agent/test_transcription_registry.py::TestBuiltinSync``
# fails if the two lists drift. Importing from
# ``tools.transcription_tools`` directly would create a circular
# dependency (``tools.transcription_tools`` imports
# ``agent.transcription_registry`` for dispatch).
_BUILTIN_NAMES = frozenset({
"local",
"local_command",
"groq",
"openai",
"mistral",
"xai",
})
_providers: Dict[str, TranscriptionProvider] = {}
_lock = threading.Lock()
def register_provider(provider: TranscriptionProvider) -> None:
"""Register a transcription provider.
Rejects:
- Non-:class:`TranscriptionProvider` instances (raises :class:`TypeError`).
- Empty/whitespace ``.name`` (raises :class:`ValueError`).
- Names colliding with a built-in (logs a warning, silently
ignores built-ins-always-win invariant).
Re-registration (same ``name``) overwrites the previous entry and
logs a debug message makes hot-reload scenarios (tests, dev
loops) behave predictably.
"""
if not isinstance(provider, TranscriptionProvider):
raise TypeError(
f"register_provider() expects a TranscriptionProvider instance, "
f"got {type(provider).__name__}"
)
name = provider.name
if not isinstance(name, str) or not name.strip():
raise ValueError("Transcription provider .name must be a non-empty string")
key = name.strip().lower()
if key in _BUILTIN_NAMES:
logger.warning(
"Transcription provider '%s' shadows a built-in name; registration "
"ignored. Built-in STT providers (%s) always win — pick a different "
"name.",
key, ", ".join(sorted(_BUILTIN_NAMES)),
)
return
with _lock:
existing = _providers.get(key)
_providers[key] = provider
if existing is not None:
logger.debug(
"Transcription provider '%s' re-registered (was %r)",
key, type(existing).__name__,
)
else:
logger.debug(
"Registered transcription provider '%s' (%s)",
key, type(provider).__name__,
)
def list_providers() -> List[TranscriptionProvider]:
"""Return all registered providers, sorted by name."""
with _lock:
items = list(_providers.values())
return sorted(items, key=lambda p: p.name)
def get_provider(name: str) -> Optional[TranscriptionProvider]:
"""Return the provider registered under *name*, or None.
Name matching is case-insensitive and whitespace-tolerant mirrors
how ``tools.transcription_tools._get_provider`` normalizes the
configured ``stt.provider`` value.
"""
if not isinstance(name, str):
return None
return _providers.get(name.strip().lower())
def _reset_for_tests() -> None:
"""Clear the registry. **Test-only.**"""
with _lock:
_providers.clear()
-15
View File
@@ -50,7 +50,6 @@ class ResponsesApiTransport(ProviderTransport):
reasoning_config: dict | None {effort, enabled}
session_id: str | None used for prompt_cache_key + xAI conv header
max_tokens: int | None max_output_tokens
timeout: float | None per-request timeout forwarded to the SDK
request_overrides: dict | None extra kwargs merged in
provider: str | None provider name for backend-specific logic
base_url: str | None endpoint URL
@@ -144,20 +143,6 @@ class ResponsesApiTransport(ProviderTransport):
if request_overrides:
kwargs.update(request_overrides)
# Forward per-request timeout to the SDK so OpenAI/Anthropic clients
# honor it. Without this, ``providers.<id>.request_timeout_seconds``
# is silently dropped on the main agent Codex path while the
# chat_completions path and auxiliary Codex adapter both forward it.
timeout = kwargs.get("timeout", params.get("timeout"))
if (
isinstance(timeout, (int, float))
and not isinstance(timeout, bool)
and 0 < float(timeout) < float("inf")
):
kwargs["timeout"] = float(timeout)
else:
kwargs.pop("timeout", None)
if is_codex_backend:
prompt_cache_key = kwargs.get("prompt_cache_key")
cache_scope_id = str(prompt_cache_key or session_id or "").strip()
-274
View File
@@ -1,274 +0,0 @@
"""
Text-to-Speech Provider ABC
============================
Defines the pluggable-backend interface for text-to-speech synthesis.
Providers register instances via
``PluginContext.register_tts_provider()``; the active one (selected via
``tts.provider`` in ``config.yaml``) services every ``text_to_speech``
tool call **only when the configured name is neither a built-in nor a
command-type provider declared under ``tts.providers.<name>``**.
Three coexisting TTS extension surfaces in resolution order:
1. **Built-in providers** (``BUILTIN_TTS_PROVIDERS`` in
:mod:`tools.tts_tool`) native Python implementations (edge, openai,
elevenlabs, ). **Always win** plugins cannot shadow them.
2. **Command-type providers** declared under ``tts.providers.<name>:
type: command`` (PR #17843, commit ``2facea7f7``). Wire any local
CLI into Hermes with shell-template placeholders. **Wins over a
same-name plugin** config is more local than plugin install.
3. **Plugin-registered providers** (this ABC). For backends that need a
Python SDK, streaming bytes, OAuth refresh, or voice-listing APIs
the shell-template grammar can't reasonably express.
Built-ins-always-win is enforced at registration time
(:func:`agent.tts_registry.register_provider` rejects names in
``BUILTIN_TTS_PROVIDERS`` with a warning) AND at dispatch time
(:func:`tools.tts_tool._dispatch_to_plugin_provider` re-checks
defensively). The dispatcher also rejects plugin dispatch when a same-
name command provider is configured.
Providers live in ``<repo>/plugins/tts/<name>/`` (built-in plugins, no
shipped today) or ``~/.hermes/plugins/tts/<name>/`` (user-installed).
None ship in-tree as of issue #30398 — the hook is additive
infrastructure waiting for a real consumer (Cartesia, Fish Audio, ).
Response contract
-----------------
:meth:`TTSProvider.synthesize` writes the audio bytes to ``output_path``
and returns the path as a string. Implementations should raise on
failure the dispatcher converts exceptions into the standard
``{success: False, error: }`` JSON envelope the rest of Hermes
expects.
"""
from __future__ import annotations
import abc
import logging
from typing import Any, Dict, Iterator, List, Optional
logger = logging.getLogger(__name__)
DEFAULT_OUTPUT_FORMAT = "mp3"
VALID_OUTPUT_FORMATS = frozenset({"mp3", "wav", "ogg", "opus", "flac"})
# ---------------------------------------------------------------------------
# ABC
# ---------------------------------------------------------------------------
class TTSProvider(abc.ABC):
"""Abstract base class for a text-to-speech backend.
Subclasses must implement :attr:`name` and :meth:`synthesize`.
Everything else has sane defaults override only what your provider
needs.
"""
@property
@abc.abstractmethod
def name(self) -> str:
"""Stable short identifier used in ``tts.provider`` config.
Lowercase, no spaces. Examples: ``cartesia``, ``fishaudio``,
``deepgram``. Names that collide with a built-in TTS provider
(``edge``, ``openai``, ``elevenlabs``, ``minimax``, ``gemini``,
``mistral``, ``xai``, ``piper``, ``kittentts``, ``neutts``) are
rejected at registration time.
"""
@property
def display_name(self) -> str:
"""Human-readable label shown in ``hermes tools``.
Defaults to ``name.title()`` (e.g. ``Cartesia`` for ``cartesia``).
"""
return self.name.title()
def is_available(self) -> bool:
"""Return True when this provider can service calls.
Typically checks for a required API key + that the SDK is
importable. Default: True (providers with no external
dependencies are always available).
Must NOT raise used by the picker and ``hermes setup`` for
availability displays and should fail gracefully.
"""
return True
def list_voices(self) -> List[Dict[str, Any]]:
"""Return voice catalog entries.
Each entry::
{
"id": "voice-abc-123", # required
"display": "Aria — neutral female", # optional; defaults to id
"language": "en-US", # optional
"gender": "female", # optional
"preview_url": "https://...mp3", # optional
}
Default: empty list (provider has no enumerable voices or
doesn't surface them via API).
"""
return []
def list_models(self) -> List[Dict[str, Any]]:
"""Return model catalog entries.
Each entry::
{
"id": "sonic-2", # required
"display": "Sonic 2", # optional
"languages": ["en", "es", "fr"], # optional
"max_text_length": 5000, # optional
}
Default: empty list (provider has a single fixed model or
doesn't expose model selection).
"""
return []
def get_setup_schema(self) -> Dict[str, Any]:
"""Return provider metadata for the ``hermes tools`` picker.
Used by ``tools_config.py`` to inject this provider as a row in
the Text-to-Speech provider list. Shape::
{
"name": "Cartesia", # picker label
"badge": "paid", # optional short tag
"tag": "Ultra-low-latency streaming", # optional subtitle
"env_vars": [ # keys to prompt for
{"key": "CARTESIA_API_KEY",
"prompt": "Cartesia API key",
"url": "https://play.cartesia.ai/console"},
],
}
Default: minimal entry derived from ``display_name`` with no
env vars. Override to expose API key prompts and custom badges.
"""
return {
"name": self.display_name,
"badge": "",
"tag": "",
"env_vars": [],
}
def default_model(self) -> Optional[str]:
"""Return the default model id, or None if not applicable."""
models = self.list_models()
if models:
return models[0].get("id")
return None
def default_voice(self) -> Optional[str]:
"""Return the default voice id, or None if not applicable."""
voices = self.list_voices()
if voices:
return voices[0].get("id")
return None
@abc.abstractmethod
def synthesize(
self,
text: str,
output_path: str,
*,
voice: Optional[str] = None,
model: Optional[str] = None,
speed: Optional[float] = None,
format: str = DEFAULT_OUTPUT_FORMAT,
**extra: Any,
) -> str:
"""Synthesize ``text`` and write audio bytes to ``output_path``.
Returns the absolute path to the written file as a string
(typically just echoes ``output_path``). Raises on failure
the dispatcher converts exceptions to the standard
``{success: False, error: ...}`` JSON envelope.
Args:
text: The text to synthesize. Already truncated to the
provider's max length by the dispatcher.
output_path: Absolute path where the audio file should be
written. Parent directory is guaranteed to exist.
voice: Voice identifier from :meth:`list_voices`, or None
to use :meth:`default_voice`.
model: Model identifier from :meth:`list_models`, or None
to use :meth:`default_model`.
speed: Optional speech-rate multiplier (1.0 = normal).
Providers that don't support speed control should
ignore this argument.
format: Output audio format. Implementations should match
the requested format when possible; if unsupported,
pick the closest equivalent and ensure ``output_path``
ends with the correct extension.
**extra: Forward-compat parameters future schema versions
may expose. Implementations should ignore unknown keys.
"""
def stream(
self,
text: str,
*,
voice: Optional[str] = None,
model: Optional[str] = None,
format: str = "opus",
**extra: Any,
) -> Iterator[bytes]:
"""Stream synthesized audio bytes.
Optional. Providers that don't support streaming raise
:class:`NotImplementedError` (the default) and the dispatcher
falls back to :meth:`synthesize` + read-whole-file.
Args mirror :meth:`synthesize`. Default ``format`` is ``opus``
because the primary streaming use case is voice-bubble
delivery (Telegram et al.) which requires Opus.
"""
raise NotImplementedError(
f"TTS provider {self.name!r} does not implement streaming "
"synthesis. Use synthesize() instead, or implement stream() "
"if your backend supports it."
)
@property
def voice_compatible(self) -> bool:
"""Whether output is suitable for voice-bubble delivery.
Mirrors the ``tts.providers.<name>.voice_compatible`` field
from PR #17843. When True, the gateway's voice-message
delivery pipeline runs ffmpeg conversion to Opus if needed.
When False, output is delivered as a regular audio attachment.
Default: False (safe providers opt in explicitly).
"""
return False
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def resolve_output_format(value: Optional[str]) -> str:
"""Clamp an output_format value to the valid set.
Invalid values are coerced to :data:`DEFAULT_OUTPUT_FORMAT` rather
than rejected so the tool surface is forgiving of agent mistakes.
"""
if not isinstance(value, str):
return DEFAULT_OUTPUT_FORMAT
v = value.strip().lower()
if v in VALID_OUTPUT_FORMATS:
return v
return DEFAULT_OUTPUT_FORMAT
-133
View File
@@ -1,133 +0,0 @@
"""
TTS Provider Registry
=====================
Central map of registered TTS providers. Populated by plugins at
import-time via :meth:`PluginContext.register_tts_provider`; consumed
by :mod:`tools.tts_tool` to dispatch ``text_to_speech`` tool calls to
the active plugin backend **when** the configured ``tts.provider``
name is neither a built-in nor a command-type provider.
Built-ins-always-win
--------------------
Plugin names that collide with a built-in TTS provider (``edge``,
``openai``, ``elevenlabs``, ``minimax``, ``gemini``, ``mistral``,
``xai``, ``piper``, ``kittentts``, ``neutts``) are rejected at
registration with a warning. This invariant is also re-checked at
dispatch time in :func:`tools.tts_tool._dispatch_to_plugin_provider`.
Command-providers-win-over-plugins
----------------------------------
This registry doesn't enforce the command-vs-plugin precedence — that
lives in the dispatcher, which checks for a same-name
``tts.providers.<name>: type: command`` entry before consulting the
registry. The rationale is locality: a name declared in the user's
``config.yaml`` is more specific to their setup than a plugin that
happens to be installed.
"""
from __future__ import annotations
import logging
import threading
from typing import Dict, List, Optional
from agent.tts_provider import TTSProvider
logger = logging.getLogger(__name__)
# Names reserved for native built-in TTS handlers. Plugins cannot
# register a name in this set — the registration call is rejected with
# a warning. **Kept in sync with ``BUILTIN_TTS_PROVIDERS`` in
# :mod:`tools.tts_tool`** — a regression test in
# ``tests/agent/test_tts_registry.py::TestBuiltinSync`` fails if the
# two lists drift. Importing from ``tools.tts_tool`` directly would
# create a circular dependency (``tools.tts_tool`` imports
# ``agent.tts_registry`` for dispatch).
_BUILTIN_NAMES = frozenset({
"edge",
"elevenlabs",
"openai",
"minimax",
"xai",
"mistral",
"gemini",
"neutts",
"kittentts",
"piper",
})
_providers: Dict[str, TTSProvider] = {}
_lock = threading.Lock()
def register_provider(provider: TTSProvider) -> None:
"""Register a TTS provider.
Rejects:
- Non-:class:`TTSProvider` instances (raises :class:`TypeError`).
- Empty/whitespace ``.name`` (raises :class:`ValueError`).
- Names colliding with a built-in (logs a warning, silently
ignores built-ins-always-win invariant).
Re-registration (same ``name``) overwrites the previous entry and
logs a debug message makes hot-reload scenarios (tests, dev
loops) behave predictably.
"""
if not isinstance(provider, TTSProvider):
raise TypeError(
f"register_provider() expects a TTSProvider instance, "
f"got {type(provider).__name__}"
)
name = provider.name
if not isinstance(name, str) or not name.strip():
raise ValueError("TTS provider .name must be a non-empty string")
key = name.strip().lower()
if key in _BUILTIN_NAMES:
logger.warning(
"TTS provider '%s' shadows a built-in name; registration ignored. "
"Built-in TTS providers (%s) always win — pick a different name.",
key, ", ".join(sorted(_BUILTIN_NAMES)),
)
return
with _lock:
existing = _providers.get(key)
_providers[key] = provider
if existing is not None:
logger.debug(
"TTS provider '%s' re-registered (was %r)",
key, type(existing).__name__,
)
else:
logger.debug(
"Registered TTS provider '%s' (%s)",
key, type(provider).__name__,
)
def list_providers() -> List[TTSProvider]:
"""Return all registered providers, sorted by name."""
with _lock:
items = list(_providers.values())
return sorted(items, key=lambda p: p.name)
def get_provider(name: str) -> Optional[TTSProvider]:
"""Return the provider registered under *name*, or None.
Name matching is case-insensitive and whitespace-tolerant mirrors
how ``tools.tts_tool._get_provider`` normalizes the configured
``tts.provider`` value.
"""
if not isinstance(name, str):
return None
return _providers.get(name.strip().lower())
def _reset_for_tests() -> None:
"""Clear the registry. **Test-only.**"""
with _lock:
_providers.clear()
+24 -234
View File
@@ -2360,89 +2360,6 @@ def _strip_leaked_bracketed_paste_wrappers(text: str) -> str:
return text
def _apply_bracketed_paste_timeout_patch() -> None:
"""Patch prompt_toolkit to recover from torn bracketed-paste sequences.
prompt_toolkit's ``Vt100Parser.feed()`` buffers all input while waiting
for the ESC[201~ end mark. If a terminal drops that end mark (terminal
race, torn write, SSH glitch, macOS sleep/wake), input appears frozen
forever the only recovery used to be killing the tab.
This patch wraps ``Vt100Parser.feed`` so that bracketed-paste mode
flushes buffered content as a normal ``BracketedPaste`` event after
``_BP_TIMEOUT_S`` seconds without an end marker, then resumes normal
parsing. See upstream issue #16263.
The patch is idempotent repeated calls are no-ops via the
``_hermes_bp_timeout_patched`` sentinel on the module.
"""
try:
import prompt_toolkit.input.vt100_parser as _vt100_mod
from prompt_toolkit.keys import Keys as _PtKeys
from prompt_toolkit.key_binding.key_processor import KeyPress as _PtKeyPress
if getattr(_vt100_mod, "_hermes_bp_timeout_patched", False):
return
_BP_TIMEOUT_S = 2.0 # max time to wait for ESC[201~ before flushing
def _patched_vt100_feed(self_parser, data: str) -> None:
if self_parser._in_bracketed_paste:
self_parser._paste_buffer += data
end_mark = "\x1b[201~"
if end_mark in self_parser._paste_buffer:
end_index = self_parser._paste_buffer.index(end_mark)
paste_content = self_parser._paste_buffer[:end_index]
self_parser.feed_key_callback(
_PtKeyPress(_PtKeys.BracketedPaste, paste_content)
)
self_parser._in_bracketed_paste = False
remaining = self_parser._paste_buffer[
end_index + len(end_mark):
]
self_parser._paste_buffer = ""
self_parser._hermes_bp_start = None
if remaining:
_patched_vt100_feed(self_parser, remaining)
else:
bp_start = getattr(self_parser, "_hermes_bp_start", None)
now = time.monotonic()
if bp_start is None:
self_parser._hermes_bp_start = now
elif now - bp_start > _BP_TIMEOUT_S:
paste_content = self_parser._paste_buffer
self_parser._in_bracketed_paste = False
self_parser._paste_buffer = ""
self_parser._hermes_bp_start = None
if paste_content:
self_parser.feed_key_callback(
_PtKeyPress(_PtKeys.BracketedPaste, paste_content)
)
logger.warning(
"Bracketed-paste timeout (%.1fs) — flushed %d bytes "
"without end mark. Terminal may have dropped ESC[201~ "
"(see #16263).",
now - bp_start,
len(paste_content),
)
else:
# Normal mode — re-inline prompt_toolkit's normal feed path.
# Calling the original feed here would double-buffer after the
# bracketed-paste entry transition.
for i, c in enumerate(data):
if self_parser._in_bracketed_paste:
_patched_vt100_feed(self_parser, data[i:])
break
self_parser._input_parser.send(c)
_vt100_mod.Vt100Parser.feed = _patched_vt100_feed
_vt100_mod._hermes_bp_timeout_patched = True
logger.debug("Applied Vt100Parser bracketed-paste timeout patch (#16263)")
except Exception as exc: # noqa: BLE001 — defensive: never break startup
logger.debug("Bracketed-paste timeout patch skipped: %s", exc)
# Cursor Position Report (CPR / DSR) response, format ``ESC[<row>;<col>R``.
# prompt_toolkit's _on_resize() + renderer send ``ESC[6n`` queries to the
# terminal; under resize storms or tab switches the terminal's reply can
@@ -3503,7 +3420,6 @@ class HermesCLI:
"session_api_calls": 0,
"compressions": 0,
"active_background_tasks": 0,
"active_background_processes": 0,
}
# Count live /background tasks. The dict entry is removed in the
@@ -3516,14 +3432,6 @@ class HermesCLI:
except Exception:
pass
# Count live background terminal processes (terminal tool background
# sessions tracked by tools.process_registry). Cheap O(1) read.
try:
from tools.process_registry import process_registry
snapshot["active_background_processes"] = process_registry.count_running()
except Exception:
pass
if not agent:
return snapshot
@@ -3762,9 +3670,6 @@ class HermesCLI:
bg_count = snapshot.get("active_background_tasks", 0)
if bg_count:
parts.append(f"{bg_count}")
bg_proc_count = snapshot.get("active_background_processes", 0)
if bg_proc_count:
parts.append(f"{bg_proc_count}")
parts.append(duration_label)
if yolo_active:
parts.append("⚠ YOLO")
@@ -3784,9 +3689,6 @@ class HermesCLI:
bg_count = snapshot.get("active_background_tasks", 0)
if bg_count:
parts.append(f"{bg_count}")
bg_proc_count = snapshot.get("active_background_processes", 0)
if bg_proc_count:
parts.append(f"{bg_proc_count}")
parts.append(duration_label)
prompt_elapsed = snapshot.get("prompt_elapsed")
if prompt_elapsed:
@@ -3828,7 +3730,6 @@ class HermesCLI:
if width < 76:
compressions = snapshot.get("compressions", 0)
bg_count = snapshot.get("active_background_tasks", 0)
bg_proc_count = snapshot.get("active_background_processes", 0)
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
@@ -3841,9 +3742,6 @@ class HermesCLI:
if bg_count:
frags.append(("class:status-bar-dim", " · "))
frags.append(("class:status-bar-strong", f"{bg_count}"))
if bg_proc_count:
frags.append(("class:status-bar-dim", " · "))
frags.append(("class:status-bar-strong", f"{bg_proc_count}"))
frags.extend([
("class:status-bar-dim", " · "),
("class:status-bar-dim", duration_label),
@@ -3863,7 +3761,6 @@ class HermesCLI:
bar_style = self._status_bar_context_style(percent)
compressions = snapshot.get("compressions", 0)
bg_count = snapshot.get("active_background_tasks", 0)
bg_proc_count = snapshot.get("active_background_processes", 0)
frags = [
("class:status-bar", ""),
("class:status-bar-strong", snapshot["model_short"]),
@@ -3880,9 +3777,6 @@ class HermesCLI:
if bg_count:
frags.append(("class:status-bar-dim", ""))
frags.append(("class:status-bar-strong", f"{bg_count}"))
if bg_proc_count:
frags.append(("class:status-bar-dim", ""))
frags.append(("class:status-bar-strong", f"{bg_proc_count}"))
frags.extend([
("class:status-bar-dim", ""),
("class:status-bar-dim", duration_label),
@@ -4862,22 +4756,9 @@ class HermesCLI:
# is non-empty and we skip the DB round-trip.
if self._resumed and self._session_db and not self.conversation_history:
session_meta = self._session_db.get_session(self.session_id)
# In quiet mode (`hermes chat -Q` / --quiet, surfaced via
# tool_progress_mode == "off"), resume status lines go to stderr
# so stdout stays machine-readable for automation wrappers that
# do `$(hermes chat -Q --resume <id> -q "...")`. Without this,
# the resume banner pollutes captured stdout. See #11793.
_quiet_mode = getattr(self, "tool_progress_mode", "full") == "off"
if not session_meta:
if _quiet_mode:
print(f"Session not found: {self.session_id}", file=sys.stderr)
print(
"Use a session ID from a previous CLI run (hermes sessions list).",
file=sys.stderr,
)
else:
_cprint(f"\033[1;31mSession not found: {self.session_id}{_RST}")
_cprint(f"{_DIM}Use a session ID from a previous CLI run (hermes sessions list).{_RST}")
_cprint(f"\033[1;31mSession not found: {self.session_id}{_RST}")
_cprint(f"{_DIM}Use a session ID from a previous CLI run (hermes sessions list).{_RST}")
return False
# If the requested session is the (empty) head of a compression
# chain, walk to the descendant that actually holds the messages.
@@ -4904,30 +4785,16 @@ class HermesCLI:
title_part = ""
if session_meta.get("title"):
title_part = f" \"{session_meta['title']}\""
if _quiet_mode:
print(
f"↻ Resumed session {self.session_id}{title_part} "
f"({msg_count} user message{'s' if msg_count != 1 else ''}, "
f"{len(restored)} total messages)",
file=sys.stderr,
)
else:
ChatConsole().print(
f"[bold {_accent_hex()}]↻ Resumed session[/] "
f"[bold]{_escape(self.session_id)}[/]"
f"[bold {_accent_hex()}]{_escape(title_part)}[/] "
f"({msg_count} user message{'s' if msg_count != 1 else ''}, {len(restored)} total messages)"
)
ChatConsole().print(
f"[bold {_accent_hex()}]↻ Resumed session[/] "
f"[bold]{_escape(self.session_id)}[/]"
f"[bold {_accent_hex()}]{_escape(title_part)}[/] "
f"({msg_count} user message{'s' if msg_count != 1 else ''}, {len(restored)} total messages)"
)
else:
if _quiet_mode:
print(
f"Session {self.session_id} found but has no messages. Starting fresh.",
file=sys.stderr,
)
else:
ChatConsole().print(
f"[bold {_accent_hex()}]Session {_escape(self.session_id)} found but has no messages. Starting fresh.[/]"
)
ChatConsole().print(
f"[bold {_accent_hex()}]Session {_escape(self.session_id)} found but has no messages. Starting fresh.[/]"
)
# Re-open the session (clear ended_at so it's active again)
try:
self._session_db._conn.execute(
@@ -5091,22 +4958,20 @@ class HermesCLI:
if os.environ.get("HERMES_DEFER_AGENT_STARTUP") != "1":
self._show_tool_availability_warnings()
# Warn about low context lengths (common with local servers). Keep
# this tied to the runtime guard so guidance cannot drift again.
from agent.model_metadata import MINIMUM_CONTEXT_LENGTH
if ctx_len and ctx_len < MINIMUM_CONTEXT_LENGTH:
# Warn about very low context lengths (common with local servers)
if ctx_len and ctx_len <= 8192:
self._console_print()
self._console_print(
f"[yellow]⚠️ Context length is only {ctx_len:,} tokens — "
f"this is likely too low for agent use with tools.[/]"
)
self._console_print(
f"[dim] Hermes needs at least {MINIMUM_CONTEXT_LENGTH:,} tokens. Tool schemas + system prompt use a large fixed prefix.[/]"
"[dim] Hermes needs 16k32k minimum. Tool schemas + system prompt alone use ~4k8k.[/]"
)
base_url = getattr(self, "base_url", "") or ""
if "11434" in base_url or "ollama" in base_url.lower():
self._console_print(
f"[dim] Ollama fix: OLLAMA_CONTEXT_LENGTH={MINIMUM_CONTEXT_LENGTH} ollama serve[/]"
"[dim] Ollama fix: OLLAMA_CONTEXT_LENGTH=32768 ollama serve[/]"
)
elif "1234" in base_url:
self._console_print(
@@ -6660,19 +6525,6 @@ class HermesCLI:
parts = cmd_original.split(None, 1)
target = parts[1].strip() if len(parts) > 1 else ""
# Strip common outer brackets/quotes users may type literally from the
# usage hint (e.g. ``/resume <abc123>`` or ``/resume [abc123]``). The
# `/resume` help text shows angle brackets as a placeholder and a few
# users copy them through verbatim. Stripping them keeps the lookup
# working without changing the help string.
if len(target) >= 2 and (
(target[0] == "<" and target[-1] == ">")
or (target[0] == "[" and target[-1] == "]")
or (target[0] == '"' and target[-1] == '"')
or (target[0] == "'" and target[-1] == "'")
):
target = target[1:-1].strip()
if not target:
_cprint(" Usage: /resume <number|session_id_or_title>")
if self._show_recent_sessions(reason="resume"):
@@ -7140,28 +6992,7 @@ class HermesCLI:
could be interpreted as EOF/exit. A first-class modal state keeps the
choices visible and lets the normal Enter key binding submit the typed
or highlighted choice.
**Platform note (Windows dead-lock issue #30768):**
The queue-based modal relies on prompt_toolkit key bindings receiving
keyboard events and calling ``_submit_slash_confirm_response``. On
Windows (PowerShell / Windows Terminal) the prompt_toolkit input
channel can become unresponsive when the modal is entered from the
``process_loop`` daemon thread, causing a dead-lock: the user sees the
confirmation panel but keystrokes never reach the key bindings and the
``response_queue.get()`` blocks until the 120-second timeout expires.
To avoid this, we fall back to ``_prompt_text_input`` (a simple
``input()``-based prompt) when any of these conditions hold:
* ``sys.platform == "win32"`` native Windows console (ConPTY /
win32_input) does not support the modal reliably.
* Called from a non-main thread the prompt_toolkit event loop only
runs on the main thread; key bindings can't fire from a daemon
thread (same rationale as the ``_prompt_text_input`` thread guard
in PR #23454).
* ``self._app`` is not set unit tests / non-interactive contexts.
"""
import threading
import time as _time
if not choices:
@@ -7172,20 +7003,6 @@ class HermesCLI:
if not getattr(self, "_app", None):
return self._prompt_text_input("Choice [1/2/3]: ")
# On Windows the prompt_toolkit input channel can deadlock when the
# modal is entered from the process_loop daemon thread — keystrokes
# never reach the key bindings, so response_queue.get() blocks for
# the full timeout (issue #30768). Fall back to the simpler
# stdin-based prompt which works reliably on Windows.
if sys.platform == "win32":
return self._prompt_text_input("Choice [1/2/3]: ")
# Mirror the thread-aware guard from _prompt_text_input (PR #23454):
# run_in_terminal and the modal queue both depend on the main-thread
# event loop. From a daemon thread the modal key bindings never fire.
if threading.current_thread() is not threading.main_thread():
return self._prompt_text_input("Choice [1/2/3]: ")
response_queue = queue.Queue()
self._capture_modal_input_snapshot()
self._slash_confirm_state = {
@@ -12122,22 +11939,9 @@ class HermesCLI:
pass
print("Resume this session with:")
# Session IDs are profile-constrained, so the resume hint must
# include `-p <profile>` for non-default profiles. Without this,
# copying the hint from a non-default profile fails to find the
# session on the next invocation. The "default" and "custom"
# profile names use the standard HERMES_HOME, so no -p needed.
try:
from hermes_cli.profiles import get_active_profile_name
_active_profile = get_active_profile_name()
except Exception:
_active_profile = "default"
profile_flag = (
"" if _active_profile in ("default", "custom") else f" -p {_active_profile}"
)
print(f" hermes --resume {self.session_id}{profile_flag}")
print(f" hermes --resume {self.session_id}")
if session_title:
print(f" hermes -c \"{session_title}\"{profile_flag}")
print(f" hermes -c \"{session_title}\"")
print()
print(f"Session: {self.session_id}")
if session_title:
@@ -13351,8 +13155,7 @@ class HermesCLI:
pasted_text = _sanitize_surrogates(pasted_text)
line_count = pasted_text.count('\n')
buf = event.current_buffer
threshold = self.config.get("paste_collapse_threshold", 5)
if threshold > 0 and line_count >= threshold and not buf.text.strip().startswith('/'):
if line_count >= 5 and not buf.text.strip().startswith('/'):
_paste_counter[0] += 1
paste_dir = _hermes_home / "pastes"
paste_dir.mkdir(parents=True, exist_ok=True)
@@ -13521,8 +13324,7 @@ class HermesCLI:
newlines_added = line_count - _prev_newline_count[0]
_prev_newline_count[0] = line_count
is_paste = chars_added > 1 or newlines_added >= 4
threshold = self.config.get("paste_collapse_threshold_fallback", 0)
if threshold > 0 and line_count >= threshold and is_paste and not text.startswith('/'):
if line_count >= 5 and is_paste and not text.startswith('/'):
_paste_counter[0] += 1
paste_dir = _hermes_home / "pastes"
paste_dir.mkdir(parents=True, exist_ok=True)
@@ -14259,10 +14061,6 @@ class HermesCLI:
except Exception:
pass
# Apply bracketed-paste timeout recovery so torn ESC[201~ end marks
# don't permanently freeze the input (issue #16263). Idempotent.
_apply_bracketed_paste_timeout_patch()
_original_on_resize = app._on_resize
def _resize_clear_ghosts():
@@ -14347,19 +14145,11 @@ class HermesCLI:
if not _file_drop and isinstance(user_input, str) and _looks_like_slash_command(user_input):
_cprint(f"\n⚙️ {user_input}")
try:
if not self.process_command(user_input):
self._should_exit = True
# Schedule app exit
if app.is_running:
app.exit()
except KeyboardInterrupt:
# Ctrl+C during a slow slash command (e.g. /skills browse,
# /sessions list with a large DB) should interrupt the
# command and return to the prompt, NOT exit the entire
# session. Without this guard a KeyboardInterrupt unwinds
# to the outer prompt_toolkit loop and the session dies.
_cprint("\n[dim]Command interrupted.[/dim]")
if not self.process_command(user_input):
self._should_exit = True
# Schedule app exit
if app.is_running:
app.exit()
continue
# Expand paste references back to full content
+2 -36
View File
@@ -45,28 +45,6 @@ _jobs_file_lock = threading.Lock()
OUTPUT_DIR = CRON_DIR / "output"
ONESHOT_GRACE_SECONDS = 120
# Fields on a cron job that must never change after creation. ``id`` is used
# as a filesystem path component under ``OUTPUT_DIR``; allowing it to be
# updated lets an unsafe value (``../escape``, absolute path, nested) leak
# into output writes/deletes.
_IMMUTABLE_JOB_FIELDS = frozenset({"id"})
def _job_output_dir(job_id: str) -> Path:
"""Resolve a job's output directory, rejecting any path-escape attempt.
Job IDs are filesystem path components under ``OUTPUT_DIR``. A legacy or
crafted ID containing ``..``, absolute paths, or nested separators would
allow output writes/deletes to escape the cron output sandbox. Reject
anything that isn't a single safe path component.
"""
text = str(job_id or "").strip()
if not text or text in {".", ".."} or "/" in text or "\\" in text:
raise ValueError(f"Invalid cron job id for output path: {job_id!r}")
if Path(text).is_absolute() or Path(text).drive:
raise ValueError(f"Invalid cron job id for output path: {job_id!r}")
return OUTPUT_DIR / text
def _normalize_skill_list(skill: Optional[str] = None, skills: Optional[Any] = None) -> List[str]:
"""Normalize legacy/single-skill and multi-skill inputs into a unique ordered list."""
@@ -750,15 +728,6 @@ def list_jobs(include_disabled: bool = False) -> List[Dict[str, Any]]:
def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Update a job by ID, refreshing derived schedule fields when needed."""
# Block mutation of immutable fields. ``id`` in particular is a filesystem
# path component under OUTPUT_DIR — letting an update change it leaks
# path-escape values into output writes/deletes.
bad_fields = _IMMUTABLE_JOB_FIELDS.intersection(updates or {})
if bad_fields:
raise ValueError(
f"Cron job field(s) cannot be updated: {', '.join(sorted(bad_fields))}"
)
jobs = load_jobs()
for i, job in enumerate(jobs):
if job["id"] != job_id:
@@ -876,12 +845,9 @@ def remove_job(job_id: str) -> bool:
original_len = len(jobs)
jobs = [j for j in jobs if j["id"] != canonical_id]
if len(jobs) < original_len:
# Resolve the output dir BEFORE saving so a legacy unsafe ID (e.g.
# left over from before the create-time guard) fails closed without
# half-applying the removal.
job_output_dir = _job_output_dir(canonical_id)
save_jobs(jobs)
# Clean up output directory to prevent orphaned dirs accumulating
job_output_dir = OUTPUT_DIR / canonical_id
if job_output_dir.exists():
shutil.rmtree(job_output_dir)
return True
@@ -1095,7 +1061,7 @@ def _get_due_jobs_locked() -> List[Dict[str, Any]]:
def save_job_output(job_id: str, output: str):
"""Save job output to file."""
ensure_dirs()
job_output_dir = _job_output_dir(job_id)
job_output_dir = OUTPUT_DIR / job_id
job_output_dir.mkdir(parents=True, exist_ok=True)
_secure_dir(job_output_dir)
+2 -55
View File
@@ -57,29 +57,6 @@ class CronPromptInjectionBlocked(Exception):
"""
def _resolve_cron_disabled_toolsets(cfg: dict) -> list[str]:
"""Toolsets a cron-spawned agent must never receive.
Three protected toolsets are always disabled in cron context:
- ``cronjob`` would let a cron-spawned agent schedule more cron jobs
- ``messaging`` interactive, needs a live gateway session
- ``clarify`` interactive, blocks waiting for user input
User-level ``agent.disabled_toolsets`` from config.yaml is layered on top
so per-job ``enabled_toolsets`` cannot bypass policy that applies to
ordinary agent runs (#25752 — LLM-supplied enabled_toolsets was widening
past config.yaml's denylist).
"""
disabled = ["cronjob", "messaging", "clarify"]
agent_cfg = (cfg or {}).get("agent") or {}
user_disabled = agent_cfg.get("disabled_toolsets") or []
for name in user_disabled:
name = str(name).strip()
if name and name not in disabled:
disabled.append(name)
return disabled
def _resolve_cron_enabled_toolsets(job: dict, cfg: dict) -> list[str] | None:
"""Resolve the toolset list for a cron job.
@@ -257,30 +234,6 @@ def _resolve_origin(job: dict) -> Optional[dict]:
return None
def _cron_job_origin_log_suffix(job: dict) -> str:
"""Return safe provenance details for security warnings about a cron job.
The scheduler normally has no live HTTP request object when it detects a
bad stored ``context_from`` reference. Including the job's saved origin
makes future probe logs actionable without exposing secrets: platform/chat
metadata for gateway-created jobs, and optional source-IP fields for API
surfaces that persist them in origin metadata.
"""
origin = job.get("origin")
if not isinstance(origin, dict):
return ""
fields = []
for key in ("platform", "chat_id", "thread_id", "source_ip", "remote", "forwarded_for"):
value = origin.get(key)
if value is None:
continue
text = str(value).replace("\r", " ").replace("\n", " ").strip()
if text:
fields.append(f"origin_{key}={text[:200]!r}")
return " " + " ".join(fields) if fields else ""
def _plugin_cron_env_var(platform_name: str) -> str:
"""Return the cron home-channel env var registered by a plugin platform.
@@ -1051,13 +1004,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
for source_job_id in context_from:
# Guard against path traversal — valid job IDs are 12-char hex strings
if not source_job_id or not all(c in "0123456789abcdef" for c in source_job_id):
logger.warning(
"context_from: skipping invalid job_id %r for job_id=%r name=%r%s",
source_job_id,
job.get("id"),
job.get("name"),
_cron_job_origin_log_suffix(job),
)
logger.warning("context_from: skipping invalid job_id %r", source_job_id)
continue
try:
job_output_dir = OUTPUT_DIR / source_job_id
@@ -1627,7 +1574,7 @@ def _run_job_impl(job: dict) -> tuple[bool, str, str, Optional[str]]:
provider_sort=pr.get("sort"),
openrouter_min_coding_score=(_cfg.get("openrouter") or {}).get("min_coding_score"),
enabled_toolsets=_resolve_cron_enabled_toolsets(job, _cfg),
disabled_toolsets=_resolve_cron_disabled_toolsets(_cfg),
disabled_toolsets=["cronjob", "messaging", "clarify"],
quiet_mode=True,
# Cron jobs should always inherit the user's SOUL.md identity from
# HERMES_HOME. When a workdir is configured, also inject project
-8
View File
@@ -111,14 +111,6 @@ seed_one ".env" ".env.example"
seed_one "config.yaml" "cli-config.yaml.example"
seed_one "SOUL.md" "docker/SOUL.md"
# .env holds API keys and secrets — restrict to owner-only access. Applied
# unconditionally (not only on first-seed) so a host-mounted .env that was
# created with a permissive umask gets tightened on every container start.
if [ -f "$HERMES_HOME/.env" ]; then
chown hermes:hermes "$HERMES_HOME/.env" 2>/dev/null || true
chmod 600 "$HERMES_HOME/.env" 2>/dev/null || true
fi
# auth.json: bootstrap from env on first boot only. Same semantics as the
# pre-s6 entrypoint — the [ ! -f ] guard is critical to avoid clobbering
# rotated refresh tokens on container restart.
+16 -2
View File
@@ -1089,8 +1089,22 @@ def load_gateway_config() -> GatewayConfig:
allowed = ",".join(str(v) for v in allowed)
os.environ["DINGTALK_ALLOWED_USERS"] = str(allowed)
# Mattermost config bridge moved into plugins/platforms/mattermost/
# adapter.py::_apply_yaml_config — see #25443 (apply_yaml_config_fn).
# Mattermost settings → env vars (env vars take precedence)
mattermost_cfg = yaml_cfg.get("mattermost", {})
if isinstance(mattermost_cfg, dict):
if "require_mention" in mattermost_cfg and not os.getenv("MATTERMOST_REQUIRE_MENTION"):
os.environ["MATTERMOST_REQUIRE_MENTION"] = str(mattermost_cfg["require_mention"]).lower()
frc = mattermost_cfg.get("free_response_channels")
if frc is not None and not os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["MATTERMOST_FREE_RESPONSE_CHANNELS"] = str(frc)
# allowed_channels: if set, bot ONLY responds in these channels (whitelist)
ac = mattermost_cfg.get("allowed_channels")
if ac is not None and not os.getenv("MATTERMOST_ALLOWED_CHANNELS"):
if isinstance(ac, list):
ac = ",".join(str(v) for v in ac)
os.environ["MATTERMOST_ALLOWED_CHANNELS"] = str(ac)
# Matrix settings → env vars (env vars take precedence)
matrix_cfg = yaml_cfg.get("matrix", {})
-62
View File
@@ -763,58 +763,6 @@ class APIServerAdapter(BasePlatformAdapter):
return "*" in self._cors_origins or origin in self._cors_origins
@staticmethod
def _clean_log_value(value: Any, *, max_len: int = 200) -> str:
"""Sanitize request metadata before it reaches security logs."""
if value is None:
return ""
text = str(value).replace("\r", " ").replace("\n", " ").strip()
return text[:max_len]
def _request_audit_context(self, request: "web.Request") -> Dict[str, str]:
"""Return non-secret source metadata for security/audit warnings."""
peer_ip = ""
try:
peer = request.transport.get_extra_info("peername") if request.transport else None
if isinstance(peer, (tuple, list)) and peer:
peer_ip = str(peer[0])
except Exception:
peer_ip = ""
return {
"remote": self._clean_log_value(getattr(request, "remote", "") or peer_ip),
"peer_ip": self._clean_log_value(peer_ip),
"forwarded_for": self._clean_log_value(request.headers.get("X-Forwarded-For", "")),
"real_ip": self._clean_log_value(request.headers.get("X-Real-IP", "")),
"method": self._clean_log_value(request.method, max_len=16),
"path": self._clean_log_value(request.path_qs, max_len=500),
"user_agent": self._clean_log_value(request.headers.get("User-Agent", ""), max_len=300),
}
def _request_audit_log_suffix(self, request: "web.Request") -> str:
ctx = self._request_audit_context(request)
fields = [f"{key}={value!r}" for key, value in ctx.items() if value]
return " ".join(fields) if fields else "source='unknown'"
def _cron_origin_from_request(self, request: "web.Request") -> Dict[str, str]:
"""Persist safe API source metadata on cron jobs created over HTTP."""
ctx = self._request_audit_context(request)
origin = {
"platform": "api_server",
"chat_id": "api",
}
if ctx.get("remote"):
origin["source_ip"] = ctx["remote"]
if ctx.get("peer_ip"):
origin["peer_ip"] = ctx["peer_ip"]
if ctx.get("forwarded_for"):
origin["forwarded_for"] = ctx["forwarded_for"]
if ctx.get("real_ip"):
origin["real_ip"] = ctx["real_ip"]
if ctx.get("user_agent"):
origin["user_agent"] = ctx["user_agent"]
return origin
# ------------------------------------------------------------------
# Auth helper
# ------------------------------------------------------------------
@@ -836,10 +784,6 @@ class APIServerAdapter(BasePlatformAdapter):
if hmac.compare_digest(token, self._api_key):
return None # Auth OK
logger.warning(
"API server rejected invalid API key: %s",
self._request_audit_log_suffix(request),
)
return web.json_response(
{"error": {"message": "Invalid API key", "type": "invalid_request_error", "code": "invalid_api_key"}},
status=401,
@@ -2510,11 +2454,6 @@ class APIServerAdapter(BasePlatformAdapter):
"""Validate and extract job_id. Returns (job_id, error_response)."""
job_id = request.match_info["job_id"]
if not self._JOB_ID_RE.fullmatch(job_id):
logger.warning(
"Cron jobs API rejected invalid job_id %r: %s",
job_id,
self._request_audit_log_suffix(request),
)
return job_id, web.json_response(
{"error": "Invalid job ID format"}, status=400,
)
@@ -2572,7 +2511,6 @@ class APIServerAdapter(BasePlatformAdapter):
"schedule": schedule,
"name": name,
"deliver": deliver,
"origin": self._cron_origin_from_request(request),
}
if skills:
kwargs["skills"] = skills
-115
View File
@@ -827,8 +827,6 @@ DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")
SCREENSHOT_CACHE_DIR = get_hermes_dir("cache/screenshots", "browser_screenshots")
_HERMES_HOME = get_hermes_home()
MEDIA_DELIVERY_ALLOW_DIRS_ENV = "HERMES_MEDIA_ALLOW_DIRS"
MEDIA_DELIVERY_TRUST_RECENT_ENV = "HERMES_MEDIA_TRUST_RECENT_FILES"
MEDIA_DELIVERY_TRUST_RECENT_SECONDS_ENV = "HERMES_MEDIA_TRUST_RECENT_SECONDS"
MEDIA_DELIVERY_SAFE_ROOTS = (
IMAGE_CACHE_DIR,
AUDIO_CACHE_DIR,
@@ -842,48 +840,6 @@ MEDIA_DELIVERY_SAFE_ROOTS = (
_HERMES_HOME / "browser_screenshots",
)
# Default recency window for trusting freshly-produced files (seconds).
# The agent's actual work generally completes well inside 10 minutes; legitimate
# build artifacts (PDFs from pandoc, plots from matplotlib, etc.) almost always
# land seconds before delivery. Old system files (/etc/passwd, ~/.ssh/id_rsa,
# stray credentials) have mtimes measured in days or months — well outside this
# window — so prompt-injection paths pointing at pre-existing host files are
# still rejected.
_MEDIA_DELIVERY_TRUST_RECENT_DEFAULT_SECONDS = 600
# Hard denylist applied even when a path would otherwise pass recency trust.
# These prefixes hold credentials, system state, or process introspection that
# should never be uploaded as a gateway attachment, regardless of how new the
# file looks. The cache-dir allowlist still beats this — an operator-configured
# allowed root can intentionally live under one of these prefixes (rare, but
# their choice).
_MEDIA_DELIVERY_DENIED_PREFIXES = (
"/etc",
"/proc",
"/sys",
"/dev",
"/root",
"/boot",
"/var/log",
"/var/lib",
"/var/run",
)
# Within $HOME we additionally deny common credential / config directories.
# Resolved at check time against the live $HOME so containers and alt-home
# setups work correctly.
_MEDIA_DELIVERY_DENIED_HOME_SUBPATHS = (
".ssh",
".aws",
".gnupg",
".kube",
".docker",
".config",
".azure",
".gcloud",
"Library/Keychains", # macOS
)
def _media_delivery_allowed_roots() -> List[Path]:
"""Return roots from which model-emitted local media may be delivered."""
@@ -900,67 +856,6 @@ def _media_delivery_allowed_roots() -> List[Path]:
return roots
def _media_delivery_recency_seconds() -> float:
"""Return the recency window for trusting freshly-produced files.
0 disables recency-based trust entirely (pure-allowlist mode).
"""
raw = os.environ.get(MEDIA_DELIVERY_TRUST_RECENT_ENV, "1").strip().lower()
if raw in ("0", "false", "no", "off", ""):
return 0.0
try:
custom = os.environ.get(MEDIA_DELIVERY_TRUST_RECENT_SECONDS_ENV, "").strip()
if custom:
seconds = float(custom)
return max(0.0, seconds)
except (TypeError, ValueError):
pass
return float(_MEDIA_DELIVERY_TRUST_RECENT_DEFAULT_SECONDS)
def _media_delivery_denied_paths() -> List[Path]:
"""Return absolute denylist paths under which delivery is never allowed."""
denied = [Path(p) for p in _MEDIA_DELIVERY_DENIED_PREFIXES]
home = Path(os.path.expanduser("~"))
for sub in _MEDIA_DELIVERY_DENIED_HOME_SUBPATHS:
denied.append(home / sub)
# The Hermes home itself contains credentials (auth.json, .env) — only the
# cache subdirectories under it are explicitly allowlisted above.
denied.append(_HERMES_HOME / ".env")
denied.append(_HERMES_HOME / "auth.json")
denied.append(_HERMES_HOME / "credentials")
return denied
def _path_under_denied_prefix(resolved: Path) -> bool:
"""Return True if ``resolved`` lives under a deny-listed system path."""
for denied in _media_delivery_denied_paths():
try:
resolved_denied = denied.expanduser().resolve(strict=False)
except (OSError, RuntimeError, ValueError):
continue
if _path_is_within(resolved, resolved_denied) or resolved == resolved_denied:
return True
return False
def _file_is_recently_produced(resolved: Path, window_seconds: float) -> bool:
"""Return True if the file's mtime is within ``window_seconds`` of now.
Used as a session-scoped trust signal: agents almost always produce
delivery artifacts within seconds of asking to send them, while
prompt-injection paths pointing at pre-existing host files (/etc/passwd,
~/.ssh/id_rsa) have mtimes measured in days or months.
"""
if window_seconds <= 0:
return False
try:
mtime = resolved.stat().st_mtime
except OSError:
return False
return (time.time() - mtime) <= window_seconds
def _path_is_within(path: Path, root: Path) -> bool:
try:
path.relative_to(root)
@@ -1007,16 +902,6 @@ def validate_media_delivery_path(path: str) -> Optional[str]:
if _path_is_within(resolved, resolved_root):
return str(resolved)
# Outside the cache/operator allowlist: fall back to recency-based trust
# for files the agent has just produced (e.g. ``pandoc -o /tmp/report.pdf``
# or ``write_file("/home/user/report.pdf", ...)``). System paths and
# credential locations remain blocked even when "recent" — see
# ``_MEDIA_DELIVERY_DENIED_PREFIXES`` for the denylist.
window = _media_delivery_recency_seconds()
if window > 0 and not _path_under_denied_prefix(resolved):
if _file_is_recently_produced(resolved, window):
return str(resolved)
return None
@@ -871,322 +871,3 @@ class MattermostAdapter(BasePlatformAdapter):
await self.handle_message(msg_event)
# ---------------------------------------------------------------------------
# Plugin standalone-send (out-of-process cron delivery via Mattermost REST)
# ---------------------------------------------------------------------------
async def _standalone_send(
pconfig,
chat_id: str,
message: str,
*,
thread_id: Optional[str] = None,
media_files: Optional[list] = None,
force_document: bool = False,
) -> Dict[str, Any]:
"""Send via the Mattermost v4 REST API without a live gateway adapter.
Used by ``tools/send_message_tool._send_via_adapter`` when the gateway
runner is not in this process (typical for cron jobs running out-of-process).
Reads ``MATTERMOST_TOKEN`` from ``pconfig.token`` (set by the gateway
config loader from env) and falls back to the ``MATTERMOST_TOKEN`` env
var. Server URL comes from ``pconfig.extra["url"]`` (set by the YAML
bridge / env loader) or the ``MATTERMOST_URL`` env var.
Thread replies (Mattermost CRT) are supported via the ``root_id`` field
on the ``POST /posts`` payload pass ``thread_id`` when threading is
desired. ``media_files`` are uploaded via ``POST /files``
(multipart/form-data), then their returned ``file_id`` values are
attached to the post.
``force_document`` is accepted for signature parity with other
standalone senders but unused Mattermost stores every uploaded file
as a generic attachment regardless.
"""
try:
import aiohttp
except ImportError:
return {"error": "aiohttp not installed. Run: pip install aiohttp"}
base_url = (
(getattr(pconfig, "extra", {}) or {}).get("url")
or os.getenv("MATTERMOST_URL", "")
).rstrip("/")
token = (getattr(pconfig, "token", None) or os.getenv("MATTERMOST_TOKEN", "")).strip()
if not base_url or not token:
return {
"error": (
"Mattermost standalone send: MATTERMOST_URL and "
"MATTERMOST_TOKEN must both be set"
)
}
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
}
upload_headers = {"Authorization": f"Bearer {token}"}
media_files = media_files or []
try:
# Resolve proxy + session kwargs once so a single ClientSession can
# cover the optional file uploads + final post.
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="MATTERMOST_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=60),
**_sess_kw,
) as session:
# 1. Upload media (if any) and collect file_ids.
file_ids: List[str] = []
for media in media_files:
file_path = media.get("path") if isinstance(media, dict) else media
if not file_path or not os.path.exists(file_path):
continue
form = aiohttp.FormData()
# Mattermost requires channel_id on file uploads so the
# server can attribute them.
form.add_field("channel_id", chat_id)
with open(file_path, "rb") as fh:
form.add_field(
"files",
fh.read(),
filename=os.path.basename(file_path),
)
async with session.post(
f"{base_url}/api/v4/files",
data=form,
headers=upload_headers,
**_req_kw,
) as upload_resp:
if upload_resp.status not in {200, 201}:
body = await upload_resp.text()
return {
"error": (
f"Mattermost file upload failed "
f"({upload_resp.status}): {body[:400]}"
)
}
upload_data = await upload_resp.json()
for info in upload_data.get("file_infos", []):
if info.get("id"):
file_ids.append(info["id"])
# 2. Post the message (with thread root + attached file_ids).
payload: Dict[str, Any] = {
"channel_id": chat_id,
"message": message,
}
if thread_id:
payload["root_id"] = thread_id
if file_ids:
payload["file_ids"] = file_ids
async with session.post(
f"{base_url}/api/v4/posts",
headers=headers,
json=payload,
**_req_kw,
) as resp:
if resp.status not in {200, 201}:
body = await resp.text()
return {
"error": (
f"Mattermost API error ({resp.status}): "
f"{body[:400]}"
)
}
data = await resp.json()
return {
"success": True,
"platform": "mattermost",
"chat_id": chat_id,
"message_id": data.get("id"),
}
except aiohttp.ClientError as exc:
return {"error": f"Mattermost send failed (network): {exc}"}
except Exception as exc: # noqa: BLE001
return {"error": f"Mattermost send failed: {exc}"}
# ---------------------------------------------------------------------------
# Interactive setup wizard
# ---------------------------------------------------------------------------
def interactive_setup() -> None:
"""Guide the user through Mattermost bot setup.
Mirrors Discord/Teams' ``interactive_setup`` shape: lazy-imports CLI
helpers so the plugin's import surface stays small, prompts for the
server URL + bot token, captures an allowlist, and offers to set a
home channel. Replaces the central
``hermes_cli/setup.py::_setup_mattermost`` function this migration
removes.
"""
from hermes_cli.config import get_env_value, save_env_value
from hermes_cli.cli_output import (
prompt,
prompt_yes_no,
print_header,
print_info,
print_success,
)
print_header("Mattermost")
existing = get_env_value("MATTERMOST_TOKEN")
if existing:
print_info("Mattermost: already configured")
if not prompt_yes_no("Reconfigure Mattermost?", False):
return
print_info("Works with any self-hosted Mattermost instance.")
print_info(" 1. In Mattermost: Integrations → Bot Accounts → Add Bot Account")
print_info(" 2. Copy the bot token")
print()
mm_url = prompt("Mattermost server URL (e.g. https://mm.example.com)")
if mm_url:
save_env_value("MATTERMOST_URL", mm_url.rstrip("/"))
token = prompt("Bot token", password=True)
if not token:
return
save_env_value("MATTERMOST_TOKEN", token)
print_success("Mattermost token saved")
print()
print_info("🔒 Security: Restrict who can use your bot")
print_info(" To find your user ID: click your avatar → Profile")
print_info(" or use the API: GET /api/v4/users/me")
print()
allowed_users = prompt("Allowed user IDs (comma-separated, leave empty for open access)")
if allowed_users:
save_env_value("MATTERMOST_ALLOWED_USERS", allowed_users.replace(" ", ""))
print_success("Mattermost allowlist configured")
else:
print_info("⚠️ No allowlist set - anyone who can message the bot can use it!")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results and notifications.")
print_info(" To get a channel ID: click channel name → View Info → copy the ID")
print_info(" You can also set this later by typing /set-home in a Mattermost channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("MATTERMOST_HOME_CHANNEL", home_channel)
print_info(" Open config in your editor: hermes config edit")
# ---------------------------------------------------------------------------
# YAML → env config bridge (apply_yaml_config_fn, #25443)
# ---------------------------------------------------------------------------
def _apply_yaml_config(yaml_cfg: dict, mattermost_cfg: dict) -> dict | None:
"""Translate ``config.yaml`` ``mattermost:`` keys into env vars.
Implements the ``apply_yaml_config_fn`` contract (#24836 / #25443).
Mirrors the legacy ``mattermost_cfg`` block that used to live in
``gateway/config.py::load_gateway_config()`` before this migration.
The MattermostAdapter reads its runtime configuration via
``os.getenv()`` for ``MATTERMOST_REQUIRE_MENTION``,
``MATTERMOST_FREE_RESPONSE_CHANNELS``, and
``MATTERMOST_ALLOWED_CHANNELS``. Rather than rewrite those call sites
to read from ``PlatformConfig.extra``, this hook keeps the env-driven
model and merely owns the YAMLenv translation here, next to the
adapter that consumes it.
Env vars take precedence over YAML every assignment is guarded
by ``not os.getenv(...)`` so an explicit env var survives a config.yaml
update. Returns ``None`` because no extras are seeded into
``PlatformConfig.extra`` directly (everything flows through env).
"""
if "require_mention" in mattermost_cfg and not os.getenv("MATTERMOST_REQUIRE_MENTION"):
os.environ["MATTERMOST_REQUIRE_MENTION"] = str(mattermost_cfg["require_mention"]).lower()
frc = mattermost_cfg.get("free_response_channels")
if frc is not None and not os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["MATTERMOST_FREE_RESPONSE_CHANNELS"] = str(frc)
# allowed_channels: if set, bot ONLY responds in these channels (whitelist)
ac = mattermost_cfg.get("allowed_channels")
if ac is not None and not os.getenv("MATTERMOST_ALLOWED_CHANNELS"):
if isinstance(ac, list):
ac = ",".join(str(v) for v in ac)
os.environ["MATTERMOST_ALLOWED_CHANNELS"] = str(ac)
return None # all settings flow through env; nothing to merge into extras
# ---------------------------------------------------------------------------
# is_connected probe
# ---------------------------------------------------------------------------
def _is_connected(config) -> bool:
"""Mattermost is considered connected when BOTH MATTERMOST_TOKEN and
MATTERMOST_URL are set.
Looks up via ``hermes_cli.gateway.get_env_value`` at call time (not via
the plugin's own bound import) so tests that patch
``gateway_mod.get_env_value`` can suppress ambient env vars. Matches
what the legacy connected-platforms check did before this migration.
"""
import hermes_cli.gateway as gateway_mod
return bool(
(gateway_mod.get_env_value("MATTERMOST_TOKEN") or "").strip()
and (gateway_mod.get_env_value("MATTERMOST_URL") or "").strip()
)
# ---------------------------------------------------------------------------
# Plugin registration entry point
# ---------------------------------------------------------------------------
def _build_adapter(config):
"""Factory wrapper that constructs MattermostAdapter from a PlatformConfig."""
return MattermostAdapter(config)
def register(ctx) -> None:
"""Plugin entry point — called by the Hermes plugin system."""
ctx.register_platform(
name="mattermost",
label="Mattermost",
adapter_factory=_build_adapter,
check_fn=check_mattermost_requirements,
is_connected=_is_connected,
required_env=["MATTERMOST_URL", "MATTERMOST_TOKEN"],
install_hint="pip install aiohttp",
# Interactive setup wizard — replaces the central
# hermes_cli/setup.py::_setup_mattermost function.
setup_fn=interactive_setup,
# YAML→env config bridge — owns the translation of
# ``config.yaml`` ``mattermost:`` keys (require_mention,
# free_response_channels, allowed_channels) into ``MATTERMOST_*``
# env vars that the adapter reads via ``os.getenv()``. Replaces
# the hardcoded block that used to live in ``gateway/config.py``.
# Hook contract: #24836 / #25443.
apply_yaml_config_fn=_apply_yaml_config,
# Auth env vars for _is_user_authorized() integration.
allowed_users_env="MATTERMOST_ALLOWED_USERS",
allow_all_env="MATTERMOST_ALLOW_ALL_USERS",
# Cron home-channel delivery.
cron_deliver_env_var="MATTERMOST_HOME_CHANNEL",
# Out-of-process cron delivery via Mattermost REST API. Without
# this hook, ``deliver=mattermost`` cron jobs fail with "No live
# adapter" when cron runs separately from the gateway. Mirrors
# the Discord / Teams pattern.
standalone_sender_fn=_standalone_send,
# Mattermost practical post-length limit (server default is 16383
# but 4000 is the readable threshold the adapter has used since
# day one).
max_message_length=MAX_POST_LENGTH,
# Display
emoji="💬",
allow_update_command=True,
)
+8 -120
View File
@@ -932,27 +932,6 @@ if _config_path.exists():
_redact = _security_cfg.get("redact_secrets")
if _redact is not None:
os.environ["HERMES_REDACT_SECRETS"] = str(_redact).lower()
# Gateway settings (media delivery allowlist + recency trust)
_gateway_cfg = _cfg.get("gateway", {})
if isinstance(_gateway_cfg, dict):
_allow_dirs = _gateway_cfg.get("media_delivery_allow_dirs")
if _allow_dirs:
if isinstance(_allow_dirs, str):
_allow_dirs_str = _allow_dirs
elif isinstance(_allow_dirs, (list, tuple)):
_allow_dirs_str = os.pathsep.join(str(p) for p in _allow_dirs if p)
else:
_allow_dirs_str = ""
if _allow_dirs_str:
os.environ["HERMES_MEDIA_ALLOW_DIRS"] = _allow_dirs_str
_trust_recent = _gateway_cfg.get("trust_recent_files")
if _trust_recent is not None:
os.environ["HERMES_MEDIA_TRUST_RECENT_FILES"] = (
"1" if _trust_recent else "0"
)
_trust_recent_seconds = _gateway_cfg.get("trust_recent_files_seconds")
if _trust_recent_seconds is not None:
os.environ["HERMES_MEDIA_TRUST_RECENT_SECONDS"] = str(_trust_recent_seconds)
except Exception as _bridge_err:
# Previously this was silent (`except Exception: pass`), which
# hid partial bridge failures and let .env defaults shadow
@@ -3034,44 +3013,6 @@ class GatewayRunner:
if agent is not _AGENT_PENDING_SENTINEL
}
@staticmethod
def _agent_has_active_subagents(running_agent: Any) -> bool:
"""Return True when *running_agent* is currently driving subagents
via the ``delegate_task`` tool.
Background (#30170): ``AIAgent.interrupt()`` cascades through the
parent's ``_active_children`` list and calls ``interrupt()`` on
every child synchronously, which aborts in-flight subagent work
and produces a fallback cascade with no actionable signal.
Demoting ``busy_input_mode='interrupt'`` to ``queue`` semantics
whenever this helper returns True protects subagent work from
conversational follow-ups while leaving the explicit ``/stop``
path (which goes through ``_interrupt_and_clear_session``)
untouched. Safe-by-default: returns False on any attribute or
lock error so a missing/broken parent never blocks the existing
interrupt path.
"""
if running_agent is None or running_agent is _AGENT_PENDING_SENTINEL:
return False
children = getattr(running_agent, "_active_children", None)
# AIAgent always initialises this as a concrete list (see
# agent/agent_init.py). Reject anything that isn't a real
# collection — this guards against ``MagicMock()._active_children``
# auto-creating a truthy stub in tests and triggering the demotion
# against an agent that doesn't actually have subagents.
if not isinstance(children, (list, tuple, set)):
return False
if not children:
return False
lock = getattr(running_agent, "_active_children_lock", None)
try:
if lock is not None:
with lock:
return bool(children)
return bool(children)
except Exception:
return False
def _queue_or_replace_pending_event(self, session_key: str, event: MessageEvent) -> None:
adapter = self.adapters.get(event.source.platform)
if not adapter:
@@ -3143,25 +3084,6 @@ class GatewayRunner:
# queueing + interrupting. If the agent isn't running yet
# (sentinel) or lacks steer(), or the payload is empty, fall back
# to queue semantics so nothing is lost.
# #30170 — Subagent protection. ``AIAgent.interrupt()`` cascades
# to every entry in the parent's ``_active_children`` list and
# aborts in-flight ``delegate_task`` work. Demote ``interrupt``
# to ``queue`` when the parent is currently driving subagents so
# a conversational follow-up doesn't destroy minutes of subagent
# work. Explicit ``/stop`` and ``/new`` slash commands go through
# ``_interrupt_and_clear_session`` and are unaffected — the
# operator still has a way to force-cancel everything.
demoted_for_subagents = (
effective_mode == "interrupt"
and self._agent_has_active_subagents(running_agent)
)
if demoted_for_subagents:
logger.info(
"Demoting busy_input_mode 'interrupt' to 'queue' for session %s "
"because the running agent has active subagents (#30170)",
session_key,
)
effective_mode = "queue"
steered = False
if effective_mode == "steer":
steer_text = (event.text or "").strip()
@@ -3249,14 +3171,6 @@ class GatewayRunner:
f"⏩ Steered into current run{status_detail}. "
f"Your message arrives after the next tool call."
)
elif is_queue_mode and demoted_for_subagents:
# #30170 — explain the demotion so the user knows their
# follow-up didn't accidentally kill the subagent and
# discovers `/stop` as the explicit escape hatch.
message = (
f"⏳ Subagent working{status_detail} — your message is queued for "
f"when it finishes (use /stop to cancel everything)."
)
elif is_queue_mode:
message = (
f"⏳ Queued for the next turn{status_detail}. "
@@ -6312,6 +6226,13 @@ class GatewayRunner:
return None
return WeixinAdapter(config)
elif platform == Platform.MATTERMOST:
from gateway.platforms.mattermost import MattermostAdapter, check_mattermost_requirements
if not check_mattermost_requirements():
logger.warning("Mattermost: MATTERMOST_TOKEN or MATTERMOST_URL not set, or aiohttp missing")
return None
return MattermostAdapter(config)
elif platform == Platform.MATRIX:
from gateway.platforms.matrix import MatrixAdapter, check_matrix_requirements
if not check_matrix_requirements():
@@ -7311,22 +7232,6 @@ class GatewayRunner:
logger.debug("PRIORITY steer-fallback-to-queue for session %s", _quick_key)
self._queue_or_replace_pending_event(_quick_key, event)
return None
# #30170 — Subagent protection (PRIORITY path). Same rationale
# as ``_handle_active_session_busy_message``: an interrupt
# cascades through ``_active_children`` and aborts in-flight
# delegate_task work. Demote to queue semantics when the
# parent is currently driving subagents so a conversational
# follow-up doesn't destroy minutes of subagent progress.
# /stop reaches its dedicated handler above, so the operator
# still has a clean escape hatch.
if self._agent_has_active_subagents(running_agent):
logger.info(
"PRIORITY interrupt demoted to queue for session %s "
"because the running agent has active subagents (#30170)",
_quick_key,
)
self._queue_or_replace_pending_event(_quick_key, event)
return None
logger.debug("PRIORITY interrupt for session %s", _quick_key)
running_agent.interrupt(event.text)
# NOTE: self._pending_messages was write-only (never consumed).
@@ -8794,7 +8699,6 @@ class GatewayRunner:
# session_entry so transcript writes below go to the right session.
if agent_result.get("session_id") and agent_result["session_id"] != session_entry.session_id:
session_entry.session_id = agent_result["session_id"]
self.session_store._save()
# Prepend reasoning/thinking if display is enabled (per-platform)
try:
@@ -12846,16 +12750,6 @@ class GatewayRunner:
session_key = self._session_key_for_source(source)
name = event.get_command_args().strip()
# Strip common outer brackets/quotes users may type literally from the
# usage hint (e.g. ``/resume <abc123>``). Mirrors the CLI behavior.
if len(name) >= 2 and (
(name[0] == "<" and name[-1] == ">")
or (name[0] == "[" and name[-1] == "]")
or (name[0] == '"' and name[-1] == '"')
or (name[0] == "'" and name[-1] == "'")
):
name = name[1:-1].strip()
def _list_titled_sessions() -> list[dict]:
user_source = source.platform.value if source.platform else None
sessions = self._session_db.list_sessions_rich(source=user_source, limit=10)
@@ -12893,13 +12787,7 @@ class GatewayRunner:
target_id = target.get("id")
name = target.get("title") or name
else:
# Try direct session ID lookup first (so `/resume <session_id>`
# works in the gateway, not just `/resume <title>`).
session = self._session_db.get_session(name)
if session:
target_id = session["id"]
else:
target_id = self._session_db.resolve_session_by_title(name)
target_id = self._session_db.resolve_session_by_title(name)
if not target_id:
return t("gateway.resume.not_found", name=name)
# Compression creates child continuations that hold the live transcript.
+5 -60
View File
@@ -49,7 +49,6 @@ import yaml
from hermes_cli.config import get_hermes_home, get_config_path, read_raw_config
from hermes_constants import OPENROUTER_BASE_URL, secure_parent_dir
from agent.credential_persistence import sanitize_borrowed_credential_payload
from utils import atomic_replace, atomic_yaml_write, is_truthy_value
logger = logging.getLogger(__name__)
@@ -197,17 +196,9 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
auth_type="oauth_external",
inference_base_url=DEFAULT_CODEX_BASE_URL,
),
"openai-api": ProviderConfig(
id="openai-api",
name="OpenAI API",
auth_type="api_key",
inference_base_url="https://api.openai.com/v1",
api_key_env_vars=("OPENAI_API_KEY",),
base_url_env_var="OPENAI_BASE_URL",
),
"xai-oauth": ProviderConfig(
id="xai-oauth",
name="xAI Grok OAuth (SuperGrok / Premium+)",
name="xAI Grok OAuth (SuperGrok Subscription)",
auth_type="oauth_external",
inference_base_url=DEFAULT_XAI_OAUTH_BASE_URL,
),
@@ -1177,23 +1168,14 @@ def read_credential_pool(provider_id: Optional[str] = None) -> Dict[str, Any]:
def write_credential_pool(provider_id: str, entries: List[Dict[str, Any]]) -> Path:
"""Persist one provider's credential pool under auth.json.
This is the final disk-boundary guard for borrowed/reference-only
credentials. Callers may pass raw dictionaries, so sanitize here even when
``PooledCredential.to_dict()`` already did the same work upstream.
"""
"""Persist one provider's credential pool under auth.json."""
with _auth_store_lock():
auth_store = _load_auth_store()
pool = auth_store.get("credential_pool")
if not isinstance(pool, dict):
pool = {}
auth_store["credential_pool"] = pool
pool[provider_id] = [
sanitize_borrowed_credential_payload(entry, provider_id)
if isinstance(entry, dict) else entry
for entry in entries
]
pool[provider_id] = list(entries)
return _save_auth_store(auth_store)
@@ -2488,32 +2470,6 @@ def _make_xai_callback_handler(expected_path: str) -> tuple[type[BaseHTTPRequest
"error_description": params.get("error_description", [None])[0],
}
# Diagnostic logging — emits at INFO so reporters of loopback bugs
# (#27385 — "callback received but Hermes times out") can produce
# actionable evidence without a code change. Logged values are
# fingerprints / booleans only; no actual code/state strings leak
# into the log file. Run with ``HERMES_LOG_LEVEL=INFO`` (or check
# ``~/.hermes/logs/agent.log`` which captures INFO+ unconditionally).
try:
logger.info(
"xAI loopback callback received: path=%s has_code=%s has_state=%s has_error=%s "
"ua=%s",
parsed.path,
incoming["code"] is not None,
incoming["state"] is not None,
incoming["error"] is not None,
(self.headers.get("User-Agent") or "")[:80],
)
if incoming["error"]:
logger.info(
"xAI loopback callback carries error=%s error_description=%s",
incoming["error"],
(incoming["error_description"] or "")[:200],
)
except Exception:
# Logging must never break the OAuth flow.
pass
# Treat a hit on the callback path with neither `code` nor `error`
# as a missing OAuth callback (e.g. xAI's auth backend failed to
# redirect and the user navigated to the bare loopback URL by hand).
@@ -2618,17 +2574,6 @@ def _xai_wait_for_callback(
server.shutdown()
server.server_close()
thread.join(timeout=1.0)
# Diagnostic: distinguish "no callback ever arrived" from "callback
# arrived but result wasn't populated" (#27385). The per-hit handler
# also logs at INFO; if neither line appears, xAI's IDP never reached
# the loopback at all (firewall, port-binding, IPv6/IPv4 mismatch).
logger.info(
"xAI loopback wait timed out after %.0fs with no usable callback "
"(result.code=%s result.error=%s)",
max(5.0, timeout_seconds),
result["code"] is not None,
result["error"] is not None,
)
raise AuthError(
"xAI authorization timed out waiting for the local callback.",
provider="xai-oauth",
@@ -3462,7 +3407,7 @@ def _read_xai_oauth_tokens(*, _lock: bool = True) -> Dict[str, Any]:
state = _load_provider_state(auth_store, "xai-oauth")
if not state:
raise AuthError(
"No xAI OAuth credentials stored. Select xAI Grok OAuth (SuperGrok / Premium+) in `hermes model`.",
"No xAI OAuth credentials stored. Select xAI Grok OAuth (SuperGrok Subscription) in `hermes model`.",
provider="xai-oauth",
code="xai_auth_missing",
relogin_required=True,
@@ -6393,7 +6338,7 @@ def _login_xai_oauth(
pass
print()
print("Signing in to xAI Grok OAuth (SuperGrok / Premium+)...")
print("Signing in to xAI Grok OAuth (SuperGrok Subscription)...")
print("(Hermes creates its own local OAuth session)")
print()
+2 -2
View File
@@ -2,6 +2,7 @@
from __future__ import annotations
from getpass import getpass
import math
import sys
import time
@@ -29,7 +30,6 @@ from agent.credential_pool import (
import hermes_cli.auth as auth_mod
from hermes_cli.auth import PROVIDER_REGISTRY
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.secret_prompt import masked_secret_prompt
# Providers that support OAuth login in addition to API keys.
@@ -196,7 +196,7 @@ def auth_add_command(args) -> None:
if requested_type == AUTH_TYPE_API_KEY:
token = (getattr(args, "api_key", None) or "").strip()
if not token:
token = masked_secret_prompt("Paste your API key: ").strip()
token = getpass("Paste your API key: ").strip()
if not token:
raise SystemExit("No API key provided.")
default_label = _api_key_default_label(len(pool.entries()) + 1)
+16 -18
View File
@@ -85,22 +85,6 @@ def _should_exclude(rel_path: Path) -> bool:
return False
def _should_skip_backup_file(abs_path: Path, rel_path: Path, out_path: Path) -> bool:
"""Return True when a candidate file should not be written to a backup zip."""
if _should_exclude(rel_path):
return True
# zipfile.write() follows file symlinks, so skip links before any archive
# write can copy data from outside HERMES_HOME.
if abs_path.is_symlink():
return True
try:
return abs_path.resolve() == out_path.resolve()
except (OSError, ValueError):
return False
# ---------------------------------------------------------------------------
# SQLite safe copy
# ---------------------------------------------------------------------------
@@ -189,9 +173,16 @@ def run_backup(args) -> None:
fpath = dp / fname
rel = fpath.relative_to(hermes_root)
if _should_skip_backup_file(fpath, rel, out_path):
if _should_exclude(rel):
continue
# Skip the output zip itself if it happens to be inside hermes root
try:
if fpath.resolve() == out_path.resolve():
continue
except (OSError, ValueError):
pass
files_to_add.append((fpath, rel))
if not files_to_add:
@@ -735,9 +726,16 @@ def _write_full_zip_backup(out_path: Path, hermes_root: Path) -> Optional[Path]:
except ValueError:
continue
if _should_skip_backup_file(fpath, rel, out_path):
if _should_exclude(rel):
continue
# Skip the output zip itself if it already exists inside root.
try:
if fpath.resolve() == out_path.resolve():
continue
except (OSError, ValueError):
pass
files_to_add.append((fpath, rel))
except OSError as exc:
logger.warning("Full-zip backup: walk failed: %s", exc)
+2 -2
View File
@@ -8,10 +8,10 @@ with the TUI.
import queue
import time as _time
import getpass
from hermes_cli.banner import cprint, _DIM, _RST
from hermes_cli.config import save_env_value_secure
from hermes_cli.secret_prompt import masked_secret_prompt
from hermes_constants import display_hermes_home
@@ -75,7 +75,7 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
if not hasattr(cli, "_secret_deadline"):
cli._secret_deadline = 0
try:
value = masked_secret_prompt(f"{prompt} (hidden, ESC or empty Enter to skip): ")
value = getpass.getpass(f"{prompt} (hidden, ESC or empty Enter to skip): ")
except (EOFError, KeyboardInterrupt):
value = ""
+3 -2
View File
@@ -5,8 +5,9 @@ functions previously duplicated across setup.py, tools_config.py,
mcp_config.py, and memory_setup.py.
"""
import getpass
from hermes_cli.colors import Colors, color
from hermes_cli.secret_prompt import masked_secret_prompt
# ─── Print Helpers ────────────────────────────────────────────────────────────
@@ -58,7 +59,7 @@ def prompt(
try:
if password:
value = masked_secret_prompt(display)
value = getpass.getpass(display)
else:
value = input(display)
value = value.strip()
+5 -43
View File
@@ -26,8 +26,6 @@ from dataclasses import dataclass
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
from hermes_cli.secret_prompt import masked_secret_prompt
logger = logging.getLogger(__name__)
# Track which (config_path, mtime_ns, size) tuples we've already warned about
@@ -1638,31 +1636,6 @@ DEFAULT_CONFIG = {
"force_ipv4": False,
},
# Gateway settings — control how messaging platforms (Telegram, Discord,
# Slack, etc.) deliver agent-produced files as native attachments.
"gateway": {
# Extra directories from which model-emitted bare file paths may be
# uploaded as native gateway attachments. Files inside the Hermes
# cache (~/.hermes/cache/{documents,images,audio,video,screenshots})
# are always trusted; this list adds operator-controlled roots
# (project dirs, scratch dirs, mounted shares). Accepts a list of
# absolute paths or a single os.pathsep-separated string. Bridged
# to HERMES_MEDIA_ALLOW_DIRS at gateway startup. Tilde paths are
# expanded.
"media_delivery_allow_dirs": [],
# When true, files whose mtime is within ``trust_recent_files_seconds``
# of "now" are trusted for native delivery even outside the cache /
# operator allowlist — useful for ``pandoc -o /tmp/report.pdf`` or
# PDFs the agent writes into a working directory. System paths
# (/etc, /proc, ~/.ssh, ~/.aws, etc.) remain blocked regardless.
# Disable to fall back to pure-allowlist mode. Bridged to
# HERMES_MEDIA_TRUST_RECENT_FILES.
"trust_recent_files": True,
# Recency window in seconds. 600 (10 min) comfortably covers a
# multi-tool agent turn. Bridged to HERMES_MEDIA_TRUST_RECENT_SECONDS.
"trust_recent_files_seconds": 600,
},
# Session storage — controls automatic cleanup of ~/.hermes/state.db.
# state.db accumulates every session, message, tool call, and FTS5 index
# entry forever. Without auto-pruning, a heavy user (gateway + cron)
@@ -1771,7 +1744,6 @@ DEFAULT_CONFIG = {
"servers": {},
},
# X (Twitter) Search via xAI's built-in x_search Responses tool.
# The tool registers when xAI credentials are available (SuperGrok
# OAuth or XAI_API_KEY) AND the x_search toolset is enabled in
@@ -1828,18 +1800,8 @@ DEFAULT_CONFIG = {
},
},
# Paste collapse thresholds (TUI + CLI).
# collapse_threshold: paste collapses to a file reference when line count
# exceeds this value (bracketed paste, safe: appends to existing text).
# collapse_threshold_fallback: same but for the fallback heuristic used
# by terminals without bracketed paste support (destructive: replaces
# entire buffer). 0 = disabled.
"paste_collapse_threshold": 5,
"paste_collapse_threshold_fallback": 0,
# Config schema version - bump this when adding new required fields
"_config_version": 24,
"_config_version": 23,
}
# =============================================================================
@@ -4042,7 +4004,8 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
print(f" Get your key at: {var['url']}")
if var.get("password"):
value = masked_secret_prompt(f" {var['prompt']}: ")
import getpass
value = getpass.getpass(f" {var['prompt']}: ")
else:
value = input(f" {var['prompt']}: ").strip()
@@ -4093,9 +4056,8 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
else:
print(f" {info.get('description', name)}")
if info.get("password"):
value = masked_secret_prompt(
f" {info.get('prompt', name)} (Enter to skip): "
)
import getpass
value = getpass.getpass(f" {info.get('prompt', name)} (Enter to skip): ")
else:
value = input(f" {info.get('prompt', name)} (Enter to skip): ").strip()
if value:
-7
View File
@@ -569,13 +569,6 @@ def run_doctor(args):
if should_fix:
env_path.parent.mkdir(parents=True, exist_ok=True)
env_path.touch()
# .env holds API keys — restrict to owner-only access from
# creation. touch() obeys umask which is commonly 0o022,
# leaving the file world-readable; tighten explicitly.
try:
os.chmod(str(env_path), 0o600)
except OSError:
pass
check_ok(f"Created empty {_DHH}/.env")
check_info("Run 'hermes setup' to configure API keys")
fixed_count += 1
+1 -4
View File
@@ -36,9 +36,7 @@ def get_secret_source(env_var: str) -> str | None:
Returns ``"bitwarden"`` for keys pulled from Bitwarden Secrets Manager
during the current process's ``load_hermes_dotenv()`` call. Returns
``None`` for keys that came from ``.env``, the shell environment, or
aren't tracked. The returned label is metadata only: credential-pool
persistence may store it to explain the origin of a borrowed secret, but
must never treat it as authorization to persist the raw value.
aren't tracked.
"""
return _SECRET_SOURCES.get(env_var)
@@ -255,7 +253,6 @@ def _apply_external_secret_sources(home_path: Path) -> None:
cache_ttl_seconds=float(bw_cfg.get("cache_ttl_seconds", 300)),
auto_install=bool(bw_cfg.get("auto_install", True)),
server_url=str(bw_cfg.get("server_url", "") or "").strip(),
home_path=home_path,
)
if result.applied:
+1 -3
View File
@@ -4750,9 +4750,7 @@ def _builtin_setup_fn(key: str):
# via the plugin path in _configure_platform().
"slack": _s._setup_slack,
"matrix": _s._setup_matrix,
# mattermost moved into the plugin: setup_fn is registered by
# plugins/platforms/mattermost/adapter.py::register() and dispatched
# via the plugin path in _configure_platform().
"mattermost": _s._setup_mattermost,
"bluebubbles": _s._setup_bluebubbles,
"webhooks": _s._setup_webhooks,
"signal": _setup_signal,
+57 -125
View File
@@ -280,29 +280,20 @@ load_hermes_dotenv(project_env=PROJECT_ROOT / ".env")
# module-import time). Without this, config.yaml's toggle is ignored because
# the setup_logging() call below imports agent.redact, which reads the env var
# exactly once. Env var in .env still wins — this is config.yaml fallback only.
#
# We also read network.force_ipv4 from the same yaml load to avoid two
# separate config.yaml reads (saves ~17ms on every CLI startup — the second
# `load_config()` was doing a full deep-merge for one boolean lookup).
_FORCE_IPV4_EARLY = False
try:
import yaml as _yaml_early
if "HERMES_REDACT_SECRETS" not in os.environ:
import yaml as _yaml_early
_cfg_path = get_hermes_home() / "config.yaml"
if _cfg_path.exists():
with open(_cfg_path, encoding="utf-8") as _f:
_early_cfg_raw = _yaml_early.safe_load(_f) or {}
if "HERMES_REDACT_SECRETS" not in os.environ:
_early_sec_cfg = _early_cfg_raw.get("security", {})
_cfg_path = get_hermes_home() / "config.yaml"
if _cfg_path.exists():
with open(_cfg_path, encoding="utf-8") as _f:
_early_sec_cfg = (_yaml_early.safe_load(_f) or {}).get("security", {})
if isinstance(_early_sec_cfg, dict):
_early_redact = _early_sec_cfg.get("redact_secrets")
if _early_redact is not None:
os.environ["HERMES_REDACT_SECRETS"] = str(_early_redact).lower()
_early_net_cfg = _early_cfg_raw.get("network", {})
if isinstance(_early_net_cfg, dict) and _early_net_cfg.get("force_ipv4"):
_FORCE_IPV4_EARLY = True
del _early_cfg_raw
del _cfg_path
del _early_sec_cfg
del _cfg_path
except Exception:
pass # best-effort — redaction stays at default (enabled) on config errors
@@ -316,15 +307,17 @@ except Exception:
pass # best-effort — don't crash the CLI if logging setup fails
# Apply IPv4 preference early, before any HTTP clients are created.
# We already determined whether to force IPv4 from the raw yaml read above —
# this just calls the toggle without a redundant load_config() round trip.
if _FORCE_IPV4_EARLY:
try:
from hermes_constants import apply_ipv4_preference as _apply_ipv4
try:
from hermes_cli.config import load_config as _load_config_early
from hermes_constants import apply_ipv4_preference as _apply_ipv4
_early_cfg = _load_config_early()
_net = _early_cfg.get("network", {})
if isinstance(_net, dict) and _net.get("force_ipv4"):
_apply_ipv4(force=True)
except Exception:
pass # best-effort — don't crash if hermes_constants not importable yet
del _early_cfg, _net
except Exception:
pass # best-effort — don't crash if config isn't available yet
import logging
import threading
@@ -2419,7 +2412,6 @@ def select_provider_and_model(args=None):
elif selected_provider == "azure-foundry":
_model_flow_azure_foundry(config, current_model)
elif selected_provider in {
"openai-api",
"gemini",
"deepseek",
"xai",
@@ -2810,7 +2802,7 @@ def _aux_flow_provider_model(
def _aux_flow_custom_endpoint(task: str, task_cfg: dict) -> None:
"""Prompt for a direct OpenAI-compatible base_url + optional api_key/model."""
from hermes_cli.secret_prompt import masked_secret_prompt
import getpass
display_name = next((name for key, name, _ in _all_aux_tasks() if key == task), task)
current_base_url = str(task_cfg.get("base_url") or "").strip()
@@ -2844,7 +2836,7 @@ def _aux_flow_custom_endpoint(task: str, task_cfg: dict) -> None:
return
model = model or current_model
try:
api_key = masked_secret_prompt(
api_key = getpass.getpass(
"API key (optional, blank = use OPENAI_API_KEY): "
).strip()
except (KeyboardInterrupt, EOFError):
@@ -3295,7 +3287,7 @@ def _model_flow_openai_codex(config, current_model=""):
def _model_flow_xai_oauth(_config, current_model="", *, args=None):
"""xAI Grok OAuth (SuperGrok / Premium+) provider: ensure logged in, then pick model."""
"""xAI Grok OAuth (SuperGrok Subscription) provider: ensure logged in, then pick model."""
from hermes_cli.auth import (
get_xai_oauth_auth_status,
_prompt_model_selection,
@@ -3310,7 +3302,7 @@ def _model_flow_xai_oauth(_config, current_model="", *, args=None):
status = get_xai_oauth_auth_status()
if status.get("logged_in"):
print(" xAI Grok OAuth (SuperGrok / Premium+) credentials: ✓")
print(" xAI Grok OAuth (SuperGrok Subscription) credentials: ✓")
print()
print(" 1. Use existing credentials")
print(" 2. Reauthenticate (new OAuth login)")
@@ -3348,7 +3340,7 @@ def _model_flow_xai_oauth(_config, current_model="", *, args=None):
elif choice == "3":
return
else:
print("Not logged into xAI Grok OAuth (SuperGrok / Premium+). Starting login...")
print("Not logged into xAI Grok OAuth (SuperGrok Subscription). Starting login...")
print()
try:
mock_args = argparse.Namespace(
@@ -3382,7 +3374,7 @@ def _model_flow_xai_oauth(_config, current_model="", *, args=None):
if selected:
_save_model_choice(selected)
_update_config_for_provider("xai-oauth", base_url)
print(f"Default model set to: {selected} (via xAI Grok OAuth — SuperGrok / Premium+)")
print(f"Default model set to: {selected} (via xAI Grok OAuth — SuperGrok Subscription)")
else:
print("No change.")
@@ -3568,7 +3560,6 @@ def _model_flow_custom(config):
"""
from hermes_cli.auth import _save_model_choice, deactivate_provider
from hermes_cli.config import get_env_value, load_config, save_config
from hermes_cli.secret_prompt import masked_secret_prompt
current_url = get_env_value("OPENAI_BASE_URL") or ""
current_key = get_env_value("OPENAI_API_KEY") or ""
@@ -3584,7 +3575,9 @@ def _model_flow_custom(config):
base_url = input(
f"API base URL [{current_url or 'e.g. https://api.example.com/v1'}]: "
).strip()
api_key = masked_secret_prompt(
import getpass
api_key = getpass.getpass(
f"API key [{current_key[:8] + '...' if current_key else 'optional'}]: "
).strip()
except (KeyboardInterrupt, EOFError):
@@ -3996,6 +3989,7 @@ def _model_flow_azure_foundry(config, current_model=""):
save_config,
)
from hermes_cli import azure_detect
import getpass
# ── Load current Azure Foundry configuration ─────────────────────
model_cfg = config.get("model", {})
@@ -4158,10 +4152,8 @@ def _model_flow_azure_foundry(config, current_model=""):
token_provider = None
else:
print()
from hermes_cli.secret_prompt import masked_secret_prompt
try:
api_key = masked_secret_prompt(
api_key = getpass.getpass(
f"API key [{current_api_key[:8] + '...' if current_api_key else 'required'}]: "
).strip()
except (KeyboardInterrupt, EOFError):
@@ -4558,27 +4550,11 @@ def _model_flow_named_custom(config, provider_info):
print(f" Provider: {name} ({base_url})")
# Lazy-export the model catalog at module level. Tests and a handful of
# downstream call sites read `hermes_cli.main._PROVIDER_MODELS` directly,
# so the symbol needs to be reachable as a module attribute. But importing
# the catalog eagerly costs ~55ms on every `hermes` invocation — including
# fast paths like `hermes --version` and slash-command dispatch that never
# touch the catalog. PEP 562 module-level __getattr__ defers the import
# until first attribute access, so the cost is only paid by callers that
# actually look up the catalog. Termux already defers via the same
# mechanism (its model-selection handlers do their own function-local
# imports), so the explicit termux branch from before is no longer needed.
_LAZY_MODEL_EXPORTS = ("_PROVIDER_MODELS",)
def __getattr__(name):
"""Defer the model-catalog import until something actually reads it."""
if name in _LAZY_MODEL_EXPORTS:
from hermes_cli.models import _PROVIDER_MODELS
# Cache on the module so subsequent accesses skip the import machinery.
globals()[name] = _PROVIDER_MODELS
return _PROVIDER_MODELS
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
# Keep the historical eager model catalog import on desktop/CI. Termux defers
# it to the model-selection handlers so plain `hermes --tui` does not pay for
# requests/models.dev catalog imports before the Node TUI starts.
if not _is_termux_startup_environment():
from hermes_cli.models import _PROVIDER_MODELS
def _current_reasoning_effort(config) -> str:
@@ -4748,10 +4724,10 @@ def _model_flow_copilot(config, current_model=""):
print(f" Login failed: {exc}")
return
elif choice == "2":
from hermes_cli.secret_prompt import masked_secret_prompt
try:
new_key = masked_secret_prompt(" Token (COPILOT_GITHUB_TOKEN): ").strip()
import getpass
new_key = getpass.getpass(" Token (COPILOT_GITHUB_TOKEN): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
@@ -5003,9 +4979,10 @@ def _prompt_api_key(pconfig, existing_key: str, provider_id: str = "") -> tuple:
``return`` immediately the user cancelled entry, declined to replace, or
cleared the key and is now unconfigured.
"""
import getpass
from hermes_cli.auth import LMSTUDIO_NOAUTH_PLACEHOLDER
from hermes_cli.config import save_env_value
from hermes_cli.secret_prompt import masked_secret_prompt
key_env = pconfig.api_key_env_vars[0] if pconfig.api_key_env_vars else ""
@@ -5015,7 +4992,7 @@ def _prompt_api_key(pconfig, existing_key: str, provider_id: str = "") -> tuple:
else:
prompt = f"{key_env} (or Enter to cancel): "
try:
entered = masked_secret_prompt(prompt).strip()
entered = getpass.getpass(prompt).strip()
except (KeyboardInterrupt, EOFError):
print()
return ""
@@ -5330,10 +5307,10 @@ def _model_flow_bedrock_api_key(config, region, current_model=""):
else:
print(f" Endpoint: {mantle_base_url}")
print()
from hermes_cli.secret_prompt import masked_secret_prompt
try:
api_key = masked_secret_prompt(" Bedrock API Key: ").strip()
import getpass
api_key = getpass.getpass(" Bedrock API Key: ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
@@ -5905,10 +5882,10 @@ def _run_anthropic_oauth_flow(save_env_value):
print()
print(" If the setup-token was displayed above, paste it here:")
print()
from hermes_cli.secret_prompt import masked_secret_prompt
try:
manual_token = masked_secret_prompt(
import getpass
manual_token = getpass.getpass(
" Paste setup-token (or Enter to cancel): "
).strip()
except (KeyboardInterrupt, EOFError):
@@ -5936,10 +5913,10 @@ def _run_anthropic_oauth_flow(save_env_value):
print()
print(" Or paste an existing setup-token now (sk-ant-oat-...):")
print()
from hermes_cli.secret_prompt import masked_secret_prompt
try:
token = masked_secret_prompt(" Setup-token (or Enter to cancel): ").strip()
import getpass
token = getpass.getpass(" Setup-token (or Enter to cancel): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return False
@@ -6054,10 +6031,10 @@ def _model_flow_anthropic(config, current_model=""):
print()
print(" Get an API key at: https://platform.claude.com/settings/keys")
print()
from hermes_cli.secret_prompt import masked_secret_prompt
try:
api_key = masked_secret_prompt(" API key (sk-ant-...): ").strip()
import getpass
api_key = getpass.getpass(" API key (sk-ant-...): ").strip()
except (KeyboardInterrupt, EOFError):
print()
return
@@ -7000,13 +6977,8 @@ def _update_via_zip(args):
urlretrieve(zip_url, zip_path)
print("→ Extracting...")
import stat as _stat
with zipfile.ZipFile(zip_path, "r") as zf:
# Validate paths to prevent zip-slip (path traversal) AND reject
# symlink members. A GitHub source ZIP for hermes-agent itself
# should never contain symlinks — they'd point outside the
# extracted tree and let an attacker who can compromise the
# update mirror plant arbitrary files via the update path.
# Validate paths to prevent zip-slip (path traversal)
tmp_dir_real = os.path.realpath(tmp_dir)
for member in zf.infolist():
member_path = os.path.realpath(os.path.join(tmp_dir, member.filename))
@@ -7017,13 +6989,6 @@ def _update_via_zip(args):
raise ValueError(
f"Zip-slip detected: {member.filename} escapes extraction directory"
)
# Unix mode lives in the upper 16 bits of external_attr;
# mask to the file-type bits.
mode = (member.external_attr >> 16) & 0o170000
if _stat.S_ISLNK(mode):
raise ValueError(
f"ZIP contains unsupported symlink member: {member.filename}"
)
zf.extractall(tmp_dir)
# GitHub ZIPs extract to hermes-agent-<branch>/
@@ -7700,11 +7665,8 @@ def _detect_concurrent_hermes_instances(
This helper enumerates processes whose ``exe`` matches one of the venv's
shims (``hermes.exe`` / ``hermes-gateway.exe``) and returns ``(pid,
process_name)`` pairs. The caller's own PID and its entire ancestor
chain are excluded so the running ``hermes update`` invocation never
reports itself this matters on Windows where the setuptools .exe
launcher (``hermes.exe``) is a separate process from the Python
interpreter it loads (``python.exe``).
process_name)`` pairs. The caller's own PID is excluded so the running
``hermes update`` invocation never reports itself.
Returns an empty list off-Windows, on missing psutil, or when no other
instances exist. Never raises process enumeration is best-effort.
@@ -7717,38 +7679,8 @@ def _detect_concurrent_hermes_instances(
except Exception:
return []
# Build a set of PIDs to exclude: the Python process itself plus its
# entire parent chain. On Windows the setuptools-generated hermes.exe
# launcher is a separate native process that spawns python.exe (the
# interpreter that runs our code). os.getpid() returns the Python PID,
# but the launcher (which holds the file lock) is the parent. Without
# walking the parent chain, every ``hermes update`` reports its own
# launcher as a concurrent instance — a false positive.
if exclude_pid is not None:
exclude_pids: set[int] = {exclude_pid}
else:
exclude_pids = {os.getpid()}
# The parent-walk is best-effort: if psutil rejects a PID (NoSuchProcess /
# AccessDenied) we stop walking and use whatever we've collected so far.
# Broader Exception catch on the outer block guards against partially-
# stubbed psutil in unit tests (e.g. a SimpleNamespace lacking Process /
# NoSuchProcess) — the surrounding update flow documents this helper as
# "never raises".
try:
current = psutil.Process(next(iter(exclude_pids)))
while True:
try:
parent = current.parent()
except Exception:
break
if parent is None or parent.pid <= 0:
break
if parent.pid in exclude_pids:
break # loop detected
exclude_pids.add(parent.pid)
current = parent
except Exception:
pass
if exclude_pid is None:
exclude_pid = os.getpid()
# Resolve every shim path to its canonical form once for cheap comparison.
shim_paths: set[str] = set()
@@ -7773,7 +7705,7 @@ def _detect_concurrent_hermes_instances(
continue
pid = info.get("pid")
exe = info.get("exe")
if not exe or pid is None or pid in exclude_pids:
if not exe or pid is None or pid == exclude_pid:
continue
try:
exe_norm = str(Path(exe).resolve()).lower()
+7 -2
View File
@@ -7,13 +7,13 @@ the provider's config schema. Writes config to config.yaml + .env.
from __future__ import annotations
import getpass
import os
import sys
import shlex
from pathlib import Path
from hermes_constants import get_hermes_home
from hermes_cli.secret_prompt import masked_secret_prompt
# ---------------------------------------------------------------------------
@@ -39,7 +39,12 @@ def _prompt(label: str, default: str | None = None, secret: bool = False) -> str
"""Prompt for a value with optional default and secret masking."""
suffix = f" [{default}]" if default else ""
if secret:
val = masked_secret_prompt(f" {label}{suffix}: ")
sys.stdout.write(f" {label}{suffix}: ")
sys.stdout.flush()
if sys.stdin.isatty():
val = getpass.getpass(prompt="")
else:
val = sys.stdin.readline().strip()
else:
sys.stdout.write(f" {label}{suffix}: ")
sys.stdout.flush()
+3 -16
View File
@@ -199,18 +199,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gpt-4o",
"gpt-4o-mini",
],
"openai-api": [
"gpt-5.5",
"gpt-5.5-pro",
"gpt-5.4",
"gpt-5.4-mini",
"gpt-5.4-nano",
"gpt-5-mini",
"gpt-5.3-codex",
"gpt-4.1",
"gpt-4o",
"gpt-4o-mini",
],
"openai-codex": _codex_curated_models(),
"xai-oauth": _xai_curated_models(),
"copilot-acp": [
@@ -940,9 +928,8 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("lmstudio", "LM Studio", "LM Studio (local desktop app with built-in model server)"),
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("openai-api", "OpenAI API", "OpenAI API (api.openai.com, API key)"),
ProviderEntry("alibaba", "Qwen Cloud", "Qwen Cloud / DashScope Coding (Qwen + multi-provider)"),
ProviderEntry("xai-oauth", "xAI Grok OAuth (SuperGrok / Premium+)", "xAI Grok OAuth (SuperGrok / Premium+)"),
ProviderEntry("xai-oauth", "xAI Grok OAuth (SuperGrok Subscription)", "xAI Grok OAuth (SuperGrok Subscription)"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2.5 and V2 models — pro, omni, flash)"),
ProviderEntry("tencent-tokenhub", "Tencent TokenHub", "Tencent TokenHub (Hy3 Preview — direct API via tokenhub.tencentmaas.com)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
@@ -2242,7 +2229,7 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
live = fetch_ollama_cloud_models(force_refresh=force_refresh)
if live:
return live
if normalized in ("openai", "openai-api"):
if normalized == "openai":
api_key = os.getenv("OPENAI_API_KEY", "").strip()
if api_key:
base_raw = os.getenv("OPENAI_BASE_URL", "").strip().rstrip("/")
@@ -3504,7 +3491,7 @@ def validate_requested_model(
suggestion_text = ""
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
provider_label = "OpenAI Codex" if normalized == "openai-codex" else "xAI Grok OAuth (SuperGrok / Premium+)"
provider_label = "OpenAI Codex" if normalized == "openai-codex" else "xAI Grok OAuth (SuperGrok Subscription)"
return {
"accepted": True,
"persist": True,
-82
View File
@@ -640,88 +640,6 @@ class PluginContext:
self.manifest.name, provider.name,
)
# -- TTS provider registration -------------------------------------------
def register_tts_provider(self, provider) -> None:
"""Register a text-to-speech backend.
``provider`` must be an instance of
:class:`agent.tts_provider.TTSProvider`. The ``provider.name``
attribute is what ``tts.provider`` in ``config.yaml`` matches
against when routing ``text_to_speech`` tool calls **but
only when**:
1. ``provider.name`` is NOT a built-in TTS provider name
(``edge``, ``openai``, ``elevenlabs``, ). Built-ins always
win the registry rejects shadowing names with a warning.
2. There is NO ``tts.providers.<name>: type: command`` entry
with the same name. Command-providers (PR #17843) win on
name collision because config is more local than plugin
install.
Coexists with the command-provider registry rather than
replacing it see issue #30398 for the full design rationale.
"""
from agent.tts_provider import TTSProvider
from agent.tts_registry import register_provider as _register_tts_provider
if not isinstance(provider, TTSProvider):
logger.warning(
"Plugin '%s' tried to register a TTS provider that does "
"not inherit from TTSProvider. Ignoring.",
self.manifest.name,
)
return
_register_tts_provider(provider)
logger.info(
"Plugin '%s' registered TTS provider: %s",
self.manifest.name, provider.name,
)
# -- transcription (STT) provider registration ---------------------------
def register_transcription_provider(self, provider) -> None:
"""Register a speech-to-text backend.
``provider`` must be an instance of
:class:`agent.transcription_provider.TranscriptionProvider`.
The ``provider.name`` attribute is what ``stt.provider`` in
``config.yaml`` matches against when routing
:func:`tools.transcription_tools.transcribe_audio` calls
**but only when**:
1. ``provider.name`` is NOT a built-in STT provider name
(``local``, ``local_command``, ``groq``, ``openai``,
``mistral``, ``xai``). Built-ins always win the registry
rejects shadowing names with a warning.
2. There is NO ``stt.providers.<name>: type: command`` entry
with the same name. Command-providers win on name
collision because config is more local than plugin install
same precedence rule as TTS.
Coexists with the in-tree dispatcher and the STT
command-provider registry rather than replacing them. The 6
built-in STT backends keep their native implementations in
``tools/transcription_tools.py``; this hook is for *new* Python
engines (OpenRouter, SenseAudio, Gemini-STT, custom proprietary
backends).
"""
from agent.transcription_provider import TranscriptionProvider
from agent.transcription_registry import register_provider as _register_stt_provider
if not isinstance(provider, TranscriptionProvider):
logger.warning(
"Plugin '%s' tried to register a transcription provider that "
"does not inherit from TranscriptionProvider. Ignoring.",
self.manifest.name,
)
return
_register_stt_provider(provider)
logger.info(
"Plugin '%s' registered transcription provider: %s",
self.manifest.name, provider.name,
)
# -- platform adapter registration ---------------------------------------
def register_platform(
+2 -2
View File
@@ -20,7 +20,6 @@ from typing import Any, Optional
from hermes_constants import get_hermes_home
from hermes_cli.config import cfg_get
from hermes_cli.secret_prompt import masked_secret_prompt
logger = logging.getLogger(__name__)
@@ -288,7 +287,8 @@ def _prompt_plugin_env_vars(manifest: dict, console) -> None:
try:
if secret:
value = masked_secret_prompt(f" {name}: ").strip()
import getpass
value = getpass.getpass(f" {name}: ").strip()
else:
value = input(f" {name}: ").strip()
except (EOFError, KeyboardInterrupt):
-15
View File
@@ -432,20 +432,6 @@ def _stage_source(source: str, workdir: Path) -> Tuple[Path, str]:
)
def _reject_distribution_symlinks(staged: Path) -> None:
"""Reject symlinks before reading or copying distribution files."""
for entry in staged.rglob("*"):
if not entry.is_symlink():
continue
try:
rel = entry.relative_to(staged)
except ValueError:
rel = entry
raise DistributionError(
f"Profile distributions cannot contain symlinks: {rel}"
)
# ---------------------------------------------------------------------------
# Install
# ---------------------------------------------------------------------------
@@ -498,7 +484,6 @@ def plan_install(
from hermes_cli import __version__ as hermes_version
staged, provenance = _stage_source(source, workdir)
_reject_distribution_symlinks(staged)
manifest = read_manifest(staged)
if manifest is None:
raise DistributionError(
+1 -37
View File
@@ -723,17 +723,7 @@ def create_profile(
for filename in _CLONE_CONFIG_FILES:
src = source_dir / filename
if src.exists():
dst = profile_dir / filename
shutil.copy2(src, dst)
# Tighten .env to owner-only after copy. shutil.copy2
# preserves source mode bits, but if the source's .env
# was loose (host umask 0o022 leaving 0o644), tighten
# explicitly so the clone doesn't inherit weak perms.
if filename == ".env":
try:
os.chmod(str(dst), 0o600)
except OSError:
pass
shutil.copy2(src, profile_dir / filename)
# Clone installed skills from the source profile. The dashboard's
# "clone from default" flow is expected to preserve both bundled
@@ -1004,30 +994,12 @@ def _maybe_register_gateway_service(profile_name: str) -> None:
(``[gateway] port = ``) there is no Python-side allocator
(PR #30136 review item I5 retired the SHA-256-derived range
[9200, 9800) because it was dead code through the entire stack).
Host short-circuit: check ``detect_service_manager()`` first and
return immediately if it isn't ``"s6"``. This keeps host
(systemd/launchd/windows) profile creation completely silent
no ``get_service_manager()`` call, no exception path, no chance
of the `` Could not register s6 gateway service`` warning ever
rendering on a non-container machine. The earlier
``supports_runtime_registration()`` check still catches the case
where detection somehow returns ``"s6"`` but the backend isn't
actually the S6 one.
"""
try:
from hermes_cli.service_manager import detect_service_manager
if detect_service_manager() != "s6":
return # host path — silent, no registration needed
from hermes_cli.service_manager import get_service_manager
mgr = get_service_manager()
except RuntimeError:
return # no backend on this host — nothing to do
except Exception:
# Defensive: detect_service_manager failed for some other
# reason. Stay silent on host rather than printing a confusing
# s6 warning to users who have never touched the container.
return
if not mgr.supports_runtime_registration():
return # host backend; no-op
try:
@@ -1046,20 +1018,12 @@ def _maybe_unregister_gateway_service(profile_name: str) -> None:
No-op on host. Idempotent: absent services are silently skipped
by ``unregister_profile_gateway``.
Same host short-circuit as :func:`_maybe_register_gateway_service`
see that docstring.
"""
try:
from hermes_cli.service_manager import detect_service_manager
if detect_service_manager() != "s6":
return # host path — silent
from hermes_cli.service_manager import get_service_manager
mgr = get_service_manager()
except RuntimeError:
return
except Exception:
return
if not mgr.supports_runtime_registration():
return
try:
-6
View File
@@ -60,11 +60,6 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
auth_type="oauth_external",
base_url_override="https://chatgpt.com/backend-api/codex",
),
"openai-api": HermesOverlay(
transport="codex_responses",
base_url_override="https://api.openai.com/v1",
base_url_env_var="OPENAI_BASE_URL",
),
"xai-oauth": HermesOverlay(
transport="codex_responses",
auth_type="oauth_external",
@@ -386,7 +381,6 @@ _LABEL_OVERRIDES: Dict[str, str] = {
"local": "Local endpoint",
"bedrock": "AWS Bedrock",
"ollama-cloud": "Ollama Cloud",
"xai-oauth": "xAI Grok OAuth (SuperGrok / Premium+)",
}
-126
View File
@@ -1,126 +0,0 @@
"""Secret input prompts with masked typing feedback."""
from __future__ import annotations
import getpass
import os
import sys
from collections.abc import Callable
_BACKSPACE_CHARS = {"\b", "\x7f"}
_ENTER_CHARS = {"\r", "\n"}
_EOF_CHARS = {"\x04", "\x1a"}
def _collect_masked_input(
read_char: Callable[[], str],
write: Callable[[str], object],
prompt: str,
*,
mask: str = "*",
) -> str:
"""Read one secret line while writing a mask character per typed char."""
value: list[str] = []
write(prompt)
while True:
ch = read_char()
if ch == "":
write("\n")
raise EOFError
if ch in _ENTER_CHARS:
write("\n")
return "".join(value)
if ch == "\x03":
write("\n")
raise KeyboardInterrupt
if ch in _EOF_CHARS:
write("\n")
raise EOFError
if ch in _BACKSPACE_CHARS:
if value:
value.pop()
write("\b \b")
continue
if ch == "\x1b":
# Ignore escape itself. Terminals commonly send escape-prefixed
# navigation/delete sequences; they should not become secret text.
continue
value.append(ch)
if mask:
write(mask)
def masked_secret_prompt(prompt: str, *, mask: str = "*") -> str:
"""Prompt for a secret while showing masked typing feedback.
Falls back to ``getpass.getpass`` when stdin/stdout are not interactive or
when raw terminal handling is unavailable.
"""
stdin = sys.stdin
stdout = sys.stdout
if not _stream_is_tty(stdin) or not _stream_is_tty(stdout):
return getpass.getpass(prompt)
if os.name == "nt":
try:
return _masked_secret_prompt_windows(prompt, mask=mask)
except (KeyboardInterrupt, EOFError):
raise
except Exception:
return getpass.getpass(prompt)
try:
return _masked_secret_prompt_posix(prompt, mask=mask)
except (KeyboardInterrupt, EOFError):
raise
except Exception:
return getpass.getpass(prompt)
def _stream_is_tty(stream) -> bool:
try:
return bool(stream.isatty())
except Exception:
return False
def _masked_secret_prompt_windows(prompt: str, *, mask: str) -> str:
import msvcrt
def read_char() -> str:
ch = msvcrt.getwch()
if ch in {"\x00", "\xe0"}:
msvcrt.getwch()
return "\x1b"
return ch
def write(text: str) -> None:
sys.stdout.write(text)
sys.stdout.flush()
return _collect_masked_input(read_char, write, prompt, mask=mask)
def _masked_secret_prompt_posix(prompt: str, *, mask: str) -> str:
import termios
import tty
fd = sys.stdin.fileno()
old_attrs = termios.tcgetattr(fd)
def read_char() -> str:
return sys.stdin.read(1)
def write(text: str) -> None:
sys.stdout.write(text)
sys.stdout.flush()
try:
tty.setraw(fd)
return _collect_masked_input(read_char, write, prompt, mask=mask)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old_attrs)
+2 -2
View File
@@ -11,6 +11,7 @@ Subcommands:
from __future__ import annotations
import argparse
import getpass
import json
import os
import subprocess
@@ -29,7 +30,6 @@ from hermes_cli.config import (
save_config,
save_env_value,
)
from hermes_cli.secret_prompt import masked_secret_prompt
# ---------------------------------------------------------------------------
@@ -140,7 +140,7 @@ def cmd_setup(args: argparse.Namespace) -> int:
token = (args.access_token or "").strip()
if not token:
token = masked_secret_prompt(f" Paste access token ({token_env}): ").strip()
token = getpass.getpass(f" Paste access token ({token_env}): ").strip()
if not token:
console.print(" [red]Empty token, aborting.[/red]")
return 1
+51 -6
View File
@@ -161,7 +161,6 @@ from hermes_cli.cli_output import ( # noqa: E402
print_success,
print_warning,
)
from hermes_cli.secret_prompt import masked_secret_prompt # noqa: E402
def is_interactive_stdin() -> bool:
@@ -203,7 +202,9 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
try:
if password:
value = masked_secret_prompt(color(display, Colors.YELLOW))
import getpass
value = getpass.getpass(color(display, Colors.YELLOW))
else:
value = input(color(display, Colors.YELLOW))
@@ -1093,7 +1094,7 @@ def _xai_oauth_logged_in_for_setup() -> bool:
"""True iff xAI Grok OAuth credentials are already stored locally.
Lets TTS / STT setup skip the API-key prompt for users who logged in
through ``hermes model`` -> xAI Grok OAuth (SuperGrok / Premium+).
through ``hermes model`` -> xAI Grok OAuth (SuperGrok Subscription).
"""
try:
from hermes_cli.auth import get_xai_oauth_auth_status
@@ -1123,7 +1124,7 @@ def _run_xai_oauth_login_from_setup() -> bool:
open_browser = not _is_remote_session()
print()
print_info("Signing in to xAI Grok OAuth (SuperGrok / Premium+)...")
print_info("Signing in to xAI Grok OAuth (SuperGrok Subscription)...")
try:
creds = _xai_oauth_loopback_login(open_browser=open_browser)
_save_xai_oauth_tokens(
@@ -1258,7 +1259,7 @@ def _setup_tts_provider(config: dict):
if oauth_logged_in:
print_success(
"xAI TTS will use your xAI Grok OAuth (SuperGrok / Premium+) "
"xAI TTS will use your xAI Grok OAuth (SuperGrok Subscription) "
"credentials"
)
elif existing_api_key:
@@ -1268,7 +1269,7 @@ def _setup_tts_provider(config: dict):
choice_idx = prompt_choice(
"How do you want xAI TTS to authenticate?",
choices=[
"Sign in with xAI Grok OAuth (SuperGrok / Premium+) — browser login",
"Sign in with xAI Grok OAuth (SuperGrok Subscription) — browser login",
"Paste an xAI API key (console.x.ai)",
"Skip → fallback to Edge TTS",
],
@@ -2260,6 +2261,50 @@ def _setup_matrix():
save_env_value("MATRIX_HOME_ROOM", home_room)
def _setup_mattermost():
"""Configure Mattermost bot credentials."""
print_header("Mattermost")
existing = get_env_value("MATTERMOST_TOKEN")
if existing:
print_info("Mattermost: already configured")
if not prompt_yes_no("Reconfigure Mattermost?", False):
return
print_info("Works with any self-hosted Mattermost instance.")
print_info(" 1. In Mattermost: Integrations → Bot Accounts → Add Bot Account")
print_info(" 2. Copy the bot token")
print()
mm_url = prompt("Mattermost server URL (e.g. https://mm.example.com)")
if mm_url:
save_env_value("MATTERMOST_URL", mm_url.rstrip("/"))
token = prompt("Bot token", password=True)
if not token:
return
save_env_value("MATTERMOST_TOKEN", token)
print_success("Mattermost token saved")
print()
print_info("🔒 Security: Restrict who can use your bot")
print_info(" To find your user ID: click your avatar → Profile")
print_info(" or use the API: GET /api/v4/users/me")
print()
allowed_users = prompt("Allowed user IDs (comma-separated, leave empty for open access)")
if allowed_users:
save_env_value("MATTERMOST_ALLOWED_USERS", allowed_users.replace(" ", ""))
print_success("Mattermost allowlist configured")
else:
print_info("⚠️ No allowlist set - anyone who can message the bot can use it!")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results and notifications.")
print_info(" To get a channel ID: click channel name → View Info → copy the ID")
print_info(" You can also set this later by typing /set-home in a Mattermost channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("MATTERMOST_HOME_CHANNEL", home_channel)
print_info(" Open config in your editor: hermes config edit")
def _setup_bluebubbles():
"""Configure BlueBubbles iMessage gateway."""
print_header("BlueBubbles (iMessage)")
+1 -8
View File
@@ -550,14 +550,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
# Scan
c.print("[bold]Running security scan...[/]")
if bundle.source == "official":
scan_source = "official"
else:
scan_source = (
getattr(bundle, "identifier", "")
or getattr(meta, "identifier", "")
or identifier
)
scan_source = getattr(bundle, "identifier", "") or getattr(meta, "identifier", "") or identifier
result = scan_skill(q_path, source=scan_source)
c.print(format_scan_report(result))
+4 -66
View File
@@ -101,7 +101,7 @@ def _xai_credentials_present() -> bool:
"""Cheap, side-effect-free check for usable xAI credentials.
Used to auto-enable the ``x_search`` toolset when the user has either
completed xAI Grok OAuth (SuperGrok / Premium+) or set
completed xAI Grok OAuth (SuperGrok subscription) or set
``XAI_API_KEY``. Does NOT hit the network only inspects the local
auth store and environment. The tool's runtime ``check_fn`` still
gates schema registration if creds later expire or get revoked.
@@ -356,7 +356,7 @@ TOOL_CATEGORIES = {
"icon": "🐦",
"providers": [
{
"name": "xAI Grok OAuth (SuperGrok / Premium+)",
"name": "xAI Grok OAuth (SuperGrok Subscription)",
"badge": "subscription",
"tag": "Browser login at accounts.x.ai — no API key required",
"env_vars": [],
@@ -1008,7 +1008,7 @@ def _run_post_setup(post_setup_key: str):
if oauth_logged_in:
_print_success(
" xAI will use your xAI Grok OAuth (SuperGrok / Premium+) credentials"
" xAI will use your xAI Grok OAuth (SuperGrok Subscription) credentials"
)
return
if existing_api_key:
@@ -1031,7 +1031,7 @@ def _run_post_setup(post_setup_key: str):
idx = prompt_choice(
" How do you want xAI to authenticate?",
choices=[
"Sign in with xAI Grok OAuth (SuperGrok / Premium+) — browser login",
"Sign in with xAI Grok OAuth (SuperGrok Subscription) — browser login",
"Paste an xAI API key (console.x.ai)",
"Skip — configure later via `hermes auth add xai-oauth`",
],
@@ -1753,62 +1753,6 @@ def _plugin_browser_providers() -> list[dict]:
return rows
def _plugin_tts_providers() -> list[dict]:
"""Build picker-row dicts from plugin-registered TTS providers.
Issue #30398 — the ``register_tts_provider()`` plugin hook
coexists alongside the 10 built-in TTS providers
(``edge``/``openai``/``elevenlabs``/) and the
``tts.providers.<name>: type: command`` registry from PR #17843.
Built-in rows stay hardcoded in ``TOOL_CATEGORIES["tts"]``; this
function only injects PLUGIN-registered providers.
Defensive: plugins whose name collides with a built-in TTS provider
are filtered out even though the registry already rejects them
at registration time, a future code path that registers directly
via :func:`agent.tts_registry.register_provider` could slip
through. Filtering here keeps the picker invariant.
"""
try:
from agent.tts_registry import _BUILTIN_NAMES, list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
providers = list_providers()
except Exception:
return []
rows: list[dict] = []
for provider in providers:
name = getattr(provider, "name", None)
if not name:
continue
# Defensive: reject built-in shadowing at the picker layer too.
if name.lower().strip() in _BUILTIN_NAMES:
continue
try:
schema = provider.get_setup_schema()
except Exception:
continue
if not isinstance(schema, dict):
continue
row = {
"name": schema.get("name", provider.display_name),
"badge": schema.get("badge", ""),
"tag": schema.get("tag", ""),
"env_vars": schema.get("env_vars", []),
# Selecting this row writes ``tts.provider: <name>`` — the
# same write-path used by hardcoded rows. The plugin
# dispatcher picks it up automatically from there.
"tts_provider": name,
"tts_plugin_name": name,
}
if schema.get("post_setup"):
row["post_setup"] = schema["post_setup"]
rows.append(row)
return rows
def _visible_providers(cat: dict, config: dict) -> list[dict]:
"""Return provider entries visible for the current auth/config state."""
features = get_nous_subscription_features(config)
@@ -1846,12 +1790,6 @@ def _visible_providers(cat: dict, config: dict) -> list[dict]:
if cat.get("name") == "Browser Automation":
visible.extend(_plugin_browser_providers())
# Inject plugin-registered TTS backends (issue #30398). Plugin rows
# render BELOW the 10 hardcoded built-in rows. Built-in shadowing
# is filtered out by ``_plugin_tts_providers`` defensively.
if cat.get("name") == "Text-to-Speech":
visible.extend(_plugin_tts_providers())
return visible
+3 -29
View File
@@ -16,7 +16,6 @@ import json
import logging
import os
import secrets
import stat
import subprocess
import sys
import threading
@@ -1687,25 +1686,7 @@ def _save_anthropic_oauth_creds(access_token: str, refresh_token: str, expires_a
"expiresAt": expires_at_ms,
}
_HERMES_OAUTH_FILE.parent.mkdir(parents=True, exist_ok=True)
tmp_path = _HERMES_OAUTH_FILE.with_name(
f"{_HERMES_OAUTH_FILE.name}.tmp.{os.getpid()}.{secrets.token_hex(8)}"
)
try:
with tmp_path.open("w", encoding="utf-8") as handle:
handle.write(json.dumps(payload, indent=2))
handle.flush()
os.fsync(handle.fileno())
os.replace(tmp_path, _HERMES_OAUTH_FILE)
try:
_HERMES_OAUTH_FILE.chmod(stat.S_IRUSR | stat.S_IWUSR)
except OSError:
pass
finally:
try:
if tmp_path.exists():
tmp_path.unlink()
except OSError:
pass
_HERMES_OAUTH_FILE.write_text(json.dumps(payload, indent=2), encoding="utf-8")
# Best-effort credential-pool insert. Failure here doesn't invalidate
# the file write — pool registration only matters for the rotation
# strategy, not for runtime credential resolution.
@@ -2711,10 +2692,7 @@ async def update_cron_job(job_id: str, body: CronJobUpdate, profile: Optional[st
selected = profile or _find_cron_job_profile(job_id)
if not selected:
raise HTTPException(status_code=404, detail="Job not found")
try:
job = _call_cron_for_profile(selected, "update_job", job_id, body.updates)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc)) from exc
job = _call_cron_for_profile(selected, "update_job", job_id, body.updates)
if not job:
raise HTTPException(status_code=404, detail="Job not found")
return job
@@ -2758,11 +2736,7 @@ async def delete_cron_job(job_id: str, profile: Optional[str] = None):
selected = profile or _find_cron_job_profile(job_id)
if not selected:
raise HTTPException(status_code=404, detail="Job not found")
try:
removed = _call_cron_for_profile(selected, "remove_job", job_id)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc)) from exc
if not removed:
if not _call_cron_for_profile(selected, "remove_job", job_id):
raise HTTPException(status_code=404, detail="Job not found")
return {"ok": True}
-8
View File
@@ -432,14 +432,6 @@ def apply_ipv4_preference(force: bool = False) -> None:
socket.getaddrinfo = _ipv4_getaddrinfo # type: ignore[assignment]
# ─── Streaming Response Constants ────────────────────────────────────────────
# Response ID for partial stream stubs used during error recovery
PARTIAL_STREAM_STUB_ID = "partial-stream-stub"
FINISH_REASON_LENGTH = "length"
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODELS_URL = f"{OPENROUTER_BASE_URL}/models"
@@ -25,41 +25,18 @@ def main() -> int:
help="Organism attribute to display. Defaults to the first str field found.",
)
ap.add_argument("--top", type=int, default=None, help="Show only top N by score.")
ap.add_argument(
"--i-trust-this-file",
action="store_true",
help=(
"Required acknowledgement that the snapshot is from a trusted source. "
"pickle.loads executes arbitrary code embedded in the file (RCE) and "
"must NEVER be run on snapshots received from untrusted parties."
),
)
args = ap.parse_args()
if not args.snapshot.exists():
sys.exit(f"snapshot not found: {args.snapshot}")
if not args.i_trust_this_file:
sys.exit(
"refusing to unpickle: pickle.loads is equivalent to executing arbitrary "
"code from the snapshot file. Only proceed if you created/control this "
"file, then re-run with --i-trust-this-file.\n"
f" file: {args.snapshot}"
)
print(
f"WARNING: unpickling {args.snapshot} — this executes code embedded in the "
"file. Only safe for snapshots you produced yourself.",
file=sys.stderr,
)
# The outer pickle wraps a dict; the inner pickle contains the actual organism
# objects, which must be importable under their original dotted path. If you
# ran a custom driver, make sure its module is on sys.path before calling this.
outer = pickle.loads(args.snapshot.read_bytes()) # noqa: S301 — gated by --i-trust-this-file
outer = pickle.loads(args.snapshot.read_bytes())
if not isinstance(outer, dict) or "population_snapshot" not in outer:
sys.exit("not a darwinian-evolver snapshot (no population_snapshot key)")
inner = pickle.loads(outer["population_snapshot"]) # noqa: S301 — gated by --i-trust-this-file
inner = pickle.loads(outer["population_snapshot"])
pairs = inner["organisms"] # list of (Organism, EvaluationResult)
print(f"# organisms: {len(pairs)}\n")
+3 -16
View File
@@ -33,7 +33,6 @@ from agent.image_gen_provider import (
error_response,
resolve_aspect_ratio,
save_b64_image,
save_url_image,
success_response,
)
@@ -267,21 +266,9 @@ class OpenAIImageGenProvider(ImageGenProvider):
)
image_ref = str(saved_path)
elif url:
# Defensive — gpt-image-2 returns b64 today, but OpenAI's API
# has previously returned URLs. Cache the bytes locally so the
# gateway never tries to fetch an ephemeral / signed URL after
# it expires — same rationale as the xAI provider (#26942).
try:
saved_path = save_url_image(url, prefix=f"openai_{tier_id}")
except Exception as exc:
logger.warning(
"OpenAI image URL %s could not be cached (%s); falling back to bare URL.",
url,
exc,
)
image_ref = url
else:
image_ref = str(saved_path)
# Defensive — gpt-image-2 returns b64 today, but fall back
# gracefully if the API ever changes.
image_ref = url
else:
return error_response(
error="OpenAI response contained neither b64_json nor URL",
+1 -19
View File
@@ -29,7 +29,6 @@ from agent.image_gen_provider import (
error_response,
resolve_aspect_ratio,
save_b64_image,
save_url_image,
success_response,
)
from tools.xai_http import hermes_xai_user_agent, resolve_xai_http_credentials
@@ -282,24 +281,7 @@ class XAIImageGenProvider(ImageGenProvider):
)
image_ref = str(saved_path)
elif url:
# xAI's grok-imagine-image returns ephemeral ``imgen.x.ai/xai-tmp-*``
# URLs that 404 within minutes — by the time Telegram's
# ``send_photo`` or any downstream consumer fetches them, the
# asset is gone (#26942). Materialise the bytes locally at
# tool-completion time so the gateway has a stable file path to
# upload, mirroring the b64 branch above and the audio_cache
# pattern used by text_to_speech.
try:
saved_path = save_url_image(url, prefix=f"xai_{model_id}")
except Exception as exc:
logger.warning(
"xAI image URL %s could not be cached (%s); falling back to bare URL.",
url,
exc,
)
image_ref = url
else:
image_ref = str(saved_path)
image_ref = url
else:
return error_response(
error="xAI response contained neither b64_json nor URL",
+5 -5
View File
@@ -629,13 +629,13 @@ class HindsightMemoryProvider(MemoryProvider):
def post_setup(self, hermes_home: str, config: dict) -> None:
"""Custom setup wizard — installs only the deps needed for the selected mode."""
import getpass
import subprocess
import shutil
import sys
from pathlib import Path
from hermes_cli.config import save_config
from hermes_cli.secret_prompt import masked_secret_prompt
from hermes_cli.memory_setup import _curses_select
@@ -696,11 +696,11 @@ class HindsightMemoryProvider(MemoryProvider):
masked = f"...{existing_key[-4:]}" if len(existing_key) > 4 else "set"
sys.stdout.write(f" API key (current: {masked}, blank to keep): ")
sys.stdout.flush()
api_key = masked_secret_prompt("") if sys.stdin.isatty() else sys.stdin.readline().strip()
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
else:
sys.stdout.write(" API key: ")
sys.stdout.flush()
api_key = masked_secret_prompt("") if sys.stdin.isatty() else sys.stdin.readline().strip()
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
if api_key:
env_writes["HINDSIGHT_API_KEY"] = api_key
@@ -714,7 +714,7 @@ class HindsightMemoryProvider(MemoryProvider):
sys.stdout.write(" API key (optional, blank to skip): ")
sys.stdout.flush()
api_key = masked_secret_prompt("") if sys.stdin.isatty() else sys.stdin.readline().strip()
api_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
if api_key:
env_writes["HINDSIGHT_API_KEY"] = api_key
@@ -750,7 +750,7 @@ class HindsightMemoryProvider(MemoryProvider):
sys.stdout.write(" LLM API key: ")
sys.stdout.flush()
llm_key = masked_secret_prompt("") if sys.stdin.isatty() else sys.stdin.readline().strip()
llm_key = getpass.getpass(prompt="") if sys.stdin.isatty() else sys.stdin.readline().strip()
if llm_key:
env_writes["HINDSIGHT_LLM_API_KEY"] = llm_key
else:
+2 -2
View File
@@ -314,8 +314,8 @@ def _prompt(label: str, default: str | None = None, secret: bool = False) -> str
sys.stdout.flush()
if secret:
if sys.stdin.isatty():
from hermes_cli.secret_prompt import masked_secret_prompt
val = masked_secret_prompt("")
import getpass
val = getpass.getpass(prompt="")
else:
# Non-TTY (piped input, test runners) — read plaintext
val = sys.stdin.readline().strip()
+22 -50
View File
@@ -61,8 +61,6 @@ import json
import logging
import os
import re
import secrets
import stat
import subprocess
import sys
from pathlib import Path
@@ -91,8 +89,6 @@ except (ModuleNotFoundError, ImportError):
except ValueError:
return str(home)
from utils import atomic_replace
def _hermes_home() -> Path:
"""Resolve HERMES_HOME at call time (NOT module import).
@@ -300,11 +296,14 @@ def list_authorized_emails() -> List[str]:
def _persist_credentials(creds: Any, token_path: Path) -> None:
"""Persist refreshed credentials atomically with private permissions."""
"""Atomic-ish JSON write of refreshed credentials."""
try:
_write_private_json(
token_path,
_normalize_authorized_user_payload(json.loads(creds.to_json())),
token_path.parent.mkdir(parents=True, exist_ok=True)
token_path.write_text(
json.dumps(
_normalize_authorized_user_payload(json.loads(creds.to_json())),
indent=2,
)
)
except Exception:
logger.debug(
@@ -326,38 +325,6 @@ def _normalize_authorized_user_payload(payload: dict) -> dict:
return normalized
def _write_private_json(path: Path, data: Any) -> None:
"""Atomically write JSON with 0o600 permissions where supported."""
path.parent.mkdir(parents=True, exist_ok=True)
try:
os.chmod(path.parent, 0o700)
except OSError:
pass
tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
fd = os.open(
str(tmp_path),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
json.dump(data, fh, indent=2, ensure_ascii=False)
fh.flush()
os.fsync(fh.fileno())
atomic_replace(tmp_path, path)
try:
os.chmod(path, stat.S_IRUSR | stat.S_IWUSR)
except OSError:
pass
finally:
try:
if tmp_path.exists():
tmp_path.unlink()
except OSError:
pass
def _ensure_deps() -> None:
"""Check deps available; install if not; exit on failure."""
try:
@@ -435,21 +402,25 @@ def store_client_secret(path: str) -> None:
sys.exit(1)
target = _client_secret_path()
_write_private_json(target, data)
target.parent.mkdir(parents=True, exist_ok=True)
target.write_text(json.dumps(data, indent=2))
print(f"OK: Client secret saved to {target}")
def _save_pending_auth(*, state: str, code_verifier: str,
email: Optional[str] = None) -> None:
pending = _pending_auth_path(email)
_write_private_json(
pending,
{
"state": state,
"code_verifier": code_verifier,
"redirect_uri": _REDIRECT_URI,
"email": email or "",
},
pending.parent.mkdir(parents=True, exist_ok=True)
pending.write_text(
json.dumps(
{
"state": state,
"code_verifier": code_verifier,
"redirect_uri": _REDIRECT_URI,
"email": email or "",
},
indent=2,
)
)
@@ -577,7 +548,8 @@ def exchange_auth_code(code: str, email: Optional[str] = None) -> None:
token_payload["scopes"] = granted_scopes
token_path = _token_path(email)
_write_private_json(token_path, token_payload)
token_path.parent.mkdir(parents=True, exist_ok=True)
token_path.write_text(json.dumps(token_payload, indent=2))
_pending_auth_path(email).unlink(missing_ok=True)
print(f"OK: Authenticated. Token saved to {token_path}")
+2 -2
View File
@@ -1585,8 +1585,8 @@ def interactive_setup() -> None:
suffix = " [keep current]" if existing else ""
try:
if secret:
from hermes_cli.secret_prompt import masked_secret_prompt
value = masked_secret_prompt(f"{prompt}{suffix}: ")
import getpass
value = getpass.getpass(f"{prompt}{suffix}: ")
else:
value = input(f"{prompt}{suffix}: ").strip()
except (EOFError, KeyboardInterrupt):
-3
View File
@@ -1,3 +0,0 @@
from .adapter import register
__all__ = ["register"]
-49
View File
@@ -1,49 +0,0 @@
name: mattermost-platform
label: Mattermost
kind: platform
version: 1.0.0
description: >
Mattermost gateway adapter for Hermes Agent.
Connects to a self-hosted or cloud Mattermost instance via the v4 REST
API + WebSocket event stream and relays messages between Mattermost
channels/DMs and the Hermes agent. Supports thread-mode replies, native
file uploads, channel-scoped allowlists, and home-channel cron delivery.
author: NousResearch
requires_env:
- name: MATTERMOST_URL
description: "Mattermost server URL (e.g. https://mm.example.com)"
prompt: "Mattermost server URL"
password: false
- name: MATTERMOST_TOKEN
description: "Bot account token or personal-access token"
prompt: "Mattermost bot token"
password: true
optional_env:
- name: MATTERMOST_ALLOWED_USERS
description: "Comma-separated Mattermost user IDs allowed to talk to the bot"
prompt: "Allowed users (comma-separated)"
password: false
- name: MATTERMOST_ALLOW_ALL_USERS
description: "Allow any Mattermost user to trigger the bot (dev only)"
prompt: "Allow all users? (true/false)"
password: false
- name: MATTERMOST_HOME_CHANNEL
description: "Default channel ID for cron / notification delivery"
prompt: "Home channel ID"
password: false
- name: MATTERMOST_REPLY_MODE
description: "How replies are sent: 'thread' (nested) or 'off' (flat). Default: off."
prompt: "Reply mode (thread|off)"
password: false
- name: MATTERMOST_REQUIRE_MENTION
description: "Require @bot mention in channels (default true). Set false for free-response everywhere."
prompt: "Require @mention? (true/false)"
password: false
- name: MATTERMOST_FREE_RESPONSE_CHANNELS
description: "Comma-separated channel IDs where @mention is not required."
prompt: "Free-response channel IDs (comma-separated)"
password: false
- name: MATTERMOST_ALLOWED_CHANNELS
description: "If set, the bot only responds in these channels (whitelist)."
prompt: "Allowed channel IDs (comma-separated)"
password: false
+2 -2
View File
@@ -685,8 +685,8 @@ def interactive_setup() -> None:
suffix = " [keep current]" if existing else ""
try:
if secret:
from hermes_cli.secret_prompt import masked_secret_prompt
value = masked_secret_prompt(f"{prompt}{suffix}: ")
import getpass
value = getpass.getpass(f"{prompt}{suffix}: ")
else:
value = input(f"{prompt}{suffix}: ").strip()
except (EOFError, KeyboardInterrupt):
+3 -3
View File
@@ -11,7 +11,7 @@ Originally salvaged from PR #10600 by @Jaaneek; reshaped into the
generate-only surface.
Authentication: xAI Grok OAuth tokens (preferred billed against the
user's SuperGrok or X Premium+ subscription) or ``XAI_API_KEY``. Both routes are
user's SuperGrok subscription) or ``XAI_API_KEY``. Both routes are
resolved through ``tools.xai_http.resolve_xai_http_credentials`` so a
single login covers chat + TTS + image gen + video gen + transcription.
Output is an HTTPS URL from xAI's CDN; the gateway downloads and
@@ -216,7 +216,7 @@ class XAIVideoGenProvider(VideoGenProvider):
# Auth resolution lives entirely in the shared ``xai_grok`` post_setup
# hook (``hermes_cli/tools_config.py``) so the picker doesn't blindly
# prompt for an API key when the user is already signed in via xAI
# Grok OAuth (SuperGrok / Premium+) — TTS / image gen / video gen
# Grok OAuth (SuperGrok Subscription) — TTS / image gen / video gen
# all share the same credential resolver. The hook offers an
# OAuth-vs-API-key choice when neither is configured.
return {
@@ -295,7 +295,7 @@ class XAIVideoGenProvider(VideoGenProvider):
return error_response(
error=(
"No xAI credentials found. Sign in via `hermes auth add xai-oauth` "
"(SuperGrok / Premium+) or set XAI_API_KEY from "
"(SuperGrok subscription) or set XAI_API_KEY from "
"https://console.x.ai/."
),
error_type="auth_required",
+15
View File
@@ -246,6 +246,21 @@ python-version = "3.13"
unknown-argument = "warn"
redundant-cast = "ignore"
# Per-file rule overrides — see [tool.ty.overrides] below.
#
# Tests can't resolve their own third-party dev deps (pytest, etc.)
# under the lint-diff CI job because that job installs ``ty`` as a
# bare uv tool without the project's venv. Installing the full venv
# just to please the type checker would balloon the lint job; the
# diagnostics aren't actionable inside tests anyway because the
# imports demonstrably work at runtime (the same CI runs the full
# pytest suite in a different job). Suppress unresolved-import
# inside tests/ so the lint-diff PR comment stays useful.
[[tool.ty.overrides]]
include = ["tests/**"]
rules = { unresolved-import = "ignore" }
[tool.ruff]
preview = true # required for PLW1514 (unspecified-encoding) — preview rule
+8 -109
View File
@@ -124,7 +124,6 @@ from agent.memory_manager import StreamingContextScrubber, build_memory_context_
from agent.think_scrubber import StreamingThinkScrubber
from agent.retry_utils import jittered_backoff
from agent.error_classifier import classify_api_error, FailoverReason
from agent.redact import redact_sensitive_text
from agent.prompt_builder import (
DEFAULT_AGENT_IDENTITY, PLATFORM_HINTS,
MEMORY_GUIDANCE, SESSION_SEARCH_GUIDANCE, SKILLS_GUIDANCE,
@@ -885,11 +884,7 @@ class AIAgent:
1. ``providers.<id>.models.<model>.stale_timeout_seconds``
2. ``providers.<id>.stale_timeout_seconds``
3. ``HERMES_API_CALL_STALE_TIMEOUT`` env var
4. 90.0s default (time-to-first-byte for non-streaming / Codex
internal-streaming requests; lowered from 300s in May 2026 so
fallback providers kick in faster when upstream providers
stall). The detector still scales up for large contexts in
``_compute_non_stream_stale_timeout``.
4. 300.0s default
Returns ``(timeout_seconds, uses_implicit_default)`` so the caller can
preserve legacy behaviors that only apply when the user has *not*
@@ -904,80 +899,22 @@ class AIAgent:
if env_timeout is not None:
return float(env_timeout), False
return 90.0, True
return 300.0, True
def _compute_non_stream_stale_timeout(self, api_payload: Any) -> float:
"""Compute the effective non-stream stale timeout for this request.
Accepts either the full ``api_kwargs`` dict (Chat Completions or
Responses API) or a legacy ``messages`` list. Context-size scaling
applies the same way to both shapes via
:func:`agent.chat_completion_helpers.estimate_request_context_tokens`.
"""
def _compute_non_stream_stale_timeout(self, messages: list[dict[str, Any]]) -> float:
"""Compute the effective non-stream stale timeout for this request."""
stale_base, uses_implicit_default = self._resolved_api_call_stale_timeout_base()
base_url = getattr(self, "_base_url", None) or self.base_url or ""
if uses_implicit_default and base_url and is_local_endpoint(base_url):
return float("inf")
from agent.chat_completion_helpers import estimate_request_context_tokens
est_tokens = estimate_request_context_tokens(api_payload)
est_tokens = sum(len(str(v)) for v in messages) // 4
if est_tokens > 100_000:
return max(stale_base, 240.0)
return max(stale_base, 600.0)
if est_tokens > 50_000:
return max(stale_base, 150.0)
return max(stale_base, 450.0)
return stale_base
def _codex_silent_hang_hint(self, model: Optional[str] = None) -> Optional[str]:
"""Return an actionable hint when this request matches a known
Codex silent-reject configuration, else ``None``.
The ChatGPT Codex backend (``chatgpt.com/backend-api/codex``) has
historically silently dropped certain model requests: the connection
is accepted but no stream events are emitted and no error is raised.
The stale-call detector ends the hang, but a generic "timed out"
message gives the user no path forward.
This helper substitutes an actionable hint into the stale-timeout
warning when the request matches a known silent-reject pattern.
Currently flagged: ``gpt-5.5`` family on the Codex backend. See
hermes-agent #21444 for the symptom history. The upstream backend
behavior has historically come and gone with ChatGPT entitlement
changes the heuristic stays in place as future-proofing even when
the symptom is dormant.
Does NOT fix the backend issue. Only converts an opaque stale-timeout
into actionable text so users learn the workaround in seconds rather
than digging through logs.
"""
if self.api_mode != "codex_responses":
return None
is_codex_backend = (
self.provider == "openai-codex"
or (
getattr(self, "_base_url_hostname", "") == "chatgpt.com"
and "/backend-api/codex" in (getattr(self, "_base_url_lower", "") or "")
)
)
if not is_codex_backend:
return None
eff_model = (model if model is not None else self.model) or ""
model_lower = eff_model.lower()
# Match the gpt-5.5 family — bare ``gpt-5.5``, ``gpt-5.5-codex``,
# vendor-prefixed variants like ``openai/gpt-5.5``, and any future
# ``gpt-5.5-*`` SKU. Anchor at a word boundary on either side so
# unrelated tokens like ``gpt-5.50`` do not match.
if not re.search(r"(?:^|[/\-_])gpt-5\.5(?:$|[\-_])", model_lower):
return None
return (
f"Codex backend appears to be silently rejecting {eff_model!r} "
"on chatgpt.com/backend-api/codex (no stream events, no error). "
"This is a known backend-side pattern that has affected ChatGPT "
"Plus accounts intermittently. "
"Workaround: try `gpt-5.4-codex` on the same OAuth profile, "
"or switch to a different model/provider in your fallback chain. "
"See hermes-agent#21444 for symptom history."
)
def _is_openrouter_url(self) -> bool:
"""Return True when the base URL targets OpenRouter."""
return base_url_host_matches(self._base_url_lower, "openrouter.ai")
@@ -1609,36 +1546,6 @@ class AIAgent:
content = re.sub(r'(</think>)\n+', r'\1\n', content)
return content.strip()
@staticmethod
def _redact_message_content(content):
"""Apply secret redaction to message content (str or list-of-parts).
Handles both plain-string content and the OpenAI/Anthropic multimodal
shape where ``content`` is a list of ``{"type": "text", "text": ...}``
/ ``{"type": "image_url", ...}`` / ``{"type": "input_text", "content": ...}``
parts. Image / binary parts are left untouched; only text fields are
passed through ``redact_sensitive_text``.
Respects ``HERMES_REDACT_SECRETS`` via ``redact_sensitive_text``
when disabled the helper is effectively a no-op.
"""
if content is None:
return content
if isinstance(content, str):
return redact_sensitive_text(content)
if isinstance(content, list):
redacted = []
for part in content:
if isinstance(part, dict):
part = dict(part)
if isinstance(part.get("text"), str):
part["text"] = redact_sensitive_text(part["text"])
if isinstance(part.get("content"), str):
part["content"] = redact_sensitive_text(part["content"])
redacted.append(part)
return redacted
return content
def _save_session_log(self, messages: List[Dict[str, Any]] = None):
"""Optional per-session JSON snapshot writer.
@@ -1674,14 +1581,6 @@ class AIAgent:
if msg.get("role") == "assistant" and msg.get("content"):
msg = dict(msg)
msg["content"] = self._clean_session_content(msg["content"])
# Defence-in-depth: redact credentials from every message
# content before persistence. Catches PATs / API keys / Bearer
# tokens that may have leaked into assistant responses, tool
# output, or user paste. Respects HERMES_REDACT_SECRETS via
# redact_sensitive_text — no-op when disabled. (#19798, #19845)
if "content" in msg:
msg = dict(msg)
msg["content"] = self._redact_message_content(msg.get("content"))
cleaned.append(msg)
# Guard: never overwrite a larger session log with fewer messages.
@@ -1707,7 +1606,7 @@ class AIAgent:
"platform": self.platform,
"session_start": self.session_start.isoformat(),
"last_updated": datetime.now().isoformat(),
"system_prompt": redact_sensitive_text(self._cached_system_prompt or ""),
"system_prompt": self._cached_system_prompt or "",
"tools": self.tools or [],
"message_count": len(cleaned),
"messages": cleaned,
-28
View File
@@ -45,19 +45,15 @@ ACP_REGISTRY_MANIFEST = REPO_ROOT / "acp_registry" / "agent.json"
# Auto-extracted from noreply emails + manual overrides
AUTHOR_MAP = {
"9592417+adam91holt@users.noreply.github.com": "adam91holt",
# teknium (multiple emails)
"teknium1@gmail.com": "teknium1",
"kenyon1977@gmail.com": "kenyonxu",
"cipherframe@users.noreply.github.com": "CipherFrame",
"121752779+jacevys@users.noreply.github.com": "jacevys",
"me@promplate.dev": "CNSeniorious000",
"yichengqiao21@gmail.com": "YarrowQiao",
"erhanyasarx@gmail.com": "erhnysr",
"30366221+WorldWriter@users.noreply.github.com": "WorldWriter",
"dafeng@DafengdeMacBook-Pro.local": "WorldWriter",
"schepers.zander1@gmail.com": "Strontvod",
"ed@bebop.crew": "someaka",
"anadi.jaggia@gmail.com": "Jaggia",
"32201324+simpolism@users.noreply.github.com": "simpolism",
"simpolism@gmail.com": "simpolism",
@@ -80,22 +76,6 @@ AUTHOR_MAP = {
"189280367+Lempkey@users.noreply.github.com": "Lempkey",
"34853915+m0n3r0@users.noreply.github.com": "m0n3r0",
"leeseoki@makestar.com": "leeseoki0",
"kronexoi13@gmail.com": "kronexoi",
"hua.zhong@kingsmith.com": "vgocoder",
"hermes@marian.local": "Schrotti77",
"1920071390@campus.ouj.ac.jp": "zapabob",
"gaia@gaia.local": "jfuenmayor",
"jiahuigu@users.noreply.github.com": "Jiahui-Gu",
"openhands@all-hands.dev": "YLChen-007",
"AdamPlatin123@outlook.com": "AdamPlatin123",
"32711803+waefrebeorn@users.noreply.github.com": "waefrebeorn",
"32869278+dusterbloom@users.noreply.github.com": "dusterbloom",
"liuhao1024@users.noreply.github.com": "liuhao1024",
"kylekahraman@users.noreply.github.com": "kylekahraman",
"130975919+kylekahraman@users.noreply.github.com": "kylekahraman",
"dsr-restyn@users.noreply.github.com": "dsr-restyn",
"210765158+WuKongAI-CMU@users.noreply.github.com": "WuKongAI-CMU",
"lichriszhang@gmail.com": "codeblackhole1024",
"leovillalbajr@gmail.com": "Lempkey",
"nidhi2894@gmail.com": "nidhi-singh02",
"30312689+aashizpoudel@users.noreply.github.com": "aashizpoudel",
@@ -601,7 +581,6 @@ AUTHOR_MAP = {
"mgparkprint@gmail.com": "vlwkaos",
"1317078257maroon@gmail.com": "Oxidane-bot",
"tranquil_flow@protonmail.com": "Tranquil-Flow",
"66773372+Tranquil-Flow@users.noreply.github.com": "Tranquil-Flow",
"LyleLengyel@gmail.com": "mcndjxlefnd",
"wangshengyang2004@163.com": "Wangshengyang2004",
"hasan.ali13381@gmail.com": "H-Ali13381",
@@ -1254,8 +1233,6 @@ AUTHOR_MAP = {
"165905879+davidcampbelldc@users.noreply.github.com": "davidcampbelldc",
"hoangv.pham0803@gmail.com": "hehehe0803", # PR #26212 salvage (codex kanban writable root)
"26063003+hehehe0803@users.noreply.github.com": "hehehe0803",
"kasunvinod@users.noreply.github.com": "kasunvinod", # PR #24126 salvage (codex timeout propagation)
"15059870+kasunvinod@users.noreply.github.com": "kasunvinod",
"38348871+vaddisrinivas@users.noreply.github.com": "vaddisrinivas", # PR #26394 salvage (Docker messaging extra)
# batch salvage (May 2026 LHF run, group 7)
"198679067+02356abc@users.noreply.github.com": "02356abc", # PR #28286 salvage (wecom CLOSING)
@@ -1307,11 +1284,6 @@ AUTHOR_MAP = {
"edison@mcclean.codes": "McClean-Edison", # PR #29817 (register_auxiliary_task plugin API)
"zhangsamuel12@gmail.com": "SamuelZ12", # PR #7480 (show recap after in-session resume)
"490408354@qq.com": "daizhonggeng", # PR #9020 (numbered /resume selection)
"claw@openclaw.ai": "wanwan2qq", # PR #10215 (strip brackets/quotes from /resume; gateway session-ID lookup)
"simo.kiihamaki@gmail.com": "SimoKiihamaki", # PR #30773 (Windows /reset+/new freeze; stdin fallback for modal)
"66773372+Tranquil-Flow@users.noreply.github.com": "Tranquil-Flow", # PR #27518 (bracketed-paste timeout)
"8bit64k@pm.me": "8bit64k", # PR #14681 (TUI /q alias from quit to queue)
"chenglunhu@gmail.com": "hclsys", # PR #31985 (TUI /q alias regression test)
}
-6
View File
@@ -329,15 +329,9 @@ fi
if [ ! -f ".env" ]; then
if [ -f ".env.example" ]; then
cp .env.example .env
# .env holds API keys — restrict to owner-only access (matches
# scripts/install.sh which already chmods 600 after creation).
chmod 600 .env 2>/dev/null || true
echo -e "${GREEN}${NC} Created .env from template"
fi
else
# Tighten an existing .env's perms in case it was created elsewhere
# under a permissive umask.
chmod 600 .env 2>/dev/null || true
echo -e "${GREEN}${NC} .env exists"
fi
+1 -8
View File
@@ -1621,14 +1621,7 @@ class TestSlashCommands:
assert "Provider: anthropic" in result
assert state.agent.provider == "anthropic"
assert state.agent.base_url == "https://anthropic.example/v1"
# ``state.agent.provider == "anthropic"`` plus the base_url check above
# already prove ``fake_resolve_runtime_provider`` was called with
# ``requested="anthropic"`` for the model-switch step — the agent's
# provider/base_url come from that fake's return value. The legacy
# ``runtime_calls[-1] == "anthropic"`` assertion was flaky in CI
# under specific xdist-slice scheduling (saw ``'custom' == 'anthropic'``
# repeatedly) and was redundant with those checks, so it's gone.
assert "anthropic" in runtime_calls
assert runtime_calls[-1] == "anthropic"
# ---------------------------------------------------------------------------
-19
View File
@@ -1,7 +1,6 @@
"""Tests for agent/anthropic_adapter.py — Anthropic Messages API adapter."""
import json
import sys
import time
from types import SimpleNamespace
from unittest.mock import patch, MagicMock
@@ -421,24 +420,6 @@ class TestWriteClaudeCodeCredentials:
assert data["otherField"] == "keep-me"
assert data["claudeAiOauth"]["accessToken"] == "new-tok"
@pytest.mark.skipif(sys.platform.startswith("win"), reason="POSIX mode bits not enforced on Windows")
def test_credentials_file_created_with_0o600(self, tmp_path, monkeypatch):
"""Refreshed Claude Code credentials must land on disk at 0o600.
Regression for the TOCTOU race where ``write_text`` + ``replace``
+ post-write ``chmod`` left both the temp file and the destination
briefly readable at the process umask (commonly 0o644). Mirrors
the fix shipped in #19673 (google_oauth) and #21148 (mcp_oauth).
"""
import stat as _stat
monkeypatch.setattr("agent.anthropic_adapter.Path.home", lambda: tmp_path)
_write_claude_code_credentials("tok", "ref", 12345)
cred_file = tmp_path / ".claude" / ".credentials.json"
assert cred_file.exists()
mode = _stat.S_IMODE(cred_file.stat().st_mode)
assert mode == 0o600, f"creds file mode {oct(mode)} != 0o600 — TOCTOU race regressed"
class TestResolveWithRefresh:
def test_auto_refresh_on_expired_creds(self, monkeypatch, tmp_path):
-149
View File
@@ -430,155 +430,6 @@ class TestBuildCodexClient:
assert mock_openai.call_count == 2
class TestResolveProviderClientUniversalModelFallback:
"""resolve_provider_client() picks a sensible model when callers pass none (#31845).
Aux tasks (title generation, vision, session search, etc.) routinely
reach this function without an explicit model the user's main
provider was picked via ``hermes model``, no per-task override is
set, and the expectation is "just use my main model for side tasks
too." The resolver fills in ``model`` from a 3-step universal
fallback before any provider branch runs:
1. ``model`` argument (caller knew what they wanted)
2. provider's catalog default (cheap aux model, if registered)
3. user's main model (``model.model`` in config.yaml)
Pre-fix the OAuth providers (xai-oauth, openai-codex) returned
``(None, None)`` on an empty model both lack a catalog default
because their accepted-model lists drift on the backend. That
silent failure caused ``_resolve_auto`` to drop to its Step-2
fallback chain (OpenRouter / Nous / etc.), so aux tasks billed
against the wrong subscription.
"""
def test_empty_model_for_oauth_provider_falls_back_to_main_model(self):
"""xai-oauth: no catalog default → uses main model."""
from agent.auxiliary_client import resolve_provider_client
with (
patch(
"agent.auxiliary_client._read_main_model",
return_value="grok-4.3",
),
patch(
"agent.auxiliary_client._get_aux_model_for_provider",
return_value="", # xai-oauth has no catalog default
),
patch(
"agent.auxiliary_client._build_xai_oauth_aux_client",
return_value=(MagicMock(), "grok-4.3"),
) as mock_build,
):
client, model = resolve_provider_client("xai-oauth", "")
assert client is not None, (
"should not fall through when main model is set"
)
assert model == "grok-4.3"
# The builder receives the main-model fallback, never the empty
# string the caller passed.
assert mock_build.call_args.args[0] == "grok-4.3"
def test_empty_model_for_codex_also_uses_main_model(self):
"""openai-codex: symmetric with xai-oauth — same universal fallback."""
from agent.auxiliary_client import resolve_provider_client
with (
patch(
"agent.auxiliary_client._read_main_model",
return_value="gpt-5.4",
),
patch(
"agent.auxiliary_client._get_aux_model_for_provider",
return_value="", # openai-codex has no catalog default either
),
patch(
"agent.auxiliary_client._build_codex_client",
return_value=(MagicMock(), "gpt-5.4"),
) as mock_build,
patch(
"agent.auxiliary_client._select_pool_entry",
return_value=(True, None),
),
):
client, model = resolve_provider_client("openai-codex", "")
assert client is not None
assert model == "gpt-5.4"
assert mock_build.call_args.args[0] == "gpt-5.4"
def test_empty_model_for_catalog_provider_uses_catalog_default(self):
"""anthropic / nous / openrouter / etc.: catalog default wins
over main model when no explicit model is passed.
This preserves the original \"cheap aux model for direct API
providers\" behaviour — users on anthropic for their main chat
still get claude-haiku-4-5 for title generation, NOT their
expensive chat model. Step 2 of the universal fallback chain.
"""
from agent.auxiliary_client import resolve_provider_client
with (
patch(
"agent.auxiliary_client._read_main_model",
# Main model is the expensive opus; if this leaks into
# aux it costs real money.
return_value="claude-opus-4-6",
) as mock_read_main,
patch(
"agent.auxiliary_client._get_aux_model_for_provider",
return_value="claude-haiku-4-5-20251001",
),
patch(
"agent.anthropic_adapter.build_anthropic_client",
return_value=MagicMock(),
),
patch(
"agent.anthropic_adapter.resolve_anthropic_token",
return_value="sk-ant-***",
),
patch(
"agent.auxiliary_client._read_nous_auth", return_value=None
),
):
client, model = resolve_provider_client("anthropic", "")
# Catalog default takes precedence — main_model was a no-op
# because step 2 of the fallback chain already produced a model.
assert client is not None
assert model == "claude-haiku-4-5-20251001"
mock_read_main.assert_not_called()
def test_explicit_model_takes_precedence_over_fallbacks(self):
"""Step 1: caller-passed model wins. Per-task config
(``auxiliary.<task>.model``) routes here when the user
explicitly picks gemini-3-flash for title generation, that's
what runs, not their main model.
"""
from agent.auxiliary_client import resolve_provider_client
with (
patch("agent.auxiliary_client._read_main_model") as mock_read_main,
patch(
"agent.auxiliary_client._get_aux_model_for_provider",
return_value="catalog-default-should-not-be-used",
),
patch(
"agent.auxiliary_client._build_xai_oauth_aux_client",
return_value=(MagicMock(), "grok-4.20-multi-agent"),
) as mock_build,
):
client, model = resolve_provider_client(
"xai-oauth", "grok-4.20-multi-agent",
)
assert client is not None
assert model == "grok-4.20-multi-agent"
mock_read_main.assert_not_called()
assert mock_build.call_args.args[0] == "grok-4.20-multi-agent"
class TestExpiredCodexFallback:
"""Test that expired Codex tokens don't block the auto chain."""
-175
View File
@@ -1,175 +0,0 @@
"""Regression tests for the Codex time-to-first-byte (TTFB) watchdog.
The chatgpt.com/backend-api/codex endpoint has an intermittent failure mode
where it accepts the connection but never emits a single stream event. The
watchdog in ``interruptible_api_call`` kills such a connection at a short TTFB
cutoff (instead of waiting out the much longer wall-clock stale timeout) so the
retry loop can reconnect promptly. Once any stream event arrives, the stream is
considered healthy and only the wall-clock stale timeout applies long
generations must never be interrupted by the TTFB cutoff.
The "bytes flowing" signal is ``agent._codex_stream_last_event_ts``, set on
*any* event by ``codex_runtime.run_codex_stream`` so reasoning-only or
tool-call-only turns (which emit no output-text deltas) are not mistaken for a
stall.
"""
from __future__ import annotations
import sys
import time
import types
from types import SimpleNamespace
import pytest
# Stub optional heavy imports so run_agent imports cleanly in isolation.
sys.modules.setdefault("fire", types.SimpleNamespace(Fire=lambda *a, **k: None))
sys.modules.setdefault("firecrawl", types.SimpleNamespace(Firecrawl=object))
sys.modules.setdefault("fal_client", types.SimpleNamespace())
def _make_codex_agent(tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
(tmp_path / "config.yaml").write_text("{}\n", encoding="utf-8")
from run_agent import AIAgent
agent = AIAgent(
model="gpt-5.5",
provider="openai-codex",
api_key="sk-dummy",
base_url="https://chatgpt.com/backend-api/codex",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
platform="cli",
)
# The watchdog is gated on the codex_responses api_mode; assert/force it so
# the test is robust to detection-logic changes elsewhere.
agent.api_mode = "codex_responses"
monkeypatch.setattr(agent, "_emit_status", lambda *a, **k: None)
# Keep the wall-clock stale timeout high so any early kill is unambiguously
# the TTFB path, not the stale-call path.
monkeypatch.setattr(
agent, "_compute_non_stream_stale_timeout", lambda *a, **k: 60.0
)
return agent
def test_ttfb_kills_when_no_stream_event(tmp_path, monkeypatch):
"""Backend accepts the connection but emits no event -> killed at the TTFB
cutoff, well before the 60s wall-clock stale timeout, with a retryable
TimeoutError and a ``codex_ttfb_kill`` close reason."""
from agent import chat_completion_helpers as h
agent = _make_codex_agent(tmp_path, monkeypatch)
monkeypatch.setenv("HERMES_CODEX_TTFB_TIMEOUT_SECONDS", "1")
closes: list = []
dummy_client = SimpleNamespace()
monkeypatch.setattr(agent, "_create_request_openai_client", lambda **k: dummy_client)
monkeypatch.setattr(
agent, "_abort_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
monkeypatch.setattr(
agent, "_close_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
stop = {"flag": False}
def fake_hang(api_kwargs, client=None, on_first_delta=None):
# Never set _codex_stream_last_event_ts: simulate zero events arriving.
deadline = time.time() + 30
while time.time() < deadline and not stop["flag"] and not agent._interrupt_requested:
time.sleep(0.02)
raise RuntimeError("connection closed")
monkeypatch.setattr(agent, "_run_codex_stream", fake_hang)
t0 = time.time()
try:
with pytest.raises(TimeoutError) as excinfo:
h.interruptible_api_call(agent, {"model": "gpt-5.5", "input": "hi"})
elapsed = time.time() - t0
assert "TTFB" in str(excinfo.value)
assert "codex_ttfb_kill" in closes
# ~1s cutoff + 2s join grace; must be far under the 60s stale timeout.
assert elapsed < 15, f"TTFB watchdog took {elapsed:.1f}s"
finally:
stop["flag"] = True
def test_ttfb_does_not_kill_when_events_flow(tmp_path, monkeypatch):
"""Once a stream event has arrived, a generation that runs past the TTFB
cutoff is NOT killed by the watchdog it completes normally."""
from agent import chat_completion_helpers as h
agent = _make_codex_agent(tmp_path, monkeypatch)
monkeypatch.setenv("HERMES_CODEX_TTFB_TIMEOUT_SECONDS", "1")
closes: list = []
dummy_client = SimpleNamespace()
monkeypatch.setattr(agent, "_create_request_openai_client", lambda **k: dummy_client)
monkeypatch.setattr(
agent, "_abort_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
monkeypatch.setattr(
agent, "_close_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
sentinel = SimpleNamespace(ok=True)
def fake_stream(api_kwargs, client=None, on_first_delta=None):
# Bytes flowing: mark stream activity right away, then keep generating
# past the 1s TTFB cutoff before returning a real response.
agent._codex_stream_last_event_ts = time.time()
if on_first_delta:
on_first_delta()
time.sleep(2.0)
return sentinel
monkeypatch.setattr(agent, "_run_codex_stream", fake_stream)
resp = h.interruptible_api_call(agent, {"model": "gpt-5.5", "input": "hi"})
assert resp is sentinel
assert "codex_ttfb_kill" not in closes
def test_ttfb_disabled_via_env_zero(tmp_path, monkeypatch):
"""Setting HERMES_CODEX_TTFB_TIMEOUT_SECONDS=0 disables the TTFB watchdog;
a no-event stall then falls through to the (here, 60s) stale timeout, so a
short hang is NOT killed by TTFB."""
from agent import chat_completion_helpers as h
agent = _make_codex_agent(tmp_path, monkeypatch)
monkeypatch.setenv("HERMES_CODEX_TTFB_TIMEOUT_SECONDS", "0")
closes: list = []
dummy_client = SimpleNamespace()
monkeypatch.setattr(agent, "_create_request_openai_client", lambda **k: dummy_client)
monkeypatch.setattr(
agent, "_abort_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
monkeypatch.setattr(
agent, "_close_request_openai_client",
lambda c, reason=None: closes.append(reason),
)
sentinel = SimpleNamespace(ok=True)
def fake_stream(api_kwargs, client=None, on_first_delta=None):
# No event marker, but only briefly — well under the 60s stale timeout.
time.sleep(2.0)
return sentinel
monkeypatch.setattr(agent, "_run_codex_stream", fake_stream)
resp = h.interruptible_api_call(agent, {"model": "gpt-5.5", "input": "hi"})
assert resp is sentinel
assert "codex_ttfb_kill" not in closes
-318
View File
@@ -395,324 +395,6 @@ def test_load_pool_seeds_env_api_key(tmp_path, monkeypatch):
def test_load_pool_does_not_persist_env_seeded_secret_value(tmp_path, monkeypatch):
"""Runtime env keys may be used in memory but must not land in auth.json."""
sentinel = "S3NTINEL_DO_NOT_PERSIST_OPENROUTER"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
monkeypatch.setenv("OPENROUTER_API_KEY", sentinel)
_write_auth_store(tmp_path, {"version": 1, "providers": {}})
from agent.credential_pool import load_pool
pool = load_pool("openrouter")
entry = pool.select()
assert entry is not None
assert entry.source == "env:OPENROUTER_API_KEY"
assert entry.access_token == sentinel
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
persisted = json.loads(auth_text)["credential_pool"]["openrouter"][0]
assert persisted["source"] == "env:OPENROUTER_API_KEY"
assert persisted["label"] == "OPENROUTER_API_KEY"
assert persisted["auth_type"] == "api_key"
assert persisted["priority"] == 0
assert "access_token" not in persisted
assert persisted["secret_fingerprint"].startswith("sha256:")
def test_load_pool_persists_bitwarden_origin_metadata_without_secret(tmp_path, monkeypatch):
"""Bitwarden-injected env vars retain source metadata but not raw values."""
sentinel = "S3NTINEL_DO_NOT_PERSIST_BITWARDEN"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
monkeypatch.setenv("OPENROUTER_API_KEY", sentinel)
monkeypatch.setattr(
"hermes_cli.env_loader.get_secret_source",
lambda env_var: "bitwarden" if env_var == "OPENROUTER_API_KEY" else None,
)
_write_auth_store(tmp_path, {"version": 1, "providers": {}})
from agent.credential_pool import load_pool
pool = load_pool("openrouter")
entry = pool.select()
assert entry is not None
assert entry.access_token == sentinel
assert entry.source == "env:OPENROUTER_API_KEY"
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
persisted = json.loads(auth_text)["credential_pool"]["openrouter"][0]
assert persisted["source"] == "env:OPENROUTER_API_KEY"
assert persisted["secret_source"] == "bitwarden"
assert "access_token" not in persisted
def test_load_pool_sanitizes_legacy_raw_borrowed_entry_when_value_unchanged(tmp_path, monkeypatch):
"""Existing raw env-seeded pool entries are rewritten even if the env value matches."""
sentinel = "S3NTINEL_DO_NOT_PERSIST_LEGACY_RAW"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
monkeypatch.setenv("OPENROUTER_API_KEY", sentinel)
_write_auth_store(
tmp_path,
{
"version": 1,
"credential_pool": {
"openrouter": [
{
"id": "legacy-env",
"label": "OPENROUTER_API_KEY",
"auth_type": "api_key",
"priority": 0,
"source": "env:OPENROUTER_API_KEY",
"access_token": sentinel,
"base_url": "https://openrouter.ai/api/v1",
}
]
},
},
)
from agent.credential_pool import load_pool
pool = load_pool("openrouter")
entry = pool.select()
assert entry is not None
assert entry.access_token == sentinel
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
persisted = json.loads(auth_text)["credential_pool"]["openrouter"][0]
assert persisted["id"] == "legacy-env"
assert "access_token" not in persisted
assert persisted["secret_fingerprint"].startswith("sha256:")
def test_pooled_credential_to_dict_strips_borrowed_secret_fields():
from agent.credential_pool import PooledCredential
sentinel = "S3NTINEL_DO_NOT_PERSIST_TO_DICT"
credential = PooledCredential(
provider="openrouter",
id="borrowed-1",
label="vault-ref",
auth_type="api_key",
priority=3,
source="vault:openrouter/api-key",
access_token=sentinel,
refresh_token=f"refresh-{sentinel}",
agent_key=f"agent-{sentinel}",
request_count=7,
last_status="ok",
extra={
"api_key": f"extra-{sentinel}",
"client_secret": f"client-{sentinel}",
"secret_key": f"secret-key-{sentinel}",
"authToken": f"auth-token-{sentinel}",
"refreshToken": f"camel-refresh-{sentinel}",
"authorization": f"Bearer {sentinel}",
"tokens": {"access_token": f"nested-{sentinel}"},
"token_type": "Bearer",
"scope": "inference",
},
)
payload = credential.to_dict()
serialized = json.dumps(payload)
assert sentinel not in serialized
assert "access_token" not in payload
assert "refresh_token" not in payload
assert "agent_key" not in payload
assert "api_key" not in payload
assert "client_secret" not in payload
assert "secret_key" not in payload
assert "authToken" not in payload
assert "refreshToken" not in payload
assert "authorization" not in payload
assert "tokens" not in payload
assert payload["source"] == "vault:openrouter/api-key"
assert payload["label"] == "vault-ref"
assert payload["request_count"] == 7
assert payload["token_type"] == "Bearer"
assert payload["scope"] == "inference"
assert payload["secret_fingerprint"].startswith("sha256:")
@pytest.mark.parametrize("source", [
"age://openrouter/api-key",
"systemd",
"keyring",
"1password",
"pass",
"sops",
"future_secret_store:openrouter",
])
def test_borrowed_source_variants_strip_secret_fields(source):
from agent.credential_pool import PooledCredential
sentinel = f"S3NTINEL_DO_NOT_PERSIST_{source.replace(':', '_').replace('/', '_')}"
credential = PooledCredential(
provider="openrouter",
id="borrowed-variant",
label="borrowed",
auth_type="api_key",
priority=0,
source=source,
access_token=sentinel,
refresh_token=f"refresh-{sentinel}",
)
payload = credential.to_dict()
serialized = json.dumps(payload)
assert sentinel not in serialized
assert "access_token" not in payload
assert "refresh_token" not in payload
assert payload["source"] == source
assert payload["secret_fingerprint"].startswith("sha256:")
def test_load_pool_prunes_stale_borrowed_custom_config_entry(tmp_path, monkeypatch):
sentinel = "S3NTINEL_DO_NOT_PERSIST_STALE_CUSTOM"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
_write_auth_store(
tmp_path,
{
"version": 1,
"credential_pool": {
"custom:foo": [
{
"id": "stale-custom",
"label": "Foo",
"auth_type": "api_key",
"priority": 0,
"source": "config:Foo",
"access_token": sentinel,
"base_url": "https://foo.example/v1",
}
]
},
},
)
from agent.credential_pool import load_pool
pool = load_pool("custom:foo")
assert pool.entries() == []
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
assert json.loads(auth_text)["credential_pool"]["custom:foo"] == []
def test_write_credential_pool_sanitizes_borrowed_payload_at_disk_boundary(tmp_path, monkeypatch):
"""Direct dictionary callers cannot bypass the borrowed-secret guard."""
sentinel = "S3NTINEL_DO_NOT_PERSIST_DIRECT_WRITE"
manual_secret = "MANUAL_SECRET_STAYS_PERSISTABLE"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
from hermes_cli.auth import write_credential_pool
write_credential_pool("openrouter", [
{
"id": "borrowed-1",
"label": "systemd-ref",
"auth_type": "api_key",
"priority": 0,
"source": "systemd://hermes/openrouter",
"access_token": sentinel,
"refresh_token": f"refresh-{sentinel}",
"agent_key": f"agent-{sentinel}",
"api_key": f"extra-{sentinel}",
},
{
"id": "manual-1",
"label": "manual",
"auth_type": "api_key",
"priority": 1,
"source": "manual",
"access_token": manual_secret,
},
])
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
assert manual_secret in auth_text
entries = json.loads(auth_text)["credential_pool"]["openrouter"]
borrowed, manual = entries
assert borrowed["source"] == "systemd://hermes/openrouter"
assert "access_token" not in borrowed
assert "refresh_token" not in borrowed
assert "agent_key" not in borrowed
assert "api_key" not in borrowed
assert borrowed["secret_fingerprint"].startswith("sha256:")
assert manual["access_token"] == manual_secret
def test_write_credential_pool_treats_unowned_oauth_source_as_borrowed(tmp_path, monkeypatch):
sentinel = "S3NTINEL_DO_NOT_PERSIST_UNOWNED_OAUTH"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
from hermes_cli.auth import write_credential_pool
write_credential_pool("openrouter", [
{
"id": "unowned-oauth",
"label": "unowned-oauth",
"auth_type": "oauth",
"priority": 0,
"source": "oauth",
"access_token": sentinel,
"refresh_token": f"refresh-{sentinel}",
}
])
auth_text = (tmp_path / "hermes" / "auth.json").read_text()
assert sentinel not in auth_text
persisted = json.loads(auth_text)["credential_pool"]["openrouter"][0]
assert persisted["source"] == "oauth"
assert "access_token" not in persisted
assert "refresh_token" not in persisted
assert persisted["secret_fingerprint"].startswith("sha256:")
def test_write_credential_pool_preserves_known_provider_owned_oauth_state(tmp_path, monkeypatch):
sentinel = "PROVIDER_OWNED_DEVICE_CODE_STAYS_PERSISTABLE"
monkeypatch.setenv("HERMES_HOME", str(tmp_path / "hermes"))
from hermes_cli.auth import write_credential_pool
write_credential_pool("nous", [
{
"id": "nous-device",
"label": "device-code",
"auth_type": "oauth",
"priority": 0,
"source": "device_code",
"access_token": sentinel,
"refresh_token": f"refresh-{sentinel}",
"agent_key": f"agent-{sentinel}",
}
])
persisted = json.loads((tmp_path / "hermes" / "auth.json").read_text())["credential_pool"]["nous"][0]
assert persisted["access_token"] == sentinel
assert persisted["refresh_token"] == f"refresh-{sentinel}"
assert persisted["agent_key"] == f"agent-{sentinel}"
def test_load_pool_prefers_dotenv_over_stale_os_environ(tmp_path, monkeypatch):
"""Regression for #18254: stale OPENROUTER_API_KEY in os.environ (inherited
from a parent shell) must NOT shadow the fresh key in ~/.hermes/.env when
-150
View File
@@ -1,150 +0,0 @@
"""Tests for agent/file_safety.py read guards — env file blocking.
Run with: python -m pytest tests/agent/test_file_safety.py -v
"""
import os
import tempfile
from pathlib import Path
from unittest.mock import patch
import pytest
from agent.file_safety import (
_BLOCKED_PROJECT_ENV_BASENAMES,
get_read_block_error,
)
# ---------------------------------------------------------------------------
# Project-local .env file blocking (issue #20734)
# ---------------------------------------------------------------------------
class TestEnvFileReadBlocking:
"""Secret-bearing .env files must be blocked by get_read_block_error."""
@pytest.mark.parametrize("basename", [
".env",
".env.local",
".env.development",
".env.production",
".env.test",
".env.staging",
".envrc",
])
def test_blocked_env_basenames(self, basename):
"""All secret-bearing .env basenames are blocked regardless of directory."""
path = f"/tmp/project/{basename}"
error = get_read_block_error(path)
assert error is not None, f"{basename} should be blocked"
assert "Access denied" in error
assert "secret-bearing" in error.lower() or "environment file" in error.lower()
def test_blocked_env_in_subdirectory(self):
"""Nested .env files are also blocked."""
error = get_read_block_error("/home/user/app/services/api/.env.production")
assert error is not None
def test_blocked_env_absolute_path(self):
"""Absolute paths to .env files are blocked."""
error = get_read_block_error("/opt/myapp/.env")
assert error is not None
def test_allowed_env_example(self):
""""The .env.example file is explicitly allowed — it's documentation, not a secret."""
error = get_read_block_error("/tmp/project/.env.example")
assert error is None
def test_allowed_env_sample(self):
"""Other .env variants like .env.sample are allowed."""
error = get_read_block_error("/tmp/project/.env.sample")
assert error is None
def test_allowed_non_env_files(self):
"""Regular files are not affected by the env guard."""
for path in ["/tmp/project/config.yaml", "/tmp/project/main.py",
"/tmp/project/README.md", "/tmp/project/.gitignore"]:
error = get_read_block_error(path)
assert error is None, f"{path} should be allowed"
def test_allowed_hermes_env(self):
"""Hermes' own .env inside HERMES_HOME is NOT blocked by this rule
(it's handled by other mechanisms). Only project-local .env is blocked."""
# Note: hermes internal .env is in ~/.hermes/.env which is NOT a project-local
# path, but the basename check applies to ANY .env. This is intentional —
# even ~/.hermes/.env should not be readable via read_file.
error = get_read_block_error(os.path.expanduser("~/.hermes/.env"))
assert error is not None
def test_blocked_set_is_lowercase(self):
"""All entries in the blocked set are lowercase for case-insensitive matching."""
for name in _BLOCKED_PROJECT_ENV_BASENAMES:
assert name == name.lower(), f"{name} should be lowercase"
# ---------------------------------------------------------------------------
# Existing cache-file blocking (regression — must still work)
# ---------------------------------------------------------------------------
class TestCacheFileReadBlocking:
"""Internal Hermes cache files must remain blocked."""
def test_hub_index_cache_blocked(self, tmp_path):
"""Hub index-cache reads are blocked."""
hermes_home = tmp_path / ".hermes"
cache = hermes_home / "skills" / ".hub" / "index-cache" / "data.json"
cache.parent.mkdir(parents=True)
cache.write_text("{}")
with patch("agent.file_safety._hermes_home_path", return_value=hermes_home):
error = get_read_block_error(str(cache))
assert error is not None
assert "internal Hermes cache" in error
def test_hub_directory_blocked(self, tmp_path):
"""Hub directory reads are blocked."""
hermes_home = tmp_path / ".hermes"
hub = hermes_home / "skills" / ".hub" / "metadata.json"
hub.parent.mkdir(parents=True)
hub.write_text("{}")
with patch("agent.file_safety._hermes_home_path", return_value=hermes_home):
error = get_read_block_error(str(hub))
assert error is not None
# ---------------------------------------------------------------------------
# Combined: env guard + cache guard don't interfere
# ---------------------------------------------------------------------------
class TestCombinedGuards:
"""Both guards should work independently without interference."""
def test_env_guard_works_regardless_of_hermes_home(self, tmp_path):
"""The env basename guard does not depend on HERMES_HOME resolution."""
hermes_home = tmp_path / ".hermes"
hermes_home.mkdir()
with patch("agent.file_safety._hermes_home_path", return_value=hermes_home):
# Regular project .env should still be blocked
error = get_read_block_error("/workspace/.env")
assert error is not None
# .env.example should still be allowed
error = get_read_block_error("/workspace/.env.example")
assert error is None
def test_cache_guard_still_works_with_env_guard(self, tmp_path):
"""Cache file blocking still works when env guard is active."""
hermes_home = tmp_path / ".hermes"
cache = hermes_home / "skills" / ".hub" / "index-cache" / "x"
cache.parent.mkdir(parents=True)
cache.write_text("")
with patch("agent.file_safety._hermes_home_path", return_value=hermes_home):
error = get_read_block_error(str(cache))
assert error is not None
assert "internal Hermes cache" in error
+10 -74
View File
@@ -66,16 +66,6 @@ def test_anthropic_oauth_json_blocked(fake_home):
assert "credential store" in err
def test_google_oauth_json_blocked(fake_home):
"""Gemini OAuth tokens live under auth/google_oauth.json — blocked."""
from agent.file_safety import get_read_block_error
oauth = _create(fake_home, Path("auth") / "google_oauth.json")
err = get_read_block_error(str(oauth))
assert err is not None
assert "credential store" in err
def test_arbitrary_hermes_home_file_not_blocked(fake_home):
"""Non-credential files inside HERMES_HOME stay readable."""
from agent.file_safety import get_read_block_error
@@ -159,37 +149,6 @@ def test_read_file_tool_blocks_relative_path_under_terminal_cwd(
assert "credential store" in out["error"]
def test_read_file_tool_blocks_nested_google_oauth_path(
fake_home, tmp_path, monkeypatch
):
"""The real read_file tool must not return Gemini OAuth token material."""
import json
import tools.file_tools as ft
oauth = _create(fake_home, Path("auth") / "google_oauth.json")
oauth.write_text(
json.dumps(
{
"refresh": "REFRESH_TOKEN_MARKER",
"access": "ACCESS_TOKEN_MARKER",
"email": "user@example.com",
}
),
encoding="utf-8",
)
monkeypatch.chdir(tmp_path)
monkeypatch.setattr(
ft, "_get_live_tracking_cwd", lambda task_id="default": None
)
out = json.loads(ft.read_file_tool(str(oauth), task_id="google-oauth-test"))
assert "error" in out
assert "credential store" in out["error"]
assert "REFRESH_TOKEN_MARKER" not in json.dumps(out)
assert "ACCESS_TOKEN_MARKER" not in json.dumps(out)
# ---------------------------------------------------------------------------
# Widening: .env, webhook_subscriptions.json, mcp-tokens/
# ---------------------------------------------------------------------------
@@ -246,29 +205,22 @@ def test_mcp_tokens_dir_itself_blocked(fake_home):
assert "MCP token" in err
def test_identically_named_hermes_files_outside_home_not_blocked(
def test_identically_named_files_outside_hermes_home_not_blocked(
fake_home, tmp_path
):
"""Hermes-specific filenames (``auth.json``, ``mcp-tokens/``, ``google_oauth.json``)
outside HERMES_HOME must remain readable the gate is per-location for
those, not per-filename. ``.env`` is the exception: it's blocked anywhere
on disk (see test_project_local_env_blocked) because the basename always
means \"secret-bearing environment file\" regardless of directory."""
"""A project's ``.env``, ``auth.json``, or ``mcp-tokens/`` outside
HERMES_HOME must remain readable the gate is per-location, not
per-filename."""
from agent.file_safety import get_read_block_error
project = tmp_path / "myproject"
project.mkdir()
# auth.json outside HERMES_HOME — readable (per-location gate).
p = project / "auth.json"
p.write_text("not secret here", encoding="utf-8")
assert get_read_block_error(str(p)) is None, (
"auth.json outside HERMES_HOME should NOT be blocked"
)
google_oauth = project / "auth" / "google_oauth.json"
google_oauth.parent.mkdir()
google_oauth.write_text("not really a token", encoding="utf-8")
assert get_read_block_error(str(google_oauth)) is None
for rel in (".env", "auth.json"):
p = project / rel
p.write_text("not secret here", encoding="utf-8")
assert get_read_block_error(str(p)) is None, (
f"{rel} outside HERMES_HOME should NOT be blocked"
)
tokens = project / "mcp-tokens"
tokens.mkdir()
@@ -277,14 +229,6 @@ def test_identically_named_hermes_files_outside_home_not_blocked(
assert get_read_block_error(str(tok_file)) is None
def test_non_secret_auth_subtree_file_not_blocked(fake_home):
"""Only the known Google OAuth token path is blocked, not all auth/*."""
from agent.file_safety import get_read_block_error
note = _create(fake_home, Path("auth") / "notes.json")
assert get_read_block_error(str(note)) is None
def test_config_yaml_not_blocked(fake_home):
"""config.yaml is NOT a credential file — agent should still be
able to read it for debugging. (Writes are denied separately by
@@ -324,14 +268,6 @@ def test_profile_mode_blocks_root_credentials(tmp_path, monkeypatch):
root_env.write_text("x")
assert "credential store" in (get_read_block_error(str(root_env)) or "")
# Root-level Google OAuth token store: blocked too
root_google_oauth = root / "auth" / "google_oauth.json"
root_google_oauth.parent.mkdir(parents=True, exist_ok=True)
root_google_oauth.write_text("x")
assert "credential store" in (
get_read_block_error(str(root_google_oauth)) or ""
)
# Root-level mcp-tokens: blocked
root_tok = root / "mcp-tokens" / "gh.json"
root_tok.parent.mkdir(parents=True, exist_ok=True)
@@ -1,192 +0,0 @@
"""Tests for the non-stream stale-call detector context estimator.
Covers:
- ``estimate_request_context_tokens`` for Chat Completions, Responses API,
bare lists, and mixed-shape dicts.
- ``AIAgent._compute_non_stream_stale_timeout`` with both legacy ``messages``
list and full ``api_kwargs`` dicts.
- The May 2026 default-base change (300s -> 90s) and the lowered
context-tier ceilings (450/600 -> 150/240).
"""
from __future__ import annotations
import os
from pathlib import Path
import pytest
def _write_config(tmp_path: Path, body: str) -> None:
hermes_home = tmp_path
(hermes_home / "config.yaml").write_text(body or "{}\n", encoding="utf-8")
def _make_agent(tmp_path: Path, **overrides):
from run_agent import AIAgent
kwargs = dict(
model="gpt-5.5",
provider="openai-codex",
api_key="sk-dummy",
base_url="https://chatgpt.com/backend-api/codex",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
platform="cli",
)
kwargs.update(overrides)
return AIAgent(**kwargs)
# ── estimator ──────────────────────────────────────────────────────────────
def test_estimator_chat_completions_messages():
from agent.chat_completion_helpers import estimate_request_context_tokens
payload = {
"model": "gpt-5.4",
"messages": [
{"role": "user", "content": "x" * 400},
{"role": "assistant", "content": "y" * 400},
],
}
# 800+ chars from messages -> ~200 tokens (char/4 estimate)
assert estimate_request_context_tokens(payload) >= 200
def test_estimator_responses_api_input():
from agent.chat_completion_helpers import estimate_request_context_tokens
payload = {
"model": "gpt-5.5",
"instructions": "i" * 1000,
"input": "x" * 4000,
"tools": [{"name": "t", "description": "d" * 200}],
}
# input(4000) + instructions(1000) + tools (~stringified) -> well over 1000 tokens
tokens = estimate_request_context_tokens(payload)
assert tokens >= 1200, f"Responses API estimator returned {tokens}"
def test_estimator_responses_api_long_session_triggers_tier():
"""A real long Codex session (large ``input``) should clear the 50k boundary."""
from agent.chat_completion_helpers import estimate_request_context_tokens
payload = {
"model": "gpt-5.5",
"input": "x" * 240_000, # ~60k tokens (240k chars / 4)
"instructions": "s" * 4000,
}
assert estimate_request_context_tokens(payload) > 50_000
def test_estimator_bare_list_back_compat():
from agent.chat_completion_helpers import estimate_request_context_tokens
messages = [
{"role": "user", "content": "x" * 800},
]
assert estimate_request_context_tokens(messages) >= 200
def test_estimator_empty_inputs():
from agent.chat_completion_helpers import estimate_request_context_tokens
assert estimate_request_context_tokens({}) == 0
assert estimate_request_context_tokens([]) == 0
assert estimate_request_context_tokens(None) == 0
def test_estimator_unknown_dict_fallback():
from agent.chat_completion_helpers import estimate_request_context_tokens
payload = {"random_field": "z" * 400}
assert estimate_request_context_tokens(payload) > 50
# ── default base + tier scaling ────────────────────────────────────────────
def test_default_base_is_90s(monkeypatch, tmp_path):
"""Default base stale timeout dropped from 300s to 90s (May 2026)."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
_write_config(tmp_path, "")
agent = _make_agent(tmp_path)
base, implicit = agent._resolved_api_call_stale_timeout_base()
assert base == 90.0
assert implicit is True
def test_short_codex_request_uses_base_only(monkeypatch, tmp_path):
"""Codex payload below 50k tokens -> default 90s base."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
_write_config(tmp_path, "")
agent = _make_agent(tmp_path)
payload = {"model": "gpt-5.5", "input": "hi", "instructions": ""}
assert agent._compute_non_stream_stale_timeout(payload) == 90.0
def test_long_codex_request_bumps_to_50k_tier(monkeypatch, tmp_path):
"""Codex payload > 50k tokens -> at least 150s."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
_write_config(tmp_path, "")
agent = _make_agent(tmp_path)
payload = {"model": "gpt-5.5", "input": "x" * 240_000, "instructions": ""}
timeout = agent._compute_non_stream_stale_timeout(payload)
assert timeout >= 150.0
assert timeout < 240.0
def test_very_long_codex_request_bumps_to_100k_tier(monkeypatch, tmp_path):
"""Codex payload > 100k tokens -> at least 240s."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
_write_config(tmp_path, "")
agent = _make_agent(tmp_path)
payload = {"model": "gpt-5.5", "input": "x" * 500_000, "instructions": ""}
assert agent._compute_non_stream_stale_timeout(payload) >= 240.0
def test_chat_completions_long_messages_bumps_tier(monkeypatch, tmp_path):
"""Chat Completions estimator still works for the legacy messages path."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
_write_config(tmp_path, "")
agent = _make_agent(
tmp_path,
provider="openai",
base_url="https://api.openai.com/v1",
model="gpt-5.4",
)
payload = {
"model": "gpt-5.4",
"messages": [{"role": "user", "content": "x" * 240_000}],
}
assert agent._compute_non_stream_stale_timeout(payload) >= 150.0
def test_explicit_user_config_overrides_default(monkeypatch, tmp_path):
"""If the user explicitly sets a stale_timeout, the new defaults don't apply."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
(tmp_path / ".env").write_text("", encoding="utf-8")
_write_config(tmp_path, """\
providers:
openai-codex:
stale_timeout_seconds: 1800
""")
monkeypatch.delenv("HERMES_API_CALL_STALE_TIMEOUT", raising=False)
import importlib
from hermes_cli import timeouts as to_mod
importlib.reload(to_mod)
agent = _make_agent(tmp_path)
assert agent._compute_non_stream_stale_timeout({"input": "hi"}) == 1800.0
@@ -1,71 +0,0 @@
"""Tests for the Nous OAuth 401 actionable-guidance branch in
``agent.conversation_loop.run_conversation``.
Source-inspection style (matches ``test_gemini_fast_fallback.py``): we assert
that the guidance strings exist in the function body so that the user-facing
hint cannot be silently removed by a future refactor.
Regression context: ashh hit a Nous 401 (OAuth token expired / portal said
account out of credits) plus a model slug ``deepseek/deepseek-v4-flash:free``
that's OpenRouter syntax, not a Nous catalog name. The previous guidance
branch only covered ``openai-codex`` and ``xai-oauth``; ``nous`` fell through
to a generic "Your API key was rejected... run hermes setup" message, which is
the wrong advice for a pure-OAuth provider.
"""
from __future__ import annotations
import inspect
from agent import conversation_loop
def test_nous_provider_is_in_oauth_401_set():
"""The provider-set gate that selects OAuth-specific guidance must
include ``nous`` alongside ``openai-codex`` and ``xai-oauth``.
"""
source = inspect.getsource(conversation_loop.run_conversation)
# Be flexible about set element ordering — assert all three are listed
# near each other in the gating expression.
assert "\"openai-codex\"" in source
assert "\"xai-oauth\"" in source
assert "\"nous\"" in source
# And the gate string itself must mention all three so future refactors
# that split nous off into its own gate still get caught.
needle = "_provider in {\"openai-codex\", \"xai-oauth\", \"nous\"}"
assert needle in source, (
"Expected nous to be co-gated with the other OAuth providers in the "
"actionable-401-guidance branch of run_conversation."
)
def test_nous_401_guidance_strings_present():
"""User-facing remediation strings for Nous OAuth 401s must exist."""
source = inspect.getsource(conversation_loop.run_conversation)
# Must tell the user it's an OAuth token problem, NOT an API key problem
# (Nous Portal has no API key path — auth_type=oauth_device_code only).
assert "Nous Portal OAuth token was rejected" in source
# Must give the exact re-auth command, not a generic "hermes setup".
assert "hermes auth add nous --type oauth" in source
# Must point at the portal so users can check account/credit status.
assert "portal.nousresearch.com" in source
def test_free_slug_hint_for_nous_provider():
"""When the failing model slug ends with ``:free`` and the provider is
``nous``, the guidance must flag that ``:free`` is OpenRouter syntax and
suggest switching providers via ``/model openrouter:<slug>``.
Without this hint, users re-OAuth successfully and then hit the same 401
on the next message because Nous Portal doesn't carry the OpenRouter
free-tier slug.
"""
source = inspect.getsource(conversation_loop.run_conversation)
assert "endswith(\":free\")" in source
assert "OpenRouter slug" in source
assert "/model openrouter:" in source
-168
View File
@@ -1,168 +0,0 @@
"""Direct tests for ``agent.image_gen_provider.save_url_image`` (#26942).
These exercise the helper against a real in-process HTTP server no
``requests.get`` mocking so we catch the kinds of issues a mocked
unit test won't: content-type parsing, partial-write cleanup, the
oversize cap, the empty-body refusal, and the cache directory it
actually writes to.
Pre-fix the helper didn't exist; xAI URL responses were returned bare
and the gateway 404'd at ``send_photo`` time.
"""
from __future__ import annotations
import http.server
import socketserver
import threading
import pytest
PNG_1PX = bytes.fromhex(
"89504e470d0a1a0a0000000d49484452000000010000000108020000009077"
"53de00000010494441547801635c0e000000feff03000006000557bfabd400"
"00000049454e44ae426082"
)
class _TinyImageHandler(http.server.BaseHTTPRequestHandler):
"""Tiny HTTP server that mimics the shapes save_url_image must handle."""
def do_GET(self): # noqa: N802
if self.path == "/image.png":
self.send_response(200)
self.send_header("Content-Type", "image/png")
self.send_header("Content-Length", str(len(PNG_1PX)))
self.end_headers()
self.wfile.write(PNG_1PX)
elif self.path == "/image.jpg":
self.send_response(200)
self.send_header("Content-Type", "image/jpeg")
self.end_headers()
self.wfile.write(PNG_1PX) # bytes don't have to be a real jpeg
elif self.path == "/oversize":
self.send_response(200)
self.send_header("Content-Type", "image/png")
self.end_headers()
chunk = b"\x00" * 65536
for _ in range(64): # 4 MiB
self.wfile.write(chunk)
elif self.path == "/empty":
self.send_response(200)
self.send_header("Content-Type", "image/png")
self.send_header("Content-Length", "0")
self.end_headers()
elif self.path == "/404":
self.send_response(404)
self.end_headers()
elif self.path == "/no-type-with-url-ext.jpg":
self.send_response(200)
self.send_header("Content-Type", "application/octet-stream")
self.end_headers()
self.wfile.write(PNG_1PX)
elif self.path == "/no-type-no-ext":
self.send_response(200)
self.end_headers()
self.wfile.write(PNG_1PX)
else:
self.send_response(404)
self.end_headers()
def log_message(self, *args, **kw): # noqa: D401
return
@pytest.fixture
def http_server(tmp_path, monkeypatch):
"""Spin up a localhost HTTP server and isolate HERMES_HOME under tmp_path."""
monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
(tmp_path / ".hermes").mkdir()
# Force the constants/image cache helpers to re-read HERMES_HOME.
import sys
for mod in list(sys.modules):
if mod.startswith("hermes_constants") or mod.startswith("agent.image_gen_provider"):
sys.modules.pop(mod, None)
httpd = socketserver.TCPServer(("127.0.0.1", 0), _TinyImageHandler)
port = httpd.server_address[1]
thread = threading.Thread(target=httpd.serve_forever, daemon=True)
thread.start()
yield f"http://127.0.0.1:{port}", httpd
httpd.shutdown()
class TestSaveUrlImage:
def test_writes_real_bytes_to_hermes_home_cache(self, http_server):
base, _ = http_server
from agent.image_gen_provider import save_url_image
path = save_url_image(f"{base}/image.png", prefix="xai_test")
assert path.exists()
assert path.read_bytes() == PNG_1PX
# The cache directory must be under HERMES_HOME — gateway cleanup
# relies on this being the canonical location.
assert "cache/images" in str(path)
assert path.suffix == ".png"
def test_extension_inferred_from_content_type(self, http_server):
base, _ = http_server
from agent.image_gen_provider import save_url_image
path = save_url_image(f"{base}/image.jpg", prefix="xai_test")
assert path.suffix == ".jpg", "image/jpeg → .jpg"
def test_extension_falls_back_to_url_suffix(self, http_server):
"""Some CDNs send ``application/octet-stream`` — the URL suffix wins then."""
base, _ = http_server
from agent.image_gen_provider import save_url_image
path = save_url_image(f"{base}/no-type-with-url-ext.jpg", prefix="xai_test")
assert path.suffix == ".jpg"
def test_extension_defaults_to_png_when_unknowable(self, http_server):
base, _ = http_server
from agent.image_gen_provider import save_url_image
path = save_url_image(f"{base}/no-type-no-ext", prefix="xai_test")
assert path.suffix == ".png"
def test_404_raises(self, http_server):
"""HTTP errors must propagate — caller decides whether to fall back."""
base, _ = http_server
from agent.image_gen_provider import save_url_image
import requests as req_lib
with pytest.raises(req_lib.HTTPError):
save_url_image(f"{base}/404")
def test_empty_body_raises_without_writing_file(self, http_server):
"""0-byte responses are not images — refuse to cache."""
base, _ = http_server
from agent.image_gen_provider import save_url_image
with pytest.raises(ValueError, match="0 bytes"):
save_url_image(f"{base}/empty")
def test_oversize_raises_and_cleans_up(self, http_server, tmp_path):
"""Oversize downloads must NOT leak a partial file into the cache."""
base, _ = http_server
from agent.image_gen_provider import save_url_image, _images_cache_dir
cache_dir = _images_cache_dir()
before = set(cache_dir.glob("*"))
with pytest.raises(ValueError, match="exceeds"):
save_url_image(f"{base}/oversize", max_bytes=1024 * 1024)
after = set(cache_dir.glob("*"))
assert after == before, "partial file leaked into cache after oversize cap"
def test_unique_filenames_avoid_collision(self, http_server):
"""Two back-to-back saves of the same URL must produce different paths."""
base, _ = http_server
from agent.image_gen_provider import save_url_image
path1 = save_url_image(f"{base}/image.png", prefix="xai_collision")
path2 = save_url_image(f"{base}/image.png", prefix="xai_collision")
assert path1 != path2, "filename collision — uuid suffix isn't doing its job"
-243
View File
@@ -1,243 +0,0 @@
"""Tests for agent/transcription_registry.py and agent/transcription_provider.py.
Covers:
- Registration happy path
- Registration rejection: non-TranscriptionProvider type
- Registration rejection: empty/whitespace name
- Built-in name shadowing: warning + silent ignore (no exception)
- Re-registration: overwrites + logs at debug
- Case + whitespace insensitivity on lookup
- ABC contract: default implementations work
- ABC contract: transcribe() must be implemented
- Sync invariant: registry built-ins match tools/transcription_tools.py
"""
from __future__ import annotations
import logging
from typing import Any, Optional
import pytest
from agent import transcription_registry
from agent.transcription_provider import TranscriptionProvider
class _FakeProvider(TranscriptionProvider):
def __init__(
self,
name: str = "fake",
display: Optional[str] = None,
available: bool = True,
transcribe_impl: Optional[Any] = None,
):
self._name = name
self._display = display
self._available = available
self._transcribe_impl = transcribe_impl
@property
def name(self) -> str:
return self._name
@property
def display_name(self) -> str:
return self._display if self._display is not None else super().display_name
def is_available(self) -> bool:
return self._available
def transcribe(self, file_path: str, **kw):
if self._transcribe_impl is not None:
return self._transcribe_impl(file_path, **kw)
return {"success": True, "transcript": f"fake({file_path})", "provider": self._name}
@pytest.fixture(autouse=True)
def _reset_registry():
transcription_registry._reset_for_tests()
yield
transcription_registry._reset_for_tests()
# ---------------------------------------------------------------------------
# Registration
# ---------------------------------------------------------------------------
class TestRegistration:
def test_happy_path(self):
p = _FakeProvider(name="openrouter")
transcription_registry.register_provider(p)
assert transcription_registry.get_provider("openrouter") is p
assert [r.name for r in transcription_registry.list_providers()] == ["openrouter"]
def test_rejects_non_provider_type(self):
with pytest.raises(TypeError, match="expects a TranscriptionProvider instance"):
transcription_registry.register_provider("not a provider") # type: ignore[arg-type]
assert transcription_registry.list_providers() == []
def test_rejects_empty_name(self):
p = _FakeProvider(name="")
with pytest.raises(ValueError, match="non-empty string"):
transcription_registry.register_provider(p)
assert transcription_registry.list_providers() == []
def test_rejects_whitespace_name(self):
p = _FakeProvider(name=" ")
with pytest.raises(ValueError, match="non-empty string"):
transcription_registry.register_provider(p)
assert transcription_registry.list_providers() == []
@pytest.mark.parametrize(
"builtin",
["local", "local_command", "groq", "openai", "mistral", "xai"],
)
def test_rejects_builtin_shadow_with_warning(self, builtin, caplog):
p = _FakeProvider(name=builtin)
with caplog.at_level(logging.WARNING, logger="agent.transcription_registry"):
transcription_registry.register_provider(p)
assert "shadows a built-in name" in caplog.text
assert builtin in caplog.text
assert transcription_registry.get_provider(builtin) is None
assert transcription_registry.list_providers() == []
def test_builtin_shadow_case_insensitive(self, caplog):
for variant in ("OPENAI", "OpenAi", " openai ", "oPeNaI"):
transcription_registry._reset_for_tests()
with caplog.at_level(logging.WARNING, logger="agent.transcription_registry"):
transcription_registry.register_provider(_FakeProvider(name=variant))
assert transcription_registry.list_providers() == [], (
f"variant {variant!r} should have been rejected as a built-in shadow"
)
def test_reregistration_overwrites(self, caplog):
p1 = _FakeProvider(name="openrouter")
p2 = _FakeProvider(name="openrouter")
transcription_registry.register_provider(p1)
with caplog.at_level(logging.DEBUG, logger="agent.transcription_registry"):
transcription_registry.register_provider(p2)
assert transcription_registry.get_provider("openrouter") is p2
assert "re-registered" in caplog.text
# ---------------------------------------------------------------------------
# Lookup
# ---------------------------------------------------------------------------
class TestLookup:
def test_get_provider_missing_returns_none(self):
assert transcription_registry.get_provider("nonexistent") is None
def test_get_provider_non_string_returns_none(self):
assert transcription_registry.get_provider(None) is None # type: ignore[arg-type]
assert transcription_registry.get_provider(123) is None # type: ignore[arg-type]
def test_get_provider_case_insensitive(self):
p = _FakeProvider(name="openrouter")
transcription_registry.register_provider(p)
assert transcription_registry.get_provider("OPENROUTER") is p
assert transcription_registry.get_provider("OpenRouter") is p
def test_get_provider_whitespace_tolerant(self):
p = _FakeProvider(name="openrouter")
transcription_registry.register_provider(p)
assert transcription_registry.get_provider(" openrouter ") is p
def test_list_providers_sorted(self):
transcription_registry.register_provider(_FakeProvider(name="zylo"))
transcription_registry.register_provider(_FakeProvider(name="alpha"))
transcription_registry.register_provider(_FakeProvider(name="middle"))
names = [p.name for p in transcription_registry.list_providers()]
assert names == ["alpha", "middle", "zylo"]
# ---------------------------------------------------------------------------
# ABC contract
# ---------------------------------------------------------------------------
class TestABCContract:
def test_must_implement_transcribe(self):
class Incomplete(TranscriptionProvider):
@property
def name(self) -> str:
return "incomplete"
# transcribe NOT implemented
with pytest.raises(TypeError, match="abstract"):
Incomplete() # type: ignore[abstract]
def test_must_implement_name(self):
class Incomplete(TranscriptionProvider):
def transcribe(self, file_path, **kw):
return {"success": True, "transcript": "", "provider": "incomplete"}
# name NOT implemented
with pytest.raises(TypeError, match="abstract"):
Incomplete() # type: ignore[abstract]
def test_display_name_defaults_to_title(self):
p = _FakeProvider(name="openrouter")
assert p.display_name == "Openrouter"
def test_display_name_override_respected(self):
p = _FakeProvider(name="openrouter", display="OpenRouter STT")
assert p.display_name == "OpenRouter STT"
def test_is_available_default_true(self):
p = _FakeProvider(name="openrouter")
assert p.is_available() is True
def test_list_models_default_empty(self):
p = _FakeProvider(name="openrouter")
assert p.list_models() == []
def test_default_model_none_when_no_models(self):
p = _FakeProvider(name="openrouter")
assert p.default_model() is None
def test_default_model_first_listed(self):
class WithModels(_FakeProvider):
def list_models(self):
return [{"id": "whisper-large-v3-turbo"}, {"id": "whisper-large-v3"}]
p = WithModels(name="openrouter")
assert p.default_model() == "whisper-large-v3-turbo"
def test_get_setup_schema_default_minimal(self):
p = _FakeProvider(name="openrouter")
schema = p.get_setup_schema()
assert schema["name"] == "Openrouter"
assert schema["env_vars"] == []
# ---------------------------------------------------------------------------
# Sync invariant: registry built-ins vs dispatcher built-ins
# ---------------------------------------------------------------------------
class TestBuiltinSync:
"""``_BUILTIN_NAMES`` in agent/transcription_registry.py is duplicated
from ``BUILTIN_STT_PROVIDERS`` in tools/transcription_tools.py
(importing directly would create a circular dependency). This test
fails loudly if the two lists drift a new built-in added to
transcription_tools.py MUST also be added to
transcription_registry.py's ``_BUILTIN_NAMES`` or the registry will
accept a name the dispatcher will silently route to the wrong
handler.
"""
def test_registry_builtins_match_dispatcher_builtins(self):
from tools.transcription_tools import BUILTIN_STT_PROVIDERS
assert transcription_registry._BUILTIN_NAMES == BUILTIN_STT_PROVIDERS, (
"agent.transcription_registry._BUILTIN_NAMES and "
"tools.transcription_tools.BUILTIN_STT_PROVIDERS have drifted!\n"
f" Registry only: {sorted(transcription_registry._BUILTIN_NAMES - BUILTIN_STT_PROVIDERS)}\n"
f" Dispatcher only: {sorted(BUILTIN_STT_PROVIDERS - transcription_registry._BUILTIN_NAMES)}\n"
"Add the missing names to whichever list is incomplete. "
"These two lists exist as a circular-import workaround and "
"MUST be kept in sync manually."
)
-312
View File
@@ -1,312 +0,0 @@
"""Tests for agent/tts_registry.py and agent/tts_provider.py.
Covers:
- Registration happy path
- Registration rejection: non-TTSProvider type
- Registration rejection: empty/whitespace name
- Built-in name shadowing: warning + silent ignore (no exception)
- Re-registration: overwrites + logs at debug
- Case + whitespace insensitivity on lookup
- ABC contract: default implementations work
- ABC contract: synthesize() must be implemented
- ABC contract: stream() raises NotImplementedError by default
- resolve_output_format helper coerces invalid input
"""
from __future__ import annotations
import logging
from typing import Any, Optional
import pytest
from agent import tts_registry
from agent.tts_provider import (
DEFAULT_OUTPUT_FORMAT,
VALID_OUTPUT_FORMATS,
TTSProvider,
resolve_output_format,
)
class _FakeProvider(TTSProvider):
def __init__(
self,
name: str = "fake",
display: Optional[str] = None,
voice_compat: bool = False,
synthesize_impl: Optional[Any] = None,
):
self._name = name
self._display = display
self._voice_compat = voice_compat
self._synthesize_impl = synthesize_impl
@property
def name(self) -> str:
return self._name
@property
def display_name(self) -> str:
return self._display if self._display is not None else super().display_name
@property
def voice_compatible(self) -> bool:
return self._voice_compat
def synthesize(self, text: str, output_path: str, **kw):
if self._synthesize_impl is not None:
return self._synthesize_impl(text, output_path, **kw)
return output_path
@pytest.fixture(autouse=True)
def _reset_registry():
tts_registry._reset_for_tests()
yield
tts_registry._reset_for_tests()
# ---------------------------------------------------------------------------
# Registration
# ---------------------------------------------------------------------------
class TestRegistration:
def test_happy_path(self):
p = _FakeProvider(name="cartesia")
tts_registry.register_provider(p)
assert tts_registry.get_provider("cartesia") is p
assert [r.name for r in tts_registry.list_providers()] == ["cartesia"]
def test_rejects_non_provider_type(self):
with pytest.raises(TypeError, match="expects a TTSProvider instance"):
tts_registry.register_provider("not a provider") # type: ignore[arg-type]
assert tts_registry.list_providers() == []
def test_rejects_empty_name(self):
p = _FakeProvider(name="")
with pytest.raises(ValueError, match="non-empty string"):
tts_registry.register_provider(p)
assert tts_registry.list_providers() == []
def test_rejects_whitespace_name(self):
p = _FakeProvider(name=" ")
with pytest.raises(ValueError, match="non-empty string"):
tts_registry.register_provider(p)
assert tts_registry.list_providers() == []
@pytest.mark.parametrize(
"builtin",
["edge", "openai", "elevenlabs", "minimax", "gemini",
"mistral", "xai", "piper", "kittentts", "neutts"],
)
def test_rejects_builtin_shadow_with_warning(self, builtin, caplog):
"""Built-in names always win — plugin registration is silently ignored
but a warning is logged so the operator can see what happened.
"""
p = _FakeProvider(name=builtin)
with caplog.at_level(logging.WARNING, logger="agent.tts_registry"):
tts_registry.register_provider(p)
assert "shadows a built-in name" in caplog.text
assert builtin in caplog.text
assert tts_registry.get_provider(builtin) is None
assert tts_registry.list_providers() == []
def test_builtin_shadow_case_insensitive(self, caplog):
"""``EDGE``/``Edge``/`` edge `` all collide with the ``edge`` built-in."""
for variant in ("EDGE", "Edge", " edge ", "eDgE"):
tts_registry._reset_for_tests()
with caplog.at_level(logging.WARNING, logger="agent.tts_registry"):
tts_registry.register_provider(_FakeProvider(name=variant))
assert tts_registry.list_providers() == [], (
f"variant {variant!r} should have been rejected as a built-in shadow"
)
def test_reregistration_overwrites(self, caplog):
p1 = _FakeProvider(name="cartesia")
p2 = _FakeProvider(name="cartesia")
tts_registry.register_provider(p1)
with caplog.at_level(logging.DEBUG, logger="agent.tts_registry"):
tts_registry.register_provider(p2)
assert tts_registry.get_provider("cartesia") is p2
assert "re-registered" in caplog.text
# ---------------------------------------------------------------------------
# Lookup
# ---------------------------------------------------------------------------
class TestLookup:
def test_get_provider_missing_returns_none(self):
assert tts_registry.get_provider("nonexistent") is None
def test_get_provider_non_string_returns_none(self):
assert tts_registry.get_provider(None) is None # type: ignore[arg-type]
assert tts_registry.get_provider(123) is None # type: ignore[arg-type]
def test_get_provider_case_insensitive(self):
p = _FakeProvider(name="cartesia")
tts_registry.register_provider(p)
assert tts_registry.get_provider("CARTESIA") is p
assert tts_registry.get_provider("Cartesia") is p
def test_get_provider_whitespace_tolerant(self):
p = _FakeProvider(name="cartesia")
tts_registry.register_provider(p)
assert tts_registry.get_provider(" cartesia ") is p
def test_list_providers_sorted(self):
tts_registry.register_provider(_FakeProvider(name="zylo"))
tts_registry.register_provider(_FakeProvider(name="alpha"))
tts_registry.register_provider(_FakeProvider(name="middle"))
names = [p.name for p in tts_registry.list_providers()]
assert names == ["alpha", "middle", "zylo"]
# ---------------------------------------------------------------------------
# ABC contract
# ---------------------------------------------------------------------------
class TestABCContract:
def test_must_implement_synthesize(self):
class Incomplete(TTSProvider):
@property
def name(self) -> str:
return "incomplete"
# synthesize NOT implemented
with pytest.raises(TypeError, match="abstract"):
Incomplete() # type: ignore[abstract]
def test_must_implement_name(self):
class Incomplete(TTSProvider):
def synthesize(self, text, output_path, **kw):
return output_path
# name NOT implemented
with pytest.raises(TypeError, match="abstract"):
Incomplete() # type: ignore[abstract]
def test_display_name_defaults_to_title(self):
p = _FakeProvider(name="cartesia")
assert p.display_name == "Cartesia"
def test_display_name_override_respected(self):
p = _FakeProvider(name="cartesia", display="Cartesia AI")
assert p.display_name == "Cartesia AI"
def test_is_available_default_true(self):
p = _FakeProvider(name="cartesia")
assert p.is_available() is True
def test_list_voices_default_empty(self):
p = _FakeProvider(name="cartesia")
assert p.list_voices() == []
def test_list_models_default_empty(self):
p = _FakeProvider(name="cartesia")
assert p.list_models() == []
def test_default_model_none_when_no_models(self):
p = _FakeProvider(name="cartesia")
assert p.default_model() is None
def test_default_voice_none_when_no_voices(self):
p = _FakeProvider(name="cartesia")
assert p.default_voice() is None
def test_default_model_first_listed(self):
class WithModels(_FakeProvider):
def list_models(self):
return [{"id": "sonic-2"}, {"id": "sonic-1"}]
p = WithModels(name="cartesia")
assert p.default_model() == "sonic-2"
def test_default_voice_first_listed(self):
class WithVoices(_FakeProvider):
def list_voices(self):
return [{"id": "voice-aria"}, {"id": "voice-jasper"}]
p = WithVoices(name="cartesia")
assert p.default_voice() == "voice-aria"
def test_get_setup_schema_default_minimal(self):
p = _FakeProvider(name="cartesia")
schema = p.get_setup_schema()
assert schema["name"] == "Cartesia"
assert schema["env_vars"] == []
def test_stream_raises_not_implemented_by_default(self):
p = _FakeProvider(name="cartesia")
with pytest.raises(NotImplementedError, match="does not implement streaming"):
next(p.stream("hello"))
def test_voice_compatible_default_false(self):
p = _FakeProvider(name="cartesia")
assert p.voice_compatible is False
def test_voice_compatible_override(self):
p = _FakeProvider(name="cartesia", voice_compat=True)
assert p.voice_compatible is True
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
class TestResolveOutputFormat:
@pytest.mark.parametrize("valid", sorted(VALID_OUTPUT_FORMATS))
def test_valid_passes_through(self, valid):
assert resolve_output_format(valid) == valid
def test_uppercase_normalized(self):
assert resolve_output_format("MP3") == "mp3"
assert resolve_output_format("Opus") == "opus"
def test_whitespace_stripped(self):
assert resolve_output_format(" wav ") == "wav"
def test_invalid_returns_default(self):
assert resolve_output_format("aiff") == DEFAULT_OUTPUT_FORMAT
assert resolve_output_format("") == DEFAULT_OUTPUT_FORMAT
def test_none_returns_default(self):
assert resolve_output_format(None) == DEFAULT_OUTPUT_FORMAT
def test_non_string_returns_default(self):
assert resolve_output_format(123) == DEFAULT_OUTPUT_FORMAT # type: ignore[arg-type]
assert resolve_output_format([]) == DEFAULT_OUTPUT_FORMAT # type: ignore[arg-type]
# ---------------------------------------------------------------------------
# Sync invariant: registry's built-in list vs dispatcher's built-in list
# ---------------------------------------------------------------------------
class TestBuiltinSync:
"""``_BUILTIN_NAMES`` in agent/tts_registry.py is duplicated from
``BUILTIN_TTS_PROVIDERS`` in tools/tts_tool.py (importing directly
would create a circular dependency). This test fails loudly if the
two lists drift a new built-in added to tts_tool.py MUST also be
added to tts_registry.py's _BUILTIN_NAMES or the registry will
accept a name the dispatcher will silently route to the wrong
handler.
"""
def test_registry_builtins_match_dispatcher_builtins(self):
from tools.tts_tool import BUILTIN_TTS_PROVIDERS
assert tts_registry._BUILTIN_NAMES == BUILTIN_TTS_PROVIDERS, (
"agent.tts_registry._BUILTIN_NAMES and "
"tools.tts_tool.BUILTIN_TTS_PROVIDERS have drifted!\n"
f" Registry only: {sorted(tts_registry._BUILTIN_NAMES - BUILTIN_TTS_PROVIDERS)}\n"
f" Dispatcher only: {sorted(BUILTIN_TTS_PROVIDERS - tts_registry._BUILTIN_NAMES)}\n"
"Add the missing names to whichever list is incomplete. "
"These two lists exist as a circular-import workaround and "
"MUST be kept in sync manually."
)
@@ -452,64 +452,3 @@ class TestCodexNormalizeResponse:
tc = nr.tool_calls[0]
assert tc.name == "terminal"
assert '"command"' in tc.arguments
class TestCodexTransportTimeout:
"""Forward per-request timeout from build_kwargs to the SDK kwargs."""
def test_positive_timeout_preserved(self, transport):
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
timeout=600.0,
)
assert kw.get("timeout") == 600.0
def test_zero_timeout_dropped(self, transport):
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
timeout=0,
)
assert "timeout" not in kw
def test_none_timeout_omitted(self, transport):
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
timeout=None,
)
assert "timeout" not in kw
def test_inf_timeout_dropped(self, transport):
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
timeout=float("inf"),
)
assert "timeout" not in kw
def test_bool_timeout_dropped(self, transport):
"""``True`` is technically int but must not survive — caller bug guard."""
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
timeout=True,
)
assert "timeout" not in kw
def test_request_overrides_can_supply_timeout(self, transport):
"""request_overrides["timeout"] is honored when no explicit kwarg passed."""
kw = transport.build_kwargs(
model="gpt-5.5",
messages=[{"role": "user", "content": "hi"}],
tools=[],
request_overrides={"timeout": 450.0},
)
assert kw.get("timeout") == 450.0
-157
View File
@@ -1,157 +0,0 @@
"""Tests for bracketed-paste timeout safety valve (#16263).
Verifies the production helper in cli.py monkey-patches prompt_toolkit's
Vt100Parser.feed() so the parser auto-escapes from bracketed-paste mode when
the ESC[201~ end mark is never received.
"""
import ast
import importlib
import logging
import time
from pathlib import Path
from unittest.mock import MagicMock
from prompt_toolkit.keys import Keys
ROOT = Path(__file__).resolve().parents[2]
CLI_PATH = ROOT / "cli.py"
def _load_production_patch_helper():
"""Load cli._apply_bracketed_paste_timeout_patch without importing cli.
Importing cli.py pulls optional runtime deps that aren't required for this
parser-level regression. AST-loading the exact helper keeps the test tied
to production code while avoiding unrelated import side effects. If the
production helper is removed, this test fails.
"""
source = CLI_PATH.read_text(encoding="utf-8")
tree = ast.parse(source)
helper_node = next(
(
node
for node in tree.body
if isinstance(node, ast.FunctionDef)
and node.name == "_apply_bracketed_paste_timeout_patch"
),
None,
)
assert helper_node is not None, (
"cli.py must define _apply_bracketed_paste_timeout_patch()"
)
helper_source = ast.get_source_segment(source, helper_node)
namespace = {"time": time, "logger": logging.getLogger("test.cli")}
exec(helper_source, namespace)
return namespace["_apply_bracketed_paste_timeout_patch"]
def _reset_and_apply_production_patch():
"""Reload prompt_toolkit's parser and apply Hermes' production patch."""
import prompt_toolkit.input.vt100_parser as vt100_mod
vt100_mod = importlib.reload(vt100_mod)
# importlib.reload() preserves module dict entries that the reloaded source
# does not redefine, so clear Hermes' sentinel before re-applying.
if hasattr(vt100_mod, "_hermes_bp_timeout_patched"):
delattr(vt100_mod, "_hermes_bp_timeout_patched")
_load_production_patch_helper()()
assert getattr(vt100_mod, "_hermes_bp_timeout_patched", False)
return vt100_mod
class TestBracketedPasteTimeout:
"""Verify the Vt100Parser monkey-patch prevents frozen bracketed-paste."""
def _make_parser(self):
"""Create a Vt100Parser after applying the production patch."""
vt100_mod = _reset_and_apply_production_patch()
callback = MagicMock()
parser = vt100_mod.Vt100Parser(callback)
return parser, callback
def test_normal_bracketed_paste_works(self):
"""A complete bracketed-paste sequence should work normally."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~hello world\x1b[201~")
callback.assert_called_once()
call_args = callback.call_args[0][0]
assert call_args.data == "hello world"
def test_incomplete_paste_times_out(self):
"""If ESC[201~ is never received, parser should recover after timeout."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~some pasted text")
assert parser._in_bracketed_paste
parser._hermes_bp_start = time.monotonic() - 3.0
parser.feed("more data")
assert not parser._in_bracketed_paste
assert callback.called
def test_timeout_preserves_buffered_content(self):
"""Auto-escape should flush buffered content, not lose it."""
parser, callback = self._make_parser()
content = "line1\nline2\nline3"
parser.feed(f"\x1b[200~{content}")
parser._hermes_bp_start = time.monotonic() - 3.0
parser.feed("")
paste_events = [
c[0][0]
for c in callback.call_args_list
if hasattr(c[0][0], "key") and c[0][0].key == Keys.BracketedPaste
]
assert len(paste_events) >= 1
assert content in paste_events[0].data
def test_normal_keys_after_timeout_recovery(self):
"""After timeout recovery, normal key processing should resume."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~stuck")
parser._hermes_bp_start = time.monotonic() - 3.0
parser.feed("")
assert not parser._in_bracketed_paste
callback.reset_mock()
parser.feed("a")
assert not parser._in_bracketed_paste
def test_no_timeout_when_end_mark_arrives_quickly(self):
"""No timeout should fire if end mark arrives within the window."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~quick paste\x1b[201~")
assert not parser._in_bracketed_paste
callback.assert_called_once()
def test_subsequent_data_after_incomplete_paste(self):
"""Data arriving after a stuck paste should be processable."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~content")
parser._hermes_bp_start = time.monotonic() - 5.0
parser.feed("x")
assert not parser._in_bracketed_paste
assert callback.call_count >= 1
def test_torn_end_mark_recovers(self):
"""If end mark arrives split across feeds within timeout, it still works."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~some content\x1b[20")
assert parser._in_bracketed_paste
parser.feed("1~")
assert not parser._in_bracketed_paste
callback.assert_called_once()
assert callback.call_args[0][0].data == "some content"
def test_no_timeout_under_threshold(self):
"""Bracketed-paste mode should not timeout within the 2s window."""
parser, callback = self._make_parser()
parser.feed("\x1b[200~waiting")
parser._hermes_bp_start = time.monotonic() - 0.5
parser.feed("more waiting")
assert parser._in_bracketed_paste
assert not callback.called
@@ -102,90 +102,3 @@ def test_fragments_omit_bg_segment_when_idle():
frags = cli_obj._get_status_bar_fragments()
rendered = "".join(text for _style, text in frags)
assert "" not in rendered
# ── Background terminal-process indicator (⚙ N) ───────────────────────────
# Source of truth is tools.process_registry.process_registry._running (a dict
# of currently-running shell processes spawned by terminal(background=true)).
# Distinct from /background tasks above: ▶ counts agent threads, ⚙ counts
# shell processes. Both can be active simultaneously.
class _FakeRunningRegistry:
"""Minimal stand-in for process_registry; exposes count_running()."""
def __init__(self, count: int) -> None:
self._count = count
def count_running(self) -> int:
return self._count
def _patch_process_registry(monkeypatch, count: int) -> None:
import tools.process_registry as pr_mod
monkeypatch.setattr(pr_mod, "process_registry", _FakeRunningRegistry(count))
def test_snapshot_reports_zero_when_no_background_processes(monkeypatch):
cli_obj = _make_cli()
_patch_process_registry(monkeypatch, 0)
snap = cli_obj._get_status_bar_snapshot()
assert snap["active_background_processes"] == 0
def test_snapshot_counts_live_background_processes(monkeypatch):
cli_obj = _make_cli()
_patch_process_registry(monkeypatch, 3)
snap = cli_obj._get_status_bar_snapshot()
assert snap["active_background_processes"] == 3
def test_snapshot_safe_when_process_registry_raises(monkeypatch):
"""If count_running() raises the snapshot stays at 0; no propagate."""
cli_obj = _make_cli()
import tools.process_registry as pr_mod
class _BoomRegistry:
def count_running(self):
raise RuntimeError("boom")
monkeypatch.setattr(pr_mod, "process_registry", _BoomRegistry())
snap = cli_obj._get_status_bar_snapshot()
assert snap["active_background_processes"] == 0
def test_plain_text_status_shows_proc_indicator_when_active(monkeypatch):
cli_obj = _make_cli()
_patch_process_registry(monkeypatch, 2)
text = cli_obj._build_status_bar_text(width=80)
assert "⚙ 2" in text
def test_plain_text_status_omits_proc_indicator_when_idle(monkeypatch):
cli_obj = _make_cli()
_patch_process_registry(monkeypatch, 0)
text = cli_obj._build_status_bar_text(width=80)
assert "" not in text
def test_fragments_include_proc_segment_when_active(monkeypatch):
cli_obj = _make_cli()
_patch_process_registry(monkeypatch, 1)
cli_obj._status_bar_visible = True
cli_obj._get_tui_terminal_width = lambda: 120 # type: ignore[method-assign]
frags = cli_obj._get_status_bar_fragments()
rendered = "".join(text for _style, text in frags)
assert "⚙ 1" in rendered
def test_indicators_independent_agents_and_processes(monkeypatch):
"""▶ (agent tasks) and ⚙ (shell processes) render side-by-side."""
cli_obj = _make_cli()
cli_obj._background_tasks = {"bg_a": _stub_thread()}
_patch_process_registry(monkeypatch, 2)
cli_obj._status_bar_visible = True
cli_obj._get_tui_terminal_width = lambda: 120 # type: ignore[method-assign]
frags = cli_obj._get_status_bar_fragments()
rendered = "".join(text for _style, text in frags)
assert "▶ 1" in rendered
assert "⚙ 2" in rendered
+9 -13
View File
@@ -6,8 +6,6 @@ from unittest.mock import MagicMock, patch
import pytest
from agent.model_metadata import MINIMUM_CONTEXT_LENGTH
@pytest.fixture
def _isolate(tmp_path, monkeypatch):
@@ -46,18 +44,17 @@ def cli_obj(_isolate):
class TestLowContextWarning:
"""Tests that the CLI warns about low context lengths."""
def test_warning_for_below_minimum_context(self, cli_obj):
"""Warning shown when context is below Hermes' minimum."""
def test_no_warning_for_normal_context(self, cli_obj):
"""No warning when context is 32k+."""
cli_obj.agent.context_compressor.context_length = 32768
with patch("cli.get_tool_definitions", return_value=[]), \
patch("cli.build_welcome_banner"):
cli_obj.show_banner()
# Check that no yellow warning was printed
calls = [str(c) for c in cli_obj.console.print.call_args_list]
warning_calls = [c for c in calls if "too low" in c]
assert len(warning_calls) == 1
minimum_calls = [c for c in calls if f"{MINIMUM_CONTEXT_LENGTH:,}" in c]
assert minimum_calls
assert len(warning_calls) == 0
def test_warning_for_low_context(self, cli_obj):
"""Warning shown when context is 4096 (Ollama default)."""
@@ -83,19 +80,19 @@ class TestLowContextWarning:
assert len(warning_calls) == 1
def test_no_warning_at_boundary(self, cli_obj):
"""No warning at exactly Hermes' minimum context length."""
cli_obj.agent.context_compressor.context_length = MINIMUM_CONTEXT_LENGTH
"""No warning at exactly 8192 — 8192 is borderline but included in warning."""
cli_obj.agent.context_compressor.context_length = 8192
with patch("cli.get_tool_definitions", return_value=[]), \
patch("cli.build_welcome_banner"):
cli_obj.show_banner()
calls = [str(c) for c in cli_obj.console.print.call_args_list]
warning_calls = [c for c in calls if "too low" in c]
assert len(warning_calls) == 0
assert len(warning_calls) == 1 # 8192 is still warned about
def test_no_warning_above_boundary(self, cli_obj):
"""No warning above Hermes' minimum context length."""
cli_obj.agent.context_compressor.context_length = MINIMUM_CONTEXT_LENGTH + 1
"""No warning at 16384."""
cli_obj.agent.context_compressor.context_length = 16384
with patch("cli.get_tool_definitions", return_value=[]), \
patch("cli.build_welcome_banner"):
cli_obj.show_banner()
@@ -115,7 +112,6 @@ class TestLowContextWarning:
calls = [str(c) for c in cli_obj.console.print.call_args_list]
ollama_hints = [c for c in calls if "OLLAMA_CONTEXT_LENGTH" in c]
assert len(ollama_hints) == 1
assert str(MINIMUM_CONTEXT_LENGTH) in ollama_hints[0]
def test_lm_studio_specific_hint(self, cli_obj):
"""LM Studio-specific fix shown when port 1234 detected."""
+2 -2
View File
@@ -534,7 +534,7 @@ def test_model_flow_custom_saves_verified_v1_base_url(monkeypatch, capsys):
# then display name. The api_mode prompt also runs before model selection.
answers = iter(["http://localhost:8000", "local-key", "", "", "", "", ""])
monkeypatch.setattr("builtins.input", lambda _prompt="": next(answers))
monkeypatch.setattr("hermes_cli.secret_prompt.masked_secret_prompt", lambda _prompt="": next(answers))
monkeypatch.setattr("getpass.getpass", lambda _prompt="": next(answers))
hermes_main._model_flow_custom({})
output = capsys.readouterr().out
@@ -592,7 +592,7 @@ def test_model_flow_custom_persists_selected_api_mode(monkeypatch):
]
)
monkeypatch.setattr("builtins.input", lambda _prompt="": next(answers))
monkeypatch.setattr("hermes_cli.secret_prompt.masked_secret_prompt", lambda _prompt="": "test-key")
monkeypatch.setattr("getpass.getpass", lambda _prompt="": "test-key")
hermes_main._model_flow_custom({"model": {"provider": "custom"}})
-41
View File
@@ -75,44 +75,3 @@ class TestCliResumeCommand:
assert "out of range" in printed.lower()
assert "/resume" in printed
assert cli_obj.session_id == "current_session"
def test_handle_resume_strips_outer_brackets(self):
"""Users copy `<session_id>` from the usage hint literally.
Strip outer ``<>``, ``[]``, ``""``, and ``''`` before lookup so
``/resume <abc123>`` works the same as ``/resume abc123``.
"""
cli_obj = _make_cli()
cli_obj._session_db.get_session.return_value = {"id": "sess_alpha", "title": "Alpha"}
cli_obj._session_db.get_messages_as_conversation.return_value = []
cli_obj._session_db.resolve_resume_session_id.return_value = "sess_alpha"
for raw in ("<sess_alpha>", "[sess_alpha]", '"sess_alpha"', "'sess_alpha'"):
cli_obj.session_id = "current_session"
with (
patch("hermes_cli.main._resolve_session_by_name_or_id", return_value="sess_alpha"),
patch("cli._cprint"),
):
cli_obj._handle_resume_command(f"/resume {raw}")
assert cli_obj.session_id == "sess_alpha", (
f"bracket-stripping failed for {raw!r}: session_id stayed {cli_obj.session_id}"
)
def test_handle_resume_does_not_strip_partial_brackets(self):
"""Mismatched or single brackets must pass through unmodified.
``"<half`` (just an open angle) is not a wrapping pair, so the
lookup should treat it verbatim preserving the existing
not-found error path instead of mangling the input.
"""
cli_obj = _make_cli()
cli_obj._session_db.get_session.return_value = None
with (
patch("hermes_cli.main._resolve_session_by_name_or_id", return_value=None),
patch("cli._cprint") as mock_cprint,
):
cli_obj._handle_resume_command("/resume <half")
printed = " ".join(str(call) for call in mock_cprint.call_args_list)
assert "<half" in printed
+2 -2
View File
@@ -83,10 +83,10 @@ def test_cancel_secret_capture_marks_setup_skipped():
assert cli._secret_deadline == 0
def test_secret_capture_uses_masked_prompt_without_tui():
def test_secret_capture_uses_getpass_without_tui():
cli = _make_cli_stub()
with patch("hermes_cli.callbacks.masked_secret_prompt", return_value="secret-value"), patch(
with patch("hermes_cli.callbacks.getpass.getpass", return_value="secret-value"), patch(
"hermes_cli.callbacks.save_env_value_secure"
) as save_secret:
save_secret.return_value = {
@@ -1,83 +0,0 @@
"""Tests for the CLI exit summary's resume hint, including profile-flag support."""
from datetime import datetime
from unittest.mock import MagicMock, patch
from cli import HermesCLI
def _make_cli(session_id="20260524_000001_abc123"):
cli_obj = HermesCLI.__new__(HermesCLI)
cli_obj.session_id = session_id
# _print_exit_summary requires a populated conversation history (msg_count > 0)
# to print the resume hint at all. One synthetic user turn is enough.
cli_obj.conversation_history = [{"role": "user", "content": "hi"}]
cli_obj.agent = None
cli_obj._session_db = None
cli_obj.session_start = datetime.now()
return cli_obj
class TestExitSummaryResumeHint:
"""The exit-line ``Resume this session with:`` hint must include the
active profile (`-p <name>`) so session IDs round-trip across
profile boundaries sessions live under `~/.hermes-profiles/<profile>/`,
so a hint copied without `-p` from a non-default profile won't find
the session.
"""
def test_resume_hint_no_profile_flag_on_default(self, capsys):
cli_obj = _make_cli()
with patch("hermes_cli.profiles.get_active_profile_name", return_value="default"):
cli_obj._print_exit_summary()
out = capsys.readouterr().out
# No `-p` for the default profile.
assert "hermes --resume 20260524_000001_abc123" in out
assert " -p " not in out
def test_resume_hint_no_profile_flag_on_custom(self, capsys):
cli_obj = _make_cli()
with patch("hermes_cli.profiles.get_active_profile_name", return_value="custom"):
cli_obj._print_exit_summary()
out = capsys.readouterr().out
# "custom" is the standard HERMES_HOME indicator — no -p needed.
assert "hermes --resume 20260524_000001_abc123" in out
assert " -p " not in out
def test_resume_hint_includes_profile_flag_for_named_profile(self, capsys):
cli_obj = _make_cli()
with patch("hermes_cli.profiles.get_active_profile_name", return_value="dev"):
cli_obj._print_exit_summary()
out = capsys.readouterr().out
assert "hermes --resume 20260524_000001_abc123 -p dev" in out
def test_resume_hint_includes_profile_flag_on_title_hint_too(self, capsys, tmp_path):
"""When a session title is available, the `hermes -c "title"` hint
must also include the `-p` flag for non-default profiles.
"""
cli_obj = _make_cli()
fake_db = MagicMock()
fake_db.get_session_title.return_value = "My Cool Session"
cli_obj._session_db = fake_db
with patch("hermes_cli.profiles.get_active_profile_name", return_value="dev"):
cli_obj._print_exit_summary()
out = capsys.readouterr().out
assert 'hermes -c "My Cool Session" -p dev' in out
assert "hermes --resume 20260524_000001_abc123 -p dev" in out
def test_resume_hint_falls_back_when_profile_lookup_fails(self, capsys):
"""If `get_active_profile_name` raises (e.g. profiles module
missing during ``hermes update`` mid-flight), fall back to no
flag rather than crashing the exit summary.
"""
cli_obj = _make_cli()
with patch(
"hermes_cli.profiles.get_active_profile_name",
side_effect=RuntimeError("profiles unavailable"),
):
cli_obj._print_exit_summary()
out = capsys.readouterr().out
# Resume hint still printed without -p.
assert "hermes --resume 20260524_000001_abc123" in out
assert " -p " not in out
-121
View File
@@ -1,121 +0,0 @@
"""Tests for /resume status lines going to stderr in quiet mode (#11793).
The fix in cli._init_agent routes three messages to stderr when
``tool_progress_mode == "off"`` (set by ``hermes chat --quiet``):
* "Session not found: ..."
* "↻ Resumed session ... (N user messages, M total messages)"
* "Session ... found but has no messages. Starting fresh."
Interactive mode (tool_progress_mode == "full") still uses ChatConsole.
"""
from datetime import datetime
from unittest.mock import MagicMock, patch
import pytest
from cli import HermesCLI
def _make_cli(quiet=False, session_id="20260524_111111_xyz", db=None):
"""Build a minimal HermesCLI bound to only what _init_agent needs for
the resume code path: _resumed, _session_db, conversation_history,
session_id, and tool_progress_mode."""
cli = HermesCLI.__new__(HermesCLI)
cli.session_id = session_id
cli._resumed = True
cli.conversation_history = []
cli._session_db = db
cli.tool_progress_mode = "off" if quiet else "full"
cli.session_start = datetime.now()
cli.agent = None
# We need _init_agent to reach the resume block (line ~4757) but not
# proceed into actual AIAgent construction. _ensure_runtime_credentials
# must return True (False returns early at line 4743). _install_tool_callbacks,
# _ensure_tirith_security are stubbed; the resume block will either return
# False (session-not-found) or reach the eventual AIAgent() call which
# we'll let raise — we only check stdout/stderr printed BEFORE that.
cli._install_tool_callbacks = lambda: None
cli._ensure_tirith_security = lambda: None
cli._ensure_runtime_credentials = lambda: True
return cli
class TestResumeQuietStderr:
def test_session_not_found_goes_to_stderr_in_quiet_mode(self, capsys):
db = MagicMock()
db.get_session.return_value = None
cli = _make_cli(quiet=True, db=db)
with patch("cli._prepare_deferred_agent_startup"):
result = cli._init_agent()
captured = capsys.readouterr()
assert result is False
# stdout must stay clean
assert "Session not found" not in captured.out
# the resume status goes to stderr
assert "Session not found" in captured.err
assert "hermes sessions list" in captured.err
def test_session_not_found_goes_to_stdout_in_full_mode(self, capsys):
db = MagicMock()
db.get_session.return_value = None
cli = _make_cli(quiet=False, db=db)
with patch("cli._prepare_deferred_agent_startup"):
result = cli._init_agent()
captured = capsys.readouterr()
assert result is False
# Interactive mode keeps the existing _cprint path → stdout.
assert "Session not found" in captured.out
def test_resumed_banner_goes_to_stderr_in_quiet_mode(self, capsys):
db = MagicMock()
db.get_session.return_value = {"id": "20260524_111111_xyz", "title": "demo"}
db.resolve_resume_session_id.return_value = "20260524_111111_xyz"
db.get_messages_as_conversation.return_value = [
{"role": "user", "content": "hi"},
{"role": "assistant", "content": "hey"},
]
db._conn = MagicMock() # for the reopen execute() call
cli = _make_cli(quiet=True, db=db)
# Stop _init_agent right after the resume banner: prevent it from
# constructing a real AIAgent (the next code path).
with patch("cli._prepare_deferred_agent_startup"):
try:
cli._init_agent()
except Exception:
# The post-resume agent-init machinery may fail in this
# stubbed context (no API key, no real config) — we only
# care about the printed banner that comes earlier.
pass
captured = capsys.readouterr()
# Banner on stderr — stdout stays clean for automation.
assert "↻ Resumed session" not in captured.out
assert "↻ Resumed session" in captured.err
assert "20260524_111111_xyz" in captured.err
assert "demo" in captured.err
def test_no_messages_goes_to_stderr_in_quiet_mode(self, capsys):
db = MagicMock()
db.get_session.return_value = {"id": "20260524_111111_xyz"}
db.resolve_resume_session_id.return_value = "20260524_111111_xyz"
db.get_messages_as_conversation.return_value = []
db._conn = MagicMock()
cli = _make_cli(quiet=True, db=db)
with patch("cli._prepare_deferred_agent_startup"):
try:
cli._init_agent()
except Exception:
pass
captured = capsys.readouterr()
assert "has no messages" not in captured.out
assert "has no messages" in captured.err
assert "Starting fresh" in captured.err
-113
View File
@@ -1,113 +0,0 @@
"""Tests for the KeyboardInterrupt guard around slash command dispatch.
A Ctrl+C during a slow slash command (e.g. /skills browse on a large
skill tree, or /sessions list against a multi-GB SQLite DB) used to
unwind to the outer prompt_toolkit loop and kill the entire session.
The fix wraps `self.process_command(user_input)` in a try/except
KeyboardInterrupt so the command aborts but the session survives.
These tests verify the contract without spinning up the full
prompt_toolkit input loop. We exercise the same try/except by calling
through a thin wrapper that mirrors the real dispatch shape.
"""
from unittest.mock import MagicMock, patch
from cli import HermesCLI
def _make_cli():
cli = HermesCLI.__new__(HermesCLI)
cli._should_exit = False
cli.conversation_history = []
cli.agent = None
cli._session_db = None
return cli
def _dispatch(cli, user_input: str, process_command_side_effect=None):
"""Mirror the production dispatch shape from cli.py around line 14236.
Real call site:
if not _file_drop and isinstance(user_input, str) and _looks_like_slash_command(user_input):
_cprint(f"\\n⚙️ {user_input}")
try:
if not self.process_command(user_input):
self._should_exit = True
if app.is_running:
app.exit()
except KeyboardInterrupt:
_cprint("\\n[dim]Command interrupted.[/dim]")
continue
"""
if process_command_side_effect is not None:
with patch.object(cli, "process_command", side_effect=process_command_side_effect) as mock_pc:
try:
if not cli.process_command(user_input):
cli._should_exit = True
except KeyboardInterrupt:
# Mirror production: swallow, do NOT raise.
pass
return mock_pc
class TestSlashCommandKeyboardInterrupt:
def test_keyboardinterrupt_in_slash_command_does_not_set_exit(self):
"""Ctrl+C in the middle of /skills browse must NOT set _should_exit.
Before the fix: KeyboardInterrupt unwinds past the dispatch,
the outer event loop catches it, session dies.
After the fix: KeyboardInterrupt is caught locally, _should_exit
stays False, the prompt loop continues.
"""
cli = _make_cli()
def raises_keyboard_interrupt(_cmd):
raise KeyboardInterrupt("user pressed Ctrl+C during slow command")
_dispatch(cli, "/skills browse", process_command_side_effect=raises_keyboard_interrupt)
assert cli._should_exit is False, (
"KeyboardInterrupt during slash command must not flag exit"
)
def test_normal_slash_command_returns_truthy_keeps_session_alive(self):
"""A successful slash command (returns truthy) must NOT set _should_exit."""
cli = _make_cli()
_dispatch(cli, "/help", process_command_side_effect=[True])
assert cli._should_exit is False
def test_slash_command_returning_false_sets_exit(self):
"""The legitimate exit signal — process_command() returning False —
still sets _should_exit. This is the path /exit / /quit use."""
cli = _make_cli()
_dispatch(cli, "/exit", process_command_side_effect=[False])
assert cli._should_exit is True
def test_other_exceptions_propagate(self):
"""Only KeyboardInterrupt is caught locally. Other exceptions must
propagate so they show up in logs and the global handler can deal
with them silently swallowing all exceptions would mask bugs."""
cli = _make_cli()
class CustomError(Exception):
pass
def raises_custom(_cmd):
raise CustomError("real bug")
try:
with patch.object(cli, "process_command", side_effect=raises_custom):
try:
if not cli.process_command("/something"):
cli._should_exit = True
except KeyboardInterrupt:
pass # would NOT catch CustomError
except CustomError:
return # expected — non-KBI exceptions propagate
raise AssertionError("CustomError should have propagated")
-259
View File
@@ -1,259 +0,0 @@
"""Regression tests for issue #30768: /reset and /new freeze on Windows.
``_prompt_text_input_modal`` uses a queue-based modal that relies on
prompt_toolkit key bindings receiving keyboard events. On Windows the
prompt_toolkit input channel can deadlock when the modal is entered from
the ``process_loop`` daemon thread. The fix falls back to the simpler
``_prompt_text_input`` (stdin-based) prompt on Windows and non-main threads.
These tests verify:
1. Windows detection triggers the stdin fallback
2. Non-main thread detection triggers the stdin fallback
3. macOS/Linux main-thread path still uses the modal (no regression)
4. No-app path still uses the stdin fallback (existing behavior)
5. Empty choices returns None (existing behavior)
"""
import queue
import sys
import threading
import time
from unittest.mock import MagicMock, patch
import pytest
def _make_cli():
"""Minimal HermesCLI shell exposing prompt/modal helpers."""
import cli as cli_mod
obj = object.__new__(cli_mod.HermesCLI)
obj._app = MagicMock()
obj._status_bar_visible = True
obj._last_invalidate = 0.0
obj._modal_input_snapshot = None
obj._slash_confirm_state = None
obj._slash_confirm_deadline = 0
return obj
# ---------------------------------------------------------------------------
# Sample choices used across tests
# ---------------------------------------------------------------------------
_SAMPLE_CHOICES = [
("once", "Approve Once", "proceed this time only"),
("always", "Always Approve", "proceed and silence this prompt permanently"),
("cancel", "Cancel", "keep current conversation"),
]
class TestModalWindowsFallback:
"""Windows dead-lock regression tests for _prompt_text_input_modal."""
def test_windows_falls_back_to_stdin(self):
"""On Windows, _prompt_text_input_modal should use _prompt_text_input."""
cli = _make_cli()
with patch.object(sys, "platform", "win32"), \
patch.object(cli, "_prompt_text_input", return_value="1") as mock_stdin:
result = cli._prompt_text_input_modal(
title="⚠️ /new — destroys conversation state",
detail="This starts a fresh session.",
choices=_SAMPLE_CHOICES,
)
# The stdin-based fallback was used, not the modal queue path.
mock_stdin.assert_called_once_with("Choice [1/2/3]: ")
assert result == "1"
def test_non_main_thread_falls_back_to_stdin(self):
"""Off the main thread, _prompt_text_input_modal should use stdin fallback."""
cli = _make_cli()
result_holder = {}
def run_on_daemon():
# Patch platform to "linux" so the Windows check doesn't short-circuit.
with patch.object(sys, "platform", "linux"), \
patch.object(cli, "_prompt_text_input", return_value="2") as mock_stdin:
result_holder["result"] = cli._prompt_text_input_modal(
title="⚠️ /reset",
detail="This starts a fresh session.",
choices=_SAMPLE_CHOICES,
)
result_holder["stdin_called"] = mock_stdin.called
t = threading.Thread(target=run_on_daemon, daemon=True)
t.start()
t.join(timeout=2.0)
assert not t.is_alive(), "daemon thread hung — modal deadlocked"
assert result_holder["stdin_called"] is True
assert result_holder["result"] == "2"
def test_main_thread_non_windows_uses_modal(self):
"""On macOS/Linux main thread, the queue-based modal is still used."""
cli = _make_cli()
# We need to simulate the modal receiving a response. We'll patch
# the response_queue to immediately return a value.
with patch.object(sys, "platform", "darwin"), \
patch.object(cli, "_capture_modal_input_snapshot"), \
patch.object(cli, "_restore_modal_input_snapshot"), \
patch.object(cli, "_invalidate"):
# Start the modal in a way that it will receive a response
# immediately via the queue.
original_queue = queue.Queue
original_time = time.monotonic
def _fake_modal_flow(*args, **kwargs):
"""Simulate the modal flow: set state, put response, return."""
# We'll directly test that the modal path is entered by
# checking that _slash_confirm_state was set.
pass
# Since we can't easily mock the internal queue, let's test
# that the modal path is entered by checking that
# _prompt_text_input was NOT called.
with patch.object(cli, "_prompt_text_input") as mock_stdin:
# Set up a response that will be put into the queue
# after the modal starts waiting.
def _submit_after_delay():
time.sleep(0.2)
state = cli._slash_confirm_state
if state and "response_queue" in state:
state["response_queue"].put("once")
submitter = threading.Thread(target=_submit_after_delay, daemon=True)
submitter.start()
result = cli._prompt_text_input_modal(
title="⚠️ /new",
detail="This starts a fresh session.",
choices=_SAMPLE_CHOICES,
timeout=5,
)
submitter.join(timeout=2.0)
# The stdin fallback should NOT have been called.
mock_stdin.assert_not_called()
# The result should be "once" from the simulated modal response.
assert result == "once"
def test_no_app_falls_back_to_stdin(self):
"""Without a prompt_toolkit app, always use stdin fallback."""
cli = _make_cli()
cli._app = None
with patch.object(cli, "_prompt_text_input", return_value="3") as mock_stdin:
result = cli._prompt_text_input_modal(
title="⚠️ /clear",
detail="This clears the screen.",
choices=_SAMPLE_CHOICES,
)
mock_stdin.assert_called_once_with("Choice [1/2/3]: ")
assert result == "3"
def test_empty_choices_returns_none(self):
"""Empty choices list should return None without prompting."""
cli = _make_cli()
with patch.object(cli, "_prompt_text_input") as mock_stdin:
result = cli._prompt_text_input_modal(
title="Test",
detail="Test",
choices=[],
)
mock_stdin.assert_not_called()
assert result is None
def test_windows_fallback_does_not_set_modal_state(self):
"""Verify Windows fallback doesn't leave _slash_confirm_state set."""
cli = _make_cli()
with patch.object(sys, "platform", "win32"), \
patch.object(cli, "_prompt_text_input", return_value="1"):
cli._prompt_text_input_modal(
title="⚠️ /reset",
detail="This starts a fresh session.",
choices=_SAMPLE_CHOICES,
)
assert cli._slash_confirm_state is None
def test_non_main_thread_fallback_does_not_set_modal_state(self):
"""Verify daemon-thread fallback doesn't leave modal state set."""
cli = _make_cli()
errors = []
def run_on_daemon():
try:
with patch.object(sys, "platform", "linux"), \
patch.object(cli, "_prompt_text_input", return_value="1"):
cli._prompt_text_input_modal(
title="⚠️ /new",
detail="This starts a fresh session.",
choices=_SAMPLE_CHOICES,
)
if cli._slash_confirm_state is not None:
errors.append("_slash_confirm_state should be None")
except Exception as exc:
errors.append(str(exc))
t = threading.Thread(target=run_on_daemon, daemon=True)
t.start()
t.join(timeout=2.0)
assert not errors, f"unexpected errors: {errors}"
assert cli._slash_confirm_state is None
class TestConfirmDestructiveSlashWindows:
"""Integration-level tests for _confirm_destructive_slash on Windows."""
def test_confirm_destructive_slash_bypasses_modal_on_windows(self):
"""_confirm_destructive_slash should work on Windows via stdin fallback."""
cli = _make_cli()
cli.model = "test-model"
cli._agent_running = False
cli._spinner_text = ""
cli._should_exit = False
cli._command_running = False
cli.session_id = "test-session"
cli._pending_tool_info = {}
cli._tool_start_time = 0.0
cli._last_scrollback_tool = ""
with patch.object(sys, "platform", "win32"), \
patch.object(cli, "_prompt_text_input", return_value="1"), \
patch("cli.load_cli_config", return_value={"approvals": {"destructive_slash_confirm": True}}):
result = cli._confirm_destructive_slash(
"new",
"This starts a fresh session.\nThe current conversation history will be discarded.",
)
assert result == "once"
def test_confirm_destructive_slash_cancelled_on_windows(self):
"""Cancellation via stdin fallback works on Windows."""
cli = _make_cli()
cli.model = "test-model"
cli._agent_running = False
cli._spinner_text = ""
cli._should_exit = False
cli._command_running = False
cli.session_id = "test-session"
cli._pending_tool_info = {}
cli._tool_start_time = 0.0
cli._last_scrollback_tool = ""
with patch.object(sys, "platform", "win32"), \
patch.object(cli, "_prompt_text_input", return_value="3"), \
patch("cli.load_cli_config", return_value={"approvals": {"destructive_slash_confirm": True}}):
result = cli._confirm_destructive_slash(
"reset",
"This starts a fresh session.\nThe current conversation history will be discarded.",
)
# Choice "3" normalizes to "cancel", which returns None.
assert result is None
-30
View File
@@ -1,6 +1,5 @@
"""Tests for cron job context_from feature (issue #5439 Option C)."""
import logging
import sys
from pathlib import Path
@@ -268,35 +267,6 @@ class TestBuildJobPromptContextFrom:
assert "Process" in prompt
assert "etc/passwd" not in prompt
def test_invalid_job_id_log_includes_job_origin(self, cron_env, caplog):
"""Invalid stored context_from refs log job/source provenance."""
from cron.jobs import create_job
from cron.scheduler import _build_job_prompt
job = create_job(
prompt="Process",
schedule="every 2h",
name="suspicious-chain",
origin={
"platform": "api_server",
"chat_id": "api",
"source_ip": "203.0.113.10",
"forwarded_for": "198.51.100.7",
},
)
job["context_from"] = ["../../../etc/passwd"]
caplog.set_level(logging.WARNING, logger="cron.scheduler")
prompt = _build_job_prompt(job)
assert "Process" in prompt
message = caplog.text
assert "context_from: skipping invalid job_id" in message
assert job["id"] in message
assert "suspicious-chain" in message
assert "203.0.113.10" in message
assert "198.51.100.7" in message
class TestUpdateContextFrom:
-41
View File
@@ -232,23 +232,6 @@ class TestJobCRUD:
assert remove_job(job["id"]) is True
assert get_job(job["id"]) is None
def test_remove_job_rejects_unsafe_legacy_id_before_output_cleanup(self, tmp_cron_dir):
"""Legacy unsafe IDs left over from before the create-time guard
must fail closed without half-applying the removal."""
job = create_job(prompt="Legacy unsafe", schedule="every 1h")
job["id"] = "../escape"
save_jobs([job])
outside = tmp_cron_dir / "escape"
outside.mkdir()
(outside / "keep.txt").write_text("keep", encoding="utf-8")
with pytest.raises(ValueError, match="output path"):
remove_job("../escape")
# Job should still be in the store and the escape dir untouched.
assert load_jobs()[0]["id"] == "../escape"
assert (outside / "keep.txt").exists()
def test_remove_nonexistent_returns_false(self, tmp_cron_dir):
assert remove_job("nonexistent") is False
@@ -317,17 +300,6 @@ class TestUpdateJob:
result = update_job("nonexistent_id", {"name": "X"})
assert result is None
def test_update_rejects_id_change(self, tmp_cron_dir):
"""Job IDs are filesystem path components — must be immutable."""
job = create_job(prompt="Original", schedule="every 1h")
with pytest.raises(ValueError, match="id"):
update_job(job["id"], {"id": "../escape"})
# Original job still resolvable, no rename happened.
assert get_job(job["id"]) is not None
assert get_job("../escape") is None
class TestPauseResumeJob:
def test_pause_sets_state(self, tmp_cron_dir):
@@ -981,16 +953,3 @@ class TestSaveJobOutput:
assert output_file.exists()
assert output_file.read_text() == "# Results\nEverything ok."
assert "test123" in str(output_file)
@pytest.mark.parametrize("bad_job_id", ["../escape", "nested/escape", ".", "..", ""])
def test_rejects_unsafe_job_id(self, tmp_cron_dir, bad_job_id):
"""Path-escape attempts must fail closed and never create dirs."""
with pytest.raises(ValueError, match="output path"):
save_job_output(bad_job_id, "# Results")
assert not (tmp_cron_dir / "escape").exists()
def test_rejects_absolute_job_id(self, tmp_cron_dir):
"""Absolute paths as job IDs must fail closed."""
with pytest.raises(ValueError, match="output path"):
save_job_output(str(tmp_cron_dir / "outside"), "# Results")
assert not (tmp_cron_dir / "outside").exists()
-36
View File
@@ -1021,42 +1021,6 @@ class TestRunJobSessionPersistence:
kwargs = mock_agent_cls.call_args.kwargs
assert kwargs["enabled_toolsets"] == ["web", "terminal", "file"]
def test_run_job_disabled_toolsets_layer_user_config_on_baseline(self, tmp_path):
"""agent.disabled_toolsets must be honoured in cron — issue #25752.
The bug: per-job enabled_toolsets was returned verbatim, letting an
LLM-supplied cronjob() call re-enable tools the operator had globally
disabled. The fix: ALWAYS include agent.disabled_toolsets in the
disabled_toolsets passed to AIAgent, on top of the cron baseline
(cronjob/messaging/clarify). AIAgent's disabled_toolsets takes
precedence over enabled_toolsets, so this stops the bypass.
"""
(tmp_path / "config.yaml").write_text(
"agent:\n"
" disabled_toolsets:\n"
" - terminal\n"
" - file\n",
encoding="utf-8",
)
job = {
"id": "policy-job",
"name": "test",
"prompt": "hello",
"enabled_toolsets": ["web", "terminal", "file"],
}
fake_db, patches = self._make_run_job_patches(tmp_path)
with patches[0], patches[1], patches[2], patches[3], patches[4], \
patch("run_agent.AIAgent") as mock_agent_cls:
mock_agent = MagicMock()
mock_agent.run_conversation.return_value = {"final_response": "ok"}
mock_agent_cls.return_value = mock_agent
run_job(job)
kwargs = mock_agent_cls.call_args.kwargs
assert set(kwargs["disabled_toolsets"]) >= {
"cronjob", "messaging", "clarify", "terminal", "file",
}
def test_run_job_enabled_toolsets_resolves_from_platform_config_when_not_set(self, tmp_path):
"""When a job has no explicit enabled_toolsets, the scheduler now
resolves them from ``hermes tools`` platform config for ``cron``
-31
View File
@@ -11,7 +11,6 @@ Covers:
"""
import json
import logging
from unittest.mock import MagicMock, patch
import pytest
@@ -152,9 +151,6 @@ class TestCreateJob:
"name": "test-job",
"schedule": "*/5 * * * *",
"prompt": "do something",
}, headers={
"X-Forwarded-For": "203.0.113.11",
"User-Agent": "cron-client",
})
assert resp.status == 200
data = await resp.json()
@@ -164,10 +160,6 @@ class TestCreateJob:
assert call_kwargs["name"] == "test-job"
assert call_kwargs["schedule"] == "*/5 * * * *"
assert call_kwargs["prompt"] == "do something"
assert call_kwargs["origin"]["platform"] == "api_server"
assert call_kwargs["origin"]["chat_id"] == "api"
assert call_kwargs["origin"]["forwarded_for"] == "203.0.113.11"
assert call_kwargs["origin"]["user_agent"] == "cron-client"
@pytest.mark.asyncio
async def test_create_job_missing_name(self, adapter):
@@ -288,29 +280,6 @@ class TestGetJob:
data = await resp.json()
assert "Invalid" in data["error"]
@pytest.mark.asyncio
async def test_invalid_job_id_logs_source_context(self, adapter, caplog):
"""Invalid job-id probes log source metadata for later investigation."""
app = _create_app(adapter)
caplog.set_level(logging.WARNING, logger="gateway.platforms.api_server")
async with TestClient(TestServer(app)) as cli:
with patch(f"{_MOD}._CRON_AVAILABLE", True):
resp = await cli.get(
"/api/jobs/..%2F..%2F..%2Fetc%2Fpasswd",
headers={
"X-Forwarded-For": "203.0.113.9",
"User-Agent": "probe scanner",
},
)
assert resp.status == 400
message = caplog.text
assert "Cron jobs API rejected invalid job_id" in message
assert "203.0.113.9" in message
assert "GET" in message
assert "/api/jobs/" in message
assert "probe scanner" in message
# ---------------------------------------------------------------------------
# 11-12. test_update_job
@@ -1,111 +0,0 @@
"""Regression tests for #29335 — gateway must persist ``session_entry.session_id``
after the agent's compression path mutates it.
When ``_compress_context()`` rolls the agent forward into a new session, the
agent now returns the new ``session_id`` in its result dict. The gateway
updates ``session_entry.session_id`` in memory AND must call
``session_store._save()`` so the new mapping survives a gateway restart.
Without ``_save()``, the next turn loads the OLD session's transcript and
re-triggers compression forever.
Three sites in ``gateway/run.py`` mutate ``session_entry.session_id`` after
a compression-induced session split. All three MUST be followed by a
``_save()`` call. This test pins that invariant.
"""
from __future__ import annotations
import ast
import inspect
import textwrap
from gateway import run as gateway_run
def _session_id_assignments_followed_by_save(source: str) -> list[tuple[int, bool]]:
"""For each ``session_entry.session_id = ...`` assignment in *source*,
return ``(lineno, saved_within_5_stmts)`` True iff a
``self.session_store._save()`` call appears in the same block within the
next 5 statements (covers normal control flow without false-flagging
cleanup that lives 200 lines away).
"""
tree = ast.parse(textwrap.dedent(source))
results: list[tuple[int, bool]] = []
class _Visitor(ast.NodeVisitor):
def _is_session_id_assign(self, node: ast.AST) -> bool:
if not isinstance(node, ast.Assign):
return False
for target in node.targets:
if (
isinstance(target, ast.Attribute)
and target.attr == "session_id"
and isinstance(target.value, ast.Name)
and target.value.id == "session_entry"
):
return True
return False
def _block_has_save_after(self, body: list[ast.stmt], idx: int) -> bool:
for stmt in body[idx : idx + 6]:
for sub in ast.walk(stmt):
if (
isinstance(sub, ast.Call)
and isinstance(sub.func, ast.Attribute)
and sub.func.attr == "_save"
):
return True
return False
def _walk_body(self, body: list[ast.stmt]) -> None:
for i, stmt in enumerate(body):
if self._is_session_id_assign(stmt):
results.append((stmt.lineno, self._block_has_save_after(body, i)))
for child in ast.iter_child_nodes(stmt):
if isinstance(child, (ast.If, ast.For, ast.While, ast.With,
ast.Try, ast.AsyncWith, ast.AsyncFor)):
self._walk_node(child)
def _walk_node(self, node: ast.AST) -> None:
for attr in ("body", "orelse", "finalbody"):
inner = getattr(node, attr, None)
if isinstance(inner, list):
self._walk_body(inner)
if hasattr(node, "handlers"):
for handler in node.handlers:
self._walk_body(handler.body)
def visit(self, node: ast.AST) -> None:
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
self._walk_body(node.body)
for child in ast.iter_child_nodes(node):
self.visit(child)
_Visitor().visit(tree)
return results
def test_every_post_compression_session_id_assignment_persists():
"""Every ``session_entry.session_id = ...`` in gateway/run.py must be
followed by a ``session_store._save()`` call within the same block.
Regression for #29335 — the assignment at the end of
``_handle_message_with_agent`` used to skip ``_save()`` while two sibling
sites (hygiene rewrite, manual /compress) already persisted. The agent
would compress correctly, the gateway would update its in-memory
session_id, then drop it on next gateway restart.
"""
source = inspect.getsource(gateway_run)
assignments = _session_id_assignments_followed_by_save(source)
assert assignments, (
"No ``session_entry.session_id = ...`` assignments found in gateway/run.py — "
"either the structure changed or the AST walker is broken."
)
missing = [lineno for lineno, saved in assignments if not saved]
assert not missing, (
f"{len(missing)} ``session_entry.session_id = ...`` site(s) in gateway/run.py "
f"are not followed by ``session_store._save()`` within the same block "
f"(lines: {missing}). Every post-compression session_id update must persist "
f"or the next turn loads the pre-compression transcript and triggers an "
f"infinite compression loop. See issue #29335."
)
-79
View File
@@ -1516,13 +1516,6 @@ class TestSetupFilesSlashCommand:
class TestUserOAuthHelper:
@staticmethod
def _assert_private_json_file(path, expected):
assert json.loads(path.read_text(encoding="utf-8")) == expected
assert list(path.parent.glob(f"{path.stem}.tmp.*")) == []
if os.name != "nt":
assert (path.stat().st_mode & 0o777) == 0o600
def test_load_user_credentials_returns_none_when_no_token(self, tmp_path, monkeypatch):
"""Missing token file is the expected no-op case (user hasn't
run /setup-files yet). Must NOT raise."""
@@ -1617,78 +1610,6 @@ class TestUserOAuthHelper:
assert a != legacy
assert "google_chat_user_oauth_pending" in str(a.parent)
def test_persist_credentials_writes_private_json(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
from plugins.platforms.google_chat.oauth import _persist_credentials, _token_path
creds = type(
"Creds",
(),
{
"to_json": lambda self: json.dumps(
{
"client_id": "cid",
"client_secret": "secret",
"refresh_token": "rtok",
"token": "atok",
}
)
},
)()
path = _token_path("alice@example.com")
_persist_credentials(creds, path)
self._assert_private_json_file(
path,
{
"client_id": "cid",
"client_secret": "secret",
"refresh_token": "rtok",
"token": "atok",
"type": "authorized_user",
},
)
def test_store_client_secret_writes_private_json(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
src = tmp_path / "client_secret.json"
payload = {"installed": {"client_id": "cid", "client_secret": "secret"}}
src.write_text(json.dumps(payload), encoding="utf-8")
from plugins.platforms.google_chat.oauth import (
_client_secret_path,
store_client_secret,
)
store_client_secret(str(src))
self._assert_private_json_file(_client_secret_path(), payload)
def test_save_pending_auth_writes_private_json(self, tmp_path, monkeypatch):
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
from plugins.platforms.google_chat.oauth import (
_REDIRECT_URI,
_pending_auth_path,
_save_pending_auth,
)
_save_pending_auth(
state="state-123",
code_verifier="verifier-abc",
email="alice@example.com",
)
self._assert_private_json_file(
_pending_auth_path("alice@example.com"),
{
"state": "state-123",
"code_verifier": "verifier-abc",
"redirect_uri": _REDIRECT_URI,
"email": "alice@example.com",
},
)
class TestPerUserAttachmentRouting:
"""The bot must use the *requesting user's* OAuth token when sending
+4 -4
View File
@@ -71,7 +71,7 @@ class TestMattermostConfigLoading:
def _make_adapter():
"""Create a MattermostAdapter with mocked config."""
from plugins.platforms.mattermost.adapter import MattermostAdapter
from gateway.platforms.mattermost import MattermostAdapter
config = PlatformConfig(
enabled=True,
token="test-token",
@@ -637,19 +637,19 @@ class TestMattermostRequirements:
def test_check_requirements_with_token_and_url(self, monkeypatch):
monkeypatch.setenv("MATTERMOST_TOKEN", "test-token")
monkeypatch.setenv("MATTERMOST_URL", "https://mm.example.com")
from plugins.platforms.mattermost.adapter import check_mattermost_requirements
from gateway.platforms.mattermost import check_mattermost_requirements
assert check_mattermost_requirements() is True
def test_check_requirements_without_token(self, monkeypatch):
monkeypatch.delenv("MATTERMOST_TOKEN", raising=False)
monkeypatch.delenv("MATTERMOST_URL", raising=False)
from plugins.platforms.mattermost.adapter import check_mattermost_requirements
from gateway.platforms.mattermost import check_mattermost_requirements
assert check_mattermost_requirements() is False
def test_check_requirements_without_url(self, monkeypatch):
monkeypatch.setenv("MATTERMOST_TOKEN", "test-token")
monkeypatch.delenv("MATTERMOST_URL", raising=False)
from plugins.platforms.mattermost.adapter import check_mattermost_requirements
from gateway.platforms.mattermost import check_mattermost_requirements
assert check_mattermost_requirements() is False
+1 -1
View File
@@ -829,7 +829,7 @@ class TestSlackDownloadSlackFileBytes:
def _make_mm_adapter():
"""Build a minimal MattermostAdapter with mocked internals."""
from plugins.platforms.mattermost.adapter import MattermostAdapter
from gateway.platforms.mattermost import MattermostAdapter
config = PlatformConfig(
enabled=True, token="mm-token-fake",
extra={"url": "https://mm.example.com"},
-109
View File
@@ -1,7 +1,6 @@
"""Tests for gateway/platforms/base.py — MessageEvent, media extraction, message truncation."""
import os
import time
from unittest.mock import patch
import pytest
@@ -368,10 +367,6 @@ class TestMediaDeliveryPathValidation:
"gateway.platforms.base.MEDIA_DELIVERY_SAFE_ROOTS",
tuple(roots),
)
# Disable recency-based trust by default so the original allowlist
# tests continue to exercise the strict-allowlist path. Tests that
# specifically cover recency trust re-enable it themselves.
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "0")
def test_allows_existing_file_inside_safe_root(self, tmp_path, monkeypatch):
root = tmp_path / "media-cache"
@@ -431,110 +426,6 @@ class TestMediaDeliveryPathValidation:
assert BasePlatformAdapter.validate_media_delivery_path(str(media_file)) == str(media_file.resolve())
def test_recency_trust_allows_freshly_produced_file(self, tmp_path, monkeypatch):
"""A PDF the agent just wrote to /tmp should be deliverable.
Covers the natural case: agent runs ``pandoc -o /tmp/report.pdf`` or
``write_file('/home/user/report.pdf', ...)`` and asks the gateway to
send the result. With recency trust on, fresh files outside the cache
allowlist are accepted because the file's mtime is within the window.
"""
self._patch_roots(monkeypatch) # zero cache allowlist
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "1")
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_SECONDS", "600")
fresh = tmp_path / "scratch" / "report.pdf"
fresh.parent.mkdir(parents=True)
fresh.write_bytes(b"%PDF-1.4")
assert BasePlatformAdapter.validate_media_delivery_path(str(fresh)) == str(fresh.resolve())
def test_recency_trust_rejects_old_file(self, tmp_path, monkeypatch):
"""A pre-existing host file (~/.bashrc, /etc/passwd shape) is rejected.
Recency trust is the load-bearing anti-injection signal: prompt-injected
paths point at files that have existed for days or months, well outside
the trust window.
"""
self._patch_roots(monkeypatch)
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "1")
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_SECONDS", "60")
stale = tmp_path / "stale.pdf"
stale.write_bytes(b"%PDF-1.4")
old_mtime = time.time() - 7200 # 2 hours ago
os.utime(stale, (old_mtime, old_mtime))
assert BasePlatformAdapter.validate_media_delivery_path(str(stale)) is None
def test_recency_trust_disabled_falls_back_to_pure_allowlist(self, tmp_path, monkeypatch):
"""Setting trust_recent_files=false reverts to pre-existing strict behavior."""
self._patch_roots(monkeypatch)
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "0")
fresh = tmp_path / "report.pdf"
fresh.write_bytes(b"%PDF-1.4") # mtime = now
assert BasePlatformAdapter.validate_media_delivery_path(str(fresh)) is None
def test_recency_trust_denies_system_paths_even_when_fresh(self, tmp_path, monkeypatch):
"""A freshly-touched file under /etc must NOT be uploaded.
Belt-and-braces: even if an attacker rewrites the file's mtime
(e.g. via a separately compromised tool result that touches a system
file), the denylist refuses to deliver paths under /etc, /proc, /sys,
~/.ssh, ~/.aws, etc.
"""
self._patch_roots(monkeypatch)
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "1")
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_SECONDS", "600")
# Simulate $HOME so ~/.ssh resolves into our tmp dir.
fake_home = tmp_path / "home"
ssh_dir = fake_home / ".ssh"
ssh_dir.mkdir(parents=True)
secret = ssh_dir / "id_rsa.txt"
secret.write_bytes(b"-----BEGIN ...") # mtime = now
monkeypatch.setenv("HOME", str(fake_home))
assert BasePlatformAdapter.validate_media_delivery_path(str(secret)) is None
def test_recency_trust_allows_pdf_in_project_dir(self, tmp_path, monkeypatch):
"""The motivating case: agent produces a PDF in a project directory.
Reproduces the Discord-PDF-not-delivered bug. Before recency trust,
files outside ~/.hermes/cache/* were silently dropped, leaving the
user with a raw filepath in chat instead of an attachment.
"""
self._patch_roots(monkeypatch)
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "1")
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_SECONDS", "600")
project = tmp_path / "my-project"
report = project / "build" / "weekly-report.pdf"
report.parent.mkdir(parents=True)
report.write_bytes(b"%PDF-1.4")
assert BasePlatformAdapter.validate_media_delivery_path(str(report)) == str(report.resolve())
def test_filter_keeps_recently_produced_files(self, tmp_path, monkeypatch):
"""End-to-end: filter_local_delivery_paths routes a fresh PDF through."""
self._patch_roots(monkeypatch)
monkeypatch.delenv("HERMES_MEDIA_ALLOW_DIRS", raising=False)
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_FILES", "1")
monkeypatch.setenv("HERMES_MEDIA_TRUST_RECENT_SECONDS", "600")
fresh = tmp_path / "report.pdf"
fresh.write_bytes(b"%PDF-1.4")
out = BasePlatformAdapter.filter_local_delivery_paths([str(fresh)])
assert out == [str(fresh.resolve())]
# ---------------------------------------------------------------------------
# should_send_media_as_audio

Some files were not shown because too many files have changed in this diff Show More