Compare commits

...

113 Commits

Author SHA1 Message Date
alt-glitch 2b47b40c10 docs(lsp): add feature page — setup, CLI, supported languages, troubleshooting
Covers: enable flow, server installation (detect-only default vs
hermes lsp install), how diagnostics reach the model, config knobs,
all 26 supported languages, and troubleshooting common issues.
2026-05-12 15:23:47 +00:00
alt-glitch b1a609fba3 chore: remove plan from PR (working document, not shipped) 2026-05-12 15:18:20 +00:00
alt-glitch 6d80aa80eb refactor(lsp): simplify __init__.py per /simplify review
- Remove dead _post_tool_call (body was only comments)
- Remove _on_session_start (redundant — _ensure_service lazy-inits)
- Remove _atexit_cleanup (duplicate of _on_session_end)
- Switch _baselines from dict to set (presence sentinel only)
- Remove redundant enabled_for recheck in transform_tool_result
- Remove V4A guard (path-empty check already covers it)
- Use modern type syntax (X | None, dict[], set[])
- Reduce from 322 → 217 lines, same behavior

77/77 tests pass.
2026-05-12 15:01:44 +00:00
alt-glitch e0a1778028 fix(lsp): address review findings — TOCTOU, None guard, JSON safety
Fixes from Claude Code adversarial review:
- Snapshot _service to local var before .is_active() (TOCTOU fix)
- Guard session_id against None with 'or ""'
- Remove text-append fallback — only inject when result is dict JSON
- Add ValueError to json.dumps except clause
- Guard result=None with 'or ""' and isinstance check

Non-dict JSON results and non-JSON results are now left unmodified
(return None = no injection) rather than risking format corruption.
2026-05-12 13:13:53 +00:00
alt-glitch 40a9327248 fix(lsp): wire CLI subcommands via setup_lsp_parser for plugin registration
register_cli_command's setup_fn receives an already-created parser,
not the parent's SubParsersAction. Added setup_lsp_parser() that adds
subcommands (status, list, install, restart, which) to the provided
parser.

Verified: 'hermes lsp status' works from cold shell when plugin is
enabled in plugins.enabled config.
2026-05-12 13:08:49 +00:00
alt-glitch 23344a9a3c feat(lsp): plugin-based LSP diagnostics with zero core changes
Ship LSP semantic diagnostics as a bundled plugin (plugins/lsp/) using
existing hook system.  Zero lines of core code modified.

Plugin wiring:
- pre_tool_call: capture LSP baseline before write_file/patch
- transform_tool_result: inject diagnostics into tool result JSON
- on_session_start/on_session_end + atexit: lifecycle management

Key design:
- Baselines keyed by (session_id, abs_path) for concurrent safety
- Diagnostics added as 'lsp_diagnostics' JSON field (preserves shape)
- Per-file workspace detection (no static session-start gate)
- V4A multi-file patch skipped for MVP
- Short timeout (3s) — cold start degrades gracefully
- os.path.exists heuristic for Docker/SSH backend skip
- First relevant write with no server → INFO log with install hint

Tests: 77/77 pass including:
- Protocol framing, reporter formatting, workspace resolution
- Client E2E against mock LSP server (live_system_guard_bypass)
- Eventlog steady-state silence contract
- Backend-gate heuristic (local vs non-local paths)
- Full hook flow integration (pre→write→transform with diagnostics)

Source: PR #24168 by @teknium1, PR #24155 by @OutThisLife
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-05-12 13:01:13 +00:00
Teknium dd0923bb89 docs: remove public advisory page (handle community comms separately) (#24253) 2026-05-12 01:09:58 -07:00
Teknium c1eb2dcda7 feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback

Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.

# What this PR makes true

1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
   detection banner with copy-pasteable remediation steps the moment
   they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
   a fresh install to 'core only' — the installer keeps every other
   extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
   lazy-install on first use under a strict allowlist, instead of
   eagerly pulling everything at install time.

# Detection: hermes_cli/security_advisories.py

- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
  mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
  dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
  the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
  re-banner after ack.
- Wired into:
  * hermes doctor — runs first, prints full remediation block
  * hermes doctor --ack <id> — dismisses an advisory
  * cli.py interactive run() and single-query branches — short
    stderr banner pointing at hermes doctor
  * gateway/run.py startup — operator-visible warning in gateway.log

# Lazy-install framework: tools/lazy_deps.py

- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
  memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
  uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
  pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
  HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
  * tools/tts_tool.py — _import_elevenlabs() calls ensure first
  * plugins/memory/honcho/client.py — get_honcho_client lazy-installs
  * tts.mistral / stt.mistral entries pre-registered for when PyPI
    restores mistralai

# Installer fallback tiers

scripts/install.sh, scripts/install.ps1, setup-hermes.sh:

- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
  array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
  PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
  landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
  the same _BROKEN_EXTRAS array so updates stay in sync.

Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).

# Config

hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: []  (advisory IDs the user has dismissed)
- allow_lazy_installs: True  (security gate for ensure())

No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.

# Tests

tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
  gateway_log_message
- shipped catalog well-formedness invariant

tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
  every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
  pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command

Combined: 63 new tests, all passing under scripts/run_tests.sh.

# Validation

- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
  tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
  tests/hermes_cli/test_doctor_command_install.py
  tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
  tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
  9191 passed, 8 pre-existing failures (verified on origin/main
  before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
  + gateway_log_message with mocked installed version → produces
  copy-pasteable remediation output

# Community

Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md

Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md

Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>

* build(deps): pin every direct dep to ==X.Y.Z (no ranges)

Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.

Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.

What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.

Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.

Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.

mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.

LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.

Validation:

- Cross-checked all 77 pinned direct deps in pyproject.toml against
  uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
  → 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.

* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra

You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.

# What this commit fixes

1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
   uv.lock records SHA256 hashes for every transitive — a compromised
   package with a different hash gets REJECTED. Falls through to the
   existing `uv pip install` cascade if the lockfile is missing or
   stale, with a loud warning that the fallback path does NOT
   hash-verify transitives. Previously only `setup-hermes.sh` (the dev
   path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
   (the paths fresh users actually run) skipped it.

2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
   project is fully quarantined right now — every version returns 404,
   so any pin we wrote was unresolvable, which broke `uv lock --check`
   in CI. Restoration is documented in pyproject.toml as a 5-step
   checklist (verify, re-add extra, re-enable in 4 modules, regenerate
   lock, optionally re-add to [all]).

3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
   jsonpath-python pruned. `uv lock --check` now passes.

# Defense-in-depth view

| Layer                      | Where             | Protects against                          |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject    | direct deps       | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph  | transitive worm injection                  |
| Tier-0 hash-verified path  | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate  | every PR          | drift between pyproject and lockfile      |
| `hermes_cli/security_advisories.py` | runtime  | cleanup for users who already got hit      |

The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.

# Validation

- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
  (test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
  test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.

* chore: remove community announcement drafts (PR body covers it)

* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)

Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.

Moved out of core dependencies = []:
- anthropic   (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client  (image gen; only when picked)
- edge-tts    (default TTS but still optional)

New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].

New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.

Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.

Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).

Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
2026-05-12 01:02:25 -07:00
Teknium 99ad2d1372 fix(deps): unbreak [all] install — drop mistralai while PyPI quarantined (#24205)
The `mistralai` PyPI package was quarantined on 2026-05-12 after a
malicious 2.4.6 release. Every fresh resolve (AUR makepkg, Docker build,
CI run, install.sh first-run) currently fails on
`mistralai>=2.3.0,<3` because PyPI returns zero candidates.

Existing users running `hermes update` mostly didn't notice — `hermes
update` falls back from `.[all]` to per-extra retries and silently
skips mistral with a warning that scrolls past. But fresh installs
hard-fail or lose every other extra.

Changes:
- pyproject.toml: drop `hermes-agent[mistral]` from `[all]` and
  `[termux-all]`. The `mistral` extra itself is preserved so users
  can opt back in once PyPI un-quarantines.
- hermes_cli/tools_config.py: hide Mistral Voxtral TTS from the
  `hermes tools` provider picker until restored.
- hermes_cli/web_server.py: drop "mistral" from dashboard STT options.
- tools/transcription_tools.py: explicit `provider: mistral` returns
  "none" with a clear status message; auto-detect skips mistral.
- tools/tts_tool.py: dispatcher returns a clear "temporarily disabled"
  error before any SDK import attempt (avoids cached-stale-package
  surprises).
- tests/tools/: update three test files to assert the new disabled
  behavior. Each test docstring records why and points at the rollback
  trigger (PyPI un-quarantines mistralai).

Restore plan: revert this commit once the package is available on PyPI
again. The behavior change is intentional and documented in code
comments + test docstrings to make the rollback trivial.

Validation:
- scripts/run_tests.sh tests/tools/ -k 'mistral or stt or tts' →
  425/425 passing.

Refs: https://pypi.org/simple/mistralai/ (currently
"pypi:project-status: quarantined").
2026-05-11 23:02:15 -07:00
nightcityblade 407683b72d fix(docs): repair Voice & TTS provider table
Fixes NousResearch/hermes-agent#24101
2026-05-11 22:42:00 -07:00
Robin Fernandes 94d9db72ba add client marker tag on aux inference requests 2026-05-11 22:30:42 -07:00
Austin Pickett 58e2109f10 fix(minimax): harden OAuth dashboard and runtime
Handle MiniMax OAuth expiry values consistently across CLI and dashboard
flows, fix CLI status/add behavior, and force pooled OAuth runtime
requests through Anthropic Messages.

- web_server._minimax_poller: parse expired_in via the shared resolver
  so unix-ms absolute timestamps stop landing as TTL seconds and crashing
  with 'year 583911 is out of range' when a user connects MiniMax OAuth
  from the dashboard.
- auth._minimax_oauth_login / _refresh_minimax_oauth_state: same fix on
  the CLI login + refresh paths.
- auth.get_auth_status: dispatch minimax-oauth to its dedicated status
  function instead of falling through.
- auth_commands.auth_add_command: 'hermes auth add minimax-oauth' now
  starts the device-code login flow and persists a pool entry with the
  access + refresh tokens, instead of requiring credentials to already
  exist.
- runtime_provider._resolve_runtime_from_pool_entry: pin pooled
  minimax-oauth credentials to anthropic_messages so a stale
  model.api_mode: chat_completions can't send requests to
  /anthropic/chat/completions and trigger MiniMax nginx 404s.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 22:15:16 -07:00
rob-maron 32abe742fa fix comment 2026-05-11 21:30:29 -07:00
rob-maron f0c2964f0b remove comments 2026-05-11 21:30:29 -07:00
rob-maron 057fc7b073 fix guard 2026-05-11 21:30:29 -07:00
rob-maron 528bba6734 fix kimi 2026-05-11 21:30:29 -07:00
Teknium 7993e03c06 fix(cache): route Nous Portal Qwen through Portal-Claude cache pathway (#24151)
Qwen models on Nous Portal (e.g. qwen3.6-plus) now get the same envelope-layout
cache_control markers and long-lived (1h cross-session) cache treatment as
Portal Claude. Portal proxies to OpenRouter with identical wire-format and
cache_control semantics, but the prior policy left Portal Qwen falling through
to the alibaba-family branch (which only matches provider=opencode/alibaba),
serving 0% cache hits and re-billing the full prompt every turn.

Scope is narrow: Portal Claude OR Portal Qwen. Other models on Portal keep
their existing behavior.

- _anthropic_prompt_cache_policy: add (is_nous_portal and qwen) -> (True, False)
- _supports_long_lived_anthropic_cache: drop Claude-only gate for Portal so
  Qwen also gets the validated 1h cross-session layout
- tests cover both functions, both bare and vendored qwen slug forms, and
  the rejection of non-Claude non-Qwen Portal traffic
2026-05-11 21:04:55 -07:00
Ben Barclay 3c23b15f81 fix(tui-clipboard): skip native safety net on OSC52-capable terminals (#20954)
* fix(tui-clipboard): skip native safety net on OSC52-capable terminals

On terminals with first-class OSC 52 support (Ghostty, kitty, WezTerm,
Windows Terminal, VS Code), setClipboard() currently fires both OSC 52
AND a parallel native-tool write (wl-copy / xclip / pbcopy). On Wayland
+ wl-copy this corrupts the clipboard: probeLinuxCopy() runs wl-copy
with empty stdin as an existence check (destructive — wipes clipboard
to empty string), and the subsequent real wl-copy invocation races
OSC 52 plus its own daemon's previous SIGTERM.

Symptom: user on Arch + Ghostty + wl-copy (Wayland, no tmux, no SSH)
had to press Ctrl+Shift+C three times before a selection landed.
env -u WAYLAND_DISPLAY -u DISPLAY HERMES_TUI_FORCE_OSC52=1 (which
short-circuits copyNative via the DISPLAY-absent early-return) made
every copy work instantly — proving OSC 52 alone is sufficient on
Ghostty and that copyNative() is actively destructive there.

Add OSC52_CAPABLE_TERMINALS allowlist to terminal.ts (same pattern as
the existing EXTENDED_KEYS_TERMINALS), and gate copyNative() on the
terminal NOT being on it. The native safety net continues to fire on
unrecognised terminals (xterm, GNOME Terminal, Konsole, Terminal.app,
etc.) where OSC 52 is less reliable.

* fix(tui-clipboard): address Copilot review feedback

- Move OSC52_CAPABLE_TERMINALS + supportsOsc52Clipboard() from
  ink/terminal.ts to utils/env.ts. ink/terminal.ts already imports
  link from ink/termio/osc.ts; importing back into termio/osc.ts
  introduced a circular dependency. utils/env.ts has no deps on
  either file and already owns terminal detection (detectTerminal()),
  so the helper sits naturally next to it.

- Replace the inline gating (!SSH_CONNECTION && !supportsOsc52Clipboard())
  with a pure shouldUseNativeClipboard(env, terminal) helper. The old
  expression skipped native on allowlisted terminals even when
  setClipboard() wouldn't actually emit OSC 52 (e.g. inside
  TMUX/STY where we use tmux load-buffer instead, or when the user
  has set HERMES_TUI_FORCE_OSC52=0). That made the clipboard write
  a no-op in those configurations. The new helper:
    1. SSH_CONNECTION set -> false (existing behaviour)
    2. TMUX or STY set -> true (we go through load-buffer, no race)
    3. shouldEmitClipboardSequence() false -> true (native is the
       only path left when OSC 52 is suppressed)
    4. Otherwise: skip native iff terminal is allowlisted.

- Add 11 tests for shouldUseNativeClipboard covering the SSH guard,
  TMUX/STY tmux-inside-Ghostty case, HERMES_TUI_FORCE_OSC52=0
  override, allowlisted vs non-allowlisted terminals, precedence,
  and default-args smoke. Tests follow the package's existing
  parameterised-helper style (no vi.mock; helpers accept env and
  terminal as arguments).

- Update test imports to the new utils/env.js path.

* fix(tui-clipboard): address Copilot round 2 feedback

* fix(tui-clipboard): address Copilot round 3 feedback

* fix(tui-clipboard): address Copilot round 4 feedback
2026-05-11 19:40:07 -07:00
Teknium e85592591e fix(nous): surface Portal-flagged free models in picker even when curated list is stale (#24082)
Free-tier users were seeing 'No free models currently available.' in the
`hermes model` and post-login pickers even though qwen/qwen3.6-plus is
free on the Portal right now. Three independent breakages compounded:

1. The docs-hosted catalog manifest at website/static/api/model-catalog.json
   was not regenerated when _PROVIDER_MODELS['nous'] was updated, so users
   fetching the manifest got a list that didn't include qwen/qwen3.6-plus.
2. _resolve_nous_pricing_credentials() returned ('', '') on any auth blip,
   collapsing get_pricing_for_provider('nous') to {} and making every
   curated model fall through the free-tier filter as 'paid'.
3. Even with healthy pricing, the picker only ever showed models from the
   in-repo curated list intersected with live pricing — a Portal-flagged
   free model not yet in the curated list could never appear.

Changes:
- hermes_cli/models.py: new union_with_portal_free_recommendations() that
  augments the curated list with Portal freeRecommendedModels entries
  (with synthetic free pricing so partition keeps them). The Portal's
  /api/nous/recommended-models endpoint is now the source of truth for
  free-tier surfacing — old Hermes builds will see new free models
  without a CLI release.
- hermes_cli/models.py: _resolve_nous_pricing_credentials() falls back to
  the public inference base URL when runtime cred resolution fails.
  The /v1/models endpoint exposes pricing without auth, so silently
  returning {} just because a refresh token expired was wrong.
- hermes_cli/auth.py + hermes_cli/main.py: both free-tier picker call
  sites call union_with_portal_free_recommendations() before partition.
- tests/hermes_cli/test_models.py: 7 tests covering union behaviour
  (prepend, dedup, end-to-end with stale pricing, empty/missing/error
  payloads, invalid entries).
- tests/hermes_cli/test_model_catalog.py: drift guard
  TestManifestMatchesInRepoLists fails CI when _PROVIDER_MODELS['nous']
  or OPENROUTER_MODELS is edited without re-running
  scripts/build_model_catalog.py. Verified empirically that removing a
  manifest entry triggers an assertion with an actionable error message.

Validation:
- 133/133 targeted tests pass (test_models, test_model_catalog,
  test_auth_nous_provider).
- Live E2E against the real Portal:
  - Stale curated list ['claude-opus','claude-sonnet','gpt-5.4'] (no
    qwen) → after union: ['qwen/qwen3.6-plus', ...] →
    partition(free_tier=True): selectable=['qwen/qwen3.6-plus'].
  - Simulated expired refresh token → anon fetch returns 403 pricing
    entries including qwen/qwen3.6-plus -> {prompt:0, completion:0}.
- ruff: clean.
2026-05-11 18:08:16 -07:00
Teknium ced1990c1c feat(computer-use): refresh cua-driver on hermes update + add install --upgrade (#24063)
cua-driver was only installed once on toolset enable: `_run_post_setup` early-returns when the binary is already on PATH, so upstream fixes (e.g. v0.1.6 Safari window-focus fix) never reached existing users without manual reinstall.

Two refresh points now:
- `hermes update` re-runs the upstream installer at the end of the update if cua-driver is on PATH (macOS-only, no-op otherwise). Ties driver freshness to the user-controlled update cadence — no startup latency, no per-launch GitHub API call.
- `hermes computer-use install --upgrade` for manual force-refresh.

The upstream `install.sh` always pulls the latest release, so re-running is the canonical upgrade path. No version-comparison logic needed.

`hermes computer-use status` now shows the installed version, and points at `--upgrade` for refreshing.
2026-05-11 17:10:58 -07:00
Teknium1 97a0e69df0 chore(release): add AUTHOR_MAP entry for ahmedbadr3 2026-05-11 16:51:09 -07:00
Ahmed Badr 05bad7b1e7 fix(dashboard): MiniMax 'Login' button launched Claude OAuth (#22832)
Fixes #22832.

## Root cause

`hermes_cli/web_server.py:start_oauth_login` dispatched OAuth flows by
the catalog's `flow` field rather than provider id:

    if catalog_entry["flow"] == "pkce":
        return _start_anthropic_pkce()

The catalog had two `flow: "pkce"` entries — `anthropic` and
`minimax-oauth` — so clicking "Login" on MiniMax in the dashboard's
Keys tab unconditionally launched the Anthropic/Claude PKCE flow.

## Fix

Three changes in `hermes_cli/web_server.py`:

1. Catalog entry for `minimax-oauth` changed from `flow: "pkce"` to
   `flow: "device_code"`. From a UX perspective MiniMax is a
   verification-URI + user-code flow (open URL, enter code, backend
   polls) — same shape as Nous's device-code flow. The PKCE bit
   (verifier + challenge from `_minimax_pkce_pair`) is a security
   extension that doesn't change the operator experience; the existing
   dashboard modal already renders `device_code` correctly for this UX.

2. New MiniMax branch in `_start_device_code_flow`, mirroring the
   existing Nous branch but calling MiniMax-specific helpers
   (`_minimax_request_user_code`, `_minimax_pkce_pair`). Stashes
   verifier + state in the session for the poller to consume. Handles
   the overloaded `expired_in` field (could be unix-ms timestamp OR
   seconds-from-now duration) the same way `_minimax_poll_token` does.

3. New `_minimax_poller` background thread mirroring `_nous_poller`.
   Calls `_minimax_poll_token` → on success builds the same
   `auth_state` dict the CLI flow (`_minimax_oauth_login`) builds, and
   persists via `_minimax_save_auth_state` so the dashboard path leaves
   the system in the same state as `hermes auth add minimax-oauth`.

Plus a dispatcher tightening to prevent regression: the `pkce` branch
now requires `provider_id == "anthropic"`, so any future PKCE provider
added without a proper start function gets a clean
`400 Unsupported flow` rather than silently launching Anthropic OAuth.

## Test

New `tests/hermes_cli/test_web_oauth_dispatch.py`:

- Regression test asserting MiniMax start does NOT return claude.ai
- Sanity test that Anthropic PKCE still works after the dispatcher
  tightening
- Forward-looking test: a hypothetical pkce-flagged provider without
  an explicit branch is rejected cleanly rather than misrouted

## Limitations

- The dashboard MiniMax path defaults to `region="global"`. CN-region
  operators can still use the CLI flow which supports `--region cn`.
  Adding a region toggle to the dashboard UI is a follow-up.
2026-05-11 16:51:09 -07:00
Teknium ea1d0462cf fix(cli): vertical fallback for markdown tables wider than terminal (#23948)
Follow-up to #23863 (CJK table alignment). The realigner was
correctly padding pipes to identical column offsets, but when a
table's natural width exceeds terminal cells it produced lines that
the terminal soft-wrapped mid-cell, destroying column alignment
visually even though the bytes were perfectly padded. Reported as
'columns are not aligned' on tables containing one long row alongside
several short rows.

Approach mirrors Claude Code's MarkdownTable.tsx narrow-terminal
fallback: when realign_markdown_tables is given an available_width
budget and the rebuilt horizontal table exceeds it, render each body
row as 'Header: value' lines separated by a thin ─ rule. Word-wraps
oversize values at the budget with a 2-space continuation indent.

- agent/markdown_tables.py: realign_markdown_tables(text, available_width=None);
  threshold check at the top of _render_block flips into a new
  _render_vertical fallback. Includes _wrap_to_width with hard-break
  for tokens longer than the budget.
- cli.py: helper _terminal_width_for_streaming() returns
  shutil.get_terminal_size().columns minus _STREAM_PAD and a 2-cell
  safety margin; passed to all three realign call sites
  (_render_final_assistant_content for strip+render Panel paths, and
  the streaming flushers in _emit_stream_text / _flush_stream).
- tests/agent/test_markdown_tables.py: 4 new tests covering the
  overflow-vertical fallback for ASCII + CJK content, the
  'fits → keep horizontal' case, and the long-cell wrap with indent.

Live-verified: with COLUMNS=100, the user's reported 'long row in
ASCII table' case now renders as vertical key-value rows that all fit
the panel; the 6-column CJK comparison table still renders as an
aligned horizontal table because it fits inside 100 cols.
2026-05-11 16:49:13 -07:00
ethernet 825bd50e6b Merge pull request #18036 from NousResearch/fix/bundle-size
ui-tui: bundle with esbuild, drop runtime node_modules
2026-05-11 17:46:19 -04:00
brooklyn! 75b428c852 feat(ui-tui): resolve markdown links to readable page titles (#24013)
* feat(ui-tui): resolve links to readable page titles

Mirror desktop pretty-link behavior in the TUI by resolving HTTP links to page titles with shared caching and safe fetch filters, plus slug-based fallbacks so chat links stay readable even when title fetch fails.

* refactor(ui-tui): tighten link-title fallback handling

Clean up the link-title resolver by hardening in-flight cleanup and clarifying title length limits, while adding focused coverage for HTML entity decoding and markdown-label fallback behavior.

* fix(ui-tui): block private-network targets in title fetches

Prevent automatic link-title resolution from requesting local or private hosts by rejecting RFC1918, link-local, ULA, and intranet-style hostnames before fetch, and add regression coverage for blocked host patterns.
2026-05-11 14:16:31 -07:00
ethernet c6ca11618a refactor(tui): simplify TUI build logic, remove stale staleness checks
The old mtime-tracking staleness machinery (_tui_build_needed,
_hermes_ink_bundle_stale, _find_bundled_tui) tried to avoid rebuilding
by comparing source timestamps to dist/entry.js. This was fragile and
added ~100 lines of code. Replace with three clear paths:

1. HERMES_TUI_DIR set (prebuilt/nix): just node dist/entry.js, no build
2. --dev mode: tsx src/entry.tsx, no build, hot reload
3. Normal: always npm run build (esbuild is ~1s, correctness > caching)

Also error when HERMES_TUI_DIR is set with --dev (footgun: prebuilt
bundle has no source code to hot-reload).
2026-05-11 17:04:34 -04:00
kshitijk4poor 9a63b5f16c chore: add nicoechaniz to AUTHOR_MAP 2026-05-11 13:16:07 -07:00
nicoechaniz e2b713cced fix(model-metadata): skip OpenRouter for known providers, add kimi/moonshot to PROVIDER_TO_MODELS_DEV
Based on PR #23950 by @nicoechaniz.

- Add "kimi" and "moonshot" to PROVIDER_TO_MODELS_DEV → kimi-for-coding
- Gate OpenRouter metadata step behind "if not effective_provider":
  known providers should not be overridden by community-maintained OR data
- Keep the targeted Kimi-family 32k guard as a secondary safety net
  inside the OR gate (for unknown providers with Kimi models)

Co-authored-by: nicoechaniz <nicoechaniz@altermundi.net>
2026-05-11 13:16:07 -07:00
kshitijk4poor 91eef6255e fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding
Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K,
tripping the 64K minimum-context guard and preventing use of the model on
Ollama Cloud and Kimi Coding / Moonshot providers.

Three fixes in the context-length resolution chain:

1. Ollama Cloud native /api/show query: new _query_ollama_api_show()
   queries the Ollama native API for authoritative GGUF model_info
   context_length.  For hosted Ollama, prefers model_info over num_ctx
   since users can't set their own num_ctx on Cloud.  Added at step 5e
   in get_model_context_length(), before the models.dev fallback.

2. models.dev :cloud/-cloud suffix fallback: lookup_models_dev_context()
   now also tries appending :cloud and -cloud suffixes when the bare
   model name doesn't match.  models.dev stores 'kimi-k2.6:cloud' but
   users and the live API use bare 'kimi-k2.6'.

3. Kimi-family 32K guard: after the OpenRouter metadata step, reject
   exactly 32768 for Kimi-named models (kimi-*, moonshot*) and fall
   through to hardcoded defaults ('kimi': 262144).  OpenRouter reports
   32768 for moonshotai/kimi-k2.6 but the model actually supports 262K.
   Narrow filter — only 32768, only Kimi-family — becomes dead code
   when OpenRouter updates its metadata.

---
2026-05-11 13:16:07 -07:00
ethernet 3197b4de6d Merge remote-tracking branch 'origin/main' into fix/bundle-size 2026-05-11 16:01:04 -04:00
Siddharth Balyan 271883447e feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847)
Set HERMES_SESSION_ID using the existing session_context.py ContextVar
system for concurrency safety (multiple gateway sessions in one process
won't cross-talk). Also writes os.environ as fallback for CLI mode.

Touchpoints:
- gateway/session_context.py: Add _SESSION_ID ContextVar + _VAR_MAP entry
- run_agent.py: Set both ContextVar and os.environ at init and on
  context-compression rotation
- tools/environments/local.py: Bridge ContextVars into subprocess env
  in _make_run_env() (ContextVars don't propagate to child processes)
- tests/run_agent/test_session_id_env.py: 3 tests covering env, provided
  ID, and ContextVar paths

execute_code subprocess already passes HERMES_* prefixed vars through
_scrub_child_env (line 82: _SAFE_ENV_PREFIXES includes 'HERMES_').

Primary use case: webhook-triggered agents that need to include a
`--resume <session_id>` takeover command in their output.
2026-05-12 00:16:45 +05:30
kshitij ce0f529cde chore: ruff auto-fix C401, C416, C408, PLR1722 (#23940)
C401:   set(x for x in y) -> {x for x in y}      (set comprehension)
C416:   [(k,v) for k,v in d] -> list(d.items())  (unnecessary listcomp)
C408:   tuple()/dict() -> ()/{}                   (unnecessary collection call)
PLR1722: exit() -> sys.exit()                     (adds import sys where needed)

21 instances fixed, 0 remaining. 19 files, +40/-36.
2026-05-11 11:20:58 -07:00
Teknium 7b76366552 feat(prompt-cache): cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal (#23828)
Cuts input cost for first-turn Claude requests by ~85-90% on subsequent
sessions within an hour. Tools array (~13k tokens for default toolset) +
stable system prefix (~5-8k tokens) get a 1h cache_control marker; the
volatile suffix (memory, USER profile, timestamp, session id) sits in a
separate non-cached block at the end so it doesn't poison the cross-session
prefix when it changes.

Provider gate: Claude on native Anthropic (incl. OAuth subscription),
OpenRouter, and Nous Portal (which proxies to OpenRouter). All other
providers keep today's system_and_3 layout unchanged.

Layout (4 cache_control breakpoints, Anthropic max):
  1. tools[-1]              -> 1h (cross-session)
  2. system content[0]      -> 1h (cross-session, stable prefix)
  3. messages[-2]           -> 5m (within-session rolling)
  4. messages[-1]           -> 5m (within-session rolling)

Within-session rolling shrinks from 3 messages to 2 to free the breakpoint
budget. On Claude with realistic tool loadouts the long-lived tier carries
the bulk of cross-session value anyway.

System prompt is now always assembled cache-friendly: stable identity /
guidance / skills / platform hints first, then session-stable context
files (AGENTS.md, .cursorrules), then per-call volatile content. Old
single-string callers see the same logical content (same join order),
just reordered so volatile lives at the end.

Config knobs (defaults shown):
  prompt_caching:
    cache_ttl: "5m"           # rolling-window TTL (unchanged)
    long_lived_prefix: true    # opt-out switch
    long_lived_ttl: "1h"       # cross-session prefix TTL

Live E2E (tests/agent/test_prompt_caching_live.py, gated on
OPENROUTER_API_KEY) on anthropic/claude-haiku-4.5 with default toolset:
  Call 1 (cold):              cache_write=13,415  cache_read=0
  Call 2 (NEW agent + msg):   cache_write=391     cache_read=13,025
  Cross-session reuse:        97.09%

Implementation:
* agent/prompt_caching.py: new apply_anthropic_cache_control_long_lived()
  + mark_tools_for_long_lived_cache(); existing apply_anthropic_cache_control()
  preserved verbatim for the fallback path.
* agent/anthropic_adapter.py: convert_tools_to_anthropic() now forwards
  cache_control onto each Anthropic-format tool dict.
* run_agent.py: _build_system_prompt_parts() returns the 3-tier dict;
  _build_system_prompt() joins them (backward compatible).
  _supports_long_lived_anthropic_cache() policy added next to the existing
  _anthropic_prompt_cache_policy() (which now also recognises Nous Portal
  Claude — pre-existing gap fixed in passing).
  _build_api_kwargs() resolves tools_for_api once and propagates the
  marker through all four build paths (anthropic_messages, bedrock,
  codex_responses, profile/legacy chat completions).
  Long-lived flag plumbed into the runtime snapshot/restore + model-switch
  + fallback-promotion paths.

Tests:
* tests/agent/test_prompt_caching.py: +8 tests (TestMarkToolsForLongLivedCache,
  TestApplyAnthropicCacheControlLongLived).
* tests/run_agent/test_anthropic_prompt_cache_policy.py: +9 tests
  (TestSupportsLongLivedAnthropicCache matrix across 8 endpoint classes
  + a fallback-target case).
* tests/agent/test_prompt_caching_live.py: new live E2E (skipif when
  OPENROUTER_API_KEY is unset; runs outside the hermetic suite).
* Targeted suites: 327/327 pass (caching/adapter/policy/builder).
* tests/agent/ + tests/run_agent/: 3992 pass, 17 skip, 1 pre-existing
  flake (test_async_httpx_del_neuter::test_same_key_replaces_stale_loop_entry,
  verified failing on pristine origin/main).
2026-05-11 11:14:56 -07:00
kshitij 2ec8d2b42f chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace  with  for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.

608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining.
133 files, +626/-626 (net zero).
2026-05-11 11:13:25 -07:00
Teknium1 8c11710314 chore(release): add AUTHOR_MAP entry for wuli666 2026-05-11 11:13:20 -07:00
wuli666 111b859e49 fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482)
#23482 fixed cache poisoning in the sync path: when a Codex auxiliary
timeout closes the underlying OpenAI client, _evict_cached_client_instance
walks CodexAuxiliaryClient wrappers via their _real_client attribute and
drops the cache entry so the next aux call rebuilds.

The cache key includes async_mode (see _client_cache_key), so the sync and
async clients for the same provider live in two distinct entries pointing
at the same underlying transport. The fix walked the sync wrapper's
_real_client correctly but the async wrappers
(AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient,
AsyncGeminiNativeClient) never exposed _real_client at all, so the async
entry survived eviction and kept handing out the poisoned client.

Effect on async aux callers: one timeout now poisons every subsequent
async aux call (compression, vision, session_search, title_generation)
with 'Connection error' until gateway restart -- even while the sync
route recovered as designed in #23482.

Mirror the sync wrapper's _real_client onto each async wrapper so the
existing eviction helper finds them. Three changes, one per wrapper:

- AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client
  (the underlying OpenAI client)
- AsyncAnthropicAuxiliaryClient: same shape
- AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's
  native facade is itself the leaf; no OpenAI client beneath it)

Update _evict_cached_client_instance docstring to reflect that it now
covers both sync and async wrappers via the same attribute walk.

Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper
seeds both sync and async cache entries pointing at the same leaf and
asserts both are dropped on a single eviction call. Verified the test
fails without the wrapper changes ("async cache entry survived
eviction -- wrapper is missing _real_client") and passes with them.

Refs #23482, #23432
2026-05-11 11:13:20 -07:00
Teknium 1d00716754 fix(cli,tui): align CJK / wide-char markdown tables (#23863)
CJK and emoji glyphs render as two terminal cells but JS String#length
and the model's own padding count them as one, so any markdown table
with Chinese / Japanese / Korean cells drifts right per row when a
real terminal renders it. Both surfaces fix this with a display-cell
width measurement (wcswidth on the Python side, stringWidth on the
TUI side).

Changes:
- agent/markdown_tables.py: new helper. realign_markdown_tables(text)
  detects markdown table blocks (header + |---| divider) and
  rewrites the row padding using wcwidth.wcswidth so every pipe and
  dash lines up across rows. No-op on text without tables.
- cli.py: hook the helper into _render_final_assistant_content for
  strip / render modes (raw passes through untouched), and into the
  streaming line emitter so live token-by-token rendering also
  produces aligned tables. A small two-buffer state machine in
  _emit_stream_text holds table rows until the block ends, then
  flushes them through the realigner so all rows pad to a single
  per-column width.
- ui-tui/src/components/markdown.tsx: renderTable now uses
  stringWidth (Bun.stringWidth fast path + East-Asian-width-aware
  fallback, already memoised in @hermes/ink) instead of UTF-16
  String#length for both column-width measurement and per-cell
  padding. Drops the comment that documented the bug as a deliberate
  limitation.

Validation:
- New tests/agent/test_markdown_tables.py (11): every rebuilt block
  shares pipe column offsets across rows for pure CJK, mixed
  CJK+emoji, ragged-row, and multi-table inputs.
- Updated tests/cli/test_cli_markdown_rendering.py: the existing
  strip-mode test asserted exact whitespace; rewritten to assert the
  alignment contract (cell content survives + every rendered row
  shares pipe offsets).
- New ui-tui markdown.test.ts case (1): rendered column-2 start
  offset is identical for the header + every body row, including
  the CJK row that drifted before the fix.
- Live: hermes chat -q with the user-reported screenshot prompt now
  produces a perfectly aligned table on the wire (header, divider,
  4 body rows including '通义千问', all pipes at identical columns).
2026-05-11 11:13:06 -07:00
kshitij 657874460f chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926)
PLR5501 (collapsible-else-if): 28 instances — else: if: → elif:
PLR1730 (if-stmt-min-max):   15 instances — if x<y: x=y → x=max(x,y)
C420   (dict.fromkeys):       2 instances — dictcomp → dict.fromkeys
PLR1704 (redefined-argument): 1 instance — reason → err_msg (shadow fix)
C414   (unnecessary-list):    1 instance — sorted(list(x)) → sorted(x)

28 files, -44 net lines. All mechanical, zero logic changes.
17,211 tests pass, zero regressions.
2026-05-11 11:03:29 -07:00
Teknium 8e2eb4b511 fix(/model): surface Nous Portal models from remote catalog manifest (#23912)
The /model picker for Nous Portal users was returning the in-repo
_PROVIDER_MODELS["nous"] snapshot — which only updates on Hermes
releases — instead of the remote manifest published at
https://hermes-agent.nousresearch.com/docs/api/model-catalog.json.

OpenRouter already pulled from the manifest via fetch_openrouter_models;
"nous" was the only curated provider where the existing manifest
plumbing (get_curated_nous_model_ids → get_curated_nous_models) was
defined but not wired into the picker pipeline. Switch the curated
build in list_authenticated_providers to use it, with the same
graceful fallback to the in-repo snapshot when the manifest is
unreachable.

Test: tests/hermes_cli/test_model_catalog.py exercises the picker with
a patched manifest and asserts the manifest's nous list reaches
list_picker_providers. Falls-back-to-static path was already covered
by test_curated_nous_ids_falls_back_to_hardcoded_on_empty_catalog.
2026-05-11 10:15:30 -07:00
Teknium1 cc9e788c14 fix(cli): defensive _slash_confirm_state access + AUTHOR_MAP
- getattr(self, '_slash_confirm_state', None) at the two read sites that
  trip object.__new__(HermesCLI) test fixtures (test_cli_external_editor,
  test_cli_skin_integration)
- _build_tui_layout_children: make slash_confirm_widget keyword-only with
  default None to avoid breaking subclassing extension hook for wrapper
  CLIs (test_cli_extension_hooks)
- AUTHOR_MAP entry for zhengyn0001

Follow-up to the salvaged commit ca1d4375a.
2026-05-11 10:02:03 -07:00
zhengyuna 054f568578 fix: use TUI modal for slash confirmations 2026-05-11 10:02:03 -07:00
rob-maron e155f2aca9 rebuild model catalog 2026-05-11 09:54:31 -07:00
Teknium1 283381b1ce fix(dashboard): validate dist exists when --skip-build is set
Follow-up to PR #23824. Adds two correctness fixes on top of the
contributor's salvaged commit:

1. Stale-dist fallback no longer gated on `fatal=False`. `cmd_dashboard`
   passes `fatal=True` and is the primary scenario this fallback is for
   (issue #23817 — Windows Scheduled Task at logon). The previous gate
   meant the fallback never fired in the case it was designed for.

2. `--skip-build` now verifies the dist actually exists before starting
   the server. Without this, a misconfigured pre-build would launch the
   dashboard pointing at a missing dist and silently serve 404s. We now
   exit 1 with a clear "pre-build first: cd web && npm run build"
   message, and on success print which dist directory is being used.

Verified end-to-end on Linux:
- build fails + stale dist (fatal=True)  -> fallback fires
- build fails + no dist (fatal=True)     -> exit 1 with stderr surfaced
- build fails + stale dist (fatal=False) -> fallback fires
- --skip-build + missing dist            -> exit 1 with clear guidance
- --skip-build + valid dist              -> 'Skipping web UI build...'
2026-05-11 09:27:05 -07:00
ygd58 7085f4e238 fix(dashboard): fallback to stale dist, retry build, add --skip-build flag
Three improvements for non-interactive contexts (Windows Scheduled
Tasks, CI/CD) where the web UI build may fail (issue #23817):

1. Retry build once after 3s — covers boot-time races (antivirus
   scanning Node.js, npm cache not ready, transient disk I/O)
2. Fall back to existing dist when build fails (non-fatal mode) —
   a stale UI is far better than no UI at all
3. Add --skip-build flag — lets callers pre-build in their wrapper
   script and start the dashboard without internal build attempt
4. Surface npm stderr in build failure output for easier debugging

Fixes #23817
2026-05-11 09:27:05 -07:00
Teknium1 88a2ce4ae5 chore: AUTHOR_MAP entry for VinceZcrikl noreply (#23647) 2026-05-11 08:14:03 -07:00
文森.Z a479ec01ed fix: make web UI build output decoding robust on Windows
On Windows systems using a Chinese GBK locale, `hermes update` could misreport the Web UI build as failed even when `npm run build` actually succeeded. The failure was caused by Python decoding captured npm output with the process locale inside a background subprocess reader thread. When npm emitted bytes such as `0x85`, decoding under GBK raised `UnicodeDecodeError`, and Hermes then surfaced a misleading "Web UI build failed" warning.

This change makes the npm install/npm ci path and the Web UI build step decode captured output explicitly as UTF-8 with `errors="replace"`. That keeps unexpected bytes from crashing output collection, preserves successful builds, and prevents false negatives during update on Windows.

The patch also adds regression tests that verify these subprocess calls always use explicit UTF-8 decoding with replacement semantics.
2026-05-11 08:14:03 -07:00
Teknium 7026af4e23 fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602)
When the user's main provider is openai-codex on the ChatGPT-account
backend (https://chatgpt.com/backend-api/codex), sending a native image
attachment encodes it as data:image/...base64,... in the input_image
field. The OpenAI Responses API on the public endpoint accepts that, but
the ChatGPT-account variant rejects it with HTTP 400:

  Invalid 'input[N].content[K].image_url'. Expected a valid URL, but got
  a value with an invalid format.

Hermes' image-rejection phrase list didn't include this wording, so the
error escaped the strip-and-retry branch and fell through to the generic
recovery path: model fallback → context-too-large → compression cascade
→ auxiliary OpenRouter 402 spam (issue #23570).

Add a NARROW phrase keyed on the field-path apostrophe used by the Codex
Responses error format: "image_url'. expected". This matches the actual
error format without false-tripping on generic 'Expected a valid URL'
errors from unrelated tools (webhooks, redirect_uri, etc.). Once matched,
the existing branch strips images from history, sets _vision_supported=
False for the session, and retries text-only.

Refs #23570 (1 of 3 image-replay improvements; persistence rewrite to
store image PATHS instead of inlined base64 is a separate follow-up)
2026-05-11 07:37:22 -07:00
Teknium 3e7145e0bb revert: roll back /goal checklist + /subgoal feature stack (#23813)
* Revert "fix(goals): force judge to use tool calls instead of JSON-text replies (#23547)"

This reverts commit a63a2b7c78.

* Revert "fix(goals): forward standing /goal state on auto-compression session rotation (#23530)"

This reverts commit 4a080b1d5a.

* Revert "feat(goals): /goal checklist + /subgoal user controls (#23456)"

This reverts commit 404640a2b7.
2026-05-11 07:06:27 -07:00
kshitijk4poor 1d4a4997b1 chore: AUTHOR_MAP entries for sudo-hardening salvage contributors
- openclaw@agent.local → 29206394 (PR #22194)
- freedemon@gmail.com  → fr33d3m0n (PR #21128)
2026-05-11 06:56:30 -07:00
fr33d3m0n 976d8e27ad fix(approval): catch sudo with stdin/askpass/shell privilege flags
Adds the only #17873 category not covered by the in-flight PRs #17962
(briandevans, reverse shell + download-execute) and #7993 (SHL0MS,
credential reads + curl/wget exfiltration): sudo invocations that an
LLM-driven agent can drive without TTY interaction.

The agent has no TTY, so the sudo forms that succeed without human
involvement are those reading the password from stdin (`-S` / `--stdin`)
or via an askpass helper (`-A` / `--askpass`). The shell-launch (`-s`)
and list-privileges (`-a`) flags are also gated since they are
privilege-relevant invocations the agent can chain after acquiring the
password (e.g. read SUDO_PASSWORD from .env -> sudo -S -s -> root shell).
Plain `sudo cmd` (no flag) is TTY-bound and excluded.

Two patterns:

  1. Direct flag: `\bsudo\b[^;|&\n]*?\s+(?:-s\b|--stdin\b|-a\b|--askpass\b)`
     The lazy `[^;|&\n]*?` consumes flag-arguments without spanning
     command separators, so `sudo -u root -S whoami` matches (a textbook
     offensive form that a strict `(?:\s+-[^\s]+)*` "leading flags only"
     pattern would have missed because `root` is a flag-value not a flag).

  2. Combined short flags: `\bsudo\b[^;|&\n]*?\s+-[a-z]*[sa][a-z]*\b`
     Catches packed forms like `sudo -nS id` where multiple flags share
     a single `-X` token.

`_normalize_command_for_detection` lowercases input before pattern
matching (tools/approval.py:340), so case variants of S/s and A/a
collapse — both letter-pairs are gated since each is a privilege-
relevant invocation.

Tests: 21 new cases in TestDetectSudoStdin (12 positive covering all
flag-order permutations including herestring source and printf-piped
forms; 9 negative including TTY-bound `sudo whoami`, interactive
`sudo -i`, env-var reference `$SUDO_USER`, doc lookup `man sudo`,
package install, and the `pseudosudo` word-boundary edge case).

Empirical coverage: 11/11 attacks matched, 0/10 false positives.

Refs: #17873 category 4. Adjacent: #17962 (reverse shell + download-
execute), #7993 (credential reads + curl/wget exfiltration).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 06:56:30 -07:00
OpenClaw Agent 9520a1ccdf fix(terminal): block sudo -S password guessing when SUDO_PASSWORD is not set
Fixes #9590: Block explicit sudo -S (stdin password mode) commands
when the SUDO_PASSWORD environment variable is not configured.

The attack vector: the LLM constructs 'echo guessedpass | sudo -S cmd'
to brute-force sudo passwords, iterates based on sudo's error output
('Sorry, try again').  The existing _transform_sudo_command only
injects -S when SUDO_PASSWORD exists; without it, the LLM's explicit
sudo -S must be treated as a guessing attempt.

Changes:
- Add _check_sudo_stdin_guard() in approval.py: detects sudo -S when
  SUDO_PASSWORD is absent, anchored to command-start positions
  (^ ; && || | etc.) to avoid false positives on literal text
- Integrate into check_all_command_guards() above yolo/mode=off so
  the block is unconditional (like the hardline floor)
- Add 6 tests covering: detection, allow-list, SUDO_PASSWORD bypass,
  integration with check_all_command_guards, yolo non-bypass,
  container backend bypass
2026-05-11 06:56:30 -07:00
kshitijk4poor 494824fb11 chore: remove unused sentinel in test_send_message_tool 2026-05-11 06:44:58 -07:00
kshitijk4poor 5712483487 fix: guard resolve_profile_env against missing profile dirs
The _default_spawn HERMES_HOME injection (PR #23356) calls
resolve_profile_env which raises FileNotFoundError when the profile
dir doesn't exist. In production the profile always exists (workers are
only dispatched for live profiles), but tests with isolated HERMES_HOME
never create profile dirs. Catch FileNotFoundError and fall through —
HERMES_PROFILE is still set below, so the worker CLI resolves the
profile at startup.
2026-05-11 06:44:58 -07:00
kshitijk4poor 7087702210 chore: add salvage contributors to AUTHOR_MAP
For PRs #23206 (Frowtek), #23252 (Sylw3ster), #23358 (dmnkhorvath),
#23659 (smwbev), and #23356 (TurgutKural) — all part of the kanban
bug-fix batch salvage.
2026-05-11 06:44:58 -07:00
Ninso112 a1854ac07c fix(kanban): treat archived parent tasks as terminal for dependency resolution
When a parent task is archived, dependent child tasks were stuck in
todo forever because recompute_ready and claim_task only checked for
status == 'done'. Now both functions also treat 'archived' as a
terminal status, allowing children to proceed when their parent is
archived.

Fixes #23180.
2026-05-11 06:44:58 -07:00
Evgenii 27cfe72543 fix(kanban): use localized column label in select-all aria label 2026-05-11 06:44:58 -07:00
Dominikh 379e7dd014 test(send_message): cover _check_send_message gating paths
Adds a TestCheckSendMessage class with 7 focused tests pinning the
four passing conditions and the failure modes:

  - HERMES_KANBAN_TASK grants access (the new branch)
  - HERMES_KANBAN_TASK short-circuits before consulting
    session_context or gateway.status (so workers don't depend on
    those import paths being healthy)
  - HERMES_SESSION_PLATFORM=telegram grants access
  - HERMES_SESSION_PLATFORM=local falls through to gateway check
  - is_gateway_running()=True grants access
  - All signals absent → False
  - gateway.status ImportError is swallowed → False

Pinning the short-circuit (test #2) is the load-bearing one — it
documents the contract that worker-side availability cannot regress
to depending on gateway-side state lookups.
2026-05-11 06:44:58 -07:00
Dominikh 8ac998cb0c fix(send_message): allow kanban workers to call send_message
The kanban dispatcher sets HERMES_KANBAN_TASK on every spawned worker
but launches it with the assignee profile's HERMES_HOME (e.g.
~/.hermes/profiles/<name>/), which has no gateway.pid file. The
existing _check_send_message therefore returned False from the
is_gateway_running() fallback, even though the parent gateway is
alive and reachable.

Net effect: workers could call kanban_* tools (gated on
HERMES_KANBAN_TASK in _check_kanban_mode) but not send_message. This
breaks the natural pattern of "worker does the job, calls
send_message to deliver rich content to the originating chat, then
calls kanban_complete with a one-line summary" because the kanban
notifier's payload_summary is hard-truncated to the first line
(~200 chars) at gateway/run.py:3963 — anything richer has to ship
via send_message.

Honoring HERMES_KANBAN_TASK in _check_send_message — symmetric with
_check_kanban_mode in kanban_tools.py:42 — closes the gap. No new
state, no new env var, no profile-config changes required.
2026-05-11 06:44:58 -07:00
TurgutKural 5af315c4cc fix(kanban): inject HERMES_HOME into worker subprocess env
Default spawn did not propagate HERMES_HOME when forking kanban workers.
The worker's env is copied from the parent via dict(os.environ), so
HERMES_HOME is absent. When the child then starts hermes -p <profile>,
the CLI's _apply_profile_override() runs before hermes_constants is
imported and get_hermes_home() falls back to ~/.hermes (the default
profile root), silently ignoring the profile's config.yaml.  Profile-
scoped fallback_providers, toolsets, and agent settings are therefore
never applied to kanban workers.

The fix injects HERMES_HOME into the worker's env using
resolve_profile_env(profile_arg) so the child reads the correct profile
directory instead of the default root.
2026-05-11 06:44:58 -07:00
Sylw3ster 641e40c4bd fix(kanban): restore HERMES_KANBAN_BOARD after scoped slash override 2026-05-11 06:44:58 -07:00
liuhao1024 2b3bf17dfa fix(kanban): call kanban_block on iteration-budget exhaustion to prevent protocol violation
When a kanban worker subprocess hits the iteration budget, the agent
loop strips tools and asks the model for a summary.  The model cannot
call kanban_block itself at that point, so the process exits rc=0
without calling kanban_complete or kanban_block — a protocol violation
that the dispatcher detects as a fatal error, giving up after 1 failure
and stranding downstream tasks.

Fix: after _handle_max_iterations() returns, check HERMES_KANBAN_TASK
and call kanban_block with a reason describing the exhaustion.  The
dispatcher then sees a clean block transition instead of a protocol
violation, and the task can be retried or escalated by a human.

Fixes [Bug] kanban-worker exits cleanly (rc=0) on iteration-budget
exhaustion without calling kanban_complete or kanban_block #23216
2026-05-11 06:44:58 -07:00
Frowtek f6d4f3c37d fix(kanban): route gateway create auto-subscribe to explicit board 2026-05-11 06:44:58 -07:00
Siddharth Balyan 64145a1996 fix(nix): replace chown -R with targeted find in container entrypoint (#23633)
The container entrypoint ran `chown -R` on $HERMES_HOME every start.
`chown` strips the setgid bit (kernel security behavior), destroying
the 2770 permissions the NixOS activation script sets for group access
by hostUsers. This caused PermissionError for interactive CLI users
even though they were in the hermes group.

Replace with `find ... ! -user $UID -exec chown` which only touches
files with wrong ownership, leaving correctly-owned directories and
their permission bits intact.

Affects: container.enable + container.hostUsers + addToSystemPackages

Related: #19795, #19788, #9383
2026-05-11 12:59:57 +05:30
Siddharth Balyan 5606258855 feat(nix): add extraDependencyGroups for sealed venv extras (#21817)
Expose the dependency-groups parameter from python.nix through
hermes-agent.nix and the NixOS module, allowing users to opt into
pyproject.toml optional extras (e.g. hindsight, voice, matrix) that
are resolved by uv inside the sealed venv.

Unlike extraPythonPackages (which appends to PYTHONPATH and requires
collision checking), extraDependencyGroups resolves the full dependency
graph in a single uv pass — no PYTHONPATH patching, no version
conflicts, no collision risk.

When to use which:
- extraDependencyGroups: enable a pyproject.toml optional extra
- extraPythonPackages: add an external Python plugin not in pyproject.toml

Usage:
  services.hermes-agent.extraDependencyGroups = [ "hindsight" ];

Or via overlay:
  pkgs.hermes-agent.override { extraDependencyGroups = [ "hindsight" ]; }

Refs: #8873, #9194
2026-05-11 12:23:48 +05:30
Siddharth Balyan d992fd9aaf feat(deps): add hindsight-client as optional dependency (#21818)
Declares hindsight-client as an optional dependency group [hindsight]
in pyproject.toml. This allows build-time inclusion for environments
where runtime pip install is not possible (NixOS sealed venvs, Docker,
Kubernetes).

Not included in [all] — memory providers are plugins and should be
opted into explicitly.

Install via:
  uv sync --extra hindsight
  pip install hermes-agent[hindsight]

NixOS (with extraDependencyGroups):
  services.hermes-agent.extraDependencyGroups = [ "hindsight" ];

Closes #8873
2026-05-11 12:22:02 +05:30
Mibayy ebf2ea584a feat(terminal,cli): docker_extra_args + display.timestamps
Two independent opt-in QoL toggles, both off by default.

terminal.docker_extra_args:
- List of extra flags appended verbatim to docker run after security
  defaults. Useful for adding capabilities (e.g. --cap-add SETUID) or
  other docker run options not exposed by existing config keys.
- Non-string entries are logged and skipped.
- Also available via TERMINAL_DOCKER_EXTRA_ARGS='[...]' env var.

display.timestamps:
- Appends [HH:MM] to user input bullet and the assistant response box
  header. Single hub in _format_submitted_user_message_preview()
  covers both single-line and multi-line user previews; assistant
  response label gets the timestamp at box-open time.

Closes #1569 (timestamps).

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
2026-05-10 22:43:39 -07:00
Teknium 228b7d27bd fix(auxiliary): cache 402'd providers as unhealthy with TTL to stop per-call retry storms (#23597)
When an auxiliary provider returns HTTP 402 (credit / payment), every
subsequent compression / title-gen / session-search / vision call still
re-tried it as the FIRST entry in the chain — burning ~1 RTT to hit 402
again, then falling back. On a long Discord/LCM session that meant dozens
of doomed 402s per minute (issue #23570).

Add a per-process unhealthy-provider cache with a 10 min TTL. When any
caller observes a payment error against a provider, the label is marked
unhealthy and skipped by:
  * _resolve_auto Step-1 (main provider use-as-aux path)
  * _resolve_auto Step-2 (aggregator/fallback chain)
  * _try_payment_fallback (used by call_llm/acall_llm on first 402)

Skip-logs are throttled to once per minute per label so a bursty session
doesn't spam agent.log. Entries auto-expire so a topped-up account
recovers without manual intervention. The cache is in-process only by
design — multi-profile users with different keys per profile must each
hit the 402 once.

Refs #23570
2026-05-10 22:43:14 -07:00
0xbyt4 ace1c4ea8c fix(discord): typing indicator task not cleaned up after API error
When the Discord typing API call fails (rate limit, network error, 403),
_typing_loop returns early but the stale task remains in _typing_tasks.
Subsequent send_typing calls see the stale entry and skip, leaving no
typing indicator for the rest of the agent invocation.

Add finally block to _typing_loop to always remove the task from
_typing_tasks on exit, whether from cancellation, error, or normal
completion. This allows send_typing to create a fresh task.

3 new tests in test_discord_send.py:
- Task removed after API error
- Typing restartable after failure
- stop_typing cleans up
2026-05-10 22:41:26 -07:00
teknium1 0458d99f22 chore(release): AUTHOR_MAP entry for Mibayy clawhub email 2026-05-10 22:37:42 -07:00
teknium1 9526040700 chore(skills/stocks): tighten SKILL.md to modern format 2026-05-10 22:37:42 -07:00
teknium1 2ea957fc41 chore(skills/stocks): relocate to optional-skills/finance/stocks/ 2026-05-10 22:37:42 -07:00
Mibayy 896a7ce261 feat: add stocks & finance skill (Yahoo Finance, no API key)
5 commands: quote, search, history, compare, crypto
Zero dependencies, Python stdlib only.
Supports multi-symbol queries and crypto prices.
2026-05-10 22:37:42 -07:00
Jeffrey Quesnelle bf2cc8b31c Merge pull request #20317 from NousResearch/meta/security-policy
docs(security): rewrite policy around OS-level isolation as the boundary
2026-05-11 01:36:32 -04:00
Teknium 228a4d11ae fix(config): warn loudly on YAML parse failure instead of silent default fallback (#23585)
A YAML parse error in ~/.hermes/config.yaml caused load_config() to print
one line to stdout (Warning: Failed to load config: ...) and silently fall
back to DEFAULT_CONFIG, dropping every user override (auxiliary providers,
fallback chain, model settings). Users only noticed when downstream
behavior misbehaved — see issue #23570 where a tab-indent error in the
auxiliary section caused aux fallback to use OpenRouter (depleted) instead
of the configured Codex/MiniMax chain.

Now: log at WARNING (so 'hermes logs' surfaces it), write a prominent line
to stderr, dedup on (path, mtime_ns, size) so concurrent loads don't spam,
and re-warn after the user edits the file. Both call sites (raw read +
merged load) route through the same helper.

Refs #23570
2026-05-10 22:36:19 -07:00
Gutslabs 3af3c4eb8c fix(misc): three small defensive fixes from PR #1974
Salvages the three substantive low-severity fixes from Gutslabs' #1974
"misc bug fixes" bundle.  The other 8 claims in that PR were either
already fixed on main with superior implementations (state lock,
firecrawl lazy import, fcntl/msvcrt guard, path normalization, schema
migrations) or did not survive review.

- run_agent: `_materialize_data_url_for_vision` uses
  `NamedTemporaryFile(delete=False)`; if `base64.b64decode` raises on a
  corrupt data URL the temp file would persist forever.  Wrap the
  write in try/except and `os.unlink` the temp on failure.

- gateway/session: `append_to_transcript` JSONL write had no error
  handling, so disk-full / read-only-fs / permission errors crashed the
  message handler.  The SQLite write above is the primary store, so
  swallow OSError on the JSONL fallback with a debug log.

- gateway/status: `_read_pid_record` reads `pid_path.read_text()` after
  an `exists()` check; if the PID file is deleted between the two
  calls (concurrent gateway restart) we hit an unhandled OSError.
  Catch it and return None.

Adds a regression test for the tempfile cleanup; the other two paths
are defensive try/excepts on infrequent OSError that don't warrant
dedicated tests.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-05-10 22:28:01 -07:00
teknium1 482d49cf90 chore: AUTHOR_MAP entry for wilsen0 2026-05-10 22:22:25 -07:00
teknium1 edb4a2bda5 test(telegram): cover env-clamped helper + adaptive text-batch tiers
- New tests/gateway/test_telegram_text_batch_perf.py:
  TestEnvFloatClamped — 7 tests covering default-when-unset, valid
  parse, garbage fallback, NaN rejection, Inf rejection, min-clamp,
  max-clamp.  Asserts asyncio.sleep() always gets a finite number.

  TestAdaptiveTextBatchTiers — 4 tests covering the tier-constant
  invariants and the min(cap, tier_delay) composition rule.

- tests/gateway/test_display_config.py: update assertions for
  Telegram's new tool_progress='new' default.
2026-05-10 22:22:25 -07:00
wilsen0 ac95b8cdbe perf(gateway): tune Telegram cadence + adaptive fast-path for short replies
Re-authored against current main from PR #10388 by @wilsen0.  The
original branch is 3800+ commits stale and could not be cherry-picked
without reverting unrelated work; this change carries only the perf
intent forward.

Tuning summary
==============

Text-batch ingress (gateway/platforms/telegram.py):
  - HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS default 0.6 -> 0.3
  - HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS default 2.0 -> 1.0
  - Adaptive fast-path tiers in _flush_text_batch:
      total <= 320 cp -> min(cap, 0.18)
      total <= 1024 cp -> min(cap, 0.24)
      else            -> cap
    A single short reply now reaches the agent in ~180ms instead of
    600ms.  Tier constants compose with the configured cap via min()
    so an operator who tightens HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS
    below 0.18 still wins on every tier.
  - _env_float_clamped helper replaces bare float(os.getenv()).
    Rejects NaN / Inf, applies optional min/max bounds.  Used for
    text-batch + media-batch knobs.  Prevents asyncio.sleep(NaN)
    crashes when an operator typos an env var.

Stream cadence (gateway/config.py + stream_consumer.py):
  - StreamingConfig.edit_interval default 1.0s -> 0.8s
  - StreamingConfig.buffer_threshold default 40 -> 24 chars
  - DEFAULT_STREAMING_EDIT_INTERVAL / BUFFER_THRESHOLD / CURSOR are now
    a single source of truth.  StreamConsumerConfig imports them
    instead of duplicating the literals; the prior dual-source drift
    is fixed.

Tool progress (gateway/display_config.py):
  - Telegram default tool_progress 'all' -> 'new'.  Inside
    Telegram's ~1 edit/s flood envelope the 'all' default would
    accumulate edit pressure on busy chats; 'new' shows only the
    leading bubble per tool batch and feels less spammy.
  - Slack tier_low override (tool_progress='off') is preserved.

Composition with native draft streaming (#23512)
================================================

The mid-stream cadence (edit_interval, buffer_threshold) gates BOTH
the draft path (send_draft) and the edit path (edit_message), so the
tighter cadence helps native draft as much as edit-based.  The
text-batch fast-path applies before the consumer starts, so it speeds
up the first-token latency on every transport.  No conflict.

Stale-base avoidance
====================

Re-authored from scratch rather than cherry-picked.  Dropped from the
original branch:
  - Unrelated d2f043f9c 'fix(anthropic): preserve third-party thinking
    continuity' commit
  - boot_md.py builtin gateway hook (unrelated)
  - Reverted Slack tool_progress='off' (#14663) restoration
  - Reverted Platform plugin discovery, MSGRAPH_WEBHOOK, YUANBAO
    members deletion
  - 2300+ lines of run.py base-skew noise

Tests
=====

New tests/gateway/test_telegram_text_batch_perf.py:
  - 7 tests for _env_float_clamped (NaN, Inf, garbage, bounds).
  - 4 tests for the adaptive-tier composition rules.

Updated tests/gateway/test_display_config.py:
  - test_platform_default_when_no_user_config: 'all' -> 'new' for
    Telegram, with comment.
  - test_high_tier_platforms: split into Telegram-overrides-to-new
    and Discord-stays-all assertions.

Closes #10388.

Co-authored-by: wilsen0 <132184373+wilsen0@users.noreply.github.com>
2026-05-10 22:22:25 -07:00
Teknium e3b88a8fe2 rename(skills): api-testing -> rest-graphql-debug (#23589)
More specific name. The skill is REST + GraphQL debugging end-to-end,
not generic 'api testing' (a smoke-test pytest scaffold is one short
section out of ~500 lines). Renames directory + frontmatter name +
self-reference in the delegate_task example body.
2026-05-10 22:22:19 -07:00
teknium1 5f767879e6 chore(release): AUTHOR_MAP entry for Hugo-SEQUIER 2026-05-10 22:15:04 -07:00
teknium1 1f899393dc chore(skills/hyperliquid): tighten SKILL.md to modern format
- description shortened to <=60 chars
- platforms gated to [linux, macos, windows] (stdlib-only, all OK)
- author credits Hugo Sequier
- collapse redundant prerequisites/setup blocks
- terminal-tool-oriented procedure section
2026-05-10 22:15:04 -07:00
Hugo Sqr f2e8ed2405 Add unit tests for hyperliquid skill functionality
- Implement tests for normalizing perpetual markets and DEXs.
- Validate JSON output for main commands including markets, candles, and review.
- Ensure environment variable resolution and dotenv file reading are covered.
- Test export functionality for market data with expected output structure.
2026-05-10 22:15:04 -07:00
Teknium1 28b4fe6007 test: stabilize quick-command redaction test against xdist ordering
agent.redact._REDACT_ENABLED is snapshotted at import time from
HERMES_REDACT_SECRETS env. Under xdist a prior test in the same worker
can flip it, so test_exec_command_output_is_redacted was order-dependent.
Pin it via monkeypatch like test_terminal_output_transform_still_runs_strip_and_redact does.
2026-05-10 22:12:23 -07:00
0xbyt4 f6736ced81 fix(security): sanitize env and redact output in quick commands + remove write-only _pending_messages
1. Quick command exec ran in the gateway process's full environment
   without env sanitization or output redaction. A quick command like
   "env" or "printenv" would leak all API keys, OAuth tokens, and
   bot credentials to the messaging user.

   Fix: apply _sanitize_subprocess_env() before exec and
   redact_sensitive_text() on output before returning.

2. GatewayRunner._pending_messages was written on every interrupt
   (lines 1331-1334) but never read or consumed anywhere. The actual
   interrupt delivery uses adapter._pending_messages (a separate dict).
   Removed the write-only accumulation to prevent unbounded growth.
2026-05-10 22:12:23 -07:00
Muhammet Eren Karakuş 4c57a5b318 feat(skills): add api-testing optional skill (#1800)
Adds optional-skills/software-development/api-testing/SKILL.md — a single-file
runbook for systematic REST/GraphQL API debugging via Hermes tools (terminal,
execute_code, web_extract, delegate_task).

- 60-char description; gated to platforms: [linux, macos]
- Layered debug flow (connectivity → TLS → auth → format → parse → semantics)
- HTTP status playbook (401/403/404/409/422/429/5xx)
- Pagination, idempotency, contract validation, correlation IDs
- pytest smoke template, token-redaction patterns, leak checklist
- Hermes tool patterns replace generic curl/python examples

Lands in optional-skills/ (not always-active skills/) so it's installed via
hermes skills install official/software-development/api-testing.

scripts/release.py: AUTHOR_MAP entry for erenkar950@gmail.com → eren-karakus0.

Closes #1800.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-05-10 22:11:31 -07:00
teknium1 6c1af45b78 chore: AUTHOR_MAP entry for kjames2001 (James Huang) 2026-05-10 22:02:56 -07:00
teknium1 82352e54c4 test(telegram): regression coverage for edit overflow split-and-deliver
Two new tests:

- tests/gateway/test_telegram_format.py
  test_message_too_long_splits_into_continuations_not_silent_truncation:
  asserts edit_message returns success=True with continuation_message_ids
  populated and message_id pointing at the last continuation when
  content exceeds MAX_MESSAGE_LENGTH (#19537). Replaces the original
  fail-on-overflow assertion with the split-and-deliver contract.

- tests/gateway/test_stream_consumer.py
  TestEditOverflowSplitAndDeliver.test_consumer_advances_message_id_on_split_and_deliver:
  asserts the consumer side updates _message_id to the latest
  continuation, clears _last_sent_text, and fires on_new_message when
  the adapter reports a split-and-deliver result.
2026-05-10 22:02:56 -07:00
kjames2001 bf1f40996f fix(telegram): split-and-deliver oversized edits instead of silent truncation
When edit_message_text exceeded Telegram's 4096 UTF-16 codepoint limit,
the adapter caught the BadRequest, best-effort truncated the content
with '…', and returned SendResult(success=True). The stream consumer
believed the full edit was delivered and never recovered, silently
dropping everything past the truncation boundary on long replies.

Returning failure isn't safe either — the consumer's existing fallback
path can race against the next streaming tick, producing duplicate
sends or gaps. Instead, the adapter now SPLITS the oversized payload
across the existing message + new continuation messages, so the user
always gets the full reply in correct order.

How it works:

1. Pre-flight: if utf16_len(content) already exceeds MAX_MESSAGE_LENGTH,
   call the new _edit_overflow_split helper directly — saves a doomed
   round-trip + a Telegram error.

2. Reactive: if Telegram still returns 'message_too_long' after the
   pre-flight (e.g. parse_mode formatting inflated the payload past
   the limit via MarkdownV2 escapes), the same helper handles it.

3. _edit_overflow_split:
   - Splits via truncate_message(len_fn=utf16_len) — same chunking the
     non-streaming send() path uses; chunks get '(1/N)' suffixes.
   - Edits the original message_id with chunk 1 (with parse_mode +
     plain-fallback when finalize=True, mirroring the main edit path).
   - Sends each remaining chunk via self._bot.send_message threaded as
     a reply to the previous chunk so the user sees them as a
     contiguous block. MarkdownV2-with-plain-fallback per chunk on
     finalize.
   - Returns SendResult(success=True, message_id=<last_chunk_id>,
     continuation_message_ids=(<chunk2_id>, <chunk3_id>, ...)) so the
     stream consumer can keep editing the most recent visible message
     and the gateway has full visibility into every message id.

SendResult contract extension:

  Added optional continuation_message_ids: tuple = () field. When
  empty (the common case), behavior is unchanged. When populated, the
  caller knows the adapter delivered across multiple platform messages.

Stream consumer integration:

  GatewayStreamConsumer._send_or_edit advances _message_id to the
  last-continuation id when it sees continuation_message_ids on a
  successful edit result, resets _last_sent_text (the new visible
  message holds only the final chunk's text), and fires
  on_new_message so tool-progress bubbles linearize below the new
  continuation rather than the original. Mirrors the openclaw #32535
  inter-tool-leak guard.

Composes with what just landed:

  - PR #23455 (UTF-16 length-aware splitting in stream consumer)
    prevents most overflows upstream by measuring text in UTF-16
    codeunits before deciding to split. This PR is the safety net at
    the adapter boundary.
  - PR #23512 (native draft streaming, default for DM Telegram) routes
    DM streaming through send_draft, which has its own contract
    unaffected by this change. So this fix narrows in scope to the
    edit-based path: groups, supergroups, forum topics, every
    non-Telegram platform, and the per-response fallback after a
    draft failure.

Salvage notes:

  - Cherry-picked from PR #19537 by @kjames2001. Original PR returned
    failure on overflow; this evolves to split-and-deliver so users
    never lose content and the consumer state stays consistent.
  - Dropped an unrelated model-picker hunk (line 2114-2117) that
    silently killed the 'X more available — type /model <name>
    directly' hint by hardcoding total=len(models). Not in scope.
  - Restored the timeout-aware retryable=not is_timeout signal in
    send()'s fallthrough catch block.

Closes #19537.
2026-05-10 22:02:56 -07:00
Teknium 3b122cc1ac feat(kanban): stranded_in_ready diagnostic for unclaimed tasks (#23578)
Surface ready tasks that nobody claims within a threshold (default
30 min) regardless of why. One identity-agnostic signal that catches:

- Operator typo'd the assignee
- Profile was deleted, leaving its tasks stranded
- External worker pool (Codex CLI lane, custom daemon) is down
- Dispatcher misconfigured (wrong board / wrong HERMES_HOME)

Today the dispatcher correctly skips these (no respawn loop, good)
but nothing surfaces the fact that operator-actionable work is
accumulating. The new `stranded_in_ready` rule does that without
requiring a manual lane registry — it reads the most recent ready-
transition event (`created` / `promoted` / `reclaimed` / `unblocked`)
and fires when (now - last_ready_ts) > threshold.

Severity escalates with age: warning at threshold, error at 2x,
critical at 6x. The cli_hint and reassign actions point operators
at the right next step.

Out of scope deliberately:
- Lane registry (#20157 closed) — this signal supersedes it.
- Pushing the diagnostic into messaging gateways — diagnostics
  are pull-only via 'hermes kanban diagnostics' for now; gateway
  push is a separate UX decision.

Tests: 10 new + 461 existing kanban tests pass. E2E verified end-
to-end via 'hermes kanban diagnostics --json' against a 2h-old
stranded task — surfaces as error severity with correct actions.
2026-05-10 21:58:44 -07:00
Teknium1 bf5b8a7d61 chore(release): map @eloklam tailnet email 2026-05-10 21:44:37 -07:00
Teknium1 b8bf2f817d fix(kanban): merge dashboard batch QOL with i18n + collapse + assignee-casing
PR #23240 was branched before main landed:
- c39168453 i18n localization (16 locales)
- a91e5a875 native <details> collapse + skip empty metadata
- 0e0ddaac8 tone down completed-run metadata panel
- b308dd7d7 preserve assignee casing in dashboard

The cherry-pick took PR's dist/index.js wholesale via -X theirs,
which dropped those features. This commit re-applies them by
hand-merging the 7 conflict regions:

1. bulk-action catch handler: keep PR's failedIds + loadBoard,
   keep main's t-in-deps for tx() i18n calls
2. Refresh button: keep main's tx(t, 'refresh', ...), add PR's
   Clear filters button with tx(t, 'clearFilters', ...)
3. Archive button: keep main's tx(t, 'archive', ...), add PR's
   priority setter with tx(t, 'priority'/'setPriority', ...)
4. Column header: keep main's colHelp i18n var, add PR's
   column-select-all checkbox
5/6. lane.tasks/column.tasks .map: keep main's t->tk rename
   (avoids shadowing the i18n t), apply tk to PR's failed/
   draggingSource props
7. Card checkbox label-wrap: keep PR's <label> structure
   (larger hit target), keep main's tx(i18n, 'selectForBulk', ...)

Adds three new i18n keys (clearFilters, priority, setPriority)
that fall back to English via tx() until translators add them
to the kanban catalog, matching the existing pattern.
2026-05-10 21:44:37 -07:00
eloklam b60462a205 test(kanban): remove stale t.summary assertion from search test
Task.summary was never a real field; latest_summary already covers it.
Matches the haystack cleanup in commit f3015e6ab.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam 3df7e30244 kanban dashboard: fix shift-click range selection, column select-all toggle, and bulk action optimistic UI
- Bug 1: shift-click now always adds the target card and sets it as the
  last-selected anchor, so range selection works even when 0 or 1 cards
  are selected.
- Bug 2: column select-all checkbox now toggles: if every card in the
  column is already selected, clicking unselects them all.
- Bug 3: applyBulk now mirrors moveSelected with optimistic UI updates
  for status moves and calls loadBoard() on catch for consistency.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam 69053832e3 kanban dashboard: remove redundant t.summary from search haystack
The Task dataclass has no `summary` field; only Run carries summary.
The dashboard already searches `latest_summary` (derived from the
latest run), so `t.summary` in the client-side haystack was always
undefined and therefore redundant.

Verdict from task t_4bcac44f:
- Before batch QOL (6c7ec94d9): search only covered id, title,
  assignee, tenant.
- Batch QOL (7fd187102) correctly added body, result, latest_summary.
- `t.summary` was included but is a misleading no-op because tasks
  never expose a `summary` key — `latest_summary` already covers it.

Removes the redundant field from the haystack only.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam a88f201cd4 kanban dashboard: multi-card drag visual feedback
- When dragging a selected card while multiple cards are selected, the
  browser ghost image now shows a 'N cards' badge instead of a single card.
- All selected cards in the original column are dimmed (opacity 0.45 +
  grayscale) during the drag so the user sees the whole set is in-flight.
- Uses React state for the dragged task id; event delegation on the board
  columns container to avoid deep prop threading.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam 98c499b235 kanban dashboard: fix batch QOL oracle blockers
- Preserve failedIds partial-failure highlighting after moveSelected/
  applyBulk by clearing only selectedIds/lastSelectedId instead of
  calling clearSelected() (which also wiped failedIds).
- Fix touch/native multi-drag drop stale closure by adding
  props.selectedIds and props.onMoveSelected to the hermes-kanban:drop
  useEffect dependency array.

Fixes t_5bfafb73.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam 0ea234e093 feat(kanban): dashboard batch QOL upgrade
- Shift-click range selection, column select-all, select-all-visible
- Multi-card drag/drop via selectedIds + /tasks/bulk
- Expanded bulk actions: todo/ready/blocked/unblock/complete/archive,
  priority setter, reassign with reclaim_first checkbox
- Partial failure card highlight (failedIds + hermes-kanban-card--failed)
- Search expanded to body, result, latest_summary, summary
- Clear filters button + reset all filters on board switch
- Accessibility: larger checkbox hit target, tabIndex/role/aria-label,
  Enter/Space/Esc keyboard handlers
- Fix temporal-dead-zone bug: move clearSelected before moveSelected
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam 518d37f6af feat(kanban): add reclaim_first support to bulk reassign endpoint
- Extend BulkTaskBody with reclaim_first: bool = False
- In bulk_update, use kanban_db.reassign_task(..., reclaim_first=True)
  when payload.reclaim_first is set and assignee is present
- Falls back to existing assign_task behavior when reclaim_first is false

This enables the dashboard to bulk-reassign running tasks by
reclaiming their claims first, matching the single-task
/tasks/{id}/reassign endpoint behavior.
2026-05-10 21:44:37 -07:00
Teknium a63a2b7c78 fix(goals): force judge to use tool calls instead of JSON-text replies (#23547)
Live-tested on gemini-3-flash-preview the judge kept returning empty
or non-JSON content, tripping the consecutive-parse-failures auto-
pause. Free-form JSON output is hopeful; tool-call schemas are
enforced server-side by virtually every modern provider.

Two new tools the judge calls:

  - submit_checklist(items)  — Phase A, decompose
  - update_checklist(updates, new_items, reason) — Phase B, evaluate

Both phases now call the auxiliary client with tool_choice forcing
the right tool. read_file remains for Phase B history inspection,
with the loop exiting only when update_checklist is called or the
read budget is exhausted (at which point read_file is dropped from
the toolbox and update_checklist is forced).

Robustness:
- _call_judge_with_tool_choice falls back tool_choice forced→required→
  auto if the provider rejects a particular shape.
- If a fully-broken provider still returns content instead of a tool
  call, the legacy JSON-text parsers stay around as a last-ditch
  backstop so we never silently lose a checklist.
- _normalize_update_args replaces the JSON parser for the apply
  layer; same 1-based→0-based conversion + terminal-status filter.

Live verification: same fizzbuzz goal that was hitting 'judge model
returned unparseable output 3 turns in a row' before now terminates
in 2 turns, all 11 items marked completed with item-specific
evidence, no auto-pause. Agent log shows
'produced 11 checklist items via tool call' instead of the JSON-
parse path.

Tests: 7 new cases for the tool-call path (Phase A success, Phase B
update only, Phase B read_file→update, JSON-content backstop,
empty-text item dropping, non-terminal status filter).
2026-05-10 20:51:40 -07:00
Teknium 4a080b1d5a fix(goals): forward standing /goal state on auto-compression session rotation (#23530)
When run_agent's _compress_context fires mid-turn it ends the parent
session in SessionDB and creates a new continuation session with a
fresh session_id. The /goal state is keyed on session_id in
state_meta ("goal:<sid>"), so without forwarding the goal silently
disappears: _get_goal_manager() rebinds for the new session_id,
load_goal() returns None, mgr.is_active() is False, and the
continuation loop dies with no user-visible signal.

Fix: in the same SessionDB transaction block that creates the
continuation session, copy state_meta[goal:<old>] →
state_meta[goal:<new>] when present. No-op when the user has no
active goal. Logged at INFO so a stuck loop is debuggable.

Tests cover the round-trip via SessionDB and the no-op path.

Affects all three run-conversation surfaces (CLI, gateway, TUI
gateway) because _compress_context is the single rotation site.
2026-05-10 20:41:53 -07:00
teknium1 68d081f570 fix(kanban): keep '--created-by' default as 'user'
Out-of-scope behavior change in #23521 — the kanban notifier-routing fix
also flipped the 'kanban create --created-by' default from 'user' to the
active profile name. Revert to keep PR scope focused on the notifier
ownership fix; the profile-aware author default can be its own change.
2026-05-10 20:04:53 -07:00
Mike Nguyen ba5640fa11 fix(gateway): route kanban notifications to creator profile 2026-05-10 20:04:53 -07:00
teknium1 9e005d6779 chore: AUTHOR_MAP entry for NivOO5 2026-05-10 20:02:50 -07:00
teknium1 7f90141c63 test(telegram): native-draft transport coverage + docs
Added tests/gateway/test_stream_consumer_draft.py with 11 tests
covering:
- Transport selection: auto+dm-supported -> draft; auto+group -> edit;
  explicit edit; explicit draft on unsupported adapter -> edit;
  MagicMock adapter -> edit (back-compat for the existing test suite).
- Happy path: DM stream animates draft frames with a single shared
  draft_id, then finalizes via a regular adapter.send.
- Group fallback: drafts entirely skipped in non-DM chats.
- Failure fallback: send_draft returning success=False disables drafts
  for the rest of the response.
- Draft_id lifecycle: consecutive responses use distinct ids; tool
  boundaries bump the id so post-tool text animates fresh below the
  tool-progress bubble (the openclaw #32535 leak guard).
- _already_sent contract: drafts must NOT set the flag so the gateway's
  fallback final-send still fires (drafts have no message_id).

Updated website/docs/user-guide/messaging/telegram.md with a
'Streaming transport' section explaining auto|draft|edit|off, the
DM-only constraint, and the per-response fallback behaviour.
2026-05-10 20:02:50 -07:00
NivOO5 4ed293b38e feat(telegram): native draft streaming via sendMessageDraft (Bot API 9.5+)
Adds Telegram's native streaming-draft API as a streaming transport so DM
replies render with smooth animated previews as tokens arrive, dropping
the per-edit jitter of the legacy editMessageText polling path.

Adapter contract (gateway/platforms/base.py):
  - supports_draft_streaming(chat_type, metadata) -> bool. Default False.
    Telegram returns True only for DMs and only when the bound python-
    telegram-bot version exposes Bot.send_message_draft (PTB 22.6+).
  - send_draft(chat_id, draft_id, content, metadata) -> SendResult.
    Default raises NotImplementedError. Telegram delegates to PTB's
    send_message_draft. Drafts have no message_id (Bot API contract);
    SendResult.message_id is None on success.

Telegram adapter (gateway/platforms/telegram.py):
  - supports_draft_streaming gates on chat_type='dm' AND PTB capability.
  - send_draft trims to MAX_MESSAGE_LENGTH using utf16_len, threads
    message_thread_id through metadata, and routes failures back as
    SendResult(success=False, error=...) so the consumer can fall back.

Stream consumer (gateway/stream_consumer.py):
  - StreamConsumerConfig gains transport ('auto'|'draft'|'edit'|'off')
    and chat_type fields.
  - run() resolves _use_draft_streaming once via a probe at the top of
    the run, allocating a fresh class-wide draft_id_counter so each
    response animates as its own preview (no animation collision across
    consecutive responses to the same chat).
  - _send_or_edit gains a pre-edit branch: when drafts are active AND
    not finalizing AND no edit-path message_id is established, the
    frame routes through _send_draft_frame instead of edit_message.
    Drafts intentionally do NOT set _already_sent so the gateway's
    final sendMessage path still fires — drafts have no message_id and
    the user needs a real message in their chat history.
  - _reset_segment_state bumps the draft_id when the consumer is in
    draft mode so each text block after a tool boundary animates as a
    fresh preview below the tool-progress bubble (avoids the inter-
    tool-call leak openclaw documented in their #32535).
  - Per-response fallback: any send_draft failure (transient network,
    server reject, capability gap) flips _use_draft_streaming to False
    for the rest of the run, gracefully returning to the edit path.

Gateway config (gateway/config.py):
  - StreamingConfig.transport default flips edit -> auto. The auto path
    is identical to edit on every chat type that doesn't currently
    support drafts (groups, supergroups, forum topics, every non-
    Telegram platform), so the default is backwards-compatible for
    non-DM users.

Lifecycle model (Telegram Bot API 9.5):
  1. sendMessageDraft(chat_id, draft_id, text='') opens the bubble.
  2. Repeated sendMessageDraft calls with the SAME draft_id animate
     the preview as text grows.
  3. Drafts have no message_id and cannot be edited or deleted.
  4. When the response finishes the gateway's normal sendMessage path
     delivers the final answer; the draft preview clears naturally on
     the client and the user sees a real message in their history.

Inspired by PR #3412 by @NivOO5. Re-authored against current main
(stream_consumer.py is now ~4x larger than at #3412's branch base, with
new _NEW_SEGMENT/_COMMENTARY/finalize/_on_new_message machinery the
original PR didn't account for) but the design call (DM-only, edit-
fallback, transport=auto|draft|edit|off) is faithful to the original
proposal, with two improvements baked in:

  1. Per-response draft_id (monotonic counter, not a time hash) — no
     collision risk across consecutive responses on the same chat.
  2. Tool-boundary draft_id bump — prevents the inter-tool-call leak
     openclaw hit during their rollout (their #32535).

Closes #21439 (duplicate feature request).
2026-05-10 20:02:50 -07:00
emozilla 0d1cbc2dda changes from feedback 2026-05-05 22:45:12 -04:00
emozilla 401aadb5b8 docs(security): rewrite policy around OS-level isolation as the boundary
Restate the trust model from first principles: the OS is the only
load-bearing boundary against an adversarial LLM. Distinguish
terminal-backend isolation (sandboxes the shell tool) from
whole-process wrapping (sandboxes the agent itself, reference
deployment NVIDIA OpenShell). Name in-process components (approval
gate, output redaction, Skills Guard) as heuristics, and the class
of reports that defeat them as out of scope under this policy —
while explicitly welcoming them as regular issues or PRs.

Introduce 'agent-loaded content' as the narrow, honest commitment:
attacker-influenced input must not chain into a write the agent
later loads on its own initiative.

Strip implementation-detail enumerations (backend names, adapter
names, config keys, env vars, internal symbols) so the doc stays
evergreen as code evolves.
2026-05-05 12:46:51 -04:00
ethernet 9d645d98c4 fix(tui): update README 2026-04-30 18:23:28 -04:00
ethernet 242659f5af fix(tui): don't hardcode /home/bb 2026-04-30 18:23:28 -04:00
ethernet 42df7ec597 fix(tui): update comments 2026-04-30 18:23:28 -04:00
ethernet 42e166c7ea refactor(docker): drop manual @hermes/ink build, rely on esbuild bundle
the esbuild pipeline (scripts/build.mjs) already bundles ink into a
single self-contained dist/entry.js.

remove the Dockerfile steps that manually copied packages/hermes-ink
into node_modules/@hermes/ink and ran a nested
npm install there.

- Dockerfile: simplify TUI build step to just 'npm run build'
- hermes_cli/main.py: _tui_build_needed now checks dist/entry.js
staleness against source files before falling back to the old
ink-bundle.js logic
- tests: update TUI npm install tests and drop the Dockerfile contract
test for the removed ink materialization step
2026-04-30 17:32:55 -04:00
github-actions[bot] 279504d5b8 fix(nix): refresh npm lockfile hashes 2026-04-30 19:49:01 +00:00
ethernet 42627b4eaf refactor(tui): bundle with esbuild, drop runtime node_modules
Replace the tsc + babel pipeline with a single esbuild invocation that
produces a self-contained dist/entry.js. The nix TUI derivation no
longer copies node_modules — only dist/ + package.json ship, shrinking
the output from hundreds of MB to ~2.9 MB.

- ui-tui/scripts/build.mjs: new esbuild bundler. Aliases @hermes/ink
  to source (esbuild's __esm helper doesn't await nested async init,
  which breaks lazy-assigned exports like 'render' when re-exporting
  through a prebuilt submodule). Stubs react-devtools-core (dev-only).
  Injects a createRequire shim for transitive CJS deps. Strips the
  shebang from src/entry.tsx because Nix patchShebangs mangles
  '/usr/bin/env -S node --max-old-space-size=8192 --expose-gc' — it
  drops the 'node' token. The Python launcher always invokes node
  explicitly, so the shebang is redundant.
- nix/tui.nix: installPhase no longer copies node_modules or the
  @hermes/ink packages dir.
- nix/checks.nix: drop the 'node_modules present' assertion.
- hermes_cli/main.py: _tui_need_npm_install short-circuits when
  dist/entry.js exists and no package-lock.json is present. That is
  the prebuilt-bundle layout (nix / packaged release) and there is
  nothing to install. Without this, the launcher tried to npm install
  in a non-existent site-packages/ui-tui path.
2026-04-30 15:38:50 -04:00
268 changed files with 21515 additions and 3960 deletions
+12
View File
@@ -143,6 +143,18 @@
# Also requires ~/.honcho/config.json with enabled=true (see README).
# HONCHO_API_KEY=
# =============================================================================
# HYPERLIQUID OPTIONAL SKILL
# =============================================================================
# Optional defaults for the Hyperliquid skill in optional-skills/blockchain/hyperliquid
#
# Hyperliquid API base URL override
# Default: https://api.hyperliquid.xyz
# HYPERLIQUID_API_URL=https://api.hyperliquid-testnet.xyz
#
# Default address for account-level commands like state, fills, orders, and review
# HYPERLIQUID_USER_ADDRESS=0x0000000000000000000000000000000000000000
# =============================================================================
# TERMINAL TOOL CONFIGURATION
# =============================================================================
+302 -55
View File
@@ -1,84 +1,331 @@
# Hermes Agent Security Policy
This document outlines the security protocols, trust model, and deployment hardening guidelines for the **Hermes Agent** project.
This document describes Hermes Agent's trust model, names the one
security boundary the project treats as load-bearing, and defines the
scope for vulnerability reports.
## 1. Vulnerability Reporting
## 1. Reporting a Vulnerability
Hermes Agent does **not** operate a bug bounty program. Security issues should be reported via [GitHub Security Advisories (GHSA)](https://github.com/NousResearch/hermes-agent/security/advisories/new) or by emailing **security@nousresearch.com**. Do not open public issues for security vulnerabilities.
Report privately via [GitHub Security Advisories](https://github.com/NousResearch/hermes-agent/security/advisories/new)
or **security@nousresearch.com**. Do not open public issues for
security vulnerabilities. **Hermes Agent does not operate a bug
bounty program.**
### Required Submission Details
- **Title & Severity:** Concise description and CVSS score/rating.
- **Affected Component:** Exact file path and line range (e.g., `tools/approval.py:120-145`).
- **Environment:** Output of `hermes version`, commit SHA, OS, and Python version.
- **Reproduction:** Step-by-step Proof-of-Concept (PoC) against `main` or the latest release.
- **Impact:** Explanation of what trust boundary was crossed.
A useful report includes:
- A concise description and severity assessment.
- The affected component, identified by file path and line range
(e.g. `path/to/file.py:120-145`).
- Environment details (`hermes version`, commit SHA, OS, Python
version).
- A reproduction against `main` or the latest release.
- A statement of which trust boundary in §2 is crossed.
Please read §2 and §3 before submitting. Reports that demonstrate
limits of an in-process heuristic this policy does not treat as a
boundary will be closed as out-of-scope under §3 — but see §3.2:
they are still welcome as regular issues or pull requests, just not
through the private security channel.
---
## 2. Trust Model
The core assumption is that Hermes is a **personal agent** with one trusted operator.
Hermes Agent is a single-tenant personal agent. Its posture is
layered, and the layers are not equally load-bearing. Reporters and
operators should reason about them in the same terms.
### Operator & Session Trust
- **Single Tenant:** The system protects the operator from LLM actions, not from malicious co-tenants. Multi-user isolation must happen at the OS/host level.
- **Gateway Security:** Authorized callers (Telegram, Discord, Slack, etc.) receive equal trust. Session keys are used for routing, not as authorization boundaries.
- **Execution:** Defaults to `terminal.backend: local` (direct host execution). Container isolation (Docker, Modal, Daytona) is opt-in for sandboxing.
### 2.1 Definitions
### Dangerous Command Approval
The approval system (`tools/approval.py`) is a core security boundary. Terminal commands, file operations, and other potentially destructive actions are gated behind explicit user confirmation before execution. The approval mode is configurable via `approvals.mode` in `config.yaml`:
- `"on"` (default) — prompts the user to approve dangerous commands.
- `"auto"` — auto-approves after a configurable delay.
- `"off"` — disables the gate entirely (break-glass; see Section 3).
- **Agent process.** The Python interpreter running Hermes Agent,
including any Python modules it has loaded (skills, plugins,
hook handlers).
- **Terminal backend.** A pluggable execution target for the
`terminal()` tool. The default runs commands directly on the host.
Other backends run commands inside a container, cloud sandbox, or
remote host.
- **Input surface.** Any channel through which content enters the
agent's context: operator input, web fetches, email, gateway
messages, file reads, MCP server responses, tool results.
- **Trust envelope.** The set of resources an operator has implicitly
granted Hermes Agent access to by running it — typically, whatever
the operator's own user account can reach on the host.
- **Stance.** An explicit statement in Hermes Agent's documentation
or code about how a consuming layer (adapter, UI, file writer,
shell) should treat agent output — e.g. "the dashboard renders
agent output as inert HTML."
### Output Redaction
`agent/redact.py` strips secret-like patterns (API keys, tokens, credentials) from all display output before it reaches the terminal or gateway platform. This prevents accidental credential leakage in chat logs, tool previews, and response text. Redaction operates on the display layer only — underlying values remain intact for internal agent operations.
### 2.2 The Boundary: OS-Level Isolation
### Skills vs. MCP Servers
- **Installed Skills:** High trust. Equivalent to local host code; skills can read environment variables and run arbitrary commands.
- **MCP Servers:** Lower trust. MCP subprocesses receive a filtered environment (`_build_safe_env()` in `tools/mcp_tool.py`) — only safe baseline variables (`PATH`, `HOME`, `XDG_*`) plus variables explicitly declared in the server's `env` config block are passed through. Host credentials are stripped by default. Additionally, packages invoked via `npx`/`uvx` are checked against the OSV malware database before spawning.
**The only security boundary against an adversarial LLM is the
operating system.** Nothing inside the agent process constitutes
containment — not the approval gate, not output redaction, not any
pattern scanner, not any tool allowlist. Any in-process component
that screens LLM output is a heuristic operating on an
attacker-influenced string, and this policy treats it as such.
### Code Execution Sandbox
The `execute_code` tool (`tools/code_execution_tool.py`) runs LLM-generated Python scripts in a child process with API keys and tokens stripped from the environment to prevent credential exfiltration. Only environment variables explicitly declared by loaded skills (via `env_passthrough`) or by the user in `config.yaml` (`terminal.env_passthrough`) are passed through. The child accesses Hermes tools via RPC, not direct API calls.
Hermes Agent supports two OS-level isolation postures. They address
different threats and an operator should choose deliberately.
### Subagents
- **No recursive delegation:** The `delegate_task` tool is disabled for child agents.
- **Depth limit:** `MAX_DEPTH = 2` — parent (depth 0) can spawn a child (depth 1); grandchildren are rejected.
- **Memory isolation:** Subagents run with `skip_memory=True` and do not have access to the parent's persistent memory provider. The parent receives only the task prompt and final response as an observation.
#### Terminal-backend isolation
A non-default terminal backend runs LLM-emitted shell commands
inside a container, remote host, or cloud sandbox. The file tools
(`read_file`, `write_file`, `patch`) also run through this backend,
since they are implemented on top of the shell contract — they
cannot reach paths the backend doesn't expose.
What this confines: anything the agent does by issuing shell or
file operations. What this does **not** confine: everything the
agent does in its own Python process. That includes the
code-execution tool (spawned as a host subprocess), MCP subprocesses
(spawned from the agent's environment), plugin loading, hook
dispatch, and skill loading (all imported into the agent
interpreter).
Terminal-backend isolation is the right posture when the concern is
LLM-emitted destructive shell or unwanted file-tool writes, and the
operator is otherwise trusted.
#### Whole-process wrapping
Whole-process wrapping runs the entire agent process tree inside a
sandbox. Every code path — shell, code-execution, MCP, file tools,
plugins, hooks, skill loading — is subject to the same filesystem,
network, process, and (where applicable) inference policy.
Hermes Agent supports this in two ways:
- **Hermes Agent's own Docker image and Compose setup.** Lighter-
weight; the agent runs in a standard container with operator-
configured mounts and network policy.
- **[NVIDIA OpenShell](https://github.com/NVIDIA/OpenShell)**.
OpenShell provides per-session sandboxes with declarative policy
across filesystem, network (L7 egress), process/syscall, and
inference-routing layers. Network and inference policies are
hot-reloadable. Credentials are injected from a Provider store
and never touch the sandbox filesystem.
Under a whole-process wrapper, Hermes Agent's in-process heuristics
(§2.4) function as accident-prevention layered on top of a real
boundary. This is the supported posture when the agent ingests
content from surfaces the operator does not control — the open web,
inbound email, multi-user channels, untrusted MCP servers — and for
production or shared deployments.
Operators running the default local backend with untrusted input
surfaces, or running a terminal-backend sandbox and expecting it to
contain code paths that don't go through the shell, are operating
outside the supported security posture.
### 2.3 Credential Scoping
Hermes Agent filters the environment it passes to its lower-trust
in-process components: shell subprocesses, MCP subprocesses, and
the code-execution child. Credentials like provider API keys and
gateway tokens are stripped by default; variables explicitly
declared by the operator or by a loaded skill are passed through.
This reduces casual exfiltration. It is not containment. Any
component running inside the agent process (skills, plugins, hook
handlers) can read whatever the agent itself can read, including
in-memory credentials. The mitigation against a compromised
in-process component is operator review before install (§2.4,
§2.5), not environment scrubbing.
### 2.4 In-Process Heuristics
The following components screen or warn about LLM behavior. They
are useful. They are not boundaries.
- The **approval gate** detects common destructive shell patterns
and prompts the operator before execution. Shell is Turing-
complete; a denylist over shell strings is structurally
incomplete. The gate catches cooperative-mode mistakes, not
adversarial output.
- **Output redaction** strips secret-like patterns from display.
A motivated output producer will defeat it.
- **Skills Guard** scans installable skill content for injection
patterns. It is a review aid; the boundary for third-party skills
is operator review before install. Reviewing a skill means
reading its Python code and scripts, not just its SKILL.md
description — skills execute arbitrary Python at import time.
### 2.5 Plugin Trust Model
Plugins load into the agent process and run with full agent
privileges: they can read the same credentials, call the same
tools, register the same hooks, and import the same modules as
anything shipped in-tree. The boundary for third-party plugins is
operator review before install — the same rule as skills (§2.4),
called out separately because plugins are architecturally heavier
and often ship their own background services, network listeners,
and dependencies.
A malicious or buggy plugin is not a vulnerability in Hermes Agent
itself. Bugs in Hermes Agent's plugin-install or plugin-discovery
path that prevent the operator from seeing what they're installing
are in scope under §3.1.
### 2.6 External Surfaces
An **external surface** is any channel outside the local agent
process through which a caller can dispatch agent work, resolve
approvals, or receive agent output. Each surface has its own
authorization model, but the rules below apply uniformly.
**Surfaces in Hermes Agent:**
- **Gateway platform adapters.** Messaging integrations in
`gateway/platforms/` (Telegram, Discord, Slack, email, SMS, etc.)
and analogous adapters shipped as plugins.
- **Network-exposed HTTP surfaces.** The API server adapter, the
dashboard plugin, the kanban plugin's HTTP endpoints, and any
other plugin that binds a listening socket.
- **Editor / IDE adapters.** The ACP adapter (`acp_adapter/`) and
equivalent integrations that accept requests from a local client
process.
- **The TUI gateway (`tui_gateway/`).** JSON-RPC backend for the
Ink terminal UI, reached over local IPC.
**Uniform rules:**
1. **Authorization is required at every surface that crosses a
trust boundary.** For messaging and network HTTP surfaces, the
boundary is the network: authorization means an operator-
configured caller allowlist. For editor and local-IPC surfaces
(ACP, TUI gateway), the boundary is the host's user account:
authorization means relying on OS-level access control (file
permissions, loopback-only binds) and not exposing the surface
beyond the local user without an explicit network auth layer.
2. **An allowlist is required for every enabled network-exposed
adapter.** Adapters must refuse to dispatch agent work, resolve
approvals, or relay output until an allowlist is set. Code paths
that fail open when no allowlist is configured are code bugs in
scope under §3.1.
3. **Session identifiers are routing handles, not authorization
boundaries.** Knowing another caller's session ID does not grant
access to their approvals or output; authorization is always
re-checked against the allowlist (or OS-level equivalent).
4. **Within the authorized set, all callers are equally trusted.**
Hermes Agent does not model per-caller capabilities inside a
single adapter. Operators who need capability separation should
run separate agent instances with separate allowlists.
5. **Binding a local-only surface to a non-loopback interface is a
break-glass operator decision (§3.2).** The dashboard and other
plugin HTTP servers default to loopback; exposing them via
`--host 0.0.0.0` or equivalent makes public-exposure hardening
(§4) the operator's responsibility.
---
## 3. Out of Scope (Non-Vulnerabilities)
## 3. Scope
The following scenarios are **not** considered security breaches:
- **Prompt Injection:** Unless it results in a concrete bypass of the approval system, toolset restrictions, or container sandbox.
- **Public Exposure:** Deploying the gateway to the public internet without external authentication or network protection.
- **Trusted State Access:** Reports that require pre-existing write access to `~/.hermes/`, `.env`, or `config.yaml` (these are operator-owned files).
- **Default Behavior:** Host-level command execution when `terminal.backend` is set to `local` — this is the documented default, not a vulnerability.
- **Configuration Trade-offs:** Intentional break-glass settings such as `approvals.mode: "off"` or `terminal.backend: local` in production.
- **Tool-level read/access restrictions:** The agent has unrestricted shell access via the `terminal` tool by design. Reports that a specific tool (e.g., `read_file`) can access a resource are not vulnerabilities if the same access is available through `terminal`. Tool-level deny lists only constitute a meaningful security boundary when paired with equivalent restrictions on the terminal side (as with write operations, where `WRITE_DENIED_PATHS` is paired with the dangerous command approval system).
### 3.1 In Scope
- Escape from a declared OS-level isolation posture (§2.2): an
attacker-controlled code path reaching state that the posture
claimed to confine.
- Unauthorized external-surface access: a caller outside the
configured authorization set (allowlist, or OS-level equivalent
for local-IPC surfaces) dispatching work, receiving output, or
resolving approvals (§2.6).
- Credential exfiltration: leakage of operator credentials or
session authorization material to a destination outside the
trust envelope, via a mechanism that should have prevented it
(environment scrubbing bug, adapter logging, transport error
that flushes credentials to an upstream, etc.).
- Trust-model documentation violations: code behaving contrary to
what this policy, Hermes Agent's own documentation, or reasonable
operator expectations would predict — including cases where
Hermes Agent has documented a stance about how its output should
be rendered by a consuming layer (dashboard, gateway adapter,
file writer, shell) and a code path breaks that stance.
### 3.2 Out of Scope
"Out of scope" here means "not a security vulnerability under this
policy." It does not mean "not worth reporting." Improvements to the
in-process heuristics, hardening ideas, and UX fixes are welcome as
regular issues or pull requests — the approval gate can always catch
more patterns, redaction can always get smarter, adapter behavior
can always be tightened. These items just don't go through the
private-disclosure channel and don't receive advisories.
- **Bypasses of in-process heuristics (§2.4)** — approval-gate regex
bypasses, redaction bypasses, Skills Guard pattern bypasses, and
analogous reports against future heuristics. These components are
not boundaries; defeating them is not a vulnerability under this
policy.
- **Prompt injection per se.** Getting the LLM to emit unusual
output — via injected content, hallucination, training artifacts,
or any other cause — is not itself a vulnerability. "I achieved
prompt injection" without a chained §3.1 outcome is not an
actionable report under this policy.
- **Consequences of a chosen isolation posture.** Reports that a
code path operating within its posture's scope can do what that
posture permits are not vulnerabilities. Examples: shell or file
tools reaching host state under the local backend; code-execution
or MCP subprocesses reaching host state under terminal-backend
isolation that only sandboxes shell; reports whose preconditions
require pre-existing write access to operator-owned configuration
or credential files (those are already inside the trust envelope).
- **Documented break-glass settings.** Operator-selected trade-offs
that explicitly disable protections: `--insecure` and equivalent
flags on the dashboard or other components, disabled approvals,
local backend in production, development profiles that bypass
hermes-home security, and similar. Reports against those
configurations are not vulnerabilities — that's the flag's job.
- **Community-contributed skills and plugins.** Third-party skills
(including the community skills repository) and third-party
plugins are in the operator's review surface, not Hermes Agent's
trust surface (§2.4, §2.5). A skill or plugin doing something
malicious is the expected failure mode of one that wasn't
reviewed, not a vulnerability in Hermes Agent. Bugs in Hermes
Agent's skill-install or plugin-install path that prevent the
operator from seeing what they're installing are in scope under
§3.1.
- **Public exposure without external controls.** Exposing the
gateway or API to the public internet without authentication,
VPN, or firewall.
- **Tool-level read/write restrictions on a posture where shell is
permitted.** If a path is reachable via the terminal tool, reports
that other file tools can reach it add nothing.
---
## 4. Deployment Hardening & Best Practices
## 4. Deployment Hardening
### Filesystem & Network
- **Production sandboxing:** Use container backends (`docker`, `modal`, `daytona`) instead of `local` for untrusted workloads.
- **File permissions:** Run as non-root (the Docker image uses UID 10000); protect credentials with `chmod 600 ~/.hermes/.env` on local installs.
- **Network exposure:** Do not expose the gateway or API server to the public internet without VPN, Tailscale, or firewall protection. SSRF protection is enabled by default across all gateway platform adapters (Telegram, Discord, Slack, Matrix, Mattermost, etc.) with redirect validation. Note: the local terminal backend does not apply SSRF filtering, as it operates within the trusted operator's environment.
The single most important hardening decision is matching isolation
(§2.2) to the trust of the content the agent will ingest. Beyond
that:
### Skills & Supply Chain
- **Skill installation:** Review Skills Guard reports (`tools/skills_guard.py`) before installing third-party skills. The audit log at `~/.hermes/skills/.hub/audit.log` tracks every install and removal.
- **MCP safety:** OSV malware checking runs automatically for `npx`/`uvx` packages before MCP server processes are spawned.
- **CI/CD:** GitHub Actions are pinned to full commit SHAs. The `supply-chain-audit.yml` workflow blocks PRs containing `.pth` files or suspicious `base64`+`exec` patterns.
### Credential Storage
- API keys and tokens belong exclusively in `~/.hermes/.env` — never in `config.yaml` or checked into version control.
- The credential pool system (`agent/credential_pool.py`) handles key rotation and fallback. Credentials are resolved from environment variables, not stored in plaintext databases.
- Run the agent as a non-root user. The supplied container image
does this by default.
- Keep credentials in the operator credential file with tight
permissions, never in the main config, never in version control.
Under OpenShell, use the Provider store rather than an on-disk
credential file.
- Do not expose the gateway or API to the public internet without
VPN, Tailscale, or firewall protection. Under OpenShell, use the
network policy layer to restrict egress.
- Configure a caller allowlist for every network-exposed adapter
you enable (§2.6).
- Review third-party skills and plugins before install (§2.4,
§2.5). For skills, this means reading the Python and scripts,
not just SKILL.md. Skills Guard reports and the install audit
log are the review surface.
- Hermes Agent includes supply-chain guards for MCP server
launches and for dependency / bundled-package changes in CI; see
`CONTRIBUTING.md` for specifics.
---
## 5. Disclosure Process
## 5. Disclosure
- **Coordinated Disclosure:** 90-day window or until a fix is released, whichever comes first.
- **Communication:** All updates occur via the GHSA thread or email correspondence with security@nousresearch.com.
- **Credits:** Reporters are credited in release notes unless anonymity is requested.
- **Coordinated disclosure window:** 90 days from report, or until a
fix is released, whichever comes first.
- **Channel:** the GHSA thread or email correspondence with
security@nousresearch.com.
- **Credit:** reporters are credited in release notes unless
anonymity is requested.
+2 -2
View File
@@ -769,8 +769,8 @@ def _build_patch_mode_content(patch_text: str) -> List[Any]:
old_chunks: list[str] = []
new_chunks: list[str] = []
for hunk in op.hunks:
old_lines = [line.content for line in hunk.lines if line.prefix in (" ", "-")]
new_lines = [line.content for line in hunk.lines if line.prefix in (" ", "+")]
old_lines = [line.content for line in hunk.lines if line.prefix in {" ", "-"}]
new_lines = [line.content for line in hunk.lines if line.prefix in {" ", "+"}]
if old_lines or new_lines:
old_chunks.append("\n".join(old_lines))
new_chunks.append("\n".join(new_lines))
+1 -1
View File
@@ -47,7 +47,7 @@ def _title_case_slug(value: Optional[str]) -> Optional[str]:
def _parse_dt(value: Any) -> Optional[datetime]:
if value in (None, ""):
if value in {None, ""}:
return None
if isinstance(value, (int, float)):
return datetime.fromtimestamp(float(value), tz=timezone.utc)
+20 -4
View File
@@ -35,6 +35,14 @@ def _get_anthropic_sdk():
"""Return the ``anthropic`` SDK module, importing lazily. None if not installed."""
global _anthropic_sdk
if _anthropic_sdk is ...:
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("provider.anthropic", prompt=False)
except ImportError:
pass
except Exception:
# FeatureUnavailable — fall through to ImportError handling below
pass
try:
import anthropic as _sdk
_anthropic_sdk = _sdk
@@ -1289,13 +1297,21 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
continue
if name:
seen_names.add(name)
result.append({
anthropic_tool: Dict[str, Any] = {
"name": name,
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
),
})
}
# Forward cache_control marker when present on the OpenAI-format
# tool dict (set by ``mark_tools_for_long_lived_cache``). Anthropic's
# tools array supports cache_control on the last tool to cache the
# entire schema cross-session.
cache_control = t.get("cache_control")
if isinstance(cache_control, dict):
anthropic_tool["cache_control"] = dict(cache_control)
result.append(anthropic_tool)
return result
@@ -1537,7 +1553,7 @@ def convert_messages_to_anthropic(
# downgraded to a spurious text block on the last assistant message.
reasoning_content = m.get("reasoning_content")
_already_has_thinking = any(
isinstance(b, dict) and b.get("type") in ("thinking", "redacted_thinking")
isinstance(b, dict) and b.get("type") in {"thinking", "redacted_thinking"}
for b in blocks
)
if isinstance(reasoning_content, str) and not _already_has_thinking:
@@ -1688,7 +1704,7 @@ def convert_messages_to_anthropic(
if isinstance(m["content"], list):
m["content"] = [
b for b in m["content"]
if not (isinstance(b, dict) and b.get("type") in ("thinking", "redacted_thinking"))
if not (isinstance(b, dict) and b.get("type") in {"thinking", "redacted_thinking"})
]
prev_blocks = fixed[-1]["content"]
curr_blocks = m["content"]
+182 -33
View File
@@ -175,7 +175,7 @@ def _normalize_aux_provider(provider: Optional[str]) -> str:
# Resolve to the user's actual main provider so named custom providers
# and non-aggregator providers (DeepSeek, Alibaba, etc.) work correctly.
main_prov = (_read_main_provider() or "").strip().lower()
if main_prov and main_prov not in ("auto", "main", ""):
if main_prov and main_prov not in {"auto", "main", ""}:
normalized = main_prov
else:
return "custom"
@@ -382,7 +382,7 @@ _AI_GATEWAY_HEADERS = {
# Nous Portal extra_body for product attribution.
# Callers should pass this as extra_body in chat.completions.create()
# when the auxiliary client is backed by Nous Portal.
NOUS_EXTRA_BODY = {"tags": ["product=hermes-agent"]}
NOUS_EXTRA_BODY = {"tags": ["product=hermes-agent", "client=aux"]}
# Set at resolve time — True if the auxiliary client points to Nous Portal
auxiliary_is_nous: bool = False
@@ -578,7 +578,7 @@ def _convert_content_for_responses(content: Any) -> Any:
if detail:
entry["detail"] = detail
converted.append(entry)
elif ptype in ("input_text", "input_image"):
elif ptype in {"input_text", "input_image"}:
# Already in Responses format — pass through
converted.append(part)
else:
@@ -798,7 +798,7 @@ class _CodexCompletionsAdapter:
if item_type == "message":
for part in (_item_get(item, "content") or []):
ptype = _item_get(part, "type")
if ptype in ("output_text", "text"):
if ptype in {"output_text", "text"}:
text_parts.append(_item_get(part, "text", ""))
elif item_type == "function_call":
tool_calls_raw.append(SimpleNamespace(
@@ -900,6 +900,14 @@ class AsyncCodexAuxiliaryClient:
self.chat = _AsyncCodexChatShim(async_adapter)
self.api_key = sync_wrapper.api_key
self.base_url = sync_wrapper.base_url
# Mirror the sync wrapper's _real_client so cache eviction by leaf
# OpenAI client (e.g. _close_client_on_timeout in #23482) drops
# this async entry too. Without this, sync and async cache entries
# diverge on poisoning: the sync entry is evicted but the async
# entry keeps reusing the closed transport, failing every
# subsequent async aux call with 'Connection error' until the
# gateway restarts.
self._real_client = sync_wrapper._real_client
class _AnthropicCompletionsAdapter:
@@ -1035,6 +1043,9 @@ class AsyncAnthropicAuxiliaryClient:
self.chat = _AsyncAnthropicChatShim(async_adapter)
self.api_key = sync_wrapper.api_key
self.base_url = sync_wrapper.base_url
# See AsyncCodexAuxiliaryClient: mirror _real_client so cache
# eviction on a poisoned underlying client also drops this entry.
self._real_client = sync_wrapper._real_client
def _endpoint_speaks_anthropic_messages(base_url: str) -> bool:
@@ -1830,6 +1841,113 @@ def _get_provider_chain() -> List[tuple]:
]
# ── Auxiliary "recently 402'd" unhealthy-provider cache ────────────────────
#
# When an auxiliary provider returns HTTP 402 (Payment Required / credit
# exhaustion), retrying it on every subsequent aux call is wasteful — the
# provider stays depleted for hours or days, but the chain re-tries it as
# the FIRST entry on every compression/title-gen/session-search call,
# burns ~1 RTT, gets 402 again, then falls back. On a long Discord/LCM
# session that adds up to dozens of doomed 402s.
#
# Solution: when ANY caller observes a payment error against a provider,
# mark it unhealthy for ``_AUX_UNHEALTHY_TTL_SECONDS``. ``_resolve_auto``
# Step-2 and ``_try_payment_fallback`` both consult this cache and skip
# unhealthy entries (logging once per skip-reason so the user sees what
# happened). Entries auto-expire so a topped-up account recovers without
# manual intervention.
#
# Failure isolation: the cache is in-process only. A second hermes
# process won't inherit the unhealthy mark — that's intentional, since
# the user might be running two profiles with different OpenRouter keys.
_AUX_UNHEALTHY_TTL_SECONDS = 600 # 10 minutes
_aux_unhealthy_until: Dict[str, float] = {}
_aux_unhealthy_logged_at: Dict[str, float] = {}
# Map provider names that show up in resolved_provider / explicit-config
# back to the chain labels used by _get_provider_chain(). Keep in sync
# with the alias map in _try_payment_fallback below.
_AUX_UNHEALTHY_LABEL_ALIASES = {
"openrouter": "openrouter",
"nous": "nous",
"custom": "local/custom",
"local/custom": "local/custom",
"openai-codex": "openai-codex",
"codex": "openai-codex",
}
def _normalize_chain_label(provider: str) -> str:
"""Normalize a resolved_provider value to a chain label used by
``_get_provider_chain()``. Falls back to the lowercased input for
direct API-key providers (deepseek, alibaba, minimax, etc.) which
each report their own provider name from the api-key chain.
"""
if not provider:
return ""
p = str(provider).strip().lower()
return _AUX_UNHEALTHY_LABEL_ALIASES.get(p, p)
def _mark_provider_unhealthy(provider: str, ttl: Optional[float] = None) -> None:
"""Mark ``provider`` as recently-402'd, hidden from chain iteration
until the TTL expires. Called from the payment-fallback branches in
``call_llm`` and ``acall_llm`` after a confirmed payment error.
"""
label = _normalize_chain_label(provider)
if not label:
return
expires_at = time.time() + (ttl if ttl is not None else _AUX_UNHEALTHY_TTL_SECONDS)
_aux_unhealthy_until[label] = expires_at
logger.warning(
"Auxiliary: marking %s unhealthy for %ds (payment / credit error). "
"Subsequent auxiliary calls will skip it until %s.",
label,
int(ttl if ttl is not None else _AUX_UNHEALTHY_TTL_SECONDS),
time.strftime("%H:%M:%S", time.localtime(expires_at)),
)
def _is_provider_unhealthy(label: str) -> bool:
"""True iff ``label`` is in the unhealthy cache and the TTL hasn't expired.
Lazily evicts expired entries so the cache stays small.
"""
if not label:
return False
expires_at = _aux_unhealthy_until.get(label)
if expires_at is None:
return False
if time.time() >= expires_at:
_aux_unhealthy_until.pop(label, None)
_aux_unhealthy_logged_at.pop(label, None)
return False
return True
def _log_skip_unhealthy(label: str, task: Optional[str] = None) -> None:
"""Emit a single info-level log per minute when we skip an unhealthy
provider. Avoids spamming the log on bursty sessions while still
giving the user a trail.
"""
now = time.time()
last = _aux_unhealthy_logged_at.get(label, 0.0)
if now - last >= 60:
_aux_unhealthy_logged_at[label] = now
expires_at = _aux_unhealthy_until.get(label, now)
logger.info(
"Auxiliary %s: skipping %s (recently returned payment error, retry in %ds)",
task or "call", label, max(0, int(expires_at - now)),
)
def _reset_aux_unhealthy_cache() -> None:
"""Clear the unhealthy cache. Used by tests and by a future explicit
user trigger (e.g. ``hermes config aux reset``)."""
_aux_unhealthy_until.clear()
_aux_unhealthy_logged_at.clear()
def _is_payment_error(exc: Exception) -> bool:
"""Detect payment/credit/quota exhaustion errors.
@@ -1842,7 +1960,7 @@ def _is_payment_error(exc: Exception) -> bool:
err_lower = str(exc).lower()
# OpenRouter and other providers include "credits" or "afford" in 402 bodies,
# but sometimes wrap them in 429 or other codes.
if status in (402, 429, None):
if status in {402, 429, None}:
if any(kw in err_lower for kw in ("credits", "insufficient funds",
"can only afford", "billing",
"payment required")):
@@ -2001,9 +2119,13 @@ def _evict_cached_client_instance(target: Any) -> bool:
transport after a timeout, broken streaming session, etc.) so the next
auxiliary call rebuilds rather than reusing the dead instance.
Walks ``CodexAuxiliaryClient`` wrappers via their ``_real_client`` so a
timeout that closes the underlying ``OpenAI`` client also evicts the
Codex shim that exposed it.
Walks both sync and async wrappers (``CodexAuxiliaryClient``,
``AnthropicAuxiliaryClient``, ``AsyncCodexAuxiliaryClient``, etc.) via
their ``_real_client`` attribute so a timeout that closes the underlying
``OpenAI`` (or native provider) client evicts every cached shim that
exposed it. Async wrappers must mirror their sync sibling's
``_real_client`` for this to work — otherwise the sync entry is evicted
but the async entry survives and keeps reusing the dead transport.
Returns True when at least one entry was evicted.
"""
@@ -2035,7 +2157,7 @@ def _pool_cache_hint(
if normalized == "auto":
runtime = _normalize_main_runtime(main_runtime)
normalized = _normalize_aux_provider(runtime.get("provider") or _read_main_provider())
if normalized in ("", "auto", "custom"):
if normalized in {"", "auto", "custom"}:
return ""
entry = _peek_pool_entry(normalized)
if entry is None:
@@ -2057,7 +2179,7 @@ def _pool_error_context(exc: Exception) -> Dict[str, Any]:
def _recoverable_pool_provider(resolved_provider: str, client: Any) -> Optional[str]:
"""Infer which provider pool can recover the current auxiliary client."""
normalized = _normalize_aux_provider(resolved_provider)
if normalized not in ("", "auto", "custom"):
if normalized not in {"", "auto", "custom"}:
return normalized
base = str(getattr(client, "base_url", "") or "")
if base_url_host_matches(base, "chatgpt.com"):
@@ -2302,6 +2424,10 @@ def _try_payment_fallback(
for label, try_fn in _get_provider_chain():
if label in skip_chain_labels:
continue
if _is_provider_unhealthy(label):
_log_skip_unhealthy(label, task)
tried.append(f"{label} (unhealthy)")
continue
client, model = try_fn()
if client is not None:
logger.info(
@@ -2370,7 +2496,7 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
main_provider = runtime_provider or _read_main_provider()
main_model = runtime_model or _read_main_model()
if (main_provider and main_model
and main_provider not in ("auto", "")):
and main_provider not in {"auto", ""}):
resolved_provider = main_provider
explicit_base_url = None
explicit_api_key = None
@@ -2378,21 +2504,34 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
resolved_provider = "custom"
explicit_base_url = runtime_base_url
explicit_api_key = runtime_api_key or None
client, resolved = resolve_provider_client(
resolved_provider,
main_model,
explicit_base_url=explicit_base_url,
explicit_api_key=explicit_api_key,
api_mode=runtime_api_mode or None,
)
if client is not None:
logger.info("Auxiliary auto-detect: using main provider %s (%s)",
main_provider, resolved or main_model)
return client, resolved or main_model
# Skip Step-1 if the main provider was recently 402'd. The unhealthy
# cache TTL bounds how long we bypass it, so a topped-up account
# recovers automatically. If we tried Step-1 anyway, every aux call
# on a depleted main provider would pay one doomed 402 RTT before
# falling to Step-2.
main_chain_label = _normalize_chain_label(resolved_provider)
if main_chain_label and _is_provider_unhealthy(main_chain_label):
_log_skip_unhealthy(main_chain_label)
else:
client, resolved = resolve_provider_client(
resolved_provider,
main_model,
explicit_base_url=explicit_base_url,
explicit_api_key=explicit_api_key,
api_mode=runtime_api_mode or None,
)
if client is not None:
logger.info("Auxiliary auto-detect: using main provider %s (%s)",
main_provider, resolved or main_model)
return client, resolved or main_model
# ── Step 2: aggregator / fallback chain ──────────────────────────────
tried = []
for label, try_fn in _get_provider_chain():
if _is_provider_unhealthy(label):
_log_skip_unhealthy(label)
tried.append(f"{label} (unhealthy)")
continue
client, model = try_fn()
if client is not None:
if tried:
@@ -3018,7 +3157,7 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
elif pconfig.auth_type in ("oauth_device_code", "oauth_external"):
elif pconfig.auth_type in {"oauth_device_code", "oauth_external"}:
# OAuth providers — route through their specific try functions
if provider == "nous":
return resolve_provider_client("nous", model, async_mode)
@@ -3127,7 +3266,7 @@ def get_available_vision_backends() -> List[str]:
available: List[str] = []
# 1. Active provider — if the user configured a provider, try it first.
main_provider = _read_main_provider()
if main_provider and main_provider not in ("auto", ""):
if main_provider and main_provider not in {"auto", ""}:
if main_provider in _VISION_AUTO_PROVIDER_ORDER:
if _strict_vision_backend_available(main_provider):
available.append(main_provider)
@@ -3173,7 +3312,7 @@ def resolve_vision_provider_client(
if resolved_base_url:
provider_for_base_override = (
requested if requested and requested not in ("", "auto") else "custom"
requested if requested and requested not in {"", "auto"} else "custom"
)
client, final_model = resolve_provider_client(
provider_for_base_override,
@@ -3201,7 +3340,7 @@ def resolve_vision_provider_client(
# 4. Stop
main_provider = _read_main_provider()
main_model = _read_main_model()
if main_provider and main_provider not in ("auto", ""):
if main_provider and main_provider not in {"auto", ""}:
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
if main_provider == "nous":
sync_client, default_model = _resolve_strict_vision_backend(
@@ -3887,7 +4026,7 @@ def _build_call_kwargs(
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
if provider == "nous" or auxiliary_is_nous:
merged_extra.setdefault("tags", []).extend(["product=hermes-agent"])
merged_extra.setdefault("tags", []).extend(NOUS_EXTRA_BODY["tags"])
if merged_extra:
kwargs["extra_body"] = merged_extra
@@ -4007,7 +4146,7 @@ def call_llm(
# credentials were found, fail fast instead of silently routing
# through OpenRouter (which causes confusing 404s).
_explicit = (resolved_provider or "").strip().lower()
if _explicit and _explicit not in ("auto", "openrouter", "custom"):
if _explicit and _explicit not in {"auto", "openrouter", "custom"}:
raise RuntimeError(
f"Provider '{_explicit}' is set in config.yaml but no API key "
f"was found. Set the {_explicit.upper()}_API_KEY environment "
@@ -4137,7 +4276,7 @@ def call_llm(
# ── Auth refresh retry ───────────────────────────────────────
if (_is_auth_error(first_err)
and resolved_provider not in ("auto", "", None)
and resolved_provider not in {"auto", "", None}
and not client_is_nous):
if _refresh_provider_credentials(resolved_provider):
logger.info(
@@ -4220,10 +4359,17 @@ def call_llm(
# Only try alternative providers when the user didn't explicitly
# configure this task's provider. Explicit provider = hard constraint;
# auto (the default) = best-effort fallback chain. (#7559)
is_auto = resolved_provider in ("auto", "", None)
is_auto = resolved_provider in {"auto", "", None}
if should_fallback and is_auto:
if _is_payment_error(first_err):
reason = "payment error"
# Resolve the actual provider label (resolved_provider may be
# "auto"; the client's base_url tells us which backend got the
# 402). Mark THAT label unhealthy so subsequent aux calls
# skip it instead of paying another doomed RTT.
_mark_provider_unhealthy(
_recoverable_pool_provider(resolved_provider, client) or resolved_provider
)
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
@@ -4369,7 +4515,7 @@ async def async_call_llm(
)
if client is None:
_explicit = (resolved_provider or "").strip().lower()
if _explicit and _explicit not in ("auto", "openrouter", "custom"):
if _explicit and _explicit not in {"auto", "openrouter", "custom"}:
raise RuntimeError(
f"Provider '{_explicit}' is set in config.yaml but no API key "
f"was found. Set the {_explicit.upper()}_API_KEY environment "
@@ -4480,7 +4626,7 @@ async def async_call_llm(
# ── Auth refresh retry (mirrors sync call_llm) ───────────────
if (_is_auth_error(first_err)
and resolved_provider not in ("auto", "", None)
and resolved_provider not in {"auto", "", None}
and not client_is_nous):
if _refresh_provider_credentials(resolved_provider):
logger.info(
@@ -4542,10 +4688,13 @@ async def async_call_llm(
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
)
is_auto = resolved_provider in ("auto", "", None)
is_auto = resolved_provider in {"auto", "", None}
if should_fallback and is_auto:
if _is_payment_error(first_err):
reason = "payment error"
_mark_provider_unhealthy(
_recoverable_pool_provider(resolved_provider, client) or resolved_provider
)
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
+8 -9
View File
@@ -167,7 +167,7 @@ def _strip_image_parts_from_parts(parts: Any) -> Any:
out.append(part)
continue
ptype = part.get("type")
if ptype in ("image", "image_url", "input_image"):
if ptype in {"image", "image_url", "input_image"}:
had_image = True
out.append({"type": "text", "text": "[screenshot removed to save context]"})
else:
@@ -274,8 +274,8 @@ def _summarize_tool_result(tool_name: str, tool_args: str, tool_content: str) ->
mode = args.get("mode", "replace")
return f"[patch] {mode} in {path} ({content_len:,} chars result)"
if tool_name in ("browser_navigate", "browser_click", "browser_snapshot",
"browser_type", "browser_scroll", "browser_vision"):
if tool_name in {"browser_navigate", "browser_click", "browser_snapshot",
"browser_type", "browser_scroll", "browser_vision"}:
url = args.get("url", "")
ref = args.get("ref", "")
detail = f" {url}" if url else (f" ref={ref}" if ref else "")
@@ -304,7 +304,7 @@ def _summarize_tool_result(tool_name: str, tool_args: str, tool_content: str) ->
code_preview += "..."
return f"[execute_code] `{code_preview}` ({line_count} lines output)"
if tool_name in ("skill_view", "skills_list", "skill_manage"):
if tool_name in {"skill_view", "skills_list", "skill_manage"}:
name = args.get("name", "?")
return f"[{tool_name}] name={name} ({content_len:,} chars)"
@@ -979,13 +979,13 @@ The user has requested that this compaction PRIORITISE preserving all informatio
_status = getattr(e, "status_code", None) or getattr(getattr(e, "response", None), "status_code", None)
_err_str = str(e).lower()
_is_model_not_found = (
_status in (404, 503)
_status in {404, 503}
or "model_not_found" in _err_str
or "does not exist" in _err_str
or "no available channel" in _err_str
)
_is_timeout = (
_status in (408, 429, 502, 504)
_status in {408, 429, 502, 504}
or "timeout" in _err_str
)
# Non-JSON / malformed-body responses from misconfigured providers
@@ -1316,8 +1316,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Ensure we protect at least min_tail messages
fallback_cut = n - min_tail
if cut_idx > fallback_cut:
cut_idx = fallback_cut
cut_idx = min(cut_idx, fallback_cut)
# If the token budget would protect everything (small conversations),
# force a cut after the head so compression can still remove middle turns.
@@ -1480,7 +1479,7 @@ The user has requested that this compaction PRIORITISE preserving all informatio
first_tail_role = messages[compress_end].get("role", "user") if compress_end < n_messages else "user"
# Pick a role that avoids consecutive same-role with both neighbors.
# Priority: avoid colliding with head (already committed), then tail.
if last_head_role in ("assistant", "tool"):
if last_head_role in {"assistant", "tool"}:
summary_role = "user"
else:
summary_role = "assistant"
+1 -1
View File
@@ -149,7 +149,7 @@ class PooledCredential:
}
result: Dict[str, Any] = {}
for field_def in fields(self):
if field_def.name in ("provider", "extra"):
if field_def.name in {"provider", "extra"}:
continue
value = getattr(self, field_def.name)
if value is not None or field_def.name in _ALWAYS_EMIT:
+8 -8
View File
@@ -83,7 +83,7 @@ class ClassifiedError:
@property
def is_auth(self) -> bool:
return self.reason in (FailoverReason.auth, FailoverReason.auth_permanent)
return self.reason in {FailoverReason.auth, FailoverReason.auth_permanent}
@@ -688,10 +688,10 @@ def _classify_by_status(
result_fn=result_fn,
)
if status_code in (500, 502):
if status_code in {500, 502}:
return result_fn(FailoverReason.server_error, retryable=True)
if status_code in (503, 529):
if status_code in {503, 529}:
return result_fn(FailoverReason.overloaded, retryable=True)
# Other 4xx — non-retryable
@@ -810,7 +810,7 @@ def _classify_400(
# Responses API (and some providers) use flat body: {"message": "..."}
if not err_body_msg:
err_body_msg = str(body.get("message") or "").strip().lower()
is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
is_generic = len(err_body_msg) < 30 or err_body_msg in {"error", ""}
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have many messages while
# still being far below their actual token budget.
@@ -841,14 +841,14 @@ def _classify_by_error_code(
"""Classify by structured error codes from the response body."""
code_lower = error_code.lower()
if code_lower in ("resource_exhausted", "throttled", "rate_limit_exceeded"):
if code_lower in {"resource_exhausted", "throttled", "rate_limit_exceeded"}:
return result_fn(
FailoverReason.rate_limit,
retryable=True,
should_rotate_credential=True,
)
if code_lower in ("insufficient_quota", "billing_not_active", "payment_required"):
if code_lower in {"insufficient_quota", "billing_not_active", "payment_required"}:
return result_fn(
FailoverReason.billing,
retryable=False,
@@ -856,14 +856,14 @@ def _classify_by_error_code(
should_fallback=True,
)
if code_lower in ("model_not_found", "model_not_available", "invalid_model"):
if code_lower in {"model_not_found", "model_not_available", "invalid_model"}:
return result_fn(
FailoverReason.model_not_found,
retryable=False,
should_fallback=True,
)
if code_lower in ("context_length_exceeded", "max_tokens_exceeded"):
if code_lower in {"context_length_exceeded", "max_tokens_exceeded"}:
return result_fn(
FailoverReason.context_overflow,
retryable=True,
+1 -1
View File
@@ -77,7 +77,7 @@ def _coerce_content_to_text(content: Any) -> str:
if p.get("type") == "text" and isinstance(p.get("text"), str):
pieces.append(p["text"])
# Multimodal (image_url, etc.) — stub for now; log and skip
elif p.get("type") in ("image_url", "input_audio"):
elif p.get("type") in {"image_url", "input_audio"}:
logger.debug("Dropping multimodal part (not yet supported): %s", p.get("type"))
return "\n".join(pieces)
return str(content)
+6
View File
@@ -945,6 +945,12 @@ class AsyncGeminiNativeClient:
self.api_key = sync_client.api_key
self.base_url = sync_client.base_url
self.chat = _AsyncGeminiChatNamespace(self)
# Expose the underlying sync client as _real_client so the auxiliary
# cache's eviction-by-leaf-client helper (#23482) can find and drop
# this async entry when the sync GeminiNativeClient is poisoned.
# GeminiNativeClient is itself the leaf (no OpenAI client beneath
# it), so we point at the sync_client directly.
self._real_client = sync_client
async def _create_chat_completion(self, **kwargs: Any) -> Any:
stream = bool(kwargs.get("stream"))
+4 -4
View File
@@ -76,7 +76,7 @@ def _explicit_aux_vision_override(cfg: Optional[Dict[str, Any]]) -> bool:
base_url = str(vision.get("base_url") or "").strip()
# "auto" / "" / blank = not explicit
if provider in ("", "auto") and not model and not base_url:
if provider in {"", "auto"} and not model and not base_url:
return False
return True
@@ -163,7 +163,7 @@ def _sniff_mime_from_bytes(raw: bytes) -> Optional[str]:
if raw.startswith(b"\xff\xd8\xff"):
return "image/jpeg"
# GIF87a / GIF89a
if raw[:6] in (b"GIF87a", b"GIF89a"):
if raw[:6] in {b"GIF87a", b"GIF89a"}:
return "image/gif"
# WEBP: "RIFF" .... "WEBP"
if len(raw) >= 12 and raw[:4] == b"RIFF" and raw[8:12] == b"WEBP":
@@ -172,9 +172,9 @@ def _sniff_mime_from_bytes(raw: bytes) -> Optional[str]:
if raw.startswith(b"BM"):
return "image/bmp"
# HEIC/HEIF: ftypheic / ftypheix / ftypmif1 / ftypmsf1 etc.
if len(raw) >= 12 and raw[4:8] == b"ftyp" and raw[8:12] in (
if len(raw) >= 12 and raw[4:8] == b"ftyp" and raw[8:12] in {
b"heic", b"heix", b"hevc", b"hevx", b"mif1", b"msf1", b"heim", b"heis",
):
}:
return "image/heic"
return None
+309
View File
@@ -0,0 +1,309 @@
"""CJK/wide-character-aware re-alignment of model-emitted markdown tables.
Models pad markdown tables assuming each character occupies one terminal
cell. CJK glyphs and most emoji render as two cells, so the model's
spacing collapses into drift the moment a table reaches a real terminal
header pipes line up, every body row drifts right by N cells per CJK
char.
This module rebuilds row padding using ``wcwidth.wcswidth`` (display
columns), preserving the table's pipes and dashes so it still reads as a
plain-text table in ``strip`` / unrendered display modes. Standard Rich
markdown rendering already aligns CJK correctly inside a wide enough
panel; this helper is for the paths that print the model's text more or
less verbatim.
The helper is deliberately conservative:
* Only contiguous ``| ... |`` blocks with a divider line are rewritten.
* Anything that does not look like a table is passed through unchanged.
* Single-line / mid-stream fragments are left alone callers buffer
table rows and flush them once the block is complete.
There is a small, intentional caveat: ``wcwidth`` returns ``-1`` for some
emoji-with-variation-selector sequences (e.g. ````); we clamp those to
0 so they do not corrupt the column width math. The 1-cell drift on
those specific glyphs is preferable to silently widening every table
that contains one.
"""
from __future__ import annotations
import re
from typing import List
from wcwidth import wcswidth
__all__ = [
"is_table_divider",
"looks_like_table_row",
"realign_markdown_tables",
"split_table_row",
]
_DIVIDER_CELL_RE = re.compile(r"^\s*:?-{3,}:?\s*$")
_MIN_COL_WIDTH = 3 # matches the divider's minimum dash run.
def _disp_width(s: str) -> int:
"""``wcswidth`` clamped to a non-negative integer.
``wcswidth`` returns ``-1`` when it encounters a control char or an
unknown sequence; treat those as zero-width rather than letting a
negative number flow into ``max`` and break the column-width math.
"""
w = wcswidth(s)
return w if w > 0 else 0
def _pad_to_width(s: str, target: int) -> str:
return s + " " * max(0, target - _disp_width(s))
def split_table_row(row: str) -> List[str]:
"""Split ``| a | b | c |`` into ``["a", "b", "c"]`` with trims."""
s = row.strip()
if s.startswith("|"):
s = s[1:]
if s.endswith("|"):
s = s[:-1]
return [c.strip() for c in s.split("|")]
def is_table_divider(row: str) -> bool:
"""True when ``row`` is a markdown table separator line."""
cells = split_table_row(row)
return len(cells) > 1 and all(_DIVIDER_CELL_RE.match(c) for c in cells)
def looks_like_table_row(row: str) -> bool:
"""True when ``row`` could plausibly be a markdown table row.
Used by streaming callers to decide whether to buffer an in-flight
line. We are intentionally permissive here the realigner itself
only rewrites blocks that are accompanied by a divider, so a false
positive here at most delays the print of one line.
"""
if "|" not in row:
return False
stripped = row.strip()
if not stripped:
return False
# A leading pipe is the strongest signal; without it we still allow
# rows with at least two pipes so models that omit the leading pipe
# don't slip past us.
if stripped.startswith("|"):
return True
return stripped.count("|") >= 2
def _render_block(rows: List[List[str]], available_width: int | None = None) -> List[str]:
"""Render ``rows`` (header + body, divider implied) at uniform widths.
If ``available_width`` is given and the rebuilt horizontal table
would exceed it, fall back to a vertical key-value rendering so
rows do not soft-wrap mid-cell terminal soft-wrap destroys
column alignment visually even when the underlying bytes are
perfectly padded, which is exactly the "tables look broken"
user report this code path is meant to address.
"""
ncols = max(len(r) for r in rows)
rows = [r + [""] * (ncols - len(r)) for r in rows]
widths = [
max(_MIN_COL_WIDTH, *(_disp_width(r[c]) for r in rows))
for c in range(ncols)
]
# Total horizontal width for the rendered row:
# `| ` + cell + ` ` for each column, plus the final closing `|`.
horizontal_width = sum(widths) + 3 * ncols + 1
if available_width is not None and horizontal_width > max(available_width, 20):
return _render_vertical(rows, ncols, available_width)
def _row(cells: List[str]) -> str:
return (
"| "
+ " | ".join(_pad_to_width(c, widths[k]) for k, c in enumerate(cells))
+ " |"
)
out = [_row(rows[0])]
out.append("|" + "|".join("-" * (w + 2) for w in widths) + "|")
for r in rows[1:]:
out.append(_row(r))
return out
def _wrap_to_width(text: str, width: int) -> List[str]:
"""Soft-wrap ``text`` at word boundaries to fit ``width`` display cells.
Falls back to hard-breaking the longest word if a single token is
wider than ``width``. Empty input yields a single empty string so
the caller's row count stays predictable.
"""
if width <= 0 or not text:
return [text]
words = text.split()
if not words:
return [""]
lines: List[str] = []
current = ""
current_w = 0
def _hard_break(word: str, w: int) -> List[str]:
out: List[str] = []
buf = ""
bw = 0
for ch in word:
cw = _disp_width(ch) or 1
if bw + cw > w and buf:
out.append(buf)
buf = ch
bw = cw
else:
buf += ch
bw += cw
if buf:
out.append(buf)
return out
for word in words:
ww = _disp_width(word)
if not current:
if ww <= width:
current = word
current_w = ww
else:
pieces = _hard_break(word, width)
lines.extend(pieces[:-1])
current = pieces[-1] if pieces else ""
current_w = _disp_width(current)
continue
if current_w + 1 + ww <= width:
current += " " + word
current_w += 1 + ww
else:
lines.append(current)
if ww <= width:
current = word
current_w = ww
else:
pieces = _hard_break(word, width)
lines.extend(pieces[:-1])
current = pieces[-1] if pieces else ""
current_w = _disp_width(current)
if current:
lines.append(current)
return lines or [""]
def _render_vertical(
rows: List[List[str]], ncols: int, available_width: int
) -> List[str]:
"""Render a too-wide table as vertical ``Header: value`` rows.
Mirrors Claude Code's narrow-terminal fallback in
``MarkdownTable.tsx``: each body row becomes a small block of
``Header: cell-value`` lines (continuation lines indented two
spaces) separated by a thin ```` divider between rows. Keeps
every line narrower than ``available_width`` so the terminal does
not soft-wrap mid-cell.
"""
if not rows:
return []
headers = rows[0] + [""] * (ncols - len(rows[0]))
body = rows[1:]
labels = [h or f"Column {i + 1}" for i, h in enumerate(headers)]
sep_width = max(20, min(40, available_width - 2)) if available_width else 30
separator = "" * sep_width
indent = " "
indent_w = _disp_width(indent)
out: List[str] = []
for ri, row in enumerate(body):
if ri > 0:
out.append(separator)
for ci in range(ncols):
label = labels[ci]
value = row[ci] if ci < len(row) else ""
label_w = _disp_width(label)
first_budget = max(10, available_width - label_w - 2)
cont_budget = max(10, available_width - indent_w)
if not value:
out.append(f"{label}:")
continue
wrapped = _wrap_to_width(value, first_budget)
out.append(f"{label}: {wrapped[0]}")
if len(wrapped) > 1:
# Re-flow continuation text at the wider continuation
# budget — words split across the narrower first-line
# budget should re-pack greedily for the rest.
cont_text = " ".join(wrapped[1:])
for cl in _wrap_to_width(cont_text, cont_budget):
if cl.strip():
out.append(f"{indent}{cl}")
return out
def realign_markdown_tables(text: str, available_width: int | None = None) -> str:
"""Rewrite every ``| ... |`` + divider block with wcwidth-aware padding.
Lines that are not part of a recognised table are returned verbatim,
so this is safe to apply to arbitrary assistant prose.
If ``available_width`` is given (terminal cells available for the
rendered table), tables wider than that are rendered as vertical
key-value pairs instead of a horizontal pipe-bordered grid. This
avoids the terminal soft-wrapping mid-cell, which destroys column
alignment visually even when the bytes are perfectly padded.
"""
if "|" not in text:
return text
lines = text.split("\n")
out: List[str] = []
i = 0
n = len(lines)
while i < n:
line = lines[i]
# A table starts with a header row whose next line is a divider.
if (
"|" in line
and i + 1 < n
and is_table_divider(lines[i + 1])
):
header = split_table_row(line)
body: List[List[str]] = []
j = i + 2
while j < n and "|" in lines[j] and lines[j].strip():
if is_table_divider(lines[j]):
j += 1
continue
body.append(split_table_row(lines[j]))
j += 1
if any(c for c in header) or body:
out.extend(_render_block([header] + body, available_width))
i = j
continue
out.append(line)
i += 1
return "\n".join(out)
+2 -2
View File
@@ -470,11 +470,11 @@ class MemoryManager:
accepted = [
p for p in params
if p.kind in (
if p.kind in {
inspect.Parameter.POSITIONAL_ONLY,
inspect.Parameter.POSITIONAL_OR_KEYWORD,
inspect.Parameter.KEYWORD_ONLY,
)
}
]
if len(accepted) >= 4:
return "positional"
+160 -19
View File
@@ -571,7 +571,7 @@ def _extract_pricing(payload: Dict[str, Any]) -> Dict[str, Any]:
pricing: Dict[str, Any] = {}
for target, aliases in alias_map.items():
for alias in aliases:
if alias in normalized and normalized[alias] not in (None, ""):
if alias in normalized and normalized[alias] not in {None, ""}:
pricing[target] = normalized[alias]
break
if pricing:
@@ -1006,6 +1006,79 @@ def query_ollama_num_ctx(model: str, base_url: str, api_key: str = "") -> Option
return None
def _query_ollama_api_show(model: str, base_url: str, api_key: str = "") -> Optional[int]:
"""Query an Ollama server's native ``/api/show`` for context length.
Provider-agnostic: works against ANY Ollama-compatible server regardless
of hostname local Ollama, Ollama Cloud (``ollama.com``), custom Ollama
hosting behind a reverse proxy, etc. For non-Ollama servers the POST
returns 404/405 quickly; the function handles errors gracefully.
For hosted servers the GGUF ``model_info.*.context_length`` is the
authoritative source: the user can't set their own ``num_ctx``, and the
OpenAI-compat ``/v1/models`` endpoint correctly omits ``context_length``
per the OpenAI schema.
Resolution order for hosted Ollama:
1. ``model_info.*.context_length`` GGUF training max (authoritative)
2. ``parameters`` ``num_ctx`` server-side Modelfile override
The order is flipped vs ``query_ollama_num_ctx()`` because local users
control ``num_ctx`` themselves; hosted users can't.
"""
import httpx
server_url = base_url.rstrip("/")
if server_url.endswith("/v1"):
server_url = server_url[:-3]
headers = _auth_headers(api_key)
try:
with httpx.Client(timeout=5.0, headers=headers) as client:
resp = client.post(f"{server_url}/api/show", json={"name": model})
if resp.status_code != 200:
return None
data = resp.json()
# Hosted Ollama: GGUF model_info is the real max — prefer it over
# num_ctx which the Cloud operator may have capped arbitrarily.
model_info = data.get("model_info", {})
for key, value in model_info.items():
if "context_length" in key and isinstance(value, (int, float)):
ctx = int(value)
if ctx >= 1024:
return ctx
# Fall back to num_ctx from Modelfile parameters (rare on Cloud)
params = data.get("parameters", "")
if "num_ctx" in params:
for line in params.split("\n"):
if "num_ctx" in line:
parts = line.strip().split()
if len(parts) >= 2:
try:
ctx = int(parts[-1])
if ctx >= 1024:
return ctx
except ValueError:
pass
except Exception:
pass
return None
def _model_name_suggests_kimi(model: str) -> bool:
"""Return True if the model name looks like a Kimi-family model.
Catches ``kimi-k2.6``, ``kimi-k2.5``, ``kimi-k2-thinking``,
``moonshotai/Kimi-K2.6``, and similar variants. Used as a guard
against stale OpenRouter metadata that underreports these models
as 32K context when they actually support 262K+.
"""
lower = model.lower()
return lower.startswith("kimi") or "moonshot" in lower
def _query_local_context_length(model: str, base_url: str, api_key: str = "") -> Optional[int]:
"""Query a local server for the model's context length."""
import httpx
@@ -1265,16 +1338,35 @@ def _resolve_nous_context_length(model: str) -> Optional[int]:
with version normalization (dotdash).
"""
metadata = fetch_model_metadata() # OpenRouter cache
def _safe_ctx(or_id: str, entry: dict) -> Optional[int]:
"""Return context length, but reject stale 32k values for Kimi models.
Apply the same guard used for the generic OpenRouter path (step 6 in
resolve_context_length) so the Nous portal path does not short-circuit it.
"""
ctx = entry.get("context_length")
if ctx is None:
return None
if ctx <= 32768 and _model_name_suggests_kimi(or_id):
logger.info(
"Rejecting OpenRouter metadata context=%s for %r "
"(Kimi-family underreport, Nous path); falling through to hardcoded defaults",
ctx, or_id,
)
return None
return ctx
# Exact match first
if model in metadata:
return metadata[model].get("context_length")
return _safe_ctx(model, metadata[model])
normalized = _normalize_model_version(model).lower()
for or_id, entry in metadata.items():
bare = or_id.split("/", 1)[1] if "/" in or_id else or_id
if bare.lower() == model.lower() or _normalize_model_version(bare).lower() == normalized:
return entry.get("context_length")
return _safe_ctx(or_id, entry)
# Partial prefix match for cases like gemini-3-flash → gemini-3-flash-preview
# Require match to be at a word boundary (followed by -, :, or end of string)
@@ -1285,7 +1377,7 @@ def _resolve_nous_context_length(model: str) -> Optional[int]:
if candidate.startswith(query) and (
len(candidate) == len(query) or candidate[len(query)] in "-:."
):
return entry.get("context_length")
return _safe_ctx(or_id, entry)
return None
@@ -1307,12 +1399,17 @@ def get_model_context_length(
2. Active endpoint metadata (/models for explicit custom endpoints)
3. Local server query (for local endpoints)
4. Anthropic /v1/models API (API-key users only, not OAuth)
5. OpenRouter live API metadata
6. Nous suffix-match via OpenRouter cache
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. Default fallback (256K)
"""
5. Provider-aware lookups (before generic OpenRouter cache):
a. Copilot live /models API
b. Nous suffix-match via OpenRouter cache
c. Codex OAuth /models probe
d. GMI /models endpoint
e. Ollama native /api/show probe (any base_url, provider-agnostic)
f. models.dev registry lookup (with :cloud/-cloud suffix fallback)
6. OpenRouter live API metadata (Kimi-family 32k guard)
7. Hardcoded defaults (broad family patterns, longest-key-first)
8. Local server query (last resort)
9. Default fallback (256K)"""
# 0. Explicit config override — user knows best
if config_context_length is not None and isinstance(config_context_length, int) and config_context_length > 0:
return config_context_length
@@ -1359,6 +1456,14 @@ def get_model_context_length(
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
# Invalidate stale 32k cache entries for Kimi-family models.
elif cached <= 32768 and _model_name_suggests_kimi(model):
logger.info(
"Dropping stale Kimi cache entry %s@%s -> %s (OpenRouter underreport); "
"re-resolving via hardcoded defaults",
model, base_url, f"{cached:,}",
)
_invalidate_cached_context_length(model, base_url)
else:
return cached
@@ -1392,6 +1497,13 @@ def get_model_context_length(
if context_length is not None:
return context_length
if not _is_known_provider_base_url(base_url):
# 2b. Ollama native /api/show — any URL might be an Ollama server
# (local, cloud, or custom hosting). Non-Ollama servers return
# 404/405 quickly. Fall through on failure.
ctx = _query_ollama_api_show(model, base_url, api_key=api_key)
if ctx is not None:
save_context_length(model, base_url, ctx)
return ctx
# 3. Try querying local server directly
if is_local_endpoint(base_url):
local_ctx = _query_local_context_length(model, base_url, api_key=api_key)
@@ -1423,7 +1535,7 @@ def get_model_context_length(
# (e.g. claude-opus-4.6 is 1M on Anthropic but 128K on GitHub Copilot).
# If provider is generic (openrouter/custom/empty), try to infer from URL.
effective_provider = provider
if not effective_provider or effective_provider in ("openrouter", "custom"):
if not effective_provider or effective_provider in {"openrouter", "custom"}:
if base_url:
inferred = _infer_provider_from_url(base_url)
if inferred:
@@ -1433,7 +1545,7 @@ def get_model_context_length(
# This catches account-specific models (e.g. claude-opus-4.6-1m) that
# don't exist in models.dev. For models that ARE in models.dev, this
# returns the provider-enforced limit which is what users can actually use.
if effective_provider in ("copilot", "copilot-acp", "github-copilot"):
if effective_provider in {"copilot", "copilot-acp", "github-copilot"}:
try:
from hermes_cli.models import get_copilot_model_context
ctx = get_copilot_model_context(model, api_key=api_key)
@@ -1461,16 +1573,45 @@ def get_model_context_length(
ctx = _resolve_endpoint_context_length(model, base_url, api_key=api_key)
if ctx is not None:
return ctx
# 5e. Ollama native /api/show probe — runs for ANY provider with a
# base_url, not just ollama-cloud. Ollama-compatible servers expose
# this endpoint regardless of hostname (local Ollama, Ollama Cloud,
# custom Ollama hosting). The OpenAI-compat /v1/models endpoint
# correctly omits context_length per the OpenAI schema, but /api/show
# returns the authoritative GGUF model_info.context_length.
# For non-Ollama servers (OpenAI, Anthropic, etc.), the POST returns
# 404/405 quickly. Results are cached, so the hit is per-model+URL,
# once per hour.
if base_url:
ctx = _query_ollama_api_show(model, base_url, api_key=api_key)
if ctx is not None:
save_context_length(model, base_url, ctx)
return ctx
if effective_provider:
from agent.models_dev import lookup_models_dev_context
ctx = lookup_models_dev_context(effective_provider, model)
if ctx:
return ctx
# 6. OpenRouter live API metadata (provider-unaware fallback)
metadata = fetch_model_metadata()
if model in metadata:
return metadata[model].get("context_length", DEFAULT_FALLBACK_CONTEXT)
# 6. OpenRouter live API metadata provider-unaware fallback.
# Only consulted when the provider is unknown (no effective_provider),
# because OpenRouter data is community-maintained and can be incorrect
# for models that belong to known providers with curated defaults.
if not effective_provider:
metadata = fetch_model_metadata()
if model in metadata:
or_ctx = metadata[model].get("context_length", DEFAULT_FALLBACK_CONTEXT)
# Guard against stale OpenRouter metadata for Kimi-family models.
if or_ctx == 32768 and _model_name_suggests_kimi(model):
logger.info(
"Rejecting OpenRouter metadata context=%s for %r "
"(Kimi-family underreport); falling through to hardcoded defaults",
or_ctx, model,
)
else:
return or_ctx
# 7. (reserved)
# 8. Hardcoded defaults (fuzzy match — longest key first for specificity)
# Only check `default_model in model` (is the key a substring of the input).
@@ -1533,7 +1674,7 @@ def _count_image_tokens(msg: Dict[str, Any], cost_per_image: int) -> int:
if not isinstance(part, dict):
continue
ptype = part.get("type")
if ptype in ("image", "image_url", "input_image"):
if ptype in {"image", "image_url", "input_image"}:
count += 1
stashed = msg.get("_anthropic_content_blocks") if isinstance(msg, dict) else None
if isinstance(stashed, list):
@@ -1545,7 +1686,7 @@ def _count_image_tokens(msg: Dict[str, Any], cost_per_image: int) -> int:
inner = content.get("content")
if isinstance(inner, list):
for part in inner:
if isinstance(part, dict) and part.get("type") in ("image", "image_url"):
if isinstance(part, dict) and part.get("type") in {"image", "image_url"}:
count += 1
return count * cost_per_image
@@ -1567,7 +1708,7 @@ def _estimate_message_chars(msg: Dict[str, Any]) -> int:
cleaned = []
for part in v:
if isinstance(part, dict):
if part.get("type") in ("image", "image_url", "input_image"):
if part.get("type") in {"image", "image_url", "input_image"}:
cleaned.append({"type": part.get("type"), "image": "[stripped]"})
else:
cleaned.append(part)
+24
View File
@@ -145,7 +145,9 @@ PROVIDER_TO_MODELS_DEV: Dict[str, str] = {
"openai": "openai",
"openai-codex": "openai",
"zai": "zai",
"kimi": "kimi-for-coding",
"kimi-coding": "kimi-for-coding",
"moonshot": "kimi-for-coding",
"stepfun": "stepfun",
"kimi-coding-cn": "kimi-for-coding",
"minimax": "minimax",
@@ -347,6 +349,28 @@ def lookup_models_dev_context(provider: str, model: str) -> Optional[int]:
if ctx:
return ctx
# Suffix-aware fallback: some providers (e.g. ollama-cloud) store
# model IDs with :cloud / -cloud suffixes in models.dev while the
# live API returns bare names. Without this, kimi-k2.6 misses the
# kimi-k2.6:cloud entry and falls through to stale OpenRouter metadata
# reporting 32768 — tripping the 64k minimum-context guard.
# The suffix-stripping in fetch_ollama_cloud_models() handles the
# model-picker UX; this handles the context-length lookup path.
for suffix in (":cloud", "-cloud"):
suffixed_key = model + suffix
entry = models.get(suffixed_key)
if entry:
ctx = _extract_context(entry)
if ctx:
return ctx
# Also try case-insensitive
suffixed_lower = model_lower + suffix
for mid, mdata in models.items():
if mid.lower() == suffixed_lower:
ctx = _extract_context(mdata)
if ctx:
return ctx
return None
+2 -2
View File
@@ -122,7 +122,7 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
# empty, drop it entirely.
if "enum" in repaired and isinstance(repaired["enum"], list):
node_type = repaired.get("type")
if node_type in ("string", "integer", "number", "boolean"):
if node_type in {"string", "integer", "number", "boolean"}:
cleaned = [v for v in repaired["enum"]
if v is not None and v != ""]
if cleaned:
@@ -135,7 +135,7 @@ def _repair_schema(node: Any, is_schema: bool = True) -> Any:
def _fill_missing_type(node: Dict[str, Any]) -> Dict[str, Any]:
"""Infer a reasonable ``type`` if this schema node has none."""
if "type" in node and node["type"] not in (None, ""):
if "type" in node and node["type"] not in {None, ""}:
return node
# Heuristic: presence of ``properties`` → object, ``items`` → array, ``enum``
+139 -10
View File
@@ -1,15 +1,25 @@
"""Anthropic prompt caching (system_and_3 strategy).
"""Anthropic prompt caching strategies.
Reduces input token costs by ~75% on multi-turn conversations by caching
the conversation prefix. Uses 4 cache_control breakpoints (Anthropic max):
1. System prompt (stable across all turns)
2-4. Last 3 non-system messages (rolling window)
Two layouts:
* ``system_and_3`` (default, used everywhere except the long-lived path):
4 cache_control breakpoints system prompt + last 3 non-system messages.
All at the same TTL (5m or 1h). Reduces input token costs by ~75% on
multi-turn conversations within a single session.
* ``prefix_and_2`` (Claude on Anthropic / OpenRouter / Nous Portal):
4 breakpoints split across two TTL tiers tools[-1] (1h) +
stable system prefix (1h) + last 2 non-system messages (5m). The
long-lived prefix is byte-stable across sessions for a given user
config, so every fresh session reads the cached system+tools instead
of re-paying for them. Within-session rolling window shrinks from 3
messages to 2 to free the breakpoint budget.
Pure functions -- no class state, no AIAgent dependency.
"""
import copy
from typing import Any, Dict, List
from typing import Any, Dict, List, Optional
def _apply_cache_marker(msg: dict, cache_marker: dict, native_anthropic: bool = False) -> None:
@@ -38,6 +48,14 @@ def _apply_cache_marker(msg: dict, cache_marker: dict, native_anthropic: bool =
last["cache_control"] = cache_marker
def _build_marker(ttl: str) -> Dict[str, str]:
"""Build a cache_control marker dict for the given TTL ('5m' or '1h')."""
marker: Dict[str, str] = {"type": "ephemeral"}
if ttl == "1h":
marker["ttl"] = "1h"
return marker
def apply_anthropic_cache_control(
api_messages: List[Dict[str, Any]],
cache_ttl: str = "5m",
@@ -45,7 +63,8 @@ def apply_anthropic_cache_control(
) -> List[Dict[str, Any]]:
"""Apply system_and_3 caching strategy to messages for Anthropic models.
Places up to 4 cache_control breakpoints: system prompt + last 3 non-system messages.
Places up to 4 cache_control breakpoints: system prompt + last 3 non-system
messages, all at the same TTL.
Returns:
Deep copy of messages with cache_control breakpoints injected.
@@ -54,9 +73,7 @@ def apply_anthropic_cache_control(
if not messages:
return messages
marker = {"type": "ephemeral"}
if cache_ttl == "1h":
marker["ttl"] = "1h"
marker = _build_marker(cache_ttl)
breakpoints_used = 0
@@ -70,3 +87,115 @@ def apply_anthropic_cache_control(
_apply_cache_marker(messages[idx], marker, native_anthropic=native_anthropic)
return messages
def _mark_system_stable_block(
messages: List[Dict[str, Any]],
long_lived_marker: Dict[str, str],
) -> bool:
"""Mark the *first* content block of the system message with the 1h marker.
The system message is expected to have been split into multiple content
blocks beforehand by the caller block[0] is the cross-session-stable
prefix, subsequent blocks carry context files + volatile suffix.
Falls back to marking the whole system message as a single block when
the message hasn't been split (preserves correctness on the fallback path).
Returns True when a marker was placed.
"""
if not messages or messages[0].get("role") != "system":
return False
sys_msg = messages[0]
content = sys_msg.get("content")
# Already a list of blocks → mark the first block.
if isinstance(content, list) and content:
first = content[0]
if isinstance(first, dict):
first["cache_control"] = long_lived_marker
return True
return False
# String content (no split) → cannot place a stable-prefix breakpoint
# without changing the byte content. Caller is responsible for
# splitting; if they didn't, fall through to envelope marker so we still
# cache *something* for this turn.
if isinstance(content, str) and content:
sys_msg["content"] = [
{"type": "text", "text": content, "cache_control": long_lived_marker}
]
return True
return False
def apply_anthropic_cache_control_long_lived(
api_messages: List[Dict[str, Any]],
long_lived_ttl: str = "1h",
rolling_ttl: str = "5m",
native_anthropic: bool = False,
) -> List[Dict[str, Any]]:
"""Apply prefix_and_2 caching: long-lived stable prefix + rolling window.
Layout (4 breakpoints total):
* Stable system prefix (block[0]) ``long_lived_ttl`` TTL
* Last 2 non-system messages ``rolling_ttl`` TTL each
NOTE: this function does NOT mark the tools array. Tools cache_control
is attached separately (see ``mark_tools_for_long_lived_cache``) because
tools live outside the messages list in the API payload.
The caller MUST have split the system message into ordered content
blocks where block[0] is the cross-session-stable portion. If the system
message is still a single string, it is wrapped into a single block and
marked this is correct, just less effective (the volatile suffix is
not isolated, so the prefix invalidates per-session).
Returns:
Deep copy of messages with cache_control breakpoints injected.
"""
messages = copy.deepcopy(api_messages)
if not messages:
return messages
long_marker = _build_marker(long_lived_ttl)
rolling_marker = _build_marker(rolling_ttl)
placed_prefix = _mark_system_stable_block(messages, long_marker)
# Reserve 1 breakpoint for the system prefix (when placed); spend the
# remaining 3 on the rolling tail. Anthropic max is 4 total —
# tools[-1] (when marked) consumes the 4th, so we cap rolling at 2 here.
rolling_budget = 2 if placed_prefix else 3
non_sys = [i for i in range(len(messages)) if messages[i].get("role") != "system"]
for idx in non_sys[-rolling_budget:]:
_apply_cache_marker(messages[idx], rolling_marker, native_anthropic=native_anthropic)
return messages
def mark_tools_for_long_lived_cache(
tools: Optional[List[Dict[str, Any]]],
long_lived_ttl: str = "1h",
) -> Optional[List[Dict[str, Any]]]:
"""Attach cache_control to the last tool in the OpenAI-format tools list.
Anthropic prefix-cache order is ``tools system messages``. Marking
the last tool dict caches the entire tools array (Anthropic's docs:
"the marker is placed on the last block you want included in the cached
prefix"). Marker is preserved across the OpenAI-wire boundary on
OpenRouter and Nous Portal (which proxies to OpenRouter); on native
Anthropic the marker is forwarded by ``convert_tools_to_anthropic``.
Returns a deep copy of the tools list with the marker attached, or the
input unchanged when tools is empty/None. Pure function does not
mutate the input.
"""
if not tools:
return tools
out = copy.deepcopy(tools)
last = out[-1]
if isinstance(last, dict):
last["cache_control"] = _build_marker(long_lived_ttl)
return out
+1 -1
View File
@@ -64,7 +64,7 @@ _SENSITIVE_BODY_KEYS = frozenset({
# cli.py) or `HERMES_REDACT_SECRETS=false` in ~/.hermes/.env. An opt-out
# warning is logged at gateway and CLI startup so operators see the
# downgrade — see `_log_redaction_status()` in gateway/run.py and cli.py.
_REDACT_ENABLED = os.getenv("HERMES_REDACT_SECRETS", "true").lower() in ("1", "true", "yes", "on")
_REDACT_ENABLED = os.getenv("HERMES_REDACT_SECRETS", "true").lower() in {"1", "true", "yes", "on"}
# Known API key prefixes -- match the prefix + contiguous token chars
_PREFIX_PATTERNS = [
+5 -5
View File
@@ -312,7 +312,7 @@ def _parse_single_entry(
)
matcher = None
if matcher is not None and event not in ("pre_tool_call", "post_tool_call"):
if matcher is not None and event not in {"pre_tool_call", "post_tool_call"}:
logger.warning(
"hooks.%s[%d].matcher=%r will be ignored at runtime — the "
"matcher field is only honored for pre_tool_call / "
@@ -423,7 +423,7 @@ def _make_callback(spec: ShellHookSpec) -> Callable[..., Optional[Dict[str, Any]
def _callback(**kwargs: Any) -> Optional[Dict[str, Any]]:
# Matcher gate — only meaningful for tool-scoped events.
if spec.event in ("pre_tool_call", "post_tool_call"):
if spec.event in {"pre_tool_call", "post_tool_call"}:
if not spec.matches_tool(kwargs.get("tool_name")):
return None
@@ -658,7 +658,7 @@ def _prompt_and_record(
print() # keep the terminal tidy after ^C
return False
if answer in ("y", "yes"):
if answer in {"y", "yes"}:
_record_approval(event, command)
return True
@@ -752,13 +752,13 @@ def _resolve_effective_accept(
if accept_hooks_arg:
return True
env = os.environ.get("HERMES_ACCEPT_HOOKS", "").strip().lower()
if env in ("1", "true", "yes", "on"):
if env in {"1", "true", "yes", "on"}:
return True
cfg_val = cfg.get("hooks_auto_accept", False)
if isinstance(cfg_val, bool):
return cfg_val
if isinstance(cfg_val, str):
return cfg_val.strip().lower() in ("1", "true", "yes", "on")
return cfg_val.strip().lower() in {"1", "true", "yes", "on"}
return False
+1 -1
View File
@@ -261,7 +261,7 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
for scan_dir in dirs_to_scan:
for skill_md in iter_skill_index_files(scan_dir, "SKILL.md"):
if any(part in ('.git', '.github', '.hub', '.archive') for part in skill_md.parts):
if any(part in {'.git', '.github', '.hub', '.archive'} for part in skill_md.parts):
continue
try:
content = skill_md.read_text(encoding='utf-8')
+2 -2
View File
@@ -279,7 +279,7 @@ class ChatCompletionsTransport(ProviderTransport):
_kimi_effort = "medium"
if reasoning_config and isinstance(reasoning_config, dict):
_e = (reasoning_config.get("effort") or "").strip().lower()
if _e in ("low", "medium", "high"):
if _e in {"low", "medium", "high"}:
_kimi_effort = _e
api_kwargs["reasoning_effort"] = _kimi_effort
@@ -294,7 +294,7 @@ class ChatCompletionsTransport(ProviderTransport):
_tokenhub_effort = "high"
if reasoning_config and isinstance(reasoning_config, dict):
_e = (reasoning_config.get("effort") or "").strip().lower()
if _e in ("low", "medium", "high"):
if _e in {"low", "medium", "high"}:
_tokenhub_effort = _e
api_kwargs["reasoning_effort"] = _tokenhub_effort
+1 -1
View File
@@ -795,7 +795,7 @@ class BatchRunner:
conversations = entry.get("conversations", [])
for msg in conversations:
role = msg.get("role") or msg.get("from")
if role in ("user", "human"):
if role in {"user", "human"}:
prompt_text = (msg.get("content") or msg.get("value", "")).strip()
break
+9
View File
@@ -203,6 +203,12 @@ terminal:
# docker_forward_env:
# - "GITHUB_TOKEN"
# - "NPM_TOKEN"
# # Optional: extra flags passed verbatim to docker run (appended after security defaults).
# # Useful for adding capabilities (e.g. apt installs needing SETUID) or custom options.
# # Example: add a Linux capability not included by default
# # docker_extra_args:
# # - "--cap-add"
# # - "SETUID"
# -----------------------------------------------------------------------------
# OPTION 4: Singularity/Apptainer container
@@ -947,6 +953,9 @@ display:
# false: Wait for the full response before rendering
streaming: true
# Show [HH:MM] timestamps on user input and assistant response labels.
# timestamps: false
# ───────────────────────────────────────────────────────────────────────────
# Skin / Theme
# ───────────────────────────────────────────────────────────────────────────
+533 -212
View File
File diff suppressed because it is too large Load Diff
+7 -8
View File
@@ -664,7 +664,7 @@ def update_job(job_id: str, updates: Dict[str, Any]) -> Optional[Dict[str, Any]]
# None both mean "clear the field" (restore old behaviour).
if "workdir" in updates:
_wd = updates["workdir"]
if _wd in (None, "", False):
if _wd in {None, "", False}:
updates["workdir"] = None
else:
updates["workdir"] = _normalize_workdir(_wd)
@@ -811,7 +811,7 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None,
# schedule quietly goes off. See issue #16265.
if job["next_run_at"] is None:
kind = job.get("schedule", {}).get("kind")
if kind in ("cron", "interval"):
if kind in {"cron", "interval"}:
job["state"] = "error"
if not job.get("last_error"):
job["last_error"] = (
@@ -855,7 +855,7 @@ def advance_next_run(job_id: str) -> bool:
for job in jobs:
if job["id"] == job_id:
kind = job.get("schedule", {}).get("kind")
if kind not in ("cron", "interval"):
if kind not in {"cron", "interval"}:
return False
now = _hermes_now().isoformat()
new_next = compute_next_run(job["schedule"], now)
@@ -909,7 +909,7 @@ def _get_due_jobs_locked() -> List[Dict[str, Any]]:
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
if not recovered_next and kind in {"cron", "interval"}:
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
@@ -940,7 +940,7 @@ def _get_due_jobs_locked() -> List[Dict[str, Any]]:
# (gateway was down and missed the window). Fast-forward to
# the next future occurrence instead of firing a stale run.
grace = _compute_grace_seconds(schedule)
if kind in ("cron", "interval") and (now - next_run_dt).total_seconds() > grace:
if kind in {"cron", "interval"} and (now - next_run_dt).total_seconds() > grace:
# Job is past its catch-up grace window — this is a stale missed run.
# Grace scales with schedule period: daily=2h, hourly=30m, 10min=5m.
new_next = compute_next_run(schedule, now.isoformat())
@@ -1082,9 +1082,8 @@ def rewrite_skill_refs(
new_skills.append(target)
elif name in pruned_set:
dropped.append(name)
else:
if name not in new_skills:
new_skills.append(name)
elif name not in new_skills:
new_skills.append(name)
if not mapped and not dropped:
continue
+1 -1
View File
@@ -754,7 +754,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
# shebang: the scripts dir is trusted, but keeping the interpreter
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
if suffix in {".sh", ".bash"}:
# Resolve bash dynamically so Windows (Git Bash) and Linux/macOS
# all work. On native Windows without Git for Windows installed
# shutil.which returns None — fall back to a clear error rather
+1 -1
View File
@@ -264,7 +264,7 @@ def _parse_hint_result(text: str) -> tuple[int | None, str]:
"""Parse the judge's boxed decision and hint text."""
boxed = _BOXED_RE.findall(text)
score = int(boxed[-1]) if boxed else None
if score not in (1, -1):
if score not in {1, -1}:
score = None
hint_matches = _HINT_RE.findall(text)
hint = hint_matches[-1].strip() if hint_matches else ""
@@ -162,7 +162,7 @@ def _normalize_tar_member_parts(member_name: str) -> list:
):
raise ValueError(f"Unsafe archive member path: {member_name}")
parts = [part for part in posix_path.parts if part not in ("", ".")]
parts = [part for part in posix_path.parts if part not in {"", "."}]
if not parts or any(part == ".." for part in parts):
raise ValueError(f"Unsafe archive member path: {member_name}")
return parts
@@ -561,7 +561,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
# --- 5. Verify -- run test suite in the agent's sandbox ---
# Skip verification if the agent produced no meaningful output
only_system_and_user = all(
msg.get("role") in ("system", "user") for msg in result.messages
msg.get("role") in {"system", "user"} for msg in result.messages
)
if result.turns_used == 0 or only_system_and_user:
logger.warning(
@@ -919,7 +919,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
eval_metrics[f"eval/pass_rate_{cat_key}"] = cat_pass_rate
# Store metrics for wandb_log
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
self.eval_metrics = list(eval_metrics.items())
# ---- Print summary ----
print(f"\n{'='*60}")
@@ -759,7 +759,7 @@ class YCBenchEvalEnv(HermesAgentBaseEnv):
eval_metrics[f"eval/survival_rate_{key}"] = ps / pt if pt else 0
eval_metrics[f"eval/avg_score_{key}"] = pa
self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
self.eval_metrics = list(eval_metrics.items())
# --- Print summary ---
print(f"\n{'='*60}")
+1 -1
View File
@@ -571,7 +571,7 @@ class HermesAgentBaseEnv(BaseEnv):
# (e.g., API call failed on turn 1). No point spinning up a Modal sandbox
# just to verify files that were never created.
only_system_and_user = all(
msg.get("role") in ("system", "user") for msg in result.messages
msg.get("role") in {"system", "user"} for msg in result.messages
)
if result.turns_used == 0 or only_system_and_user:
logger.warning(
+1 -1
View File
@@ -179,7 +179,7 @@ class ToolContext:
# Ensure parent directory exists in the sandbox
parent = str(_Path(remote_path).parent)
if parent not in (".", "/"):
if parent not in {".", "/"}:
self.terminal(f"mkdir -p {parent}", timeout=10)
# For small files, single command is fine
+45 -24
View File
@@ -28,9 +28,9 @@ def _coerce_bool(value: Any, default: bool = True) -> bool:
return default
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in ("true", "1", "yes", "on"):
if lowered in {"true", "1", "yes", "on"}:
return True
if lowered in ("false", "0", "no", "off"):
if lowered in {"false", "0", "no", "off"}:
return False
return default
return is_truthy_value(value, default=default)
@@ -317,14 +317,32 @@ class PlatformConfig:
)
# Streaming defaults — single source of truth so both StreamingConfig and
# StreamConsumerConfig agree on the out-of-the-box edit rhythm. Tuned for
# Telegram's ~1 edit/s flood envelope: a touch under 1s lets the cadence
# breathe without bumping into rate limits, and a smaller buffer threshold
# makes short replies feel near-instant in DMs.
DEFAULT_STREAMING_EDIT_INTERVAL: float = 0.8
DEFAULT_STREAMING_BUFFER_THRESHOLD: int = 24
DEFAULT_STREAMING_CURSOR: str = ""
@dataclass
class StreamingConfig:
"""Configuration for real-time token streaming to messaging platforms."""
enabled: bool = False
transport: str = "edit" # "edit" (progressive editMessageText) or "off"
edit_interval: float = 1.0 # Seconds between message edits (Telegram rate-limits at ~1/s)
buffer_threshold: int = 40 # Chars before forcing an edit
cursor: str = "" # Cursor shown during streaming
# Transport selection:
# "auto" — prefer native streaming-draft updates when the platform
# supports them (Telegram sendMessageDraft, Bot API 9.5+);
# fall back to edit-based when not. Recommended.
# "draft" — explicitly request native drafts; falls back to edit when
# the platform/chat doesn't support them.
# "edit" — progressive editMessageText only (legacy behaviour).
# "off" — disable streaming entirely.
transport: str = "auto"
edit_interval: float = DEFAULT_STREAMING_EDIT_INTERVAL
buffer_threshold: int = DEFAULT_STREAMING_BUFFER_THRESHOLD
cursor: str = DEFAULT_STREAMING_CURSOR
# Ported from openclaw/openclaw#72038. When >0, the final edit for
# a long-running streamed response is delivered as a fresh message
# if the original preview has been visible for at least this many
@@ -350,10 +368,14 @@ class StreamingConfig:
return cls()
return cls(
enabled=_coerce_bool(data.get("enabled"), False),
transport=data.get("transport", "edit"),
edit_interval=_coerce_float(data.get("edit_interval"), 1.0),
buffer_threshold=_coerce_int(data.get("buffer_threshold"), 40),
cursor=data.get("cursor", ""),
transport=data.get("transport", "auto"),
edit_interval=_coerce_float(
data.get("edit_interval"), DEFAULT_STREAMING_EDIT_INTERVAL,
),
buffer_threshold=_coerce_int(
data.get("buffer_threshold"), DEFAULT_STREAMING_BUFFER_THRESHOLD,
),
cursor=data.get("cursor", DEFAULT_STREAMING_CURSOR),
fresh_final_after_seconds=_coerce_float(
data.get("fresh_final_after_seconds"), 60.0
),
@@ -588,8 +610,7 @@ class GatewayConfig:
try:
session_store_max_age_days = int(data.get("session_store_max_age_days", 90))
if session_store_max_age_days < 0:
session_store_max_age_days = 0
session_store_max_age_days = max(session_store_max_age_days, 0)
except (TypeError, ValueError):
session_store_max_age_days = 90
@@ -778,7 +799,7 @@ def load_gateway_config() -> GatewayConfig:
bridged["group_allow_admin_from"] = platform_cfg["group_allow_admin_from"]
if "group_user_allowed_commands" in platform_cfg:
bridged["group_user_allowed_commands"] = platform_cfg["group_user_allowed_commands"]
if plat in (Platform.DISCORD, Platform.SLACK) and "channel_skill_bindings" in platform_cfg:
if plat in {Platform.DISCORD, Platform.SLACK} and "channel_skill_bindings" in platform_cfg:
bridged["channel_skill_bindings"] = platform_cfg["channel_skill_bindings"]
if "channel_prompts" in platform_cfg:
channel_prompts = platform_cfg["channel_prompts"]
@@ -1158,7 +1179,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
# Reply threading mode for Telegram (off/first/all)
telegram_reply_mode = os.getenv("TELEGRAM_REPLY_TO_MODE", "").lower()
if telegram_reply_mode in ("off", "first", "all"):
if telegram_reply_mode in {"off", "first", "all"}:
if Platform.TELEGRAM not in config.platforms:
config.platforms[Platform.TELEGRAM] = PlatformConfig()
config.platforms[Platform.TELEGRAM].reply_to_mode = telegram_reply_mode
@@ -1199,14 +1220,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
# Reply threading mode for Discord (off/first/all)
discord_reply_mode = os.getenv("DISCORD_REPLY_TO_MODE", "").lower()
if discord_reply_mode in ("off", "first", "all"):
if discord_reply_mode in {"off", "first", "all"}:
if Platform.DISCORD not in config.platforms:
config.platforms[Platform.DISCORD] = PlatformConfig()
config.platforms[Platform.DISCORD].reply_to_mode = discord_reply_mode
# WhatsApp (typically uses different auth mechanism)
whatsapp_enabled = os.getenv("WHATSAPP_ENABLED", "").lower() in ("true", "1", "yes")
whatsapp_disabled_explicitly = os.getenv("WHATSAPP_ENABLED", "").lower() in ("false", "0", "no")
whatsapp_enabled = os.getenv("WHATSAPP_ENABLED", "").lower() in {"true", "1", "yes"}
whatsapp_disabled_explicitly = os.getenv("WHATSAPP_ENABLED", "").lower() in {"false", "0", "no"}
if Platform.WHATSAPP in config.platforms:
# YAML config exists — respect explicit disable
wa_cfg = config.platforms[Platform.WHATSAPP]
@@ -1264,7 +1285,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.SIGNAL].extra.update({
"http_url": signal_url,
"account": signal_account,
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
"ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in {"true", "1", "yes"},
})
signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
if signal_home and Platform.SIGNAL in config.platforms:
@@ -1313,7 +1334,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
matrix_password = os.getenv("MATRIX_PASSWORD", "")
if matrix_password:
config.platforms[Platform.MATRIX].extra["password"] = matrix_password
matrix_e2ee = os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes")
matrix_e2ee = os.getenv("MATRIX_ENCRYPTION", "").lower() in {"true", "1", "yes"}
config.platforms[Platform.MATRIX].extra["encryption"] = matrix_e2ee
matrix_device_id = os.getenv("MATRIX_DEVICE_ID", "")
if matrix_device_id:
@@ -1378,7 +1399,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
)
# API Server
api_server_enabled = os.getenv("API_SERVER_ENABLED", "").lower() in ("true", "1", "yes")
api_server_enabled = os.getenv("API_SERVER_ENABLED", "").lower() in {"true", "1", "yes"}
api_server_key = os.getenv("API_SERVER_KEY", "")
api_server_cors_origins = os.getenv("API_SERVER_CORS_ORIGINS", "")
api_server_port = os.getenv("API_SERVER_PORT")
@@ -1405,7 +1426,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.API_SERVER].extra["model_name"] = api_server_model_name
# Webhook platform
webhook_enabled = os.getenv("WEBHOOK_ENABLED", "").lower() in ("true", "1", "yes")
webhook_enabled = os.getenv("WEBHOOK_ENABLED", "").lower() in {"true", "1", "yes"}
webhook_port = os.getenv("WEBHOOK_PORT")
webhook_secret = os.getenv("WEBHOOK_SECRET", "")
if webhook_enabled:
@@ -1421,11 +1442,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret
# Microsoft Graph webhook platform
msgraph_webhook_enabled = os.getenv("MSGRAPH_WEBHOOK_ENABLED", "").lower() in (
msgraph_webhook_enabled = os.getenv("MSGRAPH_WEBHOOK_ENABLED", "").lower() in {
"true",
"1",
"yes",
)
}
msgraph_webhook_port = os.getenv("MSGRAPH_WEBHOOK_PORT")
msgraph_webhook_client_state = os.getenv("MSGRAPH_WEBHOOK_CLIENT_STATE", "")
msgraph_webhook_resources = os.getenv("MSGRAPH_WEBHOOK_ACCEPTED_RESOURCES", "")
@@ -1619,7 +1640,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
"webhook_host": os.getenv("BLUEBUBBLES_WEBHOOK_HOST", "127.0.0.1"),
"webhook_port": int(os.getenv("BLUEBUBBLES_WEBHOOK_PORT", "8645")),
"webhook_path": os.getenv("BLUEBUBBLES_WEBHOOK_PATH", "/bluebubbles-webhook"),
"send_read_receipts": os.getenv("BLUEBUBBLES_SEND_READ_RECEIPTS", "true").lower() in ("true", "1", "yes"),
"send_read_receipts": os.getenv("BLUEBUBBLES_SEND_READ_RECEIPTS", "true").lower() in {"true", "1", "yes"},
})
bluebubbles_home = os.getenv("BLUEBUBBLES_HOME_CHANNEL")
if bluebubbles_home and Platform.BLUEBUBBLES in config.platforms:
+4 -4
View File
@@ -81,7 +81,7 @@ _TIER_MINIMAL = {
_PLATFORM_DEFAULTS: dict[str, dict[str, Any]] = {
# Tier 1 — full edit support, personal/team use
"telegram": _TIER_HIGH,
"telegram": {**_TIER_HIGH, "tool_progress": "new"},
"discord": _TIER_HIGH,
# Tier 2 — edit support, often customer/workspace channels
@@ -190,13 +190,13 @@ def _normalise(setting: str, value: Any) -> Any:
if value is True:
return "all"
return str(value).lower()
if setting in ("show_reasoning", "streaming"):
if setting in {"show_reasoning", "streaming"}:
if isinstance(value, str):
return value.lower() in ("true", "1", "yes", "on")
return value.lower() in {"true", "1", "yes", "on"}
return bool(value)
if setting == "cleanup_progress":
if isinstance(value, str):
return value.lower() in ("true", "1", "yes", "on")
return value.lower() in {"true", "1", "yes", "on"}
return bool(value)
if setting == "tool_preview_length":
try:
+4 -4
View File
@@ -449,7 +449,7 @@ if AIOHTTP_AVAILABLE:
@web.middleware
async def body_limit_middleware(request, handler):
"""Reject overly large request bodies early based on Content-Length."""
if request.method in ("POST", "PUT", "PATCH"):
if request.method in {"POST", "PUT", "PATCH"}:
cl = request.headers.get("Content-Length")
if cl is not None:
try:
@@ -646,7 +646,7 @@ class APIServerAdapter(BasePlatformAdapter):
try:
from hermes_cli.profiles import get_active_profile_name
profile = get_active_profile_name()
if profile and profile not in ("default", "custom"):
if profile and profile not in {"default", "custom"}:
return profile
except Exception:
pass
@@ -1003,7 +1003,7 @@ class APIServerAdapter(BasePlatformAdapter):
system_prompt = content
else:
system_prompt = system_prompt + "\n" + content
elif role in ("user", "assistant"):
elif role in {"user", "assistant"}:
try:
content = _normalize_multimodal_content(raw_content)
except ValueError as exc:
@@ -2381,7 +2381,7 @@ class APIServerAdapter(BasePlatformAdapter):
if cron_err:
return cron_err
try:
include_disabled = request.query.get("include_disabled", "").lower() in ("true", "1")
include_disabled = request.query.get("include_disabled", "").lower() in {"true", "1"}
jobs = _cron_list(include_disabled=include_disabled)
return web.json_response({"jobs": jobs})
except Exception as e:
+56 -3
View File
@@ -560,7 +560,7 @@ def _looks_like_image(data: bytes) -> bool:
return True
if data[:3] == b"\xff\xd8\xff":
return True
if data[:6] in (b"GIF87a", b"GIF89a"):
if data[:6] in {b"GIF87a", b"GIF89a"}:
return True
if data[:2] == b"BM":
return True
@@ -859,7 +859,7 @@ def cache_document_from_bytes(data: bytes, filename: str) -> str:
# Sanitize: strip directory components, null bytes, and control characters
safe_name = Path(filename).name if filename else "document"
safe_name = safe_name.replace("\x00", "").strip()
if not safe_name or safe_name in (".", ".."):
if not safe_name or safe_name in {".", ".."}:
safe_name = "document"
cached_name = f"doc_{uuid.uuid4().hex[:12]}_{safe_name}"
filepath = cache_dir / cached_name
@@ -1035,6 +1035,13 @@ class SendResult:
error: Optional[str] = None
raw_response: Any = None
retryable: bool = False # True for transient connection errors — base will retry automatically
# When the adapter had to split an oversized payload across multiple
# platform messages (e.g. Telegram edit_message overflow split-and-deliver),
# ``message_id`` is the LAST visible message id (so subsequent edits target
# the most recent chunk) and these are the additional message ids that
# made up the full payload, in send order. Empty tuple for the common
# single-message case.
continuation_message_ids: tuple = ()
class EphemeralReply(str):
@@ -1320,6 +1327,52 @@ class BasePlatformAdapter(ABC):
"""
return len
def supports_draft_streaming(
self,
chat_type: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> bool:
"""Whether this adapter supports native streaming-draft updates.
Telegram Bot API 9.5 introduced ``sendMessageDraft``, which renders an
animated streaming preview as the bot calls it repeatedly with the
same ``draft_id`` and growing text. Adapters that implement
``send_draft`` should return True here for the chat types where the
platform supports it (Telegram restricts drafts to private DMs).
Default implementation returns False. Stream consumers fall back to
the edit-based path (``send`` + ``edit_message``) when this returns
False or when ``send_draft`` raises.
"""
return False
async def send_draft(
self,
chat_id: str,
draft_id: int,
content: str,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Send or update an animated streaming-draft preview.
Reuse the same ``draft_id`` (any non-zero int) across consecutive
calls within a single response so the platform animates the preview
rather than re-creating it. Different responses must use different
``draft_id`` values within the same chat to avoid animating over a
prior bubble.
Drafts have no message_id and cannot be edited, replied to, or
deleted via normal message APIs. When the response finishes, the
caller delivers the final answer as a regular ``send`` and the
draft preview clears naturally on the client.
Default implementation raises NotImplementedError; adapters that
also return True from :meth:`supports_draft_streaming` must override.
"""
raise NotImplementedError(
f"{type(self).__name__} does not implement send_draft"
)
@property
def has_fatal_error(self) -> bool:
return self._fatal_error_message is not None
@@ -2740,7 +2793,7 @@ class BasePlatformAdapter(ABC):
# and preserve ordering of queued follow-ups. Route those
# through the dedicated handoff path that serializes
# cancellation + runner response + pending drain.
if cmd in ("stop", "new", "reset"):
if cmd in {"stop", "new", "reset"}:
try:
await self._dispatch_active_session_command(event, session_key, cmd)
except Exception as e:
+1 -1
View File
@@ -223,7 +223,7 @@ class BlueBubblesAdapter(BasePlatformAdapter):
def _webhook_url(self) -> str:
"""Compute the external webhook URL for BlueBubbles registration."""
host = self.webhook_host
if host in ("0.0.0.0", "127.0.0.1", "localhost", "::"):
if host in {"0.0.0.0", "127.0.0.1", "localhost", "::"}:
host = "localhost"
return f"http://{host}:{self.webhook_port}{self.webhook_path}"
+2 -2
View File
@@ -353,9 +353,9 @@ class DingTalkAdapter(BasePlatformAdapter):
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return configured.lower() in {"true", "1", "yes", "on"}
return bool(configured)
return os.getenv("DINGTALK_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
return os.getenv("DINGTALK_REQUIRE_MENTION", "false").lower() in {"true", "1", "yes", "on"}
def _dingtalk_free_response_chats(self) -> Set[str]:
raw = self.config.extra.get("free_response_chats")
+41 -15
View File
@@ -86,8 +86,32 @@ def _clean_discord_id(entry: str) -> str:
def check_discord_requirements() -> bool:
"""Check if Discord dependencies are available."""
return DISCORD_AVAILABLE
"""Check if Discord dependencies are available.
Lazy-installs discord.py via ``tools.lazy_deps.ensure("platform.discord")``
on first call if not present. After successful install, re-binds module
globals so ``DISCORD_AVAILABLE`` becomes True.
"""
global DISCORD_AVAILABLE, discord, DiscordMessage, Intents, commands
if DISCORD_AVAILABLE:
return True
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.discord", prompt=False)
except Exception:
return False
try:
import discord as _discord
from discord import Message as _DM, Intents as _Intents
from discord.ext import commands as _commands
except ImportError:
return False
discord = _discord
DiscordMessage = _DM
Intents = _Intents
commands = _commands
DISCORD_AVAILABLE = True
return True
def _build_allowed_mentions():
@@ -115,7 +139,7 @@ def _build_allowed_mentions():
raw = os.getenv(name, "").strip().lower()
if not raw:
return default
return raw in ("true", "1", "yes", "on")
return raw in {"true", "1", "yes", "on"}
return discord.AllowedMentions(
everyone=_b("DISCORD_ALLOW_MENTION_EVERYONE", False),
@@ -708,7 +732,7 @@ class DiscordAdapter(BasePlatformAdapter):
# Ignore Discord system messages (thread renames, pins, member joins, etc.)
# Allow both default and reply types — replies have a distinct MessageType.
if message.type not in (discord.MessageType.default, discord.MessageType.reply):
if message.type not in {discord.MessageType.default, discord.MessageType.reply}:
return
# Bot message filtering (DISCORD_ALLOW_BOTS):
@@ -769,7 +793,7 @@ class DiscordAdapter(BasePlatformAdapter):
# answer regardless of who is mentioned.
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
).lower() in {"true", "1", "yes"}
if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
_channel_id = str(message.channel.id)
_parent_id = None
@@ -1317,7 +1341,7 @@ class DiscordAdapter(BasePlatformAdapter):
def _reactions_enabled(self) -> bool:
"""Check if message reactions are enabled via config/env."""
return os.getenv("DISCORD_REACTIONS", "true").lower() not in ("false", "0", "no")
return os.getenv("DISCORD_REACTIONS", "true").lower() not in {"false", "0", "no"}
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction for normal Discord message events."""
@@ -2697,6 +2721,8 @@ class DiscordAdapter(BasePlatformAdapter):
await asyncio.sleep(8)
except asyncio.CancelledError:
pass
finally:
self._typing_tasks.pop(chat_id, None)
self._typing_tasks[chat_id] = asyncio.create_task(_typing_loop())
@@ -3135,9 +3161,9 @@ class DiscordAdapter(BasePlatformAdapter):
# UX so users don't see commands they can't invoke. Off by default
# to preserve the slash UX for deployments that intentionally allow
# everyone in the guild.
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in {
"true", "1", "yes", "on",
):
}:
self._apply_owner_only_visibility(tree)
def _apply_owner_only_visibility(self, tree) -> None:
@@ -3524,9 +3550,9 @@ class DiscordAdapter(BasePlatformAdapter):
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() not in ("false", "0", "no", "off")
return configured.lower() not in {"false", "0", "no", "off"}
return bool(configured)
return os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no", "off")
return os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in {"false", "0", "no", "off"}
def _discord_free_response_channels(self) -> set:
"""Return Discord channel IDs where no bot mention is required.
@@ -3722,7 +3748,7 @@ class DiscordAdapter(BasePlatformAdapter):
return None
# DMs, voice channels, and existing threads can't host child threads.
if isinstance(parent, getattr(discord, "DMChannel", tuple())):
if isinstance(parent, getattr(discord, "DMChannel", ())):
logger.info(
"[%s] Handoff thread: parent %s is a DM; threads not supported here",
self.name, parent_chat_id,
@@ -4198,7 +4224,7 @@ class DiscordAdapter(BasePlatformAdapter):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in {"true", "1", "yes"}
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
thread = await self._auto_create_thread(message)
@@ -4280,7 +4306,7 @@ class DiscordAdapter(BasePlatformAdapter):
try:
# Determine extension from content type (image/png -> .png)
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
if ext not in {".jpg", ".jpeg", ".png", ".gif", ".webp"}:
ext = ".jpg"
cached_path = await self._cache_discord_image(att, ext)
media_urls.append(cached_path)
@@ -4294,7 +4320,7 @@ class DiscordAdapter(BasePlatformAdapter):
elif content_type.startswith("audio/"):
try:
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
if ext not in {".ogg", ".mp3", ".wav", ".webm", ".m4a"}:
ext = ".ogg"
cached_path = await self._cache_discord_audio(att, ext)
media_urls.append(cached_path)
@@ -4337,7 +4363,7 @@ class DiscordAdapter(BasePlatformAdapter):
logger.info("[Discord] Cached user document: %s", cached_path)
# Inject text content for plain-text documents (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if ext in (".md", ".txt", ".log") and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
if ext in {".md", ".txt", ".log"} and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
try:
text_content = raw_bytes.decode("utf-8")
display_name = att.filename or f"document{ext}"
+2 -2
View File
@@ -54,7 +54,7 @@ _NOREPLY_PATTERNS = (
# RFC headers that indicate bulk/automated mail
_AUTOMATED_HEADERS = {
"Auto-Submitted": lambda v: v.lower() != "no",
"Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
"Precedence": lambda v: v.lower() in {"bulk", "list", "junk"},
"X-Auto-Response-Suppress": lambda v: bool(v),
"List-Unsubscribe": lambda v: bool(v),
}
@@ -203,7 +203,7 @@ def _extract_attachments(
continue
# Skip text/plain and text/html body parts
content_type = part.get_content_type()
if content_type in ("text/plain", "text/html") and "attachment" not in disposition:
if content_type in {"text/plain", "text/html"} and "attachment" not in disposition:
continue
filename = part.get_filename()
+7 -7
View File
@@ -428,7 +428,7 @@ RejectReason = Literal[
def _is_bot_sender(sender: Any) -> bool:
# receive_v1 docs say {user, bot}; accept "app" defensively.
return getattr(sender, "sender_type", "") in ("bot", "app")
return getattr(sender, "sender_type", "") in {"bot", "app"}
def _sender_identity(sender: Any) -> frozenset:
@@ -1428,8 +1428,8 @@ class FeishuAdapter(BasePlatformAdapter):
per_chat_require_mention = _to_boolean(rule_cfg.get("require_mention"))
group_rules[str(chat_id)] = FeishuGroupRule(
policy=str(rule_cfg.get("policy", "open")).strip().lower(),
allowlist=set(str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()),
blacklist=set(str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()),
allowlist={str(u).strip() for u in rule_cfg.get("allowlist", []) if str(u).strip()},
blacklist={str(u).strip() for u in rule_cfg.get("blacklist", []) if str(u).strip()},
require_mention=per_chat_require_mention,
)
@@ -1443,7 +1443,7 @@ class FeishuAdapter(BasePlatformAdapter):
# Env-only so adapter and gateway auth bypass share one source; yaml
# feishu.allow_bots is bridged to this env var at config load.
allow_bots = os.getenv("FEISHU_ALLOW_BOTS", "none").strip().lower()
if allow_bots not in ("none", "mentions", "all"):
if allow_bots not in {"none", "mentions", "all"}:
logger.warning(
"[Feishu] Unknown allow_bots=%r, falling back to 'none'. Valid: none, mentions, all.",
allow_bots,
@@ -2752,7 +2752,7 @@ class FeishuAdapter(BasePlatformAdapter):
# =========================================================================
def _reactions_enabled(self) -> bool:
return os.getenv("FEISHU_REACTIONS", "true").strip().lower() not in ("false", "0", "no")
return os.getenv("FEISHU_REACTIONS", "true").strip().lower() not in {"false", "0", "no"}
async def _add_reaction(self, message_id: str, emoji_type: str) -> Optional[str]:
"""Return the reaction_id on success, else None. The id is needed later for deletion."""
@@ -3219,7 +3219,7 @@ class FeishuAdapter(BasePlatformAdapter):
self._on_bot_added_to_chat(data)
elif event_type == "im.chat.member.bot.deleted_v1":
self._on_bot_removed_from_chat(data)
elif event_type in ("im.message.reaction.created_v1", "im.message.reaction.deleted_v1"):
elif event_type in {"im.message.reaction.created_v1", "im.message.reaction.deleted_v1"}:
self._on_reaction_event(event_type, data)
elif event_type == "card.action.trigger":
self._on_card_action_trigger(data)
@@ -4815,7 +4815,7 @@ def _poll_registration(
# Terminal errors
error = res.get("error", "")
if error in ("access_denied", "expired_token"):
if error in {"access_denied", "expired_token"}:
if poll_count > 0:
print()
logger.warning("[Feishu onboard] Registration %s", error)
+3 -3
View File
@@ -690,7 +690,7 @@ def _extract_docs_links(replies: List[Dict[str, Any]]) -> List[Dict[str, str]]:
except (json.JSONDecodeError, TypeError):
continue
for elem in content.get("elements", []):
if elem.get("type") not in ("docs_link", "link"):
if elem.get("type") not in {"docs_link", "link"}:
continue
link_data = elem.get("docs_link") or elem.get("link") or {}
url = link_data.get("url", "")
@@ -1031,7 +1031,7 @@ def _save_session_history(key: str, messages: List[Dict[str, Any]]) -> None:
# Only keep user/assistant messages (strip system messages and tool internals)
cleaned = [
m for m in messages
if m.get("role") in ("user", "assistant") and m.get("content")
if m.get("role") in {"user", "assistant"} and m.get("content")
]
# Keep last N
if len(cleaned) > _SESSION_MAX_MESSAGES:
@@ -1170,7 +1170,7 @@ async def handle_drive_comment_event(
rule = resolve_rule(comments_cfg, file_type, file_token)
# If no exact match and config has wiki keys, try reverse-lookup
if rule.match_source in ("wildcard", "top") and has_wiki_keys(comments_cfg):
if rule.match_source in {"wildcard", "top"} and has_wiki_keys(comments_cfg):
wiki_token = await _reverse_lookup_wiki_token(client, file_type, file_token)
if wiki_token:
rule = resolve_rule(comments_cfg, file_type, file_token, wiki_token=wiki_token)
+1 -1
View File
@@ -228,7 +228,7 @@ def _load_pairing_approved() -> set:
if isinstance(approved, dict):
return set(approved.keys())
if isinstance(approved, list):
return set(str(u) for u in approved if u)
return {str(u) for u in approved if u}
return set()
+1 -1
View File
@@ -246,7 +246,7 @@ class ThreadParticipationTracker:
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = {thread_id: None for thread_id in thread_list}
self._threads = dict.fromkeys(thread_list)
atomic_json_write(path, thread_list, indent=None)
def mark(self, thread_id: str) -> None:
+2 -2
View File
@@ -256,7 +256,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
await self._handle_ha_event(data.get("event", {}))
except json.JSONDecodeError:
logger.debug("Invalid JSON from HA WS: %s", ws_msg.data[:200])
elif ws_msg.type in (aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR):
elif ws_msg.type in {aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR}:
break
async def _handle_ha_event(self, event: Dict[str, Any]) -> None:
@@ -361,7 +361,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
f"(was {'triggered' if old_val == 'on' else 'cleared'})"
)
if domain in ("light", "switch", "fan"):
if domain in {"light", "switch", "fan"}:
return (
f"[Home Assistant] {friendly_name}: turned "
f"{'on' if new_val == 'on' else 'off'}"
+13 -13
View File
@@ -245,11 +245,11 @@ def check_matrix_requirements() -> bool:
# If encryption is requested, verify E2EE deps are available at startup
# rather than silently degrading to plaintext-only at connect time.
encryption_requested = os.getenv("MATRIX_ENCRYPTION", "").lower() in (
encryption_requested = os.getenv("MATRIX_ENCRYPTION", "").lower() in {
"true",
"1",
"yes",
)
}
if encryption_requested and not _check_e2ee_deps():
logger.error(
"Matrix: MATRIX_ENCRYPTION=true but E2EE dependencies are missing. %s. "
@@ -312,7 +312,7 @@ class MatrixAdapter(BasePlatformAdapter):
)
self._encryption: bool = config.extra.get(
"encryption",
os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes"),
os.getenv("MATRIX_ENCRYPTION", "").lower() in {"true", "1", "yes"},
)
self._device_id: str = config.extra.get("device_id", "") or os.getenv(
"MATRIX_DEVICE_ID", ""
@@ -343,7 +343,7 @@ class MatrixAdapter(BasePlatformAdapter):
# Mention/thread gating — parsed once from env vars.
self._require_mention: bool = os.getenv(
"MATRIX_REQUIRE_MENTION", "true"
).lower() not in ("false", "0", "no")
).lower() not in {"false", "0", "no"}
free_rooms_raw = config.extra.get("free_response_rooms")
if free_rooms_raw is None:
free_rooms_raw = os.getenv("MATRIX_FREE_RESPONSE_ROOMS", "")
@@ -367,22 +367,22 @@ class MatrixAdapter(BasePlatformAdapter):
self._allowed_rooms: Set[str] = {
r.strip() for r in str(allowed_rooms_raw).split(",") if r.strip()
}
self._auto_thread: bool = os.getenv("MATRIX_AUTO_THREAD", "true").lower() in (
self._auto_thread: bool = os.getenv("MATRIX_AUTO_THREAD", "true").lower() in {
"true",
"1",
"yes",
)
}
self._dm_auto_thread: bool = os.getenv(
"MATRIX_DM_AUTO_THREAD", "false"
).lower() in ("true", "1", "yes")
).lower() in {"true", "1", "yes"}
self._dm_mention_threads: bool = os.getenv(
"MATRIX_DM_MENTION_THREADS", "false"
).lower() in ("true", "1", "yes")
).lower() in {"true", "1", "yes"}
# Reactions: configurable via MATRIX_REACTIONS (default: true).
self._reactions_enabled: bool = os.getenv(
"MATRIX_REACTIONS", "true"
).lower() not in ("false", "0", "no")
).lower() not in {"false", "0", "no"}
self._pending_reactions: dict[tuple[str, str], str] = {}
# Delay before redacting reactions so Matrix homeservers have time to
# deliver the final message event without tripping "missing event"
@@ -1771,9 +1771,9 @@ class MatrixAdapter(BasePlatformAdapter):
# Cache media locally when downstream tools need a real file path.
cached_path = None
should_cache_locally = msg_type in (
should_cache_locally = msg_type in {
MessageType.PHOTO, MessageType.AUDIO, MessageType.VIDEO, MessageType.DOCUMENT,
) or is_voice_message or is_encrypted_media
} or is_voice_message or is_encrypted_media
if should_cache_locally and url:
try:
file_bytes = await self._client.download_media(ContentURI(url))
@@ -1834,7 +1834,7 @@ class MatrixAdapter(BasePlatformAdapter):
ext = ext_map.get(media_type, ".jpg")
cached_path = cache_image_from_bytes(file_bytes, ext=ext)
logger.info("[Matrix] Cached user image at %s", cached_path)
elif msg_type in (MessageType.AUDIO, MessageType.VOICE):
elif msg_type in {MessageType.AUDIO, MessageType.VOICE}:
ext = (
Path(
body
@@ -2602,7 +2602,7 @@ class MatrixAdapter(BasePlatformAdapter):
"""Sanitize a URL for use in an href attribute."""
stripped = url.strip()
scheme = stripped.split(":", 1)[0].lower().strip() if ":" in stripped else ""
if scheme in ("javascript", "data", "vbscript"):
if scheme in {"javascript", "data", "vbscript"}:
return ""
return stripped.replace('"', "&quot;")
+6 -6
View File
@@ -611,7 +611,7 @@ class MattermostAdapter(BasePlatformAdapter):
# succeed on retry — stop reconnecting instead of looping forever.
import aiohttp
err_str = str(exc).lower()
if isinstance(exc, aiohttp.WSServerHandshakeError) and exc.status in (401, 403):
if isinstance(exc, aiohttp.WSServerHandshakeError) and exc.status in {401, 403}:
logger.error("Mattermost WS auth failed (HTTP %d) — stopping reconnect", exc.status)
return
if "401" in err_str or "403" in err_str or "unauthorized" in err_str:
@@ -649,21 +649,21 @@ class MattermostAdapter(BasePlatformAdapter):
if self._closing:
return
if raw_msg.type in (
if raw_msg.type in {
raw_msg.type.TEXT,
raw_msg.type.BINARY,
):
}:
try:
event = json.loads(raw_msg.data)
except (json.JSONDecodeError, TypeError):
continue
await self._handle_ws_event(event)
elif raw_msg.type in (
elif raw_msg.type in {
raw_msg.type.ERROR,
raw_msg.type.CLOSE,
raw_msg.type.CLOSING,
raw_msg.type.CLOSED,
):
}:
logger.info("Mattermost: WebSocket closed (%s)", raw_msg.type)
break
@@ -732,7 +732,7 @@ class MattermostAdapter(BasePlatformAdapter):
require_mention = os.getenv(
"MATTERMOST_REQUIRE_MENTION", "true"
).lower() not in ("false", "0", "no")
).lower() not in {"false", "0", "no"}
free_channels_raw = os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
+16 -16
View File
@@ -513,7 +513,7 @@ class QQAdapter(BasePlatformAdapter):
self._fail_pending("Connection closed")
# Stop reconnecting for fatal codes
if code in (4914, 4915):
if code in {4914, 4915}:
desc = "offline/sandbox-only" if code == 4914 else "banned"
logger.error(
"[%s] Bot is %s. Check QQ Open Platform.", self._log_tag, desc
@@ -550,7 +550,7 @@ class QQAdapter(BasePlatformAdapter):
self._token_expires_at = 0.0
# Session invalid → clear session, will re-identify on next Hello
if code in (
if code in {
4006,
4007,
4009,
@@ -568,7 +568,7 @@ class QQAdapter(BasePlatformAdapter):
4911,
4912,
4913,
):
}:
logger.info(
"[%s] Session error (%d), clearing session for re-identify",
self._log_tag,
@@ -637,12 +637,12 @@ class QQAdapter(BasePlatformAdapter):
payload = self._parse_json(msg.data)
if payload:
self._dispatch_payload(payload)
elif msg.type in (aiohttp.WSMsgType.PING,):
elif msg.type in {aiohttp.WSMsgType.PING,}:
# aiohttp auto-replies with PONG
pass
elif msg.type == aiohttp.WSMsgType.CLOSE:
raise QQCloseError(msg.data, msg.extra)
elif msg.type in (aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR):
elif msg.type in {aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR}:
raise RuntimeError("WebSocket closed")
async def _heartbeat_loop(self) -> None:
@@ -783,13 +783,13 @@ class QQAdapter(BasePlatformAdapter):
self._handle_ready(d)
elif t == "RESUMED":
logger.info("[%s] Session resumed", self._log_tag)
elif t in (
elif t in {
"C2C_MESSAGE_CREATE",
"GROUP_AT_MESSAGE_CREATE",
"DIRECT_MESSAGE_CREATE",
"GUILD_MESSAGE_CREATE",
"GUILD_AT_MESSAGE_CREATE",
):
}:
asyncio.create_task(self._on_message(t, d))
elif t == "INTERACTION_CREATE":
self._create_task(self._on_interaction(d))
@@ -859,9 +859,9 @@ class QQAdapter(BasePlatformAdapter):
# Route by event type
if event_type == "C2C_MESSAGE_CREATE":
await self._handle_c2c_message(d, msg_id, content, author, timestamp)
elif event_type in ("GROUP_AT_MESSAGE_CREATE",):
elif event_type in {"GROUP_AT_MESSAGE_CREATE",}:
await self._handle_group_message(d, msg_id, content, author, timestamp)
elif event_type in ("GUILD_MESSAGE_CREATE", "GUILD_AT_MESSAGE_CREATE"):
elif event_type in {"GUILD_MESSAGE_CREATE", "GUILD_AT_MESSAGE_CREATE"}:
await self._handle_guild_message(d, msg_id, content, author, timestamp)
elif event_type == "DIRECT_MESSAGE_CREATE":
await self._handle_dm_message(d, msg_id, content, author, timestamp)
@@ -1864,7 +1864,7 @@ class QQAdapter(BasePlatformAdapter):
return ".wav"
if data[:4] == b"fLaC":
return ".flac"
if data[:2] in (b"\xff\xfb", b"\xff\xf3", b"\xff\xf2"):
if data[:2] in {b"\xff\xfb", b"\xff\xf3", b"\xff\xf2"}:
return ".mp3"
if data[:4] == b"\x30\x26\xb2\x75" or data[:4] == b"\x4f\x67\x67\x53":
return ".ogg"
@@ -2033,7 +2033,7 @@ class QQAdapter(BasePlatformAdapter):
"base_url": base_url,
"api_key": api_key,
"model": model
or ("glm-asr" if provider in ("zai", "glm") else "whisper-1"),
or ("glm-asr" if provider in {"zai", "glm"} else "whisper-1"),
}
# 2. QQ-specific env vars (set by `hermes setup gateway` / `hermes gateway`)
@@ -2115,7 +2115,7 @@ class QQAdapter(BasePlatformAdapter):
if urlparse(source_url).path
else ""
)
if not ext or ext not in (
if not ext or ext not in {
".silk",
".amr",
".mp3",
@@ -2124,7 +2124,7 @@ class QQAdapter(BasePlatformAdapter):
".m4a",
".aac",
".flac",
):
}:
ext = self._guess_ext_from_data(audio_data)
with tempfile.NamedTemporaryFile(suffix=ext, delete=False) as tmp_src:
@@ -2870,7 +2870,7 @@ class QQAdapter(BasePlatformAdapter):
raise ValueError("Media source is required")
parsed = urlparse(source)
if parsed.scheme in ("http", "https"):
if parsed.scheme in {"http", "https"}:
# For URLs, pass through directly to the upload API
content_type = mimetypes.guess_type(source)[0] or "application/octet-stream"
resolved_name = file_name or Path(parsed.path).name or "media"
@@ -2966,7 +2966,7 @@ class QQAdapter(BasePlatformAdapter):
chat_type = self._guess_chat_type(chat_id)
return {
"name": chat_id,
"type": "group" if chat_type in ("group", "guild") else "dm",
"type": "group" if chat_type in {"group", "guild"} else "dm",
}
# ------------------------------------------------------------------
@@ -2975,7 +2975,7 @@ class QQAdapter(BasePlatformAdapter):
@staticmethod
def _is_url(source: str) -> bool:
return urlparse(str(source)).scheme in ("http", "https")
return urlparse(str(source)).scheme in {"http", "https"}
def _guess_chat_type(self, chat_id: str) -> str:
"""Determine chat type from stored inbound metadata, fallback to 'c2c'."""
+2 -3
View File
@@ -239,7 +239,7 @@ class ChunkedUploader:
:raises UploadFileTooLargeError: When the file exceeds the platform limit.
:raises RuntimeError: On other API or I/O failures.
"""
if chat_type not in ("c2c", "group"):
if chat_type not in {"c2c", "group"}:
raise ValueError(
f"ChunkedUploader: unsupported chat_type {chat_type!r}"
)
@@ -592,8 +592,7 @@ async def _run_with_concurrency(
concurrency: int,
) -> None:
"""Run a list of thunks with a bounded number in flight at once."""
if concurrency < 1:
concurrency = 1
concurrency = max(concurrency, 1)
sem = asyncio.Semaphore(concurrency)
async def _wrap(thunk: Callable[[], Awaitable[None]]) -> None:
+3 -3
View File
@@ -99,11 +99,11 @@ def _guess_extension(data: bytes) -> str:
def _is_image_ext(ext: str) -> bool:
return ext.lower() in (".jpg", ".jpeg", ".png", ".gif", ".webp")
return ext.lower() in {".jpg", ".jpeg", ".png", ".gif", ".webp"}
def _is_audio_ext(ext: str) -> bool:
return ext.lower() in (".mp3", ".wav", ".ogg", ".m4a", ".aac")
return ext.lower() in {".mp3", ".wav", ".ogg", ".m4a", ".aac"}
_EXT_TO_MIME = {
@@ -1449,7 +1449,7 @@ class SignalAdapter(BasePlatformAdapter):
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
if os.getenv("SIGNAL_REACTIONS", "true").lower() in {"false", "0", "no"}:
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
+11 -11
View File
@@ -935,7 +935,7 @@ class SlackAdapter(BasePlatformAdapter):
raw = self.config.extra.get("dm_top_level_threads_as_sessions")
if raw is None:
return True # default: each DM thread is its own session
return str(raw).strip().lower() in ("1", "true", "yes", "on")
return str(raw).strip().lower() in {"1", "true", "yes", "on"}
def _resolve_thread_ts(
self,
@@ -1300,7 +1300,7 @@ class SlackAdapter(BasePlatformAdapter):
def _reactions_enabled(self) -> bool:
"""Check if message reactions are enabled via config/env."""
return os.getenv("SLACK_REACTIONS", "true").lower() not in ("false", "0", "no")
return os.getenv("SLACK_REACTIONS", "true").lower() not in {"false", "0", "no"}
async def on_processing_start(self, event: MessageEvent) -> None:
"""Add an in-progress reaction when message processing begins."""
@@ -1773,7 +1773,7 @@ class SlackAdapter(BasePlatformAdapter):
# Ignore message edits and deletions
subtype = event.get("subtype")
if subtype in ("message_changed", "message_deleted"):
if subtype in {"message_changed", "message_deleted"}:
return
original_text = event.get("text", "")
@@ -1892,7 +1892,7 @@ class SlackAdapter(BasePlatformAdapter):
channel_type = event.get("channel_type", "")
if not channel_type and channel_id.startswith("D"):
channel_type = "im"
is_dm = channel_type in ("im", "mpim") # Both 1:1 and group DMs
is_dm = channel_type in {"im", "mpim"} # Both 1:1 and group DMs
# Build thread_ts for session keying.
# In channels: fall back to ts so each top-level @mention starts a
@@ -2033,7 +2033,7 @@ class SlackAdapter(BasePlatformAdapter):
if mimetype.startswith("image/") and url:
try:
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
if ext not in {".jpg", ".jpeg", ".png", ".gif", ".webp"}:
ext = ".jpg"
# Slack private URLs require the bot token as auth header
cached = await self._download_slack_file(url, ext, team_id=team_id)
@@ -2049,7 +2049,7 @@ class SlackAdapter(BasePlatformAdapter):
elif mimetype.startswith("audio/") and url:
try:
ext = "." + mimetype.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
if ext not in {".ogg", ".mp3", ".wav", ".webm", ".m4a"}:
ext = ".ogg"
cached = await self._download_slack_file(url, ext, audio=True, team_id=team_id)
media_urls.append(cached)
@@ -2737,7 +2737,7 @@ class SlackAdapter(BasePlatformAdapter):
if team_id and channel_id:
self._channel_team[channel_id] = team_id
if slash_name in ("hermes", ""):
if slash_name in {"hermes", ""}:
# Legacy /hermes <subcommand> [args] routing + free-form questions.
# Empty slash_name falls into this branch for backward compat
# with any caller that didn't populate command["command"].
@@ -2932,9 +2932,9 @@ class SlackAdapter(BasePlatformAdapter):
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() not in ("false", "0", "no", "off")
return configured.lower() not in {"false", "0", "no", "off"}
return bool(configured)
return os.getenv("SLACK_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no", "off")
return os.getenv("SLACK_REQUIRE_MENTION", "true").lower() not in {"false", "0", "no", "off"}
def _slack_strict_mention(self) -> bool:
"""When true, channel threads require an explicit @-mention on every
@@ -2944,9 +2944,9 @@ class SlackAdapter(BasePlatformAdapter):
configured = self.config.extra.get("strict_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return configured.lower() in {"true", "1", "yes", "on"}
return bool(configured)
return os.getenv("SLACK_STRICT_MENTION", "false").lower() in ("true", "1", "yes", "on")
return os.getenv("SLACK_STRICT_MENTION", "false").lower() in {"true", "1", "yes", "on"}
def _slack_free_response_channels(self) -> set:
"""Return channel IDs where no @mention is required."""
+389 -39
View File
@@ -77,7 +77,6 @@ from gateway.platforms.base import (
SUPPORTED_VIDEO_TYPES,
SUPPORTED_DOCUMENT_TYPES,
utf16_len,
_prefix_within_utf16_limit,
)
from gateway.platforms.telegram_network import (
TelegramFallbackTransport,
@@ -104,8 +103,58 @@ _TELEGRAM_IMAGE_EXT_TO_MIME = {
def check_telegram_requirements() -> bool:
"""Check if Telegram dependencies are available."""
return TELEGRAM_AVAILABLE
"""Check if Telegram dependencies are available.
If python-telegram-bot is missing, attempts to lazy-install it via
``tools.lazy_deps.ensure("platform.telegram")``. After a successful
install, re-imports the SDK and flips ``TELEGRAM_AVAILABLE`` to True
so the adapter's class-level type aliases get rebound.
"""
global TELEGRAM_AVAILABLE, Update, Bot, Message, InlineKeyboardButton
global InlineKeyboardMarkup, LinkPreviewOptions, Application
global CommandHandler, CallbackQueryHandler, TelegramMessageHandler
global ContextTypes, filters, ParseMode, ChatType, HTTPXRequest
if TELEGRAM_AVAILABLE:
return True
try:
from tools.lazy_deps import ensure as _lazy_ensure
_lazy_ensure("platform.telegram", prompt=False)
except Exception:
return False
try:
from telegram import Update as _Update, Bot as _Bot, Message as _Message
from telegram import InlineKeyboardButton as _IKB, InlineKeyboardMarkup as _IKM
try:
from telegram import LinkPreviewOptions as _LPO
except ImportError:
_LPO = None
from telegram.ext import (
Application as _App, CommandHandler as _CH,
CallbackQueryHandler as _CQH,
MessageHandler as _MH,
ContextTypes as _CT, filters as _filters,
)
from telegram.constants import ParseMode as _PM, ChatType as _CtT
from telegram.request import HTTPXRequest as _HR
except ImportError:
return False
Update = _Update
Bot = _Bot
Message = _Message
InlineKeyboardButton = _IKB
InlineKeyboardMarkup = _IKM
LinkPreviewOptions = _LPO
Application = _App
CommandHandler = _CH
CallbackQueryHandler = _CQH
TelegramMessageHandler = _MH
ContextTypes = _CT
filters = _filters
ParseMode = _PM
ChatType = _CtT
HTTPXRequest = _HR
TELEGRAM_AVAILABLE = True
return True
# Matches every character that MarkdownV2 requires to be backslash-escaped
@@ -283,6 +332,45 @@ class TelegramAdapter(BasePlatformAdapter):
MEDIA_GROUP_WAIT_SECONDS = 0.8
_GENERAL_TOPIC_THREAD_ID = "1"
# Adaptive text-batch ingress: short messages need a tighter delay so the
# first token reaches the agent fast. Numbers tuned for "feels instant":
# ≤320 codepoints (one short paragraph) settles in ~180ms; ≤1024
# (a normal paragraph) in ~240ms; longer waits the configured cap.
# Always clamped to ``_text_batch_delay_seconds`` so an operator can lower
# the cap further via env var.
_TEXT_BATCH_FAST_LEN = 320
_TEXT_BATCH_FAST_DELAY_S = 0.18
_TEXT_BATCH_SHORT_LEN = 1024
_TEXT_BATCH_SHORT_DELAY_S = 0.24
@staticmethod
def _env_float_clamped(
name: str,
default: float,
*,
min_value: Optional[float] = None,
max_value: Optional[float] = None,
) -> float:
"""Read a float env var, reject non-finite values, and clamp to bounds.
Guarantees the returned value is a finite number usable directly in
``asyncio.sleep()`` and similar APIs that reject NaN / Inf.
"""
import math
raw = os.getenv(name)
try:
value = float(raw) if raw is not None else float(default)
except (TypeError, ValueError):
value = float(default)
if not math.isfinite(value):
value = float(default)
if min_value is not None:
value = max(value, min_value)
if max_value is not None:
value = min(value, max_value)
return value
@property
def message_len_fn(self):
"""Telegram measures message length in UTF-16 code units."""
@@ -304,9 +392,24 @@ class TelegramAdapter(BasePlatformAdapter):
self._media_group_events: Dict[str, MessageEvent] = {}
self._media_group_tasks: Dict[str, asyncio.Task] = {}
# Buffer rapid text messages so Telegram client-side splits of long
# messages are aggregated into a single MessageEvent.
self._text_batch_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS", "0.6"))
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
# messages are aggregated into a single MessageEvent. Lower defaults
# (0.3s / 1.0s instead of 0.6s / 2.0s) let short replies stream
# without a noticeable wait — combined with the adaptive fast-path
# in ``_calc_text_batch_delay`` below, ≤320-codepoint replies settle
# in ~180ms. All bounds are conservative for Telegram's
# ~1 edit/s flood envelope.
self._text_batch_delay_seconds = self._env_float_clamped(
"HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS",
0.3,
min_value=0.08,
max_value=2.0,
)
self._text_batch_split_delay_seconds = self._env_float_clamped(
"HERMES_TELEGRAM_TEXT_BATCH_SPLIT_DELAY_SECONDS",
1.0,
min_value=self._text_batch_delay_seconds,
max_value=4.0,
)
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._polling_error_task: Optional[asyncio.Task] = None
@@ -563,7 +666,7 @@ class TelegramAdapter(BasePlatformAdapter):
def _looks_like_network_error(error: Exception) -> bool:
"""Return True for transient network errors that warrant a reconnect attempt."""
name = error.__class__.__name__.lower()
if name in ("networkerror", "timedout", "connectionerror"):
if name in {"networkerror", "timedout", "connectionerror"}:
return True
try:
from telegram.error import NetworkError, TimedOut
@@ -579,9 +682,9 @@ class TelegramAdapter(BasePlatformAdapter):
return default
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in ("true", "1", "yes", "on"):
if lowered in {"true", "1", "yes", "on"}:
return True
if lowered in ("false", "0", "no", "off"):
if lowered in {"false", "0", "no", "off"}:
return False
return default
return bool(value)
@@ -1118,7 +1221,7 @@ class TelegramAdapter(BasePlatformAdapter):
"write_timeout": _env_float("HERMES_TELEGRAM_HTTP_WRITE_TIMEOUT", 20.0),
}
disable_fallback = (os.getenv("HERMES_TELEGRAM_DISABLE_FALLBACK_IPS", "").strip().lower() in ("1", "true", "yes", "on"))
disable_fallback = (os.getenv("HERMES_TELEGRAM_DISABLE_FALLBACK_IPS", "").strip().lower() in {"1", "true", "yes", "on"})
fallback_ips = self._fallback_ips()
if not fallback_ips:
fallback_ips = await discover_fallback_ips()
@@ -1559,10 +1662,18 @@ class TelegramAdapter(BasePlatformAdapter):
except Exception as e:
logger.error("[%s] Failed to send Telegram message: %s", self.name, e, exc_info=True)
err_str = str(e).lower()
# Message too long — content exceeded 4096 chars. Return failure so
# stream consumer enters fallback mode and sends the remainder.
if "message_too_long" in err_str or "too long" in err_str:
logger.debug(
"[%s] send() content too long, falling back to new-message continuation",
self.name,
)
return SendResult(success=False, error="message_too_long")
# TimedOut means the request may have reached Telegram —
# mark as non-retryable so _send_with_retry() doesn't re-send.
_to = locals().get("_TimedOut")
err_str = str(e).lower()
is_timeout = (_to and isinstance(e, _to)) or "timed out" in err_str
return SendResult(success=False, error=str(e), retryable=not is_timeout)
@@ -1574,9 +1685,26 @@ class TelegramAdapter(BasePlatformAdapter):
*,
finalize: bool = False,
) -> SendResult:
"""Edit a previously sent Telegram message."""
"""Edit a previously sent Telegram message.
Telegram caps single-message text at 4096 UTF-16 codeunits. Streaming
replies that grow past this limit must NOT be silently truncated and
must NOT return failure (the consumer would re-send and create a
duplicate). Instead this method split-and-delivers: edit the
existing message with the first chunk and send the rest as
continuation messages, returning the final chunk's id so subsequent
edits target the most recent visible message.
"""
if not self._bot:
return SendResult(success=False, error="Not connected")
# Pre-flight: if content already exceeds the limit, split-and-deliver
# without round-tripping a doomed edit.
if utf16_len(content) > self.MAX_MESSAGE_LENGTH:
return await self._edit_overflow_split(
chat_id, message_id, content, finalize=finalize,
)
try:
if not finalize:
await self._bot.edit_message_text(
@@ -1610,22 +1738,17 @@ class TelegramAdapter(BasePlatformAdapter):
# "Message is not modified" — content identical, treat as success
if "not modified" in err_str:
return SendResult(success=True, message_id=message_id)
# Message too long — content exceeded 4096 chars (e.g. during
# streaming). Truncate and succeed so the stream consumer can
# split the overflow into a new message instead of dying.
# Reactive split-and-deliver: parse_mode formatting can inflate
# the payload past the limit even when the raw text was under
# (e.g. MarkdownV2 escapes). Same fix as the pre-flight path.
if "message_too_long" in err_str or "too long" in err_str:
truncated = _prefix_within_utf16_limit(
content, self.MAX_MESSAGE_LENGTH - 20
) + ""
try:
await self._bot.edit_message_text(
chat_id=int(chat_id),
message_id=int(message_id),
text=truncated,
)
except Exception:
pass # best-effort truncation
return SendResult(success=True, message_id=message_id)
logger.debug(
"[%s] edit_message overflow (%d UTF-16 > %d), splitting",
self.name, utf16_len(content), self.MAX_MESSAGE_LENGTH,
)
return await self._edit_overflow_split(
chat_id, message_id, content, finalize=finalize,
)
# Flood control / RetryAfter — short waits are retried inline,
# long waits return a failure immediately so streaming can fall back
# to a normal final send instead of leaving a truncated partial.
@@ -1661,6 +1784,147 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=False, error=str(e))
async def _edit_overflow_split(
self,
chat_id: str,
message_id: str,
content: str,
*,
finalize: bool,
) -> SendResult:
"""Split an oversized edit across the existing message + continuations.
Edit the original ``message_id`` with chunk 1 (with the platform's
usual ``(1/N)`` suffix preserved), then send the remaining chunks as
new messages threaded as replies to the previous chunk so the user
sees them grouped. Returns ``SendResult(success=True,
message_id=<last-chunk-id>, continuation_message_ids=(...))`` so the
stream consumer can keep editing the most recent visible message
and the gateway has full visibility into every message id we put on
screen.
Falls back to ``SendResult(success=False)`` only if even the first-
chunk edit fails that's a real adapter problem, not an overflow.
"""
chunks = self.truncate_message(
content, self.MAX_MESSAGE_LENGTH, len_fn=utf16_len,
)
if len(chunks) <= 1:
# Defensive: shouldn't happen given the caller's pre-flight, but
# if truncate_message returned a single chunk just edit normally.
chunks = [content]
# Step 1 — edit the existing message with the first chunk.
first_chunk = chunks[0]
try:
if finalize:
# Use format_message + parse_mode for the final chunk;
# mirror edit_message's main happy-path.
formatted = self.format_message(first_chunk)
try:
await self._bot.edit_message_text(
chat_id=int(chat_id),
message_id=int(message_id),
text=formatted,
parse_mode=ParseMode.MARKDOWN_V2,
)
except Exception as fmt_err:
if "not modified" not in str(fmt_err).lower():
await self._bot.edit_message_text(
chat_id=int(chat_id),
message_id=int(message_id),
text=first_chunk,
)
else:
await self._bot.edit_message_text(
chat_id=int(chat_id),
message_id=int(message_id),
text=first_chunk,
)
except Exception as e:
err_str = str(e).lower()
if "not modified" in err_str:
# First chunk identical to current text — fall through to
# send continuations.
pass
else:
logger.error(
"[%s] Overflow split: first-chunk edit failed: %s",
self.name, e, exc_info=True,
)
return SendResult(success=False, error=str(e))
# Step 2 — send each remaining chunk as a continuation message,
# threaded as a reply to the previous so the user sees them as a
# contiguous block. We call self._bot.send_message directly so the
# continuation skips ``self.send``'s own pre-chunking pass (chunks
# are already correctly sized). Best-effort MarkdownV2 with plain
# fallback, mirroring send().
continuation_ids: list[str] = []
prev_id = message_id
for chunk in chunks[1:]:
sent_msg = None
for use_markdown in (True, False) if finalize else (False,):
try:
text = self.format_message(chunk) if use_markdown else chunk
sent_msg = await self._bot.send_message(
chat_id=int(chat_id),
text=text,
parse_mode=ParseMode.MARKDOWN_V2 if use_markdown else None,
reply_to_message_id=int(prev_id) if prev_id else None,
)
break
except Exception as send_err:
if "reply message not found" in str(send_err).lower():
# Drop the reply anchor and try again.
try:
sent_msg = await self._bot.send_message(
chat_id=int(chat_id),
text=chunk,
)
break
except Exception as _retry_err:
logger.warning(
"[%s] Overflow continuation no-reply retry failed: %s",
self.name, _retry_err,
)
sent_msg = None
break
if use_markdown:
# try plain text on next loop iteration
continue
logger.warning(
"[%s] Overflow continuation send failed: %s",
self.name, send_err,
)
sent_msg = None
break
if sent_msg is None:
# Continuation failed — the user has chunk 1 + however many
# continuations succeeded. Report success with what we got
# so the stream consumer knows the edit landed; the
# remaining tail is lost on this attempt and the next
# streaming tick may retry.
logger.warning(
"[%s] Overflow split: stopped at %d/%d chunks delivered",
self.name, 1 + len(continuation_ids), len(chunks),
)
break
new_id = str(getattr(sent_msg, "message_id", "")) or prev_id
continuation_ids.append(new_id)
prev_id = new_id
last_id = continuation_ids[-1] if continuation_ids else message_id
logger.debug(
"[%s] Overflow split delivered %d chunks; last_id=%s",
self.name, 1 + len(continuation_ids), last_id,
)
return SendResult(
success=True,
message_id=last_id,
continuation_message_ids=tuple(continuation_ids),
)
async def delete_message(self, chat_id: str, message_id: str) -> bool:
"""Delete a previously sent Telegram message.
@@ -1686,6 +1950,77 @@ class TelegramAdapter(BasePlatformAdapter):
)
return False
def supports_draft_streaming(
self,
chat_type: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> bool:
"""Telegram supports sendMessageDraft for private chats only.
Bot API 9.5 (March 2026) opened ``sendMessageDraft`` to all bots
unconditionally for private (DM) chats. Groups, supergroups, and
channels still rely on the edit-based path.
We additionally require ``self._bot`` to expose ``send_message_draft``
(added to python-telegram-bot in 22.6); older PTB installs gracefully
fall back to the edit path even on DMs.
"""
if not self._bot or not hasattr(self._bot, "send_message_draft"):
return False
return (chat_type or "").lower() in {"dm", "private"}
async def send_draft(
self,
chat_id: str,
draft_id: int,
content: str,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
"""Stream a partial message via Telegram's native sendMessageDraft.
The Bot API animates the preview when the same ``draft_id`` is reused
across consecutive calls in the same chat. When the response
finishes, the caller sends the final text via the normal ``send``
path; the draft preview clears naturally on the client (Telegram has
no Bot API to "promote" a draft to a real message the final
``sendMessage`` is what the user receives in their history).
"""
if not self._bot:
return SendResult(success=False, error="not_connected")
if not hasattr(self._bot, "send_message_draft"):
return SendResult(success=False, error="api_unavailable")
# Trim to the same UTF-16 budget the platform enforces on regular
# sends. Drafts have the same length contract as messages.
text = content if len(content) <= self.MAX_MESSAGE_LENGTH else \
self.truncate_message(content, self.MAX_MESSAGE_LENGTH, len_fn=utf16_len)[0]
kwargs: Dict[str, Any] = {
"chat_id": int(chat_id),
"draft_id": int(draft_id),
"text": text,
}
thread_id = self._metadata_thread_id(metadata)
if thread_id is not None:
kwargs["message_thread_id"] = thread_id
try:
ok = await self._bot.send_message_draft(**kwargs)
if ok:
# Drafts have no message_id; we report success without one
# so the caller knows the animation frame landed.
return SendResult(success=True, message_id=None)
return SendResult(success=False, error="draft_rejected")
except Exception as e:
# Most likely: BadRequest because this bot/chat doesn't allow
# drafts, or a transient server hiccup. The caller treats any
# failure as "fall back to edit-based for this response".
logger.debug(
"[%s] sendMessageDraft failed (chat=%s draft_id=%s): %s",
self.name, chat_id, draft_id, e,
)
return SendResult(success=False, error=str(e))
async def _send_message_with_thread_fallback(self, **kwargs):
"""Send a Telegram message, retrying once without message_thread_id
if Telegram returns 'Message thread not found'.
@@ -2438,7 +2773,7 @@ class TelegramAdapter(BasePlatformAdapter):
with open(audio_path, "rb") as audio_file:
ext = os.path.splitext(audio_path)[1].lower()
# .ogg / .opus files -> send as voice (round playable bubble)
if ext in (".ogg", ".opus"):
if ext in {".ogg", ".opus"}:
_voice_thread = self._metadata_thread_id(metadata)
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
voice_thread_kwargs = self._thread_kwargs_for_send(
@@ -2462,7 +2797,7 @@ class TelegramAdapter(BasePlatformAdapter):
"voice",
reset_media=lambda: audio_file.seek(0),
)
elif ext in (".mp3", ".m4a"):
elif ext in {".mp3", ".m4a"}:
# Telegram's Bot API sendAudio only accepts MP3 / M4A.
_audio_thread = self._metadata_thread_id(metadata)
reply_to_id = self._reply_to_message_id_for_send(reply_to, metadata)
@@ -3213,18 +3548,18 @@ class TelegramAdapter(BasePlatformAdapter):
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return configured.lower() in {"true", "1", "yes", "on"}
return bool(configured)
return os.getenv("TELEGRAM_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
return os.getenv("TELEGRAM_REQUIRE_MENTION", "false").lower() in {"true", "1", "yes", "on"}
def _telegram_guest_mode(self) -> bool:
"""Return whether non-allowlisted groups may trigger via direct @mention."""
configured = self.config.extra.get("guest_mode")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return configured.lower() in {"true", "1", "yes", "on"}
return bool(configured)
return os.getenv("TELEGRAM_GUEST_MODE", "false").lower() in ("true", "1", "yes", "on")
return os.getenv("TELEGRAM_GUEST_MODE", "false").lower() in {"true", "1", "yes", "on"}
def _telegram_free_response_chats(self) -> set[str]:
raw = self.config.extra.get("free_response_chats")
@@ -3313,7 +3648,7 @@ class TelegramAdapter(BasePlatformAdapter):
if not chat:
return False
chat_type = str(getattr(chat, "type", "")).split(".")[-1].lower()
return chat_type in ("group", "supergroup")
return chat_type in {"group", "supergroup"}
def _is_reply_to_bot(self, message: Message) -> bool:
if not self._bot or not getattr(message, "reply_to_message", None):
@@ -3577,12 +3912,27 @@ class TelegramAdapter(BasePlatformAdapter):
"""
current_task = asyncio.current_task()
try:
# Adaptive delay: if the latest chunk is near Telegram's 4096-char
# split point, a continuation is almost certain — wait longer.
# Adaptive delay tiers:
# - last chunk ≥ _SPLIT_THRESHOLD: a continuation is almost
# certain → wait the longer split delay.
# - total accumulated text ≤ _TEXT_BATCH_FAST_LEN (~320 cp):
# short message → cap delay at _TEXT_BATCH_FAST_DELAY_S
# so the agent sees the text near-instantly.
# - total ≤ _TEXT_BATCH_SHORT_LEN (~1024 cp):
# medium → cap at _TEXT_BATCH_SHORT_DELAY_S.
# - otherwise: use the configured cap.
# Tiers compose with operator overrides via the env-var-driven
# ``_text_batch_delay_seconds`` (e.g. an operator who sets the
# cap below 0.18s gets that lower number on every tier).
pending = self._pending_text_batches.get(key)
last_len = getattr(pending, "_last_chunk_len", 0) if pending else 0
total_len = len(getattr(pending, "text", "") or "") if pending else 0
if last_len >= self._SPLIT_THRESHOLD:
delay = self._text_batch_split_delay_seconds
elif total_len <= self._TEXT_BATCH_FAST_LEN:
delay = min(self._text_batch_delay_seconds, self._TEXT_BATCH_FAST_DELAY_S)
elif total_len <= self._TEXT_BATCH_SHORT_LEN:
delay = min(self._text_batch_delay_seconds, self._TEXT_BATCH_SHORT_DELAY_S)
else:
delay = self._text_batch_delay_seconds
await asyncio.sleep(delay)
@@ -3857,7 +4207,7 @@ class TelegramAdapter(BasePlatformAdapter):
# For text files, inject content into event.text (capped at 100 KB)
MAX_TEXT_INJECT_BYTES = 100 * 1024
if ext in (".md", ".txt") and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
if ext in {".md", ".txt"} and len(raw_bytes) <= MAX_TEXT_INJECT_BYTES:
try:
text_content = raw_bytes.decode("utf-8")
display_name = original_filename or f"document{ext}"
@@ -4096,7 +4446,7 @@ class TelegramAdapter(BasePlatformAdapter):
# Determine chat type
chat_type = "dm"
if chat.type in (ChatType.GROUP, ChatType.SUPERGROUP):
if chat.type in {ChatType.GROUP, ChatType.SUPERGROUP}:
chat_type = "group"
elif chat.type == ChatType.CHANNEL:
chat_type = "channel"
@@ -4212,7 +4562,7 @@ class TelegramAdapter(BasePlatformAdapter):
def _reactions_enabled(self) -> bool:
"""Check if message reactions are enabled via config/env."""
return os.getenv("TELEGRAM_REACTIONS", "false").lower() not in ("false", "0", "no")
return os.getenv("TELEGRAM_REACTIONS", "false").lower() not in {"false", "0", "no"}
async def _set_reaction(self, chat_id: str, message_id: str, emoji: str) -> bool:
"""Set a single emoji reaction on a Telegram message."""
+1 -1
View File
@@ -59,7 +59,7 @@ class TelegramFallbackTransport(httpx.AsyncBaseTransport):
"""
def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
self._fallback_ips = list(dict.fromkeys(_normalize_fallback_ips(fallback_ips)))
proxy_url = _resolve_proxy_url(target_hosts=[_TELEGRAM_API_HOST, *self._fallback_ips])
if proxy_url and "proxy" not in transport_kwargs:
transport_kwargs["proxy"] = proxy_url
+4 -4
View File
@@ -295,7 +295,7 @@ class WeComAdapter(BasePlatformAdapter):
auth_payload = await self._wait_for_handshake(req_id)
errcode = auth_payload.get("errcode", 0)
if errcode not in (0, None):
if errcode not in {0, None}:
errmsg = auth_payload.get("errmsg", "authentication failed")
raise RuntimeError(f"{errmsg} (errcode={errcode})")
@@ -320,7 +320,7 @@ class WeComAdapter(BasePlatformAdapter):
if self._payload_req_id(payload) == req_id:
return payload
logger.debug("[%s] Ignoring pre-auth payload: %s", self.name, payload.get("cmd"))
elif msg.type in (aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.ERROR):
elif msg.type in {aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.ERROR}:
raise RuntimeError("WeCom websocket closed during authentication")
async def _listen_loop(self) -> None:
@@ -360,7 +360,7 @@ class WeComAdapter(BasePlatformAdapter):
payload = self._parse_json(msg.data)
if payload:
await self._dispatch_payload(payload)
elif msg.type in (aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR):
elif msg.type in {aiohttp.WSMsgType.CLOSE, aiohttp.WSMsgType.CLOSED, aiohttp.WSMsgType.ERROR}:
raise RuntimeError("WeCom websocket closed")
async def _heartbeat_loop(self) -> None:
@@ -998,7 +998,7 @@ class WeComAdapter(BasePlatformAdapter):
@staticmethod
def _response_error(response: Dict[str, Any]) -> Optional[str]:
errcode = response.get("errcode", 0)
if errcode in (0, None):
if errcode in {0, None}:
return None
errmsg = str(response.get("errmsg") or "unknown error")
return f"WeCom errcode {errcode}: {errmsg}"
+4 -4
View File
@@ -605,7 +605,7 @@ def _assert_weixin_cdn_url(url: str) -> None:
except Exception as exc: # noqa: BLE001
raise ValueError(f"Unparseable media URL: {url!r}") from exc
if scheme not in ("http", "https"):
if scheme not in {"http", "https"}:
raise ValueError(
f"Media URL has disallowed scheme {scheme!r}; only http/https are permitted."
)
@@ -983,7 +983,7 @@ def _extract_text(item_list: List[Dict[str, Any]]) -> str:
ref = item.get("ref_msg") or {}
ref_item = ref.get("message_item") or {}
ref_type = ref_item.get("type")
if ref_type in (ITEM_IMAGE, ITEM_VIDEO, ITEM_FILE, ITEM_VOICE):
if ref_type in {ITEM_IMAGE, ITEM_VIDEO, ITEM_FILE, ITEM_VOICE}:
title = ref.get("title") or ""
prefix = f"[引用媒体: {title}]\n" if title else "[引用媒体]\n"
return f"{prefix}{text}".strip()
@@ -1331,7 +1331,7 @@ class WeixinAdapter(BasePlatformAdapter):
ret = response.get("ret", 0)
errcode = response.get("errcode", 0)
if ret not in (0, None) or errcode not in (0, None):
if ret not in {0, None} or errcode not in {0, None}:
if (ret == SESSION_EXPIRED_ERRCODE or errcode == SESSION_EXPIRED_ERRCODE
or _is_stale_session_ret(ret, errcode, response.get("errmsg"))):
logger.error("[%s] Session expired; pausing for 10 minutes", self.name)
@@ -1601,7 +1601,7 @@ class WeixinAdapter(BasePlatformAdapter):
if resp and isinstance(resp, dict):
ret = resp.get("ret")
errcode = resp.get("errcode")
if (ret is not None and ret not in (0,)) or (errcode is not None and errcode not in (0,)):
if (ret is not None and ret not in {0,}) or (errcode is not None and errcode not in {0,}):
is_session_expired = (
ret == SESSION_EXPIRED_ERRCODE
or errcode == SESSION_EXPIRED_ERRCODE
+4 -4
View File
@@ -301,9 +301,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() in ("true", "1", "yes", "on")
return configured.lower() in {"true", "1", "yes", "on"}
return bool(configured)
return os.getenv("WHATSAPP_REQUIRE_MENTION", "false").lower() in ("true", "1", "yes", "on")
return os.getenv("WHATSAPP_REQUIRE_MENTION", "false").lower() in {"true", "1", "yes", "on"}
def _whatsapp_free_response_chats(self) -> set[str]:
raw = self.config.extra.get("free_response_chats")
@@ -679,7 +679,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
if getattr(self, "_shutting_down", False) and returncode in {0, -2, -15}:
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
@@ -1183,7 +1183,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
if msg_type == MessageType.DOCUMENT and cached_urls:
for doc_path in cached_urls:
ext = Path(doc_path).suffix.lower()
if ext in (".txt", ".md", ".csv", ".json", ".xml", ".yaml", ".yml", ".log", ".py", ".js", ".ts", ".html", ".css"):
if ext in {".txt", ".md", ".csv", ".json", ".xml", ".yaml", ".yml", ".log", ".py", ".js", ".ts", ".html", ".css"}:
try:
file_size = Path(doc_path).stat().st_size
if file_size > MAX_TEXT_INJECT_BYTES:
+5 -5
View File
@@ -2228,7 +2228,7 @@ class MediaResolveMiddleware(InboundMiddleware):
resp.raise_for_status()
payload = resp.json()
code = payload.get("code")
if code not in (None, 0):
if code not in {None, 0}:
raise RuntimeError(
f"resource/v1/download failed: code={code}, msg={payload.get('msg', '')}"
)
@@ -2391,7 +2391,7 @@ class MediaResolveMiddleware(InboundMiddleware):
rid = m.group(2)
kind, _, filename = head.partition(":")
kind = kind.strip()
if kind not in ("image", "file"):
if kind not in {"image", "file"}:
continue
if rid in seen:
continue
@@ -2993,10 +2993,10 @@ class ConnectionManager:
# Fire-and-forget heartbeat ACKs — server always responds but callers don't
# wait on these; silently discard to avoid "Unmatched Response" noise.
if cmd_type == CMD_TYPE["Response"] and cmd in (
if cmd_type == CMD_TYPE["Response"] and cmd in {
"send_group_heartbeat",
"send_private_heartbeat",
):
}:
logger.debug("[%s] Heartbeat ACK received: cmd=%s msg_id=%s", adapter.name, cmd, msg_id)
return
@@ -3369,7 +3369,7 @@ class MediaSendHandler(ABC):
# Remove keys already passed explicitly to avoid "multiple values" TypeError
fwd_kwargs = {
k: v for k, v in kwargs.items()
if k not in ("file_uuid", "filename", "content_type")
if k not in {"file_uuid", "filename", "content_type"}
}
msg_body = self.build_msg_body(
upload_result,
+2 -2
View File
@@ -150,7 +150,7 @@ def _parse_jpeg_size(buf: bytes) -> Optional[dict[str, int]]:
i += 1
continue
marker = buf[i + 1]
if marker in (0xC0, 0xC2):
if marker in {0xC0, 0xC2}:
h = struct.unpack(">H", buf[i + 5: i + 7])[0]
w = struct.unpack(">H", buf[i + 7: i + 9])[0]
return {"width": w, "height": h}
@@ -165,7 +165,7 @@ def _parse_gif_size(buf: bytes) -> Optional[dict[str, int]]:
if len(buf) < 10:
return None
sig = buf[:6].decode("ascii", errors="replace")
if sig not in ("GIF87a", "GIF89a"):
if sig not in {"GIF87a", "GIF89a"}:
return None
w = struct.unpack("<H", buf[6:8])[0]
h = struct.unpack("<H", buf[8:10])[0]
+1 -1
View File
@@ -702,7 +702,7 @@ def decode_inbound_push(data: bytes) -> Optional[dict]:
"trace_id": trace_id,
}
# 过滤空值(保持 API 整洁)
return {k: v for k, v in result.items() if v or k in ("msg_body", "msg_seq")}
return {k: v for k, v in result.items() if v or k in {"msg_body", "msg_seq"}}
except Exception as e:
if DEBUG_MODE:
logger.debug("[yuanbao_proto] decode_inbound_push failed: %s", e)
+178 -203
View File
@@ -268,9 +268,8 @@ def _build_replay_entry(role: str, content: Any, msg: Dict[str, Any]) -> Dict[st
# Preserve empty-string sentinel for thinking-mode replay.
if _rval is None:
continue
else:
if not _rval:
continue
elif not _rval:
continue
entry[_rkey] = _rval
return entry
@@ -289,7 +288,7 @@ def _last_transcript_timestamp(history: Optional[List[Dict[str, Any]]]) -> Any:
if not isinstance(msg, dict):
continue
role = msg.get("role")
if not role or role in ("session_meta", "system"):
if not role or role in {"session_meta", "system"}:
continue
ts = msg.get("timestamp")
if ts is not None:
@@ -473,7 +472,7 @@ if _config_path.exists():
# gateway resolves these to Path.home() later (line ~255).
# Writing the raw placeholder here would just be noise.
# Only bridge explicit absolute paths from config.yaml.
if _cfg_key == "cwd" and str(_val) in (".", "auto", "cwd"):
if _cfg_key == "cwd" and str(_val) in {".", "auto", "cwd"}:
continue
# Expand shell tilde in cwd so subprocess.Popen never
# receives a literal "~/" which the kernel rejects.
@@ -617,7 +616,7 @@ os.environ["HERMES_EXEC_ASK"] = "1"
# to home directory. MESSAGING_CWD is accepted as a backward-compat
# fallback (deprecated — the warning above tells users to migrate).
_configured_cwd = os.environ.get("TERMINAL_CWD", "")
if not _configured_cwd or _configured_cwd in (".", "auto", "cwd"):
if not _configured_cwd or _configured_cwd in {".", "auto", "cwd"}:
_fallback = os.getenv("MESSAGING_CWD") or str(Path.home())
os.environ["TERMINAL_CWD"] = _fallback
@@ -850,7 +849,7 @@ def _skill_slug_from_frontmatter(skill_md: Path) -> tuple[str | None, str | None
if line.startswith("name:"):
raw = line.split(":", 1)[1].strip()
# Strip YAML quote wrappers if present
if len(raw) >= 2 and raw[0] == raw[-1] and raw[0] in ('"', "'"):
if len(raw) >= 2 and raw[0] == raw[-1] and raw[0] in {'"', "'"}:
raw = raw[1:-1]
declared_name = raw.strip()
break
@@ -892,7 +891,7 @@ def _check_unavailable_skill(command_name: str) -> str | None:
if not skills_dir.exists():
continue
for skill_md in skills_dir.rglob("SKILL.md"):
if any(part in ('.git', '.github', '.hub', '.archive') for part in skill_md.parts):
if any(part in {'.git', '.github', '.hub', '.archive'} for part in skill_md.parts):
continue
slug, declared_name = _skill_slug_from_frontmatter(skill_md)
if not slug or not declared_name:
@@ -1034,7 +1033,7 @@ def _parse_session_key(session_key: str) -> "dict | None":
"chat_type": parts[3],
"chat_id": parts[4],
}
if len(parts) > 5 and parts[3] in ("dm", "thread"):
if len(parts) > 5 and parts[3] in {"dm", "thread"}:
result["thread_id"] = parts[5]
return result
return None
@@ -1249,6 +1248,7 @@ class GatewayRunner:
# Per-session reasoning effort overrides from /reasoning.
# Key: session_key, Value: parsed reasoning config dict.
self._session_reasoning_overrides: Dict[str, Dict[str, Any]] = {}
self._kanban_notifier_profile = self._active_profile_name()
# Teams meeting pipeline runtime (bound later when msgraph_webhook adapter exists).
self._teams_pipeline_runtime = None
self._teams_pipeline_runtime_error: Optional[str] = None
@@ -1561,7 +1561,7 @@ class GatewayRunner:
enabled_chats.clear()
enabled_chats.update(
key[len(prefix):] for key, mode in self._voice_mode.items()
if mode in ("voice_only", "all") and key.startswith(prefix)
if mode in {"voice_only", "all"} and key.startswith(prefix)
)
async def _safe_adapter_disconnect(self, adapter, platform) -> None:
@@ -1991,7 +1991,7 @@ class GatewayRunner:
# Both "queue" and "steer" modes imply the user doesn't want messages
# to be lost during restart — queue them for the newly-spawned gateway
# process to pick up. "interrupt" mode drops them (current behaviour).
return self._restart_requested and self._busy_input_mode in ("queue", "steer")
return self._restart_requested and self._busy_input_mode in {"queue", "steer"}
# -------- /queue FIFO helpers --------------------------------------
# /queue must produce one full agent turn per invocation, in FIFO
@@ -2401,7 +2401,7 @@ class GatewayRunner:
raw = cfg_get(cfg, "display", "background_process_notifications")
if raw is False:
mode = "off"
elif raw not in (None, ""):
elif raw not in {None, ""}:
mode = str(raw)
except Exception:
pass
@@ -3247,7 +3247,7 @@ class GatewayRunner:
# for this process's lifetime.
try:
_redact_raw = os.getenv("HERMES_REDACT_SECRETS", "true")
_redact_on = _redact_raw.lower() in ("1", "true", "yes", "on")
_redact_on = _redact_raw.lower() in {"1", "true", "yes", "on"}
if _redact_on:
logger.info(
"Secret redaction: ENABLED (tool output, logs, and chat "
@@ -3275,6 +3275,30 @@ class GatewayRunner:
write_runtime_status(gateway_state="starting", exit_reason=None)
except Exception:
pass
# Log any active supply-chain security advisories. Operators see this
# in gateway.log and `hermes status` surfaces it; we do NOT block
# startup or surface it inline to user messages, since the gateway
# operator is the one who can act on it (uninstall the package,
# rotate credentials). See hermes_cli/security_advisories.py.
try:
from hermes_cli.security_advisories import (
detect_compromised,
gateway_log_message,
)
_adv_hits = detect_compromised()
_adv_msg = gateway_log_message(_adv_hits)
if _adv_msg:
logger.warning("%s", _adv_msg)
logger.warning(
"Run `hermes doctor` on the gateway host for full "
"remediation steps."
)
except Exception:
logger.debug(
"security advisory check failed at gateway startup",
exc_info=True,
)
# Warn if no user allowlists are configured and open access is not opted in
_builtin_allowed_vars = (
@@ -3329,8 +3353,8 @@ class GatewayRunner:
_any_allowlist = any(
os.getenv(v) for v in _builtin_allowed_vars + _plugin_allowed_vars
)
_allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes") or any(
os.getenv(v, "").lower() in ("true", "1", "yes")
_allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in {"true", "1", "yes"} or any(
os.getenv(v, "").lower() in {"true", "1", "yes"}
for v in _builtin_allow_all_vars + _plugin_allow_all_vars
)
if not _any_allowlist and not _allow_all:
@@ -4071,6 +4095,14 @@ class GatewayRunner:
break
await asyncio.sleep(1)
def _active_profile_name(self) -> str:
"""Return the profile name this gateway represents."""
try:
from hermes_cli.profiles import get_active_profile_name
return get_active_profile_name() or "default"
except Exception:
return "default"
async def _kanban_notifier_watcher(self, interval: float = 5.0) -> None:
"""Poll ``kanban_notify_subs`` and deliver terminal events to users.
@@ -4119,6 +4151,10 @@ class GatewayRunner:
self, "_kanban_sub_fail_counts", {}
)
self._kanban_sub_fail_counts = sub_fail_counts
notifier_profile = getattr(self, "_kanban_notifier_profile", None)
if not notifier_profile:
notifier_profile = self._active_profile_name()
self._kanban_notifier_profile = notifier_profile
# Initial delay so the gateway can finish wiring adapters.
await asyncio.sleep(5)
@@ -4181,6 +4217,13 @@ class GatewayRunner:
if not subs:
logger.debug("kanban notifier: board %s has no subscriptions", slug)
for sub in subs:
owner_profile = sub.get("notifier_profile") or None
if owner_profile and owner_profile != notifier_profile:
logger.debug(
"kanban notifier: subscription for %s owned by profile %s; current profile %s skipping",
sub.get("task_id"), owner_profile, notifier_profile,
)
continue
platform = (sub.get("platform") or "").lower()
if platform not in active_platforms:
logger.debug(
@@ -4360,7 +4403,7 @@ class GatewayRunner:
# dispatcher respawns the task and it cycles into the
# same state. See the longer comment on TERMINAL_KINDS
# above for the failure mode this prevents.
task_terminal = task and task.status in ("done", "archived")
task_terminal = task and task.status in {"done", "archived"}
if task_terminal:
await asyncio.to_thread(
self._kanban_unsub, sub, board_slug,
@@ -4460,7 +4503,7 @@ class GatewayRunner:
logger.warning("kanban dispatcher: config loader unavailable; disabled")
return
env_override = os.environ.get("HERMES_KANBAN_DISPATCH_IN_GATEWAY", "").strip().lower()
if env_override in ("0", "false", "no", "off"):
if env_override in {"0", "false", "no", "off"}:
logger.info("kanban dispatcher: disabled via HERMES_KANBAN_DISPATCH_IN_GATEWAY env")
return
@@ -4483,8 +4526,7 @@ class GatewayRunner:
return
interval = float(kanban_cfg.get("dispatch_interval_seconds", 60) or 60)
if interval < 1.0:
interval = 1.0 # sanity floor — tighter than this is a footgun
interval = max(interval, 1.0) # sanity floor — tighter than this is a footgun
# Read max_spawn config to limit concurrent kanban tasks
max_spawn = kanban_cfg.get("max_spawn", None)
@@ -4736,34 +4778,33 @@ class GatewayRunner:
await build_channel_directory(self.adapters)
except Exception:
pass
# Check if the failure is non-retryable
elif adapter.has_fatal_error and not adapter.fatal_error_retryable:
self._update_platform_runtime_status(
platform.value,
platform_state="fatal",
error_code=adapter.fatal_error_code,
error_message=adapter.fatal_error_message,
)
logger.warning(
"Reconnect %s: non-retryable error (%s), removing from retry queue",
platform.value, adapter.fatal_error_message,
)
del self._failed_platforms[platform]
else:
# Check if the failure is non-retryable
if adapter.has_fatal_error and not adapter.fatal_error_retryable:
self._update_platform_runtime_status(
platform.value,
platform_state="fatal",
error_code=adapter.fatal_error_code,
error_message=adapter.fatal_error_message,
)
logger.warning(
"Reconnect %s: non-retryable error (%s), removing from retry queue",
platform.value, adapter.fatal_error_message,
)
del self._failed_platforms[platform]
else:
self._update_platform_runtime_status(
platform.value,
platform_state="retrying",
error_code=adapter.fatal_error_code,
error_message=adapter.fatal_error_message or "failed to reconnect",
)
backoff = min(30 * (2 ** (attempt - 1)), _BACKOFF_CAP)
info["attempts"] = attempt
info["next_retry"] = time.monotonic() + backoff
logger.info(
"Reconnect %s failed, next retry in %ds",
platform.value, backoff,
)
self._update_platform_runtime_status(
platform.value,
platform_state="retrying",
error_code=adapter.fatal_error_code,
error_message=adapter.fatal_error_message or "failed to reconnect",
)
backoff = min(30 * (2 ** (attempt - 1)), _BACKOFF_CAP)
info["attempts"] = attempt
info["next_retry"] = time.monotonic() + backoff
logger.info(
"Reconnect %s failed, next retry in %ds",
platform.value, backoff,
)
except Exception as e:
self._update_platform_runtime_status(
platform.value,
@@ -5139,12 +5180,12 @@ class GatewayRunner:
try:
_gw_cfg = _load_gateway_config()
_raw = cfg_get(_gw_cfg, "display", "platforms", "telegram", "notifications")
if _raw not in (None, ""):
if _raw not in {None, ""}:
_notify_mode = str(_raw).strip().lower()
except Exception:
pass
_notify_mode = _notify_mode or "important"
if _notify_mode not in ("all", "important"):
if _notify_mode not in {"all", "important"}:
logger.warning(
"Unknown telegram notifications mode '%s', "
"defaulting to 'important' (valid: all, important)",
@@ -5321,7 +5362,7 @@ class GatewayRunner:
# connection, so HA events are always authorized.
# Webhook events are authenticated via HMAC signature validation in
# the adapter itself — no user allowlist applies.
if source.platform in (Platform.HOMEASSISTANT, Platform.WEBHOOK):
if source.platform in {Platform.HOMEASSISTANT, Platform.WEBHOOK}:
return True
user_id = source.user_id
@@ -5394,12 +5435,12 @@ class GatewayRunner:
# Per-platform allow-all flag (e.g., DISCORD_ALLOW_ALL_USERS=true)
platform_allow_all_var = platform_allow_all_map.get(source.platform, "")
if platform_allow_all_var and os.getenv(platform_allow_all_var, "").lower() in ("true", "1", "yes"):
if platform_allow_all_var and os.getenv(platform_allow_all_var, "").lower() in {"true", "1", "yes"}:
return True
if getattr(source, "is_bot", False):
allow_bots_var = platform_allow_bots_map.get(source.platform)
if allow_bots_var and os.getenv(allow_bots_var, "none").lower().strip() in ("mentions", "all"):
if allow_bots_var and os.getenv(allow_bots_var, "none").lower().strip() in {"mentions", "all"}:
return True
# Discord role-based access (DISCORD_ALLOWED_ROLES): the adapter's
@@ -5430,7 +5471,7 @@ class GatewayRunner:
if not platform_allowlist and not group_user_allowlist and not group_chat_allowlist and not global_allowlist:
# No allowlists configured -- check global allow-all flag
return os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes")
return os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in {"true", "1", "yes"}
# Telegram can optionally authorize group traffic by chat ID.
# Keep this separate from TELEGRAM_GROUP_ALLOWED_USERS, which gates
@@ -5725,9 +5766,9 @@ class GatewayRunner:
raw = (event.text or "").strip()
# Accept /approve and /deny as shorthand for yes/no
cmd = event.get_command()
if cmd in ("approve", "yes"):
if cmd in {"approve", "yes"}:
response_text = "y"
elif cmd in ("deny", "no"):
elif cmd in {"deny", "no"}:
response_text = "n"
else:
_recognized_cmd = None
@@ -5809,17 +5850,17 @@ class GatewayRunner:
_raw_reply = (event.text or "").strip()
_cmd_reply = event.get_command()
_confirm_choice = None
if _cmd_reply in ("approve", "yes", "ok", "confirm"):
if _cmd_reply in {"approve", "yes", "ok", "confirm"}:
_confirm_choice = "once"
elif _cmd_reply in ("always", "remember"):
elif _cmd_reply in {"always", "remember"}:
_confirm_choice = "always"
elif _cmd_reply in ("cancel", "no", "deny", "nevermind"):
elif _cmd_reply in {"cancel", "no", "deny", "nevermind"}:
_confirm_choice = "cancel"
elif _raw_reply.lower() in ("approve", "approve once", "once"):
elif _raw_reply.lower() in {"approve", "approve once", "once"}:
_confirm_choice = "once"
elif _raw_reply.lower() in ("always", "always approve"):
elif _raw_reply.lower() in {"always", "always approve"}:
_confirm_choice = "always"
elif _raw_reply.lower() in ("cancel", "nevermind", "no"):
elif _raw_reply.lower() in {"cancel", "nevermind", "no"}:
_confirm_choice = "cancel"
if _confirm_choice is not None:
_resolved = await _slash_confirm_mod.resolve(
@@ -5955,7 +5996,7 @@ class GatewayRunner:
# Semantics: each /queue invocation produces its own full agent
# turn, processed in FIFO order after the current run (and any
# earlier /queue items) finishes. Messages are NOT merged.
if event.get_command() in ("queue", "q"):
if event.get_command() in {"queue", "q"}:
queued_text = event.get_command_args().strip()
if not queued_text:
return "Usage: /queue <prompt>"
@@ -6028,7 +6069,7 @@ class GatewayRunner:
# The agent thread is blocked on a threading.Event inside
# tools/approval.py — sending an interrupt won't unblock it.
# Route directly to the approval handler so the event is signalled.
if _cmd_def_inner and _cmd_def_inner.name in ("approve", "deny"):
if _cmd_def_inner and _cmd_def_inner.name in {"approve", "deny"}:
if _cmd_def_inner.name == "approve":
return await self._handle_approve_command(event)
return await self._handle_deny_command(event)
@@ -6059,16 +6100,10 @@ class GatewayRunner:
# continuation prompt against the current turn.
if _cmd_def_inner and _cmd_def_inner.name == "goal":
_goal_arg = (event.get_command_args() or "").strip().lower()
if not _goal_arg or _goal_arg in ("status", "pause", "resume", "clear", "stop", "done"):
if not _goal_arg or _goal_arg in {"status", "pause", "resume", "clear", "stop", "done"}:
return await self._handle_goal_command(event)
return "Agent is running — use /goal status / pause / clear mid-run, or /stop before setting a new goal."
# /subgoal is safe mid-run — it only modifies the active goal's
# checklist, which the judge consults at turn boundaries. There
# is no race with the running turn.
if _cmd_def_inner and _cmd_def_inner.name == "subgoal":
return await self._handle_subgoal_command(event)
# Session-level toggles that are safe to run mid-agent —
# /yolo can unblock a pending approval prompt, /verbose cycles
# the tool-progress display mode for the ongoing stream.
@@ -6077,7 +6112,7 @@ class GatewayRunner:
# /fast and /reasoning are config-only and take effect next
# message, so they fall through to the catch-all busy response
# below — users should wait and set them between turns.
if _cmd_def_inner and _cmd_def_inner.name in ("yolo", "verbose"):
if _cmd_def_inner and _cmd_def_inner.name in {"yolo", "verbose"}:
if _cmd_def_inner.name == "yolo":
return await self._handle_yolo_command(event)
if _cmd_def_inner.name == "verbose":
@@ -6196,10 +6231,9 @@ class GatewayRunner:
return None
logger.debug("PRIORITY interrupt for session %s", _quick_key)
running_agent.interrupt(event.text)
if _quick_key in self._pending_messages:
self._pending_messages[_quick_key] += "\n" + event.text
else:
self._pending_messages[_quick_key] = event.text
# NOTE: self._pending_messages was write-only (never consumed).
# The actual interrupt message is delivered via adapter._pending_messages
# which is read by _run_agent. Removed to prevent unbounded growth.
return None
# Check for commands
@@ -6448,9 +6482,6 @@ class GatewayRunner:
if canonical == "goal":
return await self._handle_goal_command(event)
if canonical == "subgoal":
return await self._handle_subgoal_command(event)
if canonical == "voice":
return await self._handle_voice_command(event)
@@ -6471,13 +6502,23 @@ class GatewayRunner:
exec_cmd = qcmd.get("command", "")
if exec_cmd:
try:
# Sanitize env to prevent credential leakage —
# quick commands run in the gateway process which
# has all API keys in os.environ.
from tools.environments.local import _sanitize_subprocess_env
sanitized_env = _sanitize_subprocess_env(os.environ.copy())
proc = await asyncio.create_subprocess_shell(
exec_cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
env=sanitized_env,
)
stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=30)
output = (stdout or stderr).decode().strip()
# Redact any remaining sensitive patterns in output
if output:
from agent.redact import redact_sensitive_text
output = redact_sensitive_text(output)
return output if output else "Command returned no output."
except asyncio.TimeoutError:
return "Quick command timed out (30s)."
@@ -6625,18 +6666,10 @@ class GatewayRunner:
except Exception:
session_entry = None
if session_entry is not None:
# Pull the agent's full messages list from the result
# so the judge can dump it for its read_file tool.
_agent_messages: list = []
if isinstance(_agent_result, dict):
_msgs = _agent_result.get("messages")
if isinstance(_msgs, list):
_agent_messages = _msgs
await self._post_turn_goal_continuation(
session_entry=session_entry,
source=source,
final_response=_final_text,
agent_messages=_agent_messages,
)
except Exception as _goal_exc:
logger.debug("goal continuation hook failed: %s", _goal_exc)
@@ -6702,7 +6735,7 @@ class GatewayRunner:
mtype = event.media_types[i] if i < len(event.media_types) else ""
if mtype.startswith("image/") or event.message_type == MessageType.PHOTO:
image_paths.append(path)
if mtype.startswith("audio/") or event.message_type in (MessageType.VOICE, MessageType.AUDIO):
if mtype.startswith("audio/") or event.message_type in {MessageType.VOICE, MessageType.AUDIO}:
audio_paths.append(path)
if image_paths:
@@ -6771,7 +6804,7 @@ class GatewayRunner:
_TEXT_EXTENSIONS = {".txt", ".md", ".csv", ".log", ".json", ".xml", ".yaml", ".yml", ".toml", ".ini", ".cfg"}
for i, path in enumerate(event.media_urls):
mtype = event.media_types[i] if i < len(event.media_types) else ""
if mtype in ("", "application/octet-stream"):
if mtype in {"", "application/octet-stream"}:
_ext = os.path.splitext(path)[1].lower()
if _ext in _TEXT_EXTENSIONS:
mtype = "text/plain"
@@ -7155,7 +7188,7 @@ class GatewayRunner:
if isinstance(_comp_cfg, dict):
_hyg_compression_enabled = str(
_comp_cfg.get("enabled", True)
).lower() in ("true", "1", "yes")
).lower() in {"true", "1", "yes"}
_raw_hard_limit = _comp_cfg.get("hygiene_hard_message_limit")
if _raw_hard_limit is not None:
try:
@@ -7278,7 +7311,7 @@ class GatewayRunner:
_hyg_msgs = [
{"role": m.get("role"), "content": m.get("content")}
for m in history
if m.get("role") in ("user", "assistant")
if m.get("role") in {"user", "assistant"}
and m.get("content")
]
@@ -7642,7 +7675,7 @@ class GatewayRunner:
while not _pr.completion_queue.empty():
evt = _pr.completion_queue.get_nowait()
evt_type = evt.get("type", "completion")
if evt_type in ("watch_match", "watch_disabled"):
if evt_type in {"watch_match", "watch_disabled"}:
_watch_events.append(evt)
# else: completion events are handled by the watcher task
for evt in _watch_events:
@@ -7884,7 +7917,7 @@ class GatewayRunner:
status_hint = " You are being rate-limited. Please wait a moment and try again."
elif status_code == 529:
status_hint = " The API is temporarily overloaded. Please try again shortly."
elif status_code in (400, 500):
elif status_code in {400, 500}:
# 400 with a large session is context overflow.
# 500 with a large session often means the payload is too large
# for the API to process — treat it the same way.
@@ -8246,7 +8279,7 @@ class GatewayRunner:
policy = _policy_for_source(self.config, source)
platform = source.platform.value if source and source.platform else "?"
chat_type = (source.chat_type if source else "") or "dm"
scope = "DM" if chat_type.lower() in ("dm", "direct", "private", "") else "group/channel"
scope = "DM" if chat_type.lower() in {"dm", "direct", "private", ""} else "group/channel"
user_id = (source.user_id if source else None) or "?"
if not policy.enabled:
@@ -8300,6 +8333,7 @@ class GatewayRunner:
"""
import asyncio
import re
import shlex
from hermes_cli.kanban import run_slash
text = (event.text or "").strip()
@@ -8309,7 +8343,26 @@ class GatewayRunner:
if text.startswith("kanban"):
text = text[len("kanban"):].lstrip()
is_create = text.split(None, 1)[:1] == ["create"]
tokens = shlex.split(text) if text else []
requested_board = None
action = None
i = 0
while i < len(tokens):
tok = tokens[i]
if tok == "--board":
if i + 1 >= len(tokens):
break
requested_board = tokens[i + 1]
i += 2
continue
if tok.startswith("--board="):
requested_board = tok.split("=", 1)[1]
i += 1
continue
action = tok
break
is_create = action == "create"
try:
output = await asyncio.to_thread(run_slash, text)
@@ -8336,13 +8389,14 @@ class GatewayRunner:
if platform_str and chat_id:
def _sub():
from hermes_cli import kanban_db as _kb
conn = _kb.connect()
conn = _kb.connect(board=requested_board)
try:
_kb.add_notify_sub(
conn, task_id=task_id,
platform=platform_str, chat_id=chat_id,
thread_id=thread_id or None,
user_id=user_id,
notifier_profile=getattr(self, "_kanban_notifier_profile", None) or self._active_profile_name(),
)
finally:
conn.close()
@@ -9163,7 +9217,7 @@ class GatewayRunner:
return "\n".join(p for p in parts if p)
return str(value)
if args in ("none", "default", "neutral"):
if args in {"none", "default", "neutral"}:
try:
if "agent" not in config or not isinstance(config.get("agent"), dict):
config["agent"] = {}
@@ -9315,7 +9369,7 @@ class GatewayRunner:
return t("gateway.goal.no_resume")
return t("gateway.goal.resumed", goal=state.goal)
if lower in ("clear", "stop", "done"):
if lower in {"clear", "stop", "done"}:
had = mgr.has_goal()
mgr.clear()
try:
@@ -9352,83 +9406,6 @@ class GatewayRunner:
return t("gateway.goal.set", budget=state.max_turns, goal=state.goal)
async def _handle_subgoal_command(self, event: "MessageEvent") -> str:
"""Handle /subgoal for gateway platforms.
Forms (mirror of CLI):
/subgoal show the checklist
/subgoal <text> append a user item
/subgoal complete <n> | done <n> mark item n completed
/subgoal impossible <n> mark item n impossible
/subgoal undo <n> revert item n to pending
/subgoal remove <n> delete item n
/subgoal clear wipe the checklist
"""
args = (event.get_command_args() or "").strip()
mgr, _session_entry = self._get_goal_manager_for_event(event)
if mgr is None:
return t("gateway.goal.unavailable")
if not mgr.has_goal():
return "No active goal. Set one with /goal <text>."
if not args:
return f"{mgr.status_line()}\n{mgr.render_checklist()}"
tokens = args.split(None, 1)
verb = tokens[0].lower()
rest = tokens[1].strip() if len(tokens) > 1 else ""
action_status_map = {
"complete": "completed",
"completed": "completed",
"done": "completed",
"impossible": "impossible",
"imp": "impossible",
"skip": "impossible",
"undo": "pending",
"pending": "pending",
"reset": "pending",
}
if verb in action_status_map:
if not rest:
return f"Usage: /subgoal {verb} <n>"
try:
idx = int(rest.split()[0])
except ValueError:
return f"/subgoal {verb}: <n> must be an integer (1-based index)."
try:
item = mgr.mark_subgoal(idx, action_status_map[verb])
except (IndexError, ValueError, RuntimeError) as exc:
return f"/subgoal {verb}: {exc}"
return f"✓ Item {idx}{item.status}: {item.text}"
if verb == "remove":
if not rest:
return "Usage: /subgoal remove <n>"
try:
idx = int(rest.split()[0])
except ValueError:
return "/subgoal remove: <n> must be an integer (1-based index)."
try:
removed = mgr.remove_subgoal(idx)
except (IndexError, RuntimeError) as exc:
return f"/subgoal remove: {exc}"
return f"✓ Removed item {idx}: {removed.text}"
if verb == "clear":
mgr.clear_checklist()
return "✓ Checklist cleared. The judge will re-decompose on the next turn."
# Otherwise — append `args` as a new user-authored checklist item.
try:
item = mgr.add_subgoal(args)
except (ValueError, RuntimeError) as exc:
return f"/subgoal: {exc}"
idx = len(mgr.state.checklist) if mgr.state else 0
return f"✓ Added subgoal {idx}: {item.text}"
async def _send_goal_status_notice(self, source: Any, message: str) -> None:
"""Send a /goal judge status line back to the originating chat/thread."""
adapter = self.adapters.get(source.platform)
@@ -9497,7 +9474,6 @@ class GatewayRunner:
session_entry: Any,
source: Any,
final_response: str,
agent_messages: Optional[list] = None,
) -> None:
"""Run the goal judge after a gateway turn and, if still active,
enqueue a continuation prompt for the same session.
@@ -9525,11 +9501,7 @@ class GatewayRunner:
if not mgr.is_active():
return
decision = mgr.evaluate_after_turn(
final_response or "",
user_initiated=True,
messages=agent_messages or [],
)
decision = mgr.evaluate_after_turn(final_response or "", user_initiated=True)
msg = decision.get("message") or ""
# Defer the status line until after the adapter has delivered the
@@ -9650,13 +9622,13 @@ class GatewayRunner:
adapter = self.adapters.get(platform)
if args in ("on", "enable"):
if args in {"on", "enable"}:
self._voice_mode[voice_key] = "voice_only"
self._save_voice_modes()
if adapter:
self._set_adapter_auto_tts_enabled(adapter, chat_id, enabled=True)
return t("gateway.voice.enabled_voice_only")
elif args in ("off", "disable"):
elif args in {"off", "disable"}:
self._voice_mode[voice_key] = "off"
self._save_voice_modes()
if adapter:
@@ -9668,7 +9640,7 @@ class GatewayRunner:
if adapter:
self._set_adapter_auto_tts_enabled(adapter, chat_id, enabled=True)
return t("gateway.voice.tts_enabled")
elif args in ("channel", "join"):
elif args in {"channel", "join"}:
return await self._handle_voice_channel_join(event)
elif args == "leave":
return await self._handle_voice_channel_leave(event)
@@ -10442,12 +10414,12 @@ class GatewayRunner:
# Display toggle (per-platform)
platform_key = _platform_config_key(event.source.platform)
if args in ("show", "on"):
if args in {"show", "on"}:
self._show_reasoning = True
_save_config_key(f"display.platforms.{platform_key}.show_reasoning", True)
return t("gateway.reasoning.display_set_on", platform=platform_key)
if args in ("hide", "off"):
if args in {"hide", "off"}:
self._show_reasoning = False
_save_config_key(f"display.platforms.{platform_key}.show_reasoning", False)
return t("gateway.reasoning.display_set_off", platform=platform_key)
@@ -10463,7 +10435,7 @@ class GatewayRunner:
return t("gateway.reasoning.reset_done")
if effort == "none":
parsed = {"enabled": False}
elif effort in ("minimal", "low", "medium", "high", "xhigh"):
elif effort in {"minimal", "low", "medium", "high", "xhigh"}:
parsed = {"enabled": True, "effort": effort}
else:
return t(
@@ -10655,7 +10627,7 @@ class GatewayRunner:
effective = resolve_footer_config(user_config, platform_key)
if arg in ("status", "?"):
if arg in {"status", "?"}:
state = t("gateway.footer.state_on") if effective["enabled"] else t("gateway.footer.state_off")
fields = ", ".join(effective.get("fields") or [])
return t(
@@ -10665,9 +10637,9 @@ class GatewayRunner:
platform=platform_key,
)
if arg in ("on", "enable", "true", "1"):
if arg in {"on", "enable", "true", "1"}:
new_state = True
elif arg in ("off", "disable", "false", "0"):
elif arg in {"off", "disable", "false", "0"}:
new_state = False
elif arg == "":
new_state = not effective["enabled"]
@@ -10735,7 +10707,7 @@ class GatewayRunner:
msgs = [
{"role": m.get("role"), "content": m.get("content")}
for m in history
if m.get("role") in ("user", "assistant") and m.get("content")
if m.get("role") in {"user", "assistant"} and m.get("content")
]
tmp_agent = AIAgent(
@@ -11649,7 +11621,7 @@ class GatewayRunner:
history = self.session_store.load_transcript(session_entry.session_id)
if history:
from agent.model_metadata import estimate_messages_tokens_rough
msgs = [m for m in history if m.get("role") in ("user", "assistant") and m.get("content")]
msgs = [m for m in history if m.get("role") in {"user", "assistant"} and m.get("content")]
approx = estimate_messages_tokens_rough(msgs)
lines = [
t("gateway.usage.header_session_info"),
@@ -12203,9 +12175,9 @@ class GatewayRunner:
resolve_all = "all" in args
remaining = [a for a in args if a != "all"]
if any(a in ("always", "permanent", "permanently") for a in remaining):
if any(a in {"always", "permanent", "permanently"} for a in remaining):
choice = "always"
elif any(a in ("session", "ses") for a in remaining):
elif any(a in {"session", "ses"} for a in remaining):
choice = "session"
else:
choice = "once"
@@ -12748,11 +12720,10 @@ class GatewayRunner:
msg = f"✅ Hermes update finished.\n\n```\n{output}\n```"
else:
msg = f"❌ Hermes update failed.\n\n```\n{output}\n```"
elif exit_code == 0:
msg = "✅ Hermes update finished successfully."
else:
if exit_code == 0:
msg = "✅ Hermes update finished successfully."
else:
msg = "❌ Hermes update failed. Check the gateway logs or run `hermes update` manually for details."
msg = "❌ Hermes update failed. Check the gateway logs or run `hermes update` manually for details."
await adapter.send(chat_id, msg, metadata=metadata)
logger.info(
"Sent post-update notification to %s:%s (exit=%s)",
@@ -13323,8 +13294,8 @@ class GatewayRunner:
# --- Normal text-only notification ---
# Decide whether to notify based on mode
should_notify = (
notify_mode in ("all", "result")
or (notify_mode == "error" and session.exit_code not in (0, None))
notify_mode in {"all", "result"}
or (notify_mode == "error" and session.exit_code not in {0, None})
)
if should_notify:
new_output = session.output_buffer[-1000:] if session.output_buffer else ""
@@ -13919,7 +13890,7 @@ class GatewayRunner:
for msg in history:
role = msg.get("role")
content = msg.get("content")
if role in ("user", "assistant") and content:
if role in {"user", "assistant"} and content:
api_messages.append({"role": role, "content": content})
api_messages.append({"role": "user", "content": message})
@@ -13984,6 +13955,8 @@ class GatewayRunner:
cursor=_effective_cursor,
buffer_only=_buffer_only,
fresh_final_after_seconds=_fresh_final_secs,
transport=_scfg.transport or "auto",
chat_type=getattr(source, "chat_type", "") or "",
)
_stream_consumer = GatewayStreamConsumer(
adapter=_adapter,
@@ -14308,7 +14281,7 @@ class GatewayRunner:
# Only act on tool.started events (ignore tool.completed, reasoning.available, etc.)
if event_type not in ("tool.started",):
if event_type not in {"tool.started",}:
return
# Suppress tool-progress bubbles once the user has sent `stop`.
@@ -14805,6 +14778,8 @@ class GatewayRunner:
cursor=_effective_cursor,
buffer_only=_buffer_only,
fresh_final_after_seconds=_fresh_final_secs,
transport=_scfg.transport or "auto",
chat_type=getattr(source, "chat_type", "") or "",
)
_stream_consumer = GatewayStreamConsumer(
adapter=_adapter,
@@ -15003,7 +14978,7 @@ class GatewayRunner:
# Skip metadata entries (tool definitions, session info)
# -- these are for transcript logging, not for the LLM
if role in ("session_meta",):
if role in {"session_meta",}:
continue
# Skip system messages -- the agent rebuilds its own system prompt
@@ -15040,7 +15015,7 @@ class GatewayRunner:
# even if the message list shrinks, we know which paths are old.
_history_media_paths: set = set()
for _hm in agent_history:
if _hm.get("role") in ("tool", "function"):
if _hm.get("role") in {"tool", "function"}:
_hc = _hm.get("content", "")
if "MEDIA:" in _hc:
for _match in re.finditer(r'MEDIA:(\S+)', _hc):
@@ -15312,7 +15287,7 @@ class GatewayRunner:
media_tags = []
has_voice_directive = False
for msg in result.get("messages", []):
if msg.get("role") in ("tool", "function"):
if msg.get("role") in {"tool", "function"}:
content = msg.get("content", "")
if "MEDIA:" in content:
for match in re.finditer(r'MEDIA:(\S+)', content):
+12 -7
View File
@@ -764,12 +764,12 @@ class SessionStore:
now = _now()
if policy.mode in ("idle", "both"):
if policy.mode in {"idle", "both"}:
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
if now > idle_deadline:
return True
if policy.mode in ("daily", "both"):
if policy.mode in {"daily", "both"}:
today_reset = now.replace(
hour=policy.at_hour,
minute=0, second=0, microsecond=0,
@@ -805,12 +805,12 @@ class SessionStore:
now = _now()
if policy.mode in ("idle", "both"):
if policy.mode in {"idle", "both"}:
idle_deadline = entry.updated_at + timedelta(minutes=policy.idle_minutes)
if now > idle_deadline:
return "idle"
if policy.mode in ("daily", "both"):
if policy.mode in {"daily", "both"}:
today_reset = now.replace(
hour=policy.at_hour,
minute=0,
@@ -1276,9 +1276,14 @@ class SessionStore:
# Also write legacy JSONL (keeps existing tooling working during transition)
transcript_path = self.get_transcript_path(session_id)
with self._lock:
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
try:
with self._lock:
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
except OSError as e:
# Disk full / read-only fs / permission errors must not crash the
# message handler — the SQLite write above is the primary store.
logger.debug("Failed to write JSONL transcript for %s: %s", session_id, e)
def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
"""Replace the entire transcript for a session with new messages.
+2
View File
@@ -55,6 +55,7 @@ _SESSION_THREAD_ID: ContextVar = ContextVar("HERMES_SESSION_THREAD_ID", default=
_SESSION_USER_ID: ContextVar = ContextVar("HERMES_SESSION_USER_ID", default=_UNSET)
_SESSION_USER_NAME: ContextVar = ContextVar("HERMES_SESSION_USER_NAME", default=_UNSET)
_SESSION_KEY: ContextVar = ContextVar("HERMES_SESSION_KEY", default=_UNSET)
_SESSION_ID: ContextVar = ContextVar("HERMES_SESSION_ID", default=_UNSET)
# Cron auto-delivery vars — set per-job in run_job() so concurrent jobs
# don't clobber each other's delivery targets.
@@ -70,6 +71,7 @@ _VAR_MAP = {
"HERMES_SESSION_USER_ID": _SESSION_USER_ID,
"HERMES_SESSION_USER_NAME": _SESSION_USER_NAME,
"HERMES_SESSION_KEY": _SESSION_KEY,
"HERMES_SESSION_ID": _SESSION_ID,
"HERMES_CRON_AUTO_DELIVER_PLATFORM": _CRON_AUTO_DELIVER_PLATFORM,
"HERMES_CRON_AUTO_DELIVER_CHAT_ID": _CRON_AUTO_DELIVER_CHAT_ID,
"HERMES_CRON_AUTO_DELIVER_THREAD_ID": _CRON_AUTO_DELIVER_THREAD_ID,
+17 -18
View File
@@ -442,22 +442,21 @@ def _parse_systemd_duration_to_us(raw: str) -> Optional[int]:
digits += ch
elif ch.isalpha():
token += ch
else:
if digits and token:
multiplier = units.get(token.lower())
if multiplier is None:
return None
try:
total_us += int(float(digits) * multiplier)
except ValueError:
return None
digits = ""
token = ""
elif digits and not token:
# Bare number = seconds (rare but valid)
try:
total_us += int(float(digits) * 1_000_000)
except ValueError:
return None
digits = ""
elif digits and token:
multiplier = units.get(token.lower())
if multiplier is None:
return None
try:
total_us += int(float(digits) * multiplier)
except ValueError:
return None
digits = ""
token = ""
elif digits and not token:
# Bare number = seconds (rare but valid)
try:
total_us += int(float(digits) * 1_000_000)
except ValueError:
return None
digits = ""
return total_us if total_us > 0 else None
+6 -2
View File
@@ -218,7 +218,11 @@ def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
if not pid_path.exists():
return None
raw = pid_path.read_text().strip()
try:
raw = pid_path.read_text().strip()
except OSError:
# File was deleted between exists() and read_text(), or permission flipped.
return None
if not raw:
return None
@@ -600,7 +604,7 @@ def acquire_scoped_lock(scope: str, identity: str, metadata: Optional[dict[str,
for _line in _proc_status.read_text(encoding="utf-8").splitlines():
if _line.startswith("State:"):
_state = _line.split()[1]
if _state in ("T", "t"): # stopped or tracing stop
if _state in {"T", "t"}: # stopped or tracing stop
stale = True
break
except (OSError, PermissionError):
+198 -4
View File
@@ -25,6 +25,11 @@ from typing import Any, Callable, Optional
from gateway.platforms.base import BasePlatformAdapter as _BasePlatformAdapter
from gateway.platforms.base import _custom_unit_to_cp
from gateway.config import (
DEFAULT_STREAMING_EDIT_INTERVAL as _DEFAULT_STREAMING_EDIT_INTERVAL,
DEFAULT_STREAMING_BUFFER_THRESHOLD as _DEFAULT_STREAMING_BUFFER_THRESHOLD,
DEFAULT_STREAMING_CURSOR as _DEFAULT_STREAMING_CURSOR,
)
logger = logging.getLogger("gateway.stream_consumer")
@@ -43,9 +48,9 @@ _COMMENTARY = object()
@dataclass
class StreamConsumerConfig:
"""Runtime config for a single stream consumer instance."""
edit_interval: float = 1.0
buffer_threshold: int = 40
cursor: str = ""
edit_interval: float = _DEFAULT_STREAMING_EDIT_INTERVAL
buffer_threshold: int = _DEFAULT_STREAMING_BUFFER_THRESHOLD
cursor: str = _DEFAULT_STREAMING_CURSOR
buffer_only: bool = False
# When >0, the final edit for a streamed response is delivered as a
# fresh message if the original preview has been visible for at least
@@ -55,6 +60,18 @@ class StreamConsumerConfig:
# openclaw/openclaw#72038. Default 0 = always edit in place (legacy
# behavior). The gateway enables this selectively per-platform.
fresh_final_after_seconds: float = 0.0
# Streaming transport selection:
# "auto" — prefer native draft streaming (e.g. Telegram sendMessageDraft)
# when the adapter + chat supports it; fall back to edit.
# "draft" — explicitly request native draft streaming; fall back to
# edit when unsupported.
# "edit" — progressive editMessageText (legacy behavior).
# "off" — handled by the gateway before the consumer is even built.
transport: str = "auto"
# Hint for the consumer about the originating chat type (e.g. "dm",
# "group", "supergroup", "forum"). Used to gate native draft streaming,
# which is platform-specific (Telegram drafts are DM-only).
chat_type: str = ""
class GatewayStreamConsumer:
@@ -88,6 +105,11 @@ class GatewayStreamConsumer:
"</THINKING>", "</thinking>", "</thought>",
)
# Class-wide monotonic counter for native-streaming draft ids. Telegram
# animates a draft when the same draft_id is reused across consecutive
# calls in the same chat, so we need a fresh non-zero id per response.
_draft_id_counter: int = 0
def __init__(
self,
adapter: Any,
@@ -141,6 +163,20 @@ class GatewayStreamConsumer:
self._in_think_block = False
self._think_buffer = ""
# Native draft-streaming state. Resolved at the start of run() based
# on cfg.transport, cfg.chat_type, and the adapter's
# supports_draft_streaming() probe. When True, the consumer emits
# animated draft frames via adapter.send_draft instead of progressive
# edits via adapter.edit_message. The final answer still goes
# through the normal first-send path so the user gets a real message
# in their chat history (drafts have no message_id).
self._use_draft_streaming = False
self._draft_id: Optional[int] = None
# Cumulative draft-frame failure count for this consumer. After the
# first failure we permanently disable drafts for the remainder of
# this response and route through edit-based for graceful degradation.
self._draft_failures = 0
@property
def already_sent(self) -> bool:
"""True if at least one message was sent or edited during the run."""
@@ -179,6 +215,16 @@ class GatewayStreamConsumer:
self._last_sent_text = ""
self._fallback_final_send = False
self._fallback_prefix = ""
# Native draft streaming: bump the draft_id so the next text segment
# animates as a fresh preview below the tool-progress bubbles, not
# over the prior segment's already-finalized draft. This is how
# we avoid the "inter-tool-call text leak" failure mode openclaw
# documented in their issue #32535 — each text block becomes its
# own visible message via the finalize, then a new draft animates
# for the next one.
if self._use_draft_streaming:
type(self)._draft_id_counter += 1
self._draft_id = type(self)._draft_id_counter
def on_delta(self, text: str) -> None:
"""Thread-safe callback — called from the agent's worker thread.
@@ -317,6 +363,20 @@ class GatewayStreamConsumer:
_raw_limit = getattr(self.adapter, "MAX_MESSAGE_LENGTH", 4096)
_safe_limit = max(500, _raw_limit - _len_fn(self.cfg.cursor) - 100)
# Resolve native draft streaming once per run. When enabled the
# consumer routes mid-stream frames through adapter.send_draft and
# leaves _message_id=None so the existing got_done path delivers the
# final answer as a regular sendMessage (drafts have no message_id
# to edit).
self._use_draft_streaming = self._resolve_draft_streaming()
if self._use_draft_streaming:
type(self)._draft_id_counter += 1
self._draft_id = type(self)._draft_id_counter
logger.debug(
"Stream consumer using native-draft transport (chat=%s draft_id=%s)",
self.chat_id, self._draft_id,
)
try:
while True:
# Drain all available items from the queue
@@ -754,6 +814,89 @@ class GatewayStreamConsumer:
err_lower = err.lower()
return "flood" in err_lower or "retry after" in err_lower or "rate" in err_lower
def _resolve_draft_streaming(self) -> bool:
"""Decide whether this run should use native draft streaming.
Honors ``cfg.transport``:
* ``"edit"`` never use drafts (legacy progressive-edit path).
* ``"draft"`` require draft support; gracefully fall back to edit
when the adapter declines. Logs the downgrade at debug.
* ``"auto"`` use drafts when the adapter supports them for this
chat type; otherwise edit.
Adapter eligibility is checked via
:meth:`BasePlatformAdapter.supports_draft_streaming`, which considers
the chat type (e.g. Telegram drafts are DM-only) and platform-version
gates (e.g. python-telegram-bot 22.6+).
"""
transport = (self.cfg.transport or "auto").lower()
if transport == "edit":
return False
# "off" is filtered upstream by the gateway; treat as edit defensively.
if transport == "off":
return False
# Test adapters are MagicMocks that don't subclass BasePlatformAdapter;
# default them to edit so existing test behaviour is preserved.
if not isinstance(self.adapter, _BasePlatformAdapter):
return False
try:
supported = self.adapter.supports_draft_streaming(
chat_type=self.cfg.chat_type or None,
metadata=self.metadata,
)
except Exception:
logger.debug("supports_draft_streaming probe raised", exc_info=True)
supported = False
if not supported:
if transport == "draft":
logger.debug(
"Draft streaming requested but unsupported (chat=%s, type=%r) — "
"falling back to edit",
self.chat_id, self.cfg.chat_type,
)
return False
return True
async def _send_draft_frame(self, text: str) -> bool:
"""Emit a single animated draft frame for the current accumulated text.
Returns True when the frame landed. On any failure, permanently
disables drafts for the remainder of this run so subsequent frames
flow through the edit-based path (which can adapt with flood-control
backoff, etc.). Drafts have no message_id and clear naturally on
the client when the response finalizes via a regular sendMessage.
"""
if self._draft_id is None:
# Defensive: should never happen — _use_draft_streaming gate is
# set in tandem with _draft_id in run(). Disable to be safe.
self._use_draft_streaming = False
return False
try:
result = await self.adapter.send_draft(
chat_id=self.chat_id,
draft_id=self._draft_id,
content=text,
metadata=self.metadata,
)
except Exception as e:
logger.debug(
"send_draft raised, disabling draft transport for this run: %s", e,
)
self._draft_failures += 1
self._use_draft_streaming = False
return False
if not getattr(result, "success", False):
logger.debug(
"send_draft returned success=False, disabling draft transport: %s",
getattr(result, "error", "unknown"),
)
self._draft_failures += 1
self._use_draft_streaming = False
return False
# Frame delivered. Track text for parity with edit-based no-op skip.
self._last_sent_text = text
return True
async def _flush_segment_tail_on_edit_failure(self) -> None:
"""Deliver un-sent tail content before a segment-break reset.
@@ -948,6 +1091,35 @@ class GatewayStreamConsumer:
and self.cfg.cursor in text
and len(_visible_stripped) < _MIN_NEW_MSG_CHARS):
return True # too short for a standalone message — accumulate more
# Native draft streaming: route mid-stream frames through send_draft.
# The final answer is delivered via the regular sendMessage path
# below — drafts have no message_id so we can't finalize them
# in-place; the regular sendMessage clears the draft naturally on
# the client and gives the user a real message in their history.
# Skip when:
# * finalize=True (this is the final answer; needs to be a real message)
# * an edit path is already established (message_id is set, e.g. after
# a tool-boundary segment break where the prior text was finalized
# as a real sendMessage and the next text segment continues editing
# that one — staying on edit-based for that segment is correct).
if (
self._use_draft_streaming
and not finalize
and self._message_id is None
):
# No-op skip: identical to the last frame we sent.
if text == self._last_sent_text:
return True
ok = await self._send_draft_frame(text)
if ok:
# Drafts mark "we put something on screen" but DO NOT set
# _already_sent — that flag gates the gateway's fallback
# final-send path and we still need that to fire so the
# user gets a real message (drafts have no message_id).
return True
# Failure already disabled drafts for this run; fall through to
# the regular edit/send path below.
try:
if self._message_id is not None:
if self._edit_supported:
@@ -986,7 +1158,29 @@ class GatewayStreamConsumer:
)
if result.success:
self._already_sent = True
self._last_sent_text = text
# Adapter may have split-and-delivered an oversized
# edit across the original message + N continuations.
# When that happens, ``message_id`` is the LAST visible
# continuation and ``_last_sent_text`` no longer reflects
# the on-screen content (the new message only holds the
# final chunk's text), so subsequent edits must target
# the new id and skip-if-same comparisons must reset.
# Fire on_new_message so tool-progress bubbles linearize
# below the new continuation, not the original.
# ``getattr`` with default keeps backwards compat with
# SimpleNamespace mocks in tests that pre-date the field.
_continuation_ids = getattr(result, "continuation_message_ids", ()) or ()
if (
_continuation_ids
and result.message_id
and result.message_id != self._message_id
):
self._message_id = str(result.message_id)
self._message_created_ts = time.monotonic()
self._last_sent_text = ""
self._notify_new_message()
else:
self._last_sent_text = text
# Successful edit — reset flood strike counter
self._flood_strikes = 0
return True
+49 -20
View File
@@ -1450,7 +1450,7 @@ def resolve_provider(
# whose availability isn't implied by LM_API_KEY presence (it may be
# offline, and the no-auth setup uses a placeholder value), so it
# also requires explicit selection.
if pid in ("copilot", "lmstudio"):
if pid in {"copilot", "lmstudio"}:
continue
for env_var in pconfig.api_key_env_vars:
if has_usable_secret(os.getenv(env_var, "")):
@@ -2541,7 +2541,7 @@ def refresh_codex_oauth_pure(
# A 401/403 from the token endpoint always means the refresh token
# is invalid/expired — force relogin even if the body error code
# wasn't one of the known strings above.
if response.status_code in (401, 403) and not relogin_required:
if response.status_code in {401, 403} and not relogin_required:
relogin_required = True
raise AuthError(
message,
@@ -2947,7 +2947,7 @@ def _merge_shared_nous_oauth_state(state: Dict[str, Any]) -> bool:
"expires_at",
):
value = shared.get(key)
if value not in (None, ""):
if value not in {None, ""}:
state[key] = value
return True
@@ -3986,7 +3986,7 @@ def get_api_key_provider_status(provider_id: str) -> Dict[str, Any]:
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id in ("kimi-coding", "kimi-coding-cn"):
if provider_id in {"kimi-coding", "kimi-coding-cn"}:
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif env_url:
base_url = env_url
@@ -4046,6 +4046,8 @@ def get_auth_status(provider_id: Optional[str] = None) -> Dict[str, Any]:
return get_qwen_auth_status()
if target == "google-gemini-cli":
return get_gemini_oauth_auth_status()
if target == "minimax-oauth":
return get_minimax_oauth_auth_status()
if target == "copilot-acp":
return get_external_process_provider_status(target)
# API-key providers
@@ -4090,7 +4092,7 @@ def resolve_api_key_provider_credentials(provider_id: str) -> Dict[str, Any]:
if pconfig.base_url_env_var:
env_url = os.getenv(pconfig.base_url_env_var, "").strip()
if provider_id in ("kimi-coding", "kimi-coding-cn"):
if provider_id in {"kimi-coding", "kimi-coding-cn"}:
base_url = _resolve_kimi_base_url(api_key, pconfig.inference_base_url, env_url)
elif provider_id == "zai":
base_url = _resolve_zai_base_url(api_key, pconfig.inference_base_url, env_url)
@@ -4510,7 +4512,7 @@ def _login_openai_codex(
reuse = input("Use existing credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
reuse = "y"
if reuse in ("", "y", "yes"):
if reuse in {"", "y", "yes"}:
config_path = _update_config_for_provider("openai-codex", existing.get("base_url", DEFAULT_CODEX_BASE_URL))
print()
print("Login successful!")
@@ -4531,7 +4533,7 @@ def _login_openai_codex(
do_import = input("Import these credentials? (a separate login is recommended) [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "n"
if do_import in ("y", "yes"):
if do_import in {"y", "yes"}:
_save_codex_tokens(cli_tokens)
base_url = os.getenv("HERMES_CODEX_BASE_URL", "").strip().rstrip("/") or DEFAULT_CODEX_BASE_URL
config_path = _update_config_for_provider("openai-codex", base_url)
@@ -4623,7 +4625,7 @@ def _codex_device_code_login() -> Dict[str, Any]:
if poll_resp.status_code == 200:
code_resp = poll_resp.json()
break
elif poll_resp.status_code in (403, 404):
elif poll_resp.status_code in {403, 404}:
continue # User hasn't completed login yet
else:
raise AuthError(
@@ -4757,6 +4759,20 @@ def _minimax_request_user_code(
return payload
def _minimax_expired_in_looks_like_unix_ms(expired_in: int, *, now_ms: int) -> bool:
"""True if ``expired_in`` is plausibly a unix-ms absolute time (vs TTL seconds)."""
return int(expired_in) > (now_ms // 2)
def _minimax_resolve_token_expiry_unix(expired_in: int, *, now: datetime) -> float:
"""Return access-token expiry as unix seconds (MiniMax uses ms epoch or TTL seconds)."""
raw = int(expired_in)
now_ms = int(now.timestamp() * 1000)
if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
return raw / 1000.0
return now.timestamp() + max(1, raw)
def _minimax_poll_token(
client: httpx.Client, *, portal_base_url: str, client_id: str,
user_code: str, code_verifier: str, expired_in: int, interval_ms: Optional[int],
@@ -4765,12 +4781,11 @@ def _minimax_poll_token(
# Defensive parsing: if it's small enough to be a duration, treat as seconds.
import time as _time
now_ms = int(_time.time() * 1000)
if expired_in > now_ms // 2:
# Looks like a unix-ms timestamp.
deadline = expired_in / 1000.0
raw = int(expired_in)
if _minimax_expired_in_looks_like_unix_ms(raw, now_ms=now_ms):
deadline = raw / 1000.0
else:
# Treat as duration in seconds from now.
deadline = _time.time() + max(1, expired_in)
deadline = _time.time() + max(1, raw)
interval = max(2.0, (interval_ms or 2000) / 1000.0)
while _time.time() < deadline:
@@ -4884,8 +4899,10 @@ def _minimax_oauth_login(
)
now = datetime.now(timezone.utc)
expires_in_s = int(token_data["expired_in"])
expires_at = now.timestamp() + expires_in_s
expires_at_unix = _minimax_resolve_token_expiry_unix(
int(token_data["expired_in"]), now=now,
)
expires_in_s = max(0, int(expires_at_unix - now.timestamp()))
auth_state = {
"provider": "minimax-oauth",
@@ -4899,7 +4916,7 @@ def _minimax_oauth_login(
"refresh_token": token_data["refresh_token"],
"resource_url": token_data.get("resource_url"),
"obtained_at": now.isoformat(),
"expires_at": datetime.fromtimestamp(expires_at, tz=timezone.utc).isoformat(),
"expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
"expires_in": expires_in_s,
}
@@ -4960,14 +4977,16 @@ def _refresh_minimax_oauth_state(
relogin_required=True,
)
now_dt = datetime.now(timezone.utc)
expires_in_s = int(payload["expired_in"])
expires_at_unix = _minimax_resolve_token_expiry_unix(
int(payload["expired_in"]), now=now_dt,
)
expires_in_s = max(0, int(expires_at_unix - now_dt.timestamp()))
new_state = dict(state)
new_state.update({
"access_token": payload["access_token"],
"refresh_token": payload.get("refresh_token", state["refresh_token"]),
"obtained_at": now_dt.isoformat(),
"expires_at": datetime.fromtimestamp(now_dt.timestamp() + expires_in_s,
tz=timezone.utc).isoformat(),
"expires_at": datetime.fromtimestamp(expires_at_unix, tz=timezone.utc).isoformat(),
"expires_in": expires_in_s,
})
_minimax_save_auth_state(new_state)
@@ -5188,7 +5207,7 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
if do_import in {"", "y", "yes"}:
print("Rehydrating Nous session from shared credentials...")
auth_state = _try_import_shared_nous_state(
timeout_seconds=timeout_seconds,
@@ -5251,6 +5270,7 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
from hermes_cli.models import (
get_curated_nous_model_ids, get_pricing_for_provider,
check_nous_free_tier, partition_nous_models_by_tier,
union_with_portal_free_recommendations,
)
model_ids = get_curated_nous_model_ids()
@@ -5260,6 +5280,15 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
pricing = get_pricing_for_provider("nous")
free_tier = check_nous_free_tier()
if free_tier:
# The Portal's freeRecommendedModels endpoint is the
# source of truth for what's free *right now*. Augment
# the curated list with anything new the Portal flags
# as free so users on older Hermes builds still see
# newly-launched free models without a CLI release.
_portal_for_recs = auth_state.get("portal_base_url", "")
model_ids, pricing = union_with_portal_free_recommendations(
model_ids, pricing, _portal_for_recs,
)
model_ids, unavailable_models = partition_nous_models_by_tier(
model_ids, pricing, free_tier=True,
)
+9 -6
View File
@@ -266,7 +266,7 @@ def auth_add_command(args) -> None:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
if do_import in {"", "y", "yes"}:
print("Rehydrating Nous session from shared credentials...")
rehydrated = auth_mod._try_import_shared_nous_state(
timeout_seconds=getattr(args, "timeout", None) or 15.0,
@@ -375,10 +375,12 @@ def auth_add_command(args) -> None:
return
if provider == "minimax-oauth":
from hermes_cli.auth import resolve_minimax_oauth_runtime_credentials
creds = resolve_minimax_oauth_runtime_credentials()
creds = auth_mod._minimax_oauth_login(
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=getattr(args, "timeout", None) or 15.0,
)
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["api_key"],
creds["access_token"],
_oauth_default_label(provider, len(pool.entries()) + 1),
)
entry = PooledCredential(
@@ -388,8 +390,9 @@ def auth_add_command(args) -> None:
auth_type=AUTH_TYPE_OAUTH,
priority=0,
source=f"{SOURCE_MANUAL}:minimax_oauth",
access_token=creds["api_key"],
base_url=creds.get("base_url"),
access_token=creds["access_token"],
refresh_token=creds.get("refresh_token"),
base_url=creds.get("inference_base_url"),
)
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
+4 -6
View File
@@ -298,7 +298,7 @@ def _detect_prefix(zf: zipfile.ZipFile) -> str:
if len(first_parts) == 1:
prefix = first_parts.pop()
# Only strip if it looks like a hermes dir name
if prefix in (".hermes", "hermes"):
if prefix in {".hermes", "hermes"}:
return prefix + "/"
return ""
@@ -349,7 +349,7 @@ def run_import(args) -> None:
except (EOFError, KeyboardInterrupt):
print("\nAborted.")
sys.exit(1)
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
print("Aborted.")
return
@@ -802,8 +802,7 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
Operators who genuinely don't want a backup should set
``updates.pre_update_backup: false`` in config that gates creation.
"""
if keep < 1:
keep = 1
keep = max(keep, 1)
if not backup_dir.exists():
return 0
@@ -875,8 +874,7 @@ def _prune_pre_migration_backups(backup_dir: Path, keep: int) -> int:
Only touches files matching ``pre-migration-*.zip`` so other backups in
the same directory are never touched.
"""
if keep < 0:
keep = 0
keep = max(keep, 0)
if not backup_dir.exists():
return 0
+1 -1
View File
@@ -139,7 +139,7 @@ def _confirm(prompt: str) -> bool:
except (EOFError, KeyboardInterrupt):
print()
return False
return resp in ("y", "yes")
return resp in {"y", "yes"}
def cmd_clear(args: argparse.Namespace) -> int:
+10 -11
View File
@@ -298,7 +298,7 @@ def claw_command(args):
if action == "migrate":
_cmd_migrate(args)
elif action in ("cleanup", "clean"):
elif action in {"cleanup", "clean"}:
_cmd_cleanup(args)
else:
print("Usage: hermes claw <command> [options]")
@@ -670,17 +670,16 @@ def _cmd_cleanup(args):
elif not auto_yes and not sys.stdin.isatty():
print_info(f"Non-interactive session — would archive: {source_dir}")
print_info("To execute, re-run with: hermes claw cleanup --yes")
elif auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
total_archived += 1
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"Try manually: mv {source_dir} {source_dir}.pre-migration")
else:
if auto_yes or prompt_yes_no(f"Archive {source_dir}?", default=True):
try:
archive_path = _archive_directory(source_dir)
print_success(f"Archived: {source_dir}{archive_path}")
total_archived += 1
except OSError as e:
print_error(f"Could not archive: {e}")
print_info(f"Try manually: mv {source_dir} {source_dir}.pre-migration")
else:
print_info("Skipped.")
print_info("Skipped.")
# Summary
print()
+2 -2
View File
@@ -101,7 +101,7 @@ def _fetch_models_from_api(access_token: str) -> List[str]:
# Some valid Codex CLI models (for example gpt-5.3-codex-spark) are
# marked false here but are still accepted by the Codex route.
visibility = item.get("visibility", "")
if isinstance(visibility, str) and visibility.strip().lower() in ("hide", "hidden"):
if isinstance(visibility, str) and visibility.strip().lower() in {"hide", "hidden"}:
continue
priority = item.get("priority")
rank = int(priority) if isinstance(priority, (int, float)) else 10_000
@@ -152,7 +152,7 @@ def _read_cache_models(codex_home: Path) -> List[str]:
# public OpenAI API, while Hermes openai-codex talks to the same
# OAuth-backed Codex backend as Codex CLI.
visibility = item.get("visibility")
if isinstance(visibility, str) and visibility.strip().lower() in ("hide", "hidden"):
if isinstance(visibility, str) and visibility.strip().lower() in {"hide", "hidden"}:
continue
priority = item.get("priority")
rank = int(priority) if isinstance(priority, (int, float)) else 10_000
+1 -3
View File
@@ -104,8 +104,6 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="<prompt>"),
CommandDef("goal", "Set a standing goal Hermes works on across turns until achieved", "Session",
args_hint="[text | pause | resume | clear | status]"),
CommandDef("subgoal", "Add or manage checklist items on the active goal", "Session",
args_hint="[text | complete N | impossible N | undo N | remove N | clear]"),
CommandDef("status", "Show session info", "Session"),
CommandDef("whoami", "Show your slash command access (admin / user)", "Info"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
@@ -813,7 +811,7 @@ def discord_skill_commands_by_category(
# names are marked with a sentinel so the warning distinguishes
# "skill collided with a reserved command" from "two skills collided
# on the 32-char clamp" — the latter is the rename-worthy case.
_names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
_names_used: dict[str, str] = dict.fromkeys(reserved_names, "<reserved>")
hidden = 0
try:
+80 -13
View File
@@ -28,6 +28,48 @@ from typing import Dict, Any, Optional, List, Tuple
logger = logging.getLogger(__name__)
# Track which (config_path, mtime_ns, size) tuples we've already warned about
# so concurrent CLI/gateway loads of a broken config.yaml don't spam stderr
# every time. Cleared automatically when the file changes (different mtime).
_CONFIG_PARSE_WARNED: set = set()
def _warn_config_parse_failure(config_path: Path, exc: Exception) -> None:
"""Surface a config.yaml parse failure to user, log, and stderr.
A YAML parse error in ``~/.hermes/config.yaml`` causes ``load_config()``
to silently fall back to ``DEFAULT_CONFIG``, which means every user
override (auxiliary providers, fallback chain, model overrides, etc.)
is dropped. Before this helper that was a one-line ``print(...)`` that
scrolled off-screen on the first invocation and was never seen again.
Now: warn once per (path, mtime_ns, size) on stderr **and** in
``agent.log`` / ``errors.log`` at WARNING level so ``hermes logs``
surfaces it. Re-warns automatically if the file changes (different
mtime/size), so users editing the config see the next failure.
"""
try:
st = config_path.stat()
key = (str(config_path), st.st_mtime_ns, st.st_size)
except OSError:
key = (str(config_path), 0, 0)
if key in _CONFIG_PARSE_WARNED:
return
_CONFIG_PARSE_WARNED.add(key)
msg = (
f"Failed to parse {config_path}: {exc}. "
f"Falling back to default config — every user override "
f"(auxiliary providers, fallback chain, model settings) is being IGNORED. "
f"Fix the YAML and restart."
)
logger.warning(msg)
try:
sys.stderr.write(f"⚠️ hermes config: {msg}\n")
sys.stderr.flush()
except Exception:
pass
_IS_WINDOWS = platform.system() == "Windows"
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_LAST_EXPANDED_CONFIG_BY_PATH: Dict[str, Any] = {}
@@ -537,6 +579,7 @@ DEFAULT_CONFIG = {
# Explicit opt-in: mount the host cwd into /workspace for Docker sessions.
# Default off because passing host directories into a sandbox weakens isolation.
"docker_mount_cwd_to_workspace": False,
"docker_extra_args": [], # Extra flags passed verbatim to docker run
# Explicit opt-in: run the Docker container as the host user's uid:gid
# (via `--user`). When enabled, files written into bind-mounted dirs
# (docker_volumes, the persistent workspace, or the auto-mounted cwd)
@@ -680,8 +723,15 @@ DEFAULT_CONFIG = {
# Anthropic prompt caching (Claude via OpenRouter or native Anthropic API).
# cache_ttl must be "5m" or "1h" (Anthropic-supported tiers); other values are ignored.
# long_lived_prefix: when true (default), Claude on Anthropic / OpenRouter / Nous
# Portal uses a split layout: tools[-1] + stable system prefix at long_lived_ttl
# (cross-session cache), last 2 messages at cache_ttl (within-session rolling).
# Set false to keep the legacy "system + last 3 messages" single-tier layout.
# long_lived_ttl: TTL for the cross-session prefix tier ("5m" or "1h"; default "1h").
"prompt_caching": {
"cache_ttl": "5m",
"long_lived_prefix": True,
"long_lived_ttl": "1h",
},
# OpenRouter-specific settings.
@@ -859,6 +909,7 @@ DEFAULT_CONFIG = {
"bell_on_complete": False,
"show_reasoning": False,
"streaming": False,
"timestamps": False, # Show [HH:MM] on user and assistant labels
"final_response_markdown": "strip", # render | strip | raw
# Preserve recent classic CLI output across Ctrl+L, /redraw, and
# terminal resize full-screen clears. Disable if a terminal emulator
@@ -1281,6 +1332,21 @@ DEFAULT_CONFIG = {
"domains": [],
"shared_files": [],
},
# Acknowledged supply-chain security advisories. Each entry is the
# ID of an advisory the user has read and acted on (uninstalled the
# compromised package, rotated credentials). Acked advisories no
# longer trigger the startup banner. Add via `hermes doctor --ack
# <id>`; remove by editing the list directly. See
# ``hermes_cli/security_advisories.py`` for the catalog.
"acked_advisories": [],
# Allow Hermes to lazy-install opt-in backend packages from PyPI
# the first time the user enables a backend that needs them
# (e.g. installing ``elevenlabs`` when the user picks ElevenLabs as
# their TTS provider). Set to false to require explicit
# ``pip install`` for everything beyond the base set — appropriate
# for restricted networks, audited environments, or air-gapped
# systems where any runtime install is unacceptable.
"allow_lazy_installs": True,
},
"cron": {
@@ -3158,7 +3224,7 @@ def warn_deprecated_cwd_env_vars(config: Optional[Dict[str, Any]] = None) -> Non
terminal_cfg = config.get("terminal", {})
config_cwd = terminal_cfg.get("cwd", ".") if isinstance(terminal_cfg, dict) else "."
# Only warn if config.yaml doesn't have an explicit path
config_has_explicit_cwd = config_cwd not in (".", "auto", "cwd", "")
config_has_explicit_cwd = config_cwd not in {".", "auto", "cwd", ""}
lines: list[str] = []
if messaging_cwd:
@@ -3218,10 +3284,10 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if "tool_progress" not in display:
old_enabled = get_env_value("HERMES_TOOL_PROGRESS")
old_mode = get_env_value("HERMES_TOOL_PROGRESS_MODE")
if old_enabled and old_enabled.lower() in ("false", "0", "no"):
if old_enabled and old_enabled.lower() in {"false", "0", "no"}:
display["tool_progress"] = "off"
results["config_added"].append("display.tool_progress=off (from HERMES_TOOL_PROGRESS=false)")
elif old_mode and old_mode.lower() in ("new", "all"):
elif old_mode and old_mode.lower() in {"new", "all"}:
display["tool_progress"] = old_mode.lower()
results["config_added"].append(f"display.tool_progress={old_mode.lower()} (from HERMES_TOOL_PROGRESS_MODE)")
else:
@@ -3300,7 +3366,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
new_entry = {"api": old_url}
if old_name:
new_entry["name"] = old_name
if old_key and old_key not in ("no-key", "no-key-required", ""):
if old_key and old_key not in {"no-key", "no-key-required", ""}:
new_entry["api_key"] = old_key
# Carry over model and api_mode if present
@@ -3358,7 +3424,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
stt.pop("model", None)
# Place it in the appropriate provider section only if the
# user didn't already set a model there
if provider in ("local", "local_command"):
if provider in {"local", "local_command"}:
# Don't migrate an OpenAI model name into the local section
_local_models = {
"tiny.en", "tiny", "base.en", "base", "small.en", "small",
@@ -3442,7 +3508,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
if not aux_comp.get("model"):
aux_comp["model"] = str(s_model).strip()
migrated_keys.append(f"model={s_model}")
if s_provider and str(s_provider).strip() not in ("", "auto"):
if s_provider and str(s_provider).strip() not in {"", "auto"}:
aux = config.setdefault("auxiliary", {})
aux_comp = aux.setdefault("compression", {})
if not aux_comp.get("provider") or aux_comp.get("provider") == "auto":
@@ -3673,7 +3739,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer in ("y", "yes"):
if answer in {"y", "yes"}:
print()
for name, info in new_and_unset:
if info.get("url"):
@@ -3734,7 +3800,7 @@ def migrate_config(interactive: bool = True, quiet: bool = False) -> Dict[str, A
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer in ("y", "yes"):
if answer in {"y", "yes"}:
print()
config = load_config()
try:
@@ -4004,7 +4070,8 @@ def read_raw_config() -> Dict[str, Any]:
try:
with open(config_path, encoding="utf-8") as f:
data = yaml.safe_load(f) or {}
except Exception:
except Exception as e:
_warn_config_parse_failure(config_path, e)
return {}
if not isinstance(data, dict):
@@ -4054,7 +4121,7 @@ def load_config() -> Dict[str, Any]:
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
_warn_config_parse_failure(config_path, e)
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
@@ -4815,9 +4882,9 @@ def set_config_value(key: str, value: str):
# inline navigation here silently overwrote lists with dicts.
# Convert value to appropriate type
if value.lower() in ('true', 'yes', 'on'):
if value.lower() in {'true', 'yes', 'on'}:
value = True
elif value.lower() in ('false', 'no', 'off'):
elif value.lower() in {'false', 'no', 'off'}:
value = False
elif value.isdigit():
value = int(value)
@@ -5022,7 +5089,7 @@ def _inject_profile_env_vars() -> None:
try:
from providers import list_providers
for _pp in list_providers():
if _pp.auth_type not in ("api_key",):
if _pp.auth_type not in {"api_key",}:
continue
for _var in _pp.env_vars:
if _var in OPTIONAL_ENV_VARS:
+1 -1
View File
@@ -128,7 +128,7 @@ def _try_gh_cli_token() -> Optional[str]:
# Build a clean env so gh doesn't short-circuit on GITHUB_TOKEN / GH_TOKEN
clean_env = {k: v for k, v in os.environ.items()
if k not in ("GITHUB_TOKEN", "GH_TOKEN")}
if k not in {"GITHUB_TOKEN", "GH_TOKEN"}}
for gh_path in _gh_cli_candidates():
cmd = [gh_path, "auth", "token"]
+2 -2
View File
@@ -347,7 +347,7 @@ def _cmd_prune(args) -> int:
except (EOFError, KeyboardInterrupt):
print("\ncurator: aborted")
return 1
if reply not in ("y", "yes"):
if reply not in {"y", "yes"}:
print("curator: aborted")
return 1
@@ -449,7 +449,7 @@ def _cmd_rollback(args) -> int:
except (EOFError, KeyboardInterrupt):
print("\ncancelled")
return 1
if ans not in ("y", "yes"):
if ans not in {"y", "yes"}:
print("cancelled")
return 1
+12 -12
View File
@@ -139,16 +139,16 @@ def curses_checklist(
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if key in {curses.KEY_UP, ord("k")}:
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord("j")):
elif key in {curses.KEY_DOWN, ord("j")}:
cursor = (cursor + 1) % len(items)
elif key == ord(" "):
chosen.symmetric_difference_update({cursor})
elif key in (curses.KEY_ENTER, 10, 13):
elif key in {curses.KEY_ENTER, 10, 13}:
result_holder[0] = set(chosen)
return
elif key in (27, ord("q")):
elif key in {27, ord("q")}:
result_holder[0] = cancel_returns
return
@@ -265,14 +265,14 @@ def curses_radiolist(
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if key in {curses.KEY_UP, ord("k")}:
cursor = (cursor - 1) % len(items)
elif key in (curses.KEY_DOWN, ord("j")):
elif key in {curses.KEY_DOWN, ord("j")}:
cursor = (cursor + 1) % len(items)
elif key in (ord(" "), curses.KEY_ENTER, 10, 13):
elif key in {ord(" "), curses.KEY_ENTER, 10, 13}:
result_holder[0] = cursor
return
elif key in (27, ord("q")):
elif key in {27, ord("q")}:
result_holder[0] = cancel_returns
return
@@ -388,14 +388,14 @@ def curses_single_select(
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if key in {curses.KEY_UP, ord("k")}:
cursor = (cursor - 1) % len(all_items)
elif key in (curses.KEY_DOWN, ord("j")):
elif key in {curses.KEY_DOWN, ord("j")}:
cursor = (cursor + 1) % len(all_items)
elif key in (curses.KEY_ENTER, 10, 13):
elif key in {curses.KEY_ENTER, 10, 13}:
result_holder[0] = cursor
return
elif key in (27, ord("q")):
elif key in {27, ord("q")}:
result_holder[0] = None
return
+1 -1
View File
@@ -93,7 +93,7 @@ def poll_registration(device_code: str) -> dict:
"""
data = _api_post("/app/registration/poll", {"device_code": device_code})
status_raw = str(data.get("status", "")).strip().upper()
if status_raw not in ("WAITING", "SUCCESS", "FAIL", "EXPIRED"):
if status_raw not in {"WAITING", "SUCCESS", "FAIL", "EXPIRED"}:
status_raw = "UNKNOWN"
return {
"status": status_raw,
+118 -42
View File
@@ -296,19 +296,101 @@ def _build_apikey_providers_list() -> list:
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
ack_target = getattr(args, 'ack', None)
# Doctor runs from the interactive CLI, so CLI-gated tool availability
# checks (like cronjob management) should see the same context as `hermes`.
os.environ.setdefault("HERMES_INTERACTIVE", "1")
# Handle `hermes doctor --ack <id>` as a fast path. Persist the ack and
# return without running the rest of the diagnostics — the user has
# already seen the advisory and just wants to silence it.
if ack_target:
from hermes_cli.security_advisories import (
ADVISORIES,
ack_advisory,
)
valid_ids = {a.id for a in ADVISORIES}
if ack_target not in valid_ids:
print(color(
f"Unknown advisory ID: {ack_target!r}. Known IDs: "
f"{', '.join(sorted(valid_ids)) or '(none)'}",
Colors.RED,
))
sys.exit(2)
if ack_advisory(ack_target):
print(color(
f" ✓ Acknowledged advisory {ack_target}. "
f"It will no longer trigger startup banners.",
Colors.GREEN,
))
else:
print(color(
f" ✗ Failed to persist ack for {ack_target}. "
f"Check ~/.hermes/config.yaml is writable.",
Colors.RED,
))
sys.exit(1)
return
issues = []
manual_issues = [] # issues that can't be auto-fixed
fixed_count = 0
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.CYAN))
print(color("│ 🩺 Hermes Doctor │", Colors.CYAN))
print(color("└─────────────────────────────────────────────────────────┘", Colors.CYAN))
# =========================================================================
# Check: Security advisories (RUNS FIRST — these are the most urgent)
# =========================================================================
print()
print(color("◆ Security Advisories", Colors.CYAN, Colors.BOLD))
try:
from hermes_cli.security_advisories import (
detect_compromised,
filter_unacked,
full_remediation_text,
get_acked_ids,
)
all_hits = detect_compromised()
fresh_hits = filter_unacked(all_hits)
if fresh_hits:
for hit in fresh_hits:
check_fail(
f"{hit.advisory.title}",
f"({hit.package}=={hit.installed_version})",
)
# Print the full remediation block, indented under the
# check_fail header so it reads as a single section.
for line in full_remediation_text(hit):
if line:
print(f" {color(line, Colors.YELLOW)}")
else:
print()
# Funnel into the action list so the summary block surfaces it
# for users who scroll past the section.
manual_issues.append(
f"Resolve security advisory {hit.advisory.id}: "
f"uninstall {hit.package}=={hit.installed_version} and "
f"rotate credentials, then run "
f"`hermes doctor --ack {hit.advisory.id}`."
)
# Acked-but-still-installed: show as informational so the user
# knows the package is still on disk after the ack.
acked_ids = get_acked_ids()
for h in all_hits:
if h.advisory.id in acked_ids:
check_warn(
f"{h.package}=={h.installed_version} still installed "
f"(advisory {h.advisory.id} acknowledged)",
)
else:
check_ok("No active security advisories")
except Exception as e:
# Never let a bug in the advisory check block the rest of doctor.
check_warn(f"Security advisory check failed: {e}")
# =========================================================================
# Check: Python version
@@ -473,7 +555,7 @@ def run_doctor(args):
if (
provider
and _resolve_auth_provider is not None
and provider not in ("auto", "custom")
and provider not in {"auto", "custom"}
):
try:
runtime_provider = _resolve_auth_provider(provider)
@@ -485,7 +567,7 @@ def run_doctor(args):
if (
provider
and _resolve_provider_full is not None
and provider not in ("auto", "custom")
and provider not in {"auto", "custom"}
):
provider_def = _resolve_provider_full(provider, user_providers, custom_providers)
catalog_provider = provider_def.id if provider_def is not None else None
@@ -542,7 +624,7 @@ def run_doctor(args):
# own env-var checks elsewhere in doctor, and get_auth_status()
# returns a bare {logged_in: False} for anything it doesn't
# explicitly dispatch, which would produce false positives.
if runtime_provider and runtime_provider not in ("auto", "custom", "openrouter"):
if runtime_provider and runtime_provider not in {"auto", "custom", "openrouter"}:
try:
from hermes_cli.auth import PROVIDER_REGISTRY, get_auth_status
pconfig = PROVIDER_REGISTRY.get(runtime_provider)
@@ -729,13 +811,12 @@ def run_doctor(args):
hermes_home = HERMES_HOME
if hermes_home.exists():
check_ok(f"{_DHH} directory exists")
elif should_fix:
hermes_home.mkdir(parents=True, exist_ok=True)
check_ok(f"Created {_DHH} directory")
fixed_count += 1
else:
if should_fix:
hermes_home.mkdir(parents=True, exist_ok=True)
check_ok(f"Created {_DHH} directory")
fixed_count += 1
else:
check_warn(f"{_DHH} not found", "(will be created on first use)")
check_warn(f"{_DHH} not found", "(will be created on first use)")
# Check expected subdirectories
expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
@@ -743,13 +824,12 @@ def run_doctor(args):
subdir_path = hermes_home / subdir_name
if subdir_path.exists():
check_ok(f"{_DHH}/{subdir_name}/ exists")
elif should_fix:
subdir_path.mkdir(parents=True, exist_ok=True)
check_ok(f"Created {_DHH}/{subdir_name}/")
fixed_count += 1
else:
if should_fix:
subdir_path.mkdir(parents=True, exist_ok=True)
check_ok(f"Created {_DHH}/{subdir_name}/")
fixed_count += 1
else:
check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
# Check for SOUL.md persona file
soul_path = hermes_home / "SOUL.md"
@@ -955,14 +1035,12 @@ def run_doctor(args):
else:
check_fail("docker not found", "(required for TERMINAL_ENV=docker)")
issues.append("Install Docker or change TERMINAL_ENV")
elif _safe_which("docker"):
check_ok("docker", "(optional)")
elif _is_termux():
check_info("Docker backend is not available inside Termux (expected on Android)")
else:
if _safe_which("docker"):
check_ok("docker", "(optional)")
else:
if _is_termux():
check_info("Docker backend is not available inside Termux (expected on Android)")
else:
check_warn("docker not found", "(optional)")
check_warn("docker not found", "(optional)")
# SSH (if using ssh backend)
if terminal_env == "ssh":
@@ -1014,7 +1092,7 @@ def run_doctor(args):
issues.append(f"Set TERMINAL_VERCEL_RUNTIME to one of: {supported}")
disk = os.getenv("TERMINAL_CONTAINER_DISK", "51200").strip()
if disk in ("", "0", "51200"):
if disk in {"", "0", "51200"}:
check_ok("Vercel disk setting", "(uses platform default)")
else:
check_fail("Vercel custom disk unsupported", "(reset terminal.container_disk to 51200)")
@@ -1040,7 +1118,7 @@ def run_doctor(args):
for line in auth_status.detail_lines:
check_info(f"Vercel auth {line}")
persistent = os.getenv("TERMINAL_CONTAINER_PERSISTENT", "true").lower() in ("1", "true", "yes", "on")
persistent = os.getenv("TERMINAL_CONTAINER_PERSISTENT", "true").lower() in {"1", "true", "yes", "on"}
if persistent:
check_info("Vercel persistence: snapshot filesystem only; live processes do not survive sandbox recreation")
else:
@@ -1058,15 +1136,14 @@ def run_doctor(args):
elif shutil.which("agent-browser"):
check_ok("agent-browser", "(browser automation)")
agent_browser_ok = True
elif _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
check_info("Install it manually later with: npm install -g agent-browser && agent-browser install")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=True):
check_info(step)
else:
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
check_info("Install it manually later with: npm install -g agent-browser && agent-browser install")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=True):
check_info(step)
else:
check_warn("agent-browser not installed", "(run: npm install)")
check_warn("agent-browser not installed", "(run: npm install)")
# Chromium presence — the browser tools silently fail to register when
# agent-browser is found but no Playwright-managed Chromium is on disk
@@ -1117,15 +1194,14 @@ def run_doctor(args):
f"Install with: cd {PROJECT_ROOT} && "
"npx playwright install --with-deps chromium"
)
elif _is_termux():
check_info("Node.js not found (browser tools are optional in the tested Termux path)")
check_info("Install Node.js on Termux with: pkg install nodejs")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=False):
check_info(step)
else:
if _is_termux():
check_info("Node.js not found (browser tools are optional in the tested Termux path)")
check_info("Install Node.js on Termux with: pkg install nodejs")
check_info("Termux browser setup:")
for step in _termux_browser_setup_steps(node_installed=False):
check_info(step)
else:
check_warn("Node.js not found", "(optional, needed for browser tools)")
check_warn("Node.js not found", "(optional, needed for browser tools)")
# npm audit for all Node.js packages
_npm_bin = _safe_which("npm")
+3 -3
View File
@@ -307,7 +307,7 @@ def cmd_fallback_clear(args) -> None: # noqa: ARG001
print()
print(" Cancelled.")
return
if resp not in ("y", "yes"):
if resp not in {"y", "yes"}:
print(" Cancelled — no change.")
return
@@ -347,11 +347,11 @@ def _numbered_pick(question: str, choices: List[str]) -> Optional[int]:
def cmd_fallback(args) -> None:
"""Top-level dispatcher for ``hermes fallback [subcommand]``."""
sub = getattr(args, "fallback_command", None)
if sub in (None, "", "list", "ls"):
if sub in {None, "", "list", "ls"}:
cmd_fallback_list(args)
elif sub == "add":
cmd_fallback_add(args)
elif sub in ("remove", "rm"):
elif sub in {"remove", "rm"}:
cmd_fallback_remove(args)
elif sub == "clear":
cmd_fallback_clear(args)
+12 -13
View File
@@ -1194,7 +1194,7 @@ def _systemd_operational(system: bool = False) -> bool:
)
# "running", "degraded", "starting" all mean systemd is PID 1
status = result.stdout.strip().lower()
return status in ("running", "degraded", "starting", "initializing")
return status in {"running", "degraded", "starting", "initializing"}
except (RuntimeError, subprocess.TimeoutExpired, OSError):
return False
@@ -2915,7 +2915,7 @@ def launchd_start():
try:
subprocess.run(["launchctl", "kickstart", f"{_launchd_domain()}/{label}"], check=True, timeout=30)
except subprocess.CalledProcessError as e:
if e.returncode not in (3, 113):
if e.returncode not in {3, 113}:
raise
print("↻ launchd job was unloaded; reloading service definition")
subprocess.run(["launchctl", "bootstrap", _launchd_domain(), str(plist_path)], check=True, timeout=30)
@@ -2939,7 +2939,7 @@ def launchd_stop():
try:
subprocess.run(["launchctl", "bootout", target], check=True, timeout=90)
except subprocess.CalledProcessError as e:
if e.returncode in (3, 113):
if e.returncode in {3, 113}:
pass # Already unloaded — nothing to stop.
else:
raise
@@ -3011,7 +3011,7 @@ def launchd_restart():
subprocess.run(["launchctl", "kickstart", "-k", target], check=True, timeout=90)
print("✓ Service restarted")
except subprocess.CalledProcessError as e:
if e.returncode not in (3, 113):
if e.returncode not in {3, 113}:
raise
# Job not loaded — bootstrap and start fresh
print("↻ launchd job was unloaded; reloading")
@@ -3749,7 +3749,7 @@ def _platform_status(platform: dict) -> str:
password = get_env_value("MATRIX_PASSWORD")
if (val or password) and homeserver:
e2ee = get_env_value("MATRIX_ENCRYPTION")
suffix = " + E2EE" if e2ee and e2ee.lower() in ("true", "1", "yes") else ""
suffix = " + E2EE" if e2ee and e2ee.lower() in {"true", "1", "yes"} else ""
return f"configured{suffix}"
if val or password or homeserver:
return "partially configured"
@@ -4947,15 +4947,14 @@ def gateway_setup():
print_info(" Run in foreground: hermes gateway run")
print_info(" For persistence: tmux new -s hermes 'hermes gateway run'")
print_info(" To enable systemd: add systemd=true to /etc/wsl.conf, then 'wsl --shutdown'")
elif is_termux():
from hermes_constants import display_hermes_home as _dhh
print_info(" Termux does not use systemd/launchd services.")
print_info(" Run in foreground: hermes gateway run")
print_info(f" Or start it manually in the background (best effort): nohup hermes gateway run >{_dhh()}/logs/gateway.log 2>&1 &")
else:
if is_termux():
from hermes_constants import display_hermes_home as _dhh
print_info(" Termux does not use systemd/launchd services.")
print_info(" Run in foreground: hermes gateway run")
print_info(f" Or start it manually in the background (best effort): nohup hermes gateway run >{_dhh()}/logs/gateway.log 2>&1 &")
else:
print_info(" Service install not supported on this platform.")
print_info(" Run in foreground: hermes gateway run")
print_info(" Service install not supported on this platform.")
print_info(" Run in foreground: hermes gateway run")
else:
print()
print_info("No platforms configured. Run 'hermes gateway setup' when ready.")
+85 -1059
View File
File diff suppressed because it is too large Load Diff
+3 -3
View File
@@ -32,11 +32,11 @@ def hooks_command(args) -> None:
print("Run 'hermes hooks --help' for details.")
return
if sub in ("list", "ls"):
if sub in {"list", "ls"}:
_cmd_list(args)
elif sub == "test":
_cmd_test(args)
elif sub in ("revoke", "remove", "rm"):
elif sub in {"revoke", "remove", "rm"}:
_cmd_revoke(args)
elif sub == "doctor":
_cmd_doctor(args)
@@ -220,7 +220,7 @@ def _cmd_test(args) -> None:
if getattr(args, "for_tool", None):
specs = [
s for s in specs
if s.event not in ("pre_tool_call", "post_tool_call")
if s.event not in {"pre_tool_call", "post_tool_call"}
or s.matches_tool(args.for_tool)
]
+47 -23
View File
@@ -82,7 +82,7 @@ def _parse_workspace_flag(value: str) -> tuple[str, Optional[str]]:
if not value:
return ("scratch", None)
v = value.strip()
if v in ("scratch", "worktree"):
if v in {"scratch", "worktree"}:
return (v, None)
if v.startswith("dir:"):
path = v[len("dir:"):].strip()
@@ -510,6 +510,10 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
p_nsub.add_argument("--chat-id", required=True)
p_nsub.add_argument("--thread-id", default=None)
p_nsub.add_argument("--user-id", default=None)
p_nsub.add_argument(
"--notifier-profile", default=None,
help="Profile gateway that owns/delivers this subscription (default: active profile)",
)
p_nlist = sub.add_parser(
"notify-list",
@@ -648,6 +652,16 @@ def kanban_command(args: argparse.Namespace) -> int:
# keeps the patch small and inherits the exact same resolution the
# dispatcher uses for workers — consistency is a feature here.
board_override = getattr(args, "board", None)
prev_board_env = os.environ.get("HERMES_KANBAN_BOARD")
restore_board_env = False
def _restore_board_env() -> None:
if not restore_board_env:
return
if prev_board_env is None:
os.environ.pop("HERMES_KANBAN_BOARD", None)
else:
os.environ["HERMES_KANBAN_BOARD"] = prev_board_env
if board_override:
try:
normed = kb._normalize_board_slug(board_override)
@@ -667,12 +681,16 @@ def kanban_command(args: argparse.Namespace) -> int:
)
return 1
os.environ["HERMES_KANBAN_BOARD"] = normed
restore_board_env = True
# Boards management doesn't touch the DB at all — dispatch early so
# fresh installs that haven't initialized any DB can still use
# `hermes kanban boards create …`.
if action == "boards":
return _dispatch_boards(args)
try:
return _dispatch_boards(args)
finally:
_restore_board_env()
# Auto-initialize the DB before dispatching any subcommand. init_db
# is idempotent, so running it every invocation is cheap (one
@@ -685,6 +703,7 @@ def kanban_command(args: argparse.Namespace) -> int:
kb.init_db()
except Exception as exc:
print(f"kanban: could not initialize database: {exc}", file=sys.stderr)
_restore_board_env()
return 1
handlers = {
@@ -726,12 +745,16 @@ def kanban_command(args: argparse.Namespace) -> int:
handler = handlers.get(action)
if not handler:
print(f"kanban: unknown action {action!r}", file=sys.stderr)
_restore_board_env()
return 2
try:
return int(handler(args) or 0)
except (ValueError, RuntimeError) as exc:
print(f"kanban: {exc}", file=sys.stderr)
_restore_board_env()
return 1
finally:
_restore_board_env()
# ---------------------------------------------------------------------------
@@ -765,15 +788,15 @@ def _dispatch_boards(args: argparse.Namespace) -> int:
can still run ``boards create`` / ``boards list``.
"""
sub = getattr(args, "boards_action", None) or "list"
if sub in ("list", "ls"):
if sub in {"list", "ls"}:
return _cmd_boards_list(args)
if sub in ("create", "new"):
if sub in {"create", "new"}:
return _cmd_boards_create(args)
if sub in ("rm", "remove", "delete"):
if sub in {"rm", "remove", "delete"}:
return _cmd_boards_rm(args)
if sub in ("switch", "use"):
if sub in {"switch", "use"}:
return _cmd_boards_switch(args)
if sub in ("show", "current"):
if sub in {"show", "current"}:
return _cmd_boards_show(args)
if sub == "rename":
return _cmd_boards_rename(args)
@@ -1278,7 +1301,7 @@ def _cmd_show(args: argparse.Namespace) -> int:
def _cmd_assign(args: argparse.Namespace) -> int:
profile = None if args.profile.lower() in ("none", "-", "null") else args.profile
profile = None if args.profile.lower() in {"none", "-", "null"} else args.profile
with kb.connect() as conn:
ok = kb.assign_task(conn, args.task_id, profile)
if not ok:
@@ -1305,7 +1328,7 @@ def _cmd_reclaim(args: argparse.Namespace) -> int:
def _cmd_reassign(args: argparse.Namespace) -> int:
profile = None if args.profile.lower() in ("none", "-", "null") else args.profile
profile = None if args.profile.lower() in {"none", "-", "null"} else args.profile
with kb.connect() as conn:
ok = kb.reassign_task(
conn, args.task_id, profile,
@@ -1921,6 +1944,7 @@ def _cmd_notify_subscribe(args: argparse.Namespace) -> int:
conn, task_id=args.task_id,
platform=args.platform, chat_id=args.chat_id,
thread_id=args.thread_id, user_id=args.user_id,
notifier_profile=args.notifier_profile or _profile_author(),
)
print(f"Subscribed {args.platform}:{args.chat_id}"
+ (f":{args.thread_id}" if args.thread_id else "")
@@ -1939,8 +1963,9 @@ def _cmd_notify_list(args: argparse.Namespace) -> int:
return 0
for s in subs:
thr = f":{s['thread_id']}" if s.get("thread_id") else ""
owner = f" owner={s['notifier_profile']}" if s.get("notifier_profile") else ""
print(f" {s['task_id']:10s} {s['platform']}:{s['chat_id']}{thr}"
f" (since event {s['last_event_id']})")
f" (since event {s['last_event_id']}){owner}")
return 0
@@ -2071,19 +2096,18 @@ def _cmd_specify(args: argparse.Namespace) -> int:
"reason": outcome.reason,
"new_title": outcome.new_title,
}))
elif outcome.ok:
title_suffix = (
f" — retitled: {outcome.new_title!r}"
if outcome.new_title
else ""
)
print(f"Specified {outcome.task_id} → todo{title_suffix}")
else:
if outcome.ok:
title_suffix = (
f" — retitled: {outcome.new_title!r}"
if outcome.new_title
else ""
)
print(f"Specified {outcome.task_id} → todo{title_suffix}")
else:
print(
f"kanban: specify {outcome.task_id}: {outcome.reason}",
file=sys.stderr,
)
print(
f"kanban: specify {outcome.task_id}: {outcome.reason}",
file=sys.stderr,
)
if not all_flag:
return 0 if ok_count == 1 else 1
# --all: succeed if at least one promotion landed; exit 1 only when
@@ -2206,7 +2230,7 @@ def run_slash(rest: str) -> str:
out = buf_out.getvalue().rstrip()
err = buf_err.getvalue().rstrip()
# Help dump (exit 0) → return the captured help text directly.
if exc.code in (0, None) and out:
if exc.code in {0, None} and out:
return out
body = err or out
return f"⚠ /kanban usage error\n{body}" if body else "⚠ /kanban usage error"
+39 -6
View File
@@ -861,6 +861,7 @@ CREATE TABLE IF NOT EXISTS kanban_notify_subs (
chat_id TEXT NOT NULL,
thread_id TEXT NOT NULL DEFAULT '',
user_id TEXT,
notifier_profile TEXT,
created_at INTEGER NOT NULL,
last_event_id INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (task_id, platform, chat_id, thread_id)
@@ -1085,6 +1086,18 @@ def _migrate_add_optional_columns(conn: sqlite3.Connection) -> None:
"ON task_events(run_id, id)"
)
notify_table_exists = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table' AND name='kanban_notify_subs'"
).fetchone() is not None
if notify_table_exists:
notify_cols = {
row["name"] for row in conn.execute("PRAGMA table_info(kanban_notify_subs)")
}
if "notifier_profile" not in notify_cols:
_add_column_if_missing(
conn, "kanban_notify_subs", "notifier_profile", "notifier_profile TEXT"
)
# One-shot backfill: any task that is 'running' before runs existed
# had its claim_lock / claim_expires / worker_pid on the task row.
# Synthesize a matching task_runs row so subsequent end-run / heartbeat
@@ -1813,7 +1826,7 @@ def _synthesize_ended_run(
# ---------------------------------------------------------------------------
def recompute_ready(conn: sqlite3.Connection) -> int:
"""Promote ``todo`` tasks to ``ready`` when all parents are ``done``.
"""Promote ``todo`` tasks to ``ready`` when all parents are ``done`` or ``archived``.
Returns the number of tasks promoted. Safe to call inside or outside
an existing transaction; it opens its own IMMEDIATE txn.
@@ -1831,7 +1844,7 @@ def recompute_ready(conn: sqlite3.Connection) -> int:
"WHERE l.child_id = ?",
(task_id,),
).fetchall()
if all(p["status"] == "done" for p in parents):
if all(p["status"] in {"done", "archived"} for p in parents):
conn.execute(
"UPDATE tasks SET status = 'ready' WHERE id = ? AND status = 'todo'",
(task_id,),
@@ -1872,7 +1885,7 @@ def claim_task(
undone = conn.execute(
"SELECT 1 FROM task_links l "
"JOIN tasks p ON p.id = l.parent_id "
"WHERE l.child_id = ? AND p.status != 'done' LIMIT 1",
"WHERE l.child_id = ? AND p.status NOT IN ('done', 'archived') LIMIT 1",
(task_id,),
).fetchone()
if undone:
@@ -3917,6 +3930,25 @@ def _default_spawn(
prompt = f"work kanban task {task.id}"
env = dict(os.environ)
# Inject HERMES_HOME so the worker reads the profile-scoped config.yaml
# (fallback_providers, toolsets, agent settings, etc.) instead of the root
# config. Without this, `env = dict(os.environ)` copies only the parent's
# env, and when the child process starts `hermes -p <name>` the
# _apply_profile_override() runs *before* hermes_constants is imported.
# If HERMES_HOME is absent from the child's env, get_hermes_home() falls
# back to Path.home() / ".hermes" (the DEFAULT profile root), ignoring the
# profile-specific config entirely. Fixes profile-scoped fallback_providers
# being invisible to kanban workers.
from hermes_cli.profiles import resolve_profile_env
try:
env["HERMES_HOME"] = resolve_profile_env(profile_arg)
except FileNotFoundError:
# Profile dir doesn't exist — defer resolution to the CLI's
# _apply_profile_override() via HERMES_PROFILE (set below).
# This only happens in test fixtures where the isolated
# HERMES_HOME never had profiles created.
pass
if task.tenant:
env["HERMES_TENANT"] = task.tenant
env["HERMES_KANBAN_TASK"] = task.id
@@ -4341,6 +4373,7 @@ def add_notify_sub(
chat_id: str,
thread_id: Optional[str] = None,
user_id: Optional[str] = None,
notifier_profile: Optional[str] = None,
) -> None:
"""Register a gateway source that wants terminal-state notifications
for ``task_id``. Idempotent on (task, platform, chat, thread)."""
@@ -4349,10 +4382,10 @@ def add_notify_sub(
conn.execute(
"""
INSERT OR IGNORE INTO kanban_notify_subs
(task_id, platform, chat_id, thread_id, user_id, created_at)
VALUES (?, ?, ?, ?, ?, ?)
(task_id, platform, chat_id, thread_id, user_id, notifier_profile, created_at)
VALUES (?, ?, ?, ?, ?, ?, ?)
""",
(task_id, platform, chat_id, thread_id or "", user_id, now),
(task_id, platform, chat_id, thread_id or "", user_id, notifier_profile, now),
)
+137 -10
View File
@@ -177,7 +177,7 @@ def _active_hallucination_events(
active: list[Any] = []
for ev in events:
k = _event_kind(ev)
if k in ("completed", "edited"):
if k in {"completed", "edited"}:
active.clear()
elif k == kind:
active.append(ev)
@@ -193,10 +193,9 @@ def _latest_clean_event_ts(events: Iterable[Any]) -> int:
"""
latest = 0
for ev in events:
if _event_kind(ev) in ("completed", "edited"):
if _event_kind(ev) in {"completed", "edited"}:
t = _event_ts(ev)
if t > latest:
latest = t
latest = max(latest, t)
return latest
@@ -356,7 +355,7 @@ def _rule_repeated_failures(task, events, runs, now, cfg) -> list[Diagnostic]:
most_recent_outcome = None
for r in reversed(ordered_runs):
oc = _task_field(r, "outcome")
if oc in ("spawn_failed", "timed_out", "crashed"):
if oc in {"spawn_failed", "timed_out", "crashed"}:
most_recent_outcome = oc
break
@@ -374,7 +373,7 @@ def _rule_repeated_failures(task, events, runs, now, cfg) -> list[Diagnostic]:
label=f"Fix profile auth: hermes -p {assignee} auth",
payload={"command": f"hermes -p {assignee} auth"},
))
elif most_recent_outcome in ("timed_out", "crashed"):
elif most_recent_outcome in {"timed_out", "crashed"}:
# Worker got off the ground but died. Logs are the right place
# to diagnose; reclaim/reassign are the recovery levers.
task_id = _task_field(task, "id")
@@ -467,7 +466,7 @@ def _rule_repeated_crashes(task, events, runs, now, cfg) -> list[Diagnostic]:
consecutive += 1
if last_err is None:
last_err = _task_field(r, "error")
elif outcome in ("completed", "reclaimed"):
elif outcome in {"completed", "reclaimed"}:
# A success (or manual reclaim) breaks the streak.
break
else:
@@ -534,8 +533,7 @@ def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
for ev in events:
if _event_kind(ev) == "blocked":
t = _event_ts(ev)
if t > last_blocked_ts:
last_blocked_ts = t
last_blocked_ts = max(last_blocked_ts, t)
if last_blocked_ts == 0:
return []
age_hours = (now - last_blocked_ts) / 3600.0
@@ -543,7 +541,7 @@ def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
return []
# Any comment / unblock after the block breaks the "stale" signal.
for ev in events:
if _event_kind(ev) in ("commented", "unblocked") and _event_ts(ev) > last_blocked_ts:
if _event_kind(ev) in {"commented", "unblocked"} and _event_ts(ev) > last_blocked_ts:
return []
actions: list[DiagnosticAction] = [
DiagnosticAction(
@@ -570,6 +568,129 @@ def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
)]
def _rule_stranded_in_ready(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Task has been in ``ready`` status for too long without any worker
claiming it.
Threshold: cfg["stranded_threshold_seconds"] (default 1800 = 30 min).
Catches every "task waiting for a worker that never comes" case
without caring WHY:
* Operator typo'd the assignee — no profile or external worker matches.
* Profile was deleted, leaving its tasks stranded.
* External worker pool (Codex CLI, Claude Code lane, custom daemon)
is down, hung, or wasn't started.
* Dispatcher is misconfigured (wrong board, wrong HERMES_HOME).
Pre-rule, all of these silently rotted in ``skipped_nonspawnable``
the dispatcher correctly skipped them (good no respawn loop) but
nobody surfaced the fact that operator-actionable work was
accumulating. The rule fires when a ready task's promoted-to-ready
timestamp is older than the threshold AND the assignee is non-empty
(truly unassigned tasks have their own ``skipped_unassigned`` signal
on the dispatcher and a different operator response).
The signal is age-based on purpose: it's identity-agnostic, so it
works for Hermes profiles, registered lanes, external workers, and
typos uniformly. No registry to curate, no per-board allowlist.
"""
threshold_seconds = float(
cfg.get("stranded_threshold_seconds", 30 * 60)
)
status = _task_field(task, "status")
if status != "ready":
return []
# Skip tasks with a live claim — they're being worked on, even if
# the worker hasn't reported progress yet (run-level liveness
# extends the claim TTL; we don't want to second-guess that here).
if _task_field(task, "claim_lock"):
return []
assignee = _task_field(task, "assignee") or ""
if not assignee.strip():
# Unassigned tasks: the dispatcher's ``skipped_unassigned`` is
# already the right signal. A separate diagnostic here would
# double-flag the same condition.
return []
# Find the most recent event that put this task into ready.
# ``created`` covers tasks born ready; ``promoted`` covers parent-
# done auto-promotion; ``reclaimed`` covers TTL/crash recovery;
# ``unblocked`` covers human-driven resumes.
READY_TRANSITION_KINDS = {
"created", "promoted", "reclaimed", "unblocked",
}
last_ready_ts = 0
for ev in events:
if _event_kind(ev) in READY_TRANSITION_KINDS:
t = _event_ts(ev)
last_ready_ts = max(last_ready_ts, t)
# Fallback: if no qualifying event exists (very old task or events
# truncated), fall back to ``created_at`` on the task row. Better
# to occasionally over-flag an ancient task than miss a stranded one.
if last_ready_ts == 0:
last_ready_ts = int(_task_field(task, "created_at", default=0) or 0)
if last_ready_ts == 0:
return []
age_seconds = now - last_ready_ts
if age_seconds < threshold_seconds:
return []
# Format the age in the largest sensible unit.
if age_seconds >= 3600:
age_str = f"{age_seconds / 3600:.1f}h"
else:
age_str = f"{int(age_seconds / 60)}m"
# Severity escalates with age. Below 2x threshold = warning;
# 2x 6x = error; beyond 6x = critical (something is clearly
# broken, not just slow).
if age_seconds >= threshold_seconds * 6:
severity = "critical"
elif age_seconds >= threshold_seconds * 2:
severity = "error"
else:
severity = "warning"
actions = [
DiagnosticAction(
kind="reassign",
label="Reassign to a different worker",
payload={"current_assignee": assignee},
),
DiagnosticAction(
kind="cli_hint",
label="Check dispatcher status",
payload={"command": "hermes kanban diagnostics"},
),
]
return [Diagnostic(
kind="stranded_in_ready",
severity=severity,
title=f"Ready for {age_str} with no worker",
detail=(
f"This task has been ready for {age_str} but nothing has "
f"claimed it. Common causes: assignee {assignee!r} is "
f"misspelled, the profile was deleted, or the external "
f"worker pool for this lane is down. Confirm the assignee "
f"is correct and that a worker is actually polling for it."
),
actions=actions,
first_seen_at=last_ready_ts,
last_seen_at=last_ready_ts,
count=1,
data={
"ready_since": last_ready_ts,
"age_seconds": int(age_seconds),
"assignee": assignee,
"threshold_seconds": int(threshold_seconds),
},
)]
# Registry — order matters: rules higher on the list render first when
# severity ties. Add new rules here.
_RULES: list[RuleFn] = [
@@ -578,6 +699,7 @@ _RULES: list[RuleFn] = [
_rule_repeated_failures,
_rule_repeated_crashes,
_rule_stuck_in_blocked,
_rule_stranded_in_ready,
]
@@ -589,6 +711,7 @@ DIAGNOSTIC_KINDS = (
"repeated_failures",
"repeated_crashes",
"stuck_in_blocked",
"stranded_in_ready",
)
@@ -598,6 +721,10 @@ DEFAULT_CONFIG = {
"spawn_failure_threshold": 3,
"crash_threshold": 2,
"blocked_stale_hours": 24,
# Stranded-task threshold. 30 min by default — below that, the
# signal is dominated by tasks that are about to be claimed on the
# next dispatcher tick (default 60s) and would just be noise.
"stranded_threshold_seconds": 30 * 60,
}
+221 -157
View File
@@ -124,7 +124,7 @@ def _apply_profile_override() -> None:
# 1. Check for explicit -p / --profile flag
for i, arg in enumerate(argv):
if arg in ("--profile", "-p") and i + 1 < len(argv):
if arg in {"--profile", "-p"} and i + 1 < len(argv):
profile_name = argv[i + 1]
consume = 2
break
@@ -192,7 +192,7 @@ def _apply_profile_override() -> None:
# Strip the flag from argv so argparse doesn't choke
if consume > 0:
for i, arg in enumerate(argv):
if arg in ("--profile", "-p"):
if arg in {"--profile", "-p"}:
start = i + 1 # +1 because argv is sys.argv[1:]
sys.argv = sys.argv[:start] + sys.argv[start + consume :]
break
@@ -505,8 +505,7 @@ def _session_browse_picker(sessions: list) -> Optional[str]:
# Compute visible area
visible_rows = max_y - 4 # header + col header + blank + footer
if visible_rows < 1:
visible_rows = 1
visible_rows = max(visible_rows, 1)
# Clamp cursor and scroll
if not filtered:
@@ -518,8 +517,7 @@ def _session_browse_picker(sessions: list) -> Optional[str]:
else:
if cursor >= len(filtered):
cursor = len(filtered) - 1
if cursor < 0:
cursor = 0
cursor = max(cursor, 0)
if cursor < scroll_offset:
scroll_offset = cursor
elif cursor >= scroll_offset + visible_rows:
@@ -569,13 +567,13 @@ def _session_browse_picker(sessions: list) -> Optional[str]:
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP,):
if key in {curses.KEY_UP,}:
if filtered:
cursor = (cursor - 1) % len(filtered)
elif key in (curses.KEY_DOWN,):
elif key in {curses.KEY_DOWN,}:
if filtered:
cursor = (cursor + 1) % len(filtered)
elif key in (curses.KEY_ENTER, 10, 13):
elif key in {curses.KEY_ENTER, 10, 13}:
if filtered:
result_holder[0] = filtered[cursor]["id"]
return
@@ -589,7 +587,7 @@ def _session_browse_picker(sessions: list) -> Optional[str]:
else:
# Second Esc exits
return
elif key in (curses.KEY_BACKSPACE, 127, 8):
elif key in {curses.KEY_BACKSPACE, 127, 8}:
if search_text:
search_text = search_text[:-1]
if search_text:
@@ -628,7 +626,7 @@ def _session_browse_picker(sessions: list) -> Optional[str]:
while True:
try:
val = input(f"\n Select [1-{len(sessions)}]: ").strip()
if not val or val.lower() in ("q", "quit", "exit"):
if not val or val.lower() in {"q", "quit", "exit"}:
return None
idx = int(val) - 1
if 0 <= idx < len(sessions):
@@ -899,6 +897,11 @@ to avoid false-positive reinstalls on every launch.
def _tui_need_npm_install(root: Path) -> bool:
"""True when @hermes/ink is missing or node_modules is behind package-lock.json.
Prebuilt bundle mode: when ``dist/entry.js`` exists and there is no
``package-lock.json`` (nix install layout only ships ``dist/`` +
``package.json``), skip reinstall entirely the bundle is self-contained
and there is nothing to install.
Compares ``package-lock.json`` against ``node_modules/.package-lock.json``
(npm's hidden lockfile) by **content**, not mtime: git checkouts and npm
rewrites can bump the root lockfile's timestamp even when installed deps
@@ -916,10 +919,16 @@ def _tui_need_npm_install(root: Path) -> bool:
we'd rather not force a reinstall for them. Falls back to mtime
comparison if either lockfile is unparseable.
"""
lock = root / "package-lock.json"
entry = root / "dist" / "entry.js"
# Prebuilt self-contained bundle (nix / packaged release): no lockfile
# shipped, dist/entry.js is the single runtime artefact.
if entry.is_file() and not lock.is_file():
return False
ink = root / "node_modules" / "@hermes" / "ink" / "package.json"
if not ink.is_file():
return True
lock = root / "package-lock.json"
if not lock.is_file():
return False
marker = root / "node_modules" / ".package-lock.json"
@@ -958,63 +967,6 @@ def _tui_need_npm_install(root: Path) -> bool:
return False
def _find_bundled_tui(tui_dir: Path) -> Optional[Path]:
"""Directory whose dist/entry.js we should run: HERMES_TUI_DIR first, else repo ui-tui."""
env = os.environ.get("HERMES_TUI_DIR")
if env:
p = Path(env)
if (p / "dist" / "entry.js").exists() and not _tui_need_npm_install(p):
return p
if (tui_dir / "dist" / "entry.js").exists() and not _tui_need_npm_install(tui_dir):
return tui_dir
return None
def _tui_build_needed(tui_dir: Path) -> bool:
if _hermes_ink_bundle_stale(tui_dir):
return True
entry = tui_dir / "dist" / "entry.js"
if not entry.exists():
return True
dist_m = entry.stat().st_mtime
skip = frozenset({"node_modules", "dist"})
for dirpath, dirnames, filenames in os.walk(tui_dir, topdown=True):
dirnames[:] = [d for d in dirnames if d not in skip]
for fn in filenames:
if fn.endswith((".ts", ".tsx")):
if os.path.getmtime(os.path.join(dirpath, fn)) > dist_m:
return True
for meta in (
"package.json",
"package-lock.json",
"tsconfig.json",
"tsconfig.build.json",
):
mp = tui_dir / meta
if mp.exists() and mp.stat().st_mtime > dist_m:
return True
return False
def _hermes_ink_bundle_stale(tui_dir: Path) -> bool:
ink_root = tui_dir / "packages" / "hermes-ink"
bundle = ink_root / "dist" / "ink-bundle.js"
if not bundle.exists():
return True
bm = bundle.stat().st_mtime
skip = frozenset({"node_modules", "dist"})
for dirpath, dirnames, filenames in os.walk(ink_root, topdown=True):
dirnames[:] = [d for d in dirnames if d not in skip]
for fn in filenames:
if fn.endswith((".ts", ".tsx")):
if os.path.getmtime(os.path.join(dirpath, fn)) > bm:
return True
mp = ink_root / "package.json"
if mp.exists() and mp.stat().st_mtime > bm:
return True
return False
def _ensure_tui_node() -> None:
"""Make sure `node` + `npm` are on PATH for the TUI.
@@ -1073,7 +1025,7 @@ def _ensure_tui_node() -> None:
def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
"""TUI: --dev → tsx src; else node dist (HERMES_TUI_DIR or ui-tui, build when stale)."""
"""TUI: --dev → tsx src; else node dist (HERMES_TUI_DIR prebuilt or esbuild)."""
_ensure_tui_node()
def _node_bin(bin: str) -> str:
@@ -1087,23 +1039,31 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
sys.exit(1)
return path
# pre-built dist + node_modules (nix / full HERMES_TUI_DIR) skips npm.
# Footgun: --dev against a prebuilt bundle that has no source/node_modules.
ext_dir = os.environ.get("HERMES_TUI_DIR")
if tui_dev and ext_dir:
print(
f"Error: --dev is incompatible with HERMES_TUI_DIR={ext_dir}\n"
f"The prebuilt TUI has no source code to hot-reload.\n"
f"Unset HERMES_TUI_DIR (e.g. `unset HERMES_TUI_DIR`) to use --dev from a checkout.",
file=sys.stderr,
)
sys.exit(1)
# 1. Prebuilt bundle (nix / packaged release): just run it.
if not tui_dev:
ext_dir = os.environ.get("HERMES_TUI_DIR")
if ext_dir:
p = Path(ext_dir)
if (p / "dist" / "entry.js").exists() and not _tui_need_npm_install(p):
if (p / "dist" / "entry.js").is_file():
node = _node_bin("node")
return [node, str(p / "dist" / "entry.js")], p
npm = _node_bin("npm")
# 2. Normal flow: npm install if needed, always esbuild, then node dist/entry.js.
# --dev flow: npm install if needed, then tsx src/entry.tsx (no build).
if _tui_need_npm_install(tui_dir):
npm = _node_bin("npm")
if not os.environ.get("HERMES_QUIET"):
print("Installing TUI dependencies…")
# Capture stdout as well as stderr — some npm errors (notably EACCES on a
# root-owned node_modules in containers) are emitted on stdout, and a
# bare "npm install failed." with no preview defeats debugging. We keep
# the failure-only print path so a successful install stays silent.
result = subprocess.run(
[npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
cwd=str(tui_dir),
@@ -1121,47 +1081,30 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
sys.exit(1)
if tui_dev:
if _hermes_ink_bundle_stale(tui_dir):
result = subprocess.run(
[npm, "run", "build", "--prefix", "packages/hermes-ink"],
cwd=str(tui_dir),
capture_output=True,
text=True,
)
if result.returncode != 0:
combined = f"{result.stdout or ''}{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("@hermes/ink build failed.")
if preview:
print(preview)
sys.exit(1)
tsx = tui_dir / "node_modules" / ".bin" / "tsx"
if tsx.exists():
return [str(tsx), "src/entry.tsx"], tui_dir
npm = _node_bin("npm")
return [npm, "start"], tui_dir
if _tui_build_needed(tui_dir):
result = subprocess.run(
[npm, "run", "build"],
cwd=str(tui_dir),
capture_output=True,
text=True,
)
if result.returncode != 0:
combined = f"{result.stdout or ''}{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("TUI build failed.")
if preview:
print(preview)
sys.exit(1)
root = _find_bundled_tui(tui_dir)
if not root:
print("TUI build did not produce dist/entry.js")
# Always rebuild — esbuild is fast and this avoids staleness-edge-case bugs.
npm = _node_bin("npm")
result = subprocess.run(
[npm, "run", "build"],
cwd=str(tui_dir),
capture_output=True,
text=True,
)
if result.returncode != 0:
combined = f"{result.stdout or ''}{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("TUI build failed.")
if preview:
print(preview)
sys.exit(1)
node = _node_bin("node")
return [node, str(root / "dist" / "entry.js")], root
return [node, str(tui_dir / "dist" / "entry.js")], tui_dir
def _normalize_tui_toolsets(toolsets: object) -> list[str]:
@@ -1305,7 +1248,7 @@ def _launch_tui(
except KeyboardInterrupt:
code = 130
if code in (0, 130):
if code in {0, 130}:
_print_tui_exit_summary(resume_session_id, active_session_file)
finally:
try:
@@ -1405,7 +1348,7 @@ def cmd_chat(args):
reply = input("Run setup now? [Y/n] ").strip().lower()
except (EOFError, KeyboardInterrupt):
reply = "n"
if reply in ("", "y", "yes"):
if reply in {"", "y", "yes"}:
cmd_setup(args)
return
print()
@@ -1585,7 +1528,7 @@ def cmd_whatsapp(args):
response = input("\n Update allowed users? [y/N] ").strip()
except (EOFError, KeyboardInterrupt):
response = "n"
if response.lower() in ("y", "yes"):
if response.lower() in {"y", "yes"}:
if wa_mode == "bot":
phone = input(
" Phone numbers that can message the bot (comma-separated): "
@@ -1660,7 +1603,7 @@ def cmd_whatsapp(args):
).strip()
except (EOFError, KeyboardInterrupt):
response = "n"
if response.lower() in ("y", "yes"):
if response.lower() in {"y", "yes"}:
shutil.rmtree(session_dir, ignore_errors=True)
session_dir.mkdir(parents=True, exist_ok=True)
print(" ✓ Session cleared")
@@ -2014,7 +1957,7 @@ def select_provider_and_model(args=None):
_model_flow_bedrock(config, current_model)
elif selected_provider == "azure-foundry":
_model_flow_azure_foundry(config, current_model)
elif selected_provider in (
elif selected_provider in {
"gemini",
"deepseek",
"xai",
@@ -2034,18 +1977,18 @@ def select_provider_and_model(args=None):
"ollama-cloud",
"tencent-tokenhub",
"lmstudio",
) or _is_profile_api_key_provider(selected_provider):
} or _is_profile_api_key_provider(selected_provider):
_model_flow_api_key_provider(config, selected_provider, current_model)
# ── Post-switch cleanup: clear stale OPENAI_BASE_URL ──────────────
# When the user switches to a named provider (anything except "custom"),
# a leftover OPENAI_BASE_URL in ~/.hermes/.env can poison auxiliary
# clients that use provider:auto. Clear it proactively. (#5161)
if selected_provider not in (
if selected_provider not in {
"custom",
"cancel",
"remove-custom",
) and not selected_provider.startswith("custom:"):
} and not selected_provider.startswith("custom:"):
_clear_stale_openai_base_url()
@@ -2171,7 +2114,7 @@ def _reset_aux_to_auto() -> int:
entry = {}
aux[task] = entry
changed = False
if entry.get("provider") not in (None, "", "auto"):
if entry.get("provider") not in {None, "", "auto"}:
entry["provider"] = "auto"
changed = True
for field in ("model", "base_url", "api_key"):
@@ -2646,6 +2589,7 @@ def _model_flow_nous(config, current_model="", args=None):
get_pricing_for_provider,
check_nous_free_tier,
partition_nous_models_by_tier,
union_with_portal_free_recommendations,
)
model_ids = get_curated_nous_model_ids()
@@ -2686,19 +2630,8 @@ def _model_flow_nous(config, current_model="", args=None):
# Check if user is on free tier
free_tier = check_nous_free_tier()
# For free users: partition models into selectable/unavailable based on
# whether they are free per the Portal-reported pricing.
unavailable_models: list[str] = []
if free_tier:
model_ids, unavailable_models = partition_nous_models_by_tier(
model_ids, pricing, free_tier=True
)
if not model_ids and not unavailable_models:
print("No models available for Nous Portal after filtering.")
return
# Resolve portal URL for upgrade links (may differ on staging)
# Resolve portal URL early — needed both for upgrade links and for the
# freeRecommendedModels endpoint below.
_nous_portal_url = ""
try:
_nous_state = get_provider_auth_state("nous")
@@ -2707,6 +2640,24 @@ def _model_flow_nous(config, current_model="", args=None):
except Exception:
pass
# For free users: partition models into selectable/unavailable based on
# whether they are free per the Portal-reported pricing. First augment
# with the Portal's freeRecommendedModels list so newly-launched free
# models show up even if this CLI build's hardcoded curated list and
# docs-hosted manifest haven't caught up yet.
unavailable_models: list[str] = []
if free_tier:
model_ids, pricing = union_with_portal_free_recommendations(
model_ids, pricing, _nous_portal_url,
)
model_ids, unavailable_models = partition_nous_models_by_tier(
model_ids, pricing, free_tier=True
)
if not model_ids and not unavailable_models:
print("No models available for Nous Portal after filtering.")
return
if free_tier and not model_ids:
print("No free models currently available.")
if unavailable_models:
@@ -3082,7 +3033,7 @@ def _model_flow_custom(config):
_add_v1 = input(" Add /v1? [Y/n]: ").strip().lower()
except (KeyboardInterrupt, EOFError):
_add_v1 = "n"
if _add_v1 in ("", "y", "yes"):
if _add_v1 in {"", "y", "yes"}:
effective_url = effective_url.rstrip("/") + "/v1"
if base_url:
base_url = effective_url
@@ -3126,7 +3077,7 @@ def _model_flow_custom(config):
if len(detected_models) == 1:
print(f" Detected model: {detected_models[0]}")
confirm = input(" Use this model? [Y/n]: ").strip().lower()
if confirm in ("", "y", "yes"):
if confirm in {"", "y", "yes"}:
model_name = detected_models[0]
else:
model_name = input("Model name (e.g. gpt-4, llama-3-70b): ").strip()
@@ -3959,7 +3910,7 @@ def _model_flow_copilot(config, current_model=""):
api_key = creds.get("api_key", "")
source = creds.get("source", "")
else:
if source in ("GITHUB_TOKEN", "GH_TOKEN"):
if source in {"GITHUB_TOKEN", "GH_TOKEN"}:
print(f" GitHub token: {api_key[:8]}... ✓ ({source})")
elif source == "gh auth token":
print(" GitHub token: ✓ (from `gh auth token`)")
@@ -5279,7 +5230,7 @@ def cmd_slack(args):
command registered as a first-class slash.
"""
sub = getattr(args, "slack_command", None)
if sub in (None, ""):
if sub in {None, ""}:
# No subcommand — print usage hint.
print(
"usage: hermes slack <subcommand>\n"
@@ -5426,7 +5377,7 @@ def _clear_bytecode_cache(root: Path) -> int:
dirnames[:] = [
d
for d in dirnames
if d not in ("venv", ".venv", "node_modules", ".git", ".worktrees")
if d not in {"venv", ".venv", "node_modules", ".git", ".worktrees"}
]
if os.path.basename(dirpath) == "__pycache__":
try:
@@ -5491,7 +5442,6 @@ def _gateway_prompt(prompt_text: str, default: str = "", timeout: float = 300.0)
def _web_ui_build_needed(web_dir: Path) -> bool:
"""Return True if the web UI dist is missing or stale.
Mirrors the staleness logic used by ``_tui_build_needed()`` for the TUI.
The Vite build outputs to ``hermes_cli/web_dist/`` (per vite.config.ts
outDir: "../hermes_cli/web_dist"), NOT to ``web/dist/``. Uses the Vite
manifest as the sentinel because it is written last and therefore has the
@@ -5549,6 +5499,8 @@ def _run_npm_install_deterministic(
cwd=cwd,
capture_output=capture_output,
text=True,
encoding="utf-8",
errors="replace",
check=False,
)
if ci_result.returncode == 0:
@@ -5561,6 +5513,8 @@ def _run_npm_install_deterministic(
cwd=cwd,
capture_output=capture_output,
text=True,
encoding="utf-8",
errors="replace",
check=False,
)
@@ -5597,12 +5551,50 @@ def _build_web_ui(web_dir: Path, *, fatal: bool = False) -> bool:
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
r2 = subprocess.run([npm, "run", "build"], cwd=web_dir, capture_output=True)
# First attempt
r2 = subprocess.run(
[npm, "run", "build"],
cwd=web_dir,
capture_output=True,
text=True,
encoding="utf-8",
errors="replace",
)
if r2.returncode != 0:
# Retry once after a short delay — covers boot-time races on Windows
# (antivirus scanning Node.js binaries, npm cache not ready, transient
# I/O when launched via Scheduled Task at logon). See issue #23817.
_time.sleep(3)
r2 = subprocess.run(
[npm, "run", "build"],
cwd=web_dir,
capture_output=True,
text=True,
encoding="utf-8",
errors="replace",
)
if r2.returncode != 0:
stderr_preview = (r2.stderr or "").strip()
stderr_tail = "\n ".join(stderr_preview.splitlines()[-10:]) if stderr_preview else ""
dist_dir = web_dir.parent / "hermes_cli" / "web_dist"
dist_index = dist_dir / "index.html"
# If a stale dist exists, serve it as a fallback instead of failing.
# A stale UI is far better than no UI for non-interactive callers
# (Windows Scheduled Tasks, CI) — issue #23817.
if dist_index.exists():
print(" ⚠ Web UI build failed — serving stale dist as fallback")
if stderr_tail:
print(f" Build error:\n {stderr_tail}")
return True
print(
f" {'' if fatal else ''} Web UI build failed"
+ ("" if fatal else " (hermes web will not be available)")
)
if stderr_tail:
print(f" Build error:\n {stderr_tail}")
if fatal:
print(" Run manually: cd web && npm install && npm run build")
return False
@@ -5921,8 +5913,8 @@ def _kill_stale_dashboard_processes(
for pid in killed:
print(f" ✓ stopped PID {pid}")
for pid, reason in failed:
print(f" ✗ failed to stop PID {pid}: {reason}")
for pid, err_msg in failed:
print(f" ✗ failed to stop PID {pid}: {err_msg}")
if killed:
print(" Restart the dashboard when you're ready:")
@@ -6179,7 +6171,7 @@ def _restore_stashed_changes(
response = input_fn("Restore local changes now? [Y/n]", "y")
else:
response = input().strip().lower()
if response not in ("", "y", "yes"):
if response not in {"", "y", "yes"}:
print("Skipped restoring local changes.")
print("Your changes are still preserved in git stash.")
print(f"Restore manually with: git stash apply {stash_ref}")
@@ -6422,7 +6414,7 @@ def _sync_with_upstream_if_needed(git_cmd: list[str], cwd: Path) -> None:
print()
response = "n"
if response in ("", "y", "yes"):
if response in {"", "y", "yes"}:
print("→ Adding upstream remote...")
if _add_upstream_remote(git_cmd, cwd):
print(
@@ -7481,7 +7473,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
prompt_user=prompt_for_restore,
input_fn=gw_input_fn,
)
if current_branch not in ("main", "HEAD"):
if current_branch not in {"main", "HEAD"}:
subprocess.run(
git_cmd + ["checkout", current_branch],
cwd=PROJECT_ROOT,
@@ -7765,7 +7757,7 @@ def _cmd_update_impl(args, gateway_mode: bool):
except EOFError:
response = "n"
if response in ("", "y", "yes", "auto"):
if response in {"", "y", "yes", "auto"}:
print()
# Gateway mode, --yes, and non-interactive update contexts
# (dashboard / web server actions) cannot prompt for API keys.
@@ -7817,6 +7809,22 @@ def _cmd_update_impl(args, gateway_mode: bool):
except Exception as e:
logger.debug("FHS PATH guard check failed: %s", e)
# Refresh the cua-driver binary used by the Computer Use toolset.
# The upstream installer is gated on macOS and on the binary already
# being on PATH, so this is a no-op for users who don't have it.
# Tying the refresh to ``hermes update`` gives users a predictable
# cadence (matches when they pull new agent code) without adding
# startup latency or a per-launch GitHub API call.
try:
if sys.platform == "darwin" and shutil.which("cua-driver"):
from hermes_cli.tools_config import install_cua_driver
print()
print("→ Refreshing cua-driver (Computer Use)...")
install_cua_driver(upgrade=True)
except Exception as e:
logger.debug("cua-driver refresh failed: %s", e)
# Write exit code *before* the gateway restart attempt.
# When running as ``hermes update --gateway`` (spawned by the gateway's
# /update command), this process lives inside the gateway's systemd
@@ -8826,7 +8834,7 @@ def cmd_profile(args):
answer = input("\nProceed with install? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = ""
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
print("Install cancelled.")
return
@@ -8885,7 +8893,7 @@ def cmd_profile(args):
answer = input("\nProceed? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = ""
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
print("Update cancelled.")
return
@@ -9075,9 +9083,24 @@ def cmd_dashboard(args):
print(f"Import error: {e}")
sys.exit(1)
if "HERMES_WEB_DIST" not in os.environ:
if "HERMES_WEB_DIST" not in os.environ and not getattr(args, "skip_build", False):
if not _build_web_ui(PROJECT_ROOT / "web", fatal=True):
sys.exit(1)
elif getattr(args, "skip_build", False):
# --skip-build trusts the caller to have pre-built the web UI.
# Verify the dist actually exists; otherwise the server will start
# and serve 404s with no obvious cause (issue #23817).
_dist_root = (
Path(os.environ["HERMES_WEB_DIST"])
if "HERMES_WEB_DIST" in os.environ
else PROJECT_ROOT / "hermes_cli" / "web_dist"
)
if not (_dist_root / "index.html").exists():
print(f"✗ --skip-build was passed but no web dist found at: {_dist_root}")
print(" Pre-build first: cd web && npm install && npm run build")
print(" Or drop --skip-build to build automatically.")
sys.exit(1)
print(f"→ Skipping web UI build (--skip-build); using dist at {_dist_root}")
from hermes_cli.web_server import start_server
@@ -10063,6 +10086,16 @@ def main():
doctor_parser.add_argument(
"--fix", action="store_true", help="Attempt to fix issues automatically"
)
doctor_parser.add_argument(
"--ack",
metavar="ADVISORY_ID",
default=None,
help=(
"Acknowledge a security advisory by ID and exit. After ack, the "
"advisory will no longer trigger startup banners. Run `hermes "
"doctor` first to see active advisories and their IDs."
),
)
doctor_parser.set_defaults(func=cmd_doctor)
# =========================================================================
@@ -10658,9 +10691,9 @@ Examples:
mem_dir = get_hermes_home() / "memories"
target = getattr(args, "target", "all")
files_to_reset = []
if target in ("all", "memory"):
if target in {"all", "memory"}:
files_to_reset.append(("MEMORY.md", "agent notes"))
if target in ("all", "user"):
if target in {"all", "user"}:
files_to_reset.append(("USER.md", "user profile"))
# Check what exists
@@ -10771,7 +10804,7 @@ Examples:
def cmd_tools(args):
action = getattr(args, "tools_action", None)
if action in ("list", "disable", "enable"):
if action in {"list", "disable", "enable"}:
from hermes_cli.tools_config import tools_disable_enable_command
tools_disable_enable_command(args)
@@ -10802,10 +10835,19 @@ Examples:
)
computer_use_sub = computer_use_parser.add_subparsers(dest="computer_use_action")
computer_use_sub.add_parser(
computer_use_install = computer_use_sub.add_parser(
"install",
help="Install or repair the cua-driver binary (macOS)",
)
computer_use_install.add_argument(
"--upgrade",
action="store_true",
help=(
"Re-run the upstream installer even if cua-driver is already on "
"PATH. The upstream install.sh always pulls the latest release, "
"so this performs an in-place upgrade."
),
)
computer_use_sub.add_parser(
"status",
help="Print whether cua-driver is installed and on PATH",
@@ -10814,14 +10856,27 @@ Examples:
def cmd_computer_use(args):
action = getattr(args, "computer_use_action", None)
if action == "install":
from hermes_cli.tools_config import _run_post_setup
_run_post_setup("cua_driver")
from hermes_cli.tools_config import install_cua_driver
install_cua_driver(upgrade=bool(getattr(args, "upgrade", False)))
return
if action == "status":
import shutil
import subprocess
path = shutil.which("cua-driver")
if path:
print(f"cua-driver: installed at {path}")
version = ""
try:
version = subprocess.run(
["cua-driver", "--version"],
capture_output=True, text=True, timeout=5,
).stdout.strip()
except Exception:
pass
if version:
print(f"cua-driver: installed at {path} ({version})")
else:
print(f"cua-driver: installed at {path}")
print(" Refresh to latest: hermes computer-use install --upgrade")
return
print("cua-driver: not installed")
print(" Run: hermes computer-use install")
@@ -10980,7 +11035,7 @@ Examples:
def _confirm_prompt(prompt: str) -> bool:
"""Prompt for y/N confirmation, safe against non-TTY environments."""
try:
return input(prompt).strip().lower() in ("y", "yes")
return input(prompt).strip().lower() in {"y", "yes"}
except (EOFError, KeyboardInterrupt):
return False
@@ -11554,6 +11609,15 @@ Examples:
"Alternatively set HERMES_DASHBOARD_TUI=1."
),
)
dashboard_parser.add_argument(
"--skip-build",
action="store_true",
help=(
"Skip the web UI build step and serve the existing dist directly. "
"Useful for non-interactive contexts (Windows Scheduled Tasks, CI) "
"where npm may not be available. Pre-build with: cd web && npm run build"
),
)
# Lifecycle flags — mutually exclusive with each other and with the
# start-a-server flags above (if both are passed, --stop / --status win
# because they exit before the server is started). The dashboard has
+4 -4
View File
@@ -63,7 +63,7 @@ def _confirm(question: str, default: bool = True) -> bool:
return default
if not val:
return default
return val in ("y", "yes")
return val in {"y", "yes"}
def _prompt(question: str, *, password: bool = False, default: str = "") -> str:
@@ -375,11 +375,11 @@ def cmd_mcp_add(args):
_info("Cancelled.")
return
if choice in ("n", "no"):
if choice in {"n", "no"}:
_info("Cancelled — server not saved.")
return
if choice in ("s", "select"):
if choice in {"s", "select"}:
# Interactive tool selection
from hermes_cli.curses_ui import curses_checklist
@@ -509,7 +509,7 @@ def cmd_mcp_list(args=None):
# Enabled status
enabled = cfg.get("enabled", True)
if isinstance(enabled, str):
enabled = enabled.lower() in ("true", "1", "yes")
enabled = enabled.lower() in {"true", "1", "yes"}
status = color("✓ enabled", Colors.GREEN) if enabled else color("✗ disabled", Colors.DIM)
print(f" {name:<16} {transport:<30} {tools_str:<12} {status}")
+9 -5
View File
@@ -825,7 +825,7 @@ def switch_model(
# --- Step e: detect_provider_for_model() as last resort ---
_base = current_base_url or ""
is_custom = current_provider in ("custom", "local") or (
is_custom = current_provider in {"custom", "local"} or (
"localhost" in _base or "127.0.0.1" in _base
)
@@ -1079,6 +1079,7 @@ def list_authenticated_providers(
from hermes_cli.models import (
OPENROUTER_MODELS, _PROVIDER_MODELS,
_MODELS_DEV_PREFERRED, _merge_with_models_dev, provider_model_ids,
get_curated_nous_model_ids,
)
results: List[dict] = []
@@ -1160,9 +1161,12 @@ def list_authenticated_providers(
# Build curated model lists keyed by hermes provider ID
curated: dict[str, list[str]] = dict(_PROVIDER_MODELS)
curated["openrouter"] = [mid for mid, _ in OPENROUTER_MODELS]
# "nous" shares OpenRouter's curated list if not separately defined
if "nous" not in curated:
curated["nous"] = curated["openrouter"]
# "nous" pulls from the remote model-catalog manifest published at
# https://hermes-agent.nousresearch.com/docs/api/model-catalog.json so
# newly added Portal models surface in the /model picker without
# requiring a Hermes release. Falls back to the in-repo
# _PROVIDER_MODELS["nous"] snapshot when the manifest is unreachable.
curated["nous"] = get_curated_nous_model_ids()
# Ollama Cloud uses dynamic discovery (no static curated list)
if "ollama-cloud" not in curated:
from hermes_cli.models import fetch_ollama_cloud_models
@@ -1521,7 +1525,7 @@ def list_authenticated_providers(
api_key = os.environ.get(key_env, "").strip() if key_env else ""
discover = ep_cfg.get("discover_models", True)
if isinstance(discover, str):
discover = discover.lower() not in ("false", "no", "0")
discover = discover.lower() not in {"false", "no", "0"}
if api_url and api_key and discover:
try:
from hermes_cli.models import fetch_api_models
+83 -5
View File
@@ -556,6 +556,71 @@ def partition_nous_models_by_tier(
return (selectable, unavailable)
def union_with_portal_free_recommendations(
curated_ids: list[str],
pricing: dict[str, dict[str, str]],
portal_base_url: str = "",
*,
force_refresh: bool = False,
) -> tuple[list[str], dict[str, dict[str, str]]]:
"""Augment curated list + pricing with the Portal's ``freeRecommendedModels``.
The Portal's ``/api/nous/recommended-models`` endpoint advertises which
models are free *right now* independent of what the in-repo
``_PROVIDER_MODELS["nous"]`` list happens to contain or whether the
docs-hosted catalog manifest has been rebuilt since the last release.
For free-tier users this is the source of truth: any model the Portal
flags as free should be selectable, even if the user is running an
older Hermes that doesn't ship that model in its hardcoded curated
list. This function returns an augmented ``(model_ids, pricing)``
pair where:
* Portal free recommendations missing from ``curated_ids`` are
appended at the front (so the picker shows them first).
* ``pricing`` gets a synthetic ``{"prompt": "0", "completion": "0"}``
entry for any free recommendation missing from the live pricing
map, so :func:`partition_nous_models_by_tier` keeps it.
Failures (network, parse, missing field) are silent and degrade to
returning the inputs unchanged.
"""
try:
payload = fetch_nous_recommended_models(
portal_base_url, force_refresh=force_refresh
)
except Exception:
return (list(curated_ids), dict(pricing))
free_block = payload.get("freeRecommendedModels") if isinstance(payload, dict) else None
if not isinstance(free_block, list) or not free_block:
return (list(curated_ids), dict(pricing))
portal_free_ids: list[str] = []
for entry in free_block:
name = _extract_model_name(entry)
if name:
portal_free_ids.append(name)
if not portal_free_ids:
return (list(curated_ids), dict(pricing))
augmented_pricing = dict(pricing)
free_synthetic = {"prompt": "0", "completion": "0"}
for mid in portal_free_ids:
if mid not in augmented_pricing:
augmented_pricing[mid] = dict(free_synthetic)
augmented_ids = list(curated_ids)
seen = set(augmented_ids)
# Prepend Portal free recommendations that aren't already curated, so
# they appear first in the picker.
new_ones = [mid for mid in portal_free_ids if mid not in seen]
if new_ones:
augmented_ids = new_ones + augmented_ids
return (augmented_ids, augmented_pricing)
# ---------------------------------------------------------------------------
# TTL cache for free-tier detection — avoids repeated API calls within a
# session while still picking up upgrades quickly.
@@ -818,7 +883,7 @@ try:
for _pp in _list_providers_for_canonical():
if _pp.name in _canonical_slugs:
continue
if _pp.auth_type in ("oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"):
if _pp.auth_type in {"oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"}:
continue # non-api-key flows need bespoke picker UX; skip auto-inject
_label = _pp.display_name or _pp.name
_desc = _pp.description or f"{_label} (direct API)"
@@ -1338,8 +1403,21 @@ def _resolve_openrouter_api_key() -> str:
return os.getenv("OPENROUTER_API_KEY", "").strip()
_DEFAULT_NOUS_INFERENCE_BASE = "https://inference-api.nousresearch.com"
def _resolve_nous_pricing_credentials() -> tuple[str, str]:
"""Return ``(api_key, base_url)`` for Nous Portal pricing, or empty strings."""
"""Return ``(api_key, base_url)`` for Nous Portal pricing.
The Nous inference ``/v1/models`` endpoint exposes pricing without
authentication, so the api_key is best-effort: when runtime credential
resolution fails (expired refresh token, missing auth.json, etc.) we
still return the default inference base URL so the picker keeps
working with anonymous pricing data. Free-tier users in particular
need this pricing drives the free/paid partition, and silently
returning empty pricing because of an auth blip makes the picker
look broken ("No free models currently available").
"""
try:
from hermes_cli.auth import resolve_nous_runtime_credentials
creds = resolve_nous_runtime_credentials()
@@ -1347,7 +1425,7 @@ def _resolve_nous_pricing_credentials() -> tuple[str, str]:
return (creds.get("api_key", ""), creds.get("base_url", ""))
except Exception:
pass
return ("", "")
return ("", _DEFAULT_NOUS_INFERENCE_BASE)
def get_pricing_for_provider(provider: str, *, force_refresh: bool = False) -> dict[str, dict[str, str]]:
@@ -2335,7 +2413,7 @@ def _lmstudio_fetch_raw_models(
with urllib.request.urlopen(request, timeout=timeout) as resp:
payload = json.loads(resp.read().decode())
except urllib.error.HTTPError as exc:
if exc.code in (401, 403):
if exc.code in {401, 403}:
from hermes_cli.auth import AuthError
raise AuthError(
f"LM Studio rejected the request with HTTP {exc.code}.",
@@ -3270,7 +3348,7 @@ def validate_requested_model(
# MiniMax providers don't expose a /models endpoint — validate against
# the static catalog instead, similar to openai-codex.
if normalized in ("minimax", "minimax-cn"):
if normalized in {"minimax", "minimax-cn"}:
try:
catalog_models = provider_model_ids(normalized)
except Exception:
+6 -6
View File
@@ -86,9 +86,9 @@ logger = logging.getLogger(__name__)
# The env var is read once at import time; tests that need to flip it
# mid-process can call ``_install_plugin_debug_handler(force=True)``.
_PLUGINS_DEBUG = os.getenv("HERMES_PLUGINS_DEBUG", "").strip().lower() in (
_PLUGINS_DEBUG = os.getenv("HERMES_PLUGINS_DEBUG", "").strip().lower() in {
"1", "true", "yes", "on",
)
}
_DEBUG_HANDLER_INSTALLED = False
@@ -100,9 +100,9 @@ def _install_plugin_debug_handler(force: bool = False) -> None:
"""
global _DEBUG_HANDLER_INSTALLED, _PLUGINS_DEBUG
if force:
_PLUGINS_DEBUG = os.getenv("HERMES_PLUGINS_DEBUG", "").strip().lower() in (
_PLUGINS_DEBUG = os.getenv("HERMES_PLUGINS_DEBUG", "").strip().lower() in {
"1", "true", "yes", "on",
)
}
if not _PLUGINS_DEBUG or _DEBUG_HANDLER_INSTALLED:
return
handler = logging.StreamHandler(sys.stderr)
@@ -824,7 +824,7 @@ class PluginManager:
# Bundled platform plugins (gateway adapters like IRC) auto-load
# for the same reason: every platform Hermes ships must be
# available out of the box without the user having to opt in.
if manifest.source == "bundled" and manifest.kind in ("backend", "platform"):
if manifest.source == "bundled" and manifest.kind in {"backend", "platform"}:
self._load_plugin(manifest)
continue
@@ -1075,7 +1075,7 @@ class PluginManager:
)
try:
if manifest.source in ("user", "project", "bundled"):
if manifest.source in {"user", "project", "bundled"}:
module = self._load_directory_module(manifest)
else:
module = self._load_entrypoint_module(manifest)
+12 -13
View File
@@ -85,7 +85,7 @@ def _sanitize_plugin_name(name: str, plugins_dir: Path) -> Path:
if not name:
raise ValueError("Plugin name must not be empty.")
if name in (".", ".."):
if name in {".", ".."}:
raise ValueError(
f"Invalid plugin name '{name}': must not reference the plugins directory itself."
)
@@ -491,7 +491,7 @@ def cmd_install(
answer = input(
f" Enable '{installed_name}' now? [y/N]: ",
).strip().lower()
should_enable = answer in ("y", "yes")
should_enable = answer in {"y", "yes"}
except (EOFError, KeyboardInterrupt):
should_enable = False
else:
@@ -731,7 +731,7 @@ def _discover_all_plugins() -> list:
for d in sorted(base.iterdir()):
if not d.is_dir():
continue
if source == "bundled" and d.name in ("memory", "context_engine"):
if source == "bundled" and d.name in {"memory", "context_engine"}:
continue
manifest_file = d / "plugin.yaml"
if not manifest_file.exists():
@@ -1129,10 +1129,10 @@ def _run_composite_ui(curses, plugin_names, plugin_labels, plugin_selected,
stdscr.refresh()
key = stdscr.getch()
if key in (curses.KEY_UP, ord("k")):
if key in {curses.KEY_UP, ord("k")}:
if total_items > 0:
cursor = (cursor - 1) % total_items
elif key in (curses.KEY_DOWN, ord("j")):
elif key in {curses.KEY_DOWN, ord("j")}:
if total_items > 0:
cursor = (cursor + 1) % total_items
elif key == ord(" "):
@@ -1168,7 +1168,7 @@ def _run_composite_ui(curses, plugin_names, plugin_labels, plugin_selected,
curses.init_pair(3, curses.COLOR_CYAN, -1)
curses.init_pair(4, 8, -1)
curses.curs_set(0)
elif key in (curses.KEY_ENTER, 10, 13):
elif key in {curses.KEY_ENTER, 10, 13}:
if cursor < n_plugins:
# ENTER on a plugin checkbox — confirm and exit
result_holder["plugins_changed"] = True
@@ -1200,7 +1200,7 @@ def _run_composite_ui(curses, plugin_names, plugin_labels, plugin_selected,
curses.init_pair(3, curses.COLOR_CYAN, -1)
curses.init_pair(4, 8, -1)
curses.curs_set(0)
elif key in (27, ord("q")):
elif key in {27, ord("q")}:
# Save plugin changes on exit
result_holder["plugins_changed"] = True
return
@@ -1428,10 +1428,9 @@ def _toggle_plugin_toolset(name: str, *, enable: bool) -> None:
if toolset_key not in ts_list:
ts_list.append(toolset_key)
changed = True
else:
if toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
elif toolset_key in ts_list:
ts_list.remove(toolset_key)
changed = True
# If enabling and no platforms have toolset lists yet, add to "cli" at minimum
if enable and not changed and not platform_toolsets:
@@ -1570,13 +1569,13 @@ def plugins_command(args) -> None:
)
elif action == "update":
cmd_update(args.name)
elif action in ("remove", "rm", "uninstall"):
elif action in {"remove", "rm", "uninstall"}:
cmd_remove(args.name)
elif action == "enable":
cmd_enable(args.name)
elif action == "disable":
cmd_disable(args.name)
elif action in ("list", "ls"):
elif action in {"list", "ls"}:
cmd_list()
elif action is None:
cmd_toggle()
+2 -2
View File
@@ -989,7 +989,7 @@ def _default_export_ignore(root_dir: Path):
if entry == "__pycache__" or entry.endswith((".sock", ".tmp")):
ignored.add(entry)
# npm lockfiles can appear at root
elif entry in ("package.json", "package-lock.json"):
elif entry in {"package.json", "package-lock.json"}:
ignored.add(entry)
# Root-level exclusions
if Path(directory) == root_dir:
@@ -1057,7 +1057,7 @@ def _normalize_profile_archive_parts(member_name: str) -> List[str]:
):
raise ValueError(f"Unsafe archive member path: {member_name}")
parts = [part for part in posix_path.parts if part not in ("", ".")]
parts = [part for part in posix_path.parts if part not in {"", "."}]
if not parts or any(part == ".." for part in parts):
raise ValueError(f"Unsafe archive member path: {member_name}")
return parts
+2 -2
View File
@@ -164,7 +164,7 @@ class PtyBridge:
data = os.read(self._fd, 65536)
except OSError as exc:
# EIO on Linux = slave side closed. EBADF = already closed.
if exc.errno in (errno.EIO, errno.EBADF):
if exc.errno in {errno.EIO, errno.EBADF}:
return None
raise
if not data:
@@ -181,7 +181,7 @@ class PtyBridge:
try:
n = os.write(self._fd, view)
except OSError as exc:
if exc.errno in (errno.EIO, errno.EBADF, errno.EPIPE):
if exc.errno in {errno.EIO, errno.EBADF, errno.EPIPE}:
return
raise
if n <= 0:
+14 -6
View File
@@ -205,6 +205,14 @@ def _resolve_runtime_from_pool_entry(
elif provider == "google-gemini-cli":
api_mode = "chat_completions"
base_url = base_url or "cloudcode-pa://google"
elif provider == "minimax-oauth":
# MiniMax OAuth tokens are valid only against the Anthropic Messages
# compatible endpoint. Do not honor stale model.api_mode values from a
# prior OpenAI-compatible provider, or the client will hit
# /chat/completions under /anthropic and receive a bare nginx 404.
api_mode = "anthropic_messages"
pconfig = PROVIDER_REGISTRY.get(provider)
base_url = base_url or (pconfig.inference_base_url if pconfig else "")
elif provider == "anthropic":
api_mode = "anthropic_messages"
cfg_provider = str(model_cfg.get("provider") or "").strip().lower()
@@ -260,7 +268,7 @@ def _resolve_runtime_from_pool_entry(
if cfg_base_url:
base_url = cfg_base_url
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if provider in ("opencode-zen", "opencode-go"):
if provider in {"opencode-zen", "opencode-go"}:
# Re-derive api_mode from the effective model rather than the
# persisted api_mode: the opencode providers serve both
# anthropic_messages and chat_completions models, so the previous
@@ -282,7 +290,7 @@ def _resolve_runtime_from_pool_entry(
# Anthropic SDK prepends its own /v1/messages to the base_url. Strip the
# trailing /v1 so the SDK constructs the correct path (e.g.
# https://opencode.ai/zen/go/v1/messages instead of .../v1/v1/messages).
if api_mode == "anthropic_messages" and provider in ("opencode-zen", "opencode-go"):
if api_mode == "anthropic_messages" and provider in {"opencode-zen", "opencode-go"}:
base_url = re.sub(r"/v1/?$", "", base_url)
return {
@@ -859,7 +867,7 @@ def _resolve_explicit_runtime(
base_url = explicit_base_url
if not base_url:
if provider in ("kimi-coding", "kimi-coding-cn"):
if provider in {"kimi-coding", "kimi-coding-cn"}:
creds = resolve_api_key_provider_credentials(provider)
base_url = creds.get("base_url", "").rstrip("/")
else:
@@ -1223,7 +1231,7 @@ def resolve_runtime_provider(
# trust boto3's credential chain — it handles IMDS, ECS task roles,
# Lambda execution roles, SSO, and other implicit sources that our
# env-var check can't detect.
is_explicit = requested_provider in ("bedrock", "aws", "aws-bedrock", "amazon-bedrock", "amazon")
is_explicit = requested_provider in {"bedrock", "aws", "aws-bedrock", "amazon-bedrock", "amazon"}
if not is_explicit and not has_aws_credentials():
raise AuthError(
"No AWS credentials found for Bedrock. Configure one of:\n"
@@ -1303,7 +1311,7 @@ def resolve_runtime_provider(
configured_provider = str(model_cfg.get("provider") or "").strip().lower()
# Only honor persisted api_mode when it belongs to the same provider family.
configured_mode = _parse_api_mode(model_cfg.get("api_mode"))
if provider in ("opencode-zen", "opencode-go"):
if provider in {"opencode-zen", "opencode-go"}:
# opencode-zen/go must always re-derive api_mode from the
# target model (not the stale persisted api_mode), because
# the same provider serves both anthropic_messages
@@ -1325,7 +1333,7 @@ def resolve_runtime_provider(
if detected:
api_mode = detected
# Strip trailing /v1 for OpenCode Anthropic models (see comment above).
if api_mode == "anthropic_messages" and provider in ("opencode-zen", "opencode-go"):
if api_mode == "anthropic_messages" and provider in {"opencode-zen", "opencode-go"}:
base_url = re.sub(r"/v1/?$", "", base_url)
return {
"provider": provider,
+451
View File
@@ -0,0 +1,451 @@
"""
Security advisory checker for Hermes Agent.
Detects known-compromised Python packages installed in the active venv
(supply-chain attacks like the Mini Shai-Hulud worm of May 2026 that
poisoned ``mistralai 2.4.6`` on PyPI) and surfaces remediation guidance to
the user.
Design goals:
- **Cheap.** A single ``importlib.metadata.version()`` call per advisory
package. Safe to run on every CLI startup.
- **Loud when it matters, silent otherwise.** If no compromised package is
installed, the user sees nothing.
- **Acknowledgeable.** Once the user has read and acted on an advisory they
can dismiss it via ``hermes doctor --ack <id>``; the ack is persisted to
``config.security.acked_advisories`` and survives restart.
- **Extensible.** Adding a new advisory is one entry in ``ADVISORIES``;
adding a new compromised version is a one-line edit. No code changes
needed when the next worm hits.
The check is invoked from three places:
1. ``hermes doctor`` (and ``hermes doctor --ack <id>``)
2. CLI startup banner (one short line, then full guidance via
``hermes doctor``)
3. Gateway startup (logged to gateway.log; first interactive message gets
a one-line operator banner)
This module is intentionally dependency-free beyond the stdlib so it can
run in environments where the rest of Hermes failed to import.
"""
from __future__ import annotations
import logging
import os
import sys
from dataclasses import dataclass, field
from pathlib import Path
from typing import Iterable, Optional
logger = logging.getLogger(__name__)
# =============================================================================
# Advisory catalog
#
# Each advisory is a community-facing security warning about one or more
# specific package versions that are known to be compromised. To add a new
# advisory:
#
# 1. Append a new ``Advisory`` to ``ADVISORIES`` below
# 2. Set ``compromised`` to a tuple of ``(pkg_name, frozenset_of_versions)``
# — version strings must match what ``importlib.metadata.version()``
# returns. Use an empty frozenset to flag *any installed version*
# (rare; only when the maintainer namespace itself is compromised).
# 3. Write 2-4 short ``remediation`` lines a non-expert can copy/paste.
#
# Do NOT remove old advisories. Once an advisory ships, leave it in place so
# users running an older release with the compromised package still get
# warned. Mark superseded ones via ``superseded_by`` if needed.
# =============================================================================
@dataclass(frozen=True)
class Advisory:
"""One security advisory entry.
Attributes:
id: stable identifier used for acks (e.g. ``shai-hulud-2026-05``).
Lowercase-hyphen, never reused.
title: one-line headline shown in banners.
summary: 1-3 sentence description of what was compromised and how.
url: reference URL (Socket advisory, GitHub advisory, PyPI page).
compromised: tuple of ``(package_name, frozenset_of_versions)``
pairs. Empty frozenset means "any version of this package is
considered suspect" — use sparingly.
remediation: ordered list of steps the user should take. First step
should be the uninstall command; subsequent steps the credential
audit / rotation guidance.
published: ISO date string for sort order.
"""
id: str
title: str
summary: str
url: str
compromised: tuple[tuple[str, frozenset[str]], ...]
remediation: tuple[str, ...]
published: str = ""
severity: str = "high" # low / medium / high / critical
ADVISORIES: tuple[Advisory, ...] = (
Advisory(
id="shai-hulud-2026-05",
title="Mini Shai-Hulud worm — mistralai 2.4.6 compromised on PyPI",
summary=(
"PyPI quarantined the mistralai package on 2026-05-12 after a "
"malicious 2.4.6 release. The worm steals credentials from "
"environment variables and credential files (~/.npmrc, ~/.pypirc, "
"~/.aws/credentials, GitHub PATs, cloud SDK tokens) and exfils "
"them to a hardcoded webhook. If you ran any Python process that "
"imported mistralai 2.4.6 — including hermes when configured "
"with provider=mistral for TTS or STT — assume those credentials "
"are exposed."
),
url="https://socket.dev/blog/mini-shai-hulud-worm-pypi",
compromised=(
("mistralai", frozenset({"2.4.6"})),
),
remediation=(
"Run: pip uninstall -y mistralai (or: uv pip uninstall mistralai)",
"Rotate API keys in ~/.hermes/.env (OpenRouter, Anthropic, OpenAI, "
"Nous, GitHub, AWS, Google, Mistral, etc.).",
"Audit ~/.npmrc, ~/.pypirc, ~/.aws/credentials, ~/.config/gh/hosts.yml, "
"and any other credential files for tokens that may have been read.",
"Check GitHub for unexpected new SSH keys, deploy keys, or webhook "
"additions on repos you have admin on.",
"After cleanup: hermes doctor --ack shai-hulud-2026-05 to dismiss "
"this warning.",
),
published="2026-05-12",
severity="critical",
),
)
# =============================================================================
# Detection
# =============================================================================
@dataclass(frozen=True)
class AdvisoryHit:
"""One package-version match against an advisory."""
advisory: Advisory
package: str
installed_version: str
def _installed_version(pkg_name: str) -> Optional[str]:
"""Return the installed version of ``pkg_name``, or None if not installed.
Uses ``importlib.metadata`` so we don't depend on pip being importable
inside the active venv (uv-created venvs may lack pip).
"""
try:
from importlib.metadata import PackageNotFoundError, version
except ImportError: # py<3.8 — Hermes requires 3.10+ but defensive.
return None
try:
return version(pkg_name)
except PackageNotFoundError:
return None
except Exception:
# Some metadata corruption modes raise ValueError or OSError. Don't
# let advisory checking crash the CLI startup path.
logger.debug("importlib.metadata.version(%s) raised", pkg_name, exc_info=True)
return None
def detect_compromised(
advisories: Iterable[Advisory] = ADVISORIES,
) -> list[AdvisoryHit]:
"""Scan installed packages and return all advisory hits.
A "hit" means an advisory's listed package is installed AND the version
is in the compromised set (or the compromised set is empty, meaning
*any* version is suspect).
"""
hits: list[AdvisoryHit] = []
for advisory in advisories:
for pkg_name, bad_versions in advisory.compromised:
installed = _installed_version(pkg_name)
if installed is None:
continue
if not bad_versions or installed in bad_versions:
hits.append(AdvisoryHit(
advisory=advisory,
package=pkg_name,
installed_version=installed,
))
return hits
# =============================================================================
# Acknowledgement persistence
#
# Acks live under ``security.acked_advisories`` in config.yaml as a list of
# advisory IDs. The list is the only state — no per-host data, no
# timestamps, no fingerprints. Users sharing a config.yaml across machines
# (rare but possible) get the same dismissal everywhere, which is the
# correct behavior for a global advisory.
# =============================================================================
def get_acked_ids() -> set[str]:
"""Return the set of advisory IDs the user has dismissed.
Returns an empty set if config can't be loaded (don't block startup
just because config is broken the advisory will keep firing until
config is repaired, which is fine).
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
except Exception:
logger.debug("Could not load config for advisory acks", exc_info=True)
return set()
sec = cfg.get("security") or {}
raw = sec.get("acked_advisories") or []
if not isinstance(raw, list):
return set()
return {str(x).strip() for x in raw if str(x).strip()}
def ack_advisory(advisory_id: str) -> bool:
"""Persist an ack for ``advisory_id``. Returns True on success.
Idempotent acking an already-acked ID is a no-op.
"""
advisory_id = advisory_id.strip()
if not advisory_id:
return False
try:
from hermes_cli.config import load_config, save_config
except Exception:
logger.warning("Could not import config module to persist ack")
return False
try:
cfg = load_config()
sec = cfg.setdefault("security", {})
existing = sec.get("acked_advisories") or []
if not isinstance(existing, list):
existing = []
if advisory_id not in existing:
existing.append(advisory_id)
sec["acked_advisories"] = existing
save_config(cfg)
return True
except Exception:
logger.exception("Failed to persist advisory ack for %s", advisory_id)
return False
def filter_unacked(hits: list[AdvisoryHit]) -> list[AdvisoryHit]:
"""Return only hits whose advisories the user has not dismissed."""
if not hits:
return []
acked = get_acked_ids()
return [h for h in hits if h.advisory.id not in acked]
# =============================================================================
# Rendering helpers
# =============================================================================
def _term_supports_color() -> bool:
if os.environ.get("NO_COLOR"):
return False
if not sys.stdout.isatty():
return False
return True
def short_banner_lines(hits: list[AdvisoryHit]) -> list[str]:
"""Return 1-3 short lines suitable for a startup banner.
Caller is responsible for color/styling. Always names the worst hit
explicitly so the user knows what's wrong without running doctor.
"""
if not hits:
return []
primary = hits[0]
lines = [
f"SECURITY ADVISORY [{primary.advisory.id}]: {primary.advisory.title}",
f" Detected: {primary.package}=={primary.installed_version}",
" Run 'hermes doctor' for remediation steps.",
]
if len(hits) > 1:
lines.insert(1, f" ({len(hits) - 1} additional advisor"
f"{'ies' if len(hits) > 2 else 'y'} also active.)")
return lines
def full_remediation_text(hit: AdvisoryHit) -> list[str]:
"""Return a multi-line block describing the advisory + remediation."""
a = hit.advisory
lines = [
f"=== {a.title} ===",
f"ID: {a.id} Severity: {a.severity} Published: {a.published}",
f"Detected: {hit.package}=={hit.installed_version}",
f"Reference: {a.url}",
"",
a.summary,
"",
"Remediation:",
]
for i, step in enumerate(a.remediation, 1):
lines.append(f" {i}. {step}")
return lines
# =============================================================================
# Startup-banner gating
#
# We do NOT want to hammer the user with the banner on every command. Once
# they've seen it inside a 24h window we cache that fact in
# ``~/.hermes/cache/advisory_banner_seen`` (a single line per advisory ID:
# ``<id> <iso8601_timestamp>``).
#
# Acked advisories never re-banner. Cached-but-not-acked advisories
# re-banner after 24h so the user doesn't fully forget.
# =============================================================================
_BANNER_CACHE_FILE = "advisory_banner_seen"
_BANNER_REPEAT_HOURS = 24
def _banner_cache_path() -> Optional[Path]:
try:
from hermes_constants import get_hermes_home
cache_dir = Path(get_hermes_home()) / "cache"
cache_dir.mkdir(parents=True, exist_ok=True)
return cache_dir / _BANNER_CACHE_FILE
except Exception:
return None
def _read_banner_cache() -> dict[str, float]:
p = _banner_cache_path()
if p is None or not p.exists():
return {}
out: dict[str, float] = {}
try:
for line in p.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
parts = line.split(None, 1)
if len(parts) != 2:
continue
advisory_id, ts = parts
try:
out[advisory_id] = float(ts)
except ValueError:
continue
except Exception:
return {}
return out
def _write_banner_cache(seen: dict[str, float]) -> None:
p = _banner_cache_path()
if p is None:
return
try:
lines = [f"{aid} {ts}" for aid, ts in seen.items()]
p.write_text("\n".join(lines) + "\n", encoding="utf-8")
except Exception:
logger.debug("Could not write advisory banner cache", exc_info=True)
def hits_due_for_banner(
hits: list[AdvisoryHit],
*,
repeat_hours: int = _BANNER_REPEAT_HOURS,
) -> list[AdvisoryHit]:
"""Return only hits whose banner is due (not acked, not recently shown).
Side effect: stamps the banner cache for any hit that's about to be
shown. Callers should subsequently render the result.
"""
import time
fresh = filter_unacked(hits)
if not fresh:
return []
now = time.time()
cache = _read_banner_cache()
cutoff = now - (repeat_hours * 3600)
due: list[AdvisoryHit] = []
for hit in fresh:
last = cache.get(hit.advisory.id, 0.0)
if last < cutoff:
due.append(hit)
cache[hit.advisory.id] = now
if due:
_write_banner_cache(cache)
return due
# =============================================================================
# Public entry points used by doctor / CLI / gateway
# =============================================================================
def render_doctor_section(hits: list[AdvisoryHit]) -> tuple[bool, list[str]]:
"""Render the security-advisory section for ``hermes doctor``.
Returns ``(has_problems, lines)``. Caller is responsible for printing
with whatever color scheme it uses.
"""
fresh = filter_unacked(hits)
if not fresh:
return False, ["No active security advisories. ✓"]
lines: list[str] = []
for i, hit in enumerate(fresh):
if i:
lines.append("")
lines.extend(full_remediation_text(hit))
return True, lines
def startup_banner(hits: list[AdvisoryHit]) -> Optional[str]:
"""Return a printable startup banner, or None if nothing is due.
Updates the banner cache as a side effect (so the next call within
24h returns None for the same hit).
"""
due = hits_due_for_banner(hits)
if not due:
return None
lines = short_banner_lines(due)
if _term_supports_color():
red = "\x1b[1;31m"
reset = "\x1b[0m"
return red + "\n".join(lines) + reset
return "\n".join(lines)
def gateway_log_message(hits: list[AdvisoryHit]) -> Optional[str]:
"""Return a one-line log message for gateway operators, or None."""
fresh = filter_unacked(hits)
if not fresh:
return None
if len(fresh) == 1:
h = fresh[0]
return (f"Security advisory [{h.advisory.id}] active: "
f"{h.package}=={h.installed_version} matches {h.advisory.title}. "
f"See {h.advisory.url}")
return (f"{len(fresh)} security advisories active "
f"(IDs: {', '.join(h.advisory.id for h in fresh)}). "
f"Run `hermes doctor` on the gateway host for details.")
+13 -14
View File
@@ -292,9 +292,9 @@ def prompt_yes_no(question: str, default: bool = True) -> bool:
if not value:
return default
if value in ("y", "yes"):
if value in {"y", "yes"}:
return True
if value in ("n", "no"):
if value in {"n", "no"}:
return False
print_error("Please enter 'y' or 'n'")
@@ -641,7 +641,7 @@ def _prompt_container_resources(config: dict):
persist_str = prompt(
" Persist filesystem across sessions? (yes/no)", persist_label
)
terminal["container_persistent"] = persist_str.lower() in ("yes", "true", "y", "1")
terminal["container_persistent"] = persist_str.lower() in {"yes", "true", "y", "1"}
# CPU
current_cpu = terminal.get("container_cpu", 1)
@@ -692,7 +692,7 @@ def _prompt_vercel_sandbox_settings(config: dict):
persist_label = "yes" if current_persist else "no"
terminal["container_persistent"] = prompt(
" Persist filesystem with snapshots? (yes/no)", persist_label
).lower() in ("yes", "true", "y", "1")
).lower() in {"yes", "true", "y", "1"}
current_cpu = terminal.get("container_cpu", 1)
cpu_str = prompt(" CPU cores", str(current_cpu))
@@ -708,7 +708,7 @@ def _prompt_vercel_sandbox_settings(config: dict):
except ValueError:
pass
if terminal.get("container_disk", 51200) not in (0, 51200):
if terminal.get("container_disk", 51200) not in {0, 51200}:
print_warning("Vercel Sandbox does not support custom disk sizing; resetting container_disk to 51200.")
terminal["container_disk"] = 51200
@@ -1355,14 +1355,13 @@ def setup_terminal_backend(config: dict):
existing_sudo = get_env_value("SUDO_PASSWORD")
if existing_sudo:
print_info("Sudo password: configured")
else:
if prompt_yes_no(
"Enable sudo support? (stores password for apt install, etc.)", False
):
sudo_pass = prompt(" Sudo password", password=True)
if sudo_pass:
save_env_value("SUDO_PASSWORD", sudo_pass)
print_success("Sudo password saved")
elif prompt_yes_no(
"Enable sudo support? (stores password for apt install, etc.)", False
):
sudo_pass = prompt(" Sudo password", password=True)
if sudo_pass:
save_env_value("SUDO_PASSWORD", sudo_pass)
print_success("Sudo password saved")
elif selected_backend == "docker":
print_success("Terminal backend: Docker")
@@ -1730,7 +1729,7 @@ def setup_agent_settings(config: dict):
current_mode = cfg_get(config, "display", "tool_progress", default="all")
mode = prompt("Tool progress mode", current_mode)
if mode.lower() in ("off", "new", "all", "verbose"):
if mode.lower() in {"off", "new", "all", "verbose"}:
if "display" not in config:
config["display"] = {}
config["display"]["tool_progress"] = mode.lower()
+5 -5
View File
@@ -593,7 +593,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
c.print("[dim]Installation cancelled.[/]\n")
shutil.rmtree(q_path, ignore_errors=True)
return
@@ -948,7 +948,7 @@ def do_uninstall(name: str, console: Optional[Console] = None,
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
c.print("[dim]Cancelled.[/]\n")
return
@@ -984,7 +984,7 @@ def do_reset(name: str, restore: bool = False,
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
if answer not in {"y", "yes"}:
c.print("[dim]Cancelled.[/]\n")
return
@@ -1138,7 +1138,7 @@ def _github_publish(skill_path: Path, skill_name: str, target_repo: str,
f"https://api.github.com/repos/{target_repo}/forks",
headers=headers, timeout=30,
)
if resp.status_code in (200, 202):
if resp.status_code in {200, 202}:
fork = resp.json()
fork_repo = fork["full_name"]
elif resp.status_code == 403:
@@ -1564,7 +1564,7 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
repo = args[1] if len(args) > 1 else ""
do_tap(tap_action, repo=repo, console=c)
elif action in ("help", "--help", "-h"):
elif action in {"help", "--help", "-h"}:
_print_skills_help(c)
else:
+1 -1
View File
@@ -367,7 +367,7 @@ def show_status(args):
if persist is None:
persist_enabled = bool(terminal_cfg.get("container_persistent", True))
else:
persist_enabled = persist.lower() in ("1", "true", "yes", "on")
persist_enabled = persist.lower() in {"1", "true", "yes", "on"}
auth_status = describe_vercel_auth()
sdk_ok = importlib.util.find_spec("vercel") is not None
sdk_label = "installed" if sdk_ok else "missing (install: pip install 'hermes-agent[vercel]')"
+1 -1
View File
@@ -105,7 +105,7 @@ def configure_windows_stdio() -> bool:
_CONFIGURED = True
return False
if os.environ.get("HERMES_DISABLE_WINDOWS_UTF8") in ("1", "true", "True", "yes"):
if os.environ.get("HERMES_DISABLE_WINDOWS_UTF8") in {"1", "true", "True", "yes"}:
_CONFIGURED = True
return False
+138 -62
View File
@@ -205,15 +205,9 @@ TOOL_CATEGORIES = {
],
"tts_provider": "elevenlabs",
},
{
"name": "Mistral (Voxtral TTS)",
"badge": "paid",
"tag": "Multilingual, native Opus",
"env_vars": [
{"key": "MISTRAL_API_KEY", "prompt": "Mistral API key", "url": "https://console.mistral.ai/"},
],
"tts_provider": "mistral",
},
# Mistral (Voxtral TTS) temporarily hidden — `mistralai` PyPI
# package is currently quarantined (malicious 2.4.6 release on
# 2026-05-12). Restore this entry once PyPI un-quarantines.
{
"name": "Google Gemini TTS",
"badge": "preview",
@@ -591,10 +585,136 @@ def _pip_install(
)
def install_cua_driver(upgrade: bool = False) -> bool:
"""Install or refresh the cua-driver binary used by Computer Use.
The upstream installer always pulls the latest release tag, so re-running
it is the canonical way to upgrade. We expose two modes:
* ``upgrade=False`` original post-setup behaviour: skip if already
installed, install otherwise. Used by the toolset enable flow where
we don't want to surprise the user with a network fetch.
* ``upgrade=True`` always re-run the installer (or call ``cua-driver
update`` if the binary supports it). Used by ``hermes update`` and
by ``hermes computer-use install --upgrade``.
Returns True iff cua-driver is installed (or successfully refreshed)
when the function returns. macOS-only silently returns False on
other platforms.
"""
import platform as _plat
import shutil
import subprocess
if _plat.system() != "Darwin":
if upgrade:
# Silent on non-macOS — `hermes update` calls this for every
# user; only macOS users with cua-driver care.
return False
_print_warning(" Computer Use (cua-driver) is macOS-only; skipping.")
return False
binary = shutil.which("cua-driver")
# Not installed → fresh install path (only when caller asked for it).
if not binary and not upgrade:
if not shutil.which("curl"):
_print_warning(" curl not found — install manually:")
_print_info(" https://github.com/trycua/cua/blob/main/libs/cua-driver/README.md")
return False
return _run_cua_driver_installer(label="Installing")
# Already installed and caller didn't ask to upgrade → just confirm.
if binary and not upgrade:
try:
version = subprocess.run(
["cua-driver", "--version"],
capture_output=True, text=True, timeout=5,
).stdout.strip()
_print_success(f" cua-driver already installed: {version or 'unknown version'}")
except Exception:
_print_success(" cua-driver already installed.")
_print_info(" Grant macOS permissions if not done yet:")
_print_info(" System Settings > Privacy & Security > Accessibility")
_print_info(" System Settings > Privacy & Security > Screen Recording")
return True
# upgrade=True path — refresh to the latest upstream release.
if not shutil.which("curl"):
_print_warning(" curl not found — cannot refresh cua-driver.")
return bool(binary)
if binary:
# Show before/after version when we have a baseline. Best-effort.
try:
before = subprocess.run(
["cua-driver", "--version"],
capture_output=True, text=True, timeout=5,
).stdout.strip()
except Exception:
before = ""
else:
before = ""
ok = _run_cua_driver_installer(label="Refreshing", verbose=False)
if ok and before:
try:
after = subprocess.run(
["cua-driver", "--version"],
capture_output=True, text=True, timeout=5,
).stdout.strip()
if after and after != before:
_print_success(f" cua-driver upgraded: {before}{after}")
elif after:
_print_info(f" cua-driver up to date: {after}")
except Exception:
pass
return ok
def _run_cua_driver_installer(label: str = "Installing", verbose: bool = True) -> bool:
"""Run the upstream cua-driver install.sh. Returns True on success.
The script is idempotent: it always downloads the latest release, so
re-running it on an already-installed system performs an upgrade.
"""
import shutil
import subprocess
install_cmd = (
"/bin/bash -c \"$(curl -fsSL "
"https://raw.githubusercontent.com/trycua/cua/main/"
"libs/cua-driver/scripts/install.sh)\""
)
if verbose:
_print_info(f" {label} cua-driver (macOS background computer-use)...")
else:
_print_info(f" {label} cua-driver...")
try:
result = subprocess.run(install_cmd, shell=True, timeout=300)
if result.returncode == 0 and shutil.which("cua-driver"):
if verbose:
_print_success(" cua-driver installed.")
_print_info(" IMPORTANT — grant macOS permissions now:")
_print_info(" System Settings > Privacy & Security > Accessibility")
_print_info(" System Settings > Privacy & Security > Screen Recording")
_print_info(" Both must allow the terminal / Hermes process.")
return True
_print_warning(f" cua-driver {label.lower()} did not complete. Re-run manually:")
_print_info(f" {install_cmd}")
return False
except subprocess.TimeoutExpired:
_print_warning(f" cua-driver {label.lower()} timed out. Re-run manually.")
return False
except Exception as e:
_print_warning(f" cua-driver {label.lower()} failed: {e}")
return False
def _run_post_setup(post_setup_key: str):
"""Run post-setup hooks for tools that need extra installation steps."""
import shutil
if post_setup_key in ("agent_browser", "browserbase"):
if post_setup_key in {"agent_browser", "browserbase"}:
node_modules = PROJECT_ROOT / "node_modules" / "agent-browser"
npm_bin = shutil.which("npm")
npx_bin = shutil.which("npx")
@@ -729,51 +849,7 @@ def _run_post_setup(post_setup_key: str):
_print_info(" docker run -p 9377:9377 -e CAMOFOX_PORT=9377 jo-inc/camofox-browser")
elif post_setup_key == "cua_driver":
# cua-driver provides macOS background computer-use (SkyLight SPIs).
# Install via upstream curl script if the binary isn't on $PATH yet.
import platform as _plat
import subprocess
if _plat.system() != "Darwin":
_print_warning(" Computer Use (cua-driver) is macOS-only; skipping.")
return
if shutil.which("cua-driver"):
try:
version = subprocess.run(
["cua-driver", "--version"],
capture_output=True, text=True, timeout=5,
).stdout.strip()
_print_success(f" cua-driver already installed: {version or 'unknown version'}")
except Exception:
_print_success(" cua-driver already installed.")
_print_info(" Grant macOS permissions if not done yet:")
_print_info(" System Settings > Privacy & Security > Accessibility")
_print_info(" System Settings > Privacy & Security > Screen Recording")
return
if not shutil.which("curl"):
_print_warning(" curl not found — install manually:")
_print_info(" https://github.com/trycua/cua/blob/main/libs/cua-driver/README.md")
return
_print_info(" Installing cua-driver (macOS background computer-use)...")
try:
install_cmd = (
"/bin/bash -c \"$(curl -fsSL "
"https://raw.githubusercontent.com/trycua/cua/main/"
"libs/cua-driver/scripts/install.sh)\""
)
result = subprocess.run(install_cmd, shell=True, timeout=300)
if result.returncode == 0 and shutil.which("cua-driver"):
_print_success(" cua-driver installed.")
_print_info(" IMPORTANT — grant macOS permissions now:")
_print_info(" System Settings > Privacy & Security > Accessibility")
_print_info(" System Settings > Privacy & Security > Screen Recording")
_print_info(" Both must allow the terminal / Hermes process.")
else:
_print_warning(" cua-driver install did not complete. Re-run manually:")
_print_info(f" {install_cmd}")
except subprocess.TimeoutExpired:
_print_warning(" cua-driver install timed out. Re-run manually.")
except Exception as e:
_print_warning(f" cua-driver install failed: {e}")
install_cua_driver(upgrade=False)
elif post_setup_key == "kittentts":
try:
@@ -1631,7 +1707,7 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
image_cfg = config.get("image_gen", {})
if isinstance(image_cfg, dict):
configured_provider = image_cfg.get("provider")
if configured_provider not in (None, "", "fal"):
if configured_provider not in {None, "", "fal"}:
return False
if image_cfg.get("use_gateway") is not None and not is_truthy_value(image_cfg.get("use_gateway"), default=False):
return False
@@ -1664,7 +1740,7 @@ def _is_provider_active(provider: dict, config: dict) -> bool:
configured_provider = image_cfg.get("provider")
return (
provider["imagegen_backend"] == "fal"
and configured_provider in (None, "", "fal")
and configured_provider in {None, "", "fal"}
and not is_truthy_value(image_cfg.get("use_gateway"), default=False)
)
return False
@@ -1914,7 +1990,7 @@ def _configure_provider(provider: dict, config: dict):
# For tools without a specific config key (e.g. image_gen), still
# track use_gateway so the runtime knows the user's intent.
if managed_feature and managed_feature not in ("web", "tts", "browser"):
if managed_feature and managed_feature not in {"web", "tts", "browser"}:
config.setdefault(managed_feature, {})["use_gateway"] = True
elif not managed_feature:
# User picked a non-gateway provider — find which category this
@@ -1946,7 +2022,7 @@ def _configure_provider(provider: dict, config: dict):
# image_gen.provider clear so the dispatch shim falls through
# to the legacy FAL path.
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in {None, "", "fal"}:
img_cfg["provider"] = "fal"
return
@@ -1991,7 +2067,7 @@ def _configure_provider(provider: dict, config: dict):
if backend:
_configure_imagegen_model(backend, config)
img_cfg = config.setdefault("image_gen", {})
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in (None, "", "fal"):
if isinstance(img_cfg, dict) and img_cfg.get("provider") not in {None, "", "fal"}:
img_cfg["provider"] = "fal"
@@ -2186,7 +2262,7 @@ def _reconfigure_provider(provider: dict, config: dict):
web_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" Web backend set to: {provider['web_backend']}")
if managed_feature and managed_feature not in ("web", "tts", "browser"):
if managed_feature and managed_feature not in {"web", "tts", "browser"}:
section = config.setdefault(managed_feature, {})
if not isinstance(section, dict):
section = {}
@@ -2535,7 +2611,7 @@ def _configure_mcp_tools_interactive(config: dict):
# Count enabled servers
enabled_names = [
k for k, v in mcp_servers.items()
if v.get("enabled", True) not in (False, "false", "0", "no", "off")
if v.get("enabled", True) not in {False, "false", "0", "no", "off"}
]
if not enabled_names:
_print_info("All MCP servers are disabled.")

Some files were not shown because too many files have changed in this diff Show More