Compare commits

...

293 Commits

Author SHA1 Message Date
Teknium f2332d4f13 revert(gateway): remove stale-code self-check and auto-restart
Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.
2026-05-05 03:44:10 -07:00
Siddharth Balyan 13a7cbcd64 fix(nix): refresh stale tui npmDepsHash + fix cache-blind detection (#20144)
The fix-lockfiles script used 'nix build .#tui.npmDeps' to detect stale
hashes. This always succeeds when the OLD derivation is cached in Cachix
or cache.nixos.org — even when the source package-lock.json has changed.

Fix: use prefetch-npm-deps to compute the hash directly from the lockfile
and compare against what's in the nix file. Falls back to nix build only
if prefetch-npm-deps fails.
2026-05-05 15:32:20 +05:30
teknium1 601e5f1d57 fix(teams): log reply() fallback for diagnostics
The previous bare except swallowed every exception from app.reply()
silently. Log at debug so real failures (auth, chat gone) leave a
trace while keeping the group-chat 400 fallback working. Also fix
the Teams entry's indentation in the messaging flowchart.
2026-05-04 20:59:18 -07:00
Aamir Jawaid 2333b7a7ec fix(tests): patch TypingActivityInput after mock on Python <3.12
The SDK requires Python >=3.12 so CI (3.11) falls to the except
ImportError branch, leaving TypingActivityInput=None. After loading
the adapter module, explicitly restore it from the mock so
test_send_typing doesn't silently no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 3f023450dd fix(teams): fall back to flat send when threading returns 400
Group chats return 400 for threaded sends. Catch the error and
fall back to a flat send so messages always get delivered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 69aeba0df7 feat(teams): implement threading via app.reply()
Wire reply_to into send() using App.reply(conv_id, msg_id, content)
which constructs the threaded conversation ID internally.
Threads supported in channels and group chats.

Update comparison table: Threads 

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 10f89d7b72 docs(teams): add Teams to messaging/index.md
- Add to platform description and intro paragraph
- Add row to platform comparison table (images + typing)
- Add node to architecture mermaid diagram
- Add TEAMS_ALLOWED_USERS to security examples
- Add to platform-specific toolsets table
- Add to Next Steps links

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 93869b48ab docs: add Microsoft Teams to platform lists across docs
Update all platform enumeration lists to include Teams:
index.md, quickstart.md, integrations/index.md, sessions.md,
slash-commands.md, updating.md, hooks.md, hermes-agent skill.

Skipped PII redaction docs — Teams uses AAD object IDs, not
phone numbers, so redaction doesn't apply there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid ef94aa201f docs(teams): add Teams to sidebar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Teknium c77a6e3faa chore(security): add OSV-Scanner CI + Dependabot for github-actions only (#20037)
Adds two supply-chain controls that complement our existing pinning
strategy (full-SHA action pins, exact-version source dep pins via
uv.lock / package-lock.json) without undermining it.

.github/workflows/osv-scanner.yml
  Detection-only scan of uv.lock and the ui-tui/website package-locks
  against the OSV vulnerability database. Runs on PRs that touch
  lockfiles, on push to main, and weekly against main so CVEs
  published after merge still surface. Uses Google's officially-
  recommended reusable workflow pinned by full SHA (v2.3.5).
  Findings upload to the Security tab; fail-on-vuln is disabled so
  pre-existing vulns in pinned deps do not block merges — we move
  pins deliberately, not under CI pressure.

.github/dependabot.yml
  Scoped to github-actions only. Action pins must be moved when
  upstream publishes patches (often themselves security fixes);
  Dependabot opens a PR with the new SHA + release notes for normal
  review. Source-dependency ecosystems (pip, npm) are deliberately
  NOT enabled — automatic version-bump PRs against uv.lock /
  package-lock.json would fight our pinning strategy. CVE-driven
  security updates for source deps are enabled separately via the
  repo's Dependabot security updates setting (GitHub UI), which
  fires only when a pinned version becomes known-vulnerable.
2026-05-04 20:58:21 -07:00
Stephen Schoettler 1d938832a7 test(kanban): patch dashboard websocket token stub 2026-05-04 20:50:24 -07:00
Stephen Schoettler f7918c9349 test(teams): mock ClientOptions in adapter tests 2026-05-04 20:50:24 -07:00
Teknium a1bed18194 docs: clarify that the Docker terminal backend is a single persistent container (#20003)
The docs were ambiguous about whether the Docker terminal backend spins up
a fresh container per command or reuses a long-lived one. It's the latter
— Hermes starts one container on first use and routes every terminal,
file, and execute_code call through docker exec into that same container
for the life of the process (across /new, /reset, and delegate_task
subagents). Working-directory changes, installed packages, and files in
/workspace persist from one tool call to the next, like a local shell.

- configuration.md: lead the Docker Backend section with the persistence
  model before the YAML example; sharpen the Backend Overview table row.
- features/tools.md: expand the Docker Backend block (previously just a
  2-line YAML stub) with a clear statement of the persistent-container
  semantics and a pointer to the full lifecycle section.
- docker.md: tighten the 'Docker as a terminal backend' bullet and the
  'Skills and credential files' paragraph to call out the single-container
  model explicitly.
2026-05-04 20:09:31 -07:00
Jeffrey Quesnelle d12f59aa53 Merge pull request #19866 from NousResearch/fix/clarify-placeholder-credential
clarify placeholder telegram credential in tests
2026-05-04 22:24:52 -04:00
helix4u b816fd4e26 fix(tui): complete absolute paths as paths 2026-05-04 16:14:40 -07:00
helix4u b632290166 fix(gateway): handle planned service stops 2026-05-04 16:00:49 -07:00
brooklyn! 20428f5e60 fix(tui): respect voice.record_key config (supersedes #19028, #19339) (#19835)
* fix(tui): respect voice.record_key config instead of hardcoded Ctrl+B

Classic CLI loaded ``voice.record_key`` from config.yaml and bound the
prompt-toolkit handler dynamically (``cli.py`` paths). The new TUI hard-
coded ``Ctrl+B`` everywhere — ``isVoiceToggleKey`` (input handler),
``/voice status`` ("Record key: Ctrl+B"), and ``/voice on`` ("Ctrl+B to
start/stop recording"). A user who set ``voice.record_key: ctrl+o``
(or any other key) saw the documented config silently ignored — only
Ctrl+B worked, the displayed shortcut lied about it.

Wire the configured key end to end through the existing channels:

* **Backend** (``tui_gateway/server.py``): ``voice.toggle`` action=status
  AND action=on/off responses now include ``record_key``, sourced from
  ``config.get('voice', {}).get('record_key', 'ctrl+b')``.
* **Backend types** (``ui-tui/src/gatewayTypes.ts``): ``ConfigFullResponse``
  now exposes ``config.voice.record_key`` and ``VoiceToggleResponse``
  carries ``record_key`` so the TUI can both bind and display it.
* **Frontend parser/formatter** (``ui-tui/src/lib/platform.ts``):
  ``parseVoiceRecordKey()`` accepts ``ctrl+b`` / ``alt+r`` / ``cmd+space``
  and the common aliases (``option``, ``cmd``, ``win``, …); falls back to
  the documented Ctrl+B for empty / multi-character / malformed input so
  a typo never silently disables the shortcut. ``formatVoiceRecordKey()``
  renders for status text. ``isVoiceToggleKey`` now takes a parsed
  ``ParsedVoiceRecordKey`` argument; the hardcoded ``ch === 'b'`` is
  gone. Default arg keeps existing call sites back-compat.
* **Hydration** (``ui-tui/src/app/useConfigSync.ts``,
  ``useMainApp.ts``): startup ``config.get full`` already runs; extract
  ``cfg.voice.record_key`` from it, parse, push into a new
  ``voiceRecordKey`` state, and forward to the input handler ctx
  (``InputHandlerContext.voice.recordKey``). Mtime-poll path also
  re-applies the parsed key so a hand-edit of config.yaml takes effect
  the next tick — matches existing behaviour for display options.
* **Input handler** (``ui-tui/src/app/useInputHandlers.ts``):
  ``isVoiceToggleKey(key, ch, voice.recordKey)`` so the configured
  binding fires.
* **Slash command** (``ui-tui/src/app/slash/commands/session.ts``):
  ``/voice status`` and ``/voice on`` use ``formatVoiceRecordKey`` on
  the response's ``record_key`` instead of the hardcoded label.

Tests:
* ``parseVoiceRecordKey`` covers ctrl/alt/cmd/super aliases, multi-char
  rejection, and empty fallback.
* ``formatVoiceRecordKey`` covers the doc examples (``Ctrl+B``,
  ``Ctrl+O``, ``Alt+R``, ``Cmd+B``).
* ``isVoiceToggleKey`` regression: ``ctrl+o`` configured → only ``o``
  matches, not ``b``; ``alt+r`` matches both alt-bit and meta-bit
  encodings (terminal protocol parity); omitted-arg call still binds
  Ctrl+B for back-compat.

Full TUI suite (555 tests) passes; ``tsc --noEmit`` clean.

Fixes #18994

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* fix(tui): support named-key tokens in voice.record_key (space, enter, …)

Reviewer caught that the round-1 parser in #18994 rejected every
multi-character token, so a config value like ``ctrl+space`` (which the
CLI happily binds via prompt_toolkit's ``c-space`` rewrite in
``cli.py``) silently fell back to the documented Ctrl+B default —
re-introducing the same false-shortcut bug the PR was meant to fix,
just at a different surface.

Add explicit named-key support that mirrors what the CLI accepts:

* ``space``         (alias: ``spc``)        → matches ``ch === ' '``
* ``enter``         (alias: ``return``, ``ret``) → matches ``key.return``
* ``tab``                                   → matches ``key.tab``
* ``escape``        (alias: ``esc``)        → matches ``key.escape``
* ``backspace``     (alias: ``bs``)         → matches ``key.backspace``
* ``delete``        (alias: ``del``)        → matches ``key.delete``

``ParsedVoiceRecordKey`` gains an optional ``named`` field; ``ch``
holds either a single char (back-compat) or the canonical named token,
and the runtime matcher dispatches on ``named`` before checking the
modifier shape. Aliases collapse to one canonical name so
``ctrl+esc`` and ``ctrl+escape`` behave identically.

Unrecognised multi-character tokens (e.g. ``ctrl+spcae`` typo, or
unsupported keys like ``ctrl+f5``) still fall back to the Ctrl+B
default rather than silently disabling the binding — keeps the "typo
never silently kills the shortcut" guarantee.

Tests:

* ``parseVoiceRecordKey`` parametrised over every named token + each
  alias variant.
* New ``isVoiceToggleKey`` cases for space (ch-based match), enter
  (``key.return``), tab, escape, backspace, delete, including
  modifier-mismatch negatives.
* ``formatVoiceRecordKey`` renders named keys in title case
  (``Ctrl+Space``, ``Ctrl+Enter``).
* Existing fall-back-to-Ctrl+B contract preserved for empty input
  AND unrecognised multi-char tokens.

Full TUI suite: 559/559 pass; ``tsc --noEmit`` clean.

Refs #18994 (round-1 review feedback)

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* test(tui): assert voice.toggle returns configured record_key

Salvage the backend regression from #19339 — asserts ``voice.toggle``
action=on AND action=status responses carry the configured
``voice.record_key`` end-to-end through ``_load_cfg()``. Keeps the
CLI→TUI parity contract visible in the Python test suite alongside
the existing frontend parser/matcher/formatter coverage from #19028.

* fix(tui): address Copilot review on #19835 voice.record_key wiring

Five tightenings on the parser + matcher + hydration surface, all
caught by the Copilot review on the PR — each one turns a silent
false-fire or display/binding skew into a deterministic behaviour.

* **isVoiceToggleKey ctrl branch was too permissive for named keys.**
  The doc-default macOS Cmd+B muscle-memory fallback
  (``isActionMod(key)`` on top of ``key.ctrl``) fired for every
  configured key, so bare Esc — which hermes-ink reports with
  ``key.meta`` on some macOS terminals — triggered ``ctrl+escape``,
  and Alt+Space / Alt+Tab triggered ``ctrl+space`` / ``ctrl+tab``.
  Gate the fallback to the literal ``ctrl+b`` binding so any custom
  chord requires the real Ctrl bit.
* **Alt branch guarded against Ctrl/Cmd co-press.** Without this,
  Ctrl+Alt+<letter> and Cmd+Alt+<letter> also fired ``alt+<letter>``.
* **Dropped the ``meta`` modifier variant and its alias.** In
  hermes-ink ``key.meta`` is Alt on xterm-style terminals and Cmd on
  legacy macOS ones, so a literal ``meta+b`` config displayed as
  ``Cmd+B`` while matching Alt+B — exactly the kind of false
  shortcut the PR was meant to remove. ``cmd`` / ``command`` now
  collapse onto ``super`` (kitty-style ``key.super``, with a macOS
  ``key.meta`` fallback) and render as ``Cmd+B``. Unknown modifier
  tokens fall back to the documented Ctrl+B default rather than
  silently coercing to Ctrl.
* **Slash-command display/binding skew.** ``/voice status`` and
  ``/voice on`` rendered from the fresh gateway ``record_key``
  response, but ``useInputHandlers()`` still bound the old key
  until the next 5s mtime poll. Thread ``setVoiceRecordKey``
  through ``SlashHandlerContext.voice`` and push the parsed spec
  into frontend state on every response so text and binding stay
  consistent.
* **Test coverage for the two paths Copilot flagged.** Added
  vitest coverage for (a) the three-case ``/voice`` slash output
  in ``createSlashHandler.test.ts`` and (b) the
  ``applyDisplay → voice.record_key`` hydration + omit-setter
  back-compat paths in ``useConfigSync.test.ts``. Plus regression
  cases for every false-fire scenario above.

Suite: 575/575 green, tsc --noEmit clean.

* fix(tui): address Copilot round-2 review on #19835

Three tightenings on the surface introduced in the round-1 fix:

* **``/voice tts`` reset custom bindings to Ctrl+B.** The ``tts`` branch
  of ``voice.toggle`` omitted ``record_key`` from its response, so the
  frontend's ``r.record_key ?? 'ctrl+b'`` coerced a user's custom
  binding back to the default on every TTS toggle. Two-sided fix:
  the backend now includes ``record_key`` on the ``tts`` branch (parity
  with ``status``/``on``/``off``), and the slash handler only pushes
  frontend state when the response actually carries ``record_key`` —
  belt-and-suspenders against any future branch forgetting to include
  it.

* **``super+b`` / ``win+b`` / ``cmd+b`` displayed "Cmd+B" on Linux and
  Windows.** ``formatVoiceRecordKey`` rendered ``mod === 'super'`` as
  ``Cmd`` universally, which told non-mac users the wrong modifier to
  press even though ``isVoiceToggleKey`` matched the right event bits.
  Gate the label to ``isMac`` so non-mac renders ``Super+B``.

* **``control+b`` / ``ctrl + b`` lost the macOS Cmd+B fallback.**
  ``_isDefaultVoiceKey`` keyed off ``parsed.raw`` — so
  semantically-equal aliases of the documented default dropped into
  the strict branch even though they bind Ctrl+B. Compare on the
  parsed spec (mod + ch + named) instead.

Coverage added: Linux ``Super+B`` rendering (and macOS ``Cmd+B``),
``control+b`` / ``ctrl + b`` accepting the Cmd+B fallback on darwin,
``/voice tts`` without ``record_key`` not clobbering cached binding,
and a backend regression asserting every ``voice.toggle`` branch
carries the configured key.

Suite: 579/579 TUI vitest green, 2/2 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-3 review on #19835

Three classes of robustness issue caught on the second pass — all
revolve around malformed YAML tipping ``parseVoiceRecordKey`` or
``_voice_record_key`` into a crash instead of the documented
fallback.

* **Parser crashed on non-string YAML scalars.** ``config.get full``
  returns raw ``yaml.safe_load`` output, so ``voice.record_key: 1``
  or ``voice.record_key: true`` in a hand-edited config would hit
  ``.trim()`` on a number/bool and throw, breaking startup and
  every mtime re-apply. Accept ``unknown`` at the signature, guard
  with ``typeof raw !== 'string'``, and fall back to the default.

* **Backend blew up on non-dict ``voice:``.** Same YAML hazard on
  the gateway side: ``voice: true`` / ``voice: cmd+b`` left
  ``_load_cfg().get("voice")`` as a bool/str, so ``.get("record_key")``
  raised AttributeError and took every ``voice.toggle`` branch down
  with it. Centralised the lookup in a single
  ``_voice_record_key()`` helper that ``isinstance``-guards both
  ``voice`` and ``record_key`` and falls back to ``ctrl+b``.

* **Multi-modifier chords silently dropped extras.** The previous
  validator only checked the first modifier token, so ``ctrl+alt+r``
  silently parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as ``super+b`` —
  a typo bound a different shortcut than the user configured.
  Reject multi-modifier spellings outright; the classic CLI only
  supports single-modifier bindings via prompt_toolkit's ``c-x`` /
  ``a-x`` rewrite, so this matches CLI parity.

Coverage added:

* ``parseVoiceRecordKey`` fallback on ``1`` / ``true`` / ``null`` /
  ``undefined`` / ``{}``.
* ``parseVoiceRecordKey`` fallback on ``ctrl+alt+r`` /
  ``cmd+ctrl+b`` / ``alt+ctrl+space``.
* ``test_voice_toggle_handles_non_dict_voice_cfg`` exercises
  every non-dict ``voice:`` shape (bool, str, None, int, list) and
  asserts each falls back to ``record_key: 'ctrl+b'``.

Suite: 581/581 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-4 review on #19835

Four final corners of the voice.record_key surface:

* **Bare-char configs silently coerced to ``ctrl+<key>``.** A config
  like ``voice.record_key: o`` / ``space`` / ``escape`` fell through
  to the default ``mod = 'ctrl'`` and silently bound Ctrl+O, while
  the classic CLI's prompt_toolkit would bind the raw key (no
  rewrite) — so the two runtimes silently disagreed on what "o"
  means. Require an explicit modifier; bare-char configs fall back
  to the documented Ctrl+B default.

* **Reserved ctrl+<letter> bindings would never fire.**
  ``useInputHandlers()`` intercepts ``ctrl+c`` (interrupt),
  ``ctrl+d`` (quit), and ``ctrl+l`` (clear screen) before the voice
  check runs, so those configs would be advertised in /voice
  status but the advertised shortcut never actually triggers
  push-to-talk. Added ``_RESERVED_CTRL_CHARS`` at parse time so
  the user gets the documented default instead of a dead shortcut.
  (``alt+c``, ``cmd+l``, etc. are not intercepted and stay usable.)

* **``_load_cfg()`` root itself may be a non-dict.**
  ``_voice_record_key()`` isinstance-guarded the ``voice`` subkey
  but not the root — a malformed config.yaml that collapsed to a
  scalar/list at the top level (``config.yaml: true`` or ``[]``)
  would still raise on ``.get("voice")``. Added the top-level
  guard too so every malformed shape falls back to ``ctrl+b``.

* **Stale header comment on ``isVoiceToggleKey``.** The doc-comment
  still claimed "On macOS we additionally accept the platform
  action modifier (Cmd) for the configured letter" even though the
  implementation gates the Cmd fallback to the documented default
  only. Rewrote to match.

Coverage added:

* ``parseVoiceRecordKey`` fallback on bare chars (``o``, ``b``,
  ``space``, ``escape``).
* ``parseVoiceRecordKey`` fallback on ``ctrl+c`` / ``ctrl+d`` /
  ``ctrl+l``; positive case for ``alt+c`` / ``cmd+l`` still usable.
* Backend ``test_voice_toggle_handles_non_dict_voice_cfg`` now
  exercises 5 non-dict shapes at the YAML root too.

Suite: 583/583 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-5 review on #19835

Three follow-ups on the voice matcher's modifier + shift discipline:

* **``super`` branch falsely fired on Alt+<key> / bare Esc on macOS.**
  ``isVoiceToggleKey`` accepted ``isMac && key.meta`` as a Cmd
  fallback for the ``super`` modifier — but hermes-ink sets
  ``key.meta`` for plain Alt/Option AND for bare Escape on some
  macOS terminals. A ``cmd+b`` config silently fired on Alt+B;
  ``cmd+space`` on Alt+Space; ``cmd+escape`` on bare Esc. Drop the
  fallback and require the literal ``key.super`` bit. Legacy-
  terminal users who need Cmd should upgrade to a kitty-protocol
  terminal or bind ``alt+X`` explicitly.

* **Shift bit was never checked.** The parser rejects multi-
  modifier configs like ``ctrl+shift+tab``, but the runtime
  matcher didn't check ``key.shift`` — so ``ctrl+tab`` also fired
  on Ctrl+Shift+Tab and ``alt+enter`` on Alt+Shift+Enter.
  Early-return on ``key.shift === true`` so the runtime only fires
  the exact chord the user configured.

* **Test leaked ``HERMES_VOICE=1`` into later tests.**
  ``voice.toggle`` action=on writes to ``os.environ`` directly
  (CLI parity, runtime-only flag); ``test_voice_toggle_returns_
  configured_record_key`` dispatched action=on without letting
  monkeypatch take ownership of the var first. Any later test
  that read voice mode in the same Python process could inherit a
  stale enabled state. Added ``monkeypatch.setenv("HERMES_VOICE",
  "0")`` up front so monkeypatch restores the original value at
  teardown.

Coverage added:

* ``cmd+b`` / ``cmd+space`` / ``cmd+escape`` do NOT fire on
  ``key.meta``-only events on darwin.
* ``ctrl+tab`` / ``alt+enter`` / ``ctrl+o`` reject matches when
  ``key.shift`` is held; sanity cases without Shift still fire.

Suite: 585/585 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-6 review on #19835

Three classes of modifier-discipline tightening + one config-surface
honesty fix:

* **Default ``ctrl+b`` Cmd fallback leaked Alt+B.** The default's
  macOS Cmd+B muscle-memory path used ``isActionMod(key)``, which
  returns ``key.meta || key.super`` on darwin. hermes-ink also
  reports plain Alt as ``key.meta``, so Alt+B silently fired the
  default binding. Replaced with strict ``isMac && key.super ===
  true`` — kitty-style Cmd+B still works, Alt+B correctly
  rejected. Legacy-terminal mac users (Terminal.app without
  CSI-u) now get raw Ctrl+B only; the documented default still
  works everywhere.

* **ctrl / super branches accepted extra modifier bits.** The
  parser rejects multi-modifier configs like ``ctrl+alt+o``, but
  the runtime matcher was permissive — ``ctrl+o`` fired on
  Ctrl+Alt+O / Ctrl+Cmd+O, and ``super+b`` fired on Cmd+Alt+B /
  Ctrl+Cmd+B. Added strict ``!key.alt && !key.meta && key.super
  !== true`` on ctrl, and ``!key.ctrl && !key.alt && !key.meta``
  on super, so the runtime only fires the exact chord the parser
  would let you configure.

* **Dropped ``cmd`` / ``command`` aliases.** They parsed to
  ``super`` and rendered as ``Cmd+X``, but legacy macOS terminals
  report Cmd as ``key.meta`` (same signal as Alt), so a
  ``cmd+o`` config was advertised as working but never actually
  fired on Terminal.app-without-CSI-u. That recreated the
  "displayed shortcut does not work" problem this PR was meant to
  remove. Users who want the platform action modifier spell it
  ``super`` / ``win`` — that matches the unambiguous ``key.super``
  bit, and kitty-style macOS terminals render it as ``Cmd+X`` via
  platform-aware formatter.

Coverage updated:

* Default ctrl+b no longer fires on Alt+B via ``key.meta`` leak;
  raw Ctrl+B and kitty-style Cmd+B still fire.
* ``ctrl+o`` rejects Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O chords.
* ``super+b`` rejects Cmd+Alt+B / Cmd+Meta+B / Ctrl+Cmd+B chords.
* ``cmd+b`` / ``command+b`` / ``meta+b`` all fall back to the
  documented default at parse time (joined the ambiguous-mac-mod
  rejection class).
* Round-2 expectations that asserted ``cmd+b`` parsed as super
  and accepted ``key.meta`` on darwin updated to reflect the new
  stricter contract.

Suite: 588/588 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot follow-up on wire typing + escape precedence

Two follow-ups from the latest Copilot pass:

* **Config wire typing honesty (`gatewayTypes.ts`)**
  `config.get full` forwards raw `yaml.safe_load()` output, so
  `voice.record_key` can be any scalar/container when hand-edited.
  Typing it as `string` suggests a normalized contract that the
  backend does not guarantee and makes unsafe callers more likely.
  Change `ConfigVoiceConfig.record_key` to `unknown` with an
  explicit comment that callers must normalize at runtime.

* **Escape-based voice bindings were swallowed before voice check**
  `useInputHandlers()` handled `key.escape` for queue-edit cancel and
  selection clear before `isVoiceToggleKey(...)`, so configured
  `ctrl+escape` / `alt+escape` / `super+escape` chords were advertised
  but never toggled recording in those UI states.
  Add an early escape+voice check before generic Esc handlers so
  escape-based voice bindings win when configured, while plain Esc
  behavior remains unchanged.

Also updated PR #19835 description text to remove stale cmd/command
alias claims and match the current parser contract.

* fix(tui): pass configured voice shortcut through TextInput layer

Thread the live parsed voiceRecordKey into TextInput so configured voice.record_key chords bubble to useInputHandlers instead of being consumed as editor input. This removes the last hardcoded Ctrl+B pass-through in the composer path while preserving existing global control chord behavior.

* fix(tui): require explicit alt bit for escape-based alt chords

Hermes-ink reports bare Escape as meta=true+escape=true on some terminals, so a configured alt+escape binding was firing on bare Esc. Require an explicit key.alt bit when the configured named key is escape so plain Esc stays plain Esc; kitty-style alt+escape still fires.

* fix(tui): harden voice.record + TextInput paste + super-mod reserved list

Three round-7 Copilot follow-ups on #19835:

- voice.record start handler used _load_cfg().get('voice', {}).get(...) without
  shape checks, so malformed YAML (bool/scalar/list) returned 5025 instead of
  using VAD defaults. Centralized _voice_cfg_dict() helper and type-guarded
  silence_threshold/silence_duration with numeric fallbacks.
- TextInput pass-through check moved above paste/copy handling so configured
  voice chords (ctrl+v / alt+v / cmd+v) beat the composer's paste/copy
  defaults.
- parser now also rejects super+{c,d,l,v} — on macOS those are
  copy/exit/clear/paste and would be advertised in /voice status but never
  actually toggle recording.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(tui): round-8 Copilot review — allow ctrl+x, gate super reservations to macOS, preserve voice key on transient RPC failure

Three round-8 Copilot follow-ups on #19835:

- Revert ctrl+x addition to _RESERVED_CTRL_CHARS (landed via Copilot Autofix
  commit 731ec86): ctrl+x is only claimed during queue-edit
  (queueEditIdx !== null), so voice works the rest of the session and
  matches CLI ctrl+<letter> parity.
- Gate super+{c,d,l,v} reservation to isMac. Linux/Windows TUI globals key
  off Ctrl, so kitty/CSI-u super+<letter> configs don't collide on non-mac
  and should stay usable.
- applyDisplay() now skips setVoiceRecordKey when cfg is null so one
  transient quietRpc() failure after a config edit doesn't clobber the
  cached binding back to Ctrl+B until the next successful poll.

New coverage:
- parseVoiceRecordKey preserves ctrl+x on linux
- super+{c,d,l,v} rejected on darwin, allowed on linux
- applyDisplay(null, ...) leaves voiceRecordKey untouched

* fix(cli,tui): normalize voice.record_key aliases across CLI + TUI for parity

Round-9 Copilot review on #19835: TUI accepted control+/option+/opt+/super+/win+ aliases but the classic CLI only rewrote literal ctrl+/alt+ before handing to prompt_toolkit, so a TUI-valid config silently bound a different (or no) shortcut in the CLI.

- Added normalize_voice_record_key_for_prompt_toolkit() in hermes_cli/voice.py with a single alias table (ctrl/control/alt/option/opt → c-/a-).
- Wired it into all three cli.py sites (_enable_voice_mode hint, _show_voice_status display, and the prompt_toolkit binding in _register_voice_handler).
- /voice status display now renders control+x as Ctrl+X and option+x as Alt+X (canonical casing) to match TUI formatVoiceRecordKey.
- super/win/windows are intentionally left unchanged: prompt_toolkit has no super modifier, so the CLI will reject them loudly at startup rather than silently binding Ctrl+B. Documented this split at both the TUI _MOD_ALIASES comment and the CLI normalizer docstring.
- Added tests covering ctrl/control/alt/option/opt mapping, case-insensitivity, non-string fallback, empty-string fallback, and super/win pass-through.

* fix(cli): port TUI parser contract into CLI voice.record_key normalizer

Round-10 Copilot review on #19835.

hermes_cli/voice.py's normalize_voice_record_key_for_prompt_toolkit() previously did blind substring replacement with no trim/validate step, so the CLI diverged from the TUI parser on:
- whitespace ('ctrl + b' -> 'c- b' instead of 'c-b')
- typoed named keys ('ctrl+spcae' passed through as 'c-spcae' and prompt_toolkit would reject at startup)
- bare-char configs ('o' should fall back, not pass through as 'o')
- multi-modifier chords ('ctrl+alt+r')
- reserved ctrl chars ('ctrl+c/d/l')
- unknown modifiers ('meta+b' / 'shift+b')
- named-key aliases ('return'/'esc'/'bs'/'del' not collapsed to prompt_toolkit canonicals)

Port the TUI parser contract into Python (_VOICE_MOD_ALIASES, _VOICE_NAMED_KEYS, _VOICE_RESERVED_CTRL_CHARS) so one config value binds the same shortcut in both runtimes.

Also added format_voice_record_key_for_status() shared between the PTT hint and /voice status display. Non-string scalars (voice.record_key: true / 1) now surface as 'Ctrl+B' instead of the raw scalar — /voice status no longer advertises a shortcut that can never bind.

Tests: 29/29 in test_voice_wrapper.py, including 11 new regressions covering whitespace, named-key aliases, typos, bare-char, multi-modifier, reserved ctrl, unknown mods, non-string fallback, and formatter contract.

* fix(cli): shape-safe voice config read + graceful super/win fallback

Round-11 Copilot review on #19835.

Two remaining cross-runtime gaps:

1. load_config().get('voice', {}) still assumed voice was a dict, so a hand-edited voice: true / voice: cmd+b at the top level raised AttributeError before the voice UI could start. Added voice_record_key_from_config(cfg) to hermes_cli/voice.py that isinstance-guards both the root and the voice subkey. All three cli.py read sites (_enable_voice_mode hint, _show_voice_status, PTT binding) now use it.

2. The CLI normalizer previously passed super+/win+/windows+ through unrewritten so prompt_toolkit would reject them loudly at startup — but that crash was a worse UX than a silent fallback. Normalizer now returns c-b for those spellings, and the PTT binding site logs a warning so users see why their TUI-only shortcut isn't binding in the CLI.

Coverage: 34/34 in tests/hermes_cli/test_voice_wrapper.py (5 new cases for voice_record_key_from_config + malformed-root + malformed-voice + extractor/normalizer composition).

* fix(cli): self-audit cleanup — remaining voice-config shape safety + doc drift

Self-review of the voice.record_key change set turned up four remaining items Copilot would very likely flag next round:

1. cli.py _voice_start_continuous still read load_config().get('voice', {}).get('silence_threshold') without an isinstance guard, so a hand-edited voice: true / voice: cmd+b (non-dict) raised AttributeError on VAD recording start. Shape-safe coerce the voice dict and numeric-guard silence_threshold/silence_duration.

2. cli.py _enable_voice_mode's auto_tts check had the same bug — fixed with the same isinstance guard.

3. hermes_cli/voice.py module comment on _VOICE_MOD_ALIASES still said super/win/windows 'pass through unchanged and prompt_toolkit's add() call loudly rejects them at startup'. Round 11 changed the normalizer to silently fall back to c-b with a warning at the binding site; updated the comment to match.

4. ui-tui/src/lib/platform.ts header comment had the same stale 'CLI will loudly reject them at startup' claim; updated to 'falls back to the documented default and logs a warning'.

No behavior change on the code paths already covered by test_voice_wrapper.py; the two cli.py fixes are defensive against malformed YAML that previous rounds already hardened in tui_gateway/server.py but missed in the classic CLI.

* fix(cli,tui): round-12 Copilot review — alt-collide on mac, bool-in-int guards, voice UI hardcodes, mtime-reload test

Five round-12 Copilot review items on #19835:

1. platform.ts: hermes-ink reports Alt as key.meta on many terminals; isActionMod on darwin accepts key.meta as the action modifier. So alt+c/d/l get claimed by isCopyShortcut / isAction('d')/'l') before the voice check. Reject those configs at parse time on macOS only (non-mac keeps them usable).

2. cli.py: four remaining hardcoded 'Ctrl+B' sites in voice-facing UI (_get_voice_status_fragments status bar, _voice_start_recording hints, _get_placeholder composer text) were still lying about non-default configs. Added self._voice_record_key_label() shared helper and wired it into all three sites.

3. server.py + cli.py: bool is a subclass of int, so isinstance(silence_threshold, (int, float)) accepted True/False from malformed YAML and forwarded 1/0 to the VAD engine. Exclude bool explicitly so boolean typos fall back to the documented 200 / 3.0 defaults.

4. useConfigSync.ts: extracted the config.get-full fetch+apply body into a shared hydrateFullConfig() helper. Both the initial hydration and mtime-reload paths now use it, so the polling/RPC wiring is exercised by direct unit tests (4 new cases: fresh apply, reapply on new value, transient RPC failure preserves cache, back-compat without voice setter).

5. Added alt+{c,d,l} rejection regressions on darwin + allow on linux, and bool-leak regressions for both silence_threshold and silence_duration in tests/test_tui_gateway_server.py.

Suite: 602/602 TUI vitest, 38/38 backend voice tests, typecheck + lints clean.

* fix(cli): cache voice record-key label at binding time + status-bar coverage

Round-13 Copilot review on #19835.

_voice_record_key_label() was reading live config on every render, which caused two problems:

1. prompt_toolkit registers the push-to-talk binding once at session start (@kb.add(_voice_key)); the binding does NOT re-read config. Editing voice.record_key mid-session would switch the status-bar / placeholder / recording-hint label to the new shortcut while the actual keybinding stayed on the startup chord — reintroducing the display/binding drift this whole PR is fighting.

2. Hot render path: during recording the UI is invalidated every 150ms, so re-loading + deep-merging config on every call added avoidable UI overhead.

Fix: cache the label at the same site that registers the prompt_toolkit binding via new set_voice_record_key_cache(raw_key). _voice_record_key_label() now just returns the cached value (falls back to 'Ctrl+B' before startup). Status/placeholder/hint are always in sync with the live binding; no config reload per render.

Also added 4 regression cases to tests/cli/test_cli_status_bar.py: configured ctrl+<letter> renders in both wide and compact status bars, configured named key (ctrl+space) renders in the recording hint, pre-startup absent cache falls back to Ctrl+B, and malformed configs (bool True) fall through the formatter to Ctrl+B.

Suite: 60/60 test_cli_status_bar + test_voice_wrapper, typecheck + lints clean.

* fix(cli): route /voice on + /voice status through startup-pinned label; mac alt+cdl parity

Round-14 Copilot review on #19835. All three comments legit:

1. _enable_voice_mode still formatted label from live load_config() — mid-session config edit would make /voice on announce the new shortcut while the prompt_toolkit binding stayed the startup chord. Use self._voice_record_key_label() (cached at binding time, round-13) so /voice on cannot drift from the live binding.

2. _show_voice_status had the same bug — /voice status reported live config instead of the pinned startup binding. Fixed the same way.

3. CLI normalizer accepted alt+c/alt+d/alt+l even though the TUI parser rejects them on macOS (Copilot round-12 — hermes-ink reports Alt as key.meta, isActionMod on darwin accepts it, collides with isCopyShortcut / isAction). Added _VOICE_RESERVED_ALT_CHARS_MAC = {c,d,l} gated to sys.platform == 'darwin' so a shared config like option+c falls back to c-b on both runtimes on macOS; non-mac still binds a-c.

Coverage: 4 new tests in test_voice_wrapper.py covering mac alt+cdl rejection, linux alt+cdl allowed, option/opt alias forms, and mac-specific exclusions for other alt letters. 62/62 in voice wrapper + status bar suites.

---------

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
Co-authored-by: asheriif <ahmedsherif95@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-04 15:49:28 -07:00
kshitij 109c3e468c fix(terminal): guard background process spawn against deleted cwd (#19933)
Follow-up to #19928 which fixed the foreground path in _run_bash.
The background process spawn in process_registry.py had the same
vulnerability: Popen(cwd=session.cwd) and PtyProcess.spawn(cwd=...)
would raise FileNotFoundError if the directory was deleted.

Apply _resolve_safe_cwd() at session creation time so both the PTY
and pipe-mode Popen paths receive a validated cwd.
2026-05-04 15:35:34 -07:00
briandevans 9fa3a093f2 fix(local): test root as ancestor candidate; use real pipe for fake stdout
Address Copilot review on PR #17569:

1. _resolve_safe_cwd never tested the filesystem root because the loop
   exited when `os.path.dirname(parent) == parent`, which is true once
   `parent == '/'`. Restructure so the root is checked before the
   self-equal exit. Adds `test_returns_root_when_only_root_exists` —
   regression-guarded by reverting the loop and watching it fail.

2. The fake `Popen.stdout` was a `MagicMock`; `BaseEnvironment._wait_for_process`
   calls `proc.stdout.fileno()` then `select.select`/`os.read` against it,
   which raised `TypeError: fileno() returned a non-integer` (visible as a
   thread exception in test output) and could in theory read from an
   unrelated real fd. Hand `fake_popen` a real `os.pipe()` with the write
   end pre-closed so the drain loop sees EOF immediately. Helper records
   each fd so the test cleans up after itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
briandevans 9644b8ae67 fix(local): recover when persistent_shell cwd is deleted (#17558)
When a tool call deletes its own working directory (`cd /tmp/foo &&
rm -rf /tmp/foo`), the next `subprocess.Popen(args, cwd=self.cwd)` raised
`FileNotFoundError: [Errno 2]` before bash even started — every subsequent
terminal/file-tool call hit the same wedge until the gateway restarted.

Fix in `LocalEnvironment._run_bash`: before handing `self.cwd` to Popen,
resolve a safe alternative when the path is gone (walk up to the nearest
existing ancestor, falling back to `tempfile.gettempdir()` only as a last
resort). Log a warning so the recovery is visible — not silent — and
update `self.cwd` so the next call doesn't repeat the message.

Defense in depth in `LocalEnvironment._update_cwd`: only adopt the new
cwd when it still exists as a directory. `pwd -P` from a deleted cwd can
leave a stale value in the marker file; refusing to store a missing path
keeps `self.cwd` valid by construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
Teknium b8fb9270c4 refactor(cli): drop dead c-S-c key binding (follow-up to #19895) (#19919)
#19884 added a prompt_toolkit key binding for Ctrl+Shift+C to
"prevent Hermes from intercepting the keystroke as an interrupt
signal." #19895 then wrapped the binding in try/except after
discovering it crashed startup with ValueError on every platform.

Both PRs were based on a misreading of how terminal key events
propagate:

1. Terminal emulators (GNOME Terminal, iTerm2, kitty, Windows Terminal,
   etc.) intercept Ctrl+Shift+C before the keystroke reaches the
   application's stdin. prompt_toolkit never sees it. The binding
   could never have intercepted anything.

2. prompt_toolkit's key spec parser doesn't recognise 'c-S-c' on any
   platform — the Shift modifier is meaningless on control-sequence
   keys. Verified: every prompt_toolkit version raises 'Invalid key:
   c-S-c' at registration time.

The handler is dead code. Delete it and leave a comment explaining
why no binding is needed here. Ctrl+Q alias (#19884's other addition)
stays — that's a real prompt_toolkit key and a legitimate interrupt
shortcut.

Verified the CLI starts cleanly — key binding phase no longer raises
and the subsequent chat flow reaches the provider setup check without
error.
2026-05-04 14:49:38 -07:00
Teknium 56a78e74b2 feat(kanban-dashboard): sharper home-channel toggle contrast, drop → running action (#19916)
Follow-up polish to the kanban dashboard from #19864 and #19705.

**Home-channel toggle contrast.** The `.hermes-kanban-home-sub--on`
class previously used `color-mix(var(--color-ring) 14%, transparent)`
which was nearly invisible on both the default teal and NERV themes —
the on/off distinction relied almost entirely on the ✓ prefix glyph.
Bump to 32% fill + full-opacity ring border + inner ring shadow +
font-weight 600. Still theme-scoped (no hardcoded colors), but reads
at a glance on both tested themes.

**Drop the → running status action.** Since #19705, `PATCH /tasks/:id`
rejects `status=running` with HTTP 400 — only the dispatcher's
`claim_task` path legitimately enters that state (so the run row,
claim lock, and worker PID are created atomically). The UI button was
still present and produced a 400 on click, which is a confusing dead
affordance. Remove it from `StatusActions`; add a comment pointing to
#19535 so future editors know why it's missing.

Live-tested on the default Hermes Teal theme. 53/53 kanban dashboard
plugin tests still pass.
2026-05-04 14:48:19 -07:00
nftpoetrist 429b8eceb4 fix(cli): guard c-S-c key binding with try/except to prevent startup crash (#19895)
PR #19884 added @kb.add('c-S-c') unconditionally. prompt_toolkit raises
ValueError("Invalid key: c-S-c") during HermesCLI.__init__ on platforms
where this key spec is not recognised — the process exits before reaching
the prompt loop. Reported on macOS (#19894) and Linux (#19896) immediately
after #19884 landed.

Fix: wrap the registration in try/except ValueError so that startup
continues cleanly on any platform/version that rejects the spec. Where
the spec is accepted the binding is registered normally as a no-op,
allowing the terminal to handle Ctrl+Shift+C natively as before.

Fixes #19894
Fixes #19896
2026-05-04 14:45:01 -07:00
Rames Jusso e493b1c482 docs(skill): add hyperframes inspect command to cli.md + SKILL.md
- references/cli.md: add Inspect step (5/7) to Workflow + dedicated `## inspect` section between validate and preview, covering --json/--samples/--at flags and the legacy `hyperframes layout` alias
- SKILL.md: rename procedure step 7 to "Lint, validate, inspect, preview, render" with the full pipeline; explain inspect as the layout-side companion to validate (catches overflow / off-frame / occluded text issues that static lint can't see)
- SKILL.md verification: lint + validate + inspect as a single combined pass
- SKILL.md References list: include `inspect` in the cli.md command list

Brings the optional skill in sync with hyperframes-oss main as of 2026-05-03 — `inspect` was added in heygen-com/hyperframes#480 (2026-04-25) and is documented as a real workflow step in skills/hyperframes-cli/SKILL.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:13:17 -07:00
James 20859cc408 docs(skill): sync hyperframes skill with upstream changes
Pulls the hyperframes skill up to the latest state of heygen-com/hyperframes
skill content. Opened 2026-04-17; upstream has shipped CLI, layout, and path
changes since.

- SKILL.md: promote the visual-style check to a proper HARD-GATE
  (DESIGN.md > named style > ask 3 questions, with the #333/#3b82f6/Roboto
  tells); expand Step 6 to cover audio-reactive (mandatory per-frame
  tl.call() sampling loop — a single long tween does NOT react to audio),
  caption exit guarantee (hard tl.set kill after group.end), marker
  highlighting, and scene transitions; add the animation-map script to
  Verification; link the new features.md.

- references/cli.md: add capture and validate (both shipped commands, both
  referenced from the workflow but missing from the reference). Add
  --lang to tts with the voice-prefix auto-inference table and espeak-ng
  dependency note (heygen-com/hyperframes#351, 2026-04-20 — after this
  PR opened).

- references/website-to-video.md: update all paths to the capture/
  subfolder layout introduced in heygen-com/hyperframes#345
  (capture/screenshots/, capture/assets/, capture/extracted/tokens.json).
  Old captured/ prefix was broken — agents following the skill were
  looking for files in wrong locations.

- references/features.md (new): distilled coverage for captions (language
  rule, tone table, word grouping, fitTextFontSize, exit guarantee), TTS
  (multilingual phonemization, speed tuning), audio-reactive (data
  format, mapping table, sampling pattern), marker highlighting
  (highlight/circle/burst/scribble/sketchout), and transitions (energy/
  mood tables, presets, shader-compatible CSS rules). Five topics the
  original PR didn't cover.
2026-05-04 14:13:17 -07:00
James 50aabb9eb2 feat(skill): add hyperframes optional creative skill
Adds an optional creative skill that integrates HyperFrames, an
HTML-based video rendering framework, as a sibling to manim-video.
Complements manim's math-focused animation with motion-graphics,
captioned narration, audio-reactive visuals, shader transitions, and
website-to-video production.

Scope:
- optional-skills/creative/hyperframes/SKILL.md      — entry point
- references/composition.md                          — data-attr schema, timeline contract
- references/cli.md                                  — every npx hyperframes command
- references/gsap.md                                 — GSAP core API for compositions
- references/website-to-video.md                     — 7-step capture-to-video workflow
- references/troubleshooting.md                      — OpenClaw / Chromium 147 fix
- scripts/setup.sh                                   — idempotent one-time setup

OpenClaw / Chromium 147 fix (hyperframes#294):
Pinning hyperframes@>=0.4.2 (commit 4c72ba4 ships the
HeadlessExperimental.beginFrame auto-detect + screenshot fallback).
setup.sh pre-caches chrome-headless-shell so the fast BeginFrame path
is preferred over system Chrome. The PRODUCER_FORCE_SCREENSHOT=true
escape hatch is documented in troubleshooting.md and in SKILL.md
Pitfalls.

Placed under optional-skills/ (not bundled) per CONTRIBUTING.md
guidance for heavyweight deps: requires Node.js >= 22, FFmpeg, and
~300 MB chrome-headless-shell download.
2026-05-04 14:13:17 -07:00
Teknium 8fabef9d35 fix(docs): register cron-script-only guide in sidebar (#19893)
PR #19709 added website/docs/guides/cron-script-only.md but never added the entry to website/sidebars.ts, which is explicitly enumerated (not autogenerated). Two consequences:

1. The guide didn't show up in the left-nav "Guides & Tutorials" list — users could only reach it via cross-links from other pages.
2. Landing on the guide page directly made the sidebar disappear entirely (Docusaurus treats unregistered docs as orphaned and renders them without their parent sidebar).

Added 'guides/cron-script-only' next to 'guides/automate-with-cron' so it slots in alongside the other cron content. Verified with `npm run build`: no orphan warnings, no broken links, page builds with sidebar intact.

No content change, docs only.
2026-05-04 12:57:01 -07:00
briandevans 81cd678291 fix(google-workspace): restore required_credential_files in SKILL.md (#16452)
PR #9931 ("feat(google-workspace): add --from flag for custom sender display name")
accidentally removed the required_credential_files frontmatter block that tells
hermes to bind-mount google_token.json and google_client_secret.json into Docker
and Modal remote terminals before running setup.py.

Without this header the credential files are never registered in the session-scoped
ContextVar, so get_credential_file_mounts() returns an empty list at container
creation time and the OAuth files are invisible inside the sandbox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:43:14 -07:00
briandevans 60b143e9df fix(tui_gateway): guard sys.path against local package shadowing (#15989)
When the TUI backend (tui_gateway/entry.py) is spawned by Node.js with the
user's CWD containing a local utils/ directory, that directory shadows the
installed utils module, causing ImportError in run_agent and hermes_cli.

Strip '' and '.' from sys.path and prepend HERMES_PYTHON_SRC_ROOT (already
set by hermes_cli before spawning the subprocess) so installed packages
always win over CWD artifacts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 12:42:43 -07:00
Harry Riddle 645a2f482d fix(cli): fix shortcut config conflict in hermes_cli 2026-05-04 12:41:05 -07:00
Steven Chanin a919269eb5 fix(skills/email/himalaya): document v1.2.0 folder.aliases syntax
The bundled himalaya skill documented folder aliases using a stale
TOML schema (`[accounts.NAME.folder.alias]`, singular) that himalaya
v1.2.0 silently ignores. The TOML parses without error, but the
alias resolver never reads the sub-section — every lookup then falls
through to the canonical folder name.

Source: in `pimalaya/core` (the `email-lib` crate himalaya v1.2.0
depends on, currently v0.27.0), `email/src/folder/config.rs` defines
`FolderConfig { aliases: Option<HashMap<String, String>>, ... }`
(plural, no `#[serde(rename)]`/`alias` aliases, no
`deny_unknown_fields`), and `account/config/mod.rs::get_folder_alias`
returns the input verbatim when no alias is found. So the singular
`alias` key deserializes to nothing and lookups silently fall
through.

On Gmail (where `sent` resolves to `[Gmail]/Sent Mail`, not `Sent`)
this means save-to-Sent fails *after* SMTP delivery already
succeeded, and `himalaya message send` exits non-zero. Any caller
(agent, script, user) that retries on that exit code will re-run
the entire send — including SMTP — producing duplicate emails to
recipients. Silent ignore + caller-level retry is significantly
worse than a config that just doesn't work.

This commit updates SKILL.md and references/configuration.md to the
v1.2.0 `folder.aliases.X` syntax (plural, dotted keys, directly
under the account section), adds a Gmail-specific block with the
`[Gmail]/Sent Mail`-style mapping, and adds notes on the failure
mode so future readers don't hit the same trap. SKILL.md version
bumped 1.0.0 → 1.1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:39:49 -07:00
Teknium 9cda237bb1 docs(cron): lead with agent-driven setup for no-agent mode (#19871)
The shipped no-agent docs introduced the feature via CLI first and
mentioned the chat path as a two-line afterthought. That buries the
actual value prop: the cronjob tool exposes no_agent directly to the
agent, so a user can describe a watchdog in plain language and Hermes
wires up the script + schedule + delivery without anyone opening an
editor.

Changes:

* cron-script-only.md: promote 'Create One from Chat' above
  'Create One from the CLI', flesh it out with a worked transcript
  (the actual tool calls the agent makes), add subsections covering
  'what the agent decides for you' (when to pick no_agent=True vs
  LLM mode) and 'managing watchdogs from chat' (pause/resume/edit/
  remove all agent-accessible).

* user-guide/features/cron.md:
  - Add 'no-agent mode' to the top-level feature list with a cross-
    link, plus a sentence up top making it clear everything is
    agent-accessible through the cronjob tool.
  - Add 'The agent sets these up for you' subsection to the no-agent
    section showing the exact tool call shape.

* automate-with-cron.md: tighten the existing tip box to mention the
  agent-driven path, not just CLI scheduling.

No behavior change — docs only.
2026-05-04 12:39:19 -07:00
briandevans eadf34633e fix(models): strip :cloud/-cloud suffix from models.dev Ollama Cloud IDs
models.dev appends :cloud and -cloud suffixes to Ollama Cloud model IDs
(e.g. kimi-k2.6:cloud, qwen3-coder:480b-cloud) that the live Ollama Cloud
API does not use. Without normalisation, these suffixed IDs bypass the
dedup check and appear alongside the correct clean IDs, causing 400/404
errors when users select them in /model or hermes model.

Add _strip_ollama_cloud_suffix() and apply it to mdev entries before the
dedup merge in fetch_ollama_cloud_models() so all model IDs stored in the
disk cache use the canonical form the API accepts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:38:15 -07:00
Yoimex c050ee6573 fix(file_ops): resolve search_files path/line collision for hyphenated numeric filenames 2026-05-04 12:37:47 -07:00
Ricardo-M-L fbc477df71 fix(run_agent): acquire lock in IterationBudget.used property
The `used` property was reading `self._used` without holding the lock,
while `consume()`, `refund()`, and `remaining` all properly acquire
`self._lock` before accessing `_used`. This means a concurrent call to
`used` during `consume()` or `refund()` could observe a partially-
updated value, leading to incorrect iteration budget metrics reported
to the gateway, or in extreme cases a ValueError from CPython's list
implementation when the internal array resizes during iteration.

Fix: acquire the lock in `used` just like `remaining` does.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 12:37:28 -07:00
ClawdIA 64ad7dec0d fix(file-ops): allow file search in hidden roots 2026-05-04 12:37:09 -07:00
briandevans 9e2628ee7c test(discord): annotate make_attachment content_type as Optional[str]
Copilot review: the helper accepted None in one test but was annotated str.
Matches actual usage where no-content-type attachments are a tested scenario.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:47 -07:00
Ioodu 1c7f47a58c fix(cron): add concurrency regression test for parallel job state writes
get_due_jobs() called load_jobs() and save_jobs() without holding
_jobs_file_lock, creating a race with the locked mark_job_run() and
advance_next_run(). Wrap get_due_jobs() with the lock (delegating to a
new _get_due_jobs_locked() inner function) so all load→modify→save
cycles are serialised. Add two regression tests: one verifying 3
concurrent mark_job_run() calls each land their correct last_status and
last_run_at without overwrites, and a stress test confirming 10 parallel
calls each increment their job's completed count to exactly 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:29 -07:00
lhysdl 6875471916 fix(tts): update MiniMax API endpoint to v1/text_to_speech
MiniMax deprecated the old v1/t2a_v2 endpoint (api.minimax.io) and
moved to v1/text_to_speech (api.minimax.chat). The new API:

- Uses a flat payload: {model, text, voice_id} instead of nested
  voice_setting / audio_setting objects
- Returns raw audio bytes (Content-Type: audio/mpeg) instead of
  JSON with hex-encoded audio
- Uses model 'speech-01' instead of 'speech-2.8-hd'
- Updated default voice_id to 'female-shaonv' for Chinese TTS

The implementation detects Content-Type to handle both old and new
API responses, maintaining backward compatibility for any users who
manually configured the legacy base_url.
2026-05-04 12:36:09 -07:00
briandevans 75bce317a3 fix(cron): expand \${VAR} refs in config.yaml during job execution (#15890)
The cron scheduler's run_job() loaded config.yaml with yaml.safe_load()
but never called _expand_env_vars(), so ${HERMES_MODEL} and similar
references in model:, fallback_providers:, and other config.yaml fields
were forwarded to the LLM API as literal strings, causing HTTP 400 errors.

The normal CLI path has always called _expand_env_vars() via load_config(),
so this was a cron-only gap. The .env load at the top of run_job() already
populates os.environ before config.yaml is read, so the expansion sees the
correct values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:35:46 -07:00
Albert.Zhou fd9c32c0f2 fix(email): drop non-allowlisted senders before dispatch to prevent mail loops
Add EMAIL_ALLOWED_USERS check in EmailAdapter._dispatch_message()
to silently discard emails from senders not in the allowlist.  This
prevents the adapter from creating thread context and dispatching a
MessageEvent for unauthorized senders, which could race with the
gateway authorization check and result in SMTP replies being sent
despite the handler returning None.

Test: tests/gateway/test_email.py::TestDispatchMessage::test_non_allowlisted_sender_dropped
Test: tests/gateway/test_email.py::TestDispatchMessage::test_allowlisted_sender_proceeds
Test: tests/gateway/test_email.py::TestDispatchMessage::test_empty_allowlist_allows_all
2026-05-04 12:35:22 -07:00
briandevans 20edca75e9 fix(update): sync bundled skills to all profiles, including active (#16176)
`hermes update` iterated only non-active profiles when seeding bundled
skills. `seed_profile_skills()` uses a subprocess with an explicit
HERMES_HOME so it correctly targets any profile path; the `p.name !=
active` filter was the only thing preventing the active profile from
being included, leaving it silently on stale skill content after every
update.

Drop the filter and update the header line from "other profiles" to
"all profiles". The active profile is now seeded on the same path as
every other profile. The earlier `sync_skills()` call (module-level
HERMES_HOME) remains for backward compatibility; the subprocess-based
loop is reliable regardless of which HERMES_HOME the CLI was invoked
with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:34:53 -07:00
jjjojoj 103f51ad34 fix(doctor): check gh auth status when GITHUB_TOKEN absent
hermes doctor showed 'No GITHUB_TOKEN (60 req/hr)' warning even when
users had authenticated via gh auth login. Now falls back to
gh auth status --json authenticated when GITHUB_TOKEN and GH_TOKEN
are both unset.

Fixes #16115
2026-05-04 12:34:31 -07:00
fiver 8ab9f61dcf fix(gateway): preserve WSL interop PATH in systemd units 2026-05-04 12:34:06 -07:00
Teknium d90f73bcec fix(gateway): use git HEAD SHA, not file mtimes, for stale-code check (#19740)
The stale-code self-check (Issue #17648) used sentinel-file mtimes to
decide whether the gateway survived a `hermes update` with stale
`sys.modules`. That signal false-positives on any write to the
sentinel files — including agent-driven edits during Hermes-on-Hermes
dev sessions. Telling the agent to patch `run_agent.py` would flip
the check to True on the next user message and force a gateway
restart even though no update happened.

Switch the signal to `git rev-parse HEAD`. Agent file edits don't
move HEAD; `hermes update` (git pull) always does. Reading .git/HEAD
directly (no subprocess) with a 5s cache keeps the overhead negligible
on bursty chats. Non-git installs short-circuit to False — the
stale-modules class can't occur without a git-backed update path, so
there's nothing to detect.

The legacy `_compute_repo_mtime` helper is kept but unused by
detection, reserved as a fallback hook for future pip-install update
paths.

- _read_git_head_sha(): resolves HEAD across main checkout, worktree
  (follows `gitdir:` + `commondir` pointers), and packed-refs layouts.
- _current_git_sha_cached(): per-runner 5s SHA cache.
- _detect_stale_code(): boot SHA vs current SHA, returns False when
  either is unavailable.
- Tests cover all four layouts, the agent-edits-don't-trigger
  regression, and cache behavior.

Refs #17648.
2026-05-04 12:33:21 -07:00
Teknium a21f364ad7 chore(release): AUTHOR_MAP entries for Tier 1g salvage batch 2026-05-04 12:32:10 -07:00
Teknium 1c7c7c3c5f feat(kanban-dashboard): per-platform home-channel notification toggles (#19864)
* revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718)

Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.

* feat(kanban-dashboard): per-platform home-channel notification toggles

Adds a "Notify home channels" section to the task drawer in the kanban
dashboard plugin. Each platform where the user has set a home channel
(/sethome, TELEGRAM_HOME_CHANNEL env var, gateway.platforms.<p>.home_channel
in config.yaml) gets a toggle pill. Toggling on writes a kanban_notify_subs
row keyed to that platform's home (chat_id + thread_id); toggling off
removes it. The existing gateway notifier watcher delivers completed /
blocked / gave_up events without any new plumbing — this is purely a GUI
surface over existing machinery.

Replaces the reverted auto-subscribe behavior from #19718 with an explicit,
per-task, per-platform, user-controlled opt-in. No implicit subscription
on tool-driven kanban_create; no CLI commands; no slash commands. Just a
toggle in the drawer.

Backend (plugins/kanban/dashboard/plugin_api.py):
- GET  /api/plugins/kanban/home-channels[?task_id=X]
  Returns every platform with a configured home, plus a per-entry
  subscribed: bool relative to task_id (false when task_id omitted).
  Reads the live GatewayConfig via load_gateway_config() so env-var
  overlays stay honored.
- POST /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  Idempotent add_notify_sub keyed to the platform's home.
- DELETE /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  remove_notify_sub for the same tuple.
- 404 when the platform has no home configured, or task_id doesn't
  exist (POST only).

Frontend (plugins/kanban/dashboard/dist/index.js):
- TaskDrawer fetches /home-channels on open, keyed on task_id.
- HomeSubsSection renders nothing when zero platforms have a home (so
  users who haven't set one up don't see an empty UI block).
- Optimistic toggle with busy flag + revert-on-failure. One pill per
  platform; ✓ prefix and --on class indicate the subscribed state.

CSS (plugins/kanban/dashboard/dist/style.css):
- .hermes-kanban-home-subs flex row + .hermes-kanban-home-sub pill
  style + --on subscribed variant (subtle ring-colored background).

Live-tested against a dashboard with TELEGRAM + DISCORD_BOT_TOKEN /
HOME_CHANNEL env vars set: drawer shows both pills, toggling each
flips its visual state AND writes/removes the correct kanban_notify_subs
row (verified via direct DB read).

Tests (tests/plugins/test_kanban_dashboard_plugin.py, 11 new, 53/53
pass total):
- home-channels lists only platforms with a home (slack with a
  token but no home is excluded)
- no task_id -> all subscribed=false
- subscribe creates notify_sub row with correct chat/thread/platform
- subscribed=true reflected in subsequent GET
- idempotent re-subscribe
- unknown platform -> 404
- unknown task -> 404
- unsubscribe removes the row
- telegram + discord subscribe/unsubscribe independent
- zero homes -> empty list
2026-05-04 12:31:21 -07:00
emozilla 2bc82bb504 clarify placeholder telegram credential in tests 2026-05-04 15:31:15 -04:00
Teknium 3db6b9cc87 feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709)
* feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern)

Adds a no_agent=True option to the cronjob system. When enabled, the
scheduler runs the attached script on schedule and delivers its stdout
directly to the job's target — no LLM, no agent loop, no token spend.
This is the classic bash-watchdog pattern (memory alert every 5 min,
disk alert every 15 min, CI ping) reimplemented as a first-class Hermes
primitive instead of a systemd timer + curl + bot token triplet living
outside the system.

## What

  hermes cron create "every 5m" \
    --no-agent \
    --script memory-watchdog.sh \
    --deliver telegram \
    --name memory-watchdog

Agent tool:

  cronjob(action='create',
          schedule='every 5m',
          script='memory-watchdog.sh',
          no_agent=True,
          deliver='telegram')

Semantics:
- Script stdout (trimmed) → delivered verbatim as the message
- Empty stdout          → silent tick (no delivery; watchdog pattern)
- wakeAgent=false gate  → silent tick (same gate LLM jobs use)
- Non-zero exit/timeout → delivered as an error alert
                          (broken watchdogs shouldn't fail silently)
- No LLM ever invoked; no tokens spent; no provider fallback applied

## Implementation

cron/jobs.py
  * create_job gains no_agent: bool = False
  * prompt becomes Optional (no_agent jobs don't need one)
  * Validation: no_agent=True requires a script at create time
  * Field roundtrips via load_jobs / save_jobs / update_job

cron/scheduler.py
  * run_job: new short-circuit branch at the top that runs the script,
    wraps its output into the (success, doc, final_response, error)
    tuple downstream delivery already expects, and returns before any
    AIAgent import or construction
  * _run_job_script: picks interpreter by extension — .sh/.bash run
    under /bin/bash, anything else under sys.executable (Python).
    Shell support unlocks the bash-watchdog pattern without wrapping
    scripts in Python. Extension is explicit; we deliberately do NOT
    trust the file's own shebang. Path-containment guard (scripts dir)
    unchanged.

tools/cronjob_tools.py
  * Schema: new no_agent boolean property with clear trigger guidance
  * cronjob() accepts no_agent and validates mode-specific shape:
    - no_agent=True requires script; prompt/skills optional
    - no_agent=False keeps the existing 'prompt or skill required' rule
  * update path rejects flipping no_agent=True on a job without a script
  * _format_job surfaces no_agent in list output
  * Handler lambda forwards no_agent from tool args

hermes_cli/main.py, hermes_cli/cron.py
  * 'hermes cron create --no-agent' and edit's --no-agent / --agent
    pair for toggling at CLI parity with the agent tool
  * Existing --script help text updated to describe both modes
  * List / create / edit output now shows 'Mode: no-agent (...)' when set

## Tests

tests/cron/test_cron_no_agent.py — 18 tests covering:
  * create_job: no_agent shape, validation, field persistence
  * update_job: flag roundtrip across reload
  * cronjob tool: schema validation, update toggling, mode-specific
    requirements, prompt-relaxation rule
  * run_job short-circuit:
    - success path delivers stdout verbatim
    - empty stdout → SILENT_MARKER (no delivery downstream)
    - wakeAgent=false gate → silent
    - script failure → error alert
    - run_job does NOT import AIAgent (verified via mock)
  * _run_job_script:
    - .sh executes via bash (no shebang required)
    - .bash executes via bash
    - .py still runs via sys.executable (regression)
    - path-traversal still blocked (security regression)

All 18 new tests pass. 341/342 pre-existing cron tests still pass; the
one failure (test_script_empty_output_noted) was already broken on main
and is unrelated to this change.

## Docs

website/docs/guides/cron-script-only.md — new dedicated guide covering
the watchdog pattern, interpreter rules, delivery mapping, worked
examples (memory / disk alerts), and the comparison table vs hermes send,
regular LLM cron jobs, and OS-level cron.

website/docs/user-guide/features/cron.md — new 'No-agent mode' section
in the cron feature reference, cross-linked to the guide.

website/docs/guides/automate-with-cron.md — new tip box pointing users
to no-agent mode when they don't need LLM reasoning.

## Compatibility

- Existing jobs: unchanged. no_agent defaults to False, existing code
  paths untouched until the flag is set.
- Schema additive only; older jobs.json without the field load fine
  via .get() with False default.
- New CLI flags are opt-in and don't alter existing flag behavior.

* fix(cron): lazy-import AIAgent + SessionDB so no_agent ticks pay zero

The unconditional `from run_agent import AIAgent` + SessionDB() init at
the top of run_job() meant every no_agent tick still paid the full agent
module load cost (~300ms + transitive imports + DB open) even though it
never touched any of that machinery.

Move both to live under the default (LLM) path, after the no_agent
short-circuit has returned. Now a no_agent tick's sys.modules stays
clean — verified end-to-end:

    assert 'run_agent' not in sys.modules  # before
    run_job(no_agent_job)
    assert 'run_agent' not in sys.modules  # after

The existing mock-based unit test (test_run_job_no_agent_never_invokes_aiagent)
kept passing because patch() replaces the class AFTER import; the leak
was only visible via real subprocess-style verification. End-to-end
demo confirmed: agent calls cronjob(no_agent=True) → script runs →
stdout delivered → no LLM machinery loaded.

* docs(cron): tighten no_agent tool schema — defaults, silent semantics, pick rule

Previous description buried the important bits in one long sentence.
Agents could plausibly miss three things an LLM-facing schema should
make unmissable:

1. What the default is — now first sentence + JSON Schema `default: false`
2. What 'silent run' actually means for the user — now spelled out:
   'nothing is sent to the user and they won't see anything happened'
3. When to pick True vs False — now a concrete decision rule with
   examples on both sides (watchdogs/metrics/pollers → True;
   summarize/draft/pick/rephrase → False)

Also adds explicit 'prompt and skills are ignored when True' since the
agent could otherwise still pass them out of habit.

No behavior change — schema text only.
2026-05-04 12:31:01 -07:00
teknium1 d35efb9898 feat(telegram): /topic off + help + auth gate + screenshot debounce
Four production-readiness additions to topic mode:

1. /topic off — clean disable path. Flips telegram_dm_topic_mode.enabled
   to 0 and clears telegram_dm_topic_bindings for this chat. Previously
   users had to edit state.db with sqlite3 to turn the feature off.
   Idempotent: calling /topic off when the chat was never enabled
   returns a friendly no-op message.

2. /topic help — inline usage printed in the DM so users don't have to
   visit docs to discover /topic off, /topic <session-id>, etc.

3. Authorization gate. /topic mutates SQLite side tables and flips the
   root DM into a lobby, so the action must be authorized. Now calls
   self._is_user_authorized(source); unauthorized DMs get a refusal
   instead of activation. Defense in depth on top of the gateway's
   existing pre-route auth.

4. BotFather screenshot debounce. A user repeatedly running /topic
   while Threads Settings is still disabled would previously re-upload
   the same screenshot every time. Now rate-limited to one send per
   5 minutes per chat. /topic off resets the counter so re-enabling
   starts fresh.

Command-def args hint updated: /topic [off|help|session-id].

Docs:
- New /topic subcommands table at the top of the multi-session section
- Disable instructions updated to recommend /topic off first, with the
  raw SQL fallback kept for bulk cleanup
- Under-the-hood list extended with the capability-hint debounce and
  the authorization gate

Tests (6 new):
- /topic help returns usage and doesn't create topic tables
- /topic off disables mode AND clears bindings
- /topic off is idempotent when never enabled
- Unauthorized users get refusal, no tables created
- Capability-hint debounce is per-chat
- /topic off resets both lobby and capability debounce counters

All 402 targeted tests pass. Full gateway sweep: 4809/4810
(pre-existing test_teams::test_send_typing unrelated).
2026-05-04 12:07:17 -07:00
teknium1 1381c89e56 fix(telegram): polish topic mode — CASCADE, General-topic handling, rename guard, debounce
Five follow-ups to topic mode based on integration audit:

1. ON DELETE CASCADE on telegram_dm_topic_bindings.session_id. Session
   pruning (manual /delete, auto-cleanup, any future prune job) would
   have thrown 'FOREIGN KEY constraint failed' for sessions bound to a
   topic. Migration bumped to v2, rebuilds the bindings table in place
   if FK lacks CASCADE. Idempotent; only runs once per DB.

2. Never auto-rename operator-declared topics. If an operator has
   extra.dm_topics configured AND a user runs /topic, messages in those
   pre-declared topics would previously trigger auto-rename and silently
   mutate operator config. _rename_telegram_topic_for_session_title now
   early-returns when _get_dm_topic_info returns a dict for this
   (chat_id, thread_id). Uses class-based lookup (not hasattr) so
   MagicMock test fixtures don't accidentally trip the guard.

3. General topic handling. Telegram's General (pinned top) topic in a
   forum-enabled private chat may send messages with message_thread_id=1
   or omit thread_id entirely depending on client. Both are now treated
   as the root lobby, not a topic lane. Prevents users from
   accidentally burning a session on the General topic.

4. Debounce the root-lobby reminder. 30-second cooldown per chat so a
   user who forgets topic mode is enabled and types ten messages in the
   root gets one reminder, not ten. Explicit command replies
   (/new-in-lobby, /topic <session-id>) still land every time.

5. Docs: added under-the-hood invariants for the above, plus a
   Downgrade section explaining that rolling back to a pre-/topic
   Hermes build leaves the DB tables orphaned but harmless — DMs just
   revert to native per-thread isolation.

Tests:
- test_operator_declared_topic_is_not_auto_renamed
- test_general_topic_is_treated_as_root_lobby
- test_lobby_reminder_is_debounced_per_chat
- test_binding_survives_session_deletion_via_cascade
- test_migration_rebuilds_v1_binding_table_with_cascade_fk

Validated: 4803/4804 tests pass (tests/gateway/ + tests/test_hermes_state.py).
Sole failure is a pre-existing test_teams::test_send_typing flake
unrelated to this PR.
2026-05-04 12:07:17 -07:00
teknium1 1a9542cf75 docs(telegram): document /topic multi-session DM mode
Adds a new section 'Multi-session DM mode (/topic)' to the Telegram
messaging docs, covering:

- Comparison table vs the existing config-driven extra.dm_topics
- BotFather prerequisites (Threads Settings, user-create permission)
- Activation flow and root-DM lobby behavior
- End-user flow for creating topics via the + button / All Messages
- Auto-renaming when Hermes generates session titles
- /new semantics inside a topic
- /topic <session-id> restore of previous sessions
- Persistence layout (SQLite side tables)
- How to disable the feature

Also:
- New /topic row in the messaging slash-commands reference
- Updated Bot API 9.4 summary to point at both topic features
2026-05-04 12:07:17 -07:00
teknium1 a7683d04a9 fix(telegram): harden DM topic binding — persist through switch_session, rebind on /new
Follow-up on @EmelyanenkoK's feat: add Telegram DM topic-mode sessions.

Three issues:

1. Split-brain session state. After get_or_create_session() returned a
   SessionEntry for a topic lane, the handler was mutating
   .session_id in place to the binding's target, but never persisting
   the switch through SessionStore. The sessions.json session_key →
   session_id map kept pointing at the lane's natural id; any reader
   that reloaded from disk saw the wrong id. Fixed by routing through
   SessionStore.switch_session(), which _save()s the mapping and ends
   the old session in SQLite like /resume does.

2. /new inside a topic was a one-message no-op. Reset created a new
   session but left the telegram_dm_topic_bindings row pointing at the
   old session_id, so the next message's binding lookup switched right
   back. Now _handle_reset_command rebinds the topic to the new
   session_id after reset.

3. is_telegram_session_linked_to_topic and
   list_unlinked_telegram_sessions_for_user both called
   apply_telegram_topic_migration() on read, contradicting the PR's
   own invariant that migration only runs on explicit /topic opt-in.
   They now tolerate missing topic tables and return empty/False.

Also: _telegram_topic_mode_enabled() now only treats True as enabled
(not any truthy return), so test fixtures with MagicMock session_db
don't accidentally flip every DM into lobby mode — this was breaking
4 pre-existing test_status_command tests.

Tests:
- New regression: /new inside a topic must update the binding row
  (test_new_inside_telegram_topic_rewrites_binding_to_new_session).
- _make_runner now stubs switch_session so existing restore tests
  still exercise the new code path.

Validated end-to-end with real SessionDB + SessionStore:
readers on fresh DB don't create topic tables; enable creates them;
binding override persists across SessionStore restart; /new rebinds
and the new id survives a restart.

Co-authored-by: EmelyanenkoK <emelyanenko.kirill@gmail.com>
2026-05-04 12:07:17 -07:00
EmelyanenkoK 25065283b3 fix: improve telegram topic mode setup 2026-05-04 12:07:17 -07:00
EmelyanenkoK d6615d8ec7 feat: add Telegram DM topic-mode sessions 2026-05-04 12:07:17 -07:00
asheriif 0ce1b9fe20 fix(tui): preserve prompt separator width (#19340)
* fix(tui): preserve prompt separator width

* fix(tui): align transcript height estimates with prompt width
2026-05-04 09:58:40 -07:00
brooklyn! d9c090fe36 Merge pull request #19338 from asheriif/fix/tui-plugin-slash-exec-live
fix(tui): run plugin slash commands live
2026-05-04 09:57:45 -07:00
kshitijk4poor 54e78cadb2 test: add regression test for Teams interactive_setup import fix
Adapted from PR #19188 by @LeonSGP43 — mocks cli_output helpers and
verifies interactive_setup persists credentials to .env without
crashing. Also adds megastary to AUTHOR_MAP.
2026-05-04 06:54:27 -07:00
megastary 38adfebe78 fix(teams): import prompt/print helpers from cli_output, not config
The Teams adapter's interactive_setup() tried to import prompt,
prompt_yes_no, print_info, print_success, and print_warning from
hermes_cli.config, but those helpers live in hermes_cli.cli_output.
Only get_env_value/save_env_value live in hermes_cli.config.

This caused 'hermes setup' to crash with ImportError as soon as the
user picked Teams in the messaging-platforms wizard.

Split the import accordingly.
2026-05-04 06:54:27 -07:00
kshitijk4poor cfd86dcdb8 chore: add bobashopcashier noreply email to AUTHOR_MAP 2026-05-04 06:23:52 -07:00
bobashopcashier d89e7a3cd4 fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract)
Per https://platform.claude.com/docs/en/build-with-claude/fast-mode:
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error."

Pre-fix, _is_anthropic_fast_model() returned True for any claude-* model,
so /fast on Opus 4.7 (or Sonnet/Haiku) would persist agent.service_tier=fast
in config.yaml and the adapter would inject extra_body["speed"] = "fast"
on every subsequent request. Opus 4.7 returns:

  HTTP 400: 'claude-opus-4-7' does not support the `speed` parameter.

This wedged sessions across model upgrades (a user who ran /fast on Opus 4.6
and later switched the default model to 4.7 hit a hard 400 on every turn
until they manually edited config.yaml).

Changes:
- _is_anthropic_fast_model: gate on "opus-4-6" / "opus-4.6" only
- anthropic_adapter: add _supports_fast_mode predicate as defensive guard
  so stale request_overrides on an unsupported model are dropped silently
  instead of 400'ing
- Tests: flip the assertions that mirrored the bug (Sonnet/Haiku/Opus 4.7
  asserting fast-mode support) to match the documented API contract
2026-05-04 06:23:52 -07:00
JasonOA888 a7417f8a4a fix(compressor): skip non-string tool content in summarization pass to prevent AttributeError
Commit 408dd8aa added a non-string guard for Pass 1 (dedup), but the same
pattern exists in Pass 2 (summarization/pruning) where content.startswith()
and len() are called on potentially non-string tool content.

When a provider returns tool results with non-string content (e.g. dict or
int from llama.cpp or similar), the pruning pass crashes with AttributeError.

Add the same isinstance(content, str) guard to Pass 2 for consistency.
2026-05-04 06:23:52 -07:00
helix4u eeb05cf556 docs: default custom tool creation to plugins
Steers custom tool creation toward the plugin route by default.
The adding-tools.md guide is now explicitly for built-in core Hermes
tools only.

Key fixes:
- Plugin quickstart: ctx.register_tool() now uses correct keyword-arg
  API (name=, toolset=, schema=, handler=) instead of broken 3-arg call
- Handler signature: (params, **kwargs) instead of (params)
- Handler return: json.dumps({...}) instead of plain string
- AGENTS.md: mentions plugin route before built-in tool instructions
- learning-path.md: plugins listed before core tool development
- contributing.md: separates plugin vs core tool paths

Based on PR #13138 by @helix4u.
2026-05-04 05:53:16 -07:00
ygd58 74c1b946e0 fix(browser): inject --no-sandbox for root and AppArmor userns restrictions
On VPS/Docker and some Ubuntu 23.10+ hosts, Chromium refuses to start
without --no-sandbox:
  - uid=0 (root): hard requirement (VPS/Docker deployments)
  - AppArmor apparmor_restrict_unprivileged_userns=1 (Ubuntu 23.10+):
    non-root too, under systemd or unprivileged containers

Detect both conditions and inject AGENT_BROWSER_CHROME_FLAGS with
--no-sandbox --disable-dev-shm-usage when the user hasn't already
set the flags themselves.

Salvage of #15771 — only the browser_tool.py fix is cherry-picked.
The PR's accompanying MCP preset addition (new feature surface)
was dropped so the bug fix can land independently.

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-05-04 05:27:23 -07:00
briandevans ce22301dc6 test(sms): use clear=True in test_missing_phone_number_is_non_retryable
Prevents pre-existing TWILIO_PHONE_NUMBER or SMS_WEBHOOK_URL values in
the outer test environment from leaking into the assertion context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:25:09 -07:00
0668001438 83080772f2 fix(delegation): honor provider override for subagents
Clear inherited provider preference filters when delegation.provider is set so delegated children do not route back to the parent provider. Add a regression test for cross-provider delegation with parent OpenRouter filters.

Closes #10653
2026-05-04 05:22:35 -07:00
Pratik Rai 7a8ee8b29d fix(gateway): deduplicate Weixin messages by content fingerprint 2026-05-04 05:20:13 -07:00
briandevans 0b5fd40a01 fix(delegate): correct _spawn_child → _build_child_agent in comments
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:18:45 -07:00
briandevans 42d72b5922 fix(status): add missing popular provider API keys to hermes status display
Closes #16082.

`hermes status` silently omitted four widely-used LLM providers
(Google/Gemini, DeepSeek, xAI/Grok, NVIDIA NIM) from the API Keys
and API-Key Providers sections. Add them, along with tuple-valued
env var support (first found wins) so Google can accept either
GOOGLE_API_KEY or GEMINI_API_KEY.

Also deduplicates the "NVIDIA" and "NVIDIA NIM" rows that were
both pointing at NVIDIA_API_KEY.

Salvage of #16159 (core behavior preserved + NVIDIA dedup fixup
on top of the tuple-support refactor).

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-04 05:14:13 -07:00
VinVC 5d6431c114 fix(doctor): resolve merge conflicts, add kimi-coding-cn test
- Rebased on upstream/main to resolve conflicts
- Added test_run_doctor_accepts_kimi_coding_cn_provider test
- All 30 tests pass
2026-05-04 05:12:42 -07:00
阿泥豆 0e9416036a test: add unit tests for heartbeat stale threshold increase 2026-05-04 05:08:51 -07:00
阿泥豆 0cc63043e0 fix(delegation): increase heartbeat stale thresholds
The heartbeat stale detection was too aggressive:
- idle: 5 * 30s = 150s — LLM inference on slow providers (Zhipu/GLM)
  frequently exceeds 150s, causing heartbeat to stop prematurely
- in-tool: 20 * 30s = 600s — borderline for long tool calls

When heartbeat stops, parent._last_activity_ts freezes, eventually
triggering gateway timeout and killing the entire delegation.

New thresholds:
- idle: 15 * 30s = 450s — accommodates slow LLM inference
- in-tool: 40 * 30s = 1200s — accommodates long-running tool calls

child_timeout_seconds (config: delegation.child_timeout_seconds) remains
the hard cap for total delegation duration.
2026-05-04 05:08:51 -07:00
briandevans 6b4ccb9b14 fix(session-search): report source from resolved parent, not FTS5 child session (#15909)
When a delegation child session (e.g. source='telegram') contains the
FTS5 hit but _resolve_to_parent() maps it to a different root session
(source='api_server'), the result entry was still reporting the child's
source because the loop discarded session_meta as `_` and fell back to
match_info.get('source'), which carries the child session's value.

Use the resolved parent's session_meta for source, model, and started_at
with match_info as a fallback, so the output accurately reflects the
session the user actually interacted with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:40 -07:00
briandevans b46b0c9888 fix(backup): floor pre-update backup_keep to 1 so the new backup survives
`updates.backup_keep: 0` (or any negative value) wiped the freshly-
created pre-update zip:

  _prune_pre_update_backups(backup_dir, keep=0):
      backups = sorted(..., reverse=True)   # newest first, includes
                                            # the zip we just wrote
      for p in backups[0:]:                 # = all of them
          p.unlink()

The wrapper in `main.py` then printed `Saved: <path>` for a file that
no longer existed (the size lookup is wrapped in `try/except OSError`
which silently degrades to "0 B"), leaving operators believing they had
a recovery point when they had none.

This is a real footgun because some config systems treat 0 as "keep
unlimited"; here it does the opposite — every backup is destroyed
right after creation.

Fix: clamp `keep` to a minimum of 1 inside `_prune_pre_update_backups`
since that helper is only invoked immediately after a fresh backup
is written.  Operators who genuinely want no backups should set
`updates.pre_update_backup: false` (which gates creation entirely)
rather than relying on `backup_keep: 0`.

Also extends the `backup_keep` config docstring to spell out the floor
and point at `pre_update_backup: false` as the off-switch.

## Tests

Three regression tests added in `TestPreUpdateBackup`:

  - `test_keep_zero_does_not_delete_freshly_created_backup` —
    asserts the file persists after `keep=0`
  - `test_keep_negative_does_not_delete_freshly_created_backup` —
    same for negative values
  - `test_keep_zero_still_prunes_older_backups` — proves the floor
    only protects the new backup; older ones are still rotated out

Verified the new tests fail on origin/main (without the floor) and
pass with it; full `tests/hermes_cli/test_backup.py` suite green
(84 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:13 -07:00
Sanhu Li ef8c213e88 fix(model-switch): soft-accept unlisted openai-codex models 2026-05-04 05:06:53 -07:00
0xsir0000 52882dade6 fix(agent): include name field on every role:tool message for Gemini compatibility (#16478)
Gemini's OpenAI-compatibility endpoint strictly requires the `name` field
on `role: tool` messages — it returns HTTP 400 ("Request contains an
invalid argument") when the function name is missing. OpenAI/Anthropic/
ollama tolerate the absence, so the gap stays invisible until the
conversation accumulates a tool turn and the user routes it through Gemini
(direct API or via ollama-cloud proxy).

Fix: add a `_get_tool_call_name_static()` helper alongside the existing
`_get_tool_call_id_static()`, and populate `name` at every site that
constructs a `role: tool` message — the pre-call sanitizer stub, the
tool-call args repair marker, both interrupt-skip paths, both
result-append paths (parallel + sequential), the invalid-tool-name
recovery, the invalid-JSON-args recovery, and the exception fallback.

Each call site was already in scope of the function name (`function_name`,
`skipped_name`, `name`, or a dict tool_call), so the change is local —
no new lookups, no behavior change for providers that already worked.

Fixes #16478
2026-05-04 05:06:33 -07:00
OpenClaw Bot 0443484115 fix(qqbot): honor proxy env vars for websocket 2026-05-04 05:06:09 -07:00
陈运波0668001438 6cf7a9e330 fix(vision): preserve explicit provider auth with custom base_url
Keep the configured vision provider when base_url is overridden so credential-pool lookup still resolves provider-specific API keys (e.g. ZAI_API_KEY), and add a regression test for this path.
2026-05-04 05:05:43 -07:00
swithek b7bbc62503 fix(compressor): _prune_old_tool_results boundary direction 2026-05-04 05:05:18 -07:00
Dejie Guo d29f90e89d fix(error_classifier): avoid large-context false overflow heuristics
Generic 400 and server-disconnect heuristics used absolute token/message-count fallbacks that are too aggressive for 1M context sessions. Gate those absolute fallbacks to smaller context windows while preserving relative pressure checks.

Fixes #16351
2026-05-04 05:04:56 -07:00
giwaov 026a5e47df fix(cli): preserve Windows hidden-dir paths in markdown 2026-05-04 05:04:36 -07:00
Teknium 3fb35520c6 revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718) (#19721)
Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.
2026-05-04 05:04:01 -07:00
Teknium 25b7b0f8e6 chore(release): AUTHOR_MAP entries for Tier 1f salvage batch 2026-05-04 05:03:10 -07:00
Teknium ff3d2773e2 feat(kanban): auto-subscribe gateway chat on tool-driven kanban_create (#19718)
Closes #19479.

When an orchestrator agent calls kanban_create from a gateway session
(e.g. a Telegram user delegating to an orchestrator profile), auto-
subscribe the originating (platform, chat, thread, user) to the new
task's terminal events. Mirrors the behavior of the /kanban create
slash command in gateway/run.py so tool-driven creation is at parity
with human-driven creation.

Without this, a user who interacts with an orchestrator exclusively
via the gateway never receives blocked / completed / gave_up
notifications for tasks the orchestrator created on their behalf —
silently breaking the gateway-first multi-agent flow the reporter
describes.

Reads the context-local HERMES_SESSION_* vars via get_session_env()
(not os.environ — those are contextvars for asyncio concurrency
safety). Falls through cleanly in CLI / cron contexts with no
session active (subscribed=False in the response). Best-effort: if
the gateway module isn't importable (test rigs stubbing gateway.*),
the task still creates, we just skip the subscription.

Response gains a 'subscribed' bool so the orchestrator knows whether
terminal events will land back in the originating chat or whether it
needs to poll / unblock manually.

Tests: 4 new in tests/tools/test_kanban_tools.py covering
CLI/no-subscribe, telegram/gateway-auto-subscribe, discord-DM/no-
thread subscribe, and partial-ctx/no-chat_id no-subscribe. 40/40
kanban tool tests pass.
2026-05-04 05:02:23 -07:00
Nikolay Gusev fdf9343c51 fix(tools): wrap bare scalars in single-element list for array-typed args
Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from #19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
2026-05-04 05:00:37 -07:00
ms-alan 6f864f8f94 fix(redact): add code_file param to skip false-positive ENV/JSON patterns
ENV-assignment and JSON-field regex patterns in redact_sensitive_text()
cause false positives when reading source code files:
- MAX_TOKENS=*** triggers the ENV assignment pattern
- "apiKey": "test" in test fixtures triggers the JSON field pattern

Add code_file=False parameter. When code_file=True, skip only the
ENV-assignment and JSON-field regex passes; all other patterns (prefixes,
auth headers, private keys, DB connstrings, JWTs, URL secrets) are
still applied.

Update file_tools.py (read_file and search_files) to pass code_file=True
so agent code analysis is not polluted by false-positive redactions.

Closes #15934
2026-05-04 04:56:28 -07:00
Teknium a175f39577 feat(nous): persist Nous OAuth across profiles via shared token store (#19712)
Mirrors the Codex auto-import UX. On successful Nous login (either
`hermes auth add nous --type oauth` or `hermes login nous`), tokens are
mirrored to `$HERMES_SHARED_AUTH_DIR/nous_auth.json` (default
`~/.hermes/shared/nous_auth.json`, outside any named profile's
HERMES_HOME). On next login in a new profile, the flow offers to import
those credentials ("Import these credentials? [Y/n]") and rehydrates via
a forced refresh+mint instead of running the full device-code flow.

Runtime refresh in any profile syncs the rotated refresh_token back to
the shared store so sibling profiles don't hit stale-token fallback
after rotation.

The volatile 24h agent_key is NOT persisted to the shared store —
only the long-lived OAuth tokens are cross-profile useful.

- `HERMES_SHARED_AUTH_DIR` env var for tests + custom layouts
- Pytest seat belt mirrors the existing `_auth_file_path` guard so
  forgetting to redirect the store in a test fails loudly
- File mode 0600 where platform supports it
- Runtime credential resolution is unchanged — shared store is only
  consulted during the login flow, so profile isolation at runtime is
  preserved
- Stale refresh_token + portal-down cases gracefully fall back to
  device-code

Addresses a user report from Mike Nguyen: running
`hermes --profile <name> auth add nous --type oauth` for every new
profile is unnecessary friction now that Codex has a shared-import
flow via `~/.codex/auth.json`.
2026-05-04 04:54:55 -07:00
QifengKuang 69fc6d9c1e fix(telegram): fall back to document on any send_photo failure, not just dim errors
Broadens the existing fallback (previously only fired for
Photo_invalid_dimensions) to cover every send_photo exception class:
rate limits, corrupt file markers, format edge cases. The expected
dimension case still logs at INFO (document is the right path); all
other cases log at WARNING with exc_info so they're visible in logs.

If send_document itself fails, we still fall back to the base adapter's
text-only 'Image: /path' rendering as a last resort.

Salvage of #15837 — original PR author QifengKuang proposed the broader
try/except-style fallback. Adapted to keep the existing INFO-vs-WARNING
log split for dimension errors (the expected case).

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 04:54:54 -07:00
Teknium d3b22b76d8 fix(kanban): enforce worker task-ownership on destructive tool calls (#19713)
Closes #19534 (security).

A worker spawned by the kanban dispatcher has HERMES_KANBAN_TASK set
to its own task id. The destructive tools (kanban_complete,
kanban_block, kanban_heartbeat) resolved task_id via
_default_task_id() which preferred an explicit arg over the env var,
with no ownership check — so a buggy or prompt-injected worker could
complete / block / heartbeat any OTHER task (sibling, cross-tenant,
anything) by supplying its id. Reporter's repro: worker for t_A
passed task_id=t_B to kanban_complete and got {"ok": true}.

Fix: add _enforce_worker_task_ownership(tid). If HERMES_KANBAN_TASK
is set and tid doesn't match, return a structured tool error with
guidance to use kanban_comment (for information handoff across tasks)
or kanban_create (for follow-up work). Orchestrator profiles (no env
var, but kanban toolset enabled per #18968) are exempt — their job
is routing and sometimes includes closing out child tasks.

Kept unrestricted (deliberately):
- kanban_show — workers legitimately read parent/sibling handoff context
- kanban_comment — cross-task comments are the handoff mechanism
- kanban_create — orchestrator fan-out, worker follow-up spawning
- kanban_link — parent/child linking

Tests: 5 new regression tests in tests/tools/test_kanban_tools.py
covering the grid (worker-attacks-foreign ×3 tools, worker-own-task
preserved, orchestrator-unrestricted). 36/36 pass.
2026-05-04 04:54:02 -07:00
Teknium 1bd5ac7f2f fix(self-improvement-loop): bump background-review budget to 16 and suppress status leaks (#19710)
The background memory/skill review fork had two user-visible issues:

1. max_iterations=8 was too tight for multi-step reviews. A review that
   needs to skill_view one or two candidate skills, add a memory entry,
   and patch a skill routinely blew the budget — surfacing an 'Iteration
   budget exhausted (8/8)' warning to the user and leaving the review
   half-finished.

2. Mid-review lifecycle messages leaked into the user's terminal past the
   existing quiet_mode + redirect_stdout/stderr guards. _emit_status and
   _emit_warning route through _vprint(force=True) -> _print_fn /
   status_callback, which bypass sys.stdout entirely. The stdout redirect
   only catches raw print() calls.

Changes:
- Bump the review fork's max_iterations from 8 to 16.
- Set review_agent.suppress_status_output = True on the fork. This
  short-circuits _vprint unconditionally so _emit_status/_emit_warning
  emissions (iteration-budget warnings, rate-limit retries, compression
  messages) never reach the user. The only user-visible output remains
  the compact final summary line ('💾 Self-improvement review: ...')
  which is printed via self._safe_print on the *main* agent (outside
  the fork's redirect/suppress scope).

Summarizer filter is already correct — _summarize_background_review_actions
only surfaces tool calls with data.get('success') is truthy, so failed
attempts and reasoning text never reach the summary line.
2026-05-04 04:53:44 -07:00
Kathy a79b0ec461 fix: keep Feishu topic replies from falling back to new threads (local patch)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-04 04:53:28 -07:00
cong 3ccf723bf9 fix(gateway): read context_length from custom_providers in session info header 2026-05-04 04:51:13 -07:00
h0tp-ftw 8c8f95bc8e fix(gateway): show friendly error when service is not installed
Instead of an unhelpful CalledProcessError traceback when running
`hermes gateway start/stop/restart` without first installing the service,
check for the unit file and exit with an actionable install hint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 04:49:51 -07:00
Teknium c5789f4309 feat(achievements): share card render on unlocked badges (#19657)
* feat(achievements): share card render on unlocked badges

Adds a Share button to each unlocked achievement card that opens a
modal and renders a 1200x630 PNG share card client-side via Canvas2D
(no backend, no network, no new deps). Two actions: Download PNG and
Copy image to clipboard.

Card layout mirrors the in-dashboard visual language: tier-colored
glow, icon from the existing LUCIDE sprite set, achievement name,
tier badge pill, description, progress stat line, and a Hermes Agent
watermark. Sized for X/Twitter, Discord, LinkedIn, Bluesky link
previews.

Vendored on top of the upstream @PCinkusz bundle; the 'in-progress
scan banner' precedent already established this divergence pattern.
Manifest bumped 0.3.1 -> 0.4.0.

* feat(achievements): share-on-X as primary action on share dialog

Adds a 'Share on X' button as the primary action in the share dialog.
Opens https://x.com/intent/post with a pre-filled tweet referencing
the achievement name, tier, @NousResearch, and the Hermes docs URL.
Copy image and Download PNG become secondary actions: users who want
the badge attached can Copy image, paste into the X composer, post.

Primary button styled as X's signature black-on-white fill so the
action is unambiguous.
2026-05-04 04:47:53 -07:00
ygd58 297eaa3533 fix(api_server): emit run.failed when run_conversation returns failed=True
When run_conversation encounters a non-retryable client error (401, 400,
etc.), it returns a dict with failed=True instead of raising. The gateway's
_run_and_close only branched on exceptions, so it always emitted run.completed
even for failed runs — clients could not distinguish success from failure.

Inspect the result dict before emitting: if failed=True, emit run.failed
with the error message; otherwise emit run.completed as before. The existing
except Exception path is unchanged for genuine programming errors.

Fixes #15561
2026-05-04 04:47:36 -07:00
Teknium b2b479b40e docs(kanban): backfill multi-board refs in reference docs (#19704)
Followup to #19653. The feature PR updated the Kanban user guide but
missed four other pages that document the same surface. Caught when
Teknium asked 'did you add docs to the guide and any other kanban
related docs around this?'.

- reference/cli-commands.md: rewrite the `hermes kanban` section to
  document the `--board <slug>` global flag, the `boards`
  subcommand group (list/create/switch/show/rename/rm), board
  resolution order, and worked examples. Also fills in the
  `create` / `complete` flag lists that had drifted from the
  current CLI (`--summary`, `--metadata`, `--triage`,
  `--idempotency-key`, `--max-runtime`, `--skill`).
- reference/environment-variables.md: add `HERMES_KANBAN_BOARD`
  row, update `HERMES_KANBAN_DB` precedence note.
- reference/slash-commands.md: add `/kanban boards ...` and
  `/kanban --board <slug> ...` to the two `/kanban` rows (CLI
  table + gateway table).
- features/kanban-tutorial.md: the walkthrough uses the `default`
  board, so just a note pointing readers at the overview's Boards
  section if they want multiple queues, plus the corrected per-board
  DB path.

Skill docs (devops-kanban-orchestrator, -worker) intentionally not
updated: those are agent-facing lifecycle playbooks and boards are
transparent to workers (HERMES_KANBAN_BOARD env var pins the DB
automatically), so there's nothing new for a worker to know.
2026-05-04 04:47:19 -07:00
Teknium a8b689f0c2 test(kanban): regression for status=running rejection at dashboard PATCH
Reporter of #19535 explicitly asked for a regression test — covers it
here so a future refactor of _set_status_direct can't silently re-enable
the direct ready/todo -> running bypass.

Asserts both: (a) HTTP 400 with 'running' in the detail message, and
(b) the task's status is unchanged after the rejected PATCH (pre-request
status preserved, no partial mutation).
2026-05-04 04:46:47 -07:00
luyao618 6b3efcee49 fix(kanban): reject direct status transition to 'running' via dashboard API
The PATCH /tasks/:id endpoint allows setting status='running' via
_set_status_direct(), bypassing the dispatcher/claim path that creates
run rows, claim locks, expiry, and worker process metadata. This can
leave tasks stuck in 'running' with no active worker.

Fix: reject status='running' with HTTP 400, requiring all transitions
to 'running' to go through the canonical claim_task() path.

Closes #19535
2026-05-04 04:46:47 -07:00
vominh1919 652f8e6f3e fix(test): correct _coerce_number inf/nan test assertions
The test 'test_inf_stays_string_for_integer_only' incorrectly asserted
that _coerce_number('inf') returns float('inf'), but the function
correctly returns the original string 'inf' because infinity is not
JSON-serializable.

Fixed the assertion to expect the string 'inf', and added two new tests
for negative infinity and NaN edge cases to improve coverage of the
non-JSON-serializable number guard in _coerce_number().
2026-05-04 04:45:55 -07:00
Yoimex edf9c75621 fix(env): pass -- to cd for hyphen-prefixed workdirs 2026-05-04 04:45:03 -07:00
Teknium ae40fca955 fix(profiles): keep validate_profile_name strict; callers normalize first
Follow-up to @changchun989's cherry-pick: reverts the validate-via-
normalize change so validate_profile_name remains a strict regex check
on the input AS-GIVEN. Callers that accept mixed-case user input
(dashboard UI, CLI args, import flows) call normalize_profile_name()
first, then validate the result. This keeps validate honest about
what the on-disk directory name must look like — e.g. '  jules '
(trailing whitespace) is now rejected instead of silently trimmed
and accepted.

- validate_profile_name: strict lowercase/regex check again, 'UPPER'
  back in the invalid-names parametrize
- 8 call sites in profiles.py (create_profile, delete_profile,
  set_active_profile, export_profile, import_profile, rename_profile,
  resolve_profile_env, plus the clone_from branch): swap the
  normalize-then-validate order
- scripts/release.py: add changchun989@proton.me -> changchun989 to
  AUTHOR_MAP so CI doesn't block on the unmapped contributor email

All kanban + profile tests pass (268 across test_profiles.py +
test_kanban_db.py + test_kanban_core_functionality.py, plus 73 in
test_kanban_tools.py + test_kanban_dashboard_plugin.py).

Closes #18498.
2026-05-04 04:44:37 -07:00
changchun989 a31477dabb fix(profiles): normalize profile IDs for Kanban assignees and lookups
- Add normalize_profile_name() for lowercase canonical IDs and Default alias
- Use canonical names in create/delete/rename/export/import/set_active paths
- Canonicalize Kanban assignee on create/assign, list filter, and worker spawn
- Tests for mixed-case assignees and profile resolution (fixes #18498)
2026-05-04 04:44:37 -07:00
Yuyang Xu 60c4bc96fd fix(security): restore .env/auth.json/state.db with 0600 perms
`hermes import` was creating secret files with the process umask
(typically 0644) instead of 0600. zipfile.open() does not honor the
Unix mode bits stored in zip member external_attr; the restore loop
used open(target, "wb") which always falls back to umask.

Threat: silent privilege downgrade after a routine restore on
multi-user systems (shared dev boxes, CI runners, jump hosts) — any
local user could read API keys and OAuth tokens from ~/.hermes/.

Fix mirrors the convention already used at file creation
(hermes_cli/auth.py: stat.S_IRUSR | stat.S_IWUSR for auth.json).
The quick-snapshot restore path (restore_quick_snapshot) is
unaffected — it uses shutil.copy2 which preserves perms via
copystat().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 04:43:53 -07:00
MichaelWDanko da8654bb41 fix(dashboard): show custom theme palette swatches 2026-05-04 04:43:27 -07:00
Cameron Aragon 239ea1bdea fix(image-gen): preserve xAI API error status 2026-05-04 04:43:07 -07:00
atongrun 75b4a34670 fix(cli): check updates against upstream/main for fork users 2026-05-04 04:42:44 -07:00
Teknium 5ec6baa400 feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.

Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
  its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
  keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
  installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
  their env alongside the existing `HERMES_KANBAN_DB` /
  `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
  other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
  per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
  own SQLite DB, same WAL+IMMEDIATE machinery as before).

CLI
---
  hermes kanban boards list|create|switch|show|rename|rm
  hermes kanban --board <slug> <any-subcommand>

Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.

Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.

Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
  with all boards + task counts, `+ New board` button, `Archive`
  button (non-default only). Hidden entirely when only `default` exists
  and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
  + "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
  shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
  new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
  `PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
  `POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
  opens a fresh WS against the new board.

Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.

Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.

Regression surface: all 264 pre-existing kanban tests continue to pass.

Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
vominh1919 135b4c8b35 fix(mcp): decouple AnyUrl import from mcp dependency
AnyUrl was imported inside the same try block as mcp.client.auth, so
when the mcp package was not installed, AnyUrl was undefined and
_build_client_metadata raised NameError at runtime.

Moved the AnyUrl import to its own try/except block so it's available
whenever pydantic is installed (which is a core dependency), regardless
of whether the mcp SDK is present.

Also added pytest.importorskip('mcp') to the three
test_build_client_metadata tests that exercise _build_client_metadata,
since that function depends on OAuthClientMetadata from the mcp package.
2026-05-04 04:42:18 -07:00
vominh1919 0d563621fb fix(test): skip bedrock adapter tests when botocore is not installed
Six tests in test_bedrock_adapter.py import botocore.exceptions
directly (ConnectionClosedError, EndpointConnectionError,
ReadTimeoutError, ClientError) without guarding the import. When
botocore is not installed (it's an optional dependency), these tests
fail with ModuleNotFoundError instead of being gracefully skipped.

Added pytest.importorskip('botocore') to each affected test function,
following the same pattern used elsewhere in the test suite (e.g.
test_voice_mode.py for numpy, test_mcp_oauth.py for mcp).

Tests affected:
- TestIsStaleConnectionError: 3 tests
- TestCallConverseInvalidatesOnStaleError: 3 tests

Before: 6 FAIL with ModuleNotFoundError
After:  6 SKIP with reason message
2026-05-04 04:41:55 -07:00
vominh1919 d1d2d43387 fix(test): add skip marker for transcription tests requiring faster_whisper
TestTranscribeLocalExtended patches faster_whisper.WhisperModel, which
triggers an ImportError when the faster_whisper package is not installed.
Added a pytest.mark.skipif marker using importlib.util.find_spec so
these tests are gracefully skipped instead of failing with
ModuleNotFoundError.
2026-05-04 04:41:36 -07:00
Teknium 844d4a32ce chore(release): AUTHOR_MAP entries for Tier 1e salvage batch 2026-05-04 04:40:34 -07:00
Teknium 110387d149 docs(open-webui): fill gaps in quick setup — verify curls, ollama flag, restart note (#19654)
Reported by @neopabo — the Open WebUI page was missing several steps users
hit in practice:

- Use hermes config set instead of hand-editing .env (matches current UX)
- Restart-gateway note after enabling API_SERVER_ENABLED
- curl /health + /v1/models verification step before jumping to Docker
- ENABLE_OLLAMA_API=false in both docker run and compose snippets to
  suppress the empty Ollama backend that otherwise clutters the picker
- 15-30s startup wait note for first-run embedding model download
- Troubleshooting entry for the empty-Ollama-shadowing case
- /v1/models troubleshoot command now includes the Authorization header
2026-05-04 04:36:18 -07:00
Siddharth Balyan af6f9bc2a1 fix: refresh systemd unit on gateway boot (not just start/restart) (#19684)
The resilient restart settings from PR #18639 only took effect when
the gateway was started via `hermes gateway start` or `hermes gateway
restart` — both of which call refresh_systemd_unit_if_needed() which
writes the new unit and runs daemon-reload.

However, when the gateway self-restarts via exit-code-75 (stale-code
detection after `hermes update`, or the /restart command), systemd
respawns the process directly without going through any CLI function.
The unit file on disk stays stale, and systemd keeps using the old
cached settings (StartLimitBurst=5, RestartSec=30) until someone
manually runs `hermes gateway restart`.

This meant that after PR #18639 was deployed, users who never ran
`hermes gateway restart` manually were still vulnerable to the
permanent-death-on-network-outage bug.

Fix: call refresh_systemd_unit_if_needed() at the top of run_gateway()
(the foreground entry point that systemd's ExecStart invokes). This
ensures that on every boot — whether triggered by systemd restart,
exit-75 respawn, or manual foreground run — the unit definition and
daemon state are current. The call is best-effort (exceptions caught)
and a no-op when the unit is already current (one stat + string compare).
2026-05-04 16:27:51 +05:30
Teknium 33f554d83c feat(kanban-dashboard): workspace kind + path inputs in inline create form (#19679)
Closes #18718. Exposes the existing `workspace_kind` + `workspace_path`
fields (already accepted by POST /api/plugins/kanban/tasks) in the
dashboard's per-column inline-create form so users can create tasks
targeting a git worktree or an explicit directory without dropping
back to the CLI.

- Add a workspace-kind Select (scratch / worktree / dir) to
  InlineCreate in plugins/kanban/dashboard/dist/index.js.
- Conditionally render a workspace_path Input next to the select when
  kind != scratch; placeholder tells the user whether the path is
  required (dir) or optional (worktree — derived from assignee when
  blank).
- Submit wires `workspace_kind` / `workspace_path` into the POST body
  only when they're non-default, keeping the request shape small and
  interoperable with older dispatcher versions.

E2E verified in a dashboard pointed at the worktree: selecting dir +
typing /tmp/test-18718 produces a POST body with
{workspace_kind: 'dir', workspace_path: '/tmp/test-18718'} and the
task lands in sqlite with those fields set. 42/42 kanban dashboard
plugin tests pass.
2026-05-04 03:40:39 -07:00
Grey0202 a219a0a4df fix(anthropic): strip top-level oneOf/allOf/anyOf from tool input_schema
Extends the existing _normalize_tool_input_schema to also drop top-level
union keywords that Anthropic's tool schema validator rejects with HTTP 400.

Several upstream and plugin tools ship schemas with a top-level oneOf/
allOf/anyOf (common for Pydantic discriminated unions). The existing
strip_nullable_unions pass only handles anyOf-with-null patterns; a
non-null top-level union keyword sails through and hits the API.

Salvage of #16471 — approach folded into the existing normalize helper
rather than introducing a parallel _sanitize_input_schema function, to
avoid two schema-munging code paths running against the same input.

Co-authored-by: Grey0202 <grey0202@users.noreply.github.com>
2026-05-04 03:17:35 -07:00
charliekerfoot 412f2389f1 fix(google_oauth): close TOCTOU window when saving credentials 2026-05-04 03:16:19 -07:00
Ioodu e50809b771 fix(file-tools): cap read_file result size to prevent context window overflow
Set max_result_size_chars=100_000 on the read_file registry entry (was
float('inf')), closing the Layer 2 defense-in-depth gap in
tool_result_storage.py. The existing Layer 1 guard inside
_handle_read_file already returns a JSON error for oversized reads;
this aligns the registry cap with every other tool.

Update test_read_file_never_persisted → test_read_file_result_size_cap
to assert 100_000, and add test_read_file_registry_cap_is_100k as an
explicit regression guard against re-introducing float('inf').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:14:59 -07:00
Teknium 5b6d413476 fix(cli,gateway): surface title errors from /new <name>
The contributor's PR silently swallowed ValueError from
SessionDB.set_session_title() with bare except Exception: pass.
Users typing /new <title> with an already-in-use title got an
untitled session and no feedback.

Changes:
- cli.py: catch ValueError from both sanitize_title() and
  set_session_title(); print the error and mark the session
  untitled in the banner (never echo the rejected title back).
- gateway/run.py: append a warning note to the reset reply on
  title rejection; reflect the accepted title in the header.
- Add regression tests for the duplicate-title path in CLI and
  gateway.

Also map exx@example.com -> @exxmen in scripts/release.py.
2026-05-04 03:14:50 -07:00
Exx f720751d79 feat(cli,gateway): /new accepts optional session name argument
Allow users to start a fresh session and immediately set its title by
passing a name to /new (or /reset):

    /new Refactor auth module

Changes:
- hermes_cli/commands.py: add args_hint='[name]' to /new command
- cli.py: parse title argument in process_command(), pass to new_session()
- cli.py: new_session() accepts title=None, sets title via SessionDB
- gateway/run.py: _handle_reset_command() parses title, sets on new entry
- gateway/session.py: reset_session() accepts optional display_name
- tests: add test_new_session_with_title, test_reset_command_with_title,
  test_new_command_in_help_output

All 36 affected tests pass.
2026-05-04 03:14:50 -07:00
ms-alan 055fde40e0 fix(doctor): check global agent-browser when local install not found
When agent-browser is globally installed via 'npm install -g agent-browser'
but not present in the local node_modules, doctor falsely warns that it's
not installed. Add shutil.which('agent-browser') as a fallback check after
the local path check.

Closes #15951
2026-05-04 03:13:22 -07:00
xyiy001 e69d11d30c fix(browser): allow CDP override to pass requirement checks
Treat explicit CDP override mode as a valid browser backend even when agent-browser is absent, and add a regression test to prevent false-negative availability gating.
2026-05-04 03:12:30 -07:00
kshitijk4poor 46072425fe fix(model-picker): exclude providers with empty credential pool entries
The auth check in list_authenticated_providers used mere key presence in
credential_pool to conclude a provider is authenticated.  An empty entry
(pool_store key with no actual credentials) caused providers like
ollama-cloud to appear as authenticated in the model picker even when no
OLLAMA_API_KEY was set.

The user's picker then offered nemotron-3-super under Ollama Cloud;
selecting it routed every subsequent turn to https://ollama.com/v1, which
rejected the requests with HTTP 400.

Fix: drop the pool_store key-existence check from both section 2
(HERMES_OVERLAYS) and section 2b (CANONICAL_PROVIDERS).  The following
load_pool().has_credentials() call already handles the legitimate pooled-
credential case; checking for an empty key just ahead of it was redundant
and actively harmful.
2026-05-04 03:12:12 -07:00
briandevans c8ecb56f27 fix(cli): reject invalid argv values from -p/--profile before resolving
`_apply_profile_override()` scans `sys.argv` for `-p / --profile` at
module import time. When `hermes_cli.main` is imported inside pytest
with `-p no:xdist` on the command line, it picks up `'no:xdist'` as a
profile name candidate, then passes it to `resolve_profile_env()` which
raises `ValueError` (invalid format), and the function calls
`sys.exit(1)` — aborting test collection with an INTERNALERROR before
any test runs.

The same conflict affects any tool or wrapper that uses `-p` for its
own flag and then imports `hermes_cli.main`.

Fix: add a format guard immediately after step 1 (explicit flag scan).
If `consume == 2` (the value came from `-p <value>`, not
`--profile=value`) and the candidate doesn't match the canonical
profile-name pattern `[a-z0-9][a-z0-9_-]{0,63}` (mirrored from
`hermes_cli.profiles._PROFILE_ID_RE`), discard it and continue as if
no `-p` flag was found. The `active_profile` file-based fallback
(step 2) only reads a file written by hermes itself, so it always
produces valid names and needs no guard.

Regression guard: with the guard reverted, importing
`hermes_cli.main` with `sys.argv = ['pytest', '-p', 'no:xdist', ...]`
raises `SystemExit(1)`. With the guard in place, the import succeeds
and `sys.argv` is left intact for pytest. Legitimate `-p coder` still
flows through to `resolve_profile_env()` unchanged.

Rebased onto current `origin/main` (`e5dad4ac5`) — the prior branch
base (`4fade39c9`) was 824 commits behind and the PR was DIRTY /
CONFLICTING. The 1.5 HERMES_HOME-set early-return block has since
landed between the original insertion point and step 2; the new guard
is positioned correctly before the early return so a bogus `-p` value
no longer prevents the early return from kicking in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:11:47 -07:00
ChanlerDev e3461e0b2a fix(cli): remove dead 'q' check from quit command resolution
The 'q' alias is defined for 'queue' command in commands.py:93.
The hardcoded 'q' in cli.py:5910 was dead code - resolve_command('q')
returns the queue CommandDef, so canonical would never be 'q'.

Removes the misleading check without changing any behavior:
- /quit and /exit still exit (defined aliases)
- /q still maps to queue (as intended)
2026-05-04 03:11:30 -07:00
YAMAGUCHI Seiji cba86b7303 fix(cronjob): treat bare 'custom' provider as unspecified in override
`_resolve_model_override` treated any non-empty `provider` string from
the LLM as user-specified and skipped the pin-to-current-provider
fallback. When the LLM wrote bare `'custom'` (instead of the canonical
`'custom:<name>'` referring to a custom_providers entry), the value
serialized into jobs.json as `"provider": "custom"` and the scheduler
could never resolve a provider from it — the cron job failed silently
at run time.

Treat bare `'custom'` as "no provider supplied" so the current main
provider gets pinned instead, matching behaviour for the omitted case.

Defence-in-depth complement to a schema-description fix (#15477) that
discourages the LLM from emitting bare `'custom'` in the first place.
2026-05-04 03:11:11 -07:00
pander 6b88f46c54 fix(compressor): trigger fallback on timeout errors alongside model-not-found
Previously only HTTP 404/503 and specific error strings triggered a fallback
to the main model when the summary model was unavailable. Timeout errors
(HTTP 408/429/502/504, or error strings containing 'timeout') entered a
short cooldown instead, leaving context to grow unbounded for the rest of
the session.

Add _is_timeout detection alongside _is_model_not_found so that transient
timeout errors on the summary model also trigger immediate fallback to the
main model, preventing compression failure from cascading.

Closes #15935
2026-05-04 03:10:53 -07:00
DaniuXie a45bd28598 fix(wecom): set SUPPORTS_MESSAGE_EDITING=False to prevent broken streaming 2026-05-04 03:10:36 -07:00
zng8418 d2ea959fe9 fix(doctor): skip /models health check for MiniMax CN (returns 404)
MiniMax China (api.minimaxi.com) does not expose a /v1/models endpoint.
The doctor command was probing it and reporting HTTP 404 as a warning,
even though the API works correctly for chat completions.

Set supports_health_check=False for MiniMax CN so doctor shows
"(key configured)" instead of the false 404 warning.

Refs #12768, #13757
2026-05-04 03:10:17 -07:00
ideathinklab01-source d17eff29d5 fix(delegate): guard _load_config() against delegation: null in config.yaml
YAML parses `delegation: null` as Python None. `dict.get(key, {})`
only uses the default when the key is *missing*, not when it exists with
a None value, so `cfg.get("max_concurrent_children")` crashes with
`'NoneType' object has no attribute 'get'`.

Same pattern as fd9b692d (fix(tui): tolerate null top-level sections).
Use `dict.get(key) or {}` to handle both missing and None-valued keys.

Closes: delegation null config crash (same class as #7215, #7346)
2026-05-04 03:09:59 -07:00
ygd58 2d3d1d9736 fix(tui): use --outdir instead of --outfile in hermes-ink build script
esbuild raises 'Must use outdir when there are multiple input files'
on Android/Termux ARM64 with esbuild >=0.25. The build script used
--outfile=dist/ink-bundle.js which is only valid for a single entry
point with no code splitting. Switching to --outdir=dist fixes the
error and names the output file dist/entry-exports.js (matching the
input file name). Update index.js to import from the new path.

Fixes #16072
2026-05-04 03:09:41 -07:00
LLing486 145a38a875 fix(agent): preserve dots in model names for Xiaomi MiMo provider
Add 'xiaomi' to the _anthropic_preserve_dots() provider whitelist and
'xiaomimimo.com' to the URL-based fallback check. Without this,
normalize_model_name() converts mimo-v2.5 to mimo-v2-5, which the
Xiaomi API rejects with HTTP 400.

Fixes #16156
2026-05-04 03:09:24 -07:00
YAMAGUCHI Seiji 0896944382 fix(cronjob): advertise 'custom:<name>' provider format in tool schema
The `provider` field in CRONJOB_SCHEMA only showed examples like
'openrouter' and 'anthropic', with no mention of the canonical
'custom:<name>' form required for custom_providers entries. When the
user has custom providers configured, LLMs tend to write the bare type
name ('custom') because the schema does not advertise the ':<name>'
suffix. The bare value then serializes into jobs.json and causes the
cron job to fail silently at run time — `_resolve_model_override`
treats it as a user-specified provider and skips the pin-to-current
fallback, but no provider ever resolves from the bare 'custom' string.

Clarifying the schema so the canonical form is discoverable addresses
the root cause at the tool-definition boundary.
2026-05-04 03:09:07 -07:00
jjjojoj 9c64d09610 fix(status): show NVIDIA NIM api key status
hermes status was missing NVIDIA API key from its API keys display.
Now shows NVIDIA NIM ✓/✗ with key hash like other providers.

Fixes #16082
2026-05-04 03:08:50 -07:00
Teknium 64b39d835e chore(release): AUTHOR_MAP entries for Tier 1d salvage batch 2026-05-04 03:07:30 -07:00
taeng0204 20a06c586f fix(dashboard): render null instead of flashing spinner during plugin load 2026-05-04 03:06:45 -07:00
taeng0204 06a6d6967a fix(dashboard): defer unknown-route redirect while dashboard plugins load 2026-05-04 03:06:45 -07:00
Teknium 986ec04048 docs: document /kanban slash command (#19584)
* docs: document /kanban slash command

The kanban user guide and slash-commands reference only mentioned the
/kanban slash command in passing. Add a proper section covering:

- CLI and gateway both expose the full hermes kanban surface via
  hermes_cli.kanban.run_slash (identical argument surface)
- Mid-run usage: /kanban bypasses the running-agent guard, so reads
  and writes land immediately while an agent is still in a turn
- Auto-subscribe on /kanban create from the gateway — originating
  chat is subscribed to terminal events, with a worked example
- Output truncation (~3800 chars) in messaging
- Autocomplete hint list vs full subcommand surface

Also adds /kanban rows to both slash-command tables (CLI + messaging)
in reference/slash-commands.md and moves it into the 'works in both'
notes bucket.

* docs(kanban): frame the model's tool surface as primary, CLI as the human surface

The kanban user guide and CLI reference read as if you drive the board
by running `hermes kanban` commands everywhere. In practice:

- **You** (human, scripts, cron, dashboard) use the `hermes kanban …`
  CLI, the `/kanban …` slash command, or the REST/dashboard.
- **Workers** spawned by the dispatcher use a dedicated `kanban_*`
  toolset (`kanban_show`, `kanban_complete`, `kanban_block`,
  `kanban_heartbeat`, `kanban_comment`, `kanban_create`,
  `kanban_link`) and never shell out to the CLI.

Changes to `user-guide/features/kanban.md`:

- New 'Two surfaces' intro distinguishes the two front doors up front.
- Quick-start section re-labelled so each step says who is running it
  (you vs. orchestrator vs. worker).
- 'How workers interact with the board' rewritten:
  - Lead with "Workers do not shell out to `hermes kanban`."
  - Tool table extended with required params.
  - Concrete worker-turn example (`kanban_show` → `kanban_heartbeat`
    → `kanban_complete`) and an orchestrator fan-out example
    (`kanban_create` x N with `parents=[...]`).
  - Moved 'Why tools not CLI' from a defensive aside to a clean
    follow-up section.
- 'Worker skill' section explicitly says the lifecycle is taught
  in tool calls, not CLI commands.
- 'Pinning extra skills' reordered — orchestrator tool form first
  (the usual case), human/CLI second, dashboard third.
- 'Orchestrator skill' now shows a canonical `kanban_create` /
  `kanban_link` / `kanban_complete` tool-call sequence instead of
  only describing what the skill teaches.
- CLI-command-reference heading now clarifies this is the human
  surface, with a cross-link to the tool-surface section.
- 'Runs — one row per attempt' structured-handoff example replaced:
  the primary example is now `kanban_complete(summary=..., metadata=...)`
  (what a worker actually does), with the CLI form retained as
  "when you, the human, need to close a task a worker can't."

Changes to `reference/cli-commands.md`:

- `hermes kanban` intro marks itself as the human / scripting surface
  and links out to the worker tool surface.
- Corrected `comment <id>` description — the next worker reads it via
  `kanban_show()`, not by running `hermes kanban show`.

* docs(kanban-tutorial): reframe worker actions as tool calls

Honest answer to Teknium's follow-up: no, the first pass missed the
tutorial. The four stories all showed `hermes kanban claim /
complete / block / unblock` as if the backend-dev, pm, and reviewer
personas were humans running CLI commands. In a real hermes kanban
run those agents are dispatcher-spawned workers driving the board
through the `kanban_*` tool surface.

Changes:

- Setup intro now distinguishes the three surfaces up front
  (dashboard / CLI for you, `kanban_*` tools for workers) and
  establishes the convention: `bash` blocks are commands *you* run,
  `# worker tool calls` blocks are what the agent emits.
- Story 1 (solo dev schema): 'Claim the schema task, do the work,
  hand off' block replaced with the dispatcher spawning the
  backend-dev worker and a `kanban_show → kanban_heartbeat →
  kanban_complete` tool-call sequence. The 'On the CLI' `hermes
  kanban show / runs` block re-labelled as 'you peeking at the board'
  to keep it correct as a human inspection step.
- Story 2 (fleet farming): note about structured handoff updated
  from `--summary` / `--metadata` CLI flags to
  `kanban_complete(summary=..., metadata=...)` tool form.
- Story 3 (role pipeline): the big PM/engineer/reviewer block fully
  rewritten as three worker tool-call sequences — PM worker
  completes spec, engineer worker blocks, human/reviewer
  `hermes kanban unblock` (or `/kanban unblock`), engineer worker
  respawns and completes. The respawn-as-new-run mechanic is now
  explicit.
- Reviewer paragraph: `build_worker_context` replaced with
  `kanban_show()` — that's the tool that delivers the parent
  handoff to the model.
- Structured handoff section heading and body updated:
  `--summary`/`--metadata` → `summary`/`metadata` (tool params),
  with a note that the tool surface doesn't expose a bulk variant
  for the same reason the CLI refuses multi-task `complete`.

Story 4 (circuit breaker) unchanged — its workers fail to spawn,
so there are no tool calls to show; the `hermes kanban create` and
`hermes kanban runs` commands in it are correctly human-driven.
2026-05-04 03:05:34 -07:00
Teknium 0628004709 docs(model-catalog): rename x-ai/grok-4.20-beta to x-ai/grok-4.20 (#19640)
OpenRouter and Nous Portal dropped the -beta suffix from the Grok 4.20 slug.
The OpenRouter section already used the new slug; this updates the Nous
Portal section and bumps updated_at.
2026-05-04 02:48:30 -07:00
ms-alan c659a16899 fix(cli): detect quoted relative paths in _detect_file_drop
Closes #15197
2026-05-04 02:48:20 -07:00
ms-alan 08b8465ca9 fix(email): add required Date header to send_message_tool._send_email
Adds RFC 5322 Date header to the _send_email tool path in tools/send_message_tool.py.

Issue #15160 noted that both gateway/platforms/email.py and tools/send_message_tool.py
construct MIMEMultipart/MIMEText messages without setting a Date header. RFC 5322
requires the Date header; mail filters reject messages that lack it.

PR #15207 fixed the gateway/platforms/email.py path but did not cover
tools/send_message_tool._send_email, which is used by the send_message tool
for cross-channel messaging.

This change adds msg["Date"] = formatdate(localtime=True) to _send_email,
mirroring the fix applied to the gateway email adapter.

Closes #15160
2026-05-04 02:48:20 -07:00
thchen 51dc98d314 fix(agent): detect Qwen3/Ollama inline thinking after tool calls
Ollama serves Qwen3 thinking inside the content field as <think>...</think>
blocks rather than in the API-level reasoning_content field.  This means
_has_structured was False for these responses, so an empty-looking reply
after a tool call triggered the nudge instead of the prefill continuation,
causing a double-response loop.

Fix: detect <think>/<thinking>/<reasoning> in final_response and:
  1. Skip the nudge when thinking is present (model is still reasoning)
  2. Include _has_inline_thinking in _has_structured so prefill kicks in
2026-05-04 02:47:29 -07:00
LeonSGP43 0df7e61d2c fix(cli): omit empty api_mode when probing custom models 2026-05-04 02:46:41 -07:00
QifengKuang 52c539d53a fix(agent): disable SDK retries on per-request OpenAI clients
Per-request OpenAI-wire clients (used by both non-streaming and
streaming chat-completions paths in _interruptible_api_call) should
not run the SDK's built-in retry loop: the agent's outer loop owns
retries with credential rotation, provider fallback, and backoff that
the SDK can't see.

Leaving SDK retries on (default 2) compounds with our outer retries
and lets a single hung provider request stretch to ~3x the per-call
timeout before our stale detector reports it.

Shared/primary clients and Anthropic / Bedrock paths are unaffected
(they don't go through here).

Salvage of #15811 core improvement — the timeout push-down in the
original PR required scaffolding that has since been refactored on
main, so only the max_retries=0 change is preserved.

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 02:43:20 -07:00
Teknium 3c070f9f9d fix(curator): only mark agent-created for background-review sediment (#19621)
Tighten the provenance semantics added in #19618: skills a user asks a
foreground agent to write via skill_manage(create) now stay invisible to
the curator. Only skills the background self-improvement review fork
sediments through skill_manage get the created_by=agent marker.

- tools/skill_provenance.py — new ContextVar module mirroring the
  _approval_session_key pattern: set_current_write_origin / reset /
  get / is_background_review. Default origin is 'foreground'; the
  review fork sets 'background_review'.
- run_agent.py — run_conversation() binds the ContextVar from
  self._memory_write_origin at the top of each call. The review fork
  runs on its own thread (fresh context), so foreground and review
  contexts never cross-contaminate.
- tools/skill_manager_tool.py — skill_manage(action='create') now
  only calls mark_agent_created() when is_background_review(). All
  other cases (foreground create, patch, edit, write_file, delete)
  continue as before.
- tests: test_skill_provenance.py (6 tests covering the ContextVar
  surface), split test_full_create_via_dispatcher into foreground
  vs. review-fork variants, curator status tests now mark-first.

Why: the agent routinely edits existing user skills on the user's
behalf; those writes must never flip provenance. And when a user
explicitly asks the foreground agent to create a skill, that skill
belongs to the user. The curator should only be cleaning up after
its own autonomous sediment from the review nudge loop.
2026-05-04 02:42:16 -07:00
Teknium bff484a51b fix(kanban-dashboard): widen drawer, bump body fonts, fix code-block contrast (#19638)
Closes #18576. Addresses three of four complaints from the readability
report; live-verified in a dashboard against a seeded task with body,
comments, and run history.

- Drawer default width 480px → 640px, exposed as the CSS var
  `--hermes-kanban-drawer-width` so deployments / user themes can
  override without forking the plugin.
- Bump body/meta/pre/log/run-history font sizes from the 0.65-0.75rem
  cluster to the 0.78-0.85rem cluster. Long paths and code snippets in
  task bodies, run metadata, and worker logs are legible again instead
  of requiring a squint.
- Fix the black-text-on-dark-theme regression in fenced markdown code
  blocks. Root cause: themes that don't define `--color-foreground`
  (NERV, at least) leave `color: var(--color-foreground)` resolving
  empty on <code>, which then falls back to the UA default (near-black)
  instead of inheriting from the drawer's <body>. Fix: force
  `color: inherit` on both inline and fenced code, and give the fenced
  block background via `currentColor` instead of `--color-foreground`
  so there's a visible card even when the theme var is absent.

Out of scope for this PR (comments added to #18576):
- Draggable resize handle (structural JS work; plugin ships built-only,
  no src/ in-tree).
- Live worker-log viewer for running tasks (backend WS + component).
- Sibling fix: themes like NERV should define --color-foreground. The
  current changes make the drawer robust against that gap, but the
  root fix belongs in the theme layer.
2026-05-04 02:41:51 -07:00
alt-glitch 2a52e28568 fix(setup): skip AUXILIARY_VISION_MODEL write when input is blank
Guard the save_env_value('AUXILIARY_VISION_MODEL', ...) call with
'if _selected_vision_model:' so blank input at the non-OpenAI vision
model prompt doesn't nuke existing values in .env.

save_env_value has no internal guard against empty strings — it
faithfully writes whatever it receives, including empty values that
shadow the previously-configured model.

Salvage of #15504 (core hunk). Contributor's test was dropped because
it collided with subsequent test refactors; the fix stands on its own.

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-04 02:41:47 -07:00
LeonSGP43 7d36533aeb fix(pty): default TERM for resize probes
Preserve explicit caller overrides, but backfill a sensible default
TERM=xterm-256color when missing or blank in the spawn env. CI often
runs without TERM in the parent process, which makes terminal probes
like 'tput cols' fail before winsize reads.

Salvage of #15278's core code fix only — the test changes conflict
with subsequent test refactors on main that now exercise TIOCGWINSZ
directly instead of via 'tput'.

Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-04 02:38:54 -07:00
Bart 99faac212e fix(tui): prevent trailing space in picker-command completions
Commands that open pickers (/model, /skin, /personality) previously
received a trailing space in their completions to keep the dropdown
visible in the classic CLI. However, the TUI's submit handler applies
the completion when Enter is pressed and the result differs from the
input — so '/model' + space became '/model ' and the command was never
executed.

Picker commands now omit the trailing space for exact matches, allowing
Enter to submit and open the picker. Non-picker commands (/help, etc.)
are unaffected.
2026-05-04 02:35:33 -07:00
analista 6da970f15d fix(tui): close AIAgent on session teardown to prevent FD leak
session.close only closed the slash_worker subprocess but never called
agent.close() on the AIAgent instance.  In the long-lived TUI gateway
process, this left httpx clients for GC to finalize.  When the OS
recycled a closed FD number for a new active connection, the stale
finalizer would close the live socket, causing intermittent
[Errno 9] Bad file descriptor on subsequent LLM API calls.

Call agent.close() (which properly shuts down the httpx transport pool
and TCP sockets) before closing the slash_worker.
2026-05-04 02:34:53 -07:00
nftpoetrist 4e2b20b705 fix(cli): sync use_gateway in _reconfigure_provider for tts, browser, and web
_reconfigure_provider() updates cloud_provider/backend/tts.provider when
switching tool providers via "hermes setup tools → Reconfigure", but did
not update the matching use_gateway flag. _configure_provider() (the
initial-setup path) sets use_gateway on all three tool categories. The
omission in _reconfigure_provider leaves a stale value in config.yaml:
switching from a Nous-managed provider (use_gateway=True) to a self-hosted
one keeps use_gateway=True, continuing to route requests through the Nous
gateway; switching the other way leaves use_gateway unset so the managed
feature does not activate.

Fix: mirror _configure_provider's use_gateway = bool(managed_feature)
assignment in the tts, browser, and web blocks of _reconfigure_provider.
Symmetric across all three tool categories. No behavior change for any
provider that does not set tts_provider, browser_provider, or web_backend.

Fixes #15229
2026-05-04 02:33:55 -07:00
flobo3 ba8337464d fix(gemini): extract usageMetadata from streaming chunks for token tracking 2026-05-04 02:33:30 -07:00
ee-blog f6aa1965d7 fix(telegram): fallback to document when photo dimensions exceed limits
Telegram's send_photo has dimension limits (sum of width+height <= 10000px).
When sending large screenshots or tall images, the API returns
'Photo_invalid_dimensions' error.

Fix: Catch this specific error in send_image_file() and automatically
fallback to send_document() which has no dimension limits (only 50MB size).

This is similar to the existing 5MB URL fallback (commit 542faf22) but
handles local files with dimension issues instead of URL size issues.
2026-05-04 02:33:09 -07:00
barteq ad4542bf6d fix(gateway): allow free_response_channels to override DISCORD_IGNORE_NO_MENTION
When DISCORD_IGNORE_NO_MENTION is true (default), the bot ignores
messages without @mention. However, this check ran before evaluating
free_response_channels, so messages in free-response channels were
wrongly dropped unless they contained a mention.

This change adds a carve-out: if the message lands in a channel that
is configured as a free response channel (or its parent category is),
the ignore-no-mention rule is skipped.

Also removes the unconditional skip_thread for free response channels
so that auto_thread still creates threads there unless explicitly
disabled via DISCORD_NO_THREAD_CHANNELS.
2026-05-04 02:32:39 -07:00
hex-clawd 54cd633366 fix(cron): skip AI call when script produces no output
When a cron job has a pre-run script that runs successfully but produces
no output (e.g. email checker with no new mail), the scheduler previously
injected "[Script ran successfully but produced no output.]" into the
prompt and still called the AI model. This wastes tokens on every cycle.

Now _build_job_prompt() returns None when script output is empty, and
run_job() short-circuits with a SILENT response - zero API calls when
there is nothing to report.
2026-05-04 02:32:18 -07:00
dpaluy e2248045f5 fix(cron): drop stale env-var override of persisted provider
Cron jobs were passing os.getenv("HERMES_INFERENCE_PROVIDER") as the
"requested" arg to resolve_runtime_provider(), which short-circuited
the resolver's own precedence (explicit arg → persisted config → env)
and let stale shell/.env values outrank the user's saved provider.

Long-lived cron daemons inherit env from the shell that launched them,
so a since-changed provider (e.g. DeepSeek) could keep firing for jobs
that don't pin provider/model. Same bug class as f0b763c74 fixed for
the TUI /model switch.

Pass only job.get("provider") and let resolve_requested_provider fall
through to persisted config and env in the documented order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 02:31:57 -07:00
flobo3 d7663c7808 fix(docker): exclude compose/profile runtime state from build context 2026-05-04 02:31:39 -07:00
helix4u f236cbfec3 fix(tui): declare nanostores dependency 2026-05-04 02:31:22 -07:00
B1GGersnow dc63ad0ad2 fix(anthropic): cap max_tokens at 65536 for Qwen models via DashScope
DashScope's Anthropic-compatible endpoint enforces max_tokens ∈ [1, 65536].
Adding "qwen3" to _ANTHROPIC_OUTPUT_LIMITS prevents 400 errors that were
misclassified as context overflow, triggering premature compression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 02:31:05 -07:00
Emilien Domenge 83bbe9b458 fix(delegation): pass target_model to resolve_runtime_provider in _resolve_delegation_credentials
When delegation.model differs from model.default and the provider is
opencode-go or opencode-zen, the wrong api_mode is computed because
resolve_runtime_provider falls back to model_cfg.get('default') — the
main model — instead of the configured delegation model.

For example, with model.default=minimax-m2.7 (anthropic_messages) and
delegation.model=glm-5.1 (chat_completions), subagents get
anthropic_messages, which strips /v1 from the base URL and causes a 404.

resolve_runtime_provider already accepts target_model for exactly this
purpose; _resolve_delegation_credentials just wasn't passing it.

Fixes #15319
Related: #13678
2026-05-04 02:30:48 -07:00
nftpoetrist e2211b2683 fix(compressor): reset _summary_failure_cooldown_until in on_session_reset()
on_session_reset() cleared _previous_summary, _last_summary_error, and
_ineffective_compression_count but left _summary_failure_cooldown_until
intact. When a transient summary error sets a 60 s cooldown (or 600 s
for a missing-provider RuntimeError) and the user immediately runs /reset
or /new, the cooldown carries into the new session. If the new session
reaches the compression threshold before the cooldown expires,
_generate_summary() returns None early, middle turns are silently dropped
without a summary, and the agent continues with no indication that
compaction was skipped.

Fix: set _summary_failure_cooldown_until = 0.0 in on_session_reset(),
matching the value assigned in __init__ and symmetric with the other
per-session fields already cleared there.

Fixes #15547
2026-05-04 02:30:31 -07:00
Teknium 3e1559b910 chore(release): AUTHOR_MAP entries for Tier 1c salvage batch
Pre-adds author-email mappings for upcoming Tier 1c salvage PRs
(small Apr 24-25 fixes).
2026-05-04 02:29:18 -07:00
Teknium baf834cc0f chore(release): map cine.dreamer.one@gmail.com to @LeonSGP43 2026-05-04 02:19:28 -07:00
LeonSGP43 abcaf05229 fix(skills): keep manual skills out of curator 2026-05-04 02:19:28 -07:00
asheriif 21c7c9f0ca fix(tui): harden plugin slash exec errors 2026-05-04 09:07:37 +00:00
Teknium cac4f2c0e6 test(kanban): update worker-prompt header assertion to match #19427
PR #19427 dropped the 'You are a Kanban worker' identity line from
KANBAN_GUIDANCE so SOUL.md stays authoritative for profile identity.
This test assertion was stale against that change; update it to the
new protocol-only header.
2026-05-04 02:00:42 -07:00
pdonizete deb59eab72 fix: allow kanban tools for orchestrator profiles with kanban toolset
The _check_kanban_mode() gating function only checked for
HERMES_KANBAN_TASK env var, which is only set by the dispatcher
when spawning workers. This prevented orchestrator profiles (like
techlead) from using kanban_create, kanban_link, etc. even when
they had 'kanban' explicitly in their toolsets config.

Now uses load_config() from hermes_cli.config (which has mtime-based
caching) to check if 'kanban' is in the profile's toolsets list.
This enables orchestrators to route work via Kanban while workers
continue using the dispatcher env var.

Fixes #18968
2026-05-04 02:00:42 -07:00
nftpoetrist 9faaa292b4 fix(delegate): inherit parent fallback_chain in _build_child_agent
_build_child_agent constructed child AIAgents without passing
fallback_model, leaving _fallback_chain=[] for every subagent.
When a subagent hit a rate-limit or credential exhaustion the
runtime fallback check (run_agent.py:7486 / 12267) found an empty
chain and failed immediately — even though the parent agent was
configured with fallback_providers and would have recovered.

The cron scheduler already propagates fallback_model correctly
(scheduler.py:1038). Fix closes the parity gap by reading the
parent's _fallback_chain (the normalised list form accepted by
AIAgent's fallback_model parameter) and threading it through.

Empty chains coerce to None so AIAgent initialises _fallback_chain=[]
as usual rather than iterating an empty list.
2026-05-04 01:48:56 -07:00
molvikar cb33c73418 fix(run_agent): gate iteration-limit provider routing to OpenRouter 2026-05-04 01:45:59 -07:00
Asunfly 8a364df2c8 fix: inherit reasoning config in API server runs 2026-05-04 01:44:16 -07:00
SHL0MS aede94e757 fix: back up config.yaml before hermes setup modifies it
Create a timestamped backup (~/.hermes/config.yaml.bak.YYYYMMDD_HHMMSS)
before the setup wizard runs any configuration sections. After setup
completes, show the backup path and a restore command.

This protects user-customized values (compression thresholds, provider
routing, PII redaction, auxiliary model configs) from being silently
overwritten by setup defaults.

Addresses #3522
2026-05-04 01:43:17 -07:00
memosr 2c7d7a9b2f fix(security): bind Meet node server to localhost and restrict token file to owner read 2026-05-04 01:42:59 -07:00
yuehei cdde0c8411 fix(feishu): enable MEDIA attachment delivery in send_message tool
The _send_feishu() function already supports media_files (images, video,
audio, documents) via the adapter's send_image_file/send_video/send_voice
/send_document methods, but _send_to_platform() never routed Feishu into
the early media-handling branch — media attachments were silently dropped
with a "not supported" warning.

Add a Feishu-specific media branch (matching the existing Yuanbao/Signal
pattern) so that MEDIA:<path> tags in send_message calls are correctly
delivered as native Feishu attachments. Also update the two error/warning
message strings to include feishu in the supported platform list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 01:42:40 -07:00
WanderWang 45fd45103d fix: _chromium_installed() now checks AGENT_BROWSER_EXECUTABLE_PATH and system Chrome
Before this fix, _chromium_installed() only searched Playwright-style
chromium-* / chromium_headless_shell-* directories, which meant users
with system Chrome or AGENT_BROWSER_EXECUTABLE_PATH configured still
had all browser_* tools gated.

Now checks three sources in priority order:
1. AGENT_BROWSER_EXECUTABLE_PATH env var (if set and points to a real binary)
2. System Chrome/Chromium via shutil.which() (google-chrome, chromium-browser, chrome)
3. Playwright browser cache (existing logic, kept as fallback)

Closes #19294
2026-05-04 01:42:23 -07:00
Yanzhong Su c653f5dc3f Clarify session_search auxiliary model docs 2026-05-04 01:42:07 -07:00
ai-ag2026 8bdec80882 fix(agent): surface preflight compression status
Preflight compression can run synchronously before the first model call when a loaded session exceeds the active context threshold. Gateway users saw no visible progress while the compression LLM call was in flight, which can look like a dropped message during long compactions.\n\nEmit the existing lifecycle status through _emit_status before starting preflight compression so CLI, gateway, and WebUI status callbacks all get immediate feedback.\n\nAdds a regression assertion for the preflight path.
2026-05-04 01:41:51 -07:00
qiqufang d8be50d772 fix(web): add missing icons for config page category sidebar
Add icon mappings for 9 categories that fell back to FileQuestion:
- bedrock (Cloud), curator (Sparkles), kanban (LayoutDashboard)
- model_catalog (BookOpen), openrouter (Route), sessions (History)
- tool_loop_guardrails (Shield), tool_output (FileOutput), updates (RefreshCw)
2026-05-04 01:41:27 -07:00
Teknium 06031229e8 fix(tests): tolerate ps ancestor-walk in find_gateway_pids fallback test (#19590)
Follow-up to #19586 (@cixuuz salvage): _get_ancestor_pids walks ps -o ppid=
up the process tree, which the pre-existing mock in
test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails didn't
expect. Return empty stdout so the ancestor loop terminates cleanly and the
original fallback assertion still passes.
2026-05-04 01:40:39 -07:00
liuhao1024 9c93fc5775 fix(tui): call process.exit(0) after Ink exit to trigger terminal cleanup
Ink's exit() calls unmount() which resets terminal modes (kitty keyboard,
mouse, etc.) but does NOT call process.exit().  The Node process stays
alive because stdin is still open (Ink listens on it), so the
process.on('exit') handler in entry.tsx — which sends the final
resetTerminalModes() — never fires.

This left kitty keyboard protocol and other terminal modes enabled in the
parent shell after /quit, Ctrl+C, or Ctrl+D, breaking arrow keys and
other input in subsequent programs.

Add explicit process.exit(0) after exit() in die() so the process
actually terminates and the exit handler runs.

Fixes #19194
2026-05-04 01:39:39 -07:00
Hermes Agent 74c997d985 fix(gateway): move quick-command dispatch before built-in handlers
Quick commands of type "alias" that target built-in slash commands
(e.g. /h -> /model) were processed too late in _handle_message — after
the if-canonical=="model" checks. This meant alias expansion never
reached the target handler and fell through to the LLM as raw text.

Two fixes:
1. Move the quick_commands block before built-in dispatch so alias
   targets (like /model) hit the correct handler after expansion.
2. Extract bare command name from target_command via .split()[0] to
   feed _resolve_cmd() correctly (was using the full arg-string).
2026-05-04 01:39:23 -07:00
holynn c857592558 fix(cli): allow custom:* provider slugs in model validation
Two related fixes for custom_providers model switching:

1. validate_requested_model() now recognizes custom:<name> slugs
   (e.g. custom:volcengine) as custom endpoints, not generic providers.
   Previously only the bare 'custom' slug matched the relaxed validation
   branch, causing model validation to fail with 'not found in provider
   listing' for all named custom providers.

2. switch_model() now consults the custom_providers list when deciding
   whether to override a validation rejection. If the requested model
   matches the entry's 'model' field or any key in its 'models' dict,
   the switch is accepted even when the remote /v1/models endpoint does
   not list it.

Both changes are covered by existing tests (86 passed).
2026-05-04 01:39:06 -07:00
Byrn Tong e8cdcf5328 fix: exclude ancestor PIDs from gateway process scan (#13242)
_scan_gateway_pids() uses ps-based pattern matching to find running
gateways. When invoked from the CLI (e.g. `hermes gateway status`),
the calling process itself matches gateway patterns, causing false
positives — the CLI is mistakenly counted as a running gateway.

Add _get_ancestor_pids() that walks the process tree from the current
PID up to init (PID 1). Merge this set into exclude_pids at the top
of _scan_gateway_pids() so the entire ancestor chain is filtered out.

This complements the existing os.getpid() exclusion in
_append_unique_pid() by also covering parent/grandparent processes
(e.g. when hermes is invoked via a wrapper script or shell).

Closes #13242
2026-05-04 01:38:41 -07:00
Aleksandr Pasevin 8a4fe80f8d fix(signal): skip reactions for unauthorized senders
The on_processing_start hook fired a reaction emoji (👀) on every
inbound Signal message before run.py's _is_user_authorized check.
This meant contacts not in SIGNAL_ALLOWED_USERS would see the bot
react to their messages even though Hermes silently dropped them —
leaking the presence of the bot and causing confusing UX.

Two changes to gateway/platforms/signal.py:

1. Read SIGNAL_ALLOWED_USERS into self.dm_allow_from in __init__
   (mirrors the group_allow_from pattern already in place).

2. Add _reactions_enabled(event) — two-gate check:
   - SIGNAL_REACTIONS=false/0/no disables reactions globally
   - If SIGNAL_ALLOWED_USERS is set, only react to senders in
     the allowlist (skips unauthorized contacts)

Both on_processing_start and on_processing_complete now call this
guard before sending any reaction.

Telegram already has an equivalent _reactions_enabled() guard
(controlled by TELEGRAM_REACTIONS). This brings Signal to parity.
2026-05-04 01:38:21 -07:00
nftpoetrist e89376d66f fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack()
_setup_slack() was the only platform setup function that did not prompt
for a home channel. All four sibling setups (_setup_telegram,
_setup_discord, _setup_mattermost, _setup_bluebubbles) close with an
identical home-channel block, and setup_gateway() already checks for
SLACK_HOME_CHANNEL presence at the end of the wizard — but the value
was never collected, leaving cron delivery and cross-platform
notifications silently broken for Slack after a fresh hermes setup run.

Add the standard home-channel prompt at the end of _setup_slack(),
symmetric with the Discord implementation. Add two unit tests that
verify the prompt is saved when provided and skipped when left blank.
2026-05-04 01:37:18 -07:00
Byrn Tong 81ce945450 fix(gateway): show other profiles in gateway status to prevent confusion
When multiple gateway profiles are running (e.g. default and wx1),
`hermes gateway status` can be misleading — stopping one profile's
gateway and checking status may still show the other profile's process
without indicating which profile it belongs to.

Add `_print_other_profiles_gateway_status()` which displays running
gateways from other profiles at the bottom of the status output:

    Other profiles:
      ✓ wx1              — PID 166893

This uses the existing `find_profile_gateway_processes()` and
`get_active_profile_name()` — no new dependencies.

Closes #19113
Related: #4402, #4587
2026-05-04 01:37:02 -07:00
wanazhar df88375f0d fix: treat ctrl-c as curses cancel 2026-05-04 01:36:44 -07:00
leavr ccb5d87076 test: cover max-iterations summary message sanitization 2026-05-04 01:36:27 -07:00
tmdgusya a1cb811cb8 fix(cli): avoid voice TTS restart race 2026-05-04 01:36:07 -07:00
Teknium 314fe9f827 chore(release): add AUTHOR_MAP entries for upcoming salvage batch
Pre-adds author-email mappings for the 21 Tier 1b salvage PRs so
their cherry-picked commits land with mapped GitHub logins in the
release notes.
2026-05-04 01:34:32 -07:00
ethan 645b99aadd test(cron): cover null next_run_at recovery and non-dict origin tolerance
Adds four regression tests guarding the bugfix in the previous commit:
- TestGetDueJobs::test_broken_cron_without_next_run_is_recovered exercises
  cron schedules whose next_run_at was lost; expects compute_next_run to
  repopulate it within get_due_jobs() rather than silently skipping the job.
- TestGetDueJobs::test_broken_interval_without_next_run_is_recovered does
  the same for interval schedules.
- TestResolveOrigin::test_string_origin_is_tolerated and
  test_non_dict_origin_is_tolerated confirm _resolve_origin() returns None
  for legacy/hand-edited origins (string, list, int) instead of raising.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
ethan 78b635ee3c fix(cron): recover null next_run_at jobs and tolerate non-dict origin
Fixes #18722

get_due_jobs() now recomputes next_run_at via compute_next_run() for
cron/interval jobs that arrived with null next_run_at (e.g. via direct
jobs.json edits) instead of silently skipping them. _resolve_origin()
guards with isinstance(origin, dict), and _deliver_result() now routes
through _resolve_origin() so string/non-dict origins no longer crash
the ticker.

References: references #18735 (open competing fix from automated bulk PR touching 79 files); this PR is a focused single-issue contribution and adds the missing interval-recovery test variant

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-04 01:32:58 -07:00
Teknium 91ea3ae4b2 test(skills): add bytes-vs-str equivalence and on-disk hash parity tests
Follow-up on #9925 cherry-pick adding two additional tests:
- bytes content hashes identically to its str-decoded form
- mixed bytes+str bundle hash equals the on-disk content_hash from
  skills_guard (the production invariant used to detect drift)

Also map dodofun@126.com and 1615063567@qq.com in AUTHOR_MAP so the
CI contributor check passes for the cherry-picked commit.

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: zhao0112 <1615063567@qq.com>
2026-05-04 01:28:12 -07:00
dh 3072e5543b skills-hub: hash binary skill bundle files correctly 2026-05-04 01:28:12 -07:00
Teknium c90f25dd1f chore(release): map daixin1204@gmail.com to @SimbaKingjoe 2026-05-04 01:21:23 -07:00
daixin1204 744079ffe6 fix(curator): prevent false-positive consolidation from substring matching
_classify_removed_skills used naive 'in' substring matching to detect
whether a removed skill's name appeared in skill_manage arguments.
Short/common skill names (api, git, test, foo, etc.) matched
incorrectly when they appeared as substrings of longer words in file
paths (references/api-design.md) or content (latest, testing).

Replace with field-aware matching:
- file_path: needle must match a complete filename stem or directory
  name, with -/_ normalised for variant tolerance
- content fields: word-boundary regex (\b) prevents embedding in
  longer words

Also add 3 regression tests covering the false-positive scenarios.
2026-05-04 01:21:23 -07:00
Clooooode c0300575c1 fix(kanban): use get_default_hermes_root() in list_profiles_on_disk
Path.home() / ".hermes" / "profiles" breaks custom-root deployments
(e.g. HERMES_HOME=/opt/data). Switch to get_default_hermes_root() so
profile discovery is consistent with kanban_db_path() and
workspaces_root() fixed in #18985.

Fixes #19017.
Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Clooooode 1964b0565b test(kanban): add failing test for list_profiles_on_disk with custom HERMES_HOME
list_profiles_on_disk() hardcodes Path.home() / ".hermes" / "profiles",
ignoring HERMES_HOME when set to a custom root (e.g. /opt/data).

Add test_list_profiles_on_disk_custom_root to cover this case.

Related to #18442, #18985.
2026-05-04 01:21:14 -07:00
Siddharth Balyan 8163d37192 fix(skill): reference built-in video_analyze/vision_analyze tools in kanban-video-orchestrator (#19562)
The tool-matrix.md had a vague 'Gemini multimodal / Claude vision' entry
in the external tools table that didn't point to the actual built-in
Hermes tools. Now that video_analyze exists (merged in #19301), update
the skill to reference it properly:

- Add 'Built-in Hermes tools for media review' section with proper
  toolset names, enablement instructions, and capability details
- Add video + vision toolsets to cinematographer, editor, and reviewer
  profile configs
- Update role-archetypes.md to reference tools by name
- Update API key table to explain video_analyze routing
2026-05-04 12:54:50 +05:30
Siddharth Balyan a11aed1acc fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by:
1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd`
2. Inherited TERMINAL_CWD from parent hermes processes
3. Only resolved when config had a placeholder value (not explicit paths)

Fix:
- load_cli_config() unconditionally uses os.getcwd() for local backend
- TERMINAL_CWD always force-exported in CLI mode (overrides stale values)
- Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber
- Remove terminal.cwd from config-set .env sync map (prevents re-poisoning)
- Clarify setup wizard label as 'Gateway working directory'

Closes #19214
2026-05-04 11:36:19 +05:30
Ben Barclay 434d70d8bc Merge pull request #19540 from NousResearch/single_container_for_all
feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
2026-05-04 15:38:19 +10:00
Ben 5671059f62 feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1
Adds an optional dashboard side-process to the container entrypoint,
toggled by `HERMES_DASHBOARD=1` (also accepts `true` / `yes`).  When set,
the entrypoint backgrounds `hermes dashboard` before `exec`-ing the main
command so the user's chosen foreground process (gateway, chat, `sleep
infinity`, …) remains PID-of-interest for the container runtime.
  docker run -d \
    -v ~/.hermes:/opt/data \
    -p 8642:8642 -p 9119:9119 \
    -e HERMES_DASHBOARD=1 \
    nousresearch/hermes-agent gateway run
Defaults chosen for the container case:
 - Host: 0.0.0.0 (reachable through published port; can override to
   127.0.0.1 via HERMES_DASHBOARD_HOST for sidecar/reverse-proxy setups)
 - Port: 9119 (matches `hermes dashboard`)
 - Auto-adds `--insecure` when binding to non-localhost, matching the
   dashboard's own safety gate for exposing API keys
 - HERMES_DASHBOARD_TUI is read by `hermes dashboard` directly — no
   entrypoint plumbing needed
Dashboard output is prefixed with `[dashboard]` via `stdbuf`+`sed -u` so
it's easy to separate from gateway logs in `docker logs`.  No supervision:
if the dashboard crashes it stays down until the container restarts
(documented in the `:::note` panel).
Other changes bundled in:
 - Deprecate GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT env vars in
   hermes_cli/web_server.py with a DEPRECATED block comment and a
   `.. deprecated::` note on _probe_gateway_health.  The feature still
   works for this release; it'll be removed alongside the move to a
   first-class dashboard config key.
 - Rewrite the "Running the dashboard" doc section around the new
   single-container pattern.  Drops the previously-documented
   dashboard-as-its-own-container setup — that pattern relied on the
   deprecated env vars for cross-container gateway-liveness detection,
   and without them the dashboard would permanently report the gateway
   as "not running".
 - Collapse the two-service Compose example (gateway + dashboard
   container) into a single service with HERMES_DASHBOARD=1.  Removes
   the now-unnecessary bridge network and `depends_on`.
 - Drop the ":::warning" caveat about "Running a dashboard container
   alongside the gateway is safe" — that case no longer exists.
2026-05-04 15:37:27 +10:00
Ben Barclay 95f395027f Merge pull request #19520 from NousResearch/fix_docker_tui
fix(docker/tui): tolerate npm's peer-flag drop in lockfile comparison
2026-05-04 14:29:43 +10:00
Ben 2f2998bb1b fix(tui): tolerate npm's peer-flag drop in lockfile comparison
`_tui_need_npm_install()` compares the canonical `package-lock.json` against
the hidden `node_modules/.package-lock.json` to decide whether `npm install`
needs to re-run. npm 9 drops the `"peer": true` field from the hidden lock
on dev-deps that are *also* declared as peers (the canonical lock preserves
the dual annotation). That made the check flag 16 packages (`@babel/core`,
`@types/node`, `@types/react`, `@typescript-eslint/*`, `react`, `vite`,
`tsx`, `typescript`, …) as mismatched on every launch, triggering a runtime
`npm install`.
Inside the Docker image, that runtime install then fails with EACCES because
`/opt/hermes/ui-tui/node_modules/` is root-owned from build time, so
`docker run … hermes-agent --tui` prints:
    Installing TUI dependencies…
    npm install failed.
…and exits 1, with no preview. The empty preview is a second bug: the
launcher captured only stderr, but npm 9 writes EACCES to stdout, which
was DEVNULL'd.
Fixes:
 - Add `"peer"` to `_NPM_LOCK_RUNTIME_KEYS` so the comparison ignores the
   non-deterministic field, alongside the existing `"ideallyInert"`.
 - Capture stdout as well as stderr in the install subprocess so future
   failures surface a useful preview instead of a bare "failed." line.
Regression tests:
 - `test_no_install_when_only_peer_annotation_differs` — the exact scenario
 - `test_install_when_version_differs_even_with_peer_drop` — guards against
   the peer-drop tolerance masking a real version skew
On-host impact: the same false-positive was firing on every `hermes --tui`
invocation from a normal checkout, silently running a no-op `npm install`
each time (it converged because the host's `node_modules/` is writable).
Startup time on the TUI should drop noticeably.
2026-05-04 14:13:38 +10:00
Chris Danis 363cc93674 fix(cron): bump skill usage when cron jobs load skills
Cron jobs that reference skills via their skills: config never bumped
the usage counters in .usage.json, so the curator could auto-archive
skills actively used by cron jobs based on stale timestamps.

Now _build_job_prompt() calls bump_use(skill_name) for each
successfully loaded skill so the curator sees them as active.
2026-05-03 17:06:48 -07:00
nftpoetrist 808fee151d fix(auxiliary): propagate explicit_api_key to _try_anthropic()
_try_anthropic() lacked the explicit_api_key parameter added to
_try_openrouter() in #18768. When resolve_provider_client() is called
with provider="anthropic" and an explicit key (e.g. from a fallback_model
entry with api_key set), the key was silently ignored — _try_anthropic()
always fell back to resolve_anthropic_token(), so the fallback returned
None,None for users without a default Anthropic credential configured.

Fix: add explicit_api_key: str = None to _try_anthropic() and use
explicit_api_key or <pool/env fallback> in both the pool-present and
no-pool paths. Pass explicit_api_key=explicit_api_key at the call site
in resolve_provider_client(). Symmetric with the _try_openrouter() fix.
No behavior change when explicit_api_key is None.
2026-05-03 17:00:55 -07:00
molvikar 74636f9c4a fix(gateway): clear queued reload-skills notes on new/resume/branch 2026-05-03 17:00:31 -07:00
Kenny Wang 222767e5e8 fix: sanitize Telegram help command mentions 2026-05-03 17:00:09 -07:00
konsisumer 6fda92aa7f fix(gateway): bridge top-level require_mention to Telegram config
Users commonly place `require_mention: true` at the top level of
config.yaml alongside `group_sessions_per_user`, expecting it to gate
Telegram group messages. The key was silently ignored because the
config loader only checked `yaml_cfg["telegram"]["require_mention"]`.

When `require_mention` is found at the top level and no telegram-specific
value is set, the fix now:
- adds it to platforms_data["telegram"]["extra"] so _telegram_require_mention()
  picks it up via the primary config.extra path
- sets TELEGRAM_REQUIRE_MENTION env var for the secondary fallback path

A telegram-specific value (telegram.require_mention) still takes
precedence over the top-level shorthand.

Also corrects telegram.md: bare /cmd without @botname is rejected when
require_mention is enabled; only /cmd@botname (bot-menu form) passes.

Fixes #3979
2026-05-03 16:59:46 -07:00
clawbot 1bd975c0ba fix(gateway): suppress duplicate voice transcripts
Deduplicate exact and near-exact Discord voice STT transcripts per guild/user over a short window to avoid duplicate delayed agent replies.

Adds regression tests for exact and near-duplicate voice transcript suppression.
2026-05-03 16:59:21 -07:00
Teknium b58db237e4 fix(kanban): drop worker identity claim from KANBAN_GUIDANCE (#19427)
KANBAN_GUIDANCE layer 3 of the system prompt started with 'You are a
Kanban worker', overriding the profile's SOUL.md identity at layer 1.
Profiles with strict role boundaries (e.g. a reviewer profile that
never writes code) still executed implementation tasks because the
kanban identity claim diluted SOUL's.

Drop the identity line. Layer 3 now describes the task-execution
protocol only; SOUL.md remains the sole identity slot.

Fixes #19351
2026-05-03 16:59:00 -07:00
LeonSGP43 6713274a42 fix(file): strip leaked terminal fences from reads 2026-05-03 16:58:50 -07:00
Alan Chen 2d7543c61f fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash
On Windows, services and terminals default to cp1252 encoding. The CLI
uses box-drawing characters (┌│├└─) in banners, doctor output, and
status displays. When print() tries to encode these under cp1252, an
unhandled UnicodeEncodeError crashes the gateway on startup.

This fix adds early UTF-8 enforcement in hermes_cli/__init__.py:
- Sets PYTHONUTF8=1 and PYTHONIOENCODING=utf-8
- Re-opens stdout/stderr with UTF-8 encoding if not already UTF-8

Runs at import time so it protects all CLI subcommands. No effect on
Unix (gated on sys.platform == "win32"). Backwards-compatible: on
systems already using UTF-8, the function is a no-op.

Fixes #10956
2026-05-03 16:58:25 -07:00
Teknium 2ababfe6ed chore(release): map 0xKingBack noreply email 2026-05-03 16:55:16 -07:00
0xKingBack 3c42024539 fix(curator): pass auxiliary curator api_key/base_url into runtime resolution
Curator review fork now forwards per-slot credentials from auxiliary.curator
and legacy curator.auxiliary to resolve_runtime_provider, matching the
canonical aux task schema. Add regression tests for binding and main fallback.
2026-05-03 16:55:16 -07:00
Kiala 3792b77bd1 fix(send_message): support QQBot C2C and group chats
The _send_qqbot function was hardcoded to use the guild channel
endpoint (/channels/{id}/messages), which fails for C2C private
chats and QQ groups with 'channel does not exist' (code 11263).

This change tries the appropriate endpoints in order:
1. /channels/{id}/messages     (guild channels)
2. /v2/users/{id}/messages     (C2C private chats)
3. /v2/groups/{id}/messages    (QQ groups)

Fixes active sending to QQBot C2C and group recipients.
2026-05-03 16:54:39 -07:00
MrBob 86e64c1d3b fix(gateway): hide required-arg commands from Telegram menu 2026-05-03 15:29:06 -07:00
sprmn24 408dd8aa28 fix(compressor): skip non-string tool content in dedup pass to prevent AttributeError 2026-05-03 15:28:30 -07:00
sprmn24 5bd937533c fix(vision): guard user_prompt type in video_analyze_tool before debug_call_data construction 2026-05-03 15:28:04 -07:00
sprmn24 6c4aca7adc fix(vision): guard user_prompt type before debug_call_data construction 2026-05-03 15:27:40 -07:00
Zyproth a5cae16496 fix(api_server): fall back to default port on malformed API_SERVER_PORT 2026-05-03 15:27:03 -07:00
Amit Gaur 65bebb9b80 fix(cli): follow 307 redirects in MiniMax OAuth httpx clients
The MiniMax OAuth API endpoints have moved from api.minimax.io to
account.minimax.io and the old paths now respond with HTTP 307.
httpx defaults to follow_redirects=False (unlike requests), so the
device-code and token-refresh flows fail with "Temporary Redirect".

Adds follow_redirects=True to the two httpx.Client instances in
hermes_cli/auth.py used by the MiniMax OAuth flow. This is forward-
compatible -- if endpoints move again, the redirect chain is
followed automatically.

Repro before patch:
  curl -i -X POST https://api.minimax.io/oauth/code  # -> 307
  curl -i -X POST https://api.minimax.io/oauth/token # -> 307

Verified end-to-end against a real MiniMax Plus account on macOS;
the existing tests/test_minimax_oauth.py suite (15 tests) still
passes.
2026-05-03 15:26:33 -07:00
Zyproth dfdd7b6e6f fix(codex-transport): preserve request override headers for xai responses 2026-05-03 15:25:45 -07:00
LeonSGP43 4a2f822137 fix(mcp): reconnect on terminated sessions 2026-05-03 15:23:33 -07:00
teknium1 2658494e81 fix(kanban): add per-path env overrides + dispatcher env injection
Layers defense-in-depth on top of the shared-root anchoring (base commit).

Changes in hermes_cli/kanban_db.py:
- kanban_db_path() now honours HERMES_KANBAN_DB first, then falls through
  to kanban_home()/kanban.db.
- workspaces_root() now honours HERMES_KANBAN_WORKSPACES_ROOT first, then
  falls through to kanban_home()/kanban/workspaces.
- All three overrides (HERMES_KANBAN_HOME, HERMES_KANBAN_DB,
  HERMES_KANBAN_WORKSPACES_ROOT) now call .expanduser() for consistency.
- _default_spawn() injects HERMES_KANBAN_DB and
  HERMES_KANBAN_WORKSPACES_ROOT into the worker subprocess env. Even
  when the worker's get_default_hermes_root() resolution somehow
  disagrees with the dispatcher's (symlinks, unusual Docker layouts),
  the two processes still open the same SQLite file.

Module docstring updated to describe all three overrides and the
dispatcher env-injection contract.

Tests (tests/hermes_cli/test_kanban_db.py, TestSharedBoardPaths):
- test_hermes_kanban_db_pin_beats_kanban_home
- test_hermes_kanban_workspaces_root_pin_beats_kanban_home
- test_empty_per_path_overrides_fall_through
- test_dispatcher_spawn_injects_kanban_db_and_workspaces_root
  (monkeypatches subprocess.Popen, asserts both env vars reach the
  child even after HERMES_HOME is rewritten by `hermes -p <profile>`.)

Docs: website/docs/reference/environment-variables.md gets entries
for the three kanban env vars.

This fusion is built on the cleanest of the seven competing PRs that
targeted issue #18442:

* Base commit (from PR #19350 by @GodsBoy): add `kanban_home()` helper
  anchored at `get_default_hermes_root()`, reroute all 5 kanban path
  sites through it (including the 3 sibling log-dir sites that the
  other six PRs missed), 8-test regression class.
* Dispatcher env-var injection approach drawn from PRs #18300
  (@quocanh261997) and #19100 (@cg2aigc).
* Per-path env overrides drawn from PR #19100 (@cg2aigc).
* get_default_hermes_root() resolution direction first proposed in
  PR #18503 (@beibi9966) and PR #18985 (@Gosuj).

Closes the duplicate/competing PRs: #18300, #18503, #18670, #18985,
#19037, #19056, #19100. Fixes #18442 and #19348.

Co-authored-by: quocanh261997 <17986614+quocanh261997@users.noreply.github.com>
Co-authored-by: cg2aigc <232694053+cg2aigc@users.noreply.github.com>
Co-authored-by: beibi9966 <beibei1988@proton.me>
Co-authored-by: Gosuj <123411271+Gosuj@users.noreply.github.com>
Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-03 15:13:39 -07:00
GodsBoy f5bd77b3e1 fix(kanban): anchor board, workspaces, and worker logs at the shared Hermes root
The Kanban board is documented as shared across all Hermes profiles, but
`kanban_db_path()` and `workspaces_root()` resolved through `get_hermes_home()`,
which returns the active profile's HERMES_HOME. When the dispatcher spawned a
worker with `hermes -p <profile> --skills kanban-worker chat -q "work kanban
task <id>"`, the worker rewrote HERMES_HOME to the profile subdirectory before
kanban_db.py imported, opening a profile-local `kanban.db` that did not contain
the dispatcher's task. `kanban_show` and `kanban_complete` failed; the
dispatcher's row stayed `running` and was retried/crashed. The same defect
applied to `_default_spawn`'s log directory and `worker_log_path`, so
`hermes kanban tail` did not see the worker's output.

Add `kanban_home()` in `hermes_cli/kanban_db.py` that resolves through
`HERMES_KANBAN_HOME` (explicit override) then `get_default_hermes_root()`,
which already understands the `<root>/profiles/<name>` and Docker / custom
HERMES_HOME shapes. Reroute `kanban_db_path`, `workspaces_root`, the
`_default_spawn` log directory, `gc_worker_logs`, and `worker_log_path`
through it. Profile-specific config, `.env`, memory, and sessions stay
isolated as before; only the kanban surface is shared.

Add a `TestSharedBoardPaths` regression class to `tests/hermes_cli/test_kanban_db.py`
covering: default install, profile-worker convergence, Docker custom HERMES_HOME,
Docker profile layout, explicit `HERMES_KANBAN_HOME` override, and a real
SQLite round-trip across dispatcher and worker HERMES_HOME perspectives.
The dispatcher/worker convergence tests fail on origin/main and pass after
the fix.

Update the `kanban.md` user-guide page and the misleading docstrings in
`kanban_db.py` to describe the shared-root behavior.

Fixes #19348
2026-05-03 15:13:39 -07:00
asheriif 7e780f4832 fix(tui): run plugin slash commands live 2026-05-03 19:42:16 +00:00
Siddharth Balyan 167b5648ea Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit 9eaddfafa3.
2026-05-04 00:43:58 +05:30
Siddharth Balyan 9eaddfafa3 fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use
os.getcwd() as the working directory. The terminal.cwd config value is
only consumed by gateway/cron/delegation modes (where there's no shell
to cd from).

Previously, 'hermes setup' would write an absolute path (e.g. $HOME)
into terminal.cwd which then pinned the CLI to that directory regardless
of where the user launched hermes from. This was a silent foot-gun —
the user's 'cd' was being ignored.

Changes:

1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already
   set by the gateway, and the backend is local, always use os.getcwd().
   Config terminal.cwd is irrelevant for interactive CLI/TUI sessions.

2. setup.py: Moved the cwd prompt from setup_terminal_backend() to
   setup_gateway(). It now only appears when configuring messaging
   platforms and is labeled 'Gateway working directory'.

3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior:
   explicit config paths are ignored for CLI, gateway pre-set values are
   preserved, non-local backends keep their config paths.

4. Docs: Updated configuration.md, profiles.md, and
   environment-variables.md to clarify that terminal.cwd only affects
   gateway/cron mode on local backend.

Closes #19214
2026-05-04 00:14:36 +05:30
GodsBoy b8ae8cc801 fix(debug): redact log content at upload time in hermes debug share
Apply agent.redact.redact_sensitive_text with force=True to log content
captured by _capture_log_snapshot before it reaches upload_to_pastebin.
On-disk logs are untouched. Compatible with the off-by-default local
redaction policy from #16794: this is upload-time-only and applies
regardless of security.redact_secrets because the public paste service
is the leak surface. A visible banner is prepended to each uploaded log
paste so reviewers know redaction was applied. --no-redact preserves
deliberate unredacted sharing for maintainer-coordinated cases.

The bug-report, setup-help, and feature-request issue templates direct
users to run hermes debug share and paste the resulting public URLs.
With redaction off by default per #16794, those uploads have been
carrying credentials onto paste.rs and dpaste.com.

force=True is non-negotiable: without it, redact_sensitive_text
short-circuits at agent/redact.py:322 when the env var is unset, so the
fix would silently be a no-op for its target audience. A regression
test pins this down.

Fixes #19316
2026-05-03 11:42:20 -07:00
Siddharth Balyan c9a3f36f56 feat: add video_analyze tool for native video understanding (#19301)
* feat: add video_analyze tool for native video understanding

Adds a video_analyze tool that sends video files to multimodal LLMs
(e.g. Gemini) for analysis via the OpenRouter-compatible video_url
content type. Mirrors vision_analyze in structure, error handling,
and registration pattern.

Key design:
- Base64 encodes entire video (no frame extraction, no ffmpeg dep)
- Uses 'video_url' content block type (OpenRouter standard)
- Supports mp4, webm, mov, avi, mkv, mpeg formats
- 50 MB hard cap, 20 MB warning threshold
- 180s minimum timeout (videos take longer than images)
- AUXILIARY_VIDEO_MODEL env override, falls back to AUXILIARY_VISION_MODEL
- Same SSRF protection, retry logic, and cleanup as vision_analyze

Default disabled: registered in 'video' toolset (not in _HERMES_CORE_TOOLS).
Users opt in via: hermes tools enable video, or enabled_toolsets=['video'].

* feat(video): add models.dev capability pre-check + CONFIGURABLE_TOOLSETS entry

- Pre-checks model video capability via models.dev modalities.input
  before expensive base64 encoding. Fails early with helpful message
  suggesting video-capable alternatives (gemini, mimo-v2.5-pro).
- Passes optimistically if model unknown or lookup fails.
- Adds ModelInfo.supports_video_input() helper.
- Adds 'video' to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS
  so 'hermes tools enable video' works from CLI.
- 8 new tests for the capability check (37 total).

* refactor(video): remove models.dev capability pre-check

Removes _check_video_model_capability and ModelInfo.supports_video_input.
The vision_analyze tool doesn't pre-check image capability either — both
tools rely on the same pattern: send request, handle API errors gracefully
with categorized user-facing messages. The pre-check was inconsistent
(only worked for some providers/models) so drop it for parity.

* cleanup: compress comments, fix fragile timeout coupling

- Replace _VISION_DOWNLOAD_TIMEOUT * 2 with hardcoded 60s (no silent
  breakage if vision timeout changes independently)
- Strip verbose comments and redundant log lines throughout
- No behavioral changes
2026-05-04 00:04:36 +05:30
SHL0MS 0dd8e3f8d8 rename: video-orchestrator → kanban-video-orchestrator
The kanban prefix makes the skill discoverable alongside `kanban-orchestrator`
and `kanban-worker`, and signals up front that this skill drives the kanban
plugin rather than being a generic video tool.

Updated:
- directory rename
- SKILL.md frontmatter `name:` and H1
- setup.sh.tmpl header
2026-05-03 10:26:54 -07:00
SHL0MS 511add7249 feat(skill): add video-orchestrator optional creative skill
Meta-pipeline that wraps any video request — narrative film, product /
marketing, music video, explainer, ASCII, generative, comic, 3D,
real-time/installation — in a Hermes Kanban pipeline. Performs adaptive
discovery, designs an appropriate team for the requested style, generates
the setup script that creates Hermes profiles + initial kanban task, and
helps monitor execution.

Routes scenes to whichever existing Hermes skill fits each beat
(`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`,
`blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`,
`songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and
image-to-video. Kanban orchestration uses the `kanban-orchestrator` and
`kanban-worker` skills.

The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, and `--workspace dir:<path>` discipline are
adapted from alt-glitch's original kanban-video-pipeline at
https://github.com/NousResearch/kanban-video-pipeline. This skill
generalizes those patterns across video styles and replaces the original
string-replacement config patcher with a PyYAML-based one that touches
only `toolsets` and `skills.always_load` (preserving security-sensitive
fields like `approvals.mode`).

Includes:
- SKILL.md — workflow + critical rules
- references/ — intake, role archetypes, tool matrix, kanban setup,
  monitoring, six worked examples
- assets/ — brief / setup.sh / soul.md templates
- scripts/ — bootstrap_pipeline.py (plan.json -> setup.sh) and
  monitor.py (poll + issue detection)

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-03 10:26:54 -07:00
brooklyn! e97a9993b9 Merge pull request #19307 from NousResearch/bb/fix-terminal-resize-jumble
fix(tui): clear Apple Terminal resize artifacts
2026-05-03 10:17:15 -07:00
Brooklyn Nicholson 279b656adc fix(tui): clear Apple Terminal resize artifacts
Use a deeper alt-screen clear for Apple Terminal resize repaints so host reflow artifacts do not survive the recovery frame.
2026-05-03 12:11:24 -05:00
Bartok9 e527240b27 fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096)
Under context pressure, frontier models sometimes emit tool calls with
required fields dropped. Previously _handle_write_file() used
args.get('content', '') which substituted an empty string for the missing
key, returned success with bytes_written=0, and created a zero-byte file
on disk. The model had no way to detect the failure.

Changes:
- Reject calls where 'path' is absent or not a non-empty string
- Reject calls where 'content' key is entirely absent (key-presence check,
  not truthiness) — distinguishing a legitimately empty file from a dropped arg
- Reject calls where 'content' is a non-string type
- All error messages include guidance to re-emit the tool call or switch
  to execute_code with hermes_tools.write_file() for large payloads
- Explicit empty string content (file truncation) continues to work

Regression tests added for all four cases: missing path, missing content,
explicit-empty content, and wrong content type.

Fixes #19096
2026-05-03 08:52:41 -07:00
Tranquil-Flow 6b4fb9f878 fix(cron): treat non-dict origin as missing instead of crashing tick
``_resolve_origin`` called ``origin.get('platform')`` on whatever
``job.get('origin')`` returned. The leading ``if not origin: return None``
short-circuited the falsy cases (None, empty dict, "") but a non-empty
string passed that guard and then crashed with
``AttributeError: 'str' object has no attribute 'get'`` on every fire
attempt. Observed in the wild after a migration script tagged jobs with
free-form provenance strings (e.g.
``"combined-digest-replaces-x-and-y-20260503"``).

``mark_job_run`` did record ``last_status: error,
last_error: "'str' object has no attribute 'get'"`` once, but the next
tick re-loaded the same poisoned origin and crashed identically. The
job stayed enabled, fired every tick, and accumulated cascading errors
in the log until ``origin`` was patched manually.

Replace the falsy guard with ``isinstance(origin, dict)``. Non-dict
origins (string, int, list, tuple, float — anything that survived a
hand-edit, JSON-script write, or migration) are now treated the same
as a missing origin: the job continues with ``deliver`` falling back
through its normal home-channel path instead of crashing the scheduler
loop.

Test parametrises the non-dict shapes that can appear in jobs.json
through external writers and asserts ``_resolve_origin`` returns None
for each.

Note: this fix scope is the non-dict-``origin`` crash only. The
``next_run_at: null`` recurring-job recovery (the second sub-bug in
#18722) is independently addressed by the in-flight #18825, which
extends the never-silently-disable defense from #16265 to
``get_due_jobs()`` — that approach is well-aligned with the existing
recovery pattern and ships fine without a competing change here.

Fixes #18722 (non-dict origin crash; recurring-job recovery covered by #18825)
2026-05-03 08:51:50 -07:00
JasonOA888 69dd0f7cf1 fix(approval): extend sensitive write target to cover shell RC and credential files
Terminal commands can write to shell RC files (~/.bashrc, ~/.zshrc,
~/.profile) and credential files (~/.netrc, ~/.pgpass, ~/.npmrc,
~/.pypirc) via redirection or tee without triggering approval, even
though write_file already blocks these paths in file_safety.py.

This creates an inconsistency: write_file protects these paths but
terminal shell redirections bypass the same protection. An agent
prompted via indirect injection could install persistent backdoors
(e.g. PATH manipulation, alias overrides) or write credential entries
without user approval.

Extend _SENSITIVE_WRITE_TARGET with two new regex groups matching the
same paths that file_safety.py's WRITE_DENIED_PATHS already covers:
  _SHELL_RC_FILES  — ~/.bashrc, ~/.zshrc, ~/.profile, ~/.bash_profile,
                     ~/.zprofile
  _CREDENTIAL_FILES — ~/.netrc, ~/.pgpass, ~/.npmrc, ~/.pypirc

All 130 existing tests pass.
2026-05-03 08:49:13 -07:00
teknium1 3c59566cc5 chore(release): map leprincep35700 email for PR #18440 salvage 2026-05-03 08:47:49 -07:00
leprincep35700 b59bb4e351 fix(gateway): preserve home-channel thread targets across restart notifications 2026-05-03 08:47:49 -07:00
Teknium d87fd9f039 fix(goals): make /goal work in TUI and fix gateway verdict delivery (#19209)
/goal was silently broken outside the classic CLI.

TUI: /goal was routed through the HermesCLI slash-worker subprocess,
which set the goal row in SessionDB but then called
_pending_input.put(state.goal) — the subprocess has no reader for that
queue, so the kickoff message was discarded. No post-turn judge was
wired into prompt.submit either, so even a manual kickoff would not
continue the goal loop. Intercept /goal in command.dispatch instead,
drive GoalManager directly, and return {type: send, notice, message}
so the TUI client renders the Goal-set notice and fires the kickoff.
Run the judge in _run_prompt_submit after message.complete, surface
the verdict via status.update {kind: goal}, and chain the continuation
turn after the running guard is released.

Gateway: _post_turn_goal_continuation was gated on
hasattr(adapter, 'send_message'), but adapters only expose send().
That branch was dead on every platform — users never saw
'✓ Goal achieved', 'Continuing toward goal', or budget-exhausted
messages. Replace the dead call with adapter.send(chat_id, content,
metadata) and drop a broken reference to self._loop.

Tests:
- tests/tui_gateway/test_goal_command.py — full /goal dispatch matrix
  (set / status / pause / resume / clear / stop / done / whitespace)
  plus regressions for slash.exec → 4018 and 'goal' staying in
  _PENDING_INPUT_COMMANDS.
- tests/gateway/test_goal_verdict_send.py — locks in the adapter.send
  path for done / continue / budget-exhausted and verifies the hook
  no-ops when no goal is set or the adapter lacks send().
2026-05-03 05:49:12 -07:00
Teknium 55647a5813 fix(whatsapp): pin protobufjs >=7.5.5 via npm overrides to clear 3 critical vulns (#19204)
The whatsapp-bridge pulls @whiskeysockets/baileys at a pinned git
commit whose transitive dep tree ships protobufjs <7.5.5, triggering
GHSA-xq3m-2v4x-88gg (critical, arbitrary code execution). npm audit
reported 3 cascading criticals: protobufjs, @whiskeysockets/libsignal-node
(pulls protobufjs), and baileys itself (effect rollup).

Fix: add npm overrides block pinning protobufjs to ^7.5.5. Deduplicates
to a single 7.5.6 copy at node_modules/protobufjs that both libsignal-node
and any other consumers resolve through normal module resolution.

Why not bump baileys: npm-published baileys@6.17.16 is deprecated by the
maintainers (wrong version), 7.0.0-rc.* still pulls the same vulnerable
libsignal-node, and upstream Baileys HEAD adds a 4th vuln (music-metadata).
The override is the minimal, behavior-preserving fix.

Validation:
- npm audit: 3 critical -> 0 vulnerabilities
- node -e "import('@whiskeysockets/baileys')" -> all 5 named exports
  (makeWASocket, useMultiFileAuthState, DisconnectReason,
  fetchLatestBaileysVersion, downloadMediaMessage) resolve
- node bridge.js loads all modules and reaches Express bind
  (exits only on EADDRINUSE because the live gateway owns :3000)
- Single deduped protobufjs@7.5.6 in the tree
2026-05-03 05:22:30 -07:00
kshitijk4poor 6f2dab248a fix: update tests for resume_pending semantics + add AUTHOR_MAP entries
Tests updated to reflect suspend_recently_active now setting
resume_pending=True (preserves session) instead of suspended=True
(wipes session history).

AUTHOR_MAP entries: millerc79 (#19033), shellybotmoyer (#18915)
2026-05-03 03:54:03 -07:00
charliekerfoot 1148c46241 fix(gateway): correct ws scheme conversion for https urls 2026-05-03 03:54:03 -07:00
kshitijk4poor 7a22c639dc chore: add shellybotmoyer to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
Hermes Agent 934103476f fix(gateway): send /new response before cancel_session_processing to avoid race (#18912)
When /new is issued while an agent is actively processing, the confirmation response was never sent to the user because cancel_session_processing() was called before _send_with_retry(). Task cancellation side effects could silently drop the response.

Fix: reorder to send the response BEFORE cancelling the old task. Add logging at the send point (matching the pattern at line 2800 in _process_message_background) so future failures are visible.

Closes: #18912
2026-05-03 03:54:03 -07:00
kshitijk4poor bf3239472f chore: add millerc79 to AUTHOR_MAP 2026-05-03 03:54:03 -07:00
millerc79 f1e0292517 fix(gateway): resume sessions after crash/restart instead of blanket suspend
suspend_recently_active() was unconditionally setting suspended=True on
startup, causing get_or_create_session() to wipe conversation history on
every restart. Change to set resume_pending=True instead, so sessions
auto-resume while still allowing stuck-loop escalation after 3 failures.
2026-05-03 03:54:03 -07:00
kshitijk4poor 0a97ce6bff chore: add nftpoetrist to AUTHOR_MAP 2026-05-03 03:47:49 -07:00
nftpoetrist 6c1322b997 fix(slack): close previous handler in connect() to prevent zombie Socket Mode connections
SlackAdapter.connect() overwrote self._handler, self._app, and
self._socket_mode_task without closing the prior AsyncSocketModeHandler
first. If connect() was called a second time on the same adapter (e.g.
during a gateway restart or in-process reconnect attempt), the old Socket
Mode websocket stayed alive. Both the old and new connections received
every Slack event and dispatched it twice — producing double responses
with different wording, the same bug that affected DiscordAdapter (#18187,
fixed in #18758).

Fix: add a close-before-reassign guard at the start of the connection
setup path, mirroring the guard DiscordAdapter.connect() already has.
When self._handler is None (fresh adapter, first connect()) the block is
a harmless no-op. Scoped to the handler/app fields only — no behavior
change for any path that does not call connect() twice.

Fixes #18980
2026-05-03 03:47:49 -07:00
kshitijk4poor c14bf441a3 chore: add 0xyg3n noreply email to AUTHOR_MAP 2026-05-03 03:44:55 -07:00
0xyg3n 19ba9e43b6 fix(gateway/discord): require allowlist auth on slash commands
Slash commands (_run_simple_slash, _handle_thread_create_slash) bypassed
every DISCORD_ALLOWED_* gate enforced by on_message. Any guild member
could invoke /background (RCE via terminal), /restart, /model, /skill,
etc. CVSS 9.8 Critical.

- _evaluate_slash_authorization mirrors on_message gates (user, role,
  channel, ignored channel) with fail-closed semantics
- _check_slash_authorization sends ephemeral reject + logs + admin alert
- Auth gate runs before defer() so rejections are ephemeral
- /skill autocomplete returns [] for unauthorized users (no catalog leak)
- Component views (ExecApproval, SlashConfirm, UpdatePrompt, ModelPicker)
  now honor role allowlists via shared _component_check_auth helper
- Optional DISCORD_HIDE_SLASH_COMMANDS defense-in-depth
- Cross-platform admin alert (Telegram/Slack fallback) on unauthorized attempts

Based on PR #18125 by @0xyg3n.
2026-05-03 03:44:55 -07:00
kshitijk4poor 5d5b8912be test: add tests for cmd_key preservation through name clamping
- TestClampCommandNamesTriples: unit tests for 3-tuple support in
  _clamp_command_names (short names, long names, collisions, multiple
  entries, backward compat with 2-tuples)
- TestDiscordSkillCmdKeyDispatch: integration test through the full
  discord_skill_commands pipeline verifying long skill names retain
  their original cmd_key after clamping
- Add contributor CharlieKerfoot to AUTHOR_MAP
2026-05-03 03:25:45 -07:00
charliekerfoot c4c0e5abc2 fix: After _clamp_command_names truncates skill names to fit the 32-cha… 2026-05-03 03:25:45 -07:00
kshitij 457c7b76cd feat(openrouter): add response caching support (#19132)
Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache
headers. When enabled, identical API requests return cached responses for
free (zero billing), reducing both latency and cost.

Configuration via config.yaml:
  openrouter:
    response_cache: true       # default: on
    response_cache_ttl: 300    # 1-86400 seconds

Changes:
- Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL)
- Add build_or_headers() in auxiliary_client.py that builds attribution
  headers plus optional cache headers based on config
- Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites:
  run_agent.py __init__, _apply_client_headers_for_base_url(), and
  auxiliary_client.py _try_openrouter() + _to_async_client()
- Add _check_openrouter_cache_status() method to AIAgent that reads
  X-OpenRouter-Cache-Status from streaming response headers and logs
  HIT/MISS status
- Document in cli-config.yaml.example
- Add 28 tests (22 unit + 6 integration)

Ref: https://openrouter.ai/docs/guides/features/response-caching
2026-05-03 01:54:24 -07:00
Teknium 9b5b88b5e0 chore: add MottledShadow to AUTHOR_MAP 2026-05-03 01:51:33 -07:00
MottledShadow a22465e07a fix(weixin): send_weixin_direct cross-loop session check
When send_message tool is called from inside a running gateway, the
_run_async bridge spawns a worker thread with a separate event loop.
send_weixin_direct then reuses the live adapter's aiohttp session
which was created on the gateway's main loop.  aiohttp's TimerContext
checks asyncio.current_task(loop=session._loop) and sees None because
we're executing on the worker thread's loop → raises 'Timeout context
manager should be used inside a task'.

Fix: skip the live-adapter shortcut when the session belongs to a
different event loop, falling through to the fresh-session path.
2026-05-03 01:51:33 -07:00
Henkey 9987f3d824 fix(acp): compact Zed tool replay rendering 2026-05-03 01:44:23 -07:00
Henkey 19854c7cd2 Schedule ACP history replay and fence file output 2026-05-03 01:44:23 -07:00
Henkey eb612f5574 fix(acp): keep web extract rendering compact 2026-05-03 01:44:23 -07:00
Henkey b294d1d022 fix(acp): keep read-file starts compact 2026-05-03 01:44:23 -07:00
Henkey 72c8037a24 fix(acp): polish common tool rendering 2026-05-03 01:44:23 -07:00
Henkey ef9a08a872 fix(acp): polish Zed context and tool rendering 2026-05-03 01:44:23 -07:00
Henkey e26f9b2070 fix(acp): route Zed thoughts to reasoning callbacks 2026-05-03 01:44:23 -07:00
helix4u 4f37669170 fix(tools): reconfigure enabled unconfigured toolsets 2026-05-03 00:33:02 -07:00
helix4u d409a4409c fix(model): avoid bedrock credential probe in provider picker 2026-05-03 00:32:55 -07:00
Siddharth Balyan 5d3be898a8 docs(tts): mention xAI custom voice support (#18776)
Point users to xAI's custom voices feature — clone your voice in the
console, paste the voice_id into tts.xai.voice_id. No code changes
needed; the existing TTS pipeline already handles arbitrary voice IDs.

- config.py: link to xAI custom voices docs in voice_id comment
- setup.py: prompt accepts custom voice IDs during xAI TTS setup
- tts.md: short section linking to xAI console and docs
2026-05-02 16:08:01 +05:30
liuhao1024 af98122793 fix(auxiliary): propagate explicit_api_key to _try_openrouter()
When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary
tasks, _try_openrouter() now accepts and honors this parameter instead of
silently ignoring it and falling back to OPENROUTER_API_KEY env var.

Root cause: _try_openrouter() had no explicit_api_key parameter, so even
when callers wanted to pass a runtime credential pool key, it could not be used.

Fix:
- Add explicit_api_key: str = None parameter to _try_openrouter()
- Prioritize explicit_api_key over pool key and env var
- Update resolve_provider_client() call site to pass explicit_api_key

Regression coverage:
- Test that explicit_api_key is passed to OpenAI client when provided
- Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None

Closes #18338
2026-05-02 02:27:49 -07:00
teknium1 73bcd83dba chore(release): map beibi9966 email for AUTHOR_MAP
Follow-up for PR #18502 salvage.
2026-05-02 02:23:37 -07:00
teknium1 762eb79f1e fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451)
Two mitigations for the CLOSE_WAIT accumulation reported against QQ Bot
+ Feishu on macOS behind Cloudflare Warp.

1. Shared httpx.Limits helper (gateway/platforms/_http_client_limits.py).
   Every long-lived platform adapter now constructs httpx.AsyncClient
   with max_keepalive_connections=10 and keepalive_expiry=2.0, vs httpx's
   default of unbounded keepalive pool and 5.0s expiry. On macOS/Warp the
   default 5s window let idle keepalive sockets sit in CLOSE_WAIT long
   enough for seven persistent adapters (QQ Bot, WeCom, DingTalk, Signal,
   BlueBubbles, WeCom-callback, plus the transient Feishu helper) to
   compound to the 256-fd ulimit. Tunable via
   HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY and
   HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE env vars.

2. whatsapp.send_typing aiohttp leak. The call was
   'await self._http_session.post(...)' with no 'async with' and no
   variable capture — the ClientResponse went out of scope unclosed,
   holding its TCP socket in CLOSE_WAIT until GC. Fixed by wrapping in
   'async with'. This was the only bare-await aiohttp leak in the
   gateway/tools/plugins tree per audit; all other aiohttp sites use
   the context-manager pattern correctly.

The underlying reporter also saw Feishu SDK (lark-oapi) connections in
CLOSE_WAIT — those are inside the SDK and out of our direct control, but
tightening httpx keepalive across adapters reduces the aggregate pool
pressure regardless of which individual adapter leaks.
2026-05-02 02:23:37 -07:00
beibi9966 38dd057e91 fix(feishu): finalize remote document downloads inside httpx.AsyncClient context (#18502)
Snapshot Content-Type and body while the client context is still
active so pooled connections fully release on exit. Previously the
read happened after `async with httpx.AsyncClient(...)` returned —
which works today only because httpx eagerly buffers non-streaming
responses; a future refactor to `.stream()` would silently read-
after-close.

Part of the #18451 connection-hygiene audit. Salvage of #18502.
2026-05-02 02:23:37 -07:00
Teknium e444d8f29c fix(gateway): config.yaml wins over .env for agent/display/timezone settings (#18764)
Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.
2026-05-02 02:14:35 -07:00
luyao618 13f344c5ce fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929)
When a provider's credential pool has a single entry in 429-cooldown,
resolve_provider_client returns None and AIAgent.__init__ raises a
misleading RuntimeError suggesting the API key is missing — even when
valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising,
mirroring the existing in-flight fallback logic in the request loop.
If a fallback resolves, the agent initializes against it and sets
_fallback_activated=True so _restore_primary_runtime can pick the
primary back up after cooldown.

Closes #17929
2026-05-02 02:09:46 -07:00
Teknium 1dce908930 fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761)
* fix(gateway): config.yaml wins over .env for agent/display/timezone settings

Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.

Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.

Three changes:

1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
   display.*, timezone, and security.* bridge keys. config.yaml is now
   authoritative for these settings — same semantics already in place
   for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
   failure (previously 'except Exception: pass') to stderr so operators
   see bridge errors instead of silently falling back to .env.

2. gateway/run.py: INFO-log the resolved max_iterations at gateway
   start so operators can verify the config→env bridge did the right
   thing instead of chasing a phantom budget ceiling.

3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
   the setup wizard. config.yaml is the single source of truth. Also
   clean up any stale .env entry left behind by pre-fix setups.

Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.

* fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log)

Three issues observed in production gateway.log during a rapid restart
chain on 2026-05-02, all fixed here.

1. _send_restart_notification logged unconditional success
   adapter.send() catches provider errors (e.g. Telegram 'Chat not found')
   and returns SendResult(success=False); it never raises. The caller
   ignored the return value and always logged 'Sent restart notification
   to <chat>' at INFO, producing a misleading success line directly
   below the 'Failed to send Telegram message' traceback on every boot.
   Now inspects result.success and logs WARNING with the error otherwise.

2. WhatsApp bridge SIGTERM on shutdown classified as fatal error
   _check_managed_bridge_exit() saw the bridge's returncode -15 (our own
   SIGTERM from disconnect()) and fired the full fatal-error path,
   producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus
   'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every
   planned shutdown, immediately before the normal '✓ whatsapp
   disconnected'. Adds a _shutting_down flag that disconnect() sets
   before the terminate, and _check_managed_bridge_exit() returns None
   for returncode in {0, -2, -15} while shutting down. OOM-kill (137)
   and other non-signal exits still hit the fatal path.

3. restart_drain_timeout default 60s → 180s
   On 2026-05-02 01:43:27 a user /restart fired while three agents were
   mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget
   expired and all three were force-interrupted. 180s covers realistic
   in-flight agent turns; users on very-long-reasoning models can still
   raise it further via agent.restart_drain_timeout in config.yaml.
   Existing explicit user values are preserved by deep-merge.

Tests
- tests/gateway/test_restart_notification.py: two new tests assert INFO
  is only logged on SendResult(success=True) and WARNING with the error
  string is logged on SendResult(success=False).
- tests/gateway/test_whatsapp_connect.py: parametrized test for
  returncode in {0, -2, -15} proves shutdown-time exits are suppressed;
  separate test proves returncode 137 (SIGKILL/OOM) still surfaces as
  fatal even when _shutting_down is set.
- _check_managed_bridge_exit() reads _shutting_down via getattr-with-
  default so existing _make_adapter() test helpers that bypass __init__
  (pitfall #17 in AGENTS.md) keep working unmodified.
2026-05-02 02:08:06 -07:00
teknium1 50f9f389ec chore(release): map ambition0802 email for AUTHOR_MAP
Follow-up for PR #17939 salvage.
2026-05-02 02:07:14 -07:00
ambition0802 7696ddc59e fix(cli): robust paste file expansion and process_loop error handling (#17666)
Two narrow fixes for long pasted messages silently disappearing:

1. _expand_paste_references: replace path.exists() + read_text() with
   try/except (OSError, IOError). Closes the TOCTOU window where a paste
   file deleted between check and read raised FileNotFoundError, bubbled
   up through process_loop's outer except, and silently dropped the
   user's input. Failures now return the placeholder text and log a
   warning.

2. process_loop outer except: logger.warning() instead of print().
   prompt_toolkit's TUI swallows stdout, so 'Error: …' was invisible
   to the user. Logged errors are discoverable via hermes logs.

Dropped the larger interrupt_queue→pending_input drain that was part of
the original PR — that's a separate class of input-drop (in-progress
interrupt handling) unrelated to the paste-file TOCTOU reported in the
issue, and worth its own review.

Salvage of #17939.
2026-05-02 02:07:14 -07:00
Teknium 5eac6084bc fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759)
Discord's per-command name limit is 32 chars. When two skill slugs
share the same first 32 chars (or a skill slug clamps onto a reserved
gateway command name), only the first seen wins — the second is
dropped from the /skill autocomplete. The old behavior incremented a
``hidden`` counter silently, so skill authors had no way to discover
the drop short of noticing their skill was missing from the picker.

Not an actively-biting bug today (no collisions on the default catalog
as of 2026-05), but a landmine the moment someone ships a skill with a
long name. The earlier series in #18745 / #18753 / #18754 dropped the
other silent data-loss paths in the Discord /skill collector; this one
lights up the last remaining one.

Fix: promote ``_names_used`` from a set to a dict keyed by the clamped
name, mapping to the source cmd_key (or a ``"<reserved>"`` sentinel
for names inherited via ``reserved_names``). On collision, log a
WARNING naming both sides — the winner, the loser, the clamped name,
and what to rename.

Two phrasings:

* skill-vs-skill — "both clamp to X on Discord's 32-char command-name
  limit; only the winner appears in /skill. Rename one skill's
  frontmatter ``name:`` to differ in its first 32 chars."
* skill-vs-reserved — "collides with a reserved gateway command name;
  the skill will not appear in /skill. Rename the skill's frontmatter
  ``name:``."

Tests: three cases in
``tests/hermes_cli/test_discord_skill_clamp_warning.py`` —
skill-vs-skill collision (warning names both cmd_keys + clamped prefix),
skill-vs-reserved collision (warning uses the distinct phrasing), and a
no-collision negative (zero warnings emitted).
2026-05-02 02:05:01 -07:00
teknium1 e363ced3c3 test(discord): regression coverage for zombie-websocket guard in connect()
Covers PR #18224 fix for issue #18187 — when DiscordAdapter.connect() is
called a second time without an intervening disconnect(), the previous
commands.Bot must be closed before a new one is created. Otherwise both
websockets stay connected to Discord's gateway and both fire on_message,
producing double responses with different wording.
2026-05-02 02:04:14 -07:00
luyao618 292d2fb42f fix(discord): close old client before reconnect to prevent zombie websockets (#18187)
When DiscordAdapter.connect() is called during reconnect, it creates a new
commands.Bot client without closing the previous one. The old client's
websocket remains connected to Discord's gateway, causing both to fire
on_message for every incoming event — resulting in double responses.

Fix: before creating a new Bot instance, check if a previous client exists
and close it. This ensures only one websocket connection is active at any
time.

Closes #18187
2026-05-02 02:04:14 -07:00
teknium1 0a6865b328 test(credential_pool): regression coverage for .env vs os.environ precedence
Covers PR #18256 fix for issue #18254 — when OPENROUTER_API_KEY is set in
BOTH os.environ (stale from parent shell) and ~/.hermes/.env (fresh),
_seed_from_env must prefer the .env value. Also guards the fallback case
where .env omits the key entirely (Docker/K8s/systemd deployments that
only inject via runtime env).
2026-05-02 02:00:32 -07:00
teknium1 9c626ef8ea chore(release): map franksong2702 email for AUTHOR_MAP
Follow-up for PR #18256 salvage.
2026-05-02 02:00:32 -07:00
Frank Song 2ef1ad280b fix: prefer ~/.hermes/.env over os.environ when seeding credential pool
When _seed_from_env() reads API keys to populate the credential pool, it
should treat ~/.hermes/.env as the authoritative source — not os.environ.
Stale env vars inherited from parent shell processes (Codex CLI, test
scripts, etc.) can shadow deliberate changes to the .env file, causing
auth.json to cache an outdated key that leads to silent 401 errors.

This is especially visible with OpenRouter: if a parent process exported
OPENROUTER_API_KEY=test-key-fresh and the user later updates .env with a
valid key, restarting Hermes still picks up the stale os.environ value,
writes it back to auth.json, and all API calls fail with 401.

Fixes #18254
2026-05-02 02:00:32 -07:00
Teknium 10297fa23c fix(discord): /reload-skills now refreshes the /skill autocomplete live (#18754)
`_register_skill_group` captured the skill catalog in closure variables
(`entries` and `skill_lookup`) so the single `tree.add_command` call at
startup owned the only live copy. The closure is never re-entered after
startup, so `/reload-skills` — which rescans the on-disk skills dir and
refreshes the in-process `_skill_commands` registry — had no way to
propagate results into the `/skill` autocomplete on Discord. New skills
stayed invisible in the dropdown, and deleted skills returned
"Unknown skill" when the stale autocomplete entry was clicked.

The fix is purely a dataflow change: promote `entries` and `skill_lookup`
to instance attributes (`_skill_entries`, `_skill_lookup`), split the
collector-driven rebuild into a helper (`_refresh_skill_catalog_state`),
and add a public `refresh_skill_group()` method that re-runs the helper
and is safe to call at any point after the initial registration.

The gateway's `_handle_reload_skills_command` then iterates
`self.adapters` and calls `refresh_skill_group()` on any adapter that
exposes it (currently only Discord). Both sync and async implementations
are supported; adapters that don't override the method (Telegram's
BotCommand menu, Slack subcommand map, etc.) are silently skipped — the
in-process `reload_skills()` call covers them.

No `tree.sync()` is required because Discord fetches autocomplete
options dynamically on every keystroke — mutating the instance state the
callbacks already read from is sufficient. That sidesteps the per-app
command-bucket rate limit (~5 writes / 20 s) that made the previous
bulk-sync-on-reload approach unusable (#16713 context).

Tests: tests/gateway/test_reload_skills_discord_resync.py — five cases
covering (1) refresh replaces entries, (2) entries stay sorted after
refresh, (3) collector exception leaves cached state intact, (4)
`_refresh_skill_catalog_state` populates the instance attrs, (5)
orchestrator calls `refresh_skill_group()` on sync + async adapters and
skips adapters that don't expose it.
2026-05-02 02:00:11 -07:00
Teknium 6ec74aec07 fix(gateway): match disabled/optional skills by frontmatter slug, not dir name (#18753)
_check_unavailable_skill is meant to turn a typed "/foo" command that
doesn't resolve into a specific hint — "disabled, enable with hermes
skills config" or "available but not installed, install with hermes
skills install …" — instead of the generic "unknown command" reply.

It was doing the match with `skill_md.parent.name.lower().replace("_", "-")`,
comparing that to the typed command. For every skill whose directory name
drifted from its declared frontmatter `name:`, that comparison failed and
the user got the unhelpful generic path. On a standard install today 19
skills have this drift, e.g.:

  dir: mlops/stable-diffusion
  frontmatter: name: Stable Diffusion Image Generation
  registered slug (what the user types): /stable-diffusion-image-generation

  dir: mlops/qdrant
  frontmatter: name: Qdrant Vector Search
  registered slug: /qdrant-vector-search

  dir: mlops/flash-attention
  frontmatter: name: Optimizing Attention Flash
  registered slug: /optimizing-attention-flash

In every case, _check_unavailable_skill would fall through because
"stable-diffusion" != "stable-diffusion-image-generation", even with the
skill sitting right there on disk.

Fix: extract a small `_skill_slug_from_frontmatter` helper that reads the
SKILL.md frontmatter and normalizes exactly like scan_skill_commands
(lower, spaces/underscores → hyphens, strip non-[a-z0-9-], collapse
runs of hyphens, strip edges). Use it in both the
disabled-skills branch and the optional-skills branch. The disabled-set
membership check now uses the declared frontmatter name (which is what
`hermes skills config` writes into skills.disabled / platform_disabled),
not the slug.

Tests: five cases in tests/gateway/test_unavailable_skill_hint.py —
the drift case for the disabled branch, unknown-command negative,
matched-but-not-disabled negative, non-alnum stripping, and the drift
case for the optional-skills branch. All five fail against main and
pass with the fix.
2026-05-02 02:00:09 -07:00
Teknium 8825e9044c fix(discord): complete #18741 for /skill autocomplete and drop legacy 25x25 caps (#18745)
``discord_skill_commands_by_category`` was lagging the flat
``discord_skill_commands`` collector on two counts. Both were actively
dropping skills from Discord's ``/skill`` autocomplete dropdown.

1. External-dir skills were filtered out. #18741 widened the flat
   collector to accept ``SKILLS_DIR + skills.external_dirs`` but left
   this sibling collector — the one ``_register_skill_group`` actually
   uses on Discord — still matching ``SKILLS_DIR`` only. External
   skills were visible in ``hermes skills list`` and the agent's
   ``/skill-name`` dispatch but silently absent from Discord's
   ``/skill`` picker. Widen the accepted roots to match, and derive
   categories from whichever root the skill lives under so
   ``<ext>/mlops/foo/SKILL.md`` still lands in the ``mlops`` group.

2. 25-group × 25-subcommand caps were still applied. PR #11580
   refactored ``/skill`` to a flat autocomplete (whose options Discord
   fetches dynamically — no per-command payload concern) and its
   docstring promises "no hidden skills." The collector kept the old
   nested-layout caps anyway, silently dropping anything past the 25th
   alphabetical category. On installs with 29 category dirs today (real
   example: tail categories ``social-media``, ``software-development``,
   ``yuanbao`` going missing) this was biting immediately. Remove the
   caps; ``hidden`` now reports only 32-char name-clamp collisions
   against reserved names.

Tests: guard both behaviors. ``test_no_legacy_25x25_cap`` builds 30
categories × 30 skills each and asserts all 900 are returned.
``test_external_dirs_skills_included`` monkeypatches
``get_external_skills_dirs`` and asserts an external-dir skill makes
it into the result grouped under its own top-level directory.
2026-05-02 02:00:06 -07:00
Jacob Lizarraga 2470434d60 fix(telegram): probe polling liveness after reconnect to detect wedged Updater
After a transient Telegram 502, _handle_polling_network_error's
stop()+start_polling() cycle can leave PTB's Updater with `running=True`
but a wedged consumer task that never makes progress. No error_callback
fires in that state, so the reconnect ladder never advances past attempt
1, the MAX_NETWORK_RETRIES fatal-error path is never reached, and the
gateway sits silent indefinitely.

Schedule a heartbeat probe (60s after a successful reconnect) that
verifies Updater.running is still True and bot.get_me() responds within
a tight asyncio.wait_for timeout. Either failure feeds back into the
reconnect ladder so the existing escalation path fires.

No PTB-internal coupling, no Application rebuild — minimal additive
defense inside the existing reconnect abstraction.

Tests cover healthy / Updater non-running / probe timeout / probe
network error / already-fatal cases, plus an integration check that the
probe is actually scheduled after a successful start_polling().

Closes the silent-wedge case observed in the wild after a transient
Telegram 502; existing reconnect tests updated to mock bot.get_me() now
that the success path schedules a heartbeat probe.
2026-05-02 01:55:04 -07:00
liuhao1024 9bf260472b fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock
Providers like Google Vertex, Azure, and Amazon Bedrock reject API
requests with duplicate tool names (HTTP 400: 'Tool names must be
unique').  The upstream injection paths in run_agent.py already dedup
after PR #17335, but two API-boundary functions pass tools through
without checking:

- agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic
  providers in chat_completions mode)
- agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic
  Messages API path)

Add defensive dedup guards at both sites.  Duplicates are dropped with
a warning log, converting a hard 400 failure into a recoverable
condition.  This is intentionally conservative — the root-cause dedup
in run_agent.py is the primary defense; these guards add resilience
against future injection-path regressions.

Includes 8 new tests covering unique passthrough, duplicate removal,
empty/None edge cases.

Closes #18478
2026-05-02 01:51:51 -07:00
Teknium 699b3679bc fix(constants): warn once when get_hermes_home() falls back under an active profile (#18746)
When HERMES_HOME is unset but ~/.hermes/active_profile names a non-default
profile, any data this process writes lands in the default profile — not the
one the operator expects. Before this change the fallback was silent, so
cross-profile contamination (#18594) was invisible until a user noticed
their memory/state ended up in the wrong place.

Now we emit a one-shot warning to stderr the first time this happens in
a process. No raise — there are 30+ module-level callers of get_hermes_home()
and raising from any of them would brick import. Behavior is otherwise
unchanged; subprocess spawners (systemd template, kanban dispatcher, docker
entrypoint) already propagate HERMES_HOME correctly.

Bypasses logging.getLogger() because this runs before logging is configured
in a significant fraction of callers (module import time).

Refs #18594. Credit to @liuhao1024 for surfacing the silent-fallback case
in PR #18600; we kept the diagnostic signal without the import-time raise.
2026-05-02 01:49:55 -07:00
teknium1 98c98821ff chore(release): map CoreyNoDream email for AUTHOR_MAP
Follow-up for PR #18721 salvage.
2026-05-02 01:40:31 -07:00
CoreyNoDream c5e3a6fb5b fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows
Path.read_text() uses the system locale by default. On Windows CN/JP/KR
locales (GBK/CP932/CP949), reading a UTF-8 .env raises UnicodeDecodeError
as soon as it contains any non-ASCII byte (e.g. an em dash).

Pin encoding="utf-8" on every .env read in hermes_cli to match how the
rest of the codebase (load_dotenv at doctor.py:26) already decodes it.

Adds a regression test that monkeypatches Path.read_text to simulate a
GBK locale and asserts 'hermes doctor' no longer raises.

Refs #18637
2026-05-02 01:40:31 -07:00
Teknium e2cea6eeba fix(gateway): include external_dirs skills in Telegram/Discord slash commands (#18741)
Skills configured through `skills.external_dirs` in config.yaml were
visible via `hermes skills list`, `get_skill_commands()`, and the
agent's `/skill-name` dispatch, but silently excluded from the
Telegram and Discord slash-command menus. The filter in
`_collect_gateway_skill_entries` only accepted skills whose
`skill_md_path` started with `SKILLS_DIR`, so anything under an
external directory fell through.

Widen the accepted-prefix set to include all configured external
dirs alongside the local skills dir. Every prefix is now
slash-terminated so `/my-skills` cannot also admit
`/my-skills-extra`. Also guard against empty `skill_md_path`
values so they can't accidentally match.

Fixes #8110

Salvages #8790 by luyao618.

Co-authored-by: Yao <34041715+luyao618@users.noreply.github.com>
2026-05-02 01:36:57 -07:00
Teknium c73594fe41 fix(skills): rescan skill_commands cache when platform scope changes (#18739)
The process-global `_skill_commands` dict in agent/skill_commands.py
was seeded by whichever platform scanned first, and
`get_skill_commands()` only rescanned when the cache was empty. In a
long-lived gateway process serving multiple platforms (Telegram +
Discord + Slack), the first platform's
`skills.platform_disabled` view was silently inherited by the
others — so a skill disabled for Telegram would also disappear from
Discord's slash menu, and vice versa.

Track the platform scope the cache was populated for
(`_skill_commands_platform`) and rescan in `get_skill_commands()`
when the currently-active platform no longer matches. Platform
resolution uses the same precedence as `_is_skill_disabled`:
`HERMES_PLATFORM` env var then `HERMES_SESSION_PLATFORM` from the
gateway session context.

Fixes #14536

Salvages #14570 by LeonSGP43.

Co-authored-by: LeonSGP <leon@sgp43.com>
2026-05-02 01:36:53 -07:00
Teknium 97acd66b4c fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete

Closes #18671. The classification pipeline that feeds cron-ref rewriting
used to infer consolidation vs pruning from two brittle signals: the
curator model's post-hoc YAML summary block, and a substring heuristic
scanning other tool calls for the removed skill's name. Both miss in
real consolidations — the model forgets the YAML under reasoning
pressure, and the heuristic misses when the umbrella's patch content
describes the absorbed behavior abstractly instead of naming the old
slug. When both miss, the skill falls through to 'no-evidence fallback'
pruned, and #18253's cron rewriter drops the cron ref entirely instead
of mapping it to the umbrella. Same observable symptom as pre-#18253:
'Skill(s) not found and skipped' at the next cron run.

The fix makes the model declare intent at the moment of deletion.
skill_manage(action='delete') now accepts absorbed_into:
  - absorbed_into='<umbrella>'  -> consolidated, target must exist on disk
  - absorbed_into=''            -> explicit prune, no forwarding target
  - missing                     -> legacy path, falls through to heuristic/YAML

The curator reconciler reads these declarations off llm_meta.tool_calls
BEFORE either the YAML block or the substring heuristic. Declaration
wins. Fallback logic stays intact for backward compat with any caller
(human or older curator conversation) that doesn't populate the arg.

Changes
- tools/skill_manager_tool.py: add absorbed_into param to skill_manage
  + _delete_skill. Validate target exists when non-empty. Reject
  absorbed_into=<self>. Wire through dispatcher + registry + schema.
- agent/curator.py: new _extract_absorbed_into_declarations() walks
  tool calls for skill_manage(delete) with the arg. _reconcile_classification
  accepts absorbed_declarations= and treats them as authoritative. Curator
  prompt updated to require the arg on every delete.
- Tests: 7 new skill_manager tests covering the tool contract (valid
  target, empty string, nonexistent target, self-reference, whitespace,
  backward compat, dispatcher plumbing). 11 new curator tests covering
  the extractor + authoritative reconciler path + mixed-legacy-and-
  declared runs.

Validation
- 307/307 targeted tests pass (curator + cron + skill_manager suites).
- E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing
  all 3. Model emits NO YAML block. Heuristic misses (patch prose
  doesn't name old slugs). Delete calls carry absorbed_into. Result:
  both PR skills correctly classified 'consolidated' + cron rewritten
  ['pr-review-format', 'pr-review-checklist', 'stale-junk'] ->
  ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''.
- E2E backward-compat: delete without absorbed_into, model emits YAML
  -> routed via existing 'model' source, cron still rewritten correctly.

* feat(curator): capture + restore cron skill links across snapshot/rollback

Before this, rolling back a curator run restored the skills tree but cron
jobs still pointed at the umbrella skills the curator had rewritten them
to. The user would see their old narrow skills back on disk but their
cron jobs still configured with the merged umbrella — not actually 'back
to how it was'.

Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json
alongside the skills tarball, as cron-jobs.json. The manifest gets a new
'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI
confirm dialog) can surface what's in the snapshot. If jobs.json is
missing/unreadable/malformed, snapshot proceeds without cron data — the
skills backup is the core guarantee; cron is additive.

Rollback side: after the skills extract succeeds, the new
_restore_cron_skill_links() reconciles the backed-up jobs into the live
jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and
only on jobs matched by id. Everything else about a cron job — schedule,
last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live
state the user or scheduler has modified since the snapshot; overwriting
it would regress unrelated activity.

Reconciliation rules:
- Job in backup AND live, skills differ  → skills restored.
- Job in backup AND live, skills match   → no-op.
- Job in backup, NOT in live             → skipped (user deleted it
                                              after snapshot; their choice
                                              is later than the snapshot).
- Job in live, NOT in backup             → untouched (user created it
                                              after snapshot).
- Snapshot missing cron-jobs.json at all → rollback still succeeds,
                                              reports 'not captured'
                                              (older pre-feature snapshots
                                              keep working).

Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the
scheduler uses, so rollback doesn't race tick().

Also:
- hermes_cli/curator.py: rollback confirm dialog now shows
  'cron jobs: N (will be restored for skill-link fields only)' when the
  snapshot has cron data, or 'not in snapshot (<reason>)' otherwise.
- rollback()'s message string includes a 'cron links: ...' clause
  summarizing the reconciliation outcome.

Tests
- 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json
  captured-as-raw, full rollback-restores-skills-and-cron, rollback
  touches only skill fields, rollback skips user-deleted jobs, rollback
  leaves user-created jobs untouched, rollback still works with
  pre-feature snapshot that has no cron-jobs.json, standalone unit test
  on _restore_cron_skill_links exercising the full report shape.

Validation
- 484/484 targeted tests pass (curator + cron + skill_manager suites).
- E2E: real snapshot_skills, real cron rewrite, real rollback. Before:
  ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage'].
  After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format',
  'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id,
  name, prompt) preserved across the round trip.
2026-05-02 01:29:57 -07:00
Siddharth Balyan f98b5d00a4 fix: gateway systemd unit now retries indefinitely with backoff (#18639)
The old defaults (StartLimitIntervalSec=600, StartLimitBurst=5,
RestartSec=30) meant any network outage over ~5 minutes would
permanently kill the gateway until manual intervention.

Changes:
- StartLimitIntervalSec=0 (never give up)
- Restart=always (not just on-failure)
- RestartSec=60 with RestartMaxDelaySec=300, RestartSteps=5
  (exponential backoff: 60 → 120 → 180 → 240 → 300s cap)
- After=network-online.target + Wants= (both units now wait for
  actual connectivity, not just network.target)

Power outage → internet down → internet back = auto-recovery.
2026-05-02 08:51:30 +05:30
Siddharth Balyan 585d6778da fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633)
When the dashboard is bound to 0.0.0.0 with --insecure (e.g. behind
Tailscale Serve), WebSocket endpoints (/api/pty, /api/ws, /api/pub,
/api/events) rejected connections from non-loopback client IPs with
code 4403 — causing 'events feed disconnected' in the UI.

Extract the repeated loopback check into _ws_client_is_allowed() which
respects the public bind flag. Session token auth still guards all
endpoints regardless of bind mode.
2026-05-02 08:17:45 +05:30
327 changed files with 31658 additions and 2186 deletions
+4
View File
@@ -25,3 +25,7 @@ ui-tui/packages/hermes-ink/dist/
# Runtime data (bind-mounted at /opt/data; must not leak into build context)
data/
# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
hermes-config/
runtime/
+44
View File
@@ -0,0 +1,44 @@
# Dependabot configuration for hermes-agent.
#
# Deliberately scoped to github-actions only.
#
# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
# because we pin source dependencies exactly (uv.lock, package-lock.json) as
# part of our supply-chain posture. Automatic version-bump PRs against those
# pins would undermine the strategy — pins are moved deliberately, after
# review, not on a schedule.
#
# github-actions is the exception: action pins (we use full commit SHAs per
# supply-chain policy) must be updated when upstream actions publish
# patches — usually themselves security fixes. Dependabot opens a PR with
# the new SHA and release notes; we review and merge like any other PR.
#
# Security-update PRs for source dependencies (opened ONLY when a CVE is
# published affecting a currently-pinned version) are enabled separately
# via the repo's Dependabot security updates setting
# (Settings → Code security → Dependabot → Dependabot security updates).
# Those are CVE-only, not schedule-driven, and do not conflict with our
# pinning strategy — they fire when a pinned version becomes known-bad,
# which is exactly when we want to move the pin.
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "github-actions"
commit-message:
prefix: "chore(actions)"
include: "scope"
groups:
# Batch routine action bumps into one PR per week to reduce noise.
# Security updates still open individually and bypass grouping.
actions-minor-patch:
update-types:
- "minor"
- "patch"
+67
View File
@@ -0,0 +1,67 @@
name: OSV-Scanner
# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
# database. Runs on every PR that touches a lockfile and on a weekly schedule
# against main.
#
# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
# It reports known CVEs in currently-pinned dependency versions so we can
# decide when and how to patch on our own schedule. Our pinning strategy
# (full SHA / exact version) is preserved; only the notification signal
# is added.
#
# Complements the existing supply-chain-audit.yml workflow (which scans
# for malicious code patterns in PR diffs) by covering the orthogonal
# "currently-pinned dep became known-vulnerable" case.
#
# Uses Google's officially-recommended reusable workflow, pinned by SHA.
# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
# fail-on-vuln is disabled so the job does not block merges on pre-existing
# vulnerabilities in pinned deps that we may need to patch deliberately.
on:
pull_request:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package.json'
- 'ui-tui/package-lock.json'
- 'website/package.json'
- 'website/package-lock.json'
- '.github/workflows/osv-scanner.yml'
push:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package-lock.json'
- 'website/package-lock.json'
schedule:
# Weekly scan against main — catches CVEs published after merge for
# deps that haven't changed since.
- cron: '0 9 * * 1'
workflow_dispatch:
permissions:
# Required by the reusable workflow to upload SARIF to the Security tab.
actions: read
contents: read
security-events: write
jobs:
scan:
name: Scan lockfiles
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5 # v2.3.5
with:
# Scan explicit lockfiles rather than recursing, so we only look at
# the three sources of truth and skip vendored / test / worktree dirs.
scan-args: |-
--lockfile=uv.lock
--lockfile=ui-tui/package-lock.json
--lockfile=website/package-lock.json
fail-on-vuln: false
+10 -1
View File
@@ -257,7 +257,16 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite. See `hermes
## Adding New Tools
Requires changes in **2 files**:
For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
`~/.hermes/plugins/<name>/__init__.py`, then register tools with
`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
enabled or disabled without touching `tools/` or `toolsets.py`.
Use the built-in route below only when the user is explicitly contributing a new
core Hermes tool that should ship in the base system.
Built-in/core tools require changes in **2 files**:
**1. Create `tools/your_tool.py`:**
```python
+245 -27
View File
@@ -4,6 +4,7 @@ from __future__ import annotations
import asyncio
import contextvars
import json
import logging
import os
from collections import defaultdict, deque
@@ -47,6 +48,7 @@ from acp.schema import (
TextContentBlock,
UnstructuredCommandInput,
Usage,
UsageUpdate,
UserMessageChunk,
)
@@ -65,6 +67,7 @@ from acp_adapter.events import (
)
from acp_adapter.permissions import make_approval_callback
from acp_adapter.session import SessionManager, SessionState, _expand_acp_enabled_toolsets
from acp_adapter.tools import build_tool_complete, build_tool_start
logger = logging.getLogger(__name__)
@@ -315,6 +318,66 @@ class HermesACPAgent(acp.Agent):
return target_provider, new_model
@staticmethod
def _build_usage_update(state: SessionState) -> UsageUpdate | None:
"""Build ACP native context-usage data for clients like Zed.
Zed's circular context indicator is driven by ACP ``usage_update``
session updates: ``size`` is the model context window and ``used`` is
the current request pressure. Hermes estimates ``used`` from the same
buckets it sends to providers: system prompt, conversation history, and
tool schemas.
"""
agent = state.agent
compressor = getattr(agent, "context_compressor", None)
size = int(getattr(compressor, "context_length", 0) or 0)
if size <= 0:
return None
try:
from agent.model_metadata import estimate_request_tokens_rough
used = estimate_request_tokens_rough(
state.history,
system_prompt=getattr(agent, "_cached_system_prompt", "") or "",
tools=getattr(agent, "tools", None) or None,
)
except Exception:
logger.debug("Could not estimate ACP native context usage", exc_info=True)
used = int(getattr(compressor, "last_prompt_tokens", 0) or 0)
return UsageUpdate(
session_update="usage_update",
size=max(size, 0),
used=max(used, 0),
)
async def _send_usage_update(self, state: SessionState) -> None:
"""Send ACP native context usage to the connected client."""
if not self._conn:
return
update = self._build_usage_update(state)
if update is None:
return
try:
await self._conn.session_update(
session_id=state.session_id,
update=update,
)
except Exception:
logger.warning(
"Failed to send ACP usage update for session %s",
state.session_id,
exc_info=True,
)
def _schedule_usage_update(self, state: SessionState) -> None:
"""Schedule native context indicator refresh after ACP responses."""
if not self._conn:
return
loop = asyncio.get_running_loop()
loop.call_soon(asyncio.create_task, self._send_usage_update(state))
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -485,37 +548,99 @@ class HermesACPAgent(acp.Agent):
)
return None
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
function = tool_call.get("function") if isinstance(tool_call.get("function"), dict) else {}
name = str(function.get("name") or tool_call.get("name") or "unknown_tool")
raw_args = function.get("arguments") or tool_call.get("arguments") or tool_call.get("args") or {}
if isinstance(raw_args, str):
try:
parsed = json.loads(raw_args)
except Exception:
parsed = {"raw": raw_args}
raw_args = parsed
if not isinstance(raw_args, dict):
raw_args = {}
return name, raw_args
@staticmethod
def _history_tool_call_id(tool_call: dict[str, Any]) -> str:
"""Return the stable provider tool call id for ACP history replay."""
return str(
tool_call.get("id")
or tool_call.get("call_id")
or tool_call.get("tool_call_id")
or ""
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as ``user_message_chunk`` / ``agent_message_chunk`` notifications; merely
restoring server-side state makes Hermes remember context, but leaves the
editor looking like a clean thread.
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
for message in state.history:
role = str(message.get("role") or "")
if role not in {"user", "assistant"}:
continue
text = self._history_message_text(message)
if not text:
continue
update = self._history_message_update(role=role, text=text)
if update is None:
continue
active_tool_calls: dict[str, tuple[str, dict[str, Any]]] = {}
async def _send(update: Any) -> bool:
try:
await self._conn.session_update(session_id=state.session_id, update=update)
return True
except Exception:
logger.warning(
"Failed to replay ACP history for session %s",
state.session_id,
exc_info=True,
)
return
return False
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
tool_call_id = str(message.get("tool_call_id") or "").strip()
tool_name = str(message.get("tool_name") or "").strip()
function_args: dict[str, Any] | None = None
if tool_call_id in active_tool_calls:
tool_name, function_args = active_tool_calls.pop(tool_call_id)
if not tool_call_id or not tool_name:
continue
result = message.get("content")
if not await _send(
build_tool_complete(
tool_call_id,
tool_name,
result=result if isinstance(result, str) else None,
function_args=function_args,
)
):
return
async def new_session(
self,
@@ -527,11 +652,24 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@@ -545,8 +683,9 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
@@ -562,8 +701,9 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
await self._replay_session_history(state)
self._schedule_history_replay(state)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
@@ -712,6 +852,7 @@ class HermesACPAgent(acp.Agent):
if self._conn:
update = acp.update_agent_message_text(response_text)
await self._conn.session_update(session_id, update)
await self._send_usage_update(state)
return PromptResponse(stop_reason="end_turn")
# If Zed sends another regular prompt while the same ACP session is
@@ -744,24 +885,37 @@ class HermesACPAgent(acp.Agent):
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
streamed_message = False
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
reasoning_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
def stream_delta_cb(text: str) -> None:
nonlocal streamed_message
if text:
streamed_message = True
message_cb(text)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
tool_progress_cb = None
thinking_cb = None
reasoning_cb = None
step_cb = None
message_cb = None
stream_delta_cb = None
approval_cb = None
agent = state.agent
agent.tool_progress_callback = tool_progress_cb
agent.thinking_callback = thinking_cb
# ACP thought panes should not receive Hermes' local kawaii waiting/status
# updates. Route provider/model reasoning deltas instead; if the provider
# emits no reasoning, Zed should not get a fake "thinking" accordion.
agent.thinking_callback = None
agent.reasoning_callback = reasoning_cb
agent.step_callback = step_cb
agent.message_callback = message_cb
agent.stream_delta_callback = stream_delta_cb
# Approval callback is per-thread (thread-local, GHSA-qg5c-hvr5-hjgr).
# Set it INSIDE _run_agent so the TLS write happens in the executor
@@ -867,7 +1021,7 @@ class HermesACPAgent(acp.Agent):
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
if final_response and conn and not streamed_message:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
@@ -903,6 +1057,8 @@ class HermesACPAgent(acp.Agent):
cached_read_tokens=result.get("cache_read_tokens"),
)
await self._send_usage_update(state)
stop_reason = "cancelled" if state.cancel_event and state.cancel_event.is_set() else "end_turn"
return PromptResponse(stop_reason=stop_reason, usage=usage)
@@ -1035,22 +1191,84 @@ class HermesACPAgent(acp.Agent):
return f"Could not list tools: {e}"
def _cmd_context(self, args: str, state: SessionState) -> str:
"""Show ACP session context pressure and compression guidance."""
n_messages = len(state.history)
if n_messages == 0:
return "Conversation is empty (no messages yet)."
# Count by role
# Count by role.
roles: dict[str, int] = {}
for msg in state.history:
role = msg.get("role", "unknown")
roles[role] = roles.get(role, 0) + 1
agent = state.agent
model = state.model or getattr(agent, "model", "")
provider = getattr(agent, "provider", None) or "auto"
compressor = getattr(agent, "context_compressor", None)
context_length = int(getattr(compressor, "context_length", 0) or 0)
threshold_tokens = int(getattr(compressor, "threshold_tokens", 0) or 0)
try:
from agent.model_metadata import estimate_request_tokens_rough
system_prompt = getattr(agent, "_cached_system_prompt", "") or ""
tools = getattr(agent, "tools", None) or None
approx_tokens = estimate_request_tokens_rough(
state.history,
system_prompt=system_prompt,
tools=tools,
)
except Exception:
logger.debug("Could not estimate ACP context usage", exc_info=True)
approx_tokens = 0
if threshold_tokens <= 0 and context_length > 0:
threshold_tokens = int(context_length * 0.80)
lines = [
f"Conversation: {n_messages} messages",
f"Conversation: {n_messages} messages"
if n_messages
else "Conversation is empty (no messages yet).",
f" user: {roles.get('user', 0)}, assistant: {roles.get('assistant', 0)}, "
f"tool: {roles.get('tool', 0)}, system: {roles.get('system', 0)}",
]
model = state.model or getattr(state.agent, "model", "")
if model:
lines.append(f"Model: {model}")
lines.append(f"Provider: {provider}")
if approx_tokens > 0:
if context_length > 0:
usage_pct = (approx_tokens / context_length) * 100
lines.append(
f"Context usage: ~{approx_tokens:,} / {context_length:,} tokens ({usage_pct:.1f}%)"
)
else:
lines.append(f"Context usage: ~{approx_tokens:,} tokens")
if threshold_tokens > 0:
if approx_tokens > 0:
threshold_pct = (threshold_tokens / context_length) * 100 if context_length > 0 else 0
remaining = max(threshold_tokens - approx_tokens, 0)
if approx_tokens >= threshold_tokens:
lines.append(
f"Compression: due now (threshold ~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ "). Run /compact."
)
else:
lines.append(
f"Compression: ~{remaining:,} tokens until threshold "
f"(~{threshold_tokens:,}"
+ (f", {threshold_pct:.0f}%" if threshold_pct else "")
+ ")."
)
else:
lines.append(f"Compression threshold: ~{threshold_tokens:,} tokens")
if getattr(agent, "compression_enabled", True) is False:
lines.append("Compression is disabled for this agent.")
else:
lines.append("Tip: run /compact to compress manually before the threshold.")
return "\n".join(lines)
def _cmd_reset(self, args: str, state: SessionState) -> str:
+822 -21
View File
@@ -28,6 +28,11 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
"terminal": "execute",
"process": "execute",
"execute_code": "execute",
# Session/meta tools
"todo": "other",
"skill_view": "read",
"skills_list": "read",
"skill_manage": "edit",
# Web / fetch
"web_search": "fetch",
"web_extract": "fetch",
@@ -51,6 +56,28 @@ TOOL_KIND_MAP: Dict[str, ToolKind] = {
}
_POLISHED_TOOLS = {
# Core operator loop
"todo", "memory", "session_search", "delegate_task",
# Files / execution
"read_file", "write_file", "patch", "search_files", "terminal", "process", "execute_code",
# Skills / web / browser / media
"skill_view", "skills_list", "skill_manage", "web_search", "web_extract",
"browser_navigate", "browser_click", "browser_type", "browser_press", "browser_scroll",
"browser_back", "browser_snapshot", "browser_console", "browser_get_images", "browser_vision",
"vision_analyze", "image_generate", "text_to_speech",
# Schedulers / platform integrations
"cronjob", "send_message", "clarify", "discord", "discord_admin",
"ha_list_entities", "ha_get_state", "ha_list_services", "ha_call_service",
"feishu_doc_read", "feishu_drive_list_comments", "feishu_drive_list_comment_replies",
"feishu_drive_reply_comment", "feishu_drive_add_comment",
"kanban_create", "kanban_show", "kanban_comment", "kanban_complete",
"kanban_block", "kanban_link", "kanban_heartbeat",
"yb_query_group_info", "yb_query_group_members", "yb_search_sticker",
"yb_send_dm", "yb_send_sticker", "mixture_of_agents",
}
def get_tool_kind(tool_name: str) -> ToolKind:
"""Return the ACP ToolKind for a hermes tool, defaulting to 'other'."""
return TOOL_KIND_MAP.get(tool_name, "other")
@@ -85,18 +112,645 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
if urls:
return f"extract: {urls[0]}" + (f" (+{len(urls)-1})" if len(urls) > 1 else "")
return "web extract"
if tool_name == "process":
action = str(args.get("action") or "").strip() or "manage"
sid = str(args.get("session_id") or "").strip()
return f"process {action}: {sid}" if sid else f"process {action}"
if tool_name == "delegate_task":
tasks = args.get("tasks")
if isinstance(tasks, list) and tasks:
return f"delegate batch ({len(tasks)} tasks)"
goal = args.get("goal", "")
if goal and len(goal) > 60:
goal = goal[:57] + "..."
return f"delegate: {goal}" if goal else "delegate task"
if tool_name == "session_search":
query = str(args.get("query") or "").strip()
return f"session search: {query}" if query else "recent sessions"
if tool_name == "memory":
action = str(args.get("action") or "manage").strip() or "manage"
target = str(args.get("target") or "memory").strip() or "memory"
return f"memory {action}: {target}"
if tool_name == "execute_code":
return "execute code"
code = str(args.get("code") or "").strip()
first_line = next((line.strip() for line in code.splitlines() if line.strip()), "")
if first_line:
if len(first_line) > 70:
first_line = first_line[:67] + "..."
return f"python: {first_line}"
return "python code"
if tool_name == "todo":
items = args.get("todos")
if isinstance(items, list):
return f"todo ({len(items)} item{'s' if len(items) != 1 else ''})"
return "todo"
if tool_name == "skill_view":
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
suffix = f"/{file_path}" if file_path else ""
return f"skill view ({name}{suffix})"
if tool_name == "skills_list":
category = str(args.get("category") or "").strip()
return f"skills list ({category})" if category else "skills list"
if tool_name == "skill_manage":
action = str(args.get("action") or "manage").strip() or "manage"
name = str(args.get("name") or "?").strip() or "?"
file_path = str(args.get("file_path") or "").strip()
target = f"{name}/{file_path}" if file_path else name
if len(target) > 64:
target = target[:61] + "..."
return f"skill {action}: {target}"
if tool_name == "browser_navigate":
return f"navigate: {args.get('url', '?')}"
if tool_name == "browser_snapshot":
return "browser snapshot"
if tool_name == "browser_vision":
return f"browser vision: {str(args.get('question', '?'))[:50]}"
if tool_name == "browser_get_images":
return "browser images"
if tool_name == "vision_analyze":
return f"analyze image: {args.get('question', '?')[:50]}"
return f"analyze image: {str(args.get('question', '?'))[:50]}"
if tool_name == "image_generate":
prompt = str(args.get("prompt") or args.get("description") or "").strip()
return f"generate image: {prompt[:50]}" if prompt else "generate image"
if tool_name == "cronjob":
action = str(args.get("action") or "manage").strip() or "manage"
job_id = str(args.get("job_id") or args.get("id") or "").strip()
return f"cron {action}: {job_id}" if job_id else f"cron {action}"
return tool_name
def _text(content: str) -> Any:
return acp.tool_content(acp.text_block(content))
def _json_loads_maybe(value: Optional[str]) -> Any:
if not isinstance(value, str):
return value
try:
return json.loads(value)
except Exception:
pass
# Some Hermes tools append a human hint after a JSON payload, e.g.
# ``{...}\n\n[Hint: Results truncated...]``. Keep the structured rendering path
# by decoding the first JSON value instead of falling back to raw text.
try:
decoded, _ = json.JSONDecoder().raw_decode(value.lstrip())
return decoded
except Exception:
return None
def _truncate_text(text: str, limit: int = 5000) -> str:
if len(text) <= limit:
return text
return text[: max(0, limit - 100)] + f"\n... ({len(text)} chars total, truncated)"
def _fenced_text(text: str, language: str = "") -> str:
"""Return a Markdown fence that cannot be broken by backticks in text."""
longest = max((len(run) for run in text.split("`")[1::2]), default=0)
fence = "`" * max(3, longest + 1)
return f"{fence}{language}\n{text}\n{fence}"
def _format_todo_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict) or not isinstance(data.get("todos"), list):
return None
summary = data.get("summary") if isinstance(data.get("summary"), dict) else {}
icon = {
"completed": "",
"in_progress": "🔄",
"pending": "",
"cancelled": "",
}
lines = ["**Todo list**", ""]
for item in data["todos"]:
if not isinstance(item, dict):
continue
status = str(item.get("status") or "pending")
content = str(item.get("content") or item.get("id") or "").strip()
if content:
lines.append(f"- {icon.get(status, '')} {content}")
if summary:
cancelled = summary.get("cancelled", 0)
lines.extend([
"",
"**Progress:** "
f"{summary.get('completed', 0)} completed, "
f"{summary.get('in_progress', 0)} in progress, "
f"{summary.get('pending', 0)} pending"
+ (f", {cancelled} cancelled" if cancelled else ""),
])
return "\n".join(lines)
def _format_read_file_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not data.get("content"):
return f"Read failed: {data.get('error')}"
content = data.get("content")
if not isinstance(content, str):
return None
path = str((args or {}).get("path") or data.get("path") or "file").strip()
offset = (args or {}).get("offset")
limit = (args or {}).get("limit")
range_bits = []
if offset:
range_bits.append(f"from line {offset}")
if limit:
range_bits.append(f"limit {limit}")
suffix = f" ({', '.join(range_bits)})" if range_bits else ""
header = f"Read {path}{suffix}"
if data.get("total_lines") is not None:
header += f"{data.get('total_lines')} total lines"
# Hermes read_file output is line-numbered with `|`. If we send it as raw
# Markdown, Zed can interpret pipes as tables and collapse the layout.
# Fence the payload so file lines stay readable and literal.
return _truncate_text(f"{header}\n\n{_fenced_text(content)}")
def _format_search_files_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
matches = data.get("matches")
if not isinstance(matches, list):
return None
total = data.get("total_count", len(matches))
shown = min(len(matches), 12)
truncated = bool(data.get("truncated")) or len(matches) > shown
lines = [
"Search results",
f"Found {total} match{'es' if total != 1 else ''}; showing {shown}.",
"",
]
for match in matches[:shown]:
if not isinstance(match, dict):
lines.append(f"- {match}")
continue
path = str(match.get("path") or match.get("file") or match.get("filename") or "?")
line = match.get("line") or match.get("line_number")
content = str(match.get("content") or match.get("text") or "").strip()
loc = f"{path}:{line}" if line else path
lines.append(f"- {loc}")
if content:
snippet = _truncate_text(" ".join(content.split()), 300)
lines.append(f" {snippet}")
if truncated:
lines.extend([
"",
"Results truncated. Narrow the search, add file_glob, or use offset to page.",
])
return _truncate_text("\n".join(lines), limit=7000)
def _format_execute_code_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
output = str(data.get("output") or "")
error = str(data.get("error") or "")
exit_code = data.get("exit_code")
parts = [f"Exit code: {exit_code}" if exit_code is not None else "Execution complete"]
if output:
parts.extend(["", "Output:", output])
if error:
parts.extend(["", "Error:", error])
return _truncate_text("\n".join(parts))
def _extract_markdown_headings(content: str, limit: int = 8) -> list[str]:
headings: list[str] = []
for line in content.splitlines():
stripped = line.strip()
if stripped.startswith("#"):
heading = stripped.lstrip("#").strip()
if heading:
headings.append(heading)
if len(headings) >= limit:
break
return headings
def _format_skill_view_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Skill view failed: {data.get('error', 'unknown error')}"
name = str(data.get("name") or "skill")
file_path = str(data.get("file") or data.get("path") or "SKILL.md")
description = str(data.get("description") or "").strip()
content = str(data.get("content") or "")
linked = data.get("linked_files") if isinstance(data.get("linked_files"), dict) else None
lines = ["**Skill loaded**", "", f"- **Name:** `{name}`", f"- **File:** `{file_path}`"]
if description:
lines.append(f"- **Description:** {description}")
if content:
lines.append(f"- **Content:** {len(content):,} chars loaded into agent context")
if linked:
linked_count = sum(len(v) for v in linked.values() if isinstance(v, list))
lines.append(f"- **Linked files:** {linked_count}")
headings = _extract_markdown_headings(content)
if headings:
lines.extend(["", "**Sections**"])
lines.extend(f"- {heading}" for heading in headings)
lines.extend([
"",
"_Full skill content is available to the agent but hidden here to keep ACP readable._",
])
return "\n".join(lines)
def _format_skill_manage_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "manage").strip() or "manage"
name = str((args or {}).get("name") or data.get("name") or "skill").strip() or "skill"
file_path = str((args or {}).get("file_path") or data.get("file_path") or "SKILL.md").strip() or "SKILL.md"
success = data.get("success")
status = "✅ Skill updated" if success is not False else "✗ Skill update failed"
lines = [f"**{status}**", "", f"- **Action:** `{action}`", f"- **Skill:** `{name}`"]
if action not in {"delete"}:
lines.append(f"- **File:** `{file_path}`")
message = str(data.get("message") or data.get("error") or "").strip()
if message:
lines.append(f"- **Result:** {message}")
replacements = data.get("replacements") or data.get("replacement_count")
if replacements is not None:
lines.append(f"- **Replacements:** {replacements}")
path = str(data.get("path") or "").strip()
if path:
lines.append(f"- **Path:** `{path}`")
return "\n".join(lines)
def _format_web_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
web = data.get("data", {}).get("web") if isinstance(data.get("data"), dict) else data.get("web")
if not isinstance(web, list):
return None
lines = [f"Web results: {len(web)}"]
for item in web[:10]:
if not isinstance(item, dict):
continue
title = str(item.get("title") or item.get("url") or "result").strip()
url = str(item.get("url") or "").strip()
desc = str(item.get("description") or "").strip()
lines.append(f"{title}" + (f"{url}" if url else ""))
if desc:
lines.append(f" {desc}")
return _truncate_text("\n".join(lines))
def _format_web_extract_result(result: Optional[str]) -> Optional[str]:
"""Return only web_extract errors for ACP; success stays compact via title."""
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False and data.get("error"):
return f"Web extract failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
failures: list[str] = []
for item in results[:10]:
if not isinstance(item, dict):
continue
error = str(item.get("error") or "").strip()
if not error or error in {"None", "null"}:
continue
url = str(item.get("url") or "").strip()
title = str(item.get("title") or url or "Untitled").strip()
failures.append(
f"- {title}" + (f"{url}" if url and url != title else "") + f"\n Error: {_truncate_text(error, limit=500)}"
)
if not failures:
return None
lines = [f"Web extract failed for {len(failures)} URL{'s' if len(failures) != 1 else ''}"]
lines.extend(failures)
return "\n".join(lines)
def _format_process_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False and data.get("error"):
return f"Process error: {data.get('error')}"
action = str((args or {}).get("action") or "process").strip() or "process"
if isinstance(data.get("processes"), list):
processes = data["processes"]
lines = [f"Processes: {len(processes)}"]
for proc in processes[:20]:
if not isinstance(proc, dict):
lines.append(f"- {proc}")
continue
sid = str(proc.get("session_id") or proc.get("id") or "?")
status = str(proc.get("status") or ("exited" if proc.get("exited") else "running"))
cmd = str(proc.get("command") or "").strip()
pid = proc.get("pid")
code = proc.get("exit_code")
bits = [status]
if pid is not None:
bits.append(f"pid {pid}")
if code is not None:
bits.append(f"exit {code}")
lines.append(f"- `{sid}` — {', '.join(bits)}" + (f"{cmd[:120]}" if cmd else ""))
if len(processes) > 20:
lines.append(f"... {len(processes) - 20} more process(es)")
return "\n".join(lines)
status = str(data.get("status") or data.get("state") or action).strip()
sid = str(data.get("session_id") or (args or {}).get("session_id") or "").strip()
lines = [f"Process {action}: {status}" + (f" (`{sid}`)" if sid else "")]
for key, label in (("command", "Command"), ("pid", "PID"), ("exit_code", "Exit code"), ("returncode", "Exit code"), ("lines", "Lines")):
if data.get(key) is not None:
lines.append(f"- **{label}:** {data.get(key)}")
output = data.get("output") or data.get("new_output") or data.get("log") or data.get("stdout")
error = data.get("error") or data.get("stderr")
if output:
lines.extend(["", "Output:", _truncate_text(str(output), limit=5000)])
if error:
lines.extend(["", "Error:", _truncate_text(str(error), limit=2000)])
msg = data.get("message")
if msg and not output and not error:
lines.append(str(msg))
return _truncate_text("\n".join(lines), limit=7000)
def _format_delegate_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("error") and not isinstance(data.get("results"), list):
return f"Delegation failed: {data.get('error')}"
results = data.get("results")
if not isinstance(results, list):
return None
total = data.get("total_duration_seconds")
lines = [f"Delegation results: {len(results)} task{'s' if len(results) != 1 else ''}" + (f" in {total}s" if total is not None else "")]
icon = {"completed": "", "failed": "", "error": "", "timeout": "", "interrupted": ""}
for item in results:
if not isinstance(item, dict):
lines.append(f"- {item}")
continue
idx = item.get("task_index")
status = str(item.get("status") or "unknown")
model = item.get("model")
dur = item.get("duration_seconds")
role = item.get("_child_role")
header = f"{icon.get(status, '')} Task {idx + 1 if isinstance(idx, int) else '?'}: {status}"
bits = []
if model:
bits.append(str(model))
if role:
bits.append(f"role={role}")
if dur is not None:
bits.append(f"{dur}s")
if bits:
header += " (" + ", ".join(bits) + ")"
lines.extend(["", header])
summary = str(item.get("summary") or "").strip()
error = str(item.get("error") or "").strip()
if summary:
lines.append(_truncate_text(summary, limit=1200))
if error:
lines.append("Error: " + _truncate_text(error, limit=800))
trace = item.get("tool_trace")
if isinstance(trace, list) and trace:
names = [str(t.get("tool") or "?") for t in trace if isinstance(t, dict)]
if names:
lines.append("Tools: " + ", ".join(names[:12]) + (f" (+{len(names)-12})" if len(names) > 12 else ""))
return _truncate_text("\n".join(lines), limit=8000)
def _format_session_search_result(result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
if data.get("success") is False:
return f"Session search failed: {data.get('error', 'unknown error')}"
results = data.get("results")
if not isinstance(results, list):
return None
mode = data.get("mode") or "search"
query = data.get("query")
lines = ["Recent sessions" if mode == "recent" else f"Session search results" + (f" for `{query}`" if query else "")]
if not results:
lines.append(str(data.get("message") or "No matching sessions found."))
return "\n".join(lines)
for item in results:
if not isinstance(item, dict):
continue
sid = str(item.get("session_id") or "?")
title = str(item.get("title") or item.get("when") or "Untitled session").strip()
when = str(item.get("last_active") or item.get("started_at") or item.get("when") or "").strip()
count = item.get("message_count")
source = str(item.get("source") or "").strip()
meta = ", ".join(str(x) for x in [when, source, f"{count} msgs" if count is not None else ""] if x)
lines.append(f"- **{title}** (`{sid}`)" + (f"{meta}" if meta else ""))
summary = str(item.get("summary") or item.get("preview") or "").strip()
if summary:
lines.append(" " + _truncate_text(" ".join(summary.split()), limit=500))
return _truncate_text("\n".join(lines), limit=7000)
def _format_memory_result(result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return None
action = str((args or {}).get("action") or "memory").strip() or "memory"
target = str(data.get("target") or (args or {}).get("target") or "memory")
if data.get("success") is False:
lines = [f"✗ Memory {action} failed ({target})", str(data.get("error") or "unknown error")]
matches = data.get("matches")
if isinstance(matches, list) and matches:
lines.append("Matches:")
lines.extend(f"- {_truncate_text(str(m), 160)}" for m in matches[:5])
return "\n".join(lines)
lines = [f"✅ Memory {action} saved ({target})"]
if data.get("message"):
lines.append(str(data.get("message")))
if data.get("entry_count") is not None:
lines.append(f"Entries: {data.get('entry_count')}")
if data.get("usage"):
lines.append(f"Usage: {data.get('usage')}")
# Avoid dumping all memory entries into ACP UI; show only the explicit new value preview.
preview = str((args or {}).get("content") or (args or {}).get("old_text") or "").strip()
if preview:
lines.append("Preview: " + _truncate_text(preview, limit=300))
return "\n".join(lines)
def _format_edit_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
path = str((args or {}).get("path") or "file").strip()
if isinstance(data, dict):
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed for {path}: {data.get('error', 'unknown error')}"
message = str(data.get("message") or "").strip()
replacements = data.get("replacements") or data.get("replacement_count")
lines = [f"{tool_name} completed" + (f" for `{path}`" if path else "")]
if message:
lines.append(message)
if replacements is not None:
lines.append(f"Replacements: {replacements}")
if data.get("files_modified"):
files = data.get("files_modified")
if isinstance(files, list):
lines.append("Files: " + ", ".join(f"`{f}`" for f in files[:8]))
return "\n".join(lines)
if isinstance(result, str) and result.strip():
return _truncate_text(result, limit=3000)
return f"{tool_name} completed" + (f" for `{path}`" if path else "")
def _format_browser_result(tool_name: str, result: Optional[str], args: Optional[Dict[str, Any]]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
if tool_name == "browser_get_images":
images = data.get("images") or data.get("data")
if isinstance(images, list):
lines = [f"Images found: {len(images)}"]
for img in images[:12]:
if isinstance(img, dict):
alt = str(img.get("alt") or "").strip()
url = str(img.get("url") or img.get("src") or "").strip()
lines.append(f"- {alt or 'image'}" + (f"{url}" if url else ""))
return _truncate_text("\n".join(lines), limit=5000)
title = str(data.get("title") or data.get("url") or data.get("status") or tool_name)
text = str(data.get("text") or data.get("content") or data.get("snapshot") or data.get("analysis") or data.get("message") or "").strip()
lines = [title]
if data.get("url") and data.get("url") != title:
lines.append(str(data.get("url")))
if text:
lines.extend(["", _truncate_text(text, limit=5000)])
return _truncate_text("\n".join(lines), limit=7000)
def _format_media_or_cron_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, dict):
return result if isinstance(result, str) and result.strip() else None
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed"]
for key in ("file_path", "path", "url", "image_url", "job_id", "id", "status", "message", "next_run"):
if data.get(key):
lines.append(f"- **{key}:** {data.get(key)}")
return "\n".join(lines)
def _format_generic_structured_result(tool_name: str, result: Optional[str]) -> Optional[str]:
data = _json_loads_maybe(result)
if not isinstance(data, (dict, list)):
return result if isinstance(result, str) and result.strip() else None
if isinstance(data, list):
lines = [f"{tool_name}: {len(data)} item{'s' if len(data) != 1 else ''}"]
for item in data[:12]:
lines.append(f"- {_truncate_text(str(item), limit=240)}")
return _truncate_text("\n".join(lines), limit=5000)
if data.get("success") is False or data.get("error"):
return f"{tool_name} failed: {data.get('error', 'unknown error')}"
lines = [f"{tool_name} completed" if data.get("success") is True else f"{tool_name} result"]
priority_keys = (
"message", "status", "id", "task_id", "issue_id", "title", "name", "entity_id",
"state", "service", "url", "path", "file_path", "count", "total", "next_run",
)
seen = set()
for key in priority_keys:
value = data.get(key)
if value in (None, "", [], {}):
continue
seen.add(key)
lines.append(f"- **{key}:** {_truncate_text(str(value), limit=500)}")
for key, value in data.items():
if key in seen or key in {"success", "raw", "content", "entries"}:
continue
if value in (None, "", [], {}):
continue
if isinstance(value, (dict, list)):
preview = json.dumps(value, ensure_ascii=False, default=str)
else:
preview = str(value)
lines.append(f"- **{key}:** {_truncate_text(preview, limit=500)}")
if len(lines) >= 14:
break
content = data.get("content")
if isinstance(content, str) and content.strip():
lines.extend(["", _truncate_text(content.strip(), limit=1500)])
return _truncate_text("\n".join(lines), limit=7000)
def _build_polished_completion_content(
tool_name: str,
result: Optional[str],
function_args: Optional[Dict[str, Any]],
) -> Optional[List[Any]]:
formatter = {
"todo": lambda: _format_todo_result(result),
"read_file": lambda: _format_read_file_result(result, function_args),
"write_file": lambda: _format_edit_result(tool_name, result, function_args),
"patch": lambda: _format_edit_result(tool_name, result, function_args),
"search_files": lambda: _format_search_files_result(result),
"execute_code": lambda: _format_execute_code_result(result),
"process": lambda: _format_process_result(result, function_args),
"delegate_task": lambda: _format_delegate_result(result),
"session_search": lambda: _format_session_search_result(result),
"memory": lambda: _format_memory_result(result, function_args),
"skill_view": lambda: _format_skill_view_result(result),
"skill_manage": lambda: _format_skill_manage_result(result, function_args),
"web_search": lambda: _format_web_search_result(result),
"web_extract": lambda: _format_web_extract_result(result),
"browser_navigate": lambda: _format_browser_result(tool_name, result, function_args),
"browser_snapshot": lambda: _format_browser_result(tool_name, result, function_args),
"browser_vision": lambda: _format_browser_result(tool_name, result, function_args),
"browser_get_images": lambda: _format_browser_result(tool_name, result, function_args),
"vision_analyze": lambda: _format_media_or_cron_result(tool_name, result),
"image_generate": lambda: _format_media_or_cron_result(tool_name, result),
"cronjob": lambda: _format_media_or_cron_result(tool_name, result),
}.get(tool_name)
if formatter is None and tool_name in _POLISHED_TOOLS:
formatter = lambda: _format_generic_structured_result(tool_name, result)
if formatter is None:
return None
text = formatter()
if not text:
return None
return [_text(text)]
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
@@ -258,7 +912,11 @@ def _build_tool_complete_content(
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
polished_content = _build_polished_completion_content(tool_name, result, function_args)
if polished_content:
return polished_content
return [_text(display_result)]
# ---------------------------------------------------------------------------
@@ -288,7 +946,6 @@ def build_tool_start(
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "write_file":
@@ -297,32 +954,172 @@ def build_tool_start(
content = [acp.tool_diff_content(path=path, new_text=file_content)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "terminal":
command = arguments.get("command", "")
content = [acp.tool_content(acp.text_block(f"$ {command}"))]
content = [_text(f"$ {command}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
if tool_name == "read_file":
path = arguments.get("path", "")
content = [acp.tool_content(acp.text_block(f"Reading {path}"))]
# The title and location already identify the file. Sending a synthetic
# "Reading ..." content block makes Zed render an unhelpful Output
# section before the real file contents arrive on completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "search_files":
pattern = arguments.get("pattern", "")
target = arguments.get("target", "content")
content = [acp.tool_content(acp.text_block(f"Searching for '{pattern}' ({target})"))]
search_path = arguments.get("path")
where = f" in {search_path}" if search_path else ""
content = [_text(f"Searching for '{pattern}' ({target}){where}")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "todo":
items = arguments.get("todos")
if isinstance(items, list):
preview_lines = ["Updating todo list", ""]
for item in items[:8]:
if isinstance(item, dict):
preview_lines.append(f"- {item.get('status', 'pending')}: {item.get('content', item.get('id', ''))}")
if len(items) > 8:
preview_lines.append(f"... {len(items) - 8} more")
content = [_text("\n".join(preview_lines))]
else:
content = [_text("Reading todo list")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_view":
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
content = [_text(f"Loading skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "skill_manage":
action = str(arguments.get("action") or "manage").strip() or "manage"
name = str(arguments.get("name") or "?").strip() or "?"
file_path = str(arguments.get("file_path") or "SKILL.md").strip() or "SKILL.md"
path = f"skills/{name}/{file_path}" if file_path else f"skills/{name}"
if action == "patch":
old = str(arguments.get("old_string") or "")
new = str(arguments.get("new_string") or "")
content = [acp.tool_diff_content(path=path, old_text=old or None, new_text=new)]
elif action in {"edit", "create"}:
content = [
acp.tool_diff_content(
path=path,
new_text=str(arguments.get("content") or ""),
)
]
elif action == "write_file":
target = str(arguments.get("file_path") or "file")
content = [
acp.tool_diff_content(
path=f"skills/{name}/{target}",
new_text=str(arguments.get("file_content") or ""),
)
]
elif action in {"delete", "remove_file"}:
target = str(arguments.get("file_path") or file_path or name)
content = [_text(f"Removing {target} from skill '{name}'")]
else:
content = [_text(f"Running skill_manage action '{action}' on skill '{name}' ({file_path})")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "execute_code":
code = str(arguments.get("code") or "").strip()
preview = code[:1200] + (f"\n... ({len(code)} chars total, truncated)" if len(code) > 1200 else "")
content = [_text(f"Running Python helper script:\n\n```python\n{preview}\n```" if preview else "Running Python helper script")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching the web for: {query}" if query else "Searching the web")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "web_extract":
# The title identifies the URL(s). Avoid a duplicate content block so
# Zed renders this like read_file: compact start, concise completion.
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=None, locations=locations,
)
if tool_name == "process":
action = str(arguments.get("action") or "").strip() or "manage"
sid = str(arguments.get("session_id") or "").strip()
data_preview = str(arguments.get("data") or "").strip()
text = f"Process action: {action}" + (f"\nSession: {sid}" if sid else "")
if data_preview:
text += "\nInput: " + _truncate_text(data_preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "delegate_task":
tasks = arguments.get("tasks")
if isinstance(tasks, list) and tasks:
lines = [f"Delegating {len(tasks)} tasks", ""]
for i, task in enumerate(tasks[:8], 1):
if isinstance(task, dict):
goal = str(task.get("goal") or "").strip()
role = str(task.get("role") or "").strip()
lines.append(f"{i}. " + _truncate_text(goal, limit=160) + (f" ({role})" if role else ""))
if len(tasks) > 8:
lines.append(f"... {len(tasks) - 8} more")
content = [_text("\n".join(lines))]
else:
goal = str(arguments.get("goal") or "").strip()
content = [_text("Delegating task" + (f":\n{_truncate_text(goal, limit=800)}" if goal else ""))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "session_search":
query = str(arguments.get("query") or "").strip()
content = [_text(f"Searching past sessions for: {query}" if query else "Loading recent sessions")]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name == "memory":
action = str(arguments.get("action") or "manage").strip() or "manage"
target = str(arguments.get("target") or "memory").strip() or "memory"
preview = str(arguments.get("content") or arguments.get("old_text") or "").strip()
text = f"Memory {action} ({target})"
if preview:
text += "\nPreview: " + _truncate_text(preview, limit=500)
content = [_text(text)]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
)
if tool_name in _POLISHED_TOOLS:
try:
args_text = json.dumps(arguments, indent=2, default=str)
except (TypeError, ValueError):
args_text = str(arguments)
content = [_text(_truncate_text(args_text, limit=1200))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
)
# Generic fallback
@@ -334,7 +1131,7 @@ def build_tool_start(
content = [acp.tool_content(acp.text_block(args_text))]
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
raw_input=None if tool_name in _POLISHED_TOOLS else arguments,
)
@@ -347,18 +1144,22 @@ def build_tool_complete(
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if tool_name == "web_extract":
error_text = _format_web_extract_result(result)
content = [_text(error_text)] if error_text else None
else:
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
status="completed",
content=content,
raw_output=result,
raw_output=None if tool_name in _POLISHED_TOOLS else result,
)
+53 -4
View File
@@ -76,6 +76,7 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
# Models where temperature/top_p/top_k return 400 if set to non-default values.
# This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
_NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")
# ── Max output token limits per Anthropic model ───────────────────────
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
@@ -105,6 +106,9 @@ _ANTHROPIC_OUTPUT_LIMITS = {
"claude-3-haiku": 4_096,
# Third-party Anthropic-compatible providers
"minimax": 131_072,
# Qwen models via DashScope Anthropic-compatible endpoint
# DashScope enforces max_tokens ∈ [1, 65536]
"qwen3": 65_536,
}
# For any model not in the table, assume the highest current limit.
@@ -216,6 +220,17 @@ def _forbids_sampling_params(model: str) -> bool:
return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)
def _supports_fast_mode(model: str) -> bool:
"""Return True for models that support Anthropic Fast Mode (speed=fast).
Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
returns HTTP 400. This guard prevents silently 400'ing when stale config
or older callers leave fast mode enabled across a model upgrade.
"""
return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
# Beta headers for enhanced features (sent with ALL auth types).
# As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
# beta headers are still accepted (harmless no-op) but not required. Kept
@@ -1222,6 +1237,14 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
``keep_nullable_hint=False`` because the Anthropic validator does not
recognize the OpenAPI-style ``nullable: true`` extension and strict
schema-to-grammar converters may reject unknown keywords.
Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
Anthropic API rejects union keywords at the schema root with a generic
HTTP 400. Several upstream and plugin tools ship schemas with one of
these keywords at the top level (commonly for Pydantic discriminated
unions). If we land here with those keywords still present after
nullable-union stripping, drop them and fall back to a plain object
schema so the tool still validates at the Anthropic boundary.
"""
if not schema:
return {"type": "object", "properties": {}}
@@ -1231,6 +1254,12 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
if not isinstance(normalized, dict):
return {"type": "object", "properties": {}}
# Strip top-level union keywords that Anthropic's validator rejects.
banned = {"oneOf", "allOf", "anyOf"}
if banned & normalized.keys():
normalized = {k: v for k, v in normalized.items() if k not in banned}
if "type" not in normalized:
normalized["type"] = "object"
if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
normalized = {**normalized, "properties": {}}
return normalized
@@ -1241,10 +1270,24 @@ def convert_tools_to_anthropic(tools: List[Dict]) -> List[Dict]:
if not tools:
return []
result = []
seen_names: set = set()
for t in tools:
fn = t.get("function", {})
name = fn.get("name", "")
# Defensive dedup: Anthropic rejects requests with duplicate tool
# names. Upstream injection paths already dedup, but this guard
# converts a hard API failure into a warning. See: #18478
if name and name in seen_names:
logger.warning(
"convert_tools_to_anthropic: duplicate tool name '%s' "
"— dropping second occurrence",
name,
)
continue
if name:
seen_names.add(name)
result.append({
"name": fn.get("name", ""),
"name": name,
"description": fn.get("description", ""),
"input_schema": _normalize_tool_input_schema(
fn.get("parameters", {"type": "object", "properties": {}})
@@ -1901,9 +1944,15 @@ def build_anthropic_kwargs(
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
# output speed. Only for native Anthropic endpoints — third-party
# providers would reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
# output speed. Per Anthropic docs, fast mode is only supported on
# Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if (
fast_mode
and not _is_third_party_anthropic_endpoint(base_url)
and _supports_fast_mode(model)
):
kwargs.setdefault("extra_body", {})["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
+95 -18
View File
@@ -259,13 +259,68 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
"kimi-coding-cn",
})
# OpenRouter app attribution headers
_OR_HEADERS = {
# OpenRouter app attribution headers (base — always sent)
_OR_HEADERS_BASE = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
# Truthy values for boolean env-var parsing.
_TRUTHY_ENV_VALUES = frozenset({"1", "true", "yes", "on"})
def build_or_headers(or_config: dict | None = None) -> dict:
"""Build OpenRouter headers, optionally including response-cache headers.
Precedence for response cache: env var > config.yaml > default (enabled).
Environment variables:
``HERMES_OPENROUTER_CACHE`` — truthy (``1``/``true``/``yes``/``on``)
enables caching; ``0``/``false``/``no``/``off`` disables.
Overrides ``openrouter.response_cache`` in config.yaml.
``HERMES_OPENROUTER_CACHE_TTL`` — integer seconds (1-86400).
Overrides ``openrouter.response_cache_ttl`` in config.yaml.
*or_config* is the ``openrouter`` section from config.yaml. When *None*,
falls back to reading config from disk via ``load_config()``.
"""
headers = dict(_OR_HEADERS_BASE)
# Resolve config from disk if not provided.
if or_config is None:
try:
from hermes_cli.config import load_config
or_config = load_config().get("openrouter", {})
except Exception:
or_config = {}
# Determine cache enabled: env var overrides config.
env_cache = os.environ.get("HERMES_OPENROUTER_CACHE", "").strip().lower()
if env_cache:
cache_enabled = env_cache in _TRUTHY_ENV_VALUES
else:
cache_enabled = or_config.get("response_cache", False)
if not cache_enabled:
return headers
headers["X-OpenRouter-Cache"] = "true"
# Determine TTL: env var overrides config.
env_ttl = os.environ.get("HERMES_OPENROUTER_CACHE_TTL", "").strip()
if env_ttl:
if env_ttl.isdigit():
ttl = int(env_ttl)
if 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(ttl)
else:
ttl = or_config.get("response_cache_ttl", 300)
if isinstance(ttl, (int, float)) and 1 <= ttl <= 86400:
headers["X-OpenRouter-Cache-TTL"] = str(int(ttl))
return headers
# Vercel AI Gateway app attribution headers. HTTP-Referer maps to
# referrerUrl and X-Title maps to appName in the gateway's analytics.
from hermes_cli import __version__ as _HERMES_VERSION
@@ -1149,23 +1204,23 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter() -> Tuple[Optional[OpenAI], Optional[str]]:
def _try_openrouter(explicit_api_key: str = None) -> Tuple[Optional[OpenAI], Optional[str]]:
pool_present, entry = _select_pool_entry("openrouter")
if pool_present:
or_key = _pool_runtime_api_key(entry)
or_key = explicit_api_key or _pool_runtime_api_key(entry)
if not or_key:
return None, None
base_url = _pool_runtime_base_url(entry, OPENROUTER_BASE_URL) or OPENROUTER_BASE_URL
logger.debug("Auxiliary client: OpenRouter via pool")
return OpenAI(api_key=or_key, base_url=base_url,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
or_key = os.getenv("OPENROUTER_API_KEY")
or_key = explicit_api_key or os.getenv("OPENROUTER_API_KEY")
if not or_key:
return None, None
logger.debug("Auxiliary client: OpenRouter")
return OpenAI(api_key=or_key, base_url=OPENROUTER_BASE_URL,
default_headers=_OR_HEADERS), _OPENROUTER_MODEL
default_headers=build_or_headers()), _OPENROUTER_MODEL
def _describe_openrouter_unavailable() -> str:
@@ -1474,7 +1529,7 @@ def _build_codex_client(model: str) -> Tuple[Optional[Any], Optional[str]]:
return CodexAuxiliaryClient(real_client, model), model
def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optional[str]]:
try:
from agent.anthropic_adapter import build_anthropic_client, resolve_anthropic_token
except ImportError:
@@ -1484,10 +1539,10 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
if pool_present:
if entry is None:
return None, None
token = _pool_runtime_api_key(entry)
token = explicit_api_key or _pool_runtime_api_key(entry)
else:
entry = None
token = resolve_anthropic_token()
token = explicit_api_key or resolve_anthropic_token()
if not token:
return None, None
@@ -1911,7 +1966,7 @@ def _to_async_client(sync_client, model: str, is_vision: bool = False):
}
sync_base_url = str(sync_client.base_url)
if base_url_host_matches(sync_base_url, "openrouter.ai"):
async_kwargs["default_headers"] = dict(_OR_HEADERS)
async_kwargs["default_headers"] = build_or_headers()
elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
from hermes_cli.copilot_auth import copilot_request_headers
@@ -2053,9 +2108,9 @@ def resolve_provider_client(
return (_to_async_client(client, final_model, is_vision=is_vision) if async_mode
else (client, final_model))
# ── OpenRouter ───────────────────────────────────────────────────
# ── OpenRouter ───────────────────────────────────────────
if provider == "openrouter":
client, default = _try_openrouter()
client, default = _try_openrouter(explicit_api_key=explicit_api_key)
if client is None:
logger.warning(
"resolve_provider_client: openrouter requested but %s",
@@ -2281,7 +2336,7 @@ def resolve_provider_client(
if pconfig.auth_type == "api_key":
if provider == "anthropic":
client, default_model = _try_anthropic()
client, default_model = _try_anthropic(explicit_api_key=explicit_api_key)
if client is None:
logger.warning("resolve_provider_client: anthropic requested but no Anthropic credentials found")
return None, None
@@ -2593,8 +2648,11 @@ def resolve_vision_provider_client(
return resolved_provider, sync_client, final_model
if resolved_base_url:
provider_for_base_override = (
requested if requested and requested not in ("", "auto") else "custom"
)
client, final_model = resolve_provider_client(
"custom",
provider_for_base_override,
model=resolved_model,
async_mode=async_mode,
explicit_base_url=resolved_base_url,
@@ -2602,8 +2660,8 @@ def resolve_vision_provider_client(
api_mode=resolved_api_mode,
)
if client is None:
return "custom", None, None
return "custom", client, final_model
return provider_for_base_override, None, None
return provider_for_base_override, client, final_model
if requested == "auto":
# Vision auto-detection order:
@@ -3237,7 +3295,26 @@ def _build_call_kwargs(
kwargs["max_tokens"] = max_tokens
if tools:
kwargs["tools"] = tools
# Defensive dedup: providers like Google Vertex, Azure, and Bedrock
# reject requests with duplicate tool names (HTTP 400). The upstream
# injection paths (run_agent.py) already dedup, but this guard
# converts a hard API failure into a warning if an upstream regression
# reintroduces duplicates. See: #18478
_seen: set = set()
_deduped: list = []
for _t in tools:
_tname = (_t.get("function") or {}).get("name", "")
if _tname and _tname in _seen:
logger.warning(
"_build_call_kwargs: duplicate tool name '%s' removed "
"(provider=%s model=%s)",
_tname, provider, model,
)
continue
if _tname:
_seen.add(_tname)
_deduped.append(_t)
kwargs["tools"] = _deduped
# Provider-specific extra_body
merged_extra = dict(extra_body or {})
+21 -3
View File
@@ -344,6 +344,7 @@ class ContextCompressor(ContextEngine):
self._last_aux_model_failure_model = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
self._summary_failure_cooldown_until = 0.0 # transient errors must not block a fresh session
def update_model(
self,
@@ -553,7 +554,16 @@ class ContextCompressor(ContextEngine):
break
accumulated += msg_tokens
boundary = i
prune_boundary = max(boundary, len(result) - min_protect)
# Translate the budget walk into a "protected count", apply the
# floor in count-space (where `max` reads naturally: protect at
# least `min_protect` messages or whatever the budget reserved,
# whichever is more), then convert back to a prune boundary.
# Doing this in index-space with `max` would invert the direction
# (smaller index = MORE protected), so a generous budget would
# silently get truncated back down to `min_protect`.
budget_protect_count = len(result) - boundary
protected_count = max(budget_protect_count, min_protect)
prune_boundary = len(result) - protected_count
else:
prune_boundary = len(result) - protect_tail_count
@@ -569,6 +579,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if len(content) < 200:
continue
h = hashlib.md5(content.encode("utf-8", errors="replace")).hexdigest()[:12]
@@ -588,6 +600,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if not content or content == _PRUNED_TOOL_PLACEHOLDER:
continue
# Skip already-deduplicated or previously-summarized results
@@ -903,15 +917,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
or "does not exist" in _err_str
or "no available channel" in _err_str
)
_is_timeout = (
_status in (408, 429, 502, 504)
or "timeout" in _err_str
)
if (
_is_model_not_found
(_is_model_not_found or _is_timeout)
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' not available (%s). "
"Summary model '%s' unavailable (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
+17 -6
View File
@@ -3,6 +3,7 @@
from __future__ import annotations
import logging
import os
import random
import threading
import time
@@ -13,7 +14,7 @@ from datetime import datetime
from typing import Any, Dict, List, Optional, Set, Tuple
from hermes_constants import OPENROUTER_BASE_URL
from hermes_cli.config import get_env_value
from hermes_cli.config import get_env_value, load_env
import hermes_cli.auth as auth_mod
from hermes_cli.auth import (
CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
@@ -1380,6 +1381,16 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool, Set[str]]:
changed = False
active_sources: Set[str] = set()
# Prefer ~/.hermes/.env over os.environ — the user's config file is the
# authoritative source for Hermes credentials. Stale env vars from parent
# processes (Codex CLI, test scripts, etc.) should not override deliberate
# changes to the .env file.
def _get_env_prefer_dotenv(key: str) -> str:
env_file = load_env()
val = env_file.get(key) or os.environ.get(key) or ""
return val.strip()
# Honour user suppression — `hermes auth remove <provider> <N>` for an
# env-seeded credential marks the env:<VAR> source as suppressed so it
# won't be re-seeded from the user's shell environment or ~/.hermes/.env.
@@ -1391,8 +1402,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
def _is_source_suppressed(_p, _s): # type: ignore[misc]
return False
if provider == "openrouter":
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value("OPENROUTER_API_KEY") or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv("OPENROUTER_API_KEY")
if token:
source = "env:OPENROUTER_API_KEY"
if _is_source_suppressed(provider, source):
@@ -1418,7 +1429,7 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
env_url = ""
if pconfig.base_url_env_var:
env_url = (get_env_value(pconfig.base_url_env_var) or "").strip().rstrip("/")
env_url = _get_env_prefer_dotenv(pconfig.base_url_env_var).rstrip("/")
env_vars = list(pconfig.api_key_env_vars)
if provider == "anthropic":
@@ -1429,8 +1440,8 @@ def _seed_from_env(provider: str, entries: List[PooledCredential]) -> Tuple[bool
]
for env_var in env_vars:
# Check both os.environ and ~/.hermes/.env file
token = (get_env_value(env_var) or "").strip()
# Prefer ~/.hermes/.env over os.environ
token = _get_env_prefer_dotenv(env_var)
if not token:
continue
source = f"env:{env_var}"
+218 -36
View File
@@ -24,11 +24,12 @@ from __future__ import annotations
import json
import logging
import os
import re
import tempfile
import threading
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Callable, Dict, List, Optional, Set
from typing import Any, Callable, Dict, List, NamedTuple, Optional, Set
from hermes_constants import get_hermes_home
from tools import skill_usage
@@ -36,6 +37,22 @@ from tools import skill_usage
logger = logging.getLogger(__name__)
def _strip_aux_credential(value: Any) -> Optional[str]:
if value is None:
return None
text = str(value).strip()
return text or None
class _ReviewRuntimeBinding(NamedTuple):
"""Provider/model for the curator review fork plus optional per-slot overrides."""
provider: str
model: str
explicit_api_key: Optional[str]
explicit_base_url: Optional[str]
DEFAULT_INTERVAL_HOURS = 24 * 7 # 7 days
DEFAULT_MIN_IDLE_HOURS = 2
DEFAULT_STALE_AFTER_DAYS = 30
@@ -387,6 +404,11 @@ CURATOR_REVIEW_PROMPT = (
" - skill_manage action=write_file — add a references/, templates/, "
"or scripts/ file under an existing skill (the skill must already "
"exist)\n"
" - skill_manage action=delete — archive a skill. MUST pass "
"`absorbed_into=<umbrella>` when you've merged its content into another "
"skill, or `absorbed_into=\"\"` when you're truly pruning with no "
"forwarding target. This drives cron-job skill-reference migration — "
"guessing from your YAML summary after the fact is fragile.\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
@@ -448,6 +470,24 @@ def _reports_root() -> Path:
return root
def _needle_in_path_component(needle: str, path: str) -> bool:
"""Check if *needle* is a complete filename stem or directory name in *path*.
Unlike simple substring matching, this avoids false positives where short
skill names are embedded in longer filenames (e.g. "api" matching
"references/api-design.md"). Hyphens and underscores are normalised so
"open-webui-setup" matches "open_webui_setup.md".
"""
norm_needle = needle.replace("-", "_")
for part in path.replace("\\", "/").split("/"):
if not part:
continue
stem = part.rsplit(".", 1)[0] if "." in part else part
if stem.replace("-", "_") == norm_needle:
return True
return False
def _classify_removed_skills(
removed: List[str],
added: List[str],
@@ -526,15 +566,29 @@ def _classify_removed_skills(
continue
# Look for the removed skill's name in file_path / content / raw.
haystacks: List[str] = []
# Matching strategy differs by field type:
# file_path — needle must be a complete path component
# (filename stem or directory name), so "api" does NOT
# falsely match "references/api-design.md".
# content fields — word-boundary regex so "test" does NOT
# falsely match "latest" or "testing".
haystacks: List[tuple[str, str]] = []
for key in ("file_path", "file_content", "content", "new_string", "_raw"):
v = args.get(key)
if isinstance(v, str):
haystacks.append(v)
haystacks.append((key, v))
hit = False
for hay in haystacks:
for key, hay in haystacks:
for needle in needles:
if needle and needle in hay:
if not needle:
continue
if key == "file_path":
matched = _needle_in_path_component(needle, hay)
else:
matched = bool(
re.search(rf'\b{re.escape(needle)}\b', hay)
)
if matched:
hit = True
evidence = (
f"skill_manage action={args.get('action', '?')} "
@@ -637,15 +691,76 @@ def _parse_structured_summary(
return out
def _extract_absorbed_into_declarations(
tool_calls: List[Dict[str, Any]],
) -> Dict[str, Dict[str, Any]]:
"""Walk this run's tool calls and extract model-declared absorption targets.
The curator prompt requires every ``skill_manage(action='delete')`` call
to pass ``absorbed_into=<umbrella>`` when consolidating, or
``absorbed_into=""`` when truly pruning. This is the single authoritative
signal for classification — the model's own declaration at the moment of
deletion, which beats both post-hoc YAML summary parsing and substring
heuristics on other tool calls.
Returns ``{skill_name: {"into": "<umbrella>" | "", "declared": True}}``.
Entries with ``into == ""`` are explicit prunings.
Skills without a ``skill_manage(delete)`` call, or with one that omitted
``absorbed_into``, are not in the returned dict — caller falls back to
the existing heuristic/YAML logic for those (backward compat with older
curator runs and any callers that don't populate the arg).
"""
out: Dict[str, Dict[str, Any]] = {}
for tc in tool_calls or []:
if not isinstance(tc, dict):
continue
if tc.get("name") != "skill_manage":
continue
raw = tc.get("arguments") or ""
args: Dict[str, Any] = {}
if isinstance(raw, dict):
args = raw
elif isinstance(raw, str):
try:
args = json.loads(raw)
except Exception:
continue
if not isinstance(args, dict):
continue
if args.get("action") != "delete":
continue
name = args.get("name")
if not isinstance(name, str) or not name.strip():
continue
# absorbed_into must be present (even empty string is meaningful);
# missing key means the model didn't declare intent.
if "absorbed_into" not in args:
continue
target = args.get("absorbed_into")
if target is None:
continue
if not isinstance(target, str):
continue
out[name.strip()] = {"into": target.strip(), "declared": True}
return out
def _reconcile_classification(
removed: List[str],
heuristic: Dict[str, List[Dict[str, Any]]],
model_block: Dict[str, List[Dict[str, str]]],
destinations: Set[str],
absorbed_declarations: Optional[Dict[str, Dict[str, Any]]] = None,
) -> Dict[str, List[Dict[str, Any]]]:
"""Merge heuristic (tool-call evidence) with the model's structured block.
Rules:
Rules (evaluated in order; first match wins):
- **Model-declared `absorbed_into` at delete time is authoritative.** Any
entry in ``absorbed_declarations`` beats every other signal. This is
the model telling us directly, at the moment of deletion, what it did.
``into != ""`` and target exists → consolidated. ``into == ""`` →
pruned. ``into != ""`` but target doesn't exist → hallucination; fall
through to the usual signals.
- Model-declared consolidation wins when its ``into`` target exists
in ``destinations`` (survived or newly-created). This gives the
model authority over intent + rationale.
@@ -666,6 +781,8 @@ def _reconcile_classification(
model_cons = {e["from"]: e for e in model_block.get("consolidations", [])}
model_pruned = {e["name"]: e for e in model_block.get("prunings", [])}
declared = absorbed_declarations or {}
consolidated: List[Dict[str, Any]] = []
pruned: List[Dict[str, Any]] = []
@@ -673,6 +790,36 @@ def _reconcile_classification(
mc = model_cons.get(name)
mp = model_pruned.get(name)
hc = heur_cons.get(name)
dec = declared.get(name)
# Authoritative: model declared `absorbed_into` at the delete call.
if dec is not None:
into_claim = dec.get("into", "")
if into_claim and into_claim in destinations:
entry: Dict[str, Any] = {
"name": name,
"into": into_claim,
"source": "absorbed_into (model-declared at delete)",
"reason": (mc.get("reason") or "") if mc else "",
}
if hc and hc.get("evidence"):
entry["evidence"] = hc["evidence"]
consolidated.append(entry)
continue
if into_claim == "":
# Explicit prune declaration
pruned.append({
"name": name,
"source": "absorbed_into=\"\" (model-declared prune)",
"reason": (mp.get("reason") or "") if mp else "",
})
continue
# into_claim is non-empty but target doesn't exist: the model
# named a nonexistent umbrella at delete time. The tool already
# rejects this at the skill_manage layer, so we shouldn't see it
# in practice — but if it slips through (e.g. the umbrella was
# deleted LATER in the same run), fall through to the usual
# signals rather than trusting a broken reference.
# Model says consolidated — trust it if the destination is real.
if mc and mc.get("into") in destinations:
@@ -808,11 +955,20 @@ def _write_run_report(
)
model_block = _parse_structured_summary(llm_meta.get("final", "") or "")
destinations = set(after_names) | set(added or [])
# Authoritative signal: extract per-delete `absorbed_into` declarations
# from this run's tool calls. These beat both the YAML summary block and
# the substring heuristic — the model is telling us directly, at the
# moment of deletion, whether each archived skill was consolidated
# (into=<umbrella>) or pruned (into="").
absorbed_declarations = _extract_absorbed_into_declarations(
llm_meta.get("tool_calls", []) or []
)
classification = _reconcile_classification(
removed=removed,
heuristic=heuristic,
model_block=model_block,
destinations=destinations,
absorbed_declarations=absorbed_declarations,
)
consolidated = classification["consolidated"]
pruned = classification["pruned"]
@@ -1291,6 +1447,52 @@ def run_curator_review(
}
def _resolve_review_runtime(cfg: Dict[str, Any]) -> _ReviewRuntimeBinding:
"""Resolve provider/model and per-slot credentials for the curator review fork.
Same precedence as `_resolve_review_model()`. Non-empty ``api_key`` /
``base_url`` from the active slot are returned as explicit overrides so
``resolve_runtime_provider`` does not silently reuse the main chat
credential chain for a routed auxiliary model.
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _ReviewRuntimeBinding(
_task_provider,
_task_model,
_strip_aux_credential(_cur_task.get("api_key")),
_strip_aux_credential(_cur_task.get("base_url")),
)
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _ReviewRuntimeBinding(
str(_legacy_provider),
str(_legacy_model),
_strip_aux_credential(_legacy.get("api_key")),
_strip_aux_credential(_legacy.get("base_url")),
)
# 3. Fall through to the main chat model
return _ReviewRuntimeBinding(_main_provider, _main_model, None, None)
def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
"""Pick (provider, model) for the curator review fork.
@@ -1306,32 +1508,8 @@ def _resolve_review_model(cfg: Dict[str, Any]) -> tuple[str, str]:
2. Legacy ``curator.auxiliary.{provider,model}`` when both are set
3. Main ``model.{provider,default/model}`` pair
"""
_main = cfg.get("model", {}) if isinstance(cfg.get("model"), dict) else {}
_main_provider = _main.get("provider") or "auto"
_main_model = _main.get("default") or _main.get("model") or ""
# 1. Canonical aux task slot
_aux = cfg.get("auxiliary", {}) if isinstance(cfg.get("auxiliary"), dict) else {}
_cur_task = _aux.get("curator", {}) if isinstance(_aux.get("curator"), dict) else {}
_task_provider = (_cur_task.get("provider") or "").strip() or None
_task_model = (_cur_task.get("model") or "").strip() or None
if _task_provider and _task_provider != "auto" and _task_model:
return _task_provider, _task_model
# 2. Legacy curator.auxiliary.{provider,model} (deprecated, pre-unification)
_cur = cfg.get("curator", {}) if isinstance(cfg.get("curator"), dict) else {}
_legacy = _cur.get("auxiliary", {}) if isinstance(_cur.get("auxiliary"), dict) else {}
_legacy_provider = _legacy.get("provider") or None
_legacy_model = _legacy.get("model") or None
if _legacy_provider and _legacy_model:
logger.info(
"curator: using deprecated curator.auxiliary.{provider,model} "
"config — please migrate to auxiliary.curator.{provider,model}"
)
return _legacy_provider, _legacy_model
# 3. Fall through to the main chat model
return _main_provider, _main_model
b = _resolve_review_runtime(cfg)
return b.provider, b.model
def _run_llm_review(prompt: str) -> Dict[str, Any]:
@@ -1370,10 +1548,10 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
# arguments hits an auto-resolution path that fails for OAuth-only
# providers and for pool-backed credentials.
#
# `_resolve_review_model()` honors `auxiliary.curator.{provider,model}`
# `_resolve_review_runtime()` honors `auxiliary.curator.{provider,model,...}`
# (canonical aux-task slot, wired through `hermes model` → auxiliary
# picker and the dashboard Models tab), with a legacy fallback to
# `curator.auxiliary.{provider,model}`. See docs/user-guide/features/curator.md.
# `curator.auxiliary.{provider,model,...}`. See docs/user-guide/features/curator.md.
_api_key = None
_base_url = None
_api_mode = None
@@ -1383,9 +1561,13 @@ def _run_llm_review(prompt: str) -> Dict[str, Any]:
from hermes_cli.config import load_config
from hermes_cli.runtime_provider import resolve_runtime_provider
_cfg = load_config()
_provider, _model_name = _resolve_review_model(_cfg)
_binding = _resolve_review_runtime(_cfg)
_provider, _model_name = _binding.provider, _binding.model
_rp = resolve_runtime_provider(
requested=_provider, target_model=_model_name
requested=_provider,
target_model=_model_name,
explicit_api_key=_binding.explicit_api_key,
explicit_base_url=_binding.explicit_base_url,
)
_api_key = _rp.get("api_key")
_base_url = _rp.get("base_url")
+257 -4
View File
@@ -21,6 +21,18 @@ It DOES include:
pointer otherwise the curator would immediately re-fire on the next
tick)
- ``.bundled_manifest`` (so protection markers stay consistent)
Alongside the skills tarball, each snapshot also captures a copy of
``~/.hermes/cron/jobs.json`` as ``cron-jobs.json`` when it exists. Cron
jobs reference skills by name in their ``skills``/``skill`` fields; the
curator's consolidation pass rewrites those in place via
``cron.jobs.rewrite_skill_refs()``. Without capturing the pre-run state,
rolling back the skills tree would leave cron jobs pointing at the
umbrella skills even though the narrow skills they were originally
configured with have been restored. We store the whole jobs.json for
fidelity but rollback only touches the ``skills``/``skill`` fields the
rest (schedule, next_run_at, enabled, prompt, etc.) is live state and
we leave it alone.
"""
from __future__ import annotations
@@ -63,6 +75,60 @@ def _skills_dir() -> Path:
return get_hermes_home() / "skills"
def _cron_jobs_file() -> Path:
"""Source path for the live cron jobs store (``~/.hermes/cron/jobs.json``)."""
return get_hermes_home() / "cron" / "jobs.json"
CRON_JOBS_FILENAME = "cron-jobs.json"
def _backup_cron_jobs_into(dest: Path) -> Dict[str, Any]:
"""Copy the live cron jobs.json into ``dest`` as ``cron-jobs.json``.
Returns a small dict describing what was captured so the caller can
fold it into the manifest. Never raises if the cron file is missing
or unreadable, the return dict has ``backed_up=False`` and the reason,
and the snapshot proceeds without cron data (the snapshot is still
useful for rolling back skills).
"""
src = _cron_jobs_file()
info: Dict[str, Any] = {"backed_up": False, "jobs_count": 0}
if not src.exists():
info["reason"] = "no cron/jobs.json present"
return info
try:
raw = src.read_text(encoding="utf-8")
except OSError as e:
logger.debug("Failed to read cron/jobs.json for backup: %s", e)
info["reason"] = f"read error: {e}"
return info
# Count jobs as a nice diagnostic — but don't fail the snapshot if the
# file is unparseable; just store the raw text and let rollback deal
# with it (or not, if it's corrupted). jobs.json wraps the list as
# `{"jobs": [...], "updated_at": ...}` — we count via that shape, and
# fall back to bare-list shape just in case the format ever changes.
try:
parsed = json.loads(raw)
if isinstance(parsed, dict):
inner = parsed.get("jobs")
if isinstance(inner, list):
info["jobs_count"] = len(inner)
elif isinstance(parsed, list):
info["jobs_count"] = len(parsed)
except (json.JSONDecodeError, TypeError):
info["jobs_count"] = 0
info["parse_warning"] = "jobs.json was not valid JSON at snapshot time"
try:
(dest / CRON_JOBS_FILENAME).write_text(raw, encoding="utf-8")
except OSError as e:
logger.debug("Failed to write cron backup file: %s", e)
info["reason"] = f"write error: {e}"
return info
info["backed_up"] = True
return info
def _utc_id(now: Optional[datetime] = None) -> str:
"""UTC ISO-ish filesystem-safe timestamp: ``2026-05-01T13-05-42Z``."""
if now is None:
@@ -116,7 +182,8 @@ def _count_skill_files(base: Path) -> int:
def _write_manifest(dest: Path, reason: str, archive_path: Path,
skills_counted: int) -> None:
skills_counted: int,
cron_info: Optional[Dict[str, Any]] = None) -> None:
manifest = {
"id": dest.name,
"reason": reason,
@@ -125,6 +192,15 @@ def _write_manifest(dest: Path, reason: str, archive_path: Path,
"archive_bytes": archive_path.stat().st_size,
"skill_files": skills_counted,
}
if cron_info is not None:
manifest["cron_jobs"] = {
"backed_up": bool(cron_info.get("backed_up", False)),
"jobs_count": int(cron_info.get("jobs_count", 0)),
}
if not cron_info.get("backed_up"):
manifest["cron_jobs"]["reason"] = cron_info.get("reason", "not captured")
if cron_info.get("parse_warning"):
manifest["cron_jobs"]["parse_warning"] = cron_info["parse_warning"]
(dest / "manifest.json").write_text(
json.dumps(manifest, indent=2, sort_keys=True), encoding="utf-8"
)
@@ -181,7 +257,14 @@ def snapshot_skills(reason: str = "manual") -> Optional[Path]:
# arcname: store paths relative to skills/ so extraction
# drops cleanly back into the skills dir.
tf.add(str(entry), arcname=entry.name, recursive=True)
_write_manifest(dest, reason, archive, _count_skill_files(skills))
# Capture cron/jobs.json alongside the tarball. Never fails the
# snapshot — the skills side is the core guarantee; cron is
# additive. We still record in the manifest whether it was
# captured so rollback can surface "no cron data in this snapshot".
cron_info = _backup_cron_jobs_into(dest)
_write_manifest(dest, reason, archive,
_count_skill_files(skills),
cron_info=cron_info)
except (OSError, tarfile.TarError) as e:
logger.debug("Curator snapshot failed: %s", e, exc_info=True)
# Clean up partial snapshot
@@ -298,6 +381,149 @@ def _resolve_backup(backup_id: Optional[str]) -> Optional[Path]:
return candidates[0] if candidates else None
def _restore_cron_skill_links(snapshot_dir: Path) -> Dict[str, Any]:
"""Reconcile backed-up cron skill links into the live ``cron/jobs.json``.
We do NOT overwrite the whole cron file. Only the ``skills`` and
``skill`` fields are restored, and only on jobs that still exist in the
current file (matched by ``id``). Everything else about the job
schedule, next_run_at, last_run_at, enabled, prompt, workdir, hooks
is live state that the user/scheduler has modified since the snapshot;
overwriting it would regress unrelated cron activity.
Rules:
- Jobs present in backup AND live, with differing skills skills restored.
- Jobs present in backup AND live, with matching skills no-op.
- Jobs present in backup but gone from live (user deleted the job
after the snapshot) skipped, noted in the return report.
- Jobs present in live but not in backup (user created a new cron
job after the snapshot) left untouched.
Never raises; failures are captured in the return dict. Writes through
``cron.jobs`` to pick up the same lock + atomic-write path that tick()
uses, so we don't race the scheduler.
"""
report: Dict[str, Any] = {
"attempted": False,
"restored": [],
"skipped_missing": [],
"unchanged": 0,
"error": None,
}
backup_file = snapshot_dir / CRON_JOBS_FILENAME
if not backup_file.exists():
report["error"] = f"snapshot has no {CRON_JOBS_FILENAME}"
return report
try:
backup_text = backup_file.read_text(encoding="utf-8")
backup_parsed = json.loads(backup_text)
except (OSError, json.JSONDecodeError) as e:
report["error"] = f"failed to load backed-up jobs: {e}"
return report
# jobs.json on disk is `{"jobs": [...], "updated_at": ...}`; accept both
# that shape and a bare list for forward compat.
if isinstance(backup_parsed, dict):
backup_jobs = backup_parsed.get("jobs")
elif isinstance(backup_parsed, list):
backup_jobs = backup_parsed
else:
backup_jobs = None
if not isinstance(backup_jobs, list):
report["error"] = "backed-up cron-jobs.json has no jobs list"
return report
# Build a lookup of the backed-up skill state keyed by job id.
# We only need the two skill-ish fields (legacy single and modern list).
backup_by_id: Dict[str, Dict[str, Any]] = {}
for job in backup_jobs:
if not isinstance(job, dict):
continue
jid = job.get("id")
if not isinstance(jid, str) or not jid:
continue
backup_by_id[jid] = {
"skills": job.get("skills"),
"skill": job.get("skill"),
"name": job.get("name") or jid,
}
if not backup_by_id:
report["attempted"] = True # we tried but there was nothing to do
return report
# Load and rewrite the live jobs under the scheduler's lock.
try:
from cron.jobs import load_jobs, save_jobs, _jobs_file_lock
except ImportError as e:
report["error"] = f"cron module unavailable: {e}"
return report
report["attempted"] = True
try:
with _jobs_file_lock:
live_jobs = load_jobs()
changed = False
live_ids = set()
for live in live_jobs:
if not isinstance(live, dict):
continue
jid = live.get("id")
if not isinstance(jid, str) or not jid:
continue
live_ids.add(jid)
backup = backup_by_id.get(jid)
if backup is None:
continue # live job didn't exist at snapshot time
cur_skills = live.get("skills")
cur_skill = live.get("skill")
bkp_skills = backup.get("skills")
bkp_skill = backup.get("skill")
if cur_skills == bkp_skills and cur_skill == bkp_skill:
report["unchanged"] += 1
continue
# Restore. Preserve absence (don't force the key to appear
# if the backup didn't have it either).
if bkp_skills is None:
live.pop("skills", None)
else:
live["skills"] = bkp_skills
if bkp_skill is None:
live.pop("skill", None)
else:
live["skill"] = bkp_skill
report["restored"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
"from": {"skills": cur_skills, "skill": cur_skill},
"to": {"skills": bkp_skills, "skill": bkp_skill},
})
changed = True
# Jobs in backup but not in live = user deleted them after snapshot
for jid, backup in backup_by_id.items():
if jid not in live_ids:
report["skipped_missing"].append({
"job_id": jid,
"job_name": backup.get("name") or jid,
})
if changed:
save_jobs(live_jobs)
except Exception as e: # noqa: BLE001 — rollback must not die mid-restore
logger.debug("Cron skill-link restore failed: %s", e, exc_info=True)
report["error"] = f"restore failed mid-flight: {e}"
return report
def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]]:
"""Restore ``~/.hermes/skills/`` from a snapshot.
@@ -408,8 +634,35 @@ def rollback(backup_id: Optional[str] = None) -> Tuple[bool, str, Optional[Path]
except OSError:
pass
logger.info("Curator rollback: restored from %s", target.name)
return (True, f"restored from snapshot {target.name}", target)
# Reconcile cron skill-links. Surgical: only the skills/skill fields
# on jobs matched by id. Everything else in jobs.json is live state
# (schedule, next_run_at, enabled, prompt, etc.) and we leave it
# alone. Failures here don't fail the overall rollback — the skills
# tree is already restored, which is the main guarantee.
cron_report = _restore_cron_skill_links(target)
summary_bits = [f"restored from snapshot {target.name}"]
if cron_report.get("attempted"):
restored_n = len(cron_report.get("restored") or [])
skipped_n = len(cron_report.get("skipped_missing") or [])
if cron_report.get("error"):
summary_bits.append(f"cron links: error — {cron_report['error']}")
elif restored_n == 0 and skipped_n == 0 and cron_report.get("unchanged", 0) == 0:
# Attempted but nothing matched — empty snapshot or no overlapping ids.
pass
else:
parts = []
if restored_n:
parts.append(f"{restored_n} job(s) had skill links restored")
if skipped_n:
parts.append(f"{skipped_n} backed-up job(s) no longer exist (skipped)")
if cron_report.get("unchanged"):
parts.append(f"{cron_report['unchanged']} already matched")
summary_bits.append("cron links: " + ", ".join(parts))
logger.info("Curator rollback: restored from %s (cron_report=%s)",
target.name, cron_report)
return (True, "; ".join(summary_bits), target)
# ---------------------------------------------------------------------------
+12 -2
View File
@@ -520,7 +520,12 @@ def classify_api_error(
is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
if is_disconnect and not status_code:
is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have hundreds of
# messages while still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.6 or (
context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
)
if is_large:
return _result(
FailoverReason.context_overflow,
@@ -766,7 +771,12 @@ def _classify_400(
if not err_body_msg:
err_body_msg = str(body.get("message") or "").strip().lower()
is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have many messages while
# still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.4 or (
context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
)
if is_generic and is_large:
return result_fn(
+15 -1
View File
@@ -679,7 +679,21 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
finish_reason_raw = str(cand.get("finishReason") or "")
if finish_reason_raw:
mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
# Attach usage from this event's usageMetadata so the streaming
# loop in run_agent.py can record token counts (mirrors the
# non-streaming path in translate_gemini_response).
usage_meta = event.get("usageMetadata") or {}
if usage_meta:
finish_chunk.usage = SimpleNamespace(
prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
total_tokens=int(usage_meta.get("totalTokenCount") or 0),
prompt_tokens_details=SimpleNamespace(
cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
),
)
chunks.append(finish_chunk)
return chunks
+15 -2
View File
@@ -489,16 +489,29 @@ def save_credentials(creds: GoogleCredentials) -> Path:
"""Atomically write creds to disk with 0o600 permissions."""
path = _credentials_path()
path.parent.mkdir(parents=True, exist_ok=True)
# Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
# On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
try:
os.chmod(path.parent, 0o700)
except OSError:
pass
payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"
with _credentials_lock():
tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
with open(tmp_path, "w", encoding="utf-8") as fh:
# Create with 0o600 atomically to close the TOCTOU window where the
# default umask (often 0o644) would briefly expose tokens to other
# local users between open() and chmod().
fd = os.open(
str(tmp_path),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
fh.write(payload)
fh.flush()
os.fsync(fh.fileno())
os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
atomic_replace(tmp_path, path)
finally:
try:
+2 -2
View File
@@ -183,8 +183,8 @@ SKILLS_GUIDANCE = (
)
KANBAN_GUIDANCE = (
"# You are a Kanban worker\n"
"You were spawned by the Hermes Kanban dispatcher to execute ONE task from "
"# Kanban task execution protocol\n"
"You have been assigned ONE task from "
"the shared board at `~/.hermes/kanban.db`. Your task id is in "
"`$HERMES_KANBAN_TASK`; your workspace is `$HERMES_KANBAN_WORKSPACE`. "
"The `kanban_*` tools in your schema are your primary coordination surface — "
+17 -11
View File
@@ -305,13 +305,18 @@ def _redact_form_body(text: str) -> str:
return _redact_query_string(text.strip())
def redact_sensitive_text(text: str, *, force: bool = False) -> str:
def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled by default enable via security.redact_secrets: true in config.yaml.
Set force=True for safety boundaries that must never return raw secrets
regardless of the user's global logging redaction preference.
Set code_file=True to skip the ENV-assignment and JSON-field regex
patterns when the text is known to be source code (e.g. MAX_TOKENS=***
constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
private keys, DB connstrings, JWTs, and URL secrets are still redacted.
"""
if text is None:
return None
@@ -325,17 +330,18 @@ def redact_sensitive_text(text: str, *, force: bool = False) -> str:
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
# ENV assignments: OPENAI_API_KEY=sk-abc...
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# ENV assignments: OPENAI_API_KEY=*** (skip for code files — false positives)
if not code_file:
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "value"
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# JSON fields: "apiKey": "***" (skip for code files — false positives)
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Authorization headers
text = _AUTH_HEADER_RE.sub(
+38 -3
View File
@@ -6,6 +6,7 @@ can invoke skills via /skill-name commands.
import json
import logging
import os
import re
from pathlib import Path
from typing import Any, Dict, Optional
@@ -20,10 +21,35 @@ from agent.skill_preprocessing import (
logger = logging.getLogger(__name__)
_skill_commands: Dict[str, Dict[str, Any]] = {}
_skill_commands_platform: Optional[str] = None
# Patterns for sanitizing skill names into clean hyphen-separated slugs.
_SKILL_INVALID_CHARS = re.compile(r"[^a-z0-9-]")
_SKILL_MULTI_HYPHEN = re.compile(r"-{2,}")
def _resolve_skill_commands_platform() -> Optional[str]:
"""Return the current platform scope used for disabled-skill filtering.
Used to detect when the active platform has shifted so
:func:`get_skill_commands` can drop a stale cache that was populated
for a different platform's ``skills.platform_disabled`` view (#14536).
Resolves from (in order) ``HERMES_PLATFORM`` env var and
``HERMES_SESSION_PLATFORM`` from the gateway session context. Returns
``None`` when no platform scope is active (e.g. classic CLI, RL
rollouts, standalone scripts).
"""
try:
from gateway.session_context import get_session_env
resolved_platform = (
os.getenv("HERMES_PLATFORM")
or get_session_env("HERMES_SESSION_PLATFORM")
)
except Exception:
resolved_platform = os.getenv("HERMES_PLATFORM")
return resolved_platform or None
def _load_skill_payload(skill_identifier: str, task_id: str | None = None) -> tuple[dict[str, Any], Path | None, str] | None:
"""Load a skill by name/path and return (loaded_payload, skill_dir, display_name)."""
raw_identifier = (skill_identifier or "").strip()
@@ -218,7 +244,8 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
Returns:
Dict mapping "/skill-name" to {name, description, skill_md_path, skill_dir}.
"""
global _skill_commands
global _skill_commands, _skill_commands_platform
_skill_commands_platform = _resolve_skill_commands_platform()
_skill_commands = {}
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
@@ -278,8 +305,16 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
def get_skill_commands() -> Dict[str, Dict[str, Any]]:
"""Return the current skill commands mapping (scan first if empty)."""
if not _skill_commands:
"""Return the current skill commands mapping (scan first if empty).
Rescans when the active platform scope changes (e.g. a gateway
process serving Telegram and Discord concurrently) so each platform
sees its own ``skills.platform_disabled`` view (#14536).
"""
if (
not _skill_commands
or _skill_commands_platform != _resolve_skill_commands_platform()
):
scan_skill_commands()
return _skill_commands
+13 -1
View File
@@ -17,6 +17,7 @@ logger = logging.getLogger(__name__)
# so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
# become visible instead of piling up as NULL session titles.
FailureCallback = Callable[[str, BaseException], None]
TitleCallback = Callable[[str], None]
_TITLE_PROMPT = (
"Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
@@ -90,6 +91,7 @@ def auto_title_session(
assistant_response: str,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Generate and set a session title if one doesn't already exist.
@@ -119,6 +121,11 @@ def auto_title_session(
try:
session_db.set_session_title(session_id, title)
logger.debug("Auto-generated session title: %s", title)
if title_callback is not None:
try:
title_callback(title)
except Exception:
logger.debug("Auto-title callback failed", exc_info=True)
except Exception as e:
logger.debug("Failed to set auto-generated title: %s", e)
@@ -131,6 +138,7 @@ def maybe_auto_title(
conversation_history: list,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Fire-and-forget title generation after the first exchange.
@@ -152,7 +160,11 @@ def maybe_auto_title(
thread = threading.Thread(
target=auto_title_session,
args=(session_db, session_id, user_message, assistant_response),
kwargs={"failure_callback": failure_callback, "main_runtime": main_runtime},
kwargs={
"failure_callback": failure_callback,
"main_runtime": main_runtime,
"title_callback": title_callback,
},
daemon=True,
name="auto-title",
)
+12 -1
View File
@@ -143,7 +143,18 @@ class ResponsesApiTransport(ProviderTransport):
kwargs["max_output_tokens"] = max_tokens
if is_xai_responses and session_id:
kwargs["extra_headers"] = {"x-grok-conv-id": session_id}
existing_extra_headers = kwargs.get("extra_headers")
merged_extra_headers: Dict[str, str] = {}
if isinstance(existing_extra_headers, dict):
merged_extra_headers.update(
{
str(key): str(value)
for key, value in existing_extra_headers.items()
if key and value is not None
}
)
merged_extra_headers["x-grok-conv-id"] = session_id
kwargs["extra_headers"] = merged_extra_headers
return kwargs
+12
View File
@@ -121,6 +121,18 @@ model:
# # Data policy: "allow" (default) or "deny" to exclude providers that may store data
# # data_collection: "deny"
# =============================================================================
# OpenRouter Response Caching (only applies when using OpenRouter)
# =============================================================================
# Cache identical API responses at the OpenRouter edge for free instant replays.
# When enabled, identical requests (same model, messages, parameters) return
# cached responses with zero billing. Separate from Anthropic prompt caching.
# See: https://openrouter.ai/docs/guides/features/response-caching
#
# openrouter:
# response_cache: true # Enable response caching (default: true)
# response_cache_ttl: 300 # Cache TTL in seconds, 1-86400 (default: 300)
# =============================================================================
# Git Worktree Isolation
# =============================================================================
+305 -73
View File
@@ -459,32 +459,19 @@ def load_cli_config() -> Dict[str, Any]:
if "backend" in terminal_config:
terminal_config["env_type"] = terminal_config["backend"]
# Handle special cwd values: "." or "auto" means use current working directory.
# Only resolve to the host's CWD for the local backend where the host
# filesystem is directly accessible. For ALL remote/container backends
# (ssh, docker, modal, singularity), the host path doesn't exist on the
# target -- remove the key so terminal_tool.py uses its per-backend default.
#
# GUARD: If TERMINAL_CWD is already set to a real absolute path (by the
# gateway's config bridge earlier in the process), don't clobber it.
# This prevents a lazy import of cli.py during gateway runtime from
# rewriting TERMINAL_CWD to the service's working directory.
# See issue #10817.
# CWD resolution for CLI/TUI. The gateway has its own config bridge in
# gateway/run.py but may lazily import cli.py (triggering this code).
# Local backend: always os.getcwd(). Use `cd /dir && hermes` to control it.
# Non-local with placeholder: pop so terminal_tool uses its per-backend default.
# Non-local with explicit path: keep as-is.
_CWD_PLACEHOLDERS = (".", "auto", "cwd")
if terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
_existing_cwd = os.environ.get("TERMINAL_CWD", "")
if _existing_cwd and _existing_cwd not in _CWD_PLACEHOLDERS and os.path.isabs(_existing_cwd):
# Gateway (or earlier startup) already resolved a real path — keep it
terminal_config["cwd"] = _existing_cwd
defaults["terminal"]["cwd"] = _existing_cwd
else:
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
else:
# Remove so TERMINAL_CWD stays unset → tool picks backend default
terminal_config.pop("cwd", None)
effective_backend = terminal_config.get("env_type", "local")
if effective_backend == "local":
terminal_config["cwd"] = os.getcwd()
defaults["terminal"]["cwd"] = terminal_config["cwd"]
elif terminal_config.get("cwd") in _CWD_PLACEHOLDERS:
terminal_config.pop("cwd", None)
env_mappings = {
"env_type": "TERMINAL_ENV",
@@ -517,13 +504,18 @@ def load_cli_config() -> Dict[str, Any]:
"sudo_password": "SUDO_PASSWORD",
}
# Apply config values to env vars so terminal_tool picks them up.
# If the config file explicitly has a [terminal] section, those values are
# authoritative and override any .env settings. When using defaults only
# (no config file or no terminal section), don't overwrite env vars that
# were already set by .env -- the user's .env is the fallback source.
# Bridge config env vars for terminal_tool. TERMINAL_CWD is force-exported
# UNLESS we're inside a gateway process (detected by _HERMES_GATEWAY marker)
# where it was already set correctly by gateway/run.py's config bridge.
_is_gateway = os.environ.get("_HERMES_GATEWAY") == "1"
for config_key, env_var in env_mappings.items():
if config_key in terminal_config:
if env_var == "TERMINAL_CWD":
if _is_gateway:
continue
# CLI: always export (overrides stale .env or inherited values)
os.environ[env_var] = str(terminal_config[config_key])
continue
if _file_has_terminal_config or env_var not in os.environ:
val = terminal_config[config_key]
if isinstance(val, list):
@@ -1234,6 +1226,28 @@ def _strip_markdown_syntax(text: str) -> str:
return plain.strip("\n")
_WINDOWS_PATH_WITH_DOT_SEGMENT_RE = re.compile(
r"(?i)(?:\b[a-z]:\\|\\\\)[^\s`]*\\\.[^\s`]*"
)
def _preserve_windows_dot_segments_for_markdown(text: str) -> str:
r"""Keep Windows path separators before hidden directories in Markdown.
CommonMark treats ``\.`` as an escaped literal dot, so Rich Markdown would
render ``D:\repo\.ai`` as ``D:\repo.ai``. Doubling only that separator
inside Windows path-looking tokens preserves the path without changing
ordinary markdown escapes like ``1\. not a list``.
"""
if "\\." not in text:
return text
def _protect(match: re.Match[str]) -> str:
return re.sub(r"(?<!\\)\\(?=\.)", r"\\\\", match.group(0))
return _WINDOWS_PATH_WITH_DOT_SEGMENT_RE.sub(_protect, text)
def _render_final_assistant_content(text: str, mode: str = "render"):
"""Render final assistant content as markdown, stripped text, or raw text."""
from rich.markdown import Markdown
@@ -1245,6 +1259,7 @@ def _render_final_assistant_content(text: str, mode: str = "render"):
return _rich_text_from_ansi(text or "")
plain = _rich_text_from_ansi(text or "").plain
plain = _preserve_windows_dot_segments_for_markdown(plain)
return Markdown(plain)
@@ -1513,6 +1528,10 @@ def _detect_file_drop(user_input: str) -> "dict | None":
or stripped.startswith('"~')
or stripped.startswith("'/")
or stripped.startswith("'~")
or stripped.startswith('"./')
or stripped.startswith('"../')
or stripped.startswith("'./")
or stripped.startswith("'../")
or (len(stripped) >= 4 and stripped[0] in ("'", '"') and stripped[2] == ":" and stripped[3] in ("\\", "/") and stripped[1].isalpha())
)
if not starts_like_path:
@@ -2560,23 +2579,59 @@ class HermesCLI:
return f" {txt} ({elapsed_str})"
return f" {txt}"
def _voice_record_key_label(self) -> str:
"""Return the configured voice push-to-talk key formatted for UI.
Shared helper so every voice-facing status line / placeholder /
recording hint advertises the SAME label as the registered
prompt_toolkit binding.
Cached at startup (see ``set_voice_record_key_cache``) rather
than re-read per render. Two reasons (Copilot round-13 on
#19835):
* The prompt_toolkit binding is registered once at session
start via ``@kb.add(_voice_key)``; re-reading config per
render meant the status bar could advertise a new shortcut
after a config edit while the actual binding was still the
startup chord exactly the display/binding drift this PR
is trying to eliminate.
* The label is on the hot render path (status bar + composer
placeholder invalidated every 150ms during recording), so
reading config on every call added avoidable UI overhead.
"""
return getattr(self, "_voice_record_key_display_cache", None) or "Ctrl+B"
def set_voice_record_key_cache(self, raw_key: object) -> None:
"""Populate the voice label cache from a raw ``voice.record_key``.
Called at CLI startup after the prompt_toolkit binding is
registered so the cached label always matches the live binding.
"""
try:
from hermes_cli.voice import format_voice_record_key_for_status
self._voice_record_key_display_cache = format_voice_record_key_for_status(raw_key)
except Exception:
self._voice_record_key_display_cache = "Ctrl+B"
def _get_voice_status_fragments(self, width: Optional[int] = None):
"""Return the voice status bar fragments for the interactive TUI."""
width = width or self._get_tui_terminal_width()
compact = self._use_minimal_tui_chrome(width=width)
label = self._voice_record_key_label()
if self._voice_recording:
if compact:
return [("class:voice-status-recording", " ● REC ")]
return [("class:voice-status-recording", " ● REC Ctrl+B to stop ")]
return [("class:voice-status-recording", f" ● REC {label} to stop ")]
if self._voice_processing:
if compact:
return [("class:voice-status", " ◉ STT ")]
return [("class:voice-status", " ◉ Transcribing... ")]
if compact:
return [("class:voice-status", " 🎤 Ctrl+B ")]
return [("class:voice-status", f" 🎤 {label} ")]
tts = " | TTS on" if self._voice_tts else ""
cont = " | Continuous" if self._voice_continuous else ""
return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}Ctrl+B to record ")]
return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}{label} to record ")]
def _build_status_bar_text(self, width: Optional[int] = None) -> str:
"""Return a compact one-line session status string for the TUI footer."""
@@ -2928,7 +2983,14 @@ class HermesCLI:
def _expand_ref(match):
path = Path(match.group(1))
return path.read_text(encoding="utf-8") if path.exists() else match.group(0)
# Use try/except instead of path.exists() to avoid TOCTOU race:
# the paste file may be deleted between check and read, causing
# the input to be silently dropped (#17666).
try:
return path.read_text(encoding="utf-8")
except (OSError, IOError):
logger.warning("Paste file gone or unreadable, returning placeholder: %s", path)
return match.group(0)
return paste_ref_re.sub(_expand_ref, text)
@@ -4929,7 +4991,7 @@ class HermesCLI:
except Exception:
pass
def new_session(self, silent=False):
def new_session(self, silent=False, title=None):
"""Start a fresh session with a new session ID and cleared agent state."""
if self.agent and self.conversation_history:
# Trigger memory extraction on the old session before session_id rotates.
@@ -4984,6 +5046,28 @@ class HermesCLI:
self.agent._session_db_created = True
except Exception:
pass
if title and self._session_db:
from hermes_state import SessionDB
try:
sanitized = SessionDB.sanitize_title(title)
except ValueError as e:
_cprint(f" Title rejected: {e}")
sanitized = None
title = None
if sanitized:
try:
self._session_db.set_session_title(self.session_id, sanitized)
self._pending_title = None
title = sanitized
except ValueError as e:
_cprint(f" {e} — session started untitled.")
title = None
except Exception:
title = None
elif title is not None:
# sanitize_title returned empty (whitespace-only / unprintable)
_cprint(" Title is empty after cleanup — session started untitled.")
title = None
# Notify memory providers that session_id rotated to a fresh
# conversation. reset=True signals providers to flush accumulated
# per-session state (_session_turns, _turn_counter, _document_id).
@@ -5003,7 +5087,10 @@ class HermesCLI:
self._notify_session_boundary("on_session_reset")
if not silent:
print("(^_^)v New session started!")
if title:
print(f"(^_^)v New session started: {title}")
else:
print("(^_^)v New session started!")
def _handle_resume_command(self, cmd_original: str) -> None:
"""Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
@@ -6279,7 +6366,7 @@ class HermesCLI:
_cmd_def = _resolve_cmd(_base_word)
canonical = _cmd_def.name if _cmd_def else _base_word
if canonical in ("quit", "exit", "q"):
if canonical in ("quit", "exit"):
return False
elif canonical == "help":
self.show_help()
@@ -6415,7 +6502,9 @@ class HermesCLI:
else:
_cprint(" Session database not available.")
elif canonical == "new":
self.new_session()
parts = cmd_original.split(maxsplit=1)
title = parts[1].strip() if len(parts) > 1 else None
self.new_session(title=title)
elif canonical == "resume":
self._handle_resume_command(cmd_original)
elif canonical == "model":
@@ -8217,20 +8306,38 @@ class HermesCLI:
return
self._voice_recording = True
# Load silence detection params from config
voice_cfg = {}
# Load silence detection params from config. Shape-safe: a
# hand-edited ``voice: true`` / ``voice: cmd+b`` leaves
# ``load_config()['voice']`` as a non-dict; coerce to {} so
# continuous recording falls back to the documented defaults
# instead of crashing on ``.get()``.
voice_cfg: dict = {}
try:
from hermes_cli.config import load_config
voice_cfg = load_config().get("voice", {})
_cfg = load_config().get("voice")
voice_cfg = _cfg if isinstance(_cfg, dict) else {}
except Exception:
pass
if self._voice_recorder is None:
self._voice_recorder = create_audio_recorder()
# Apply config-driven silence params
self._voice_recorder._silence_threshold = voice_cfg.get("silence_threshold", 200)
self._voice_recorder._silence_duration = voice_cfg.get("silence_duration", 3.0)
# Apply config-driven silence params (numeric-guarded so YAML
# scalar corruption doesn't break recording start-up).
#
# ``bool`` is explicitly excluded from the numeric check — in
# Python bool is a subclass of int, so a hand-edited
# ``silence_threshold: true`` would otherwise be forwarded as
# ``1`` instead of falling back to the 200 default (Copilot
# round-12 on #19835).
_threshold = voice_cfg.get("silence_threshold")
_duration = voice_cfg.get("silence_duration")
self._voice_recorder._silence_threshold = (
_threshold if isinstance(_threshold, (int, float)) and not isinstance(_threshold, bool) else 200
)
self._voice_recorder._silence_duration = (
_duration if isinstance(_duration, (int, float)) and not isinstance(_duration, bool) else 3.0
)
def _on_silence():
"""Called by AudioRecorder when silence is detected after speech."""
@@ -8256,12 +8363,13 @@ class HermesCLI:
with self._voice_lock:
self._voice_recording = False
raise
_label = self._voice_record_key_label()
if getattr(self._voice_recorder, "supports_silence_autostop", True):
_recording_hint = "auto-stops on silence | Ctrl+B to stop & exit continuous"
_recording_hint = f"auto-stops on silence | {_label} to stop & exit continuous"
elif _is_termux_environment():
_recording_hint = "Termux:API capture | Ctrl+B to stop"
_recording_hint = f"Termux:API capture | {_label} to stop"
else:
_recording_hint = "Ctrl+B to stop"
_recording_hint = f"{_label} to stop"
_cprint(f"\n{_ACCENT}● Recording...{_RST} {_DIM}({_recording_hint}){_RST}")
# Periodically refresh prompt to update audio level indicator
@@ -8376,6 +8484,17 @@ class HermesCLI:
_cprint(f"{_DIM}Voice auto-restart failed: {e}{_RST}")
threading.Thread(target=_restart_recording, daemon=True).start()
def _voice_speak_response_async(self, text: str) -> None:
"""Schedule TTS and mark it pending before continuous recording can restart."""
if not self._voice_tts or not text:
return
self._voice_tts_done.clear()
threading.Thread(
target=self._voice_speak_response,
args=(text,),
daemon=True,
).start()
def _voice_speak_response(self, text: str):
"""Speak the agent's response aloud using TTS (runs in background thread)."""
if not self._voice_tts:
@@ -8495,10 +8614,12 @@ class HermesCLI:
with self._voice_lock:
self._voice_mode = True
# Check config for auto_tts
# Check config for auto_tts (shape-safe — malformed ``voice:`` YAML
# leaves ``voice_config`` as a non-dict, so guard before .get()).
try:
from hermes_cli.config import load_config
voice_config = load_config().get("voice", {})
_raw_voice = load_config().get("voice")
voice_config = _raw_voice if isinstance(_raw_voice, dict) else {}
if voice_config.get("auto_tts", False):
with self._voice_lock:
self._voice_tts = True
@@ -8510,13 +8631,11 @@ class HermesCLI:
# _voice_message_prefix property and its usage in _process_message().
tts_status = " (TTS enabled)" if self._voice_tts else ""
try:
from hermes_cli.config import load_config
_raw_ptt = load_config().get("voice", {}).get("record_key", "ctrl+b")
_ptt_key = _raw_ptt.lower().replace("ctrl+", "c-").replace("alt+", "a-")
except Exception:
_ptt_key = "c-b"
_ptt_display = _ptt_key.replace("c-", "Ctrl+").upper()
# Use the startup-pinned cache so the advertised shortcut always
# matches the live prompt_toolkit binding — reading live config
# here would drift after a mid-session config edit (Copilot
# round-14 on #19835, same class as round-13).
_ptt_display = self._voice_record_key_label()
_cprint(f"\n{_ACCENT}Voice mode enabled{tts_status}{_RST}")
_cprint(f" {_DIM}{_ptt_display} to start/stop recording{_RST}")
_cprint(f" {_DIM}/voice tts to toggle speech output{_RST}")
@@ -8573,7 +8692,6 @@ class HermesCLI:
def _show_voice_status(self):
"""Show current voice mode status."""
from hermes_cli.config import load_config
from tools.voice_mode import check_voice_requirements
reqs = check_voice_requirements()
@@ -8582,9 +8700,11 @@ class HermesCLI:
_cprint(f" Mode: {'ON' if self._voice_mode else 'OFF'}")
_cprint(f" TTS: {'ON' if self._voice_tts else 'OFF'}")
_cprint(f" Recording: {'YES' if self._voice_recording else 'no'}")
_raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
_display_key = _raw_key.replace("ctrl+", "Ctrl+").upper() if "ctrl+" in _raw_key.lower() else _raw_key
_cprint(f" Record key: {_display_key}")
# Display the startup-pinned label so /voice status always
# matches the live prompt_toolkit binding (Copilot round-14 on
# #19835, same class as round-13). Reading live config here
# would drift after a mid-session config edit.
_cprint(f" Record key: {self._voice_record_key_label()}")
_cprint(f"\n {_BOLD}Requirements:{_RST}")
for line in reqs["details"].split("\n"):
_cprint(f" {line}")
@@ -9536,11 +9656,7 @@ class HermesCLI:
# Speak response aloud if voice TTS is enabled
# Skip batch TTS when streaming TTS already handled it
if self._voice_tts and response and not use_streaming_tts:
threading.Thread(
target=self._voice_speak_response,
args=(response,),
daemon=True,
).start()
self._voice_speak_response_async(response)
# Re-queue the interrupt message (and any that arrived while we were
@@ -10423,7 +10539,92 @@ class HermesCLI:
else:
self._should_exit = True
event.app.exit()
# Ctrl+Shift+C: no binding needed. Terminal emulators (GNOME Terminal,
# iTerm2, kitty, Windows Terminal, etc.) intercept Ctrl+Shift+C before
# the keystroke reaches the application's stdin — prompt_toolkit never
# sees it, and prompt_toolkit's key spec parser doesn't even recognise
# 'c-S-c' anyway (the Shift modifier is meaningless on control-sequence
# keys). #19884 added a handler for this; #19895 patched the resulting
# startup crash with try/except. Both were based on a misreading of how
# terminal key events propagate. Deleting the dead handler outright.
@kb.add('c-q') # Ctrl+Q
def handle_ctrl_q(event):
"""Alternative interrupt/exit shortcut (Ctrl+Q).
Behaves like Ctrl+C: cancels active prompts, interrupts the
running agent, or clears the input buffer. Does not support
the double-press 'force exit' feature of Ctrl+C.
"""
# Cancel active voice recording.
_should_cancel_voice = False
_recorder_ref = None
with cli_ref._voice_lock:
if cli_ref._voice_recording and cli_ref._voice_recorder:
_recorder_ref = cli_ref._voice_recorder
cli_ref._voice_recording = False
cli_ref._voice_continuous = False
_should_cancel_voice = True
if _should_cancel_voice:
_cprint(f"\n{_DIM}Recording cancelled.{_RST}")
threading.Thread(
target=_recorder_ref.cancel, daemon=True
).start()
event.app.invalidate()
return
# Cancel sudo prompt
if self._sudo_state:
self._sudo_state["response_queue"].put("")
self._sudo_state = None
event.app.invalidate()
return
# Cancel secret prompt
if self._secret_state:
self._cancel_secret_capture()
event.app.current_buffer.reset()
event.app.invalidate()
return
# Cancel approval prompt (deny)
if self._approval_state:
self._approval_state["response_queue"].put("deny")
self._approval_state = None
event.app.invalidate()
return
# Cancel /model picker
if self._model_picker_state:
self._close_model_picker()
event.app.current_buffer.reset()
event.app.invalidate()
return
# Cancel clarify prompt
if self._clarify_state:
self._clarify_state["response_queue"].put(
"The user cancelled. Use your best judgement to proceed."
)
self._clarify_state = None
self._clarify_freetext = False
event.app.current_buffer.reset()
event.app.invalidate()
return
if self._agent_running and self.agent:
print("\n⚡ Interrupting agent...")
self.agent.interrupt()
else:
if event.app.current_buffer.text or self._attached_images:
event.app.current_buffer.reset()
self._attached_images.clear()
event.app.invalidate()
else:
self._should_exit = True
event.app.exit()
@kb.add('c-d')
def handle_ctrl_d(event):
"""Ctrl+D: delete char under cursor (standard readline behaviour).
@@ -10477,15 +10678,44 @@ class HermesCLI:
run_in_terminal(_suspend)
# Voice push-to-talk key: configurable via config.yaml (voice.record_key)
# Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search)
# Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format.
# Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search).
# Config spellings (ctrl/control/alt/option/opt) are normalized to
# prompt_toolkit's c-x / a-x format via ``normalize_voice_record_key_for_prompt_toolkit``
# so the same config value binds identically in the TUI and CLI
# (Copilot round-9 review on #19835). ``super``/``win``/``windows``
# configs silently fall back to the default here since prompt_toolkit
# has no super modifier — log a warning so users notice the
# TUI/CLI split instead of a silent mismatch (round-11).
_raw_key: object = "ctrl+b"
try:
from hermes_cli.config import load_config
_raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
_voice_key = _raw_key.lower().replace("ctrl+", "c-").replace("alt+", "a-")
from hermes_cli.voice import (
normalize_voice_record_key_for_prompt_toolkit,
voice_record_key_from_config,
)
_raw_key = voice_record_key_from_config(load_config())
_voice_key = normalize_voice_record_key_for_prompt_toolkit(_raw_key)
if (
isinstance(_raw_key, str)
and _raw_key.strip().lower().split("+", 1)[0].strip() in {"super", "win", "windows"}
and _voice_key == "c-b"
):
logger.warning(
"voice.record_key %r uses a TUI-only modifier (super/win); "
"CLI fell back to Ctrl+B. Use ctrl+<key> or alt+<key> for "
"cross-runtime parity.",
_raw_key,
)
except Exception:
_voice_key = "c-b"
# Cache the UI label here — same ``_raw_key`` that drives the
# prompt_toolkit binding below. Every status / placeholder /
# recording-hint render reads this cached value so display can
# never drift from the live keybinding even if the user edits
# voice.record_key mid-session (Copilot round-13 on #19835).
self.set_voice_record_key_cache(_raw_key)
@kb.add(_voice_key)
def handle_voice_record(event):
"""Toggle voice recording when voice mode is active.
@@ -10788,7 +11018,8 @@ class HermesCLI:
def _get_placeholder():
if cli_ref._voice_recording:
return "recording... Ctrl+B to stop, Ctrl+C to cancel"
_label = cli_ref._voice_record_key_label()
return f"recording... {_label} to stop, Ctrl+C to cancel"
if cli_ref._voice_processing:
return "transcribing..."
if cli_ref._sudo_state:
@@ -10808,7 +11039,8 @@ class HermesCLI:
if cli_ref._agent_running:
return "msg=interrupt · /queue · /bg · /steer · Ctrl+C cancel"
if cli_ref._voice_mode:
return "type or Ctrl+B to record"
_label = cli_ref._voice_record_key_label()
return f"type or {_label} to record"
return ""
input_area.control.input_processors.append(_PlaceholderProcessor(_get_placeholder))
@@ -11584,7 +11816,7 @@ class HermesCLI:
pass # Non-fatal — don't break the main loop
except Exception as e:
print(f"Error: {e}")
logger.warning("process_loop unhandled error (msg may be lost): %s", e)
# Start processing thread
process_thread = threading.Thread(target=process_loop, daemon=True)
+56 -8
View File
@@ -420,7 +420,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:
def create_job(
prompt: str,
prompt: Optional[str],
schedule: str,
name: Optional[str] = None,
repeat: Optional[int] = None,
@@ -435,12 +435,14 @@ def create_job(
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
no_agent: bool = False,
) -> Dict[str, Any]:
"""
Create a new cron job.
Args:
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
Ignored when ``no_agent=True`` except as an optional name hint.
schedule: Schedule string (see parse_schedule)
name: Optional friendly name
repeat: How many times to run (None = forever, 1 = once)
@@ -451,21 +453,33 @@ def create_job(
model: Optional per-job model override
provider: Optional per-job provider override
base_url: Optional per-job base URL override
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
script: Optional path to a script whose stdout feeds the job. With
``no_agent=True`` the script IS the job its stdout is
delivered verbatim. Without ``no_agent``, its stdout is
injected into the agent's prompt as context (data-collection /
change-detection pattern). Paths resolve under
~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
anything else via Python.
context_from: Optional job ID (or list of job IDs) whose most recent output
is injected into the prompt as context before each run.
Useful for chaining cron jobs: job A finds data, job B processes it.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
Ignored when ``no_agent=True``.
workdir: Optional absolute path. When set, the job runs as if launched
from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
that directory are injected into the system prompt, and the
terminal/file/code_exec tools use it as their working directory
(via TERMINAL_CWD). When unset, the old behaviour is preserved
(no context files injected, tools use the scheduler's cwd).
With ``no_agent=True``, ``workdir`` is still applied as the
script's cwd so relative paths inside the script behave
predictably.
no_agent: When True, skip the agent entirely run ``script`` on schedule
and deliver its stdout directly. Empty stdout = silent (no
delivery). Requires ``script`` to be set. Ideal for classic
watchdogs and periodic alerts that don't need LLM reasoning.
Returns:
The created job dict
@@ -499,6 +513,16 @@ def create_job(
normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
normalized_toolsets = normalized_toolsets or None
normalized_workdir = _normalize_workdir(workdir)
normalized_no_agent = bool(no_agent)
# no_agent jobs are meaningless without a script — the script IS the job.
# Surface this as a clear ValueError at create time so bad configs never
# reach the scheduler.
if normalized_no_agent and not normalized_script:
raise ValueError(
"no_agent=True requires a script — with no agent and no script "
"there is nothing for the job to run."
)
# Normalize context_from: accept str or list of str, store as list or None
if isinstance(context_from, str):
@@ -508,7 +532,7 @@ def create_job(
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
job = {
"id": job_id,
"name": name or label_source[:50].strip(),
@@ -519,6 +543,7 @@ def create_job(
"provider": normalized_provider,
"base_url": normalized_base_url,
"script": normalized_script,
"no_agent": normalized_no_agent,
"context_from": context_from,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
@@ -785,6 +810,12 @@ def get_due_jobs() -> List[Dict[str, Any]]:
the job is fast-forwarded to the next future run instead of firing
immediately. This prevents a burst of missed jobs on gateway restart.
"""
with _jobs_file_lock:
return _get_due_jobs_locked()
def _get_due_jobs_locked() -> List[Dict[str, Any]]:
"""Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
now = _hermes_now()
raw_jobs = load_jobs()
jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
@@ -797,19 +828,36 @@ def get_due_jobs() -> List[Dict[str, Any]]:
next_run = job.get("next_run_at")
if not next_run:
schedule = job.get("schedule", {})
kind = schedule.get("kind")
# One-shot jobs use a small grace window via the dedicated helper.
recovered_next = _recoverable_oneshot_run_at(
job.get("schedule", {}),
schedule,
now,
last_run_at=job.get("last_run_at"),
)
recovery_kind = "one-shot" if recovered_next else None
# Recurring jobs reach here only when something — typically a
# direct jobs.json edit that bypassed add_job() — left
# next_run_at unset. Without this branch, such jobs are
# silently skipped forever; recompute next_run_at from the
# schedule so they pick up at their next scheduled tick.
if not recovered_next and kind in ("cron", "interval"):
recovered_next = compute_next_run(schedule, now.isoformat())
if recovered_next:
recovery_kind = kind
if not recovered_next:
continue
job["next_run_at"] = recovered_next
next_run = recovered_next
logger.info(
"Job '%s' had no next_run_at; recovering one-shot run at %s",
"Job '%s' had no next_run_at; recovering %s run at %s",
job.get("name", job["id"]),
recovery_kind,
recovered_next,
)
for rj in raw_jobs:
+183 -17
View File
@@ -35,7 +35,7 @@ from typing import List, Optional
sys.path.insert(0, str(Path(__file__).parent.parent))
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config
from hermes_cli.config import load_config, _expand_env_vars
from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
@@ -123,9 +123,19 @@ _LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
"""Extract origin info from a job, preserving any extra routing metadata."""
"""Extract origin info from a job, preserving any extra routing metadata.
Treats non-dict origins (free-form provenance strings, ints, lists from
migration scripts or hand-edited jobs.json) as missing instead of
crashing with ``AttributeError`` on ``origin.get(...)``. Without this
guard, a job tagged with e.g. ``"combined-digest-replaces-x-and-y"``
crashed every fire attempt with
``'str' object has no attribute 'get'`` ``mark_job_run`` recorded the
failure, but the next tick re-loaded the same poisoned origin and
crashed identically until the field was patched manually (#18722).
"""
origin = job.get("origin")
if not origin:
if not isinstance(origin, dict):
return None
platform = origin.get("platform")
chat_id = origin.get("chat_id")
@@ -147,6 +157,19 @@ def _get_home_target_chat_id(platform_name: str) -> str:
return value
def _get_home_target_thread_id(platform_name: str) -> Optional[str]:
"""Return the optional thread/topic ID for a platform home target."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return None
value = os.getenv(f"{env_var}_THREAD_ID", "").strip()
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(f"{legacy}_THREAD_ID", "").strip()
return value or None
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
@@ -175,7 +198,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
return None
@@ -229,7 +252,7 @@ def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[d
return {
"platform": platform_name,
"chat_id": chat_id,
"thread_id": None,
"thread_id": _get_home_target_thread_id(platform_name),
}
@@ -394,7 +417,7 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin = _resolve_origin(job) or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
@@ -553,8 +576,18 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
prevent arbitrary script execution via path traversal or absolute
path injection.
Supported interpreters (chosen by file extension):
* ``.sh`` / ``.bash`` run with ``/bin/bash``
* anything else run with the current Python interpreter
(``sys.executable``), preserving the original behaviour for
Python-based pre-check and data-collection scripts.
Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
(the `memory-watchdog.sh` pattern) without wrapping them in Python.
Args:
script_path: Path to a Python script. Relative paths are resolved
script_path: Path to the script. Relative paths are resolved
against HERMES_HOME/scripts/. Absolute and ~-prefixed paths
are also validated to ensure they stay within the scripts dir.
@@ -591,9 +624,19 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
script_timeout = _get_script_timeout()
# Pick an interpreter by extension. Bash for .sh/.bash, Python for
# everything else. We deliberately do NOT honour the file's own
# shebang: the scripts dir is trusted, but keeping the interpreter
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
argv = ["/bin/bash", str(path)]
else:
argv = [sys.executable, str(path)]
try:
result = subprocess.run(
[sys.executable, str(path)],
argv,
capture_output=True,
text=True,
timeout=script_timeout,
@@ -683,10 +726,8 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
f"{prompt}"
)
else:
prompt = (
"[Script ran successfully but produced no output.]\n\n"
f"{prompt}"
)
# Script produced no output — nothing to report, skip AI call.
return None
else:
prompt = (
"## Script Error\n"
@@ -759,6 +800,7 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
return prompt
from tools.skills_tool import skill_view
from tools.skill_usage import bump_use
parts = []
skipped: list[str] = []
@@ -770,6 +812,12 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
skipped.append(skill_name)
continue
# Bump usage so the curator sees this skill as actively used.
try:
bump_use(skill_name)
except Exception:
logger.debug("Cron job: failed to bump skill usage for '%s'", skill_name, exc_info=True)
content = str(loaded.get("content") or "").strip()
if parts:
parts.append("")
@@ -802,8 +850,120 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
Returns:
Tuple of (success, full_output_doc, final_response, error_message)
"""
job_id = job["id"]
job_name = job["name"]
# ---------------------------------------------------------------
# no_agent short-circuit — the script IS the job, no LLM involvement.
# ---------------------------------------------------------------
# This mirrors the classic "run a bash script on a timer, send its
# stdout to telegram" watchdog pattern. The agent path is skipped
# entirely: no AIAgent, no prompt, no tool loop, no token spend.
#
# We check this BEFORE importing run_agent / constructing SessionDB so
# a pure-script tick never pays for the agent machinery it isn't going
# to use. Keep this block self-contained.
#
# Semantics:
# - script stdout (trimmed) → delivered verbatim as the final message
# - empty stdout → silent run (no delivery, success=True)
# - non-zero exit / timeout → delivered as an error alert, success=False
# - wakeAgent=false gate → treated like empty stdout (silent), since
# the whole point of no_agent is that there
# is no agent to wake
if job.get("no_agent"):
script_path = job.get("script")
if not script_path:
err = "no_agent=True but no script is set for this job"
logger.error("Job '%s': %s", job_id, err)
return False, "", "", err
# Apply workdir if configured — lets scripts use predictable relative
# paths. For no_agent jobs this is just the subprocess cwd (not an
# agent TERMINAL_CWD bridge).
_job_workdir = (job.get("workdir") or "").strip() or None
_prior_cwd = None
if _job_workdir and Path(_job_workdir).is_dir():
_prior_cwd = os.getcwd()
try:
os.chdir(_job_workdir)
except OSError:
_prior_cwd = None
try:
ok, output = _run_job_script(script_path)
finally:
if _prior_cwd is not None:
try:
os.chdir(_prior_cwd)
except OSError:
pass
now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
if not ok:
# Script crashed / timed out / exited non-zero. Deliver the
# error so the user knows the watchdog itself broke — silent
# failure for an alerting job is the worst-case outcome.
alert = (
f"⚠ Cron watchdog '{job_name}' script failed\n\n"
f"{output}\n\n"
f"Time: {now_iso}"
)
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** script failed\n\n"
f"{output}\n"
)
return False, doc, alert, output
# Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
# means "nothing to report this tick", same as empty stdout.
if not _parse_wake_gate(output):
logger.info(
"Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (wakeAgent=false)\n"
)
return True, silent_doc, SILENT_MARKER, None
if not output.strip():
logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (empty output)\n"
)
return True, silent_doc, SILENT_MARKER, None
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n\n"
f"---\n\n"
f"{output}\n"
)
return True, doc, output, None
# ---------------------------------------------------------------
# Default (LLM) path — import and construct the agent machinery now
# that we know we actually need it. Doing these imports here instead of
# at module top keeps no_agent ticks from paying for AIAgent / SessionDB
# construction costs.
# ---------------------------------------------------------------
from run_agent import AIAgent
# Initialize SQLite session store so cron job messages are persisted
# and discoverable via session_search (same pattern as gateway/run.py).
_session_db = None
@@ -812,9 +972,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_session_db = SessionDB()
except Exception as e:
logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
job_id = job["id"]
job_name = job["name"]
# Wake-gate: if this job has a pre-check script, run it BEFORE building
# the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -839,6 +996,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
return True, silent_doc, SILENT_MARKER, None
prompt = _build_job_prompt(job, prerun_script=prerun_script)
if prompt is None:
logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
return True, "", SILENT_MARKER, None
origin = _resolve_origin(job)
_cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
@@ -922,6 +1082,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
if not job.get("model"):
if isinstance(_model_cfg, str):
@@ -974,8 +1135,13 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
)
from hermes_cli.auth import AuthError
try:
# Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
# already prefers persisted config over stale shell/env overrides when
# no explicit provider is requested. Passing the env var here short-
# circuits that precedence and can resurrect old providers (for
# example DeepSeek) for cron jobs that do not pin provider/model.
runtime_kwargs = {
"requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
"requested": job.get("provider"),
}
if job.get("base_url"):
runtime_kwargs["explicit_base_url"] = job.get("base_url")
+35
View File
@@ -86,6 +86,41 @@ if [ -d "$INSTALL_DIR/skills" ]; then
python3 "$INSTALL_DIR/tools/skills_sync.py"
fi
# Optionally start `hermes dashboard` as a side-process.
#
# Toggled by HERMES_DASHBOARD=1 (also accepts "true"/"yes", case-insensitive).
# Host/port/TUI can be overridden via:
# HERMES_DASHBOARD_HOST (default 0.0.0.0 — exposed outside the container)
# HERMES_DASHBOARD_PORT (default 9119, matches `hermes dashboard` default)
# HERMES_DASHBOARD_TUI (already honored by `hermes dashboard` itself)
#
# The dashboard is a long-lived server. We background it *before* the final
# `exec hermes "$@"` so the user's chosen foreground command (chat, gateway,
# sleep infinity, …) remains PID-of-interest for the container runtime. When
# the container stops the whole process tree is torn down, so no explicit
# cleanup is needed.
case "${HERMES_DASHBOARD:-}" in
1|true|TRUE|True|yes|YES|Yes)
dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"
dash_port="${HERMES_DASHBOARD_PORT:-9119}"
dash_args=(--host "$dash_host" --port "$dash_port" --no-open)
# Binding to anything other than localhost requires --insecure — the
# dashboard refuses otherwise because it exposes API keys. Inside a
# container this is the expected deployment (host reaches it via
# published port), so opt in automatically.
if [ "$dash_host" != "127.0.0.1" ] && [ "$dash_host" != "localhost" ]; then
dash_args+=(--insecure)
fi
echo "Starting hermes dashboard on ${dash_host}:${dash_port} (background)"
# Prefix dashboard output so it's distinguishable from the main
# process in `docker logs`. stdbuf keeps the pipe line-buffered.
(
stdbuf -oL -eL hermes dashboard "${dash_args[@]}" 2>&1 \
| sed -u 's/^/[dashboard] /'
) &
;;
esac
# Final exec: two supported invocation patterns.
#
# docker run <image> -> exec `hermes` with no args (legacy default)
@@ -0,0 +1,473 @@
# Telegram DM User-Managed Multi-Session Topics Implementation Plan
> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
---
## 1. Product decisions
### Accepted
- PR-quality implementation: migrations, tests, docs, backwards compatibility.
- Use SQLite persistence, not JSON sidecars.
- Live status suffixes in topic titles are out of MVP.
- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
- User creates Telegram topics manually through the Telegram bot interface.
- `/new` does **not** create Telegram topics.
- Root/main DM becomes a system lobby after activation.
- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
### Telegram API assumptions verified from Bot API docs
- `getMe` returns bot `User` fields:
- `has_topics_enabled`: forum/topic mode enabled in private chats.
- `allows_users_to_create_topics`: users may create/delete topics in private chats.
- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
- `Message.message_thread_id` identifies a topic in private chats.
- `sendMessage` supports `message_thread_id` for private-chat topics.
- `pinChatMessage` is allowed in private chats.
---
## 2. Target UX
### 2.1 Activation from root/main DM
User sends:
```text
/topic
```
Hermes:
1. calls Telegram `getMe`;
2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
3. enables multi-session topic mode for this Telegram DM user/chat;
4. sends an onboarding message;
5. pins the onboarding message if configured;
6. shows old/unlinked sessions that can be restored into topics.
Suggested onboarding text:
```text
Multi-session mode is enabled.
Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
This main chat is reserved for system commands, status, and session management.
To restore an old session:
1. Use /topic here to see unlinked sessions.
2. Create a new topic with the + button.
3. Send /topic <session_id> inside that topic.
```
### 2.2 Root/main DM after activation
Root DM is a system lobby.
Allowed/system commands include at least:
- `/topic`
- `/status`
- `/sessions` if available
- `/usage`
- `/help`
- `/platforms`
Normal user prompts in root DM do not enter the agent loop. Reply:
```text
This main chat is reserved for system commands.
To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
```
`/new` in root DM does not create a session/topic. Reply:
```text
To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
```
### 2.3 First message in a user-created topic
When a user creates a Telegram topic and sends the first message there:
1. Hermes receives a Telegram DM message with `message_thread_id`.
2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
4. The message runs through the normal agent loop for that lane.
### 2.4 `/new` inside a non-main topic
`/new` remains supported but replaces the session attached to the current topic lane.
Hermes should warn:
```text
Started a new Hermes session in this topic.
Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
```
### 2.5 `/topic` in root/main DM after activation
Shows:
- mode enabled/disabled;
- last capability check result;
- whether intro message is pinned if known;
- count of known topic bindings;
- list of old/unlinked sessions.
Example:
```text
Telegram multi-session topics are enabled.
Create new Hermes chats with the + button in this bot interface.
Unlinked previous sessions:
1. 2026-05-01 Research notes — id: abc123
2. 2026-04-30 Deploy debugging — id: def456
3. Untitled session — id: ghi789
To restore one:
1. Create a new topic with the + button.
2. Open that topic.
3. Send /topic <id>
```
### 2.6 `/topic` inside a non-main topic
Without args, show the current topic binding:
```text
This topic is linked to:
Session: Research notes
ID: abc123
Use /new to replace this topic with a fresh session.
For parallel work, create another topic with the + button.
```
### 2.7 `/topic <session_id>` inside a non-main topic
Restore an old/unlinked session into the current user-created topic.
Behavior:
1. reject if not in Telegram DM topic;
2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
3. reject if session is already linked to another active topic in MVP;
4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
5. upsert binding with `managed_mode = restored`;
6. send two messages into the topic:
- session restored confirmation;
- last Hermes assistant message if available.
Example:
```text
Session restored: Research notes
Last Hermes message:
...
```
---
## 3. Persistence model
Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
Important rollback-safety rule:
- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
- old/default Telegram behavior must keep working on the existing `state.db`;
- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
### 3.1 No eager `sessions` table mutation for MVP
Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
### 3.2 Explicit `/topic` migration API
Add an idempotent method such as:
```python
def apply_telegram_topic_migration(self) -> None: ...
```
It creates only topic-mode side tables/indexes and records:
```text
state_meta.telegram_dm_topic_schema_version = 1
```
This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
### 3.3 `telegram_dm_topic_mode`
Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id` primary key
- `user_id`
- `enabled`
- `activated_at`
- `updated_at`
- `has_topics_enabled`
- `allows_users_to_create_topics`
- `capability_checked_at`
- `intro_message_id`
- `pinned_message_id`
### 3.4 `telegram_dm_topic_bindings`
Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id`
- `thread_id`
- `user_id`
- `session_key`
- `session_id`
- `managed_mode`
- `auto`
- `restored`
- `new_replaced`
- `linked_at`
- `updated_at`
Recommended constraints:
- primary key `(chat_id, thread_id)`;
- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
- index `(user_id, chat_id)` for status/listing.
### 3.5 Unlinked session semantics
For MVP, a session is unlinked if:
- `source = telegram`;
- `user_id = current Telegram user`;
- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
Never dedupe by title.
---
## 4. Config
Suggested config block:
```yaml
platforms:
telegram:
extra:
multisession_topics:
enabled: false
mode: user_managed_topics
root_chat_behavior: system_lobby
pin_intro_message: true
```
Notes:
- `enabled: false` means existing Telegram behavior is unchanged.
- Activation via `/topic` may create per-chat enabled state only if global config permits it.
- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
---
## 5. Command behavior summary
### `/topic` root/main DM
- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
- If activated: show status and unlinked sessions.
### `/topic` non-main topic
- Show current binding.
### `/topic <session_id>` root/main DM
Reject with instructions:
```text
Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
```
### `/topic <session_id>` non-main topic
Restore that session into this topic if ownership/linking checks pass.
### `/new` root/main DM when activated
Reply with instructions to use the `+` button. Do not enter agent loop.
### `/new` non-main topic
Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
### Normal text root/main DM when activated
Reply with system-lobby instruction. Do not enter agent loop.
### Normal text non-main topic
Normal Hermes agent flow for that topic's session lane.
---
## 6. PR breakdown
### PR 1 — Explicit topic-mode schema migration
**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
**Files likely touched:**
- `hermes_state.py`
- tests under `tests/`
**Tests first:**
1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
### PR 2 — Topic mode activation and binding APIs
**Goal:** Add SQLite persistence for activation and topic bindings.
**Tests first:**
1. enable/check mode row round-trips;
2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
3. linked sessions are excluded from unlinked list.
### PR 3 — `/topic` activation/status command
**Goal:** Implement root activation/status/listing behavior.
**Tests first:**
1. `/topic` in root checks `getMe` capabilities and records activation;
2. capability failure returns readable instructions;
3. activated root `/topic` lists unlinked sessions.
### PR 4 — System lobby behavior
**Goal:** Prevent root chat from entering agent loop after activation.
**Tests first:**
1. normal text in activated root returns lobby instruction;
2. `/new` in activated root returns `+` button instruction;
3. non-activated root behavior is unchanged.
### PR 5 — Auto-bind user-created topics
**Goal:** First message in non-main topic creates/uses an independent session lane.
**Tests first:**
1. new topic message creates binding with `auto_created`;
2. repeated topic message reuses same binding/lane;
3. two topics in same DM do not share sessions.
### PR 6 — Restore legacy sessions into a topic
**Goal:** Implement `/topic <session_id>` in non-main topics.
**Tests first:**
1. root `/topic <id>` rejects with instructions;
2. topic `/topic <id>` switches current topic lane to target session;
3. restore rejects sessions from other users/chats;
4. restore rejects already-linked sessions;
5. restore emits confirmation and last Hermes assistant message.
### PR 7 — `/new` inside topic updates binding
**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
**Tests first:**
1. `/new` in topic creates a new session for same topic lane;
2. binding updates to `managed_mode = new_replaced`;
3. response includes guidance to use `+` for parallel work.
### PR 8 — Docs and polish
**Goal:** Document the feature and Telegram setup.
**Files likely touched:**
- `website/docs/user-guide/messaging/telegram.md`
- maybe `website/docs/user-guide/sessions.md`
Docs must explain:
- BotFather/Telegram settings for topic mode and user-created topics;
- `/topic` activation;
- root system lobby;
- using `+` for new parallel chats;
- restoring old sessions with `/topic <id>` inside a topic;
- limitations.
---
## 7. Testing / quality gates
Run targeted tests after each TDD cycle, then broader tests before completion.
Suggested commands after inspection confirms test paths:
```bash
python -m pytest tests/test_hermes_state.py -q
python -m pytest tests/gateway/ -q
python -m pytest tests/ -o 'addopts=' -q
```
Do not ship without verifying disabled-feature backwards compatibility.
---
## 8. Definition of done for MVP
- `/topic` activates/checks Telegram DM multi-session mode.
- Root DM becomes a system lobby after activation.
- Onboarding message tells users to create new chats with the Telegram `+` button.
- Onboarding message can be pinned in private chat.
- User-created topics automatically become independent Hermes session lanes.
- `/new` in root gives instructions, not a new agent run.
- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
- `/topic` in root lists unlinked old sessions.
- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
- Ownership checks prevent restoring other users' sessions.
- Already-linked sessions are not restored into a second topic in MVP.
- Existing Telegram behavior is unchanged when the feature is disabled.
- Tests and docs are included.
Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

+45 -4
View File
@@ -186,18 +186,24 @@ class HomeChannel:
Default destination for a platform.
When a cron job specifies deliver="telegram" without a specific chat ID,
messages are sent to this home channel.
messages are sent to this home channel. Thread-aware platforms may also
store a thread/topic ID so the bare platform target routes to the exact
conversation where /sethome was run.
"""
platform: Platform
chat_id: str
name: str # Human-readable name for display
thread_id: Optional[str] = None
def to_dict(self) -> Dict[str, Any]:
return {
result = {
"platform": self.platform.value,
"chat_id": self.chat_id,
"name": self.name,
}
if self.thread_id:
result["thread_id"] = self.thread_id
return result
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "HomeChannel":
@@ -205,6 +211,7 @@ class HomeChannel:
platform=Platform(data["platform"]),
chat_id=str(data["chat_id"]),
name=data.get("name", "Home"),
thread_id=str(data["thread_id"]) if data.get("thread_id") else None,
)
@@ -839,11 +846,25 @@ def load_gateway_config() -> GatewayConfig:
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
# at the top level alongside group_sessions_per_user, expecting it to work
# the same way (#3979).
_tl_require_mention = yaml_cfg.get("require_mention")
if _tl_require_mention is not None:
_tg_section = yaml_cfg.get("telegram") or {}
if "require_mention" not in _tg_section:
_tg_plat = platforms_data.setdefault(Platform.TELEGRAM.value, {})
_tg_extra = _tg_plat.setdefault("extra", {})
_tg_extra.setdefault("require_mention", _tl_require_mention)
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
if isinstance(telegram_cfg, dict):
if "require_mention" in telegram_cfg and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(telegram_cfg["require_mention"]).lower()
# Prefer telegram.require_mention; fall back to the top-level shorthand.
_effective_rm = telegram_cfg.get("require_mention", yaml_cfg.get("require_mention"))
if _effective_rm is not None and not os.getenv("TELEGRAM_REQUIRE_MENTION"):
os.environ["TELEGRAM_REQUIRE_MENTION"] = str(_effective_rm).lower()
if "mention_patterns" in telegram_cfg and not os.getenv("TELEGRAM_MENTION_PATTERNS"):
os.environ["TELEGRAM_MENTION_PATTERNS"] = json.dumps(telegram_cfg["mention_patterns"])
frc = telegram_cfg.get("free_response_chats")
@@ -1071,6 +1092,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.TELEGRAM,
chat_id=telegram_home,
name=os.getenv("TELEGRAM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("TELEGRAM_HOME_CHANNEL_THREAD_ID") or None,
)
# Discord
@@ -1087,6 +1109,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DISCORD,
chat_id=discord_home,
name=os.getenv("DISCORD_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DISCORD_HOME_CHANNEL_THREAD_ID") or None,
)
# Reply threading mode for Discord (off/first/all)
@@ -1108,6 +1131,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WHATSAPP,
chat_id=whatsapp_home,
name=os.getenv("WHATSAPP_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WHATSAPP_HOME_CHANNEL_THREAD_ID") or None,
)
# Slack
@@ -1135,6 +1159,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SLACK,
chat_id=slack_home,
name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
thread_id=os.getenv("SLACK_HOME_CHANNEL_THREAD_ID") or None,
)
# Signal
@@ -1155,6 +1180,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SIGNAL,
chat_id=signal_home,
name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SIGNAL_HOME_CHANNEL_THREAD_ID") or None,
)
# Mattermost
@@ -1174,6 +1200,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATTERMOST,
chat_id=mattermost_home,
name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("MATTERMOST_HOME_CHANNEL_THREAD_ID") or None,
)
# Matrix
@@ -1205,6 +1232,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.MATRIX,
chat_id=matrix_home,
name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
thread_id=os.getenv("MATRIX_HOME_ROOM_THREAD_ID") or None,
)
# Home Assistant
@@ -1238,6 +1266,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.EMAIL,
chat_id=email_home,
name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
thread_id=os.getenv("EMAIL_HOME_ADDRESS_THREAD_ID") or None,
)
# SMS (Twilio)
@@ -1253,6 +1282,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.SMS,
chat_id=sms_home,
name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("SMS_HOME_CHANNEL_THREAD_ID") or None,
)
# API Server
@@ -1315,6 +1345,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("DINGTALK_HOME_CHANNEL_THREAD_ID") or None,
)
# Feishu / Lark
@@ -1342,6 +1373,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.FEISHU,
chat_id=feishu_home,
name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("FEISHU_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom (Enterprise WeChat)
@@ -1364,6 +1396,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WECOM,
chat_id=wecom_home,
name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WECOM_HOME_CHANNEL_THREAD_ID") or None,
)
# WeCom callback mode (self-built apps)
@@ -1422,6 +1455,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.WEIXIN,
chat_id=weixin_home,
name=os.getenv("WEIXIN_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("WEIXIN_HOME_CHANNEL_THREAD_ID") or None,
)
# BlueBubbles (iMessage)
@@ -1445,6 +1479,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.BLUEBUBBLES,
chat_id=bluebubbles_home,
name=os.getenv("BLUEBUBBLES_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("BLUEBUBBLES_HOME_CHANNEL_THREAD_ID") or None,
)
# QQ (Official Bot API v2)
@@ -1482,6 +1517,11 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
thread_id=(
os.getenv("QQBOT_HOME_CHANNEL_THREAD_ID")
or os.getenv("QQ_HOME_CHANNEL_THREAD_ID")
or None
),
)
# Yuanbao — YUANBAO_APP_ID preferred
@@ -1512,6 +1552,7 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
platform=Platform.YUANBAO,
chat_id=yuanbao_home,
name=os.getenv("YUANBAO_HOME_CHANNEL_NAME", "Home"),
thread_id=os.getenv("YUANBAO_HOME_CHANNEL_THREAD_ID") or None,
)
yuanbao_dm_policy = os.getenv("YUANBAO_DM_POLICY")
if yuanbao_dm_policy:
+84
View File
@@ -0,0 +1,84 @@
"""Shared HTTP client factory for long-lived platform adapters.
Gateway messaging platforms (QQ Bot, Feishu, WeCom, DingTalk, Signal,
BlueBubbles, WeCom-callback) keep a persistent ``httpx.AsyncClient``
alive for the adapter's lifetime. That amortises TLS/connection setup
across many API calls, but it also means the process's file-descriptor
pressure is sensitive to how aggressively the pool recycles idle keep-
alive connections.
httpx's default ``keepalive_expiry`` is 5 seconds. On macOS behind
Cloudflare Warp (and other transparent proxies), peer-initiated FIN can
sit in ``CLOSE_WAIT`` longer than that before the local socket actually
drains which, multiplied across 7 long-lived adapters plus the LLM
client and MCP clients, walks straight into the default 256 fd limit.
See #18451.
``platform_httpx_limits()`` returns a tighter ``httpx.Limits`` the
adapter factories use instead of the httpx default. The values chosen:
* ``max_keepalive_connections=10`` plenty for any single adapter;
platform APIs rarely parallelise beyond this.
* ``keepalive_expiry=2.0`` close idle sockets aggressively so a
proxy's lingering CLOSE_WAIT window can't starve the process.
Override via ``HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY`` /
``HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE`` env vars when tuning under load.
"""
from __future__ import annotations
import os
try:
import httpx
except ImportError: # pragma: no cover — optional dep
httpx = None # type: ignore[assignment]
_DEFAULT_KEEPALIVE_EXPIRY_S = 2.0
_DEFAULT_MAX_KEEPALIVE = 10
def platform_httpx_limits() -> "httpx.Limits | None":
"""Return ``httpx.Limits`` tuned for persistent platform-adapter clients.
Returns ``None`` when httpx isn't importable, so callers can fall
back to httpx's built-in default without a hard dependency on this
helper being reachable.
"""
if httpx is None:
return None
def _env_float(name: str, default: float) -> float:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = float(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
def _env_int(name: str, default: int) -> int:
raw = os.environ.get(name, "").strip()
if not raw:
return default
try:
val = int(raw)
except (TypeError, ValueError):
return default
return val if val > 0 else default
keepalive_expiry = _env_float(
"HERMES_GATEWAY_HTTPX_KEEPALIVE_EXPIRY", _DEFAULT_KEEPALIVE_EXPIRY_S
)
max_keepalive = _env_int(
"HERMES_GATEWAY_HTTPX_MAX_KEEPALIVE", _DEFAULT_MAX_KEEPALIVE
)
return httpx.Limits(
max_keepalive_connections=max_keepalive,
# Leave max_connections at httpx default (100) — plenty of headroom.
keepalive_expiry=keepalive_expiry,
)
+48 -18
View File
@@ -62,6 +62,14 @@ MAX_NORMALIZED_TEXT_LENGTH = 65_536 # 64 KB cap for normalized content parts
MAX_CONTENT_LIST_SIZE = 1_000 # Max items when content is an array
def _coerce_port(value: Any, default: int = DEFAULT_PORT) -> int:
"""Parse a listen port without letting malformed env/config values crash startup."""
try:
return int(value)
except (TypeError, ValueError):
return default
def _normalize_chat_content(
content: Any, *, _max_depth: int = 10, _depth: int = 0,
) -> str:
@@ -573,7 +581,10 @@ class APIServerAdapter(BasePlatformAdapter):
super().__init__(config, Platform.API_SERVER)
extra = config.extra or {}
self._host: str = extra.get("host", os.getenv("API_SERVER_HOST", DEFAULT_HOST))
self._port: int = int(extra.get("port", os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))))
raw_port = extra.get("port")
if raw_port is None:
raw_port = os.getenv("API_SERVER_PORT", str(DEFAULT_PORT))
self._port: int = _coerce_port(raw_port, DEFAULT_PORT)
self._api_key: str = extra.get("key", os.getenv("API_SERVER_KEY", ""))
self._cors_origins: tuple[str, ...] = self._parse_cors_origins(
extra.get("cors_origins", os.getenv("API_SERVER_CORS_ORIGINS", "")),
@@ -727,10 +738,11 @@ class APIServerAdapter(BasePlatformAdapter):
gateway platforms), falling back to the hermes-api-server default.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
from hermes_cli.tools_config import _get_platform_tools
runtime_kwargs = _resolve_runtime_agent_kwargs()
reasoning_config = GatewayRunner._load_reasoning_config()
model = _resolve_gateway_model()
user_config = _load_gateway_config()
@@ -740,7 +752,6 @@ class APIServerAdapter(BasePlatformAdapter):
# Load fallback provider chain so the API server platform has the
# same fallback behaviour as Telegram/Discord/Slack (fixes #4954).
from gateway.run import GatewayRunner
fallback_model = GatewayRunner._load_fallback_model()
agent = AIAgent(
@@ -759,6 +770,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_complete_callback=tool_complete_callback,
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
)
return agent
@@ -2566,21 +2578,39 @@ class APIServerAdapter(BasePlatformAdapter):
return r, u
result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
# Check for structured failure (non-retryable client errors like
# 401/400 return failed=True instead of raising, so the except
# block below never fires — issue #15561).
if isinstance(result, dict) and result.get("failed"):
error_msg = result.get("error") or "agent run failed"
q.put_nowait({
"event": "run.failed",
"run_id": run_id,
"timestamp": time.time(),
"error": error_msg,
})
self._set_run_status(
run_id,
"failed",
error=error_msg,
last_event="run.failed",
)
else:
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
except asyncio.CancelledError:
self._set_run_status(
run_id,
+39 -10
View File
@@ -2489,19 +2489,30 @@ class BasePlatformAdapter(ABC):
try:
response = await self._message_handler(event)
# Old adapter task (if any) is cancelled AFTER the runner has
# fully handled the command — keeps ordering deterministic.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
_text, _eph_ttl = self._unwrap_ephemeral(response)
# Send the response BEFORE cancelling the old task so the send
# cannot be affected by task-cancellation side effects (race
# condition fix — issue #18912). Previously the send happened
# after cancel_session_processing, which could silently drop the
# "/new" confirmation when an agent was actively running.
if _text:
logger.info(
"[%s] Sending command '/%s' response (%d chars) to %s",
self.name,
cmd,
len(_text),
event.source.chat_id,
)
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2510,6 +2521,13 @@ class BasePlatformAdapter(ABC):
message_id=_r.message_id,
ttl_seconds=_eph_ttl,
)
# Old adapter task (if any) is cancelled AFTER the response has
# been sent — keeps ordering deterministic and avoids the race.
await self.cancel_session_processing(
session_key,
release_guard=False,
discard_pending=False,
)
except Exception:
# On failure, restore the original guard if one still exists so
# we don't leave the session in a half-reset state.
@@ -2594,7 +2612,13 @@ class BasePlatformAdapter(ABC):
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2798,10 +2822,15 @@ class BasePlatformAdapter(ABC):
# Send the text portion
if text_content:
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
_reply_anchor = (
event.reply_to_message_id
if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
else event.message_id
)
result = await self._send_with_retry(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id,
reply_to=_reply_anchor,
metadata=_thread_metadata,
)
_record_delivery(result)
+3 -1
View File
@@ -162,7 +162,9 @@ class BlueBubblesAdapter(BasePlatformAdapter):
return False
from aiohttp import web
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
await self._api_get("/api/v1/ping")
info = await self._api_get("/api/v1/server/info")
+5 -1
View File
@@ -228,7 +228,11 @@ class DingTalkAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, limits=platform_httpx_limits(),
)
credential = dingtalk_stream.Credential(
self._client_id, self._client_secret
+515 -46
View File
@@ -497,6 +497,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
self.gateway_runner = None # Set by gateway/run.py for cross-platform delivery
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
self._voice_locks: Dict[int, asyncio.Lock] = {} # guild_id -> serialize join/leave
@@ -613,6 +614,21 @@ class DiscordAdapter(BasePlatformAdapter):
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
# Close any existing client to prevent zombie websocket connections
# on reconnect (see #18187). Without this, the old client remains
# connected to Discord gateway and both fire on_message, causing
# double responses.
if self._client is not None:
try:
if not self._client.is_closed():
await self._client.close()
except Exception:
logger.debug("[%s] Failed to close previous Discord client", self.name)
finally:
self._client = None
self._ready_event.clear()
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
@@ -704,11 +720,22 @@ class DiscordAdapter(BasePlatformAdapter):
return
# If humans are mentioned but we're not → not for us
# (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
# EXCEPT in free-response channels where the bot should
# answer regardless of who is mentioned.
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
return
_channel_id = str(message.channel.id)
_parent_id = None
if hasattr(message.channel, "parent_id") and message.channel.parent_id:
_parent_id = str(message.channel.parent_id)
_free_channels = adapter_self._discord_free_response_channels()
_channel_ids = {_channel_id}
if _parent_id:
_channel_ids.add(_parent_id)
if "*" not in _free_channels and not (_channel_ids & _free_channels):
return
await self._handle_message(message)
@@ -1914,6 +1941,225 @@ class DiscordAdapter(BasePlatformAdapter):
return True
return False
# ── Slash command authorization ─────────────────────────────────────
# Slash commands (``_run_simple_slash`` and ``_handle_thread_create_slash``)
# are a separate Discord interaction surface from regular messages and
# historically ran with NO authorization check — bypassing every gate
# ``on_message`` enforces (DISCORD_ALLOWED_USERS, DISCORD_ALLOWED_ROLES,
# DISCORD_ALLOWED_CHANNELS, DISCORD_IGNORED_CHANNELS). Any guild member
# could invoke ``/background``, ``/restart``, ``/sethome``, etc. as the
# operator. ``_check_slash_authorization`` mirrors the on_message gates
# one-for-one so the slash surface honors the same trust boundary.
#
# By design, this is a no-op for deployments with no allowlist env vars
# set — ``_is_allowed_user`` returns True and the channel checks early-out
# — preserving the existing "single-tenant, all guild members trusted"
# default. Deployments that DO set any DISCORD_ALLOWED_* var get slash
# parity with on_message.
def _evaluate_slash_authorization(
self, interaction: "discord.Interaction",
) -> Tuple[bool, Optional[str]]:
"""Evaluate slash authorization without producing any response.
Returns ``(allowed, reason)``. ``reason`` is populated only when
``allowed`` is False. This is the shared core used by both the
responding wrapper (``_check_slash_authorization``) and side-effect-
free callers like the ``/skill`` autocomplete callback, which must
return an empty list for unauthorized users instead of leaking an
ephemeral rejection per-keystroke.
Fail-closed semantics for malformed payloads: when an allowlist is
configured but the interaction is missing the data needed to
evaluate it (no channel id with channel policy active, no user
with user/role policy active), the gate REJECTS rather than
falling through. Without these guards a guild interaction that
happens to deserialize without a channel id would silently bypass
``DISCORD_ALLOWED_CHANNELS`` and a payload missing ``user`` would
raise ``AttributeError`` in the user check below, surfacing as
an opaque interaction failure rather than a clean rejection.
"""
chan_obj = getattr(interaction, "channel", None)
in_dm = isinstance(chan_obj, discord.DMChannel) if chan_obj is not None else False
# ── Channel scope (mirrors on_message lines 3374-3388) ──
# DMs aren't channel-gated — DMs follow on_message's DM lockdown
# path which has its own user-allowlist enforcement.
if not in_dm:
chan_id_raw = getattr(interaction, "channel_id", None) or getattr(
chan_obj, "id", None,
)
channel_ids: set = set()
if chan_id_raw is not None:
channel_ids.add(str(chan_id_raw))
# Mirror on_message: also test the parent channel for threads
# so per-channel allow/deny lists work consistently.
if isinstance(chan_obj, discord.Thread):
parent_id = self._get_parent_channel_id(chan_obj)
if parent_id:
channel_ids.add(str(parent_id))
allowed_raw = os.getenv("DISCORD_ALLOWED_CHANNELS", "")
if allowed_raw:
allowed = {c.strip() for c in allowed_raw.split(",") if c.strip()}
if "*" not in allowed:
if not channel_ids:
# Channel policy is configured but the interaction
# has no resolvable channel id. Fail closed.
return (
False,
"channel id missing with DISCORD_ALLOWED_CHANNELS configured",
)
if not (channel_ids & allowed):
return (False, "channel not in DISCORD_ALLOWED_CHANNELS")
# Ignored beats allowed: even when a thread's parent channel
# is on the allowlist, an explicit DISCORD_IGNORED_CHANNELS
# entry on the thread or its parent rejects the interaction.
ignored_raw = os.getenv("DISCORD_IGNORED_CHANNELS", "")
if ignored_raw and channel_ids:
ignored = {c.strip() for c in ignored_raw.split(",") if c.strip()}
if "*" in ignored or (channel_ids & ignored):
return (False, "channel in DISCORD_IGNORED_CHANNELS")
# ── User / role allowlist (mirrors on_message line 681) ──
user = getattr(interaction, "user", None)
allowed_users = getattr(self, "_allowed_user_ids", set()) or set()
allowed_roles = getattr(self, "_allowed_role_ids", set()) or set()
if user is None or getattr(user, "id", None) is None:
# No identifiable user. With any user/role allowlist
# configured, fail closed rather than raise AttributeError
# on ``interaction.user.id`` below. With no allowlist this
# is the existing "no allowlist = everyone" backwards-compat.
if allowed_users or allowed_roles:
return (False, "missing interaction.user with allowlist configured")
return (True, None)
user_id = str(user.id)
if not self._is_allowed_user(user_id, author=user):
return (
False,
"user not in DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES",
)
return (True, None)
async def _check_slash_authorization(
self, interaction: "discord.Interaction", command_text: str,
) -> bool:
"""Mirror on_message's user/role/channel gates onto a slash invocation.
Returns True to proceed. Returns False *after* sending an ephemeral
rejection, logging a warning, and scheduling a cross-platform admin
alert the caller must stop on False (the interaction has already
been responded to).
"""
allowed, reason = self._evaluate_slash_authorization(interaction)
if allowed:
return True
return await self._reject_slash(
interaction, command_text, reason=reason or "unauthorized",
)
async def _reject_slash(
self, interaction: "discord.Interaction", command_text: str, *, reason: str,
) -> bool:
"""Send ephemeral reject + log warning + schedule admin alert. Returns False.
Tolerates a missing ``interaction.user`` -- the fail-closed branch
in ``_evaluate_slash_authorization`` deliberately routes here for
malformed payloads (no user) when an allowlist is configured, and
``str(interaction.user.id)`` would raise AttributeError before the
ephemeral rejection could be sent.
"""
user = getattr(interaction, "user", None)
if user is not None:
user_id = str(getattr(user, "id", "?"))
user_name = getattr(user, "name", "?")
else:
user_id = "?"
user_name = "?"
chan_id = getattr(interaction, "channel_id", None) or getattr(
getattr(interaction, "channel", None), "id", None,
)
guild_id = getattr(interaction, "guild_id", None)
logger.warning(
"[Discord] Unauthorized slash attempt: user=%s id=%s channel=%s "
"guild=%s cmd=%r reason=%r",
user_name, user_id, chan_id, guild_id, command_text, reason,
)
try:
await interaction.response.send_message(
"You're not authorized to use this command.",
ephemeral=True,
)
except Exception as e:
# Interaction may already be responded to (e.g. caller deferred
# before the auth check, or Discord retried). Best-effort only.
logger.debug("[Discord] Could not send unauthorized ephemeral: %s", e)
# Fire-and-forget: don't block the interaction handler on Telegram I/O.
try:
asyncio.create_task(self._notify_unauthorized_slash(
user_name, user_id, chan_id, guild_id, command_text, reason,
))
except Exception as e:
logger.debug("[Discord] Could not schedule admin notify task: %s", e)
return False
async def _notify_unauthorized_slash(
self, user_name: str, user_id: str, chan_id, guild_id,
command_text: str, reason: str,
) -> None:
"""Best-effort cross-platform alert to the gateway operator.
Tries TELEGRAM first (most operators set TELEGRAM_HOME_CHANNEL),
then SLACK. Silently no-ops if no other platform is configured
with a home channel.
A soft send failure -- adapter.send() returning a result with
``success=False`` rather than raising -- continues the fallback
chain. Treating a SendResult(success=False) as delivered would
mean a Telegram outage that the adapter politely surfaces (e.g.
rate-limit, auth failure) silently swallows the alert without
attempting Slack. Hard exceptions still take the same path via
the except branch below.
"""
runner = getattr(self, "gateway_runner", None)
if not runner:
return
for target in (Platform.TELEGRAM, Platform.SLACK):
try:
adapter = runner.adapters.get(target)
if not adapter:
continue
home = runner.config.get_home_channel(target)
if not home or not getattr(home, "chat_id", None):
continue
msg = (
"⚠️ Unauthorized Discord slash attempt\n"
f"User: {user_name} ({user_id})\n"
f"Channel: {chan_id} (guild {guild_id})\n"
f"Command: {command_text}\n"
f"Reason: {reason}"
)
result = await adapter.send(str(home.chat_id), msg)
# Only return on confirmed delivery. SendResult(success=False)
# -> continue to the next platform.
if getattr(result, "success", None) is False:
logger.debug(
"[Discord] Admin notify via %s returned success=False"
" (error=%r); falling through",
target, getattr(result, "error", None),
)
continue
return
except Exception as e:
logger.debug("[Discord] Admin notify via %s failed: %s", target, e)
async def send_image_file(
self,
chat_id: str,
@@ -2301,6 +2547,11 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass # logging must never block command dispatch
# Auth gate — must run before defer() so an ephemeral rejection can
# be delivered on the still-unresponded interaction.
if not await self._check_slash_authorization(interaction, command_text):
return
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
@@ -2445,7 +2696,8 @@ class DiscordAdapter(BasePlatformAdapter):
message: str = "",
auto_archive_duration: int = 1440,
):
await interaction.response.defer(ephemeral=True)
# defer() is performed inside the handler *after* the auth gate
# so a rejected invoker can receive an ephemeral rejection.
await self._handle_thread_create_slash(interaction, name, message, auto_archive_duration)
@tree.command(name="queue", description="Queue a prompt for the next turn (doesn't interrupt)")
@@ -2566,6 +2818,54 @@ class DiscordAdapter(BasePlatformAdapter):
# supporting up to 25 categories × 25 skills = 625 skills.
self._register_skill_group(tree)
# Optional defense-in-depth: hide every slash command from non-admin
# guild members in Discord's slash picker. Server-side authorization
# (``_check_slash_authorization``) is the actual gate; this is purely
# UX so users don't see commands they can't invoke. Off by default
# to preserve the slash UX for deployments that intentionally allow
# everyone in the guild.
if os.getenv("DISCORD_HIDE_SLASH_COMMANDS", "false").strip().lower() in (
"true", "1", "yes", "on",
):
self._apply_owner_only_visibility(tree)
def _apply_owner_only_visibility(self, tree) -> None:
"""Set default_member_permissions=0 on every registered slash command.
Discord interprets ``Permissions(0)`` as "requires no permissions",
which paradoxically means the command is hidden from every guild
member except those with the Administrator permission. Server admins
can re-grant per user/role via Server Settings Integrations
<bot> Permissions.
Authoritative gate is ``_check_slash_authorization`` on every
invocation, which catches stale clients, role grants made by
mistake, and direct API calls bypassing Discord's UI hide.
"""
try:
no_perms = discord.Permissions(0)
except Exception as e:
logger.warning(
"[Discord] _apply_owner_only_visibility: cannot build Permissions(0): %s",
e,
)
return
applied = 0
for cmd in tree.get_commands():
try:
cmd.default_permissions = no_perms
applied += 1
except Exception as e:
logger.debug(
"[Discord] Could not set default_permissions on %r: %s",
getattr(cmd, "name", "?"), e,
)
logger.info(
"[Discord] Hid %d slash command(s) from non-admin guild members "
"(opt-in defense in depth via DISCORD_HIDE_SLASH_COMMANDS).",
applied,
)
def _register_skill_group(self, tree) -> None:
"""Register a single ``/skill`` command with autocomplete on the name.
@@ -2584,40 +2884,32 @@ class DiscordAdapter(BasePlatformAdapter):
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
The entries list and lookup dict are stored on ``self`` rather
than captured in closure variables so :meth:`refresh_skill_group`
can repopulate them when the user runs ``/reload-skills`` without
needing to touch the Discord slash-command tree or trigger a
``tree.sync()`` call.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
existing_names = set()
try:
existing_names = {cmd.name for cmd in tree.get_commands()}
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Populate the instance-level entries/lookup so the
# autocomplete + handler callbacks below always read the
# freshest state. refresh_skill_group() re-runs the same
# collector and mutates these two attributes in place.
self._skill_entries: list[tuple[str, str, str]] = []
self._skill_lookup: dict[str, tuple[str, str]] = {}
self._skill_group_reserved_names: set[str] = set(existing_names)
self._refresh_skill_catalog_state()
if not entries:
if not self._skill_entries:
return
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
@@ -2627,10 +2919,29 @@ class DiscordAdapter(BasePlatformAdapter):
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
Authorization: a quiet pre-check evaluates the slash
allowlists and returns ``[]`` for unauthorized users so
the installed skill catalog is not leaked to anyone who
can see the command in the picker. Returning a generic
empty list here is intentional sending a per-keystroke
ephemeral rejection would produce a barrage of error
popups during typing.
Reads ``self._skill_entries`` so a ``/reload-skills`` run
since process start shows up on the very next keystroke.
"""
try:
allowed, _reason = self._evaluate_slash_authorization(interaction)
except Exception:
# Defensive: never raise from autocomplete. Fail
# closed by returning an empty suggestion list.
return []
if not allowed:
return []
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
for name, desc, _key in self._skill_entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
@@ -2654,7 +2965,13 @@ class DiscordAdapter(BasePlatformAdapter):
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
# Authorize BEFORE any skill lookup so that known and
# unknown skill names produce identical rejections for
# unauthorized users (no probing the installed catalog
# via "Unknown skill: <name>" responses).
if not await self._check_slash_authorization(interaction, "/skill"):
return
entry = self._skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
@@ -2676,16 +2993,74 @@ class DiscordAdapter(BasePlatformAdapter):
logger.info(
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
self.name, len(self._skill_entries),
)
if hidden:
if self._skill_group_hidden_count:
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
self.name, self._skill_group_hidden_count,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _refresh_skill_catalog_state(self) -> None:
"""Re-scan disk for skills and repopulate ``self._skill_entries``.
Called once from :meth:`_register_skill_group` at startup and
again from :meth:`refresh_skill_group` whenever the user runs
``/reload-skills``. No Discord API calls are made autocomplete
and the handler both read from these instance attributes
directly, so an in-place mutation is sufficient.
"""
from hermes_cli.commands import discord_skill_commands_by_category
reserved = getattr(self, "_skill_group_reserved_names", set())
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=set(reserved),
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
self._skill_entries = entries
self._skill_lookup = {n: (d, k) for n, d, k in entries}
self._skill_group_hidden_count = hidden
def refresh_skill_group(self) -> tuple[int, int]:
"""Rescan skills and update the live ``/skill`` autocomplete state.
Invoked by :meth:`gateway.run.GatewayOrchestrator._handle_reload_skills_command`
after :func:`agent.skill_commands.reload_skills` has refreshed
the in-process skill-command registry. Without this call, the
``/skill`` autocomplete dropdown keeps showing the list captured
at process start new skills stay invisible and deleted skills
return an "Unknown skill" error when clicked.
Because autocomplete options are fetched dynamically by Discord,
we only need to mutate the entries/lookup attributes read by the
callbacks no ``tree.sync()`` is required.
Returns ``(new_count, hidden_count)``.
"""
try:
self._refresh_skill_catalog_state()
except Exception as exc:
logger.warning(
"[%s] Failed to refresh /skill autocomplete after reload: %s",
self.name, exc,
)
return (len(getattr(self, "_skill_entries", [])), 0)
logger.info(
"[%s] Refreshed /skill autocomplete: %d skill(s) available (%d filtered)",
self.name,
len(self._skill_entries),
self._skill_group_hidden_count,
)
return (len(self._skill_entries), self._skill_group_hidden_count)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
is_dm = isinstance(interaction.channel, discord.DMChannel)
@@ -2743,6 +3118,9 @@ class DiscordAdapter(BasePlatformAdapter):
auto_archive_duration: int = 1440,
) -> None:
"""Create a Discord thread from a slash command and start a session in it."""
if not await self._check_slash_authorization(interaction, "/thread"):
return
await interaction.response.defer(ephemeral=True)
result = await self._create_thread(
interaction,
name=name,
@@ -3037,6 +3415,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = ExecApprovalView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3075,6 +3454,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
confirm_id=confirm_id,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3109,6 +3489,7 @@ class DiscordAdapter(BasePlatformAdapter):
view = UpdatePromptView(
session_key=session_key,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
return SendResult(success=True, message_id=str(msg.id))
@@ -3166,6 +3547,7 @@ class DiscordAdapter(BasePlatformAdapter):
session_key=session_key,
on_model_selected=on_model_selected,
allowed_user_ids=self._allowed_user_ids,
allowed_role_ids=self._allowed_role_ids,
)
msg = await channel.send(embed=embed, view=view)
@@ -3426,7 +3808,7 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
@@ -3721,6 +4103,72 @@ class DiscordAdapter(BasePlatformAdapter):
# Discord UI Components (outside the adapter class)
# ---------------------------------------------------------------------------
def _component_check_auth(
interaction,
allowed_user_ids: Optional[set],
allowed_role_ids: Optional[set],
) -> bool:
"""Shared user-or-role OR semantics for component view button clicks.
Mirrors ``DiscordAdapter._is_allowed_user`` / the slash and on_message
gates so every Discord interaction surface honors the same trust
boundary. Component views (ExecApprovalView, SlashConfirmView,
UpdatePromptView, ModelPickerView) used to receive only
``allowed_user_ids``: in role-only deployments
(DISCORD_ALLOWED_ROLES set, DISCORD_ALLOWED_USERS empty) the user
set was empty and the legacy "no allowlist = allow everyone" branch
let any guild member click the buttons -- approving exec commands,
cancelling slash confirmations, switching the model.
Behavior:
- both allowlists empty -> allow (preserves existing no-allowlist
deployments, no regression)
- user is in user allowlist -> allow
- role allowlist set + user has a role in it -> allow
- role allowlist set + interaction.user has no resolvable
``roles`` attribute (e.g. DM context with a role policy active)
-> reject (fail closed)
- otherwise -> reject
"""
user_set = allowed_user_ids or set()
role_set = allowed_role_ids or set()
has_users = bool(user_set)
has_roles = bool(role_set)
if not has_users and not has_roles:
return True
user = getattr(interaction, "user", None)
if user is None:
return False
if has_users:
try:
uid = str(user.id)
except AttributeError:
uid = ""
if uid and uid in user_set:
return True
if has_roles:
roles_attr = getattr(user, "roles", None)
if roles_attr is None:
# Role policy is configured but the interaction doesn't
# carry role data (DM-context Member, raw User payload).
# Fail closed: a user without a resolvable role list cannot
# satisfy a role allowlist.
return False
try:
user_role_ids = {getattr(r, "id", None) for r in roles_attr}
except TypeError:
return False
if user_role_ids & role_set:
return True
return False
if DISCORD_AVAILABLE:
class ExecApprovalView(discord.ui.View):
@@ -3733,17 +4181,23 @@ if DISCORD_AVAILABLE:
Only users in the allowed list can click. Times out after 5 minutes.
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300) # 5-minute timeout
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
"""Verify the user clicking is authorized."""
if not self.allowed_user_ids:
return True # No allowlist = anyone can approve
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3835,17 +4289,24 @@ if DISCORD_AVAILABLE:
5 minutes (matches the gateway primitive's timeout).
"""
def __init__(self, session_key: str, confirm_id: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
confirm_id: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.confirm_id = confirm_id
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _resolve(
self, interaction: discord.Interaction, choice: str,
@@ -3923,16 +4384,22 @@ if DISCORD_AVAILABLE:
5-minute timeout on its side).
"""
def __init__(self, session_key: str, allowed_user_ids: set):
def __init__(
self,
session_key: str,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=300)
self.session_key = session_key
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
async def _respond(
self, interaction: discord.Interaction, answer: str,
@@ -4009,6 +4476,7 @@ if DISCORD_AVAILABLE:
session_key: str,
on_model_selected,
allowed_user_ids: set,
allowed_role_ids: Optional[set] = None,
):
super().__init__(timeout=120)
self.providers = providers
@@ -4017,15 +4485,16 @@ if DISCORD_AVAILABLE:
self.session_key = session_key
self.on_model_selected = on_model_selected
self.allowed_user_ids = allowed_user_ids
self.allowed_role_ids = allowed_role_ids or set()
self.resolved = False
self._selected_provider: str = ""
self._build_provider_select()
def _check_auth(self, interaction: discord.Interaction) -> bool:
if not self.allowed_user_ids:
return True
return str(interaction.user.id) in self.allowed_user_ids
return _component_check_auth(
interaction, self.allowed_user_ids, self.allowed_role_ids,
)
def _build_provider_select(self):
"""Build the provider dropdown menu."""
+12
View File
@@ -416,6 +416,18 @@ class EmailAdapter(BasePlatformAdapter):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
# Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
# from creating a MessageEvent (and thus thread context) for senders
# that the gateway will never authorize. Without this early guard,
# a race between dispatch and authorization can result in the adapter
# sending a reply even though the handler returned None.
allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
if allowed_raw:
allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
if sender_addr.lower() not in allowed:
logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
+19 -3
View File
@@ -2757,9 +2757,11 @@ class FeishuAdapter(BasePlatformAdapter):
if hint:
text = f"{hint}\n\n{text}" if text else hint
thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
reply_to_message_id = (
getattr(message, "parent_id", None)
or getattr(message, "upper_message_id", None)
or getattr(message, "root_id", None)
or None
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
@@ -2791,7 +2793,7 @@ class FeishuAdapter(BasePlatformAdapter):
chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
user_id=sender_profile["user_id"],
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
thread_id=thread_id,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
@@ -2922,13 +2924,18 @@ class FeishuAdapter(BasePlatformAdapter):
},
)
response.raise_for_status()
# Snapshot Content-Type and body while the client context is
# still active so pooled connections fully release on exit.
# See #18451.
content_type_hdr = str(response.headers.get("Content-Type", ""))
body = response.content
filename = self._derive_remote_filename(
file_url,
content_type=str(response.headers.get("Content-Type", "")),
content_type=content_type_hdr,
default_name=preferred_name,
default_ext=default_ext,
)
cached_path = cache_document_from_bytes(response.content, filename)
cached_path = cache_document_from_bytes(body, filename)
return cached_path, filename
@staticmethod
@@ -4222,6 +4229,15 @@ class FeishuAdapter(BasePlatformAdapter):
if active_reply_to and not self._response_succeeded(response):
code = getattr(response, "code", None)
if code in _FEISHU_REPLY_FALLBACK_CODES:
if (metadata or {}).get("thread_id"):
logger.warning(
"[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
"skipping top-level fallback to avoid creating a new topic",
active_reply_to,
(metadata or {}).get("thread_id"),
code,
)
return response
logger.warning(
"[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
"falling back to new message in chat %s",
+1 -1
View File
@@ -139,7 +139,7 @@ class HomeAssistantAdapter(BasePlatformAdapter):
async def _ws_connect(self) -> bool:
"""Establish WebSocket connection and authenticate."""
ws_url = self._hass_url.replace("http://", "ws://").replace("https://", "wss://")
ws_url = self._hass_url.replace("https://", "wss://").replace("http://", "ws://")
ws_url = f"{ws_url}/api/websocket"
self._session = aiohttp.ClientSession(
+16 -1
View File
@@ -243,10 +243,14 @@ class QQAdapter(BasePlatformAdapter):
return False
try:
# Tighter keepalive pool so idle CLOSE_WAIT sockets drain
# faster behind proxies like Cloudflare Warp (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0,
follow_redirects=True,
event_hooks={"response": [_ssrf_redirect_guard]},
limits=platform_httpx_limits(),
)
# 1. Get access token
@@ -393,13 +397,24 @@ class QQAdapter(BasePlatformAdapter):
await self._session.close()
self._session = None
self._session = aiohttp.ClientSession()
# Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
# local patch, so QQ can regress to direct-connect timeouts after update.
self._session = aiohttp.ClientSession(trust_env=True)
ws_proxy = (
os.getenv("WSS_PROXY")
or os.getenv("wss_proxy")
or os.getenv("HTTPS_PROXY")
or os.getenv("https_proxy")
or os.getenv("ALL_PROXY")
or os.getenv("all_proxy")
)
self._ws = await self._session.ws_connect(
gateway_url,
headers={
"User-Agent": build_user_agent(),
},
timeout=CONNECT_TIMEOUT_SECONDS,
proxy=ws_proxy,
)
logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)
+34 -1
View File
@@ -192,6 +192,15 @@ class SignalAdapter(BasePlatformAdapter):
group_allowed_str = os.getenv("SIGNAL_GROUP_ALLOWED_USERS", "")
self.group_allow_from = set(_parse_comma_list(group_allowed_str))
# DM allowlist — mirrors SIGNAL_ALLOWED_USERS checked by run.py.
# Stored here so the reaction hooks can skip unauthorized senders
# (reactions fire before run.py's auth gate, so without this check
# every inbound DM from any contact gets a 👀 reaction).
# "*" means all users allowed (open mode); empty means no restriction
# recorded at adapter level (run.py still enforces auth separately).
dm_allowed_str = os.getenv("SIGNAL_ALLOWED_USERS", "*")
self.dm_allow_from = set(_parse_comma_list(dm_allowed_str))
# HTTP client
self.client: Optional[httpx.AsyncClient] = None
@@ -248,7 +257,9 @@ class SignalAdapter(BasePlatformAdapter):
except Exception as e:
logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
self.client = httpx.AsyncClient(timeout=30.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self.client = httpx.AsyncClient(timeout=30.0, limits=platform_httpx_limits())
try:
# Health check — verify signal-cli daemon is reachable
try:
@@ -1428,8 +1439,28 @@ class SignalAdapter(BasePlatformAdapter):
return None
return (author, ts)
def _reactions_enabled(self, event: "MessageEvent" = None) -> bool:
"""Check if message reactions are enabled for this event.
Two gates:
1. SIGNAL_REACTIONS env var set to false/0/no to disable globally.
2. DM allowlist if SIGNAL_ALLOWED_USERS is set, only react to
messages from senders in that list. This prevents unauthorized
contacts from seeing the 👀 reaction (which fires before run.py's
auth gate and would otherwise reveal that a bot is listening).
"""
if os.getenv("SIGNAL_REACTIONS", "true").lower() in ("false", "0", "no"):
return False
if event is not None:
sender = getattr(getattr(event, "source", None), "user_id", None)
if sender and "*" not in self.dm_allow_from and sender not in self.dm_allow_from:
return False
return True
async def on_processing_start(self, event: MessageEvent) -> None:
"""React with 👀 when processing begins."""
if not self._reactions_enabled(event):
return
target = self._extract_reaction_target(event)
if target:
await self.send_reaction(event.source.chat_id, "👀", *target)
@@ -1440,6 +1471,8 @@ class SignalAdapter(BasePlatformAdapter):
On CANCELLED we leave the 👀 in place no terminal outcome means
the reaction should keep reflecting "in progress" (matches Telegram).
"""
if not self._reactions_enabled(event):
return
if outcome == ProcessingOutcome.CANCELLED:
return
target = self._extract_reaction_target(event)
+15
View File
@@ -528,6 +528,21 @@ class SlackAdapter(BasePlatformAdapter):
return False
lock_acquired = True
# Close any previous handler before creating a new one so that
# calling connect() a second time (e.g. during a gateway restart or
# in-process reconnect attempt) does not leave a zombie Socket Mode
# connection alive. Both the old and new connections would otherwise
# receive every Slack event and dispatch it twice, producing double
# responses — the same bug that affected DiscordAdapter (#18187).
if self._handler is not None:
try:
await self._handler.close_async()
except Exception:
logger.debug("[%s] Failed to close previous Slack handler", self.name)
finally:
self._handler = None
self._app = None
# First token is the primary — used for AsyncApp / Socket Mode
primary_token = bot_tokens[0]
self._app = AsyncApp(token=primary_token)
+9 -5
View File
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_WEBHOOK_HOST (default 0.0.0.0)
- SMS_WEBHOOK_HOST (default 127.0.0.1)
- SMS_WEBHOOK_URL (public URL for Twilio signature validation required)
- SMS_INSECURE_NO_SIGNATURE (true to disable signature validation dev only)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
DEFAULT_WEBHOOK_HOST = "0.0.0.0"
DEFAULT_WEBHOOK_HOST = "127.0.0.1"
def check_sms_requirements() -> bool:
@@ -91,19 +91,23 @@ class SmsAdapter(BasePlatformAdapter):
from aiohttp import web
if not self._from_number:
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
logger.error(msg)
self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
return False
insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"
if not self._webhook_url and not insecure_no_sig:
logger.error(
msg = (
"[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
"signature validation. Set it to the public URL configured in your "
"Twilio console (e.g. https://example.com/webhooks/twilio). "
"For local development without validation, set "
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
)
logger.error(msg)
self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
return False
if insecure_no_sig and not self._webhook_url:
+125 -6
View File
@@ -512,6 +512,17 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, attempt,
)
self._polling_network_error_count = 0
# start_polling() returning is necessary but not sufficient:
# PTB's Updater can be left in a state where `running` is True
# but the underlying long-poll task is wedged on a stale httpx
# connection and never makes progress. No error_callback fires
# in that state, so the reconnect ladder won't advance on its
# own. Schedule a deferred probe to detect the wedge and
# re-enter the ladder if needed.
if not self.has_fatal_error:
probe = asyncio.ensure_future(self._verify_polling_after_reconnect())
self._background_tasks.add(probe)
probe.add_done_callback(self._background_tasks.discard)
except Exception as retry_err:
logger.warning("[%s] Telegram polling reconnect failed: %s", self.name, retry_err)
# start_polling failed — polling is dead and no further error
@@ -523,6 +534,50 @@ class TelegramAdapter(BasePlatformAdapter):
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
async def _verify_polling_after_reconnect(self) -> None:
"""Heartbeat probe scheduled after a successful reconnect.
PTB's Updater can survive a botched stop()+start_polling() cycle
with `running=True` but a wedged consumer task. No error callback
fires, so the reconnect ladder doesn't advance on its own. This
probe detects the wedge by:
1. Sleeping HEARTBEAT_PROBE_DELAY so a healthy long-poll has time
to complete at least one cycle.
2. Verifying `Updater.running` is still True.
3. Probing the bot endpoint with a tight asyncio timeout. A
wedged httpx pool fails this probe; a healthy one returns
well under the timeout.
On any failure, re-enter the reconnect ladder so the existing
MAX_NETWORK_RETRIES path can ultimately escalate to fatal-error.
"""
HEARTBEAT_PROBE_DELAY = 60
PROBE_TIMEOUT = 10
await asyncio.sleep(HEARTBEAT_PROBE_DELAY)
if self.has_fatal_error:
return
if not (self._app and self._app.updater and self._app.updater.running):
logger.warning(
"[%s] Updater not running %ds after reconnect — treating as wedged",
self.name, HEARTBEAT_PROBE_DELAY,
)
await self._handle_polling_network_error(
RuntimeError("Updater not running after reconnect heartbeat")
)
return
try:
await asyncio.wait_for(self._app.bot.get_me(), PROBE_TIMEOUT)
except Exception as probe_err:
logger.warning(
"[%s] Polling heartbeat probe failed %ds after reconnect: %s",
self.name, HEARTBEAT_PROBE_DELAY, probe_err,
)
await self._handle_polling_network_error(probe_err)
async def _handle_polling_conflict(self, error: Exception) -> None:
if self.has_fatal_error and self.fatal_error_code == "telegram_polling_conflict":
return
@@ -633,6 +688,29 @@ class TelegramAdapter(BasePlatformAdapter):
)
return None
async def rename_dm_topic(
self,
chat_id: int,
thread_id: int,
name: str,
) -> None:
"""Rename a forum topic in a private (DM) chat."""
if not self._bot:
return
try:
chat_id_arg = int(chat_id)
except (TypeError, ValueError):
chat_id_arg = chat_id
await self._bot.edit_forum_topic(
chat_id=chat_id_arg,
message_thread_id=int(thread_id),
name=name,
)
logger.info(
"[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
self.name, chat_id, thread_id, name,
)
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
@@ -2212,13 +2290,54 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram local image, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
error_str = str(e)
# Dimension-related errors are the expected case for valid image
# files that Telegram just refuses as photos (screenshots, extreme
# aspect ratios). Log at INFO because the document fallback is
# the correct path. Any other send_photo failure also falls back
# to document (rate limits, corrupt file markers, format edge
# cases), but at WARNING because it's unexpected and worth
# surfacing in logs.
is_dim_error = (
"Photo_invalid_dimensions" in error_str
or "PHOTO_INVALID_DIMENSIONS" in error_str
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
if is_dim_error:
logger.info(
"[%s] Image dimensions exceed Telegram photo limits, "
"sending as document: %s",
self.name,
image_path,
)
else:
logger.warning(
"[%s] Failed to send Telegram local image as photo, "
"trying document fallback: %s",
self.name,
e,
exc_info=True,
)
# Fallback to sending as document (file) — no dimension limit,
# only 50MB size limit. If even that fails, fall back to the
# base adapter's text-only "Image: /path" rendering.
try:
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
file_name=os.path.basename(image_path),
reply_to=reply_to,
metadata=metadata,
)
except Exception as doc_err:
logger.error(
"[%s] Failed to send Telegram local image as document, "
"falling back to base adapter: %s",
self.name,
doc_err,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_document(
self,
+6 -1
View File
@@ -142,6 +142,7 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
SUPPORTS_MESSAGE_EDITING = False
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
@@ -206,7 +207,11 @@ class WeComAdapter(BasePlatformAdapter):
return False
try:
self._http_client = httpx.AsyncClient(timeout=30.0, follow_redirects=True)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(
timeout=30.0, follow_redirects=True, limits=platform_httpx_limits(),
)
await self._open_connection()
self._mark_connected()
self._listen_task = asyncio.create_task(self._listen_loop())
+3 -1
View File
@@ -119,7 +119,9 @@ class WecomCallbackAdapter(BasePlatformAdapter):
pass
try:
self._http_client = httpx.AsyncClient(timeout=20.0)
# Tighter keepalive so idle CLOSE_WAIT drains promptly (#18451).
from gateway.platforms._http_client_limits import platform_httpx_limits
self._http_client = httpx.AsyncClient(timeout=20.0, limits=platform_httpx_limits())
self._app = web.Application()
self._app.router.add_get("/health", self._handle_health)
self._app.router.add_get(self._path, self._handle_verify)
+12 -3
View File
@@ -1333,6 +1333,15 @@ class WeixinAdapter(BasePlatformAdapter):
if message_id and self._dedup.is_duplicate(message_id):
return
# Secondary content-fingerprint dedup for text messages
item_list = message.get("item_list") or []
text = _extract_text(item_list)
if text:
content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
if self._dedup.is_duplicate(content_key):
logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
return
chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
if chat_type == "group":
if self._group_policy == "disabled":
@@ -1347,8 +1356,6 @@ class WeixinAdapter(BasePlatformAdapter):
self._token_store.set(self._account_id, sender_id, context_token)
asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))
item_list = message.get("item_list") or []
text = _extract_text(item_list)
media_paths: List[str] = []
media_types: List[str] = []
@@ -2030,7 +2037,9 @@ async def send_weixin_direct(
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
if (live_adapter is not None and send_session is not None
and not send_session.closed
and send_session._loop is asyncio.get_running_loop()):
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
+32 -2
View File
@@ -185,6 +185,13 @@ class WhatsAppAdapter(BasePlatformAdapter):
self._bridge_log: Optional[Path] = None
self._poll_task: Optional[asyncio.Task] = None
self._http_session: Optional["aiohttp.ClientSession"] = None
# Set to True by disconnect() before we SIGTERM our child bridge so
# _check_managed_bridge_exit() can distinguish an intentional
# shutdown-time exit (returncode -15 / -2 / 0) from a real crash.
# Without this, every graceful gateway shutdown/restart would log
# "Fatal whatsapp adapter error" plus dispatch a fatal-error
# notification before the normal "✓ whatsapp disconnected" fires.
self._shutting_down: bool = False
def _whatsapp_require_mention(self) -> bool:
configured = self.config.extra.get("require_mention")
@@ -555,6 +562,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
if returncode is None:
return None
# Planned shutdown: disconnect() sets _shutting_down before it sends
# SIGTERM to the bridge, so a returncode of -15 (SIGTERM), -2 (SIGINT),
# or 0 (clean exit) at that point is expected, not a crash. Treat it
# as informational and skip the fatal-error path.
# getattr-with-default keeps tests that construct the adapter via
# ``WhatsAppAdapter.__new__`` (bypassing __init__) working without
# every _make_adapter() helper having to seed the attribute.
if getattr(self, "_shutting_down", False) and returncode in (0, -2, -15):
logger.info(
"[%s] Bridge exited during shutdown (code %d).",
self.name,
returncode,
)
return None
message = f"WhatsApp bridge process exited unexpectedly (code {returncode})."
if not self.has_fatal_error:
logger.error("[%s] %s", self.name, message)
@@ -565,6 +587,10 @@ class WhatsAppAdapter(BasePlatformAdapter):
async def disconnect(self) -> None:
"""Stop the WhatsApp bridge and clean up any orphaned processes."""
# Flip the shutdown flag BEFORE signalling the child so the exit-check
# path (which runs from other tasks like send() and the poll loop)
# doesn't race us and report the intentional termination as fatal.
self._shutting_down = True
if self._bridge_process:
try:
try:
@@ -876,11 +902,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
try:
import aiohttp
await self._http_session.post(
# Must wrap in `async with` — a bare `await session.post(...)`
# leaves the response object alive until GC, holding its TCP
# socket in CLOSE_WAIT. See #18451.
async with self._http_session.post(
f"http://127.0.0.1:{self._bridge_port}/typing",
json={"chatId": chat_id},
timeout=aiohttp.ClientTimeout(total=5)
)
):
pass
except Exception:
pass # Ignore typing indicator failures
+1366 -292
View File
File diff suppressed because it is too large Load Diff
+19 -14
View File
@@ -1086,19 +1086,22 @@ class SessionStore:
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
"""Mark recently-active sessions as resumable after an unexpected exit.
Called on gateway startup to prevent sessions that were likely
in-flight when the gateway last exited from being blindly resumed
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
Called on gateway startup after a crash or fast restart to preserve
in-flight sessions instead of destroying their conversation history
(#7536). Only marks sessions updated within *max_age_seconds* to
avoid touching long-idle sessions. Sets ``resume_pending=True`` so
the next incoming message on the same session_key auto-resumes from
the existing transcript.
Entries flagged ``resume_pending=True`` are skipped those were
marked intentionally by the drain-timeout path as recoverable.
Terminal escalation for genuinely stuck ``resume_pending`` sessions
is handled by the existing ``.restart_failure_counts`` stuck-loop
counter, which runs after this method on startup.
Entries already flagged ``resume_pending=True`` are skipped. Entries
explicitly ``suspended=True`` (from /stop or stuck-loop escalation)
are also skipped. Terminal escalation for genuinely stuck sessions
is still handled by the existing ``.restart_failure_counts`` counter
(threshold 3), which runs after this method and sets ``suspended=True``.
Returns the number of sessions marked resumable.
"""
from datetime import timedelta
@@ -1110,13 +1113,15 @@ class SessionStore:
if entry.resume_pending:
continue
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
entry.resume_pending = True
entry.resume_reason = "restart_interrupted"
entry.last_resume_marked_at = _now()
count += 1
if count:
self._save()
return count
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
def reset_session(self, session_key: str, display_name: Optional[str] = None) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""
db_end_session_id = None
db_create_kwargs = None
@@ -1140,7 +1145,7 @@ class SessionStore:
created_at=now,
updated_at=now,
origin=old_entry.origin,
display_name=old_entry.display_name,
display_name=display_name if display_name is not None else old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
+107 -51
View File
@@ -637,6 +637,8 @@ def release_all_scoped_locks(
_TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
_TAKEOVER_MARKER_TTL_S = 60 # Marker older than this is treated as stale
_PLANNED_STOP_MARKER_FILENAME = ".gateway-planned-stop.json"
_PLANNED_STOP_MARKER_TTL_S = 60
def _get_takeover_marker_path() -> Path:
@@ -645,6 +647,67 @@ def _get_takeover_marker_path() -> Path:
return home / _TAKEOVER_MARKER_FILENAME
def _get_planned_stop_marker_path() -> Path:
"""Return the path to the intentional gateway stop marker file."""
home = get_hermes_home()
return home / _PLANNED_STOP_MARKER_FILENAME
def _marker_is_stale(written_at: str, ttl_s: int) -> bool:
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
return age > ttl_s
except (TypeError, ValueError):
return True
def _consume_pid_marker_for_self(
path: Path,
*,
pid_field: str,
start_time_field: str,
ttl_s: int,
) -> bool:
record = _read_json_file(path)
if not record:
return False
try:
target_pid = int(record[pid_field])
target_start_time = record.get(start_time_field)
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
if _marker_is_stale(written_at, ttl_s):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def write_takeover_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being replaced by the current process.
@@ -681,59 +744,13 @@ def consume_takeover_marker_for_self() -> bool:
Always unlinks the marker on match (and on detected staleness) so
subsequent unrelated signals don't re-trigger.
"""
path = _get_takeover_marker_path()
record = _read_json_file(path)
if not record:
return False
# Any malformed or stale marker → drop it and return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
stale = False
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
if age > _TAKEOVER_MARKER_TTL_S:
stale = True
except (TypeError, ValueError):
stale = True # Unparseable timestamp — treat as stale
if stale:
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# Does the marker name THIS process?
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
return _consume_pid_marker_for_self(
_get_takeover_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_TAKEOVER_MARKER_TTL_S,
)
# Consume the marker whether it matched or not — a marker that doesn't
# match our identity is stale-for-us anyway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def clear_takeover_marker() -> None:
"""Remove the takeover marker unconditionally. Safe to call repeatedly."""
@@ -743,6 +760,45 @@ def clear_takeover_marker() -> None:
pass
def write_planned_stop_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being stopped intentionally.
The gateway exits non-zero for unexpected SIGTERM so service managers can
revive it. Service stop commands send the same SIGTERM, so the CLI writes
this short-lived marker first to let the target process exit cleanly.
"""
try:
target_start_time = _get_process_start_time(target_pid)
record = {
"target_pid": target_pid,
"target_start_time": target_start_time,
"stopper_pid": os.getpid(),
"written_at": _utc_now_iso(),
}
_write_json_file(_get_planned_stop_marker_path(), record)
return True
except (OSError, PermissionError):
return False
def consume_planned_stop_marker_for_self() -> bool:
"""Return True when the current process is being intentionally stopped."""
return _consume_pid_marker_for_self(
_get_planned_stop_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_PLANNED_STOP_MARKER_TTL_S,
)
def clear_planned_stop_marker() -> None:
"""Remove the planned-stop marker unconditionally."""
try:
_get_planned_stop_marker_path().unlink(missing_ok=True)
except OSError:
pass
def get_running_pid(
pid_path: Optional[Path] = None,
*,
+33 -1
View File
@@ -5,11 +5,43 @@ Provides subcommands for:
- hermes chat - Interactive chat (same as ./hermes)
- hermes gateway - Run gateway in foreground
- hermes gateway start - Start gateway service
- hermes gateway stop - Stop gateway service
- hermes gateway stop - Stop gateway service
- hermes setup - Interactive setup wizard
- hermes status - Show status of all components
- hermes cron - Manage cron jobs
"""
import os
import sys
__version__ = "0.12.0"
__release_date__ = "2026.4.30"
def _ensure_utf8():
"""Force UTF-8 stdout/stderr on Windows to prevent UnicodeEncodeError.
Windows services and terminals default to cp1252, which cannot encode
box-drawing characters used in CLI output. This causes unhandled
UnicodeEncodeError crashes on gateway startup.
"""
if sys.platform != "win32":
return
os.environ.setdefault("PYTHONUTF8", "1")
os.environ.setdefault("PYTHONIOENCODING", "utf-8")
for stream_name in ("stdout", "stderr"):
stream = getattr(sys, stream_name, None)
if stream is None:
continue
try:
if getattr(stream, "encoding", "").lower().replace("-", "") != "utf8":
new_stream = open(
stream.fileno(), "w", encoding="utf-8",
buffering=1, closefd=False,
)
setattr(sys, stream_name, new_stream)
except (AttributeError, OSError):
pass
_ensure_utf8()
+263 -13
View File
@@ -2589,6 +2589,208 @@ def _poll_for_token(
# Nous Portal — token refresh, agent key minting, model discovery
# =============================================================================
# -----------------------------------------------------------------------------
# Shared Nous token store — lets OAuth credentials persist across profiles
# so a new `hermes --profile <name> auth add nous --type oauth` can one-tap
# import instead of running the full device-code flow every time.
#
# File lives at ${HERMES_SHARED_AUTH_DIR}/nous_auth.json, defaulting to
# ~/.hermes/shared/nous_auth.json. It is OUTSIDE any named profile's
# HERMES_HOME so named profiles (which typically live under
# ~/.hermes/profiles/<name>/) all see the same file.
#
# Written on successful login and on every runtime refresh so the stored
# refresh_token stays current even if one profile refreshes and rotates it.
# If ever the stored refresh_token does go stale server-side, import fails
# gracefully and the user falls back to the normal device-code flow.
# -----------------------------------------------------------------------------
NOUS_SHARED_STORE_FILENAME = "nous_auth.json"
def _nous_shared_auth_dir() -> Path:
"""Resolve the directory that holds the shared Nous token store.
Honors ``HERMES_SHARED_AUTH_DIR`` so tests can redirect it to a tmp
path without touching the real user's home. Defaults to
``~/.hermes/shared/``.
"""
override = os.getenv("HERMES_SHARED_AUTH_DIR", "").strip()
if override:
return Path(override).expanduser()
return Path.home() / ".hermes" / "shared"
def _nous_shared_store_path() -> Path:
path = _nous_shared_auth_dir() / NOUS_SHARED_STORE_FILENAME
# Seat belt: if pytest is running and this resolves to a path under the
# real user's home, refuse rather than silently corrupt cross-profile
# state. Tests must set HERMES_SHARED_AUTH_DIR to a tmp_path (conftest
# does not do this automatically — mirror the _auth_file_path() guard
# so forgetting to set it fails loudly instead of writing to the real
# shared store).
if os.environ.get("PYTEST_CURRENT_TEST"):
real_home_shared = (
Path.home() / ".hermes" / "shared" / NOUS_SHARED_STORE_FILENAME
).resolve(strict=False)
try:
resolved = path.resolve(strict=False)
except Exception:
resolved = path
if resolved == real_home_shared:
raise RuntimeError(
f"Refusing to touch real user shared Nous auth store during test run: "
f"{path}. Set HERMES_SHARED_AUTH_DIR to a tmp_path in your test fixture."
)
return path
def _write_shared_nous_state(state: Dict[str, Any]) -> None:
"""Persist a minimal copy of the Nous OAuth state to the shared store.
Best-effort: any failure is swallowed after logging. The shared store
is a convenience layer; the per-profile auth.json remains the source
of truth.
We deliberately omit the short-lived ``agent_key`` (24h TTL, profile-
specific) only the long-lived OAuth tokens are cross-profile useful.
"""
refresh_token = state.get("refresh_token")
access_token = state.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
# No refresh_token = nothing worth sharing across profiles
return
if not (isinstance(access_token, str) and access_token.strip()):
return
shared = {
"_schema": 1,
"access_token": access_token,
"refresh_token": refresh_token,
"token_type": state.get("token_type") or "Bearer",
"scope": state.get("scope") or DEFAULT_NOUS_SCOPE,
"client_id": state.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": state.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": state.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"obtained_at": state.get("obtained_at"),
"expires_at": state.get("expires_at"),
"updated_at": datetime.now(timezone.utc).isoformat(),
}
try:
path = _nous_shared_store_path()
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(path.suffix + ".tmp")
tmp.write_text(json.dumps(shared, indent=2, sort_keys=True))
try:
os.chmod(tmp, 0o600)
except OSError:
pass
os.replace(tmp, path)
_oauth_trace(
"nous_shared_store_written",
path=str(path),
refresh_token_fp=_token_fingerprint(refresh_token),
)
except Exception as exc:
logger.debug("Failed to write shared Nous auth store: %s", exc)
def _read_shared_nous_state() -> Optional[Dict[str, Any]]:
"""Return the shared Nous OAuth state if present and well-formed.
Returns ``None`` when the file is missing, unreadable, malformed, or
lacks required fields. Callers should treat ``None`` as "no shared
credentials available fall through to device-code".
"""
try:
path = _nous_shared_store_path()
except RuntimeError:
# Test seat belt tripped — treat as missing
return None
if not path.is_file():
return None
try:
payload = json.loads(path.read_text())
except (OSError, ValueError) as exc:
logger.debug("Shared Nous auth store at %s is unreadable: %s", path, exc)
return None
if not isinstance(payload, dict):
return None
refresh_token = payload.get("refresh_token")
access_token = payload.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
return None
if not (isinstance(access_token, str) and access_token.strip()):
return None
return payload
def _try_import_shared_nous_state(
*,
timeout_seconds: float = 15.0,
min_key_ttl_seconds: int = 5 * 60,
) -> Optional[Dict[str, Any]]:
"""Attempt to rehydrate Nous OAuth state from the shared store.
Reads the shared file (if present), runs a forced refresh+mint using
the stored refresh_token to produce a fresh access_token + agent_key
scoped to this profile, and returns the full auth_state dict ready
for ``persist_nous_credentials()``.
Returns ``None`` when no shared state is available or the rehydrate
fails for any reason (expired refresh_token, portal unreachable,
etc.) caller should then fall through to the normal device-code
flow.
"""
shared = _read_shared_nous_state()
if not shared:
return None
# Build a full state dict so refresh_nous_oauth_from_state has every
# field it needs. force_refresh=True gets us a fresh access_token
# for this profile; force_mint=True gets us a fresh agent_key.
state: Dict[str, Any] = {
"access_token": shared.get("access_token"),
"refresh_token": shared.get("refresh_token"),
"client_id": shared.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": shared.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": shared.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"token_type": shared.get("token_type") or "Bearer",
"scope": shared.get("scope") or DEFAULT_NOUS_SCOPE,
"obtained_at": shared.get("obtained_at"),
"expires_at": shared.get("expires_at"),
"agent_key": None,
"agent_key_expires_at": None,
"tls": {"insecure": False, "ca_bundle": None},
}
try:
refreshed = refresh_nous_oauth_from_state(
state,
min_key_ttl_seconds=min_key_ttl_seconds,
timeout_seconds=timeout_seconds,
force_refresh=True,
force_mint=True,
)
except AuthError as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
error_code=getattr(exc, "code", None),
)
logger.debug("Shared Nous import failed: %s", exc)
return None
except Exception as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
)
logger.debug("Shared Nous import failed: %s", exc)
return None
return refreshed
def _refresh_access_token(
*,
client: httpx.Client,
@@ -2991,6 +3193,12 @@ def persist_nous_credentials(
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
# Mirror to the shared store so a new profile can one-tap import
# these credentials via `hermes auth add nous --type oauth`. Best-
# effort: any I/O failure is logged and swallowed (the per-profile
# auth.json is still the source of truth).
_write_shared_nous_state(state)
pool = load_pool("nous")
return next(
(e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
@@ -3059,6 +3267,11 @@ def resolve_nous_runtime_credentials(
refresh_token_fp=_token_fingerprint(state.get("refresh_token")),
access_token_fp=_token_fingerprint(state.get("access_token")),
)
# Mirror post-refresh state to the shared store so sibling
# profiles don't hold stale refresh_tokens after rotation.
# Best-effort — any failure is logged and swallowed inside
# _write_shared_nous_state.
_write_shared_nous_state(state)
verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
@@ -4283,7 +4496,8 @@ def _minimax_oauth_login(
print(f"Portal: {portal_base_url}")
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
headers={"Accept": "application/json"}) as client:
headers={"Accept": "application/json"},
follow_redirects=True) as client:
code_data = _minimax_request_user_code(
client, portal_base_url=portal_base_url,
client_id=pconfig.client_id,
@@ -4360,7 +4574,8 @@ def _refresh_minimax_oauth_state(
return state
portal_base_url = state["portal_base_url"]
with httpx.Client(timeout=httpx.Timeout(timeout_seconds)) as client:
with httpx.Client(timeout=httpx.Timeout(timeout_seconds),
follow_redirects=True) as client:
response = client.post(
f"{portal_base_url}/oauth/token",
data={
@@ -4598,17 +4813,47 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
)
try:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
auth_state = None
# Codex-style auto-import: before launching a fresh device-code
# flow, check the shared store for an existing Nous credential
# from any other profile. If present, offer to rehydrate it.
shared = _read_shared_nous_state()
if shared:
try:
shared_path = _nous_shared_store_path()
except RuntimeError:
shared_path = None
print()
if shared_path:
print(f"Found existing Nous OAuth credentials at {shared_path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
auth_state = _try_import_shared_nous_state(
timeout_seconds=timeout_seconds,
min_key_ttl_seconds=5 * 60,
)
if auth_state is None:
print("Could not refresh shared credentials — falling back to device-code login.")
if auth_state is None:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
inference_base_url = auth_state["inference_base_url"]
@@ -4625,6 +4870,11 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
_save_provider_state(auth_store, "nous", auth_state)
saved_to = _save_auth_store(auth_store)
# Mirror to the shared store so other profiles can one-tap import
# these credentials. Best-effort: any I/O failure is logged and
# swallowed inside the helper.
_write_shared_nous_state(auth_state)
print()
print("Login successful!")
print(f" Auth state: {saved_to}")
+41
View File
@@ -245,6 +245,47 @@ def auth_add_command(args) -> None:
return
if provider == "nous":
# Codex-style auto-import: if a shared Nous credential lives at
# ~/.hermes/shared/nous_auth.json (written by any previous
# successful login), offer to import it instead of running the
# full device-code flow. This makes `hermes --profile <name>
# auth add nous --type oauth` a one-tap operation for users who
# run multiple profiles.
shared = auth_mod._read_shared_nous_state()
if shared:
try:
path = auth_mod._nous_shared_store_path()
except RuntimeError:
path = None
print()
if path:
print(f"Found existing Nous OAuth credentials at {path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
rehydrated = auth_mod._try_import_shared_nous_state(
timeout_seconds=getattr(args, "timeout", None) or 15.0,
min_key_ttl_seconds=max(
60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))
),
)
if rehydrated is not None:
custom_label = (getattr(args, "label", None) or "").strip() or None
entry = auth_mod.persist_nous_credentials(rehydrated, label=custom_label)
shown_label = entry.label if entry is not None else label_from_token(
rehydrated.get("access_token", ""), _oauth_default_label(provider, 1),
)
print(f'Imported {provider} OAuth credentials: "{shown_label}"')
return
# Rehydrate failed (expired refresh_token, portal down, etc.)
# — fall through to device-code flow.
print("Could not refresh shared credentials — falling back to device-code login.")
creds = auth_mod._nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
+15 -2
View File
@@ -61,6 +61,9 @@ _EXCLUDED_NAMES = {
"cron.pid",
}
# zipfile.open() drops Unix mode bits on extract; restore tightens these to 0600.
_SECRET_FILE_NAMES = {".env", "auth.json", "state.db"}
def _should_exclude(rel_path: Path) -> bool:
"""Return True if *rel_path* (relative to hermes root) should be skipped."""
@@ -381,6 +384,8 @@ def run_import(args) -> None:
target.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as src, open(target, "wb") as dst:
dst.write(src.read())
if target.name in _SECRET_FILE_NAMES:
os.chmod(target, 0o600)
restored += 1
except (PermissionError, OSError) as exc:
errors.append(f" {rel}: {exc}")
@@ -788,9 +793,17 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
Returns the number of files deleted. Only touches files matching
``pre-update-*.zip`` so hand-made zips dropped in the same directory
are never touched.
``keep`` is floored to 1 because this helper is only called immediately
after a fresh backup is written: deleting that backup right after the
user paid the disk/CPU cost to create it would leave them worse off
than no backup at all (and the wrapper in ``main.py`` would still print
a misleading ``Saved: <path>`` line for a file that no longer exists).
Operators who genuinely don't want a backup should set
``updates.pre_update_backup: false`` in config that gates creation.
"""
if keep < 0:
keep = 0
if keep < 1:
keep = 1
if not backup_dir.exists():
return 0
+161 -61
View File
@@ -10,6 +10,7 @@ To add an alias: set ``aliases=("short",)`` on the existing ``CommandDef``.
from __future__ import annotations
import logging
import os
import re
import shutil
@@ -21,6 +22,8 @@ from typing import Any
from utils import is_truthy_value
logger = logging.getLogger(__name__)
# prompt_toolkit is an optional CLI dependency — only needed for
# SlashCommandCompleter and SlashCommandAutoSuggest. Gateway and test
# environments that lack it must still be able to import this module
@@ -61,7 +64,9 @@ class CommandDef:
COMMAND_REGISTRY: list[CommandDef] = [
# Session
CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
aliases=("reset",)),
aliases=("reset",), args_hint="[name]"),
CommandDef("topic", "Enable or inspect Telegram DM topic sessions", "Session",
gateway_only=True, args_hint="[off|help|session-id]"),
CommandDef("clear", "Clear screen and start a new session", "Session",
cli_only=True),
CommandDef("redraw", "Force a full UI repaint (recovers from terminal drift)", "Session",
@@ -396,6 +401,11 @@ def _is_gateway_available(cmd: CommandDef, config_overrides: set[str] | None = N
return False
def _requires_argument(args_hint: str) -> bool:
"""Return True when selecting a command without text would be incomplete."""
return args_hint.strip().startswith("<")
def gateway_help_lines() -> list[str]:
"""Generate gateway help text lines from the registry."""
overrides = _resolve_config_gates()
@@ -452,7 +462,9 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
Telegram command names cannot contain hyphens, so they are replaced with
underscores. Aliases are skipped -- Telegram shows one menu entry per
canonical command.
canonical command. Commands that require arguments are skipped because
selecting a Telegram BotCommand sends only ``/command`` and would execute
an incomplete command.
Plugin-registered slash commands are included so plugins get native
autocomplete in Telegram without touching core code.
@@ -462,10 +474,14 @@ def telegram_bot_commands() -> list[tuple[str, str]]:
for cmd in COMMAND_REGISTRY:
if not _is_gateway_available(cmd, overrides):
continue
if _requires_argument(cmd.args_hint):
continue
tg_name = _sanitize_telegram_name(cmd.name)
if tg_name:
result.append((tg_name, cmd.description))
for name, description, _args_hint in _iter_plugin_command_entries():
for name, description, args_hint in _iter_plugin_command_entries():
if _requires_argument(args_hint):
continue
tg_name = _sanitize_telegram_name(name)
if tg_name:
result.append((tg_name, description))
@@ -499,9 +515,9 @@ def _sanitize_telegram_name(raw: str) -> str:
def _clamp_command_names(
entries: list[tuple[str, str]],
entries: list[tuple[str, ...]],
reserved: set[str],
) -> list[tuple[str, str]]:
) -> list[tuple[str, ...]]:
"""Enforce 32-char command name limit with collision avoidance.
Both Telegram and Discord cap slash command names at 32 characters.
@@ -509,10 +525,15 @@ def _clamp_command_names(
(against *reserved* names or earlier entries in the same batch), the name is
shortened to 31 chars and a digit ``0``-``9`` is appended to differentiate.
If all 10 digit slots are taken the entry is silently dropped.
Accepts tuples of any length >= 2. Extra elements beyond ``(name, desc)``
(e.g. ``cmd_key``) are passed through unchanged, so callers can attach
metadata that survives the rename.
"""
used: set[str] = set(reserved)
result: list[tuple[str, str]] = []
for name, desc in entries:
result: list[tuple] = []
for entry in entries:
name, desc, *extra = entry
if len(name) > _CMD_NAME_LIMIT:
candidate = name[:_CMD_NAME_LIMIT]
if candidate in used:
@@ -528,7 +549,7 @@ def _clamp_command_names(
if name in used:
continue
used.add(name)
result.append((name, desc))
result.append((name, desc, *extra))
return result
@@ -611,13 +632,26 @@ def _collect_gateway_skill_entries(
try:
from agent.skill_commands import get_skill_commands
from tools.skills_tool import SKILLS_DIR
from agent.skill_utils import get_external_skills_dirs
_skills_dir = str(SKILLS_DIR.resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve())
_hub_dir = str((SKILLS_DIR / ".hub").resolve()).rstrip("/") + "/"
# Build set of allowed directory prefixes: local skills dir + any
# user-configured ``skills.external_dirs``. Ensure each prefix ends
# with ``/`` so ``/my-skills`` does not also match ``/my-skills-extra``.
# Without this widening, external skills are visible in
# ``hermes skills list`` and the agent's ``/skill-name`` dispatch but
# silently excluded from gateway slash menus (#8110).
_allowed_prefixes = [_skills_dir.rstrip("/") + "/"]
_allowed_prefixes.extend(
str(d).rstrip("/") + "/" for d in get_external_skills_dirs()
)
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
info = skill_cmds[cmd_key]
skill_path = info.get("skill_md_path", "")
if not skill_path.startswith(_skills_dir):
if not skill_path:
continue
if not any(skill_path.startswith(prefix) for prefix in _allowed_prefixes):
continue
if skill_path.startswith(_hub_dir):
continue
@@ -635,17 +669,15 @@ def _collect_gateway_skill_entries(
except Exception:
pass
# Clamp names; _clamp_command_names works on (name, desc) pairs so we
# need to zip/unzip.
skill_pairs = [(n, d) for n, d, _ in skill_triples]
key_by_pair = {(n, d): k for n, d, k in skill_triples}
skill_pairs = _clamp_command_names(skill_pairs, reserved_names)
# Clamp names; cmd_key is passed through as extra payload so it survives
# any clamp-induced renames.
skill_triples = _clamp_command_names(skill_triples, reserved_names)
# Skills fill remaining slots — only tier that gets trimmed
remaining = max(0, max_slots - len(all_entries))
hidden_count = max(0, len(skill_pairs) - remaining)
for n, d in skill_pairs[:remaining]:
all_entries.append((n, d, key_by_pair.get((n, d), "")))
hidden_count = max(0, len(skill_triples) - remaining)
for n, d, k in skill_triples[:remaining]:
all_entries.append((n, d, k))
return all_entries[:max_slots], hidden_count
@@ -721,24 +753,40 @@ def discord_skill_commands(
def discord_skill_commands_by_category(
reserved_names: set[str],
) -> tuple[dict[str, list[tuple[str, str, str]]], list[tuple[str, str, str]], int]:
"""Return skill entries organized by category for Discord ``/skill`` subcommand groups.
"""Return skill entries organized by category for Discord ``/skill`` autocomplete.
Skills whose directory is nested at least 2 levels under ``SKILLS_DIR``
Skills whose directory is nested at least 2 levels under a scan root
(e.g. ``creative/ascii-art/SKILL.md``) are grouped by their top-level
category. Root-level skills (e.g. ``dogfood/SKILL.md``) are returned as
*uncategorized* the caller should register them as direct subcommands
of the ``/skill`` group.
*uncategorized*.
The same filtering as :func:`discord_skill_commands` is applied: hub
skills excluded, per-platform disabled excluded, names clamped.
Scan roots include the local ``SKILLS_DIR`` **and** any configured
``skills.external_dirs`` matching the widened filter applied to the
flat ``discord_skill_commands()`` collector in #18741. Without this
parity, external-dir skills are visible via ``hermes skills list`` and
the agent's ``/skill-name`` dispatch but silently absent from Discord's
``/skill`` autocomplete.
Filtering mirrors :func:`discord_skill_commands`: hub skills excluded,
per-platform disabled excluded, names clamped to 32 chars, descriptions
clamped to 100 chars.
The legacy 25-group × 25-subcommand caps (from the old nested
``/skill <cat> <name>`` layout) are **not** applied the live caller
(``_register_skill_group`` in ``gateway/platforms/discord.py``, refactored
in PR #11580) flattens these results and feeds them into a single
autocomplete callback, which scales to thousands of entries without any
per-command payload concerns. ``hidden_count`` is retained in the return
tuple for backward compatibility and still reports skills dropped for
other reasons (32-char clamp collision vs a reserved name).
Returns:
``(categories, uncategorized, hidden_count)``
- *categories*: ``{category_name: [(name, description, cmd_key), ...]}``
- *uncategorized*: ``[(name, description, cmd_key), ...]``
- *hidden_count*: skills dropped due to Discord group limits
(25 subcommand groups, 25 subcommands per group)
- *hidden_count*: skills dropped due to name clamp collisions
against already-registered command names.
"""
from pathlib import Path as _P
@@ -752,14 +800,33 @@ def discord_skill_commands_by_category(
# Collect raw skill data --------------------------------------------------
categories: dict[str, list[tuple[str, str, str]]] = {}
uncategorized: list[tuple[str, str, str]] = []
_names_used: set[str] = set(reserved_names)
# Map clamped-32-char-name → what it came from, so we can emit an
# actionable warning on collision. Reserved (gateway-builtin) command
# names are marked with a sentinel so the warning distinguishes
# "skill collided with a reserved command" from "two skills collided
# on the 32-char clamp" — the latter is the rename-worthy case.
_names_used: dict[str, str] = {n: "<reserved>" for n in reserved_names}
hidden = 0
try:
from agent.skill_commands import get_skill_commands
from agent.skill_utils import get_external_skills_dirs
from tools.skills_tool import SKILLS_DIR
_skills_dir = SKILLS_DIR.resolve()
_hub_dir = (SKILLS_DIR / ".hub").resolve()
# Build list of (resolved_root, is_local) tuples. Each external dir
# becomes its own scan root for category derivation — a skill at
# ``<external>/mlops/foo/SKILL.md`` is still categorized as "mlops".
_scan_roots: list[_P] = [_skills_dir]
try:
for ext in get_external_skills_dirs():
try:
_scan_roots.append(_P(ext).resolve())
except Exception:
continue
except Exception:
pass
skill_cmds = get_skill_commands()
for cmd_key in sorted(skill_cmds):
@@ -768,33 +835,72 @@ def discord_skill_commands_by_category(
if not skill_path:
continue
sp = _P(skill_path).resolve()
# Skip skills outside SKILLS_DIR or from the hub
if not str(sp).startswith(str(_skills_dir)):
continue
# Hub skills are loaded via the skill hub, not surfaced as
# slash commands.
if str(sp).startswith(str(_hub_dir)):
continue
# Accept skill if it lives under any scan root; record the
# matching root so we can derive the category correctly.
matched_root: _P | None = None
for root in _scan_roots:
try:
sp.relative_to(root)
except ValueError:
continue
matched_root = root
break
if matched_root is None:
continue
skill_name = info.get("name", "")
if skill_name in _platform_disabled:
continue
raw_name = cmd_key.lstrip("/")
# Clamp to 32 chars (Discord limit)
# Clamp to 32 chars (Discord per-command name limit)
discord_name = raw_name[:32]
if discord_name in _names_used:
# Two skills whose first 32 chars are identical. One wins
# (the first one seen, which is alphabetical because the
# caller iterates ``sorted(skill_cmds)``); the other is
# dropped from Discord's /skill autocomplete.
#
# Silently counting this as ``hidden`` (the old behavior)
# meant skill authors had no way to discover the drop —
# their skill just didn't appear in the picker. Emit a
# WARNING naming both sides so the author can rename the
# losing skill's frontmatter name to something with a
# distinct 32-char prefix.
prior = _names_used[discord_name]
if prior == "<reserved>":
logger.warning(
"Discord /skill: %r (from %r) collides on its 32-char "
"clamp with a reserved gateway command name %r — the "
"skill will not appear in the /skill autocomplete. "
"Rename the skill's frontmatter ``name:`` to differ "
"in its first 32 chars.",
discord_name, cmd_key, discord_name,
)
else:
logger.warning(
"Discord /skill: %r and %r both clamp to %r on "
"Discord's 32-char command-name limit — only %r "
"will appear in the /skill autocomplete. Rename "
"one skill's frontmatter ``name:`` to differ in "
"its first 32 chars.",
prior, cmd_key, discord_name, prior,
)
hidden += 1
continue
_names_used.add(discord_name)
_names_used[discord_name] = cmd_key
desc = info.get("description", "")
if len(desc) > 100:
desc = desc[:97] + "..."
# Determine category from the relative path within SKILLS_DIR.
# e.g. creative/ascii-art/SKILL.md → parts = ("creative", "ascii-art")
try:
rel = sp.parent.relative_to(_skills_dir)
except ValueError:
continue
# Determine category from the relative path within the matched
# scan root. e.g. creative/ascii-art/SKILL.md → ("creative", ...)
rel = sp.parent.relative_to(matched_root)
parts = rel.parts
if len(parts) >= 2:
cat = parts[0]
@@ -804,28 +910,7 @@ def discord_skill_commands_by_category(
except Exception:
pass
# Enforce Discord limits: 25 subcommand groups, 25 subcommands each ------
_MAX_GROUPS = 25
_MAX_PER_GROUP = 25
trimmed_categories: dict[str, list[tuple[str, str, str]]] = {}
group_count = 0
for cat in sorted(categories):
if group_count >= _MAX_GROUPS:
hidden += len(categories[cat])
continue
entries = categories[cat][:_MAX_PER_GROUP]
hidden += max(0, len(categories[cat]) - _MAX_PER_GROUP)
trimmed_categories[cat] = entries
group_count += 1
# Uncategorized skills also count against the 25 top-level limit
remaining_slots = _MAX_GROUPS - group_count
if len(uncategorized) > remaining_slots:
hidden += len(uncategorized) - remaining_slots
uncategorized = uncategorized[:remaining_slots]
return trimmed_categories, uncategorized, hidden
return categories, uncategorized, hidden
# ---------------------------------------------------------------------------
@@ -1043,6 +1128,12 @@ class SlashCommandCompleter(Completer):
except Exception:
return {}
# Commands that open pickers when run without arguments.
# These should NOT receive a trailing space in completions because:
# - The TUI's submit handler applies completions on Enter if input differs
# - Adding space makes "/model" → "/model " which blocks picker execution
_PICKER_COMMANDS = frozenset({"model", "skin", "personality"})
@staticmethod
def _completion_text(cmd_name: str, word: str) -> str:
"""Return replacement text for a completion.
@@ -1051,8 +1142,17 @@ class SlashCommandCompleter(Completer):
returning ``help`` would be a no-op and prompt_toolkit suppresses the
menu. Appending a trailing space keeps the dropdown visible and makes
backspacing retrigger it naturally.
However, commands that open pickers (model, skin, personality) should
NOT get a trailing space the TUI would apply the completion on Enter
and block the picker from opening.
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
if cmd_name != word:
return cmd_name
# Don't add space for picker commands — allows Enter to execute them
if cmd_name in SlashCommandCompleter._PICKER_COMMANDS:
return cmd_name
return f"{cmd_name} "
@staticmethod
def _extract_path_word(text: str) -> str | None:
+27 -4
View File
@@ -400,7 +400,12 @@ DEFAULT_CONFIG = {
# The gateway stops accepting new work, waits for running agents
# to finish, then interrupts any remaining runs after the timeout.
# 0 = no drain, interrupt immediately.
"restart_drain_timeout": 60,
#
# 180s is calibrated for realistic in-flight agent turns: a typical
# coding conversation mid-reasoning runs 60150s per call, so a 60s
# budget routinely interrupted legitimate work on /restart. Raise
# further in config.yaml if you run very-long-reasoning models.
"restart_drain_timeout": 180,
# Max app-level retry attempts for API errors (connection drops,
# provider timeouts, 5xx, etc.) before the agent surfaces the
# failure. The OpenAI SDK already does its own low-level retries
@@ -639,6 +644,18 @@ DEFAULT_CONFIG = {
"cache_ttl": "5m",
},
# OpenRouter-specific settings.
# response_cache: enable OpenRouter response caching (X-OpenRouter-Cache header).
# When enabled, identical requests return cached responses for free (zero billing).
# This is separate from Anthropic prompt caching and works alongside it.
# See: https://openrouter.ai/docs/guides/features/response-caching
# response_cache_ttl: how long cached responses remain valid, in seconds (1-86400).
# Default 300 (5 minutes). Only used when response_cache is enabled.
"openrouter": {
"response_cache": True,
"response_cache_ttl": 300,
},
# AWS Bedrock provider configuration.
# Only used when model.provider is "bedrock".
"bedrock": {
@@ -792,6 +809,7 @@ DEFAULT_CONFIG = {
"enabled": False,
"fields": ["model", "context_pct", "cwd"], # Order shown; drop any to hide
},
"copy_shortcut": "auto", # "auto" (platform default) | "ctrl_c" | "ctrl_shift_c" | "disabled"
},
# Web dashboard settings
@@ -825,7 +843,7 @@ DEFAULT_CONFIG = {
# Voices: alloy, echo, fable, onyx, nova, shimmer
},
"xai": {
"voice_id": "eve",
"voice_id": "eve", # or custom voice ID — see https://docs.x.ai/developers/model-capabilities/audio/custom-voices
"language": "en",
"sample_rate": 24000,
"bit_rate": 128000,
@@ -1269,7 +1287,10 @@ DEFAULT_CONFIG = {
# for a single update run.
"pre_update_backup": False,
# How many pre-update backup zips to retain. Older ones are pruned
# automatically after each successful backup.
# automatically after each successful backup. Values below 1 are
# floored to 1 — the backup just created is always preserved. To
# disable backups entirely, set ``pre_update_backup: false`` above
# rather than ``backup_keep: 0``.
"backup_keep": 5,
},
@@ -4658,7 +4679,9 @@ def set_config_value(key: str, value: str):
"terminal.vercel_runtime": "TERMINAL_VERCEL_RUNTIME",
"terminal.docker_mount_cwd_to_workspace": "TERMINAL_DOCKER_MOUNT_CWD_TO_WORKSPACE",
"terminal.docker_run_as_host_user": "TERMINAL_DOCKER_RUN_AS_HOST_USER",
"terminal.cwd": "TERMINAL_CWD",
# terminal.cwd intentionally excluded — CLI resolves at runtime,
# gateway bridges it in gateway/run.py. Persisting to .env causes
# stale values to poison child processes.
"terminal.timeout": "TERMINAL_TIMEOUT",
"terminal.sandbox_dir": "TERMINAL_SANDBOX_DIR",
"terminal.persistent_shell": "TERMINAL_PERSISTENT_SHELL",
+8
View File
@@ -93,6 +93,8 @@ def cron_list(show_all: bool = False):
script = job.get("script")
if script:
print(f" Script: {script}")
if job.get("no_agent"):
print(f" Mode: {color('no-agent', Colors.DIM)} (script stdout delivered directly)")
workdir = job.get("workdir")
if workdir:
print(f" Workdir: {workdir}")
@@ -172,6 +174,7 @@ def cron_create(args):
skills=_normalize_skills(getattr(args, "skill", None), getattr(args, "skills", None)),
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", False) or None,
)
if not result.get("success"):
print(color(f"Failed to create job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -184,6 +187,8 @@ def cron_create(args):
job_data = result.get("job", {})
if job_data.get("script"):
print(f" Script: {job_data['script']}")
if job_data.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if job_data.get("workdir"):
print(f" Workdir: {job_data['workdir']}")
print(f" Next run: {result['next_run_at']}")
@@ -225,6 +230,7 @@ def cron_edit(args):
skills=final_skills,
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", None),
)
if not result.get("success"):
print(color(f"Failed to update job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -240,6 +246,8 @@ def cron_edit(args):
print(" Skills: none")
if updated.get("script"):
print(f" Script: {updated['script']}")
if updated.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if updated.get("workdir"):
print(f" Workdir: {updated['workdir']}")
return 0
+13 -1
View File
@@ -302,9 +302,21 @@ def _cmd_rollback(args) -> int:
print(f" reason: {manifest.get('reason', '?')}")
print(f" created_at: {manifest.get('created_at', '?')}")
print(f" skill files: {manifest.get('skill_files', '?')}")
cron = manifest.get("cron_jobs") or {}
if isinstance(cron, dict):
if cron.get("backed_up"):
print(
f" cron jobs: {cron.get('jobs_count', 0)} "
f"(will be restored for skill-link fields only)"
)
else:
reason = cron.get("reason", "not captured")
print(f" cron jobs: not in snapshot ({reason})")
print(
"\nThis will replace the current ~/.hermes/skills/ tree (a safety "
"snapshot of the current state is taken first so this is undoable)."
"snapshot of the current state is taken first so this is undoable). "
"Cron jobs that still exist will have their skills/skill fields "
"restored from the snapshot; all other cron fields are left alone."
)
if not getattr(args, "yes", False):
+6
View File
@@ -156,6 +156,8 @@ def curses_checklist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _numbered_fallback(title, items, selected, cancel_returns, status_fn)
@@ -278,6 +280,8 @@ def curses_radiolist(
flush_stdin()
return result_holder[0] if result_holder[0] is not None else cancel_returns
except KeyboardInterrupt:
return cancel_returns
except Exception:
return _radio_numbered_fallback(title, items, selected, cancel_returns)
@@ -401,6 +405,8 @@ def curses_single_select(
return None
return result_holder[0]
except KeyboardInterrupt:
return None
except Exception:
all_items = list(items) + [cancel_label]
cancel_idx = len(items)
+82 -7
View File
@@ -1,12 +1,19 @@
"""``hermes debug`` debug tools for Hermes Agent.
"""``hermes debug`` debug tools for Hermes Agent.
Currently supports:
hermes debug share Upload debug report (system info + logs) to a
paste service and print a shareable URL.
By default, log content is run through
``agent.redact.redact_sensitive_text`` with
``force=True`` before upload so credentials in
``~/.hermes/logs/*.log`` are not leaked into
the public paste service. Pass ``--no-redact``
to disable.
"""
import io
import json
import logging
import sys
import time
import urllib.error
@@ -19,6 +26,16 @@ from typing import Optional
from hermes_constants import get_hermes_home
from utils import atomic_replace
logger = logging.getLogger(__name__)
# Banner prepended to upload-bound log content when redaction is enabled.
# Visible in the public paste so reviewers know the content was sanitized.
# Kept short; the trailing newline guarantees the banner sits on its own line.
_REDACTION_BANNER = (
"[hermes debug share: log content redacted at upload time. "
"run with --no-redact to disable]\n"
)
# ---------------------------------------------------------------------------
# Paste services — try paste.rs first, dpaste.com as fallback.
@@ -368,17 +385,40 @@ def _resolve_log_path(log_name: str) -> Optional[Path]:
return None
def _redact_log_text(text: str) -> str:
"""Run ``redact_sensitive_text`` with ``force=True`` over upload-bound text.
Uses ``force=True`` so redaction fires regardless of the operator's
``security.redact_secrets`` setting. The local on-disk log file is
not modified; only the in-memory copy headed for the public paste
service is sanitized. Returns the redacted text (or the original
when empty / non-string).
"""
if not text:
return text
from agent.redact import redact_sensitive_text
return redact_sensitive_text(text, force=True)
def _capture_log_snapshot(
log_name: str,
*,
tail_lines: int,
max_bytes: int = _MAX_LOG_BYTES,
redact: bool = True,
) -> LogSnapshot:
"""Capture a log once and derive summary/full-log views from it.
The report tail and standalone log upload must come from the same file
snapshot. Otherwise a rotation/truncate between reads can make the report
look newer than the uploaded ``agent.log`` paste.
When ``redact`` is True (the default), both ``tail_text`` and
``full_text`` are run through ``_redact_log_text`` so the snapshot
returned is upload-safe. The on-disk log file is never modified.
Pass ``redact=False`` to capture original log content (used by
``hermes debug share --no-redact``).
"""
log_path = _resolve_log_path(log_name)
if log_path is None:
@@ -438,18 +478,34 @@ def _capture_log_snapshot(
if truncated:
full_text = f"[... truncated — showing last ~{max_bytes // 1024}KB ...]\n{full_text}"
if redact:
tail_text = _redact_log_text(tail_text)
full_text = _redact_log_text(full_text)
return LogSnapshot(path=log_path, tail_text=tail_text, full_text=full_text)
except Exception as exc:
return LogSnapshot(path=log_path, tail_text=f"(error reading: {exc})", full_text=None)
def _capture_default_log_snapshots(log_lines: int) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once."""
def _capture_default_log_snapshots(
log_lines: int, *, redact: bool = True
) -> dict[str, LogSnapshot]:
"""Capture all logs used by debug-share exactly once.
``redact`` is forwarded to each ``_capture_log_snapshot`` call so all
captured logs share the same redaction policy for a given run.
"""
errors_lines = min(log_lines, 100)
return {
"agent": _capture_log_snapshot("agent", tail_lines=log_lines),
"errors": _capture_log_snapshot("errors", tail_lines=errors_lines),
"gateway": _capture_log_snapshot("gateway", tail_lines=errors_lines),
"agent": _capture_log_snapshot(
"agent", tail_lines=log_lines, redact=redact
),
"errors": _capture_log_snapshot(
"errors", tail_lines=errors_lines, redact=redact
),
"gateway": _capture_log_snapshot(
"gateway", tail_lines=errors_lines, redact=redact
),
}
@@ -532,6 +588,7 @@ def run_debug_share(args):
log_lines = getattr(args, "lines", 200)
expiry = getattr(args, "expire", 7)
local_only = getattr(args, "local", False)
redact = not getattr(args, "no_redact", False)
if not local_only:
print(_PRIVACY_NOTICE)
@@ -539,8 +596,16 @@ def run_debug_share(args):
print("Collecting debug report...")
# Capture dump once — prepended to every paste for context.
# The dump is already redacted at extract time via dump.py:_redact;
# log_snapshots are redacted by _capture_default_log_snapshots when
# redact=True so credentials never reach the public paste service.
dump_text = _capture_dump()
log_snapshots = _capture_default_log_snapshots(log_lines)
log_snapshots = _capture_default_log_snapshots(log_lines, redact=redact)
if redact:
logger.info(
"hermes debug share: applied force-mode redaction to log snapshots before upload"
)
report = collect_debug_report(
log_lines=log_lines,
@@ -556,6 +621,15 @@ def run_debug_share(args):
if gateway_log:
gateway_log = dump_text + "\n\n--- full gateway.log ---\n" + gateway_log
# Visible banner so reviewers reading the public paste know redaction
# was applied at upload time. Banner is omitted under --no-redact.
if redact:
report = _REDACTION_BANNER + report
if agent_log:
agent_log = _REDACTION_BANNER + agent_log
if gateway_log:
gateway_log = _REDACTION_BANNER + gateway_log
if local_only:
print(report)
if agent_log:
@@ -666,6 +740,7 @@ def run_debug(args):
print(" --lines N Number of log lines to include (default: 200)")
print(" --expire N Paste expiry in days (default: 7)")
print(" --local Print report locally instead of uploading")
print(" --no-redact Disable upload-time secret redaction (default: redact)")
print()
print("Options (delete):")
print(" <url> ... One or more paste URLs to delete")
+24 -4
View File
@@ -263,8 +263,11 @@ def run_doctor(args):
if env_path.exists():
check_ok(f"{_DHH}/.env file exists")
# Check for common issues
content = env_path.read_text()
# Check for common issues. Pin encoding to UTF-8 because .env files are
# written as UTF-8 everywhere in the codebase, while Path.read_text()
# defaults to the system locale — which crashes on non-UTF-8 Windows
# locales (e.g. GBK) as soon as the file contains any non-ASCII byte.
content = env_path.read_text(encoding="utf-8")
if _has_provider_env_config(content):
check_ok("API key or custom endpoint configured")
else:
@@ -932,6 +935,8 @@ def run_doctor(args):
agent_browser_path = PROJECT_ROOT / "node_modules" / "agent-browser"
if agent_browser_path.exists():
check_ok("agent-browser (Node.js)", "(browser automation)")
elif shutil.which("agent-browser"):
check_ok("agent-browser", "(browser automation)")
else:
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
@@ -1093,9 +1098,10 @@ def run_doctor(args):
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
# MiniMax global: /v1 endpoint supports /models.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
# MiniMax CN: /v1 endpoint does NOT support /models (returns 404).
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", False),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
@@ -1258,9 +1264,23 @@ def run_doctor(args):
check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")
from hermes_cli.config import get_env_value
def _gh_authenticated() -> bool:
"""Check if gh CLI is authenticated via token file or device flow."""
try:
result = subprocess.run(
["gh", "auth", "status", "--json", "authenticated"],
capture_output=True, timeout=10,
)
return result.returncode == 0
except (FileNotFoundError, subprocess.TimeoutExpired):
return False
github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
if github_token:
check_ok("GitHub token configured (authenticated API access)")
elif _gh_authenticated():
check_ok("GitHub authenticated via gh CLI", "(full API access — no GITHUB_TOKEN needed)")
else:
check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
+155 -11
View File
@@ -188,7 +188,7 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
SIGUSR1 is wired in gateway/run.py to ``request_restart(via_service=True)``
which drains in-flight agent runs (up to ``agent.restart_drain_timeout``
seconds), then exits with code 75. Both systemd (``Restart=on-failure``
seconds), then exits with code 75. Both systemd (``Restart=always``
+ ``RestartForceExitStatus=75``) and launchd (``KeepAlive.SuccessfulExit
= false``) relaunch the process after the graceful exit.
@@ -237,6 +237,26 @@ def _graceful_restart_via_sigusr1(pid: int, drain_timeout: float) -> bool:
return False
def _get_ancestor_pids() -> set[int]:
"""Return the set of PIDs in the current process's ancestor chain.
Walks from the current PID up to PID 1 (init) so that process-table scans
never match the calling CLI process or any of its parents. This prevents
``hermes gateway status`` from falsely counting the ``hermes`` CLI that
invoked it as a running gateway instance (see #13242).
"""
ancestors: set[int] = set()
pid = os.getpid()
# Cap iterations to avoid infinite loops on exotic platforms.
for _ in range(64):
ancestors.add(pid)
parent = _get_parent_pid(pid)
if parent is None or parent <= 0 or parent in ancestors:
break
pid = parent
return ancestors
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
@@ -252,6 +272,10 @@ def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> li
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
# Exclude the entire ancestor chain so the CLI process that invoked this
# scan (e.g. ``hermes gateway status``) is never mistaken for a running
# gateway. See #13242.
exclude_pids = exclude_pids | _get_ancestor_pids()
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
@@ -690,6 +714,32 @@ def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
print(" can refuse to start another copy until this process stops.")
def _print_other_profiles_gateway_status() -> None:
"""Print a summary of gateway status across all profiles.
Shown at the bottom of ``hermes gateway status`` output so users with
multiple profiles can tell at a glance which gateways are running and
avoid confusing another profile's process with the current one.
"""
try:
from hermes_cli.profiles import get_active_profile_name
current = get_active_profile_name()
other_processes = [
p for p in find_profile_gateway_processes()
if p.profile != current
]
if not other_processes:
return
print()
print("Other profiles:")
for proc in other_processes:
print(f"{proc.profile:<16s} — PID {proc.pid}")
except Exception:
pass
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -735,6 +785,12 @@ def stop_profile_gateway() -> bool:
if pid is None:
return False
try:
from gateway.status import write_planned_stop_marker
write_planned_stop_marker(pid)
except Exception:
pass
try:
os.kill(pid, signal.SIGTERM)
except ProcessLookupError:
@@ -1558,6 +1614,46 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
return [p for p in candidates if p not in path_entries and Path(p).exists()]
def _build_wsl_interop_paths(path_entries: list[str]) -> list[str]:
"""Return WSL Windows interop PATH entries for generated systemd units.
WSL shells normally inherit Windows PATH entries such as
``/mnt/c/WINDOWS/System32``. systemd user services do not, so gateway tools
that call ``powershell.exe``/``cmd.exe`` work in a terminal but fail in the
background service unless we persist the relevant entries at install time.
"""
if not is_wsl():
return []
candidates: list[str] = []
for entry in os.environ.get("PATH", "").split(os.pathsep):
if entry.startswith("/mnt/"):
candidates.append(entry)
for executable in ("powershell.exe", "cmd.exe", "explorer.exe", "wsl.exe"):
resolved = shutil.which(executable)
if resolved:
candidates.append(str(Path(resolved).parent))
for entry in (
"/mnt/c/WINDOWS/system32",
"/mnt/c/WINDOWS",
"/mnt/c/WINDOWS/System32/Wbem",
"/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/",
"/mnt/c/WINDOWS/System32/OpenSSH/",
):
if Path(entry).exists():
candidates.append(entry)
result: list[str] = []
seen = set(path_entries)
for entry in candidates:
if entry and entry not in seen:
seen.add(entry)
result.append(entry)
return result
def _remap_path_for_user(path: str, target_home_dir: str) -> str:
"""Remap *path* from the current user's home to *target_home_dir*.
@@ -1649,14 +1745,14 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
node_bin = _remap_path_for_user(node_bin, home_dir)
path_entries = [_remap_path_for_user(p, home_dir) for p in path_entries]
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=600
StartLimitBurst=5
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1670,8 +1766,10 @@ Environment="LOGNAME={username}"
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1687,13 +1785,14 @@ WantedBy=multi-user.target
hermes_home = str(get_hermes_home().resolve())
profile_arg = _profile_arg(hermes_home)
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
Description={SERVICE_DESCRIPTION}
After=network.target
StartLimitIntervalSec=600
StartLimitBurst=5
After=network-online.target
Wants=network-online.target
StartLimitIntervalSec=0
[Service]
Type=simple
@@ -1702,8 +1801,10 @@ WorkingDirectory={working_dir}
Environment="PATH={sane_path}"
Environment="VIRTUAL_ENV={venv_dir}"
Environment="HERMES_HOME={hermes_home}"
Restart=on-failure
RestartSec=30
Restart=always
RestartSec=60
RestartMaxDelaySec=300
RestartSteps=5
RestartForceExitStatus={GATEWAY_SERVICE_RESTART_EXIT_CODE}
KillMode=mixed
KillSignal=SIGTERM
@@ -1918,6 +2019,15 @@ def systemd_uninstall(system: bool = False):
print(f"{_service_scope_label(system).capitalize()} service uninstalled")
def _require_service_installed(action: str, system: bool = False) -> None:
unit_path = get_systemd_unit_path(system=system)
if not unit_path.exists():
scope_flag = " --system" if system else ""
print(f"✗ Gateway service is not installed")
print(f" Run: {'sudo ' if system else ''}hermes gateway install{scope_flag}")
sys.exit(1)
def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
@@ -1927,6 +2037,7 @@ def systemd_start(system: bool = False):
# reachable (common on fresh RHEL/Debian SSH sessions without linger).
# Raises UserSystemdUnavailableError with a remediation message.
_preflight_user_systemd()
_require_service_installed("start", system=system)
refresh_systemd_unit_if_needed(system=system)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1937,6 +2048,14 @@ def systemd_stop(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("stop")
_require_service_installed("stop", system=system)
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -1948,6 +2067,7 @@ def systemd_restart(system: bool = False):
_require_root_for_system_service("restart")
else:
_preflight_user_systemd()
_require_service_installed("restart", system=system)
refresh_systemd_unit_if_needed(system=system)
from gateway.status import get_running_pid
@@ -2297,6 +2417,13 @@ def launchd_start():
def launchd_stop():
label = get_launchd_label()
target = f"{_launchd_domain()}/{label}"
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
# bootout unloads the service definition so KeepAlive doesn't respawn
# the process. A plain `kill SIGTERM` only signals the process — launchd
# immediately restarts it because KeepAlive.SuccessfulExit = false.
@@ -2439,6 +2566,20 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
hasn't fully exited yet.
"""
sys.path.insert(0, str(PROJECT_ROOT))
# Refresh the systemd unit definition on every boot so that restart
# settings (RestartSec, StartLimitIntervalSec, etc.) stay current even
# when the process was respawned via exit-code-75 (stale-code or
# /restart) rather than through `hermes gateway restart` which already
# calls refresh_systemd_unit_if_needed(). Without this, a code update
# that ships new unit settings won't take effect until the next manual
# `hermes gateway start/restart` — leaving the gateway vulnerable to
# the exact failure mode the new settings were meant to prevent.
if supports_systemd_services():
try:
refresh_systemd_unit_if_needed(system=False)
except Exception:
pass # best-effort; don't block gateway startup
from gateway.run import start_gateway
@@ -2451,7 +2592,7 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
print()
# Exit with code 1 if gateway fails to connect any platform,
# so systemd Restart=on-failure will retry on transient errors
# so systemd Restart=always will retry on transient errors
verbosity = None if quiet else verbose
try:
success = asyncio.run(start_gateway(replace=replace, verbosity=verbosity))
@@ -4453,6 +4594,9 @@ def _gateway_command_inner(args):
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
# Show other profiles' gateway status for multi-profile awareness
_print_other_profiles_gateway_status()
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
+309 -1
View File
@@ -169,11 +169,93 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
"or docs/hermes-kanban-v1-spec.pdf for the full design."
),
)
# --- global --board flag ---
# Applies to every subcommand below. When set, scopes all reads and
# writes to that board's DB. When omitted, resolves via the
# HERMES_KANBAN_BOARD env var, then the persisted current-board
# file, then "default". See kanban_db.get_current_board().
kanban_parser.add_argument(
"--board",
default=None,
metavar="<slug>",
help=(
"Board slug to operate on. Defaults to the current board "
"(set via `hermes kanban boards switch <slug>` or the "
"HERMES_KANBAN_BOARD env var). Use `hermes kanban boards list` "
"to see all boards."
),
)
sub = kanban_parser.add_subparsers(dest="kanban_action")
# --- init ---
sub.add_parser("init", help="Create kanban.db if missing (idempotent)")
# --- boards (new in v2: multi-project support) ---
p_boards = sub.add_parser(
"boards",
help="Manage kanban boards (one board per project / workstream)",
description=(
"Boards let you separate unrelated streams of work "
"(projects, repos, domains) into isolated queues. Each "
"board has its own DB, workspaces directory, and dispatcher "
"loop — tasks on one board cannot collide with tasks on "
"another. The first board is 'default' and always exists."
),
)
boards_sub = p_boards.add_subparsers(dest="boards_action")
b_list = boards_sub.add_parser(
"list", aliases=["ls"],
help="List all boards with task counts",
)
b_list.add_argument("--json", action="store_true")
b_list.add_argument("--all", action="store_true",
help="Include archived boards too")
b_create = boards_sub.add_parser(
"create", aliases=["new"],
help="Create a new board",
)
b_create.add_argument("slug",
help="Board slug (kebab-case, e.g. atm10-server)")
b_create.add_argument("--name", default=None,
help="Human-readable display name (defaults to Title Case of slug)")
b_create.add_argument("--description", default=None,
help="Optional description")
b_create.add_argument("--icon", default=None,
help="Optional emoji or single-character icon for the dashboard")
b_create.add_argument("--color", default=None,
help="Optional hex color (e.g. '#8b5cf6') for the dashboard")
b_create.add_argument("--switch", action="store_true",
help="Switch to the new board after creating it")
b_rm = boards_sub.add_parser(
"rm", aliases=["remove", "delete"],
help="Archive (default) or delete a board",
)
b_rm.add_argument("slug")
b_rm.add_argument("--delete", action="store_true",
help="Hard-delete the board directory instead of archiving it. "
"Default is to move it to boards/_archived/ so it's recoverable.")
b_switch = boards_sub.add_parser(
"switch", aliases=["use"],
help="Set the active board for subsequent CLI calls",
)
b_switch.add_argument("slug")
boards_sub.add_parser(
"show", aliases=["current"],
help="Print the currently-active board slug",
)
b_rename = boards_sub.add_parser(
"rename",
help="Change a board's human-readable display name (slug is immutable)",
)
b_rename.add_argument("slug")
b_rename.add_argument("name", help="New display name")
# --- create ---
p_create = sub.add_parser("create", help="Create a new task")
p_create.add_argument("title", help="Task title")
@@ -366,7 +448,7 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
# --- log ---
p_log = sub.add_parser(
"log",
help="Print the worker log for a task (from $HERMES_HOME/kanban/logs/)",
help="Print the worker log for a task (from <kanban-root>/kanban/logs/)",
)
p_log.add_argument("task_id")
p_log.add_argument("--tail", type=int, default=None,
@@ -442,6 +524,38 @@ def kanban_command(args: argparse.Namespace) -> int:
)
return 0
# `--board <slug>` applies to every subcommand below by way of an
# env-var pin for the duration of this call. Using HERMES_KANBAN_BOARD
# (rather than threading `board=` through 50+ kb.connect() sites)
# keeps the patch small and inherits the exact same resolution the
# dispatcher uses for workers — consistency is a feature here.
board_override = getattr(args, "board", None)
if board_override:
try:
normed = kb._normalize_board_slug(board_override)
except ValueError as exc:
print(f"kanban: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban: --board requires a slug", file=sys.stderr)
return 2
# Boards other than 'default' must already exist — typoed slugs
# would otherwise silently create an empty board.
if normed != kb.DEFAULT_BOARD and not kb.board_exists(normed):
print(
f"kanban: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
os.environ["HERMES_KANBAN_BOARD"] = normed
# Boards management doesn't touch the DB at all — dispatch early so
# fresh installs that haven't initialized any DB can still use
# `hermes kanban boards create …`.
if action == "boards":
return _dispatch_boards(args)
# Auto-initialize the DB before dispatching any subcommand. init_db
# is idempotent, so running it every invocation is cheap (one
# SELECT against sqlite_master when tables already exist) and
@@ -513,6 +627,185 @@ def _profile_author() -> str:
return "user"
# ---------------------------------------------------------------------------
# Boards management (hermes kanban boards …)
# ---------------------------------------------------------------------------
def _dispatch_boards(args: argparse.Namespace) -> int:
"""Handle ``hermes kanban boards <action>``.
Boards management is deliberately separate from the task-level
commands: it operates on the filesystem (board directories,
``current`` pointer, ``board.json``), not on the per-board SQLite
DB, so a fresh HERMES_HOME that has never called ``kanban init``
can still run ``boards create`` / ``boards list``.
"""
sub = getattr(args, "boards_action", None) or "list"
if sub in ("list", "ls"):
return _cmd_boards_list(args)
if sub in ("create", "new"):
return _cmd_boards_create(args)
if sub in ("rm", "remove", "delete"):
return _cmd_boards_rm(args)
if sub in ("switch", "use"):
return _cmd_boards_switch(args)
if sub in ("show", "current"):
return _cmd_boards_show(args)
if sub == "rename":
return _cmd_boards_rename(args)
print(f"kanban boards: unknown action {sub!r}", file=sys.stderr)
return 2
def _board_task_counts(slug: str) -> dict[str, int]:
"""Return ``{status: count}`` for a board. Safe to call on an empty DB."""
try:
path = kb.kanban_db_path(board=slug)
if not path.exists():
return {}
with kb.connect(board=slug) as conn:
rows = conn.execute(
"SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
).fetchall()
return {r["status"]: int(r["n"]) for r in rows}
except Exception:
return {}
def _cmd_boards_list(args: argparse.Namespace) -> int:
include_archived = bool(getattr(args, "all", False))
boards = kb.list_boards(include_archived=include_archived)
# Enrich each entry with task counts + whether it's the current board.
current = kb.get_current_board()
for b in boards:
b["is_current"] = (b["slug"] == current)
b["counts"] = _board_task_counts(b["slug"])
b["total"] = sum(b["counts"].values())
if getattr(args, "json", False):
print(json.dumps(boards, indent=2, ensure_ascii=False))
return 0
# Human table: marker (•) for current, slug, display name, counts.
if not boards:
print("(no boards — create one with `hermes kanban boards create <slug>`)")
return 0
print(f"{'':2s} {'SLUG':24s} {'NAME':28s} COUNTS")
for b in boards:
marker = "" if b["is_current"] else " "
counts = b["counts"] or {}
counts_str = (
", ".join(f"{k}={v}" for k, v in sorted(counts.items()))
or "(empty)"
)
name = b.get("name") or ""
if b.get("archived"):
name += " [archived]"
print(f"{marker:2s} {b['slug']:24s} {name:28s} {counts_str}")
print()
print(f"Current board: {current}")
if len(boards) > 1:
print("Switch boards with `hermes kanban boards switch <slug>`.")
return 0
def _cmd_boards_create(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards create: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards create: slug is required", file=sys.stderr)
return 2
already = kb.board_exists(normed) and normed != kb.DEFAULT_BOARD
meta = kb.create_board(
normed,
name=args.name,
description=args.description,
icon=args.icon,
color=args.color,
)
verb = "already exists" if already else "created"
print(f"Board {meta['slug']!r} {verb}.")
print(f" Display name: {meta.get('name', '')}")
print(f" DB path: {meta['db_path']}")
if getattr(args, "switch", False):
kb.set_current_board(meta["slug"])
print(f" Switched to {meta['slug']!r}.")
else:
print(f" Use `hermes kanban boards switch {meta['slug']}` to make it current.")
return 0
def _cmd_boards_rm(args: argparse.Namespace) -> int:
try:
res = kb.remove_board(args.slug, archive=not getattr(args, "delete", False))
except ValueError as exc:
print(f"kanban boards rm: {exc}", file=sys.stderr)
return 1
if res["action"] == "archived":
print(f"Board {res['slug']!r} archived → {res['new_path']}")
print("Recover by moving the directory back to "
"<root>/kanban/boards/<slug>/.")
else:
print(f"Board {res['slug']!r} deleted.")
return 0
def _cmd_boards_switch(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards switch: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards switch: slug is required", file=sys.stderr)
return 2
if not kb.board_exists(normed):
print(
f"kanban boards switch: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
kb.set_current_board(normed)
print(f"Active board is now {normed!r}.")
return 0
def _cmd_boards_show(args: argparse.Namespace) -> int:
current = kb.get_current_board()
meta = kb.read_board_metadata(current)
counts = _board_task_counts(current)
total = sum(counts.values())
print(f"Current board: {current}")
print(f" Display name: {meta.get('name', '')}")
if meta.get("description"):
print(f" Description: {meta['description']}")
print(f" DB path: {meta['db_path']}")
print(f" Tasks: {total} total"
+ (f" ({', '.join(f'{k}={v}' for k, v in sorted(counts.items()))})"
if counts else ""))
return 0
def _cmd_boards_rename(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards rename: {exc}", file=sys.stderr)
return 2
if not normed or not kb.board_exists(normed):
print(f"kanban boards rename: board {args.slug!r} does not exist",
file=sys.stderr)
return 1
meta = kb.write_board_metadata(normed, name=args.name)
print(f"Board {normed!r} renamed to {meta['name']!r}.")
return 0
# ---------------------------------------------------------------------------
def _parse_duration(val) -> Optional[int]:
"""Parse ``30s`` / ``5m`` / ``2h`` / ``1d`` or a raw integer → seconds.
@@ -662,6 +955,21 @@ def _cmd_list(args: argparse.Namespace) -> int:
if getattr(args, "json", False):
print(json.dumps([_task_to_dict(t) for t in tasks], indent=2, ensure_ascii=False))
return 0
# Passive discoverability: when the user has multiple boards, surface
# which one they're looking at in the list header. Single-board users
# never see this — the feature stays invisible until you opt in.
try:
all_boards = kb.list_boards(include_archived=False)
except Exception:
all_boards = []
if len(all_boards) > 1:
current = kb.get_current_board()
other_count = len(all_boards) - 1
print(
f"Board: {current} "
f"({other_count} other board{'s' if other_count != 1 else ''}"
f"`hermes kanban boards list`)\n"
)
if not tasks:
print("(no matching tasks)")
return 0
+602 -41
View File
@@ -1,8 +1,56 @@
"""SQLite-backed Kanban board for multi-profile collaboration.
"""SQLite-backed Kanban board for multi-profile, multi-project collaboration.
The board lives at ``$HERMES_HOME/kanban.db`` (profile-agnostic on purpose:
multiple profiles on the same machine all see the same board, which IS the
coordination primitive).
In a fresh install the board lives at ``<root>/kanban.db`` where
``<root>`` is the **shared Hermes root** (the parent of any active
profile). Profiles intentionally collapse onto a shared board: it IS
the cross-profile coordination primitive. A worker spawned with
``hermes -p <profile>`` joins the same board as the dispatcher that
claimed the task. The same applies to ``<root>/kanban/workspaces/`` and
``<root>/kanban/logs/``.
**Multiple boards (projects):** users can create additional boards to
separate unrelated streams of work (e.g. one per project / repo / domain).
Each board is a directory under ``<root>/kanban/boards/<slug>/`` with
its own ``kanban.db``, ``workspaces/``, and ``logs/``. All boards share
the profile's Hermes home but are otherwise isolated: a worker spawned
for a task on board ``atm10-server`` sees only that board's tasks,
cannot enumerate other boards, and its dispatcher ticks don't touch
other boards' DBs.
The first (and for single-project users, only) board is ``default``.
For back-compat its on-disk DB is ``<root>/kanban.db`` (not
``boards/default/kanban.db``), so installs that predate the boards
feature keep working with zero migration. See :func:`kanban_db_path`.
Board resolution order (highest precedence first, all optional):
* ``board=`` argument passed directly to :func:`connect` / :func:`init_db`
(explicit used by the CLI ``--board`` flag and the dashboard
``?board=...`` query param).
* ``HERMES_KANBAN_BOARD`` env var (used by the dispatcher to pin workers
to the board their task lives on workers cannot see other boards).
* ``HERMES_KANBAN_DB`` env var (pins the DB file path directly legacy
override still honoured; highest precedence when the file path itself
is what the caller wants to force).
* ``<root>/kanban/current`` a one-line text file holding the slug of
the "currently selected" board. Written by ``hermes kanban boards
switch <slug>``. When absent, the active board is ``default``.
In standard installs ``<root>`` is ``~/.hermes``. In Docker / custom
deployments where ``HERMES_HOME`` points outside ``~/.hermes`` (e.g.
``/opt/hermes``), ``<root>`` is ``HERMES_HOME``. Legacy env-var
overrides still work:
* ``HERMES_KANBAN_DB`` pin the database file path directly.
* ``HERMES_KANBAN_WORKSPACES_ROOT`` pin the workspaces root directly.
* ``HERMES_KANBAN_HOME`` pin the umbrella root that anchors kanban
paths. Useful for tests and unusual deployments.
The dispatcher injects ``HERMES_KANBAN_DB``,
``HERMES_KANBAN_WORKSPACES_ROOT``, and ``HERMES_KANBAN_BOARD`` into
worker subprocess env so workers converge on the exact DB the
dispatcher used to claim their task even under unusual symlink or
Docker layouts.
Schema is intentionally small: tasks, task_links, task_comments,
task_events. The ``workspace_kind`` field decouples coordination from git
@@ -15,6 +63,9 @@ transactions + compare-and-swap (CAS) updates on ``tasks.status`` and
``tasks.claim_lock``. SQLite serializes writers via its WAL lock, so at
most one claimer can win any given task. Losers observe zero affected
rows and move on -- no retry loops, no distributed-lock machinery.
The CAS coordination is **per-board** each board is a separate DB,
so multi-board installs get the same atomicity guarantees without any
new locking.
"""
from __future__ import annotations
@@ -22,6 +73,7 @@ from __future__ import annotations
import contextlib
import json
import os
import re
import secrets
import sqlite3
import sys
@@ -61,16 +113,438 @@ _CTX_MAX_COMMENT_BYTES = 2 * 1024 # 2 KB per comment
# Paths
# ---------------------------------------------------------------------------
def kanban_db_path() -> Path:
"""Return the path to ``kanban.db`` inside the active HERMES_HOME."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban.db"
DEFAULT_BOARD = "default"
# Slug validator: lowercase alphanumerics, digits, hyphens; 164 chars.
# Strict enough to stop traversal (`..`) and embedded path separators, loose
# enough that kebab-case names like ``atm10-server`` or ``hermes-agent``
# pass without fuss. Board names with display formatting (spaces, emoji)
# live in ``board.json``; the slug is just the directory name.
_BOARD_SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9\-_]{0,63}$")
def workspaces_root() -> Path:
"""Return the directory under which ``scratch`` workspaces are created."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "workspaces"
def _normalize_board_slug(slug: Optional[str]) -> Optional[str]:
"""Lowercase + strip a slug; validate; return ``None`` for empty."""
if slug is None:
return None
s = str(slug).strip().lower()
if not s:
return None
if not _BOARD_SLUG_RE.match(s):
raise ValueError(
f"invalid board slug {slug!r}: must be 1-64 chars, lowercase "
f"alphanumerics / hyphens / underscores, not starting with '-' or '_'"
)
return s
def kanban_home() -> Path:
"""Return the shared Hermes root that anchors the kanban board.
Resolution order:
1. ``HERMES_KANBAN_HOME`` env var when set and non-empty (explicit
override for tests and unusual deployments).
2. ``get_default_hermes_root()``, which already returns ``<root>``
when ``HERMES_HOME`` is ``<root>/profiles/<name>``, and returns
``HERMES_HOME`` directly for Docker / custom deployments.
The kanban board is shared across profiles **by design** (see the
module docstring). Resolving the kanban paths through the active
profile's ``HERMES_HOME`` would silently fork the board per profile,
which breaks the dispatcher / worker handoff.
"""
override = os.environ.get("HERMES_KANBAN_HOME", "").strip()
if override:
return Path(override).expanduser()
from hermes_constants import get_default_hermes_root
return get_default_hermes_root()
def boards_root() -> Path:
"""Return ``<root>/kanban/boards`` — the parent of non-default board dirs.
``default`` is intentionally NOT under this directory its DB lives at
``<root>/kanban.db`` for back-compat with pre-boards installs. This
function returns the directory where *additional* named boards live,
used by :func:`list_boards` to enumerate them.
"""
return kanban_home() / "kanban" / "boards"
def current_board_path() -> Path:
"""Return the path to ``<root>/kanban/current``.
One-line text file written by ``hermes kanban boards switch <slug>``
to persist the user's board selection across CLI invocations. Absent
by default (meaning: active board is ``default``).
"""
return kanban_home() / "kanban" / "current"
def get_current_board() -> str:
"""Return the active board slug, honouring the resolution chain.
Order (highest precedence first):
1. ``HERMES_KANBAN_BOARD`` env var (set by the dispatcher on worker
spawn, or manually for ad-hoc overrides).
2. ``<root>/kanban/current`` on disk (set by ``hermes kanban boards
switch``).
3. ``DEFAULT_BOARD`` (``"default"``).
A malformed slug at any step falls through to the next layer with a
best-effort warning the dispatcher must never crash because a user
hand-edited a file.
"""
env = os.environ.get("HERMES_KANBAN_BOARD", "").strip()
if env:
try:
normed = _normalize_board_slug(env)
if normed:
return normed
except ValueError:
pass
try:
f = current_board_path()
if f.exists():
val = f.read_text(encoding="utf-8").strip()
if val:
try:
normed = _normalize_board_slug(val)
if normed:
return normed
except ValueError:
pass
except OSError:
pass
return DEFAULT_BOARD
def set_current_board(slug: str) -> Path:
"""Persist ``slug`` as the active board. Returns the file written.
Writes ``<root>/kanban/current``. The caller should validate the slug
exists first (via :func:`board_exists`) this function does not
so that ``hermes kanban boards switch <typo>`` returns an error
instead of silently pointing at nothing.
"""
normed = _normalize_board_slug(slug)
if not normed:
raise ValueError("board slug is required")
path = current_board_path()
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(normed + "\n", encoding="utf-8")
return path
def clear_current_board() -> None:
"""Remove ``<root>/kanban/current`` so the active board reverts to ``default``."""
try:
current_board_path().unlink()
except FileNotFoundError:
pass
def board_dir(board: Optional[str] = None) -> Path:
"""Return the on-disk directory for ``board``.
``default`` is ``<root>/kanban/boards/default/`` **for metadata only**
(board.json + workspaces/ + logs/). Its DB file stays at
``<root>/kanban.db`` for back-compat see :func:`kanban_db_path`.
All other boards live at ``<root>/kanban/boards/<slug>/`` with
everything inside that directory including the ``kanban.db``.
"""
slug = _normalize_board_slug(board) or DEFAULT_BOARD
return boards_root() / slug
def board_exists(board: Optional[str] = None) -> bool:
"""Return True if the board has a DB or a metadata dir on disk.
``default`` is considered to always exist its DB is created
on first :func:`connect` and there's no way for it to be missing
in a configuration where the kanban feature is usable at all.
"""
slug = _normalize_board_slug(board) or DEFAULT_BOARD
if slug == DEFAULT_BOARD:
return True
d = board_dir(slug)
return d.is_dir() or (d / "kanban.db").exists()
def kanban_db_path(board: Optional[str] = None) -> Path:
"""Return the path to the ``kanban.db`` for ``board``.
Resolution (highest precedence first):
1. ``HERMES_KANBAN_DB`` env var pins the path directly. Honoured for
back-compat and for the dispatcherworker handoff (defense in
depth: dispatcher injects this into worker env so workers are
immune to any path-resolution disagreement).
2. When ``board`` arg is None, the active board from
:func:`get_current_board` is used.
3. Board ``default`` ``<root>/kanban.db`` (back-compat path).
Other boards ``<root>/kanban/boards/<slug>/kanban.db``.
"""
override = os.environ.get("HERMES_KANBAN_DB", "").strip()
if override:
return Path(override).expanduser()
slug = _normalize_board_slug(board)
if slug is None:
slug = get_current_board()
if slug == DEFAULT_BOARD:
return kanban_home() / "kanban.db"
return board_dir(slug) / "kanban.db"
def workspaces_root(board: Optional[str] = None) -> Path:
"""Return the directory under which ``scratch`` workspaces are created.
Anchored per-board so workspaces don't leak between projects.
``HERMES_KANBAN_WORKSPACES_ROOT`` pins the path directly (highest
precedence) the dispatcher injects this into worker env.
``default`` keeps the legacy path ``<root>/kanban/workspaces/`` so
that existing scratch workspaces from before the boards feature are
preserved. Other boards use ``<root>/kanban/boards/<slug>/workspaces/``.
"""
override = os.environ.get("HERMES_KANBAN_WORKSPACES_ROOT", "").strip()
if override:
return Path(override).expanduser()
slug = _normalize_board_slug(board)
if slug is None:
slug = get_current_board()
if slug == DEFAULT_BOARD:
return kanban_home() / "kanban" / "workspaces"
return board_dir(slug) / "workspaces"
def worker_logs_dir(board: Optional[str] = None) -> Path:
"""Return the directory under which per-task worker logs are written.
``default`` keeps the legacy path ``<root>/kanban/logs/``. Other
boards use ``<root>/kanban/boards/<slug>/logs/``. Logs follow the
board makes ``hermes kanban log`` unambiguous even when multiple
boards have tasks with the same id.
"""
slug = _normalize_board_slug(board)
if slug is None:
slug = get_current_board()
if slug == DEFAULT_BOARD:
return kanban_home() / "kanban" / "logs"
return board_dir(slug) / "logs"
def board_metadata_path(board: Optional[str] = None) -> Path:
"""Return the path to ``board.json`` for ``board``.
Stores display metadata (display name, description, icon, color,
created_at). The on-disk slug is the canonical identity; this file
is purely for presentation in the CLI / dashboard.
"""
slug = _normalize_board_slug(board) or DEFAULT_BOARD
return board_dir(slug) / "board.json"
def _default_board_display_name(slug: str) -> str:
"""Turn a slug into a reasonable default display name.
``atm10-server`` ``Atm10 Server``. Users can override via
``board.json`` but the default should look presentable in the
dashboard without any follow-up editing.
"""
return " ".join(part.capitalize() for part in slug.replace("_", "-").split("-") if part) or slug
def read_board_metadata(board: Optional[str] = None) -> dict:
"""Return ``board.json`` contents (or synthesized defaults).
Never raises a missing / malformed ``board.json`` falls back to a
synthesised entry so the dashboard always has something to render.
Includes the canonical ``slug`` and ``db_path`` so the caller
doesn't need to reconstruct them.
"""
slug = _normalize_board_slug(board) or DEFAULT_BOARD
meta: dict[str, Any] = {
"slug": slug,
"name": _default_board_display_name(slug),
"description": "",
"icon": "",
"color": "",
"created_at": None,
"archived": False,
}
try:
p = board_metadata_path(slug)
if p.exists():
raw = json.loads(p.read_text(encoding="utf-8"))
if isinstance(raw, dict):
# Never let the metadata file claim a different slug than
# its directory — trust the filesystem.
raw["slug"] = slug
meta.update(raw)
except (OSError, json.JSONDecodeError):
pass
meta["db_path"] = str(kanban_db_path(slug))
return meta
def write_board_metadata(
board: Optional[str],
*,
name: Optional[str] = None,
description: Optional[str] = None,
icon: Optional[str] = None,
color: Optional[str] = None,
archived: Optional[bool] = None,
) -> dict:
"""Create / update ``board.json`` for ``board``.
Preserves any existing fields not mentioned in the call. Sets
``created_at`` on first write. Returns the resulting metadata dict.
"""
slug = _normalize_board_slug(board) or DEFAULT_BOARD
meta = read_board_metadata(slug)
# Preserve existing DB-derived fields — they get re-computed each
# read but shouldn't be written into board.json.
meta.pop("db_path", None)
if name is not None:
meta["name"] = str(name).strip() or _default_board_display_name(slug)
if description is not None:
meta["description"] = str(description)
if icon is not None:
meta["icon"] = str(icon)
if color is not None:
meta["color"] = str(color)
if archived is not None:
meta["archived"] = bool(archived)
if not meta.get("created_at"):
meta["created_at"] = int(time.time())
path = board_metadata_path(slug)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(
json.dumps(meta, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
meta["db_path"] = str(kanban_db_path(slug))
return meta
def create_board(
slug: str,
*,
name: Optional[str] = None,
description: Optional[str] = None,
icon: Optional[str] = None,
color: Optional[str] = None,
) -> dict:
"""Create a new board directory + DB + metadata. Idempotent.
Returns the resulting metadata. Raises :class:`ValueError` for a
malformed slug; returns the existing metadata (not an error) if the
board already exists matching ``mkdir -p`` semantics.
"""
normed = _normalize_board_slug(slug)
if not normed:
raise ValueError("board slug is required")
meta = write_board_metadata(
normed,
name=name,
description=description,
icon=icon,
color=color,
)
# Touch the DB so list_boards() sees it immediately.
init_db(board=normed)
return meta
def list_boards(*, include_archived: bool = True) -> list[dict]:
"""Enumerate all boards that exist on disk.
Always includes ``default`` (even when the ``boards/default/``
metadata dir doesn't exist, because its DB is at the legacy path).
Other boards are discovered by scanning ``boards/`` for subdirectories
that either contain a ``kanban.db`` or a ``board.json``.
Returns a list of metadata dicts, sorted with ``default`` first and
the rest alphabetically.
"""
entries: list[dict] = []
seen: set[str] = set()
# Default board is always first.
entries.append(read_board_metadata(DEFAULT_BOARD))
seen.add(DEFAULT_BOARD)
root = boards_root()
if root.is_dir():
for child in sorted(root.iterdir(), key=lambda p: p.name.lower()):
if not child.is_dir():
continue
slug = child.name
# Keep slug normalisation soft for discovery — but skip dirs
# that don't parse as valid slugs so we don't surface junk.
try:
normed = _normalize_board_slug(slug)
except ValueError:
continue
if not normed or normed in seen:
continue
has_db = (child / "kanban.db").exists()
has_meta = (child / "board.json").exists()
if not (has_db or has_meta):
continue
meta = read_board_metadata(normed)
if meta.get("archived") and not include_archived:
continue
entries.append(meta)
seen.add(normed)
return entries
def remove_board(slug: str, *, archive: bool = True) -> dict:
"""Remove or archive a board.
``archive=True`` (default) moves the board's directory to
``<root>/kanban/boards/_archived/<slug>-<timestamp>/`` so the data
is recoverable. ``archive=False`` deletes the directory outright.
The ``default`` board cannot be removed raises :class:`ValueError`.
Returns a summary dict describing what happened (``{"slug", "action",
"new_path"}``).
"""
normed = _normalize_board_slug(slug)
if not normed:
raise ValueError("board slug is required")
if normed == DEFAULT_BOARD:
raise ValueError("the 'default' board cannot be removed")
d = board_dir(normed)
if not d.exists():
raise ValueError(f"board {normed!r} does not exist")
# If the user removed the currently-active board, revert to default.
if get_current_board() == normed:
clear_current_board()
if archive:
archive_root = boards_root() / "_archived"
archive_root.mkdir(parents=True, exist_ok=True)
ts = int(time.time())
target = archive_root / f"{normed}-{ts}"
# Avoid collision on rapid double-archives.
suffix = 1
while target.exists():
target = archive_root / f"{normed}-{ts}-{suffix}"
suffix += 1
d.rename(target)
return {"slug": normed, "action": "archived", "new_path": str(target)}
else:
import shutil
shutil.rmtree(d)
return {"slug": normed, "action": "deleted", "new_path": ""}
# ---------------------------------------------------------------------------
@@ -368,7 +842,11 @@ CREATE INDEX IF NOT EXISTS idx_notify_task ON kanban_notify_subs(task_
_INITIALIZED_PATHS: set[str] = set()
def connect(db_path: Optional[Path] = None) -> sqlite3.Connection:
def connect(
db_path: Optional[Path] = None,
*,
board: Optional[str] = None,
) -> sqlite3.Connection:
"""Open (and initialize if needed) the kanban DB.
WAL mode is enabled on every connection; it's a no-op after the first
@@ -378,8 +856,19 @@ def connect(db_path: Optional[Path] = None) -> sqlite3.Connection:
fresh installs and test harnesses that construct `connect()`
directly don't have to remember a separate init step. Subsequent
connections skip the schema check via a module-level path cache.
Path resolution:
* ``db_path`` explicit used as-is (legacy callers, tests).
* ``board`` explicit resolves to that board's DB.
* Neither :func:`kanban_db_path` resolves via
``HERMES_KANBAN_DB`` env ``HERMES_KANBAN_BOARD`` env
``<root>/kanban/current`` ``default``.
"""
path = db_path or kanban_db_path()
if db_path is not None:
path = db_path
else:
path = kanban_db_path(board=board)
path.parent.mkdir(parents=True, exist_ok=True)
resolved = str(path.resolve())
needs_init = resolved not in _INITIALIZED_PATHS
@@ -398,7 +887,11 @@ def connect(db_path: Optional[Path] = None) -> sqlite3.Connection:
return conn
def init_db(db_path: Optional[Path] = None) -> Path:
def init_db(
db_path: Optional[Path] = None,
*,
board: Optional[str] = None,
) -> Path:
"""Create the schema if it doesn't exist; return the path used.
Kept as a public entry point so CLI ``hermes kanban init`` and the
@@ -409,7 +902,10 @@ def init_db(db_path: Optional[Path] = None) -> Path:
external tools that upgrade an old DB file can call this to
force re-migration.
"""
path = db_path or kanban_db_path()
if db_path is not None:
path = db_path
else:
path = kanban_db_path(board=board)
path.parent.mkdir(parents=True, exist_ok=True)
resolved = str(path.resolve())
# Clear the cache entry so the underlying connect() re-runs the
@@ -590,6 +1086,15 @@ def _claimer_id() -> str:
# Task creation / mutation
# ---------------------------------------------------------------------------
def _canonical_assignee(assignee: Optional[str]) -> Optional[str]:
"""Lowercase-assignee normalization for Kanban rows (dashboard/CLI parity)."""
if assignee is None:
return None
from hermes_cli.profiles import normalize_profile_name
return normalize_profile_name(assignee)
def create_task(
conn: sqlite3.Connection,
*,
@@ -631,6 +1136,7 @@ def create_task(
(e.g. ``skills=["translation"]`` so the worker loads the
translation skill regardless of the profile's default config).
"""
assignee = _canonical_assignee(assignee)
if not title or not title.strip():
raise ValueError("title is required")
if workspace_kind not in VALID_WORKSPACE_KINDS:
@@ -795,7 +1301,7 @@ def list_tasks(
params: list[Any] = []
if assignee is not None:
query += " AND assignee = ?"
params.append(assignee)
params.append(_canonical_assignee(assignee))
if status is not None:
if status not in VALID_STATUSES:
raise ValueError(f"status must be one of {sorted(VALID_STATUSES)}")
@@ -819,6 +1325,7 @@ def assign_task(conn: sqlite3.Connection, task_id: str, profile: Optional[str])
Refuses to reassign a task that's currently running (claim_lock set).
Reassign after the current run completes if needed.
"""
profile = _canonical_assignee(profile)
with write_txn(conn):
row = conn.execute(
"SELECT status, claim_lock FROM tasks WHERE id = ?", (task_id,)
@@ -1513,15 +2020,18 @@ def archive_task(conn: sqlite3.Connection, task_id: str) -> bool:
# Workspace resolution
# ---------------------------------------------------------------------------
def resolve_workspace(task: Task) -> Path:
def resolve_workspace(task: Task, *, board: Optional[str] = None) -> Path:
"""Resolve (and create if needed) the workspace for a task.
- ``scratch``: a fresh dir under ``$HERMES_HOME/kanban/workspaces/<id>/``.
- ``scratch``: a fresh dir under ``<board-root>/workspaces/<id>/``,
where ``<board-root>`` is the active board's root. The path is the
same for the dispatcher and every profile worker, so handoff is
path-stable.
- ``dir:<path>``: the path stored in ``workspace_path``. Created
if missing. MUST be absolute relative paths are rejected to
prevent confused-deputy traversal where ``../../../tmp/attacker``
resolves against the dispatcher's CWD instead of a meaningful
root. Users who want a HERMES_HOME-relative workspace should
root. Users who want a kanban-root-relative workspace should
compute the absolute path themselves.
- ``worktree``: a git worktree at ``workspace_path``. Not created
automatically in v1 -- the kanban-worker skill documents
@@ -1543,7 +2053,7 @@ def resolve_workspace(task: Task) -> Path:
f"{task.workspace_path!r}; workspace paths must be absolute"
)
else:
p = workspaces_root() / task.id
p = workspaces_root(board=board) / task.id
p.mkdir(parents=True, exist_ok=True)
return p
if kind == "dir":
@@ -1957,6 +2467,7 @@ def dispatch_once(
dry_run: bool = False,
max_spawn: Optional[int] = None,
failure_limit: int = DEFAULT_SPAWN_FAILURE_LIMIT,
board: Optional[str] = None,
) -> DispatchResult:
"""Run one dispatcher tick.
@@ -1965,15 +2476,17 @@ def dispatch_once(
2. Reclaim crashed running tasks (host-local PID no longer alive).
3. Promote todo -> ready where all parents are done.
4. For each ready task with an assignee, atomically claim and call
``spawn_fn(task, workspace_path) -> Optional[int]``. The return
value (if any) is recorded as ``worker_pid`` so subsequent ticks
can detect crashes before the TTL expires.
``spawn_fn(task, workspace_path, board) -> Optional[int]``. The
return value (if any) is recorded as ``worker_pid`` so subsequent
ticks can detect crashes before the TTL expires.
Spawn failures are counted per-task. After ``failure_limit`` consecutive
failures the task is auto-blocked with the last error as its reason
prevents the dispatcher from thrashing forever on an unfixable task.
``spawn_fn`` defaults to ``_default_spawn``. Tests pass a stub.
``board`` pins workspace/log/db resolution for this tick to a specific
board. When omitted, the current-board resolution chain is used.
"""
result = DispatchResult()
result.reclaimed = release_stale_claims(conn)
@@ -2000,7 +2513,7 @@ def dispatch_once(
if claimed is None:
continue
try:
workspace = resolve_workspace(claimed)
workspace = resolve_workspace(claimed, board=board)
except Exception as exc:
auto = _record_spawn_failure(
conn, claimed.id, f"workspace: {exc}",
@@ -2013,7 +2526,18 @@ def dispatch_once(
set_workspace_path(conn, claimed.id, str(workspace))
_spawn = spawn_fn if spawn_fn is not None else _default_spawn
try:
pid = _spawn(claimed, str(workspace))
# Back-compat: older spawn_fn signatures accept only
# (task, workspace). Test stubs in the suite rely on that.
# Introspect the callable and pass `board` only when supported.
import inspect
try:
sig = inspect.signature(_spawn)
if "board" in sig.parameters:
pid = _spawn(claimed, str(workspace), board=board)
else:
pid = _spawn(claimed, str(workspace))
except (TypeError, ValueError):
pid = _spawn(claimed, str(workspace))
if pid:
_set_worker_pid(conn, claimed.id, int(pid))
_clear_spawn_failures(conn, claimed.id)
@@ -2052,33 +2576,60 @@ def _rotate_worker_log(log_path: Path, max_bytes: int) -> None:
pass
def _default_spawn(task: Task, workspace: str) -> Optional[int]:
def _default_spawn(
task: Task,
workspace: str,
*,
board: Optional[str] = None,
) -> Optional[int]:
"""Fire-and-forget ``hermes -p <profile> chat -q ...`` subprocess.
Returns the spawned child's PID so the dispatcher can detect crashes
before the claim TTL expires. The child's completion is still observed
via the ``complete`` / ``block`` transitions the worker writes itself;
the PID check is a safety net for crashes, OOM kills, and Ctrl+C.
``board`` pins the child's kanban context to that board: the child's
``HERMES_KANBAN_DB`` / ``HERMES_KANBAN_BOARD`` / workspaces_root env
vars all resolve to the same board the dispatcher claimed the task
from. Workers cannot accidentally see other boards.
"""
import subprocess
if not task.assignee:
raise ValueError(f"task {task.id} has no assignee")
from hermes_cli.profiles import normalize_profile_name
profile_arg = normalize_profile_name(task.assignee)
prompt = f"work kanban task {task.id}"
env = dict(os.environ)
if task.tenant:
env["HERMES_TENANT"] = task.tenant
env["HERMES_KANBAN_TASK"] = task.id
env["HERMES_KANBAN_WORKSPACE"] = workspace
# Pin the shared board + workspaces root the dispatcher resolved, so
# that even when the worker activates a profile (`hermes -p <name>`
# rewrites HERMES_HOME), its kanban paths still match the
# dispatcher's. Belt-and-braces with the `get_default_hermes_root()`
# resolution in `kanban_home()` — symmetric resolution is the norm,
# but unusual symlink / Docker layouts are caught here too.
env["HERMES_KANBAN_DB"] = str(kanban_db_path(board=board))
env["HERMES_KANBAN_WORKSPACES_ROOT"] = str(workspaces_root(board=board))
# Board slug — the final defense-in-depth pin. If the worker ever
# resolves kanban paths without the DB / workspaces env vars, the
# board slug still forces it to the right directory.
resolved_board = _normalize_board_slug(board) or get_current_board()
env["HERMES_KANBAN_BOARD"] = resolved_board
# HERMES_PROFILE is the author the kanban_comment tool defaults to.
# `hermes -p <assignee>` activates the profile, but the env var is
# what the tool reads — set it explicitly here so comments are
# attributed correctly regardless of how the child loads config.
env["HERMES_PROFILE"] = task.assignee
env["HERMES_PROFILE"] = profile_arg
cmd = [
"hermes",
"-p", task.assignee,
"-p", profile_arg,
# Auto-load the kanban-worker skill so every dispatched worker
# has the pattern library (good summary/metadata shapes, retry
# diagnostics, block-reason examples) in its context, even if
@@ -2104,9 +2655,11 @@ def _default_spawn(task: Task, workspace: str) -> Optional[int]:
"chat",
"-q", prompt,
])
# Redirect output to a per-task log under HERMES_HOME/kanban/logs/.
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
# Redirect output to a per-task log under <board-root>/logs/.
# Anchored at the board root (not the shared kanban root), so
# `hermes kanban log` on a specific board reads its own file and
# logs don't collide across boards that happen to share task ids.
log_dir = worker_logs_dir(board=board)
log_dir.mkdir(parents=True, exist_ok=True)
log_path = log_dir / f"{task.id}.log"
_rotate_worker_log(log_path, DEFAULT_LOG_ROTATE_BYTES)
@@ -2587,12 +3140,14 @@ def gc_events(
def gc_worker_logs(
*, older_than_seconds: int = 30 * 24 * 3600,
board: Optional[str] = None,
) -> int:
"""Delete worker log files older than ``older_than_seconds``. Returns
the number of files removed. Kept separate from ``gc_events`` because
log files live on disk, not in SQLite."""
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "kanban" / "logs"
log files live on disk, not in SQLite. Scoped to ``board`` (defaults
to the active board) per-board isolation means deleting logs from
board A cannot touch board B's logs."""
log_dir = worker_logs_dir(board=board)
if not log_dir.exists():
return 0
cutoff = time.time() - older_than_seconds
@@ -2611,20 +3166,25 @@ def gc_worker_logs(
# Worker log accessor
# ---------------------------------------------------------------------------
def worker_log_path(task_id: str) -> Path:
def worker_log_path(task_id: str, *, board: Optional[str] = None) -> Path:
"""Return the path to a worker's log file. The file may not exist
(task never spawned, or log already GC'd)."""
from hermes_constants import get_hermes_home
return get_hermes_home() / "kanban" / "logs" / f"{task_id}.log"
(task never spawned, or log already GC'd).
When ``board`` is None, resolves via the active board (env var
current-board file default). The dispatcher always passes the
board explicitly to avoid any resolution ambiguity when multiple
boards exist."""
return worker_logs_dir(board=board) / f"{task_id}.log"
def read_worker_log(
task_id: str, *, tail_bytes: Optional[int] = None,
board: Optional[str] = None,
) -> Optional[str]:
"""Read the worker log for ``task_id``. Returns None if the file
doesn't exist. If ``tail_bytes`` is set, only the last N bytes are
returned (useful for the dashboard drawer which shouldn't page megabytes)."""
path = worker_log_path(task_id)
path = worker_log_path(task_id, board=board)
if not path.exists():
return None
try:
@@ -2661,7 +3221,8 @@ def list_profiles_on_disk() -> list[str]:
``config.yaml`` a bare dir without config isn't a real profile.
"""
try:
home = Path.home() / ".hermes" / "profiles"
from hermes_constants import get_default_hermes_root
home = get_default_hermes_root() / "profiles"
except Exception:
return []
if not home.is_dir():
+115 -23
View File
@@ -114,6 +114,16 @@ def _apply_profile_override() -> None:
consume = 1
break
# 1b. Reject values that can't be valid profile names (e.g. pytest's
# "-p no:xdist" would be misread as profile "no:xdist" otherwise).
# Mirrors hermes_cli.profiles._PROFILE_ID_RE so we never call
# resolve_profile_env() with a value it must reject + sys.exit on.
if profile_name is not None and consume == 2:
import re as _re
if not _re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", profile_name):
profile_name = None
consume = 0
# 1.5 If HERMES_HOME is already set and no explicit flag was given, trust it.
# This lets child processes (relaunch, subprocess) inherit the parent's
# profile choice without having to pass --profile again.
@@ -289,7 +299,7 @@ def _has_any_provider_configured() -> bool:
env_file = get_env_path()
if env_file.exists():
try:
for line in env_file.read_text().splitlines():
for line in env_file.read_text(encoding="utf-8").splitlines():
line = line.strip()
if line.startswith("#") or "=" not in line:
continue
@@ -837,7 +847,17 @@ def _print_tui_exit_summary(session_id: Optional[str], active_session_file: Opti
)
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert"})
_NPM_LOCK_RUNTIME_KEYS = frozenset({"ideallyInert", "peer"})
"""Lockfile fields npm writes non-deterministically at install time.
``ideallyInert`` is npm's runtime annotation for packages it skipped installing
(per-platform opt-outs). ``peer`` is dropped from the hidden ``.package-lock.json``
on dev-dependencies that are *also* declared as peers the canonical
``package-lock.json`` records the dual role, but npm 9's actualized tree strips
it. Neither key represents a real skew between what was declared and what was
installed, so we exclude them from the comparison in :func:`_tui_need_npm_install`
to avoid false-positive reinstalls on every launch.
"""
def _tui_need_npm_install(root: Path) -> bool:
@@ -1042,17 +1062,21 @@ def _make_tui_argv(tui_dir: Path, tui_dev: bool) -> tuple[list[str], Path]:
if _tui_need_npm_install(tui_dir):
if not os.environ.get("HERMES_QUIET"):
print("Installing TUI dependencies…")
# Capture stdout as well as stderr — some npm errors (notably EACCES on a
# root-owned node_modules in containers) are emitted on stdout, and a
# bare "npm install failed." with no preview defeats debugging. We keep
# the failure-only print path so a successful install stays silent.
result = subprocess.run(
[npm, "install", "--silent", "--no-fund", "--no-audit", "--progress=false"],
cwd=str(tui_dir),
stdout=subprocess.DEVNULL,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
env={**os.environ, "CI": "1"},
)
if result.returncode != 0:
err = (result.stderr or "").strip()
preview = "\n".join(err.splitlines()[-30:])
combined = f"{result.stdout or ''}\n{result.stderr or ''}".strip()
preview = "\n".join(combined.splitlines()[-30:])
print("npm install failed.")
if preview:
print(preview)
@@ -3399,10 +3423,10 @@ def _model_flow_named_custom(config, provider_info):
print()
print("Fetching available models...")
models = fetch_api_models(
api_key, base_url, timeout=8.0,
api_mode=api_mode or None,
)
fetch_kwargs = {"timeout": 8.0}
if api_mode:
fetch_kwargs["api_mode"] = api_mode
models = fetch_api_models(api_key, base_url, **fetch_kwargs)
if models:
default_idx = 0
@@ -6471,13 +6495,29 @@ def _cmd_update_check():
if sys.platform == "win32":
git_cmd = ["git", "-c", "windows.appendAtomically=false"]
print("→ Fetching from origin...")
# Fetch both origin and upstream; prefer upstream as the canonical reference
print("→ Fetching from upstream...")
fetch_result = subprocess.run(
git_cmd + ["fetch", "origin"],
git_cmd + ["fetch", "upstream"],
cwd=PROJECT_ROOT,
capture_output=True,
text=True,
)
if fetch_result.returncode != 0:
# Fallback to origin if upstream doesn't exist
print("→ Fetching from origin...")
fetch_result = subprocess.run(
git_cmd + ["fetch", "origin"],
cwd=PROJECT_ROOT,
capture_output=True,
text=True,
)
upstream_exists = False
compare_branch = "origin/main"
else:
upstream_exists = True
compare_branch = "upstream/main"
if fetch_result.returncode != 0:
stderr = fetch_result.stderr.strip()
if "Could not resolve host" in stderr or "unable to access" in stderr:
@@ -6485,13 +6525,13 @@ def _cmd_update_check():
elif "Authentication failed" in stderr or "could not read Username" in stderr:
print("✗ Authentication failed — check your git credentials or SSH key.")
else:
print("✗ Failed to fetch from origin.")
print("✗ Failed to fetch.")
if stderr:
print(f" {stderr.splitlines()[0]}")
sys.exit(1)
rev_result = subprocess.run(
git_cmd + ["rev-list", "HEAD..origin/main", "--count"],
git_cmd + ["rev-list", f"HEAD..{compare_branch}", "--count"],
cwd=PROJECT_ROOT,
capture_output=True,
text=True,
@@ -6503,7 +6543,7 @@ def _cmd_update_check():
print("✓ Already up to date.")
else:
commits_word = "commit" if behind == 1 else "commits"
print(f"⚕ Update available: {behind} {commits_word} behind origin/main.")
print(f"⚕ Update available: {behind} {commits_word} behind {compare_branch}.")
from hermes_cli.config import recommended_update_command
print(f" Run '{recommended_update_command()}' to install.")
@@ -7029,20 +7069,22 @@ def _cmd_update_impl(args, gateway_mode: bool):
except Exception as e:
logger.debug("Skills sync during update failed: %s", e)
# Sync bundled skills to all other profiles
# Sync bundled skills to all profiles (including the active one).
# seed_profile_skills() uses subprocess with an explicit HERMES_HOME so
# it is not affected by sync_skills()'s module-level HERMES_HOME cache,
# which means the active profile is reliably synced regardless of whether
# the caller's HERMES_HOME env var points at the default or a named profile.
try:
from hermes_cli.profiles import (
list_profiles,
get_active_profile_name,
seed_profile_skills,
)
active = get_active_profile_name()
other_profiles = [p for p in list_profiles() if p.name != active]
if other_profiles:
all_profiles = list_profiles()
if all_profiles:
print()
print("→ Syncing bundled skills to other profiles...")
for p in other_profiles:
print("→ Syncing bundled skills to all profiles...")
for p in all_profiles:
try:
r = seed_profile_skills(p.path, quiet=True)
if r:
@@ -8638,7 +8680,24 @@ def main():
)
cron_create.add_argument(
"--script",
help="Path to a Python script whose stdout is injected into the prompt each run",
help=(
"Path to a script under ~/.hermes/scripts/. Default mode: "
"script stdout is injected into the agent's prompt each run. "
"With --no-agent: the script IS the job and its stdout is "
"delivered verbatim. .sh/.bash files run via bash, everything "
"else via Python."
),
)
cron_create.add_argument(
"--no-agent",
dest="no_agent",
action="store_true",
default=False,
help=(
"Skip the LLM entirely — run --script on schedule and deliver "
"its stdout directly. Empty stdout = silent. Classic watchdog "
"pattern (memory alerts, disk alerts, CI pings)."
),
)
cron_create.add_argument(
"--workdir",
@@ -8680,7 +8739,29 @@ def main():
)
cron_edit.add_argument(
"--script",
help="Path to a Python script whose stdout is injected into the prompt each run. Pass empty string to clear.",
help=(
"Path to a script under ~/.hermes/scripts/. Pass empty string to clear. "
"With --no-agent the script IS the job; otherwise its stdout is "
"injected into the agent's prompt each run."
),
)
cron_edit.add_argument(
"--no-agent",
dest="no_agent",
action="store_const",
const=True,
default=None,
help=(
"Enable no-agent mode on this job (requires --script or an "
"existing script on the job)."
),
)
cron_edit.add_argument(
"--agent",
dest="no_agent",
action="store_const",
const=False,
help="Disable no-agent mode on this job (reverts to LLM-driven execution).",
)
cron_edit.add_argument(
"--workdir",
@@ -8891,6 +8972,7 @@ Examples:
hermes debug share --lines 500 Include more log lines
hermes debug share --expire 30 Keep paste for 30 days
hermes debug share --local Print report locally (no upload)
hermes debug share --no-redact Disable upload-time secret redaction
hermes debug delete <url> Delete a previously uploaded paste
""",
)
@@ -8916,6 +8998,16 @@ Examples:
action="store_true",
help="Print the report locally instead of uploading",
)
share_parser.add_argument(
"--no-redact",
action="store_true",
help=(
"Disable upload-time secret redaction (default: redact). Logs "
"are normally run through agent.redact.redact_sensitive_text "
"with force=True before upload so credentials are not leaked "
"into the public paste service."
),
)
delete_parser = debug_sub.add_parser(
"delete",
help="Delete a paste uploaded by 'hermes debug share'",
+1 -1
View File
@@ -361,7 +361,7 @@ def _write_env_vars(env_path: Path, env_writes: dict) -> None:
existing_lines = []
if env_path.exists():
existing_lines = env_path.read_text().splitlines()
existing_lines = env_path.read_text(encoding="utf-8").splitlines()
updated_keys = set()
new_lines = []
+65 -16
View File
@@ -904,6 +904,26 @@ def switch_model(
if any(m.get("name") == new_model for m in cfg_models if isinstance(m, dict)):
override = True
break
# Also check custom_providers list — models declared there should be accepted
# even if the remote /v1/models endpoint doesn't list them.
if not override and custom_providers and isinstance(custom_providers, list):
for entry in custom_providers:
if not isinstance(entry, dict):
continue
# Match by provider slug (custom:<name>) or by base_url
entry_name = entry.get("name", "")
entry_slug = f"custom:{entry_name}" if entry_name else ""
entry_url = entry.get("base_url", "")
if entry_slug == target_provider or entry_url == base_url:
# Check if the requested model matches the entry's model
entry_model = entry.get("model", "")
entry_models = entry.get("models", {})
if new_model == entry_model:
override = True
break
if isinstance(entry_models, dict) and new_model in entry_models:
override = True
break
if override:
validation = {"accepted": True, "persist": True, "recognized": False, "message": validation.get("message", "")}
else:
@@ -1057,6 +1077,45 @@ def list_authenticated_providers(
if normed:
_builtin_endpoints.add(normed)
def _has_fast_aws_sdk_signal() -> bool:
"""Return True when explicit AWS auth config is present.
This intentionally avoids botocore's full credential chain. Provider
picker/model-switch discovery can run for non-Bedrock providers, and
botocore may otherwise probe EC2 IMDS (169.254.169.254) on local
machines before returning no credentials.
"""
if os.environ.get("AWS_BEARER_TOKEN_BEDROCK", "").strip():
return True
if (
os.environ.get("AWS_ACCESS_KEY_ID", "").strip()
and os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()
):
return True
return any(
os.environ.get(name, "").strip()
for name in (
"AWS_PROFILE",
"AWS_CONTAINER_CREDENTIALS_RELATIVE_URI",
"AWS_CONTAINER_CREDENTIALS_FULL_URI",
"AWS_WEB_IDENTITY_TOKEN_FILE",
)
)
def _has_aws_sdk_creds_for_listing(slug: str) -> bool:
"""Credential check for AWS SDK providers in non-runtime discovery."""
slug_norm = str(slug or "").strip().lower()
current_norm = str(current_provider or "").strip().lower()
if _has_fast_aws_sdk_signal():
return True
if slug_norm != current_norm:
return False
try:
from agent.bedrock_adapter import has_aws_credentials
return bool(has_aws_credentials())
except Exception:
return False
data = fetch_models_dev()
# Build curated model lists keyed by hermes provider ID
@@ -1184,7 +1243,9 @@ def list_authenticated_providers(
# Check if credentials exist
has_creds = False
if overlay.extra_env_vars:
if overlay.auth_type == "aws_sdk":
has_creds = _has_aws_sdk_creds_for_listing(hermes_slug)
elif overlay.extra_env_vars:
has_creds = any(os.environ.get(ev) for ev in overlay.extra_env_vars)
# Also check api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
if not has_creds and overlay.auth_type == "api_key":
@@ -1203,11 +1264,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
store = _load_auth_store()
providers_store = store.get("providers", {})
pool_store = store.get("credential_pool", {})
if store and (
pid in providers_store or hermes_slug in providers_store
or pid in pool_store or hermes_slug in pool_store
):
if store and (pid in providers_store or hermes_slug in providers_store):
has_creds = True
except Exception as exc:
logger.debug("Auth store check failed for %s: %s", pid, exc)
@@ -1303,11 +1360,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
_cp_store = _load_auth_store()
_cp_providers_store = _cp_store.get("providers", {})
_cp_pool_store = _cp_store.get("credential_pool", {})
if _cp_store and (
_cp.slug in _cp_providers_store
or _cp.slug in _cp_pool_store
):
if _cp_store and _cp.slug in _cp_providers_store:
_cp_has_creds = True
except Exception:
pass
@@ -1324,11 +1377,7 @@ def list_authenticated_providers(
# credentials come from the boto3 credential chain (env vars,
# ~/.aws/credentials, instance roles, etc.)
if not _cp_has_creds and _cp_config and getattr(_cp_config, "auth_type", "") == "aws_sdk":
try:
from agent.bedrock_adapter import has_aws_credentials
_cp_has_creds = has_aws_credentials()
except Exception:
pass
_cp_has_creds = _has_aws_sdk_creds_for_listing(_cp.slug)
if not _cp_has_creds:
continue
+34 -9
View File
@@ -1740,10 +1740,20 @@ def model_supports_fast_mode(model_id: Optional[str]) -> bool:
def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode."""
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode.
Fast mode is currently supported on Claude Opus 4.6 only. Per Anthropic's
docs (https://platform.claude.com/docs/en/build-with-claude/fast-mode):
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error." Opus 4.7 explicitly rejects
the ``speed`` parameter with HTTP 400.
"""
raw = _strip_vendor_prefix(str(model_id or ""))
base = raw.split(":")[0]
return base.startswith("claude-")
if not base.startswith("claude-"):
return False
# Only Opus 4.6 supports fast mode at present.
return "opus-4-6" in base or "opus-4.6" in base
def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
@@ -2896,6 +2906,19 @@ def fetch_api_models(
_OLLAMA_CLOUD_CACHE_TTL = 3600 # 1 hour
def _strip_ollama_cloud_suffix(model_id: str) -> str:
"""Strip :cloud / -cloud suffixes that models.dev appends to Ollama Cloud IDs.
The live API uses clean IDs (e.g. 'kimi-k2.6') while models.dev sometimes
returns them as 'kimi-k2.6:cloud'. Normalising before the dedup merge
prevents duplicate entries in the merged model list.
"""
for suffix in (":cloud", "-cloud"):
if model_id.endswith(suffix):
return model_id[: -len(suffix)]
return model_id
def _ollama_cloud_cache_path() -> Path:
"""Return the path for the Ollama Cloud model cache."""
from hermes_constants import get_hermes_home
@@ -2991,9 +3014,10 @@ def fetch_ollama_cloud_models(
seen.add(m)
merged.append(m)
for m in mdev_models:
if m and m not in seen:
seen.add(m)
merged.append(m)
normalized = _strip_ollama_cloud_suffix(m)
if normalized and normalized not in seen:
seen.add(normalized)
merged.append(normalized)
if merged:
_save_ollama_cloud_cache(merged)
return merged
@@ -3087,7 +3111,7 @@ def validate_requested_model(
"message": f"Model `{requested}` was not found in LM Studio's model listing.",
}
if normalized == "custom":
if normalized == "custom" or normalized.startswith("custom:"):
# Try probing with correct auth for the api_mode.
if api_mode == "anthropic_messages":
probe = probe_api_models(api_key, base_url, api_mode=api_mode)
@@ -3185,11 +3209,12 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": False,
"persist": False,
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Model `{requested}` was not found in the OpenAI Codex model listing."
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
"It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
f"{suggestion_text}"
),
}
+111 -73
View File
@@ -179,8 +179,33 @@ def _get_wrapper_dir() -> Path:
# Validation
# ---------------------------------------------------------------------------
def normalize_profile_name(name: str) -> str:
"""Return the canonical profile id used on disk and in CLI ``-p`` argv.
Named profiles are stored lowercase under ``profiles/<id>/``. The special
alias ``default`` is matched case-insensitively (``Default`` ``default``).
Dashboards and tools may pass title-cased display labels; normalize before
validation, assignment, and subprocess spawn (see issue #18498).
"""
if not isinstance(name, str):
name = str(name)
stripped = name.strip()
if not stripped:
raise ValueError("profile name cannot be empty")
if stripped.casefold() == "default":
return "default"
return stripped.lower()
def validate_profile_name(name: str) -> None:
"""Raise ``ValueError`` if *name* is not a valid profile identifier."""
"""Raise ``ValueError`` if *name* is not a valid profile identifier.
Validates the input as-given strict lowercase match. Callers that accept
mixed-case or title-cased input from users (dashboard UI, CLI args) should
call :func:`normalize_profile_name` first. This separation keeps validate
honest about what the on-disk directory name must look like, while
ingress-point normalization handles UX flexibility (see #18498).
"""
if name == "default":
return # special alias for ~/.hermes
if not _PROFILE_ID_RE.match(name):
@@ -192,16 +217,18 @@ def validate_profile_name(name: str) -> None:
def get_profile_dir(name: str) -> Path:
"""Resolve a profile name to its HERMES_HOME directory."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return _get_default_hermes_home()
return _get_profiles_root() / name
return _get_profiles_root() / canon
def profile_exists(name: str) -> bool:
"""Check whether a profile directory exists."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return True
return get_profile_dir(name).is_dir()
return get_profile_dir(canon).is_dir()
# ---------------------------------------------------------------------------
@@ -213,28 +240,29 @@ def check_alias_collision(name: str) -> Optional[str]:
Checks: reserved names, hermes subcommands, existing binaries in PATH.
"""
if name in _RESERVED_NAMES:
return f"'{name}' is a reserved name"
if name in _HERMES_SUBCOMMANDS:
return f"'{name}' conflicts with a hermes subcommand"
canon = normalize_profile_name(name)
if canon in _RESERVED_NAMES:
return f"'{canon}' is a reserved name"
if canon in _HERMES_SUBCOMMANDS:
return f"'{canon}' conflicts with a hermes subcommand"
# Check existing commands in PATH
wrapper_dir = _get_wrapper_dir()
try:
result = subprocess.run(
["which", name], capture_output=True, text=True, timeout=5,
["which", canon], capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
existing_path = result.stdout.strip()
# Allow overwriting our own wrappers
if existing_path == str(wrapper_dir / name):
if existing_path == str(wrapper_dir / canon):
try:
content = (wrapper_dir / name).read_text()
content = (wrapper_dir / canon).read_text()
if "hermes -p" in content:
return None # it's our wrapper, safe to overwrite
except Exception:
pass
return f"'{name}' conflicts with an existing command ({existing_path})"
return f"'{canon}' conflicts with an existing command ({existing_path})"
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
@@ -252,6 +280,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
Returns the path to the created wrapper, or None if creation failed.
"""
canon = normalize_profile_name(name)
wrapper_dir = _get_wrapper_dir()
try:
wrapper_dir.mkdir(parents=True, exist_ok=True)
@@ -259,9 +288,9 @@ def create_wrapper_script(name: str) -> Optional[Path]:
print(f"⚠ Could not create {wrapper_dir}: {e}")
return None
wrapper_path = wrapper_dir / name
wrapper_path = wrapper_dir / canon
try:
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {canon} "$@"\n')
wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
return wrapper_path
except OSError as e:
@@ -271,7 +300,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
def remove_wrapper_script(name: str) -> bool:
"""Remove the wrapper script for a profile. Returns True if removed."""
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / normalize_profile_name(name)
if wrapper_path.exists():
try:
# Verify it's our wrapper before removing
@@ -421,16 +450,17 @@ def create_profile(
Path
The newly created profile directory.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
# Resolve clone source
source_dir = None
@@ -440,6 +470,7 @@ def create_profile(
from hermes_constants import get_hermes_home
source_dir = get_hermes_home()
else:
clone_from = normalize_profile_name(clone_from)
validate_profile_name(clone_from)
source_dir = get_profile_dir(clone_from)
if not source_dir.is_dir():
@@ -540,24 +571,25 @@ def delete_profile(name: str, yes: bool = False) -> Path:
Returns the path that was removed.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot delete the default profile (~/.hermes).\n"
"To remove everything, use: hermes uninstall"
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
# Show what will be deleted
model, provider = _read_config_model(profile_dir)
gw_running = _check_gateway_running(profile_dir)
skill_count = _count_skills(profile_dir)
print(f"\nProfile: {name}")
print(f"\nProfile: {canon}")
print(f"Path: {profile_dir}")
if model:
print(f"Model: {model}" + (f" ({provider})" if provider else ""))
@@ -569,7 +601,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
]
# Check for service
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / canon
has_wrapper = wrapper_path.exists()
if has_wrapper:
items.append(f"Command alias ({wrapper_path})")
@@ -584,16 +616,16 @@ def delete_profile(name: str, yes: bool = False) -> Path:
if not yes:
print()
try:
confirm = input(f"Type '{name}' to confirm: ").strip()
confirm = input(f"Type '{canon}' to confirm: ").strip()
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return profile_dir
if confirm != name:
if confirm != canon:
print("Cancelled.")
return profile_dir
# 1. Disable service (prevents auto-restart)
_cleanup_gateway_service(name, profile_dir)
_cleanup_gateway_service(canon, profile_dir)
# 2. Stop running gateway
if gw_running:
@@ -601,7 +633,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 3. Remove wrapper script
if has_wrapper:
if remove_wrapper_script(name):
if remove_wrapper_script(canon):
print(f"✓ Removed {wrapper_path}")
# 4. Remove profile directory
@@ -614,13 +646,13 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 5. Clear active_profile if it pointed to this profile
try:
active = get_active_profile()
if active == name:
if active == canon:
set_active_profile("default")
print("✓ Active profile reset to default")
except Exception:
pass
print(f"\nProfile '{name}' deleted.")
print(f"\nProfile '{canon}' deleted.")
return profile_dir
@@ -730,22 +762,23 @@ def set_active_profile(name: str) -> None:
Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
"""
validate_profile_name(name)
if name != "default" and not profile_exists(name):
canon = normalize_profile_name(name)
validate_profile_name(canon)
if canon != "default" and not profile_exists(canon):
raise FileNotFoundError(
f"Profile '{name}' does not exist. "
f"Create it with: hermes profile create {name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
path = _get_active_profile_path()
path.parent.mkdir(parents=True, exist_ok=True)
if name == "default":
if canon == "default":
# Remove the file to indicate default
path.unlink(missing_ok=True)
else:
# Atomic write
tmp = path.with_suffix(".tmp")
tmp.write_text(name + "\n")
tmp.write_text(canon + "\n")
tmp.replace(path)
@@ -811,16 +844,17 @@ def export_profile(name: str, output_path: str) -> Path:
"""
import tempfile
validate_profile_name(name)
profile_dir = get_profile_dir(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
output = Path(output_path)
# shutil.make_archive wants the base name without extension
base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
if name == "default":
if canon == "default":
# The default profile IS ~/.hermes itself — its parent is ~/ and its
# directory name is ".hermes", not "default". We stage a clean copy
# under a temp dir so the archive contains ``default/...``.
@@ -836,14 +870,14 @@ def export_profile(name: str, output_path: str) -> Path:
# Named profiles — stage a filtered copy to exclude credentials
with tempfile.TemporaryDirectory() as tmpdir:
staged = Path(tmpdir) / name
staged = Path(tmpdir) / canon
_CREDENTIAL_FILES = {"auth.json", ".env"}
shutil.copytree(
profile_dir,
staged,
ignore=lambda d, contents: _CREDENTIAL_FILES & set(contents),
)
result = shutil.make_archive(base, "gztar", tmpdir, name)
result = shutil.make_archive(base, "gztar", tmpdir, canon)
return Path(result)
@@ -952,16 +986,17 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
# Archives exported from the default profile have "default/" as top-level
# dir. Importing as "default" would target ~/.hermes itself — disallow
# that and guide the user toward a named profile.
if inferred_name == "default":
canon = normalize_profile_name(inferred_name)
validate_profile_name(canon)
if canon == "default":
raise ValueError(
"Cannot import as 'default' — that is the built-in root profile (~/.hermes). "
"Specify a different name: hermes profile import <archive> --name <name>"
)
validate_profile_name(inferred_name)
profile_dir = get_profile_dir(inferred_name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
profiles_root = _get_profiles_root()
profiles_root.mkdir(parents=True, exist_ok=True)
@@ -977,8 +1012,8 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
)
final_source = extracted
if archive_root != inferred_name:
final_source = staging_root / inferred_name
if archive_root != canon:
final_source = staging_root / canon
extracted.rename(final_source)
shutil.move(str(final_source), str(profile_dir))
@@ -1048,25 +1083,27 @@ def rename_profile(old_name: str, new_name: str) -> Path:
Returns the new profile directory.
"""
validate_profile_name(old_name)
validate_profile_name(new_name)
old_canon = normalize_profile_name(old_name)
new_canon = normalize_profile_name(new_name)
validate_profile_name(old_canon)
validate_profile_name(new_canon)
if old_name == "default":
if old_canon == "default":
raise ValueError("Cannot rename the default profile.")
if new_name == "default":
if new_canon == "default":
raise ValueError("Cannot rename to 'default' — it is reserved.")
old_dir = get_profile_dir(old_name)
new_dir = get_profile_dir(new_name)
old_dir = get_profile_dir(old_canon)
new_dir = get_profile_dir(new_canon)
if not old_dir.is_dir():
raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
raise FileNotFoundError(f"Profile '{old_canon}' does not exist.")
if new_dir.exists():
raise FileExistsError(f"Profile '{new_name}' already exists.")
raise FileExistsError(f"Profile '{new_canon}' already exists.")
# 1. Stop gateway if running
if _check_gateway_running(old_dir):
_cleanup_gateway_service(old_name, old_dir)
_cleanup_gateway_service(old_canon, old_dir)
_stop_gateway_process(old_dir)
# 2. Rename directory
@@ -1074,22 +1111,22 @@ def rename_profile(old_name: str, new_name: str) -> Path:
print(f"✓ Renamed {old_dir.name}{new_dir.name}")
# 3. Update profile-scoped Honcho host blocks, preserving aiPeer identity
_migrate_honcho_profile_host(old_name, new_name, new_dir)
_migrate_honcho_profile_host(old_canon, new_canon, new_dir)
# 4. Update wrapper script
remove_wrapper_script(old_name)
collision = check_alias_collision(new_name)
remove_wrapper_script(old_canon)
collision = check_alias_collision(new_canon)
if not collision:
create_wrapper_script(new_name)
print(f"✓ Alias updated: {new_name}")
create_wrapper_script(new_canon)
print(f"✓ Alias updated: {new_canon}")
else:
print(f"⚠ Cannot create alias '{new_name}'{collision}")
print(f"⚠ Cannot create alias '{new_canon}'{collision}")
# 5. Update active_profile if it pointed to old name
try:
if get_active_profile() == old_name:
set_active_profile(new_name)
print(f"✓ Active profile updated: {new_name}")
if get_active_profile() == old_canon:
set_active_profile(new_canon)
print(f"✓ Active profile updated: {new_canon}")
except Exception:
pass
@@ -1191,13 +1228,14 @@ def resolve_profile_env(profile_name: str) -> str:
Called early in the CLI entry point, before any hermes modules
are imported, to set the HERMES_HOME environment variable.
"""
validate_profile_name(profile_name)
profile_dir = get_profile_dir(profile_name)
canon = normalize_profile_name(profile_name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if profile_name != "default" and not profile_dir.is_dir():
if canon != "default" and not profile_dir.is_dir():
raise FileNotFoundError(
f"Profile '{profile_name}' does not exist. "
f"Create it with: hermes profile create {profile_name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
return str(profile_dir)
+8 -3
View File
@@ -108,9 +108,14 @@ class PtyBridge:
"(or pip install -e '.[pty]')."
)
raise PtyUnavailableError("Pseudo-terminals are unavailable.")
# Let caller-supplied env fully override inheritance; if they pass
# None we inherit the server's env (same semantics as subprocess).
spawn_env = os.environ.copy() if env is None else env
# PTY-hosted programs expect TERM to describe the terminal type.
# CI often runs without TERM in the parent process, which makes
# simple terminal probes like `tput cols` fail before winsize reads.
# Preserve explicit caller overrides, but backfill a sensible default
# when TERM is missing or blank.
spawn_env = (os.environ.copy() if env is None else env.copy())
if not spawn_env.get("TERM"):
spawn_env["TERM"] = "xterm-256color"
proc = ptyprocess.PtyProcess.spawn( # type: ignore[union-attr]
list(argv),
cwd=cwd,
+57 -13
View File
@@ -964,7 +964,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
)
else:
_selected_vision_model = prompt(" Vision model (blank = use main/custom default)").strip()
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
if _selected_vision_model:
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
print_success(
f"Vision configured with {_base_url}"
+ (f" ({_selected_vision_model})" if _selected_vision_model else "")
@@ -1190,6 +1191,13 @@ def _setup_tts_provider(config: dict):
"Falling back to Edge TTS."
)
selected = "edge"
if selected == "xai":
print()
voice_id = prompt("xAI voice_id (Enter for 'eve', or paste a custom voice ID)")
if voice_id and voice_id.strip():
config.setdefault("tts", {}).setdefault("xai", {})["voice_id"] = voice_id.strip()
print_success(f"xAI voice_id set to: {voice_id.strip()}")
elif selected == "minimax":
existing = get_env_value("MINIMAX_API_KEY")
@@ -1321,15 +1329,13 @@ def setup_terminal_backend(config: dict):
print_success("Terminal backend: Local")
print_info("Commands run directly on this machine.")
# CWD for messaging
# Gateway/cron working directory
print()
print_info("Working directory for messaging sessions:")
print_info(" When using Hermes via Telegram/Discord, this is where")
print_info(
" the agent starts. CLI mode always starts in the current directory."
)
print_info("Gateway working directory:")
print_info(" Used by Telegram/Discord/cron sessions.")
print_info(" CLI/TUI always uses your launch directory instead.")
current_cwd = cfg_get(config, "terminal", "cwd", default="")
cwd = prompt(" Messaging working directory", current_cwd or str(Path.home()))
cwd = prompt(" Gateway working directory", current_cwd or str(Path.home()))
if cwd:
config["terminal"]["cwd"] = cwd
@@ -1643,7 +1649,11 @@ def setup_terminal_backend(config: dict):
def _apply_default_agent_settings(config: dict):
"""Apply recommended defaults for all agent settings without prompting."""
config.setdefault("agent", {})["max_turns"] = 90
save_env_value("HERMES_MAX_ITERATIONS", "90")
# config.yaml is the authoritative source for max_turns; the gateway
# bridges it into HERMES_MAX_ITERATIONS at startup. We no longer write
# to .env to avoid the dual-source inconsistency that caused the
# 60-vs-500 bug (stale .env entry silently shadowing config.yaml).
remove_env_value("HERMES_MAX_ITERATIONS")
config.setdefault("display", {})["tool_progress"] = "all"
@@ -1673,9 +1683,10 @@ def setup_agent_settings(config: dict):
print()
# ── Max Iterations ──
current_max = get_env_value("HERMES_MAX_ITERATIONS") or str(
cfg_get(config, "agent", "max_turns", default=90)
)
# config.yaml is authoritative; read from there. If a legacy .env
# entry is still around (from pre-PR#18413 setups), prefer the
# config value so we don't surface a stale number to the user.
current_max = str(cfg_get(config, "agent", "max_turns", default=90))
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info(
@@ -1686,9 +1697,13 @@ def setup_agent_settings(config: dict):
try:
max_iter = int(max_iter_str)
if max_iter > 0:
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
# Write to config.yaml (authoritative) only. Also clean up any
# stale .env entry from earlier setup runs — the gateway's
# bridge in gateway/run.py now unconditionally derives
# HERMES_MAX_ITERATIONS from agent.max_turns at startup.
config.setdefault("agent", {})["max_turns"] = max_iter
config.pop("max_turns", None)
remove_env_value("HERMES_MAX_ITERATIONS")
print_success(f"Max iterations set to {max_iter}")
except ValueError:
print_warning("Invalid number, keeping current value")
@@ -2033,6 +2048,16 @@ def _setup_slack():
print_warning("⚠️ No Slack allowlist set - unpaired users will be denied by default.")
print_info(" Set SLACK_ALLOW_ALL_USERS=true or GATEWAY_ALLOW_ALL_USERS=true only if you intentionally want open workspace access.")
print()
print_info("📬 Home Channel: where Hermes delivers cron job results,")
print_info(" cross-platform messages, and notifications.")
print_info(" To get a channel ID: open the channel in Slack, then right-click")
print_info(" the channel name → Copy link — the ID starts with C (e.g. C01ABC2DE3F).")
print_info(" You can also set this later by typing /set-home in a Slack channel.")
home_channel = prompt("Home channel ID (leave empty to set later with /set-home)")
if home_channel:
save_env_value("SLACK_HOME_CHANNEL", home_channel.strip())
def _write_slack_manifest_and_instruct():
"""Generate the Slack manifest, write it under HERMES_HOME, and print
@@ -2979,6 +3004,21 @@ def run_setup_wizard(args):
config = load_config()
hermes_home = get_hermes_home()
# Back up existing config before setup modifies it (#3522)
config_path = get_config_path()
if config_path.exists():
from datetime import datetime as _dt
_backup_path = config_path.with_suffix(
f".yaml.bak.{_dt.now().strftime('%Y%m%d_%H%M%S')}"
)
try:
import shutil
shutil.copy2(config_path, _backup_path)
except Exception:
_backup_path = None
else:
_backup_path = None
# Detect non-interactive environments (headless SSH, Docker, CI/CD)
non_interactive = getattr(args, 'non_interactive', False)
if not non_interactive and not is_interactive_stdin():
@@ -3148,6 +3188,10 @@ def run_setup_wizard(args):
# Save and show summary
save_config(config)
if _backup_path and _backup_path.exists():
print_info(f"Previous config backed up to: {_backup_path}")
print_info("If setup changed a value you customized, restore it with:")
print_info(f" cp {_backup_path} {config_path}")
_print_setup_summary(config, hermes_home)
_offer_launch_chat()
+25 -5
View File
@@ -122,11 +122,16 @@ def show_status(args):
print()
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
keys = {
# Values may be a single env var name (str) or a tuple of alternates (first found wins).
keys: dict[str, str | tuple[str, ...]] = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Anthropic": ("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN"),
"Google / Gemini": ("GOOGLE_API_KEY", "GEMINI_API_KEY"),
"DeepSeek": "DEEPSEEK_API_KEY",
"xAI / Grok": "XAI_API_KEY",
"NVIDIA NIM": "NVIDIA_API_KEY",
"Z.AI / GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
"MiniMax": "MINIMAX_API_KEY",
@@ -142,8 +147,23 @@ def show_status(args):
"GitHub": "GITHUB_TOKEN",
}
for name, env_var in keys.items():
value = get_env_value(env_var) or ""
def _resolve_env(env_ref) -> str:
"""Return first non-empty env var value from a str or tuple of names."""
if isinstance(env_ref, tuple):
for candidate in env_ref:
v = get_env_value(candidate) or ""
if v:
return v
return ""
return get_env_value(env_ref) or ""
for name, env_ref in keys.items():
# Anthropic already has a dedicated lookup below; keep that as the
# single source of truth (it also resolves OAuth tokens), skip here
# so we don't print two "Anthropic" rows.
if name == "Anthropic":
continue
value = _resolve_env(env_ref)
has_key = bool(value)
display = redact_key(value) if not show_all else value
print(f" {name:<12} {check_mark(has_key)} {display}")
+35 -6
View File
@@ -56,6 +56,7 @@ CONFIGURABLE_TOOLSETS = [
("file", "📁 File Operations", "read, write, patch, search"),
("code_execution", "⚡ Code Execution", "execute_code"),
("vision", "👁️ Vision / Image Analysis", "vision_analyze"),
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
@@ -78,7 +79,7 @@ CONFIGURABLE_TOOLSETS = [
# Toolsets that are OFF by default for new installs.
# They're still in _HERMES_CORE_TOOLS (available at runtime if enabled),
# but the setup checklist won't pre-select them for first-time users.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin"}
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "rl", "spotify", "discord", "discord_admin", "video"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
@@ -1822,7 +1823,7 @@ def _reconfigure_tool(config: dict):
cat = TOOL_CATEGORIES.get(ts_key)
reqs = TOOLSET_ENV_REQUIREMENTS.get(ts_key)
if cat or reqs:
if _toolset_has_keys(ts_key, config):
if _toolset_has_keys(ts_key, config) or _toolset_enabled_for_reconfigure(ts_key, config):
configurable.append((ts_key, ts_label))
if not configurable:
@@ -1848,6 +1849,28 @@ def _reconfigure_tool(config: dict):
save_config(config)
def _toolset_enabled_for_reconfigure(ts_key: str, config: dict) -> bool:
"""Return True if a configurable toolset is enabled anywhere.
Reconfigure must include enabled-but-unconfigured categories so users can
finish provider/API-key setup without disabling and re-enabling the toolset.
"""
for platform in PLATFORMS:
if not _toolset_allowed_for_platform(ts_key, platform):
continue
try:
enabled = _get_platform_tools(
config,
platform,
include_default_mcp_servers=False,
)
except Exception:
continue
if ts_key in enabled:
return True
return False
def _configure_tool_category_for_reconfig(ts_key: str, cat: dict, config: dict):
"""Reconfigure a tool category - provider selection + API key update."""
icon = cat.get("icon", "")
@@ -1897,21 +1920,27 @@ def _reconfigure_provider(provider: dict, config: dict):
return
if provider.get("tts_provider"):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
tts_cfg = config.setdefault("tts", {})
tts_cfg["provider"] = provider["tts_provider"]
tts_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" TTS provider set to: {provider['tts_provider']}")
if "browser_provider" in provider:
bp = provider["browser_provider"]
browser_cfg = config.setdefault("browser", {})
if bp == "local":
config.setdefault("browser", {})["cloud_provider"] = "local"
browser_cfg["cloud_provider"] = "local"
_print_success(" Browser set to local mode")
elif bp:
config.setdefault("browser", {})["cloud_provider"] = bp
browser_cfg["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
browser_cfg["use_gateway"] = bool(managed_feature)
# Set web search backend in config if applicable
if provider.get("web_backend"):
config.setdefault("web", {})["backend"] = provider["web_backend"]
web_cfg = config.setdefault("web", {})
web_cfg["backend"] = provider["web_backend"]
web_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" Web backend set to: {provider['web_backend']}")
if managed_feature and managed_feature not in ("web", "tts", "browser"):
+186
View File
@@ -27,6 +27,192 @@ import sys
import threading
from typing import Any, Callable, Optional
# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``)
# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime
# mismatch Copilot flagged in round-9 on #19835.
#
# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit
# has no super/meta modifier for the Cmd key, so those spellings are
# TUI-only. The normalizer below returns the documented default
# (``c-b``) for them — a silent fallback was preferred to a hard
# startup crash (Copilot round-11). The CLI binding site
# (``_register_voice_handler`` in cli.py) logs a warning when that
# fallback fires so users see why their TUI-only shortcut isn't
# bound in the classic CLI.
_VOICE_MOD_ALIASES = {
"ctrl": "c-",
"control": "c-",
"alt": "a-",
"option": "a-",
"opt": "a-",
}
# Named keys prompt_toolkit accepts in ``c-<name>`` / ``a-<name>`` form.
# Aliases collapse to prompt_toolkit's canonical spelling so the same
# config value binds identically in both runtimes (Copilot round-10 on
# #19835).
_VOICE_NAMED_KEYS = {
"space": "space",
"spc": "space",
"enter": "enter",
"return": "enter",
"ret": "enter",
"tab": "tab",
"escape": "escape",
"esc": "escape",
"backspace": "backspace",
"bs": "backspace",
"delete": "delete",
"del": "delete",
}
# ``useInputHandlers()`` intercepts these before the voice check runs,
# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or
# ``ctrl+l`` (clear screen) would be advertised in /voice status but
# never fire push-to-talk — the same blocklist the TUI parser uses.
_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"})
# On macOS the classic CLI's prompt_toolkit bindings for copy / exit /
# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier
# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals.
# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc.
# don't bind Alt+C in the CLI while the TUI silently falls back to
# Ctrl+B (Copilot round-14 on #19835).
_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"})
_DEFAULT_PT_KEY = "c-b"
def voice_record_key_from_config(cfg: Any) -> Any:
"""Shape-safe ``cfg.voice.record_key`` lookup.
``load_config()`` deep-merges raw YAML and preserves scalar
overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b``
leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the
naive ``.get("voice", {}).get("record_key")`` chain raises
AttributeError before voice can even start (Copilot round-11 on
#19835). Return ``None`` for malformed shapes so call sites can
feed the result straight into the normalizer/formatter and get
the documented default.
"""
if not isinstance(cfg, dict):
return None
voice = cfg.get("voice")
if not isinstance(voice, dict):
return None
return voice.get("record_key")
def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str:
"""Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format.
Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``)
so one config value binds the same shortcut in both runtimes:
* non-string / empty / typo'd / bare-char / multi-modifier / reserved
``ctrl+c|d|l`` documented default ``c-b``
* single-char keys: ``ctrl+o`` ``c-o``
* named keys: ``ctrl+space`` ``c-space`` (aliases collapse:
``ctrl+return`` ``c-enter``)
* ``super`` / ``win`` / ``windows`` ``c-b`` (TUI-only modifiers
prompt_toolkit has no super mod; the CLI binding site is
expected to warn when this fallback fires so users see the
cross-runtime split, Copilot round-11 on #19835)
"""
if not isinstance(raw, str):
return _DEFAULT_PT_KEY
lowered = raw.strip().lower()
if not lowered:
return _DEFAULT_PT_KEY
parts = [p.strip() for p in lowered.split("+") if p.strip()]
if not parts:
return _DEFAULT_PT_KEY
# Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts
# in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse
# to the documented default instead of silently diverging.
if len(parts) > 2:
return _DEFAULT_PT_KEY
# Bare char / bare named key (no explicit modifier) — the CLI's
# prompt_toolkit binds the raw key without a modifier, which the TUI
# parser refuses; reject here too so both runtimes agree.
if len(parts) == 1:
return _DEFAULT_PT_KEY
modifier_token, key_token = parts
# ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has
# no super modifier, so ``@kb.add(super+b)`` crashes the CLI at
# startup). Fall back to the documented default here; the CLI
# binding site is expected to log a warning when the configured
# value is one of these spellings so users know the TUI+CLI
# runtimes diverge on that shortcut (Copilot round-11 on #19835).
if modifier_token in {"super", "win", "windows"}:
return _DEFAULT_PT_KEY
normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token)
if not normalized_mod:
return _DEFAULT_PT_KEY
# Single-char key: reject reserved-ctrl chords that the TUI would
# also block at parse time, plus the mac-only alt reservation.
if len(key_token) == 1:
if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS:
return _DEFAULT_PT_KEY
if (
normalized_mod == "a-"
and sys.platform == "darwin"
and key_token in _VOICE_RESERVED_ALT_CHARS_MAC
):
return _DEFAULT_PT_KEY
return f"{normalized_mod}{key_token}"
# Multi-char key token must be a known named key; typos like
# ``ctrl+spcae`` fall back to the default rather than being passed
# through as ``c-spcae`` (which prompt_toolkit would reject).
named = _VOICE_NAMED_KEYS.get(key_token)
if not named:
return _DEFAULT_PT_KEY
return f"{normalized_mod}{named}"
def format_voice_record_key_for_status(raw: Any) -> str:
"""Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form.
Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` /
``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the
documented default so status never advertises a shortcut that
won't bind (Copilot round-10 on #19835).
"""
normalized = normalize_voice_record_key_for_prompt_toolkit(raw)
if normalized.startswith("c-"):
prefix, key = "Ctrl+", normalized[2:]
elif normalized.startswith("a-"):
prefix, key = "Alt+", normalized[2:]
elif "+" in normalized:
# ``super+<key>`` / ``win+<key>`` — CLI won't bind them, but
# render in title case so status output is still readable.
mod, key = normalized.split("+", 1)
prefix = mod[0].upper() + mod[1:] + "+"
else:
return "Ctrl+B"
if not key:
return prefix.rstrip("+")
if len(key) == 1:
return prefix + key.upper()
return prefix + key[0].upper() + key[1:]
from tools.voice_mode import (
create_audio_recorder,
is_whisper_hallucination,
+36 -8
View File
@@ -470,10 +470,23 @@ except (ValueError, TypeError):
)
_GATEWAY_HEALTH_TIMEOUT = 3.0
# DEPRECATED (scheduled for removal): GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT.
# Cross-container / cross-host gateway liveness detection will be folded into a
# first-class dashboard config key so it's no longer Docker-adjacent lore buried
# in env vars. The env vars still work for now so existing Compose deployments
# don't break. Do not add new callers — wire new uses through the planned
# config surface.
def _probe_gateway_health() -> tuple[bool, dict | None]:
"""Probe the gateway via its HTTP health endpoint (cross-container).
.. deprecated::
Driven by the deprecated ``GATEWAY_HEALTH_URL`` /
``GATEWAY_HEALTH_TIMEOUT`` env vars. Scheduled for removal alongside
a move to a first-class dashboard config key. See
:data:`_GATEWAY_HEALTH_URL` for context.
Uses ``/health/detailed`` first (returns full state), falling back to
the simpler ``/health`` endpoint. Returns ``(is_alive, body_dict)``.
@@ -2882,6 +2895,25 @@ _VALID_CHANNEL_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
# loopback so tests don't need to rewrite request scope.
_LOOPBACK_HOSTS = frozenset({"127.0.0.1", "::1", "localhost", "testclient"})
def _is_public_bind() -> bool:
"""True when bound to all-interfaces (operator used --insecure)."""
return getattr(app.state, "bound_host", "") in ("0.0.0.0", "::")
def _ws_client_is_allowed(ws: "WebSocket") -> bool:
"""Check if the WebSocket client IP is acceptable.
Allows loopback always; allows any IP when bound to all-interfaces
(--insecure mode, guarded by session token auth).
"""
if _is_public_bind():
return True
client_host = ws.client.host if ws.client else ""
if not client_host:
return True
return client_host in _LOOPBACK_HOSTS
# Per-channel subscriber registry used by /api/pub (PTY-side gateway → dashboard)
# and /api/events (dashboard → browser sidebar). Keyed by an opaque channel id
# the chat tab generates on mount; entries auto-evict when the last subscriber
@@ -2972,8 +3004,7 @@ async def pty_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3080,8 +3111,7 @@ async def gateway_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3113,8 +3143,7 @@ async def pub_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
@@ -3143,8 +3172,7 @@ async def events_ws(ws: WebSocket) -> None:
await ws.close(code=4401)
return
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
if not _ws_client_is_allowed(ws):
await ws.close(code=4403)
return
+51 -1
View File
@@ -8,14 +8,64 @@ import os
from pathlib import Path
_profile_fallback_warned: bool = False
def get_hermes_home() -> Path:
"""Return the Hermes home directory (default: ~/.hermes).
Reads HERMES_HOME env var, falls back to ~/.hermes.
This is the single source of truth all other copies should import this.
When ``HERMES_HOME`` is unset but an ``active_profile`` file indicates
a non-default profile is active, logs a loud one-shot warning to
``errors.log`` so cross-profile data corruption is diagnosable instead
of silent. Behavior is unchanged otherwise we still return
``~/.hermes`` because raising here would brick 30+ module-level
callers that import this at load time. Subprocess spawners are
expected to propagate ``HERMES_HOME`` explicitly (see the systemd
template in ``hermes_cli/gateway.py`` and the kanban dispatcher in
``hermes_cli/kanban_db.py``). See https://github.com/NousResearch/hermes-agent/issues/18594.
"""
val = os.environ.get("HERMES_HOME", "").strip()
return Path(val) if val else Path.home() / ".hermes"
if val:
return Path(val)
# Guard: if a non-default profile is sticky-active, warn once that
# the fallback to the default profile is almost certainly wrong.
global _profile_fallback_warned
if not _profile_fallback_warned:
try:
# Inline the default-root resolution from get_default_hermes_root()
# to stay import-safe (this function is called from module scope
# in 30+ files; we cannot afford to trigger logging setup here).
active_path = (Path.home() / ".hermes" / "active_profile")
active = active_path.read_text().strip() if active_path.exists() else ""
except (UnicodeDecodeError, OSError):
active = ""
if active and active != "default":
_profile_fallback_warned = True
# Write directly to stderr. We intentionally do NOT route this
# through ``logging`` because (a) this function is called at
# module-import time from 30+ sites, often before logging is
# configured, and (b) root-logger propagation would double-emit
# on consoles where a StreamHandler is already attached.
import sys
msg = (
f"[HERMES_HOME fallback] HERMES_HOME is unset but active "
f"profile is {active!r}. Falling back to ~/.hermes, which "
f"is the DEFAULT profile — not {active!r}. Any data this "
f"process writes will land in the wrong profile. The "
f"subprocess spawner should pass HERMES_HOME explicitly "
f"(see issue #18594)."
)
try:
sys.stderr.write(msg + "\n")
sys.stderr.flush()
except Exception:
pass
return Path.home() / ".hermes"
def get_default_hermes_root() -> Path:
+382
View File
@@ -2148,6 +2148,388 @@ class SessionDB:
)
self._execute_write(_do)
def apply_telegram_topic_migration(self) -> None:
"""Create Telegram DM topic-mode tables on explicit /topic opt-in.
This migration is deliberately not part of automatic SessionDB startup
reconciliation. Operators must be able to upgrade Hermes, keep the old
Telegram bot behavior running, and only mutate topic-mode state when the
user executes /topic to opt into the feature.
Schema versions:
v1 initial shape (no ON DELETE CASCADE on session_id FK)
v2 session_id FK gets ON DELETE CASCADE so session pruning
automatically clears bindings.
"""
def _do(conn):
conn.executescript(
"""
CREATE TABLE IF NOT EXISTS telegram_dm_topic_mode (
chat_id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
enabled INTEGER NOT NULL DEFAULT 1,
activated_at REAL NOT NULL,
updated_at REAL NOT NULL,
has_topics_enabled INTEGER,
allows_users_to_create_topics INTEGER,
capability_checked_at REAL,
intro_message_id TEXT,
pinned_message_id TEXT
);
CREATE TABLE IF NOT EXISTS telegram_dm_topic_bindings (
chat_id TEXT NOT NULL,
thread_id TEXT NOT NULL,
user_id TEXT NOT NULL,
session_key TEXT NOT NULL,
session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
managed_mode TEXT NOT NULL DEFAULT 'auto',
linked_at REAL NOT NULL,
updated_at REAL NOT NULL,
PRIMARY KEY (chat_id, thread_id)
);
CREATE UNIQUE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_session
ON telegram_dm_topic_bindings(session_id);
CREATE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_user
ON telegram_dm_topic_bindings(user_id, chat_id);
"""
)
# v1 → v2: rebuild telegram_dm_topic_bindings if its session_id FK
# lacks ON DELETE CASCADE. SQLite can't ALTER a foreign key, so we
# rebuild the table. Only runs once per DB (version gate).
current = conn.execute(
"SELECT value FROM state_meta WHERE key = ?",
("telegram_dm_topic_schema_version",),
).fetchone()
current_version = int(current[0]) if current and str(current[0]).isdigit() else 0
if current_version < 2:
fk_rows = conn.execute(
"PRAGMA foreign_key_list('telegram_dm_topic_bindings')"
).fetchall()
needs_rebuild = any(
row[2] == "sessions" and (row[6] or "") != "CASCADE"
for row in fk_rows
)
if needs_rebuild:
conn.executescript(
"""
CREATE TABLE telegram_dm_topic_bindings_new (
chat_id TEXT NOT NULL,
thread_id TEXT NOT NULL,
user_id TEXT NOT NULL,
session_key TEXT NOT NULL,
session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
managed_mode TEXT NOT NULL DEFAULT 'auto',
linked_at REAL NOT NULL,
updated_at REAL NOT NULL,
PRIMARY KEY (chat_id, thread_id)
);
INSERT INTO telegram_dm_topic_bindings_new
SELECT chat_id, thread_id, user_id, session_key,
session_id, managed_mode, linked_at, updated_at
FROM telegram_dm_topic_bindings;
DROP TABLE telegram_dm_topic_bindings;
ALTER TABLE telegram_dm_topic_bindings_new
RENAME TO telegram_dm_topic_bindings;
CREATE UNIQUE INDEX idx_telegram_dm_topic_bindings_session
ON telegram_dm_topic_bindings(session_id);
CREATE INDEX idx_telegram_dm_topic_bindings_user
ON telegram_dm_topic_bindings(user_id, chat_id);
"""
)
conn.execute(
"INSERT INTO state_meta (key, value) VALUES (?, ?) "
"ON CONFLICT(key) DO UPDATE SET value = excluded.value",
("telegram_dm_topic_schema_version", "2"),
)
self._execute_write(_do)
def enable_telegram_topic_mode(
self,
*,
chat_id: str,
user_id: str,
has_topics_enabled: Optional[bool] = None,
allows_users_to_create_topics: Optional[bool] = None,
) -> None:
"""Enable Telegram DM topic mode for one private chat/user.
This method intentionally owns the explicit topic migration. Ordinary
SessionDB startup must not create these side tables.
"""
self.apply_telegram_topic_migration()
now = time.time()
def _to_int(value: Optional[bool]) -> Optional[int]:
if value is None:
return None
return 1 if value else 0
def _do(conn):
conn.execute(
"""
INSERT INTO telegram_dm_topic_mode (
chat_id, user_id, enabled, activated_at, updated_at,
has_topics_enabled, allows_users_to_create_topics,
capability_checked_at
) VALUES (?, ?, 1, ?, ?, ?, ?, ?)
ON CONFLICT(chat_id) DO UPDATE SET
user_id = excluded.user_id,
enabled = 1,
updated_at = excluded.updated_at,
has_topics_enabled = excluded.has_topics_enabled,
allows_users_to_create_topics = excluded.allows_users_to_create_topics,
capability_checked_at = excluded.capability_checked_at
""",
(
str(chat_id),
str(user_id),
now,
now,
_to_int(has_topics_enabled),
_to_int(allows_users_to_create_topics),
now,
),
)
self._execute_write(_do)
def disable_telegram_topic_mode(
self,
*,
chat_id: str,
clear_bindings: bool = True,
) -> None:
"""Disable Telegram DM topic mode for one private chat.
When ``clear_bindings`` is True (default) the (chat_id, thread_id)
bindings for this chat are also cleared so re-enabling later
starts from a clean slate. Set to False if the operator wants to
preserve bindings for a later re-enable.
Never creates the topic-mode tables from scratch; if they don't
exist there is nothing to disable and the call is a no-op.
"""
def _do(conn):
try:
conn.execute(
"UPDATE telegram_dm_topic_mode SET enabled = 0, updated_at = ? "
"WHERE chat_id = ?",
(time.time(), str(chat_id)),
)
if clear_bindings:
conn.execute(
"DELETE FROM telegram_dm_topic_bindings WHERE chat_id = ?",
(str(chat_id),),
)
except sqlite3.OperationalError:
# Tables don't exist yet — nothing to disable.
return
self._execute_write(_do)
def is_telegram_topic_mode_enabled(self, *, chat_id: str, user_id: str) -> bool:
"""Return whether Telegram DM topic mode is enabled for this chat/user."""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT enabled FROM telegram_dm_topic_mode
WHERE chat_id = ? AND user_id = ?
""",
(str(chat_id), str(user_id)),
).fetchone()
except sqlite3.OperationalError:
return False
if row is None:
return False
enabled = row["enabled"] if isinstance(row, sqlite3.Row) else row[0]
return bool(enabled)
def get_telegram_topic_binding(
self,
*,
chat_id: str,
thread_id: str,
) -> Optional[Dict[str, Any]]:
"""Return the session binding for a Telegram DM topic, if present."""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT * FROM telegram_dm_topic_bindings
WHERE chat_id = ? AND thread_id = ?
""",
(str(chat_id), str(thread_id)),
).fetchone()
except sqlite3.OperationalError:
return None
return dict(row) if row else None
def bind_telegram_topic(
self,
*,
chat_id: str,
thread_id: str,
user_id: str,
session_key: str,
session_id: str,
managed_mode: str = "auto",
) -> None:
"""Bind one Telegram DM topic thread to one Hermes session.
A Hermes session may only be linked to one Telegram topic in MVP.
Rebinding the same topic to the same session is idempotent; trying to
link the same session to a different topic raises ValueError.
"""
self.apply_telegram_topic_migration()
now = time.time()
chat_id = str(chat_id)
thread_id = str(thread_id)
user_id = str(user_id)
session_key = str(session_key)
session_id = str(session_id)
def _do(conn):
existing_session = conn.execute(
"""
SELECT chat_id, thread_id FROM telegram_dm_topic_bindings
WHERE session_id = ?
""",
(session_id,),
).fetchone()
if existing_session is not None:
linked_chat = existing_session["chat_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[0]
linked_thread = existing_session["thread_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[1]
if str(linked_chat) != chat_id or str(linked_thread) != thread_id:
raise ValueError("session is already linked to another Telegram topic")
conn.execute(
"""
INSERT INTO telegram_dm_topic_bindings (
chat_id, thread_id, user_id, session_key, session_id,
managed_mode, linked_at, updated_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(chat_id, thread_id) DO UPDATE SET
user_id = excluded.user_id,
session_key = excluded.session_key,
session_id = excluded.session_id,
managed_mode = excluded.managed_mode,
updated_at = excluded.updated_at
""",
(
chat_id,
thread_id,
user_id,
session_key,
session_id,
managed_mode,
now,
now,
),
)
self._execute_write(_do)
def is_telegram_session_linked_to_topic(self, *, session_id: str) -> bool:
"""Return True if a Hermes session is already bound to any Telegram DM topic.
Read-only: does NOT trigger the telegram-topic migration. If the
topic-mode tables have not been created yet (i.e. nobody has run
``/topic`` in this profile), the session is by definition unbound
and we return False.
"""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT 1 FROM telegram_dm_topic_bindings
WHERE session_id = ?
LIMIT 1
""",
(str(session_id),),
).fetchone()
except sqlite3.OperationalError:
return False
return row is not None
def list_unlinked_telegram_sessions_for_user(
self,
*,
chat_id: str,
user_id: str,
limit: int = 10,
) -> List[Dict[str, Any]]:
"""List previous Telegram sessions for this user that are not bound to a topic.
Read-only: does NOT trigger the telegram-topic migration. If the
topic-mode tables are absent, fall back to a simpler query that
just returns this user's Telegram sessions — there can't be any
bindings yet.
"""
with self._lock:
try:
rows = self._conn.execute(
"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
WHERE s.source = 'telegram'
AND s.user_id = ?
AND NOT EXISTS (
SELECT 1 FROM telegram_dm_topic_bindings b
WHERE b.session_id = s.id
)
ORDER BY last_active DESC, s.started_at DESC
LIMIT ?
""",
(str(user_id), int(limit)),
).fetchall()
except sqlite3.OperationalError:
# telegram_dm_topic_bindings doesn't exist yet — no bindings
# means every telegram session for this user is "unlinked".
rows = self._conn.execute(
"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
WHERE s.source = 'telegram'
AND s.user_id = ?
ORDER BY last_active DESC, s.started_at DESC
LIMIT ?
""",
(str(user_id), int(limit)),
).fetchall()
sessions: List[Dict[str, Any]] = []
for row in rows:
session = dict(row)
raw = str(session.pop("_preview_raw", "") or "").strip()
session["preview"] = raw[:60] + ("..." if len(raw) > 60 else "") if raw else ""
sessions.append(session)
return sessions
# ── Space reclamation ──
def vacuum(self) -> None:
+38 -3
View File
@@ -511,6 +511,12 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
Handles ``"type": "integer"``, ``"type": "number"``, ``"type": "boolean"``,
and union types (``"type": ["integer", "string"]``).
Also wraps bare scalar values in a single-element list when the schema
declares ``"type": "array"``. Open-weight models (DeepSeek, Qwen, GLM)
sometimes emit ``{"urls": "https://a.com"}`` when the tool expects
``{"urls": ["https://a.com"]}``; wrapping here avoids a confusing tool
failure on what is otherwise a well-formed call.
"""
if not args or not isinstance(args, dict):
return args
@@ -523,13 +529,42 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
if not properties:
return args
for key, value in args.items():
if not isinstance(value, str):
continue
for key, value in list(args.items()):
prop_schema = properties.get(key)
if not prop_schema:
continue
expected = prop_schema.get("type")
# Wrap bare non-list values when the schema declares ``array``.
# Strings still go through _coerce_value first so JSON-encoded
# arrays (``'["a","b"]'``) get parsed and nullable ``"null"``
# becomes ``None`` rather than ``["null"]``.
# ``None`` itself is preserved — we don't know whether the model
# meant "omit" or "empty list", and tools with sensible defaults
# (e.g. read_file's normalize_read_pagination) already handle it.
if expected == "array" and value is not None and not isinstance(value, (list, tuple)):
if isinstance(value, str):
coerced = _coerce_value(value, expected, schema=prop_schema)
if coerced is not value:
# _coerce_value handled it (JSON-parsed list or
# nullable "null" → None).
args[key] = coerced
continue
args[key] = [value]
logger.info(
"coerce_tool_args: wrapped bare string in list for %s.%s",
tool_name, key,
)
continue
args[key] = [value]
logger.info(
"coerce_tool_args: wrapped bare %s in list for %s.%s",
type(value).__name__, tool_name, key,
)
continue
if not isinstance(value, str):
continue
if not expected and not _schema_allows_null(prop_schema):
continue
coerced = _coerce_value(value, expected, schema=prop_schema)
+31 -24
View File
@@ -163,35 +163,42 @@
for entry in "''${ENTRIES[@]}"; do
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
# Compute the actual hash from the lockfile directly using
# prefetch-npm-deps. This avoids false "ok" from nix build when
# an old derivation is cached in a substituter (cachix/cache.nixos.org).
LOCK_FILE="$FOLDER/package-lock.json"
NEW_HASH=$(${pkgs.lib.getExe pkgs.prefetch-npm-deps} "$LOCK_FILE" 2>/dev/null)
if [ -z "$NEW_HASH" ]; then
echo " prefetch-npm-deps failed, falling back to nix build" >&2
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
echo " ok (via nix build)"
continue
fi
NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
if [ -z "$NEW_HASH" ]; then
if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
echo " skipped (transient cache failure see primary nix build for real status)" >&2
echo "$OUTPUT" | tail -8 >&2
continue
fi
echo " build failed with no hash mismatch:" >&2
echo "$OUTPUT" | tail -40 >&2
exit 1
fi
fi
OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
| sed -E 's/hash = "(.*)"/\1/')
if [ "$NEW_HASH" = "$OLD_HASH" ]; then
echo " ok"
continue
fi
NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
if [ -z "$NEW_HASH" ]; then
# Magic-Nix-Cache occasionally returns HTTP 418 / cache-throttled
# mid-run; nix then prints "outputs … not valid, so checking is
# not possible" without a `got:` line. That's an infrastructure
# blip, not a stale lockfile — warn + skip rather than failing
# the lint. A real hash mismatch would still surface in the
# primary `.#$ATTR` build, which is a separate CI job.
if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
echo " skipped (transient cache failure see primary nix build for real status)" >&2
echo "$OUTPUT" | tail -8 >&2
continue
fi
echo " build failed with no hash mismatch:" >&2
echo "$OUTPUT" | tail -40 >&2
exit 1
fi
HASH_LINE=$(grep -n 'hash = "sha256-' "$NIX_FILE" | head -1 | cut -d: -f1)
OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
| sed -E 's/hash = "(.*)"/\1/')
LOCK_FILE="$FOLDER/package-lock.json"
echo " stale: $NIX_FILE:$HASH_LINE $OLD_HASH -> $NEW_HASH"
STALE=1
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-a/HGI9OgVcTnZrMXA7xFMGnFoVxyHe95fulVz+WNYB0=";
hash = "sha256-MLcLhjTF6dgdvNBtJWzo8Nh19eNh/ZitD2b07nm61Tc=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -0,0 +1,190 @@
---
name: hyperframes
description: Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.
version: 1.0.0
author: heygen-com
license: Apache-2.0
prerequisites:
commands: [node, ffmpeg, npx]
metadata:
hermes:
tags: [creative, video, animation, html, gsap, motion-graphics]
related_skills: [manim-video, meme-generation]
category: creative
requires_toolsets: [terminal]
---
# HyperFrames
HTML is the source of truth for video. A composition is an HTML file with `data-*` attributes for timing, a GSAP timeline for animation, and CSS for appearance. The HyperFrames engine captures the page frame-by-frame and encodes to MP4/WebM with FFmpeg.
**Complement to `manim-video`:** Use `manim-video` for mathematical/geometric explainers (equations, 3B1B-style). Use `hyperframes` for motion-graphics, talking-head with captions, product tours, social overlays, shader transitions, and anything driven by real video/audio media.
## When to Use
- User asks for a rendered video from text, a script, or a website
- Animated title cards, lower thirds, or typographic intros
- Captioned narration video (TTS + captions synced to waveform)
- Audio-reactive visuals (beat sync, spectrum bars, pulsing glow)
- Scene-to-scene transitions (crossfade, wipe, shader warp, flash-through-white)
- Social overlays (Instagram/TikTok/YouTube style)
- Website-to-video pipeline (capture a URL, produce a promo)
- Any HTML/CSS/JS animation that must render deterministically to a video file
Do **not** use this skill for:
- Pure math/equation animation (→ `manim-video`)
- Image generation or memes (→ `meme-generation`, image models)
- Live video conferencing or streaming
## Quick Reference
```bash
npx hyperframes init my-video # scaffold a project
cd my-video
npx hyperframes lint # validate before preview/render
npx hyperframes preview # live-reload browser preview (port 3002)
npx hyperframes render --output final.mp4 # render to MP4
npx hyperframes doctor # diagnose environment issues
```
Render flags: `--quality draft|standard|high` · `--fps 24|30|60` · `--format mp4|webm` · `--docker` (reproducible) · `--strict`.
Full CLI reference: [references/cli.md](references/cli.md).
## Setup (one-time)
```bash
bash "$(dirname "$(find ~/.hermes/skills -path '*/hyperframes/SKILL.md' 2>/dev/null | head -1)")/scripts/setup.sh"
```
The script:
1. Verifies Node.js >= 22 and FFmpeg are installed (prints fix instructions if not).
2. Installs the `hyperframes` CLI globally (`npm install -g hyperframes@>=0.4.2`).
3. Pre-caches `chrome-headless-shell` via Puppeteer — **required** for best-quality rendering via Chrome's `HeadlessExperimental.beginFrame` capture path.
4. Runs `npx hyperframes doctor` and reports the result.
See [references/troubleshooting.md](references/troubleshooting.md) if setup fails.
## Procedure
### 1. Plan before writing HTML
Before touching code, articulate at a high level:
- **What** — narrative arc, key moments, emotional beats
- **Structure** — compositions, tracks (video/audio/overlays), durations
- **Visual identity** — colors, fonts, motion character (explosive / cinematic / fluid / technical)
- **Hero frame** — for each scene, the moment when the most elements are simultaneously visible. This is the static layout you'll build first.
**Visual Identity Gate (HARD-GATE).** Before writing ANY composition HTML, a visual identity must be defined. Do NOT write compositions with default or generic colors (`#333`, `#3b82f6`, `Roboto` are tells that this step was skipped). Check in order:
1. **`DESIGN.md` at project root?** → Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.
2. **User named a style** (e.g. "Swiss Pulse", "dark and techy", "luxury brand")? → Generate a minimal `DESIGN.md` with `## Style Prompt`, `## Colors` (3-5 hex with roles), `## Typography` (1-2 families), `## What NOT to Do` (3-5 anti-patterns).
3. **None of the above?** → Ask 3 questions before writing any HTML:
- Mood? (explosive / cinematic / fluid / technical / chaotic / warm)
- Light or dark canvas?
- Any brand colors, fonts, or visual references?
Then generate a `DESIGN.md` from the answers. Every composition must trace its palette and typography back to `DESIGN.md` or explicit user direction.
### 2. Scaffold
```bash
npx hyperframes init my-video --non-interactive
```
Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`. Pass `--example <name>` to pick one, `--video clip.mp4` or `--audio track.mp3` to seed with media.
### 3. Layout before animation
Write the static HTML+CSS for the **hero frame first** — no GSAP yet. The `.scene-content` container must fill the scene (`width:100%; height:100%; padding:Npx`) with `display:flex` + `gap`. Use padding to push content inward — never `position: absolute; top: Npx` on a content container (content overflows when taller than the remaining space).
Only after the hero frame looks right, add `gsap.from()` entrances (animate **to** the CSS position) and `gsap.to()` exits (animate **from** it).
See [references/composition.md](references/composition.md) for the full data-attribute schema and composition rules.
### 4. Animate with GSAP
Every composition must:
- Register its timeline: `window.__timelines["<composition-id>"] = tl`
- Start paused: `gsap.timeline({ paused: true })` — the player controls playback
- Use finite `repeat` values (no `repeat: -1` — breaks the capture engine). Calculate: `repeat: Math.ceil(duration / cycleDuration) - 1`.
- Be deterministic — no `Math.random()`, `Date.now()`, or wall-clock logic. Use a seeded PRNG if you need pseudo-randomness.
- Build synchronously — no `async`/`await`, `setTimeout`, or Promises around timeline construction.
See [references/gsap.md](references/gsap.md) for the core GSAP API (tweens, eases, stagger, timelines).
### 5. Transitions between scenes
Multi-scene compositions require transitions. Rules:
1. **Always use a transition between scenes** — no jump cuts.
2. **Always use entrance animations** on every scene element (`gsap.from(...)`).
3. **Never use exit animations** except on the final scene — the transition IS the exit.
4. The final scene may fade out.
Use `npx hyperframes add <transition-name>` to install shader transitions (`flash-through-white`, `liquid-wipe`, etc.). Full list: `npx hyperframes add --list`.
### 6. Audio, captions, TTS, audio-reactive, highlighting
- **Audio:** always a separate `<audio>` element (video is `muted playsinline`).
- **TTS:** `npx hyperframes tts "Script text" --voice af_nova --output narration.wav`. List voices with `--list`. Voice ID first letter encodes language (`a`/`b`=English, `e`=Spanish, `f`=French, `j`=Japanese, `z`=Mandarin, etc.) — the CLI auto-infers the phonemizer locale; pass `--lang` only to override. Non-English phonemization requires `espeak-ng` installed system-wide.
- **Captions:** `npx hyperframes transcribe narration.wav` → word-level transcript. Pick style from the transcript tone (hype / corporate / tutorial / storytelling / social — see the table in `references/features.md`). **Language rule:** never use `.en` whisper models unless the audio is confirmed English — `.en` translates non-English audio instead of transcribing it. Every caption group MUST have a hard `tl.set(el, { opacity: 0, visibility: "hidden" }, group.end)` kill after its exit tween — otherwise groups leak visible into later ones.
- **Audio-reactive visuals:** pre-extract audio bands (bass / mid / treble) and sample per-frame inside the timeline with a `for` loop of `tl.call(draw, [], f / fps)` — a single long tween does NOT react to audio. Map bass → `scale` (pulse), treble → `textShadow`/`boxShadow` (glow), overall amplitude → `opacity`/`y`/`backgroundColor`. Avoid equalizer-bar clichés — let content guide the visual, audio drive its behavior.
- **Marker-style highlighting:** highlight, circle, burst, scribble, sketchout effects for text emphasis are deterministic CSS+GSAP — see `references/features.md#marker-highlighting`. Fully seekable, no animated SVG filters.
- **Scene transitions:** every multi-scene composition MUST use transitions (no jump cuts). Pick from CSS primitives (push slide, blur crossfade, zoom through, staggered blocks) or shader transitions (`flash-through-white`, `liquid-wipe`, `cross-warp-morph`, `chromatic-split`, etc.) via `npx hyperframes add`. Mood and energy tables live in `references/features.md#transitions`. Do not mix CSS and shader transitions in the same composition.
### 7. Lint, validate, inspect, preview, render
```bash
npx hyperframes lint # catches missing data-composition-id, overlapping tracks, unregistered timelines
npx hyperframes validate # WCAG contrast audit at 5 timestamps
npx hyperframes inspect # visual layout audit — overflow, off-frame elements, occluded text
npx hyperframes preview # live browser preview
npx hyperframes render --quality draft --output draft.mp4 # fast iteration
npx hyperframes render --quality high --output final.mp4 # final delivery
```
`hyperframes validate` samples background pixels behind every text element and warns on contrast ratios below 4.5:1 (or 3:1 for large text). `hyperframes inspect` is the layout-side companion — runs the page at multiple timestamps and flags issues that a static lint can't see (a caption that wraps past the safe area only at 4.5s, a card that overflows when its title is the longest variant, an element that ends up behind a transition shader). Run `inspect` especially on compositions with speech bubbles, cards, captions, or tight typography.
### 8. Website-to-video (if the user gives a URL)
Use the 7-step capture-to-video workflow in [references/website-to-video.md](references/website-to-video.md): capture → DESIGN.md → SCRIPT.md → storyboard → composition → render → deliver.
## Pitfalls
- **`HeadlessExperimental.beginFrame' wasn't found`** — Chromium 147+ removed this protocol. Ensure you're on `hyperframes@>=0.4.2` (auto-detects and falls back to screenshot mode). Escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294) and [references/troubleshooting.md](references/troubleshooting.md).
- **System Chrome (not `chrome-headless-shell`)** — renders hang for 120s then timeout. Run `npx puppeteer browsers install chrome-headless-shell` (setup.sh does this). `hyperframes doctor` reports which binary will be used.
- **`repeat: -1` anywhere** — breaks the capture engine. Always compute a finite repeat count.
- **`gsap.set()` on clip elements that enter later** — the element doesn't exist at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline instead, at or after the clip's `data-start`.
- **`<br>` inside content text** — forced breaks don't know the rendered font width, so natural wrap + `<br>` double-breaks. Use `max-width` to let text wrap. Exception: short display titles where each word is deliberately on its own line.
- **Animating `visibility` or `display`** — GSAP can't tween these. Use `autoAlpha` (handles both visibility and opacity).
- **Calling `video.play()` or `audio.play()`** — the framework owns playback. Never call these yourself.
- **Building timelines async** — the capture engine reads `window.__timelines` synchronously after page load. Never wrap timeline construction in `async`, `setTimeout`, or a Promise.
- **Standalone `index.html` wrapped in `<template>`** — hides all content from the browser. Only **sub-compositions** loaded via `data-composition-src` use `<template>`.
- **Using video for audio** — always muted `<video>` + separate `<audio>`.
## Verification
Before and after rendering:
1. **Lint + validate + inspect pass:** `npx hyperframes lint --strict && npx hyperframes validate && npx hyperframes inspect` (lint catches structural issues, validate catches contrast, inspect catches visual layout / overflow issues — see troubleshooting.md if warnings appear).
2. **Animation choreography** — for new compositions or significant animation changes, run the animation map. `npx hyperframes init` copies the skill scripts into the project, so the path is project-local:
```bash
node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
--out <composition-dir>/.hyperframes/anim-map
```
Outputs a single `animation-map.json` with per-tween summaries, ASCII Gantt timeline, stagger detection, dead zones (>1s with no animation), element lifecycles, and flags (`offscreen`, `collision`, `invisible`, `paced-fast` <0.2s, `paced-slow` >2s). Scan summaries and flags — fix or justify each. Skip on small edits.
3. **File exists + non-zero:** `ls -lh final.mp4`.
4. **Duration matches `data-duration`:** `ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 final.mp4`.
5. **Visual check:** extract a mid-composition frame: `ffmpeg -i final.mp4 -ss 00:00:05 -vframes 1 preview.png`.
6. **Audio present if expected:** `ffprobe -v error -show_streams -select_streams a -of default=nw=1:nk=1 final.mp4 | head -1`.
If `hyperframes render` fails, run `npx hyperframes doctor` and attach its output when reporting.
## References
- [composition.md](references/composition.md) — data attributes, timeline contract, non-negotiable rules, typography/asset rules
- [cli.md](references/cli.md) — every CLI command (init, capture, lint, validate, inspect, preview, render, transcribe, tts, doctor, browser, info, upgrade, benchmark)
- [gsap.md](references/gsap.md) — GSAP core API for HyperFrames (tweens, eases, stagger, timelines, matchMedia)
- [features.md](references/features.md) — captions, TTS, audio-reactive, marker highlighting, transitions (load on demand)
- [website-to-video.md](references/website-to-video.md) — 7-step capture-to-video workflow
- [troubleshooting.md](references/troubleshooting.md) — OpenClaw fix, env vars, common render errors
@@ -0,0 +1,185 @@
# HyperFrames CLI
Everything runs through `npx hyperframes` (or the globally-installed `hyperframes` after `npm install -g hyperframes`). Requires Node.js >= 22 and FFmpeg.
## Workflow
1. **Scaffold**`npx hyperframes init my-video` (or `npx hyperframes capture <url>` if starting from a website)
2. **Write** — author HTML composition (see `composition.md`)
3. **Lint**`npx hyperframes lint`
4. **Validate**`npx hyperframes validate` (WCAG contrast audit)
5. **Inspect**`npx hyperframes inspect` (visual layout audit)
6. **Preview**`npx hyperframes preview`
7. **Render**`npx hyperframes render`
Always lint before preview/render — catches missing `data-composition-id`, overlapping tracks, and unregistered timelines.
## init — Scaffold a Project
```bash
npx hyperframes init my-video # interactive wizard
npx hyperframes init my-video --example warm-grain # pick an example template
npx hyperframes init my-video --video clip.mp4 # seed with a video file
npx hyperframes init my-video --audio track.mp3 # seed with an audio file
npx hyperframes init my-video --non-interactive # skip prompts (CI / agent use)
```
Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`.
`init` creates the correct file structure, copies media, transcribes audio with Whisper, and installs authoring skills. Use it instead of creating files by hand.
## capture — Website → Editable Components
```bash
npx hyperframes capture https://example.com # → captures/example.com/
npx hyperframes capture https://stripe.com -o stripe-video # custom output dir
npx hyperframes capture https://example.com --json # machine-readable output
npx hyperframes capture https://example.com --skip-assets # skip images/SVGs
```
Captures the site into `captures/<hostname>/capture/` by default, producing `capture/screenshots/`, `capture/assets/`, `capture/extracted/` (tokens.json, visible-text.txt, fonts.json), and a self-contained snapshot.
All downstream steps (DESIGN.md, SCRIPT.md, STORYBOARD, composition) read from the `capture/` subfolder — see `website-to-video.md`.
## lint
```bash
npx hyperframes lint # current directory
npx hyperframes lint ./my-project # specific project
npx hyperframes lint --verbose # include info-level findings
npx hyperframes lint --json # machine-readable output
```
Lints `index.html` and all files in `compositions/`. Reports errors (must fix), warnings (should fix), and info (only with `--verbose`).
## validate
```bash
npx hyperframes validate # WCAG contrast audit at 5 timestamps
npx hyperframes validate --no-contrast # skip while iterating
```
Seeks to 5 timestamps, screenshots the page, samples background pixels behind every text element, and warns on contrast ratios below 4.5:1 (normal text) or 3:1 (large text — 24px+, or 19px+ bold). Run before final render.
## inspect
```bash
npx hyperframes inspect # visual layout audit at 5 timestamps
npx hyperframes inspect ./my-project # specific project
npx hyperframes inspect --json # agent-readable findings
npx hyperframes inspect --samples 15 # denser timeline sweep
npx hyperframes inspect --at 1.5,4,7.25 # explicit hero-frame timestamps
```
Use this after `lint` and `validate`, especially for compositions with speech bubbles, cards, captions, or tight typography. Reports overflow, off-frame elements, occluded text, contrast warnings, and per-timestamp layout summaries — catches issues that pure timeline lint can't see (e.g., a caption that wraps past the safe area only at a specific timestamp).
`npx hyperframes layout` is a compatibility alias for the same visual inspection pass.
## preview
```bash
npx hyperframes preview # serve current directory (port 3002)
npx hyperframes preview --port 4567 # custom port
```
Hot-reloads on file changes. Opens the Studio in your browser automatically.
## render
```bash
npx hyperframes render # standard MP4
npx hyperframes render --output final.mp4 # named output
npx hyperframes render --quality draft # fast iteration
npx hyperframes render --fps 60 --quality high # final delivery
npx hyperframes render --format webm # transparent WebM
npx hyperframes render --docker # byte-identical reproducible render
```
| Flag | Options | Default | Notes |
| -------------- | ----------------------- | ------------------------------ | --------------------------- |
| `--output` | path | `renders/<name>_<timestamp>.mp4` | Output path |
| `--fps` | 24, 30, 60 | 30 | 60fps doubles render time |
| `--quality` | `draft`, `standard`, `high` | standard | draft for iterating |
| `--format` | `mp4`, `webm` | mp4 | WebM supports transparency |
| `--workers` | 18 or `auto` | auto | Each spawns Chrome |
| `--docker` | flag | off | Reproducible output |
| `--gpu` | flag | off | GPU-accelerated encoding |
| `--strict` | flag | off | Fail on lint errors |
| `--strict-all` | flag | off | Fail on errors AND warnings |
**Quality guidance:** `draft` while iterating, `standard` for review, `high` for final delivery.
## transcribe
```bash
npx hyperframes transcribe audio.mp3
npx hyperframes transcribe video.mp4 --model medium.en --language en
npx hyperframes transcribe subtitles.srt # import existing
npx hyperframes transcribe subtitles.vtt
npx hyperframes transcribe openai-response.json
```
Produces word-level timings suitable for caption components. First run downloads the Whisper model (cached after).
## tts
```bash
npx hyperframes tts "Text here" --voice af_nova --output narration.wav
npx hyperframes tts script.txt --voice bf_emma
npx hyperframes tts "La reunión empieza a las nueve" --voice ef_dora --output es.wav
npx hyperframes tts "Hello there" --voice af_heart --lang fr-fr --output accented.wav
npx hyperframes tts --list # show all voices
```
Uses Kokoro (local, no API key). Voice ID first letter encodes language: `a` American English, `b` British English, `e` Spanish, `f` French, `h` Hindi, `i` Italian, `j` Japanese, `p` Brazilian Portuguese, `z` Mandarin. The CLI auto-infers the phonemizer locale from that prefix — pass `--lang` only to override (e.g. stylized accents). Valid `--lang` codes: `en-us`, `en-gb`, `es`, `fr-fr`, `hi`, `it`, `pt-br`, `ja`, `zh`. Non-English phonemization requires `espeak-ng` installed system-wide (`apt-get install espeak-ng` / `brew install espeak-ng`).
## doctor
```bash
npx hyperframes doctor
```
Verifies environment:
- Node.js >= 22
- FFmpeg present on PATH
- Available RAM (renders are memory-hungry — 4 GB minimum)
- Chrome binary resolution (`chrome-headless-shell` preferred over system Chrome)
- Current `hyperframes` version
Run this **first** when a render fails. See `troubleshooting.md` for interpreting the output.
## browser
```bash
npx hyperframes browser --install # install the bundled chrome-headless-shell
npx hyperframes browser --path # print the resolved browser binary path
npx hyperframes browser --clean # clear the bundled browser cache
```
## info
```bash
npx hyperframes info
```
Prints version, Node version, FFmpeg version, OS, and resolved browser path — useful in bug reports.
## upgrade
```bash
npx hyperframes upgrade -y
```
Check for and install updates. Run this if you hit `HeadlessExperimental.beginFrame` errors — the auto-detect fix shipped in `hyperframes@0.4.2` (commit 4c72ba4, March 2026).
## Other
```bash
npx hyperframes compositions # list compositions in the project
npx hyperframes docs # open documentation in browser
npx hyperframes benchmark . # benchmark render performance
npx hyperframes add <block> # install a block/component from the catalog
npx hyperframes add --list # browse the catalog
```
Popular catalog blocks: `flash-through-white` (shader transition), `instagram-follow` (social overlay), `data-chart` (animated chart), `lower-third` (talking-head overlay). See [hyperframes.heygen.com/catalog](https://hyperframes.heygen.com/catalog).
@@ -0,0 +1,129 @@
# Composition Authoring
HTML structure, data attributes, timeline contract, and non-negotiable rules.
## Root Structure
Standalone `index.html` — the top-level composition. **Does NOT use `<template>`**. Put the `data-composition-id` div directly in `<body>`.
```html
<!doctype html>
<html>
<body>
<div
id="stage"
data-composition-id="root"
data-start="0"
data-duration="10"
data-width="1920"
data-height="1080"
>
<!-- clips go here -->
<video id="clip-1" data-start="0" data-duration="5" data-track-index="0" src="intro.mp4" muted playsinline></video>
<img id="logo" data-start="2" data-duration="3" data-track-index="1" src="logo.png" />
<audio id="music" data-start="0" data-duration="10" data-track-index="2" data-volume="0.5" src="music.wav"></audio>
</div>
<script src="https://cdn.jsdelivr.net/npm/gsap@3.14.2/dist/gsap.min.js"></script>
<script>
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true });
tl.from("#logo", { opacity: 0, y: 40, duration: 0.6 }, 2);
window.__timelines["root"] = tl;
</script>
</body>
</html>
```
Sub-compositions loaded via `data-composition-src` **DO** use `<template>`:
```html
<template id="my-comp-template">
<div data-composition-id="my-comp" data-width="1920" data-height="1080">
<!-- content + scoped <style> + <script> with window.__timelines["my-comp"] -->
</div>
</template>
```
Load from the root: `<div id="el-1" data-composition-id="my-comp" data-composition-src="compositions/my-comp.html" data-start="0" data-duration="10" data-track-index="1"></div>`
## Data Attributes
### All clips
| Attribute | Required | Values |
| ------------------ | --------------------------------- | ------------------------------------------------------ |
| `id` | Yes | Unique identifier |
| `data-start` | Yes | Seconds, or clip ID reference (`"el-1"`, `"intro + 2"`) |
| `data-duration` | Required for img/div/compositions | Seconds. Video/audio defaults to media duration. |
| `data-track-index` | Yes | Integer. Same-track clips cannot overlap. |
| `data-media-start` | No | Trim offset into source (seconds) |
| `data-volume` | No | 01 (default 1) |
`data-track-index` controls timeline layout only — **not** visual layering. Use CSS `z-index` for layering.
### Composition clips
| Attribute | Required | Values |
| ---------------------------- | -------- | -------------------------------------------- |
| `data-composition-id` | Yes | Unique composition ID |
| `data-start` | Yes | Start time (root composition: `"0"`) |
| `data-duration` | Yes | Takes precedence over GSAP timeline duration |
| `data-width` / `data-height` | Yes | Pixel dimensions (1920x1080 or 1080x1920) |
| `data-composition-src` | No | Path to external HTML file |
## Timeline Contract
- Every timeline starts `{ paused: true }` — the player controls playback.
- Register every timeline: `window.__timelines["<composition-id>"] = tl`.
- Duration comes from `data-duration`, not from the GSAP timeline length.
- Framework auto-nests sub-timelines — do NOT manually add them.
- Never create empty tweens just to set duration.
## Non-Negotiable Rules
1. **Deterministic.** No `Math.random()`, `Date.now()`, or time-based logic. Use a seeded PRNG (e.g. mulberry32) if you need pseudo-randomness.
2. **GSAP only on visual properties.** `opacity`, `x`, `y`, `scale`, `rotation`, `color`, `backgroundColor`, `borderRadius`, transforms. Never animate `visibility`, `display`, or call `video.play()`/`audio.play()`.
3. **No property conflicts across timelines.** Never animate the same property on the same element from multiple timelines simultaneously.
4. **No `repeat: -1`.** Infinite-repeat tweens break the capture engine. Compute `repeat: Math.ceil(duration / cycleDuration) - 1`.
5. **Synchronous timeline construction.** Never build timelines inside `async`/`await`, `setTimeout`, or Promises. The capture engine reads `window.__timelines` synchronously after page load. Fonts are embedded by the compiler — no need to wait for load.
6. **Root composition has no `<template>` wrapper.** Only sub-compositions use `<template>`.
7. **Video is always `muted playsinline`.** Audio is always a separate `<audio>` element — even if it's the same source file.
8. **Content containers use padding, not absolute positioning.** `.scene-content { width: 100%; height: 100%; padding: Npx; display: flex; flex-direction: column; gap: Npx; box-sizing: border-box }`. Absolute-positioned content containers overflow. Reserve `position: absolute` for decoratives only.
## Scene Transitions
Multi-scene compositions MUST follow all of these:
1. **Always use a transition between scenes.** No jump cuts.
2. **Always use entrance animations** on every scene element. Every element animates IN via `gsap.from(...)`. No element may appear fully-formed.
3. **Never use exit animations** (except on the final scene). This means NO `gsap.to()` that animates `opacity` to 0, `y` offscreen, etc. The transition IS the exit. Outgoing scene content must be fully visible at the moment the transition starts.
4. **Final scene only:** may fade elements out. This is the only scene where `gsap.to(..., { opacity: 0 })` is allowed.
## Typography and Assets
- **Fonts:** write the `font-family` you want in CSS — the compiler embeds supported fonts automatically. Unsupported fonts produce a compiler warning.
- Add `crossorigin="anonymous"` to external media.
- For dynamic text sizing, use `window.__hyperframes.fitTextFontSize(text, { maxWidth, fontFamily, fontWeight })`.
- All project files live at the project root alongside `index.html`. Sub-compositions reference assets with `../`.
- For rendered video: 60px+ headlines, 20px+ body, 16px+ data labels. `font-variant-numeric: tabular-nums` on number columns. Avoid full-screen linear gradients on dark backgrounds (H.264 banding — use radial or solid + localized glow).
## Animation Guardrails
- Offset the first animation 0.10.3s (not `t=0`).
- Vary eases across entrance tweens — at least 3 different eases per scene.
- Don't repeat an entrance pattern within a scene.
## Never Do
1. Forget `window.__timelines` registration.
2. Use video for audio — always muted video + separate `<audio>`.
3. Nest video inside a timed div — use a non-timed wrapper.
4. Use `data-layer` (use `data-track-index`) or `data-end` (use `data-duration`).
5. Animate video element dimensions — animate a wrapper div instead.
6. Call `play`/`pause`/`seek` on media — framework owns playback.
7. Create a top-level container without `data-composition-id`.
8. Use `repeat: -1` on any timeline or tween.
9. Build timelines asynchronously.
10. Use `gsap.set()` on elements from later scenes — they don't exist in the DOM at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline at or after the clip's `data-start`.
11. Use `<br>` in content text — causes unwanted extra breaks when the text wraps naturally. Use `max-width` instead. Exception: short display titles (e.g., "THE\nIMMORTAL\nGAME") where each word is deliberately on its own line.
@@ -0,0 +1,289 @@
# HyperFrames Feature Reference
Load this file when a composition needs captions, TTS narration, audio-reactive visuals, marker-style text highlighting, or scene transitions. All patterns here are deterministic (no `Math.random()`, no `Date.now()`, no runtime audio analysis) and live on the same GSAP timeline as the rest of the composition.
## Captions
### Language Rule (Non-Negotiable)
**Never use `.en` whisper models unless the audio is confirmed English.** `.en` models TRANSLATE non-English audio into English instead of transcribing it.
- User says the language → `npx hyperframes transcribe audio.mp3 --model small --language <code>` (no `.en`)
- User confirms English → `--model small.en`
- Language unknown → `--model small` (auto-detects)
### Style Detection
If the user doesn't specify a caption style, detect it from the transcript tone:
| Tone | Font mood | Animation | Color | Size |
| ------------ | ------------------------ | ---------------------------------- | --------------------------- | ------- |
| Hype / launch | Heavy condensed, 800-900 | Scale-pop, `back.out(1.7)`, 0.1-0.2s | Bright on dark | 72-96px |
| Corporate | Clean sans, 600-700 | Fade+slide, `power3.out`, 0.3s | White / neutral + muted accent | 56-72px |
| Tutorial | Mono / clean sans, 500-600 | Typewriter or fade, 0.4-0.5s | High contrast, minimal | 48-64px |
| Storytelling | Serif / elegant, 400-500 | Slow fade, `power2.out`, 0.5-0.6s | Warm muted tones | 44-56px |
| Social | Rounded sans, 700-800 | Bounce, `elastic.out`, word-by-word | Playful, colored pills | 56-80px |
### Word Grouping
- High energy: 2-3 words, quick turnover.
- Conversational: 3-5 words, natural phrases.
- Measured / calm: 4-6 words.
Break on sentence boundaries, 150ms+ pauses, or a max word count.
### Positioning
- Landscape (1920x1080): bottom 80-120px, centered.
- Portrait (1080x1920): ~600-700px from bottom, centered.
- Never cover the subject's face. `position: absolute` (never relative). One caption group visible at a time.
### Text Overflow Prevention
Use the runtime helper so captions never overflow:
```js
const result = window.__hyperframes.fitTextFontSize(group.text.toUpperCase(), {
fontFamily: "Outfit",
fontWeight: 900,
maxWidth: 1600, // 1600 landscape, 900 portrait
});
el.style.fontSize = result.fontSize + "px";
```
When per-word styling uses `scale > 1.0`, compute `maxWidth = safeWidth / maxScale` to leave headroom. Container needs `overflow: visible` (not `hidden` — hidden clips scaled emphasis words and glow).
### Caption Exit Guarantee
Every group MUST have a hard kill after its exit tween — otherwise groups leak into later ones:
```js
tl.to(groupEl, { opacity: 0, scale: 0.95, duration: 0.12, ease: "power2.in" }, group.end - 0.12);
tl.set(groupEl, { opacity: 0, visibility: "hidden" }, group.end); // deterministic kill
```
### Per-Word Styling
Scan the transcript for words that deserve distinct treatment:
- Brand / product names — larger, unique color.
- ALL CAPS — scale boost, flash, accent color.
- Numbers / statistics — bold weight, accent color.
- Emotional keywords — exaggerated animation (overshoot, bounce).
- Call-to-action — highlight, underline, color pop.
## TTS (Kokoro-82M)
Local, no API key. Runs on CPU. Model downloads on first use (~311 MB + ~27 MB voices, cached in `~/.cache/hyperframes/tts/`).
### Voice Selection
| Content type | Voice | Why |
| ------------- | ----------------------- | --------------------------- |
| Product demo | `af_heart` / `af_nova` | Warm, professional |
| Tutorial | `am_adam` / `bf_emma` | Neutral, easy to follow |
| Marketing | `af_sky` / `am_michael` | Energetic or authoritative |
| Documentation | `bf_emma` / `bm_george` | Clear British English |
| Casual | `af_heart` / `af_sky` | Approachable, natural |
Run `npx hyperframes tts --list` for all 54 voices across 8 languages.
### Multilingual Phonemization
Voice ID first letter encodes language: `a`=American English, `b`=British English, `e`=Spanish, `f`=French, `h`=Hindi, `i`=Italian, `j`=Japanese, `p`=Brazilian Portuguese, `z`=Mandarin. The CLI auto-infers the phonemizer locale from that prefix — you don't need `--lang` when voice and text match.
```bash
npx hyperframes tts "La reunión empieza a las nueve" --voice ef_dora --output es.wav
npx hyperframes tts "今日はいい天気ですね" --voice jf_alpha --output ja.wav
```
Pass `--lang` only to override auto-detection (e.g. stylized accents):
```bash
npx hyperframes tts "Hello there" --voice af_heart --lang fr-fr --output accented.wav
```
Valid `--lang` codes: `en-us`, `en-gb`, `es`, `fr-fr`, `hi`, `it`, `pt-br`, `ja`, `zh`. Non-English phonemization requires `espeak-ng` installed system-wide (`apt-get install espeak-ng` / `brew install espeak-ng`).
### Speed
- `0.7-0.8` — tutorial, complex content
- `1.0` — natural (default)
- `1.1-1.2` — intros, upbeat content
- `1.5+` — rarely appropriate
### TTS + Captions Workflow
```bash
npx hyperframes tts script.txt --voice af_heart --output narration.wav
npx hyperframes transcribe narration.wav # → transcript.json (word-level)
```
## Audio-Reactive Visuals
Drive visuals from music, voice, or sound. Any GSAP-tweenable property can respond to pre-extracted audio data.
### Data format
```js
const AUDIO_DATA = {
fps: 30,
totalFrames: 900,
frames: [{ bands: [0.82, 0.45, 0.31, /* ... */] }, /* ... */],
};
```
`frames[i].bands[]` are frequency band amplitudes, 0-1. Index 0 = bass, higher indices = treble. Each band is normalized independently across the full track.
### Mapping audio to visuals
| Audio signal | Visual property | Effect |
| ---------------------- | --------------------------------- | -------------------------- |
| Bass (`bands[0]`) | `scale` | Pulse on beat |
| Treble (`bands[12-14]`)| `textShadow`, `boxShadow` | Glow intensity |
| Overall amplitude | `opacity`, `y`, `backgroundColor` | Breathe, lift, color shift |
| Mid-range (`bands[4-8]`)| `borderRadius`, `width` | Shape morphing |
Any GSAP-tweenable property works — `clipPath`, `filter`, SVG attributes, CSS custom properties. Let content guide the visual and let audio drive its behavior. **Never add** equalizer bars, spectrum analyzers, waveform displays, rainbow cycling, or generic particle systems — they look cheap.
### Sampling pattern (required)
Audio reactivity needs per-frame sampling via a `for` loop of `tl.call()`, NOT a single tween. A single long tween does NOT react to audio:
```js
for (let f = 0; f < AUDIO_DATA.totalFrames; f++) {
tl.call(
((frame) => () => draw(frame))(AUDIO_DATA.frames[f]),
[],
f / AUDIO_DATA.fps,
);
}
```
### Gotchas
- **textShadow on a container** with semi-transparent children (e.g. inactive caption words at `rgba(255,255,255,0.3)`) renders a visible glow rectangle behind every child. Apply the glow to active words individually, not to the container.
- **Subtlety for text** — 3-6% scale variation, soft glow. Heavy pulsing makes text unreadable.
- **Go bigger on non-text** — backgrounds and shapes can handle 10-30% swings.
- **Deterministic only** — pre-extracted audio data, no Web Audio API, no runtime analysis.
## Marker-Style Highlighting
Deterministic CSS + GSAP implementations of the classic "highlight / circle / burst / scribble / sketchout" drawing modes for emphasizing text. Fully seekable — no animated SVG filters, no JS timers.
### Highlight (yellow marker sweep)
```html
<span class="mh-highlight-wrap">
<span class="mh-highlight-bar" id="hl-1"></span>
<span class="mh-highlight-text">highlighted text</span>
</span>
```
```css
.mh-highlight-wrap { position: relative; display: inline; }
.mh-highlight-bar {
position: absolute; inset: 0 -6px;
background: #fdd835; opacity: 0.35;
transform: scaleX(0); transform-origin: left center;
border-radius: 3px; z-index: 0;
}
.mh-highlight-text { position: relative; z-index: 1; }
```
```js
tl.to("#hl-1", { scaleX: 1, duration: 0.5, ease: "power2.out" }, 0.6);
```
Multi-line: apply to `.mh-highlight-bar` with `stagger: 0.3`.
### Circle
Hand-drawn ellipse around a word. Use a positioned `::before` with `border-radius: 50%`, slight rotation, and `clip-path` to avoid covering the letters. Animate `clip-path` or `stroke-dashoffset` on an inline SVG circle.
### Burst
Short radiating lines around a word. Render 6-12 small `<span>` elements positioned in a radial pattern; animate `scaleY` from 0.
### Scribble
A chaotic overlay created by animating `stroke-dashoffset` on an inline SVG `<path>` with a `d` attribute describing a zig-zag. Seed values, never `Math.random()`.
### Sketchout
A rough rectangle outline. Two `<rect>`s with slight `transform` offsets, animated via `stroke-dashoffset`.
All five modes tween CSS transforms or `stroke-dashoffset` only — both tween cleanly, are deterministic, and seek correctly.
## Scene Transitions
Every multi-scene composition MUST use transitions. No jump cuts.
### Energy → primary transition
| Energy | CSS primary | Shader primary | Accent | Duration | Easing |
| ------------------------------------ | ---------------------------- | ------------------------------------ | ------------------------------ | --------- | ------------------------ |
| **Calm** (wellness, brand, luxury) | Blur crossfade, focus pull | Cross-warp morph, thermal distortion | Light leak, circle iris | 0.5-0.8s | `sine.inOut`, `power1` |
| **Medium** (corporate, SaaS) | Push slide, staggered blocks | Whip pan, cinematic zoom | Squeeze, vertical push | 0.3-0.5s | `power2`, `power3` |
| **High** (promos, sports, launch) | Zoom through, overexposure | Ridged burn, glitch, chromatic split | Staggered blocks, gravity drop | 0.15-0.3s | `power4`, `expo` |
Pick ONE primary (60-70% of scene changes) plus 1-2 accents. Never use a different transition for every scene.
### Mood → transition type
| Mood | Transitions |
| ------------------------ | --------------------------------------------------------------------------- |
| Warm / inviting | Light leak, blur crossfade, focus pull, film burn · _Shader:_ thermal distortion, cross-warp morph |
| Cold / clinical | Squeeze, zoom out, blinds, shutter, grid dissolve · _Shader:_ gravitational lens |
| Editorial / magazine | Push slide, vertical push, diagonal split, shutter · _Shader:_ whip pan |
| Tech / futuristic | Grid dissolve, staggered blocks, blinds · _Shader:_ glitch, chromatic split |
| Tense / edgy | Glitch, VHS, chromatic aberration, ripple · _Shader:_ ridged burn, domain warp |
| Playful / fun | Elastic push, 3D flip, circle iris, morph circle · _Shader:_ swirl vortex, ripple waves |
| Dramatic / cinematic | Zoom through, gravity drop, overexposure · _Shader:_ cinematic zoom, gravitational lens |
| Premium / luxury | Focus pull, blur crossfade, color dip to black · _Shader:_ cross-warp morph |
| Retro / analog | Film burn, light leak, VHS, clock wipe · _Shader:_ light leak |
### Presets
| Preset | Duration | Easing |
| ---------- | -------- | ----------------- |
| `snappy` | 0.2s | `power4.inOut` |
| `smooth` | 0.4s | `power2.inOut` |
| `gentle` | 0.6s | `sine.inOut` |
| `dramatic` | 0.5s | `power3.in` → out |
| `instant` | 0.15s | `expo.inOut` |
| `luxe` | 0.7s | `power1.inOut` |
### Install a shader transition
```bash
npx hyperframes add flash-through-white
npx hyperframes add --list
```
### CSS vs shader
- **CSS transitions** animate scene containers with opacity, transforms, `clip-path`, and filters. Simpler to set up.
- **Shader transitions** composite both scene textures per-pixel on a WebGL canvas — can warp, dissolve, and morph in ways CSS cannot. Import from `@hyperframes/shader-transitions` instead of writing raw GLSL.
Don't mix CSS and shader transitions in the same composition — once a composition uses shader transitions, the WebGL canvas replaces DOM-based scene switching for every transition.
### Shader-compatible CSS rules
Shader transitions capture DOM scenes to WebGL textures via html2canvas. The canvas 2D pipeline doesn't match CSS exactly:
1. No `transparent` keyword in gradients — use the target color at zero alpha: `rgba(200,117,51,0)` not `transparent`. (Canvas interpolates `transparent` as `rgba(0,0,0,0)` creating dark fringes.)
2. No gradient backgrounds on elements thinner than 4px. Use solid `background-color` on thin accent lines.
3. No CSS variables (`var()`) on elements visible during capture — html2canvas doesn't reliably resolve custom properties. Use literal color values.
4. Mark uncapturable decoratives with `data-no-capture` — they stay on the live DOM but are absent from the shader texture.
5. No gradient opacity below 0.15 — renders differently in canvas vs CSS.
6. Every `.scene` div must have explicit `background-color`, AND pass the same color as `bgColor` in the `init()` config. Without either, the texture renders as black.
These rules only apply to shader transition compositions. CSS-only compositions have no restrictions.
### Don't
- Mix CSS and shader transitions in one composition.
- Use exit animations on any scene except the final scene — the transition IS the exit.
- Introduce a new transition type every scene — pick one primary + 1-2 accents.
- Use transitions that create visible geometric repetition (grids, hex cells, uniform dots) — they look artificial regardless of the math behind them. Prefer organic noise (FBM, domain warping).
@@ -0,0 +1,136 @@
# GSAP for HyperFrames
GSAP is the animation engine for all HyperFrames compositions. Load from CDN inside the composition:
```html
<script src="https://cdn.jsdelivr.net/npm/gsap@3.14.2/dist/gsap.min.js"></script>
```
## Core Tween Methods
- **`gsap.to(targets, vars)`** — animate from current state to `vars`. Most common.
- **`gsap.from(targets, vars)`** — animate from `vars` to current state (entrances).
- **`gsap.fromTo(targets, fromVars, toVars)`** — explicit start and end.
- **`gsap.set(targets, vars)`** — apply immediately (duration 0). Don't use on clip elements that enter later — use `tl.set(selector, vars, time)` inside the timeline instead.
Always use **camelCase** property names (`backgroundColor`, `rotationX`, not `background-color`).
## Common vars
- **`duration`** — seconds (default 0.5).
- **`delay`** — seconds before start.
- **`ease`** — `"power1.out"` (default), `"power3.inOut"`, `"back.out(1.7)"`, `"elastic.out(1, 0.3)"`, `"none"`, `"expo.out"`, `"circ.inOut"`.
- **`stagger`** — number `0.1` or object: `{ amount: 0.3, from: "center" }`, `{ each: 0.1, from: "random" }`.
- **`overwrite`** — `false` (default), `true`, or `"auto"`.
- **`repeat`** — number (never `-1` in HyperFrames). **`yoyo`** — alternates direction with repeat.
- **`onComplete`**, **`onStart`**, **`onUpdate`** — callbacks.
- **`immediateRender`** — default `true` for `from()`/`fromTo()`. Set `false` on later tweens targeting the same property+element to avoid overwrite surprises.
## Transforms
Prefer GSAP's transform aliases over raw CSS `transform`:
| GSAP property | Equivalent |
| --------------------------- | -------------------------- |
| `x`, `y`, `z` | translateX/Y/Z (px) |
| `xPercent`, `yPercent` | translateX/Y (%) |
| `scale`, `scaleX`, `scaleY` | scale |
| `rotation` | rotate (deg) |
| `rotationX`, `rotationY` | 3D rotate |
| `skewX`, `skewY` | skew |
| `transformOrigin` | transform-origin |
- **`autoAlpha`** — prefer over `opacity`. At 0, also sets `visibility: hidden`.
- **CSS variables**`"--hue": 180`.
- **Directional rotation**`"360_cw"`, `"-170_short"`, `"90_ccw"`.
- **`clearProps`** — `"all"` or comma-separated; removes inline styles on complete.
- **Relative values**`"+=20"`, `"-=10"`, `"*=2"`.
## Function-based Values
```js
gsap.to(".item", {
x: (i, target, targets) => i * 50,
stagger: 0.1,
});
```
## Easing
Built-in eases: `power1` through `power4`, `back`, `bounce`, `circ`, `elastic`, `expo`, `sine`. Each has `.in`, `.out`, `.inOut`.
Rule of thumb:
- Entrances: `power3.out`, `expo.out`, `back.out(1.4)`
- Exits: `power2.in`, `expo.in`
- Scrubbed sections: `none` (linear)
- Vary eases across entrance tweens within a scene — at least 3 different eases.
## Defaults
```js
gsap.defaults({ duration: 0.6, ease: "power2.out" });
```
## Timelines (HyperFrames primary pattern)
```js
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true, defaults: { duration: 0.6, ease: "power2.out" } });
tl.from(".title", { y: 50, opacity: 0 }, 0.3);
tl.from(".subtitle", { y: 30, opacity: 0 }, 0.5);
tl.from(".cta", { scale: 0.8, opacity: 0, ease: "back.out(1.7)" }, 0.8);
window.__timelines["root"] = tl;
```
### Position parameter
Third argument to `.from()` / `.to()` / `.add()`:
- Absolute seconds: `0.5`, `2.1`.
- Relative to end: `">+0.2"` (0.2s after previous), `"<"` (same time as previous), `"<+0.3"` (0.3s after previous's start).
- Named labels: `tl.addLabel("act2", 5); tl.from(".x", { y: 30 }, "act2");`
### Nesting
HyperFrames auto-nests sub-composition timelines. **Do not** manually `tl.add(subTl)` — the framework wires sub-timelines into the parent at the sub-composition's `data-start`.
### Playback
The player controls playback. Don't call `tl.play()`, `tl.pause()`, or `tl.reverse()` at construction time. `{ paused: true }` is required.
## Stagger
```js
// even distribution
tl.from(".card", { opacity: 0, y: 40, stagger: 0.1 });
// control total amount
tl.from(".card", { opacity: 0, stagger: { amount: 0.6, from: "center" } });
// deterministic "random" stagger (HyperFrames compositions must be deterministic)
tl.from(".dot", { opacity: 0, stagger: { each: 0.05, from: "random" } });
```
`stagger.from`: `"start"` | `"end"` | `"center"` | `"edges"` | `"random"` | index | `[x, y]` for grid.
## Performance
- Animate transforms (`x`, `y`, `scale`, `rotation`, `opacity`) — cheap, GPU-accelerated.
- Avoid animating `width`, `height`, `top`, `left`, `margin` — causes layout thrash.
- Avoid box-shadow or filter animations on large elements — expensive.
- `will-change` is rarely needed; GSAP handles promotion.
## gsap.matchMedia (rarely needed in HyperFrames)
Compositions have fixed dimensions (`data-width`/`data-height`), so responsive breakpoints don't apply. You may still use `matchMedia` for `prefers-reduced-motion` when authoring UI previews, but it's not used in rendered video output.
## Don't Do
- `repeat: -1` anywhere — breaks the capture engine.
- `Math.random()`, `Date.now()`, performance.now()` inside tween values — non-deterministic.
- `async` / `setTimeout` / `Promise` around timeline construction — the capture engine reads `window.__timelines` synchronously.
- Animate `visibility` or `display` directly — use `autoAlpha`.
- `gsap.set()` on clip elements that enter later in the timeline — they don't exist in the DOM at page-load. Use `tl.set(sel, vars, time)` inside the timeline.
@@ -0,0 +1,137 @@
# Troubleshooting
## `HeadlessExperimental.beginFrame' wasn't found` (first thing to check)
**Symptom:** `npx hyperframes render` fails with:
```
✗ Render failed
Protocol error (HeadlessExperimental.beginFrame):
'HeadlessExperimental.beginFrame' wasn't found
```
**Cause:** Chromium 147+ removed the `HeadlessExperimental.beginFrame` CDP command. This affected sandbox environments (e.g., OpenClaw, some containerized agent hosts) that ship modern Chromium as the system browser. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294).
**Fix (permanent — preferred):** upgrade.
```bash
npx hyperframes upgrade -y
# or
npm install -g hyperframes@latest
```
`hyperframes >= 0.4.2` auto-detects whether the resolved browser supports `beginFrame` (checks for `chrome-headless-shell` in the binary path) and falls back to screenshot capture mode when it doesn't. Commit [`4c72ba4`](https://github.com/heygen-com/hyperframes/commit/4c72ba4a36ec2bd6733f7b9cb2a9e63f9fb234b9) (March 2026) shipped this auto-detect.
**Fix (escape hatch — if you can't upgrade):**
```bash
export PRODUCER_FORCE_SCREENSHOT=true
npx hyperframes render
```
This forces screenshot mode regardless of the binary. Screenshot mode is slightly slower but visually identical.
**Fix (prevent — recommended):** install `chrome-headless-shell` so the engine can use the fast BeginFrame path:
```bash
npx puppeteer browsers install chrome-headless-shell
# or let the CLI do it
npx hyperframes browser --install
```
`scripts/setup.sh` runs this automatically.
## `npx hyperframes render` hangs for 120s then times out
**Cause:** the resolved browser is system Chrome (e.g., `/usr/bin/google-chrome`) and doesn't support the BeginFrame path, but auto-detect also missed it (older `hyperframes` version).
**Fix:**
1. Check which binary is being used: `npx hyperframes browser --path`
2. If it's system Chrome, either:
- Install `chrome-headless-shell`: `npx hyperframes browser --install`, OR
- Set the escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`, OR
- Upgrade: `npx hyperframes upgrade -y`
## `ffmpeg: command not found`
Install FFmpeg via your system package manager:
| OS / distro | Command |
| --------------- | ----------------------------------- |
| Ubuntu / Debian | `sudo apt-get install -y ffmpeg` |
| Fedora / RHEL | `sudo dnf install -y ffmpeg` |
| Arch | `sudo pacman -S ffmpeg` |
| macOS | `brew install ffmpeg` |
| Windows | `winget install Gyan.FFmpeg` |
Verify: `ffmpeg -version`.
## `Node version X is not supported`
HyperFrames requires Node.js >= 22. Check with `node --version`.
- **nvm:** `nvm install 22 && nvm use 22`
- **Homebrew (macOS):** `brew install node@22 && brew link --overwrite node@22`
- **apt:** follow [nodesource](https://github.com/nodesource/distributions) for Node 22 LTS.
## `ENOSPC: no space left on device` or OOM kills during render
Renders are memory- and disk-hungry. Minimums:
- **RAM:** 4 GB free (8 GB recommended for 60fps / `--quality high`)
- **Disk:** 2 GB free scratch space — frames are written to `/tmp` during capture
Mitigations:
- Lower quality: `--quality draft`.
- Lower fps: `--fps 24`.
- Lower worker count: `--workers 1`.
- Set `TMPDIR` to a volume with more space: `export TMPDIR=/mnt/scratch`.
## Lint passes but the render is blank / black frames
Check the browser console in `preview` — usually:
- A timeline was registered with the wrong key (`__timelines["typo"]` instead of `__timelines["root"]`).
- The root composition was wrapped in `<template>` (only sub-compositions use `<template>`).
- A script tag failed to load — check Network tab in preview.
Run `npx hyperframes lint --verbose` to see info-level findings.
## Contrast warnings from `hyperframes validate`
```
⚠ WCAG AA contrast warnings (3):
· .subtitle "secondary text" — 2.67:1 (need 4.5:1, t=5.3s)
```
- **Dark backgrounds:** brighten the failing color until it clears 4.5:1 (normal text) or 3:1 (large text — 24px+ or 19px+ bold).
- **Light backgrounds:** darken it.
- Stay within the palette family — don't invent a new color, adjust the existing one.
- Skip the check temporarily with `--no-contrast` if iterating rapidly, but clear it before delivery.
## `Font family 'X' not supported by compiler`
The compiler embeds a curated set of web-safe + open-source fonts. If a font isn't supported, either:
- Swap to a supported alternative from the warning.
- Register a custom font via `@font-face` pointing to a `.woff2` in the project directory (the compiler embeds referenced `@font-face` files).
## Video plays back muted or with no audio
Check:
- The `<video>` element has `muted playsinline` (required — browser autoplay policy).
- Audio is a **separate** `<audio>` element, not the video element.
- Audio `data-volume` is set (defaults to 1).
- The audio file is at the expected path — compositions load relative to their own directory.
## Docker render fails on Linux with rootless Docker
Add `--privileged` or pass `--cap-add=SYS_ADMIN`:
```bash
npx hyperframes render --docker --docker-args "--cap-add=SYS_ADMIN"
```
The headless browser needs namespace permissions for sandboxing.
## Bug reports
Include `npx hyperframes info` output + the full error log. File at [github.com/heygen-com/hyperframes](https://github.com/heygen-com/hyperframes/issues).
@@ -0,0 +1,145 @@
# Website to Video
Capture a website, produce a professional video from it. Use when the user provides a URL and wants a video — social ad, product tour, 30-second promo, etc.
The workflow has 7 steps. Each produces an artifact that gates the next. **Do not skip steps** — each artifact prevents a downstream failure mode.
## Step 1: Capture & Understand
```bash
npx hyperframes capture https://example.com -o example-video
```
Produces `example-video/capture/` with:
- `capture/screenshots/` — above-the-fold + section screenshots (up to `--max-screenshots`)
- `capture/assets/` — logos, hero images, background video (if any)
- `capture/extracted/tokens.json` — colors, fonts, and spacing tokens
- `capture/extracted/visible-text.txt` — extracted headings, paragraphs, CTAs
- `capture/extracted/fonts.json` — font families and stacks detected in computed styles
- `capture/asset-descriptions.md` — auto-generated asset catalog
All subsequent steps read from the `capture/` subfolder — `capture/extracted/tokens.json`, `capture/assets/hero.png`, etc. Never strip the `capture/` prefix when referencing these files.
**Gate:** Print a site summary — name, top 3 colors, primary + display fonts, hero asset path, one-sentence vibe. Keep it in your context — don't re-capture.
## Step 2: Write DESIGN.md
Small brand reference at the project root. 6 sections, ~90 lines. This is the cheat sheet — not the creative plan.
```markdown
# DESIGN
## Brand
- Name: Example Co.
- One-line mission: "…"
## Colors
- Background: #0B0F14
- Primary: #00E0A4 (accent, CTA)
- Secondary: #7A8B9B (body text)
- Text: #FFFFFF
## Typography
- Display: "Inter Tight", 700, tight letter-spacing
- Body: "Inter", 400
## Motion
- Mood: precise, technical, confident
- Eases: `power3.out` for entrances, `expo.in` for exits
## Assets
- Logo: `capture/assets/logo.svg`
- Hero image: `capture/assets/hero.png`
## What NOT to Do
- No purple, no pastels, no serif body
- No playful/bubbly eases (`elastic`, `bounce`)
- No drop shadows on text
```
**Gate:** `DESIGN.md` exists in the project directory.
## Step 3: Write SCRIPT.md
Narration script. Story backbone. **Scene durations come from the narration, not from guessing.**
```markdown
# SCRIPT
## Scene 1 — Hook (0:000:04)
"What if your dashboards wrote themselves?"
## Scene 2 — Problem (0:040:11)
"Teams spend hours stitching together queries, charts, and callouts — every Monday."
## Scene 3 — Solution (0:110:22)
"Example Co. watches your data streams and proposes the dashboard you'd have built — in seconds."
## Scene 4 — CTA (0:220:28)
"Try it free at example.com."
```
Run `npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav` to generate TTS audio. Note the exact duration — that's the video's duration.
**Gate:** `SCRIPT.md` + `narration.wav` exist and durations match the plan (±0.3s).
## Step 4: Storyboard
Text-only scene plan: for each scene, describe the hero frame — what's on screen at the scene's most-visible moment.
```markdown
# STORYBOARD
## Scene 1 (0:000:04) — Hook
Hero frame: giant "WHAT IF YOUR DASHBOARDS WROTE THEMSELVES?" in display font, centered, on near-black. Logo top-left at 40% opacity.
Entrance: each word staggers in, 0.08s apart.
Transition out: flash-through-white into Scene 2.
```
One paragraph per scene. Do NOT skip this step — it's where you catch narrative gaps before writing HTML.
**Gate:** `STORYBOARD.md` exists. Each scene has: hero frame, entrance, transition.
## Step 5: Composition
Write `index.html` scene-by-scene:
- Each scene is a `<div class="scene scene-N">` positioned absolutely, full-bleed.
- Static HTML+CSS for the hero frame first (no GSAP).
- Layer the narration `<audio>` at `data-start="0"` on a high track index.
- Add a transitions component (`flash-through-white`, `liquid-wipe`, etc.) between each scene.
- THEN add GSAP entrances (`gsap.from()`), no exits — transitions own the exit.
- Register `window.__timelines["root"] = tl`.
Install transitions as needed:
```bash
npx hyperframes add flash-through-white
```
## Step 6: Render
```bash
npx hyperframes lint --strict # must pass
npx hyperframes validate # WCAG contrast audit
npx hyperframes render --quality draft --output draft.mp4
```
Watch the draft. Note issues in a `REVIEW.md` bullet list (scene, timestamp, issue). Fix, re-render.
When happy:
```bash
npx hyperframes render --quality high --output final.mp4
```
## Step 7: Deliver
- Report file path + duration + file size to the user.
- If the user wants a vertical cut, re-render with a 9:16 composition (`data-width="1080" data-height="1920"`) — typically requires a separate `index-vertical.html` with tighter typography and re-stacked scene layout.
## Common Failure Modes
- **Skipped DESIGN.md** → colors drift scene-to-scene; output feels like "AI slides."
- **Skipped STORYBOARD.md** → scenes overlap or hero frames collide with transitions.
- **Exit animations** before transitions → empty frames when the transition fires.
- **Narration longer than `data-duration`** → audio clips mid-sentence. Update the composition's `data-duration` to match the TTS output length + 0.5s buffer.
+135
View File
@@ -0,0 +1,135 @@
#!/usr/bin/env bash
# HyperFrames setup for Hermes.
#
# Verifies Node >= 22 and FFmpeg, installs the `hyperframes` CLI globally,
# pre-caches `chrome-headless-shell`, and runs `hyperframes doctor`.
#
# Pins `hyperframes@>=0.4.2` so the OpenClaw/Chromium-147 fix from
# https://github.com/heygen-com/hyperframes/issues/294 (commit 4c72ba4)
# is always present — the engine auto-detects `HeadlessExperimental.beginFrame`
# support and falls back to screenshot capture otherwise.
#
# Idempotent: safe to re-run.
set -euo pipefail
MIN_NODE_MAJOR=22
MIN_HYPERFRAMES_VERSION="0.4.2"
red() { printf '\033[31m%s\033[0m\n' "$*"; }
green() { printf '\033[32m%s\033[0m\n' "$*"; }
yellow() { printf '\033[33m%s\033[0m\n' "$*"; }
bold() { printf '\033[1m%s\033[0m\n' "$*"; }
bold "==> HyperFrames setup"
# --- 1. Node.js --------------------------------------------------------------
if ! command -v node >/dev/null 2>&1; then
red "✗ Node.js is not installed."
echo " Install Node.js >= ${MIN_NODE_MAJOR} (nvm, Homebrew, or your package manager) and re-run."
exit 1
fi
node_version="$(node --version | sed 's/^v//')"
node_major="$(echo "$node_version" | cut -d. -f1)"
if [ "$node_major" -lt "$MIN_NODE_MAJOR" ]; then
red "✗ Node.js ${node_version} is too old. HyperFrames requires Node.js >= ${MIN_NODE_MAJOR}."
echo " Upgrade with 'nvm install ${MIN_NODE_MAJOR} && nvm use ${MIN_NODE_MAJOR}' or your package manager."
exit 1
fi
green "✓ Node.js ${node_version}"
# --- 2. FFmpeg ---------------------------------------------------------------
if ! command -v ffmpeg >/dev/null 2>&1; then
red "✗ FFmpeg is not installed."
case "$(uname -s)" in
Linux*) echo " sudo apt-get install -y ffmpeg # Debian/Ubuntu"
echo " sudo dnf install -y ffmpeg # Fedora/RHEL";;
Darwin*) echo " brew install ffmpeg";;
MINGW*|MSYS*|CYGWIN*) echo " winget install Gyan.FFmpeg";;
*) echo " See https://ffmpeg.org/download.html";;
esac
exit 1
fi
green "✓ FFmpeg $(ffmpeg -version 2>&1 | head -1 | awk '{print $3}')"
# --- 3. npm ------------------------------------------------------------------
if ! command -v npm >/dev/null 2>&1; then
red "✗ npm is not installed (should ship with Node.js)."
exit 1
fi
# --- 4. Install / upgrade hyperframes CLI -----------------------------------
bold "==> Installing hyperframes CLI (>= ${MIN_HYPERFRAMES_VERSION})"
current_hyperframes=""
if command -v hyperframes >/dev/null 2>&1; then
current_hyperframes="$(hyperframes --version 2>/dev/null | tail -1 | sed 's/^v//')"
fi
if [ -n "$current_hyperframes" ]; then
yellow " Found hyperframes ${current_hyperframes}"
fi
# Always install/upgrade to >= MIN version.
# Using 'latest' so we pick up any newer auto-detect/capture fixes.
if ! npm install -g "hyperframes@latest" >/dev/null 2>&1; then
red "✗ npm install -g hyperframes@latest failed."
echo " Try: sudo npm install -g hyperframes@latest"
echo " Or use a user-scoped npm prefix: npm config set prefix ~/.npm-global && export PATH=\"\$HOME/.npm-global/bin:\$PATH\""
exit 1
fi
installed_version="$(hyperframes --version 2>/dev/null | tail -1 | sed 's/^v//')"
green "✓ hyperframes ${installed_version} installed globally"
# Sanity-check minimum version.
version_ge() {
# version_ge A B → true if A >= B
[ "$(printf '%s\n%s\n' "$1" "$2" | sort -V | head -1)" = "$2" ]
}
if ! version_ge "$installed_version" "$MIN_HYPERFRAMES_VERSION"; then
red "✗ hyperframes ${installed_version} is below required minimum ${MIN_HYPERFRAMES_VERSION}."
echo " Try 'npm install -g hyperframes@latest' or 'sudo npm install -g hyperframes@latest'."
exit 1
fi
# --- 5. Pre-cache chrome-headless-shell --------------------------------------
#
# Chromium 147+ removed HeadlessExperimental.beginFrame. System Chrome (e.g.
# /usr/bin/google-chrome) can't render with the fast path, so the engine
# auto-detects and falls back to screenshot mode — but BeginFrame mode is
# faster and produces higher-quality output. Install chrome-headless-shell
# up front so the engine picks it over system Chrome.
bold "==> Pre-caching chrome-headless-shell (for best render quality)"
if ! npx --yes puppeteer browsers install chrome-headless-shell >/dev/null 2>&1; then
yellow "⚠ Could not pre-install chrome-headless-shell."
yellow " Rendering will still work via screenshot-mode fallback (slower)."
yellow " If you hit HeadlessExperimental.beginFrame errors:"
yellow " export PRODUCER_FORCE_SCREENSHOT=true"
yellow " See references/troubleshooting.md."
else
green "✓ chrome-headless-shell installed"
fi
# --- 6. Doctor ---------------------------------------------------------------
bold "==> Running hyperframes doctor"
if hyperframes doctor; then
green "✓ HyperFrames is ready"
echo
echo " Scaffold a project: npx hyperframes init my-video"
echo " Preview: npx hyperframes preview"
echo " Render: npx hyperframes render"
else
yellow "⚠ hyperframes doctor reported issues."
yellow " See references/troubleshooting.md or re-run 'hyperframes doctor'."
exit 1
fi
@@ -0,0 +1,206 @@
---
name: kanban-video-orchestrator
description: Plan, set up, and monitor a multi-agent video production pipeline backed by Hermes Kanban. Use when the user wants to make ANY video — narrative film, product/marketing, music video, explainer, ASCII/terminal art, abstract/generative loop, comic, 3D, real-time/installation — and the work warrants decomposition into specialized profiles (writer, designer, animator, renderer, voice, editor, etc.) coordinated through a kanban board. Performs adaptive discovery to scope the brief, designs an appropriate team for the requested style, generates the setup script that creates Hermes profiles + initial kanban task, then helps monitor execution and intervene when tasks stall or fail. Routes scenes to whichever Hermes rendering / audio / design skill fits each beat (`ascii-video`, `manim-video`, `p5js`, `comfyui`, `touchdesigner-mcp`, `blender-mcp`, `pixel-art`, `baoyu-comic`, `claude-design`, `excalidraw`, `songsee`, `heartmula`, …) plus external APIs for TTS, image-gen, and image-to-video as needed.
version: 1.0.0
author: [SHL0MS, alt-glitch]
license: MIT
metadata:
hermes:
tags: [video, kanban, multi-agent, orchestration, production-pipeline]
related_skills: [kanban-orchestrator, kanban-worker, ascii-video, manim-video, p5js, comfyui, touchdesigner-mcp, blender-mcp, pixel-art, ascii-art, songwriting-and-ai-music, heartmula, songsee, spotify, youtube-content, claude-design, excalidraw, architecture-diagram, concept-diagrams, baoyu-comic, baoyu-infographic, humanizer, gif-search, meme-generation]
credits: |
The single-project workspace layout, profile-config patching pattern,
SOUL.md-per-profile model, TEAM.md task-graph convention, and
`--workspace dir:<path>` discipline are adapted from alt-glitch's
original multi-agent video pipeline at
https://github.com/NousResearch/kanban-video-pipeline.
---
# Kanban Video Orchestrator
Wrap any video request — from a 15-second product teaser to a 5-minute narrative
short to a music video to an ASCII loop — in a Hermes Kanban pipeline that
decomposes the work to specialized agent profiles.
This skill does **not** render anything itself. It is a meta-pipeline that:
1. **Scopes** the request through targeted discovery
2. **Designs** an appropriate team (which roles, which tools per role) based on the style
3. **Generates** a setup script that creates Hermes profiles, project workspace, and the initial kanban task
4. **Hands off** to the director profile, which decomposes via the kanban
5. **Monitors** execution, helps intervene when tasks stall or fail
The actual rendering happens inside the kanban once it's running, via whichever
existing skills + tools fit the scenes — `ascii-video`, `manim-video`, `p5js`,
`comfyui`, `touchdesigner-mcp`, `blender-mcp`, `songwriting-and-ai-music`,
`heartmula`, external APIs, or plain Python with PIL + ffmpeg.
## When NOT to use this skill
- The video is one continuous procedural project that needs no specialists. Just write the code directly.
- The user wants a quick one-shot conversion (e.g. "convert this mp4 to a GIF") — use ffmpeg directly.
- The output is a static image, GIF, or audio-only artifact — use the matching specific skill (`ascii-art`, `gifs`, `meme-generation`, `songwriting-and-ai-music`).
- The work fits a single existing skill cleanly (e.g. a pure ASCII video — just use `ascii-video`).
## Workflow
```
DISCOVER → BRIEF → TEAM DESIGN → SETUP → EXECUTE → MONITOR
```
### Step 1 — Discover (ask the right questions)
The discovery process is **adaptive**: ask only what is actually needed. Always
start with three questions to identify the broad shape:
- **What is the video?** (one-sentence brief)
- **How long?** (5-30s teaser / 30-90s short / 90s-3min explainer / 3-10min film / longer)
- **What aspect ratio + target platform?** (1:1 / 9:16 / 16:9; X, IG, YouTube, internal, etc.)
From the answer, classify the style category. The style determines which
follow-up questions to ask. **Do not ask all questions at once.** Ask 2-4 at a
time, listen, then proceed. Make reasonable assumptions whenever the user
implies an answer.
For complete intake patterns and per-style question banks, see
**[references/intake.md](references/intake.md)**.
### Step 2 — Brief
Once enough is known, produce a structured `brief.md` using the template in
`assets/brief.md.tmpl`. Stages:
1. **Concept** — the one-sentence pitch + emotional north star
2. **Scope** — duration, aspect, platform, deadline
3. **Style** — visual references, brand constraints, tone
4. **Scenes** — beat-by-beat breakdown (durations, content, target tool)
5. **Audio** — narration / music / SFX / silent (per scene if needed)
6. **Deliverables** — file format, resolution, optional alternates (vertical cut, GIF, etc.)
Show the brief to the user for confirmation before designing the team. **The
brief is the contract** — every downstream task references it.
### Step 3 — Team design
Pick role archetypes from the library that fit this video. **Compose, don't
clone.** Most videos need 4-7 profiles. The director is always present; the
rest are picked by what the brief actually requires.
For the role library and per-style team compositions, see
**[references/role-archetypes.md](references/role-archetypes.md)**.
For mapping role → which Hermes skills + toolsets it loads, see
**[references/tool-matrix.md](references/tool-matrix.md)**.
### Step 4 — Setup
Generate a setup script (`setup.sh`) and run it. The script:
1. Creates the project workspace (`~/projects/video-pipeline/<slug>/`)
2. Copies any provided assets into `taste/`, `audio/`, `assets/`
3. Creates each Hermes profile via `hermes profile create --clone`
4. Writes per-profile `SOUL.md` (personality + role definition)
5. Configures profile YAML (toolsets, always_load skills, cwd)
6. Writes `brief.md`, `TEAM.md`, and `taste/` content
7. Fires the initial `hermes kanban create` task assigned to the director
Use `scripts/bootstrap_pipeline.py` to generate setup.sh from a brief +
team-design JSON. See **[references/kanban-setup.md](references/kanban-setup.md)**
for the setup script structure, profile config patterns, and the critical
"shared workspace" rule.
### Step 5 — Execute
Run `setup.sh`. Then provide the user with monitoring commands:
```bash
hermes kanban watch --tenant <project-tenant> # live events
hermes kanban list --tenant <project-tenant> # board snapshot
hermes dashboard # visual board UI
```
The director profile takes over from here, decomposing the work and routing
tasks to specialist profiles via the kanban toolset.
### Step 6 — Monitor and intervene
Stay engaged — the kanban runs autonomously but a stuck task or bad output
needs human (or AI) judgment.
Monitoring patterns: poll `kanban list` periodically, inspect any RUNNING task
that exceeds its expected duration with `kanban show <id>`, and check
heartbeats. When a worker's output fails review, the standard interventions are:
1. Comment on the worker's task with specific feedback (`kanban_comment`)
2. Create a re-run task with the original as parent
3. Adjust the brief's scope and let the director re-decompose
For diagnostic patterns, intervention recipes, and the "task is stuck"
playbook, see **[references/monitoring.md](references/monitoring.md)**.
## Reference: worked examples
Six concrete pipelines covering very different video styles — narrative film,
product/marketing, music video, math/algorithm explainer, ASCII video, real-time
installation — showing how the same workflow yields very different teams and
task graphs. See **[references/examples.md](references/examples.md)**.
## Critical rules
1. **Discovery before action.** Never start generating a brief or team without
asking at least the three baseline questions. A bad brief cascades through
the entire pipeline.
2. **Match the team to the video.** Don't reuse the same 4-profile setup for
every job. A music video that doesn't have a beat-analysis profile will
misfire. A narrative film that doesn't have a writer profile will produce
incoherent scenes. See `references/role-archetypes.md`.
3. **One workspace per project.** All profiles for a given video share the same
`dir:` workspace. Tasks pass artifacts via shared filesystem and structured
handoffs. **Every** `kanban_create` call passes
`workspace_kind="dir"` + `workspace_path="<absolute project path>"`.
4. **Tenant every project.** Use a project-specific tenant
(`--tenant <project-slug>`). Keeps the dashboard scoped and prevents
cross-pollination with other ongoing kanbans.
5. **Respect existing skills.** When a scene fits an existing skill, the
relevant renderer should load that skill via `--skill <name>` on its task
or `always_load` in its profile. Do not re-derive what a skill already
provides.
6. **The director never executes.** Even with the full `kanban + terminal +
file` toolset, the director's `SOUL.md` rules forbid it from executing
work itself. It decomposes and routes only — every concrete task becomes
a `hermes kanban create` call to a specialist profile. The
`kanban-orchestrator` skill spells this out further.
7. **Don't over-decompose.** A 30-second product video does NOT need 20 tasks.
Aim for the smallest task graph that still parallelizes well and exposes the
right human-review gates.
8. **Verify API keys BEFORE firing.** External APIs (TTS, image-gen,
image-to-video) need keys in `~/.hermes/.env` or the user's secret store.
A worker that hits a missing-key error wastes a task slot. The setup
script's `check_key` helper aborts cleanly if a required key is missing.
## File map
```
SKILL.md ← this file (workflow + rules)
references/
intake.md ← discovery question banks per style
role-archetypes.md ← role library (writer, designer, animator, …)
tool-matrix.md ← skill + toolset mapping per role
kanban-setup.md ← setup script structure & profile config
monitoring.md ← watch + intervene patterns
examples.md ← six worked pipelines
assets/
brief.md.tmpl ← brief skeleton
setup.sh.tmpl ← setup script skeleton
soul.md.tmpl ← profile personality skeleton
scripts/
bootstrap_pipeline.py ← generate setup.sh from brief + team JSON
monitor.py ← polling + intervention helpers
```
@@ -0,0 +1,79 @@
# Video Brief — {{TITLE}}
> Slug: `{{SLUG}}` · Tenant: `{{TENANT}}` · Project workspace: `{{WORKSPACE}}`
## 1. Concept
**One-line pitch.** {{ONE_LINE_PITCH}}
**Emotional north star.** {{EMOTIONAL_NORTH_STAR}}
*(What should the viewer feel walking away?)*
## 2. Scope
| | |
|---|---|
| Duration | {{DURATION_S}} seconds |
| Aspect ratio | {{ASPECT}} |
| Resolution | {{RESOLUTION}} |
| Frame rate | {{FPS}} fps |
| Target platforms | {{PLATFORMS}} |
| Deadline | {{DEADLINE}} |
| Quality bar | {{QUALITY_BAR}} *(rough draft / polished / archival)* |
## 3. Style
**Visual references.** {{VISUAL_REFS}}
**Tone.** {{TONE}}
**Brand constraints.** {{BRAND_CONSTRAINTS}}
*(colors, typography, motion language; or "n/a")*
**Aesthetic rules.**
{{AESTHETIC_RULES}}
## 4. Scenes
Beat-by-beat breakdown. Each scene gets a row.
| # | Time | Content | Target tool / skill | Audio | Notes |
|---|------|---------|---------------------|-------|-------|
| 1 | 0:000:0X | {{SCENE_1_CONTENT}} | {{SCENE_1_TOOL}} | {{SCENE_1_AUDIO}} | {{SCENE_1_NOTES}} |
| 2 | 0:0X0:0Y | ... | ... | ... | ... |
## 5. Audio
**Approach.** {{AUDIO_APPROACH}}
*(narration / music-only / synced to track / silent / mixed)*
**Voiceover.** {{VO_DETAILS}}
*(provider, voice, language, script source — "n/a" if no VO)*
**Music.** {{MUSIC_DETAILS}}
*(provided track path / commission via Suno / commission via heartmula /
license-free / "n/a")*
**SFX.** {{SFX_DETAILS}}
*(generated, library, or "n/a")*
## 6. Deliverables
| Format | Resolution | Notes |
|--------|-----------|-------|
| {{PRIMARY_FORMAT}} | {{PRIMARY_RES}} | The main output |
| {{ALT_FORMAT_1}} | {{ALT_RES_1}} | {{ALT_NOTES_1}} |
**Final filename.** `output/final.mp4`
*(plus optional `output/final-9x16.mp4`, `output/captions.srt`, etc.)*
## 7. Constraints
- API keys required: {{API_KEYS_REQUIRED}}
- External dependencies: {{EXT_DEPS}}
- Source assets to incorporate: {{SOURCE_ASSETS}}
---
**This brief is the contract. The director and every downstream profile read
it. If the brief changes, the kanban must be re-fired — don't edit live.**
@@ -0,0 +1,185 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════════════
# Video Pipeline Setup — {{TITLE}}
#
# Generated by kanban-video-orchestrator skill.
#
# Slug: {{SLUG}}
# Workspace: {{WORKSPACE}}
# Tenant: {{TENANT}}
# ═══════════════════════════════════════════════════════════════════════
set -euo pipefail
PROJECT_SLUG="{{SLUG}}"
WORKSPACE="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
TENANT="{{TENANT}}"
# ─────────────────────────────────────────────────────────────────────
# 1. Verify required API keys
# ─────────────────────────────────────────────────────────────────────
echo "═══ Checking required API keys ═══"
check_key() {
local var="$1"
local kc_account="${2:-hermes}"
local kc_service="${3:-$1}"
if grep -q "^${var}=" "$HOME/.hermes/.env" 2>/dev/null && \
[ -n "$(grep "^${var}=" "$HOME/.hermes/.env" | cut -d= -f2-)" ]; then
echo " ✓ ${var} (env)"
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
echo " ✓ ${var} (Keychain ${kc_account}/${kc_service})"
return 0
fi
echo " ✗ ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
# Customize this list per project — only check keys actually used:
{{KEY_CHECKS}}
# ─────────────────────────────────────────────────────────────────────
# 2. Create project workspace
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating project workspace ═══"
mkdir -p "$WORKSPACE"/{taste,audio/{voiceover,sfx},assets,scenes,checkpoints,tools,output}
{{SCENE_DIRS}}
echo " ✓ $WORKSPACE"
# ─────────────────────────────────────────────────────────────────────
# 3. Create Hermes profiles
# ─────────────────────────────────────────────────────────────────────
echo "═══ Creating Hermes profiles ═══"
{{PROFILE_CREATE_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 4. Configure profiles (toolsets, skills, cwd)
# ─────────────────────────────────────────────────────────────────────
echo "═══ Configuring profiles ═══"
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array string, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array string, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" "$WORKSPACE" <<'PY'
"""Patch a Hermes profile config.yaml using PyYAML so we don't depend on the
exact default-config string format. Validates the patch took effect and exits
non-zero if anything's off."""
import json
import os
import sys
try:
import yaml
except ImportError:
print("ERROR: PyYAML required. pip install pyyaml", file=sys.stderr)
sys.exit(1)
profile, toolsets_json, skills_json, workspace = sys.argv[1:5]
toolsets = json.loads(toolsets_json)
skills = json.loads(skills_json)
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
if not os.path.exists(p):
print(f" ✗ profile config not found: {p}", file=sys.stderr)
sys.exit(1)
with open(p) as f:
cfg = yaml.safe_load(f) or {}
# Apply our changes — only the keys we actually want to set.
cfg["toolsets"] = toolsets
cfg.setdefault("skills", {})
cfg["skills"]["always_load"] = skills
# Note: we do NOT touch cfg["approvals"] — that's a security-sensitive
# setting (manual confirmation of tool calls). Workspace cwd is overridden
# per-task by `--workspace dir:<path>` on `hermes kanban create`, so we
# don't need to mutate cfg["terminal"]["cwd"] either.
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
# Validate
with open(p) as f:
after = yaml.safe_load(f)
errors = []
if after.get("toolsets") != toolsets:
errors.append(f"toolsets mismatch: {after.get('toolsets')!r}")
if after.get("skills", {}).get("always_load") != skills:
errors.append(f"skills.always_load mismatch: {after.get('skills', {}).get('always_load')!r}")
if errors:
print(f" ✗ {profile}: " + "; ".join(errors), file=sys.stderr)
sys.exit(1)
PY
if [ $? -ne 0 ]; then
echo " ✗ failed to configure ${profile}" >&2
exit 1
fi
echo " ✓ ${profile}"
}
{{PROFILE_CONFIG_COMMANDS}}
# ─────────────────────────────────────────────────────────────────────
# 5. Write SOUL.md per profile
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing profile personalities ═══"
{{SOUL_WRITES}}
# ─────────────────────────────────────────────────────────────────────
# 6. Copy brief, TEAM.md, and any provided assets
# ─────────────────────────────────────────────────────────────────────
echo "═══ Writing brief + taste ═══"
cat > "$WORKSPACE/brief.md" <<'BRIEF_EOF'
{{BRIEF_CONTENTS}}
BRIEF_EOF
cat > "$WORKSPACE/TEAM.md" <<'TEAM_EOF'
{{TEAM_CONTENTS}}
TEAM_EOF
{{TASTE_WRITES}}
{{ASSET_COPIES}}
# ─────────────────────────────────────────────────────────────────────
# 7. Fire the initial kanban task
# ─────────────────────────────────────────────────────────────────────
echo "═══ Firing initial kanban task ═══"
hermes kanban create "Direct production of {{TITLE}}" \
--assignee director \
--workspace dir:"$WORKSPACE" \
--tenant "$TENANT" \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$WORKSPACE"
tenant="$TENANT"
Do not execute the work yourself — route every concrete subtask to the
appropriate profile via kanban_create.
EOF
)"
echo ""
echo "═══ Setup complete ═══"
echo ""
echo "Monitor with:"
echo " hermes kanban watch --tenant $TENANT"
echo " hermes kanban list --tenant $TENANT"
echo " hermes dashboard"
echo ""
echo "Workspace: $WORKSPACE"
@@ -0,0 +1,38 @@
# {{ROLE_NAME}}
You are the **{{ROLE_NAME}}** for this video production.
## Project context
- **Brief:** read `brief.md` in your CWD
- **Team graph:** read `TEAM.md` in your CWD
- **Style spec:** read `taste/brand-guide.md` and `taste/emotional-dna.md` in
your CWD
## What you do
{{ROLE_RESPONSIBILITIES}}
## Inputs you read
{{INPUTS_READ}}
## Outputs you produce
{{OUTPUTS_PRODUCED}}
## Tools and skills available
- **Toolsets:** {{TOOLSETS}}
- **Skills loaded:** {{SKILLS}}
- **External APIs / CLIs:** {{EXTERNAL_TOOLS}}
## Rules
{{ROLE_RULES}}
{{COMMON_RULES}}
## Common reference commands
{{COMMON_COMMANDS}}
@@ -0,0 +1,227 @@
# Worked Examples
Six concrete pipelines covering different video styles. Each shows the team
composition, task graph, and skill/tool choices the orchestrator would make
for that brief. **These are illustrative, not templates** — adapt to the
actual brief.
## Example 1 — Narrative short film (text-to-image → image-to-video → cut)
**Brief:** A 90-second noir-style short. A detective walks through a rainy
city. Voiceover narration. AI-generated visuals.
**Team:**
- `director` — vision, decomposition, approval
- `writer` — script + voiceover copy (loads `humanizer` for natural voice)
- `storyboarder` — beat-by-beat shot list (loads `excalidraw`)
- `image-generator` — generates each shot's still via local ComfyUI workflows
(loads `comfyui`)
- `image-to-video-generator` — animates each still (Runway/Kling, OR
ComfyUI's AnimateDiff/WAN workflows via `comfyui`)
- `voice-talent` — narration via ElevenLabs
- `audio-mixer` — VO + ambient pad
- `editor` — assembly + transitions
- `reviewer` — final QA
**Task graph:**
```
T0 director decompose
T1 writer script + voiceover.md (parent: T0)
T2 storyboarder shot list with framing per beat (parent: T1)
T3 image-generator one still per shot (~12 shots) (parent: T2)
T4 image-to-video animate each still (parent: T3)
T5 voice-talent generate narration audio (parent: T1)
T6 audio-mixer mix VO + ambient (parent: T5)
T7 editor cut + transitions + audio mux (parents: T4, T6)
T8 reviewer final QA (parent: T7)
```
**Key choices:**
- Local ComfyUI via `comfyui` skill is preferred over external API for
cost/control — but external APIs are fine if ComfyUI isn't installed
- `editor` profile is ffmpeg-only, no Hermes skill required beyond
`kanban-worker`
- Storyboarder produces `storyboard.excalidraw` alongside the markdown
## Example 2 — Product / marketing teaser
**Brief:** A 30-second product teaser for a developer tool. Shows code +
terminal + UI screen recordings, voiceover, CTA at end. Square 1:1.
**Team:**
- `director`
- `copywriter` — taglines, voiceover script, CTA (loads `humanizer`)
- `concept-artist` — style frames (loads `claude-design` for UI mockups)
- `renderer-motion-graphics` — animated UI sequences (Remotion CLI)
- `renderer-ascii` — terminal-style demo scenes (loads `ascii-video`)
- `voice-talent` — VO via ElevenLabs
- `editor` — assembly + brand-color treatment
- `audio-mixer` — VO + light music bed
- `captioner` — burned subtitles for muted-autoplay platforms
- `masterer` — produces 1:1 + 9:16 + 16:9 variants
**Task graph:**
```
T0 director decompose
T1 copywriter copy.md + cta + vo script (parent: T0)
T2 concept-artist visual-spec.md + style frames (parent: T1)
T3a renderer-motion-graphics scene 1: UI sequence (parent: T2)
T3b renderer-ascii scene 2: terminal demo (parent: T2)
T3c renderer-motion-graphics scene 3: feature highlight (parent: T2)
T3d renderer-motion-graphics scene 4: CTA card (parent: T2)
T4 voice-talent narration (parent: T1)
T5 audio-mixer VO + music bed (parent: T4)
T6 editor cut + transitions (parents: T3*, T5)
T7 captioner SRT + burned subtitles (parent: T6)
T8 masterer 1:1, 9:16, 16:9 variants (parent: T7)
```
**Key choices:**
- Multiple specialized renderers (motion-graphics + ASCII) coexist
- Captioner is included because muted autoplay is the norm on social
- `claude-design` skill for UI mockups maps directly to the product video idiom
## Example 3 — Music video (synced to provided track)
**Brief:** A 3-minute music video for a provided lo-fi hip-hop track. Visuals
should pulse with the beat. Generative + ASCII hybrid. Vertical 9:16.
**Team:**
- `director`
- `music-supervisor` — analyze track, emit `audio/beats.json` (loads `songsee`)
- `storyboarder` — beat-aligned shot list (loads `excalidraw`)
- `renderer-ascii` — ASCII scenes synced to bass kicks (loads `ascii-video`)
- `renderer-p5js` — generative particle scenes synced to highs (loads `p5js`)
- `editor` — beat-cut assembly using `beats.json`
- `reviewer` — sync QA
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track → beats.json + spectrogram (parent: T0)
T2 storyboarder shot list aligned to beats (parents: T1, T0)
T3a renderer-ascii scene 1: bass-driven ASCII (parent: T2)
T3b renderer-p5js scene 2: high-end particle field (parent: T2)
... (more scenes)
T4 editor cut to beats + mux track (parents: T3*, T1)
T5 reviewer sync QA + final approval (parent: T4)
```
**Key choices:**
- `music-supervisor` runs FIRST — `beats.json` gates the renderers
- `editor` uses `beats.json` directly to align cuts to bass kicks
- No voice-talent — music is the audio
- Two specialized renderers (`ascii-video` + `p5js`) for visual variety
## Example 4 — Math/algorithm explainer
**Brief:** A 2-minute explainer of an algorithm. 3Blue1Brown-style. Animated
diagrams, equations, narration. Square 1:1.
**Team:**
- `director`
- `writer` — narration script (loads `humanizer`)
- `cinematographer` — visual spec (loads `manim-video`)
- `renderer-manim` — all animated scenes (loads `manim-video`)
- `voice-talent` — narration via ElevenLabs
- `editor` — assembly + audio mux
- `captioner` — burned subtitles
**Task graph:**
```
T0 director decompose
T1 writer script + narration (parent: T0)
T2 cinematographer visual spec for all scenes (parent: T1)
T3a-Tn renderer-manim scenes 1..N (parents: T2)
T4 voice-talent narration audio (parent: T1)
T5 editor cut + mux (parents: T3*, T4)
T6 captioner SRT + burn (parent: T5)
```
**Key choices:**
- `manim-video` skill drives both the cinematographer (visual language) and
the renderer (actual scene production)
- The `manim-video` skill's reference docs (animation-design-thinking,
scene-planning, equations) auto-load when needed via the renderer's pinned skill
## Example 5 — ASCII video, music-track-only
**Brief:** A 60-second pure-ASCII video reactive to an existing track. No
voiceover, no other tools. Square 1:1.
**Team:**
- `director`
- `music-supervisor` — track analysis (loads `songsee`)
- `renderer-ascii` — all visuals (loads `ascii-video`)
- `editor` — assembly + audio mux
**Task graph:**
```
T0 director decompose
T1 music-supervisor analyze track (parent: T0)
T2a renderer-ascii scene 1 (parents: T1, T0)
T2b renderer-ascii scene 2 (parents: T1, T0)
T2c renderer-ascii scene 3 (parents: T1, T0)
T3 editor stitch + mux audio (parents: T2*)
```
**Key choices:**
- Minimal team (4 profiles) for a focused single-tool project
- No reviewer — short experimental piece, director approves directly
- All scenes run through one `renderer-ascii` profile because the `ascii-video`
skill covers everything
This example illustrates the rule: **don't over-decompose**. Three scenes
through one renderer is fine. Don't spawn three renderer profiles.
## Example 6 — Real-time / installation art
**Brief:** A 2-minute audio-reactive visual for a gallery installation. Driven
by an audio input feed. TouchDesigner-based. 16:9 4K.
**Team:**
- `director`
- `cinematographer` — visual language spec (loads `touchdesigner-mcp`)
- `renderer-touchdesigner` — all visuals + record-to-disk
(loads `touchdesigner-mcp`)
- `audio-mixer` — final loudness pass on the captured audio (optional if
pre-mixed source)
- `editor` — assemble final clip from TouchDesigner recording
- `reviewer` — visual QA
**Task graph:**
```
T0 director decompose
T1 cinematographer TD operator graph spec (parent: T0)
T2 renderer-touchdesigner build TD network + record output (parent: T1)
T3 editor trim + audio mux (parent: T2)
T4 reviewer final QA (parent: T3)
```
**Key choices:**
- `touchdesigner-mcp` controls a running TouchDesigner instance — the
cinematographer designs the operator graph, renderer builds it
- Output is a recording from the running TD network, not a render-to-frames
process; editor mostly just trims
## Pattern recognition
When the user describes a video, look for these signals to map to an example:
- **Plot, characters, scripted dialogue** → Example 1 (narrative)
- **Specific product, CTA, brand colors, voiceover** → Example 2 (marketing)
- **Track file provided, "synced to music"** → Example 3 (music video)
- **"Explain how X works", math/algorithm/concept walkthrough** → Example 4 (manim explainer)
- **Terminal aesthetic, ASCII, retro pixel** → Example 5 (ASCII)
- **"Audio-reactive", "real-time", "installation"** → Example 6 (TouchDesigner)
- **Comic-style narrative** → use `renderer-comic` (`baoyu-comic` skill)
- **Retro game / pixel-art aesthetic** → use `renderer-pixel` (`pixel-art` skill)
- **3D scene, photoreal environment** → use `renderer-3d` (`blender-mcp`)
- **Generative art, particle system, shader** → use `renderer-p5js` (`p5js`)
- **AI-generated photoreal stills + animation** → use `renderer-comfyui`
(`comfyui`) for both stills and image-to-video
- **"video about how the system works", recursive demo** → composable from
any of the above; the recursion is a rendering technique, not a style
The actual team should be derived from the specific brief — these examples are
starting points, not endpoints.
@@ -0,0 +1,166 @@
# Intake — Discovery Question Banks
The discovery process is **adaptive**. Always start with three baseline
questions to identify the broad style category, then drill into a per-style
question bank. Ask 2-4 questions at a time, listen, then proceed. Make
reasonable assumptions whenever the user implies an answer.
## Tier 0 — Baseline (always ask)
1. **What is the video?** — One-sentence pitch
2. **How long?** — Approximate duration
3. **Aspect ratio + target platform?** — 16:9 / 9:16 / 1:1 / 4:5; X, IG, YouTube, internal, etc.
From these answers, classify the style category and pick the relevant Tier 1
follow-ups. **Do not** continue asking until you have at least these three.
## Style classification
Map the brief to one of these archetypes (or a hybrid):
| Archetype | Tells |
|-----------|-------|
| **Narrative film** | Plot, characters, scenes-with-events, dialogue, location |
| **Product / marketing** | A specific product or feature being shown / sold; CTA at end |
| **Music video** | A specific track exists; visuals sync to music |
| **Explainer / educational** | A concept being taught; voiceover-driven |
| **Tutorial / changelog** | Software demo, terminal-heavy, technical |
| **ASCII / terminal art** | Retro terminal aesthetic explicit, character-grid |
| **Abstract / loop** | Generative, no plot, often perfect-loop |
| **Documentary / interview cut** | Real footage, transcription-driven |
| **Real-time / installation** | Audio-reactive, gallery installation, VJ output |
If ambiguous, **ask** which category fits — don't guess. Hybrids are common
(e.g., a product video with a narrative arc); decompose into the dominant
mode + secondary modifiers.
**Recursive / meta** ("a video that shows its own production") is a
*rendering technique*, not a separate style — compose it from any of the
above by adding a two-pass render step where pass 2 uses pass 1's output as
texture inside the final scene.
## Tier 1 — Per-style follow-ups
### Narrative film
- **Setting / world?** — When and where the story takes place
- **Characters?** — How many, archetypes, who carries dialogue
- **Beat list or full script?** — Has the user written the story or do we draft it
- **Dialogue language?** — Spoken lines, on-screen subs only, silent
- **Visual generation approach?** — Text-to-image (FAL/Midjourney/Imagen) →
image-to-video (Runway/Kling), 3D animation (Blender), 2D animation,
procedural, or hybrid
- **Voice approach?** — TTS (which voice), recorded VO, no dialogue
- **Music / score?** — Commissioned (via `songwriting-and-ai-music` Suno
prompts, or local `heartmula`), licensed track provided, silent
### Product / marketing
- **Product?** — Name, what it does, key feature being shown
- **Target audience?** — Who's watching, what they care about
- **CTA?** — Visit URL, install, sign up, etc.
- **Tone?** — Serious, playful, technical, premium, edgy
- **Brand assets available?** — Logo files, color palette, fonts, existing footage
- **Animation style?** — Motion graphics (Remotion / AE-style), screen recording,
generative, illustrated
- **Voiceover?** — Yes (which voice / language) or text-only
- **Music?** — Track provided, license-free needed, custom-composed
### Music video
- **Track file?** — Path to the audio (essential — we'll analyze BPM + beats)
- **Track length to use?** — Full song or a section
- **Genre / energy?** — Tells what visual rhythm and density to use
- **Lyric / narrative content?** — Are there lyrics to render on screen,
or is it purely visual?
- **Visual reference style?** — Existing music videos / artists for reference
- **Performer footage?** — None, has clips, will provide
- **Visual generation approach?** — Per-beat generative, edit-driven cuts of stock
footage, illustrated, hybrid
### Explainer / educational
- **What concept is being taught?** — One-sentence concept, key takeaway
- **Audience expertise?** — Beginner / intermediate / expert
- **Diagram density?** — Heavy math / formulas / code / abstract concepts
- **Voiceover?** — TTS / recorded / on-screen text only
- **Tool preference?**`manim-video` (math), `p5js` (generative),
Remotion (UI motion graphics), `comfyui` (AI-generated visuals),
`ascii-video` (technical/retro), hybrid
- **Pacing?** — Fast and dense (3Blue1Brown) or slow and contemplative
### Tutorial / changelog / software demo
- **Software being demonstrated?** — Name, what it does
- **Demo script?** — Sequence of commands / screens to show
- **Terminal-only or with GUI?**
- **Voiceover for narration?**
- **Diagram support needed?** — Often these benefit from a diagram skill
alongside the screen-capture/render step (`excalidraw`,
`architecture-diagram`, `concept-diagrams`)
### ASCII / terminal art
- **Source material?** — Generative / driven by audio / converting existing
video / static image starting point
- **Color palette?** — Brand-driven (gold/black/blue), Matrix green, full
rainbow, monochrome
- **Audio reactivity?** — None / loose mood / tight beat sync / FFT-driven
- **Character set?** — ASCII only / Unicode block-drawing / mystic glyphs
- **Loop or narrative?** — Perfect loop or one-shot
### Abstract / loop
- **Mood / emotion?** — One word that captures the feel
- **Motion type?** — Zoom-into-itself, particle drift, wave, geometric, organic
- **Loop required?** — Perfect loop (Droste-style) or just satisfying ending
- **Audio?** — Silent, ambient pad, beat-synced
### Documentary / interview cut
- **Source footage?** — Provided clips, length per clip
- **Transcript / subtitles?** — Provided or to be generated
- **Story structure?** — Chronological / thematic / arc
- **B-roll approach?** — Generated, stock library, none
### Real-time / installation
- **Output environment?** — Gallery wall, projector, screen, web embed
- **Audio source?** — Live audio input, pre-recorded track, both
- **Reactivity tightness?** — Mood-level (loose) vs. tight beat-sync vs. live
parameter control
- **Tool preference?**`touchdesigner-mcp` for full TD operator graphs;
`p5js` for web-canvas; `comfyui` for generative-AI fed by audio features
## Tier 2 — Always ask near the end
- **Brand assets path?** — Where logo / color palette / fonts / music library lives
- **Output format requirements?** — Codec preference, target file size, accepted
alternates (vertical cut, GIF, audio-only)
- **Deadline?** — Affects task `max_runtime_seconds` and acceptable scope
- **Quality bar?** — Rough draft for review / polished final / archival
- **Existing footage / assets to reuse?** — Anything that should appear, not just inform
## Reasonable assumption defaults
When the user under-specifies, fill in these defaults rather than asking:
| Question | Default |
|----------|---------|
| Frame rate | 30 fps for X / IG; 60 fps for tutorials/explainers; 24 fps for narrative film |
| Resolution | 1080×1080 for square, 1920×1080 for 16:9, 1080×1920 for 9:16 |
| Codec | H.264 / yuv420p, CRF 18 |
| Audio codec | AAC 192 kbps |
| Voice | Provider's mid-range neutral voice unless brand calls for distinctive timbre |
| Music | Silent (require user to specify if music is wanted) |
| Captions | On for explainer/tutorial; off for narrative/abstract unless requested |
| Quality bar | Polished final unless user says draft |
State the assumption explicitly: *"Assuming 30fps and AAC audio unless you say otherwise — proceed?"*
## Anti-patterns
- **Asking 10 questions at once.** Maximum 4 per turn.
- **Asking for things the brief already implies.** If the user said "music video for my track," do not ask "is there a track?"
- **Failing to classify before drilling in.** Tier-1 questions depend on classification; mixing them up wastes turns.
- **Treating "make a video" as enough to proceed.** Always confirm the three baseline questions.
@@ -0,0 +1,276 @@
# Kanban Setup — Project Bootstrap & Profile Configuration
Once the brief is locked and the team is designed, the next step is producing
the actual `setup.sh` that creates the project workspace, configures Hermes
profiles, and fires the initial kanban task.
This file documents the patterns. The companion script
`scripts/bootstrap_pipeline.py` automates most of it from a structured input
JSON.
> **Credit:** the single-project-workspace layout, profile-config patching
> approach, SOUL.md-per-profile convention, and `--workspace dir:<path>` rule
> are adapted from alt-glitch's original multi-agent video pipeline:
> [NousResearch/kanban-video-pipeline](https://github.com/NousResearch/kanban-video-pipeline).
> This skill generalizes those patterns across video styles and replaces the
> string-replacement config patcher with a PyYAML-based one.
## Project workspace structure
Every video project gets one workspace under `~/projects/video-pipeline/<slug>/`:
```
~/projects/video-pipeline/<slug>/
├── brief.md ← the contract; all tasks reference
├── TEAM.md ← team composition + task graph (director reads this)
├── taste/
│ ├── brand-guide.md ← color, typography, motion rules
│ ├── emotional-dna.md ← what the piece should FEEL like
│ └── style-frames/ ← optional: visual references
├── audio/
│ ├── track.mp3 ← provided music (if any)
│ ├── voiceover/ ← per-line TTS clips
│ └── sfx/ ← sound effects
├── assets/
│ ├── logos/
│ ├── fonts/
│ └── existing-footage/ ← reusable provided clips
├── scenes/
│ ├── scene-01/
│ │ ├── VISUAL_SPEC.md ← cinematographer's per-scene spec
│ │ ├── render.py ← renderer's code (or sketch.html, etc.)
│ │ ├── checkpoints/ ← preview frames for QA
│ │ └── clip.mp4 ← the deliverable for this scene
│ ├── scene-02/...
│ └── ...
├── checkpoints/ ← global review frames
├── tools/ ← optional project-local helpers
└── output/
├── final.mp4 ← stitched + audio
├── final-noaudio.mp4
├── final-9x16.mp4 ← optional: vertical alternate
└── captions.srt ← optional: subtitle file
```
**The slug** is derived from the brief title: lowercase, hyphen-separated.
Example: `q3-product-teaser`, `ascii-mood-loop`, `interview-cut-2026-q1`.
## The setup.sh script
The setup script does six things in order:
1. **Create workspace tree** — all directories above
2. **Create profiles**`hermes profile create <name> --clone`
3. **Configure profiles** — patch each profile's
`~/.hermes/profiles/<name>/config.yaml` to set toolsets, always_load skills,
and `cwd`
4. **Write SOUL.md per profile** — the personality + role definition
5. **Copy any provided assets + write `brief.md`, `TEAM.md`, and `taste/`**
6. **Fire the initial kanban task**`hermes kanban create` assigned to the director
See `assets/setup.sh.tmpl` for the skeleton.
### Profile creation pattern
```bash
hermes profile create director --clone 2>/dev/null || true
```
The `--clone` flag clones from the active profile (preserving model, base
config). The `|| true` makes the script idempotent — re-running won't error if
the profile already exists.
### Profile config patching
Each profile has a YAML config at `~/.hermes/profiles/<name>/config.yaml`. The
setup script edits exactly two keys:
1. `toolsets:` — replace the default with the role's required toolsets
2. `skills.always_load:` — list the role's must-load skills (may be empty)
**Do NOT** modify `approvals.mode` (controls user-confirmation of tool calls
— a security setting that must stay as the user configured it). **Do NOT**
modify `terminal.cwd` — the kanban dispatcher overrides cwd per-task via
`--workspace dir:<path>`, so the profile's cwd is irrelevant to the kanban
work and changing it could break the user's interactive use of the profile.
Use **PyYAML**, not string replacement, so the patch is robust against
default-config schema drift:
```bash
configure_profile() {
local profile="$1"
local toolsets_json="$2" # JSON array, e.g. '["kanban","terminal","file"]'
local skills_json="$3" # JSON array, e.g. '["kanban-worker","ascii-video"]'
python3 - "$profile" "$toolsets_json" "$skills_json" <<'PY'
import json, os, sys, yaml
profile, ts_json, sk_json = sys.argv[1:4]
p = os.path.expanduser(f"~/.hermes/profiles/{profile}/config.yaml")
with open(p) as f:
cfg = yaml.safe_load(f) or {}
cfg["toolsets"] = json.loads(ts_json)
cfg.setdefault("skills", {})["always_load"] = json.loads(sk_json)
with open(p, "w") as f:
yaml.safe_dump(cfg, f, sort_keys=False)
PY
}
```
PyYAML must be installed in the user's Python (it ships with most Hermes
installs). If absent: `pip install pyyaml`.
The setup script should also **validate** the patch by re-reading the file
and comparing — see `assets/setup.sh.tmpl` for the validation pattern.
### SOUL.md per profile
Each profile gets a `SOUL.md` at `~/.hermes/profiles/<name>/SOUL.md` that
defines its role, voice, and rules. See `assets/soul.md.tmpl` for the
template. Customize per role and per project.
The director's SOUL.md should be the most opinionated — its voice flavors
the entire production. **Critical content for the director's SOUL.md:**
- **Anti-temptation rules:** "Do not execute the work yourself. For every
concrete task, create a kanban task and assign it. Decompose, route, comment,
approve — that's the whole job." (The `kanban-orchestrator` skill provides
the deeper playbook; load it.)
- **Decomposition steps:** Read `brief.md`, `TEAM.md`, `taste/`. Use the team
graph in `TEAM.md` to fan out tasks.
- **The workspace_path rule** (see below).
Other profiles' SOUL.md is briefer; mostly mechanical: who you are, what you
read, what you produce, what skills/tools to use, where to write outputs.
Most non-director profiles should `always_load: kanban-worker` for the
deeper-than-baseline kanban guidance.
### Initial kanban task
The final action of setup.sh is firing the kanban:
```bash
hermes kanban create "Direct production of <video title>" \
--assignee director \
--workspace dir:"$HOME/projects/video-pipeline/${PROJECT_SLUG}" \
--tenant ${PROJECT_SLUG} \
--priority 2 \
--max-runtime 4h \
--body "$(cat <<EOF
Read brief.md, TEAM.md, and taste/.
Decompose into the team graph defined in TEAM.md.
All child tasks MUST use:
workspace_kind="dir"
workspace_path="$HOME/projects/video-pipeline/${PROJECT_SLUG}"
tenant="${PROJECT_SLUG}"
EOF
)"
```
The `--workspace dir:<path>` flag is **critical** — it tells the kanban that
all child tasks share this workspace. Skipping or using `worktree` will
isolate profiles and break artifact sharing.
## The TEAM.md file
Alongside `brief.md`, write a `TEAM.md` that the director reads. It documents
the team composition + task graph the orchestrator should follow. This
removes ambiguity and prevents the director from inventing extra steps.
Example structure (for an ASCII video with a music supervisor and editor):
```markdown
# Team & Task Graph — <video title>
## Team
- `director` (this profile) — vision, decomposition, approval
- `cinematographer` — visual spec, quality review (loads `ascii-video`)
- `renderer-ascii` — ASCII scenes (loads `ascii-video`)
- `music-supervisor` — track analysis (loads `songsee`)
- `voice-talent` — narration (uses ElevenLabs API)
- `audio-mixer` — final mix (ffmpeg)
- `editor` — assembly (ffmpeg)
- `reviewer` — final QA gate
## Task Graph
T0: this task — decompose
├── T1: cinematographer "Design visual language" (parent: T0)
│ │
│ ├── T2a: renderer-ascii "Scene 1 — title card" (parent: T1)
│ ├── T2b: renderer-ascii "Scene 2 — main beat" (parent: T1)
│ ├── T2c: renderer-ascii "Scene 3 — outro" (parent: T1)
├── T3: music-supervisor "Analyze track + emit beats.json" (parent: T0)
├── T4: voice-talent "Generate narration" (parent: T0)
├── T5: audio-mixer "Mix VO + bg music" (parents: T3, T4)
├── T6: editor "Assemble cut + mux audio" (parents: T2*, T5)
└── T7: reviewer "Final QA" (parent: T6)
```
The director turns this into actual `kanban_create` calls.
## API-key prerequisites check
Before firing the kanban, verify required keys are available. Check both
`~/.hermes/.env` and macOS Keychain (if on macOS):
```bash
check_key() {
local var="$1"
local kc_account="$2"
local kc_service="$3"
if grep -q "^${var}=" ~/.hermes/.env 2>/dev/null && \
[ -n "$(grep "^${var}=" ~/.hermes/.env | cut -d= -f2-)" ]; then
return 0
fi
if command -v security >/dev/null 2>&1 && \
security find-generic-password -a "${kc_account}" -s "${kc_service}" -w >/dev/null 2>&1; then
return 0
fi
echo "ERROR: ${var} not set in ~/.hermes/.env or Keychain (${kc_account}/${kc_service})"
return 1
}
check_key ELEVENLABS_API_KEY hermes ELEVENLABS_API_KEY || exit 1
check_key OPENROUTER_API_KEY hermes OPENROUTER_API_KEY || exit 1
# ...
```
If a key is missing, the script aborts with a clear message rather than
firing a kanban that will hit credential errors mid-execution.
## Critical rules
1. **`workspace_kind="dir"` + `workspace_path="<absolute>"` on every kanban_create.** Otherwise profiles can't share artifacts.
2. **Tenant every task.** `--tenant <project-slug>` keeps the dashboard scoped
and prevents cross-pollination with other ongoing kanbans.
3. **Idempotency keys.** For tasks that should not duplicate on re-run (e.g.,
setup creating profiles), use the `idempotency_key` argument or check
existence first.
4. **`max_runtime_seconds` per task.** Renderers that get stuck eat compute.
Standard defaults:
- Renderer task: 1800s (30min)
- Editor task: 600s (10min)
- Voice-talent task: 300s (5min)
- Image-generator task: 600s (10min)
- Image-to-video-generator task: 900s (15min)
5. **Heartbeats for long renders.** Tasks expected to run >5min should emit
`kanban_heartbeat` periodically with progress. Renderers should report
frame counts; the editor should report assembly progress.
6. **The `audio/` and `taste/` dirs are populated BEFORE firing the kanban.**
Don't ask the director's pipeline to source these — copy at setup time.
7. **`brief.md` is read-only after setup.** If the brief changes during
execution, that's a significant pivot — re-fire the kanban rather than edit
live.
@@ -0,0 +1,180 @@
# Monitoring — Watch the Pipeline + Intervene
After `setup.sh` fires the kanban, the work runs autonomously. The role of
this skill in the execution phase is to help the user (and the AI overseeing
the session) detect problems early and intervene effectively.
## Live monitoring commands
```bash
# Live event stream — task spawns, status changes, heartbeats, completions
hermes kanban watch --tenant <project-slug>
# Snapshot of the board
hermes kanban list --tenant <project-slug>
hermes kanban list --tenant <project-slug> --json # machine-readable
# Per-status counts + oldest-ready age
hermes kanban stats --tenant <project-slug>
# Visual dashboard (browser)
hermes dashboard
# Inspect a specific task (includes comments + events)
hermes kanban show <task-id>
# Follow a single task's event stream
hermes kanban tail <task-id>
```
Verify available subcommands with `hermes kanban --help` — the kanban CLI
ships with `init / create / list / show / assign / link / unlink / claim /
comment / complete / block / unblock / archive / tail / dispatch / watch /
stats / heartbeat / log / runs / context / gc`.
The companion `scripts/monitor.py` polls the kanban via the CLI and surfaces
common issues (stuck tasks, missing heartbeats, repeated retries, dependency
deadlocks).
## What to watch for
### Healthy pipeline indicators
- Tasks transition `READY → RUNNING → DONE` in roughly the expected order
- Renderers emit periodic `kanban_heartbeat` events with progress (e.g. "frame
240/720")
- Each task's runtime is well under its `max_runtime_seconds` cap
- No task accumulates more than 1 retry
- Dependency arrows resolve (children unblock as parents complete)
### Warning signs
| Symptom | Likely cause | Action |
|---------|--------------|--------|
| Task RUNNING but no heartbeat in 2+ min | Worker stuck, infinite loop, blocked on input | `hermes kanban show <id>` — read the worker's last events. The dispatcher SIGTERMs tasks that exceed their `max-runtime`; if you need to stop one earlier, `hermes kanban block <id>` then `hermes kanban archive <id>`, and create a re-run task. |
| Same task retried 2+ times | Reproducible failure (missing key, bad spec, broken tool) | `hermes kanban show <id>` to read failure events. Fix root cause before re-running. |
| RUNNING longer than max_runtime | Task is slow but progressing OR genuinely stuck | Check heartbeats with `hermes kanban tail <id>`. If progressing, the dispatcher will SIGTERM eventually anyway — raise `max-runtime` on a re-created task. |
| Child task READY but parents still RUNNING for >2× expected | Cascade slow, dependency miswired | Check the dependency graph. Inspect the parent: sometimes it completed but its handoff fields (summary, metadata) were empty so the child has nothing to consume. |
| New tasks not appearing | Director is hung in decomposition | Inspect director task with `kanban show`. Often a malformed `kanban_create` call. |
| Specialist tasks completing instantly | Decomposition created tasks without bodies | Director didn't pass enough context. Re-create with explicit body content. |
| Tasks created but never picked up | Profile not running, or tenant mismatch, or dispatcher not running | Check `hermes profile list` (profile exists?), `hermes status` (gateway/dispatcher up?), and verify tenant. |
| Specific renderer task fails → review note → renderer redoes → fails again | Brief is asking for the impossible | Pivot the brief, not the renderer. |
## Intervention recipes
### Rejecting bad output
When a renderer ships a clip that doesn't pass review:
```bash
# 1. Comment on the renderer's task with specific feedback
hermes kanban comment <renderer-task-id> "Scene 3 looks too sparse \
— increase visual density. Tighten color palette to brand spec."
# 2. Create a re-render task with the original as parent
hermes kanban create "Scene 3 — re-render with feedback" \
--assignee renderer-ascii \
--parent <renderer-task-id> \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--skill ascii-video \
--max-runtime 30m
```
### Adding a new dependency mid-flight
When the editor needs an asset that wasn't originally planned (e.g., a captions
file):
```bash
# 1. Create the new task and capture its id
NEW_TASK_ID=$(hermes kanban create "Generate SRT captions from voiceover" \
--assignee captioner \
--workspace dir:"$HOME/projects/video-pipeline/<slug>" \
--tenant <slug> \
--json | python3 -c "import json,sys;print(json.load(sys.stdin)['id'])")
# 2. Wire it as a parent of the editor's task with `kanban link`
hermes kanban link "$NEW_TASK_ID" <editor-task-id>
```
`kanban link` takes `parent_id child_id` (parent first). Use `kanban unlink`
to remove a dependency.
### Stopping a worker that's stuck
The kanban dispatcher will SIGTERM (then SIGKILL) any task that exceeds its
`--max-runtime` automatically. To stop one sooner:
```bash
# Mark blocked so the dispatcher leaves it alone, then archive
hermes kanban block <task-id>
hermes kanban archive <task-id>
# Diagnose what happened
hermes kanban show <task-id> # task body, comments, recent events
hermes kanban tail <task-id> # follow the live event stream
hermes kanban log <task-id> # worker process log
```
After stopping, decide: fix root cause + re-create the task, or skip and
adjust dependent tasks.
### Pivoting the brief
If during execution the user wants something fundamentally different:
1. Cancel the active director task and all RUNNING children
2. Edit `brief.md` and `TEAM.md`
3. Re-fire the initial `hermes kanban create` for the director
Don't try to "edit while running" — the kanban's audit trail makes a clean
pivot more legible than mid-stream changes.
## Periodic check-in script
A simple polling pattern for hands-off monitoring:
```bash
while true; do
clear
hermes kanban list --tenant <slug>
echo "---"
hermes kanban stats --tenant <slug>
sleep 30
done
```
For a live event feed, run `hermes kanban watch --tenant <slug>` in a
separate terminal — it streams task lifecycle events as they happen.
For automated intervention (auto-restart stuck tasks, auto-create re-render on
review failure), see the `scripts/monitor.py` patterns.
## When to call it done
The pipeline is finished when:
1. All RENDER tasks complete and pass review
2. The editor's `output/final.mp4` exists and `ffprobe` confirms expected
duration + streams
3. The reviewer (if present) has approved
4. Optional masterer variants exist
At this point, present the final.mp4 path to the user along with any review
notes. Do NOT delete the workspace — the user may want to iterate on a single
scene without re-running the whole pipeline.
## Common gotchas
- **Tenant mismatches.** A task created with the wrong tenant won't appear in
monitoring. Always pass `--tenant <slug>` consistently.
- **Profile process not running.** Tasks queue indefinitely in READY if no
worker for that profile is online. Check `hermes profile list` and start
any missing profiles.
- **Workspace permissions.** All profiles need read+write to the workspace
directory. `chmod -R u+rw <workspace>` if any worker reports permission
errors.
- **Audio/visual sync.** The editor's clip stitching must match the
renderer's actual output durations. Don't hardcode scene durations in
the editor — read from the renderer's handoff metadata.
@@ -0,0 +1,298 @@
# Role Archetypes
The library of role archetypes for video production. **Compose a team from this
list, don't clone a fixed roster.** Most videos need 4-7 profiles. The director
is always present; everything else is conditional on the brief.
Each role's profile name is by convention `kebab-case` (e.g. `creative-director`,
`image-generator`). Multiple instances of the same role get descriptive suffixes
when they need different focus (e.g., `renderer-ascii`, `renderer-3d`).
For toolset + skill mapping per role, see [tool-matrix.md](tool-matrix.md).
## Always present
### director
The vision-holder. Reads the brief and brand guide, decomposes into a task
graph, comments to steer creative direction, approves the final cut.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-orchestrator`. The kanban plugin auto-injects baseline
orchestration guidance for free; `kanban-orchestrator` is the deeper
decomposition playbook. Add `creative-ideation` if the brief is wide-open
and needs framing help.
- **Personality:** Tied to the brand voice — see `assets/soul.md.tmpl`
The director has the same toolset as everyone else, but its `SOUL.md` rules
**forbid** execution. The "decompose, don't execute" discipline is enforced
by personality + the kanban-orchestrator skill, not by missing tools.
## Pre-production roles
Pick based on what the brief needs.
### writer / screenwriter
Writes scripts, dialogue, voiceover copy, narration. Use for any video with
spoken or written words beyond a tagline.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer` (post-process to strip AI-tells)
- **Outputs:** `script.md`, `narration.md`, `dialogue/scene-NN.md`
### copywriter
Like `writer` but specifically for marketing copy: taglines, CTAs, voiceover
scripts for product videos.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`, `humanizer`
- **Outputs:** `copy.md`
### concept-artist / visual-designer
Develops the visual identity: mood board, style frames, color palette
rationale, typography choices. Produces a `visual-spec.md` that all generators
follow. Often produces still reference frames using image-generation APIs or
local skills.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker` plus any project-specific design skill —
`claude-design` (UI/web), `sketch` (quick mockup variants),
`popular-web-designs` (matching known web aesthetic), `pixel-art` (retro),
`ascii-art` (terminal/retro), `excalidraw` (hand-drawn frames),
`design-md` (text-based design docs)
- **Outputs:** `visual-spec.md`, `taste/style-frames/*.png`
### storyboarder
Maps the brief to a beat-by-beat shot list with timing. Critical for narrative
film and music video. Often pairs with a diagramming tool.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker` plus a diagram skill — `excalidraw` (sketch),
`architecture-diagram` (technical/system), `concept-diagrams` (educational/
scientific)
- **Outputs:** `storyboard.md` with one row per scene/shot, optional
storyboard sketches
### cinematographer / dp
Designs the visual language: framing, color, motion, transitions. Reviews
generator output for visual consistency. Hands off per-scene `VISUAL_SPEC.md`.
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker` plus the visual skill that matches the project
(e.g., `ascii-video` for ASCII work, `manim-video` for explainers,
`touchdesigner-mcp` for real-time visuals, etc.)
- **Outputs:** `scenes/scene-NN/VISUAL_SPEC.md`, review comments on renderer
tasks
- **Reviews via:** `video_analyze` (sends full clip to multimodal LLM for
native review), `vision_analyze` for spot-checking frames, ffprobe summaries
## Production roles
### renderer (generic)
A worker that produces visual content for one or more scenes. Loaded with
whichever creative skill fits the scene's style. Multiple renderers can run in
parallel, each pinned to a different skill via `always_load` in their profile
or `--skill` on the task.
- **Toolsets:** kanban, terminal, file
- **Skills:** one creative skill (see specialized variants below)
- **Outputs:** `scenes/scene-NN/clip.mp4`
### Specialized renderer variants
When scenes need very different tools, create specialized renderer profiles
instead of overloading one. Each loads a different creative skill.
| Variant | Skill | Best for |
|---------|-------|----------|
| `renderer-ascii` | `ascii-video` | Terminal aesthetic, retro pixel, audio-reactive grid, video-to-ASCII conversion |
| `renderer-manim` | `manim-video` | Math, algorithms, 3Blue1Brown-style explainers, equation derivations |
| `renderer-p5js` | `p5js` | Generative art, particles, shaders, organic motion, web-canvas content |
| `renderer-comfyui` | `comfyui` | AI-generated stills + video using local ComfyUI workflows (img-to-img, img-to-video, etc.) |
| `renderer-touchdesigner` | `touchdesigner-mcp` | Real-time, audio-reactive, installation art, VJ-style content |
| `renderer-3d` | `blender-mcp` *(optional)* | 3D modeling, animation, photoreal environments, character animation |
| `renderer-pixel` | `pixel-art` | Retro game aesthetic with era-correct palettes |
| `renderer-comic` | `baoyu-comic` | Knowledge-comic style narrative scenes |
| `renderer-meme` | `meme-generation` *(optional)* | Meme-style stills for satirical/social content |
| `renderer-procedural` | (none — Python with PIL + ffmpeg directly) | Custom procedural content where no skill fits |
| `renderer-video` | (external image-to-video API: Runway / Kling / Luma) | Animating still images in narrative film |
| `renderer-motion-graphics` | (external — Remotion CLI) | Motion graphics, kinetic typography, UI animations |
For external-API renderers, the profile holds the API client logic; only
`kanban-worker` is loaded, plus the terminal toolset and the API key.
### image-generator
Specifically for text-to-image generation. Often produces stills that go to
`renderer-video` for animation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (drives a local
ComfyUI install for image generation)
- **External APIs (alternative to local ComfyUI):** FAL, Replicate, OpenAI
Images, Midjourney
- **Outputs:** `scenes/scene-NN/stills/*.png`
### image-to-video-generator
Takes still images and animates them via Runway/Kling/Luma APIs, or via
ComfyUI's image-to-video workflows locally. Almost always follows
`image-generator` in narrative film pipelines.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, optionally `comfyui` (for local image-to-video
workflows like AnimateDiff or WAN)
- **External APIs:** Runway, Kling, Luma, Pika
- **Outputs:** `scenes/scene-NN/clip.mp4`
### music-supervisor
Sources, analyzes, and prepares the music track. For music videos, also
produces a beat/BPM map and key-moment timestamps. Uses `songsee` for
spectrograms when the editor or renderer needs a visual reference of the
audio's energy.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` (audio visualization), plus one of:
- `songwriting-and-ai-music` — when commissioning lyrics + Suno prompts
- `heartmula` — when generating music with the open-source local model
- `spotify` — when sourcing existing tracks
- **Outputs:** `audio/track.mp3`, `audio/beats.json`, optional
`audio/track-spectrogram.png`
### voice-talent / narrator
Generates voiceover audio. Calls a TTS API directly; no Hermes skill required
beyond `kanban-worker`. The user can also supply pre-recorded VO instead of
generation.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External APIs:** ElevenLabs, OpenAI TTS, etc.
- **Outputs:** `audio/voiceover/line-NN.mp3`, `audio/voiceover/timeline.mp3`
### foley / sfx-designer
Sound effects and ambient design. Often optional unless the brief calls for
sound design specifically.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`, `songsee` for audio-feature visualization when
designing to a track
- **Outputs:** `audio/sfx/*.mp3`
## Post-production roles
### editor
Assembles the final cut from clips. Uses ffmpeg for stitching, fades,
transitions. Reviews each clip for pacing and quality before assembly.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg, ffprobe
- **Outputs:** `output/final.mp4`, `output/final-noaudio.mp4`
### colorist
Color grading. Usually optional — if the renderers already produce
brand-consistent output and the editor just stitches, the colorist is overkill.
Worth including for narrative film with hero shots.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-graded.mp4`
### audio-mixer
Mixes voiceover + music + SFX into a final audio track. Sets levels, ducks
music under VO, normalizes loudness (LUFS).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** ffmpeg with `loudnorm` filter, optional `sox`
- **Outputs:** `audio/final-mix.mp3`
### captioner
Burns subtitles into the video, generates SRT, handles accessibility. Can also
generate captions from audio via Whisper.
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **External tools:** Whisper (CLI or API), ffmpeg subtitle filters
- **Outputs:** `output/captions.srt`, `output/final-captioned.mp4`
### masterer
Final encode + format variants. Produces deliverables for each platform target
(square for IG, vertical for TikTok, full HD for YouTube, etc.).
- **Toolsets:** kanban, terminal, file
- **Skills:** `kanban-worker`
- **Outputs:** `output/final-1080.mp4`, `output/final-9x16.mp4`, etc.
## QA roles
### reviewer
A neutral quality gate. Reads the brief, watches the cut, comments
specifically on what's off (pacing, sync, brand alignment, technical
quality). Distinct from the cinematographer (who reviews visuals during
production) and the editor (who reviews for assembly).
- **Toolsets:** kanban, terminal, file, video, vision
- **Skills:** `kanban-worker`
- **Review tools:** `video_analyze` (native clip review via multimodal LLM),
`vision_analyze` (frame/thumbnail review), ffprobe
- **Outputs:** `review-notes.md`, comments on tasks
### brand-cop
Reviews specifically for brand compliance — colors, typography, voice. Use
when the brand guidelines are detailed and a generic reviewer might miss
violations.
- **Toolsets:** kanban, file
- **Skills:** `kanban-worker`
- **Outputs:** comments + `brand-review.md`
## Composing teams — heuristics
- **Always:** director + at least one renderer + editor.
- **Add writer** if scripted dialogue / narration / on-screen text exceeds a
tagline.
- **Add storyboarder** if the brief has more than 5 distinct beats and the
director hasn't already laid out a beat list.
- **Add cinematographer** if multiple renderer instances need consistent
visual language. (For a single-tool video, the renderer's own skill spec
is enough.)
- **Add image-generator + image-to-video-generator pair** for narrative film
with photorealistic visuals.
- **Add music-supervisor** when music is provided and rhythm matters
(music videos always; explainers sometimes).
- **Add voice-talent** for any voiceover / narrative dialogue.
- **Add audio-mixer** when there are 2+ audio sources (VO + music, music + SFX).
- **Add captioner** for accessibility-priority projects (explainer, tutorial,
any platform that defaults to muted playback).
- **Add reviewer** for high-stakes projects. Skip for quick experimental loops.
- **Add masterer** when multiple platform deliverables are needed.
## Anti-patterns
- **One renderer doing everything.** If scenes use very different tools
(ASCII + 3D + motion graphics), use specialized renderer variants. The
renderer loads ONE creative skill at a time; mixing styles in a single
renderer causes thrashing.
- **A separate profile per scene.** No. Profiles are per-role, not per-scene.
Eight scenes use one or two renderer profiles, not eight.
- **A "general" profile that does everything.** Worse than no specialization.
The kanban routing breaks down if every task fits every profile.
- **No reviewer for important deliverables.** Saves an hour of pipeline time
but ships flaws.
@@ -0,0 +1,317 @@
# Tool Matrix — Skills + Toolsets per Role
Maps each role archetype to the Hermes skills it should `always_load` and the
toolsets it needs. Only references skills that ship in the public hermes-agent
repository (under `skills/` or `optional-skills/`). External APIs and CLIs are
called from the terminal toolset; they don't appear in `always_load`.
## Hermes skills relevant to video production
### Visual / rendering skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `ascii-video` | Production pipeline for ASCII art video — generative, audio-reactive, video-to-ASCII | Renderer for ASCII / terminal / retro pixel content; cinematographer for ASCII projects |
| `ascii-art` | Static ASCII art generation | Concept artist for ASCII style frames; secondary tool for ASCII renderer |
| `manim-video` | Manim CE animations — math, algorithms, 3Blue1Brown-style explainers | Renderer for math, algorithm walkthroughs, technical concept explainers |
| `p5js` | p5.js sketches — generative art, shaders, interactive, 3D | Renderer for generative art, particle systems, organic motion, web-canvas content |
| `comfyui` | Generate images, video, audio with ComfyUI workflows (image-to-image, image-to-video, etc.) | image-generator, image-to-video-generator, or general renderer for AI-generated content |
| `touchdesigner-mcp` | Control a running TouchDesigner instance — real-time visuals, audio-reactive installation art, VJ | Renderer for real-time/audio-reactive content; installation art; live performance |
| `blender-mcp` *(optional)* | Control Blender 4.3+ via MCP — 3D modeling, animation, rendering | Renderer for 3D scenes, photoreal environments, character animation |
| `pixel-art` | Pixel art with era palettes (NES, Game Boy, PICO-8) | Renderer for retro game aesthetic; concept artist for pixel-style frames |
| `baoyu-comic` | Knowledge-comic generation (educational, biography, tutorial) | Renderer for comic-style narrative; explainer in panel form |
| `baoyu-infographic` | Infographic generation | Renderer for data-driven explainer scenes |
| `meme-generation` *(optional)* | Generate meme images by overlaying text on templates | Generator for satirical/social content; meme-style stills |
### Design / pre-production skills (`hermes-agent/skills/creative/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `claude-design` | Design one-off HTML artifacts (landing, deck, prototype) | Concept artist for product video style frames; storyboarder for UI-heavy content |
| `design-md` | Design markdown docs | Concept artist documenting visual specs |
| `popular-web-designs` | Reference patterns for popular web designs | Concept artist; cinematographer when matching a known UI aesthetic |
| `sketch` | Throwaway HTML mockups (2-3 design variants to compare) | Concept artist exploring directions; storyboarder for UI flows |
| `excalidraw` | Excalidraw-style hand-drawn diagrams | Storyboarder; concept artist for sketch-style frames |
| `architecture-diagram` | Software architecture diagrams | Storyboarder for technical content; explainer scenes about systems |
| `concept-diagrams` *(optional)* | Flat, minimal SVG diagrams (educational visual language; physics, chemistry, math, anatomy, etc.) | Renderer / storyboarder for explainer scenes with clean educational diagrams |
| `pretext` | Mathematical/scientific content authoring | Writer / cinematographer for technical-explainer pretexts |
| `creative-ideation` | Constraint-driven project ideation | Director / cinematographer when the brief is wide-open and needs framing |
| `humanizer` | Strip AI-isms from text, add real voice | Writer / copywriter post-process to avoid AI-tells in scripts and VO copy |
### Audio / media skills (`hermes-agent/skills/creative/` + `skills/media/`)
| Skill | What it does | Best fit for |
|-------|--------------|--------------|
| `songwriting-and-ai-music` | Songwriting craft + Suno prompt patterns | Music supervisor when commissioning a track via Suno |
| `heartmula` | Open-source music generation (Apache-2.0, Suno-like) | Music supervisor generating bespoke tracks without external APIs |
| `songsee` | Spectrograms, mel/chroma/MFCC of audio files | Music supervisor analyzing tracks; foley-designer designing to a beat; editor visualizing a mix |
| `spotify` | Spotify control — play, search, queue, manage playlists | Music supervisor sourcing existing tracks; reference research |
| `youtube-content` | Fetch transcripts + transform to chapters/summaries/posts | Documentary cut, content adaptation, research for explainers |
| `gif-search` | Find existing GIFs | Editor / concept artist sourcing references |
| `gifs` | GIF tooling | Masterer producing GIF deliverables |
### Kanban infrastructure (`hermes-agent/skills/devops/`)
| Skill | What it does | When to load |
|-------|--------------|--------------|
| `kanban-orchestrator` | Decomposition playbook + anti-temptation rules for orchestrator profiles | Director only |
| `kanban-worker` | Pitfalls, examples, edge cases for kanban workers (deeper than auto-injected guidance) | Any profile — load when handling tricky multi-step workflows |
The kanban plugin auto-injects baseline orchestration guidance into every
worker's system prompt — the `kanban_create` fan-out pattern, claim/handoff
lifecycle, and the "decompose, don't execute" rule for orchestrators.
`kanban-orchestrator` and `kanban-worker` are deeper playbooks loaded when a
profile needs them.
## External tools (called from terminal toolset)
These are **not** Hermes skills but external CLIs / APIs that profiles invoke.
They don't appear in `always_load`; instead the role's terminal commands hit
them directly.
| Tool | What it does | Profile that uses it |
|------|--------------|----------------------|
| `ffmpeg` | Video / audio encode, splice, mux | renderer, editor, audio-mixer, masterer |
| `ffprobe` | Inspect media | All media-touching profiles |
| Whisper (CLI or API) | Speech-to-text for captions | captioner |
| Text-to-image API (FAL / Replicate / OpenAI / Midjourney) | Stills generation | image-generator (alternative to local `comfyui`) |
| Image-to-video API (Runway / Kling / Luma / Pika) | Animate stills | image-to-video-generator |
| Text-to-speech API (ElevenLabs / OpenAI TTS / etc.) | Voiceover generation | voice-talent |
| Suno API or web | Track composition (paired with `songwriting-and-ai-music`) | music-supervisor |
| Remotion CLI (`npx remotion render`) | React-based motion graphics | renderer-motion-graphics |
| Manim CE (`manim`) | Math animation render (driven by `manim-video` skill's recipes) | renderer-manim |
| Blender (`blender -b`) | 3D rendering (alternative to `blender-mcp`) | renderer-3d |
## Built-in Hermes tools for media review
These are native Hermes tools — not invoked via terminal but through their own
toolsets. Enable them per-profile by adding the toolset to the profile config.
| Tool | Toolset | What it does | Profile that uses it |
|------|---------|--------------|----------------------|
| `video_analyze` | `video` (opt-in — `hermes tools enable video`) | Native video understanding — sends full clip to a multimodal LLM (Gemini via OpenRouter) for review without frame extraction. Supports mp4, webm, mov, avi, mkv. 50 MB cap. Model: `AUXILIARY_VIDEO_MODEL` env → `AUXILIARY_VISION_MODEL` fallback. | reviewer, cinematographer, editor |
| `vision_analyze` | `vision` (core — enabled by default) | Image/frame analysis — review stills, thumbnails, exported frames. Already available to all profiles without opt-in. | reviewer, cinematographer, concept-artist |
## Standard toolset configurations per role
### director
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-orchestrator
```
The director's terminal access is conventional but the SOUL.md rules forbid
execution. Audit logs catch violations.
### writer / copywriter
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
- humanizer # post-process scripts to strip AI-tells
```
No terminal — writers don't need it.
### concept-artist
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# plus one or more (style-dependent):
# - claude-design (UI / web product video)
# - sketch (quick mockup variants)
# - excalidraw (hand-drawn frames)
# - ascii-art (ASCII style frames)
# - pixel-art (retro/game aesthetic)
# - popular-web-designs (matching known web aesthetic)
# - design-md (text-based design docs)
```
### storyboarder
```yaml
toolsets:
- kanban
- file
skills:
always_load:
- kanban-worker
# one of:
# - excalidraw (sketch storyboards)
# - architecture-diagram (technical/system content)
# - concept-diagrams (educational / scientific content)
```
### cinematographer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
# the visual skill that matches the project, e.g.:
# - ascii-video (ASCII projects)
# - manim-video (math/explainer)
# - p5js (generative)
# - comfyui (AI-generated visuals)
# - blender-mcp (3D)
# - touchdesigner-mcp (real-time/installation)
```
### renderer (specialized variants)
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# ONE skill per renderer variant (or empty for external-API renderers):
# - ascii-video (renderer-ascii)
# - manim-video (renderer-manim)
# - p5js (renderer-p5js)
# - comfyui (renderer-comfyui — img/video AI gen)
# - touchdesigner-mcp (renderer-touchdesigner)
# - blender-mcp (renderer-3d)
# - pixel-art (renderer-pixel)
# - baoyu-comic (renderer-comic)
# - meme-generation (renderer-meme)
```
For external-API renderers (image-to-video-generator using Runway, voice-talent
using ElevenLabs, renderer-motion-graphics using Remotion), `always_load` only
contains `kanban-worker` — the role's work is API-driven and the API key +
terminal commands suffice.
For multi-skill renderer setups (rare — usually one variant per skill is
cleaner) use `--skill <name>` on individual `kanban_create` calls to override
which skill loads for that specific task.
### image-generator / image-to-video-generator / voice-talent
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
# for image-generator that drives ComfyUI locally:
# - comfyui
env_required:
# populate based on the chosen API:
- FAL_KEY # or REPLICATE_API_TOKEN, OPENAI_API_KEY for image-gen
- RUNWAY_API_KEY # or KLING_API_KEY, LUMA_API_KEY for image-to-video
- ELEVENLABS_API_KEY # or OPENAI_API_KEY for TTS
```
If the user's setup has ComfyUI installed locally, the `comfyui` skill can
replace the external image-gen API entirely (cheaper, more control, supports
custom workflows for image-to-video too).
### music-supervisor
```yaml
toolsets:
- kanban
- terminal
- file
skills:
always_load:
- kanban-worker
- songsee # spectrograms / audio analysis
# plus (depending on what the project needs):
# - songwriting-and-ai-music (commissioning Suno tracks)
# - heartmula (commissioning open-source local generation)
# - spotify (sourcing existing tracks)
```
### editor / audio-mixer / captioner / masterer
```yaml
toolsets:
- kanban
- terminal
- file
- video # video_analyze — editor reviews assembled cuts natively
- vision # vision_analyze — spot-check frames
skills:
always_load:
- kanban-worker
```
These are mostly ffmpeg-driven; no special skill needed beyond `kanban-worker`.
For captioner add Whisper invocation patterns to the SOUL.md.
### reviewer / brand-cop
```yaml
toolsets:
- kanban
- terminal # for media inspection (ffprobe, etc.)
- file
- video # video_analyze — review full clips natively
- vision # vision_analyze — review stills / exported frames
skills:
always_load:
- kanban-worker
```
## API key requirements
Track these in the project setup. The setup script should verify each required
key is present in `~/.hermes/.env` (or macOS Keychain) before firing the kanban.
| Service | Env var | Used by |
|---------|---------|---------|
| ElevenLabs | `ELEVENLABS_API_KEY` | voice-talent |
| OpenAI | `OPENAI_API_KEY` | image-generator (DALL-E), voice-talent (TTS) |
| OpenRouter | `OPENROUTER_API_KEY` | reviewer, cinematographer, editor (`video_analyze` routes through `AUXILIARY_VIDEO_MODEL` → OpenRouter) |
| FAL | `FAL_KEY` | image-generator (FAL flux models) |
| Replicate | `REPLICATE_API_TOKEN` | image-generator (alternate provider) |
| Runway | `RUNWAY_API_KEY` | image-to-video-generator |
| Kling | `KLING_API_KEY` | image-to-video-generator (alternate) |
| Luma | `LUMA_API_KEY` | image-to-video-generator (alternate) |
| Suno | `SUNO_API_KEY` | music-supervisor (paired with `songwriting-and-ai-music`) |
| Spotify | `SPOTIFY_CLIENT_ID` + `SPOTIFY_CLIENT_SECRET` | music-supervisor (paired with `spotify` skill) |
| Anthropic | `ANTHROPIC_API_KEY` | every Hermes profile (Claude) |
If a key is missing, prompt the user to add it. Storage methods, in order of
preference: macOS Keychain → `~/.hermes/.env` → environment variable.
## Skill version pinning
If a specific skill version is desired, pass it via the per-task
`--skill <name>=<version>` flag. The default is whatever's installed.
## Adding a new skill to the matrix
When a new Hermes-public video skill ships:
1. Add a row to the relevant table at the top of this file
2. If it warrants a specialized renderer variant, add to `role-archetypes.md`
3. Update relevant per-style examples in `examples.md`
@@ -0,0 +1,501 @@
#!/usr/bin/env python3
"""
Bootstrap a video production kanban from a structured plan JSON.
Reads a plan.json describing the team + brief, expands templates from
../assets/, and writes a setup.sh that creates Hermes profiles and fires the
initial kanban task.
Profile-config patching, SOUL.md-per-profile, TEAM.md task-graph convention,
and the `hermes kanban create --workspace dir:` initial-task pattern are
adapted from alt-glitch's NousResearch/kanban-video-pipeline.
Usage:
bootstrap_pipeline.py plan.json [--out setup.sh]
The plan.json schema is documented inline below see the `validate_plan`
function. A minimal example:
{
"title": "Q3 Product Teaser",
"slug": "q3-product-teaser",
"tenant": "q3-product-teaser",
"duration_s": 30,
"aspect": "1:1",
"resolution": "1080x1080",
"fps": 30,
"team": [
{
"profile": "director",
"role": "director",
"toolsets": ["kanban", "terminal", "file"],
"skills": [],
"responsibilities": "...",
"inputs": "brief.md, TEAM.md, taste/",
"outputs": "kanban tasks for the team"
},
...
],
"scenes": [
{"n": 1, "time": "0:00-0:08", "content": "...", "tool": "renderer-ascii"},
...
],
"audio": {"approach": "voiceover + music bed", "vo": "ElevenLabs Lily",
"music": "license-free", "sfx": "n/a"},
"deliverables": [
{"format": "mp4", "resolution": "1080x1080", "notes": "primary"}
],
"api_keys_required": ["ELEVENLABS_API_KEY", "OPENROUTER_API_KEY"],
"brief_extra": {
"concept_one_liner": "...",
"emotional_north_star": "...",
"visual_refs": "...",
"tone": "...",
"brand_constraints": "..."
}
}
"""
from __future__ import annotations
import argparse
import json
import os
import re
import sys
from pathlib import Path
ASSETS_DIR = Path(__file__).resolve().parent.parent / "assets"
def load_template(name: str) -> str:
return (ASSETS_DIR / name).read_text()
PROFILE_NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
SLUG_RE = re.compile(r"^[a-z0-9][a-z0-9-]+$")
def validate_plan(plan: dict) -> list[str]:
"""Return a list of validation error strings; empty list = valid."""
errors = []
required_top = ["title", "slug", "tenant", "duration_s", "aspect",
"resolution", "fps", "team", "scenes", "audio",
"deliverables"]
for k in required_top:
if k not in plan:
errors.append(f"missing required key: {k}")
if "team" in plan:
if not isinstance(plan["team"], list) or not plan["team"]:
errors.append("team must be a non-empty list")
else:
roles = [t.get("role") for t in plan["team"]]
if "director" not in roles:
errors.append("team must include a director role")
seen_profiles = set()
for i, t in enumerate(plan["team"]):
for k in ["profile", "role", "toolsets", "skills",
"responsibilities"]:
if k not in t:
errors.append(f"team[{i}] missing {k}")
# Profile name must match Hermes's regex (lowercase
# alphanumeric + hyphens + underscores, up to 64 chars).
if "profile" in t:
if not PROFILE_NAME_RE.match(t["profile"]):
errors.append(
f"team[{i}].profile {t['profile']!r} must match "
f"[a-z0-9][a-z0-9_-]{{0,63}} per Hermes profile rules"
)
if t["profile"] in seen_profiles:
errors.append(
f"team[{i}].profile {t['profile']!r} is duplicated"
)
seen_profiles.add(t["profile"])
# Toolsets / skills must be lists, not strings.
if "toolsets" in t and not isinstance(t["toolsets"], list):
errors.append(
f"team[{i}].toolsets must be a list of strings"
)
if "skills" in t and not isinstance(t["skills"], list):
errors.append(
f"team[{i}].skills must be a list of strings"
)
if "slug" in plan:
if not SLUG_RE.match(plan["slug"]):
errors.append("slug must be lowercase, hyphenated, "
"starting with [a-z0-9]")
return errors
def render_brief(plan: dict) -> str:
"""Render brief.md from the plan."""
tmpl = load_template("brief.md.tmpl")
extra = plan.get("brief_extra", {})
# Scene table rows
scene_rows = []
for s in plan["scenes"]:
scene_rows.append(
f"| {s.get('n', '?')} | {s.get('time', '?')} | "
f"{s.get('content', '')} | {s.get('tool', '')} | "
f"{s.get('audio', '')} | {s.get('notes', '')} |"
)
scene_table = "\n".join(scene_rows) if scene_rows else "_(none yet)_"
# Deliverable rows
deliv_rows = []
for d in plan["deliverables"]:
deliv_rows.append(
f"| {d.get('format', '?')} | {d.get('resolution', '?')} | "
f"{d.get('notes', '')} |"
)
deliv_table = "\n".join(deliv_rows) if deliv_rows else "_(none)_"
# Replacements (single-pass)
replacements = {
"TITLE": plan["title"],
"SLUG": plan["slug"],
"TENANT": plan["tenant"],
"WORKSPACE": f"~/projects/video-pipeline/{plan['slug']}",
"ONE_LINE_PITCH": extra.get("concept_one_liner", "_(TBD)_"),
"EMOTIONAL_NORTH_STAR": extra.get("emotional_north_star", "_(TBD)_"),
"DURATION_S": str(plan["duration_s"]),
"ASPECT": plan["aspect"],
"RESOLUTION": plan["resolution"],
"FPS": str(plan["fps"]),
"PLATFORMS": extra.get("platforms", "_(TBD)_"),
"DEADLINE": extra.get("deadline", "_(none)_"),
"QUALITY_BAR": extra.get("quality_bar", "polished"),
"VISUAL_REFS": extra.get("visual_refs", "_(none)_"),
"TONE": extra.get("tone", "_(TBD)_"),
"BRAND_CONSTRAINTS": extra.get("brand_constraints", "_(none)_"),
"AESTHETIC_RULES": extra.get("aesthetic_rules", "_(TBD)_"),
"AUDIO_APPROACH": plan["audio"].get("approach", "_(TBD)_"),
"VO_DETAILS": plan["audio"].get("vo", "_(n/a)_"),
"MUSIC_DETAILS": plan["audio"].get("music", "_(n/a)_"),
"SFX_DETAILS": plan["audio"].get("sfx", "_(n/a)_"),
"PRIMARY_FORMAT": plan["deliverables"][0]["format"],
"PRIMARY_RES": plan["deliverables"][0]["resolution"],
"ALT_FORMAT_1": (plan["deliverables"][1]["format"]
if len(plan["deliverables"]) > 1 else "_(none)_"),
"ALT_RES_1": (plan["deliverables"][1]["resolution"]
if len(plan["deliverables"]) > 1 else ""),
"ALT_NOTES_1": (plan["deliverables"][1].get("notes", "")
if len(plan["deliverables"]) > 1 else ""),
"API_KEYS_REQUIRED": ", ".join(plan.get("api_keys_required", [])) or "none",
"EXT_DEPS": extra.get("ext_deps", "ffmpeg, Python 3.11+"),
"SOURCE_ASSETS": extra.get("source_assets", "_(none)_"),
}
out = tmpl
for k, v in replacements.items():
out = out.replace("{{" + k + "}}", str(v))
# Scene + deliv tables: replace the placeholder row in the template
out = re.sub(
r"\|\s*1\s*\|\s*0:000:0X.+?\n\|\s*2\s*\|.+?\n",
scene_table + "\n",
out, flags=re.DOTALL,
)
return out
def render_team_md(plan: dict) -> str:
"""Render TEAM.md from the team list + scene → tool mapping."""
lines = [f"# Team & Task Graph — {plan['title']}", "", "## Team", ""]
for t in plan["team"]:
skills = (
f"loads `{', '.join(t['skills'])}`"
if t["skills"] else "no skills required"
)
lines.append(
f"- `{t['profile']}` — {t['responsibilities']} ({skills})"
)
lines.extend(["", "## Task Graph", "", "```"])
# Build a simple task graph based on conventions
profiles_by_role = {t["role"]: t["profile"] for t in plan["team"]}
director = profiles_by_role.get("director", "director")
lines.append(f"T0 {director} — decompose")
next_id = 1
parents_for_renderer: list[str] = ["T0"]
if "cinematographer" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['cinematographer']} — visual spec for all scenes (parent: T0)"
)
parents_for_renderer = [cid]
next_id += 1
if "music-supervisor" in profiles_by_role:
cid = f"T{next_id}"
lines.append(
f"{cid:5} {profiles_by_role['music-supervisor']} — track analysis + beats.json (parent: T0)"
)
next_id += 1
ms_id = cid
else:
ms_id = None
# Scenes
scene_ids = []
for s in plan["scenes"]:
cid = f"T{next_id}"
renderer_profile = s.get("tool") or "renderer"
# Lookup the actual profile name
for t in plan["team"]:
if t["role"] == renderer_profile or t["profile"] == renderer_profile:
renderer_profile = t["profile"]
break
parents = parents_for_renderer + ([ms_id] if ms_id else [])
parent_str = ", ".join(parents)
lines.append(
f"{cid:5} {renderer_profile} — scene {s.get('n', '?')}: "
f"{s.get('content', '')[:50]} (parents: {parent_str})"
)
scene_ids.append(cid)
next_id += 1
# VO + audio mix
if "voice-talent" in profiles_by_role:
vo_id = f"T{next_id}"
lines.append(f"{vo_id:5} {profiles_by_role['voice-talent']} — narration (parent: T0)")
next_id += 1
else:
vo_id = None
if "audio-mixer" in profiles_by_role:
am_id = f"T{next_id}"
am_parents = [p for p in [ms_id, vo_id] if p]
lines.append(
f"{am_id:5} {profiles_by_role['audio-mixer']} — mix audio (parents: {', '.join(am_parents)})"
)
next_id += 1
else:
am_id = None
# Editor
if "editor" in profiles_by_role:
ed_id = f"T{next_id}"
ed_parents = scene_ids + [p for p in [am_id, vo_id, ms_id] if p and p not in scene_ids]
lines.append(
f"{ed_id:5} {profiles_by_role['editor']} — assemble + mux (parents: {', '.join(ed_parents)})"
)
next_id += 1
else:
ed_id = None
# Captioner
if "captioner" in profiles_by_role and ed_id:
cap_id = f"T{next_id}"
lines.append(
f"{cap_id:5} {profiles_by_role['captioner']} — SRT + burn (parent: {ed_id})"
)
next_id += 1
last = cap_id
else:
last = ed_id
# Reviewer
if "reviewer" in profiles_by_role and last:
rv_id = f"T{next_id}"
lines.append(
f"{rv_id:5} {profiles_by_role['reviewer']} — final QA (parent: {last})"
)
lines.append("```")
lines.extend([
"",
"## Per-task workspace requirement",
"",
f"All `kanban_create` calls MUST pass:",
f"```",
f'workspace_kind="dir"',
f'workspace_path="$HOME/projects/video-pipeline/{plan["slug"]}"',
f'tenant="{plan["tenant"]}"',
f"```",
])
return "\n".join(lines)
def render_setup_sh(plan: dict, brief_md: str, team_md: str) -> str:
"""Render setup.sh from the plan."""
tmpl = load_template("setup.sh.tmpl")
# API key checks
key_checks = []
for key in plan.get("api_keys_required", []):
key_checks.append(f'check_key {key} hermes {key} || exit 1')
key_checks_str = "\n".join(key_checks) if key_checks else "# (no API keys required)"
# Scene dirs
scene_dir_lines = []
for s in plan["scenes"]:
n = s.get("n", "?")
scene_dir_lines.append(f'mkdir -p "$WORKSPACE/scenes/scene-{n:02d}"/checkpoints')
scene_dirs = "\n".join(scene_dir_lines) if scene_dir_lines else ""
# Profile create
profile_creates = []
for t in plan["team"]:
profile_creates.append(
f'hermes profile create {t["profile"]} --clone 2>/dev/null || true'
)
# Profile config — emit JSON arrays so the bash function can pass them
# safely through to the Python YAML patcher.
profile_configs = []
for t in plan["team"]:
ts_json = json.dumps(t["toolsets"])
sk_json = json.dumps(t["skills"])
# Use single-quoted bash strings; JSON only contains "/[/], no single
# quotes, so this is safe.
profile_configs.append(
f"configure_profile {t['profile']!r} {ts_json!r} {sk_json!r}"
)
# SOUL writes — uses heredocs per profile
soul_writes = []
for t in plan["team"]:
soul_writes.append(
f'cat > "$HOME/.hermes/profiles/{t["profile"]}/SOUL.md" <<\'SOUL_EOF\'\n'
f"{render_soul_md(t, plan)}\n"
f"SOUL_EOF\n"
f'echo " ✓ SOUL.md for {t["profile"]}"'
)
# Taste writes (placeholder; real content optional)
taste_writes = (
'cat > "$WORKSPACE/taste/brand-guide.md" <<\'TASTE_EOF\'\n'
'# Brand Guide\n\n'
'_(Populate with project-specific colors, typography, motion rules)_\n'
'TASTE_EOF\n'
'cat > "$WORKSPACE/taste/emotional-dna.md" <<\'DNA_EOF\'\n'
'# Emotional DNA\n\n'
'_(What this piece should FEEL like — populate from the brief.)_\n'
'DNA_EOF'
)
# Asset copies — leave empty by default; user fills in
asset_copies = "# Add cp/rsync commands here for any provided assets"
out = tmpl
out = out.replace("{{TITLE}}", plan["title"])
out = out.replace("{{SLUG}}", plan["slug"])
out = out.replace("{{TENANT}}", plan["tenant"])
out = out.replace("{{WORKSPACE}}", f"~/projects/video-pipeline/{plan['slug']}")
out = out.replace("{{KEY_CHECKS}}", key_checks_str)
out = out.replace("{{SCENE_DIRS}}", scene_dirs)
out = out.replace("{{PROFILE_CREATE_COMMANDS}}", "\n".join(profile_creates))
out = out.replace("{{PROFILE_CONFIG_COMMANDS}}", "\n".join(profile_configs))
out = out.replace("{{SOUL_WRITES}}", "\n".join(soul_writes))
out = out.replace("{{BRIEF_CONTENTS}}", brief_md)
out = out.replace("{{TEAM_CONTENTS}}", team_md)
out = out.replace("{{TASTE_WRITES}}", taste_writes)
out = out.replace("{{ASSET_COPIES}}", asset_copies)
return out
def render_soul_md(team_member: dict, plan: dict) -> str:
"""Render a profile's SOUL.md from a team member dict + plan context."""
tmpl = load_template("soul.md.tmpl")
role = team_member["role"]
common_rules = (
"- **Read the brief and team graph** before doing anything else.\n"
"- **Pass `workspace_kind=\"dir\"` and `workspace_path` on every "
"`kanban_create` call.** This keeps the team in one shared workspace.\n"
f"- **Use tenant `{plan['tenant']}`** on every kanban call.\n"
"- **Write outputs to predictable paths.** Other profiles depend on "
"your filename conventions.\n"
"- **Emit heartbeats** during long-running work. Renderers should "
"report frame counts; editors should report assembly progress.\n"
)
if role == "director":
common_rules += (
"- **Do not execute the work yourself.** For every concrete task, "
"create a kanban task and assign it to the appropriate profile.\n"
"- **Decompose, route, comment, approve — that's the whole job.**\n"
"- **Read TEAM.md** for the canonical task graph. Do not invent "
"new roles unless the brief truly demands it.\n"
"- **Load the `kanban-orchestrator` skill** for the deeper "
"decomposition playbook beyond the auto-injected baseline.\n"
)
common_commands = (
"```bash\n"
"# Inspect a clip\n"
"ffprobe -v quiet -show_entries format=duration -show_entries "
"stream=codec_name,width,height,r_frame_rate <file.mp4>\n"
"\n"
"# Extract a frame for QA\n"
"ffmpeg -y -i <input.mp4> -vf \"select='eq(n,30)'\" -vsync vfr <out.png>\n"
"```"
)
out = tmpl
out = out.replace("{{ROLE_NAME}}", role)
out = out.replace("{{ROLE_RESPONSIBILITIES}}", team_member["responsibilities"])
out = out.replace("{{INPUTS_READ}}", team_member.get("inputs", "_(see brief)_"))
out = out.replace("{{OUTPUTS_PRODUCED}}", team_member.get("outputs", "_(see brief)_"))
out = out.replace("{{TOOLSETS}}", ", ".join(team_member["toolsets"]))
out = out.replace(
"{{SKILLS}}",
", ".join(team_member["skills"]) if team_member["skills"] else "(none)"
)
out = out.replace(
"{{EXTERNAL_TOOLS}}",
team_member.get("external_tools", "ffmpeg, ffprobe (via terminal)")
)
out = out.replace(
"{{ROLE_RULES}}",
team_member.get("role_rules", "_(see TEAM.md and brief.md)_")
)
out = out.replace("{{COMMON_RULES}}", common_rules)
out = out.replace("{{COMMON_COMMANDS}}", common_commands)
return out
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("plan_json", help="Path to plan.json")
ap.add_argument("--out", default="setup.sh",
help="Output path for setup.sh (default: ./setup.sh)")
ap.add_argument("--brief-out", default=None,
help="Write brief.md alongside (default: skipped)")
ap.add_argument("--team-out", default=None,
help="Write TEAM.md alongside (default: skipped)")
args = ap.parse_args()
plan = json.loads(Path(args.plan_json).read_text())
errors = validate_plan(plan)
if errors:
print("Plan validation failed:", file=sys.stderr)
for e in errors:
print(f" - {e}", file=sys.stderr)
sys.exit(2)
brief = render_brief(plan)
team = render_team_md(plan)
setup = render_setup_sh(plan, brief, team)
Path(args.out).write_text(setup)
os.chmod(args.out, 0o755)
print(f"Wrote {args.out}")
if args.brief_out:
Path(args.brief_out).write_text(brief)
print(f"Wrote {args.brief_out}")
if args.team_out:
Path(args.team_out).write_text(team)
print(f"Wrote {args.team_out}")
if __name__ == "__main__":
main()
@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Monitor a running video-production kanban. Polls `hermes kanban list` and
`events` for a tenant and surfaces issues (stuck tasks, missing heartbeats,
repeated retries, dependency deadlocks).
Usage:
monitor.py --tenant <project-slug> [--interval 30]
Outputs a periodic snapshot to stdout. Sends alerts via stderr when issues
are detected. Designed to run alongside the kanban kill with Ctrl-C when
you're satisfied (or scripted to stop on completion).
This is best-effort observability. It does not auto-restart tasks; intervention
decisions should remain human/AI-overseen.
"""
from __future__ import annotations
import argparse
import json
import shutil
import subprocess
import sys
import time
from collections import defaultdict
from datetime import datetime, timedelta
def hermes_available() -> bool:
return shutil.which("hermes") is not None
def kanban_list(tenant: str) -> list[dict]:
"""Returns parsed task rows. Falls back to plain stdout parsing if JSON
output isn't supported by the installed hermes CLI."""
try:
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode == 0 and out.stdout.strip().startswith("["):
return json.loads(out.stdout)
except (FileNotFoundError, json.JSONDecodeError):
pass
# Fallback: textual parse of `hermes kanban list`
out = subprocess.run(
["hermes", "kanban", "list", "--tenant", tenant],
capture_output=True, text=True, check=False,
)
rows = []
for line in out.stdout.splitlines():
line = line.strip()
if not line or line.startswith("#") or "STATUS" in line.upper():
continue
parts = line.split()
if len(parts) >= 4 and parts[0].startswith("t_"):
rows.append({
"id": parts[0],
"status": parts[1] if len(parts) > 1 else "?",
"assignee": parts[2] if len(parts) > 2 else "?",
"title": " ".join(parts[3:]) if len(parts) > 3 else "",
"started_at": None,
"heartbeat_at": None,
"max_runtime_s": None,
})
return rows
def kanban_show(task_id: str) -> dict | None:
out = subprocess.run(
["hermes", "kanban", "show", task_id, "--json"],
capture_output=True, text=True, check=False,
)
if out.returncode != 0:
return None
try:
return json.loads(out.stdout)
except json.JSONDecodeError:
return None
def detect_issues(tasks: list[dict]) -> list[str]:
"""Return a list of issue strings, one per concern."""
now = datetime.now()
issues: list[str] = []
by_status = defaultdict(list)
for t in tasks:
by_status[t.get("status", "?")].append(t)
# Stuck tasks: RUNNING with no heartbeat in 2 min
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
hb = t.get("heartbeat_at")
if not hb:
continue
try:
hb_dt = datetime.fromisoformat(str(hb).rstrip("Z"))
except ValueError:
continue
if now - hb_dt > timedelta(minutes=2):
issues.append(
f"STUCK: {t['id']} ({t.get('assignee', '?')}) — "
f"no heartbeat in {(now - hb_dt).total_seconds():.0f}s"
)
# Tasks exceeding max_runtime
for t in by_status.get("running", []) + by_status.get("RUNNING", []):
started = t.get("started_at")
max_rt = t.get("max_runtime_s")
if not started or not max_rt:
continue
try:
started_dt = datetime.fromisoformat(str(started).rstrip("Z"))
except ValueError:
continue
elapsed = (now - started_dt).total_seconds()
if elapsed > max_rt:
issues.append(
f"OVERTIME: {t['id']} ({t.get('assignee', '?')}) — "
f"running {elapsed:.0f}s, cap was {max_rt}s"
)
# Repeated retries
for t in tasks:
retries = t.get("retries", 0)
if retries and retries >= 2:
issues.append(
f"FLAPPING: {t['id']} ({t.get('assignee', '?')}) — "
f"retried {retries}× — fix root cause before next run"
)
return issues
def snapshot(tenant: str) -> tuple[list[dict], list[str]]:
tasks = kanban_list(tenant)
issues = detect_issues(tasks)
return tasks, issues
def print_snapshot(tasks: list[dict], issues: list[str]):
counts = defaultdict(int)
for t in tasks:
counts[str(t.get("status", "?")).lower()] += 1
print(f"\n[{datetime.now().strftime('%H:%M:%S')}] "
f"Total: {len(tasks)} | "
+ " | ".join(f"{k}: {v}" for k, v in sorted(counts.items())))
for t in tasks:
bar = "" if str(t.get("status", "")).lower() == "done" else \
"" if str(t.get("status", "")).lower() == "running" else \
"·" if str(t.get("status", "")).lower() == "ready" else \
"" if str(t.get("status", "")).lower() == "failed" else "?"
print(f" {bar} {t.get('id', '?'):14} {t.get('assignee', '?'):20} "
f"{t.get('title', '')[:60]}")
if issues:
print("\n ⚠ ISSUES:", file=sys.stderr)
for i in issues:
print(f" {i}", file=sys.stderr)
def main():
ap = argparse.ArgumentParser(description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
ap.add_argument("--tenant", required=True,
help="Project tenant slug to monitor")
ap.add_argument("--interval", type=int, default=30,
help="Poll interval in seconds (default: 30)")
ap.add_argument("--once", action="store_true",
help="Print one snapshot and exit (no polling loop)")
args = ap.parse_args()
if not hermes_available():
print("ERROR: 'hermes' CLI not found in PATH", file=sys.stderr)
sys.exit(1)
if args.once:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
sys.exit(0 if not issues else 2)
print(f"Monitoring tenant '{args.tenant}' every {args.interval}s. "
"Ctrl-C to exit.")
try:
while True:
tasks, issues = snapshot(args.tenant)
print_snapshot(tasks, issues)
time.sleep(args.interval)
except KeyboardInterrupt:
print("\nStopped.")
if __name__ == "__main__":
main()
+8 -1
View File
@@ -43,7 +43,7 @@ class NodeServer:
def __init__(
self,
host: str = "0.0.0.0",
host: str = "127.0.0.1",
port: int = 18789,
token_path: Optional[Path] = None,
display_name: str = "hermes-meet-node",
@@ -76,6 +76,13 @@ class NodeServer:
json.dumps({"token": tok, "generated_at": time.time()}, indent=2),
encoding="utf-8",
)
# Restrict to owner-read-write only — the token grants full RPC
# access to the meet bot (start, transcribe, speak in meetings).
try:
tmp.chmod(0o600)
except (OSError, NotImplementedError):
# Best-effort on non-POSIX filesystems; mode is set on POSIX.
pass
tmp.replace(self.token_path)
self._token = tok
return tok

Some files were not shown because too many files have changed in this diff Show More