Compare commits

...

301 Commits

Author SHA1 Message Date
Brooklyn Nicholson f93a5c03be fix(ui-tui): use lipgloss-style space-fill backdrop instead of ░ chars
Pattern stolen from rogue / bubbletea: paint the scrim as lines of
SPACES with a backgroundColor (theme.color.statusBg), so the area reads
as a clean dimmed plane over the transcript instead of a noisy shade
texture. \`Dialog\` keeps \`opaque\` so its interior stays distinct from the
scrim. \`backdropChar\` prop dropped — single source of truth.
2026-05-07 11:06:36 -04:00
Brooklyn Nicholson 61b502853f feat(ui-tui): decorated character-fill backdrop for Overlay
Replace the empty `backgroundColor` scrim with an explicit character grid
(`░` by default, `backdropChar` prop to customize). Ink only paints
backgrounds where a Box has content, so an empty centering Box rendered
no scrim — that's why it looked black/white. Now every viewport cell is
painted with the backdrop char in `theme.color.border`, then `opaque` on
the Dialog blocks bleed-through inside the card.
2026-05-07 11:04:24 -04:00
Brooklyn Nicholson 5385b7573d feat(ui-tui): hot-key 'd' in /grid-test overlays a dialog on top
Drop a centered dialog over the active grid so the backdrop visibly dims
the grid behind it — easiest way to see the overlay primitive layer
inside a live overlay.
2026-05-07 11:01:53 -04:00
Brooklyn Nicholson bc7b84f575 feat(ui-tui): viewport overlay primitive with zones + faked backdrop
Add `Overlay` (zoned absolute positioning over the full viewport, optional
opaque backdrop) and `Dialog` (bordered card with title/hint slots) in
`components/overlay.tsx`. Mounted as a sibling of the main column inside
`AlternateScreen` so it stacks over transcript + composer without
disturbing the layout below.

Nine CSS-grid-style zones (corners + edges + center) drive placement
through deterministic Yoga `justifyContent` / `alignItems`, sourced from
`stdout` dims so positioning is depth-independent.

Wire a `dialog: DialogState | null` slot into the overlay store, gate
input on it, dismiss with Esc/q/Enter/Ctrl+C. Add `/dialog-test [zone]`
to drive every zone, and surface it in `/help` next to `/grid-test`.
2026-05-07 10:59:42 -04:00
Brooklyn Nicholson f3d958f482 feat(ui-tui): make widget grid composable + drop into TUI surfaces
Make `WidgetGrid` a real composition primitive: cells accept either a
width-aware `render(width, cell)` factory or a direct `children` subtree
(static, stateful, or another `WidgetGrid`). Layout core gains explicit
column counts and per-item `colStart` / `colSpan`, so sparse rows render
without collapsing holes. Cells clip with `overflow: hidden` so child
overflow can never bleed across cell or panel borders.

Wire the grid into the surfaces that were doing hand-rolled padding:
intro hero/session panel, generic `Panel` sections (incl. setup-required
and `/help`), the `/` slash-completion popover, the resume picker, and
the model/provider walkthrough. `/resume` is now full-overlay-span and
its rows are 1-col grid cells instead of fixed 30/30/title chunks.

Add `/grid-test` (debug-only): an interactive overlay that lets you
sweep cols/rows/gap/padding, toggle a sparse nested-preview pattern,
and Enter-zoom a parent cell into a fullscreen child grid. It dogfoods
the polymorphic API: parent labels are centered with a flex-grow box on
inner width (deterministic Yoga math, no `%` ambiguity) and the zoom
header itself is a 2-col `WidgetGrid`.

Tests: layout invariants (exact columns, sparse `colStart`, span
clamping) plus a component-level test that renders stateful direct
children and a nested grid inside cells.
2026-05-06 19:10:47 -05:00
Brooklyn Nicholson dff5dc34ce Merge origin/main into bb/widget-grid-slots
Resolve the appOverlays.tsx conflict by keeping the widget-grid overlay layout while adopting main's focused ui selectors for theme/session subscriptions.
2026-05-05 15:44:41 -05:00
Teknium 8fa5a03752 chore: AUTHOR_MAP entry for jethac 2026-05-05 13:43:04 -07:00
Jetha Chan b1476c76f6 docs(gemini): add Google Gemini guide 2026-05-05 13:43:04 -07:00
brooklyn! 794f48766c fix(tui): close slash parity gaps with CLI (#20339)
* fix(tui): close slash parity gaps with CLI

Route unsupported /skills subcommands through slash.exec, support /new <name>
titles, and handle /redraw natively so TUI behavior matches classic CLI. Also
filter gateway-only commands out of the TUI catalog while keeping /status
discoverable.

* fix(tui): run remaining CLI parity paths natively

Forward chat launch flags into the TUI runtime and handle live-session status
and skill reloads in the gateway process so TUI state no longer depends on the
slash worker's stale CLI instance.

* fix(tui): block stale snapshot restores

Prevent snapshot restore from running through the isolated slash worker because
it mutates disk state without refreshing the live TUI agent.

* chore: uptick

* fix(tui): guard async session title updates

Handle failures from the fire-and-forget session.title RPC so title-setting errors do not surface as unhandled promise rejections while preserving session-scoped messaging.
2026-05-05 15:42:39 -05:00
Jason Perlow acca3ec3af docs(providers): Together/Groq/Perplexity cookbook via custom_providers
Three worked recipes for OpenAI-compatible cloud providers, plus the
Copilot HTTP 401 auto-recovery info block and the GMI Cloud row in the
compatible providers table. All three additions were on the original
docs/custom-providers-cookbook branch but its merge base predated 1186
main commits, making the rebase impractical (84k+ line conflict).

Replays just the providers.md additions onto current main.
2026-05-05 13:42:20 -07:00
Wysie af312ccc97 docs: fix Camofox Docker setup instructions 2026-05-05 13:41:46 -07:00
JiaDe-Wu 7b05ccddc7 docs(bedrock): fix IAM permissions, add quickstart entry, add fallback provider, fix deployment section 2026-05-05 13:41:14 -07:00
Serhat Dolmac 84ec27616a docs(cli): expand hermes import reference — add description, warning, and examples 2026-05-05 13:40:26 -07:00
Teknium 9022804d78 feat(providers): make all 33 providers pluggable under plugins/model-providers/
Every provider profile is now a self-contained plugin under
plugins/model-providers/<name>/, mirroring the plugins/platforms/
pattern established for IRC and Teams. The ProviderProfile ABC
stays in providers/; the per-provider profile data moves out.

- plugins/model-providers/<name>/__init__.py calls register_provider()
- plugins/model-providers/<name>/plugin.yaml declares kind: model-provider
- providers/__init__.py._discover_providers() lazily scans bundled plugins
  then $HERMES_HOME/plugins/model-providers/<name>/ (user override path)
- User plugins with the same name override bundled ones (last-writer-wins
  in register_provider)
- Legacy providers/<name>.py layout still supported for back-compat with
  out-of-tree editable installs
- Hermes PluginManager: new kind=model-provider; skipped like memory
  plugins (providers/ discovery owns them); standalone plugins with
  register_provider+ProviderProfile in their __init__.py auto-coerce to
  this kind (same heuristic as memory providers)
- skip_names extended to include 'model-providers' so the general
  PluginManager doesn't double-scan the category
- 4 new tests in tests/providers/test_plugin_discovery.py covering
  bundled discovery, user override, and general-loader isolation
- Docs updated: website/docs/developer-guide/adding-providers.md,
  provider-runtime.md, providers/README.md, plugins/model-providers/README.md

No API break: auth.py / config.py / doctor.py / models.py / runtime_provider.py /
model_metadata.py / auxiliary_client.py / chat_completions.py / run_agent.py
all still consume providers via get_provider_profile() / list_providers() —
they just now see plugin-discovered entries instead of pkgutil-iterated ones.

Third parties can now drop a single directory into
~/.hermes/plugins/model-providers/<name>/ to add or override an inference
provider without touching the repo.
2026-05-05 13:40:01 -07:00
kshitijk4poor 20a4f79ed1 feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path
Introduces providers/ package — single source of truth for every
inference provider. Adding a simple api-key provider now requires one
providers/<name>.py file with zero edits anywhere else.

What this PR ships:
- providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes)
- ProviderProfile declarative fields: name, api_mode, aliases, display_name,
  env_vars, base_url, models_url, auth_type, fallback_models, hostname,
  default_headers, fixed_temperature, default_max_tokens, default_aux_model
- 4 overridable hooks: prepare_messages, build_extra_body,
  build_api_kwargs_extras, fetch_models
- chat_completions.build_kwargs: profile path via _build_kwargs_from_profile,
  legacy flag path retained for lmstudio/tencent-tokenhub (which have
  session-aware reasoning probing that doesn't map cleanly to hooks yet)
- run_agent.py: profile path for all registered providers; legacy path
  variable scoping fixed (all flags defined before branching)
- Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS,
  doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER
- GeminiProfile: thinking_config translation (native + openai-compat nested)
- New tests/providers/ (79 tests covering profile declarations, transport
  parity, hook overrides, e2e kwargs assembly)

Deltas vs original PR (salvaged onto current main):
- Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth
  (were added to main since original PR)
- Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their
  reasoning_effort probing has no clean hook equivalent yet)
- Removed lmstudio alias from custom profile (it's a separate provider now)
- Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension
  (resolve_provider special-cases them; adding breaks runtime resolution)
- runtime_provider: profile.api_mode only as fallback when URL detection
  finds nothing (was breaking minimax /v1 override)
- Preserved main's legacy-path improvements: deepseek reasoning_content
  preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M
  beta recovery, etc.
- Kept agent/copilot_acp_client.py in place (rejected PR's relocation —
  main has 7 fixes landed since; relocation would revert them)
- _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing
  test imports

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes #14418
2026-05-05 13:40:01 -07:00
Teknium 2b500ed68a chore: AUTHOR_MAP entry for asimons81 2026-05-05 13:34:03 -07:00
Tony Simons e4723f671a docs(cron): add context_from chaining section
Resolved merge against current main (new No-agent mode section added in parallel).

Co-authored-by: Tony Simons <tony@tonysimons.dev>
2026-05-05 13:34:03 -07:00
r266-tech b6e4e40df4 docs(guide): add Dispatch tools from slash commands section 2026-05-05 13:33:56 -07:00
r266-tech 91f339b981 docs(plugins): document ctx.dispatch_tool() in plugin capabilities table 2026-05-05 13:33:56 -07:00
Bartok9 72c33dfe95 docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings
The BuiltinMemoryProvider class was removed from the codebase but its
name lingered in the module-level docstrings of memory_manager.py and
memory_provider.py, creating false expectations:

- memory_manager.py docstring showed example code doing
  add_provider(BuiltinMemoryProvider(...)) which ImportError at runtime
- memory_provider.py docstring listed BuiltinMemoryProvider as
  'always present, not removable' — misleading for new contributors

The regression test (test_memory_user_id.py) already passes without
any reference to BuiltinMemoryProvider; it uses RecordingProvider
instances directly. The stale references were docs-only drift.

Update both docstrings to reflect the actual current architecture:
MemoryManager accepts external plugin providers only (one at a time).

Closes #14402
2026-05-05 13:33:49 -07:00
Teknium f67063ba81 feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals

Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.

Closes the follow-up from #20232 discussion.

New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.

v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
  ``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
  ``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
  fires when ``tasks.spawn_failures >= 3``; suggests
  ``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
  ``crashed`` run outcomes with no successful completion between;
  suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
  state with no comments / unblock attempts; suggests commenting.

Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.

Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.

API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
  ``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
  section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
  severity, sorted critical-first.

CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
  [--json]`` — fleet view or single-task view, matches dashboard rule
  output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
  the top with severity markers + suggested actions.

Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
  !!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
  diagnostics (not just hallucinations), severity-coloured, lists
  affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
  ``DiagnosticsSection`` rendering a card per active diagnostic:
  title + detail + structured data (task-id chips when payload keys
  look like id lists) + action buttons. Reassign profile picker is
  inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
  environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
  red for critical. Uses CSS variables so theming is straightforward.

Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
  covering each rule's positive/negative/threshold paths, severity
  sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
  populated, severity-filtered), ``/board`` exposes both diagnostic
  list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
  warnings_field_for_hallucinated_completions``) updated to reflect
  the new contract: warning summary keys by diagnostic kind
  (``hallucinated_cards``) not event kind.

379 kanban-suite tests pass (+16 net from this PR).

Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:

* Attention strip: shows ``!! 5 tasks need attention`` in the
  error-severity orange; Show expands to a list of 5 rows ordered
  critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
  render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
  diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
  ``broken-ml-worker → alice`` and the drawer refreshed with the
  new assignee + the same diagnostic still firing (correct:
  spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
  ``--severity error`` narrows to 3; ``kanban show <id>`` includes
  the Diagnostics block at the top with suggested action hint.

Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.

* feat(kanban/diagnostics): lead titles with the actual error text

The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.

New titles:

  Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
  Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
  Agent crashed 3x: provider auth error: 401 Unauthorized
  Agent spawn failed 4x: insufficient_quota: You exceeded your current

Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').

Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).

Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
r266-tech ec7f2f249e docs(cli): add skills reset subcommand to CLI reference
PR #11468 added `hermes skills reset` but cli-commands.md was not
updated. Adds the subcommand to the table and usage examples.

Closes #11543
2026-05-05 13:32:28 -07:00
Brooklyn Nicholson 00d25595c1 perf(ui-tui): narrow overlay subscriptions to focused selectors
Subscribe overlay components to computed theme/session selectors instead of the full UI store so unrelated UI state updates trigger fewer overlay renders.
2026-05-05 13:31:47 -07:00
r266-tech ee502e5640 docs(cli): add --deliver-only flag to hermes webhook subscribe
PR #12473 (merged 2026-04-19) added a new --deliver-only flag to
`hermes webhook subscribe` for zero-LLM direct delivery, but
website/docs/reference/cli-commands.md options table did not
reference it. Add the row so CLI users can discover the flag from
the reference page instead of having to read the source.
2026-05-05 13:30:06 -07:00
Teknium 0dc677f071 docs(skill/hermes-agent): sync slash commands + add durable-systems section
Mirrors the AGENTS.md #20226 additions (Toolsets / Delegation / Curator /
Cron / Kanban) into the user-facing hermes-agent skill, and closes the
drift in the in-session slash command list.

User report (wxrrior in Discord): the skill did not mention /goal, so a
brand-new session answering "/hermes-agent do you have any info on /goal"
confidently said it did not exist. Cross-check against the CommandDef
registry found 16 commands missing from the static list: /goal, /agents,
/busy, /copy, /curator, /debug, /footer, /gquota, /indicator, /kanban,
/redraw, /reload, /reload-skills, /snapshot, /steer, /topic.

Changes:
- Slash Commands header now tells the reader to run /help or check the
  live docs reference as the source of truth, and names the registry
  of record (hermes_cli/commands.py) so future drift gets flagged
  honestly instead of answered confidently wrong.
- Added all 16 missing commands, slotted into existing subsections
  (/goal and /steer in Session; /busy + /indicator + /footer in
  Configuration; /curator + /kanban + /reload-skills + /reload in
  Tools & Skills; /topic in Gateway; /copy in Utility; /gquota +
  /debug in Info).
- Toolsets table updated to the authoritative 30-key list from
  toolsets.py (added kanban, yuanbao, spotify, safe, debugging, video,
  feishu_doc, feishu_drive, discord, discord_admin, clarify; previously
  stopped at 20 keys).
- New "Durable & Background Systems" section before Troubleshooting
  covers Delegation, Cron, Curator, Kanban - each with a short rundown
  of CLI verbs, key invariants, and a pointer to the user-facing docs.
  Mirrors AGENTS.md #20226 but in the skill's user-facing register.
- Bumped version 2.0.0 -> 2.1.0.
2026-05-05 13:29:39 -07:00
Brooklyn Nicholson 4532182bda refactor(ui-tui): defer overlay selector perf split
Revert focused overlay selector subscriptions from the widget-grid branch so the layout PR stays scoped to grid behavior and picker sizing only.
2026-05-05 15:29:30 -05:00
r266-tech c28c2a2380 docs(tts): document per-provider max_text_length caps
PR #13743 replaced the global MAX_TEXT_LENGTH=4000 with a per-provider
table and a user-override 'max_text_length:' key, but the user-guide
TTS page documented no length behaviour at all. Users hitting truncation
had no way to discover the new caps or the override.

Add an 'Input length limits' subsection after the existing Configuration
YAML block: provider default caps (Edge 5000 / OpenAI 4096 / xAI 15000 /
MiniMax 10000 / Mistral 4000 / Gemini 5000 / ElevenLabs model-aware /
NeuTTS,KittenTTS 2000), ElevenLabs model_id -> cap table (5k-40k), an
override example, and the validation rules (non-positive / non-integer /
boolean values fall through to the provider default).
2026-05-05 13:28:53 -07:00
Teknium d5357f816d refactor(telegram): make typing thread-id resolver symmetric with send
Mirror _message_thread_id_for_typing() with _message_thread_id_for_send():
both now map the General forum topic (thread id "1") to None upfront.

That removes the need for the retry-without-thread fallback in send_typing()
entirely — if _message_thread_id_for_typing() returns a non-None value, it's
a real user-created topic and falling back to the root chat is never correct.
If Telegram rejects the typing action (e.g. topic deleted mid-session), we
swallow it at debug level instead of bleeding the indicator into All Messages.

Updates the General-topic typing regression test to assert the new single-call
contract.
2026-05-05 13:28:08 -07:00
helix4u 41545f7ec5 fix(telegram): keep DM topic typing scoped 2026-05-05 13:28:08 -07:00
WadydX 0664bf961a docs: fix broken nix-setup anchor for container-aware CLI 2026-05-05 13:27:38 -07:00
WadydX 58f93fb7d3 docs: remove dead papers.md link from saelens references 2026-05-05 13:27:12 -07:00
WadydX 2d5f20684a docs: remove dead reference links in flash-attention skill 2026-05-05 13:26:45 -07:00
Teknium c85a25faaa chore: AUTHOR_MAP entry for Beandon13 2026-05-05 13:26:12 -07:00
Brandon Zarnitz 27a8ba42ed docs(prompt): clarify supported customization surfaces 2026-05-05 13:26:12 -07:00
LeonSGP43 ce9888b52a docs(config): fix fallback provider config paths 2026-05-05 13:24:53 -07:00
beardthelion a6289927d3 docs(web_tools): correct web_extract summarizer timeout comment
The comment at tools/web_tools.py:700-702 stated the runtime default for
auxiliary.web_extract.timeout is 360s. The actual runtime default is 30s
(_DEFAULT_AUX_TIMEOUT in agent/auxiliary_client.py:3140), used by
_get_task_timeout when no auxiliary.web_extract.timeout key is present in
config.yaml.

The 360s figure is the config template default written by
hermes_cli/config.py:697 into freshly-generated config.yaml files. It only
takes effect when that key exists in the user's config — not as a fallback.
Users on configs that predate commit 20b4060d (Apr 5, 2026), or who removed
the key, fall through to the 30s _DEFAULT_AUX_TIMEOUT runtime default.

The comment was introduced in 20b4060d alongside the template-default bump
from 30 to 360. The runtime default in auxiliary_client.py was not changed
in that commit and has remained 30s since 839d9d74 (Mar 28, 2026).
2026-05-05 13:24:19 -07:00
Brooklyn Nicholson dbbd5512d5 feat(ui-tui): add responsive overlay widget grid
Introduce a shared widget grid and width-capped overlay pickers so wide terminals can tile widgets cleanly while reducing overlay rerenders via focused ui store selectors.
2026-05-05 15:09:18 -05:00
Siddharth Balyan 3b750715a3 fix: resolve lazy session creation regressions (#18370 fallout) (#20363)
Fix three regressions introduced by PR #18370 (lazy session creation):

1. _finalize_session() uses stale session_key after compression (#20001)
2. session_key not synced after auto-compression in run_conversation (#20001)
3. pending_title ValueError leaves title wedged forever (#19029)
4. Gateway silently swallows null responses when agent did work (#18765)
5. One-time cleanup for accumulated ghost compression continuations (#20001)

Changes:
- tui_gateway/server.py: _finalize_session() now uses agent.session_id
  (falls back to session_key when agent is None). Refactor
  _sync_session_key_after_compress() with clear_pending_title and
  restart_slash_worker policy flags. Call it post-run_conversation()
  to sync session_key after auto-compression. Add ValueError handler
  to pending_title flush.
- gateway/run.py: Extract _normalize_empty_agent_response() helper that
  consolidates failed/partial/null response handling. Surfaces user-facing
  error when agent did work (api_calls > 0) but returned no text.
- hermes_state.py: Add finalize_orphaned_compression_sessions() — marks
  ghost continuation sessions as ended (non-destructive, preserves data).
- cli.py: One-time startup migration for orphaned compression sessions.

Test changes:
- tests/test_tui_gateway_server.py: Update pending_title ValueError test
  for post-#18370 architecture (title applied post-message, not at create).
- tests/test_lazy_session_regressions.py: 14 new regression tests covering
  all fixed paths.
2026-05-06 01:11:49 +05:30
Teknium 0397be5939 feat(tui): remove /provider alias for /model (#20358)
/model is the canonical command; /provider was a redundant alias that
dispatched to the same ModelPicker overlay. Drop the alias, the regex
branch in useCompletion, and the alias-coverage test.
2026-05-05 12:23:21 -07:00
Teknium 87b113c2e3 chore: AUTHOR_MAP entry for Tkander1715 2026-05-05 10:18:58 -07:00
Traemond Anderson 60235dba5e feat(cli): add list_picker_providers for credential-filtered picker
The Telegram/Discord /model pickers currently call
list_authenticated_providers(), which returns every provider whose
credentials resolve locally and every model in its curated snapshot.
Two failure modes fall out:

- OpenRouter rows can include IDs the live catalog no longer carries.
- Provider rows can surface with zero callable models (e.g. a slug
  whose credential pool entry exists but has nothing behind it).

list_picker_providers() wraps the base function and post-processes the
result so the interactive picker only shows models the user can
actually select:

- OpenRouter's models come from fetch_openrouter_models() (live-catalog
  filtered against the curated OPENROUTER_MODELS snapshot).
- Rows with an empty models list are dropped, except custom endpoints
  (is_user_defined=True with an api_url) where the user may enter
  model ids manually.
- All other fields pass through unchanged.

The gateway /model handler switches to the new helper for the
interactive picker payload only. Typed /model <name> and the text
fallback list stay on list_authenticated_providers() so nothing is
hidden from power users or platforms without a picker.

Covered by nine focused unit tests in
tests/hermes_cli/test_list_picker_providers.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:18:58 -07:00
Teknium cc2c820975 chore: AUTHOR_MAP entry for Aslaaen 2026-05-05 10:18:28 -07:00
Aslaaen e8e9147377 fix(acp): preserve assistant reasoning metadata in session persistence 2026-05-05 10:18:28 -07:00
Teknium dbe9b15fa1 chore: AUTHOR_MAP entry for zeejaytan 2026-05-05 10:15:57 -07:00
Zeejay f8ba265340 fix(aux): trigger fallback on 429 rate-limit errors in auxiliary client
When a provider returns a 429 rate-limit error (not billing-related),
the auxiliary client's call_llm/async_call_llm previously did NOT trigger
the fallback chain. This caused auxiliary tasks like session_search to
exhaust all 3 retries against the same rate-limited endpoint, losing
session metadata that depended on the summarization completing.

Root cause: `_is_payment_error()` only matched 429s containing billing
keywords ("credits", "insufficient funds", etc.). Provider-specific
rate-limit messages like Nous's "Hold up for a bit, you've exceeded the
rate limit on your API key" didn't match, so `_is_payment_error` returned
False, `_is_connection_error` returned False, and `should_fallback` was
False — all retries hit the same rate-limited provider.

Fix:
- New `_is_rate_limit_error()` function that detects 429 + rate-limit
  keywords, generic 429 without billing keywords, and OpenAI SDK
  `RateLimitError` class instances (which may omit .status_code).
- Updated `should_fallback` in both `call_llm` and `async_call_llm` to
  include `_is_rate_limit_error`.
- Updated the max_tokens retry path to also check for rate-limit errors.
- Updated the reason string to include "rate limit".

This complements the Nous rate guard (PR #10568) which prevents new calls
to Nous when already rate-limited — this fix handles the case where a
request is already in flight when the 429 arrives.

Related: #8023, #12554, #11034
Co-authored-by: Zeejay <zjtan1@gmail.com>
2026-05-05 10:15:57 -07:00
Teknium 8c0f254c06 chore: AUTHOR_MAP entry for LeonSGP43 2026-05-05 10:15:31 -07:00
LeonSGP43 244bacd0dc fix(skills): support category-qualified local skill names 2026-05-05 10:15:31 -07:00
Teknium 4553e32bc4 chore: AUTHOR_MAP entry for Es1la 2026-05-05 10:15:09 -07:00
Es1la a877c3f6d9 fix(feishu): tolerate malformed dedup timestamps
Salvages @Es1la's PR #13632 — a non-numeric timestamp in the persisted
feishu dedup state crashed adapter startup with ValueError/TypeError
from the unguarded float() call. Wrap the float() conversion in
try/except; skip the bad key and keep loading the rest.

The original PR also restructured existing TestDedupTTL tests to use
tempfile.TemporaryDirectory + HERMES_HOME patching — that was
test-hygiene scope creep unrelated to the bug. Kept only the
malformed-timestamp fix and added a focused regression test.
2026-05-05 10:15:09 -07:00
Teknium 77a102b7de chore: AUTHOR_MAP entry for jkausel-ai 2026-05-05 10:14:48 -07:00
Justin Kausel 526742199b Prefer fallback for Gemini CloudCode rate limits 2026-05-05 10:14:48 -07:00
Teknium 12135b4c8a chore: AUTHOR_MAP entry for wysie 2026-05-05 10:14:17 -07:00
Wysie 0120d8f31e fix: merge plugin tools into builtin toolsets 2026-05-05 10:14:17 -07:00
Teknium d9f0875591 chore: AUTHOR_MAP entry for hharry11 2026-05-05 10:13:55 -07:00
hharry11 247c9d468c fix(gateway): ensure deterministic thread eviction in helpers 2026-05-05 10:13:55 -07:00
Teknium 935cf2fcca chore: AUTHOR_MAP entry for JTroyerOvermatch 2026-05-05 10:13:34 -07:00
Jonathan Troyer 6430d67569 fix(openrouter): use canonical X-Title attribution header
OpenRouter's dashboard attributes usage via the `X-Title` header.
Hermes was sending `X-OpenRouter-Title`, which OpenRouter does not
recognize, so Hermes usage showed up unlabeled. Rename to `X-Title`
to match the canonical header (already used elsewhere in the same
file via _AI_GATEWAY_HEADERS).

Salvages the core fix from @JTroyerOvermatch's PR #13649. Dropped the
PR's `HERMES_OPENROUTER_TITLE` / `HERMES_OPENROUTER_REFERER` env-var
override plumbing per the '.env is for secrets only' policy — if
per-deployment attribution is needed later it should go under
`openrouter.title` / `openrouter.referer` in config.yaml instead.
2026-05-05 10:13:34 -07:00
Teknium 269be4ec84 chore: AUTHOR_MAP entry for Bongulielmi 2026-05-05 10:13:13 -07:00
Remigio Bongulielmi d8097d587f refactor(env): use shared Hermes dotenv loader 2026-05-05 10:13:13 -07:00
Teknium c62d8c9b74 chore: AUTHOR_MAP entry for Bartok9 2026-05-05 10:12:40 -07:00
Bartok dad62c4c47 fix(whatsapp): auto-convert mp3/wav to ogg/opus in send-media for native voice bubbles
WhatsApp bridge (bridge.js) only sets ptt:true when file extension is .ogg
or .opus, causing mp3/wav files (from Edge TTS, NeuTTS, etc.) to arrive
as file attachments instead of voice bubbles — silently, with no error.

Fix: when audio type is sent with a non-ogg/opus format, run ffmpeg
conversion to ogg/opus in a temp file before sending. This makes
send_voice() self-sufficient regardless of what format the caller provides.

Fallback: if ffmpeg is unavailable, original buffer is sent (previous
behaviour) with a console.warn — no crash.

Addresses veloguardian's review comment on PR #4992.
2026-05-05 10:12:40 -07:00
Teknium 45949e944a chore: AUTHOR_MAP entry for Junass1 2026-05-05 10:05:23 -07:00
Teknium e4e0090b54 test(acp): regression for #13675 — save_session preserves existing messages on encode failure 2026-05-05 10:05:23 -07:00
Junass1 5795b3be4e fix(acp): use SessionDB.replace_messages for atomic history rewrite
ACP's save_session() did a non-atomic clear_messages() + append_message()
loop. If any message hit an exception mid-loop (bad tool_call shape, etc.),
the DELETE had already committed and the persisted conversation was lost.

SessionDB.replace_messages() wraps DELETE + bulk INSERT in a single
BEGIN IMMEDIATE transaction that rolls back on any exception, so a bad
message can no longer clobber previously-persisted history.

Salvages @Awsh1's PR #13675 — uses the existing replace_messages()
helper (which covers more message fields than the PR's own copy)
instead of adding a duplicate.
2026-05-05 10:05:23 -07:00
Justin Kausel e805380b82 Discover plugin commands during CLI dispatch 2026-05-05 09:58:37 -07:00
sprmn24 ecc909de38 fix(session): serialize JSONL transcript appends under existing lock 2026-05-05 09:57:31 -07:00
sprmn24 db84c1535d fix(ssh): add scp availability check to preflight validation 2026-05-05 09:57:23 -07:00
WuTianyi 8e18d10318 fix(feishu): force text mode for markdown tables
Feishu post-type 'md' elements do not render markdown tables.
When table content is sent as post (triggered by **bold** matching
_MARKDOWN_HINT_RE), the message appears blank on the client.

Add _MARKDOWN_TABLE_RE to detect markdown table syntax and force
text mode for table content, ensuring it is visible as plain text.
2026-05-05 09:57:14 -07:00
Teknium b014a3d315 test(cron): update _isolate_tick_lock fixture for _get_lock_paths
After PR #13725 replaced the module-level _LOCK_DIR/_LOCK_FILE constants
with a dynamic _get_lock_paths() helper, the xdist-isolation fixture
needs to patch the function instead of the removed constants.
2026-05-05 09:57:06 -07:00
邓taoyuan 969bfff449 fix: merge _get_hermes_home() dynamic resolution and feishu receive_id_type detection
- scheduler.py: Replace static _hermes_home with dynamic _get_hermes_home() function
  to support profile switching at runtime (HERMES_HOME override)
- scheduler.py: Replace static _LOCK_DIR/_LOCK_FILE with _get_lock_paths() function
  for profile-aware lock path resolution
- feishu.py: Add receive_id_type detection (oc_/ou_ -> open_id, else chat_id)
  to fix Feishu API '[230001] ext=invalid receive_id' error for user DMs
2026-05-05 09:57:06 -07:00
Teknium de9238d37e feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.

Closes #20017.

Recovery UX (kernel + CLI + dashboard)
--------------------------------------

A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:

* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
  releases an active worker claim immediately (unlike
  ``release_stale_claims`` which only acts after claim_expires has
  passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
  switch a task to a different profile, optionally reclaiming a stuck
  running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
  ``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
  CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
  ``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
  dashboard plugin.

Dashboard surfacing
-------------------

* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
  tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
  with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
  Reassign (with profile picker + reclaim-first checkbox), and a
  copy-to-clipboard hint for ``hermes -p <profile> model`` since
  profile config lives on disk and can't be edited from the browser.
  Auto-opens when the task has warnings, collapsed otherwise.
  Keyed by task id so state doesn't leak between drawers.

Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.

Skill updates
-------------

* ``skills/devops/kanban-worker/SKILL.md`` documents the
  ``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
  stuck workers" section with the three actions and when to use each.

Tests
-----

* Kernel gate: verified-cards manifest, phantom rejection + audit
  event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
  returns False, reassign refuses running without reclaim_first,
  reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
  warnings cleared after clean completion, reclaim 200 + 409 paths,
  reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.

Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.

359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
Teknium 7de3c86c5a feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* feat(i18n): add display.language for static message translation (zh/ja/de/es)

Adds a thin-slice i18n layer covering the highest-impact static user-facing
messages: the CLI dangerous-command approval prompt and a handful of gateway
slash-command replies (restart-drain, goal cleared, approval expired, config
read/save errors).

Out of scope (stays English): agent responses, log lines, tool outputs,
slash-command descriptions, error tracebacks.

Infrastructure:
- agent/i18n.py: catalog loader, t() helper, language resolution
  (HERMES_LANGUAGE env var > display.language config > en)
- locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language
- display.language in DEFAULT_CONFIG (hermes_cli/config.py)

Tests:
- tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder
  parity across locales, fallback behavior, env-var override, alias
  normalization, missing-key graceful degradation.

Docs:
- website/docs/user-guide/configuration.md: display.language entry plus a
  short section explaining scope so users don't expect agent responses to
  translate via this knob.
2026-05-05 08:03:07 -07:00
Teknium b7bd177105 docs(AGENTS.md): add curator/cron/delegation/toolsets, fix plugin tree (#20226)
* docs(AGENTS.md): add curator/cron/delegation/toolsets, fix plugin tree, frontmatter, auto-discovery caveat

Closes #19101 and #19107 (@pty819).

Verified 16 claims from those two issues against current main. 12 were
real gaps; 2 were generated/hallucinated (#10 unverified --now flag is
actually real and already cited in AGENTS.md; #11 stale PR refs #5587
and #4950 do not appear in AGENTS.md at all); 2 were low-prio nits
(memory provider hierarchy, --now scope enumeration) deferred.

Changes:
- Project tree: add yuanbao to platforms comment; expand plugins/
  subtree with real directory names (kanban, hermes-achievements,
  observability, image_gen) instead of vague '<others>'.
- Test-count blurb: 15k/700 Apr → 17k/900 May (verified: 17,375 test
  defs, 915 files).
- Adding New Tools: clarify that auto-discovery wires up schemas but
  the tool only reaches an agent if its name is added to a toolset in
  toolsets.py. _HERMES_CORE_TOOLS is not dead code.
- Adding Configuration: enumerate top-level config.yaml sections
  including auxiliary and curator; note auxiliary is per-task
  overrides for side-LLM work.
- SKILL.md frontmatter: add author, license, related_skills. Note
  top-level tags/category are mirrored from metadata.hermes.*.
- New section 'Toolsets' — enumerates the 30 current TOOLSETS keys
  (including yuanbao, kanban, moa, spotify, safe, debugging).
- New section 'Delegation (delegate_task)' — sync semantics, batch
  mode, leaf vs orchestrator roles, config knobs, durability caveat.
- New section 'Curator (skill lifecycle)' — core files, 11 CLI verbs,
  telemetry sidecar, invariants (pin/delete split after PR #20220,
  bundled/hub off-limits), curator.* config section.
- New section 'Cron (scheduled jobs)' — 4 schedule formats, 7 CLI
  verbs, per-job fields, 3-min hard interrupt, catchup/grace windows,
  tick.lock, cron→session isolation.

Skipped (invalid claims):
- #19107 item 10: --now is real (hermes_cli/skills_hub.py:624/966/1013/1470)
- #19107 item 11: no '#5587' or '#4950' or 'async_delegation' in AGENTS.md

* docs(AGENTS.md): add Kanban section

Adds a Kanban entry alongside Curator / Cron / Delegation so the major
durable background systems are all represented. Covers the CLI verbs,
the HERMES_KANBAN_TASK-gated worker toolset, the in-gateway dispatcher,
plugin assets, and the board/tenant isolation model. Points at the full
742-line user docs for detail.
2026-05-05 07:56:29 -07:00
Teknium 7530ce04e0 chore: AUTHOR_MAP entry for MaHaoHao-ch 2026-05-05 06:12:42 -07:00
MaHaoHao-ch 02147cc850 fix(cli): sanitize bracketed paste markers during setup
Strip bracketed-paste control sequences from setup prompt input so pasted API keys work on Linux and WSL terminals, and add regression tests for normal/password prompts.

Closes #16491
2026-05-05 06:12:42 -07:00
Teknium 8ebb81fd76 chore: AUTHOR_MAP entry for rxdxxxx 2026-05-05 06:12:11 -07:00
rxdxxxx c46bc92949 fix(run_agent): use aux provider for compression context length lookup
Each auxiliary model must be resolved with its own provider so that
provider-specific paths (e.g. Bedrock static table, OpenRouter API)
are invoked for the correct client, not inherited from the main model.

When the main model is Bedrock, passing self.provider unconditionally
to get_model_context_length() for the aux model caused the Bedrock
static table hard-intercept (step 1b) to fire for non-Bedrock models,
returning BEDROCK_DEFAULT_CONTEXT_LENGTH=128K instead of the model's
real context window — triggering a false compression warning every session.

Fix: pass _aux_cfg_provider when explicitly set, falling back to
self.provider only when the aux provider is unset or "auto".

Closes #12977
Related: #13807, #17460
2026-05-05 06:12:11 -07:00
Teknium fb311952d7 chore: AUTHOR_MAP entry for Krionex 2026-05-05 06:11:38 -07:00
Teknium 285c208cf7 fix(gateway): also tolerate malformed env vars in custom human-delay mode
Widens @Krionex's PR #16933 fix to cover the second bug class at the sibling
site. natural mode used to pass env values through int() before the PR
caught mis-typed values crashing the gateway; custom mode had the exact
same bug one branch away (HERMES_HUMAN_DELAY_MIN_MS=oops in custom mode
still crashed). Same try/except/fallback pattern, scoped to the two
int() calls that feed random.uniform().
2026-05-05 06:11:38 -07:00
Krionex 3b16c590e0 fix(gateway): ignore malformed custom delay env vars in natural mode 2026-05-05 06:11:38 -07:00
Teknium 349d0da07e chore: AUTHOR_MAP entry for novax635 2026-05-05 06:11:03 -07:00
novax635 4e6f51167d fix(cli): fall back on invalid HERMES_MAX_ITERATIONS 2026-05-05 06:11:03 -07:00
Teknium 37b5731694 chore: AUTHOR_MAP entry for npmisantosh 2026-05-05 06:08:14 -07:00
Santosh f6677748a0 fix(claw): handle missing dir in _scan_workspace_state 2026-05-05 06:08:14 -07:00
Teknium f844e516d8 chore: AUTHOR_MAP entry for agentlinker 2026-05-05 06:07:44 -07:00
Leon 19eebf6e0d fix(openrouter): treat xiaomi models as reasoning-capable 2026-05-05 06:07:44 -07:00
vominh1919 96514de472 fix(auxiliary): avoid locking into custom path when api_key is empty
When auxiliary.<task> config has base_url set but api_key is empty
(common when user expects env var fallback), _resolve_task_provider_model()
returned provider="custom" with api_key=None. This caused downstream
client construction to make API calls without an Authorization header,
resulting in HTTP 401 errors.

Fix: only return "custom" when BOTH cfg_base_url AND cfg_api_key are
non-empty. When base_url is set without api_key but with a known
provider (e.g. "openrouter"), pass through to that provider so it can
resolve credentials from environment variables.

Fixes #16829
2026-05-05 06:07:07 -07:00
Teknium c7fc5af122 chore: AUTHOR_MAP entry for tangyuanjc 2026-05-05 06:04:20 -07:00
JC的AI分身 80b386a472 fix(feishu): refresh bot identity during hydration 2026-05-05 06:04:20 -07:00
Teknium 314361733f test(api_server): _run_agent result now carries session_id for #16938 2026-05-05 06:01:03 -07:00
vominh1919 7f735b4db2 fix: return effective session_id after context compression (#16938)
When context compression rotates the agent's session_id to a new
child session, the API server was still returning the stale parent
session_id in the X-Hermes-Session-Id response header.

This caused external clients to keep sending the old session_id,
loading uncompressed parent history instead of the compressed
continuation.

Fix: _run_agent() now includes the effective session_id in its
result dict, and the response header uses it instead of the
original provided session_id.
2026-05-05 06:01:03 -07:00
Hafiy Zakaria 34c6f93496 fix: resolve model.aliases from config.yaml in /model alias resolution
hermes config set model.aliases.xxx commands write to the model.aliases
nested key, but _load_direct_aliases() only read from the top-level
model_aliases key. This meant aliases set via hermes config set were
invisible to the /model command, and unrecognised inputs fell through
to the DeepSeek normaliser which mapped everything to deepseek-chat.

Add a second pass in _load_direct_aliases() that reads model.aliases
and converts string-value entries (provider/model format) into
DirectAlias objects. The provider is parsed from the slash prefix;
if no slash, the current default provider from config is used.

Also prevent simple aliases from overriding explicit model_aliases
dict entries when both exist.
2026-05-05 05:49:01 -07:00
briandevans c1a2710a32 test(aux): cover effort: 0 fallback in Codex reasoning translation
Copilot review on PR #17012 noted the docstring/comment lists `0`
among the falsy effort values that fall back to `medium`, but the
existing regression tests only cover `None` and `""`. Add the third
case to lock in the full contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:47:50 -07:00
briandevans 9e893d16d1 fix(aux): default Codex reasoning effort to medium when extra_body.reasoning.effort is falsy
auxiliary.<task>.extra_body.reasoning, but the new translation path in
_CodexCompletionsAdapter.create() reads the effort with
``reasoning_cfg.get("effort", "medium")``.  That returns the configured
value verbatim when the key is present, so ``effort: null`` /
``effort: ""`` (both common YAML shapes) flow through as
``{"effort": null, "summary": "auto"}`` and Codex rejects the request
with "Invalid value for parameter ``reasoning.effort``".

agent/transports/codex.py::build_kwargs() — which the new adapter is
documented to mirror — uses a truthy check (``elif
reasoning_config.get("effort"):``) so the same falsy values keep the
"medium" default.  Switch the auxiliary adapter to the same
``or "medium"`` truthy form so identical config produces identical
requests on both paths.

- [x] Two new regression tests cover ``effort: None`` and
  ``effort: ""`` and assert the request goes out as
  ``{"effort": "medium", "summary": "auto"}``.
- [x] Old behaviour fails the new tests (``{'effort': None} !=
  {'effort': 'medium'}``); fixed behaviour passes all 11 tests in the
  ``TestCodexAdapterReasoningTranslation`` class.
- [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py``
  (108 passed) and ``tests/agent/transports/test_codex_transport.py +
  test_chat_completions.py`` (73 passed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:47:50 -07:00
vominh1919 44cf33449d fix(mcp): add periodic keepalive to _wait_for_lifecycle_event
Sends a lightweight list_tools() probe every 3 minutes during idle
periods to prevent TCP connections from going stale behind LB / NAT
idle timeouts (commonly 300-600s).  When the keepalive fails, the
reconnect event fires so the transport rebuilds the session cleanly.

Salvages the keepalive portion of @vominh1919's PR #17016. The
circuit-breaker half-open recovery from the same PR was independently
landed on main via #benbarclay's commit 8cc3cebca ("fix(mcp): add
half-open state to circuit breaker", Apr 21); only the keepalive is
salvaged here.

Fixes #17003.
2026-05-05 05:47:33 -07:00
Teknium 005b2f4c5d chore: AUTHOR_MAP entry for beardthelion 2026-05-05 05:46:16 -07:00
beardthelion f15b0fbb4f fix: add PLATFORM_HINTS entry for api_server platform
The API server is a documented, first-class messaging platform with its own
gateway adapter, docs pages, and toolset. But it's the only messaging
platform missing from PLATFORM_HINTS in agent/prompt_builder.py.

Without a platform hint, the agent has no context about the API server's
rendering environment and defaults to markdown-heavy document-style outputs
(code fences, bold, bullet points) — which break on the plain-text frontends
most API server consumers wrap (Open WebUI, custom agents, third-party
bridges).

Adds a generic api_server entry that describes the medium (unknown rendering,
assume plain text) without encoding any specific use case. Individual consumers
can layer additional style guidance via ephemeral system prompts.

Before (DeepSeek V4 Pro via API server, no hint):
  **Sendblue bridge** at /opt/sendblue-bridge - **68MB** on disk

After (same prompt, with hint):
  Sendblue bridge at /opt/sendblue-bridge, 68MB on disk

No breaking changes — new dict entry only. Existing API server consumers see
no behavioral change except for models that previously defaulted to markdown
formatting, which now produce cleaner plain-text output.
2026-05-05 05:46:16 -07:00
Teknium b10e38e392 fix(skills): pin protects against deletion only, not edits (#20220)
Previously, pinning a skill blocked every skill_manage write action
(edit, patch, delete, write_file, remove_file). The 'hard fence'
design conflated two concerns:

  1. Pin as deletion protection — don't let the curator archive
     or the agent delete a stable skill.
  2. Pin as content freeze — don't let the agent rewrite it mid-conversation.

In practice (1) is what users pin for: they want a skill to survive
curator passes. (2) created friction — agents finding a new pitfall
in a pinned skill had to ask the user to unpin, then the agent
patches, then the user re-pins. The dance discouraged skill
maintenance and pinned skills went stale.

This narrows the _pinned_guard to skill_manage(action='delete') only.
Patches, edits, and supporting-file writes go through on pinned
skills so the agent can keep improving them. The curator's own
pinned-skip behavior (agent/curator.py:271 for auto-archive,
line 349 for the LLM review prompt) is unchanged — curator still
never touches pinned skills.

Changes:
- tools/skill_manager_tool.py: remove _pinned_guard calls from
  _edit_skill, _patch_skill, _write_file, _remove_file; keep on
  _delete_skill. Updated _pinned_guard docstring and error message.
- tools/skill_manager_tool.py: updated skill_manage model-facing tool
  description to reflect the new semantic.
- website/docs/user-guide/features/curator.md: updated pinning
  section.
- tests/tools/test_skill_manager_tool.py: flipped refuses-pinned
  tests for edit/patch/write_file/remove_file into allowed-when-pinned;
  kept test_delete_refuses_pinned (strengthened assertion to check the
  'cannot be deleted' wording).

Closes #18354
2026-05-05 05:43:10 -07:00
Teknium fe8560fc12 feat(api-server): X-Hermes-Session-Key header for long-term memory scoping (#20199)
* feat(api-server): X-Hermes-Session-Key header for long-term memory scoping

API Server integrations (Open WebUI, custom web UIs) can now pass a stable
per-channel identifier via X-Hermes-Session-Key that scopes long-term memory
(Honcho, etc.) independently of the transcript-scoped X-Hermes-Session-Id.
This matches the native gateway's session_key / session_id split: one stable
key per assistant channel, many independent transcripts that rotate on /new.

- _create_agent and _run_agent accept gateway_session_key and pass it to
  AIAgent(gateway_session_key=...), which is already honored by the Honcho
  memory provider (plugins/memory/honcho/client.py resolve_session_name).
- New shared helper _parse_session_key_header applies the same API-key
  gate, control-character sanitization, and a 256-char length cap as the
  existing session-id header.
- All three agent endpoints honor the header: /v1/chat/completions,
  /v1/responses, /v1/runs. JSON and SSE responses echo it back.
- /v1/capabilities advertises session_key_header so clients can
  feature-detect.

Closes #20060.

Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com>

* chore: AUTHOR_MAP entry for manateelazycat

---------

Co-authored-by: Andy Stewart <lazycat.manatee@gmail.com>
2026-05-05 05:34:47 -07:00
Teknium 436672de0e feat(curator): add archive and prune subcommands (#20200)
* fix(curator): protect hub skills by frontmatter name

* test(skill_usage): add mark_agent_created to regression test

The cherry-picked test predates #19618/#19621 which rewrote
list_agent_created_skill_names() to require an explicit
created_by: 'agent' provenance marker. Without mark_agent_created(),
my-skill is excluded from the list and the positive assertion fails.

* feat(curator): add archive and prune subcommands

Adds 'hermes curator archive <skill>' and 'hermes curator prune
[--days N] [--yes] [--dry-run]' alongside the existing status, run,
pause, resume, pin, unpin, restore, backup, rollback verbs.

These are the two genuinely new user-facing verbs requested in #19384.
The other verbs proposed there ('stats' and 'restore') already exist
as 'curator status' and 'curator restore', so no duplicate surface is
added — all skill lifecycle commands live under the single 'hermes
curator' namespace.

- archive: manual archive of an agent-created skill. Refuses pinned
  skills with a hint pointing at 'hermes curator unpin'.
- prune: bulk-archive unpinned skills idle for >= N days (default 90).
  Falls back to created_at when last_activity_at is null so never-used
  skills can still be pruned. --dry-run previews, --yes skips prompt.

Adapted from @elmatadorgh's PR #19454 which placed the same verbs
under 'hermes skills' with a separate hermes_cli/skills_config.py
handler and rich table for stats. The 'stats' and 'restore' parts of
that PR duplicated existing surface, so only archive and prune are
kept, rewritten to match hermes_cli/curator.py's existing plain-text
handler style. Tests rewritten from scratch against the new handlers.

Closes #19384

Co-authored-by: elmatadorgh <coktinbaran5@gmail.com>

---------

Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com>
Co-authored-by: elmatadorgh <coktinbaran5@gmail.com>
2026-05-05 05:15:54 -07:00
Teknium 4f76166cf0 chore: AUTHOR_MAP entry for qxxaa 2026-05-05 05:01:12 -07:00
qxxaa 0a7cc85eab fix(honcho): pass user_message as search_query in get_prefetch_context
The user_message parameter was accepted by get_prefetch_context but intentionally discarded, with the rationale that passing it would
expose conversation content in server access logs.

This rationale is inconsistent: Honcho already persists every message in full via saveMessages. The content is already in the database. A search query in an access log adds negligible additional exposure, and is moot for self-hosted Honcho deployments where the operator owns the logs.

Without search_query, Honcho returns the full peer representation -
all observations, deductive/inductive layers, and peer card - in
insertion order. When contextTokens is set, the most useful parts
(peer card, dialectic conclusions) are truncated because raw
observations fill the budget first.

Passing user_message as search_query enables Honcho's semantic
retrieval to return only conclusions relevant to the current session
topic, reducing injection noise and improving context quality on cold starts.

The _fetch_peer_context method already accepts and passes search_query to the Honcho API. This change simply connects the two.
2026-05-05 05:01:12 -07:00
Teknium 046c293183 chore: AUTHOR_MAP entry for chengoak 2026-05-05 05:00:41 -07:00
chengoak 8f4c0bf088 fix(wecom): pad base64 AES key before decode
WeCom doesn't pad base64 aeskey, causing Python strict mode decode failure
on media/image/file messages. Add automatic padding before base64 decode:
aes_key + '=' * ((4 - len(aes_key) % 4) % 4).

Salvages the AES padding fix from @chengoak's PR #17040. The SSRF whitelist
entry for a private COS bucket hostname was dropped as it belongs in user
config, not the built-in trusted-private-IP-hosts list. The debug-level
full-body info log was dropped to avoid logging potentially sensitive
message content at INFO level.
2026-05-05 05:00:41 -07:00
Teknium 83a07f4759 chore: AUTHOR_MAP entry for happy5318 2026-05-05 05:00:05 -07:00
Teknium 9e0ef2a1bc test: pin per-turn reasoning extraction semantics
Covers four scenarios for the reasoning-box extraction loop:
 - simple turn with reasoning
 - simple turn with no reasoning
 - tool-calling turn where reasoning lives on the tool-call step
 - prior turn had reasoning, current turn does not (the stale-display
   bug the fix exists for)
 - tool-calling turn where reasoning lives on BOTH steps (latest wins)
 - empty-string reasoning treated as missing

Also updates the four inline replica loops in tests/cli/test_reasoning_command.py
to match the new turn-boundary shape so the test file reflects
production semantics.
2026-05-05 05:00:05 -07:00
happy5318 efe1cb00c8 fix: prevent stale reasoning from being reused across turns
The reasoning-box extraction loop in run_conversation() walked backwards
through the entire message history looking for any assistant message
with a non-empty 'reasoning' field.  When the current turn produced
no reasoning (e.g. the provider returned reasoning_content=null for a
trivial response), the loop walked past the current turn and showed
reasoning from a prior turn — stale text from minutes or hours ago
displayed as if it belonged to the current reply.

Fix: stop the walk at the user message that started the current turn.
That picks the most recent reasoning WITHIN the turn (correct for
tool-calling turns where reasoning lands on the tool-call step and
the final-answer step has reasoning=None — common on Claude thinking,
DeepSeek v4, Codex Responses), and returns None cleanly when the
current turn genuinely had no reasoning.

Co-authored-by: happy5318 <happy5318@users.noreply.github.com>
2026-05-05 05:00:05 -07:00
Teknium 4577f392f9 chore: AUTHOR_MAP entry for ashermorse 2026-05-05 04:58:23 -07:00
Asher Morse 6b76ea4707 fix(gateway): load reply_to_mode from config.yaml for Discord and Telegram
The YAML-to-env-var bridge in load_gateway_config() mapped every Discord
and Telegram config key (require_mention, auto_thread, reactions, etc.)
except reply_to_mode. Users setting discord.reply_to_mode or
telegram.reply_to_mode in ~/.hermes/config.yaml got no effect — the
adapter only read the env var, which nothing populated from YAML.

Add the missing bridge for both platforms, following the existing pattern.
Top-level <platform>.reply_to_mode preferred, falls back to
<platform>.extra.reply_to_mode, env var never overwritten. Handles YAML
1.1 bare `off` → Python False coercion.

This is a re-submission of the work from #9837 and #13930, which both
implemented the same fix but neither landed (see co-authors below).

Co-authored-by: Matteo De Agazio <hypnosis.mda@gmail.com>
Co-authored-by: ishardo <239075732+ishardo@users.noreply.github.com>
2026-05-05 04:58:23 -07:00
LeonSGP43 354502ee48 fix(kanban): preserve dashboard completion summaries 2026-05-05 04:57:38 -07:00
Teknium cca8587d35 docs(quickstart): link Onchain AI Garage Hermes tutorials playlist (#20192)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* docs(quickstart): link Onchain AI Garage Hermes tutorials playlist

Adds a 'Prefer to watch?' tip callout near the top of the quickstart page pointing to @OnchainAIGarage's Hermes Agent Tutorials + Use Cases playlist, which includes a Masterclass series covering install, setup, and basic commands.

* docs(quickstart): embed Masterclass video in Prefer to watch section

Swaps the plain-link tip callout for an inline responsive YouTube embed of the Hermes Agent Masterclass (R3YOGfTBcQg) plus a kept link to the full Onchain AI Garage tutorials playlist.
2026-05-05 04:56:54 -07:00
Teknium 4d0f59fa5a test(skill_usage): add mark_agent_created to regression test
The cherry-picked test predates #19618/#19621 which rewrote
list_agent_created_skill_names() to require an explicit
created_by: 'agent' provenance marker. Without mark_agent_created(),
my-skill is excluded from the list and the positive assertion fails.
2026-05-05 04:55:22 -07:00
LeonSGP43 68c1a08ad1 fix(curator): protect hub skills by frontmatter name 2026-05-05 04:55:22 -07:00
Teknium 5168226d60 feat(file_tools): post-write delta lint on write_file + patch, add JSON/YAML/TOML/Python in-process linters (#20191)
Closes the gap where write_file skipped the post-edit syntax check that
patch already ran, so silent file corruption (bad quote escaping,
truncated writes, etc.) would persist on disk until a later read.

## Changes

tools/file_operations.py:
- Add in-process linters for .py, .json, .yaml, .toml (LINTERS_INPROC).
  Python uses ast.parse, JSON/YAML/TOML use stdlib/PyYAML parsers.
  Zero subprocess overhead; preferred over shell linters when both apply.
- _check_lint() now accepts optional content and routes to in-process
  linter first. Shell linter (py_compile, node --check, tsc, go vet,
  rustfmt) remains the fallback for languages without an in-process
  equivalent.
- New _check_lint_delta() implements the post-first/pre-lazy pattern
  borrowed from Cline and OpenCode: lint post-write state first; only
  if errors are found AND pre-content was captured does it lint the
  pre-state and diff. If the pre-existing file had the SAME errors the
  edit didn't introduce anything new, so the file is reported as 'still
  broken, pre-existing' with success=False but a message explaining the
  errors were pre-existing. If the edit introduced genuinely new errors,
  those are surfaced and pre-existing ones are filtered out.
- WriteResult gains a lint field.
- write_file() captures pre-content for in-process-lintable extensions
  and calls _check_lint_delta after a successful write.
- patch_replace() switches from _check_lint to _check_lint_delta,
  reusing the pre-edit content it already has in scope.

tools/file_tools.py:
- Update write_file schema description to mention the post-write lint.

tests/tools/test_file_operations_edge_cases.py:
- Update existing brace-path tests to use .js (shell linter) now that
  .py is in-process.
- Add TestCheckLintInproc (9 tests) covering Python/JSON/YAML/TOML
  in-process linters.
- Add TestCheckLintDelta (5 tests) covering the post-first/pre-lazy
  short-circuit, new-file path, and the single-error-parser caveat.

## Performance

In-process linters are microseconds per call (ast.parse, json.loads).
The hot path (clean write) runs exactly one lint — matches main's cost
for patch. Pre-state capture is skipped when the file has no applicable
linter. Measured 4.89ms/write average over 100 .py writes including lint.

## Inspiration

- Cline's DiffViewProvider.getNewDiagnosticProblems() — filters pre-write
  diagnostics from post-write diagnostics (src/integrations/editor/DiffViewProvider.ts).
- OpenCode's WriteTool — runs lsp.diagnostics() after write and appends
  errors to tool output (packages/opencode/src/tool/write.ts).
- Claude Code's DiagnosticTrackingService — captures baseline via
  beforeFileEdited() and returns new-diagnostics-only from
  getNewDiagnostics() (src/services/diagnosticTracking.ts).

## Validation

- tests/tools/test_file_operations.py + test_file_operations_edge_cases.py
  + test_file_tools.py + test_file_tools_live.py + test_file_write_safety.py
  + test_write_deny.py + test_patch_parser.py + test_file_ops_cwd_tracking.py:
  228 passed locally.
- Live E2E reproduction of the tips.py corruption incident: broken
  content written; lint field surfaces 'SyntaxError: invalid syntax.
  Perhaps you forgot a comma? (line 6, column 5)' — the exact error
  that would have self-corrected the bug on the next turn.
2026-05-05 04:54:17 -07:00
Teknium b93643c8fe chore: AUTHOR_MAP entry for wmagev 2026-05-05 04:51:29 -07:00
wmagev 2eef395e1c fix(compaction): mark end of context summary in role=user fallback
When the head ends with assistant/tool and the tail starts with assistant,
the summary is inserted as a standalone role="user" message. The body's
verbatim "## Active Task" quote then gets read as fresh user input by
weak/local models (#11475, #14521).

The merge-into-tail path already appends an explicit end-of-summary marker
for this reason. Mirror it on the standalone path so both insertion routes
give the model the same "summary above, not new input" signal.
2026-05-05 04:51:29 -07:00
Teknium c725d7d648 chore: AUTHOR_MAP entry for TheEpTic 2026-05-05 04:45:32 -07:00
Nexus 660ce7c54b fix(ui-tui): prevent React effect cleanup from killing python TUI gateway subprocess
The useEffect at useMainApp.ts:546-565 calls gw.kill() in its cleanup function. React calls cleanup on every re-render when the dependency array ([gw, sys]) shifts — which happens whenever sys changes identity (any system message). This sends SIGTERM to the Python TUI gateway subprocess, silently killing the backend mid-session.

The kill path was already handled by entry.tsx's setupGracefulExit for real app exits (SIGINT, uncaught exception). The die() function also calls gw.kill() for explicit user exit. Removing the cleanup kill leaves all exit paths covered while preventing accidental mid-session kills on ordinary React re-renders.
2026-05-05 04:45:32 -07:00
LeonSGP43 1a03e3b1c6 fix(kanban): detect darwin zombie workers 2026-05-05 04:43:40 -07:00
0xsir0000 f6b68f0f50 fix(gateway): keep DoH-confirmed Telegram IPs that match system DNS (#14520)
discover_fallback_ips() filtered out any DoH-resolved IP that also appeared
in the system resolver's answer set, on the assumption that the system IP
was unreachable. When DoH and system DNS agreed (a common case), the
function returned the hardcoded _SEED_FALLBACK_IPS list instead — and on
networks where those seed addresses are not routable, the Telegram fallback
transport had nothing usable to retry against and polling failed.

Drop the system_ips exclusion so DoH-confirmed IPs are preserved regardless
of system DNS overlap. The TelegramFallbackTransport already tries the
primary path first via system DNS, then falls through to the IP-rewrite
path on connect failure; including the same IP in both lanes lets a
transient primary failure recover via the explicit IP route instead of
escalating to seed addresses.

Update the two tests that codified the old exclusion to reflect the new,
inclusion-by-default behaviour.

Fixes #14520
2026-05-05 04:42:59 -07:00
revaraver aacf36e943 fix(cli): persist manual compress handoff 2026-05-05 04:42:48 -07:00
Teknium fe8dc26bc9 chore: AUTHOR_MAP entry for revaraver noreply 2026-05-05 04:42:44 -07:00
revaraver 4a3e3e20e5 fix(compression): preserve iterative summary continuity 2026-05-05 04:42:44 -07:00
Teknium f8a6db68ca test(kanban): isolate HERMES_KANBAN_BOARD writes in pin-env tests
The helper under test writes to os.environ directly, bypassing
monkeypatch tracking. Without an explicit snapshot/restore fixture,
the mutation leaks into subsequent tests and breaks TestSharedBoardPaths
(kanban path resolution reads HERMES_KANBAN_BOARD and routes through
boards/<leaked-slug>/ instead of the test's own HERMES_HOME).

Add an autouse fixture that snapshots the env var before the test and
restores (or pops) it after, regardless of what the helper did.
2026-05-05 04:37:47 -07:00
0xDevNinja b22b3f506a fix(cli): pin HERMES_KANBAN_BOARD at chat boot to stop subprocess board drift
Without an explicit pin, in-process kanban tools and shelled-out
`hermes kanban …` subprocesses resolve the active board on different
paths: the env var when set, otherwise the global `<root>/kanban/current`
file. When a concurrent session toggles the current-board pointer
mid-turn, the same chat ends up routing tool calls to board A while its
shell calls hit board B, surfacing as phantom "no such task" errors.

Pin the resolved board into env once at `cmd_chat` boot when
HERMES_KANBAN_BOARD isn't already set. Mirrors what the dispatcher does
for spawned workers (kanban_db.py:2622-2623). Idempotent and a no-op
when the env is already pinned by the caller.

Closes #20074
2026-05-05 04:37:47 -07:00
Teknium d472d697cd chore(release): map stevekelly622@gmail.com → @steezkelly 2026-05-05 04:34:45 -07:00
Steve Kelly 8c82d0664d fix(kanban): ignore stale current board pointers 2026-05-05 04:34:45 -07:00
Teknium 2a285d5ec2 fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924)

Per-delta _strip_think_blocks ran at _fire_stream_delta and destroyed
downstream state. When MiniMax-M2.7 / DeepSeek / Qwen3 streamed a tag
split across deltas (delta1='<think>', delta2='Let me check'), the
regex case-2 match erased delta1 entirely, so CLI/gateway state
machines never learned a block was open and leaked delta2 as content.
Raw consumers (ACP, api_server, TTS) had no downstream defense at all.

Replace the per-delta regex with a stateful StreamingThinkScrubber
that survives delta boundaries:
  - Closed <tag>X</tag> pairs always stripped (matches _strip_think_blocks
    case 1).
  - Unterminated open at block boundary enters a block; content
    discarded until close tag arrives.  At end-of-stream, held
    content is dropped.
  - Orphan close tags stripped without boundary gating.
  - Partial tags at delta boundaries held back until resolved.
  - Block-boundary rule (start-of-stream, after \n, or
    whitespace-only since last \n) preserves prose that mentions
    tag names.

Reset at turn start alongside the existing context scrubber; flush at
turn end so a benign '<' held back at end-of-stream reaches the UI.

E2E-verified on live OpenRouter->MiniMax-m2 streams: closed pairs
strip cleanly, first word of post-block content is preserved, pure
content passes through unchanged.  Stefan's screenshot case (#17924)
— 'Let me check' getting chopped to ' me check' — no longer happens.

Final _strip_think_blocks calls on completed strings (final_response,
replay, compression) are preserved; only the streaming per-delta call
site switched to the scrubber.
2026-05-05 04:33:38 -07:00
Chris Danis 28f4d6db63 fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s
MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}`
for date-time params) and `format` keywords. llama.cpp's
`json-schema-to-grammar` converter rejects regex escape classes
(\\d/\\w/\\s) and most format values, returning HTTP 400
"parse: error parsing grammar: unknown escape at \\d" — the whole request
fails.

Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these
keywords fine and use them as prompting hints. Stripping unconditionally
loses useful hints for every cloud user to fix a llama.cpp-only bug.

Approach: classify the llama.cpp grammar-parse 400 in the error
classifier, and on match do a one-shot in-place strip of pattern/format
from `self.tools`, then retry. Follows the existing
`thinking_signature` recovery pattern. Cloud users hit zero overhead;
llama.cpp users pay one failed request per session.

Changes
- agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern`
  + narrow HTTP-400 branch matching "error parsing grammar",
  "json-schema-to-grammar", or "unable to generate parser ... template".
- tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper —
  reactive, walks schema nodes, skips property names (search_files.pattern
  survives). Returns strip count for logging.
- run_agent.py: new one-shot recovery block in the retry loop. Strips,
  logs, continues. Falls through to normal retry if nothing to strip.
- tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip
  tests including the property-name preservation and idempotency checks.

Co-authored-by: Chris Danis <cdanis@gmail.com>
2026-05-05 04:25:18 -07:00
Interstellar-code 542e06c789 fix: include default profile in kanban assignees 2026-05-05 04:25:05 -07:00
Teknium fc4aa66ee4 feat(tips): add 100 new CLI startup tips (#20168)
Expands TIPS corpus from 280 to 380 entries covering untapped
territory across slash commands, CLI flags, env vars, config keys,
and platform features. Every tip verified against real code and
docs.

Batch 1 (50): advanced slash commands (/steer, /goal, /snapshot,
/copy, /redraw, /agents, /footer, /busy, /topic, /approve, /restart,
/kanban, /reload), no-agent cron, gateway hooks, curator, credential
pools, provider routing, TUI/dashboard env vars and themes, checkpoints,
Piper TTS, API server, GATEWAY_PROXY_URL, MATRIX_DEVICE_ID,
TELEGRAM_WEBHOOK_SECRET, batch_runner --resume.

Batch 2 (50): lesser-known slash commands (/new, /clear, /history,
/save, /status, /image, /platforms, /commands, /toolsets, /gquota,
/voice tts, /reload-skills, /indicator, /debug), CLI subcommands
(hermes -z, --pass-session-id, --image, --ignore-user-config,
--source tool, dump --show-keys, sessions rename/delete, import,
fallback, pairing, setup, status --deep), agent behavior env vars
(HERMES_AGENT_TIMEOUT, HERMES_ENABLE_PROJECT_PLUGINS,
HERMES_DISABLE_FILE_STATE_GUARD, HERMES_ALLOW_PRIVATE_URLS,
HERMES_OPTIONAL_SKILLS, HERMES_BUNDLED_SKILLS,
HERMES_DUMP_REQUEST_STDOUT, HERMES_OAUTH_TRACE, HERMES_STREAM_RETRIES),
gateway env vars, image_gen config, auxiliary.session_search,
tirith_fail_open, source tool filtering, API_SERVER_MODEL_NAME,
dashboard plugins.
2026-05-05 04:15:58 -07:00
Brecht-H f25d3ec917 fix(kanban): suppress dispatcher stuck-warn when ready queue holds only non-spawnable assignees
After PR #20105 (dispatcher skips ready tasks whose assignee fails
``profile_exists()`` to prevent the orion-cc/orion-research crash
loop), the gateway and CLI emit a spurious "kanban dispatcher stuck:
ready queue non-empty for N consecutive ticks but 0 workers spawned"
warning every 5 minutes on multi-lane setups where the queue is
steadily full of human-pulled work assigned to terminal lanes.

The warn is intended to catch real failure modes (broken PATH,
missing venv, credential loss for a real Hermes profile). On a
multi-lane host it fires forever even though everything is healthy:
the dispatcher correctly chose not to spawn, and there is nothing
for the operator to fix.

Changes:

* ``DispatchResult`` gains a ``skipped_nonspawnable`` field
  (separate from ``skipped_unassigned``) so callers can distinguish
  "task missing an owner — operator should route it" from "task
  owned by a control-plane lane — terminal will pull it".
* ``dispatch_once`` routes the ``not profile_exists(assignee)`` skip
  into the new bucket (was lumped into ``skipped_unassigned``).
* New helper ``has_spawnable_ready(conn)`` returns True iff at least
  one ready+assigned+unclaimed task in the DB has an assignee that
  maps to a real Hermes profile. Falls back to legacy "any
  ready+assigned" when ``profile_exists`` is unimportable so degraded
  installs still surface the original warn.
* The gateway dispatcher (``gateway/run.py``) and the CLI standalone
  daemon (``hermes_cli/kanban.py``) both swap their cheap
  ``ready_nonempty`` probe to use ``has_spawnable_ready``. Stuck-warn
  now fires only when there is genuine spawnable work the dispatcher
  failed to start.
* CLI dispatch output prints ``Skipped (non-spawnable assignee —
  terminal lane, OK)`` for visibility without alarm.

Tests:

* New ``has_spawnable_ready`` cases (empty queue, terminal-lane
  only, mixed real+terminal).
* New ``test_dispatch_skips_nonspawnable_into_separate_bucket``
  verifies the bucketing change.
* Updated ``test_dispatch_skips_unassigned`` to assert no
  cross-leak.
* Added ``all_assignees_spawnable`` fixture in
  ``tests/hermes_cli/conftest.py`` and threaded it through dispatcher
  tests that use synthetic assignees ("alice", "bob"). PR #20105
  (the parent commit) silently broke 8 such tests by routing those
  assignees into ``skipped_nonspawnable`` instead of spawning; this
  PR repairs them as part of the same code area.

Verified locally: 246/246 kanban-suite tests pass.

Stacks on top of fix/kanban-dispatcher-skip-missing-profile-2026-05-05
(PR #20105). Reviewer: this PR is meant to merge AFTER #20105.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 04:13:12 -07:00
Brecht-H ca5595fe7b fix(kanban): dispatcher skips ready tasks whose assignee is not a real profile
The kanban dispatcher's `_default_spawn` invokes
``hermes -p <task.assignee> chat -q ...``. When ``assignee``
names a control-plane lane (e.g. an interactive Claude Code
terminal like ``orion-cc`` / ``orion-research``) instead of a
real Hermes profile, the subprocess fails on startup with
"Profile 'X' does not exist", gets reaped as a zombie, the
TTL/crash detector marks the task back to ``ready``, and the
next tick re-spawns the same crashing worker. Result: a
permanent crash loop emitting ``spawned=2 crashed=2 every tick``
in the gateway log and burning CPU forever.

Reproduce on a fresh Hermes-agent install:

  # 1. Create a kanban task whose assignee names a non-profile.
  hermes kanban create --assignee orion-cc --status ready \
      --title "Review PR #N" --body "..."
  # 2. Start the gateway with the embedded dispatcher.
  hermes gateway run
  # gateway.log lines every minute:
  #   kanban dispatcher: tick spawned=1 reclaimed=0 crashed=1 ...
  # 3. ps -ef | grep '[h]ermes.*defunct' shows zombies.

Fix
---
``dispatch_once()`` now pre-checks ``hermes_cli.profiles.
profile_exists(assignee)`` before claiming. If False, the row
is added to ``skipped_unassigned`` (it's effectively
"unassigned-to-an-executable-profile") and the dispatcher
moves on without claiming, spawning, or counting a crash.

The check is opt-in safe: if the import fails (e.g. test
isolation, profile module restructured), ``profile_exists``
falls back to ``None`` and the original behaviour is preserved
unchanged.

This addresses the explicit hint in the kanban task body
(``t_2bab06e3``):

  "Should ready-state tasks auto-spawn at all, or only on
  explicit orion-cc claim? If spurious, gate the auto-spawn
  behind a config flag (e.g. only assignee=hermes or
  assignee=auto)."

Profile-existence is a tighter gate than a config flag — it
self-documents (the user already knows whether they have an
``orion-cc`` profile), and it doesn't require Mac to maintain
an allowlist as new lane names appear. New lanes that ARE
real profiles (created via ``hermes profile create``) auto-
qualify the moment the profile dir is created.

Validated live
--------------
On Orion's hermes-agent install, two ``orion-research``-
assigned tasks (Bug A and Bug C investigations) had been
crash-looping since 2026-05-05 06:58 local. After applying
the patch + restarting the gateway:

- Stale ``running`` claims released to ``ready`` cleanly.
- New gateway emitted ``kanban dispatcher: embedded`` and
  has ticked silently for 2+ minutes — no spawned=,
  crashed=, or stuck= log lines (all spawn skips are quiet).
- Tasks remain ``ready`` with ``claim_lock=None``,
  ``worker_pid=None``, ``spawn_failures=0``.
- Dashboard + telegram + freqtrade unaffected.

Confidence: high (live verified on Orion).
Scope-risk: narrow (additive guard inside one function).
Not-tested: behaviour when a profile is renamed mid-tick —
current code re-imports ``profile_exists`` per row so a
freshly created profile auto-qualifies on the next tick.
Machine: orion-terminal

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 04:13:12 -07:00
Teknium 91ce8fc000 fix(setup): offer Keep/Replace/Clear when API key already exists
hermes setup / hermes model used to silently skip the key prompt when
any value was present in .env — even a malformed paste — leaving users
with a stuck '✓' and no way to recover without hand-editing .env.

Replace the silent acknowledgement at all three API-key provider flows
(Kimi, Stepfun, generic) with a single [K]eep / [R]eplace / [C]lear
menu via a shared `_prompt_api_key` helper.

- K / Enter / Ctrl-C / unknown input → keep (never destroys the key)
- R → getpass for new key; empty input cancels and preserves existing
- C → clears the env var, tells user to rerun hermes setup, aborts flow

LM Studio's no-auth-placeholder substitution stays on first-time entry
only; on Replace an empty input means 'cancel', not 'overwrite with
dummy key'.

11 unit tests cover all branches incl. garbage-input-keeps-key, Ctrl-C
at the choice prompt, Replace-cancel preserving the old key, Clear
wiping only the target env var, and lmstudio placeholder semantics.

Fixes #16394
Reshapes #18355 — original PR pasted the menu inline at 3 sites with
no tests; this consolidates to one helper (+88/-66) with coverage.

Co-authored-by: Feranmi10 <89228157+Feranmi10@users.noreply.github.com>
2026-05-05 04:08:11 -07:00
simbam99 8ad5e98f8d fix(gateway): preserve pending update prompts across restarts 2026-05-05 03:59:39 -07:00
Teknium 2785355750 chore(release): map bjianhang@gmail.com → @bjianhang 2026-05-05 03:59:00 -07:00
baojianhang c3112adac5 fix(tui): improve clipboard copy fallbacks 2026-05-05 03:59:00 -07:00
Siddharth Balyan 13a7cbcd64 fix(nix): refresh stale tui npmDepsHash + fix cache-blind detection (#20144)
The fix-lockfiles script used 'nix build .#tui.npmDeps' to detect stale
hashes. This always succeeds when the OLD derivation is cached in Cachix
or cache.nixos.org — even when the source package-lock.json has changed.

Fix: use prefetch-npm-deps to compute the hash directly from the lockfile
and compare against what's in the nix file. Falls back to nix build only
if prefetch-npm-deps fails.
2026-05-05 15:32:20 +05:30
teknium1 601e5f1d57 fix(teams): log reply() fallback for diagnostics
The previous bare except swallowed every exception from app.reply()
silently. Log at debug so real failures (auth, chat gone) leave a
trace while keeping the group-chat 400 fallback working. Also fix
the Teams entry's indentation in the messaging flowchart.
2026-05-04 20:59:18 -07:00
Aamir Jawaid 2333b7a7ec fix(tests): patch TypingActivityInput after mock on Python <3.12
The SDK requires Python >=3.12 so CI (3.11) falls to the except
ImportError branch, leaving TypingActivityInput=None. After loading
the adapter module, explicitly restore it from the mock so
test_send_typing doesn't silently no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 3f023450dd fix(teams): fall back to flat send when threading returns 400
Group chats return 400 for threaded sends. Catch the error and
fall back to a flat send so messages always get delivered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 69aeba0df7 feat(teams): implement threading via app.reply()
Wire reply_to into send() using App.reply(conv_id, msg_id, content)
which constructs the threaded conversation ID internally.
Threads supported in channels and group chats.

Update comparison table: Threads 

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 10f89d7b72 docs(teams): add Teams to messaging/index.md
- Add to platform description and intro paragraph
- Add row to platform comparison table (images + typing)
- Add node to architecture mermaid diagram
- Add TEAMS_ALLOWED_USERS to security examples
- Add to platform-specific toolsets table
- Add to Next Steps links

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid 93869b48ab docs: add Microsoft Teams to platform lists across docs
Update all platform enumeration lists to include Teams:
index.md, quickstart.md, integrations/index.md, sessions.md,
slash-commands.md, updating.md, hooks.md, hermes-agent skill.

Skipped PII redaction docs — Teams uses AAD object IDs, not
phone numbers, so redaction doesn't apply there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Aamir Jawaid ef94aa201f docs(teams): add Teams to sidebar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 20:59:18 -07:00
Teknium c77a6e3faa chore(security): add OSV-Scanner CI + Dependabot for github-actions only (#20037)
Adds two supply-chain controls that complement our existing pinning
strategy (full-SHA action pins, exact-version source dep pins via
uv.lock / package-lock.json) without undermining it.

.github/workflows/osv-scanner.yml
  Detection-only scan of uv.lock and the ui-tui/website package-locks
  against the OSV vulnerability database. Runs on PRs that touch
  lockfiles, on push to main, and weekly against main so CVEs
  published after merge still surface. Uses Google's officially-
  recommended reusable workflow pinned by full SHA (v2.3.5).
  Findings upload to the Security tab; fail-on-vuln is disabled so
  pre-existing vulns in pinned deps do not block merges — we move
  pins deliberately, not under CI pressure.

.github/dependabot.yml
  Scoped to github-actions only. Action pins must be moved when
  upstream publishes patches (often themselves security fixes);
  Dependabot opens a PR with the new SHA + release notes for normal
  review. Source-dependency ecosystems (pip, npm) are deliberately
  NOT enabled — automatic version-bump PRs against uv.lock /
  package-lock.json would fight our pinning strategy. CVE-driven
  security updates for source deps are enabled separately via the
  repo's Dependabot security updates setting (GitHub UI), which
  fires only when a pinned version becomes known-vulnerable.
2026-05-04 20:58:21 -07:00
Stephen Schoettler 1d938832a7 test(kanban): patch dashboard websocket token stub 2026-05-04 20:50:24 -07:00
Stephen Schoettler f7918c9349 test(teams): mock ClientOptions in adapter tests 2026-05-04 20:50:24 -07:00
Teknium a1bed18194 docs: clarify that the Docker terminal backend is a single persistent container (#20003)
The docs were ambiguous about whether the Docker terminal backend spins up
a fresh container per command or reuses a long-lived one. It's the latter
— Hermes starts one container on first use and routes every terminal,
file, and execute_code call through docker exec into that same container
for the life of the process (across /new, /reset, and delegate_task
subagents). Working-directory changes, installed packages, and files in
/workspace persist from one tool call to the next, like a local shell.

- configuration.md: lead the Docker Backend section with the persistence
  model before the YAML example; sharpen the Backend Overview table row.
- features/tools.md: expand the Docker Backend block (previously just a
  2-line YAML stub) with a clear statement of the persistent-container
  semantics and a pointer to the full lifecycle section.
- docker.md: tighten the 'Docker as a terminal backend' bullet and the
  'Skills and credential files' paragraph to call out the single-container
  model explicitly.
2026-05-04 20:09:31 -07:00
Jeffrey Quesnelle d12f59aa53 Merge pull request #19866 from NousResearch/fix/clarify-placeholder-credential
clarify placeholder telegram credential in tests
2026-05-04 22:24:52 -04:00
helix4u b816fd4e26 fix(tui): complete absolute paths as paths 2026-05-04 16:14:40 -07:00
helix4u b632290166 fix(gateway): handle planned service stops 2026-05-04 16:00:49 -07:00
brooklyn! 20428f5e60 fix(tui): respect voice.record_key config (supersedes #19028, #19339) (#19835)
* fix(tui): respect voice.record_key config instead of hardcoded Ctrl+B

Classic CLI loaded ``voice.record_key`` from config.yaml and bound the
prompt-toolkit handler dynamically (``cli.py`` paths). The new TUI hard-
coded ``Ctrl+B`` everywhere — ``isVoiceToggleKey`` (input handler),
``/voice status`` ("Record key: Ctrl+B"), and ``/voice on`` ("Ctrl+B to
start/stop recording"). A user who set ``voice.record_key: ctrl+o``
(or any other key) saw the documented config silently ignored — only
Ctrl+B worked, the displayed shortcut lied about it.

Wire the configured key end to end through the existing channels:

* **Backend** (``tui_gateway/server.py``): ``voice.toggle`` action=status
  AND action=on/off responses now include ``record_key``, sourced from
  ``config.get('voice', {}).get('record_key', 'ctrl+b')``.
* **Backend types** (``ui-tui/src/gatewayTypes.ts``): ``ConfigFullResponse``
  now exposes ``config.voice.record_key`` and ``VoiceToggleResponse``
  carries ``record_key`` so the TUI can both bind and display it.
* **Frontend parser/formatter** (``ui-tui/src/lib/platform.ts``):
  ``parseVoiceRecordKey()`` accepts ``ctrl+b`` / ``alt+r`` / ``cmd+space``
  and the common aliases (``option``, ``cmd``, ``win``, …); falls back to
  the documented Ctrl+B for empty / multi-character / malformed input so
  a typo never silently disables the shortcut. ``formatVoiceRecordKey()``
  renders for status text. ``isVoiceToggleKey`` now takes a parsed
  ``ParsedVoiceRecordKey`` argument; the hardcoded ``ch === 'b'`` is
  gone. Default arg keeps existing call sites back-compat.
* **Hydration** (``ui-tui/src/app/useConfigSync.ts``,
  ``useMainApp.ts``): startup ``config.get full`` already runs; extract
  ``cfg.voice.record_key`` from it, parse, push into a new
  ``voiceRecordKey`` state, and forward to the input handler ctx
  (``InputHandlerContext.voice.recordKey``). Mtime-poll path also
  re-applies the parsed key so a hand-edit of config.yaml takes effect
  the next tick — matches existing behaviour for display options.
* **Input handler** (``ui-tui/src/app/useInputHandlers.ts``):
  ``isVoiceToggleKey(key, ch, voice.recordKey)`` so the configured
  binding fires.
* **Slash command** (``ui-tui/src/app/slash/commands/session.ts``):
  ``/voice status`` and ``/voice on`` use ``formatVoiceRecordKey`` on
  the response's ``record_key`` instead of the hardcoded label.

Tests:
* ``parseVoiceRecordKey`` covers ctrl/alt/cmd/super aliases, multi-char
  rejection, and empty fallback.
* ``formatVoiceRecordKey`` covers the doc examples (``Ctrl+B``,
  ``Ctrl+O``, ``Alt+R``, ``Cmd+B``).
* ``isVoiceToggleKey`` regression: ``ctrl+o`` configured → only ``o``
  matches, not ``b``; ``alt+r`` matches both alt-bit and meta-bit
  encodings (terminal protocol parity); omitted-arg call still binds
  Ctrl+B for back-compat.

Full TUI suite (555 tests) passes; ``tsc --noEmit`` clean.

Fixes #18994

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* fix(tui): support named-key tokens in voice.record_key (space, enter, …)

Reviewer caught that the round-1 parser in #18994 rejected every
multi-character token, so a config value like ``ctrl+space`` (which the
CLI happily binds via prompt_toolkit's ``c-space`` rewrite in
``cli.py``) silently fell back to the documented Ctrl+B default —
re-introducing the same false-shortcut bug the PR was meant to fix,
just at a different surface.

Add explicit named-key support that mirrors what the CLI accepts:

* ``space``         (alias: ``spc``)        → matches ``ch === ' '``
* ``enter``         (alias: ``return``, ``ret``) → matches ``key.return``
* ``tab``                                   → matches ``key.tab``
* ``escape``        (alias: ``esc``)        → matches ``key.escape``
* ``backspace``     (alias: ``bs``)         → matches ``key.backspace``
* ``delete``        (alias: ``del``)        → matches ``key.delete``

``ParsedVoiceRecordKey`` gains an optional ``named`` field; ``ch``
holds either a single char (back-compat) or the canonical named token,
and the runtime matcher dispatches on ``named`` before checking the
modifier shape. Aliases collapse to one canonical name so
``ctrl+esc`` and ``ctrl+escape`` behave identically.

Unrecognised multi-character tokens (e.g. ``ctrl+spcae`` typo, or
unsupported keys like ``ctrl+f5``) still fall back to the Ctrl+B
default rather than silently disabling the binding — keeps the "typo
never silently kills the shortcut" guarantee.

Tests:

* ``parseVoiceRecordKey`` parametrised over every named token + each
  alias variant.
* New ``isVoiceToggleKey`` cases for space (ch-based match), enter
  (``key.return``), tab, escape, backspace, delete, including
  modifier-mismatch negatives.
* ``formatVoiceRecordKey`` renders named keys in title case
  (``Ctrl+Space``, ``Ctrl+Enter``).
* Existing fall-back-to-Ctrl+B contract preserved for empty input
  AND unrecognised multi-char tokens.

Full TUI suite: 559/559 pass; ``tsc --noEmit`` clean.

Refs #18994 (round-1 review feedback)

Co-authored-by: asheriif <ahmedsherif95@gmail.com>

* test(tui): assert voice.toggle returns configured record_key

Salvage the backend regression from #19339 — asserts ``voice.toggle``
action=on AND action=status responses carry the configured
``voice.record_key`` end-to-end through ``_load_cfg()``. Keeps the
CLI→TUI parity contract visible in the Python test suite alongside
the existing frontend parser/matcher/formatter coverage from #19028.

* fix(tui): address Copilot review on #19835 voice.record_key wiring

Five tightenings on the parser + matcher + hydration surface, all
caught by the Copilot review on the PR — each one turns a silent
false-fire or display/binding skew into a deterministic behaviour.

* **isVoiceToggleKey ctrl branch was too permissive for named keys.**
  The doc-default macOS Cmd+B muscle-memory fallback
  (``isActionMod(key)`` on top of ``key.ctrl``) fired for every
  configured key, so bare Esc — which hermes-ink reports with
  ``key.meta`` on some macOS terminals — triggered ``ctrl+escape``,
  and Alt+Space / Alt+Tab triggered ``ctrl+space`` / ``ctrl+tab``.
  Gate the fallback to the literal ``ctrl+b`` binding so any custom
  chord requires the real Ctrl bit.
* **Alt branch guarded against Ctrl/Cmd co-press.** Without this,
  Ctrl+Alt+<letter> and Cmd+Alt+<letter> also fired ``alt+<letter>``.
* **Dropped the ``meta`` modifier variant and its alias.** In
  hermes-ink ``key.meta`` is Alt on xterm-style terminals and Cmd on
  legacy macOS ones, so a literal ``meta+b`` config displayed as
  ``Cmd+B`` while matching Alt+B — exactly the kind of false
  shortcut the PR was meant to remove. ``cmd`` / ``command`` now
  collapse onto ``super`` (kitty-style ``key.super``, with a macOS
  ``key.meta`` fallback) and render as ``Cmd+B``. Unknown modifier
  tokens fall back to the documented Ctrl+B default rather than
  silently coercing to Ctrl.
* **Slash-command display/binding skew.** ``/voice status`` and
  ``/voice on`` rendered from the fresh gateway ``record_key``
  response, but ``useInputHandlers()`` still bound the old key
  until the next 5s mtime poll. Thread ``setVoiceRecordKey``
  through ``SlashHandlerContext.voice`` and push the parsed spec
  into frontend state on every response so text and binding stay
  consistent.
* **Test coverage for the two paths Copilot flagged.** Added
  vitest coverage for (a) the three-case ``/voice`` slash output
  in ``createSlashHandler.test.ts`` and (b) the
  ``applyDisplay → voice.record_key`` hydration + omit-setter
  back-compat paths in ``useConfigSync.test.ts``. Plus regression
  cases for every false-fire scenario above.

Suite: 575/575 green, tsc --noEmit clean.

* fix(tui): address Copilot round-2 review on #19835

Three tightenings on the surface introduced in the round-1 fix:

* **``/voice tts`` reset custom bindings to Ctrl+B.** The ``tts`` branch
  of ``voice.toggle`` omitted ``record_key`` from its response, so the
  frontend's ``r.record_key ?? 'ctrl+b'`` coerced a user's custom
  binding back to the default on every TTS toggle. Two-sided fix:
  the backend now includes ``record_key`` on the ``tts`` branch (parity
  with ``status``/``on``/``off``), and the slash handler only pushes
  frontend state when the response actually carries ``record_key`` —
  belt-and-suspenders against any future branch forgetting to include
  it.

* **``super+b`` / ``win+b`` / ``cmd+b`` displayed "Cmd+B" on Linux and
  Windows.** ``formatVoiceRecordKey`` rendered ``mod === 'super'`` as
  ``Cmd`` universally, which told non-mac users the wrong modifier to
  press even though ``isVoiceToggleKey`` matched the right event bits.
  Gate the label to ``isMac`` so non-mac renders ``Super+B``.

* **``control+b`` / ``ctrl + b`` lost the macOS Cmd+B fallback.**
  ``_isDefaultVoiceKey`` keyed off ``parsed.raw`` — so
  semantically-equal aliases of the documented default dropped into
  the strict branch even though they bind Ctrl+B. Compare on the
  parsed spec (mod + ch + named) instead.

Coverage added: Linux ``Super+B`` rendering (and macOS ``Cmd+B``),
``control+b`` / ``ctrl + b`` accepting the Cmd+B fallback on darwin,
``/voice tts`` without ``record_key`` not clobbering cached binding,
and a backend regression asserting every ``voice.toggle`` branch
carries the configured key.

Suite: 579/579 TUI vitest green, 2/2 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-3 review on #19835

Three classes of robustness issue caught on the second pass — all
revolve around malformed YAML tipping ``parseVoiceRecordKey`` or
``_voice_record_key`` into a crash instead of the documented
fallback.

* **Parser crashed on non-string YAML scalars.** ``config.get full``
  returns raw ``yaml.safe_load`` output, so ``voice.record_key: 1``
  or ``voice.record_key: true`` in a hand-edited config would hit
  ``.trim()`` on a number/bool and throw, breaking startup and
  every mtime re-apply. Accept ``unknown`` at the signature, guard
  with ``typeof raw !== 'string'``, and fall back to the default.

* **Backend blew up on non-dict ``voice:``.** Same YAML hazard on
  the gateway side: ``voice: true`` / ``voice: cmd+b`` left
  ``_load_cfg().get("voice")`` as a bool/str, so ``.get("record_key")``
  raised AttributeError and took every ``voice.toggle`` branch down
  with it. Centralised the lookup in a single
  ``_voice_record_key()`` helper that ``isinstance``-guards both
  ``voice`` and ``record_key`` and falls back to ``ctrl+b``.

* **Multi-modifier chords silently dropped extras.** The previous
  validator only checked the first modifier token, so ``ctrl+alt+r``
  silently parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as ``super+b`` —
  a typo bound a different shortcut than the user configured.
  Reject multi-modifier spellings outright; the classic CLI only
  supports single-modifier bindings via prompt_toolkit's ``c-x`` /
  ``a-x`` rewrite, so this matches CLI parity.

Coverage added:

* ``parseVoiceRecordKey`` fallback on ``1`` / ``true`` / ``null`` /
  ``undefined`` / ``{}``.
* ``parseVoiceRecordKey`` fallback on ``ctrl+alt+r`` /
  ``cmd+ctrl+b`` / ``alt+ctrl+space``.
* ``test_voice_toggle_handles_non_dict_voice_cfg`` exercises
  every non-dict ``voice:`` shape (bool, str, None, int, list) and
  asserts each falls back to ``record_key: 'ctrl+b'``.

Suite: 581/581 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-4 review on #19835

Four final corners of the voice.record_key surface:

* **Bare-char configs silently coerced to ``ctrl+<key>``.** A config
  like ``voice.record_key: o`` / ``space`` / ``escape`` fell through
  to the default ``mod = 'ctrl'`` and silently bound Ctrl+O, while
  the classic CLI's prompt_toolkit would bind the raw key (no
  rewrite) — so the two runtimes silently disagreed on what "o"
  means. Require an explicit modifier; bare-char configs fall back
  to the documented Ctrl+B default.

* **Reserved ctrl+<letter> bindings would never fire.**
  ``useInputHandlers()`` intercepts ``ctrl+c`` (interrupt),
  ``ctrl+d`` (quit), and ``ctrl+l`` (clear screen) before the voice
  check runs, so those configs would be advertised in /voice
  status but the advertised shortcut never actually triggers
  push-to-talk. Added ``_RESERVED_CTRL_CHARS`` at parse time so
  the user gets the documented default instead of a dead shortcut.
  (``alt+c``, ``cmd+l``, etc. are not intercepted and stay usable.)

* **``_load_cfg()`` root itself may be a non-dict.**
  ``_voice_record_key()`` isinstance-guarded the ``voice`` subkey
  but not the root — a malformed config.yaml that collapsed to a
  scalar/list at the top level (``config.yaml: true`` or ``[]``)
  would still raise on ``.get("voice")``. Added the top-level
  guard too so every malformed shape falls back to ``ctrl+b``.

* **Stale header comment on ``isVoiceToggleKey``.** The doc-comment
  still claimed "On macOS we additionally accept the platform
  action modifier (Cmd) for the configured letter" even though the
  implementation gates the Cmd fallback to the documented default
  only. Rewrote to match.

Coverage added:

* ``parseVoiceRecordKey`` fallback on bare chars (``o``, ``b``,
  ``space``, ``escape``).
* ``parseVoiceRecordKey`` fallback on ``ctrl+c`` / ``ctrl+d`` /
  ``ctrl+l``; positive case for ``alt+c`` / ``cmd+l`` still usable.
* Backend ``test_voice_toggle_handles_non_dict_voice_cfg`` now
  exercises 5 non-dict shapes at the YAML root too.

Suite: 583/583 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-5 review on #19835

Three follow-ups on the voice matcher's modifier + shift discipline:

* **``super`` branch falsely fired on Alt+<key> / bare Esc on macOS.**
  ``isVoiceToggleKey`` accepted ``isMac && key.meta`` as a Cmd
  fallback for the ``super`` modifier — but hermes-ink sets
  ``key.meta`` for plain Alt/Option AND for bare Escape on some
  macOS terminals. A ``cmd+b`` config silently fired on Alt+B;
  ``cmd+space`` on Alt+Space; ``cmd+escape`` on bare Esc. Drop the
  fallback and require the literal ``key.super`` bit. Legacy-
  terminal users who need Cmd should upgrade to a kitty-protocol
  terminal or bind ``alt+X`` explicitly.

* **Shift bit was never checked.** The parser rejects multi-
  modifier configs like ``ctrl+shift+tab``, but the runtime
  matcher didn't check ``key.shift`` — so ``ctrl+tab`` also fired
  on Ctrl+Shift+Tab and ``alt+enter`` on Alt+Shift+Enter.
  Early-return on ``key.shift === true`` so the runtime only fires
  the exact chord the user configured.

* **Test leaked ``HERMES_VOICE=1`` into later tests.**
  ``voice.toggle`` action=on writes to ``os.environ`` directly
  (CLI parity, runtime-only flag); ``test_voice_toggle_returns_
  configured_record_key`` dispatched action=on without letting
  monkeypatch take ownership of the var first. Any later test
  that read voice mode in the same Python process could inherit a
  stale enabled state. Added ``monkeypatch.setenv("HERMES_VOICE",
  "0")`` up front so monkeypatch restores the original value at
  teardown.

Coverage added:

* ``cmd+b`` / ``cmd+space`` / ``cmd+escape`` do NOT fire on
  ``key.meta``-only events on darwin.
* ``ctrl+tab`` / ``alt+enter`` / ``ctrl+o`` reject matches when
  ``key.shift`` is held; sanity cases without Shift still fire.

Suite: 585/585 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot round-6 review on #19835

Three classes of modifier-discipline tightening + one config-surface
honesty fix:

* **Default ``ctrl+b`` Cmd fallback leaked Alt+B.** The default's
  macOS Cmd+B muscle-memory path used ``isActionMod(key)``, which
  returns ``key.meta || key.super`` on darwin. hermes-ink also
  reports plain Alt as ``key.meta``, so Alt+B silently fired the
  default binding. Replaced with strict ``isMac && key.super ===
  true`` — kitty-style Cmd+B still works, Alt+B correctly
  rejected. Legacy-terminal mac users (Terminal.app without
  CSI-u) now get raw Ctrl+B only; the documented default still
  works everywhere.

* **ctrl / super branches accepted extra modifier bits.** The
  parser rejects multi-modifier configs like ``ctrl+alt+o``, but
  the runtime matcher was permissive — ``ctrl+o`` fired on
  Ctrl+Alt+O / Ctrl+Cmd+O, and ``super+b`` fired on Cmd+Alt+B /
  Ctrl+Cmd+B. Added strict ``!key.alt && !key.meta && key.super
  !== true`` on ctrl, and ``!key.ctrl && !key.alt && !key.meta``
  on super, so the runtime only fires the exact chord the parser
  would let you configure.

* **Dropped ``cmd`` / ``command`` aliases.** They parsed to
  ``super`` and rendered as ``Cmd+X``, but legacy macOS terminals
  report Cmd as ``key.meta`` (same signal as Alt), so a
  ``cmd+o`` config was advertised as working but never actually
  fired on Terminal.app-without-CSI-u. That recreated the
  "displayed shortcut does not work" problem this PR was meant to
  remove. Users who want the platform action modifier spell it
  ``super`` / ``win`` — that matches the unambiguous ``key.super``
  bit, and kitty-style macOS terminals render it as ``Cmd+X`` via
  platform-aware formatter.

Coverage updated:

* Default ctrl+b no longer fires on Alt+B via ``key.meta`` leak;
  raw Ctrl+B and kitty-style Cmd+B still fire.
* ``ctrl+o`` rejects Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O chords.
* ``super+b`` rejects Cmd+Alt+B / Cmd+Meta+B / Ctrl+Cmd+B chords.
* ``cmd+b`` / ``command+b`` / ``meta+b`` all fall back to the
  documented default at parse time (joined the ambiguous-mac-mod
  rejection class).
* Round-2 expectations that asserted ``cmd+b`` parsed as super
  and accepted ``key.meta`` on darwin updated to reflect the new
  stricter contract.

Suite: 588/588 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.

* fix(tui): address Copilot follow-up on wire typing + escape precedence

Two follow-ups from the latest Copilot pass:

* **Config wire typing honesty (`gatewayTypes.ts`)**
  `config.get full` forwards raw `yaml.safe_load()` output, so
  `voice.record_key` can be any scalar/container when hand-edited.
  Typing it as `string` suggests a normalized contract that the
  backend does not guarantee and makes unsafe callers more likely.
  Change `ConfigVoiceConfig.record_key` to `unknown` with an
  explicit comment that callers must normalize at runtime.

* **Escape-based voice bindings were swallowed before voice check**
  `useInputHandlers()` handled `key.escape` for queue-edit cancel and
  selection clear before `isVoiceToggleKey(...)`, so configured
  `ctrl+escape` / `alt+escape` / `super+escape` chords were advertised
  but never toggled recording in those UI states.
  Add an early escape+voice check before generic Esc handlers so
  escape-based voice bindings win when configured, while plain Esc
  behavior remains unchanged.

Also updated PR #19835 description text to remove stale cmd/command
alias claims and match the current parser contract.

* fix(tui): pass configured voice shortcut through TextInput layer

Thread the live parsed voiceRecordKey into TextInput so configured voice.record_key chords bubble to useInputHandlers instead of being consumed as editor input. This removes the last hardcoded Ctrl+B pass-through in the composer path while preserving existing global control chord behavior.

* fix(tui): require explicit alt bit for escape-based alt chords

Hermes-ink reports bare Escape as meta=true+escape=true on some terminals, so a configured alt+escape binding was firing on bare Esc. Require an explicit key.alt bit when the configured named key is escape so plain Esc stays plain Esc; kitty-style alt+escape still fires.

* fix(tui): harden voice.record + TextInput paste + super-mod reserved list

Three round-7 Copilot follow-ups on #19835:

- voice.record start handler used _load_cfg().get('voice', {}).get(...) without
  shape checks, so malformed YAML (bool/scalar/list) returned 5025 instead of
  using VAD defaults. Centralized _voice_cfg_dict() helper and type-guarded
  silence_threshold/silence_duration with numeric fallbacks.
- TextInput pass-through check moved above paste/copy handling so configured
  voice chords (ctrl+v / alt+v / cmd+v) beat the composer's paste/copy
  defaults.
- parser now also rejects super+{c,d,l,v} — on macOS those are
  copy/exit/clear/paste and would be advertised in /voice status but never
  actually toggle recording.

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(tui): round-8 Copilot review — allow ctrl+x, gate super reservations to macOS, preserve voice key on transient RPC failure

Three round-8 Copilot follow-ups on #19835:

- Revert ctrl+x addition to _RESERVED_CTRL_CHARS (landed via Copilot Autofix
  commit 731ec86): ctrl+x is only claimed during queue-edit
  (queueEditIdx !== null), so voice works the rest of the session and
  matches CLI ctrl+<letter> parity.
- Gate super+{c,d,l,v} reservation to isMac. Linux/Windows TUI globals key
  off Ctrl, so kitty/CSI-u super+<letter> configs don't collide on non-mac
  and should stay usable.
- applyDisplay() now skips setVoiceRecordKey when cfg is null so one
  transient quietRpc() failure after a config edit doesn't clobber the
  cached binding back to Ctrl+B until the next successful poll.

New coverage:
- parseVoiceRecordKey preserves ctrl+x on linux
- super+{c,d,l,v} rejected on darwin, allowed on linux
- applyDisplay(null, ...) leaves voiceRecordKey untouched

* fix(cli,tui): normalize voice.record_key aliases across CLI + TUI for parity

Round-9 Copilot review on #19835: TUI accepted control+/option+/opt+/super+/win+ aliases but the classic CLI only rewrote literal ctrl+/alt+ before handing to prompt_toolkit, so a TUI-valid config silently bound a different (or no) shortcut in the CLI.

- Added normalize_voice_record_key_for_prompt_toolkit() in hermes_cli/voice.py with a single alias table (ctrl/control/alt/option/opt → c-/a-).
- Wired it into all three cli.py sites (_enable_voice_mode hint, _show_voice_status display, and the prompt_toolkit binding in _register_voice_handler).
- /voice status display now renders control+x as Ctrl+X and option+x as Alt+X (canonical casing) to match TUI formatVoiceRecordKey.
- super/win/windows are intentionally left unchanged: prompt_toolkit has no super modifier, so the CLI will reject them loudly at startup rather than silently binding Ctrl+B. Documented this split at both the TUI _MOD_ALIASES comment and the CLI normalizer docstring.
- Added tests covering ctrl/control/alt/option/opt mapping, case-insensitivity, non-string fallback, empty-string fallback, and super/win pass-through.

* fix(cli): port TUI parser contract into CLI voice.record_key normalizer

Round-10 Copilot review on #19835.

hermes_cli/voice.py's normalize_voice_record_key_for_prompt_toolkit() previously did blind substring replacement with no trim/validate step, so the CLI diverged from the TUI parser on:
- whitespace ('ctrl + b' -> 'c- b' instead of 'c-b')
- typoed named keys ('ctrl+spcae' passed through as 'c-spcae' and prompt_toolkit would reject at startup)
- bare-char configs ('o' should fall back, not pass through as 'o')
- multi-modifier chords ('ctrl+alt+r')
- reserved ctrl chars ('ctrl+c/d/l')
- unknown modifiers ('meta+b' / 'shift+b')
- named-key aliases ('return'/'esc'/'bs'/'del' not collapsed to prompt_toolkit canonicals)

Port the TUI parser contract into Python (_VOICE_MOD_ALIASES, _VOICE_NAMED_KEYS, _VOICE_RESERVED_CTRL_CHARS) so one config value binds the same shortcut in both runtimes.

Also added format_voice_record_key_for_status() shared between the PTT hint and /voice status display. Non-string scalars (voice.record_key: true / 1) now surface as 'Ctrl+B' instead of the raw scalar — /voice status no longer advertises a shortcut that can never bind.

Tests: 29/29 in test_voice_wrapper.py, including 11 new regressions covering whitespace, named-key aliases, typos, bare-char, multi-modifier, reserved ctrl, unknown mods, non-string fallback, and formatter contract.

* fix(cli): shape-safe voice config read + graceful super/win fallback

Round-11 Copilot review on #19835.

Two remaining cross-runtime gaps:

1. load_config().get('voice', {}) still assumed voice was a dict, so a hand-edited voice: true / voice: cmd+b at the top level raised AttributeError before the voice UI could start. Added voice_record_key_from_config(cfg) to hermes_cli/voice.py that isinstance-guards both the root and the voice subkey. All three cli.py read sites (_enable_voice_mode hint, _show_voice_status, PTT binding) now use it.

2. The CLI normalizer previously passed super+/win+/windows+ through unrewritten so prompt_toolkit would reject them loudly at startup — but that crash was a worse UX than a silent fallback. Normalizer now returns c-b for those spellings, and the PTT binding site logs a warning so users see why their TUI-only shortcut isn't binding in the CLI.

Coverage: 34/34 in tests/hermes_cli/test_voice_wrapper.py (5 new cases for voice_record_key_from_config + malformed-root + malformed-voice + extractor/normalizer composition).

* fix(cli): self-audit cleanup — remaining voice-config shape safety + doc drift

Self-review of the voice.record_key change set turned up four remaining items Copilot would very likely flag next round:

1. cli.py _voice_start_continuous still read load_config().get('voice', {}).get('silence_threshold') without an isinstance guard, so a hand-edited voice: true / voice: cmd+b (non-dict) raised AttributeError on VAD recording start. Shape-safe coerce the voice dict and numeric-guard silence_threshold/silence_duration.

2. cli.py _enable_voice_mode's auto_tts check had the same bug — fixed with the same isinstance guard.

3. hermes_cli/voice.py module comment on _VOICE_MOD_ALIASES still said super/win/windows 'pass through unchanged and prompt_toolkit's add() call loudly rejects them at startup'. Round 11 changed the normalizer to silently fall back to c-b with a warning at the binding site; updated the comment to match.

4. ui-tui/src/lib/platform.ts header comment had the same stale 'CLI will loudly reject them at startup' claim; updated to 'falls back to the documented default and logs a warning'.

No behavior change on the code paths already covered by test_voice_wrapper.py; the two cli.py fixes are defensive against malformed YAML that previous rounds already hardened in tui_gateway/server.py but missed in the classic CLI.

* fix(cli,tui): round-12 Copilot review — alt-collide on mac, bool-in-int guards, voice UI hardcodes, mtime-reload test

Five round-12 Copilot review items on #19835:

1. platform.ts: hermes-ink reports Alt as key.meta on many terminals; isActionMod on darwin accepts key.meta as the action modifier. So alt+c/d/l get claimed by isCopyShortcut / isAction('d')/'l') before the voice check. Reject those configs at parse time on macOS only (non-mac keeps them usable).

2. cli.py: four remaining hardcoded 'Ctrl+B' sites in voice-facing UI (_get_voice_status_fragments status bar, _voice_start_recording hints, _get_placeholder composer text) were still lying about non-default configs. Added self._voice_record_key_label() shared helper and wired it into all three sites.

3. server.py + cli.py: bool is a subclass of int, so isinstance(silence_threshold, (int, float)) accepted True/False from malformed YAML and forwarded 1/0 to the VAD engine. Exclude bool explicitly so boolean typos fall back to the documented 200 / 3.0 defaults.

4. useConfigSync.ts: extracted the config.get-full fetch+apply body into a shared hydrateFullConfig() helper. Both the initial hydration and mtime-reload paths now use it, so the polling/RPC wiring is exercised by direct unit tests (4 new cases: fresh apply, reapply on new value, transient RPC failure preserves cache, back-compat without voice setter).

5. Added alt+{c,d,l} rejection regressions on darwin + allow on linux, and bool-leak regressions for both silence_threshold and silence_duration in tests/test_tui_gateway_server.py.

Suite: 602/602 TUI vitest, 38/38 backend voice tests, typecheck + lints clean.

* fix(cli): cache voice record-key label at binding time + status-bar coverage

Round-13 Copilot review on #19835.

_voice_record_key_label() was reading live config on every render, which caused two problems:

1. prompt_toolkit registers the push-to-talk binding once at session start (@kb.add(_voice_key)); the binding does NOT re-read config. Editing voice.record_key mid-session would switch the status-bar / placeholder / recording-hint label to the new shortcut while the actual keybinding stayed on the startup chord — reintroducing the display/binding drift this whole PR is fighting.

2. Hot render path: during recording the UI is invalidated every 150ms, so re-loading + deep-merging config on every call added avoidable UI overhead.

Fix: cache the label at the same site that registers the prompt_toolkit binding via new set_voice_record_key_cache(raw_key). _voice_record_key_label() now just returns the cached value (falls back to 'Ctrl+B' before startup). Status/placeholder/hint are always in sync with the live binding; no config reload per render.

Also added 4 regression cases to tests/cli/test_cli_status_bar.py: configured ctrl+<letter> renders in both wide and compact status bars, configured named key (ctrl+space) renders in the recording hint, pre-startup absent cache falls back to Ctrl+B, and malformed configs (bool True) fall through the formatter to Ctrl+B.

Suite: 60/60 test_cli_status_bar + test_voice_wrapper, typecheck + lints clean.

* fix(cli): route /voice on + /voice status through startup-pinned label; mac alt+cdl parity

Round-14 Copilot review on #19835. All three comments legit:

1. _enable_voice_mode still formatted label from live load_config() — mid-session config edit would make /voice on announce the new shortcut while the prompt_toolkit binding stayed the startup chord. Use self._voice_record_key_label() (cached at binding time, round-13) so /voice on cannot drift from the live binding.

2. _show_voice_status had the same bug — /voice status reported live config instead of the pinned startup binding. Fixed the same way.

3. CLI normalizer accepted alt+c/alt+d/alt+l even though the TUI parser rejects them on macOS (Copilot round-12 — hermes-ink reports Alt as key.meta, isActionMod on darwin accepts it, collides with isCopyShortcut / isAction). Added _VOICE_RESERVED_ALT_CHARS_MAC = {c,d,l} gated to sys.platform == 'darwin' so a shared config like option+c falls back to c-b on both runtimes on macOS; non-mac still binds a-c.

Coverage: 4 new tests in test_voice_wrapper.py covering mac alt+cdl rejection, linux alt+cdl allowed, option/opt alias forms, and mac-specific exclusions for other alt letters. 62/62 in voice wrapper + status bar suites.

---------

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
Co-authored-by: asheriif <ahmedsherif95@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-04 15:49:28 -07:00
kshitij 109c3e468c fix(terminal): guard background process spawn against deleted cwd (#19933)
Follow-up to #19928 which fixed the foreground path in _run_bash.
The background process spawn in process_registry.py had the same
vulnerability: Popen(cwd=session.cwd) and PtyProcess.spawn(cwd=...)
would raise FileNotFoundError if the directory was deleted.

Apply _resolve_safe_cwd() at session creation time so both the PTY
and pipe-mode Popen paths receive a validated cwd.
2026-05-04 15:35:34 -07:00
briandevans 9fa3a093f2 fix(local): test root as ancestor candidate; use real pipe for fake stdout
Address Copilot review on PR #17569:

1. _resolve_safe_cwd never tested the filesystem root because the loop
   exited when `os.path.dirname(parent) == parent`, which is true once
   `parent == '/'`. Restructure so the root is checked before the
   self-equal exit. Adds `test_returns_root_when_only_root_exists` —
   regression-guarded by reverting the loop and watching it fail.

2. The fake `Popen.stdout` was a `MagicMock`; `BaseEnvironment._wait_for_process`
   calls `proc.stdout.fileno()` then `select.select`/`os.read` against it,
   which raised `TypeError: fileno() returned a non-integer` (visible as a
   thread exception in test output) and could in theory read from an
   unrelated real fd. Hand `fake_popen` a real `os.pipe()` with the write
   end pre-closed so the drain loop sees EOF immediately. Helper records
   each fd so the test cleans up after itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
briandevans 9644b8ae67 fix(local): recover when persistent_shell cwd is deleted (#17558)
When a tool call deletes its own working directory (`cd /tmp/foo &&
rm -rf /tmp/foo`), the next `subprocess.Popen(args, cwd=self.cwd)` raised
`FileNotFoundError: [Errno 2]` before bash even started — every subsequent
terminal/file-tool call hit the same wedge until the gateway restarted.

Fix in `LocalEnvironment._run_bash`: before handing `self.cwd` to Popen,
resolve a safe alternative when the path is gone (walk up to the nearest
existing ancestor, falling back to `tempfile.gettempdir()` only as a last
resort). Log a warning so the recovery is visible — not silent — and
update `self.cwd` so the next call doesn't repeat the message.

Defense in depth in `LocalEnvironment._update_cwd`: only adopt the new
cwd when it still exists as a directory. `pwd -P` from a deleted cwd can
leave a stale value in the marker file; refusing to store a missing path
keeps `self.cwd` valid by construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:31:47 -07:00
Teknium b8fb9270c4 refactor(cli): drop dead c-S-c key binding (follow-up to #19895) (#19919)
#19884 added a prompt_toolkit key binding for Ctrl+Shift+C to
"prevent Hermes from intercepting the keystroke as an interrupt
signal." #19895 then wrapped the binding in try/except after
discovering it crashed startup with ValueError on every platform.

Both PRs were based on a misreading of how terminal key events
propagate:

1. Terminal emulators (GNOME Terminal, iTerm2, kitty, Windows Terminal,
   etc.) intercept Ctrl+Shift+C before the keystroke reaches the
   application's stdin. prompt_toolkit never sees it. The binding
   could never have intercepted anything.

2. prompt_toolkit's key spec parser doesn't recognise 'c-S-c' on any
   platform — the Shift modifier is meaningless on control-sequence
   keys. Verified: every prompt_toolkit version raises 'Invalid key:
   c-S-c' at registration time.

The handler is dead code. Delete it and leave a comment explaining
why no binding is needed here. Ctrl+Q alias (#19884's other addition)
stays — that's a real prompt_toolkit key and a legitimate interrupt
shortcut.

Verified the CLI starts cleanly — key binding phase no longer raises
and the subsequent chat flow reaches the provider setup check without
error.
2026-05-04 14:49:38 -07:00
Teknium 56a78e74b2 feat(kanban-dashboard): sharper home-channel toggle contrast, drop → running action (#19916)
Follow-up polish to the kanban dashboard from #19864 and #19705.

**Home-channel toggle contrast.** The `.hermes-kanban-home-sub--on`
class previously used `color-mix(var(--color-ring) 14%, transparent)`
which was nearly invisible on both the default teal and NERV themes —
the on/off distinction relied almost entirely on the ✓ prefix glyph.
Bump to 32% fill + full-opacity ring border + inner ring shadow +
font-weight 600. Still theme-scoped (no hardcoded colors), but reads
at a glance on both tested themes.

**Drop the → running status action.** Since #19705, `PATCH /tasks/:id`
rejects `status=running` with HTTP 400 — only the dispatcher's
`claim_task` path legitimately enters that state (so the run row,
claim lock, and worker PID are created atomically). The UI button was
still present and produced a 400 on click, which is a confusing dead
affordance. Remove it from `StatusActions`; add a comment pointing to
#19535 so future editors know why it's missing.

Live-tested on the default Hermes Teal theme. 53/53 kanban dashboard
plugin tests still pass.
2026-05-04 14:48:19 -07:00
nftpoetrist 429b8eceb4 fix(cli): guard c-S-c key binding with try/except to prevent startup crash (#19895)
PR #19884 added @kb.add('c-S-c') unconditionally. prompt_toolkit raises
ValueError("Invalid key: c-S-c") during HermesCLI.__init__ on platforms
where this key spec is not recognised — the process exits before reaching
the prompt loop. Reported on macOS (#19894) and Linux (#19896) immediately
after #19884 landed.

Fix: wrap the registration in try/except ValueError so that startup
continues cleanly on any platform/version that rejects the spec. Where
the spec is accepted the binding is registered normally as a no-op,
allowing the terminal to handle Ctrl+Shift+C natively as before.

Fixes #19894
Fixes #19896
2026-05-04 14:45:01 -07:00
Rames Jusso e493b1c482 docs(skill): add hyperframes inspect command to cli.md + SKILL.md
- references/cli.md: add Inspect step (5/7) to Workflow + dedicated `## inspect` section between validate and preview, covering --json/--samples/--at flags and the legacy `hyperframes layout` alias
- SKILL.md: rename procedure step 7 to "Lint, validate, inspect, preview, render" with the full pipeline; explain inspect as the layout-side companion to validate (catches overflow / off-frame / occluded text issues that static lint can't see)
- SKILL.md verification: lint + validate + inspect as a single combined pass
- SKILL.md References list: include `inspect` in the cli.md command list

Brings the optional skill in sync with hyperframes-oss main as of 2026-05-03 — `inspect` was added in heygen-com/hyperframes#480 (2026-04-25) and is documented as a real workflow step in skills/hyperframes-cli/SKILL.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:13:17 -07:00
James 20859cc408 docs(skill): sync hyperframes skill with upstream changes
Pulls the hyperframes skill up to the latest state of heygen-com/hyperframes
skill content. Opened 2026-04-17; upstream has shipped CLI, layout, and path
changes since.

- SKILL.md: promote the visual-style check to a proper HARD-GATE
  (DESIGN.md > named style > ask 3 questions, with the #333/#3b82f6/Roboto
  tells); expand Step 6 to cover audio-reactive (mandatory per-frame
  tl.call() sampling loop — a single long tween does NOT react to audio),
  caption exit guarantee (hard tl.set kill after group.end), marker
  highlighting, and scene transitions; add the animation-map script to
  Verification; link the new features.md.

- references/cli.md: add capture and validate (both shipped commands, both
  referenced from the workflow but missing from the reference). Add
  --lang to tts with the voice-prefix auto-inference table and espeak-ng
  dependency note (heygen-com/hyperframes#351, 2026-04-20 — after this
  PR opened).

- references/website-to-video.md: update all paths to the capture/
  subfolder layout introduced in heygen-com/hyperframes#345
  (capture/screenshots/, capture/assets/, capture/extracted/tokens.json).
  Old captured/ prefix was broken — agents following the skill were
  looking for files in wrong locations.

- references/features.md (new): distilled coverage for captions (language
  rule, tone table, word grouping, fitTextFontSize, exit guarantee), TTS
  (multilingual phonemization, speed tuning), audio-reactive (data
  format, mapping table, sampling pattern), marker highlighting
  (highlight/circle/burst/scribble/sketchout), and transitions (energy/
  mood tables, presets, shader-compatible CSS rules). Five topics the
  original PR didn't cover.
2026-05-04 14:13:17 -07:00
James 50aabb9eb2 feat(skill): add hyperframes optional creative skill
Adds an optional creative skill that integrates HyperFrames, an
HTML-based video rendering framework, as a sibling to manim-video.
Complements manim's math-focused animation with motion-graphics,
captioned narration, audio-reactive visuals, shader transitions, and
website-to-video production.

Scope:
- optional-skills/creative/hyperframes/SKILL.md      — entry point
- references/composition.md                          — data-attr schema, timeline contract
- references/cli.md                                  — every npx hyperframes command
- references/gsap.md                                 — GSAP core API for compositions
- references/website-to-video.md                     — 7-step capture-to-video workflow
- references/troubleshooting.md                      — OpenClaw / Chromium 147 fix
- scripts/setup.sh                                   — idempotent one-time setup

OpenClaw / Chromium 147 fix (hyperframes#294):
Pinning hyperframes@>=0.4.2 (commit 4c72ba4 ships the
HeadlessExperimental.beginFrame auto-detect + screenshot fallback).
setup.sh pre-caches chrome-headless-shell so the fast BeginFrame path
is preferred over system Chrome. The PRODUCER_FORCE_SCREENSHOT=true
escape hatch is documented in troubleshooting.md and in SKILL.md
Pitfalls.

Placed under optional-skills/ (not bundled) per CONTRIBUTING.md
guidance for heavyweight deps: requires Node.js >= 22, FFmpeg, and
~300 MB chrome-headless-shell download.
2026-05-04 14:13:17 -07:00
Teknium 8fabef9d35 fix(docs): register cron-script-only guide in sidebar (#19893)
PR #19709 added website/docs/guides/cron-script-only.md but never added the entry to website/sidebars.ts, which is explicitly enumerated (not autogenerated). Two consequences:

1. The guide didn't show up in the left-nav "Guides & Tutorials" list — users could only reach it via cross-links from other pages.
2. Landing on the guide page directly made the sidebar disappear entirely (Docusaurus treats unregistered docs as orphaned and renders them without their parent sidebar).

Added 'guides/cron-script-only' next to 'guides/automate-with-cron' so it slots in alongside the other cron content. Verified with `npm run build`: no orphan warnings, no broken links, page builds with sidebar intact.

No content change, docs only.
2026-05-04 12:57:01 -07:00
briandevans 81cd678291 fix(google-workspace): restore required_credential_files in SKILL.md (#16452)
PR #9931 ("feat(google-workspace): add --from flag for custom sender display name")
accidentally removed the required_credential_files frontmatter block that tells
hermes to bind-mount google_token.json and google_client_secret.json into Docker
and Modal remote terminals before running setup.py.

Without this header the credential files are never registered in the session-scoped
ContextVar, so get_credential_file_mounts() returns an empty list at container
creation time and the OAuth files are invisible inside the sandbox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:43:14 -07:00
briandevans 60b143e9df fix(tui_gateway): guard sys.path against local package shadowing (#15989)
When the TUI backend (tui_gateway/entry.py) is spawned by Node.js with the
user's CWD containing a local utils/ directory, that directory shadows the
installed utils module, causing ImportError in run_agent and hermes_cli.

Strip '' and '.' from sys.path and prepend HERMES_PYTHON_SRC_ROOT (already
set by hermes_cli before spawning the subprocess) so installed packages
always win over CWD artifacts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 12:42:43 -07:00
Harry Riddle 645a2f482d fix(cli): fix shortcut config conflict in hermes_cli 2026-05-04 12:41:05 -07:00
Steven Chanin a919269eb5 fix(skills/email/himalaya): document v1.2.0 folder.aliases syntax
The bundled himalaya skill documented folder aliases using a stale
TOML schema (`[accounts.NAME.folder.alias]`, singular) that himalaya
v1.2.0 silently ignores. The TOML parses without error, but the
alias resolver never reads the sub-section — every lookup then falls
through to the canonical folder name.

Source: in `pimalaya/core` (the `email-lib` crate himalaya v1.2.0
depends on, currently v0.27.0), `email/src/folder/config.rs` defines
`FolderConfig { aliases: Option<HashMap<String, String>>, ... }`
(plural, no `#[serde(rename)]`/`alias` aliases, no
`deny_unknown_fields`), and `account/config/mod.rs::get_folder_alias`
returns the input verbatim when no alias is found. So the singular
`alias` key deserializes to nothing and lookups silently fall
through.

On Gmail (where `sent` resolves to `[Gmail]/Sent Mail`, not `Sent`)
this means save-to-Sent fails *after* SMTP delivery already
succeeded, and `himalaya message send` exits non-zero. Any caller
(agent, script, user) that retries on that exit code will re-run
the entire send — including SMTP — producing duplicate emails to
recipients. Silent ignore + caller-level retry is significantly
worse than a config that just doesn't work.

This commit updates SKILL.md and references/configuration.md to the
v1.2.0 `folder.aliases.X` syntax (plural, dotted keys, directly
under the account section), adds a Gmail-specific block with the
`[Gmail]/Sent Mail`-style mapping, and adds notes on the failure
mode so future readers don't hit the same trap. SKILL.md version
bumped 1.0.0 → 1.1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:39:49 -07:00
Teknium 9cda237bb1 docs(cron): lead with agent-driven setup for no-agent mode (#19871)
The shipped no-agent docs introduced the feature via CLI first and
mentioned the chat path as a two-line afterthought. That buries the
actual value prop: the cronjob tool exposes no_agent directly to the
agent, so a user can describe a watchdog in plain language and Hermes
wires up the script + schedule + delivery without anyone opening an
editor.

Changes:

* cron-script-only.md: promote 'Create One from Chat' above
  'Create One from the CLI', flesh it out with a worked transcript
  (the actual tool calls the agent makes), add subsections covering
  'what the agent decides for you' (when to pick no_agent=True vs
  LLM mode) and 'managing watchdogs from chat' (pause/resume/edit/
  remove all agent-accessible).

* user-guide/features/cron.md:
  - Add 'no-agent mode' to the top-level feature list with a cross-
    link, plus a sentence up top making it clear everything is
    agent-accessible through the cronjob tool.
  - Add 'The agent sets these up for you' subsection to the no-agent
    section showing the exact tool call shape.

* automate-with-cron.md: tighten the existing tip box to mention the
  agent-driven path, not just CLI scheduling.

No behavior change — docs only.
2026-05-04 12:39:19 -07:00
briandevans eadf34633e fix(models): strip :cloud/-cloud suffix from models.dev Ollama Cloud IDs
models.dev appends :cloud and -cloud suffixes to Ollama Cloud model IDs
(e.g. kimi-k2.6:cloud, qwen3-coder:480b-cloud) that the live Ollama Cloud
API does not use. Without normalisation, these suffixed IDs bypass the
dedup check and appear alongside the correct clean IDs, causing 400/404
errors when users select them in /model or hermes model.

Add _strip_ollama_cloud_suffix() and apply it to mdev entries before the
dedup merge in fetch_ollama_cloud_models() so all model IDs stored in the
disk cache use the canonical form the API accepts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:38:15 -07:00
Yoimex c050ee6573 fix(file_ops): resolve search_files path/line collision for hyphenated numeric filenames 2026-05-04 12:37:47 -07:00
Ricardo-M-L fbc477df71 fix(run_agent): acquire lock in IterationBudget.used property
The `used` property was reading `self._used` without holding the lock,
while `consume()`, `refund()`, and `remaining` all properly acquire
`self._lock` before accessing `_used`. This means a concurrent call to
`used` during `consume()` or `refund()` could observe a partially-
updated value, leading to incorrect iteration budget metrics reported
to the gateway, or in extreme cases a ValueError from CPython's list
implementation when the internal array resizes during iteration.

Fix: acquire the lock in `used` just like `remaining` does.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 12:37:28 -07:00
ClawdIA 64ad7dec0d fix(file-ops): allow file search in hidden roots 2026-05-04 12:37:09 -07:00
briandevans 9e2628ee7c test(discord): annotate make_attachment content_type as Optional[str]
Copilot review: the helper accepted None in one test but was annotated str.
Matches actual usage where no-content-type attachments are a tested scenario.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:47 -07:00
Ioodu 1c7f47a58c fix(cron): add concurrency regression test for parallel job state writes
get_due_jobs() called load_jobs() and save_jobs() without holding
_jobs_file_lock, creating a race with the locked mark_job_run() and
advance_next_run(). Wrap get_due_jobs() with the lock (delegating to a
new _get_due_jobs_locked() inner function) so all load→modify→save
cycles are serialised. Add two regression tests: one verifying 3
concurrent mark_job_run() calls each land their correct last_status and
last_run_at without overwrites, and a stress test confirming 10 parallel
calls each increment their job's completed count to exactly 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:36:29 -07:00
lhysdl 6875471916 fix(tts): update MiniMax API endpoint to v1/text_to_speech
MiniMax deprecated the old v1/t2a_v2 endpoint (api.minimax.io) and
moved to v1/text_to_speech (api.minimax.chat). The new API:

- Uses a flat payload: {model, text, voice_id} instead of nested
  voice_setting / audio_setting objects
- Returns raw audio bytes (Content-Type: audio/mpeg) instead of
  JSON with hex-encoded audio
- Uses model 'speech-01' instead of 'speech-2.8-hd'
- Updated default voice_id to 'female-shaonv' for Chinese TTS

The implementation detects Content-Type to handle both old and new
API responses, maintaining backward compatibility for any users who
manually configured the legacy base_url.
2026-05-04 12:36:09 -07:00
briandevans 75bce317a3 fix(cron): expand \${VAR} refs in config.yaml during job execution (#15890)
The cron scheduler's run_job() loaded config.yaml with yaml.safe_load()
but never called _expand_env_vars(), so ${HERMES_MODEL} and similar
references in model:, fallback_providers:, and other config.yaml fields
were forwarded to the LLM API as literal strings, causing HTTP 400 errors.

The normal CLI path has always called _expand_env_vars() via load_config(),
so this was a cron-only gap. The .env load at the top of run_job() already
populates os.environ before config.yaml is read, so the expansion sees the
correct values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:35:46 -07:00
Albert.Zhou fd9c32c0f2 fix(email): drop non-allowlisted senders before dispatch to prevent mail loops
Add EMAIL_ALLOWED_USERS check in EmailAdapter._dispatch_message()
to silently discard emails from senders not in the allowlist.  This
prevents the adapter from creating thread context and dispatching a
MessageEvent for unauthorized senders, which could race with the
gateway authorization check and result in SMTP replies being sent
despite the handler returning None.

Test: tests/gateway/test_email.py::TestDispatchMessage::test_non_allowlisted_sender_dropped
Test: tests/gateway/test_email.py::TestDispatchMessage::test_allowlisted_sender_proceeds
Test: tests/gateway/test_email.py::TestDispatchMessage::test_empty_allowlist_allows_all
2026-05-04 12:35:22 -07:00
briandevans 20edca75e9 fix(update): sync bundled skills to all profiles, including active (#16176)
`hermes update` iterated only non-active profiles when seeding bundled
skills. `seed_profile_skills()` uses a subprocess with an explicit
HERMES_HOME so it correctly targets any profile path; the `p.name !=
active` filter was the only thing preventing the active profile from
being included, leaving it silently on stale skill content after every
update.

Drop the filter and update the header line from "other profiles" to
"all profiles". The active profile is now seeded on the same path as
every other profile. The earlier `sync_skills()` call (module-level
HERMES_HOME) remains for backward compatibility; the subprocess-based
loop is reliable regardless of which HERMES_HOME the CLI was invoked
with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 12:34:53 -07:00
jjjojoj 103f51ad34 fix(doctor): check gh auth status when GITHUB_TOKEN absent
hermes doctor showed 'No GITHUB_TOKEN (60 req/hr)' warning even when
users had authenticated via gh auth login. Now falls back to
gh auth status --json authenticated when GITHUB_TOKEN and GH_TOKEN
are both unset.

Fixes #16115
2026-05-04 12:34:31 -07:00
fiver 8ab9f61dcf fix(gateway): preserve WSL interop PATH in systemd units 2026-05-04 12:34:06 -07:00
Teknium d90f73bcec fix(gateway): use git HEAD SHA, not file mtimes, for stale-code check (#19740)
The stale-code self-check (Issue #17648) used sentinel-file mtimes to
decide whether the gateway survived a `hermes update` with stale
`sys.modules`. That signal false-positives on any write to the
sentinel files — including agent-driven edits during Hermes-on-Hermes
dev sessions. Telling the agent to patch `run_agent.py` would flip
the check to True on the next user message and force a gateway
restart even though no update happened.

Switch the signal to `git rev-parse HEAD`. Agent file edits don't
move HEAD; `hermes update` (git pull) always does. Reading .git/HEAD
directly (no subprocess) with a 5s cache keeps the overhead negligible
on bursty chats. Non-git installs short-circuit to False — the
stale-modules class can't occur without a git-backed update path, so
there's nothing to detect.

The legacy `_compute_repo_mtime` helper is kept but unused by
detection, reserved as a fallback hook for future pip-install update
paths.

- _read_git_head_sha(): resolves HEAD across main checkout, worktree
  (follows `gitdir:` + `commondir` pointers), and packed-refs layouts.
- _current_git_sha_cached(): per-runner 5s SHA cache.
- _detect_stale_code(): boot SHA vs current SHA, returns False when
  either is unavailable.
- Tests cover all four layouts, the agent-edits-don't-trigger
  regression, and cache behavior.

Refs #17648.
2026-05-04 12:33:21 -07:00
Teknium a21f364ad7 chore(release): AUTHOR_MAP entries for Tier 1g salvage batch 2026-05-04 12:32:10 -07:00
Teknium 1c7c7c3c5f feat(kanban-dashboard): per-platform home-channel notification toggles (#19864)
* revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718)

Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.

* feat(kanban-dashboard): per-platform home-channel notification toggles

Adds a "Notify home channels" section to the task drawer in the kanban
dashboard plugin. Each platform where the user has set a home channel
(/sethome, TELEGRAM_HOME_CHANNEL env var, gateway.platforms.<p>.home_channel
in config.yaml) gets a toggle pill. Toggling on writes a kanban_notify_subs
row keyed to that platform's home (chat_id + thread_id); toggling off
removes it. The existing gateway notifier watcher delivers completed /
blocked / gave_up events without any new plumbing — this is purely a GUI
surface over existing machinery.

Replaces the reverted auto-subscribe behavior from #19718 with an explicit,
per-task, per-platform, user-controlled opt-in. No implicit subscription
on tool-driven kanban_create; no CLI commands; no slash commands. Just a
toggle in the drawer.

Backend (plugins/kanban/dashboard/plugin_api.py):
- GET  /api/plugins/kanban/home-channels[?task_id=X]
  Returns every platform with a configured home, plus a per-entry
  subscribed: bool relative to task_id (false when task_id omitted).
  Reads the live GatewayConfig via load_gateway_config() so env-var
  overlays stay honored.
- POST /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  Idempotent add_notify_sub keyed to the platform's home.
- DELETE /api/plugins/kanban/tasks/:id/home-subscribe/:platform
  remove_notify_sub for the same tuple.
- 404 when the platform has no home configured, or task_id doesn't
  exist (POST only).

Frontend (plugins/kanban/dashboard/dist/index.js):
- TaskDrawer fetches /home-channels on open, keyed on task_id.
- HomeSubsSection renders nothing when zero platforms have a home (so
  users who haven't set one up don't see an empty UI block).
- Optimistic toggle with busy flag + revert-on-failure. One pill per
  platform; ✓ prefix and --on class indicate the subscribed state.

CSS (plugins/kanban/dashboard/dist/style.css):
- .hermes-kanban-home-subs flex row + .hermes-kanban-home-sub pill
  style + --on subscribed variant (subtle ring-colored background).

Live-tested against a dashboard with TELEGRAM + DISCORD_BOT_TOKEN /
HOME_CHANNEL env vars set: drawer shows both pills, toggling each
flips its visual state AND writes/removes the correct kanban_notify_subs
row (verified via direct DB read).

Tests (tests/plugins/test_kanban_dashboard_plugin.py, 11 new, 53/53
pass total):
- home-channels lists only platforms with a home (slack with a
  token but no home is excluded)
- no task_id -> all subscribed=false
- subscribe creates notify_sub row with correct chat/thread/platform
- subscribed=true reflected in subsequent GET
- idempotent re-subscribe
- unknown platform -> 404
- unknown task -> 404
- unsubscribe removes the row
- telegram + discord subscribe/unsubscribe independent
- zero homes -> empty list
2026-05-04 12:31:21 -07:00
emozilla 2bc82bb504 clarify placeholder telegram credential in tests 2026-05-04 15:31:15 -04:00
Teknium 3db6b9cc87 feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709)
* feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern)

Adds a no_agent=True option to the cronjob system. When enabled, the
scheduler runs the attached script on schedule and delivers its stdout
directly to the job's target — no LLM, no agent loop, no token spend.
This is the classic bash-watchdog pattern (memory alert every 5 min,
disk alert every 15 min, CI ping) reimplemented as a first-class Hermes
primitive instead of a systemd timer + curl + bot token triplet living
outside the system.

## What

  hermes cron create "every 5m" \
    --no-agent \
    --script memory-watchdog.sh \
    --deliver telegram \
    --name memory-watchdog

Agent tool:

  cronjob(action='create',
          schedule='every 5m',
          script='memory-watchdog.sh',
          no_agent=True,
          deliver='telegram')

Semantics:
- Script stdout (trimmed) → delivered verbatim as the message
- Empty stdout          → silent tick (no delivery; watchdog pattern)
- wakeAgent=false gate  → silent tick (same gate LLM jobs use)
- Non-zero exit/timeout → delivered as an error alert
                          (broken watchdogs shouldn't fail silently)
- No LLM ever invoked; no tokens spent; no provider fallback applied

## Implementation

cron/jobs.py
  * create_job gains no_agent: bool = False
  * prompt becomes Optional (no_agent jobs don't need one)
  * Validation: no_agent=True requires a script at create time
  * Field roundtrips via load_jobs / save_jobs / update_job

cron/scheduler.py
  * run_job: new short-circuit branch at the top that runs the script,
    wraps its output into the (success, doc, final_response, error)
    tuple downstream delivery already expects, and returns before any
    AIAgent import or construction
  * _run_job_script: picks interpreter by extension — .sh/.bash run
    under /bin/bash, anything else under sys.executable (Python).
    Shell support unlocks the bash-watchdog pattern without wrapping
    scripts in Python. Extension is explicit; we deliberately do NOT
    trust the file's own shebang. Path-containment guard (scripts dir)
    unchanged.

tools/cronjob_tools.py
  * Schema: new no_agent boolean property with clear trigger guidance
  * cronjob() accepts no_agent and validates mode-specific shape:
    - no_agent=True requires script; prompt/skills optional
    - no_agent=False keeps the existing 'prompt or skill required' rule
  * update path rejects flipping no_agent=True on a job without a script
  * _format_job surfaces no_agent in list output
  * Handler lambda forwards no_agent from tool args

hermes_cli/main.py, hermes_cli/cron.py
  * 'hermes cron create --no-agent' and edit's --no-agent / --agent
    pair for toggling at CLI parity with the agent tool
  * Existing --script help text updated to describe both modes
  * List / create / edit output now shows 'Mode: no-agent (...)' when set

## Tests

tests/cron/test_cron_no_agent.py — 18 tests covering:
  * create_job: no_agent shape, validation, field persistence
  * update_job: flag roundtrip across reload
  * cronjob tool: schema validation, update toggling, mode-specific
    requirements, prompt-relaxation rule
  * run_job short-circuit:
    - success path delivers stdout verbatim
    - empty stdout → SILENT_MARKER (no delivery downstream)
    - wakeAgent=false gate → silent
    - script failure → error alert
    - run_job does NOT import AIAgent (verified via mock)
  * _run_job_script:
    - .sh executes via bash (no shebang required)
    - .bash executes via bash
    - .py still runs via sys.executable (regression)
    - path-traversal still blocked (security regression)

All 18 new tests pass. 341/342 pre-existing cron tests still pass; the
one failure (test_script_empty_output_noted) was already broken on main
and is unrelated to this change.

## Docs

website/docs/guides/cron-script-only.md — new dedicated guide covering
the watchdog pattern, interpreter rules, delivery mapping, worked
examples (memory / disk alerts), and the comparison table vs hermes send,
regular LLM cron jobs, and OS-level cron.

website/docs/user-guide/features/cron.md — new 'No-agent mode' section
in the cron feature reference, cross-linked to the guide.

website/docs/guides/automate-with-cron.md — new tip box pointing users
to no-agent mode when they don't need LLM reasoning.

## Compatibility

- Existing jobs: unchanged. no_agent defaults to False, existing code
  paths untouched until the flag is set.
- Schema additive only; older jobs.json without the field load fine
  via .get() with False default.
- New CLI flags are opt-in and don't alter existing flag behavior.

* fix(cron): lazy-import AIAgent + SessionDB so no_agent ticks pay zero

The unconditional `from run_agent import AIAgent` + SessionDB() init at
the top of run_job() meant every no_agent tick still paid the full agent
module load cost (~300ms + transitive imports + DB open) even though it
never touched any of that machinery.

Move both to live under the default (LLM) path, after the no_agent
short-circuit has returned. Now a no_agent tick's sys.modules stays
clean — verified end-to-end:

    assert 'run_agent' not in sys.modules  # before
    run_job(no_agent_job)
    assert 'run_agent' not in sys.modules  # after

The existing mock-based unit test (test_run_job_no_agent_never_invokes_aiagent)
kept passing because patch() replaces the class AFTER import; the leak
was only visible via real subprocess-style verification. End-to-end
demo confirmed: agent calls cronjob(no_agent=True) → script runs →
stdout delivered → no LLM machinery loaded.

* docs(cron): tighten no_agent tool schema — defaults, silent semantics, pick rule

Previous description buried the important bits in one long sentence.
Agents could plausibly miss three things an LLM-facing schema should
make unmissable:

1. What the default is — now first sentence + JSON Schema `default: false`
2. What 'silent run' actually means for the user — now spelled out:
   'nothing is sent to the user and they won't see anything happened'
3. When to pick True vs False — now a concrete decision rule with
   examples on both sides (watchdogs/metrics/pollers → True;
   summarize/draft/pick/rephrase → False)

Also adds explicit 'prompt and skills are ignored when True' since the
agent could otherwise still pass them out of habit.

No behavior change — schema text only.
2026-05-04 12:31:01 -07:00
teknium1 d35efb9898 feat(telegram): /topic off + help + auth gate + screenshot debounce
Four production-readiness additions to topic mode:

1. /topic off — clean disable path. Flips telegram_dm_topic_mode.enabled
   to 0 and clears telegram_dm_topic_bindings for this chat. Previously
   users had to edit state.db with sqlite3 to turn the feature off.
   Idempotent: calling /topic off when the chat was never enabled
   returns a friendly no-op message.

2. /topic help — inline usage printed in the DM so users don't have to
   visit docs to discover /topic off, /topic <session-id>, etc.

3. Authorization gate. /topic mutates SQLite side tables and flips the
   root DM into a lobby, so the action must be authorized. Now calls
   self._is_user_authorized(source); unauthorized DMs get a refusal
   instead of activation. Defense in depth on top of the gateway's
   existing pre-route auth.

4. BotFather screenshot debounce. A user repeatedly running /topic
   while Threads Settings is still disabled would previously re-upload
   the same screenshot every time. Now rate-limited to one send per
   5 minutes per chat. /topic off resets the counter so re-enabling
   starts fresh.

Command-def args hint updated: /topic [off|help|session-id].

Docs:
- New /topic subcommands table at the top of the multi-session section
- Disable instructions updated to recommend /topic off first, with the
  raw SQL fallback kept for bulk cleanup
- Under-the-hood list extended with the capability-hint debounce and
  the authorization gate

Tests (6 new):
- /topic help returns usage and doesn't create topic tables
- /topic off disables mode AND clears bindings
- /topic off is idempotent when never enabled
- Unauthorized users get refusal, no tables created
- Capability-hint debounce is per-chat
- /topic off resets both lobby and capability debounce counters

All 402 targeted tests pass. Full gateway sweep: 4809/4810
(pre-existing test_teams::test_send_typing unrelated).
2026-05-04 12:07:17 -07:00
teknium1 1381c89e56 fix(telegram): polish topic mode — CASCADE, General-topic handling, rename guard, debounce
Five follow-ups to topic mode based on integration audit:

1. ON DELETE CASCADE on telegram_dm_topic_bindings.session_id. Session
   pruning (manual /delete, auto-cleanup, any future prune job) would
   have thrown 'FOREIGN KEY constraint failed' for sessions bound to a
   topic. Migration bumped to v2, rebuilds the bindings table in place
   if FK lacks CASCADE. Idempotent; only runs once per DB.

2. Never auto-rename operator-declared topics. If an operator has
   extra.dm_topics configured AND a user runs /topic, messages in those
   pre-declared topics would previously trigger auto-rename and silently
   mutate operator config. _rename_telegram_topic_for_session_title now
   early-returns when _get_dm_topic_info returns a dict for this
   (chat_id, thread_id). Uses class-based lookup (not hasattr) so
   MagicMock test fixtures don't accidentally trip the guard.

3. General topic handling. Telegram's General (pinned top) topic in a
   forum-enabled private chat may send messages with message_thread_id=1
   or omit thread_id entirely depending on client. Both are now treated
   as the root lobby, not a topic lane. Prevents users from
   accidentally burning a session on the General topic.

4. Debounce the root-lobby reminder. 30-second cooldown per chat so a
   user who forgets topic mode is enabled and types ten messages in the
   root gets one reminder, not ten. Explicit command replies
   (/new-in-lobby, /topic <session-id>) still land every time.

5. Docs: added under-the-hood invariants for the above, plus a
   Downgrade section explaining that rolling back to a pre-/topic
   Hermes build leaves the DB tables orphaned but harmless — DMs just
   revert to native per-thread isolation.

Tests:
- test_operator_declared_topic_is_not_auto_renamed
- test_general_topic_is_treated_as_root_lobby
- test_lobby_reminder_is_debounced_per_chat
- test_binding_survives_session_deletion_via_cascade
- test_migration_rebuilds_v1_binding_table_with_cascade_fk

Validated: 4803/4804 tests pass (tests/gateway/ + tests/test_hermes_state.py).
Sole failure is a pre-existing test_teams::test_send_typing flake
unrelated to this PR.
2026-05-04 12:07:17 -07:00
teknium1 1a9542cf75 docs(telegram): document /topic multi-session DM mode
Adds a new section 'Multi-session DM mode (/topic)' to the Telegram
messaging docs, covering:

- Comparison table vs the existing config-driven extra.dm_topics
- BotFather prerequisites (Threads Settings, user-create permission)
- Activation flow and root-DM lobby behavior
- End-user flow for creating topics via the + button / All Messages
- Auto-renaming when Hermes generates session titles
- /new semantics inside a topic
- /topic <session-id> restore of previous sessions
- Persistence layout (SQLite side tables)
- How to disable the feature

Also:
- New /topic row in the messaging slash-commands reference
- Updated Bot API 9.4 summary to point at both topic features
2026-05-04 12:07:17 -07:00
teknium1 a7683d04a9 fix(telegram): harden DM topic binding — persist through switch_session, rebind on /new
Follow-up on @EmelyanenkoK's feat: add Telegram DM topic-mode sessions.

Three issues:

1. Split-brain session state. After get_or_create_session() returned a
   SessionEntry for a topic lane, the handler was mutating
   .session_id in place to the binding's target, but never persisting
   the switch through SessionStore. The sessions.json session_key →
   session_id map kept pointing at the lane's natural id; any reader
   that reloaded from disk saw the wrong id. Fixed by routing through
   SessionStore.switch_session(), which _save()s the mapping and ends
   the old session in SQLite like /resume does.

2. /new inside a topic was a one-message no-op. Reset created a new
   session but left the telegram_dm_topic_bindings row pointing at the
   old session_id, so the next message's binding lookup switched right
   back. Now _handle_reset_command rebinds the topic to the new
   session_id after reset.

3. is_telegram_session_linked_to_topic and
   list_unlinked_telegram_sessions_for_user both called
   apply_telegram_topic_migration() on read, contradicting the PR's
   own invariant that migration only runs on explicit /topic opt-in.
   They now tolerate missing topic tables and return empty/False.

Also: _telegram_topic_mode_enabled() now only treats True as enabled
(not any truthy return), so test fixtures with MagicMock session_db
don't accidentally flip every DM into lobby mode — this was breaking
4 pre-existing test_status_command tests.

Tests:
- New regression: /new inside a topic must update the binding row
  (test_new_inside_telegram_topic_rewrites_binding_to_new_session).
- _make_runner now stubs switch_session so existing restore tests
  still exercise the new code path.

Validated end-to-end with real SessionDB + SessionStore:
readers on fresh DB don't create topic tables; enable creates them;
binding override persists across SessionStore restart; /new rebinds
and the new id survives a restart.

Co-authored-by: EmelyanenkoK <emelyanenko.kirill@gmail.com>
2026-05-04 12:07:17 -07:00
EmelyanenkoK 25065283b3 fix: improve telegram topic mode setup 2026-05-04 12:07:17 -07:00
EmelyanenkoK d6615d8ec7 feat: add Telegram DM topic-mode sessions 2026-05-04 12:07:17 -07:00
asheriif 0ce1b9fe20 fix(tui): preserve prompt separator width (#19340)
* fix(tui): preserve prompt separator width

* fix(tui): align transcript height estimates with prompt width
2026-05-04 09:58:40 -07:00
brooklyn! d9c090fe36 Merge pull request #19338 from asheriif/fix/tui-plugin-slash-exec-live
fix(tui): run plugin slash commands live
2026-05-04 09:57:45 -07:00
kshitijk4poor 54e78cadb2 test: add regression test for Teams interactive_setup import fix
Adapted from PR #19188 by @LeonSGP43 — mocks cli_output helpers and
verifies interactive_setup persists credentials to .env without
crashing. Also adds megastary to AUTHOR_MAP.
2026-05-04 06:54:27 -07:00
megastary 38adfebe78 fix(teams): import prompt/print helpers from cli_output, not config
The Teams adapter's interactive_setup() tried to import prompt,
prompt_yes_no, print_info, print_success, and print_warning from
hermes_cli.config, but those helpers live in hermes_cli.cli_output.
Only get_env_value/save_env_value live in hermes_cli.config.

This caused 'hermes setup' to crash with ImportError as soon as the
user picked Teams in the messaging-platforms wizard.

Split the import accordingly.
2026-05-04 06:54:27 -07:00
kshitijk4poor cfd86dcdb8 chore: add bobashopcashier noreply email to AUTHOR_MAP 2026-05-04 06:23:52 -07:00
bobashopcashier d89e7a3cd4 fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract)
Per https://platform.claude.com/docs/en/build-with-claude/fast-mode:
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error."

Pre-fix, _is_anthropic_fast_model() returned True for any claude-* model,
so /fast on Opus 4.7 (or Sonnet/Haiku) would persist agent.service_tier=fast
in config.yaml and the adapter would inject extra_body["speed"] = "fast"
on every subsequent request. Opus 4.7 returns:

  HTTP 400: 'claude-opus-4-7' does not support the `speed` parameter.

This wedged sessions across model upgrades (a user who ran /fast on Opus 4.6
and later switched the default model to 4.7 hit a hard 400 on every turn
until they manually edited config.yaml).

Changes:
- _is_anthropic_fast_model: gate on "opus-4-6" / "opus-4.6" only
- anthropic_adapter: add _supports_fast_mode predicate as defensive guard
  so stale request_overrides on an unsupported model are dropped silently
  instead of 400'ing
- Tests: flip the assertions that mirrored the bug (Sonnet/Haiku/Opus 4.7
  asserting fast-mode support) to match the documented API contract
2026-05-04 06:23:52 -07:00
JasonOA888 a7417f8a4a fix(compressor): skip non-string tool content in summarization pass to prevent AttributeError
Commit 408dd8aa added a non-string guard for Pass 1 (dedup), but the same
pattern exists in Pass 2 (summarization/pruning) where content.startswith()
and len() are called on potentially non-string tool content.

When a provider returns tool results with non-string content (e.g. dict or
int from llama.cpp or similar), the pruning pass crashes with AttributeError.

Add the same isinstance(content, str) guard to Pass 2 for consistency.
2026-05-04 06:23:52 -07:00
helix4u eeb05cf556 docs: default custom tool creation to plugins
Steers custom tool creation toward the plugin route by default.
The adding-tools.md guide is now explicitly for built-in core Hermes
tools only.

Key fixes:
- Plugin quickstart: ctx.register_tool() now uses correct keyword-arg
  API (name=, toolset=, schema=, handler=) instead of broken 3-arg call
- Handler signature: (params, **kwargs) instead of (params)
- Handler return: json.dumps({...}) instead of plain string
- AGENTS.md: mentions plugin route before built-in tool instructions
- learning-path.md: plugins listed before core tool development
- contributing.md: separates plugin vs core tool paths

Based on PR #13138 by @helix4u.
2026-05-04 05:53:16 -07:00
ygd58 74c1b946e0 fix(browser): inject --no-sandbox for root and AppArmor userns restrictions
On VPS/Docker and some Ubuntu 23.10+ hosts, Chromium refuses to start
without --no-sandbox:
  - uid=0 (root): hard requirement (VPS/Docker deployments)
  - AppArmor apparmor_restrict_unprivileged_userns=1 (Ubuntu 23.10+):
    non-root too, under systemd or unprivileged containers

Detect both conditions and inject AGENT_BROWSER_CHROME_FLAGS with
--no-sandbox --disable-dev-shm-usage when the user hasn't already
set the flags themselves.

Salvage of #15771 — only the browser_tool.py fix is cherry-picked.
The PR's accompanying MCP preset addition (new feature surface)
was dropped so the bug fix can land independently.

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-05-04 05:27:23 -07:00
briandevans ce22301dc6 test(sms): use clear=True in test_missing_phone_number_is_non_retryable
Prevents pre-existing TWILIO_PHONE_NUMBER or SMS_WEBHOOK_URL values in
the outer test environment from leaking into the assertion context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:25:09 -07:00
0668001438 83080772f2 fix(delegation): honor provider override for subagents
Clear inherited provider preference filters when delegation.provider is set so delegated children do not route back to the parent provider. Add a regression test for cross-provider delegation with parent OpenRouter filters.

Closes #10653
2026-05-04 05:22:35 -07:00
Pratik Rai 7a8ee8b29d fix(gateway): deduplicate Weixin messages by content fingerprint 2026-05-04 05:20:13 -07:00
briandevans 0b5fd40a01 fix(delegate): correct _spawn_child → _build_child_agent in comments
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:18:45 -07:00
briandevans 42d72b5922 fix(status): add missing popular provider API keys to hermes status display
Closes #16082.

`hermes status` silently omitted four widely-used LLM providers
(Google/Gemini, DeepSeek, xAI/Grok, NVIDIA NIM) from the API Keys
and API-Key Providers sections. Add them, along with tuple-valued
env var support (first found wins) so Google can accept either
GOOGLE_API_KEY or GEMINI_API_KEY.

Also deduplicates the "NVIDIA" and "NVIDIA NIM" rows that were
both pointing at NVIDIA_API_KEY.

Salvage of #16159 (core behavior preserved + NVIDIA dedup fixup
on top of the tuple-support refactor).

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-04 05:14:13 -07:00
VinVC 5d6431c114 fix(doctor): resolve merge conflicts, add kimi-coding-cn test
- Rebased on upstream/main to resolve conflicts
- Added test_run_doctor_accepts_kimi_coding_cn_provider test
- All 30 tests pass
2026-05-04 05:12:42 -07:00
阿泥豆 0e9416036a test: add unit tests for heartbeat stale threshold increase 2026-05-04 05:08:51 -07:00
阿泥豆 0cc63043e0 fix(delegation): increase heartbeat stale thresholds
The heartbeat stale detection was too aggressive:
- idle: 5 * 30s = 150s — LLM inference on slow providers (Zhipu/GLM)
  frequently exceeds 150s, causing heartbeat to stop prematurely
- in-tool: 20 * 30s = 600s — borderline for long tool calls

When heartbeat stops, parent._last_activity_ts freezes, eventually
triggering gateway timeout and killing the entire delegation.

New thresholds:
- idle: 15 * 30s = 450s — accommodates slow LLM inference
- in-tool: 40 * 30s = 1200s — accommodates long-running tool calls

child_timeout_seconds (config: delegation.child_timeout_seconds) remains
the hard cap for total delegation duration.
2026-05-04 05:08:51 -07:00
briandevans 6b4ccb9b14 fix(session-search): report source from resolved parent, not FTS5 child session (#15909)
When a delegation child session (e.g. source='telegram') contains the
FTS5 hit but _resolve_to_parent() maps it to a different root session
(source='api_server'), the result entry was still reporting the child's
source because the loop discarded session_meta as `_` and fell back to
match_info.get('source'), which carries the child session's value.

Use the resolved parent's session_meta for source, model, and started_at
with match_info as a fallback, so the output accurately reflects the
session the user actually interacted with.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:40 -07:00
briandevans b46b0c9888 fix(backup): floor pre-update backup_keep to 1 so the new backup survives
`updates.backup_keep: 0` (or any negative value) wiped the freshly-
created pre-update zip:

  _prune_pre_update_backups(backup_dir, keep=0):
      backups = sorted(..., reverse=True)   # newest first, includes
                                            # the zip we just wrote
      for p in backups[0:]:                 # = all of them
          p.unlink()

The wrapper in `main.py` then printed `Saved: <path>` for a file that
no longer existed (the size lookup is wrapped in `try/except OSError`
which silently degrades to "0 B"), leaving operators believing they had
a recovery point when they had none.

This is a real footgun because some config systems treat 0 as "keep
unlimited"; here it does the opposite — every backup is destroyed
right after creation.

Fix: clamp `keep` to a minimum of 1 inside `_prune_pre_update_backups`
since that helper is only invoked immediately after a fresh backup
is written.  Operators who genuinely want no backups should set
`updates.pre_update_backup: false` (which gates creation entirely)
rather than relying on `backup_keep: 0`.

Also extends the `backup_keep` config docstring to spell out the floor
and point at `pre_update_backup: false` as the off-switch.

## Tests

Three regression tests added in `TestPreUpdateBackup`:

  - `test_keep_zero_does_not_delete_freshly_created_backup` —
    asserts the file persists after `keep=0`
  - `test_keep_negative_does_not_delete_freshly_created_backup` —
    same for negative values
  - `test_keep_zero_still_prunes_older_backups` — proves the floor
    only protects the new backup; older ones are still rotated out

Verified the new tests fail on origin/main (without the floor) and
pass with it; full `tests/hermes_cli/test_backup.py` suite green
(84 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:07:13 -07:00
Sanhu Li ef8c213e88 fix(model-switch): soft-accept unlisted openai-codex models 2026-05-04 05:06:53 -07:00
0xsir0000 52882dade6 fix(agent): include name field on every role:tool message for Gemini compatibility (#16478)
Gemini's OpenAI-compatibility endpoint strictly requires the `name` field
on `role: tool` messages — it returns HTTP 400 ("Request contains an
invalid argument") when the function name is missing. OpenAI/Anthropic/
ollama tolerate the absence, so the gap stays invisible until the
conversation accumulates a tool turn and the user routes it through Gemini
(direct API or via ollama-cloud proxy).

Fix: add a `_get_tool_call_name_static()` helper alongside the existing
`_get_tool_call_id_static()`, and populate `name` at every site that
constructs a `role: tool` message — the pre-call sanitizer stub, the
tool-call args repair marker, both interrupt-skip paths, both
result-append paths (parallel + sequential), the invalid-tool-name
recovery, the invalid-JSON-args recovery, and the exception fallback.

Each call site was already in scope of the function name (`function_name`,
`skipped_name`, `name`, or a dict tool_call), so the change is local —
no new lookups, no behavior change for providers that already worked.

Fixes #16478
2026-05-04 05:06:33 -07:00
OpenClaw Bot 0443484115 fix(qqbot): honor proxy env vars for websocket 2026-05-04 05:06:09 -07:00
陈运波0668001438 6cf7a9e330 fix(vision): preserve explicit provider auth with custom base_url
Keep the configured vision provider when base_url is overridden so credential-pool lookup still resolves provider-specific API keys (e.g. ZAI_API_KEY), and add a regression test for this path.
2026-05-04 05:05:43 -07:00
swithek b7bbc62503 fix(compressor): _prune_old_tool_results boundary direction 2026-05-04 05:05:18 -07:00
Dejie Guo d29f90e89d fix(error_classifier): avoid large-context false overflow heuristics
Generic 400 and server-disconnect heuristics used absolute token/message-count fallbacks that are too aggressive for 1M context sessions. Gate those absolute fallbacks to smaller context windows while preserving relative pressure checks.

Fixes #16351
2026-05-04 05:04:56 -07:00
giwaov 026a5e47df fix(cli): preserve Windows hidden-dir paths in markdown 2026-05-04 05:04:36 -07:00
Teknium 3fb35520c6 revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718) (#19721)
Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.
2026-05-04 05:04:01 -07:00
Teknium 25b7b0f8e6 chore(release): AUTHOR_MAP entries for Tier 1f salvage batch 2026-05-04 05:03:10 -07:00
Teknium ff3d2773e2 feat(kanban): auto-subscribe gateway chat on tool-driven kanban_create (#19718)
Closes #19479.

When an orchestrator agent calls kanban_create from a gateway session
(e.g. a Telegram user delegating to an orchestrator profile), auto-
subscribe the originating (platform, chat, thread, user) to the new
task's terminal events. Mirrors the behavior of the /kanban create
slash command in gateway/run.py so tool-driven creation is at parity
with human-driven creation.

Without this, a user who interacts with an orchestrator exclusively
via the gateway never receives blocked / completed / gave_up
notifications for tasks the orchestrator created on their behalf —
silently breaking the gateway-first multi-agent flow the reporter
describes.

Reads the context-local HERMES_SESSION_* vars via get_session_env()
(not os.environ — those are contextvars for asyncio concurrency
safety). Falls through cleanly in CLI / cron contexts with no
session active (subscribed=False in the response). Best-effort: if
the gateway module isn't importable (test rigs stubbing gateway.*),
the task still creates, we just skip the subscription.

Response gains a 'subscribed' bool so the orchestrator knows whether
terminal events will land back in the originating chat or whether it
needs to poll / unblock manually.

Tests: 4 new in tests/tools/test_kanban_tools.py covering
CLI/no-subscribe, telegram/gateway-auto-subscribe, discord-DM/no-
thread subscribe, and partial-ctx/no-chat_id no-subscribe. 40/40
kanban tool tests pass.
2026-05-04 05:02:23 -07:00
Nikolay Gusev fdf9343c51 fix(tools): wrap bare scalars in single-element list for array-typed args
Open-weight models (DeepSeek, Qwen, GLM) sometimes emit tool calls like
`{"urls": "https://a.com"}` when the tool schema declares
`type: array`.  The call was JSON-valid but semantically wrong, and
`coerce_tool_args` would pass the bare string through — the tool then
failed with a confusing type error.

`coerce_tool_args` now wraps non-list, non-null values in a
single-element list when the schema declares `array`.  Strings still go
through `_coerce_value` first so JSON-encoded arrays
(`'["a","b"]'`) parse correctly and nullable `"null"` still
becomes `None`.  `None` itself is preserved — tools with sensible
defaults already handle it, and we don't want to silently mask a
deliberate null.

Salvaged from #19652 (NikolayGusev-astra) — the broader validate-then-
repair layer had several issues (duplicated existing coercion,
mis-classified `old_string` as a path field, prepended non-JSON
prefixes to tool results that break downstream JSON parsing, hardcoded
offset/limit defaults unsuitable for non-read_file tools).  The one
genuinely new capability is wrapping bare scalars, which is implemented
here directly inside the existing coercion path.

Co-authored-by: Nikolay Gusev <ngusev@astralinux.ru>
2026-05-04 05:00:37 -07:00
ms-alan 6f864f8f94 fix(redact): add code_file param to skip false-positive ENV/JSON patterns
ENV-assignment and JSON-field regex patterns in redact_sensitive_text()
cause false positives when reading source code files:
- MAX_TOKENS=*** triggers the ENV assignment pattern
- "apiKey": "test" in test fixtures triggers the JSON field pattern

Add code_file=False parameter. When code_file=True, skip only the
ENV-assignment and JSON-field regex passes; all other patterns (prefixes,
auth headers, private keys, DB connstrings, JWTs, URL secrets) are
still applied.

Update file_tools.py (read_file and search_files) to pass code_file=True
so agent code analysis is not polluted by false-positive redactions.

Closes #15934
2026-05-04 04:56:28 -07:00
Teknium a175f39577 feat(nous): persist Nous OAuth across profiles via shared token store (#19712)
Mirrors the Codex auto-import UX. On successful Nous login (either
`hermes auth add nous --type oauth` or `hermes login nous`), tokens are
mirrored to `$HERMES_SHARED_AUTH_DIR/nous_auth.json` (default
`~/.hermes/shared/nous_auth.json`, outside any named profile's
HERMES_HOME). On next login in a new profile, the flow offers to import
those credentials ("Import these credentials? [Y/n]") and rehydrates via
a forced refresh+mint instead of running the full device-code flow.

Runtime refresh in any profile syncs the rotated refresh_token back to
the shared store so sibling profiles don't hit stale-token fallback
after rotation.

The volatile 24h agent_key is NOT persisted to the shared store —
only the long-lived OAuth tokens are cross-profile useful.

- `HERMES_SHARED_AUTH_DIR` env var for tests + custom layouts
- Pytest seat belt mirrors the existing `_auth_file_path` guard so
  forgetting to redirect the store in a test fails loudly
- File mode 0600 where platform supports it
- Runtime credential resolution is unchanged — shared store is only
  consulted during the login flow, so profile isolation at runtime is
  preserved
- Stale refresh_token + portal-down cases gracefully fall back to
  device-code

Addresses a user report from Mike Nguyen: running
`hermes --profile <name> auth add nous --type oauth` for every new
profile is unnecessary friction now that Codex has a shared-import
flow via `~/.codex/auth.json`.
2026-05-04 04:54:55 -07:00
QifengKuang 69fc6d9c1e fix(telegram): fall back to document on any send_photo failure, not just dim errors
Broadens the existing fallback (previously only fired for
Photo_invalid_dimensions) to cover every send_photo exception class:
rate limits, corrupt file markers, format edge cases. The expected
dimension case still logs at INFO (document is the right path); all
other cases log at WARNING with exc_info so they're visible in logs.

If send_document itself fails, we still fall back to the base adapter's
text-only 'Image: /path' rendering as a last resort.

Salvage of #15837 — original PR author QifengKuang proposed the broader
try/except-style fallback. Adapted to keep the existing INFO-vs-WARNING
log split for dimension errors (the expected case).

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 04:54:54 -07:00
Teknium d3b22b76d8 fix(kanban): enforce worker task-ownership on destructive tool calls (#19713)
Closes #19534 (security).

A worker spawned by the kanban dispatcher has HERMES_KANBAN_TASK set
to its own task id. The destructive tools (kanban_complete,
kanban_block, kanban_heartbeat) resolved task_id via
_default_task_id() which preferred an explicit arg over the env var,
with no ownership check — so a buggy or prompt-injected worker could
complete / block / heartbeat any OTHER task (sibling, cross-tenant,
anything) by supplying its id. Reporter's repro: worker for t_A
passed task_id=t_B to kanban_complete and got {"ok": true}.

Fix: add _enforce_worker_task_ownership(tid). If HERMES_KANBAN_TASK
is set and tid doesn't match, return a structured tool error with
guidance to use kanban_comment (for information handoff across tasks)
or kanban_create (for follow-up work). Orchestrator profiles (no env
var, but kanban toolset enabled per #18968) are exempt — their job
is routing and sometimes includes closing out child tasks.

Kept unrestricted (deliberately):
- kanban_show — workers legitimately read parent/sibling handoff context
- kanban_comment — cross-task comments are the handoff mechanism
- kanban_create — orchestrator fan-out, worker follow-up spawning
- kanban_link — parent/child linking

Tests: 5 new regression tests in tests/tools/test_kanban_tools.py
covering the grid (worker-attacks-foreign ×3 tools, worker-own-task
preserved, orchestrator-unrestricted). 36/36 pass.
2026-05-04 04:54:02 -07:00
Teknium 1bd5ac7f2f fix(self-improvement-loop): bump background-review budget to 16 and suppress status leaks (#19710)
The background memory/skill review fork had two user-visible issues:

1. max_iterations=8 was too tight for multi-step reviews. A review that
   needs to skill_view one or two candidate skills, add a memory entry,
   and patch a skill routinely blew the budget — surfacing an 'Iteration
   budget exhausted (8/8)' warning to the user and leaving the review
   half-finished.

2. Mid-review lifecycle messages leaked into the user's terminal past the
   existing quiet_mode + redirect_stdout/stderr guards. _emit_status and
   _emit_warning route through _vprint(force=True) -> _print_fn /
   status_callback, which bypass sys.stdout entirely. The stdout redirect
   only catches raw print() calls.

Changes:
- Bump the review fork's max_iterations from 8 to 16.
- Set review_agent.suppress_status_output = True on the fork. This
  short-circuits _vprint unconditionally so _emit_status/_emit_warning
  emissions (iteration-budget warnings, rate-limit retries, compression
  messages) never reach the user. The only user-visible output remains
  the compact final summary line ('💾 Self-improvement review: ...')
  which is printed via self._safe_print on the *main* agent (outside
  the fork's redirect/suppress scope).

Summarizer filter is already correct — _summarize_background_review_actions
only surfaces tool calls with data.get('success') is truthy, so failed
attempts and reasoning text never reach the summary line.
2026-05-04 04:53:44 -07:00
Kathy a79b0ec461 fix: keep Feishu topic replies from falling back to new threads (local patch)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-04 04:53:28 -07:00
cong 3ccf723bf9 fix(gateway): read context_length from custom_providers in session info header 2026-05-04 04:51:13 -07:00
h0tp-ftw 8c8f95bc8e fix(gateway): show friendly error when service is not installed
Instead of an unhelpful CalledProcessError traceback when running
`hermes gateway start/stop/restart` without first installing the service,
check for the unit file and exit with an actionable install hint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 04:49:51 -07:00
Teknium c5789f4309 feat(achievements): share card render on unlocked badges (#19657)
* feat(achievements): share card render on unlocked badges

Adds a Share button to each unlocked achievement card that opens a
modal and renders a 1200x630 PNG share card client-side via Canvas2D
(no backend, no network, no new deps). Two actions: Download PNG and
Copy image to clipboard.

Card layout mirrors the in-dashboard visual language: tier-colored
glow, icon from the existing LUCIDE sprite set, achievement name,
tier badge pill, description, progress stat line, and a Hermes Agent
watermark. Sized for X/Twitter, Discord, LinkedIn, Bluesky link
previews.

Vendored on top of the upstream @PCinkusz bundle; the 'in-progress
scan banner' precedent already established this divergence pattern.
Manifest bumped 0.3.1 -> 0.4.0.

* feat(achievements): share-on-X as primary action on share dialog

Adds a 'Share on X' button as the primary action in the share dialog.
Opens https://x.com/intent/post with a pre-filled tweet referencing
the achievement name, tier, @NousResearch, and the Hermes docs URL.
Copy image and Download PNG become secondary actions: users who want
the badge attached can Copy image, paste into the X composer, post.

Primary button styled as X's signature black-on-white fill so the
action is unambiguous.
2026-05-04 04:47:53 -07:00
ygd58 297eaa3533 fix(api_server): emit run.failed when run_conversation returns failed=True
When run_conversation encounters a non-retryable client error (401, 400,
etc.), it returns a dict with failed=True instead of raising. The gateway's
_run_and_close only branched on exceptions, so it always emitted run.completed
even for failed runs — clients could not distinguish success from failure.

Inspect the result dict before emitting: if failed=True, emit run.failed
with the error message; otherwise emit run.completed as before. The existing
except Exception path is unchanged for genuine programming errors.

Fixes #15561
2026-05-04 04:47:36 -07:00
Teknium b2b479b40e docs(kanban): backfill multi-board refs in reference docs (#19704)
Followup to #19653. The feature PR updated the Kanban user guide but
missed four other pages that document the same surface. Caught when
Teknium asked 'did you add docs to the guide and any other kanban
related docs around this?'.

- reference/cli-commands.md: rewrite the `hermes kanban` section to
  document the `--board <slug>` global flag, the `boards`
  subcommand group (list/create/switch/show/rename/rm), board
  resolution order, and worked examples. Also fills in the
  `create` / `complete` flag lists that had drifted from the
  current CLI (`--summary`, `--metadata`, `--triage`,
  `--idempotency-key`, `--max-runtime`, `--skill`).
- reference/environment-variables.md: add `HERMES_KANBAN_BOARD`
  row, update `HERMES_KANBAN_DB` precedence note.
- reference/slash-commands.md: add `/kanban boards ...` and
  `/kanban --board <slug> ...` to the two `/kanban` rows (CLI
  table + gateway table).
- features/kanban-tutorial.md: the walkthrough uses the `default`
  board, so just a note pointing readers at the overview's Boards
  section if they want multiple queues, plus the corrected per-board
  DB path.

Skill docs (devops-kanban-orchestrator, -worker) intentionally not
updated: those are agent-facing lifecycle playbooks and boards are
transparent to workers (HERMES_KANBAN_BOARD env var pins the DB
automatically), so there's nothing new for a worker to know.
2026-05-04 04:47:19 -07:00
Teknium a8b689f0c2 test(kanban): regression for status=running rejection at dashboard PATCH
Reporter of #19535 explicitly asked for a regression test — covers it
here so a future refactor of _set_status_direct can't silently re-enable
the direct ready/todo -> running bypass.

Asserts both: (a) HTTP 400 with 'running' in the detail message, and
(b) the task's status is unchanged after the rejected PATCH (pre-request
status preserved, no partial mutation).
2026-05-04 04:46:47 -07:00
luyao618 6b3efcee49 fix(kanban): reject direct status transition to 'running' via dashboard API
The PATCH /tasks/:id endpoint allows setting status='running' via
_set_status_direct(), bypassing the dispatcher/claim path that creates
run rows, claim locks, expiry, and worker process metadata. This can
leave tasks stuck in 'running' with no active worker.

Fix: reject status='running' with HTTP 400, requiring all transitions
to 'running' to go through the canonical claim_task() path.

Closes #19535
2026-05-04 04:46:47 -07:00
vominh1919 652f8e6f3e fix(test): correct _coerce_number inf/nan test assertions
The test 'test_inf_stays_string_for_integer_only' incorrectly asserted
that _coerce_number('inf') returns float('inf'), but the function
correctly returns the original string 'inf' because infinity is not
JSON-serializable.

Fixed the assertion to expect the string 'inf', and added two new tests
for negative infinity and NaN edge cases to improve coverage of the
non-JSON-serializable number guard in _coerce_number().
2026-05-04 04:45:55 -07:00
Yoimex edf9c75621 fix(env): pass -- to cd for hyphen-prefixed workdirs 2026-05-04 04:45:03 -07:00
Teknium ae40fca955 fix(profiles): keep validate_profile_name strict; callers normalize first
Follow-up to @changchun989's cherry-pick: reverts the validate-via-
normalize change so validate_profile_name remains a strict regex check
on the input AS-GIVEN. Callers that accept mixed-case user input
(dashboard UI, CLI args, import flows) call normalize_profile_name()
first, then validate the result. This keeps validate honest about
what the on-disk directory name must look like — e.g. '  jules '
(trailing whitespace) is now rejected instead of silently trimmed
and accepted.

- validate_profile_name: strict lowercase/regex check again, 'UPPER'
  back in the invalid-names parametrize
- 8 call sites in profiles.py (create_profile, delete_profile,
  set_active_profile, export_profile, import_profile, rename_profile,
  resolve_profile_env, plus the clone_from branch): swap the
  normalize-then-validate order
- scripts/release.py: add changchun989@proton.me -> changchun989 to
  AUTHOR_MAP so CI doesn't block on the unmapped contributor email

All kanban + profile tests pass (268 across test_profiles.py +
test_kanban_db.py + test_kanban_core_functionality.py, plus 73 in
test_kanban_tools.py + test_kanban_dashboard_plugin.py).

Closes #18498.
2026-05-04 04:44:37 -07:00
changchun989 a31477dabb fix(profiles): normalize profile IDs for Kanban assignees and lookups
- Add normalize_profile_name() for lowercase canonical IDs and Default alias
- Use canonical names in create/delete/rename/export/import/set_active paths
- Canonicalize Kanban assignee on create/assign, list filter, and worker spawn
- Tests for mixed-case assignees and profile resolution (fixes #18498)
2026-05-04 04:44:37 -07:00
Yuyang Xu 60c4bc96fd fix(security): restore .env/auth.json/state.db with 0600 perms
`hermes import` was creating secret files with the process umask
(typically 0644) instead of 0600. zipfile.open() does not honor the
Unix mode bits stored in zip member external_attr; the restore loop
used open(target, "wb") which always falls back to umask.

Threat: silent privilege downgrade after a routine restore on
multi-user systems (shared dev boxes, CI runners, jump hosts) — any
local user could read API keys and OAuth tokens from ~/.hermes/.

Fix mirrors the convention already used at file creation
(hermes_cli/auth.py: stat.S_IRUSR | stat.S_IWUSR for auth.json).
The quick-snapshot restore path (restore_quick_snapshot) is
unaffected — it uses shutil.copy2 which preserves perms via
copystat().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 04:43:53 -07:00
MichaelWDanko da8654bb41 fix(dashboard): show custom theme palette swatches 2026-05-04 04:43:27 -07:00
Cameron Aragon 239ea1bdea fix(image-gen): preserve xAI API error status 2026-05-04 04:43:07 -07:00
atongrun 75b4a34670 fix(cli): check updates against upstream/main for fork users 2026-05-04 04:42:44 -07:00
Teknium 5ec6baa400 feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.

Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
  its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
  keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
  installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
  their env alongside the existing `HERMES_KANBAN_DB` /
  `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
  other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
  per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
  own SQLite DB, same WAL+IMMEDIATE machinery as before).

CLI
---
  hermes kanban boards list|create|switch|show|rename|rm
  hermes kanban --board <slug> <any-subcommand>

Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.

Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.

Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
  with all boards + task counts, `+ New board` button, `Archive`
  button (non-default only). Hidden entirely when only `default` exists
  and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
  + "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
  shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
  new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
  `PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
  `POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
  opens a fresh WS against the new board.

Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.

Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.

Regression surface: all 264 pre-existing kanban tests continue to pass.

Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
vominh1919 135b4c8b35 fix(mcp): decouple AnyUrl import from mcp dependency
AnyUrl was imported inside the same try block as mcp.client.auth, so
when the mcp package was not installed, AnyUrl was undefined and
_build_client_metadata raised NameError at runtime.

Moved the AnyUrl import to its own try/except block so it's available
whenever pydantic is installed (which is a core dependency), regardless
of whether the mcp SDK is present.

Also added pytest.importorskip('mcp') to the three
test_build_client_metadata tests that exercise _build_client_metadata,
since that function depends on OAuthClientMetadata from the mcp package.
2026-05-04 04:42:18 -07:00
vominh1919 0d563621fb fix(test): skip bedrock adapter tests when botocore is not installed
Six tests in test_bedrock_adapter.py import botocore.exceptions
directly (ConnectionClosedError, EndpointConnectionError,
ReadTimeoutError, ClientError) without guarding the import. When
botocore is not installed (it's an optional dependency), these tests
fail with ModuleNotFoundError instead of being gracefully skipped.

Added pytest.importorskip('botocore') to each affected test function,
following the same pattern used elsewhere in the test suite (e.g.
test_voice_mode.py for numpy, test_mcp_oauth.py for mcp).

Tests affected:
- TestIsStaleConnectionError: 3 tests
- TestCallConverseInvalidatesOnStaleError: 3 tests

Before: 6 FAIL with ModuleNotFoundError
After:  6 SKIP with reason message
2026-05-04 04:41:55 -07:00
vominh1919 d1d2d43387 fix(test): add skip marker for transcription tests requiring faster_whisper
TestTranscribeLocalExtended patches faster_whisper.WhisperModel, which
triggers an ImportError when the faster_whisper package is not installed.
Added a pytest.mark.skipif marker using importlib.util.find_spec so
these tests are gracefully skipped instead of failing with
ModuleNotFoundError.
2026-05-04 04:41:36 -07:00
Teknium 844d4a32ce chore(release): AUTHOR_MAP entries for Tier 1e salvage batch 2026-05-04 04:40:34 -07:00
Teknium 110387d149 docs(open-webui): fill gaps in quick setup — verify curls, ollama flag, restart note (#19654)
Reported by @neopabo — the Open WebUI page was missing several steps users
hit in practice:

- Use hermes config set instead of hand-editing .env (matches current UX)
- Restart-gateway note after enabling API_SERVER_ENABLED
- curl /health + /v1/models verification step before jumping to Docker
- ENABLE_OLLAMA_API=false in both docker run and compose snippets to
  suppress the empty Ollama backend that otherwise clutters the picker
- 15-30s startup wait note for first-run embedding model download
- Troubleshooting entry for the empty-Ollama-shadowing case
- /v1/models troubleshoot command now includes the Authorization header
2026-05-04 04:36:18 -07:00
Siddharth Balyan af6f9bc2a1 fix: refresh systemd unit on gateway boot (not just start/restart) (#19684)
The resilient restart settings from PR #18639 only took effect when
the gateway was started via `hermes gateway start` or `hermes gateway
restart` — both of which call refresh_systemd_unit_if_needed() which
writes the new unit and runs daemon-reload.

However, when the gateway self-restarts via exit-code-75 (stale-code
detection after `hermes update`, or the /restart command), systemd
respawns the process directly without going through any CLI function.
The unit file on disk stays stale, and systemd keeps using the old
cached settings (StartLimitBurst=5, RestartSec=30) until someone
manually runs `hermes gateway restart`.

This meant that after PR #18639 was deployed, users who never ran
`hermes gateway restart` manually were still vulnerable to the
permanent-death-on-network-outage bug.

Fix: call refresh_systemd_unit_if_needed() at the top of run_gateway()
(the foreground entry point that systemd's ExecStart invokes). This
ensures that on every boot — whether triggered by systemd restart,
exit-75 respawn, or manual foreground run — the unit definition and
daemon state are current. The call is best-effort (exceptions caught)
and a no-op when the unit is already current (one stat + string compare).
2026-05-04 16:27:51 +05:30
Teknium 33f554d83c feat(kanban-dashboard): workspace kind + path inputs in inline create form (#19679)
Closes #18718. Exposes the existing `workspace_kind` + `workspace_path`
fields (already accepted by POST /api/plugins/kanban/tasks) in the
dashboard's per-column inline-create form so users can create tasks
targeting a git worktree or an explicit directory without dropping
back to the CLI.

- Add a workspace-kind Select (scratch / worktree / dir) to
  InlineCreate in plugins/kanban/dashboard/dist/index.js.
- Conditionally render a workspace_path Input next to the select when
  kind != scratch; placeholder tells the user whether the path is
  required (dir) or optional (worktree — derived from assignee when
  blank).
- Submit wires `workspace_kind` / `workspace_path` into the POST body
  only when they're non-default, keeping the request shape small and
  interoperable with older dispatcher versions.

E2E verified in a dashboard pointed at the worktree: selecting dir +
typing /tmp/test-18718 produces a POST body with
{workspace_kind: 'dir', workspace_path: '/tmp/test-18718'} and the
task lands in sqlite with those fields set. 42/42 kanban dashboard
plugin tests pass.
2026-05-04 03:40:39 -07:00
Grey0202 a219a0a4df fix(anthropic): strip top-level oneOf/allOf/anyOf from tool input_schema
Extends the existing _normalize_tool_input_schema to also drop top-level
union keywords that Anthropic's tool schema validator rejects with HTTP 400.

Several upstream and plugin tools ship schemas with a top-level oneOf/
allOf/anyOf (common for Pydantic discriminated unions). The existing
strip_nullable_unions pass only handles anyOf-with-null patterns; a
non-null top-level union keyword sails through and hits the API.

Salvage of #16471 — approach folded into the existing normalize helper
rather than introducing a parallel _sanitize_input_schema function, to
avoid two schema-munging code paths running against the same input.

Co-authored-by: Grey0202 <grey0202@users.noreply.github.com>
2026-05-04 03:17:35 -07:00
charliekerfoot 412f2389f1 fix(google_oauth): close TOCTOU window when saving credentials 2026-05-04 03:16:19 -07:00
Ioodu e50809b771 fix(file-tools): cap read_file result size to prevent context window overflow
Set max_result_size_chars=100_000 on the read_file registry entry (was
float('inf')), closing the Layer 2 defense-in-depth gap in
tool_result_storage.py. The existing Layer 1 guard inside
_handle_read_file already returns a JSON error for oversized reads;
this aligns the registry cap with every other tool.

Update test_read_file_never_persisted → test_read_file_result_size_cap
to assert 100_000, and add test_read_file_registry_cap_is_100k as an
explicit regression guard against re-introducing float('inf').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:14:59 -07:00
Teknium 5b6d413476 fix(cli,gateway): surface title errors from /new <name>
The contributor's PR silently swallowed ValueError from
SessionDB.set_session_title() with bare except Exception: pass.
Users typing /new <title> with an already-in-use title got an
untitled session and no feedback.

Changes:
- cli.py: catch ValueError from both sanitize_title() and
  set_session_title(); print the error and mark the session
  untitled in the banner (never echo the rejected title back).
- gateway/run.py: append a warning note to the reset reply on
  title rejection; reflect the accepted title in the header.
- Add regression tests for the duplicate-title path in CLI and
  gateway.

Also map exx@example.com -> @exxmen in scripts/release.py.
2026-05-04 03:14:50 -07:00
Exx f720751d79 feat(cli,gateway): /new accepts optional session name argument
Allow users to start a fresh session and immediately set its title by
passing a name to /new (or /reset):

    /new Refactor auth module

Changes:
- hermes_cli/commands.py: add args_hint='[name]' to /new command
- cli.py: parse title argument in process_command(), pass to new_session()
- cli.py: new_session() accepts title=None, sets title via SessionDB
- gateway/run.py: _handle_reset_command() parses title, sets on new entry
- gateway/session.py: reset_session() accepts optional display_name
- tests: add test_new_session_with_title, test_reset_command_with_title,
  test_new_command_in_help_output

All 36 affected tests pass.
2026-05-04 03:14:50 -07:00
ms-alan 055fde40e0 fix(doctor): check global agent-browser when local install not found
When agent-browser is globally installed via 'npm install -g agent-browser'
but not present in the local node_modules, doctor falsely warns that it's
not installed. Add shutil.which('agent-browser') as a fallback check after
the local path check.

Closes #15951
2026-05-04 03:13:22 -07:00
xyiy001 e69d11d30c fix(browser): allow CDP override to pass requirement checks
Treat explicit CDP override mode as a valid browser backend even when agent-browser is absent, and add a regression test to prevent false-negative availability gating.
2026-05-04 03:12:30 -07:00
kshitijk4poor 46072425fe fix(model-picker): exclude providers with empty credential pool entries
The auth check in list_authenticated_providers used mere key presence in
credential_pool to conclude a provider is authenticated.  An empty entry
(pool_store key with no actual credentials) caused providers like
ollama-cloud to appear as authenticated in the model picker even when no
OLLAMA_API_KEY was set.

The user's picker then offered nemotron-3-super under Ollama Cloud;
selecting it routed every subsequent turn to https://ollama.com/v1, which
rejected the requests with HTTP 400.

Fix: drop the pool_store key-existence check from both section 2
(HERMES_OVERLAYS) and section 2b (CANONICAL_PROVIDERS).  The following
load_pool().has_credentials() call already handles the legitimate pooled-
credential case; checking for an empty key just ahead of it was redundant
and actively harmful.
2026-05-04 03:12:12 -07:00
briandevans c8ecb56f27 fix(cli): reject invalid argv values from -p/--profile before resolving
`_apply_profile_override()` scans `sys.argv` for `-p / --profile` at
module import time. When `hermes_cli.main` is imported inside pytest
with `-p no:xdist` on the command line, it picks up `'no:xdist'` as a
profile name candidate, then passes it to `resolve_profile_env()` which
raises `ValueError` (invalid format), and the function calls
`sys.exit(1)` — aborting test collection with an INTERNALERROR before
any test runs.

The same conflict affects any tool or wrapper that uses `-p` for its
own flag and then imports `hermes_cli.main`.

Fix: add a format guard immediately after step 1 (explicit flag scan).
If `consume == 2` (the value came from `-p <value>`, not
`--profile=value`) and the candidate doesn't match the canonical
profile-name pattern `[a-z0-9][a-z0-9_-]{0,63}` (mirrored from
`hermes_cli.profiles._PROFILE_ID_RE`), discard it and continue as if
no `-p` flag was found. The `active_profile` file-based fallback
(step 2) only reads a file written by hermes itself, so it always
produces valid names and needs no guard.

Regression guard: with the guard reverted, importing
`hermes_cli.main` with `sys.argv = ['pytest', '-p', 'no:xdist', ...]`
raises `SystemExit(1)`. With the guard in place, the import succeeds
and `sys.argv` is left intact for pytest. Legitimate `-p coder` still
flows through to `resolve_profile_env()` unchanged.

Rebased onto current `origin/main` (`e5dad4ac5`) — the prior branch
base (`4fade39c9`) was 824 commits behind and the PR was DIRTY /
CONFLICTING. The 1.5 HERMES_HOME-set early-return block has since
landed between the original insertion point and step 2; the new guard
is positioned correctly before the early return so a bogus `-p` value
no longer prevents the early return from kicking in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 03:11:47 -07:00
ChanlerDev e3461e0b2a fix(cli): remove dead 'q' check from quit command resolution
The 'q' alias is defined for 'queue' command in commands.py:93.
The hardcoded 'q' in cli.py:5910 was dead code - resolve_command('q')
returns the queue CommandDef, so canonical would never be 'q'.

Removes the misleading check without changing any behavior:
- /quit and /exit still exit (defined aliases)
- /q still maps to queue (as intended)
2026-05-04 03:11:30 -07:00
YAMAGUCHI Seiji cba86b7303 fix(cronjob): treat bare 'custom' provider as unspecified in override
`_resolve_model_override` treated any non-empty `provider` string from
the LLM as user-specified and skipped the pin-to-current-provider
fallback. When the LLM wrote bare `'custom'` (instead of the canonical
`'custom:<name>'` referring to a custom_providers entry), the value
serialized into jobs.json as `"provider": "custom"` and the scheduler
could never resolve a provider from it — the cron job failed silently
at run time.

Treat bare `'custom'` as "no provider supplied" so the current main
provider gets pinned instead, matching behaviour for the omitted case.

Defence-in-depth complement to a schema-description fix (#15477) that
discourages the LLM from emitting bare `'custom'` in the first place.
2026-05-04 03:11:11 -07:00
pander 6b88f46c54 fix(compressor): trigger fallback on timeout errors alongside model-not-found
Previously only HTTP 404/503 and specific error strings triggered a fallback
to the main model when the summary model was unavailable. Timeout errors
(HTTP 408/429/502/504, or error strings containing 'timeout') entered a
short cooldown instead, leaving context to grow unbounded for the rest of
the session.

Add _is_timeout detection alongside _is_model_not_found so that transient
timeout errors on the summary model also trigger immediate fallback to the
main model, preventing compression failure from cascading.

Closes #15935
2026-05-04 03:10:53 -07:00
DaniuXie a45bd28598 fix(wecom): set SUPPORTS_MESSAGE_EDITING=False to prevent broken streaming 2026-05-04 03:10:36 -07:00
zng8418 d2ea959fe9 fix(doctor): skip /models health check for MiniMax CN (returns 404)
MiniMax China (api.minimaxi.com) does not expose a /v1/models endpoint.
The doctor command was probing it and reporting HTTP 404 as a warning,
even though the API works correctly for chat completions.

Set supports_health_check=False for MiniMax CN so doctor shows
"(key configured)" instead of the false 404 warning.

Refs #12768, #13757
2026-05-04 03:10:17 -07:00
ideathinklab01-source d17eff29d5 fix(delegate): guard _load_config() against delegation: null in config.yaml
YAML parses `delegation: null` as Python None. `dict.get(key, {})`
only uses the default when the key is *missing*, not when it exists with
a None value, so `cfg.get("max_concurrent_children")` crashes with
`'NoneType' object has no attribute 'get'`.

Same pattern as fd9b692d (fix(tui): tolerate null top-level sections).
Use `dict.get(key) or {}` to handle both missing and None-valued keys.

Closes: delegation null config crash (same class as #7215, #7346)
2026-05-04 03:09:59 -07:00
ygd58 2d3d1d9736 fix(tui): use --outdir instead of --outfile in hermes-ink build script
esbuild raises 'Must use outdir when there are multiple input files'
on Android/Termux ARM64 with esbuild >=0.25. The build script used
--outfile=dist/ink-bundle.js which is only valid for a single entry
point with no code splitting. Switching to --outdir=dist fixes the
error and names the output file dist/entry-exports.js (matching the
input file name). Update index.js to import from the new path.

Fixes #16072
2026-05-04 03:09:41 -07:00
LLing486 145a38a875 fix(agent): preserve dots in model names for Xiaomi MiMo provider
Add 'xiaomi' to the _anthropic_preserve_dots() provider whitelist and
'xiaomimimo.com' to the URL-based fallback check. Without this,
normalize_model_name() converts mimo-v2.5 to mimo-v2-5, which the
Xiaomi API rejects with HTTP 400.

Fixes #16156
2026-05-04 03:09:24 -07:00
YAMAGUCHI Seiji 0896944382 fix(cronjob): advertise 'custom:<name>' provider format in tool schema
The `provider` field in CRONJOB_SCHEMA only showed examples like
'openrouter' and 'anthropic', with no mention of the canonical
'custom:<name>' form required for custom_providers entries. When the
user has custom providers configured, LLMs tend to write the bare type
name ('custom') because the schema does not advertise the ':<name>'
suffix. The bare value then serializes into jobs.json and causes the
cron job to fail silently at run time — `_resolve_model_override`
treats it as a user-specified provider and skips the pin-to-current
fallback, but no provider ever resolves from the bare 'custom' string.

Clarifying the schema so the canonical form is discoverable addresses
the root cause at the tool-definition boundary.
2026-05-04 03:09:07 -07:00
jjjojoj 9c64d09610 fix(status): show NVIDIA NIM api key status
hermes status was missing NVIDIA API key from its API keys display.
Now shows NVIDIA NIM ✓/✗ with key hash like other providers.

Fixes #16082
2026-05-04 03:08:50 -07:00
Teknium 64b39d835e chore(release): AUTHOR_MAP entries for Tier 1d salvage batch 2026-05-04 03:07:30 -07:00
taeng0204 20a06c586f fix(dashboard): render null instead of flashing spinner during plugin load 2026-05-04 03:06:45 -07:00
taeng0204 06a6d6967a fix(dashboard): defer unknown-route redirect while dashboard plugins load 2026-05-04 03:06:45 -07:00
Teknium 986ec04048 docs: document /kanban slash command (#19584)
* docs: document /kanban slash command

The kanban user guide and slash-commands reference only mentioned the
/kanban slash command in passing. Add a proper section covering:

- CLI and gateway both expose the full hermes kanban surface via
  hermes_cli.kanban.run_slash (identical argument surface)
- Mid-run usage: /kanban bypasses the running-agent guard, so reads
  and writes land immediately while an agent is still in a turn
- Auto-subscribe on /kanban create from the gateway — originating
  chat is subscribed to terminal events, with a worked example
- Output truncation (~3800 chars) in messaging
- Autocomplete hint list vs full subcommand surface

Also adds /kanban rows to both slash-command tables (CLI + messaging)
in reference/slash-commands.md and moves it into the 'works in both'
notes bucket.

* docs(kanban): frame the model's tool surface as primary, CLI as the human surface

The kanban user guide and CLI reference read as if you drive the board
by running `hermes kanban` commands everywhere. In practice:

- **You** (human, scripts, cron, dashboard) use the `hermes kanban …`
  CLI, the `/kanban …` slash command, or the REST/dashboard.
- **Workers** spawned by the dispatcher use a dedicated `kanban_*`
  toolset (`kanban_show`, `kanban_complete`, `kanban_block`,
  `kanban_heartbeat`, `kanban_comment`, `kanban_create`,
  `kanban_link`) and never shell out to the CLI.

Changes to `user-guide/features/kanban.md`:

- New 'Two surfaces' intro distinguishes the two front doors up front.
- Quick-start section re-labelled so each step says who is running it
  (you vs. orchestrator vs. worker).
- 'How workers interact with the board' rewritten:
  - Lead with "Workers do not shell out to `hermes kanban`."
  - Tool table extended with required params.
  - Concrete worker-turn example (`kanban_show` → `kanban_heartbeat`
    → `kanban_complete`) and an orchestrator fan-out example
    (`kanban_create` x N with `parents=[...]`).
  - Moved 'Why tools not CLI' from a defensive aside to a clean
    follow-up section.
- 'Worker skill' section explicitly says the lifecycle is taught
  in tool calls, not CLI commands.
- 'Pinning extra skills' reordered — orchestrator tool form first
  (the usual case), human/CLI second, dashboard third.
- 'Orchestrator skill' now shows a canonical `kanban_create` /
  `kanban_link` / `kanban_complete` tool-call sequence instead of
  only describing what the skill teaches.
- CLI-command-reference heading now clarifies this is the human
  surface, with a cross-link to the tool-surface section.
- 'Runs — one row per attempt' structured-handoff example replaced:
  the primary example is now `kanban_complete(summary=..., metadata=...)`
  (what a worker actually does), with the CLI form retained as
  "when you, the human, need to close a task a worker can't."

Changes to `reference/cli-commands.md`:

- `hermes kanban` intro marks itself as the human / scripting surface
  and links out to the worker tool surface.
- Corrected `comment <id>` description — the next worker reads it via
  `kanban_show()`, not by running `hermes kanban show`.

* docs(kanban-tutorial): reframe worker actions as tool calls

Honest answer to Teknium's follow-up: no, the first pass missed the
tutorial. The four stories all showed `hermes kanban claim /
complete / block / unblock` as if the backend-dev, pm, and reviewer
personas were humans running CLI commands. In a real hermes kanban
run those agents are dispatcher-spawned workers driving the board
through the `kanban_*` tool surface.

Changes:

- Setup intro now distinguishes the three surfaces up front
  (dashboard / CLI for you, `kanban_*` tools for workers) and
  establishes the convention: `bash` blocks are commands *you* run,
  `# worker tool calls` blocks are what the agent emits.
- Story 1 (solo dev schema): 'Claim the schema task, do the work,
  hand off' block replaced with the dispatcher spawning the
  backend-dev worker and a `kanban_show → kanban_heartbeat →
  kanban_complete` tool-call sequence. The 'On the CLI' `hermes
  kanban show / runs` block re-labelled as 'you peeking at the board'
  to keep it correct as a human inspection step.
- Story 2 (fleet farming): note about structured handoff updated
  from `--summary` / `--metadata` CLI flags to
  `kanban_complete(summary=..., metadata=...)` tool form.
- Story 3 (role pipeline): the big PM/engineer/reviewer block fully
  rewritten as three worker tool-call sequences — PM worker
  completes spec, engineer worker blocks, human/reviewer
  `hermes kanban unblock` (or `/kanban unblock`), engineer worker
  respawns and completes. The respawn-as-new-run mechanic is now
  explicit.
- Reviewer paragraph: `build_worker_context` replaced with
  `kanban_show()` — that's the tool that delivers the parent
  handoff to the model.
- Structured handoff section heading and body updated:
  `--summary`/`--metadata` → `summary`/`metadata` (tool params),
  with a note that the tool surface doesn't expose a bulk variant
  for the same reason the CLI refuses multi-task `complete`.

Story 4 (circuit breaker) unchanged — its workers fail to spawn,
so there are no tool calls to show; the `hermes kanban create` and
`hermes kanban runs` commands in it are correctly human-driven.
2026-05-04 03:05:34 -07:00
Teknium 0628004709 docs(model-catalog): rename x-ai/grok-4.20-beta to x-ai/grok-4.20 (#19640)
OpenRouter and Nous Portal dropped the -beta suffix from the Grok 4.20 slug.
The OpenRouter section already used the new slug; this updates the Nous
Portal section and bumps updated_at.
2026-05-04 02:48:30 -07:00
ms-alan c659a16899 fix(cli): detect quoted relative paths in _detect_file_drop
Closes #15197
2026-05-04 02:48:20 -07:00
ms-alan 08b8465ca9 fix(email): add required Date header to send_message_tool._send_email
Adds RFC 5322 Date header to the _send_email tool path in tools/send_message_tool.py.

Issue #15160 noted that both gateway/platforms/email.py and tools/send_message_tool.py
construct MIMEMultipart/MIMEText messages without setting a Date header. RFC 5322
requires the Date header; mail filters reject messages that lack it.

PR #15207 fixed the gateway/platforms/email.py path but did not cover
tools/send_message_tool._send_email, which is used by the send_message tool
for cross-channel messaging.

This change adds msg["Date"] = formatdate(localtime=True) to _send_email,
mirroring the fix applied to the gateway email adapter.

Closes #15160
2026-05-04 02:48:20 -07:00
thchen 51dc98d314 fix(agent): detect Qwen3/Ollama inline thinking after tool calls
Ollama serves Qwen3 thinking inside the content field as <think>...</think>
blocks rather than in the API-level reasoning_content field.  This means
_has_structured was False for these responses, so an empty-looking reply
after a tool call triggered the nudge instead of the prefill continuation,
causing a double-response loop.

Fix: detect <think>/<thinking>/<reasoning> in final_response and:
  1. Skip the nudge when thinking is present (model is still reasoning)
  2. Include _has_inline_thinking in _has_structured so prefill kicks in
2026-05-04 02:47:29 -07:00
LeonSGP43 0df7e61d2c fix(cli): omit empty api_mode when probing custom models 2026-05-04 02:46:41 -07:00
QifengKuang 52c539d53a fix(agent): disable SDK retries on per-request OpenAI clients
Per-request OpenAI-wire clients (used by both non-streaming and
streaming chat-completions paths in _interruptible_api_call) should
not run the SDK's built-in retry loop: the agent's outer loop owns
retries with credential rotation, provider fallback, and backoff that
the SDK can't see.

Leaving SDK retries on (default 2) compounds with our outer retries
and lets a single hung provider request stretch to ~3x the per-call
timeout before our stale detector reports it.

Shared/primary clients and Anthropic / Bedrock paths are unaffected
(they don't go through here).

Salvage of #15811 core improvement — the timeout push-down in the
original PR required scaffolding that has since been refactored on
main, so only the max_retries=0 change is preserved.

Co-authored-by: QifengKuang <k2767567815@gmail.com>
2026-05-04 02:43:20 -07:00
Teknium 3c070f9f9d fix(curator): only mark agent-created for background-review sediment (#19621)
Tighten the provenance semantics added in #19618: skills a user asks a
foreground agent to write via skill_manage(create) now stay invisible to
the curator. Only skills the background self-improvement review fork
sediments through skill_manage get the created_by=agent marker.

- tools/skill_provenance.py — new ContextVar module mirroring the
  _approval_session_key pattern: set_current_write_origin / reset /
  get / is_background_review. Default origin is 'foreground'; the
  review fork sets 'background_review'.
- run_agent.py — run_conversation() binds the ContextVar from
  self._memory_write_origin at the top of each call. The review fork
  runs on its own thread (fresh context), so foreground and review
  contexts never cross-contaminate.
- tools/skill_manager_tool.py — skill_manage(action='create') now
  only calls mark_agent_created() when is_background_review(). All
  other cases (foreground create, patch, edit, write_file, delete)
  continue as before.
- tests: test_skill_provenance.py (6 tests covering the ContextVar
  surface), split test_full_create_via_dispatcher into foreground
  vs. review-fork variants, curator status tests now mark-first.

Why: the agent routinely edits existing user skills on the user's
behalf; those writes must never flip provenance. And when a user
explicitly asks the foreground agent to create a skill, that skill
belongs to the user. The curator should only be cleaning up after
its own autonomous sediment from the review nudge loop.
2026-05-04 02:42:16 -07:00
Teknium bff484a51b fix(kanban-dashboard): widen drawer, bump body fonts, fix code-block contrast (#19638)
Closes #18576. Addresses three of four complaints from the readability
report; live-verified in a dashboard against a seeded task with body,
comments, and run history.

- Drawer default width 480px → 640px, exposed as the CSS var
  `--hermes-kanban-drawer-width` so deployments / user themes can
  override without forking the plugin.
- Bump body/meta/pre/log/run-history font sizes from the 0.65-0.75rem
  cluster to the 0.78-0.85rem cluster. Long paths and code snippets in
  task bodies, run metadata, and worker logs are legible again instead
  of requiring a squint.
- Fix the black-text-on-dark-theme regression in fenced markdown code
  blocks. Root cause: themes that don't define `--color-foreground`
  (NERV, at least) leave `color: var(--color-foreground)` resolving
  empty on <code>, which then falls back to the UA default (near-black)
  instead of inheriting from the drawer's <body>. Fix: force
  `color: inherit` on both inline and fenced code, and give the fenced
  block background via `currentColor` instead of `--color-foreground`
  so there's a visible card even when the theme var is absent.

Out of scope for this PR (comments added to #18576):
- Draggable resize handle (structural JS work; plugin ships built-only,
  no src/ in-tree).
- Live worker-log viewer for running tasks (backend WS + component).
- Sibling fix: themes like NERV should define --color-foreground. The
  current changes make the drawer robust against that gap, but the
  root fix belongs in the theme layer.
2026-05-04 02:41:51 -07:00
alt-glitch 2a52e28568 fix(setup): skip AUXILIARY_VISION_MODEL write when input is blank
Guard the save_env_value('AUXILIARY_VISION_MODEL', ...) call with
'if _selected_vision_model:' so blank input at the non-OpenAI vision
model prompt doesn't nuke existing values in .env.

save_env_value has no internal guard against empty strings — it
faithfully writes whatever it receives, including empty values that
shadow the previously-configured model.

Salvage of #15504 (core hunk). Contributor's test was dropped because
it collided with subsequent test refactors; the fix stands on its own.

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-05-04 02:41:47 -07:00
LeonSGP43 7d36533aeb fix(pty): default TERM for resize probes
Preserve explicit caller overrides, but backfill a sensible default
TERM=xterm-256color when missing or blank in the spawn env. CI often
runs without TERM in the parent process, which makes terminal probes
like 'tput cols' fail before winsize reads.

Salvage of #15278's core code fix only — the test changes conflict
with subsequent test refactors on main that now exercise TIOCGWINSZ
directly instead of via 'tput'.

Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com>
2026-05-04 02:38:54 -07:00
Bart 99faac212e fix(tui): prevent trailing space in picker-command completions
Commands that open pickers (/model, /skin, /personality) previously
received a trailing space in their completions to keep the dropdown
visible in the classic CLI. However, the TUI's submit handler applies
the completion when Enter is pressed and the result differs from the
input — so '/model' + space became '/model ' and the command was never
executed.

Picker commands now omit the trailing space for exact matches, allowing
Enter to submit and open the picker. Non-picker commands (/help, etc.)
are unaffected.
2026-05-04 02:35:33 -07:00
analista 6da970f15d fix(tui): close AIAgent on session teardown to prevent FD leak
session.close only closed the slash_worker subprocess but never called
agent.close() on the AIAgent instance.  In the long-lived TUI gateway
process, this left httpx clients for GC to finalize.  When the OS
recycled a closed FD number for a new active connection, the stale
finalizer would close the live socket, causing intermittent
[Errno 9] Bad file descriptor on subsequent LLM API calls.

Call agent.close() (which properly shuts down the httpx transport pool
and TCP sockets) before closing the slash_worker.
2026-05-04 02:34:53 -07:00
nftpoetrist 4e2b20b705 fix(cli): sync use_gateway in _reconfigure_provider for tts, browser, and web
_reconfigure_provider() updates cloud_provider/backend/tts.provider when
switching tool providers via "hermes setup tools → Reconfigure", but did
not update the matching use_gateway flag. _configure_provider() (the
initial-setup path) sets use_gateway on all three tool categories. The
omission in _reconfigure_provider leaves a stale value in config.yaml:
switching from a Nous-managed provider (use_gateway=True) to a self-hosted
one keeps use_gateway=True, continuing to route requests through the Nous
gateway; switching the other way leaves use_gateway unset so the managed
feature does not activate.

Fix: mirror _configure_provider's use_gateway = bool(managed_feature)
assignment in the tts, browser, and web blocks of _reconfigure_provider.
Symmetric across all three tool categories. No behavior change for any
provider that does not set tts_provider, browser_provider, or web_backend.

Fixes #15229
2026-05-04 02:33:55 -07:00
flobo3 ba8337464d fix(gemini): extract usageMetadata from streaming chunks for token tracking 2026-05-04 02:33:30 -07:00
ee-blog f6aa1965d7 fix(telegram): fallback to document when photo dimensions exceed limits
Telegram's send_photo has dimension limits (sum of width+height <= 10000px).
When sending large screenshots or tall images, the API returns
'Photo_invalid_dimensions' error.

Fix: Catch this specific error in send_image_file() and automatically
fallback to send_document() which has no dimension limits (only 50MB size).

This is similar to the existing 5MB URL fallback (commit 542faf22) but
handles local files with dimension issues instead of URL size issues.
2026-05-04 02:33:09 -07:00
barteq ad4542bf6d fix(gateway): allow free_response_channels to override DISCORD_IGNORE_NO_MENTION
When DISCORD_IGNORE_NO_MENTION is true (default), the bot ignores
messages without @mention. However, this check ran before evaluating
free_response_channels, so messages in free-response channels were
wrongly dropped unless they contained a mention.

This change adds a carve-out: if the message lands in a channel that
is configured as a free response channel (or its parent category is),
the ignore-no-mention rule is skipped.

Also removes the unconditional skip_thread for free response channels
so that auto_thread still creates threads there unless explicitly
disabled via DISCORD_NO_THREAD_CHANNELS.
2026-05-04 02:32:39 -07:00
hex-clawd 54cd633366 fix(cron): skip AI call when script produces no output
When a cron job has a pre-run script that runs successfully but produces
no output (e.g. email checker with no new mail), the scheduler previously
injected "[Script ran successfully but produced no output.]" into the
prompt and still called the AI model. This wastes tokens on every cycle.

Now _build_job_prompt() returns None when script output is empty, and
run_job() short-circuits with a SILENT response - zero API calls when
there is nothing to report.
2026-05-04 02:32:18 -07:00
dpaluy e2248045f5 fix(cron): drop stale env-var override of persisted provider
Cron jobs were passing os.getenv("HERMES_INFERENCE_PROVIDER") as the
"requested" arg to resolve_runtime_provider(), which short-circuited
the resolver's own precedence (explicit arg → persisted config → env)
and let stale shell/.env values outrank the user's saved provider.

Long-lived cron daemons inherit env from the shell that launched them,
so a since-changed provider (e.g. DeepSeek) could keep firing for jobs
that don't pin provider/model. Same bug class as f0b763c74 fixed for
the TUI /model switch.

Pass only job.get("provider") and let resolve_requested_provider fall
through to persisted config and env in the documented order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 02:31:57 -07:00
flobo3 d7663c7808 fix(docker): exclude compose/profile runtime state from build context 2026-05-04 02:31:39 -07:00
helix4u f236cbfec3 fix(tui): declare nanostores dependency 2026-05-04 02:31:22 -07:00
B1GGersnow dc63ad0ad2 fix(anthropic): cap max_tokens at 65536 for Qwen models via DashScope
DashScope's Anthropic-compatible endpoint enforces max_tokens ∈ [1, 65536].
Adding "qwen3" to _ANTHROPIC_OUTPUT_LIMITS prevents 400 errors that were
misclassified as context overflow, triggering premature compression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-04 02:31:05 -07:00
Emilien Domenge 83bbe9b458 fix(delegation): pass target_model to resolve_runtime_provider in _resolve_delegation_credentials
When delegation.model differs from model.default and the provider is
opencode-go or opencode-zen, the wrong api_mode is computed because
resolve_runtime_provider falls back to model_cfg.get('default') — the
main model — instead of the configured delegation model.

For example, with model.default=minimax-m2.7 (anthropic_messages) and
delegation.model=glm-5.1 (chat_completions), subagents get
anthropic_messages, which strips /v1 from the base URL and causes a 404.

resolve_runtime_provider already accepts target_model for exactly this
purpose; _resolve_delegation_credentials just wasn't passing it.

Fixes #15319
Related: #13678
2026-05-04 02:30:48 -07:00
nftpoetrist e2211b2683 fix(compressor): reset _summary_failure_cooldown_until in on_session_reset()
on_session_reset() cleared _previous_summary, _last_summary_error, and
_ineffective_compression_count but left _summary_failure_cooldown_until
intact. When a transient summary error sets a 60 s cooldown (or 600 s
for a missing-provider RuntimeError) and the user immediately runs /reset
or /new, the cooldown carries into the new session. If the new session
reaches the compression threshold before the cooldown expires,
_generate_summary() returns None early, middle turns are silently dropped
without a summary, and the agent continues with no indication that
compaction was skipped.

Fix: set _summary_failure_cooldown_until = 0.0 in on_session_reset(),
matching the value assigned in __init__ and symmetric with the other
per-session fields already cleared there.

Fixes #15547
2026-05-04 02:30:31 -07:00
Teknium 3e1559b910 chore(release): AUTHOR_MAP entries for Tier 1c salvage batch
Pre-adds author-email mappings for upcoming Tier 1c salvage PRs
(small Apr 24-25 fixes).
2026-05-04 02:29:18 -07:00
Teknium baf834cc0f chore(release): map cine.dreamer.one@gmail.com to @LeonSGP43 2026-05-04 02:19:28 -07:00
LeonSGP43 abcaf05229 fix(skills): keep manual skills out of curator 2026-05-04 02:19:28 -07:00
asheriif 21c7c9f0ca fix(tui): harden plugin slash exec errors 2026-05-04 09:07:37 +00:00
asheriif 7e780f4832 fix(tui): run plugin slash commands live 2026-05-03 19:42:16 +00:00
396 changed files with 33789 additions and 2754 deletions
+4
View File
@@ -25,3 +25,7 @@ ui-tui/packages/hermes-ink/dist/
# Runtime data (bind-mounted at /opt/data; must not leak into build context)
data/
# Compose/profile runtime state (bind-mounted; avoid ownership/secret issues)
hermes-config/
runtime/
+44
View File
@@ -0,0 +1,44 @@
# Dependabot configuration for hermes-agent.
#
# Deliberately scoped to github-actions only.
#
# We do NOT enable Dependabot for pip / npm / any source-dependency ecosystem
# because we pin source dependencies exactly (uv.lock, package-lock.json) as
# part of our supply-chain posture. Automatic version-bump PRs against those
# pins would undermine the strategy — pins are moved deliberately, after
# review, not on a schedule.
#
# github-actions is the exception: action pins (we use full commit SHAs per
# supply-chain policy) must be updated when upstream actions publish
# patches — usually themselves security fixes. Dependabot opens a PR with
# the new SHA and release notes; we review and merge like any other PR.
#
# Security-update PRs for source dependencies (opened ONLY when a CVE is
# published affecting a currently-pinned version) are enabled separately
# via the repo's Dependabot security updates setting
# (Settings → Code security → Dependabot → Dependabot security updates).
# Those are CVE-only, not schedule-driven, and do not conflict with our
# pinning strategy — they fire when a pinned version becomes known-bad,
# which is exactly when we want to move the pin.
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
day: "monday"
open-pull-requests-limit: 5
labels:
- "dependencies"
- "github-actions"
commit-message:
prefix: "chore(actions)"
include: "scope"
groups:
# Batch routine action bumps into one PR per week to reduce noise.
# Security updates still open individually and bypass grouping.
actions-minor-patch:
update-types:
- "minor"
- "patch"
+67
View File
@@ -0,0 +1,67 @@
name: OSV-Scanner
# Scans lockfiles (uv.lock, package-lock.json) against the OSV vulnerability
# database. Runs on every PR that touches a lockfile and on a weekly schedule
# against main.
#
# This is detection-only — OSV-Scanner does NOT open PRs or modify pins.
# It reports known CVEs in currently-pinned dependency versions so we can
# decide when and how to patch on our own schedule. Our pinning strategy
# (full SHA / exact version) is preserved; only the notification signal
# is added.
#
# Complements the existing supply-chain-audit.yml workflow (which scans
# for malicious code patterns in PR diffs) by covering the orthogonal
# "currently-pinned dep became known-vulnerable" case.
#
# Uses Google's officially-recommended reusable workflow, pinned by SHA.
# Findings land in the repo's Security tab (Code Scanning > OSV-Scanner).
# fail-on-vuln is disabled so the job does not block merges on pre-existing
# vulnerabilities in pinned deps that we may need to patch deliberately.
on:
pull_request:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package.json'
- 'ui-tui/package-lock.json'
- 'website/package.json'
- 'website/package-lock.json'
- '.github/workflows/osv-scanner.yml'
push:
branches: [main]
paths:
- 'uv.lock'
- 'pyproject.toml'
- 'package.json'
- 'package-lock.json'
- 'ui-tui/package-lock.json'
- 'website/package-lock.json'
schedule:
# Weekly scan against main — catches CVEs published after merge for
# deps that haven't changed since.
- cron: '0 9 * * 1'
workflow_dispatch:
permissions:
# Required by the reusable workflow to upload SARIF to the Security tab.
actions: read
contents: read
security-events: write
jobs:
scan:
name: Scan lockfiles
uses: google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@c51854704019a247608d928f370c98740469d4b5 # v2.3.5
with:
# Scan explicit lockfiles rather than recursing, so we only look at
# the three sources of truth and skip vendored / test / worktree dirs.
scan-args: |-
--lockfile=uv.lock
--lockfile=ui-tui/package-lock.json
--lockfile=website/package-lock.json
fail-on-vuln: false
+205 -10
View File
@@ -37,12 +37,17 @@ hermes-agent/
│ ├── platforms/ # Adapter per platform (telegram, discord, slack, whatsapp,
│ │ # homeassistant, signal, matrix, mattermost, email, sms,
│ │ # dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
│ │ # webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ │ # yuanbao, webhook, api_server, ...). See ADDING_A_PLATFORM.md.
│ └── builtin_hooks/ # Extension point for always-registered gateway hooks (none shipped)
├── plugins/ # Plugin system (see "Plugins" section below)
│ ├── memory/ # Memory-provider plugins (honcho, mem0, supermemory, ...)
│ ├── context_engine/ # Context-engine plugins
── <others>/ # Dashboard, image-gen, disk-cleanup, examples, ...
── kanban/ # Multi-agent board dispatcher + worker plugin
│ ├── hermes-achievements/ # Gamified achievement tracking
│ ├── observability/ # Metrics / traces / logs plugin
│ ├── image_gen/ # Image-generation providers
│ └── <others>/ # disk-cleanup, example-dashboard, google_meet, platforms,
│ # spotify, strike-freedom-cockpit, ...
├── optional-skills/ # Heavier/niche skills shipped but NOT active by default
├── skills/ # Built-in skills bundled with the repo
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
@@ -53,7 +58,7 @@ hermes-agent/
├── environments/ # RL training environments (Atropos)
├── scripts/ # run_tests.sh, release.py, auxiliary scripts
├── website/ # Docusaurus docs site
└── tests/ # Pytest suite (~15k tests across ~700 files as of Apr 2026)
└── tests/ # Pytest suite (~17k tests across ~900 files as of May 2026)
```
**User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
@@ -257,7 +262,16 @@ The dashboard embeds the real `hermes --tui` — **not** a rewrite. See `hermes
## Adding New Tools
Requires changes in **2 files**:
For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
`~/.hermes/plugins/<name>/__init__.py`, then register tools with
`ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
enabled or disabled without touching `tools/` or `toolsets.py`.
Use the built-in route below only when the user is explicitly contributing a new
core Hermes tool that should ship in the base system.
Built-in/core tools require changes in **2 files**:
**1. Create `tools/your_tool.py`:**
```python
@@ -280,9 +294,9 @@ registry.register(
)
```
**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
**2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset. **This step is required:** auto-discovery imports the tool and registers its schema, but the tool is only *exposed to an agent* if its name appears in a toolset. `_HERMES_CORE_TOOLS` is not dead code — it's the default bundle every platform's base toolset inherits from.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain.
Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain. Wiring into a toolset is still a deliberate, manual step.
The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
@@ -304,6 +318,22 @@ The registry handles schema collection, dispatch, availability checking, and err
section is handled automatically by the deep-merge and does NOT require
a version bump.
### Top-level `config.yaml` sections (non-exhaustive):
`model`, `agent`, `terminal`, `compression`, `display`, `stt`, `tts`,
`memory`, `security`, `delegation`, `smart_model_routing`, `checkpoints`,
`auxiliary`, `curator`, `skills`, `gateway`, `logging`, `cron`, `profiles`,
`plugins`, `honcho`.
`auxiliary` holds per-task overrides for side-LLM work (curator, vision,
embedding, title generation, session_search, etc.) — each task can pin
its own provider/model/base_url/max_tokens/reasoning_effort. See
`agent/auxiliary_client.py::_resolve_auto` for resolution order.
`curator` holds the background skill-maintenance config —
`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
`archive_after_days`, `backup` (nested).
### .env variables (SECRETS ONLY — API keys, tokens, passwords):
1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
```python
@@ -510,11 +540,176 @@ niche skills belong in `optional-skills/`.
### SKILL.md frontmatter
Standard fields: `name`, `description`, `version`, `platforms`
(OS-gating list: `[macos]`, `[linux, macos]`, ...),
Standard fields: `name`, `description`, `version`, `author`, `license`,
`platforms` (OS-gating list: `[macos]`, `[linux, macos]`, ...),
`metadata.hermes.tags`, `metadata.hermes.category`,
`metadata.hermes.config` (config.yaml settings the skill needs — stored
under `skills.config.<key>`, prompted during setup, injected at load time).
`metadata.hermes.related_skills`, `metadata.hermes.config` (config.yaml
settings the skill needs — stored under `skills.config.<key>`, prompted
during setup, injected at load time).
Top-level `tags:` and `category:` are also accepted and mirrored from
`metadata.hermes.*` by the loader.
---
## Toolsets
All toolsets are defined in `toolsets.py` as a single `TOOLSETS` dict.
Each platform's adapter picks a base toolset (e.g. Telegram uses
`"messaging"`); `_HERMES_CORE_TOOLS` is the default bundle most
platforms inherit from.
Current toolset keys: `browser`, `clarify`, `code_execution`, `cronjob`,
`debugging`, `delegation`, `discord`, `discord_admin`, `feishu_doc`,
`feishu_drive`, `file`, `homeassistant`, `image_gen`, `kanban`, `memory`,
`messaging`, `moa`, `rl`, `safe`, `search`, `session_search`, `skills`,
`spotify`, `terminal`, `todo`, `tts`, `video`, `vision`, `web`, `yuanbao`.
Enable/disable per platform via `hermes tools` (the curses UI) or the
`tools.<platform>.enabled` / `tools.<platform>.disabled` lists in
`config.yaml`.
---
## Delegation (`delegate_task`)
`tools/delegate_tool.py` spawns a subagent with an isolated
context + terminal session. Synchronous: the parent waits for the
child's summary before continuing its own loop — if the parent is
interrupted, the child is cancelled.
Two shapes:
- **Single:** pass `goal` (+ optional `context`, `toolsets`).
- **Batch (parallel):** pass `tasks: [...]` — each gets its own subagent
running concurrently. Concurrency is capped by
`delegation.max_concurrent_children` (default 3).
Roles:
- `role="leaf"` (default) — focused worker. Cannot call `delegate_task`,
`clarify`, `memory`, `send_message`, `execute_code`.
- `role="orchestrator"` — retains `delegate_task` so it can spawn its
own workers. Gated by `delegation.orchestrator_enabled` (default true)
and bounded by `delegation.max_spawn_depth` (default 2).
Key config knobs (under `delegation:` in `config.yaml`):
`max_concurrent_children`, `max_spawn_depth`, `child_timeout_seconds`,
`orchestrator_enabled`, `subagent_auto_approve`, `inherit_mcp_toolsets`,
`max_iterations`.
Synchronicity rule: delegate_task is **not** durable. For long-running
work that must outlive the current turn, use `cronjob` or
`terminal(background=True, notify_on_complete=True)` instead.
---
## Curator (skill lifecycle)
Background skill-maintenance system that tracks usage on agent-created
skills and auto-archives stale ones. Users never lose skills; archives
go to `~/.hermes/skills/.archive/` and are restorable.
- **Core:** `agent/curator.py` (review loop, auto-transitions, LLM review
prompt) + `agent/curator_backup.py` (pre-run tar.gz snapshots).
- **CLI:** `hermes_cli/curator.py` wires `hermes curator <verb>` where
verbs are: `status`, `run`, `pause`, `resume`, `pin`, `unpin`,
`archive`, `restore`, `prune`, `backup`, `rollback`.
- **Telemetry:** `tools/skill_usage.py` owns the sidecar
`~/.hermes/skills/.usage.json` — per-skill `use_count`, `view_count`,
`patch_count`, `last_activity_at`, `state` (active / stale /
archived), `pinned`.
Invariants:
- Curator only touches skills with `created_by: "agent"` provenance —
bundled + hub-installed skills are off-limits.
- Never deletes; max destructive action is archive.
- Pinned skills are exempt from every auto-transition and from the
LLM review pass.
- `skill_manage(action="delete")` refuses pinned skills; patch/edit/
write_file/remove_file go through so the agent can keep improving
pinned skills.
Config section (`curator:` in `config.yaml`):
`enabled`, `interval_hours`, `min_idle_hours`, `stale_after_days`,
`archive_after_days`, `backup.*`.
Full user-facing docs: `website/docs/user-guide/features/curator.md`.
---
## Cron (scheduled jobs)
`cron/jobs.py` (job store) + `cron/scheduler.py` (tick loop). Agents
schedule jobs via the `cronjob` tool; users via `hermes cron <verb>`
(`list`, `add`, `edit`, `pause`, `resume`, `run`, `remove`) or the
`/cron` slash command.
Supported schedule formats:
- Duration: `"30m"`, `"2h"`, `"1d"`
- "every" phrase: `"every 2h"`, `"every monday 9am"`
- 5-field cron expression: `"0 9 * * *"`
- ISO timestamp (one-shot): `"2026-06-01T09:00:00Z"`
Per-job fields include `skills` (load specific skills), `model` /
`provider` overrides, `script` (pre-run data-collection script whose
stdout is injected into the prompt; `no_agent=True` turns the script
into the entire job), `context_from` (chain job A's last output into
job B's prompt), `workdir` (run in a specific directory with its
`AGENTS.md`/`CLAUDE.md` loaded), and multi-platform delivery.
Hardening invariants:
- **3-minute hard interrupt** on cron sessions — runaway agent loops
cannot monopolize the scheduler.
- Catchup window: half the job's period, clamped to 120s2h.
- Grace window: 120s for one-shot jobs whose fire time was missed.
- File lock at `~/.hermes/cron/.tick.lock` prevents duplicate ticks
across processes.
- Cron sessions pass `skip_memory=True` by default; memory providers
intentionally do not run during cron.
Cron deliveries are **not** mirrored into the target gateway session —
they land in their own cron session with a header/footer frame so the
main conversation's message-role alternation stays intact.
---
## Kanban (multi-agent work queue)
Durable SQLite-backed board that lets multiple profiles / workers
collaborate on shared tasks. Users drive it via `hermes kanban <verb>`;
workers spawned by the dispatcher drive it via a dedicated `kanban_*`
toolset so their schema footprint is zero when they're not inside a
kanban task.
- **CLI:** `hermes_cli/kanban.py` wires `hermes kanban` with verbs
`init`, `create`, `list` (alias `ls`), `show`, `assign`, `link`,
`unlink`, `comment`, `complete`, `block`, `unblock`, `archive`,
`tail`, plus less-commonly-used `watch`, `stats`, `runs`, `log`,
`assignees`, `heartbeat`, `notify-*`, `dispatch`, `daemon`, `gc`.
- **Worker toolset:** `tools/kanban_tools.py` exposes `kanban_show`,
`kanban_complete`, `kanban_block`, `kanban_heartbeat`, `kanban_comment`,
`kanban_create`, `kanban_link` — gated by `HERMES_KANBAN_TASK` so
the schema only appears for processes actually running as a worker.
- **Dispatcher:** long-lived loop that (default every 60s) reclaims
stale claims, promotes ready tasks, atomically claims, and spawns
assigned profiles. Runs **inside the gateway** by default via
`kanban.dispatch_in_gateway: true`.
- **Plugin assets:** `plugins/kanban/dashboard/` (web UI) +
`plugins/kanban/systemd/` (`hermes-kanban-dispatcher.service` for
standalone dispatcher deployment).
Isolation model:
- **Board** is the hard boundary — workers are spawned with
`HERMES_KANBAN_BOARD` pinned in their env so they can't see other
boards.
- **Tenant** is a soft namespace *within* a board — one specialist
fleet can serve multiple businesses with workspace-path + memory-key
isolation.
- After ~5 consecutive spawn failures on the same task the dispatcher
auto-blocks it to prevent spin loops.
Full user-facing docs: `website/docs/user-guide/features/kanban.md`.
---
+4 -11
View File
@@ -466,17 +466,10 @@ class SessionManager:
except Exception:
logger.debug("Failed to update ACP session metadata", exc_info=True)
# Replace stored messages with current history.
db.clear_messages(state.session_id)
for msg in state.history:
db.append_message(
session_id=state.session_id,
role=msg.get("role", "user"),
content=msg.get("content"),
tool_name=msg.get("tool_name") or msg.get("name"),
tool_calls=msg.get("tool_calls"),
tool_call_id=msg.get("tool_call_id"),
)
# Replace stored messages with current history atomically so a
# mid-rewrite failure rolls back and the previously persisted
# conversation is preserved (salvaged from #13675).
db.replace_messages(state.session_id, state.history)
except Exception:
logger.warning("Failed to persist ACP session %s", state.session_id, exc_info=True)
+38 -3
View File
@@ -76,6 +76,7 @@ _ADAPTIVE_THINKING_SUBSTRINGS = ("4-6", "4.6", "4-7", "4.7")
# Models where temperature/top_p/top_k return 400 if set to non-default values.
# This is the Opus 4.7 contract; future 4.x+ models are expected to follow it.
_NO_SAMPLING_PARAMS_SUBSTRINGS = ("4-7", "4.7")
_FAST_MODE_SUPPORTED_SUBSTRINGS = ("opus-4-6", "opus-4.6")
# ── Max output token limits per Anthropic model ───────────────────────
# Source: Anthropic docs + Cline model catalog. Anthropic's API requires
@@ -105,6 +106,9 @@ _ANTHROPIC_OUTPUT_LIMITS = {
"claude-3-haiku": 4_096,
# Third-party Anthropic-compatible providers
"minimax": 131_072,
# Qwen models via DashScope Anthropic-compatible endpoint
# DashScope enforces max_tokens ∈ [1, 65536]
"qwen3": 65_536,
}
# For any model not in the table, assume the highest current limit.
@@ -216,6 +220,17 @@ def _forbids_sampling_params(model: str) -> bool:
return any(v in model for v in _NO_SAMPLING_PARAMS_SUBSTRINGS)
def _supports_fast_mode(model: str) -> bool:
"""Return True for models that support Anthropic Fast Mode (speed=fast).
Per Anthropic docs, fast mode is currently supported on Opus 4.6 only.
Sending ``speed: "fast"`` to any other Claude model (including Opus 4.7)
returns HTTP 400. This guard prevents silently 400'ing when stale config
or older callers leave fast mode enabled across a model upgrade.
"""
return any(v in model for v in _FAST_MODE_SUPPORTED_SUBSTRINGS)
# Beta headers for enhanced features (sent with ALL auth types).
# As of Opus 4.7 (2026-04-16), the first two are GA on Claude 4.6+ — the
# beta headers are still accepted (harmless no-op) but not required. Kept
@@ -1222,6 +1237,14 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
``keep_nullable_hint=False`` because the Anthropic validator does not
recognize the OpenAPI-style ``nullable: true`` extension and strict
schema-to-grammar converters may reject unknown keywords.
Top-level ``oneOf``/``allOf``/``anyOf`` are also stripped here: the
Anthropic API rejects union keywords at the schema root with a generic
HTTP 400. Several upstream and plugin tools ship schemas with one of
these keywords at the top level (commonly for Pydantic discriminated
unions). If we land here with those keywords still present after
nullable-union stripping, drop them and fall back to a plain object
schema so the tool still validates at the Anthropic boundary.
"""
if not schema:
return {"type": "object", "properties": {}}
@@ -1231,6 +1254,12 @@ def _normalize_tool_input_schema(schema: Any) -> Dict[str, Any]:
normalized = strip_nullable_unions(schema, keep_nullable_hint=False)
if not isinstance(normalized, dict):
return {"type": "object", "properties": {}}
# Strip top-level union keywords that Anthropic's validator rejects.
banned = {"oneOf", "allOf", "anyOf"}
if banned & normalized.keys():
normalized = {k: v for k, v in normalized.items() if k not in banned}
if "type" not in normalized:
normalized["type"] = "object"
if normalized.get("type") == "object" and not isinstance(normalized.get("properties"), dict):
normalized = {**normalized, "properties": {}}
return normalized
@@ -1915,9 +1944,15 @@ def build_anthropic_kwargs(
# ── Fast mode (Opus 4.6 only) ────────────────────────────────────
# Adds extra_body.speed="fast" + the fast-mode beta header for ~2.5x
# output speed. Only for native Anthropic endpoints — third-party
# providers would reject the unknown beta header and speed parameter.
if fast_mode and not _is_third_party_anthropic_endpoint(base_url):
# output speed. Per Anthropic docs, fast mode is only supported on
# Opus 4.6 — Opus 4.7 and other models 400 on the speed parameter.
# Only for native Anthropic endpoints — third-party providers would
# reject the unknown beta header and speed parameter.
if (
fast_mode
and not _is_third_party_anthropic_endpoint(base_url)
and _supports_fast_mode(model)
):
kwargs.setdefault("extra_body", {})["speed"] = "fast"
# Build extra_headers with ALL applicable betas (the per-request
# extra_headers override the client-level anthropic-beta header).
+130 -19
View File
@@ -216,7 +216,26 @@ def _fixed_temperature_for_model(
return None
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
def _get_aux_model_for_provider(provider_id: str) -> str:
"""Return the cheap auxiliary model for a provider.
Reads from ProviderProfile.default_aux_model first, falling back to the
legacy hardcoded dict for providers that predate the profiles system.
"""
try:
from providers import get_provider_profile
_p = get_provider_profile(provider_id)
if _p and _p.default_aux_model:
return _p.default_aux_model
except Exception:
pass
return _API_KEY_PROVIDER_AUX_MODELS_FALLBACK.get(provider_id, "")
# Fallback for providers not yet migrated to ProviderProfile.default_aux_model,
# plus providers we intentionally keep pinned here (e.g. Anthropic predates
# profiles). New providers should set default_aux_model on their profile instead.
_API_KEY_PROVIDER_AUX_MODELS_FALLBACK: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
"zai": "glm-4.5-flash",
"kimi-coding": "kimi-k2-turbo-preview",
@@ -235,6 +254,10 @@ _API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"tencent-tokenhub": "hy3-preview",
}
# Legacy alias — callers that haven't been updated to _get_aux_model_for_provider()
# can still use this dict directly. Kept in sync with _FALLBACK above.
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = _API_KEY_PROVIDER_AUX_MODELS_FALLBACK
# Vision-specific model overrides for direct providers.
# When the user's main provider has a dedicated vision/multimodal model that
# differs from their main chat model, map it here. The vision auto-detect
@@ -259,10 +282,12 @@ _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
"kimi-coding-cn",
})
# OpenRouter app attribution headers (base — always sent)
# OpenRouter app attribution headers (base — always sent).
# `X-Title` is the canonical attribution header OpenRouter's dashboard
# reads; the previous `X-OpenRouter-Title` label was not recognized there.
_OR_HEADERS_BASE = {
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-OpenRouter-Title": "Hermes Agent",
"X-Title": "Hermes Agent",
"X-OpenRouter-Categories": "productivity,cli-agent",
}
@@ -567,7 +592,12 @@ class _CodexCompletionsAdapter:
# API allows it.
pass
else:
effort = reasoning_cfg.get("effort", "medium")
# Truthy-only check mirrors agent/transports/codex.py
# build_kwargs(): falsy values (None, "", 0) fall back
# to the default rather than being forwarded to the
# Codex backend, which rejects e.g. {"effort": null}
# with a 400.
effort = reasoning_cfg.get("effort") or "medium"
# Codex backend rejects "minimal"; clamp to "low" to
# match the main-agent Codex transport behavior.
if effort == "minimal":
@@ -1150,7 +1180,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
raw_base_url = _pool_runtime_base_url(entry, pconfig.inference_base_url) or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s) via pool", pconfig.name, model)
@@ -1166,6 +1196,14 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux
_ph_aux = _gpf_aux(provider_id)
if _ph_aux and _ph_aux.default_headers:
extra["default_headers"] = dict(_ph_aux.default_headers)
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1177,7 +1215,7 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
raw_base_url = str(creds.get("base_url", "")).strip().rstrip("/") or pconfig.inference_base_url
base_url = _to_openai_base_url(raw_base_url)
model = _API_KEY_PROVIDER_AUX_MODELS.get(provider_id)
model = _get_aux_model_for_provider(provider_id) or None
if model is None:
continue # skip provider if we don't know a valid aux model
logger.debug("Auxiliary text client: %s (%s)", pconfig.name, model)
@@ -1193,6 +1231,14 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
else:
try:
from providers import get_provider_profile as _gpf_aux2
_ph_aux2 = _gpf_aux2(provider_id)
if _ph_aux2 and _ph_aux2.default_headers:
extra["default_headers"] = dict(_ph_aux2.default_headers)
except Exception:
pass
_client = OpenAI(api_key=api_key, base_url=base_url, **extra)
_client = _maybe_wrap_anthropic(_client, model, api_key, raw_base_url)
return _client, model
@@ -1565,7 +1611,7 @@ def _try_anthropic(explicit_api_key: str = None) -> Tuple[Optional[Any], Optiona
from agent.anthropic_adapter import _is_oauth_token
is_oauth = _is_oauth_token(token)
model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
model = _get_aux_model_for_provider("anthropic") or "claude-haiku-4-5-20251001"
logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
try:
real_client = build_anthropic_client(token, base_url)
@@ -1643,6 +1689,39 @@ def _is_payment_error(exc: Exception) -> bool:
return False
def _is_rate_limit_error(exc: Exception) -> bool:
"""Detect rate-limit errors that warrant provider fallback.
Returns True for HTTP 429 errors whose message indicates rate limiting
(as opposed to billing/quota exhaustion, which _is_payment_error handles).
Also catches OpenAI SDK RateLimitError instances that may not set
.status_code on the exception object.
"""
status = getattr(exc, "status_code", None)
err_lower = str(exc).lower()
# OpenAI SDK's RateLimitError sometimes omits .status_code —
# detect by class name so we don't miss these. (PR #8023 pattern)
if type(exc).__name__ == "RateLimitError":
return True
if status == 429:
# Distinguish rate-limit from billing: billing keywords are handled
# by _is_payment_error, everything else on 429 is a rate limit.
if any(kw in err_lower for kw in (
"rate limit", "rate_limit", "too many requests",
"try again", "retry after", "resets in",
)):
return True
# Generic 429 without billing keywords = likely a rate limit
if not any(kw in err_lower for kw in (
"credits", "insufficient funds", "billing",
"payment required", "can only afford",
)):
return True
return False
def _is_connection_error(exc: Exception) -> bool:
"""Detect connection/network errors that warrant provider fallback.
@@ -2368,7 +2447,7 @@ def resolve_provider_client(
if explicit_base_url:
base_url = _to_openai_base_url(explicit_base_url.strip().rstrip("/"))
default_model = _API_KEY_PROVIDER_AUX_MODELS.get(provider, "")
default_model = _get_aux_model_for_provider(provider)
final_model = _normalize_resolved_model(model or default_model, provider)
if provider == "gemini":
@@ -2648,8 +2727,11 @@ def resolve_vision_provider_client(
return resolved_provider, sync_client, final_model
if resolved_base_url:
provider_for_base_override = (
requested if requested and requested not in ("", "auto") else "custom"
)
client, final_model = resolve_provider_client(
"custom",
provider_for_base_override,
model=resolved_model,
async_mode=async_mode,
explicit_base_url=resolved_base_url,
@@ -2657,8 +2739,8 @@ def resolve_vision_provider_client(
api_mode=resolved_api_mode,
)
if client is None:
return "custom", None, None
return "custom", client, final_model
return provider_for_base_override, None, None
return provider_for_base_override, client, final_model
if requested == "auto":
# Vision auto-detection order:
@@ -3124,8 +3206,14 @@ def _resolve_task_provider_model(
if task:
# Config.yaml is the primary source for per-task overrides.
if cfg_base_url:
if cfg_base_url and cfg_api_key:
# Both base_url and api_key explicitly set → custom endpoint.
return "custom", resolved_model, cfg_base_url, cfg_api_key, resolved_api_mode
if cfg_base_url and cfg_provider and cfg_provider != "auto":
# base_url set without api_key but with a known provider — use
# the provider so it can resolve credentials from env vars
# (e.g. OPENROUTER_API_KEY) instead of locking into "custom".
return cfg_provider, resolved_model, cfg_base_url, None, resolved_api_mode
if cfg_provider and cfg_provider != "auto":
return cfg_provider, resolved_model, None, None, resolved_api_mode
@@ -3526,7 +3614,7 @@ def call_llm(
except Exception as retry_err:
# If the max_tokens retry also hits a payment or connection
# error, fall through to the fallback chain below.
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
first_err = retry_err
@@ -3609,13 +3697,27 @@ def call_llm(
# Codex/OAuth tokens that authenticate but whose endpoint is down,
# and providers the user never configured that got picked up by
# the auto-detection chain.
should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
#
# ── Rate-limit fallback (#13579) ─────────────────────────────
# When the provider returns a 429 rate-limit (not billing), fall
# back to an alternative provider instead of exhausting retries
# against the same rate-limited endpoint.
should_fallback = (
_is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
)
# Only try alternative providers when the user didn't explicitly
# configure this task's provider. Explicit provider = hard constraint;
# auto (the default) = best-effort fallback chain. (#7559)
is_auto = resolved_provider in ("auto", "", None)
if should_fallback and is_auto:
reason = "payment error" if _is_payment_error(first_err) else "connection error"
if _is_payment_error(first_err):
reason = "payment error"
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
reason = "connection error"
logger.info("Auxiliary %s: %s on %s (%s), trying fallback",
task or "call", reason, resolved_provider, first_err)
fb_client, fb_model, fb_label = _try_payment_fallback(
@@ -3818,7 +3920,7 @@ async def async_call_llm(
except Exception as retry_err:
# If the max_tokens retry also hits a payment or connection
# error, fall through to the fallback chain below.
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err)):
if not (_is_payment_error(retry_err) or _is_connection_error(retry_err) or _is_rate_limit_error(retry_err)):
raise
first_err = retry_err
@@ -3887,11 +3989,20 @@ async def async_call_llm(
return _validate_llm_response(
await retry_client.chat.completions.create(**retry_kwargs), task)
# ── Payment / connection fallback (mirrors sync call_llm) ─────
should_fallback = _is_payment_error(first_err) or _is_connection_error(first_err)
# ── Payment / connection / rate-limit fallback (mirrors sync call_llm) ──
should_fallback = (
_is_payment_error(first_err)
or _is_connection_error(first_err)
or _is_rate_limit_error(first_err)
)
is_auto = resolved_provider in ("auto", "", None)
if should_fallback and is_auto:
reason = "payment error" if _is_payment_error(first_err) else "connection error"
if _is_payment_error(first_err):
reason = "payment error"
elif _is_rate_limit_error(first_err):
reason = "rate limit"
else:
reason = "connection error"
logger.info("Auxiliary %s (async): %s on %s (%s), trying fallback",
task or "call", reason, resolved_provider, first_err)
fb_client, fb_model, fb_label = _try_payment_fallback(
+70 -8
View File
@@ -344,6 +344,7 @@ class ContextCompressor(ContextEngine):
self._last_aux_model_failure_model = None
self._last_compression_savings_pct = 100.0
self._ineffective_compression_count = 0
self._summary_failure_cooldown_until = 0.0 # transient errors must not block a fresh session
def update_model(
self,
@@ -553,7 +554,16 @@ class ContextCompressor(ContextEngine):
break
accumulated += msg_tokens
boundary = i
prune_boundary = max(boundary, len(result) - min_protect)
# Translate the budget walk into a "protected count", apply the
# floor in count-space (where `max` reads naturally: protect at
# least `min_protect` messages or whatever the budget reserved,
# whichever is more), then convert back to a prune boundary.
# Doing this in index-space with `max` would invert the direction
# (smaller index = MORE protected), so a generous budget would
# silently get truncated back down to `min_protect`.
budget_protect_count = len(result) - boundary
protected_count = max(budget_protect_count, min_protect)
prune_boundary = len(result) - protected_count
else:
prune_boundary = len(result) - protect_tail_count
@@ -590,6 +600,8 @@ class ContextCompressor(ContextEngine):
# Skip multimodal content (list of content blocks)
if isinstance(content, list):
continue
if not isinstance(content, str):
continue
if not content or content == _PRUNED_TOOL_PLACEHOLDER:
continue
# Skip already-deduplicated or previously-summarized results
@@ -905,15 +917,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
or "does not exist" in _err_str
or "no available channel" in _err_str
)
_is_timeout = (
_status in (408, 429, 502, 504)
or "timeout" in _err_str
)
if (
_is_model_not_found
(_is_model_not_found or _is_timeout)
and self.summary_model
and self.summary_model != self.model
and not getattr(self, "_summary_model_fallen_back", False)
):
self._summary_model_fallen_back = True
logging.warning(
"Summary model '%s' not available (%s). "
"Summary model '%s' unavailable (%s). "
"Falling back to main model '%s' for compression.",
self.summary_model, e, self.model,
)
@@ -977,15 +993,39 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return None
@staticmethod
def _with_summary_prefix(summary: str) -> str:
"""Normalize summary text to the current compaction handoff format."""
def _strip_summary_prefix(summary: str) -> str:
"""Return summary body without the current or legacy handoff prefix."""
text = (summary or "").strip()
for prefix in (LEGACY_SUMMARY_PREFIX, SUMMARY_PREFIX):
for prefix in (SUMMARY_PREFIX, LEGACY_SUMMARY_PREFIX):
if text.startswith(prefix):
text = text[len(prefix):].lstrip()
break
return text[len(prefix):].lstrip()
return text
@classmethod
def _with_summary_prefix(cls, summary: str) -> str:
"""Normalize summary text to the current compaction handoff format."""
text = cls._strip_summary_prefix(summary)
return f"{SUMMARY_PREFIX}\n{text}" if text else SUMMARY_PREFIX
@staticmethod
def _is_context_summary_content(content: Any) -> bool:
text = _content_text_for_contains(content).lstrip()
return text.startswith(SUMMARY_PREFIX) or text.startswith(LEGACY_SUMMARY_PREFIX)
@classmethod
def _find_latest_context_summary(
cls,
messages: List[Dict[str, Any]],
start: int,
end: int,
) -> tuple[Optional[int], str]:
"""Find the newest handoff summary inside a compression window."""
for idx in range(end - 1, start - 1, -1):
content = messages[idx].get("content")
if cls._is_context_summary_content(content):
return idx, cls._strip_summary_prefix(_content_text_for_contains(content))
return None, ""
# ------------------------------------------------------------------
# Tool-call / tool-result pair integrity helpers
# ------------------------------------------------------------------
@@ -1292,6 +1332,15 @@ The user has requested that this compaction PRIORITISE preserving all informatio
return messages
turns_to_summarize = messages[compress_start:compress_end]
summary_idx, summary_body = self._find_latest_context_summary(
messages,
compress_start,
compress_end,
)
if summary_idx is not None:
if summary_body and not self._previous_summary:
self._previous_summary = summary_body
turns_to_summarize = messages[summary_idx + 1:compress_end]
if not self.quiet_mode:
logger.info(
@@ -1369,6 +1418,19 @@ The user has requested that this compaction PRIORITISE preserving all informatio
# Merge the summary into the first tail message instead
# of inserting a standalone message that breaks alternation.
_merge_summary_into_tail = True
# When the summary lands as a standalone role="user" message,
# weak models read the verbatim "## Active Task" quote of a past
# user request as fresh input (#11475, #14521). Append the explicit
# end marker — the same one used in the merge-into-tail path — so
# the model has a clear "summary above, not new input" signal.
if not _merge_summary_into_tail and summary_role == "user":
summary = (
summary
+ "\n\n--- END OF CONTEXT SUMMARY — "
"respond to the message below, not the summary above ---"
)
if not _merge_summary_into_tail:
compressed.append({"role": summary_role, "content": summary})
+38 -2
View File
@@ -55,6 +55,7 @@ class FailoverReason(enum.Enum):
thinking_signature = "thinking_signature" # Anthropic thinking block sig invalid
long_context_tier = "long_context_tier" # Anthropic "extra usage" tier gate
oauth_long_context_beta_forbidden = "oauth_long_context_beta_forbidden" # Anthropic OAuth subscription rejects 1M context beta — disable beta and retry
llama_cpp_grammar_pattern = "llama_cpp_grammar_pattern" # llama.cpp json-schema-to-grammar rejects regex escapes in `pattern` / `format` — strip from tools and retry
# Catch-all
unknown = "unknown" # Unclassifiable — retry with backoff
@@ -470,6 +471,31 @@ def classify_api_error(
should_compress=False,
)
# llama.cpp's ``json-schema-to-grammar`` converter (used by its OAI
# server to build GBNF tool-call parsers) rejects regex escape classes
# like ``\d``/``\w``/``\s`` and most ``format`` values. MCP servers
# routinely emit ``"pattern": "\\d{4}-\\d{2}-\\d{2}"`` for date/phone/
# email params. llama.cpp surfaces this as HTTP 400 with one of a few
# recognizable phrases; on match we strip ``pattern``/``format`` from
# ``self.tools`` in the retry loop and retry once. Cloud providers are
# unaffected — they accept these keywords and we never hit this branch.
if (
status_code == 400
and (
"error parsing grammar" in error_msg
or "json-schema-to-grammar" in error_msg
or (
"unable to generate parser" in error_msg
and "template" in error_msg
)
)
):
return _result(
FailoverReason.llama_cpp_grammar_pattern,
retryable=True,
should_compress=False,
)
# ── 2. HTTP status code classification ──────────────────────────
if status_code is not None:
@@ -520,7 +546,12 @@ def classify_api_error(
is_disconnect = any(p in error_msg for p in _SERVER_DISCONNECT_PATTERNS)
if is_disconnect and not status_code:
is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have hundreds of
# messages while still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.6 or (
context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
)
if is_large:
return _result(
FailoverReason.context_overflow,
@@ -766,7 +797,12 @@ def _classify_400(
if not err_body_msg:
err_body_msg = str(body.get("message") or "").strip().lower()
is_generic = len(err_body_msg) < 30 or err_body_msg in ("error", "")
is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
# Absolute token/message-count thresholds are only a proxy for smaller
# context windows. Large-context sessions can have many messages while
# still being far below their actual token budget.
is_large = approx_tokens > context_length * 0.4 or (
context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
)
if is_generic and is_large:
return result_fn(
+15 -1
View File
@@ -679,7 +679,21 @@ def translate_stream_event(event: Dict[str, Any], model: str, tool_call_indices:
finish_reason_raw = str(cand.get("finishReason") or "")
if finish_reason_raw:
mapped = "tool_calls" if tool_call_indices else _map_gemini_finish_reason(finish_reason_raw)
chunks.append(_make_stream_chunk(model=model, finish_reason=mapped))
finish_chunk = _make_stream_chunk(model=model, finish_reason=mapped)
# Attach usage from this event's usageMetadata so the streaming
# loop in run_agent.py can record token counts (mirrors the
# non-streaming path in translate_gemini_response).
usage_meta = event.get("usageMetadata") or {}
if usage_meta:
finish_chunk.usage = SimpleNamespace(
prompt_tokens=int(usage_meta.get("promptTokenCount") or 0),
completion_tokens=int(usage_meta.get("candidatesTokenCount") or 0),
total_tokens=int(usage_meta.get("totalTokenCount") or 0),
prompt_tokens_details=SimpleNamespace(
cached_tokens=int(usage_meta.get("cachedContentTokenCount") or 0),
),
)
chunks.append(finish_chunk)
return chunks
+15 -2
View File
@@ -489,16 +489,29 @@ def save_credentials(creds: GoogleCredentials) -> Path:
"""Atomically write creds to disk with 0o600 permissions."""
path = _credentials_path()
path.parent.mkdir(parents=True, exist_ok=True)
# Tighten parent dir to 0o700 so siblings can't traverse to the creds file.
# On Windows this is a no-op (POSIX mode bits aren't enforced); ignore failures.
try:
os.chmod(path.parent, 0o700)
except OSError:
pass
payload = json.dumps(creds.to_dict(), indent=2, sort_keys=True) + "\n"
with _credentials_lock():
tmp_path = path.with_suffix(f".tmp.{os.getpid()}.{secrets.token_hex(4)}")
try:
with open(tmp_path, "w", encoding="utf-8") as fh:
# Create with 0o600 atomically to close the TOCTOU window where the
# default umask (often 0o644) would briefly expose tokens to other
# local users between open() and chmod().
fd = os.open(
str(tmp_path),
os.O_WRONLY | os.O_CREAT | os.O_EXCL,
stat.S_IRUSR | stat.S_IWUSR,
)
with os.fdopen(fd, "w", encoding="utf-8") as fh:
fh.write(payload)
fh.flush()
os.fsync(fh.fileno())
os.chmod(tmp_path, stat.S_IRUSR | stat.S_IWUSR)
atomic_replace(tmp_path, path)
finally:
try:
+230
View File
@@ -0,0 +1,230 @@
"""Lightweight internationalization (i18n) for Hermes static user-facing messages.
Scope (thin slice, by design): only the highest-impact static strings shown
to the user by Hermes itself -- approval prompts, a handful of gateway slash
command replies, restart-drain notices. Agent-generated output, log lines,
error tracebacks, tool outputs, and slash-command descriptions all stay in
English.
Catalog files live under ``locales/<lang>.yaml`` at the repo root. Each
catalog is a flat dict keyed by dotted paths (e.g. ``approval.choose`` or
``gateway.approval_expired``). Missing keys fall back to English; if English
is missing too, the key path itself is returned so a broken catalog never
crashes the agent.
Usage::
from agent.i18n import t
print(t("approval.choose_long")) # current lang
print(t("gateway.draining", count=3)) # {count} formatted
print(t("approval.choose_long", lang="zh")) # explicit override
Language resolution order:
1. Explicit ``lang=`` argument passed to :func:`t`
2. ``HERMES_LANGUAGE`` environment variable (for tests / quick override)
3. ``display.language`` from config.yaml
4. ``"en"`` (baseline)
Supported languages: en, zh, ja, de, es. Unknown values fall back to en.
"""
from __future__ import annotations
import logging
import os
import threading
from functools import lru_cache
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
SUPPORTED_LANGUAGES: tuple[str, ...] = ("en", "zh", "ja", "de", "es")
DEFAULT_LANGUAGE = "en"
# Accept a few natural aliases so users who type "chinese" / "zh-CN" / "jp"
# get the right catalog instead of silently falling back to English.
_LANGUAGE_ALIASES: dict[str, str] = {
"english": "en", "en-us": "en", "en-gb": "en",
"chinese": "zh", "mandarin": "zh", "zh-cn": "zh", "zh-tw": "zh", "zh-hans": "zh", "zh-hant": "zh",
"japanese": "ja", "jp": "ja", "ja-jp": "ja",
"german": "de", "deutsch": "de", "de-de": "de",
"spanish": "es", "español": "es", "espanol": "es", "es-es": "es", "es-mx": "es",
}
_catalog_cache: dict[str, dict[str, str]] = {}
_catalog_lock = threading.Lock()
def _locales_dir() -> Path:
"""Return the directory containing locale YAML files.
Lives next to the repo root so both the bundled install and editable
checkouts find it without PYTHONPATH gymnastics.
"""
# agent/i18n.py -> agent/ -> repo root
return Path(__file__).resolve().parent.parent / "locales"
def _normalize_lang(value: Any) -> str:
"""Normalize a user-supplied language value to a supported code.
Accepts supported codes directly, common aliases (``chinese`` -> ``zh``),
and case-insensitive regional tags (``zh-CN`` -> ``zh``). Returns the
default language for unknown values.
"""
if not isinstance(value, str):
return DEFAULT_LANGUAGE
key = value.strip().lower()
if not key:
return DEFAULT_LANGUAGE
if key in SUPPORTED_LANGUAGES:
return key
if key in _LANGUAGE_ALIASES:
return _LANGUAGE_ALIASES[key]
# Try stripping a region suffix (e.g. "pt-br" -> "pt" won't be supported,
# but "zh-CN" -> "zh" will).
base = key.split("-", 1)[0]
if base in SUPPORTED_LANGUAGES:
return base
return DEFAULT_LANGUAGE
def _load_catalog(lang: str) -> dict[str, str]:
"""Load and flatten one locale YAML file into a dotted-key dict.
YAML files can be nested for human readability; this produces the flat
key space :func:`t` expects. Cached per-language for the process.
"""
with _catalog_lock:
cached = _catalog_cache.get(lang)
if cached is not None:
return cached
path = _locales_dir() / f"{lang}.yaml"
if not path.is_file():
logger.debug("i18n catalog missing for %s at %s", lang, path)
with _catalog_lock:
_catalog_cache[lang] = {}
return {}
try:
import yaml # PyYAML is already a hermes dependency
with path.open("r", encoding="utf-8") as f:
raw = yaml.safe_load(f) or {}
except Exception as exc:
logger.warning("Failed to load i18n catalog %s: %s", path, exc)
with _catalog_lock:
_catalog_cache[lang] = {}
return {}
flat: dict[str, str] = {}
_flatten_into(raw, "", flat)
with _catalog_lock:
_catalog_cache[lang] = flat
return flat
def _flatten_into(node: Any, prefix: str, out: dict[str, str]) -> None:
if isinstance(node, dict):
for key, value in node.items():
child_key = f"{prefix}.{key}" if prefix else str(key)
_flatten_into(value, child_key, out)
elif isinstance(node, str):
out[prefix] = node
# Non-string, non-dict leaves are ignored -- catalogs are text-only.
@lru_cache(maxsize=1)
def _config_language_cached() -> str | None:
"""Read ``display.language`` from config.yaml once per process.
Cached because ``t()`` is called in hot paths (every approval prompt,
every gateway reply) and re-reading YAML each call would be wasteful.
``reset_language_cache()`` clears this when config changes at runtime
(e.g. after the setup wizard).
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
lang = (cfg.get("display") or {}).get("language")
if lang:
return _normalize_lang(lang)
except Exception as exc:
logger.debug("Could not read display.language from config: %s", exc)
return None
def reset_language_cache() -> None:
"""Invalidate cached language resolution and catalogs.
Call after :func:`hermes_cli.config.save_config` if a running process
needs to pick up a changed ``display.language`` without restart.
"""
_config_language_cached.cache_clear()
with _catalog_lock:
_catalog_cache.clear()
def get_language() -> str:
"""Resolve the active language using env > config > default order."""
env_lang = os.environ.get("HERMES_LANGUAGE")
if env_lang:
return _normalize_lang(env_lang)
cfg_lang = _config_language_cached()
if cfg_lang:
return cfg_lang
return DEFAULT_LANGUAGE
def t(key: str, lang: str | None = None, **format_kwargs: Any) -> str:
"""Translate a dotted key to the active language.
Parameters
----------
key
Dotted path into the catalog, e.g. ``"approval.choose_long"``.
lang
Explicit language override. Takes precedence over env + config.
**format_kwargs
``str.format`` substitution arguments (``t("gateway.drain", count=3)``
expects a catalog entry with a ``{count}`` placeholder).
Returns
-------
The translated string, or the English fallback if the key is missing in
the target language, or the bare key if English is also missing.
"""
target = _normalize_lang(lang) if lang else get_language()
catalog = _load_catalog(target)
value = catalog.get(key)
if value is None and target != DEFAULT_LANGUAGE:
# Fall through to English rather than showing a key path to the user.
value = _load_catalog(DEFAULT_LANGUAGE).get(key)
if value is None:
# Last-ditch: return the key itself. A broken catalog should not
# crash anything; it just looks ugly until someone fixes it.
logger.debug("i18n miss: key=%r lang=%r", key, target)
value = key
if format_kwargs:
try:
return value.format(**format_kwargs)
except (KeyError, IndexError, ValueError) as exc:
logger.warning(
"i18n format failed for key=%r lang=%r kwargs=%r: %s",
key, target, format_kwargs, exc,
)
return value
return value
__all__ = [
"SUPPORTED_LANGUAGES",
"DEFAULT_LANGUAGE",
"t",
"get_language",
"reset_language_cache",
]
+3 -6
View File
@@ -1,17 +1,14 @@
"""MemoryManager — orchestrates the built-in memory provider plus at most
ONE external plugin memory provider.
"""MemoryManager — orchestrates memory providers for the agent.
Single integration point in run_agent.py. Replaces scattered per-backend
code with one manager that delegates to registered providers.
The BuiltinMemoryProvider is always registered first and cannot be removed.
Only ONE external (non-builtin) provider is allowed at a time attempting
to register a second external provider is rejected with a warning. This
Only ONE external plugin provider is allowed at a time attempting to
register a second external provider is rejected with a warning. This
prevents tool schema bloat and conflicting memory backends.
Usage in run_agent.py:
self._memory_manager = MemoryManager()
self._memory_manager.add_provider(BuiltinMemoryProvider(...))
# Only ONE of these:
self._memory_manager.add_provider(plugin_provider)
+8 -9
View File
@@ -1,17 +1,16 @@
"""Abstract base class for pluggable memory providers.
Memory providers give the agent persistent recall across sessions. One
external provider is active at a time alongside the always-on built-in
memory (MEMORY.md / USER.md). The MemoryManager enforces this limit.
Memory providers give the agent persistent recall across sessions.
The MemoryManager enforces a one-external-provider limit to prevent
tool schema bloat and conflicting memory backends.
Built-in memory is always active as the first provider and cannot be removed.
External providers (Honcho, Hindsight, Mem0, etc.) are additive they never
disable the built-in store. Only one external provider runs at a time to
prevent tool schema bloat and conflicting memory backends.
External providers (Honcho, Hindsight, Mem0, etc.) are registered
and managed via MemoryManager. Only one external provider runs at a
time.
Registration:
1. Built-in: BuiltinMemoryProvider always present, not removable.
2. Plugins: Ship in plugins/memory/<name>/, activated by memory.provider config.
Plugins ship in plugins/memory/<name>/ and are activated via
the memory.provider config key.
Lifecycle (called by MemoryManager, wired in run_agent.py):
initialize() connect, create resources, warm up
+11
View File
@@ -318,6 +318,17 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"ollama.com": "ollama-cloud",
}
# Auto-extend with hostnames derived from provider profiles.
# Any provider with a base_url not already in the map gets added automatically.
try:
from providers import list_providers as _list_providers
for _pp in _list_providers():
_host = _pp.get_hostname()
if _host and _host not in _URL_TO_PROVIDER:
_URL_TO_PROVIDER[_host] = _pp.name
except Exception:
pass
def _infer_provider_from_url(base_url: str) -> Optional[str]:
"""Infer the models.dev provider name from a base URL.
+6
View File
@@ -513,6 +513,12 @@ PLATFORM_HINTS = {
"image and is the WRONG path. Bare Unicode emoji in text is also not a substitute "
"— when a sticker is the right response, use yb_send_sticker."
),
"api_server": (
"You're responding through an API server. The rendering layer is unknown — "
"assume plain text. No markdown formatting (no asterisks, bullets, headers, "
"code fences). Treat this like a conversation, not a document. Keep responses "
"brief and natural."
),
}
# ---------------------------------------------------------------------------
+17 -11
View File
@@ -305,13 +305,18 @@ def _redact_form_body(text: str) -> str:
return _redact_query_string(text.strip())
def redact_sensitive_text(text: str, *, force: bool = False) -> str:
def redact_sensitive_text(text: str, *, force: bool = False, code_file: bool = False) -> str:
"""Apply all redaction patterns to a block of text.
Safe to call on any string -- non-matching text passes through unchanged.
Disabled by default enable via security.redact_secrets: true in config.yaml.
Set force=True for safety boundaries that must never return raw secrets
regardless of the user's global logging redaction preference.
Set code_file=True to skip the ENV-assignment and JSON-field regex
patterns when the text is known to be source code (e.g. MAX_TOKENS=***
constants, "apiKey": "test" fixtures). Prefix patterns, auth headers,
private keys, DB connstrings, JWTs, and URL secrets are still redacted.
"""
if text is None:
return None
@@ -325,17 +330,18 @@ def redact_sensitive_text(text: str, *, force: bool = False) -> str:
# Known prefixes (sk-, ghp_, etc.)
text = _PREFIX_RE.sub(lambda m: _mask_token(m.group(1)), text)
# ENV assignments: OPENAI_API_KEY=sk-abc...
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# ENV assignments: OPENAI_API_KEY=*** (skip for code files — false positives)
if not code_file:
def _redact_env(m):
name, quote, value = m.group(1), m.group(2), m.group(3)
return f"{name}={quote}{_mask_token(value)}{quote}"
text = _ENV_ASSIGN_RE.sub(_redact_env, text)
# JSON fields: "apiKey": "value"
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# JSON fields: "apiKey": "***" (skip for code files — false positives)
def _redact_json(m):
key, value = m.group(1), m.group(2)
return f'{key}: "{_mask_token(value)}"'
text = _JSON_FIELD_RE.sub(_redact_json, text)
# Authorization headers
text = _AUTH_HEADER_RE.sub(
+386
View File
@@ -0,0 +1,386 @@
"""Stateful scrubber for reasoning/thinking blocks in streamed assistant text.
``run_agent._strip_think_blocks`` is regex-based and correct for a complete
string, but when it runs *per-delta* in ``_fire_stream_delta`` it destroys
the state that downstream consumers (CLI ``_stream_delta``, gateway
``GatewayStreamConsumer._filter_and_accumulate``) rely on.
Concretely, when MiniMax-M2.7 streams
delta1 = "<think>"
delta2 = "Let me check their config"
delta3 = "</think>"
the per-delta regex erases delta1 entirely (case 2: unterminated-open at
boundary matches ``^<think>...``), so the downstream state machine never
sees the open tag, treats delta2 as regular content, and leaks reasoning
to the user. Consumers that don't run their own state machine (ACP,
api_server, TTS) never had any defence at all they just emitted
whatever survived the upstream regex.
This module centralises the tag-suppression state machine at the
upstream layer so every stream_delta_callback sees text that has
already had reasoning blocks removed. Partial tags at delta
boundaries are held back until the next delta resolves them, and
end-of-stream flushing surfaces any held-back prose that turned out
not to be a real tag.
Usage::
scrubber = StreamingThinkScrubber()
for delta in stream:
visible = scrubber.feed(delta)
if visible:
emit(visible)
tail = scrubber.flush() # at end of stream
if tail:
emit(tail)
The scrubber is re-entrant per agent instance. Call ``reset()`` at
the top of each new turn so a hung block from an interrupted prior
stream cannot taint the next turn's output.
Tag variants handled (case-insensitive):
``<think>``, ``<thinking>``, ``<reasoning>``, ``<thought>``,
``<REASONING_SCRATCHPAD>``.
Block-boundary rule for opens: an opening tag is only treated as a
reasoning-block opener when it appears at the start of the stream,
after a newline (optionally followed by whitespace), or when only
whitespace has been emitted on the current line. This prevents prose
that *mentions* the tag name (e.g. ``"use <think> tags here"``) from
being incorrectly suppressed. Closed pairs (``<think>X</think>``) are
always suppressed regardless of boundary; a closed pair is an
intentional, bounded construct.
"""
from __future__ import annotations
from typing import Tuple
__all__ = ["StreamingThinkScrubber"]
class StreamingThinkScrubber:
"""Stateful scrubber for streaming reasoning/thinking blocks.
State machine:
- ``_in_block``: True while inside an opened block, waiting for
a close tag. All text inside is discarded.
- ``_buf``: held-back partial-tag tail. Emitted / discarded on
the next ``feed()`` call or by ``flush()``.
- ``_last_emitted_ended_newline``: True iff the most recent
emission to the consumer ended with ``\\n``, or nothing has
been emitted yet (start-of-stream counts as a boundary). Used
to decide whether an open tag at buffer position 0 is at a
block boundary.
"""
_OPEN_TAG_NAMES: Tuple[str, ...] = (
"think",
"thinking",
"reasoning",
"thought",
"REASONING_SCRATCHPAD",
)
# Materialise literal tag strings so the hot path does string
# operations, not regex compilation per feed().
_OPEN_TAGS: Tuple[str, ...] = tuple(f"<{name}>" for name in _OPEN_TAG_NAMES)
_CLOSE_TAGS: Tuple[str, ...] = tuple(f"</{name}>" for name in _OPEN_TAG_NAMES)
# Pre-compute the longest tag (for partial-tag hold-back bound).
_MAX_TAG_LEN: int = max(len(tag) for tag in _OPEN_TAGS + _CLOSE_TAGS)
def __init__(self) -> None:
self._in_block: bool = False
self._buf: str = ""
self._last_emitted_ended_newline: bool = True
def reset(self) -> None:
"""Reset all state. Call at the top of every new turn."""
self._in_block = False
self._buf = ""
self._last_emitted_ended_newline = True
def feed(self, text: str) -> str:
"""Feed one delta; return the scrubbed visible portion.
May return an empty string when the entire delta is reasoning
content or is being held back pending resolution of a partial
tag at the boundary.
"""
if not text:
return ""
buf = self._buf + text
self._buf = ""
out: list[str] = []
while buf:
if self._in_block:
# Hunt for the earliest close tag.
close_idx, close_len = self._find_first_tag(
buf, self._CLOSE_TAGS,
)
if close_idx == -1:
# No close yet — hold back a potential partial
# close-tag prefix; discard everything else.
held = self._max_partial_suffix(buf, self._CLOSE_TAGS)
self._buf = buf[-held:] if held else ""
return "".join(out)
# Found close: discard block content + tag, continue.
buf = buf[close_idx + close_len:]
self._in_block = False
else:
# Priority 1 — closed <tag>X</tag> pair anywhere in
# buf. Closed pairs are always an intentional,
# bounded construct (even mid-line prose containing
# an open/close pair is almost certainly a model
# leaking reasoning inline), so no boundary gating.
pair = self._find_earliest_closed_pair(buf)
# Priority 2 — unterminated open tag at a block
# boundary. Boundary-gated so prose that mentions
# '<think>' isn't over-stripped.
open_idx, open_len = self._find_open_at_boundary(
buf, out,
)
# Pick whichever match comes earliest in the buffer.
if pair is not None and (
open_idx == -1 or pair[0] <= open_idx
):
start_idx, end_idx = pair
preceding = buf[:start_idx]
if preceding:
preceding = self._strip_orphan_close_tags(preceding)
if preceding:
out.append(preceding)
self._last_emitted_ended_newline = (
preceding.endswith("\n")
)
buf = buf[end_idx:]
continue
if open_idx != -1:
# Unterminated open at boundary — emit preceding,
# enter block, continue loop with remainder.
preceding = buf[:open_idx]
if preceding:
preceding = self._strip_orphan_close_tags(preceding)
if preceding:
out.append(preceding)
self._last_emitted_ended_newline = (
preceding.endswith("\n")
)
self._in_block = True
buf = buf[open_idx + open_len:]
continue
# No resolvable tag structure in buf. Hold back any
# partial-tag prefix at the tail so a split tag
# across deltas isn't missed, then emit the rest.
held = self._max_partial_suffix(buf, self._OPEN_TAGS)
held_close = self._max_partial_suffix(
buf, self._CLOSE_TAGS,
)
held = max(held, held_close)
if held:
emit_text = buf[:-held]
self._buf = buf[-held:]
else:
emit_text = buf
self._buf = ""
if emit_text:
emit_text = self._strip_orphan_close_tags(emit_text)
if emit_text:
out.append(emit_text)
self._last_emitted_ended_newline = (
emit_text.endswith("\n")
)
return "".join(out)
return "".join(out)
def flush(self) -> str:
"""End-of-stream flush.
If still inside an unterminated block, held-back content is
discarded leaking partial reasoning is worse than a
truncated answer. Otherwise the held-back partial-tag tail is
emitted verbatim (it turned out not to be a real tag prefix).
"""
if self._in_block:
self._buf = ""
self._in_block = False
return ""
tail = self._buf
self._buf = ""
if not tail:
return ""
tail = self._strip_orphan_close_tags(tail)
if tail:
self._last_emitted_ended_newline = tail.endswith("\n")
return tail
# ── internal helpers ───────────────────────────────────────────────
@staticmethod
def _find_first_tag(
buf: str, tags: Tuple[str, ...],
) -> Tuple[int, int]:
"""Return (earliest_index, tag_length) over *tags*, or (-1, 0).
Case-insensitive match.
"""
buf_lower = buf.lower()
best_idx = -1
best_len = 0
for tag in tags:
idx = buf_lower.find(tag.lower())
if idx != -1 and (best_idx == -1 or idx < best_idx):
best_idx = idx
best_len = len(tag)
return best_idx, best_len
def _find_earliest_closed_pair(self, buf: str):
"""Return (start_idx, end_idx) of the earliest closed pair, else None.
A closed pair is ``<tag>...</tag>`` of any variant. Matches are
case-insensitive and non-greedy (the closest close tag after
an open tag wins), matching the regex ``<tag>.*?</tag>``
semantics of ``_strip_think_blocks`` case 1. When two tag
variants could both match, the one whose open tag appears
earlier wins.
"""
buf_lower = buf.lower()
best: "tuple[int, int] | None" = None
for open_tag, close_tag in zip(self._OPEN_TAGS, self._CLOSE_TAGS):
open_lower = open_tag.lower()
close_lower = close_tag.lower()
open_idx = buf_lower.find(open_lower)
if open_idx == -1:
continue
close_idx = buf_lower.find(
close_lower, open_idx + len(open_lower),
)
if close_idx == -1:
continue
end_idx = close_idx + len(close_lower)
if best is None or open_idx < best[0]:
best = (open_idx, end_idx)
return best
def _find_open_at_boundary(
self, buf: str, already_emitted: list[str],
) -> Tuple[int, int]:
"""Return the earliest block-boundary open-tag (idx, len).
Returns (-1, 0) if no boundary-legal opener is present.
"""
buf_lower = buf.lower()
best_idx = -1
best_len = 0
for tag in self._OPEN_TAGS:
tag_lower = tag.lower()
search_start = 0
while True:
idx = buf_lower.find(tag_lower, search_start)
if idx == -1:
break
if self._is_block_boundary(buf, idx, already_emitted):
if best_idx == -1 or idx < best_idx:
best_idx = idx
best_len = len(tag)
break # first boundary hit for this tag is enough
search_start = idx + 1
return best_idx, best_len
def _is_block_boundary(
self, buf: str, idx: int, already_emitted: list[str],
) -> bool:
"""True iff position *idx* in *buf* is a block boundary.
A block boundary is:
- buf position 0 AND the most recent emission ended with
a newline (or nothing has been emitted yet)
- any position whose preceding text on the current line
(since the last newline in buf) is whitespace-only, AND
if there is no newline in the preceding buf portion, the
most recent prior emission ended with a newline
"""
if idx == 0:
# Check whether the last already-emitted chunk in THIS
# feed() call ended with a newline, otherwise fall back
# to the cross-feed flag.
if already_emitted:
return already_emitted[-1].endswith("\n")
return self._last_emitted_ended_newline
preceding = buf[:idx]
last_nl = preceding.rfind("\n")
if last_nl == -1:
# No newline in buf before the tag — boundary only if the
# prior emission ended with a newline AND everything since
# is whitespace.
if already_emitted:
prior_newline = already_emitted[-1].endswith("\n")
else:
prior_newline = self._last_emitted_ended_newline
return prior_newline and preceding.strip() == ""
# Newline present — text between it and the tag must be
# whitespace-only.
return preceding[last_nl + 1:].strip() == ""
@classmethod
def _max_partial_suffix(
cls, buf: str, tags: Tuple[str, ...],
) -> int:
"""Return the longest buf-suffix that is a prefix of any tag.
Only prefixes strictly shorter than the tag itself count
(full-length suffixes are the tag and are handled as matches,
not held-back partials). Case-insensitive.
"""
if not buf:
return 0
buf_lower = buf.lower()
max_check = min(len(buf_lower), cls._MAX_TAG_LEN - 1)
for i in range(max_check, 0, -1):
suffix = buf_lower[-i:]
for tag in tags:
tag_lower = tag.lower()
if len(tag_lower) > i and tag_lower.startswith(suffix):
return i
return 0
@classmethod
def _strip_orphan_close_tags(cls, text: str) -> str:
"""Remove any close tags from *text* (orphan-close handling).
An orphan close tag has no matching open in the current
scrubber state; it's always noise, stripped with any trailing
whitespace so the surrounding prose flows naturally.
"""
if "</" not in text:
return text
text_lower = text.lower()
out: list[str] = []
i = 0
while i < len(text):
matched = False
if text_lower[i:i + 2] == "</":
for tag in cls._CLOSE_TAGS:
tag_lower = tag.lower()
tag_len = len(tag_lower)
if text_lower[i:i + tag_len] == tag_lower:
# Skip the tag and any trailing whitespace,
# matching _strip_think_blocks case 3.
j = i + tag_len
while j < len(text) and text[j] in " \t\n\r":
j += 1
i = j
matched = True
break
if not matched:
out.append(text[i])
i += 1
return "".join(out)
+13 -1
View File
@@ -17,6 +17,7 @@ logger = logging.getLogger(__name__)
# so silent-drops (e.g. OpenRouter 402 exhausting the fallback chain)
# become visible instead of piling up as NULL session titles.
FailureCallback = Callable[[str, BaseException], None]
TitleCallback = Callable[[str], None]
_TITLE_PROMPT = (
"Generate a short, descriptive title (3-7 words) for a conversation that starts with the "
@@ -90,6 +91,7 @@ def auto_title_session(
assistant_response: str,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Generate and set a session title if one doesn't already exist.
@@ -119,6 +121,11 @@ def auto_title_session(
try:
session_db.set_session_title(session_id, title)
logger.debug("Auto-generated session title: %s", title)
if title_callback is not None:
try:
title_callback(title)
except Exception:
logger.debug("Auto-title callback failed", exc_info=True)
except Exception as e:
logger.debug("Failed to set auto-generated title: %s", e)
@@ -131,6 +138,7 @@ def maybe_auto_title(
conversation_history: list,
failure_callback: Optional[FailureCallback] = None,
main_runtime: dict = None,
title_callback: Optional[TitleCallback] = None,
) -> None:
"""Fire-and-forget title generation after the first exchange.
@@ -152,7 +160,11 @@ def maybe_auto_title(
thread = threading.Thread(
target=auto_title_session,
args=(session_db, session_id, user_message, assistant_response),
kwargs={"failure_callback": failure_callback, "main_runtime": main_runtime},
kwargs={
"failure_callback": failure_callback,
"main_runtime": main_runtime,
"title_callback": title_callback,
},
daemon=True,
name="auto-title",
)
+13 -1
View File
@@ -6,9 +6,16 @@ Usage:
result = transport.normalize_response(raw_response)
"""
from agent.transports.types import NormalizedResponse, ToolCall, Usage, build_tool_call, map_finish_reason # noqa: F401
from agent.transports.types import (
NormalizedResponse,
ToolCall,
Usage,
build_tool_call,
map_finish_reason,
) # noqa: F401
_REGISTRY: dict = {}
_discovered: bool = False
def register_transport(api_mode: str, transport_cls: type) -> None:
@@ -23,6 +30,9 @@ def get_transport(api_mode: str):
This allows gradual migration call sites can check for None
and fall back to the legacy code path.
"""
global _discovered
if not _discovered:
_discover_transports()
cls = _REGISTRY.get(api_mode)
if cls is None:
# The registry can be partially populated when a specific transport
@@ -38,6 +48,8 @@ def get_transport(api_mode: str):
def _discover_transports() -> None:
"""Import all transport modules to trigger auto-registration."""
global _discovered
_discovered = True
try:
import agent.transports.anthropic # noqa: F401
except ImportError:
+158 -90
View File
@@ -109,7 +109,9 @@ class ChatCompletionsTransport(ProviderTransport):
def api_mode(self) -> str:
return "chat_completions"
def convert_messages(self, messages: List[Dict[str, Any]], **kwargs) -> List[Dict[str, Any]]:
def convert_messages(
self, messages: list[dict[str, Any]], **kwargs
) -> list[dict[str, Any]]:
"""Messages are already in OpenAI format — sanitize Codex leaks only.
Strips Codex Responses API fields (``codex_reasoning_items`` /
@@ -126,7 +128,9 @@ class ChatCompletionsTransport(ProviderTransport):
tool_calls = msg.get("tool_calls")
if isinstance(tool_calls, list):
for tc in tool_calls:
if isinstance(tc, dict) and ("call_id" in tc or "response_item_id" in tc):
if isinstance(tc, dict) and (
"call_id" in tc or "response_item_id" in tc
):
needs_sanitize = True
break
if needs_sanitize:
@@ -149,39 +153,41 @@ class ChatCompletionsTransport(ProviderTransport):
tc.pop("response_item_id", None)
return sanitized
def convert_tools(self, tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
def convert_tools(self, tools: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Tools are already in OpenAI format — identity."""
return tools
def build_kwargs(
self,
model: str,
messages: List[Dict[str, Any]],
tools: Optional[List[Dict[str, Any]]] = None,
messages: list[dict[str, Any]],
tools: list[dict[str, Any]] | None = None,
**params,
) -> Dict[str, Any]:
) -> dict[str, Any]:
"""Build chat.completions.create() kwargs.
This is the most complex transport method it handles ~16 providers
via params rather than subclasses.
params:
params (all optional):
timeout: float API call timeout
max_tokens: int | None user-configured max tokens
ephemeral_max_output_tokens: int | None one-shot override (error recovery)
ephemeral_max_output_tokens: int | None one-shot override
max_tokens_param_fn: callable returns {max_tokens: N} or {max_completion_tokens: N}
reasoning_config: dict | None
request_overrides: dict | None
session_id: str | None
qwen_session_metadata: dict | None {sessionId, promptId} precomputed
model_lower: str lowercase model name for pattern matching
# Provider detection flags (all optional, default False)
# Provider profile path (all per-provider quirks live in providers/)
provider_profile: ProviderProfile | None when present, delegates to
_build_kwargs_from_profile(); all flag params below are bypassed.
# Legacy-path flags — only used when provider_profile is None
# (i.e. custom / unregistered providers). Known providers all go
# through provider_profile.
is_openrouter: bool
is_nous: bool
is_qwen_portal: bool
is_github_models: bool
is_nvidia_nim: bool
is_kimi: bool
is_tokenhub: bool
is_lmstudio: bool
is_custom_provider: bool
ollama_num_ctx: int | None
@@ -190,6 +196,7 @@ class ChatCompletionsTransport(ProviderTransport):
# Qwen-specific
qwen_prepare_fn: callable | None runs AFTER codex sanitization
qwen_prepare_inplace_fn: callable | None in-place variant for deepcopied lists
qwen_session_metadata: dict | None
# Temperature
fixed_temperature: Any from _fixed_temperature_for_model()
omit_temperature: bool
@@ -199,28 +206,21 @@ class ChatCompletionsTransport(ProviderTransport):
lmstudio_reasoning_options: list[str] | None # raw allowed_options from /api/v1/models
# Claude on OpenRouter/Nous max output
anthropic_max_output: int | None
# Extra
extra_body_additions: dict | None pre-built extra_body entries
extra_body_additions: dict | None
"""
# Codex sanitization: drop reasoning_items / call_id / response_item_id
sanitized = self.convert_messages(messages)
# Qwen portal prep AFTER codex sanitization. If sanitize already
# deepcopied, reuse that copy via the in-place variant to avoid a
# second deepcopy.
is_qwen = params.get("is_qwen_portal", False)
if is_qwen:
qwen_prep = params.get("qwen_prepare_fn")
qwen_prep_inplace = params.get("qwen_prepare_inplace_fn")
if sanitized is messages:
if qwen_prep is not None:
sanitized = qwen_prep(sanitized)
else:
# Already deepcopied — transform in place
if qwen_prep_inplace is not None:
qwen_prep_inplace(sanitized)
elif qwen_prep is not None:
sanitized = qwen_prep(sanitized)
# ── Provider profile: single-path when present ──────────────────
_profile = params.get("provider_profile")
if _profile:
return self._build_kwargs_from_profile(
_profile, model, sanitized, tools, params
)
# ── Legacy fallback (unregistered / unknown provider) ───────────
# Reached only when get_provider_profile() returned None.
# Known providers always go through the profile path above.
# Developer role swap for GPT-5/Codex models
model_lower = params.get("model_lower", (model or "").lower())
@@ -233,7 +233,7 @@ class ChatCompletionsTransport(ProviderTransport):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: Dict[str, Any] = {
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
@@ -242,19 +242,6 @@ class ChatCompletionsTransport(ProviderTransport):
if timeout is not None:
api_kwargs["timeout"] = timeout
# Temperature
fixed_temp = params.get("fixed_temperature")
omit_temp = params.get("omit_temperature", False)
if omit_temp:
api_kwargs.pop("temperature", None)
elif fixed_temp is not None:
api_kwargs["temperature"] = fixed_temp
# Qwen metadata (caller precomputes {sessionId, promptId})
qwen_meta = params.get("qwen_session_metadata")
if qwen_meta and is_qwen:
api_kwargs["metadata"] = qwen_meta
# Tools
if tools:
# Moonshot/Kimi uses a stricter flavored JSON Schema. Rewriting
@@ -278,13 +265,6 @@ class ChatCompletionsTransport(ProviderTransport):
api_kwargs.update(max_tokens_fn(ephemeral))
elif max_tokens is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(max_tokens))
elif is_nvidia_nim and max_tokens_fn:
api_kwargs.update(max_tokens_fn(16384))
elif is_qwen and max_tokens_fn:
api_kwargs.update(max_tokens_fn(65536))
elif is_kimi and max_tokens_fn:
# Kimi/Moonshot: 32000 matches Kimi CLI's default
api_kwargs.update(max_tokens_fn(32000))
elif anthropic_max_out is not None:
api_kwargs["max_tokens"] = anthropic_max_out
@@ -331,7 +311,7 @@ class ChatCompletionsTransport(ProviderTransport):
api_kwargs["reasoning_effort"] = _lm_effort
# extra_body assembly
extra_body: Dict[str, Any] = {}
extra_body: dict[str, Any] = {}
is_openrouter = params.get("is_openrouter", False)
is_nous = params.get("is_nous", False)
@@ -361,35 +341,7 @@ class ChatCompletionsTransport(ProviderTransport):
if gh_reasoning is not None:
extra_body["reasoning"] = gh_reasoning
else:
if reasoning_config is not None:
rc = dict(reasoning_config)
if is_nous and rc.get("enabled") is False:
pass # omit for Nous when disabled
else:
extra_body["reasoning"] = rc
else:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if is_nous:
extra_body["tags"] = ["product=hermes-agent"]
# Ollama num_ctx
ollama_ctx = params.get("ollama_num_ctx")
if ollama_ctx:
options = extra_body.get("options", {})
options["num_ctx"] = ollama_ctx
extra_body["options"] = options
# Ollama/custom think=false
if params.get("is_custom_provider", False):
if reasoning_config and isinstance(reasoning_config, dict):
_effort = (reasoning_config.get("effort") or "").strip().lower()
_enabled = reasoning_config.get("enabled", True)
if _effort == "none" or _enabled is False:
extra_body["think"] = False
if is_qwen:
extra_body["vl_high_resolution_images"] = True
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
if provider_name == "gemini":
raw_thinking_config = _build_gemini_thinking_config(model, reasoning_config)
@@ -423,6 +375,120 @@ class ChatCompletionsTransport(ProviderTransport):
return api_kwargs
def _build_kwargs_from_profile(self, profile, model, sanitized, tools, params):
"""Build API kwargs using a ProviderProfile — single path, no legacy flags.
This method replaces the entire flag-based kwargs assembly when a
provider_profile is passed. Every quirk comes from the profile object.
"""
from providers.base import OMIT_TEMPERATURE
# Message preprocessing
sanitized = profile.prepare_messages(sanitized)
# Developer role swap — model-name-based, applies to all providers
_model_lower = (model or "").lower()
if (
sanitized
and isinstance(sanitized[0], dict)
and sanitized[0].get("role") == "system"
and any(p in _model_lower for p in DEVELOPER_ROLE_MODELS)
):
sanitized = list(sanitized)
sanitized[0] = {**sanitized[0], "role": "developer"}
api_kwargs: dict[str, Any] = {
"model": model,
"messages": sanitized,
}
# Temperature
if profile.fixed_temperature is OMIT_TEMPERATURE:
pass # Don't include temperature at all
elif profile.fixed_temperature is not None:
api_kwargs["temperature"] = profile.fixed_temperature
else:
# Use caller's temperature if provided
temp = params.get("temperature")
if temp is not None:
api_kwargs["temperature"] = temp
# Timeout
timeout = params.get("timeout")
if timeout is not None:
api_kwargs["timeout"] = timeout
# Tools — apply Moonshot/Kimi schema sanitization regardless of path
if tools:
if is_moonshot_model(model):
tools = sanitize_moonshot_tools(tools)
api_kwargs["tools"] = tools
# max_tokens resolution — priority: ephemeral > user > profile default
max_tokens_fn = params.get("max_tokens_param_fn")
ephemeral = params.get("ephemeral_max_output_tokens")
user_max = params.get("max_tokens")
anthropic_max = params.get("anthropic_max_output")
if ephemeral is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(ephemeral))
elif user_max is not None and max_tokens_fn:
api_kwargs.update(max_tokens_fn(user_max))
elif profile.default_max_tokens and max_tokens_fn:
api_kwargs.update(max_tokens_fn(profile.default_max_tokens))
elif anthropic_max is not None:
api_kwargs["max_tokens"] = anthropic_max
# Provider-specific api_kwargs extras (reasoning_effort, metadata, etc.)
reasoning_config = params.get("reasoning_config")
extra_body_from_profile, top_level_from_profile = (
profile.build_api_kwargs_extras(
reasoning_config=reasoning_config,
supports_reasoning=params.get("supports_reasoning", False),
qwen_session_metadata=params.get("qwen_session_metadata"),
model=model,
ollama_num_ctx=params.get("ollama_num_ctx"),
)
)
api_kwargs.update(top_level_from_profile)
# extra_body assembly
extra_body: dict[str, Any] = {}
# Profile's extra_body (tags, provider prefs, vl_high_resolution, etc.)
profile_body = profile.build_extra_body(
session_id=params.get("session_id"),
provider_preferences=params.get("provider_preferences"),
model=model,
base_url=params.get("base_url"),
reasoning_config=reasoning_config,
)
if profile_body:
extra_body.update(profile_body)
# Profile's reasoning/thinking extra_body entries
if extra_body_from_profile:
extra_body.update(extra_body_from_profile)
# Merge any pre-built extra_body additions from the caller
additions = params.get("extra_body_additions")
if additions:
extra_body.update(additions)
# Request overrides (user config)
overrides = params.get("request_overrides")
if overrides:
for k, v in overrides.items():
if k == "extra_body" and isinstance(v, dict):
extra_body.update(v)
else:
api_kwargs[k] = v
if extra_body:
api_kwargs["extra_body"] = extra_body
return api_kwargs
def normalize_response(self, response: Any, **kwargs) -> NormalizedResponse:
"""Normalize OpenAI ChatCompletion to NormalizedResponse.
@@ -444,7 +510,7 @@ class ChatCompletionsTransport(ProviderTransport):
# Gemini 3 thinking models attach extra_content with
# thought_signature — without replay on the next turn the API
# rejects the request with 400.
tc_provider_data: Dict[str, Any] = {}
tc_provider_data: dict[str, Any] = {}
extra = getattr(tc, "extra_content", None)
if extra is None and hasattr(tc, "model_extra"):
extra = (tc.model_extra or {}).get("extra_content")
@@ -455,12 +521,14 @@ class ChatCompletionsTransport(ProviderTransport):
except Exception:
pass
tc_provider_data["extra_content"] = extra
tool_calls.append(ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
))
tool_calls.append(
ToolCall(
id=tc.id,
name=tc.function.name,
arguments=tc.function.arguments,
provider_data=tc_provider_data or None,
)
)
usage = None
if hasattr(response, "usage") and response.usage:
@@ -508,7 +576,7 @@ class ChatCompletionsTransport(ProviderTransport):
return False
return True
def extract_cache_stats(self, response: Any) -> Optional[Dict[str, int]]:
def extract_cache_stats(self, response: Any) -> dict[str, int] | None:
"""Extract OpenRouter/OpenAI cache stats from prompt_tokens_details."""
usage = getattr(response, "usage", None)
if usage is None:
+15 -14
View File
@@ -12,7 +12,7 @@ from __future__ import annotations
import json
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from typing import Any
@dataclass
@@ -32,10 +32,10 @@ class ToolCall:
* Others: ``None``
"""
id: Optional[str]
id: str | None
name: str
arguments: str # JSON string
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The agent loop reads tc.function.name / tc.function.arguments
@@ -47,17 +47,17 @@ class ToolCall:
return "function"
@property
def function(self) -> "ToolCall":
def function(self) -> ToolCall:
"""Return self so tc.function.name / tc.function.arguments work."""
return self
@property
def call_id(self) -> Optional[str]:
def call_id(self) -> str | None:
"""Codex call_id from provider_data, accessed via getattr by _build_assistant_message."""
return (self.provider_data or {}).get("call_id")
@property
def response_item_id(self) -> Optional[str]:
def response_item_id(self) -> str | None:
"""Codex response_item_id from provider_data."""
return (self.provider_data or {}).get("response_item_id")
@@ -101,18 +101,18 @@ class NormalizedResponse:
* Others: ``None``
"""
content: Optional[str]
tool_calls: Optional[List[ToolCall]]
content: str | None
tool_calls: list[ToolCall] | None
finish_reason: str # "stop", "tool_calls", "length", "content_filter"
reasoning: Optional[str] = None
usage: Optional[Usage] = None
provider_data: Optional[Dict[str, Any]] = field(default=None, repr=False)
reasoning: str | None = None
usage: Usage | None = None
provider_data: dict[str, Any] | None = field(default=None, repr=False)
# ── Backward compatibility ──────────────────────────────────
# The shim _nr_to_assistant_message() mapped these from provider_data.
# These properties let NormalizedResponse pass through directly.
@property
def reasoning_content(self) -> Optional[str]:
def reasoning_content(self) -> str | None:
pd = self.provider_data or {}
return pd.get("reasoning_content")
@@ -136,8 +136,9 @@ class NormalizedResponse:
# Factory helpers
# ---------------------------------------------------------------------------
def build_tool_call(
id: Optional[str],
id: str | None,
name: str,
arguments: Any,
**provider_fields: Any,
@@ -151,7 +152,7 @@ def build_tool_call(
return ToolCall(id=id, name=name, arguments=args_str, provider_data=pd)
def map_finish_reason(reason: Optional[str], mapping: Dict[str, str]) -> str:
def map_finish_reason(reason: str | None, mapping: dict[str, str]) -> str:
"""Translate a provider-specific stop reason to the normalised set.
Falls back to ``"stop"`` for unknown or ``None`` reasons.
+284 -39
View File
@@ -940,6 +940,18 @@ def _run_state_db_auto_maintenance(session_db) -> None:
except Exception as _prune_exc:
logger.debug("Ghost session prune skipped: %s", _prune_exc)
# One-time finalize of orphaned compression continuations (#20001).
try:
if not session_db.get_meta("orphaned_compression_finalize_v1"):
finalized = session_db.finalize_orphaned_compression_sessions()
session_db.set_meta("orphaned_compression_finalize_v1", "1")
if finalized:
logger.info(
"Finalized %d orphaned compression sessions", finalized
)
except Exception as _finalize_exc:
logger.debug("Orphan compression finalize skipped: %s", _finalize_exc)
cfg = (_load_full_config().get("sessions") or {})
if not cfg.get("auto_prune", False):
return
@@ -1226,6 +1238,28 @@ def _strip_markdown_syntax(text: str) -> str:
return plain.strip("\n")
_WINDOWS_PATH_WITH_DOT_SEGMENT_RE = re.compile(
r"(?i)(?:\b[a-z]:\\|\\\\)[^\s`]*\\\.[^\s`]*"
)
def _preserve_windows_dot_segments_for_markdown(text: str) -> str:
r"""Keep Windows path separators before hidden directories in Markdown.
CommonMark treats ``\.`` as an escaped literal dot, so Rich Markdown would
render ``D:\repo\.ai`` as ``D:\repo.ai``. Doubling only that separator
inside Windows path-looking tokens preserves the path without changing
ordinary markdown escapes like ``1\. not a list``.
"""
if "\\." not in text:
return text
def _protect(match: re.Match[str]) -> str:
return re.sub(r"(?<!\\)\\(?=\.)", r"\\\\", match.group(0))
return _WINDOWS_PATH_WITH_DOT_SEGMENT_RE.sub(_protect, text)
def _render_final_assistant_content(text: str, mode: str = "render"):
"""Render final assistant content as markdown, stripped text, or raw text."""
from rich.markdown import Markdown
@@ -1237,6 +1271,7 @@ def _render_final_assistant_content(text: str, mode: str = "render"):
return _rich_text_from_ansi(text or "")
plain = _rich_text_from_ansi(text or "").plain
plain = _preserve_windows_dot_segments_for_markdown(plain)
return Markdown(plain)
@@ -1505,6 +1540,10 @@ def _detect_file_drop(user_input: str) -> "dict | None":
or stripped.startswith('"~')
or stripped.startswith("'/")
or stripped.startswith("'~")
or stripped.startswith('"./')
or stripped.startswith('"../')
or stripped.startswith("'./")
or stripped.startswith("'../")
or (len(stripped) >= 4 and stripped[0] in ("'", '"') and stripped[2] == ":" and stripped[3] in ("\\", "/") and stripped[1].isalpha())
)
if not starts_like_path:
@@ -1863,8 +1902,8 @@ _skill_commands = scan_skill_commands()
def _get_plugin_cmd_handler_names() -> set:
"""Return plugin command names (without slash prefix) for dispatch matching."""
try:
from hermes_cli.plugins import get_plugin_manager
return set(get_plugin_manager()._plugin_commands.keys())
from hermes_cli.plugins import get_plugin_commands
return set(get_plugin_commands().keys())
except Exception:
return set()
@@ -2118,7 +2157,10 @@ class HermesCLI:
elif CLI_CONFIG.get("max_turns"): # Backwards compat: root-level max_turns
self.max_turns = CLI_CONFIG["max_turns"]
elif os.getenv("HERMES_MAX_ITERATIONS"):
self.max_turns = int(os.getenv("HERMES_MAX_ITERATIONS"))
try:
self.max_turns = int(os.getenv("HERMES_MAX_ITERATIONS", ""))
except (TypeError, ValueError):
self.max_turns = 90
else:
self.max_turns = 90
@@ -2552,23 +2594,59 @@ class HermesCLI:
return f" {txt} ({elapsed_str})"
return f" {txt}"
def _voice_record_key_label(self) -> str:
"""Return the configured voice push-to-talk key formatted for UI.
Shared helper so every voice-facing status line / placeholder /
recording hint advertises the SAME label as the registered
prompt_toolkit binding.
Cached at startup (see ``set_voice_record_key_cache``) rather
than re-read per render. Two reasons (Copilot round-13 on
#19835):
* The prompt_toolkit binding is registered once at session
start via ``@kb.add(_voice_key)``; re-reading config per
render meant the status bar could advertise a new shortcut
after a config edit while the actual binding was still the
startup chord exactly the display/binding drift this PR
is trying to eliminate.
* The label is on the hot render path (status bar + composer
placeholder invalidated every 150ms during recording), so
reading config on every call added avoidable UI overhead.
"""
return getattr(self, "_voice_record_key_display_cache", None) or "Ctrl+B"
def set_voice_record_key_cache(self, raw_key: object) -> None:
"""Populate the voice label cache from a raw ``voice.record_key``.
Called at CLI startup after the prompt_toolkit binding is
registered so the cached label always matches the live binding.
"""
try:
from hermes_cli.voice import format_voice_record_key_for_status
self._voice_record_key_display_cache = format_voice_record_key_for_status(raw_key)
except Exception:
self._voice_record_key_display_cache = "Ctrl+B"
def _get_voice_status_fragments(self, width: Optional[int] = None):
"""Return the voice status bar fragments for the interactive TUI."""
width = width or self._get_tui_terminal_width()
compact = self._use_minimal_tui_chrome(width=width)
label = self._voice_record_key_label()
if self._voice_recording:
if compact:
return [("class:voice-status-recording", " ● REC ")]
return [("class:voice-status-recording", " ● REC Ctrl+B to stop ")]
return [("class:voice-status-recording", f" ● REC {label} to stop ")]
if self._voice_processing:
if compact:
return [("class:voice-status", " ◉ STT ")]
return [("class:voice-status", " ◉ Transcribing... ")]
if compact:
return [("class:voice-status", " 🎤 Ctrl+B ")]
return [("class:voice-status", f" 🎤 {label} ")]
tts = " | TTS on" if self._voice_tts else ""
cont = " | Continuous" if self._voice_continuous else ""
return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}Ctrl+B to record ")]
return [("class:voice-status", f" 🎤 Voice mode{tts}{cont}{label} to record ")]
def _build_status_bar_text(self, width: Optional[int] = None) -> str:
"""Return a compact one-line session status string for the TUI footer."""
@@ -4928,7 +5006,7 @@ class HermesCLI:
except Exception:
pass
def new_session(self, silent=False):
def new_session(self, silent=False, title=None):
"""Start a fresh session with a new session ID and cleared agent state."""
if self.agent and self.conversation_history:
# Trigger memory extraction on the old session before session_id rotates.
@@ -4983,6 +5061,28 @@ class HermesCLI:
self.agent._session_db_created = True
except Exception:
pass
if title and self._session_db:
from hermes_state import SessionDB
try:
sanitized = SessionDB.sanitize_title(title)
except ValueError as e:
_cprint(f" Title rejected: {e}")
sanitized = None
title = None
if sanitized:
try:
self._session_db.set_session_title(self.session_id, sanitized)
self._pending_title = None
title = sanitized
except ValueError as e:
_cprint(f" {e} — session started untitled.")
title = None
except Exception:
title = None
elif title is not None:
# sanitize_title returned empty (whitespace-only / unprintable)
_cprint(" Title is empty after cleanup — session started untitled.")
title = None
# Notify memory providers that session_id rotated to a fresh
# conversation. reset=True signals providers to flush accumulated
# per-session state (_session_turns, _turn_counter, _document_id).
@@ -5002,7 +5102,10 @@ class HermesCLI:
self._notify_session_boundary("on_session_reset")
if not silent:
print("(^_^)v New session started!")
if title:
print(f"(^_^)v New session started: {title}")
else:
print("(^_^)v New session started!")
def _handle_resume_command(self, cmd_original: str) -> None:
"""Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
@@ -6278,7 +6381,7 @@ class HermesCLI:
_cmd_def = _resolve_cmd(_base_word)
canonical = _cmd_def.name if _cmd_def else _base_word
if canonical in ("quit", "exit", "q"):
if canonical in ("quit", "exit"):
return False
elif canonical == "help":
self.show_help()
@@ -6414,7 +6517,9 @@ class HermesCLI:
else:
_cprint(" Session database not available.")
elif canonical == "new":
self.new_session()
parts = cmd_original.split(maxsplit=1)
title = parts[1].strip() if len(parts) > 1 else None
self.new_session(title=title)
elif canonical == "resume":
self._handle_resume_command(cmd_original)
elif canonical == "model":
@@ -7569,6 +7674,10 @@ class HermesCLI:
):
self.session_id = self.agent.session_id
self._pending_title = None
# Manual /compress replaces conversation_history with a new
# compressed handoff for the child session. Persist it from
# offset 0 so resume can recover the continuation after exit.
self.agent._flush_messages_to_session_db(self.conversation_history, None)
new_tokens = estimate_request_tokens_rough(
self.conversation_history,
system_prompt=_sys_prompt,
@@ -8216,20 +8325,38 @@ class HermesCLI:
return
self._voice_recording = True
# Load silence detection params from config
voice_cfg = {}
# Load silence detection params from config. Shape-safe: a
# hand-edited ``voice: true`` / ``voice: cmd+b`` leaves
# ``load_config()['voice']`` as a non-dict; coerce to {} so
# continuous recording falls back to the documented defaults
# instead of crashing on ``.get()``.
voice_cfg: dict = {}
try:
from hermes_cli.config import load_config
voice_cfg = load_config().get("voice", {})
_cfg = load_config().get("voice")
voice_cfg = _cfg if isinstance(_cfg, dict) else {}
except Exception:
pass
if self._voice_recorder is None:
self._voice_recorder = create_audio_recorder()
# Apply config-driven silence params
self._voice_recorder._silence_threshold = voice_cfg.get("silence_threshold", 200)
self._voice_recorder._silence_duration = voice_cfg.get("silence_duration", 3.0)
# Apply config-driven silence params (numeric-guarded so YAML
# scalar corruption doesn't break recording start-up).
#
# ``bool`` is explicitly excluded from the numeric check — in
# Python bool is a subclass of int, so a hand-edited
# ``silence_threshold: true`` would otherwise be forwarded as
# ``1`` instead of falling back to the 200 default (Copilot
# round-12 on #19835).
_threshold = voice_cfg.get("silence_threshold")
_duration = voice_cfg.get("silence_duration")
self._voice_recorder._silence_threshold = (
_threshold if isinstance(_threshold, (int, float)) and not isinstance(_threshold, bool) else 200
)
self._voice_recorder._silence_duration = (
_duration if isinstance(_duration, (int, float)) and not isinstance(_duration, bool) else 3.0
)
def _on_silence():
"""Called by AudioRecorder when silence is detected after speech."""
@@ -8255,12 +8382,13 @@ class HermesCLI:
with self._voice_lock:
self._voice_recording = False
raise
_label = self._voice_record_key_label()
if getattr(self._voice_recorder, "supports_silence_autostop", True):
_recording_hint = "auto-stops on silence | Ctrl+B to stop & exit continuous"
_recording_hint = f"auto-stops on silence | {_label} to stop & exit continuous"
elif _is_termux_environment():
_recording_hint = "Termux:API capture | Ctrl+B to stop"
_recording_hint = f"Termux:API capture | {_label} to stop"
else:
_recording_hint = "Ctrl+B to stop"
_recording_hint = f"{_label} to stop"
_cprint(f"\n{_ACCENT}● Recording...{_RST} {_DIM}({_recording_hint}){_RST}")
# Periodically refresh prompt to update audio level indicator
@@ -8505,10 +8633,12 @@ class HermesCLI:
with self._voice_lock:
self._voice_mode = True
# Check config for auto_tts
# Check config for auto_tts (shape-safe — malformed ``voice:`` YAML
# leaves ``voice_config`` as a non-dict, so guard before .get()).
try:
from hermes_cli.config import load_config
voice_config = load_config().get("voice", {})
_raw_voice = load_config().get("voice")
voice_config = _raw_voice if isinstance(_raw_voice, dict) else {}
if voice_config.get("auto_tts", False):
with self._voice_lock:
self._voice_tts = True
@@ -8520,13 +8650,11 @@ class HermesCLI:
# _voice_message_prefix property and its usage in _process_message().
tts_status = " (TTS enabled)" if self._voice_tts else ""
try:
from hermes_cli.config import load_config
_raw_ptt = load_config().get("voice", {}).get("record_key", "ctrl+b")
_ptt_key = _raw_ptt.lower().replace("ctrl+", "c-").replace("alt+", "a-")
except Exception:
_ptt_key = "c-b"
_ptt_display = _ptt_key.replace("c-", "Ctrl+").upper()
# Use the startup-pinned cache so the advertised shortcut always
# matches the live prompt_toolkit binding — reading live config
# here would drift after a mid-session config edit (Copilot
# round-14 on #19835, same class as round-13).
_ptt_display = self._voice_record_key_label()
_cprint(f"\n{_ACCENT}Voice mode enabled{tts_status}{_RST}")
_cprint(f" {_DIM}{_ptt_display} to start/stop recording{_RST}")
_cprint(f" {_DIM}/voice tts to toggle speech output{_RST}")
@@ -8583,7 +8711,6 @@ class HermesCLI:
def _show_voice_status(self):
"""Show current voice mode status."""
from hermes_cli.config import load_config
from tools.voice_mode import check_voice_requirements
reqs = check_voice_requirements()
@@ -8592,9 +8719,11 @@ class HermesCLI:
_cprint(f" Mode: {'ON' if self._voice_mode else 'OFF'}")
_cprint(f" TTS: {'ON' if self._voice_tts else 'OFF'}")
_cprint(f" Recording: {'YES' if self._voice_recording else 'no'}")
_raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
_display_key = _raw_key.replace("ctrl+", "Ctrl+").upper() if "ctrl+" in _raw_key.lower() else _raw_key
_cprint(f" Record key: {_display_key}")
# Display the startup-pinned label so /voice status always
# matches the live prompt_toolkit binding (Copilot round-14 on
# #19835, same class as round-13). Reading live config here
# would drift after a mid-session config edit.
_cprint(f" Record key: {self._voice_record_key_label()}")
_cprint(f"\n {_BOLD}Requirements:{_RST}")
for line in reqs["details"].split("\n"):
_cprint(f" {line}")
@@ -10429,7 +10558,92 @@ class HermesCLI:
else:
self._should_exit = True
event.app.exit()
# Ctrl+Shift+C: no binding needed. Terminal emulators (GNOME Terminal,
# iTerm2, kitty, Windows Terminal, etc.) intercept Ctrl+Shift+C before
# the keystroke reaches the application's stdin — prompt_toolkit never
# sees it, and prompt_toolkit's key spec parser doesn't even recognise
# 'c-S-c' anyway (the Shift modifier is meaningless on control-sequence
# keys). #19884 added a handler for this; #19895 patched the resulting
# startup crash with try/except. Both were based on a misreading of how
# terminal key events propagate. Deleting the dead handler outright.
@kb.add('c-q') # Ctrl+Q
def handle_ctrl_q(event):
"""Alternative interrupt/exit shortcut (Ctrl+Q).
Behaves like Ctrl+C: cancels active prompts, interrupts the
running agent, or clears the input buffer. Does not support
the double-press 'force exit' feature of Ctrl+C.
"""
# Cancel active voice recording.
_should_cancel_voice = False
_recorder_ref = None
with cli_ref._voice_lock:
if cli_ref._voice_recording and cli_ref._voice_recorder:
_recorder_ref = cli_ref._voice_recorder
cli_ref._voice_recording = False
cli_ref._voice_continuous = False
_should_cancel_voice = True
if _should_cancel_voice:
_cprint(f"\n{_DIM}Recording cancelled.{_RST}")
threading.Thread(
target=_recorder_ref.cancel, daemon=True
).start()
event.app.invalidate()
return
# Cancel sudo prompt
if self._sudo_state:
self._sudo_state["response_queue"].put("")
self._sudo_state = None
event.app.invalidate()
return
# Cancel secret prompt
if self._secret_state:
self._cancel_secret_capture()
event.app.current_buffer.reset()
event.app.invalidate()
return
# Cancel approval prompt (deny)
if self._approval_state:
self._approval_state["response_queue"].put("deny")
self._approval_state = None
event.app.invalidate()
return
# Cancel /model picker
if self._model_picker_state:
self._close_model_picker()
event.app.current_buffer.reset()
event.app.invalidate()
return
# Cancel clarify prompt
if self._clarify_state:
self._clarify_state["response_queue"].put(
"The user cancelled. Use your best judgement to proceed."
)
self._clarify_state = None
self._clarify_freetext = False
event.app.current_buffer.reset()
event.app.invalidate()
return
if self._agent_running and self.agent:
print("\n⚡ Interrupting agent...")
self.agent.interrupt()
else:
if event.app.current_buffer.text or self._attached_images:
event.app.current_buffer.reset()
self._attached_images.clear()
event.app.invalidate()
else:
self._should_exit = True
event.app.exit()
@kb.add('c-d')
def handle_ctrl_d(event):
"""Ctrl+D: delete char under cursor (standard readline behaviour).
@@ -10483,15 +10697,44 @@ class HermesCLI:
run_in_terminal(_suspend)
# Voice push-to-talk key: configurable via config.yaml (voice.record_key)
# Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search)
# Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format.
# Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search).
# Config spellings (ctrl/control/alt/option/opt) are normalized to
# prompt_toolkit's c-x / a-x format via ``normalize_voice_record_key_for_prompt_toolkit``
# so the same config value binds identically in the TUI and CLI
# (Copilot round-9 review on #19835). ``super``/``win``/``windows``
# configs silently fall back to the default here since prompt_toolkit
# has no super modifier — log a warning so users notice the
# TUI/CLI split instead of a silent mismatch (round-11).
_raw_key: object = "ctrl+b"
try:
from hermes_cli.config import load_config
_raw_key = load_config().get("voice", {}).get("record_key", "ctrl+b")
_voice_key = _raw_key.lower().replace("ctrl+", "c-").replace("alt+", "a-")
from hermes_cli.voice import (
normalize_voice_record_key_for_prompt_toolkit,
voice_record_key_from_config,
)
_raw_key = voice_record_key_from_config(load_config())
_voice_key = normalize_voice_record_key_for_prompt_toolkit(_raw_key)
if (
isinstance(_raw_key, str)
and _raw_key.strip().lower().split("+", 1)[0].strip() in {"super", "win", "windows"}
and _voice_key == "c-b"
):
logger.warning(
"voice.record_key %r uses a TUI-only modifier (super/win); "
"CLI fell back to Ctrl+B. Use ctrl+<key> or alt+<key> for "
"cross-runtime parity.",
_raw_key,
)
except Exception:
_voice_key = "c-b"
# Cache the UI label here — same ``_raw_key`` that drives the
# prompt_toolkit binding below. Every status / placeholder /
# recording-hint render reads this cached value so display can
# never drift from the live keybinding even if the user edits
# voice.record_key mid-session (Copilot round-13 on #19835).
self.set_voice_record_key_cache(_raw_key)
@kb.add(_voice_key)
def handle_voice_record(event):
"""Toggle voice recording when voice mode is active.
@@ -10794,7 +11037,8 @@ class HermesCLI:
def _get_placeholder():
if cli_ref._voice_recording:
return "recording... Ctrl+B to stop, Ctrl+C to cancel"
_label = cli_ref._voice_record_key_label()
return f"recording... {_label} to stop, Ctrl+C to cancel"
if cli_ref._voice_processing:
return "transcribing..."
if cli_ref._sudo_state:
@@ -10814,7 +11058,8 @@ class HermesCLI:
if cli_ref._agent_running:
return "msg=interrupt · /queue · /bg · /steer · Ctrl+C cancel"
if cli_ref._voice_mode:
return "type or Ctrl+B to record"
_label = cli_ref._voice_record_key_label()
return f"type or {_label} to record"
return ""
input_area.control.input_processors.append(_PlaceholderProcessor(_get_placeholder))
+37 -6
View File
@@ -420,7 +420,7 @@ def _normalize_workdir(workdir: Optional[str]) -> Optional[str]:
def create_job(
prompt: str,
prompt: Optional[str],
schedule: str,
name: Optional[str] = None,
repeat: Optional[int] = None,
@@ -435,12 +435,14 @@ def create_job(
context_from: Optional[Union[str, List[str]]] = None,
enabled_toolsets: Optional[List[str]] = None,
workdir: Optional[str] = None,
no_agent: bool = False,
) -> Dict[str, Any]:
"""
Create a new cron job.
Args:
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set)
prompt: The prompt to run (must be self-contained, or a task instruction when skill is set).
Ignored when ``no_agent=True`` except as an optional name hint.
schedule: Schedule string (see parse_schedule)
name: Optional friendly name
repeat: How many times to run (None = forever, 1 = once)
@@ -451,21 +453,33 @@ def create_job(
model: Optional per-job model override
provider: Optional per-job provider override
base_url: Optional per-job base URL override
script: Optional path to a Python script whose stdout is injected into the
prompt each run. The script runs before the agent turn, and its output
is prepended as context. Useful for data collection / change detection.
script: Optional path to a script whose stdout feeds the job. With
``no_agent=True`` the script IS the job its stdout is
delivered verbatim. Without ``no_agent``, its stdout is
injected into the agent's prompt as context (data-collection /
change-detection pattern). Paths resolve under
~/.hermes/scripts/; ``.sh`` / ``.bash`` files run via bash,
anything else via Python.
context_from: Optional job ID (or list of job IDs) whose most recent output
is injected into the prompt as context before each run.
Useful for chaining cron jobs: job A finds data, job B processes it.
enabled_toolsets: Optional list of toolset names to restrict the agent to.
When set, only tools from these toolsets are loaded, reducing
token overhead. When omitted, all default tools are loaded.
Ignored when ``no_agent=True``.
workdir: Optional absolute path. When set, the job runs as if launched
from that directory: AGENTS.md / CLAUDE.md / .cursorrules from
that directory are injected into the system prompt, and the
terminal/file/code_exec tools use it as their working directory
(via TERMINAL_CWD). When unset, the old behaviour is preserved
(no context files injected, tools use the scheduler's cwd).
With ``no_agent=True``, ``workdir`` is still applied as the
script's cwd so relative paths inside the script behave
predictably.
no_agent: When True, skip the agent entirely run ``script`` on schedule
and deliver its stdout directly. Empty stdout = silent (no
delivery). Requires ``script`` to be set. Ideal for classic
watchdogs and periodic alerts that don't need LLM reasoning.
Returns:
The created job dict
@@ -499,6 +513,16 @@ def create_job(
normalized_toolsets = [str(t).strip() for t in enabled_toolsets if str(t).strip()] if enabled_toolsets else None
normalized_toolsets = normalized_toolsets or None
normalized_workdir = _normalize_workdir(workdir)
normalized_no_agent = bool(no_agent)
# no_agent jobs are meaningless without a script — the script IS the job.
# Surface this as a clear ValueError at create time so bad configs never
# reach the scheduler.
if normalized_no_agent and not normalized_script:
raise ValueError(
"no_agent=True requires a script — with no agent and no script "
"there is nothing for the job to run."
)
# Normalize context_from: accept str or list of str, store as list or None
if isinstance(context_from, str):
@@ -508,7 +532,7 @@ def create_job(
else:
context_from = None
label_source = (prompt or (normalized_skills[0] if normalized_skills else None)) or "cron job"
label_source = (prompt or (normalized_skills[0] if normalized_skills else None) or (normalized_script if normalized_no_agent else None)) or "cron job"
job = {
"id": job_id,
"name": name or label_source[:50].strip(),
@@ -519,6 +543,7 @@ def create_job(
"provider": normalized_provider,
"base_url": normalized_base_url,
"script": normalized_script,
"no_agent": normalized_no_agent,
"context_from": context_from,
"schedule": parsed_schedule,
"schedule_display": parsed_schedule.get("display", schedule),
@@ -785,6 +810,12 @@ def get_due_jobs() -> List[Dict[str, Any]]:
the job is fast-forwarded to the next future run instead of firing
immediately. This prevents a burst of missed jobs on gateway restart.
"""
with _jobs_file_lock:
return _get_due_jobs_locked()
def _get_due_jobs_locked() -> List[Dict[str, Any]]:
"""Inner implementation of get_due_jobs(); must be called with _jobs_file_lock held."""
now = _hermes_now()
raw_jobs = load_jobs()
jobs = [_apply_skill_fields(j) for j in copy.deepcopy(raw_jobs)]
+169 -24
View File
@@ -35,7 +35,7 @@ from typing import List, Optional
sys.path.insert(0, str(Path(__file__).parent.parent))
from hermes_constants import get_hermes_home
from hermes_cli.config import load_config
from hermes_cli.config import load_config, _expand_env_vars
from hermes_time import now as _hermes_now
logger = logging.getLogger(__name__)
@@ -114,12 +114,20 @@ from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_
# locally for audit.
SILENT_MARKER = "[SILENT]"
# Resolve Hermes home directory (respects HERMES_HOME override)
_hermes_home = get_hermes_home()
# Backward-compatible module override used by tests and emergency monkeypatches.
_hermes_home: Path | None = None
# File-based lock prevents concurrent ticks from gateway + daemon + systemd timer
_LOCK_DIR = _hermes_home / "cron"
_LOCK_FILE = _LOCK_DIR / ".tick.lock"
def _get_hermes_home() -> Path:
"""Resolve Hermes home dynamically while preserving test monkeypatch hooks."""
return _hermes_home or get_hermes_home()
def _get_lock_paths() -> tuple[Path, Path]:
"""Resolve cron lock paths at call time so profile/env changes are honored."""
hermes_home = _get_hermes_home()
lock_dir = hermes_home / "cron"
return lock_dir, lock_dir / ".tick.lock"
def _resolve_origin(job: dict) -> Optional[dict]:
@@ -576,8 +584,18 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
prevent arbitrary script execution via path traversal or absolute
path injection.
Supported interpreters (chosen by file extension):
* ``.sh`` / ``.bash`` run with ``/bin/bash``
* anything else run with the current Python interpreter
(``sys.executable``), preserving the original behaviour for
Python-based pre-check and data-collection scripts.
Shell support lets ``no_agent=True`` jobs ship classic bash watchdogs
(the `memory-watchdog.sh` pattern) without wrapping them in Python.
Args:
script_path: Path to a Python script. Relative paths are resolved
script_path: Path to the script. Relative paths are resolved
against HERMES_HOME/scripts/. Absolute and ~-prefixed paths
are also validated to ensure they stay within the scripts dir.
@@ -587,7 +605,7 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
"""
from hermes_constants import get_hermes_home
scripts_dir = get_hermes_home() / "scripts"
scripts_dir = _get_hermes_home() / "scripts"
scripts_dir.mkdir(parents=True, exist_ok=True)
scripts_dir_resolved = scripts_dir.resolve()
@@ -614,9 +632,19 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
script_timeout = _get_script_timeout()
# Pick an interpreter by extension. Bash for .sh/.bash, Python for
# everything else. We deliberately do NOT honour the file's own
# shebang: the scripts dir is trusted, but keeping the interpreter
# choice explicit here keeps the allowed surface small and auditable.
suffix = path.suffix.lower()
if suffix in (".sh", ".bash"):
argv = ["/bin/bash", str(path)]
else:
argv = [sys.executable, str(path)]
try:
result = subprocess.run(
[sys.executable, str(path)],
argv,
capture_output=True,
text=True,
timeout=script_timeout,
@@ -706,10 +734,8 @@ def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
f"{prompt}"
)
else:
prompt = (
"[Script ran successfully but produced no output.]\n\n"
f"{prompt}"
)
# Script produced no output — nothing to report, skip AI call.
return None
else:
prompt = (
"## Script Error\n"
@@ -832,8 +858,120 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
Returns:
Tuple of (success, full_output_doc, final_response, error_message)
"""
job_id = job["id"]
job_name = job["name"]
# ---------------------------------------------------------------
# no_agent short-circuit — the script IS the job, no LLM involvement.
# ---------------------------------------------------------------
# This mirrors the classic "run a bash script on a timer, send its
# stdout to telegram" watchdog pattern. The agent path is skipped
# entirely: no AIAgent, no prompt, no tool loop, no token spend.
#
# We check this BEFORE importing run_agent / constructing SessionDB so
# a pure-script tick never pays for the agent machinery it isn't going
# to use. Keep this block self-contained.
#
# Semantics:
# - script stdout (trimmed) → delivered verbatim as the final message
# - empty stdout → silent run (no delivery, success=True)
# - non-zero exit / timeout → delivered as an error alert, success=False
# - wakeAgent=false gate → treated like empty stdout (silent), since
# the whole point of no_agent is that there
# is no agent to wake
if job.get("no_agent"):
script_path = job.get("script")
if not script_path:
err = "no_agent=True but no script is set for this job"
logger.error("Job '%s': %s", job_id, err)
return False, "", "", err
# Apply workdir if configured — lets scripts use predictable relative
# paths. For no_agent jobs this is just the subprocess cwd (not an
# agent TERMINAL_CWD bridge).
_job_workdir = (job.get("workdir") or "").strip() or None
_prior_cwd = None
if _job_workdir and Path(_job_workdir).is_dir():
_prior_cwd = os.getcwd()
try:
os.chdir(_job_workdir)
except OSError:
_prior_cwd = None
try:
ok, output = _run_job_script(script_path)
finally:
if _prior_cwd is not None:
try:
os.chdir(_prior_cwd)
except OSError:
pass
now_iso = _hermes_now().strftime("%Y-%m-%d %H:%M:%S")
if not ok:
# Script crashed / timed out / exited non-zero. Deliver the
# error so the user knows the watchdog itself broke — silent
# failure for an alerting job is the worst-case outcome.
alert = (
f"⚠ Cron watchdog '{job_name}' script failed\n\n"
f"{output}\n\n"
f"Time: {now_iso}"
)
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** script failed\n\n"
f"{output}\n"
)
return False, doc, alert, output
# Honour the wakeAgent gate as a silent signal — `wakeAgent: false`
# means "nothing to report this tick", same as empty stdout.
if not _parse_wake_gate(output):
logger.info(
"Job '%s' (no_agent): wakeAgent=false gate — silent run", job_id
)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (wakeAgent=false)\n"
)
return True, silent_doc, SILENT_MARKER, None
if not output.strip():
logger.info("Job '%s' (no_agent): empty stdout — silent run", job_id)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n"
f"**Status:** silent (empty output)\n"
)
return True, silent_doc, SILENT_MARKER, None
doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {now_iso}\n"
f"**Mode:** no_agent (script)\n\n"
f"---\n\n"
f"{output}\n"
)
return True, doc, output, None
# ---------------------------------------------------------------
# Default (LLM) path — import and construct the agent machinery now
# that we know we actually need it. Doing these imports here instead of
# at module top keeps no_agent ticks from paying for AIAgent / SessionDB
# construction costs.
# ---------------------------------------------------------------
from run_agent import AIAgent
# Initialize SQLite session store so cron job messages are persisted
# and discoverable via session_search (same pattern as gateway/run.py).
_session_db = None
@@ -842,9 +980,6 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_session_db = SessionDB()
except Exception as e:
logger.debug("Job '%s': SQLite session store not available: %s", job.get("id", "?"), e)
job_id = job["id"]
job_name = job["name"]
# Wake-gate: if this job has a pre-check script, run it BEFORE building
# the prompt so a ``{"wakeAgent": false}`` response can short-circuit
@@ -869,6 +1004,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
return True, silent_doc, SILENT_MARKER, None
prompt = _build_job_prompt(job, prerun_script=prerun_script)
if prompt is None:
logger.info("Job '%s': script produced no output, skipping AI call.", job_name)
return True, "", SILENT_MARKER, None
origin = _resolve_origin(job)
_cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
@@ -928,9 +1066,9 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
# changes take effect without a gateway restart.
from dotenv import load_dotenv
try:
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="utf-8")
load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(str(_hermes_home / ".env"), override=True, encoding="latin-1")
load_dotenv(str(_get_hermes_home() / ".env"), override=True, encoding="latin-1")
delivery_target = _resolve_delivery_target(job)
if delivery_target:
@@ -948,10 +1086,11 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
_cfg = {}
try:
import yaml
_cfg_path = str(_hermes_home / "config.yaml")
_cfg_path = str(_get_hermes_home() / "config.yaml")
if os.path.exists(_cfg_path):
with open(_cfg_path) as _f:
_cfg = yaml.safe_load(_f) or {}
_cfg = _expand_env_vars(_cfg)
_model_cfg = _cfg.get("model", {})
if not job.get("model"):
if isinstance(_model_cfg, str):
@@ -981,7 +1120,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
if prefill_file:
pfpath = Path(prefill_file).expanduser()
if not pfpath.is_absolute():
pfpath = _hermes_home / pfpath
pfpath = _get_hermes_home() / pfpath
if pfpath.exists():
try:
with open(pfpath, "r", encoding="utf-8") as _pf:
@@ -1004,8 +1143,13 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
)
from hermes_cli.auth import AuthError
try:
# Do not inject HERMES_INFERENCE_PROVIDER here. resolve_runtime_provider()
# already prefers persisted config over stale shell/env overrides when
# no explicit provider is requested. Passing the env var here short-
# circuits that precedence and can resurrect old providers (for
# example DeepSeek) for cron jobs that do not pin provider/model.
runtime_kwargs = {
"requested": job.get("provider") or os.getenv("HERMES_INFERENCE_PROVIDER"),
"requested": job.get("provider"),
}
if job.get("base_url"):
runtime_kwargs["explicit_base_url"] = job.get("base_url")
@@ -1300,12 +1444,13 @@ def tick(verbose: bool = True, adapters=None, loop=None) -> int:
Returns:
Number of jobs executed (0 if another tick is already running)
"""
_LOCK_DIR.mkdir(parents=True, exist_ok=True)
lock_dir, lock_file = _get_lock_paths()
lock_dir.mkdir(parents=True, exist_ok=True)
# Cross-platform file locking: fcntl on Unix, msvcrt on Windows
lock_fd = None
try:
lock_fd = open(_LOCK_FILE, "w")
lock_fd = open(lock_file, "w")
if fcntl:
fcntl.flock(lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
elif msvcrt:
@@ -0,0 +1,473 @@
# Telegram DM User-Managed Multi-Session Topics Implementation Plan
> **For Hermes:** Use test-driven-development for implementation. Use subagent-driven-development only after this plan is split into small reviewed tasks.
**Goal:** Add an opt-in Telegram DM multi-session mode where Telegram user-created private-chat topics become independent Hermes session lanes, while the root DM becomes a system lobby.
**Architecture:** Rely on Telegram's native private-chat topic UI. Users create new topics with the `+` button; Hermes maps each `message_thread_id` to a separate session lane. Hermes does not create topics for normal `/new` flow and does not try to manage topic lifecycle beyond activation/status, root-lobby behavior, and restoring legacy sessions into a user-created topic.
**Tech Stack:** Hermes gateway, Telegram Bot API 9.4+, python-telegram-bot adapter, SQLite SessionDB / side tables, pytest.
---
## 1. Product decisions
### Accepted
- PR-quality implementation: migrations, tests, docs, backwards compatibility.
- Use SQLite persistence, not JSON sidecars.
- Live status suffixes in topic titles are out of MVP.
- Topic title sync/editing is out of MVP except future-compatible storage if cheap.
- User creates Telegram topics manually through the Telegram bot interface.
- `/new` does **not** create Telegram topics.
- Root/main DM becomes a system lobby after activation.
- Existing Telegram behavior remains unchanged until the feature is activated/enabled.
- Migration of old sessions is supported through `/topic` listing and `/topic <session_id>` restore inside a user-created topic.
### Telegram API assumptions verified from Bot API docs
- `getMe` returns bot `User` fields:
- `has_topics_enabled`: forum/topic mode enabled in private chats.
- `allows_users_to_create_topics`: users may create/delete topics in private chats.
- `createForumTopic` works for private chats with a user, but MVP does not rely on it for normal flow.
- `Message.message_thread_id` identifies a topic in private chats.
- `sendMessage` supports `message_thread_id` for private-chat topics.
- `pinChatMessage` is allowed in private chats.
---
## 2. Target UX
### 2.1 Activation from root/main DM
User sends:
```text
/topic
```
Hermes:
1. calls Telegram `getMe`;
2. verifies `has_topics_enabled` and `allows_users_to_create_topics`;
3. enables multi-session topic mode for this Telegram DM user/chat;
4. sends an onboarding message;
5. pins the onboarding message if configured;
6. shows old/unlinked sessions that can be restored into topics.
Suggested onboarding text:
```text
Multi-session mode is enabled.
Create new Hermes chats with the + button in this bot interface. Each Telegram topic is an independent Hermes session, so you can work on different tasks in parallel.
This main chat is reserved for system commands, status, and session management.
To restore an old session:
1. Use /topic here to see unlinked sessions.
2. Create a new topic with the + button.
3. Send /topic <session_id> inside that topic.
```
### 2.2 Root/main DM after activation
Root DM is a system lobby.
Allowed/system commands include at least:
- `/topic`
- `/status`
- `/sessions` if available
- `/usage`
- `/help`
- `/platforms`
Normal user prompts in root DM do not enter the agent loop. Reply:
```text
This main chat is reserved for system commands.
To chat with Hermes, create a new topic using the + button in this bot interface. Each topic works as an independent Hermes session.
```
`/new` in root DM does not create a session/topic. Reply:
```text
To start a new parallel Hermes chat, create a new topic with the + button in this bot interface.
Each topic is an independent Hermes session. Use /new inside a topic only if you want to replace that topic's current session.
```
### 2.3 First message in a user-created topic
When a user creates a Telegram topic and sends the first message there:
1. Hermes receives a Telegram DM message with `message_thread_id`.
2. Hermes derives the existing thread-aware `session_key` from `(platform=telegram, chat_type=dm, chat_id, thread_id)`.
3. If no binding exists, Hermes creates a fresh Hermes session for this topic lane and persists the binding.
4. The message runs through the normal agent loop for that lane.
### 2.4 `/new` inside a non-main topic
`/new` remains supported but replaces the session attached to the current topic lane.
Hermes should warn:
```text
Started a new Hermes session in this topic.
Tip: for parallel work, create a new topic with the + button instead of using /new here. /new replaces the session attached to the current topic.
```
### 2.5 `/topic` in root/main DM after activation
Shows:
- mode enabled/disabled;
- last capability check result;
- whether intro message is pinned if known;
- count of known topic bindings;
- list of old/unlinked sessions.
Example:
```text
Telegram multi-session topics are enabled.
Create new Hermes chats with the + button in this bot interface.
Unlinked previous sessions:
1. 2026-05-01 Research notes — id: abc123
2. 2026-04-30 Deploy debugging — id: def456
3. Untitled session — id: ghi789
To restore one:
1. Create a new topic with the + button.
2. Open that topic.
3. Send /topic <id>
```
### 2.6 `/topic` inside a non-main topic
Without args, show the current topic binding:
```text
This topic is linked to:
Session: Research notes
ID: abc123
Use /new to replace this topic with a fresh session.
For parallel work, create another topic with the + button.
```
### 2.7 `/topic <session_id>` inside a non-main topic
Restore an old/unlinked session into the current user-created topic.
Behavior:
1. reject if not in Telegram DM topic;
2. verify session belongs to the same Telegram user/chat or is a safe legacy root DM session for this user;
3. reject if session is already linked to another active topic in MVP;
4. `SessionStore.switch_session(current_topic_session_key, target_session_id)`;
5. upsert binding with `managed_mode = restored`;
6. send two messages into the topic:
- session restored confirmation;
- last Hermes assistant message if available.
Example:
```text
Session restored: Research notes
Last Hermes message:
...
```
---
## 3. Persistence model
Use SQLite, but topic-mode schema changes are **explicit opt-in migrations**, not automatic startup reconciliation.
Important rollback-safety rule:
- upgrading Hermes and starting the gateway must not create Telegram topic-mode tables or columns;
- old/default Telegram behavior must keep working on the existing `state.db`;
- the first `/topic` activation path calls an idempotent explicit migration, then enables topic mode for that chat;
- if activation fails before the migration is needed, the database remains in the pre-topic-mode shape.
### 3.1 No eager `sessions` table mutation for MVP
Do **not** add `chat_id`, `chat_type`, `thread_id`, or `session_key` columns to `sessions` as part of ordinary `SessionDB()` startup. The existing declarative `_reconcile_columns()` mechanism would add them eagerly on every process start, which violates the managed-migration requirement.
For MVP, keep origin/session-lane data in topic-specific side tables created only by the explicit `/topic` migration. Legacy unlinked sessions can be discovered conservatively from existing data (`source = telegram`, `user_id = current Telegram user`) plus absence from topic bindings.
If future PRs need richer origin metadata for all gateway sessions, introduce it behind a separate explicit migration/command or a compatibility-reviewed schema bump.
### 3.2 Explicit `/topic` migration API
Add an idempotent method such as:
```python
def apply_telegram_topic_migration(self) -> None: ...
```
It creates only topic-mode side tables/indexes and records:
```text
state_meta.telegram_dm_topic_schema_version = 1
```
This method is called from `/topic` activation/status paths before reading or writing topic-mode state. It is not called from generic `SessionDB.__init__`, gateway startup, CLI startup, or auto-maintenance.
### 3.3 `telegram_dm_topic_mode`
Stores per-user/chat activation state. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id` primary key
- `user_id`
- `enabled`
- `activated_at`
- `updated_at`
- `has_topics_enabled`
- `allows_users_to_create_topics`
- `capability_checked_at`
- `intro_message_id`
- `pinned_message_id`
### 3.4 `telegram_dm_topic_bindings`
Stores Telegram topic/thread to Hermes session binding. Created only by `apply_telegram_topic_migration()`.
Suggested fields:
- `chat_id`
- `thread_id`
- `user_id`
- `session_key`
- `session_id`
- `managed_mode`
- `auto`
- `restored`
- `new_replaced`
- `linked_at`
- `updated_at`
Recommended constraints:
- primary key `(chat_id, thread_id)`;
- unique index on `session_id` for MVP to prevent one session linked to multiple topics;
- index `(user_id, chat_id)` for status/listing.
### 3.5 Unlinked session semantics
For MVP, a session is unlinked if:
- `source = telegram`;
- `user_id = current Telegram user`;
- no row in `telegram_dm_topic_bindings` has `session_id = session_id`.
This is intentionally conservative until a future explicit migration adds richer cross-platform origin metadata.
Never dedupe by title.
---
## 4. Config
Suggested config block:
```yaml
platforms:
telegram:
extra:
multisession_topics:
enabled: false
mode: user_managed_topics
root_chat_behavior: system_lobby
pin_intro_message: true
```
Notes:
- `enabled: false` means existing Telegram behavior is unchanged.
- Activation via `/topic` may create per-chat enabled state only if global config permits it.
- `root_chat_behavior: system_lobby` is the MVP behavior for activated chats.
---
## 5. Command behavior summary
### `/topic` root/main DM
- If not activated: capability check, activate, send/pin onboarding, list unlinked sessions.
- If activated: show status and unlinked sessions.
### `/topic` non-main topic
- Show current binding.
### `/topic <session_id>` root/main DM
Reject with instructions:
```text
Create a new topic with the + button, open it, then send /topic <session_id> there to restore this session.
```
### `/topic <session_id>` non-main topic
Restore that session into this topic if ownership/linking checks pass.
### `/new` root/main DM when activated
Reply with instructions to use the `+` button. Do not enter agent loop.
### `/new` non-main topic
Create a new session in the current topic lane, persist/update binding, warn that `+` is preferred for parallel work.
### Normal text root/main DM when activated
Reply with system-lobby instruction. Do not enter agent loop.
### Normal text non-main topic
Normal Hermes agent flow for that topic's session lane.
---
## 6. PR breakdown
### PR 1 — Explicit topic-mode schema migration
**Goal:** Add rollback-safe SQLite support for Telegram topic mode without mutating `state.db` on ordinary upgrade/startup.
**Files likely touched:**
- `hermes_state.py`
- tests under `tests/`
**Tests first:**
1. opening an old/current DB with `SessionDB()` does not create topic-mode tables or `sessions` origin columns;
2. calling `apply_telegram_topic_migration()` creates `telegram_dm_topic_mode` and `telegram_dm_topic_bindings` idempotently;
3. migration records `state_meta.telegram_dm_topic_schema_version = 1`.
### PR 2 — Topic mode activation and binding APIs
**Goal:** Add SQLite persistence for activation and topic bindings.
**Tests first:**
1. enable/check mode row round-trips;
2. binding upsert and lookup by `(chat_id, user_id, thread_id)`;
3. linked sessions are excluded from unlinked list.
### PR 3 — `/topic` activation/status command
**Goal:** Implement root activation/status/listing behavior.
**Tests first:**
1. `/topic` in root checks `getMe` capabilities and records activation;
2. capability failure returns readable instructions;
3. activated root `/topic` lists unlinked sessions.
### PR 4 — System lobby behavior
**Goal:** Prevent root chat from entering agent loop after activation.
**Tests first:**
1. normal text in activated root returns lobby instruction;
2. `/new` in activated root returns `+` button instruction;
3. non-activated root behavior is unchanged.
### PR 5 — Auto-bind user-created topics
**Goal:** First message in non-main topic creates/uses an independent session lane.
**Tests first:**
1. new topic message creates binding with `auto_created`;
2. repeated topic message reuses same binding/lane;
3. two topics in same DM do not share sessions.
### PR 6 — Restore legacy sessions into a topic
**Goal:** Implement `/topic <session_id>` in non-main topics.
**Tests first:**
1. root `/topic <id>` rejects with instructions;
2. topic `/topic <id>` switches current topic lane to target session;
3. restore rejects sessions from other users/chats;
4. restore rejects already-linked sessions;
5. restore emits confirmation and last Hermes assistant message.
### PR 7 — `/new` inside topic updates binding
**Goal:** Keep existing `/new` semantics but persist topic binding replacement.
**Tests first:**
1. `/new` in topic creates a new session for same topic lane;
2. binding updates to `managed_mode = new_replaced`;
3. response includes guidance to use `+` for parallel work.
### PR 8 — Docs and polish
**Goal:** Document the feature and Telegram setup.
**Files likely touched:**
- `website/docs/user-guide/messaging/telegram.md`
- maybe `website/docs/user-guide/sessions.md`
Docs must explain:
- BotFather/Telegram settings for topic mode and user-created topics;
- `/topic` activation;
- root system lobby;
- using `+` for new parallel chats;
- restoring old sessions with `/topic <id>` inside a topic;
- limitations.
---
## 7. Testing / quality gates
Run targeted tests after each TDD cycle, then broader tests before completion.
Suggested commands after inspection confirms test paths:
```bash
python -m pytest tests/test_hermes_state.py -q
python -m pytest tests/gateway/ -q
python -m pytest tests/ -o 'addopts=' -q
```
Do not ship without verifying disabled-feature backwards compatibility.
---
## 8. Definition of done for MVP
- `/topic` activates/checks Telegram DM multi-session mode.
- Root DM becomes a system lobby after activation.
- Onboarding message tells users to create new chats with the Telegram `+` button.
- Onboarding message can be pinned in private chat.
- User-created topics automatically become independent Hermes session lanes.
- `/new` in root gives instructions, not a new agent run.
- `/new` in a topic creates a new session in that topic and warns that `+` is preferred for parallel work.
- `/topic` in root lists unlinked old sessions.
- `/topic <session_id>` inside a topic restores that session and sends confirmation + last Hermes assistant message.
- Ownership checks prevent restoring other users' sessions.
- Already-linked sessions are not restored into a second topic in MVP.
- Existing Telegram behavior is unchanged when the feature is disabled.
- Tests and docs are included.
Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

+20
View File
@@ -845,6 +845,16 @@ def load_gateway_config() -> GatewayConfig:
):
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
# YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
_discord_extra = discord_cfg.get("extra") if isinstance(discord_cfg.get("extra"), dict) else {}
_discord_rtm = (
discord_cfg["reply_to_mode"] if "reply_to_mode" in discord_cfg
else _discord_extra.get("reply_to_mode")
)
if _discord_rtm is not None and not os.getenv("DISCORD_REPLY_TO_MODE"):
_rtm_str = "off" if _discord_rtm is False else str(_discord_rtm).lower()
os.environ["DISCORD_REPLY_TO_MODE"] = _rtm_str
# Bridge top-level require_mention to Telegram when the telegram: section
# does not already provide one. Users often write "require_mention: true"
@@ -881,6 +891,16 @@ def load_gateway_config() -> GatewayConfig:
os.environ["TELEGRAM_REACTIONS"] = str(telegram_cfg["reactions"]).lower()
if "proxy_url" in telegram_cfg and not os.getenv("TELEGRAM_PROXY"):
os.environ["TELEGRAM_PROXY"] = str(telegram_cfg["proxy_url"]).strip()
# reply_to_mode: top-level preferred, falls back to extra.reply_to_mode
# YAML 1.1 parses bare 'off' as boolean False — coerce to string "off".
_telegram_extra = telegram_cfg.get("extra") if isinstance(telegram_cfg.get("extra"), dict) else {}
_telegram_rtm = (
telegram_cfg["reply_to_mode"] if "reply_to_mode" in telegram_cfg
else _telegram_extra.get("reply_to_mode")
)
if _telegram_rtm is not None and not os.getenv("TELEGRAM_REPLY_TO_MODE"):
_rtm_str = "off" if _telegram_rtm is False else str(_telegram_rtm).lower()
os.environ["TELEGRAM_REPLY_TO_MODE"] = _rtm_str
allowed_users = telegram_cfg.get("allow_from")
if allowed_users is not None and not os.getenv("TELEGRAM_ALLOWED_USERS"):
if isinstance(allowed_users, list):
+168 -20
View File
@@ -2,8 +2,8 @@
OpenAI-compatible API server platform adapter.
Exposes an HTTP server with endpoints:
- POST /v1/chat/completions OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header)
- POST /v1/responses OpenAI Responses API format (stateful via previous_response_id)
- POST /v1/chat/completions OpenAI Chat Completions format (stateless; opt-in session continuity via X-Hermes-Session-Id header; opt-in long-term memory scoping via X-Hermes-Session-Key header)
- POST /v1/responses OpenAI Responses API format (stateful via previous_response_id; X-Hermes-Session-Key supported)
- GET /v1/responses/{response_id} Retrieve a stored response
- DELETE /v1/responses/{response_id} Delete a stored response
- GET /v1/models lists hermes-agent as an available model
@@ -698,6 +698,71 @@ class APIServerAdapter(BasePlatformAdapter):
status=401,
)
# ------------------------------------------------------------------
# Session header helpers
# ------------------------------------------------------------------
# Soft length cap for session identifiers. Headers are bounded in
# aggregate by aiohttp (``client_max_size`` / default 8 KiB per
# header), but we impose a tighter limit on the session headers so a
# caller can't burn memory by passing a multi-kilobyte "session key".
# 256 chars is well above any realistic stable channel identifier
# (e.g. ``agent:main:webui:dm:user-42``) while staying small enough
# that the sanitized form is safe to pass into Honcho / state.db.
_MAX_SESSION_HEADER_LEN = 256
def _parse_session_key_header(
self, request: "web.Request"
) -> tuple[Optional[str], Optional["web.Response"]]:
"""Extract and validate the ``X-Hermes-Session-Key`` header.
The session key is a stable per-channel identifier that scopes
long-term memory (e.g. Honcho sessions) across transcripts. It
is independent of ``X-Hermes-Session-Id``: callers may send
either, both, or neither.
Returns ``(session_key, None)`` on success (with an empty/absent
header yielding ``None`` for the key), or ``(None, error_response)``
on validation failure.
Security: like session continuation, accepting a caller-supplied
memory scope requires API-key authentication so that an
unauthenticated client on a local-only server can't inject itself
into another user's long-term memory scope by guessing a key.
"""
raw = request.headers.get("X-Hermes-Session-Key", "").strip()
if not raw:
return None, None
if not self._api_key:
logger.warning(
"X-Hermes-Session-Key rejected: no API key configured. "
"Set API_SERVER_KEY to enable long-term memory scoping."
)
return None, web.json_response(
_openai_error(
"X-Hermes-Session-Key requires API key authentication. "
"Configure API_SERVER_KEY to enable this feature."
),
status=403,
)
# Reject control characters that could enable header injection on
# the echo path.
if re.search(r'[\r\n\x00]', raw):
return None, web.json_response(
{"error": {"message": "Invalid session key", "type": "invalid_request_error"}},
status=400,
)
if len(raw) > self._MAX_SESSION_HEADER_LEN:
return None, web.json_response(
{"error": {"message": "Session key too long", "type": "invalid_request_error"}},
status=400,
)
return raw, None
# ------------------------------------------------------------------
# Session DB helper
# ------------------------------------------------------------------
@@ -728,6 +793,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_progress_callback=None,
tool_start_callback=None,
tool_complete_callback=None,
gateway_session_key: Optional[str] = None,
) -> Any:
"""
Create an AIAgent instance using the gateway's runtime config.
@@ -736,6 +802,13 @@ class APIServerAdapter(BasePlatformAdapter):
base_url, etc. from config.yaml / env vars. Toolsets are resolved
from config.yaml platform_toolsets.api_server (same as all other
gateway platforms), falling back to the hermes-api-server default.
``gateway_session_key`` is a stable per-channel identifier supplied
by the client (via ``X-Hermes-Session-Key``). Unlike ``session_id``
which scopes the short-term transcript and rotates on /new, this
key is meant to persist across transcripts so long-term memory
providers (e.g. Honcho) can scope their per-chat state correctly
matching the semantics of the native gateway's ``session_key``.
"""
from run_agent import AIAgent
from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config, GatewayRunner
@@ -771,6 +844,7 @@ class APIServerAdapter(BasePlatformAdapter):
session_db=self._ensure_session_db(),
fallback_model=fallback_model,
reasoning_config=reasoning_config,
gateway_session_key=gateway_session_key,
)
return agent
@@ -854,6 +928,7 @@ class APIServerAdapter(BasePlatformAdapter):
"run_stop": True,
"tool_progress_events": True,
"session_continuity_header": "X-Hermes-Session-Id",
"session_key_header": "X-Hermes-Session-Key",
"cors": bool(self._cors_origins),
},
"endpoints": {
@@ -925,6 +1000,15 @@ class APIServerAdapter(BasePlatformAdapter):
status=400,
)
# Allow caller to scope long-term memory (e.g. Honcho) with a
# stable per-channel identifier via X-Hermes-Session-Key. This
# is independent of X-Hermes-Session-Id: the key persists across
# transcripts while the id rotates when the caller starts a new
# transcript (i.e. /new semantics). See _parse_session_key_header.
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Allow caller to continue an existing session by passing X-Hermes-Session-Id.
# When provided, history is loaded from state.db instead of from the request body.
#
@@ -1059,11 +1143,13 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
return await self._write_sse_chat_completion(
request, completion_id, model_name, created, _stream_q,
agent_task, agent_ref, session_id=session_id,
gateway_session_key=gateway_session_key,
)
# Non-streaming: run the agent (with optional Idempotency-Key)
@@ -1073,6 +1159,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation_history=history,
ephemeral_system_prompt=system_prompt,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
idempotency_key = request.headers.get("Idempotency-Key")
@@ -1122,11 +1209,17 @@ class APIServerAdapter(BasePlatformAdapter):
},
}
return web.json_response(response_data, headers={"X-Hermes-Session-Id": session_id})
response_headers = {
"X-Hermes-Session-Id": result.get("session_id", session_id),
}
if gateway_session_key:
response_headers["X-Hermes-Session-Key"] = gateway_session_key
return web.json_response(response_data, headers=response_headers)
async def _write_sse_chat_completion(
self, request: "web.Request", completion_id: str, model: str,
created: int, stream_q, agent_task, agent_ref=None, session_id: str = None,
gateway_session_key: str = None,
) -> "web.StreamResponse":
"""Write real streaming SSE from agent's stream_delta_callback queue.
@@ -1149,6 +1242,8 @@ class APIServerAdapter(BasePlatformAdapter):
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
if gateway_session_key:
sse_headers["X-Hermes-Session-Key"] = gateway_session_key
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
@@ -1272,6 +1367,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation: Optional[str],
store: bool,
session_id: str,
gateway_session_key: Optional[str] = None,
) -> "web.StreamResponse":
"""Write an SSE stream for POST /v1/responses (OpenAI Responses API).
@@ -1314,6 +1410,8 @@ class APIServerAdapter(BasePlatformAdapter):
sse_headers.update(cors)
if session_id:
sse_headers["X-Hermes-Session-Id"] = session_id
if gateway_session_key:
sse_headers["X-Hermes-Session-Key"] = gateway_session_key
response = web.StreamResponse(status=200, headers=sse_headers)
await response.prepare(request)
@@ -1763,6 +1861,11 @@ class APIServerAdapter(BasePlatformAdapter):
if auth_err:
return auth_err
# Long-term memory scope header (see chat_completions for details).
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Parse request body
try:
body = await request.json()
@@ -1914,6 +2017,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=_on_tool_start,
tool_complete_callback=_on_tool_complete,
agent_ref=agent_ref,
gateway_session_key=gateway_session_key,
))
response_id = f"resp_{uuid.uuid4().hex[:28]}"
@@ -1934,6 +2038,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation=conversation,
store=store,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
async def _compute_response():
@@ -1942,6 +2047,7 @@ class APIServerAdapter(BasePlatformAdapter):
conversation_history=conversation_history,
ephemeral_system_prompt=instructions,
session_id=session_id,
gateway_session_key=gateway_session_key,
)
idempotency_key = request.headers.get("Idempotency-Key")
@@ -2016,7 +2122,10 @@ class APIServerAdapter(BasePlatformAdapter):
if conversation:
self._response_store.set_conversation(conversation, response_id)
return web.json_response(response_data)
response_headers = {"X-Hermes-Session-Id": session_id}
if gateway_session_key:
response_headers["X-Hermes-Session-Key"] = gateway_session_key
return web.json_response(response_data, headers=response_headers)
# ------------------------------------------------------------------
# GET / DELETE response endpoints
@@ -2338,6 +2447,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_start_callback=None,
tool_complete_callback=None,
agent_ref: Optional[list] = None,
gateway_session_key: Optional[str] = None,
) -> tuple:
"""
Create an agent and run a conversation in a thread executor.
@@ -2360,6 +2470,7 @@ class APIServerAdapter(BasePlatformAdapter):
tool_progress_callback=tool_progress_callback,
tool_start_callback=tool_start_callback,
tool_complete_callback=tool_complete_callback,
gateway_session_key=gateway_session_key,
)
if agent_ref is not None:
agent_ref[0] = agent
@@ -2374,6 +2485,12 @@ class APIServerAdapter(BasePlatformAdapter):
"output_tokens": getattr(agent, "session_completion_tokens", 0) or 0,
"total_tokens": getattr(agent, "session_total_tokens", 0) or 0,
}
# Include the effective session ID in the result so callers
# (e.g. X-Hermes-Session-Id header) can track compression-
# triggered session rotations. (#16938)
_eff_sid = getattr(agent, "session_id", session_id)
if isinstance(_eff_sid, str) and _eff_sid:
result["session_id"] = _eff_sid
return result, usage
return await loop.run_in_executor(None, _run)
@@ -2453,6 +2570,11 @@ class APIServerAdapter(BasePlatformAdapter):
if auth_err:
return auth_err
# Long-term memory scope header (see chat_completions for details).
gateway_session_key, key_err = self._parse_session_key_header(request)
if key_err is not None:
return key_err
# Enforce concurrency limit
if len(self._run_streams) >= self._MAX_CONCURRENT_RUNS:
return web.json_response(
@@ -2561,6 +2683,7 @@ class APIServerAdapter(BasePlatformAdapter):
session_id=session_id,
stream_delta_callback=_text_cb,
tool_progress_callback=event_cb,
gateway_session_key=gateway_session_key,
)
self._active_run_agents[run_id] = agent
def _run_sync():
@@ -2578,21 +2701,39 @@ class APIServerAdapter(BasePlatformAdapter):
return r, u
result, usage = await asyncio.get_running_loop().run_in_executor(None, _run_sync)
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
# Check for structured failure (non-retryable client errors like
# 401/400 return failed=True instead of raising, so the except
# block below never fires — issue #15561).
if isinstance(result, dict) and result.get("failed"):
error_msg = result.get("error") or "agent run failed"
q.put_nowait({
"event": "run.failed",
"run_id": run_id,
"timestamp": time.time(),
"error": error_msg,
})
self._set_run_status(
run_id,
"failed",
error=error_msg,
last_event="run.failed",
)
else:
final_response = result.get("final_response", "") if isinstance(result, dict) else ""
q.put_nowait({
"event": "run.completed",
"run_id": run_id,
"timestamp": time.time(),
"output": final_response,
"usage": usage,
})
self._set_run_status(
run_id,
"completed",
output=final_response,
usage=usage,
last_event="run.completed",
)
except asyncio.CancelledError:
self._set_run_status(
run_id,
@@ -2643,7 +2784,14 @@ class APIServerAdapter(BasePlatformAdapter):
if hasattr(task, "add_done_callback"):
task.add_done_callback(self._background_tasks.discard)
return web.json_response({"run_id": run_id, "status": "started"}, status=202)
response_headers = (
{"X-Hermes-Session-Key": gateway_session_key} if gateway_session_key else {}
)
return web.json_response(
{"run_id": run_id, "status": "started"},
status=202,
headers=response_headers,
)
async def _handle_get_run(self, request: "web.Request") -> "web.Response":
"""GET /v1/runs/{run_id} — return pollable run status for external UIs."""
+30 -5
View File
@@ -2506,7 +2506,13 @@ class BasePlatformAdapter(ABC):
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2606,7 +2612,13 @@ class BasePlatformAdapter(ABC):
_r = await self._send_with_retry(
chat_id=event.source.chat_id,
content=_text,
reply_to=event.message_id,
reply_to=(
event.reply_to_message_id
if event.source.platform == Platform.FEISHU
and event.source.thread_id
and event.reply_to_message_id
else event.message_id
),
metadata=_thread_meta,
)
if _eph_ttl > 0 and _r.success and _r.message_id:
@@ -2663,10 +2675,18 @@ class BasePlatformAdapter(ABC):
mode = os.getenv("HERMES_HUMAN_DELAY_MODE", "off").lower()
if mode == "off":
return 0.0
min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
if mode == "natural":
min_ms, max_ms = 800, 2500
return random.uniform(min_ms / 1000.0, max_ms / 1000.0)
# custom mode — tolerate malformed env vars instead of crashing.
try:
min_ms = int(os.getenv("HERMES_HUMAN_DELAY_MIN_MS", "800"))
except (TypeError, ValueError):
min_ms = 800
try:
max_ms = int(os.getenv("HERMES_HUMAN_DELAY_MAX_MS", "2500"))
except (TypeError, ValueError):
max_ms = 2500
return random.uniform(min_ms / 1000.0, max_ms / 1000.0)
async def _process_message_background(self, event: MessageEvent, session_key: str) -> None:
@@ -2810,10 +2830,15 @@ class BasePlatformAdapter(ABC):
# Send the text portion
if text_content:
logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
_reply_anchor = (
event.reply_to_message_id
if event.source.platform == Platform.FEISHU and event.source.thread_id and event.reply_to_message_id
else event.message_id
)
result = await self._send_with_retry(
chat_id=event.source.chat_id,
content=text_content,
reply_to=event.message_id,
reply_to=_reply_anchor,
metadata=_thread_metadata,
)
_record_delivery(result)
+13 -2
View File
@@ -720,11 +720,22 @@ class DiscordAdapter(BasePlatformAdapter):
return
# If humans are mentioned but we're not → not for us
# (preserves old DISCORD_IGNORE_NO_MENTION=true behavior)
# EXCEPT in free-response channels where the bot should
# answer regardless of who is mentioned.
_ignore_no_mention = os.getenv(
"DISCORD_IGNORE_NO_MENTION", "true"
).lower() in ("true", "1", "yes")
if _ignore_no_mention and not _self_mentioned and not _other_bots_mentioned:
return
_channel_id = str(message.channel.id)
_parent_id = None
if hasattr(message.channel, "parent_id") and message.channel.parent_id:
_parent_id = str(message.channel.parent_id)
_free_channels = adapter_self._discord_free_response_channels()
_channel_ids = {_channel_id}
if _parent_id:
_channel_ids.add(_parent_id)
if "*" not in _free_channels and not (_channel_ids & _free_channels):
return
await self._handle_message(message)
@@ -3797,7 +3808,7 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
skip_thread = bool(channel_ids & no_thread_channels)
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
+12
View File
@@ -416,6 +416,18 @@ class EmailAdapter(BasePlatformAdapter):
logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
return
# Skip senders not in EMAIL_ALLOWED_USERS — prevents the adapter
# from creating a MessageEvent (and thus thread context) for senders
# that the gateway will never authorize. Without this early guard,
# a race between dispatch and authorization can result in the adapter
# sending a reply even though the handler returned None.
allowed_raw = os.getenv("EMAIL_ALLOWED_USERS", "").strip()
if allowed_raw:
allowed = {addr.strip().lower() for addr in allowed_raw.split(",") if addr.strip()}
if sender_addr.lower() not in allowed:
logger.debug("[Email] Dropping non-allowlisted sender at dispatch: %s", sender_addr)
return
subject = msg_data["subject"]
body = msg_data["body"].strip()
attachments = msg_data["attachments"]
+75 -37
View File
@@ -153,6 +153,9 @@ _MARKDOWN_HINT_RE = re.compile(
r"(^#{1,6}\s)|(^\s*[-*]\s)|(^\s*\d+\.\s)|(^\s*---+\s*$)|(```)|(`[^`\n]+`)|(\*\*[^*\n].+?\*\*)|(~~[^~\n].+?~~)|(<u>.+?</u>)|(\*[^*\n]+\*)|(\[[^\]]+\]\([^)]+\))|(^>\s)",
re.MULTILINE,
)
# Detect markdown tables: a line starting with | followed by a separator line.
# Feishu post-type 'md' elements do not render tables, so we force text mode.
_MARKDOWN_TABLE_RE = re.compile(r"^\|.*\|\n\|[-|: ]+\|", re.MULTILINE)
_MARKDOWN_LINK_RE = re.compile(r"\[([^\]]+)\]\(([^)]+)\)")
_MARKDOWN_FENCE_OPEN_RE = re.compile(r"^```([^\n`]*)\s*$")
_MARKDOWN_FENCE_CLOSE_RE = re.compile(r"^```\s*$")
@@ -2757,9 +2760,11 @@ class FeishuAdapter(BasePlatformAdapter):
if hint:
text = f"{hint}\n\n{text}" if text else hint
thread_id = getattr(message, "thread_id", None) or getattr(message, "root_id", None) or None
reply_to_message_id = (
getattr(message, "parent_id", None)
or getattr(message, "upper_message_id", None)
or getattr(message, "root_id", None)
or None
)
reply_to_text = await self._fetch_message_text(reply_to_message_id) if reply_to_message_id else None
@@ -2791,7 +2796,7 @@ class FeishuAdapter(BasePlatformAdapter):
chat_type=self._resolve_source_chat_type(chat_info=chat_info, event_chat_type=chat_type),
user_id=sender_profile["user_id"],
user_name=sender_profile["user_name"],
thread_id=getattr(message, "thread_id", None) or None,
thread_id=thread_id,
user_id_alt=sender_profile["user_id_alt"],
is_bot=is_bot,
)
@@ -3860,47 +3865,50 @@ class FeishuAdapter(BasePlatformAdapter):
and self-sent bot event filtering.
Populates ``_bot_open_id`` and ``_bot_name`` from /open-apis/bot/v3/info
(no extra scopes required beyond the tenant access token). Falls back to
the application info endpoint for ``_bot_name`` only when the first probe
doesn't return it. Each field is hydrated independently — a value already
supplied via env vars (FEISHU_BOT_OPEN_ID / FEISHU_BOT_USER_ID /
FEISHU_BOT_NAME) is preserved and skips its probe.
(no extra scopes required beyond the tenant access token). The probe
always runs when a client is available so stale env vars from app/bot
migrations do not break group @mention gating. Falls back to the
application info endpoint for ``_bot_name`` only when the first probe
doesn't return it. If the probe fails, env-provided values are preserved.
"""
if not self._client:
return
if self._bot_open_id and self._bot_name:
# Everything the self-send filter and precise mention gate need is
# already in place; nothing to probe.
return
# Primary probe: /open-apis/bot/v3/info — returns bot_name + open_id, no
# extra scopes required. This is the same endpoint the onboarding wizard
# uses via probe_bot().
if not self._bot_open_id or not self._bot_name:
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content:
payload = json.loads(content)
parsed = _parse_bot_response(payload) or {}
open_id = (parsed.get("bot_open_id") or "").strip()
bot_name = (parsed.get("bot_name") or "").strip()
if open_id and not self._bot_open_id:
self._bot_open_id = open_id
if bot_name and not self._bot_name:
self._bot_name = bot_name
except Exception:
logger.debug(
"[Feishu] /bot/v3/info probe failed during hydration",
exc_info=True,
)
try:
req = (
BaseRequest.builder()
.http_method(HttpMethod.GET)
.uri("/open-apis/bot/v3/info")
.token_types({AccessTokenType.TENANT})
.build()
)
resp = await asyncio.to_thread(self._client.request, req)
content = getattr(getattr(resp, "raw", None), "content", None)
if content:
payload = json.loads(content)
parsed = _parse_bot_response(payload) or {}
open_id = (parsed.get("bot_open_id") or "").strip()
bot_name = (parsed.get("bot_name") or "").strip()
if open_id:
if self._bot_open_id and self._bot_open_id != open_id:
logger.warning(
"[Feishu] FEISHU_BOT_OPEN_ID is stale; using /bot/v3/info open_id for group @mention gating."
)
self._bot_open_id = open_id
if bot_name:
if self._bot_name and self._bot_name != bot_name:
logger.info(
"[Feishu] FEISHU_BOT_NAME differs from /bot/v3/info; using hydrated bot name for group @mention gating."
)
self._bot_name = bot_name
except Exception:
logger.debug(
"[Feishu] /bot/v3/info probe failed during hydration",
exc_info=True,
)
# Fallback probe for _bot_name only: application info endpoint. Needs
# admin:app.info:readonly or application:application:self_manage scope,
@@ -3945,7 +3953,14 @@ class FeishuAdapter(BasePlatformAdapter):
if isinstance(seen_data, list):
entries: Dict[str, float] = {str(item).strip(): 0.0 for item in seen_data if str(item).strip()}
elif isinstance(seen_data, dict):
entries = {k: float(v) for k, v in seen_data.items() if isinstance(k, str) and k.strip()}
entries = {}
for key, value in seen_data.items():
if not isinstance(key, str) or not key.strip():
continue
try:
entries[key] = float(value)
except (TypeError, ValueError):
continue
else:
return
# Filter out TTL-expired entries (entries saved with ts=0.0 are treated as immortal
@@ -3990,6 +4005,12 @@ class FeishuAdapter(BasePlatformAdapter):
# =========================================================================
def _build_outbound_payload(self, content: str) -> tuple[str, str]:
# Feishu post-type 'md' elements do not render markdown tables; sending
# table content as post causes the message to appear blank on the client.
# Force plain text for anything that looks like a markdown table.
if _MARKDOWN_TABLE_RE.search(content):
text_payload = {"text": content}
return "text", json.dumps(text_payload, ensure_ascii=False)
if _MARKDOWN_HINT_RE.search(content):
return "post", _build_markdown_post_payload(content)
text_payload = {"text": content}
@@ -4085,7 +4106,15 @@ class FeishuAdapter(BasePlatformAdapter):
content=payload,
uuid_value=str(uuid.uuid4()),
)
request = self._build_create_message_request("chat_id", body)
# Detect whether chat_id is a user open_id (DM) or a chat_id (group).
# Feishu API expects receive_id_type="open_id" for user DMs (ou_ prefix)
# and receive_id_type="chat_id" for group chats (oc_ prefix, which IS
# the chat_id format — see https://open.feishu.cn/document/).
if chat_id.startswith("ou_"):
receive_id_type = "open_id"
else:
receive_id_type = "chat_id"
request = self._build_create_message_request(receive_id_type, body)
return await asyncio.to_thread(self._client.im.v1.message.create, request)
@staticmethod
@@ -4227,6 +4256,15 @@ class FeishuAdapter(BasePlatformAdapter):
if active_reply_to and not self._response_succeeded(response):
code = getattr(response, "code", None)
if code in _FEISHU_REPLY_FALLBACK_CODES:
if (metadata or {}).get("thread_id"):
logger.warning(
"[Feishu] Reply to %s failed in thread %s (code %s — message withdrawn/missing); "
"skipping top-level fallback to avoid creating a new topic",
active_reply_to,
(metadata or {}).get("thread_id"),
code,
)
return response
logger.warning(
"[Feishu] Reply to %s failed (code %s — message withdrawn/missing); "
"falling back to new message in chat %s",
+10 -6
View File
@@ -222,33 +222,37 @@ class ThreadParticipationTracker:
def __init__(self, platform_name: str, max_tracked: int = 500):
self._platform = platform_name
self._max_tracked = max_tracked
self._threads: set = self._load()
self._threads: dict[str, None] = {
str(thread_id): None for thread_id in self._load()
}
def _state_path(self) -> Path:
from hermes_constants import get_hermes_home
return get_hermes_home() / f"{self._platform}_threads.json"
def _load(self) -> set:
def _load(self) -> list[str]:
path = self._state_path()
if path.exists():
try:
return set(json.loads(path.read_text(encoding="utf-8")))
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
return [str(thread_id) for thread_id in data]
except Exception:
pass
return set()
return []
def _save(self) -> None:
path = self._state_path()
thread_list = list(self._threads)
if len(thread_list) > self._max_tracked:
thread_list = thread_list[-self._max_tracked:]
self._threads = set(thread_list)
self._threads = {thread_id: None for thread_id in thread_list}
atomic_json_write(path, thread_list, indent=None)
def mark(self, thread_id: str) -> None:
"""Mark *thread_id* as participated and persist."""
if thread_id not in self._threads:
self._threads.add(thread_id)
self._threads[thread_id] = None
self._save()
def __contains__(self, thread_id: str) -> bool:
+12 -1
View File
@@ -397,13 +397,24 @@ class QQAdapter(BasePlatformAdapter):
await self._session.close()
self._session = None
self._session = aiohttp.ClientSession()
# Honor WSL proxy env for QQ WebSocket. Hermes upgrades overwrite this
# local patch, so QQ can regress to direct-connect timeouts after update.
self._session = aiohttp.ClientSession(trust_env=True)
ws_proxy = (
os.getenv("WSS_PROXY")
or os.getenv("wss_proxy")
or os.getenv("HTTPS_PROXY")
or os.getenv("https_proxy")
or os.getenv("ALL_PROXY")
or os.getenv("all_proxy")
)
self._ws = await self._session.ws_connect(
gateway_url,
headers={
"User-Agent": build_user_agent(),
},
timeout=CONNECT_TIMEOUT_SECONDS,
proxy=ws_proxy,
)
logger.info("[%s] WebSocket connected to %s", self._log_tag, gateway_url)
+9 -5
View File
@@ -10,7 +10,7 @@ Shares credentials with the optional telephony skill — same env vars:
Gateway-specific env vars:
- SMS_WEBHOOK_PORT (default 8080)
- SMS_WEBHOOK_HOST (default 0.0.0.0)
- SMS_WEBHOOK_HOST (default 127.0.0.1)
- SMS_WEBHOOK_URL (public URL for Twilio signature validation required)
- SMS_INSECURE_NO_SIGNATURE (true to disable signature validation dev only)
- SMS_ALLOWED_USERS (comma-separated E.164 phone numbers)
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
TWILIO_API_BASE = "https://api.twilio.com/2010-04-01/Accounts"
MAX_SMS_LENGTH = 1600 # ~10 SMS segments
DEFAULT_WEBHOOK_PORT = 8080
DEFAULT_WEBHOOK_HOST = "0.0.0.0"
DEFAULT_WEBHOOK_HOST = "127.0.0.1"
def check_sms_requirements() -> bool:
@@ -91,19 +91,23 @@ class SmsAdapter(BasePlatformAdapter):
from aiohttp import web
if not self._from_number:
logger.error("[sms] TWILIO_PHONE_NUMBER not set — cannot send replies")
msg = "[sms] TWILIO_PHONE_NUMBER not set — cannot send replies"
logger.error(msg)
self._set_fatal_error("sms_missing_phone_number", msg, retryable=False)
return False
insecure_no_sig = os.getenv("SMS_INSECURE_NO_SIGNATURE", "").lower() == "true"
if not self._webhook_url and not insecure_no_sig:
logger.error(
msg = (
"[sms] Refusing to start: SMS_WEBHOOK_URL is required for Twilio "
"signature validation. Set it to the public URL configured in your "
"Twilio console (e.g. https://example.com/webhooks/twilio). "
"For local development without validation, set "
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production).",
"SMS_INSECURE_NO_SIGNATURE=true (NOT recommended for production)."
)
logger.error(msg)
self._set_fatal_error("sms_missing_webhook_url", msg, retryable=False)
return False
if insecure_no_sig and not self._webhook_url:
+84 -22
View File
@@ -353,7 +353,10 @@ class TelegramAdapter(BasePlatformAdapter):
@classmethod
def _message_thread_id_for_typing(cls, thread_id: Optional[str]) -> Optional[int]:
if not thread_id:
# Mirrors _message_thread_id_for_send: the General forum topic (thread id
# "1") is represented as "no thread id" on the wire. User-created topics
# keep their real id so typing stays scoped to that topic.
if not thread_id or str(thread_id) == cls._GENERAL_TOPIC_THREAD_ID:
return None
return int(thread_id)
@@ -688,6 +691,29 @@ class TelegramAdapter(BasePlatformAdapter):
)
return None
async def rename_dm_topic(
self,
chat_id: int,
thread_id: int,
name: str,
) -> None:
"""Rename a forum topic in a private (DM) chat."""
if not self._bot:
return
try:
chat_id_arg = int(chat_id)
except (TypeError, ValueError):
chat_id_arg = chat_id
await self._bot.edit_forum_topic(
chat_id=chat_id_arg,
message_thread_id=int(thread_id),
name=name,
)
logger.info(
"[%s] Renamed DM topic in chat %s thread_id=%s -> '%s'",
self.name, chat_id, thread_id, name,
)
def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
"""Save a newly created thread_id back into config.yaml so it persists across restarts."""
try:
@@ -2267,13 +2293,54 @@ class TelegramAdapter(BasePlatformAdapter):
)
return SendResult(success=True, message_id=str(msg.message_id))
except Exception as e:
logger.error(
"[%s] Failed to send Telegram local image, falling back to base adapter: %s",
self.name,
e,
exc_info=True,
error_str = str(e)
# Dimension-related errors are the expected case for valid image
# files that Telegram just refuses as photos (screenshots, extreme
# aspect ratios). Log at INFO because the document fallback is
# the correct path. Any other send_photo failure also falls back
# to document (rate limits, corrupt file markers, format edge
# cases), but at WARNING because it's unexpected and worth
# surfacing in logs.
is_dim_error = (
"Photo_invalid_dimensions" in error_str
or "PHOTO_INVALID_DIMENSIONS" in error_str
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
if is_dim_error:
logger.info(
"[%s] Image dimensions exceed Telegram photo limits, "
"sending as document: %s",
self.name,
image_path,
)
else:
logger.warning(
"[%s] Failed to send Telegram local image as photo, "
"trying document fallback: %s",
self.name,
e,
exc_info=True,
)
# Fallback to sending as document (file) — no dimension limit,
# only 50MB size limit. If even that fails, fall back to the
# base adapter's text-only "Image: /path" rendering.
try:
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
file_name=os.path.basename(image_path),
reply_to=reply_to,
metadata=metadata,
)
except Exception as doc_err:
logger.error(
"[%s] Failed to send Telegram local image as document, "
"falling back to base adapter: %s",
self.name,
doc_err,
exc_info=True,
)
return await super().send_image_file(chat_id, image_path, caption, reply_to)
async def send_document(
self,
@@ -2444,21 +2511,16 @@ class TelegramAdapter(BasePlatformAdapter):
try:
_typing_thread = self._metadata_thread_id(metadata)
message_thread_id = self._message_thread_id_for_typing(_typing_thread)
try:
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=message_thread_id,
)
except Exception as e:
if message_thread_id is not None and self._is_thread_not_found_error(e):
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=None,
)
else:
raise
# No retry-without-thread fallback here: _message_thread_id_for_typing
# already maps the forum General topic to None, so any non-None value
# reaching this call is a user-created topic. If Telegram rejects it
# (e.g. topic deleted mid-session), we swallow the failure rather than
# showing a typing indicator in the wrong chat/All Messages.
await self._bot.send_chat_action(
chat_id=int(chat_id),
action="typing",
message_thread_id=message_thread_id,
)
except Exception as e:
# Typing failures are non-fatal; log at debug level only.
logger.debug(
+10 -7
View File
@@ -185,10 +185,13 @@ async def _query_doh_provider(
async def discover_fallback_ips() -> list[str]:
"""Auto-discover Telegram API IPs via DNS-over-HTTPS.
Resolves api.telegram.org through Google and Cloudflare DoH, collects all
unique IPs, and excludes the system-DNS-resolved IP (which is presumably
unreachable on this network). Falls back to a hardcoded seed list when DoH
is also unavailable.
Resolves api.telegram.org through Google and Cloudflare DoH and returns all
unique A records. IPs that match the local system resolver are kept rather
than excluded: in many networks the system-DNS IP is the most reliable path
to api.telegram.org and a transient primary-path failure should be retried
against the same address via the IP-rewrite path before the seed list is
consulted (#14520). Falls back to a hardcoded seed list only when DoH
yields no usable answers.
"""
async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
@@ -203,11 +206,11 @@ async def discover_fallback_ips() -> list[str]:
if isinstance(r, list):
doh_ips.extend(r)
# Deduplicate preserving order, exclude system-DNS IPs
# Deduplicate preserving order
seen: set[str] = set()
candidates: list[str] = []
for ip in doh_ips:
if ip not in seen and ip not in system_ips:
if ip not in seen:
seen.add(ip)
candidates.append(ip)
@@ -219,7 +222,7 @@ async def discover_fallback_ips() -> list[str]:
return validated
logger.info(
"DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
"DoH discovery yielded no usable IPs (system DNS: %s); using seed fallback IPs %s",
", ".join(system_ips) or "unknown",
", ".join(_SEED_FALLBACK_IPS),
)
+3
View File
@@ -142,6 +142,7 @@ class WeComAdapter(BasePlatformAdapter):
"""WeCom AI Bot adapter backed by a persistent WebSocket connection."""
MAX_MESSAGE_LENGTH = MAX_MESSAGE_LENGTH
SUPPORTS_MESSAGE_EDITING = False
# Threshold for detecting WeCom client-side message splits.
# When a chunk is near the 4000-char limit, a continuation is almost certain.
_SPLIT_THRESHOLD = 3900
@@ -1014,6 +1015,8 @@ class WeComAdapter(BasePlatformAdapter):
if not aes_key:
raise ValueError("aes_key is required")
# WeCom doesn't pad base64 keys; add padding if needed
aes_key = aes_key + '=' * ((4 - len(aes_key) % 4) % 4)
key = base64.b64decode(aes_key)
if len(key) != 32:
raise ValueError(f"Invalid WeCom AES key length: expected 32 bytes, got {len(key)}")
+9 -2
View File
@@ -1333,6 +1333,15 @@ class WeixinAdapter(BasePlatformAdapter):
if message_id and self._dedup.is_duplicate(message_id):
return
# Secondary content-fingerprint dedup for text messages
item_list = message.get("item_list") or []
text = _extract_text(item_list)
if text:
content_key = f"content:{sender_id}:{hashlib.md5(text.encode()).hexdigest()}"
if self._dedup.is_duplicate(content_key):
logger.debug("[%s] Content-dedup: skipping duplicate message from %s", self.name, sender_id)
return
chat_type, effective_chat_id = _guess_chat_type(message, self._account_id)
if chat_type == "group":
if self._group_policy == "disabled":
@@ -1347,8 +1356,6 @@ class WeixinAdapter(BasePlatformAdapter):
self._token_store.set(self._account_id, sender_id, context_token)
asyncio.create_task(self._maybe_fetch_typing_ticket(sender_id, context_token or None))
item_list = message.get("item_list") or []
text = _extract_text(item_list)
media_paths: List[str] = []
media_types: List[str] = []
+979 -277
View File
File diff suppressed because it is too large Load Diff
+5 -4
View File
@@ -1121,7 +1121,7 @@ class SessionStore:
self._save()
return count
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
def reset_session(self, session_key: str, display_name: Optional[str] = None) -> Optional[SessionEntry]:
"""Force reset a session, creating a new session ID."""
db_end_session_id = None
db_create_kwargs = None
@@ -1145,7 +1145,7 @@ class SessionStore:
created_at=now,
updated_at=now,
origin=old_entry.origin,
display_name=old_entry.display_name,
display_name=display_name if display_name is not None else old_entry.display_name,
platform=old_entry.platform,
chat_type=old_entry.chat_type,
is_fresh_reset=True,
@@ -1276,8 +1276,9 @@ class SessionStore:
# Also write legacy JSONL (keeps existing tooling working during transition)
transcript_path = self.get_transcript_path(session_id)
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
with self._lock:
with open(transcript_path, "a", encoding="utf-8") as f:
f.write(json.dumps(message, ensure_ascii=False) + "\n")
def rewrite_transcript(self, session_id: str, messages: List[Dict[str, Any]]) -> None:
"""Replace the entire transcript for a session with new messages.
+107 -51
View File
@@ -637,6 +637,8 @@ def release_all_scoped_locks(
_TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
_TAKEOVER_MARKER_TTL_S = 60 # Marker older than this is treated as stale
_PLANNED_STOP_MARKER_FILENAME = ".gateway-planned-stop.json"
_PLANNED_STOP_MARKER_TTL_S = 60
def _get_takeover_marker_path() -> Path:
@@ -645,6 +647,67 @@ def _get_takeover_marker_path() -> Path:
return home / _TAKEOVER_MARKER_FILENAME
def _get_planned_stop_marker_path() -> Path:
"""Return the path to the intentional gateway stop marker file."""
home = get_hermes_home()
return home / _PLANNED_STOP_MARKER_FILENAME
def _marker_is_stale(written_at: str, ttl_s: int) -> bool:
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
return age > ttl_s
except (TypeError, ValueError):
return True
def _consume_pid_marker_for_self(
path: Path,
*,
pid_field: str,
start_time_field: str,
ttl_s: int,
) -> bool:
record = _read_json_file(path)
if not record:
return False
try:
target_pid = int(record[pid_field])
target_start_time = record.get(start_time_field)
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
if _marker_is_stale(written_at, ttl_s):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def write_takeover_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being replaced by the current process.
@@ -681,59 +744,13 @@ def consume_takeover_marker_for_self() -> bool:
Always unlinks the marker on match (and on detected staleness) so
subsequent unrelated signals don't re-trigger.
"""
path = _get_takeover_marker_path()
record = _read_json_file(path)
if not record:
return False
# Any malformed or stale marker → drop it and return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
stale = False
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
if age > _TAKEOVER_MARKER_TTL_S:
stale = True
except (TypeError, ValueError):
stale = True # Unparseable timestamp — treat as stale
if stale:
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# Does the marker name THIS process?
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
return _consume_pid_marker_for_self(
_get_takeover_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_TAKEOVER_MARKER_TTL_S,
)
# Consume the marker whether it matched or not — a marker that doesn't
# match our identity is stale-for-us anyway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def clear_takeover_marker() -> None:
"""Remove the takeover marker unconditionally. Safe to call repeatedly."""
@@ -743,6 +760,45 @@ def clear_takeover_marker() -> None:
pass
def write_planned_stop_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being stopped intentionally.
The gateway exits non-zero for unexpected SIGTERM so service managers can
revive it. Service stop commands send the same SIGTERM, so the CLI writes
this short-lived marker first to let the target process exit cleanly.
"""
try:
target_start_time = _get_process_start_time(target_pid)
record = {
"target_pid": target_pid,
"target_start_time": target_start_time,
"stopper_pid": os.getpid(),
"written_at": _utc_now_iso(),
}
_write_json_file(_get_planned_stop_marker_path(), record)
return True
except (OSError, PermissionError):
return False
def consume_planned_stop_marker_for_self() -> bool:
"""Return True when the current process is being intentionally stopped."""
return _consume_pid_marker_for_self(
_get_planned_stop_marker_path(),
pid_field="target_pid",
start_time_field="target_start_time",
ttl_s=_PLANNED_STOP_MARKER_TTL_S,
)
def clear_planned_stop_marker() -> None:
"""Remove the planned-stop marker unconditionally."""
try:
_get_planned_stop_marker_path().unlink(missing_ok=True)
except OSError:
pass
def get_running_pid(
pid_path: Optional[Path] = None,
*,
+304 -11
View File
@@ -416,6 +416,40 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
),
}
# Auto-extend PROVIDER_REGISTRY with any api-key provider registered in
# providers/ that is not already declared above. New providers only need a
# providers/*.py file — no edits to this file required.
try:
from providers import list_providers as _list_providers_for_registry
for _pp in _list_providers_for_registry():
if _pp.name in PROVIDER_REGISTRY:
continue
if _pp.auth_type != "api_key" or not _pp.env_vars:
continue
# Skip providers that need custom token resolution or are special-cased
# in resolve_provider() (copilot/kimi/zai have bespoke token refresh;
# openrouter/custom are aggregator/user-supplied and handled outside
# the registry — adding them here breaks runtime_provider resolution
# that relies on `openrouter not in PROVIDER_REGISTRY`).
if _pp.name in {"copilot", "kimi-coding", "kimi-coding-cn", "zai", "openrouter", "custom"}:
continue
_api_key_vars = tuple(v for v in _pp.env_vars if not v.endswith("_BASE_URL") and not v.endswith("_URL"))
_base_url_var = next((v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")), None)
PROVIDER_REGISTRY[_pp.name] = ProviderConfig(
id=_pp.name,
name=_pp.display_name or _pp.name,
auth_type="api_key",
inference_base_url=_pp.base_url,
api_key_env_vars=_api_key_vars or _pp.env_vars,
base_url_env_var=_base_url_var or "",
)
# Also register aliases so resolve_provider() resolves them
for _alias in _pp.aliases:
if _alias not in PROVIDER_REGISTRY:
PROVIDER_REGISTRY[_alias] = PROVIDER_REGISTRY[_pp.name]
except Exception:
pass
# =============================================================================
# Anthropic Key Helper
@@ -1195,6 +1229,17 @@ def resolve_provider(
"vllm": "custom", "llamacpp": "custom",
"llama.cpp": "custom", "llama-cpp": "custom",
}
# Extend with aliases declared in providers/*.py that aren't already mapped.
# This keeps providers/ as the single source for new aliases while the
# hardcoded dict above remains authoritative for existing ones.
try:
from providers import list_providers as _lp
for _pp in _lp():
for _alias in _pp.aliases:
if _alias not in _PROVIDER_ALIASES:
_PROVIDER_ALIASES[_alias] = _pp.name
except Exception:
pass
normalized = _PROVIDER_ALIASES.get(normalized, normalized)
if normalized == "openrouter":
@@ -2589,6 +2634,208 @@ def _poll_for_token(
# Nous Portal — token refresh, agent key minting, model discovery
# =============================================================================
# -----------------------------------------------------------------------------
# Shared Nous token store — lets OAuth credentials persist across profiles
# so a new `hermes --profile <name> auth add nous --type oauth` can one-tap
# import instead of running the full device-code flow every time.
#
# File lives at ${HERMES_SHARED_AUTH_DIR}/nous_auth.json, defaulting to
# ~/.hermes/shared/nous_auth.json. It is OUTSIDE any named profile's
# HERMES_HOME so named profiles (which typically live under
# ~/.hermes/profiles/<name>/) all see the same file.
#
# Written on successful login and on every runtime refresh so the stored
# refresh_token stays current even if one profile refreshes and rotates it.
# If ever the stored refresh_token does go stale server-side, import fails
# gracefully and the user falls back to the normal device-code flow.
# -----------------------------------------------------------------------------
NOUS_SHARED_STORE_FILENAME = "nous_auth.json"
def _nous_shared_auth_dir() -> Path:
"""Resolve the directory that holds the shared Nous token store.
Honors ``HERMES_SHARED_AUTH_DIR`` so tests can redirect it to a tmp
path without touching the real user's home. Defaults to
``~/.hermes/shared/``.
"""
override = os.getenv("HERMES_SHARED_AUTH_DIR", "").strip()
if override:
return Path(override).expanduser()
return Path.home() / ".hermes" / "shared"
def _nous_shared_store_path() -> Path:
path = _nous_shared_auth_dir() / NOUS_SHARED_STORE_FILENAME
# Seat belt: if pytest is running and this resolves to a path under the
# real user's home, refuse rather than silently corrupt cross-profile
# state. Tests must set HERMES_SHARED_AUTH_DIR to a tmp_path (conftest
# does not do this automatically — mirror the _auth_file_path() guard
# so forgetting to set it fails loudly instead of writing to the real
# shared store).
if os.environ.get("PYTEST_CURRENT_TEST"):
real_home_shared = (
Path.home() / ".hermes" / "shared" / NOUS_SHARED_STORE_FILENAME
).resolve(strict=False)
try:
resolved = path.resolve(strict=False)
except Exception:
resolved = path
if resolved == real_home_shared:
raise RuntimeError(
f"Refusing to touch real user shared Nous auth store during test run: "
f"{path}. Set HERMES_SHARED_AUTH_DIR to a tmp_path in your test fixture."
)
return path
def _write_shared_nous_state(state: Dict[str, Any]) -> None:
"""Persist a minimal copy of the Nous OAuth state to the shared store.
Best-effort: any failure is swallowed after logging. The shared store
is a convenience layer; the per-profile auth.json remains the source
of truth.
We deliberately omit the short-lived ``agent_key`` (24h TTL, profile-
specific) only the long-lived OAuth tokens are cross-profile useful.
"""
refresh_token = state.get("refresh_token")
access_token = state.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
# No refresh_token = nothing worth sharing across profiles
return
if not (isinstance(access_token, str) and access_token.strip()):
return
shared = {
"_schema": 1,
"access_token": access_token,
"refresh_token": refresh_token,
"token_type": state.get("token_type") or "Bearer",
"scope": state.get("scope") or DEFAULT_NOUS_SCOPE,
"client_id": state.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": state.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": state.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"obtained_at": state.get("obtained_at"),
"expires_at": state.get("expires_at"),
"updated_at": datetime.now(timezone.utc).isoformat(),
}
try:
path = _nous_shared_store_path()
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(path.suffix + ".tmp")
tmp.write_text(json.dumps(shared, indent=2, sort_keys=True))
try:
os.chmod(tmp, 0o600)
except OSError:
pass
os.replace(tmp, path)
_oauth_trace(
"nous_shared_store_written",
path=str(path),
refresh_token_fp=_token_fingerprint(refresh_token),
)
except Exception as exc:
logger.debug("Failed to write shared Nous auth store: %s", exc)
def _read_shared_nous_state() -> Optional[Dict[str, Any]]:
"""Return the shared Nous OAuth state if present and well-formed.
Returns ``None`` when the file is missing, unreadable, malformed, or
lacks required fields. Callers should treat ``None`` as "no shared
credentials available fall through to device-code".
"""
try:
path = _nous_shared_store_path()
except RuntimeError:
# Test seat belt tripped — treat as missing
return None
if not path.is_file():
return None
try:
payload = json.loads(path.read_text())
except (OSError, ValueError) as exc:
logger.debug("Shared Nous auth store at %s is unreadable: %s", path, exc)
return None
if not isinstance(payload, dict):
return None
refresh_token = payload.get("refresh_token")
access_token = payload.get("access_token")
if not (isinstance(refresh_token, str) and refresh_token.strip()):
return None
if not (isinstance(access_token, str) and access_token.strip()):
return None
return payload
def _try_import_shared_nous_state(
*,
timeout_seconds: float = 15.0,
min_key_ttl_seconds: int = 5 * 60,
) -> Optional[Dict[str, Any]]:
"""Attempt to rehydrate Nous OAuth state from the shared store.
Reads the shared file (if present), runs a forced refresh+mint using
the stored refresh_token to produce a fresh access_token + agent_key
scoped to this profile, and returns the full auth_state dict ready
for ``persist_nous_credentials()``.
Returns ``None`` when no shared state is available or the rehydrate
fails for any reason (expired refresh_token, portal unreachable,
etc.) caller should then fall through to the normal device-code
flow.
"""
shared = _read_shared_nous_state()
if not shared:
return None
# Build a full state dict so refresh_nous_oauth_from_state has every
# field it needs. force_refresh=True gets us a fresh access_token
# for this profile; force_mint=True gets us a fresh agent_key.
state: Dict[str, Any] = {
"access_token": shared.get("access_token"),
"refresh_token": shared.get("refresh_token"),
"client_id": shared.get("client_id") or DEFAULT_NOUS_CLIENT_ID,
"portal_base_url": shared.get("portal_base_url") or DEFAULT_NOUS_PORTAL_URL,
"inference_base_url": shared.get("inference_base_url") or DEFAULT_NOUS_INFERENCE_URL,
"token_type": shared.get("token_type") or "Bearer",
"scope": shared.get("scope") or DEFAULT_NOUS_SCOPE,
"obtained_at": shared.get("obtained_at"),
"expires_at": shared.get("expires_at"),
"agent_key": None,
"agent_key_expires_at": None,
"tls": {"insecure": False, "ca_bundle": None},
}
try:
refreshed = refresh_nous_oauth_from_state(
state,
min_key_ttl_seconds=min_key_ttl_seconds,
timeout_seconds=timeout_seconds,
force_refresh=True,
force_mint=True,
)
except AuthError as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
error_code=getattr(exc, "code", None),
)
logger.debug("Shared Nous import failed: %s", exc)
return None
except Exception as exc:
_oauth_trace(
"nous_shared_import_failed",
error_type=type(exc).__name__,
)
logger.debug("Shared Nous import failed: %s", exc)
return None
return refreshed
def _refresh_access_token(
*,
client: httpx.Client,
@@ -2991,6 +3238,12 @@ def persist_nous_credentials(
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
# Mirror to the shared store so a new profile can one-tap import
# these credentials via `hermes auth add nous --type oauth`. Best-
# effort: any I/O failure is logged and swallowed (the per-profile
# auth.json is still the source of truth).
_write_shared_nous_state(state)
pool = load_pool("nous")
return next(
(e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
@@ -3059,6 +3312,11 @@ def resolve_nous_runtime_credentials(
refresh_token_fp=_token_fingerprint(state.get("refresh_token")),
access_token_fp=_token_fingerprint(state.get("access_token")),
)
# Mirror post-refresh state to the shared store so sibling
# profiles don't hold stale refresh_tokens after rotation.
# Best-effort — any failure is logged and swallowed inside
# _write_shared_nous_state.
_write_shared_nous_state(state)
verify = _resolve_verify(insecure=insecure, ca_bundle=ca_bundle, auth_state=state)
timeout = httpx.Timeout(timeout_seconds if timeout_seconds else 15.0)
@@ -4600,17 +4858,47 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
)
try:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
auth_state = None
# Codex-style auto-import: before launching a fresh device-code
# flow, check the shared store for an existing Nous credential
# from any other profile. If present, offer to rehydrate it.
shared = _read_shared_nous_state()
if shared:
try:
shared_path = _nous_shared_store_path()
except RuntimeError:
shared_path = None
print()
if shared_path:
print(f"Found existing Nous OAuth credentials at {shared_path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
auth_state = _try_import_shared_nous_state(
timeout_seconds=timeout_seconds,
min_key_ttl_seconds=5 * 60,
)
if auth_state is None:
print("Could not refresh shared credentials — falling back to device-code login.")
if auth_state is None:
auth_state = _nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
client_id=getattr(args, "client_id", None) or pconfig.client_id,
scope=getattr(args, "scope", None) or pconfig.scope,
open_browser=not getattr(args, "no_browser", False),
timeout_seconds=timeout_seconds,
insecure=insecure,
ca_bundle=ca_bundle,
min_key_ttl_seconds=5 * 60,
)
inference_base_url = auth_state["inference_base_url"]
@@ -4627,6 +4915,11 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
_save_provider_state(auth_store, "nous", auth_state)
saved_to = _save_auth_store(auth_store)
# Mirror to the shared store so other profiles can one-tap import
# these credentials. Best-effort: any I/O failure is logged and
# swallowed inside the helper.
_write_shared_nous_state(auth_state)
print()
print("Login successful!")
print(f" Auth state: {saved_to}")
+41
View File
@@ -245,6 +245,47 @@ def auth_add_command(args) -> None:
return
if provider == "nous":
# Codex-style auto-import: if a shared Nous credential lives at
# ~/.hermes/shared/nous_auth.json (written by any previous
# successful login), offer to import it instead of running the
# full device-code flow. This makes `hermes --profile <name>
# auth add nous --type oauth` a one-tap operation for users who
# run multiple profiles.
shared = auth_mod._read_shared_nous_state()
if shared:
try:
path = auth_mod._nous_shared_store_path()
except RuntimeError:
path = None
print()
if path:
print(f"Found existing Nous OAuth credentials at {path}")
else:
print("Found existing shared Nous OAuth credentials")
try:
do_import = input("Import these credentials? [Y/n]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
do_import = "y"
if do_import in ("", "y", "yes"):
print("Rehydrating Nous session from shared credentials...")
rehydrated = auth_mod._try_import_shared_nous_state(
timeout_seconds=getattr(args, "timeout", None) or 15.0,
min_key_ttl_seconds=max(
60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))
),
)
if rehydrated is not None:
custom_label = (getattr(args, "label", None) or "").strip() or None
entry = auth_mod.persist_nous_credentials(rehydrated, label=custom_label)
shown_label = entry.label if entry is not None else label_from_token(
rehydrated.get("access_token", ""), _oauth_default_label(provider, 1),
)
print(f'Imported {provider} OAuth credentials: "{shown_label}"')
return
# Rehydrate failed (expired refresh_token, portal down, etc.)
# — fall through to device-code flow.
print("Could not refresh shared credentials — falling back to device-code login.")
creds = auth_mod._nous_device_code_login(
portal_base_url=getattr(args, "portal_url", None),
inference_base_url=getattr(args, "inference_url", None),
+15 -2
View File
@@ -61,6 +61,9 @@ _EXCLUDED_NAMES = {
"cron.pid",
}
# zipfile.open() drops Unix mode bits on extract; restore tightens these to 0600.
_SECRET_FILE_NAMES = {".env", "auth.json", "state.db"}
def _should_exclude(rel_path: Path) -> bool:
"""Return True if *rel_path* (relative to hermes root) should be skipped."""
@@ -381,6 +384,8 @@ def run_import(args) -> None:
target.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as src, open(target, "wb") as dst:
dst.write(src.read())
if target.name in _SECRET_FILE_NAMES:
os.chmod(target, 0o600)
restored += 1
except (PermissionError, OSError) as exc:
errors.append(f" {rel}: {exc}")
@@ -788,9 +793,17 @@ def _prune_pre_update_backups(backup_dir: Path, keep: int) -> int:
Returns the number of files deleted. Only touches files matching
``pre-update-*.zip`` so hand-made zips dropped in the same directory
are never touched.
``keep`` is floored to 1 because this helper is only called immediately
after a fresh backup is written: deleting that backup right after the
user paid the disk/CPU cost to create it would leave them worse off
than no backup at all (and the wrapper in ``main.py`` would still print
a misleading ``Saved: <path>`` line for a file that no longer exists).
Operators who genuinely don't want a backup should set
``updates.pre_update_backup: false`` in config that gates creation.
"""
if keep < 0:
keep = 0
if keep < 1:
keep = 1
if not backup_dir.exists():
return 0
+9 -1
View File
@@ -235,6 +235,9 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
"""
findings: list[tuple[Path, str]] = []
if not source_dir.exists():
return findings
# Direct state files in the root
for name in ("todo.json", "sessions", "logs"):
candidate = source_dir / name
@@ -243,7 +246,12 @@ def _scan_workspace_state(source_dir: Path) -> list[tuple[Path, str]]:
findings.append((candidate, f"Root {kind}: {name}"))
# State files inside workspace directories
for child in sorted(source_dir.iterdir()):
try:
children = sorted(source_dir.iterdir())
except OSError:
return findings
for child in children:
if not child.is_dir() or child.name.startswith("."):
continue
# Check for workspace-like subdirectories
+19 -2
View File
@@ -64,7 +64,9 @@ class CommandDef:
COMMAND_REGISTRY: list[CommandDef] = [
# Session
CommandDef("new", "Start a new session (fresh session ID + history)", "Session",
aliases=("reset",)),
aliases=("reset",), args_hint="[name]"),
CommandDef("topic", "Enable or inspect Telegram DM topic sessions", "Session",
gateway_only=True, args_hint="[off|help|session-id]"),
CommandDef("clear", "Clear screen and start a new session", "Session",
cli_only=True),
CommandDef("redraw", "Force a full UI repaint (recovers from terminal drift)", "Session",
@@ -1126,6 +1128,12 @@ class SlashCommandCompleter(Completer):
except Exception:
return {}
# Commands that open pickers when run without arguments.
# These should NOT receive a trailing space in completions because:
# - The TUI's submit handler applies completions on Enter if input differs
# - Adding space makes "/model" → "/model " which blocks picker execution
_PICKER_COMMANDS = frozenset({"model", "skin", "personality"})
@staticmethod
def _completion_text(cmd_name: str, word: str) -> str:
"""Return replacement text for a completion.
@@ -1134,8 +1142,17 @@ class SlashCommandCompleter(Completer):
returning ``help`` would be a no-op and prompt_toolkit suppresses the
menu. Appending a trailing space keeps the dropdown visible and makes
backspacing retrigger it naturally.
However, commands that open pickers (model, skin, personality) should
NOT get a trailing space the TUI would apply the completion on Enter
and block the picker from opening.
"""
return f"{cmd_name} " if cmd_name == word else cmd_name
if cmd_name != word:
return cmd_name
# Don't add space for picker commands — allows Enter to execute them
if cmd_name in SlashCommandCompleter._PICKER_COMMANDS:
return cmd_name
return f"{cmd_name} "
@staticmethod
def _extract_path_word(text: str) -> str | None:
+54 -1
View File
@@ -781,6 +781,11 @@ DEFAULT_CONFIG = {
"inline_diffs": True, # Show inline diff previews for write actions (write_file, patch, skill_manage)
"show_cost": False, # Show $ cost in the status bar (off by default)
"skin": "default",
# UI language for static user-facing messages (approval prompts, a
# handful of gateway slash-command replies). Does NOT affect agent
# responses, log lines, tool outputs, or slash-command descriptions.
# Supported: en, zh, ja, de, es. Unknown values fall back to en.
"language": "en",
# TUI busy indicator style: kaomoji (default), emoji, unicode (braille
# spinner), or ascii. Live-swappable via `/indicator <style>`.
"tui_status_indicator": "kaomoji",
@@ -809,6 +814,7 @@ DEFAULT_CONFIG = {
"enabled": False,
"fields": ["model", "context_pct", "cwd"], # Order shown; drop any to hide
},
"copy_shortcut": "auto", # "auto" (platform default) | "ctrl_c" | "ctrl_shift_c" | "disabled"
},
# Web dashboard settings
@@ -1286,7 +1292,10 @@ DEFAULT_CONFIG = {
# for a single update run.
"pre_update_backup": False,
# How many pre-update backup zips to retain. Older ones are pruned
# automatically after each successful backup.
# automatically after each successful backup. Values below 1 are
# floored to 1 — the backup just created is always preserved. To
# disable backups entirely, set ``pre_update_backup: false`` above
# rather than ``backup_keep: 0``.
"backup_keep": 5,
},
@@ -3943,6 +3952,7 @@ _FALLBACK_COMMENT = """
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
# bedrock (AWS IAM / boto3) — AWS Bedrock (Converse API)
#
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
@@ -3974,6 +3984,7 @@ _COMMENTED_SECTIONS = """
# kimi-coding-cn (KIMI_CN_API_KEY) — Kimi / Moonshot (China)
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
# bedrock (AWS IAM / boto3) — AWS Bedrock (Converse API)
#
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
@@ -4831,3 +4842,45 @@ def config_command(args):
print(" hermes config path Show config file path")
print(" hermes config env-path Show .env file path")
sys.exit(1)
# ── Profile-driven env var injection ─────────────────────────────────────────
# Any provider registered in providers/ with auth_type="api_key" automatically
# gets its env_vars exposed in OPTIONAL_ENV_VARS without editing this file.
# Runs once at import time.
_profile_env_vars_injected = False
def _inject_profile_env_vars() -> None:
"""Populate OPTIONAL_ENV_VARS from provider profiles not already listed.
Called once at module load time. Idempotent repeated calls are no-ops.
"""
global _profile_env_vars_injected
if _profile_env_vars_injected:
return
_profile_env_vars_injected = True
try:
from providers import list_providers
for _pp in list_providers():
if _pp.auth_type not in ("api_key",):
continue
for _var in _pp.env_vars:
if _var in OPTIONAL_ENV_VARS:
continue
_is_key = not _var.endswith("_BASE_URL") and not _var.endswith("_URL")
OPTIONAL_ENV_VARS[_var] = {
"description": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL override'}",
"prompt": f"{_pp.display_name or _pp.name} {'API key' if _is_key else 'base URL (leave empty for default)'}",
"url": _pp.signup_url or None,
"password": _is_key,
"category": "provider",
"advanced": True,
}
except Exception:
pass
# Eagerly inject so that OPTIONAL_ENV_VARS is fully populated at import time.
_inject_profile_env_vars()
+8
View File
@@ -93,6 +93,8 @@ def cron_list(show_all: bool = False):
script = job.get("script")
if script:
print(f" Script: {script}")
if job.get("no_agent"):
print(f" Mode: {color('no-agent', Colors.DIM)} (script stdout delivered directly)")
workdir = job.get("workdir")
if workdir:
print(f" Workdir: {workdir}")
@@ -172,6 +174,7 @@ def cron_create(args):
skills=_normalize_skills(getattr(args, "skill", None), getattr(args, "skills", None)),
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", False) or None,
)
if not result.get("success"):
print(color(f"Failed to create job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -184,6 +187,8 @@ def cron_create(args):
job_data = result.get("job", {})
if job_data.get("script"):
print(f" Script: {job_data['script']}")
if job_data.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if job_data.get("workdir"):
print(f" Workdir: {job_data['workdir']}")
print(f" Next run: {result['next_run_at']}")
@@ -225,6 +230,7 @@ def cron_edit(args):
skills=final_skills,
script=getattr(args, "script", None),
workdir=getattr(args, "workdir", None),
no_agent=getattr(args, "no_agent", None),
)
if not result.get("success"):
print(color(f"Failed to update job: {result.get('error', 'unknown error')}", Colors.RED))
@@ -240,6 +246,8 @@ def cron_edit(args):
print(" Skills: none")
if updated.get("script"):
print(f" Script: {updated['script']}")
if updated.get("no_agent"):
print(" Mode: no-agent (script stdout delivered directly)")
if updated.get("workdir"):
print(f" Workdir: {updated['workdir']}")
return 0
+130
View File
@@ -245,6 +245,111 @@ def _cmd_restore(args) -> int:
return 0 if ok else 1
def _cmd_archive(args) -> int:
"""Manually archive an agent-created skill. Refuses if pinned.
The auto-curator archives stale skills on its own schedule; this verb is
for the user who wants to archive *now* without waiting for a run.
"""
from tools import skill_usage
if skill_usage.get_record(args.skill).get("pinned"):
print(
f"curator: '{args.skill}' is pinned — unpin first with "
f"`hermes curator unpin {args.skill}`"
)
return 1
ok, msg = skill_usage.archive_skill(args.skill)
print(f"curator: {msg}")
return 0 if ok else 1
def _idle_days(record: dict) -> Optional[int]:
"""Days since the skill's last activity (view / use / patch).
Falls back to ``created_at`` so a skill that was authored but never used
can still be pruned otherwise never-touched skills would be immortal.
Returns None only when both fields are missing or unparseable.
"""
ts = record.get("last_activity_at") or record.get("created_at")
if not ts:
return None
try:
dt = datetime.fromisoformat(str(ts))
except (TypeError, ValueError):
return None
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return max(0, (datetime.now(timezone.utc) - dt).days)
def _cmd_prune(args) -> int:
"""Bulk-archive agent-created skills idle for >= N days.
Pinned skills are exempt. Already-archived skills are skipped. Default
``--days 90`` matches a conservative read of the curator's own archive
threshold; adjust with ``--days``. Use ``--dry-run`` to preview.
"""
from tools import skill_usage
days = getattr(args, "days", 90)
if days < 1:
print(f"curator: --days must be >= 1 (got {days})", file=sys.stderr)
return 2
dry_run = bool(getattr(args, "dry_run", False))
skip_confirm = bool(getattr(args, "yes", False))
candidates = []
for r in skill_usage.agent_created_report():
if r.get("pinned"):
continue
if r.get("state") == skill_usage.STATE_ARCHIVED:
continue
idle = _idle_days(r)
if idle is None or idle < days:
continue
candidates.append((r["name"], idle))
if not candidates:
print(f"curator: nothing to prune (no unpinned skills idle >= {days}d)")
return 0
candidates.sort(key=lambda c: -c[1])
print(f"curator: {len(candidates)} skill(s) idle >= {days}d:")
for name, idle in candidates:
print(f" {name:40s} idle {idle}d")
if dry_run:
print("\n(dry run — no changes made)")
return 0
if not skip_confirm:
try:
reply = input(f"\nArchive {len(candidates)} skill(s)? [y/N] ").strip().lower()
except (EOFError, KeyboardInterrupt):
print("\ncurator: aborted")
return 1
if reply not in ("y", "yes"):
print("curator: aborted")
return 1
archived = 0
failures = []
for name, _ in candidates:
ok, msg = skill_usage.archive_skill(name)
if ok:
archived += 1
else:
failures.append((name, msg))
print(f"\ncurator: archived {archived}/{len(candidates)}")
if failures:
print("failures:")
for name, msg in failures:
print(f" {name}: {msg}")
return 1
return 0
def _cmd_backup(args) -> int:
"""Take a manual snapshot of the skills tree. Same mechanism as the
automatic pre-run snapshot, just user-initiated."""
@@ -383,6 +488,31 @@ def register_cli(parent: argparse.ArgumentParser) -> None:
p_restore.add_argument("skill", help="Skill name")
p_restore.set_defaults(func=_cmd_restore)
p_archive = subs.add_parser(
"archive",
help="Manually archive a skill (move to .archive/, excluded from prompt)",
)
p_archive.add_argument("skill", help="Skill name")
p_archive.set_defaults(func=_cmd_archive)
p_prune = subs.add_parser(
"prune",
help="Bulk-archive agent-created skills idle for >= N days (default 90)",
)
p_prune.add_argument(
"--days", type=int, default=90,
help="Archive skills idle for at least N days (default: 90)",
)
p_prune.add_argument(
"-y", "--yes", action="store_true",
help="Skip the confirmation prompt",
)
p_prune.add_argument(
"--dry-run", dest="dry_run", action="store_true",
help="Show what would be archived without doing it",
)
p_prune.set_defaults(func=_cmd_prune)
p_backup = subs.add_parser(
"backup",
help="Take a manual tar.gz snapshot of ~/.hermes/skills/ "
+102 -28
View File
@@ -12,6 +12,7 @@ import importlib.util
from pathlib import Path
from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
from hermes_cli.env_loader import load_hermes_dotenv
from hermes_constants import display_hermes_home
PROJECT_ROOT = get_project_root()
@@ -19,15 +20,8 @@ HERMES_HOME = get_hermes_home()
_DHH = display_hermes_home() # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)
# Load environment variables from ~/.hermes/.env so API key checks work
from dotenv import load_dotenv
_env_path = get_env_path()
if _env_path.exists():
try:
load_dotenv(_env_path, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(_env_path, encoding="latin-1")
# Also try project .env as dev fallback
load_dotenv(PROJECT_ROOT / ".env", override=False, encoding="utf-8")
load_hermes_dotenv(hermes_home=_env_path.parent, project_env=PROJECT_ROOT / ".env")
from hermes_cli.colors import Colors, color
from hermes_cli.models import _HERMES_USER_AGENT
@@ -175,6 +169,85 @@ def _check_gateway_service_linger(issues: list[str]) -> None:
check_warn("Could not verify systemd linger", f"({linger_detail})")
_APIKEY_PROVIDERS_CACHE: list | None = None
def _build_apikey_providers_list() -> list:
"""Build the API-key provider health-check list once and cache it.
Tuple format: (name, env_vars, default_url, base_env, supports_models_endpoint)
Base list augmented with any ProviderProfile with auth_type="api_key" not
already present adding providers/*.py is sufficient to get into doctor.
"""
_static = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax global: /v1 endpoint supports /models.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
# MiniMax CN: /v1 endpoint does NOT support /models (returns 404).
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", False),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
_known_names = {t[0] for t in _static}
# Also index by profile canonical name so profiles without display_name
# don't create duplicate entries for providers already in the static list.
_known_canonical: set[str] = set()
_name_to_canonical = {
"Z.AI / GLM": "zai", "Kimi / Moonshot": "kimi-coding",
"StepFun Step Plan": "stepfun", "Kimi / Moonshot (China)": "kimi-coding-cn",
"Arcee AI": "arcee", "GMI Cloud": "gmi", "DeepSeek": "deepseek",
"Hugging Face": "huggingface", "NVIDIA NIM": "nvidia",
"Alibaba/DashScope": "alibaba", "MiniMax": "minimax",
"MiniMax (China)": "minimax-cn", "Vercel AI Gateway": "ai-gateway",
"Kilo Code": "kilocode", "OpenCode Zen": "opencode-zen",
"OpenCode Go": "opencode-go",
}
for _label, _canonical in _name_to_canonical.items():
_known_canonical.add(_canonical)
try:
from providers import list_providers
from providers.base import ProviderProfile as _PP
for _pp in list_providers():
if not isinstance(_pp, _PP) or _pp.auth_type != "api_key" or not _pp.env_vars:
continue
_label = _pp.display_name or _pp.name
if _label in _known_names or _pp.name in _known_canonical:
continue
# Separate API-key vars from base-URL override vars — the health-check
# loop sends the first found value as Authorization: Bearer, so a URL
# string must never be picked.
_key_vars = tuple(
v for v in _pp.env_vars
if not v.endswith("_BASE_URL") and not v.endswith("_URL")
)
_base_var = next(
(v for v in _pp.env_vars if v.endswith("_BASE_URL") or v.endswith("_URL")),
None,
)
if not _key_vars:
continue
_models_url = (
(_pp.models_url or (_pp.base_url.rstrip("/") + "/models"))
if _pp.base_url else None
)
_static.append((_label, _key_vars, _models_url, _base_var, True))
except Exception:
pass
return _static
def run_doctor(args):
"""Run diagnostic checks."""
should_fix = getattr(args, 'fix', False)
@@ -935,6 +1008,8 @@ def run_doctor(args):
agent_browser_path = PROJECT_ROOT / "node_modules" / "agent-browser"
if agent_browser_path.exists():
check_ok("agent-browser (Node.js)", "(browser automation)")
elif shutil.which("agent-browser"):
check_ok("agent-browser", "(browser automation)")
else:
if _is_termux():
check_info("agent-browser is not installed (expected in the tested Termux path)")
@@ -1085,26 +1160,11 @@ def run_doctor(args):
# -- API-key providers --
# Tuple: (name, env_vars, default_url, base_env, supports_models_endpoint)
# If supports_models_endpoint is False, we skip the health check and just show "configured"
_apikey_providers = [
("Z.AI / GLM", ("GLM_API_KEY", "ZAI_API_KEY", "Z_AI_API_KEY"), "https://api.z.ai/api/paas/v4/models", "GLM_BASE_URL", True),
("Kimi / Moonshot", ("KIMI_API_KEY",), "https://api.moonshot.ai/v1/models", "KIMI_BASE_URL", True),
("StepFun Step Plan", ("STEPFUN_API_KEY",), "https://api.stepfun.ai/step_plan/v1/models", "STEPFUN_BASE_URL", True),
("Kimi / Moonshot (China)", ("KIMI_CN_API_KEY",), "https://api.moonshot.cn/v1/models", None, True),
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("GMI Cloud", ("GMI_API_KEY",), "https://api.gmi-serving.com/v1/models", "GMI_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
("MiniMax (China)", ("MINIMAX_CN_API_KEY",), "https://api.minimaxi.com/v1/models", "MINIMAX_CN_BASE_URL", True),
("Vercel AI Gateway", ("AI_GATEWAY_API_KEY",), "https://ai-gateway.vercel.sh/v1/models", "AI_GATEWAY_BASE_URL", True),
("Kilo Code", ("KILOCODE_API_KEY",), "https://api.kilo.ai/api/gateway/models", "KILOCODE_BASE_URL", True),
("OpenCode Zen", ("OPENCODE_ZEN_API_KEY",), "https://opencode.ai/zen/v1/models", "OPENCODE_ZEN_BASE_URL", True),
# OpenCode Go has no shared /models endpoint; skip the health check.
("OpenCode Go", ("OPENCODE_GO_API_KEY",), None, "OPENCODE_GO_BASE_URL", False),
]
# Cached at module level after first build — profiles auto-extend it.
global _APIKEY_PROVIDERS_CACHE
if _APIKEY_PROVIDERS_CACHE is None:
_APIKEY_PROVIDERS_CACHE = _build_apikey_providers_list()
_apikey_providers = _APIKEY_PROVIDERS_CACHE
for _pname, _env_vars, _default_url, _base_env, _supports_health_check in _apikey_providers:
_key = ""
for _ev in _env_vars:
@@ -1261,9 +1321,23 @@ def run_doctor(args):
check_warn("Skills Hub directory not initialized", "(run: hermes skills list)")
from hermes_cli.config import get_env_value
def _gh_authenticated() -> bool:
"""Check if gh CLI is authenticated via token file or device flow."""
try:
result = subprocess.run(
["gh", "auth", "status", "--json", "authenticated"],
capture_output=True, timeout=10,
)
return result.returncode == 0
except (FileNotFoundError, subprocess.TimeoutExpired):
return False
github_token = get_env_value("GITHUB_TOKEN") or get_env_value("GH_TOKEN")
if github_token:
check_ok("GitHub token configured (authenticated API access)")
elif _gh_authenticated():
check_ok("GitHub authenticated via gh CLI", "(full API access — no GITHUB_TOKEN needed)")
else:
check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")
+5 -8
View File
@@ -14,6 +14,7 @@ import sys
from pathlib import Path
from hermes_cli.config import get_hermes_home, get_env_path, get_project_root, load_config
from hermes_cli.env_loader import load_hermes_dotenv
from hermes_constants import display_hermes_home
@@ -195,15 +196,11 @@ def run_dump(args):
show_keys = getattr(args, "show_keys", False)
# Load env from .env file so key checks work
from dotenv import load_dotenv
env_path = get_env_path()
if env_path.exists():
try:
load_dotenv(env_path, encoding="utf-8")
except UnicodeDecodeError:
load_dotenv(env_path, encoding="latin-1")
# Also try project .env as dev fallback
load_dotenv(get_project_root() / ".env", override=False, encoding="utf-8")
load_hermes_dotenv(
hermes_home=env_path.parent,
project_env=get_project_root() / ".env",
)
project_root = get_project_root()
hermes_home = get_hermes_home()
+88
View File
@@ -785,6 +785,12 @@ def stop_profile_gateway() -> bool:
if pid is None:
return False
try:
from gateway.status import write_planned_stop_marker
write_planned_stop_marker(pid)
except Exception:
pass
try:
os.kill(pid, signal.SIGTERM)
except ProcessLookupError:
@@ -1608,6 +1614,46 @@ def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
return [p for p in candidates if p not in path_entries and Path(p).exists()]
def _build_wsl_interop_paths(path_entries: list[str]) -> list[str]:
"""Return WSL Windows interop PATH entries for generated systemd units.
WSL shells normally inherit Windows PATH entries such as
``/mnt/c/WINDOWS/System32``. systemd user services do not, so gateway tools
that call ``powershell.exe``/``cmd.exe`` work in a terminal but fail in the
background service unless we persist the relevant entries at install time.
"""
if not is_wsl():
return []
candidates: list[str] = []
for entry in os.environ.get("PATH", "").split(os.pathsep):
if entry.startswith("/mnt/"):
candidates.append(entry)
for executable in ("powershell.exe", "cmd.exe", "explorer.exe", "wsl.exe"):
resolved = shutil.which(executable)
if resolved:
candidates.append(str(Path(resolved).parent))
for entry in (
"/mnt/c/WINDOWS/system32",
"/mnt/c/WINDOWS",
"/mnt/c/WINDOWS/System32/Wbem",
"/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/",
"/mnt/c/WINDOWS/System32/OpenSSH/",
):
if Path(entry).exists():
candidates.append(entry)
result: list[str] = []
seen = set(path_entries)
for entry in candidates:
if entry and entry not in seen:
seen.add(entry)
result.append(entry)
return result
def _remap_path_for_user(path: str, target_home_dir: str) -> str:
"""Remap *path* from the current user's home to *target_home_dir*.
@@ -1699,6 +1745,7 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
node_bin = _remap_path_for_user(node_bin, home_dir)
path_entries = [_remap_path_for_user(p, home_dir) for p in path_entries]
path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
@@ -1738,6 +1785,7 @@ WantedBy=multi-user.target
hermes_home = str(get_hermes_home().resolve())
profile_arg = _profile_arg(hermes_home)
path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
path_entries.extend(_build_wsl_interop_paths(path_entries))
path_entries.extend(common_bin_paths)
sane_path = ":".join(path_entries)
return f"""[Unit]
@@ -1971,6 +2019,15 @@ def systemd_uninstall(system: bool = False):
print(f"{_service_scope_label(system).capitalize()} service uninstalled")
def _require_service_installed(action: str, system: bool = False) -> None:
unit_path = get_systemd_unit_path(system=system)
if not unit_path.exists():
scope_flag = " --system" if system else ""
print(f"✗ Gateway service is not installed")
print(f" Run: {'sudo ' if system else ''}hermes gateway install{scope_flag}")
sys.exit(1)
def systemd_start(system: bool = False):
system = _select_systemd_scope(system)
if system:
@@ -1980,6 +2037,7 @@ def systemd_start(system: bool = False):
# reachable (common on fresh RHEL/Debian SSH sessions without linger).
# Raises UserSystemdUnavailableError with a remediation message.
_preflight_user_systemd()
_require_service_installed("start", system=system)
refresh_systemd_unit_if_needed(system=system)
_run_systemctl(["start", get_service_name()], system=system, check=True, timeout=30)
print(f"{_service_scope_label(system).capitalize()} service started")
@@ -1990,6 +2048,14 @@ def systemd_stop(system: bool = False):
system = _select_systemd_scope(system)
if system:
_require_root_for_system_service("stop")
_require_service_installed("stop", system=system)
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
_run_systemctl(["stop", get_service_name()], system=system, check=True, timeout=90)
print(f"{_service_scope_label(system).capitalize()} service stopped")
@@ -2001,6 +2067,7 @@ def systemd_restart(system: bool = False):
_require_root_for_system_service("restart")
else:
_preflight_user_systemd()
_require_service_installed("restart", system=system)
refresh_systemd_unit_if_needed(system=system)
from gateway.status import get_running_pid
@@ -2350,6 +2417,13 @@ def launchd_start():
def launchd_stop():
label = get_launchd_label()
target = f"{_launchd_domain()}/{label}"
try:
from gateway.status import get_running_pid, write_planned_stop_marker
pid = get_running_pid(cleanup_stale=False)
if pid is not None:
write_planned_stop_marker(pid)
except Exception:
pass
# bootout unloads the service definition so KeepAlive doesn't respawn
# the process. A plain `kill SIGTERM` only signals the process — launchd
# immediately restarts it because KeepAlive.SuccessfulExit = false.
@@ -2492,6 +2566,20 @@ def run_gateway(verbose: int = 0, quiet: bool = False, replace: bool = False):
hasn't fully exited yet.
"""
sys.path.insert(0, str(PROJECT_ROOT))
# Refresh the systemd unit definition on every boot so that restart
# settings (RestartSec, StartLimitIntervalSec, etc.) stay current even
# when the process was respawned via exit-code-75 (stale-code or
# /restart) rather than through `hermes gateway restart` which already
# calls refresh_systemd_unit_if_needed(). Without this, a code update
# that ships new unit settings won't take effect until the next manual
# `hermes gateway start/restart` — leaving the gateway vulnerable to
# the exact failure mode the new settings were meant to prevent.
if supports_systemd_services():
try:
refresh_systemd_unit_if_needed(system=False)
except Exception:
pass # best-effort; don't block gateway startup
from gateway.run import start_gateway
+615 -8
View File
@@ -169,11 +169,93 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
"or docs/hermes-kanban-v1-spec.pdf for the full design."
),
)
# --- global --board flag ---
# Applies to every subcommand below. When set, scopes all reads and
# writes to that board's DB. When omitted, resolves via the
# HERMES_KANBAN_BOARD env var, then the persisted current-board
# file, then "default". See kanban_db.get_current_board().
kanban_parser.add_argument(
"--board",
default=None,
metavar="<slug>",
help=(
"Board slug to operate on. Defaults to the current board "
"(set via `hermes kanban boards switch <slug>` or the "
"HERMES_KANBAN_BOARD env var). Use `hermes kanban boards list` "
"to see all boards."
),
)
sub = kanban_parser.add_subparsers(dest="kanban_action")
# --- init ---
sub.add_parser("init", help="Create kanban.db if missing (idempotent)")
# --- boards (new in v2: multi-project support) ---
p_boards = sub.add_parser(
"boards",
help="Manage kanban boards (one board per project / workstream)",
description=(
"Boards let you separate unrelated streams of work "
"(projects, repos, domains) into isolated queues. Each "
"board has its own DB, workspaces directory, and dispatcher "
"loop — tasks on one board cannot collide with tasks on "
"another. The first board is 'default' and always exists."
),
)
boards_sub = p_boards.add_subparsers(dest="boards_action")
b_list = boards_sub.add_parser(
"list", aliases=["ls"],
help="List all boards with task counts",
)
b_list.add_argument("--json", action="store_true")
b_list.add_argument("--all", action="store_true",
help="Include archived boards too")
b_create = boards_sub.add_parser(
"create", aliases=["new"],
help="Create a new board",
)
b_create.add_argument("slug",
help="Board slug (kebab-case, e.g. atm10-server)")
b_create.add_argument("--name", default=None,
help="Human-readable display name (defaults to Title Case of slug)")
b_create.add_argument("--description", default=None,
help="Optional description")
b_create.add_argument("--icon", default=None,
help="Optional emoji or single-character icon for the dashboard")
b_create.add_argument("--color", default=None,
help="Optional hex color (e.g. '#8b5cf6') for the dashboard")
b_create.add_argument("--switch", action="store_true",
help="Switch to the new board after creating it")
b_rm = boards_sub.add_parser(
"rm", aliases=["remove", "delete"],
help="Archive (default) or delete a board",
)
b_rm.add_argument("slug")
b_rm.add_argument("--delete", action="store_true",
help="Hard-delete the board directory instead of archiving it. "
"Default is to move it to boards/_archived/ so it's recoverable.")
b_switch = boards_sub.add_parser(
"switch", aliases=["use"],
help="Set the active board for subsequent CLI calls",
)
b_switch.add_argument("slug")
boards_sub.add_parser(
"show", aliases=["current"],
help="Print the currently-active board slug",
)
b_rename = boards_sub.add_parser(
"rename",
help="Change a board's human-readable display name (slug is immutable)",
)
b_rename.add_argument("slug")
b_rename.add_argument("name", help="New display name")
# --- create ---
p_create = sub.add_parser("create", help="Create a new task")
p_create.add_argument("title", help="Task title")
@@ -226,6 +308,57 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
p_assign.add_argument("task_id")
p_assign.add_argument("profile", help="Profile name (or 'none' to unassign)")
# --- reclaim / reassign (recovery) ---
p_reclaim = sub.add_parser(
"reclaim",
help="Release an active worker claim on a running task",
)
p_reclaim.add_argument("task_id")
p_reclaim.add_argument(
"--reason", default=None,
help="Human-readable reason (recorded on the reclaimed event)",
)
p_reassign = sub.add_parser(
"reassign",
help="Reassign a task to a different profile, optionally reclaiming first",
)
p_reassign.add_argument("task_id")
p_reassign.add_argument(
"profile",
help="New profile name (or 'none' to unassign)",
)
p_reassign.add_argument(
"--reclaim", action="store_true",
help="Release any active claim before reassigning (required if task is running)",
)
p_reassign.add_argument(
"--reason", default=None,
help="Human-readable reason (recorded on the reclaimed event)",
)
# --- diagnostics (board-wide health) ---
p_diag = sub.add_parser(
"diagnostics",
aliases=["diag"],
help="List active diagnostics on the current board",
)
p_diag.add_argument(
"--severity",
choices=["warning", "error", "critical"],
default=None,
help="Only show diagnostics at or above this severity",
)
p_diag.add_argument(
"--task",
default=None,
help="Only show diagnostics for one task id",
)
p_diag.add_argument(
"--json", action="store_true",
help="Emit JSON (structured) instead of the default human table",
)
# --- link / unlink ---
p_link = sub.add_parser("link", help="Add a parent->child dependency")
p_link.add_argument("parent_id")
@@ -261,6 +394,27 @@ def build_parser(parent_subparsers: argparse._SubParsersAction) -> argparse.Argu
help='JSON dict of structured facts (e.g. \'{"changed_files": [...], '
'"tests_run": 12}\'). Stored on the closing run.')
p_edit = sub.add_parser(
"edit",
help="Edit recovery fields on an already-completed task",
)
p_edit.add_argument("task_id")
p_edit.add_argument(
"--result",
required=True,
help="Backfilled task result text for a done task",
)
p_edit.add_argument(
"--summary",
default=None,
help="Structured handoff summary. Falls back to --result if omitted.",
)
p_edit.add_argument(
"--metadata",
default=None,
help="JSON dict of structured facts to store on the latest completed run.",
)
p_block = sub.add_parser("block", help="Mark one or more tasks blocked")
p_block.add_argument("task_id")
p_block.add_argument("reason", nargs="*", help="Reason (also appended as a comment)")
@@ -442,6 +596,38 @@ def kanban_command(args: argparse.Namespace) -> int:
)
return 0
# `--board <slug>` applies to every subcommand below by way of an
# env-var pin for the duration of this call. Using HERMES_KANBAN_BOARD
# (rather than threading `board=` through 50+ kb.connect() sites)
# keeps the patch small and inherits the exact same resolution the
# dispatcher uses for workers — consistency is a feature here.
board_override = getattr(args, "board", None)
if board_override:
try:
normed = kb._normalize_board_slug(board_override)
except ValueError as exc:
print(f"kanban: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban: --board requires a slug", file=sys.stderr)
return 2
# Boards other than 'default' must already exist — typoed slugs
# would otherwise silently create an empty board.
if normed != kb.DEFAULT_BOARD and not kb.board_exists(normed):
print(
f"kanban: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
os.environ["HERMES_KANBAN_BOARD"] = normed
# Boards management doesn't touch the DB at all — dispatch early so
# fresh installs that haven't initialized any DB can still use
# `hermes kanban boards create …`.
if action == "boards":
return _dispatch_boards(args)
# Auto-initialize the DB before dispatching any subcommand. init_db
# is idempotent, so running it every invocation is cheap (one
# SELECT against sqlite_master when tables already exist) and
@@ -462,11 +648,16 @@ def kanban_command(args: argparse.Namespace) -> int:
"ls": _cmd_list,
"show": _cmd_show,
"assign": _cmd_assign,
"reclaim": _cmd_reclaim,
"reassign": _cmd_reassign,
"diagnostics": _cmd_diagnostics,
"diag": _cmd_diagnostics,
"link": _cmd_link,
"unlink": _cmd_unlink,
"claim": _cmd_claim,
"comment": _cmd_comment,
"complete": _cmd_complete,
"edit": _cmd_edit,
"block": _cmd_block,
"unblock": _cmd_unblock,
"archive": _cmd_archive,
@@ -513,6 +704,185 @@ def _profile_author() -> str:
return "user"
# ---------------------------------------------------------------------------
# Boards management (hermes kanban boards …)
# ---------------------------------------------------------------------------
def _dispatch_boards(args: argparse.Namespace) -> int:
"""Handle ``hermes kanban boards <action>``.
Boards management is deliberately separate from the task-level
commands: it operates on the filesystem (board directories,
``current`` pointer, ``board.json``), not on the per-board SQLite
DB, so a fresh HERMES_HOME that has never called ``kanban init``
can still run ``boards create`` / ``boards list``.
"""
sub = getattr(args, "boards_action", None) or "list"
if sub in ("list", "ls"):
return _cmd_boards_list(args)
if sub in ("create", "new"):
return _cmd_boards_create(args)
if sub in ("rm", "remove", "delete"):
return _cmd_boards_rm(args)
if sub in ("switch", "use"):
return _cmd_boards_switch(args)
if sub in ("show", "current"):
return _cmd_boards_show(args)
if sub == "rename":
return _cmd_boards_rename(args)
print(f"kanban boards: unknown action {sub!r}", file=sys.stderr)
return 2
def _board_task_counts(slug: str) -> dict[str, int]:
"""Return ``{status: count}`` for a board. Safe to call on an empty DB."""
try:
path = kb.kanban_db_path(board=slug)
if not path.exists():
return {}
with kb.connect(board=slug) as conn:
rows = conn.execute(
"SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
).fetchall()
return {r["status"]: int(r["n"]) for r in rows}
except Exception:
return {}
def _cmd_boards_list(args: argparse.Namespace) -> int:
include_archived = bool(getattr(args, "all", False))
boards = kb.list_boards(include_archived=include_archived)
# Enrich each entry with task counts + whether it's the current board.
current = kb.get_current_board()
for b in boards:
b["is_current"] = (b["slug"] == current)
b["counts"] = _board_task_counts(b["slug"])
b["total"] = sum(b["counts"].values())
if getattr(args, "json", False):
print(json.dumps(boards, indent=2, ensure_ascii=False))
return 0
# Human table: marker (•) for current, slug, display name, counts.
if not boards:
print("(no boards — create one with `hermes kanban boards create <slug>`)")
return 0
print(f"{'':2s} {'SLUG':24s} {'NAME':28s} COUNTS")
for b in boards:
marker = "" if b["is_current"] else " "
counts = b["counts"] or {}
counts_str = (
", ".join(f"{k}={v}" for k, v in sorted(counts.items()))
or "(empty)"
)
name = b.get("name") or ""
if b.get("archived"):
name += " [archived]"
print(f"{marker:2s} {b['slug']:24s} {name:28s} {counts_str}")
print()
print(f"Current board: {current}")
if len(boards) > 1:
print("Switch boards with `hermes kanban boards switch <slug>`.")
return 0
def _cmd_boards_create(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards create: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards create: slug is required", file=sys.stderr)
return 2
already = kb.board_exists(normed) and normed != kb.DEFAULT_BOARD
meta = kb.create_board(
normed,
name=args.name,
description=args.description,
icon=args.icon,
color=args.color,
)
verb = "already exists" if already else "created"
print(f"Board {meta['slug']!r} {verb}.")
print(f" Display name: {meta.get('name', '')}")
print(f" DB path: {meta['db_path']}")
if getattr(args, "switch", False):
kb.set_current_board(meta["slug"])
print(f" Switched to {meta['slug']!r}.")
else:
print(f" Use `hermes kanban boards switch {meta['slug']}` to make it current.")
return 0
def _cmd_boards_rm(args: argparse.Namespace) -> int:
try:
res = kb.remove_board(args.slug, archive=not getattr(args, "delete", False))
except ValueError as exc:
print(f"kanban boards rm: {exc}", file=sys.stderr)
return 1
if res["action"] == "archived":
print(f"Board {res['slug']!r} archived → {res['new_path']}")
print("Recover by moving the directory back to "
"<root>/kanban/boards/<slug>/.")
else:
print(f"Board {res['slug']!r} deleted.")
return 0
def _cmd_boards_switch(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards switch: {exc}", file=sys.stderr)
return 2
if not normed:
print("kanban boards switch: slug is required", file=sys.stderr)
return 2
if not kb.board_exists(normed):
print(
f"kanban boards switch: board {normed!r} does not exist. "
f"Create it with `hermes kanban boards create {normed}`.",
file=sys.stderr,
)
return 1
kb.set_current_board(normed)
print(f"Active board is now {normed!r}.")
return 0
def _cmd_boards_show(args: argparse.Namespace) -> int:
current = kb.get_current_board()
meta = kb.read_board_metadata(current)
counts = _board_task_counts(current)
total = sum(counts.values())
print(f"Current board: {current}")
print(f" Display name: {meta.get('name', '')}")
if meta.get("description"):
print(f" Description: {meta['description']}")
print(f" DB path: {meta['db_path']}")
print(f" Tasks: {total} total"
+ (f" ({', '.join(f'{k}={v}' for k, v in sorted(counts.items()))})"
if counts else ""))
return 0
def _cmd_boards_rename(args: argparse.Namespace) -> int:
try:
normed = kb._normalize_board_slug(args.slug)
except ValueError as exc:
print(f"kanban boards rename: {exc}", file=sys.stderr)
return 2
if not normed or not kb.board_exists(normed):
print(f"kanban boards rename: board {args.slug!r} does not exist",
file=sys.stderr)
return 1
meta = kb.write_board_metadata(normed, name=args.name)
print(f"Board {normed!r} renamed to {meta['name']!r}.")
return 0
# ---------------------------------------------------------------------------
def _parse_duration(val) -> Optional[int]:
"""Parse ``30s`` / ``5m`` / ``2h`` / ``1d`` or a raw integer → seconds.
@@ -662,6 +1032,21 @@ def _cmd_list(args: argparse.Namespace) -> int:
if getattr(args, "json", False):
print(json.dumps([_task_to_dict(t) for t in tasks], indent=2, ensure_ascii=False))
return 0
# Passive discoverability: when the user has multiple boards, surface
# which one they're looking at in the list header. Single-board users
# never see this — the feature stays invisible until you opt in.
try:
all_boards = kb.list_boards(include_archived=False)
except Exception:
all_boards = []
if len(all_boards) > 1:
current = kb.get_current_board()
other_count = len(all_boards) - 1
print(
f"Board: {current} "
f"({other_count} other board{'s' if other_count != 1 else ''}"
f"`hermes kanban boards list`)\n"
)
if not tasks:
print("(no matching tasks)")
return 0
@@ -730,6 +1115,31 @@ def _cmd_show(args: argparse.Namespace) -> int:
if task.skills:
print(f" skills: {', '.join(task.skills)}")
print(f" created: {_fmt_ts(task.created_at)} by {task.created_by or '-'}")
# Diagnostics section — surface active distress signals at the top
# of show output so CLI users see them before scrolling through
# comments / runs.
from hermes_cli import kanban_diagnostics as kd
diags = kd.compute_task_diagnostics(task, events, runs)
if diags:
sev_marker = {"warning": "", "error": "!!", "critical": "!!!"}
print(f"\n Diagnostics ({len(diags)}):")
for d in diags:
print(f" {sev_marker.get(d.severity, '?')} [{d.severity}] {d.title}")
if d.data:
bits = []
for k, v in d.data.items():
if isinstance(v, list):
bits.append(f"{k}={','.join(str(x) for x in v)}")
else:
bits.append(f"{k}={v}")
if bits:
print(f" data: {' | '.join(bits)}")
# Only show suggested actions in show output to keep it tight;
# full list is available via `kanban diagnostics --task <id>`.
for a in d.actions:
if a.suggested:
print(f"{a.label}")
if task.started_at:
print(f" started: {_fmt_ts(task.started_at)}")
if task.completed_at:
@@ -787,6 +1197,167 @@ def _cmd_assign(args: argparse.Namespace) -> int:
return 0
def _cmd_reclaim(args: argparse.Namespace) -> int:
with kb.connect() as conn:
ok = kb.reclaim_task(
conn, args.task_id,
reason=getattr(args, "reason", None),
)
if not ok:
print(
f"cannot reclaim {args.task_id} (not running or unknown id)",
file=sys.stderr,
)
return 1
print(f"Reclaimed {args.task_id}")
return 0
def _cmd_reassign(args: argparse.Namespace) -> int:
profile = None if args.profile.lower() in ("none", "-", "null") else args.profile
with kb.connect() as conn:
ok = kb.reassign_task(
conn, args.task_id, profile,
reclaim_first=bool(getattr(args, "reclaim", False)),
reason=getattr(args, "reason", None),
)
if not ok:
print(
f"cannot reassign {args.task_id} "
f"(unknown id, or still running — pass --reclaim to release first)",
file=sys.stderr,
)
return 1
print(
f"Reassigned {args.task_id} to "
f"{profile or '(unassigned)'}"
+ (" (claim reclaimed)" if getattr(args, "reclaim", False) else "")
)
return 0
def _cmd_diagnostics(args: argparse.Namespace) -> int:
"""List active diagnostics on the board. Wraps the same rule engine
the dashboard uses, so CLI output matches what the UI shows.
"""
from hermes_cli import kanban_diagnostics as kd
with kb.connect() as conn:
# Either one-task mode or fleet mode.
if getattr(args, "task", None):
task = kb.get_task(conn, args.task)
if task is None:
print(f"no such task: {args.task}", file=sys.stderr)
return 1
diags_by_task = {
args.task: kd.compute_task_diagnostics(
task,
kb.list_events(conn, args.task),
kb.list_runs(conn, args.task),
)
}
else:
# Fleet mode: pull all non-archived tasks + their events/runs.
rows = list(conn.execute(
"SELECT * FROM tasks WHERE status != 'archived'"
).fetchall())
ids = [r["id"] for r in rows]
if not ids:
diags_by_task = {}
else:
placeholders = ",".join(["?"] * len(ids))
ev_by = {i: [] for i in ids}
for row in conn.execute(
f"SELECT * FROM task_events WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(ids),
):
ev_by.setdefault(row["task_id"], []).append(row)
run_by = {i: [] for i in ids}
for row in conn.execute(
f"SELECT * FROM task_runs WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(ids),
):
run_by.setdefault(row["task_id"], []).append(row)
diags_by_task = {}
for r in rows:
tid = r["id"]
dl = kd.compute_task_diagnostics(r, ev_by.get(tid, []), run_by.get(tid, []))
if dl:
diags_by_task[tid] = dl
# Severity filter.
sev = getattr(args, "severity", None)
if sev:
for tid in list(diags_by_task.keys()):
kept = [d for d in diags_by_task[tid] if d.severity == sev]
if kept:
diags_by_task[tid] = kept
else:
del diags_by_task[tid]
# Map task_id → title/status/assignee for the table output.
meta: dict[str, dict] = {}
if diags_by_task:
placeholders = ",".join(["?"] * len(diags_by_task))
for r in conn.execute(
f"SELECT id, title, status, assignee FROM tasks WHERE id IN ({placeholders})",
tuple(diags_by_task.keys()),
):
meta[r["id"]] = {
"title": r["title"], "status": r["status"],
"assignee": r["assignee"],
}
if getattr(args, "json", False):
out_json = [
{
"task_id": tid,
**meta.get(tid, {}),
"diagnostics": [d.to_dict() for d in dl],
}
for tid, dl in diags_by_task.items()
]
print(json.dumps(out_json, indent=2, ensure_ascii=False))
return 0
if not diags_by_task:
print("No active diagnostics on this board.")
return 0
# Human-readable summary: grouped by task, severity-marked, with
# suggested actions inline.
sev_marker = {"warning": "", "error": "!!", "critical": "!!!"}
total = sum(len(dl) for dl in diags_by_task.values())
print(
f"{total} active diagnostic(s) across "
f"{len(diags_by_task)} task(s):\n"
)
for tid, dl in diags_by_task.items():
m = meta.get(tid, {})
title = m.get("title") or "(untitled)"
status = m.get("status") or "?"
assignee = m.get("assignee") or "(unassigned)"
print(f" {tid} {status:8s} @{assignee:18s} {title}")
for d in dl:
print(f" {sev_marker.get(d.severity, '?')} [{d.severity}] {d.kind}: {d.title}")
if d.data:
# Compact key:value pairs on one line.
bits = []
for k, v in d.data.items():
if isinstance(v, list):
bits.append(f"{k}={','.join(str(x) for x in v)}")
else:
bits.append(f"{k}={v}")
if bits:
print(f" data: {' | '.join(bits)}")
# Suggested actions first.
for a in d.actions:
if a.suggested:
print(f"{a.label}")
print()
return 0
def _cmd_link(args: argparse.Namespace) -> int:
with kb.connect() as conn:
kb.link_tasks(conn, args.parent_id, args.child_id)
@@ -879,6 +1450,34 @@ def _cmd_complete(args: argparse.Namespace) -> int:
return 0 if not failed else 1
def _cmd_edit(args: argparse.Namespace) -> int:
raw_meta = getattr(args, "metadata", None)
metadata = None
if raw_meta:
try:
metadata = json.loads(raw_meta)
if not isinstance(metadata, dict):
raise ValueError("must be a JSON object")
except (ValueError, json.JSONDecodeError) as exc:
print(f"kanban: --metadata: {exc}", file=sys.stderr)
return 2
with kb.connect() as conn:
if not kb.edit_completed_task_result(
conn,
args.task_id,
result=args.result,
summary=getattr(args, "summary", None),
metadata=metadata,
):
print(
f"cannot edit {args.task_id} (unknown id or task is not done)",
file=sys.stderr,
)
return 1
print(f"Edited {args.task_id}")
return 0
def _cmd_block(args: argparse.Namespace) -> int:
reason = " ".join(args.reason).strip() if args.reason else None
author = _profile_author()
@@ -966,6 +1565,7 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
for (tid, who, ws) in res.spawned
],
"skipped_unassigned": res.skipped_unassigned,
"skipped_nonspawnable": res.skipped_nonspawnable,
}, indent=2))
return 0
print(f"Reclaimed: {res.reclaimed}")
@@ -985,6 +1585,11 @@ def _cmd_dispatch(args: argparse.Namespace) -> int:
print(f" - {tid} -> {who} @ {ws or '-'}{tag}")
if res.skipped_unassigned:
print(f"Skipped (unassigned): {', '.join(res.skipped_unassigned)}")
if res.skipped_nonspawnable:
print(
f"Skipped (non-spawnable assignee — terminal lane, OK): "
f"{', '.join(res.skipped_nonspawnable)}"
)
return 0
@@ -1096,16 +1701,18 @@ def _cmd_daemon(args: argparse.Namespace) -> int:
)
def _ready_queue_nonempty() -> bool:
"""Cheap SELECT — just asks whether there's at least one ready
task with an assignee that the dispatcher could have picked up."""
"""Cheap probe — is there at least one ready+assigned+unclaimed
task whose assignee maps to a real Hermes profile (i.e. one the
dispatcher would actually try to spawn for)?
Filters out tasks assigned to control-plane lanes
(e.g. ``orion-cc``, ``orion-research``) that are pulled by
terminals via ``claim_task`` directly those are correctly idle
from the dispatcher's perspective, not stuck.
"""
try:
with kb.connect() as conn:
row = conn.execute(
"SELECT 1 FROM tasks "
"WHERE status = 'ready' AND assignee IS NOT NULL "
" AND claim_lock IS NULL LIMIT 1"
).fetchone()
return row is not None
return kb.has_spawnable_ready(conn)
except Exception:
return False
+1010 -89
View File
File diff suppressed because it is too large Load Diff
+570
View File
@@ -0,0 +1,570 @@
"""Kanban diagnostics — structured, actionable distress signals for tasks.
A ``Diagnostic`` is a machine-readable description of something that's wrong
with a kanban task: a hallucinated card id, a spawn crash-loop, a task
stuck blocked for too long, etc. Each one carries:
* A **kind** (canonical code; UI/tests match on this).
* A **severity** (``warning`` / ``error`` / ``critical``).
* A **title** (one-line human description) and **detail** (longer text).
* A list of **suggested actions** structured entries the dashboard
turns into buttons and the CLI turns into hints.
Rules run over (task, recent events, recent runs) and emit diagnostics.
They are stateless and read-only no DB writes. Callers compute
diagnostics on demand (on ``/board`` load, ``/tasks/:id`` fetch, or
``hermes kanban diagnostics``).
Design goals:
* Fixable-on-the-operator's-side signals only (missing config, phantom
ids, crash loop). Not "the provider returned 502 once" that's a
transient runtime blip, not a diagnostic.
* Recoverable: every diagnostic comes with at least one suggested
recovery action the operator can actually take from the UI.
* Auto-clearing: when the underlying failure mode resolves (a clean
``completed`` event arrives, a spawn succeeds, the task gets
unblocked), the diagnostic stops firing. The audit event trail stays.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any, Callable, Iterable, Optional
import json
import time
# Severity rungs, ordered least → most urgent. The UI colors them
# amber (warning), orange (error), red (critical). Sorted outputs put
# critical first so operators see the worst fires at the top.
SEVERITY_ORDER = ("warning", "error", "critical")
@dataclass
class DiagnosticAction:
"""A single recovery action attached to a diagnostic.
The ``kind`` determines how both the UI and CLI render it:
* ``reclaim`` / ``reassign`` POST to the matching /tasks/:id/*
endpoint; dashboard wires into the existing recovery popover.
* ``unblock`` PATCH status back to ``ready`` (for stuck-blocked
diagnostics).
* ``cli_hint`` print/copy a shell command (e.g.
``hermes -p <profile> auth``). No HTTP side effect.
* ``open_docs`` deep-link to the docs URL named in ``payload.url``.
* ``comment`` nudge the operator to add a comment (for
stuck-blocked tasks that need human input).
``suggested=True`` marks the action as the recommended first step;
the UI highlights it. Multiple actions can be suggested if they're
equally valid.
"""
kind: str
label: str
payload: dict = field(default_factory=dict)
suggested: bool = False
def to_dict(self) -> dict:
return {
"kind": self.kind,
"label": self.label,
"payload": self.payload,
"suggested": self.suggested,
}
@dataclass
class Diagnostic:
"""One active distress signal on a task."""
kind: str
severity: str # "warning" | "error" | "critical"
title: str
detail: str
actions: list[DiagnosticAction] = field(default_factory=list)
first_seen_at: int = 0
last_seen_at: int = 0
count: int = 1
# Optional: the run id this diagnostic is scoped to. None = task-wide.
run_id: Optional[int] = None
# Optional structured payload for the UI (phantom ids, failure count).
data: dict = field(default_factory=dict)
def to_dict(self) -> dict:
return {
"kind": self.kind,
"severity": self.severity,
"title": self.title,
"detail": self.detail,
"actions": [a.to_dict() for a in self.actions],
"first_seen_at": self.first_seen_at,
"last_seen_at": self.last_seen_at,
"count": self.count,
"run_id": self.run_id,
"data": self.data,
}
# ---------------------------------------------------------------------------
# Rule helpers
# ---------------------------------------------------------------------------
def _task_field(task, name, default=None):
"""Read a field from a task regardless of representation.
Callers pass sqlite3.Row (dict-like with [] but no attribute
access), kanban_db.Task dataclasses (attribute access), or plain
dicts (both). This normalises them so rule functions don't have
to branch on type each time.
"""
if task is None:
return default
# sqlite Row + plain dicts both support mapping access; Row also
# supports .keys().
try:
# Row raises IndexError if the key isn't a column in the query;
# dicts return default via .get. Handle both.
if hasattr(task, "keys") and name in task.keys():
return task[name]
except Exception:
pass
if isinstance(task, dict):
return task.get(name, default)
return getattr(task, name, default)
def _parse_payload(ev) -> dict:
"""Tolerate event.payload being either a dict or a JSON string."""
p = _task_field(ev, "payload", None)
if p is None:
return {}
if isinstance(p, dict):
return p
if isinstance(p, str):
try:
return json.loads(p) or {}
except Exception:
return {}
return {}
def _event_kind(ev) -> str:
return _task_field(ev, "kind", "") or ""
def _event_ts(ev) -> int:
t = _task_field(ev, "created_at", 0)
return int(t or 0)
def _active_hallucination_events(
events: Iterable[Any],
kind: str,
) -> list[Any]:
"""Return events of ``kind`` that have no ``completed``/``edited``
event *strictly after* them. Walks chronologically: each clean
event resets the accumulator; each matching event gets appended.
Events must be sorted by id (i.e. arrival order); callers pass the
task's full event list which the DB already returns in that order.
"""
# Events arrive sorted by id asc (chronological). Walk once, track
# which hallucination events are still "active" (no clean event
# supersedes them).
active: list[Any] = []
for ev in events:
k = _event_kind(ev)
if k in ("completed", "edited"):
active.clear()
elif k == kind:
active.append(ev)
return active
def _latest_clean_event_ts(events: Iterable[Any]) -> int:
"""Timestamp of the most recent clean completion / edit event.
Kept for general "has this task ever been successfully completed"
lookups; hallucination rules use ``_active_hallucination_events``
instead because they need strict ordering.
"""
latest = 0
for ev in events:
if _event_kind(ev) in ("completed", "edited"):
t = _event_ts(ev)
if t > latest:
latest = t
return latest
# Standard always-available actions. Every diagnostic can offer these as
# fallbacks regardless of kind — they're the two baseline recovery
# primitives the kernel supports.
def _generic_recovery_actions(task: Any, *, running: bool) -> list[DiagnosticAction]:
out: list[DiagnosticAction] = []
if running:
out.append(DiagnosticAction(
kind="reclaim",
label="Reclaim task",
payload={},
))
out.append(DiagnosticAction(
kind="reassign",
label="Reassign to different profile",
payload={"reclaim_first": running},
))
return out
# ---------------------------------------------------------------------------
# Rule implementations
# ---------------------------------------------------------------------------
# Each rule takes (task, events, runs, now_ts, config) and returns
# zero or more Diagnostic instances. ``events`` / ``runs`` are lists of
# kanban_db.Event / kanban_db.Run (or plain dicts matching the same
# shape — for test convenience).
RuleFn = Callable[[Any, list[Any], list[Any], int, dict], list[Diagnostic]]
def _rule_hallucinated_cards(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Blocked-hallucination gate fires: a worker called kanban_complete
with created_cards that didn't exist or weren't created by the
completing profile. Task stayed in its prior state; the operator
needs to decide how to proceed.
Auto-clears when a successful completion (or edit) follows the
blocked event.
"""
hits = _active_hallucination_events(events, "completion_blocked_hallucination")
if not hits:
return []
phantom_ids: list[str] = []
first = _event_ts(hits[0])
last = _event_ts(hits[-1])
for ev in hits:
payload = _parse_payload(ev)
for pid in payload.get("phantom_cards", []) or []:
if pid not in phantom_ids:
phantom_ids.append(pid)
running = _task_field(task, "status") == "running"
actions: list[DiagnosticAction] = []
actions.append(DiagnosticAction(
kind="comment",
label="Add a comment explaining what to do",
suggested=False,
))
actions.extend(_generic_recovery_actions(task, running=running))
return [Diagnostic(
kind="hallucinated_cards",
severity="error",
title="Worker claimed cards that don't exist",
detail=(
f"The completing worker declared created_cards that either didn't "
f"exist or weren't created by its profile. The completion was "
f"blocked and the task stayed in its prior state. "
f"Usually means the worker hallucinated ids instead of capturing "
f"return values from kanban_create."
),
actions=actions,
first_seen_at=first,
last_seen_at=last,
count=len(hits),
data={"phantom_ids": phantom_ids},
)]
def _rule_prose_phantom_refs(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Advisory prose-scan: the completion summary mentions ``t_<hex>``
ids that don't resolve. Non-blocking; surfaced as a warning only.
Auto-clears when a fresh clean completion arrives AFTER the
suspected event.
"""
hits = _active_hallucination_events(events, "suspected_hallucinated_references")
if not hits:
return []
phantom_refs: list[str] = []
for ev in hits:
for pid in _parse_payload(ev).get("phantom_refs", []) or []:
if pid not in phantom_refs:
phantom_refs.append(pid)
running = _task_field(task, "status") == "running"
return [Diagnostic(
kind="prose_phantom_refs",
severity="warning",
title="Completion summary references unknown task ids",
detail=(
"The completion summary mentions task ids that don't resolve "
"in this board's database. The completion itself succeeded, "
"but downstream consumers parsing the summary may be pointed "
"at cards that never existed."
),
actions=_generic_recovery_actions(task, running=running),
first_seen_at=_event_ts(hits[0]),
last_seen_at=_event_ts(hits[-1]),
count=len(hits),
data={"phantom_refs": phantom_refs},
)]
def _rule_repeated_spawn_failures(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Task's ``spawn_failures`` counter is climbing — worker can't
even start. Usually a profile misconfiguration (missing config.yaml,
bad PATH/venv, wrong credentials).
Threshold: cfg["spawn_failure_threshold"] (default 3).
"""
threshold = int(cfg.get("spawn_failure_threshold", 3))
failures = _task_field(task, "spawn_failures", 0)
if failures is None or failures < threshold:
return []
last_err = _task_field(task, "last_spawn_error")
assignee = _task_field(task, "assignee")
actions: list[DiagnosticAction] = []
if assignee and assignee != "default":
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Verify profile: hermes -p {assignee} doctor",
payload={"command": f"hermes -p {assignee} doctor"},
suggested=True,
))
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Fix profile auth: hermes -p {assignee} auth",
payload={"command": f"hermes -p {assignee} auth"},
))
actions.extend(_generic_recovery_actions(task, running=False))
severity = "critical" if failures >= threshold * 2 else "error"
err_text = (last_err or "").strip() if last_err else ""
err_snippet = err_text[:500] + ("" if len(err_text) > 500 else "") if err_text else ""
if err_snippet:
title = f"Agent spawn failed {failures}x: {err_snippet.splitlines()[0][:160]}"
detail = (
f"The dispatcher tried to launch a worker {failures} times "
f"and failed every time. Full last error:\n\n{err_snippet}\n\n"
f"Common causes: missing config.yaml, bad venv/PATH, or "
f"missing credentials for the profile's configured provider."
)
else:
title = f"Agent spawn failed {failures}x (no error recorded)"
detail = (
f"The dispatcher tried to launch a worker {failures} times "
f"and failed every time, but no error text was captured. "
f"Usually a profile configuration issue — check profile "
f"health with the suggested command."
)
return [Diagnostic(
kind="repeated_spawn_failures",
severity=severity,
title=title,
detail=detail,
actions=actions,
first_seen_at=now,
last_seen_at=now,
count=failures,
data={"spawn_failures": failures, "last_spawn_error": last_err},
)]
def _rule_repeated_crashes(task, events, runs, now, cfg) -> list[Diagnostic]:
"""The worker spawns fine but keeps crashing mid-run. Check the last
N runs' outcomes; N consecutive ``crashed`` without a successful
``completed`` means something about the task + profile combo is
broken (OOM, missing dependency, tool it needs is down).
Threshold: cfg["crash_threshold"] (default 2).
"""
threshold = int(cfg.get("crash_threshold", 2))
ordered = sorted(runs, key=lambda r: _task_field(r, "id", 0))
# Count trailing consecutive 'crashed' outcomes.
consecutive = 0
last_err = None
for r in reversed(ordered):
outcome = _task_field(r, "outcome")
if outcome == "crashed":
consecutive += 1
if last_err is None:
last_err = _task_field(r, "error")
elif outcome in ("completed", "reclaimed"):
# A success (or manual reclaim) breaks the streak.
break
else:
# Other outcomes (timed_out, blocked, spawn_failed, gave_up)
# aren't crash signals — don't count them, but they also
# don't break the crash streak.
continue
if consecutive < threshold:
return []
task_id = _task_field(task, "id")
actions: list[DiagnosticAction] = []
if task_id:
actions.append(DiagnosticAction(
kind="cli_hint",
label=f"Check logs: hermes kanban log {task_id}",
payload={"command": f"hermes kanban log {task_id}"},
suggested=True,
))
running = _task_field(task, "status") == "running"
actions.extend(_generic_recovery_actions(task, running=running))
severity = "critical" if consecutive >= threshold * 2 else "error"
# Put the actual error up-front so operators see WHAT broke without
# having to open the logs. Truncate defensively — these can be huge
# (full tracebacks).
err_text = (last_err or "").strip() if last_err else ""
err_snippet = err_text[:500] + ("" if len(err_text) > 500 else "") if err_text else ""
if err_snippet:
title = f"Agent crashed {consecutive}x: {err_snippet.splitlines()[0][:160]}"
detail = (
f"The last {consecutive} runs ended with outcome=crashed. "
f"Full last error:\n\n{err_snippet}"
)
else:
title = f"Agent crashed {consecutive}x (no error recorded)"
detail = (
f"The last {consecutive} runs ended with outcome=crashed but "
f"no error text was captured. Check the worker log for more."
)
return [Diagnostic(
kind="repeated_crashes",
severity=severity,
title=title,
detail=detail,
actions=actions,
first_seen_at=now,
last_seen_at=now,
count=consecutive,
data={"consecutive_crashes": consecutive, "last_error": last_err},
)]
def _rule_stuck_in_blocked(task, events, runs, now, cfg) -> list[Diagnostic]:
"""Task has been in ``blocked`` status for too long without a comment.
Threshold: cfg["blocked_stale_hours"] (default 24).
Surfaced as a warning so humans know there's a pending unblock.
"""
hours = float(cfg.get("blocked_stale_hours", 24))
status = _task_field(task, "status")
if status != "blocked":
return []
# Find the most recent ``blocked`` event.
last_blocked_ts = 0
for ev in events:
if _event_kind(ev) == "blocked":
t = _event_ts(ev)
if t > last_blocked_ts:
last_blocked_ts = t
if last_blocked_ts == 0:
return []
age_hours = (now - last_blocked_ts) / 3600.0
if age_hours < hours:
return []
# Any comment / unblock after the block breaks the "stale" signal.
for ev in events:
if _event_kind(ev) in ("commented", "unblocked") and _event_ts(ev) > last_blocked_ts:
return []
actions: list[DiagnosticAction] = [
DiagnosticAction(
kind="comment",
label="Add a comment / unblock the task",
suggested=True,
),
]
return [Diagnostic(
kind="stuck_in_blocked",
severity="warning",
title=f"Task has been blocked for {int(age_hours)}h",
detail=(
f"This task transitioned to blocked {int(age_hours)}h ago and "
f"has had no comments or unblock attempts since. Blocked tasks "
f"are waiting for human input — check the block reason and "
f"either unblock with feedback or answer with a comment."
),
actions=actions,
first_seen_at=last_blocked_ts,
last_seen_at=last_blocked_ts,
count=1,
data={"blocked_at": last_blocked_ts, "age_hours": round(age_hours, 1)},
)]
# Registry — order matters: rules higher on the list render first when
# severity ties. Add new rules here.
_RULES: list[RuleFn] = [
_rule_hallucinated_cards,
_rule_prose_phantom_refs,
_rule_repeated_spawn_failures,
_rule_repeated_crashes,
_rule_stuck_in_blocked,
]
# Known kinds (for the UI's filter / legend / i18n keys). Update when
# rules are added.
DIAGNOSTIC_KINDS = (
"hallucinated_cards",
"prose_phantom_refs",
"repeated_spawn_failures",
"repeated_crashes",
"stuck_in_blocked",
)
DEFAULT_CONFIG = {
"spawn_failure_threshold": 3,
"crash_threshold": 2,
"blocked_stale_hours": 24,
}
def compute_task_diagnostics(
task,
events: list,
runs: list,
*,
now: Optional[int] = None,
config: Optional[dict] = None,
) -> list[Diagnostic]:
"""Run every rule against a single task's state and return a
severity-sorted list of active diagnostics.
Sorting: critical first, then error, then warning; ties broken by
most-recent ``last_seen_at``.
"""
now_ts = int(now if now is not None else time.time())
cfg = {**DEFAULT_CONFIG, **(config or {})}
out: list[Diagnostic] = []
for rule in _RULES:
try:
out.extend(rule(task, events, runs, now_ts, cfg))
except Exception:
# A broken rule must never crash the dashboard. Rule bugs
# get caught in tests; in production we'd rather drop the
# diagnostic than 500 a whole /board request.
continue
severity_idx = {s: i for i, s in enumerate(SEVERITY_ORDER)}
out.sort(
key=lambda d: (
-severity_idx.get(d.severity, -1),
-(d.last_seen_at or 0),
)
)
return out
def severity_of_highest(diagnostics: Iterable[Diagnostic]) -> Optional[str]:
"""Highest severity present in the list, or None if empty. Useful
for card badges that need a single color."""
highest_idx = -1
highest = None
for d in diagnostics:
idx = SEVERITY_ORDER.index(d.severity) if d.severity in SEVERITY_ORDER else -1
if idx > highest_idx:
highest_idx = idx
highest = d.severity
return highest
+617 -219
View File
File diff suppressed because it is too large Load Diff
+89 -10
View File
@@ -190,11 +190,18 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
model: "minimax-m2.7"
provider: custom
base_url: "https://ollama.com/v1"
Also reads ``model.aliases`` (set by ``hermes config set model.aliases.xxx``)
and converts simple string entries (``ds-flash: deepseek/deepseek-v4-flash``)
into DirectAlias objects. The provider is parsed from the ``provider/``
prefix in the value; if no slash, the current provider is used.
"""
merged = dict(_BUILTIN_DIRECT_ALIASES)
try:
from hermes_cli.config import load_config
cfg = load_config()
# --- model_aliases (dict-based format) ---
user_aliases = cfg.get("model_aliases")
if isinstance(user_aliases, dict):
for name, entry in user_aliases.items():
@@ -207,6 +214,30 @@ def _load_direct_aliases() -> dict[str, DirectAlias]:
merged[name.strip().lower()] = DirectAlias(
model=model, provider=provider, base_url=base_url,
)
# --- model.aliases (string-based format, from config set) ---
model_section = cfg.get("model", {})
if isinstance(model_section, dict):
simple_aliases = model_section.get("aliases")
if isinstance(simple_aliases, dict):
current_provider = model_section.get("provider", "")
for name, value in simple_aliases.items():
if not isinstance(value, str) or not value.strip():
continue
key = name.strip().lower()
if key in merged:
continue # don't override explicit model_aliases entries
val = value.strip()
if "/" in val:
provider, model = val.split("/", 1)
else:
provider = current_provider
model = val
merged[key] = DirectAlias(
model=model.strip(),
provider=provider.strip() or current_provider,
base_url="",
)
except Exception:
pass
return merged
@@ -1264,11 +1295,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
store = _load_auth_store()
providers_store = store.get("providers", {})
pool_store = store.get("credential_pool", {})
if store and (
pid in providers_store or hermes_slug in providers_store
or pid in pool_store or hermes_slug in pool_store
):
if store and (pid in providers_store or hermes_slug in providers_store):
has_creds = True
except Exception as exc:
logger.debug("Auth store check failed for %s: %s", pid, exc)
@@ -1364,11 +1391,7 @@ def list_authenticated_providers(
from hermes_cli.auth import _load_auth_store
_cp_store = _load_auth_store()
_cp_providers_store = _cp_store.get("providers", {})
_cp_pool_store = _cp_store.get("credential_pool", {})
if _cp_store and (
_cp.slug in _cp_providers_store
or _cp.slug in _cp_pool_store
):
if _cp_store and _cp.slug in _cp_providers_store:
_cp_has_creds = True
except Exception:
pass
@@ -1660,3 +1683,59 @@ def list_authenticated_providers(
results.sort(key=lambda r: (not r["is_current"], -r["total_models"]))
return results
def list_picker_providers(
current_provider: str = "",
user_providers: dict = None,
custom_providers: list | None = None,
max_models: int = 8,
) -> List[dict]:
"""Interactive-picker variant of :func:`list_authenticated_providers`.
Post-processes the base list so the ``/model`` picker (Telegram/Discord
inline keyboards) only surfaces models that are actually callable in the
current install:
- OpenRouter's model list is replaced with the output of
:func:`hermes_cli.models.fetch_openrouter_models`, which filters the
curated ``OPENROUTER_MODELS`` snapshot against the live OpenRouter
catalog. IDs the live catalog no longer carries drop out, so the
picker never offers a model the user can't call.
- Provider rows whose model list ends up empty are dropped, except
custom endpoints (``is_user_defined=True`` with an ``api_url``) where
the user may supply their own model set through config.
All other providers and metadata fields are passed through unchanged.
The typed ``/model <name>`` path is unaffected -- only the interactive
picker payload is narrowed.
"""
from hermes_cli.models import fetch_openrouter_models
providers = list_authenticated_providers(
current_provider=current_provider,
user_providers=user_providers,
custom_providers=custom_providers,
max_models=max_models,
)
filtered: List[dict] = []
for p in providers:
slug = str(p.get("slug", "")).lower()
if slug == "openrouter":
try:
live = fetch_openrouter_models()
live_ids = [mid for mid, _ in live]
except Exception:
live_ids = list(p.get("models", []))
p = dict(p)
p["models"] = live_ids[:max_models]
p["total_models"] = len(live_ids)
has_models = bool(p.get("models"))
is_custom_endpoint = bool(p.get("is_user_defined")) and bool(p.get("api_url"))
if not has_models and not is_custom_endpoint:
continue
filtered.append(p)
return filtered
+80 -8
View File
@@ -806,6 +806,25 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("ai-gateway", "Vercel AI Gateway", "Vercel AI Gateway"),
]
# Auto-extend CANONICAL_PROVIDERS with any provider registered in providers/
# that is not already in the list above. Adding providers/*.py is sufficient
# to expose a new provider in the model picker, /model, and all downstream
# consumers — no edits to this file needed.
_canonical_slugs = {p.slug for p in CANONICAL_PROVIDERS}
try:
from providers import list_providers as _list_providers_for_canonical
for _pp in _list_providers_for_canonical():
if _pp.name in _canonical_slugs:
continue
if _pp.auth_type in ("oauth_device_code", "oauth_external", "external_process", "aws_sdk", "copilot"):
continue # non-api-key flows need bespoke picker UX; skip auto-inject
_label = _pp.display_name or _pp.name
_desc = _pp.description or f"{_label} (direct API)"
CANONICAL_PROVIDERS.append(ProviderEntry(_pp.name, _label, _desc))
_canonical_slugs.add(_pp.name)
except Exception:
pass
# Derived dicts — used throughout the codebase
_PROVIDER_LABELS = {p.slug: p.label for p in CANONICAL_PROVIDERS}
_PROVIDER_LABELS["custom"] = "Custom endpoint" # special case: not a named provider
@@ -1740,10 +1759,20 @@ def model_supports_fast_mode(model_id: Optional[str]) -> bool:
def _is_anthropic_fast_model(model_id: Optional[str]) -> bool:
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode."""
"""Return True if the model is a Claude model eligible for Anthropic Fast Mode.
Fast mode is currently supported on Claude Opus 4.6 only. Per Anthropic's
docs (https://platform.claude.com/docs/en/build-with-claude/fast-mode):
"Fast mode is currently supported on Opus 4.6 only. Sending speed: fast
with an unsupported model returns an error." Opus 4.7 explicitly rejects
the ``speed`` parameter with HTTP 400.
"""
raw = _strip_vendor_prefix(str(model_id or ""))
base = raw.split(":")[0]
return base.startswith("claude-")
if not base.startswith("claude-"):
return False
# Only Opus 4.6 supports fast mode at present.
return "opus-4-6" in base or "opus-4.6" in base
def resolve_fast_mode_overrides(model_id: Optional[str]) -> dict[str, Any] | None:
@@ -2013,6 +2042,34 @@ def provider_model_ids(provider: Optional[str], *, force_refresh: bool = False)
return ids
except Exception:
pass
# ── Profile-based generic live fetch (all simple api-key providers) ──
# Handles any provider registered in providers/ with auth_type="api_key".
# Replaces per-provider copy-paste blocks (stepfun, gmi, zai, etc.).
try:
from providers import get_provider_profile
from hermes_cli.auth import resolve_api_key_provider_credentials
_p = get_provider_profile(normalized)
if _p and _p.auth_type == "api_key" and _p.base_url:
try:
creds = resolve_api_key_provider_credentials(normalized)
api_key = str(creds.get("api_key") or "").strip()
base_url = str(creds.get("base_url") or "").strip()
except Exception:
api_key, base_url = "", _p.base_url
if not base_url:
base_url = _p.base_url
if api_key:
live = _p.fetch_models(api_key=api_key)
if live:
return live
# Use profile's fallback_models if defined
if _p.fallback_models:
return list(_p.fallback_models)
except Exception:
pass
curated_static = list(_PROVIDER_MODELS.get(normalized, []))
if normalized in _MODELS_DEV_PREFERRED:
return _merge_with_models_dev(normalized, curated_static)
@@ -2896,6 +2953,19 @@ def fetch_api_models(
_OLLAMA_CLOUD_CACHE_TTL = 3600 # 1 hour
def _strip_ollama_cloud_suffix(model_id: str) -> str:
"""Strip :cloud / -cloud suffixes that models.dev appends to Ollama Cloud IDs.
The live API uses clean IDs (e.g. 'kimi-k2.6') while models.dev sometimes
returns them as 'kimi-k2.6:cloud'. Normalising before the dedup merge
prevents duplicate entries in the merged model list.
"""
for suffix in (":cloud", "-cloud"):
if model_id.endswith(suffix):
return model_id[: -len(suffix)]
return model_id
def _ollama_cloud_cache_path() -> Path:
"""Return the path for the Ollama Cloud model cache."""
from hermes_constants import get_hermes_home
@@ -2991,9 +3061,10 @@ def fetch_ollama_cloud_models(
seen.add(m)
merged.append(m)
for m in mdev_models:
if m and m not in seen:
seen.add(m)
merged.append(m)
normalized = _strip_ollama_cloud_suffix(m)
if normalized and normalized not in seen:
seen.add(normalized)
merged.append(normalized)
if merged:
_save_ollama_cloud_cache(merged)
return merged
@@ -3185,11 +3256,12 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": False,
"persist": False,
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Model `{requested}` was not found in the OpenAI Codex model listing."
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
"It may still work if your ChatGPT/Codex account has access to a newer or hidden model ID."
f"{suggestion_text}"
),
}
+35 -5
View File
@@ -173,7 +173,7 @@ def _get_enabled_plugins() -> Optional[set]:
# Data classes
# ---------------------------------------------------------------------------
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform"}
_VALID_PLUGIN_KINDS: Set[str] = {"standalone", "backend", "exclusive", "platform", "model-provider"}
@dataclass
@@ -643,15 +643,17 @@ class PluginManager:
# - flat: ``plugins/disk-cleanup/plugin.yaml`` (standalone)
# - category: ``plugins/image_gen/openai/plugin.yaml`` (backend)
#
# ``memory/`` and ``context_engine/`` are skipped at the top level —
# they have their own discovery systems. ``platforms/`` is a category
# holding platform adapters (scanned one level deeper below).
# ``memory/``, ``context_engine/``, and ``model-providers/`` are
# skipped at the top level — they have their own discovery systems
# (plugins/memory/__init__.py, providers/__init__.py). ``platforms/``
# is a category holding platform adapters (scanned one level deeper
# below).
repo_plugins = get_bundled_plugins_dir()
manifests.extend(
self._scan_directory(
repo_plugins,
source="bundled",
skip_names={"memory", "context_engine", "platforms"},
skip_names={"memory", "context_engine", "platforms", "model-providers"},
)
)
manifests.extend(
@@ -709,6 +711,21 @@ class PluginManager:
)
continue
# Model provider plugins are loaded by providers/__init__.py
# (its own lazy discovery keyed off first get_provider_profile()
# call). We record the manifest here for introspection but do
# not import the module — a second import would create two
# ProviderProfile instances and break the "last writer wins"
# override semantics between bundled and user plugins.
if manifest.kind == "model-provider":
loaded = LoadedPlugin(manifest=manifest, enabled=True)
self._plugins[lookup_key] = loaded
logger.debug(
"Skipping '%s' (model-provider, handled by providers/ discovery)",
lookup_key,
)
continue
# Built-in backends auto-load — they ship with hermes and must
# just work. Selection among them (e.g. which image_gen backend
# services calls) is driven by ``<category>.provider`` config,
@@ -886,6 +903,19 @@ class PluginManager:
"treating as kind='exclusive'",
key,
)
elif (
"register_provider" in source_text
and "ProviderProfile" in source_text
):
# Model provider plugin (calls register_provider()
# from ``providers`` with a ProviderProfile). Route
# to providers/__init__.py discovery.
kind = "model-provider"
logger.debug(
"Plugin %s: detected model provider, "
"treating as kind='model-provider'",
key,
)
except Exception:
pass
+111 -73
View File
@@ -179,8 +179,33 @@ def _get_wrapper_dir() -> Path:
# Validation
# ---------------------------------------------------------------------------
def normalize_profile_name(name: str) -> str:
"""Return the canonical profile id used on disk and in CLI ``-p`` argv.
Named profiles are stored lowercase under ``profiles/<id>/``. The special
alias ``default`` is matched case-insensitively (``Default`` ``default``).
Dashboards and tools may pass title-cased display labels; normalize before
validation, assignment, and subprocess spawn (see issue #18498).
"""
if not isinstance(name, str):
name = str(name)
stripped = name.strip()
if not stripped:
raise ValueError("profile name cannot be empty")
if stripped.casefold() == "default":
return "default"
return stripped.lower()
def validate_profile_name(name: str) -> None:
"""Raise ``ValueError`` if *name* is not a valid profile identifier."""
"""Raise ``ValueError`` if *name* is not a valid profile identifier.
Validates the input as-given strict lowercase match. Callers that accept
mixed-case or title-cased input from users (dashboard UI, CLI args) should
call :func:`normalize_profile_name` first. This separation keeps validate
honest about what the on-disk directory name must look like, while
ingress-point normalization handles UX flexibility (see #18498).
"""
if name == "default":
return # special alias for ~/.hermes
if not _PROFILE_ID_RE.match(name):
@@ -192,16 +217,18 @@ def validate_profile_name(name: str) -> None:
def get_profile_dir(name: str) -> Path:
"""Resolve a profile name to its HERMES_HOME directory."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return _get_default_hermes_home()
return _get_profiles_root() / name
return _get_profiles_root() / canon
def profile_exists(name: str) -> bool:
"""Check whether a profile directory exists."""
if name == "default":
canon = normalize_profile_name(name)
if canon == "default":
return True
return get_profile_dir(name).is_dir()
return get_profile_dir(canon).is_dir()
# ---------------------------------------------------------------------------
@@ -213,28 +240,29 @@ def check_alias_collision(name: str) -> Optional[str]:
Checks: reserved names, hermes subcommands, existing binaries in PATH.
"""
if name in _RESERVED_NAMES:
return f"'{name}' is a reserved name"
if name in _HERMES_SUBCOMMANDS:
return f"'{name}' conflicts with a hermes subcommand"
canon = normalize_profile_name(name)
if canon in _RESERVED_NAMES:
return f"'{canon}' is a reserved name"
if canon in _HERMES_SUBCOMMANDS:
return f"'{canon}' conflicts with a hermes subcommand"
# Check existing commands in PATH
wrapper_dir = _get_wrapper_dir()
try:
result = subprocess.run(
["which", name], capture_output=True, text=True, timeout=5,
["which", canon], capture_output=True, text=True, timeout=5,
)
if result.returncode == 0:
existing_path = result.stdout.strip()
# Allow overwriting our own wrappers
if existing_path == str(wrapper_dir / name):
if existing_path == str(wrapper_dir / canon):
try:
content = (wrapper_dir / name).read_text()
content = (wrapper_dir / canon).read_text()
if "hermes -p" in content:
return None # it's our wrapper, safe to overwrite
except Exception:
pass
return f"'{name}' conflicts with an existing command ({existing_path})"
return f"'{canon}' conflicts with an existing command ({existing_path})"
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
@@ -252,6 +280,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
Returns the path to the created wrapper, or None if creation failed.
"""
canon = normalize_profile_name(name)
wrapper_dir = _get_wrapper_dir()
try:
wrapper_dir.mkdir(parents=True, exist_ok=True)
@@ -259,9 +288,9 @@ def create_wrapper_script(name: str) -> Optional[Path]:
print(f"⚠ Could not create {wrapper_dir}: {e}")
return None
wrapper_path = wrapper_dir / name
wrapper_path = wrapper_dir / canon
try:
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {canon} "$@"\n')
wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
return wrapper_path
except OSError as e:
@@ -271,7 +300,7 @@ def create_wrapper_script(name: str) -> Optional[Path]:
def remove_wrapper_script(name: str) -> bool:
"""Remove the wrapper script for a profile. Returns True if removed."""
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / normalize_profile_name(name)
if wrapper_path.exists():
try:
# Verify it's our wrapper before removing
@@ -421,16 +450,17 @@ def create_profile(
Path
The newly created profile directory.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
# Resolve clone source
source_dir = None
@@ -440,6 +470,7 @@ def create_profile(
from hermes_constants import get_hermes_home
source_dir = get_hermes_home()
else:
clone_from = normalize_profile_name(clone_from)
validate_profile_name(clone_from)
source_dir = get_profile_dir(clone_from)
if not source_dir.is_dir():
@@ -540,24 +571,25 @@ def delete_profile(name: str, yes: bool = False) -> Path:
Returns the path that was removed.
"""
validate_profile_name(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
if name == "default":
if canon == "default":
raise ValueError(
"Cannot delete the default profile (~/.hermes).\n"
"To remove everything, use: hermes uninstall"
)
profile_dir = get_profile_dir(name)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
# Show what will be deleted
model, provider = _read_config_model(profile_dir)
gw_running = _check_gateway_running(profile_dir)
skill_count = _count_skills(profile_dir)
print(f"\nProfile: {name}")
print(f"\nProfile: {canon}")
print(f"Path: {profile_dir}")
if model:
print(f"Model: {model}" + (f" ({provider})" if provider else ""))
@@ -569,7 +601,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
]
# Check for service
wrapper_path = _get_wrapper_dir() / name
wrapper_path = _get_wrapper_dir() / canon
has_wrapper = wrapper_path.exists()
if has_wrapper:
items.append(f"Command alias ({wrapper_path})")
@@ -584,16 +616,16 @@ def delete_profile(name: str, yes: bool = False) -> Path:
if not yes:
print()
try:
confirm = input(f"Type '{name}' to confirm: ").strip()
confirm = input(f"Type '{canon}' to confirm: ").strip()
except (KeyboardInterrupt, EOFError):
print("\nCancelled.")
return profile_dir
if confirm != name:
if confirm != canon:
print("Cancelled.")
return profile_dir
# 1. Disable service (prevents auto-restart)
_cleanup_gateway_service(name, profile_dir)
_cleanup_gateway_service(canon, profile_dir)
# 2. Stop running gateway
if gw_running:
@@ -601,7 +633,7 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 3. Remove wrapper script
if has_wrapper:
if remove_wrapper_script(name):
if remove_wrapper_script(canon):
print(f"✓ Removed {wrapper_path}")
# 4. Remove profile directory
@@ -614,13 +646,13 @@ def delete_profile(name: str, yes: bool = False) -> Path:
# 5. Clear active_profile if it pointed to this profile
try:
active = get_active_profile()
if active == name:
if active == canon:
set_active_profile("default")
print("✓ Active profile reset to default")
except Exception:
pass
print(f"\nProfile '{name}' deleted.")
print(f"\nProfile '{canon}' deleted.")
return profile_dir
@@ -730,22 +762,23 @@ def set_active_profile(name: str) -> None:
Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
"""
validate_profile_name(name)
if name != "default" and not profile_exists(name):
canon = normalize_profile_name(name)
validate_profile_name(canon)
if canon != "default" and not profile_exists(canon):
raise FileNotFoundError(
f"Profile '{name}' does not exist. "
f"Create it with: hermes profile create {name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
path = _get_active_profile_path()
path.parent.mkdir(parents=True, exist_ok=True)
if name == "default":
if canon == "default":
# Remove the file to indicate default
path.unlink(missing_ok=True)
else:
# Atomic write
tmp = path.with_suffix(".tmp")
tmp.write_text(name + "\n")
tmp.write_text(canon + "\n")
tmp.replace(path)
@@ -811,16 +844,17 @@ def export_profile(name: str, output_path: str) -> Path:
"""
import tempfile
validate_profile_name(name)
profile_dir = get_profile_dir(name)
canon = normalize_profile_name(name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if not profile_dir.is_dir():
raise FileNotFoundError(f"Profile '{name}' does not exist.")
raise FileNotFoundError(f"Profile '{canon}' does not exist.")
output = Path(output_path)
# shutil.make_archive wants the base name without extension
base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
if name == "default":
if canon == "default":
# The default profile IS ~/.hermes itself — its parent is ~/ and its
# directory name is ".hermes", not "default". We stage a clean copy
# under a temp dir so the archive contains ``default/...``.
@@ -836,14 +870,14 @@ def export_profile(name: str, output_path: str) -> Path:
# Named profiles — stage a filtered copy to exclude credentials
with tempfile.TemporaryDirectory() as tmpdir:
staged = Path(tmpdir) / name
staged = Path(tmpdir) / canon
_CREDENTIAL_FILES = {"auth.json", ".env"}
shutil.copytree(
profile_dir,
staged,
ignore=lambda d, contents: _CREDENTIAL_FILES & set(contents),
)
result = shutil.make_archive(base, "gztar", tmpdir, name)
result = shutil.make_archive(base, "gztar", tmpdir, canon)
return Path(result)
@@ -952,16 +986,17 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
# Archives exported from the default profile have "default/" as top-level
# dir. Importing as "default" would target ~/.hermes itself — disallow
# that and guide the user toward a named profile.
if inferred_name == "default":
canon = normalize_profile_name(inferred_name)
validate_profile_name(canon)
if canon == "default":
raise ValueError(
"Cannot import as 'default' — that is the built-in root profile (~/.hermes). "
"Specify a different name: hermes profile import <archive> --name <name>"
)
validate_profile_name(inferred_name)
profile_dir = get_profile_dir(inferred_name)
profile_dir = get_profile_dir(canon)
if profile_dir.exists():
raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
raise FileExistsError(f"Profile '{canon}' already exists at {profile_dir}")
profiles_root = _get_profiles_root()
profiles_root.mkdir(parents=True, exist_ok=True)
@@ -977,8 +1012,8 @@ def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
)
final_source = extracted
if archive_root != inferred_name:
final_source = staging_root / inferred_name
if archive_root != canon:
final_source = staging_root / canon
extracted.rename(final_source)
shutil.move(str(final_source), str(profile_dir))
@@ -1048,25 +1083,27 @@ def rename_profile(old_name: str, new_name: str) -> Path:
Returns the new profile directory.
"""
validate_profile_name(old_name)
validate_profile_name(new_name)
old_canon = normalize_profile_name(old_name)
new_canon = normalize_profile_name(new_name)
validate_profile_name(old_canon)
validate_profile_name(new_canon)
if old_name == "default":
if old_canon == "default":
raise ValueError("Cannot rename the default profile.")
if new_name == "default":
if new_canon == "default":
raise ValueError("Cannot rename to 'default' — it is reserved.")
old_dir = get_profile_dir(old_name)
new_dir = get_profile_dir(new_name)
old_dir = get_profile_dir(old_canon)
new_dir = get_profile_dir(new_canon)
if not old_dir.is_dir():
raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
raise FileNotFoundError(f"Profile '{old_canon}' does not exist.")
if new_dir.exists():
raise FileExistsError(f"Profile '{new_name}' already exists.")
raise FileExistsError(f"Profile '{new_canon}' already exists.")
# 1. Stop gateway if running
if _check_gateway_running(old_dir):
_cleanup_gateway_service(old_name, old_dir)
_cleanup_gateway_service(old_canon, old_dir)
_stop_gateway_process(old_dir)
# 2. Rename directory
@@ -1074,22 +1111,22 @@ def rename_profile(old_name: str, new_name: str) -> Path:
print(f"✓ Renamed {old_dir.name}{new_dir.name}")
# 3. Update profile-scoped Honcho host blocks, preserving aiPeer identity
_migrate_honcho_profile_host(old_name, new_name, new_dir)
_migrate_honcho_profile_host(old_canon, new_canon, new_dir)
# 4. Update wrapper script
remove_wrapper_script(old_name)
collision = check_alias_collision(new_name)
remove_wrapper_script(old_canon)
collision = check_alias_collision(new_canon)
if not collision:
create_wrapper_script(new_name)
print(f"✓ Alias updated: {new_name}")
create_wrapper_script(new_canon)
print(f"✓ Alias updated: {new_canon}")
else:
print(f"⚠ Cannot create alias '{new_name}'{collision}")
print(f"⚠ Cannot create alias '{new_canon}'{collision}")
# 5. Update active_profile if it pointed to old name
try:
if get_active_profile() == old_name:
set_active_profile(new_name)
print(f"✓ Active profile updated: {new_name}")
if get_active_profile() == old_canon:
set_active_profile(new_canon)
print(f"✓ Active profile updated: {new_canon}")
except Exception:
pass
@@ -1191,13 +1228,14 @@ def resolve_profile_env(profile_name: str) -> str:
Called early in the CLI entry point, before any hermes modules
are imported, to set the HERMES_HOME environment variable.
"""
validate_profile_name(profile_name)
profile_dir = get_profile_dir(profile_name)
canon = normalize_profile_name(profile_name)
validate_profile_name(canon)
profile_dir = get_profile_dir(canon)
if profile_name != "default" and not profile_dir.is_dir():
if canon != "default" and not profile_dir.is_dir():
raise FileNotFoundError(
f"Profile '{profile_name}' does not exist. "
f"Create it with: hermes profile create {profile_name}"
f"Profile '{canon}' does not exist. "
f"Create it with: hermes profile create {canon}"
)
return str(profile_dir)
+8 -3
View File
@@ -108,9 +108,14 @@ class PtyBridge:
"(or pip install -e '.[pty]')."
)
raise PtyUnavailableError("Pseudo-terminals are unavailable.")
# Let caller-supplied env fully override inheritance; if they pass
# None we inherit the server's env (same semantics as subprocess).
spawn_env = os.environ.copy() if env is None else env
# PTY-hosted programs expect TERM to describe the terminal type.
# CI often runs without TERM in the parent process, which makes
# simple terminal probes like `tput cols` fail before winsize reads.
# Preserve explicit caller overrides, but backfill a sensible default
# when TERM is missing or blank.
spawn_env = (os.environ.copy() if env is None else env.copy())
if not spawn_env.get("TERM"):
spawn_env["TERM"] = "xterm-256color"
proc = ptyprocess.PtyProcess.spawn( # type: ignore[union-attr]
list(argv),
cwd=cwd,
+15 -2
View File
@@ -15,6 +15,7 @@ import importlib.util
import json
import logging
import os
import re
import shutil
import sys
import copy
@@ -208,12 +209,23 @@ def prompt(question: str, default: str = None, password: bool = False) -> str:
else:
value = input(color(display, Colors.YELLOW))
return value.strip() or default or ""
cleaned = _sanitize_pasted_input(value)
return cleaned.strip() or default or ""
except (KeyboardInterrupt, EOFError):
print()
sys.exit(1)
_BRACKETED_PASTE_PATTERN = re.compile(r"\x1b\[\s*200~|\x1b\[\s*201~")
def _sanitize_pasted_input(value: str) -> str:
"""Strip terminal bracketed-paste control markers from pasted text."""
if not isinstance(value, str) or not value:
return value
return _BRACKETED_PASTE_PATTERN.sub("", value)
def _curses_prompt_choice(question: str, choices: list, default: int = 0, description: str | None = None) -> int:
"""Single-select menu using curses. Delegates to curses_radiolist."""
from hermes_cli.curses_ui import curses_radiolist
@@ -964,7 +976,8 @@ def setup_model_provider(config: dict, *, quick: bool = False):
)
else:
_selected_vision_model = prompt(" Vision model (blank = use main/custom default)").strip()
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
if _selected_vision_model:
save_env_value("AUXILIARY_VISION_MODEL", _selected_vision_model)
print_success(
f"Vision configured with {_base_url}"
+ (f" ({_selected_vision_model})" if _selected_vision_model else "")
+25 -5
View File
@@ -122,11 +122,16 @@ def show_status(args):
print()
print(color("◆ API Keys", Colors.CYAN, Colors.BOLD))
keys = {
# Values may be a single env var name (str) or a tuple of alternates (first found wins).
keys: dict[str, str | tuple[str, ...]] = {
"OpenRouter": "OPENROUTER_API_KEY",
"OpenAI": "OPENAI_API_KEY",
"NVIDIA": "NVIDIA_API_KEY",
"Z.AI/GLM": "GLM_API_KEY",
"Anthropic": ("ANTHROPIC_API_KEY", "ANTHROPIC_TOKEN"),
"Google / Gemini": ("GOOGLE_API_KEY", "GEMINI_API_KEY"),
"DeepSeek": "DEEPSEEK_API_KEY",
"xAI / Grok": "XAI_API_KEY",
"NVIDIA NIM": "NVIDIA_API_KEY",
"Z.AI / GLM": "GLM_API_KEY",
"Kimi": "KIMI_API_KEY",
"StepFun Step Plan": "STEPFUN_API_KEY",
"MiniMax": "MINIMAX_API_KEY",
@@ -142,8 +147,23 @@ def show_status(args):
"GitHub": "GITHUB_TOKEN",
}
for name, env_var in keys.items():
value = get_env_value(env_var) or ""
def _resolve_env(env_ref) -> str:
"""Return first non-empty env var value from a str or tuple of names."""
if isinstance(env_ref, tuple):
for candidate in env_ref:
v = get_env_value(candidate) or ""
if v:
return v
return ""
return get_env_value(env_ref) or ""
for name, env_ref in keys.items():
# Anthropic already has a dedicated lookup below; keep that as the
# single source of truth (it also resolves OAuth tokens), skip here
# so we don't print two "Anthropic" rows.
if name == "Anthropic":
continue
value = _resolve_env(env_ref)
has_key = bool(value)
display = redact_key(value) if not show_all else value
print(f" {name:<12} {check_mark(has_key)} {display}")
+138
View File
@@ -334,6 +334,144 @@ TIPS = [
"MCP ${ENV_VAR} placeholders in config are resolved at server spawn — including vars from ~/.hermes/.env.",
"Skills from trusted repos (NousResearch) get a 'trusted' security level; community skills get extra scanning.",
"The skills quarantine at ~/.hermes/skills/.hub/quarantine/ holds skills pending security review.",
# --- Advanced Slash Commands ---
'/steer <prompt> injects a note after the next tool call — nudge direction mid-task without interrupting.',
'/goal <text> sets a standing Ralph-loop objective — Hermes auto-continues turn after turn until a judge says done.',
'/snapshot create [label] saves a full state snapshot of Hermes config; /snapshot restore <id> reverts later.',
'/copy [N] copies the last assistant response to your clipboard, or the Nth-from-last with a number.',
'/redraw forces a full UI repaint, fixing terminal drift after tmux resize or mouse selection artifacts.',
'/agents (alias /tasks) shows active agents and running background tasks across the current session.',
'/footer toggles the gateway footer on final replies showing model, tool counts, and turn timing.',
'/busy queue|steer|interrupt controls what pressing Enter does while Hermes is working.',
'/topic in Telegram DMs enables user-managed multi-session topic mode — /topic <id> restores past sessions inline.',
'/approve session|always runs a pending dangerous command with your chosen trust scope; /deny rejects it.',
'/restart gracefully restarts the gateway after draining active runs, then pings the requester when back up.',
'/kanban boards switch <slug> changes the active multi-project Kanban board from inside chat.',
'/reload reloads ~/.hermes/.env into the running session — pick up new API keys without restarting.',
# --- Cron (no-agent & scripts) ---
'cronjob with no_agent=True runs a script on schedule and sends its stdout directly — zero tokens, zero LLM.',
'An empty cron script stdout means silent tick — nothing is delivered, perfect for threshold watchdogs.',
"HERMES_CRON_MAX_PARALLEL (default 4) caps how many cron jobs run per tick so bursts don't saturate your keys.",
# --- Gateway Hooks ---
'Gateway hooks live under ~/.hermes/hooks/<name>/ with HOOK.yaml + handler.py — handler must be named `handle`.',
'Hook events include gateway:startup, session:start, agent:step, and command:* wildcard subscriptions.',
'Drop a ~/.hermes/BOOT.md checklist and a gateway:startup hook runs it as a one-shot agent every boot.',
# --- Curator ---
'hermes curator run --dry-run previews what the curator would archive or consolidate without mutating anything.',
"hermes curator pin <skill> hard-fences a skill against both auto-archival and the agent's skill_manage tool.",
'hermes curator rollback restores skills from a pre-run snapshot — backups live under skills/.curator_backups/.',
# --- Credential Pools & Routing ---
'hermes auth reset <provider> clears all cooldowns and exhaustion flags on a credential pool.',
'credential_pool_strategies.<provider>: round_robin cycles keys evenly instead of the fill_first default.',
'use_gateway: true per-tool routes web, image, tts, or browser through your Nous subscription — no extra keys.',
'provider_routing.data_collection: deny excludes data-storing providers on OpenRouter.',
'provider_routing.require_parameters: true only routes to providers that support every param in your request.',
# --- TUI & Dashboard ---
'HERMES_TUI_RESUME=1 auto-re-attaches to the most recent TUI session on launch — handy after SSH drops.',
"HERMES_TUI_THEME=light|dark|<hex> forces the TUI theme on terminals that don't set COLORFGBG.",
'Ctrl+G or Ctrl+X Ctrl+E in the TUI opens the input buffer in $EDITOR for long multi-line prompts.',
'The TUI renders LaTeX inline — $E=mc^2$ becomes Unicode math instead of raw TeX.',
'hermes dashboard launches a local web UI at 127.0.0.1:9119 — zero data leaves localhost.',
'hermes dashboard --tui embeds the full Hermes TUI in your browser via xterm.js and a WebSocket PTY.',
'Drop a YAML in ~/.hermes/dashboard-themes/ with two palette colors to reskin the entire dashboard.',
'Dashboard plugins are drop-in: manifest.json + JS bundle in ~/.hermes/dashboard-plugins/ — no npm build required.',
'layoutVariant: cockpit in a dashboard theme adds a 260px left rail that plugins can populate via the sidebar slot.',
# --- Env Vars & Config Gates ---
"display.tool_progress_command: true exposes /verbose on messaging platforms; it's CLI-only by default.",
'HERMES_BACKGROUND_NOTIFICATIONS=result only pings when background tasks finish (vs all/error/off).',
'HERMES_WRITE_SAFE_ROOT restricts write_file and patch to a directory prefix; writes outside require approval.',
'HERMES_IGNORE_RULES skips auto-injection of AGENTS.md, SOUL.md, .cursorrules, memory, and preloaded skills.',
'HERMES_ACCEPT_HOOKS auto-approves unseen shell hooks declared in config.yaml without a TTY prompt.',
'auxiliary.goal_judge.model routes the /goal judge to a cheap fast model to keep loop cost near zero.',
'Checkpoints skip directories with more than 50,000 files to avoid slow git operations on massive monorepos.',
# --- TTS ---
'tts.provider: piper runs 44-language local TTS on CPU — voices auto-download to ~/.hermes/cache/piper-voices/.',
'tts.providers.<name>.type: command wires any CLI TTS engine with {input_path} and {output_path} placeholders.',
# --- API Server & Proxy ---
'API_SERVER_ENABLED=true runs an OpenAI-compatible endpoint alongside the gateway for Open WebUI and LibreChat.',
'GATEWAY_PROXY_URL runs a split setup: platform I/O locally, agent work delegated to a remote API server.',
# --- Platform-specific ---
'MATRIX_DEVICE_ID pins a stable device ID for E2EE — without it, keys rotate every start and historic decrypt breaks.',
'TELEGRAM_WEBHOOK_SECRET is required whenever TELEGRAM_WEBHOOK_URL is set — generate with openssl rand -hex 32.',
# --- Batch ---
"batch_runner.py --resume content-matches completed prompts by text so dataset reorders don't re-run finished work.",
# --- Less-Known Slash Commands ---
'/new starts a fresh session in place (alias /reset) — fresh session ID, clean history, CLI stays open.',
'/clear wipes the terminal screen AND starts a new session — one shortcut for a visual reset.',
'/history prints the current conversation in-line without leaving the CLI — useful for a quick re-read.',
'/save writes the current conversation to disk without ending the session.',
'/status shows session info at a glance: ID, title, model, token usage, and elapsed time.',
'/image <path> attaches a local image file for your next prompt without pasting or drag-and-drop.',
'/platforms shows gateway and messaging-platform connection status right from inside chat.',
'/commands paginates the full slash-command + installed-skill list — useful on platforms without tab completion.',
'/toolsets lists every available toolset so you know what -t/--toolsets accepts.',
'/gquota shows Google Gemini Code Assist quota usage with progress bars when that provider is active.',
'/voice tts toggles TTS-only mode — agent replies out loud but you still type your prompts.',
'/reload-skills re-scans ~/.hermes/skills/ so drop-in skills appear without restarting the session.',
'/indicator kaomoji|emoji|unicode|ascii picks the TUI busy-indicator style shown during agent runs.',
'/debug uploads a support bundle (system info + logs) and returns shareable links — works in chat too.',
# --- CLI Subcommands & Flags ---
'hermes -z "<prompt>" is the purest one-shot: final answer on stdout, nothing else — ideal for piping in scripts.',
'hermes chat --pass-session-id injects the session ID into the system prompt so the agent can self-reference it.',
'hermes chat --image path/to/pic.png attaches a local image to a single -q query without a separate upload step.',
'hermes chat --ignore-user-config skips ~/.hermes/config.yaml — reproducible bug reports and CI runs.',
"hermes chat --source tool tags programmatic chats so they don't clutter hermes sessions list.",
'hermes dump --show-keys includes redacted API key fingerprints for deeper support debugging.',
'hermes sessions rename <ID> "new title" renames any past session; hermes sessions delete <ID> removes one.',
'hermes import restores a session export or profile archive produced by sessions export or profile export.',
'hermes fallback manages the fallback_model chain interactively — no hand-editing config.yaml.',
'hermes pairing rotates the DM pairing token — the first messager after rotation claims access to the bot.',
'hermes setup walks first-time users through provider, keys, and platform wiring in one interactive flow.',
'hermes status --deep runs the full health sweep across every component; plain hermes status is the quick view.',
# --- Agent Behavior Env Vars ---
'HERMES_AGENT_TIMEOUT=0 disables the gateway inactivity kill for a running agent — use for long research runs.',
'HERMES_ENABLE_PROJECT_PLUGINS=1 auto-loads repo-local plugins from ./.hermes/plugins/ — trust-gated by design.',
"HERMES_DISABLE_FILE_STATE_GUARD=1 turns off the 'file changed since you read it' guard on patch and write_file.",
'HERMES_ALLOW_PRIVATE_URLS=true lets web tools hit localhost and private networks — off by default in gateway mode.',
'HERMES_OPTIONAL_SKILLS=name1,name2 auto-installs extra optional-catalog skills on first run per profile.',
'HERMES_BUNDLED_SKILLS points at a custom bundled-skill tree — used by Homebrew and Nix packaging.',
'HERMES_DUMP_REQUEST_STDOUT=1 dumps every API request payload to stdout instead of log files.',
'HERMES_OAUTH_TRACE=1 logs redacted OAuth token exchange and refresh attempts for debugging provider auth.',
'HERMES_STREAM_RETRIES (default 3) controls mid-stream reconnect attempts on transient network errors.',
# --- Gateway Behavior Env Vars ---
'HERMES_GATEWAY_BUSY_ACK_ENABLED=false silences the ⚡/⏳/⏩ ack messages when a user messages a busy agent.',
'HERMES_AGENT_NOTIFY_INTERVAL (default 180s) sets how often the gateway pings with progress on long turns.',
'HERMES_RESTART_DRAIN_TIMEOUT (default 900s) caps how long /restart waits for in-flight runs before forcing.',
'HERMES_CHECKPOINT_TIMEOUT (default 30s) caps filesystem checkpoint creation — raise it on huge monorepos.',
# --- Auxiliary Tasks & Image Generation ---
'image_gen.model in config.yaml picks the FAL model: flux-2/klein, gpt-image-2, nano-banana-pro, and more.',
'image_gen.provider routes image generation through a plugin (OpenAI Images, Codex, FAL) instead of the default.',
'AUXILIARY_VISION_BASE_URL + AUXILIARY_VISION_API_KEY point vision analysis at any OpenAI-compatible endpoint.',
'auxiliary.session_search.max_concurrency bounds how many matched sessions are summarized in parallel (default 3).',
'auxiliary.session_search.extra_body forwards provider-specific OpenAI-compatible fields on summarization calls.',
# --- Security ---
'security.tirith_fail_open: false makes Hermes block commands when the tirith scanner itself errors out.',
'TIRITH_FAIL_OPEN env var overrides the tirith_fail_open config — a quick toggle without editing config.yaml.',
# --- Sessions & Source Tags ---
'--source tool chats are excluded from hermes sessions list by default — set --source explicitly to see them.',
'Session IDs are timestamp-prefixed (20250305_091523_abcd) so sorting works naturally in ls and jq.',
# --- Misc ---
'API_SERVER_MODEL_NAME customizes the model name on /v1/models — essential for multi-profile Open WebUI setups.',
'Dashboard plugins are served from /dashboard-plugins/<name>/ — drop files into ~/.hermes/dashboard-plugins/.',
]
+10 -4
View File
@@ -1920,21 +1920,27 @@ def _reconfigure_provider(provider: dict, config: dict):
return
if provider.get("tts_provider"):
config.setdefault("tts", {})["provider"] = provider["tts_provider"]
tts_cfg = config.setdefault("tts", {})
tts_cfg["provider"] = provider["tts_provider"]
tts_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" TTS provider set to: {provider['tts_provider']}")
if "browser_provider" in provider:
bp = provider["browser_provider"]
browser_cfg = config.setdefault("browser", {})
if bp == "local":
config.setdefault("browser", {})["cloud_provider"] = "local"
browser_cfg["cloud_provider"] = "local"
_print_success(" Browser set to local mode")
elif bp:
config.setdefault("browser", {})["cloud_provider"] = bp
browser_cfg["cloud_provider"] = bp
_print_success(f" Browser cloud provider set to: {bp}")
browser_cfg["use_gateway"] = bool(managed_feature)
# Set web search backend in config if applicable
if provider.get("web_backend"):
config.setdefault("web", {})["backend"] = provider["web_backend"]
web_cfg = config.setdefault("web", {})
web_cfg["backend"] = provider["web_backend"]
web_cfg["use_gateway"] = bool(managed_feature)
_print_success(f" Web backend set to: {provider['web_backend']}")
if managed_feature and managed_feature not in ("web", "tts", "browser"):
+186
View File
@@ -27,6 +27,192 @@ import sys
import threading
from typing import Any, Callable, Optional
# Modifier aliases mirrored from the TUI parser (``ui-tui/src/lib/platform.ts``)
# ``_MOD_ALIASES`` table — the contract that removes the cross-runtime
# mismatch Copilot flagged in round-9 on #19835.
#
# ``super``/``win``/``windows`` are intentionally absent: prompt_toolkit
# has no super/meta modifier for the Cmd key, so those spellings are
# TUI-only. The normalizer below returns the documented default
# (``c-b``) for them — a silent fallback was preferred to a hard
# startup crash (Copilot round-11). The CLI binding site
# (``_register_voice_handler`` in cli.py) logs a warning when that
# fallback fires so users see why their TUI-only shortcut isn't
# bound in the classic CLI.
_VOICE_MOD_ALIASES = {
"ctrl": "c-",
"control": "c-",
"alt": "a-",
"option": "a-",
"opt": "a-",
}
# Named keys prompt_toolkit accepts in ``c-<name>`` / ``a-<name>`` form.
# Aliases collapse to prompt_toolkit's canonical spelling so the same
# config value binds identically in both runtimes (Copilot round-10 on
# #19835).
_VOICE_NAMED_KEYS = {
"space": "space",
"spc": "space",
"enter": "enter",
"return": "enter",
"ret": "enter",
"tab": "tab",
"escape": "escape",
"esc": "escape",
"backspace": "backspace",
"bs": "backspace",
"delete": "delete",
"del": "delete",
}
# ``useInputHandlers()`` intercepts these before the voice check runs,
# so a binding like ``ctrl+c`` (interrupt), ``ctrl+d`` (quit), or
# ``ctrl+l`` (clear screen) would be advertised in /voice status but
# never fire push-to-talk — the same blocklist the TUI parser uses.
_VOICE_RESERVED_CTRL_CHARS = frozenset({"c", "d", "l"})
# On macOS the classic CLI's prompt_toolkit bindings for copy / exit /
# clear also claim ``a-c`` / ``a-d`` / ``a-l`` via the action-modifier
# lookup, and hermes-ink reports Alt as ``key.meta`` on many terminals.
# Mirror the TUI parser's darwin-only reservation so ``option+c`` etc.
# don't bind Alt+C in the CLI while the TUI silently falls back to
# Ctrl+B (Copilot round-14 on #19835).
_VOICE_RESERVED_ALT_CHARS_MAC = frozenset({"c", "d", "l"})
_DEFAULT_PT_KEY = "c-b"
def voice_record_key_from_config(cfg: Any) -> Any:
"""Shape-safe ``cfg.voice.record_key`` lookup.
``load_config()`` deep-merges raw YAML and preserves scalar
overrides, so a hand-edited ``voice: true`` / ``voice: cmd+b``
leaves ``cfg["voice"]`` as a bool/str instead of a dict, and the
naive ``.get("voice", {}).get("record_key")`` chain raises
AttributeError before voice can even start (Copilot round-11 on
#19835). Return ``None`` for malformed shapes so call sites can
feed the result straight into the normalizer/formatter and get
the documented default.
"""
if not isinstance(cfg, dict):
return None
voice = cfg.get("voice")
if not isinstance(voice, dict):
return None
return voice.get("record_key")
def normalize_voice_record_key_for_prompt_toolkit(raw: Any) -> str:
"""Coerce ``voice.record_key`` into prompt_toolkit's ``c-x`` / ``a-x`` format.
Mirrors the TUI parser contract (``ui-tui/src/lib/platform.ts``)
so one config value binds the same shortcut in both runtimes:
* non-string / empty / typo'd / bare-char / multi-modifier / reserved
``ctrl+c|d|l`` documented default ``c-b``
* single-char keys: ``ctrl+o`` ``c-o``
* named keys: ``ctrl+space`` ``c-space`` (aliases collapse:
``ctrl+return`` ``c-enter``)
* ``super`` / ``win`` / ``windows`` ``c-b`` (TUI-only modifiers
prompt_toolkit has no super mod; the CLI binding site is
expected to warn when this fallback fires so users see the
cross-runtime split, Copilot round-11 on #19835)
"""
if not isinstance(raw, str):
return _DEFAULT_PT_KEY
lowered = raw.strip().lower()
if not lowered:
return _DEFAULT_PT_KEY
parts = [p.strip() for p in lowered.split("+") if p.strip()]
if not parts:
return _DEFAULT_PT_KEY
# Multi-modifier chords like ``ctrl+alt+r`` bind different shortcuts
# in prompt_toolkit (a-c-r form) and hermes-ink rejects them; collapse
# to the documented default instead of silently diverging.
if len(parts) > 2:
return _DEFAULT_PT_KEY
# Bare char / bare named key (no explicit modifier) — the CLI's
# prompt_toolkit binds the raw key without a modifier, which the TUI
# parser refuses; reject here too so both runtimes agree.
if len(parts) == 1:
return _DEFAULT_PT_KEY
modifier_token, key_token = parts
# ``super`` / ``win`` / ``windows`` are TUI-only (prompt_toolkit has
# no super modifier, so ``@kb.add(super+b)`` crashes the CLI at
# startup). Fall back to the documented default here; the CLI
# binding site is expected to log a warning when the configured
# value is one of these spellings so users know the TUI+CLI
# runtimes diverge on that shortcut (Copilot round-11 on #19835).
if modifier_token in {"super", "win", "windows"}:
return _DEFAULT_PT_KEY
normalized_mod = _VOICE_MOD_ALIASES.get(modifier_token)
if not normalized_mod:
return _DEFAULT_PT_KEY
# Single-char key: reject reserved-ctrl chords that the TUI would
# also block at parse time, plus the mac-only alt reservation.
if len(key_token) == 1:
if normalized_mod == "c-" and key_token in _VOICE_RESERVED_CTRL_CHARS:
return _DEFAULT_PT_KEY
if (
normalized_mod == "a-"
and sys.platform == "darwin"
and key_token in _VOICE_RESERVED_ALT_CHARS_MAC
):
return _DEFAULT_PT_KEY
return f"{normalized_mod}{key_token}"
# Multi-char key token must be a known named key; typos like
# ``ctrl+spcae`` fall back to the default rather than being passed
# through as ``c-spcae`` (which prompt_toolkit would reject).
named = _VOICE_NAMED_KEYS.get(key_token)
if not named:
return _DEFAULT_PT_KEY
return f"{normalized_mod}{named}"
def format_voice_record_key_for_status(raw: Any) -> str:
"""Render ``voice.record_key`` for ``/voice status`` in CLI-friendly form.
Mirrors the TUI's ``formatVoiceRecordKey``: returns ``Ctrl+B`` /
``Alt+Space`` / ``Ctrl+Enter``. Malformed configs surface as the
documented default so status never advertises a shortcut that
won't bind (Copilot round-10 on #19835).
"""
normalized = normalize_voice_record_key_for_prompt_toolkit(raw)
if normalized.startswith("c-"):
prefix, key = "Ctrl+", normalized[2:]
elif normalized.startswith("a-"):
prefix, key = "Alt+", normalized[2:]
elif "+" in normalized:
# ``super+<key>`` / ``win+<key>`` — CLI won't bind them, but
# render in title case so status output is still readable.
mod, key = normalized.split("+", 1)
prefix = mod[0].upper() + mod[1:] + "+"
else:
return "Ctrl+B"
if not key:
return prefix.rstrip("+")
if len(key) == 1:
return prefix + key.upper()
return prefix + key[0].upper() + key[1:]
from tools.voice_mode import (
create_audio_recorder,
is_whisper_hallucination,
+421
View File
@@ -718,6 +718,45 @@ class SessionDB:
self._remove_session_files(sessions_dir, sid)
return len(removed_ids)
def finalize_orphaned_compression_sessions(self) -> int:
"""Mark orphaned compression continuation sessions as ended.
Targets child sessions that were never finalized: parent is ended
with reason='compression', child has messages but no end_reason/ended_at
and api_call_count=0. Non-destructive: preserves all messages and sets
end_reason='orphaned_compression'. Fix for #20001.
"""
cutoff = time.time() - 604800 # 7 days
def _do(conn):
now = time.time()
result = conn.execute(
"""
UPDATE sessions
SET ended_at = ?,
end_reason = 'orphaned_compression'
WHERE api_call_count = 0
AND end_reason IS NULL
AND ended_at IS NULL
AND started_at < ?
AND parent_session_id IS NOT NULL
AND EXISTS (
SELECT 1 FROM sessions p
WHERE p.id = sessions.parent_session_id
AND p.end_reason = 'compression'
AND p.ended_at IS NOT NULL
)
AND EXISTS (
SELECT 1 FROM messages m
WHERE m.session_id = sessions.id
)
""",
(now, cutoff),
)
return result.rowcount
return self._execute_write(_do) or 0
def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
"""Get a session by ID."""
with self._lock:
@@ -2148,6 +2187,388 @@ class SessionDB:
)
self._execute_write(_do)
def apply_telegram_topic_migration(self) -> None:
"""Create Telegram DM topic-mode tables on explicit /topic opt-in.
This migration is deliberately not part of automatic SessionDB startup
reconciliation. Operators must be able to upgrade Hermes, keep the old
Telegram bot behavior running, and only mutate topic-mode state when the
user executes /topic to opt into the feature.
Schema versions:
v1 initial shape (no ON DELETE CASCADE on session_id FK)
v2 session_id FK gets ON DELETE CASCADE so session pruning
automatically clears bindings.
"""
def _do(conn):
conn.executescript(
"""
CREATE TABLE IF NOT EXISTS telegram_dm_topic_mode (
chat_id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
enabled INTEGER NOT NULL DEFAULT 1,
activated_at REAL NOT NULL,
updated_at REAL NOT NULL,
has_topics_enabled INTEGER,
allows_users_to_create_topics INTEGER,
capability_checked_at REAL,
intro_message_id TEXT,
pinned_message_id TEXT
);
CREATE TABLE IF NOT EXISTS telegram_dm_topic_bindings (
chat_id TEXT NOT NULL,
thread_id TEXT NOT NULL,
user_id TEXT NOT NULL,
session_key TEXT NOT NULL,
session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
managed_mode TEXT NOT NULL DEFAULT 'auto',
linked_at REAL NOT NULL,
updated_at REAL NOT NULL,
PRIMARY KEY (chat_id, thread_id)
);
CREATE UNIQUE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_session
ON telegram_dm_topic_bindings(session_id);
CREATE INDEX IF NOT EXISTS idx_telegram_dm_topic_bindings_user
ON telegram_dm_topic_bindings(user_id, chat_id);
"""
)
# v1 → v2: rebuild telegram_dm_topic_bindings if its session_id FK
# lacks ON DELETE CASCADE. SQLite can't ALTER a foreign key, so we
# rebuild the table. Only runs once per DB (version gate).
current = conn.execute(
"SELECT value FROM state_meta WHERE key = ?",
("telegram_dm_topic_schema_version",),
).fetchone()
current_version = int(current[0]) if current and str(current[0]).isdigit() else 0
if current_version < 2:
fk_rows = conn.execute(
"PRAGMA foreign_key_list('telegram_dm_topic_bindings')"
).fetchall()
needs_rebuild = any(
row[2] == "sessions" and (row[6] or "") != "CASCADE"
for row in fk_rows
)
if needs_rebuild:
conn.executescript(
"""
CREATE TABLE telegram_dm_topic_bindings_new (
chat_id TEXT NOT NULL,
thread_id TEXT NOT NULL,
user_id TEXT NOT NULL,
session_key TEXT NOT NULL,
session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
managed_mode TEXT NOT NULL DEFAULT 'auto',
linked_at REAL NOT NULL,
updated_at REAL NOT NULL,
PRIMARY KEY (chat_id, thread_id)
);
INSERT INTO telegram_dm_topic_bindings_new
SELECT chat_id, thread_id, user_id, session_key,
session_id, managed_mode, linked_at, updated_at
FROM telegram_dm_topic_bindings;
DROP TABLE telegram_dm_topic_bindings;
ALTER TABLE telegram_dm_topic_bindings_new
RENAME TO telegram_dm_topic_bindings;
CREATE UNIQUE INDEX idx_telegram_dm_topic_bindings_session
ON telegram_dm_topic_bindings(session_id);
CREATE INDEX idx_telegram_dm_topic_bindings_user
ON telegram_dm_topic_bindings(user_id, chat_id);
"""
)
conn.execute(
"INSERT INTO state_meta (key, value) VALUES (?, ?) "
"ON CONFLICT(key) DO UPDATE SET value = excluded.value",
("telegram_dm_topic_schema_version", "2"),
)
self._execute_write(_do)
def enable_telegram_topic_mode(
self,
*,
chat_id: str,
user_id: str,
has_topics_enabled: Optional[bool] = None,
allows_users_to_create_topics: Optional[bool] = None,
) -> None:
"""Enable Telegram DM topic mode for one private chat/user.
This method intentionally owns the explicit topic migration. Ordinary
SessionDB startup must not create these side tables.
"""
self.apply_telegram_topic_migration()
now = time.time()
def _to_int(value: Optional[bool]) -> Optional[int]:
if value is None:
return None
return 1 if value else 0
def _do(conn):
conn.execute(
"""
INSERT INTO telegram_dm_topic_mode (
chat_id, user_id, enabled, activated_at, updated_at,
has_topics_enabled, allows_users_to_create_topics,
capability_checked_at
) VALUES (?, ?, 1, ?, ?, ?, ?, ?)
ON CONFLICT(chat_id) DO UPDATE SET
user_id = excluded.user_id,
enabled = 1,
updated_at = excluded.updated_at,
has_topics_enabled = excluded.has_topics_enabled,
allows_users_to_create_topics = excluded.allows_users_to_create_topics,
capability_checked_at = excluded.capability_checked_at
""",
(
str(chat_id),
str(user_id),
now,
now,
_to_int(has_topics_enabled),
_to_int(allows_users_to_create_topics),
now,
),
)
self._execute_write(_do)
def disable_telegram_topic_mode(
self,
*,
chat_id: str,
clear_bindings: bool = True,
) -> None:
"""Disable Telegram DM topic mode for one private chat.
When ``clear_bindings`` is True (default) the (chat_id, thread_id)
bindings for this chat are also cleared so re-enabling later
starts from a clean slate. Set to False if the operator wants to
preserve bindings for a later re-enable.
Never creates the topic-mode tables from scratch; if they don't
exist there is nothing to disable and the call is a no-op.
"""
def _do(conn):
try:
conn.execute(
"UPDATE telegram_dm_topic_mode SET enabled = 0, updated_at = ? "
"WHERE chat_id = ?",
(time.time(), str(chat_id)),
)
if clear_bindings:
conn.execute(
"DELETE FROM telegram_dm_topic_bindings WHERE chat_id = ?",
(str(chat_id),),
)
except sqlite3.OperationalError:
# Tables don't exist yet — nothing to disable.
return
self._execute_write(_do)
def is_telegram_topic_mode_enabled(self, *, chat_id: str, user_id: str) -> bool:
"""Return whether Telegram DM topic mode is enabled for this chat/user."""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT enabled FROM telegram_dm_topic_mode
WHERE chat_id = ? AND user_id = ?
""",
(str(chat_id), str(user_id)),
).fetchone()
except sqlite3.OperationalError:
return False
if row is None:
return False
enabled = row["enabled"] if isinstance(row, sqlite3.Row) else row[0]
return bool(enabled)
def get_telegram_topic_binding(
self,
*,
chat_id: str,
thread_id: str,
) -> Optional[Dict[str, Any]]:
"""Return the session binding for a Telegram DM topic, if present."""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT * FROM telegram_dm_topic_bindings
WHERE chat_id = ? AND thread_id = ?
""",
(str(chat_id), str(thread_id)),
).fetchone()
except sqlite3.OperationalError:
return None
return dict(row) if row else None
def bind_telegram_topic(
self,
*,
chat_id: str,
thread_id: str,
user_id: str,
session_key: str,
session_id: str,
managed_mode: str = "auto",
) -> None:
"""Bind one Telegram DM topic thread to one Hermes session.
A Hermes session may only be linked to one Telegram topic in MVP.
Rebinding the same topic to the same session is idempotent; trying to
link the same session to a different topic raises ValueError.
"""
self.apply_telegram_topic_migration()
now = time.time()
chat_id = str(chat_id)
thread_id = str(thread_id)
user_id = str(user_id)
session_key = str(session_key)
session_id = str(session_id)
def _do(conn):
existing_session = conn.execute(
"""
SELECT chat_id, thread_id FROM telegram_dm_topic_bindings
WHERE session_id = ?
""",
(session_id,),
).fetchone()
if existing_session is not None:
linked_chat = existing_session["chat_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[0]
linked_thread = existing_session["thread_id"] if isinstance(existing_session, sqlite3.Row) else existing_session[1]
if str(linked_chat) != chat_id or str(linked_thread) != thread_id:
raise ValueError("session is already linked to another Telegram topic")
conn.execute(
"""
INSERT INTO telegram_dm_topic_bindings (
chat_id, thread_id, user_id, session_key, session_id,
managed_mode, linked_at, updated_at
) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(chat_id, thread_id) DO UPDATE SET
user_id = excluded.user_id,
session_key = excluded.session_key,
session_id = excluded.session_id,
managed_mode = excluded.managed_mode,
updated_at = excluded.updated_at
""",
(
chat_id,
thread_id,
user_id,
session_key,
session_id,
managed_mode,
now,
now,
),
)
self._execute_write(_do)
def is_telegram_session_linked_to_topic(self, *, session_id: str) -> bool:
"""Return True if a Hermes session is already bound to any Telegram DM topic.
Read-only: does NOT trigger the telegram-topic migration. If the
topic-mode tables have not been created yet (i.e. nobody has run
``/topic`` in this profile), the session is by definition unbound
and we return False.
"""
with self._lock:
try:
row = self._conn.execute(
"""
SELECT 1 FROM telegram_dm_topic_bindings
WHERE session_id = ?
LIMIT 1
""",
(str(session_id),),
).fetchone()
except sqlite3.OperationalError:
return False
return row is not None
def list_unlinked_telegram_sessions_for_user(
self,
*,
chat_id: str,
user_id: str,
limit: int = 10,
) -> List[Dict[str, Any]]:
"""List previous Telegram sessions for this user that are not bound to a topic.
Read-only: does NOT trigger the telegram-topic migration. If the
topic-mode tables are absent, fall back to a simpler query that
just returns this user's Telegram sessions — there can't be any
bindings yet.
"""
with self._lock:
try:
rows = self._conn.execute(
"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
WHERE s.source = 'telegram'
AND s.user_id = ?
AND NOT EXISTS (
SELECT 1 FROM telegram_dm_topic_bindings b
WHERE b.session_id = s.id
)
ORDER BY last_active DESC, s.started_at DESC
LIMIT ?
""",
(str(user_id), int(limit)),
).fetchall()
except sqlite3.OperationalError:
# telegram_dm_topic_bindings doesn't exist yet — no bindings
# means every telegram session for this user is "unlinked".
rows = self._conn.execute(
"""
SELECT s.*,
COALESCE(
(SELECT SUBSTR(REPLACE(REPLACE(m.content, X'0A', ' '), X'0D', ' '), 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS _preview_raw,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
WHERE s.source = 'telegram'
AND s.user_id = ?
ORDER BY last_active DESC, s.started_at DESC
LIMIT ?
""",
(str(user_id), int(limit)),
).fetchall()
sessions: List[Dict[str, Any]] = []
for row in rows:
session = dict(row)
raw = str(session.pop("_preview_raw", "") or "").strip()
session["preview"] = raw[:60] + ("..." if len(raw) > 60 else "") if raw else ""
sessions.append(session)
return sessions
# ── Space reclamation ──
def vacuum(self) -> None:
+24
View File
@@ -0,0 +1,24 @@
# Hermes-Katalog für statische Meldungen -- Deutsch
# See locales/en.yaml for the source of truth; keep keys in sync.
approval:
dangerous_header: "⚠️ GEFÄHRLICHER BEFEHL: {description}"
choose_long: " [o]einmal | [s]sitzung | [a]immer | [d]ablehnen"
choose_short: " [o]einmal | [s]sitzung | [d]ablehnen"
prompt_long: " Auswahl [o/s/a/D]: "
prompt_short: " Auswahl [o/s/D]: "
timeout: " ⏱ Zeitüberschreitung Befehl wird abgelehnt"
allowed_once: " ✓ Einmalig erlaubt"
allowed_session: " ✓ Für diese Sitzung erlaubt"
allowed_always: " ✓ Zur dauerhaften Erlaubnisliste hinzugefügt"
denied: " ✗ Abgelehnt"
cancelled: " ✗ Abgebrochen"
blocklist_message: "Dieser Befehl steht auf der unbedingten Sperrliste und kann nicht genehmigt werden."
gateway:
approval_expired: "⚠️ Genehmigung abgelaufen (Agent wartet nicht mehr). Bitten Sie den Agenten, es erneut zu versuchen."
draining: "⏳ Warte auf {count} aktive(n) Agent(en) vor dem Neustart..."
goal_cleared: "✓ Ziel gelöscht."
no_active_goal: "Kein aktives Ziel."
config_read_failed: "⚠️ config.yaml konnte nicht gelesen werden: {error}"
config_save_failed: "⚠️ Konfiguration konnte nicht gespeichert werden: {error}"
+35
View File
@@ -0,0 +1,35 @@
# Hermes static-message catalog -- English (baseline / source of truth)
#
# Only user-facing static messages from the CLI approval prompt and a handful
# of gateway slash-command replies live here. Agent-generated output, log
# lines, error tracebacks, tool outputs, and slash-command descriptions stay
# in English and are NOT translated -- see agent/i18n.py for scope rationale.
#
# Keys are dotted paths; nesting below is purely for readability. Values may
# contain {placeholder} tokens for str.format substitution. When adding a
# new key, add it to EVERY locale file (en/zh/ja/de/es) in the same commit --
# tests/agent/test_i18n.py asserts catalog parity.
approval:
# CLI approval prompt -- shown when a dangerous command needs user review.
dangerous_header: "⚠️ DANGEROUS COMMAND: {description}"
choose_long: " [o]nce | [s]ession | [a]lways | [d]eny"
choose_short: " [o]nce | [s]ession | [d]eny"
prompt_long: " Choice [o/s/a/D]: "
prompt_short: " Choice [o/s/D]: "
timeout: " ⏱ Timeout - denying command"
allowed_once: " ✓ Allowed once"
allowed_session: " ✓ Allowed for this session"
allowed_always: " ✓ Added to permanent allowlist"
denied: " ✗ Denied"
cancelled: " ✗ Cancelled"
blocklist_message: "This command is on the unconditional blocklist and cannot be approved."
gateway:
# Messenger replies to slash commands and implicit state changes.
approval_expired: "⚠️ Approval expired (agent is no longer waiting). Ask the agent to try again."
draining: "⏳ Draining {count} active agent(s) before restart..."
goal_cleared: "✓ Goal cleared."
no_active_goal: "No active goal."
config_read_failed: "⚠️ Could not read config.yaml: {error}"
config_save_failed: "⚠️ Could not save config: {error}"
+24
View File
@@ -0,0 +1,24 @@
# Catálogo de mensajes estáticos de Hermes -- Español
# See locales/en.yaml for the source of truth; keep keys in sync.
approval:
dangerous_header: "⚠️ COMANDO PELIGROSO: {description}"
choose_long: " [o]una vez | [s]sesión | [a]siempre | [d]denegar"
choose_short: " [o]una vez | [s]sesión | [d]denegar"
prompt_long: " Opción [o/s/a/D]: "
prompt_short: " Opción [o/s/D]: "
timeout: " ⏱ Tiempo agotado — comando denegado"
allowed_once: " ✓ Permitido una vez"
allowed_session: " ✓ Permitido en esta sesión"
allowed_always: " ✓ Añadido a la lista de permitidos permanente"
denied: " ✗ Denegado"
cancelled: " ✗ Cancelado"
blocklist_message: "Este comando está en la lista de bloqueo incondicional y no se puede aprobar."
gateway:
approval_expired: "⚠️ La aprobación ha caducado (el agente ya no está esperando). Pida al agente que lo intente de nuevo."
draining: "⏳ Esperando a que terminen {count} agente(s) activo(s) antes de reiniciar..."
goal_cleared: "✓ Objetivo eliminado."
no_active_goal: "No hay objetivo activo."
config_read_failed: "⚠️ No se pudo leer config.yaml: {error}"
config_save_failed: "⚠️ No se pudo guardar la configuración: {error}"
+24
View File
@@ -0,0 +1,24 @@
# Hermes 静的メッセージカタログ -- 日本語
# See locales/en.yaml for the source of truth; keep keys in sync.
approval:
dangerous_header: "⚠️ 危険なコマンド: {description}"
choose_long: " [o]今回のみ | [s]セッション中 | [a]常に許可 | [d]拒否"
choose_short: " [o]今回のみ | [s]セッション中 | [d]拒否"
prompt_long: " 選択 [o/s/a/D]: "
prompt_short: " 選択 [o/s/D]: "
timeout: " ⏱ タイムアウト — コマンドを拒否しました"
allowed_once: " ✓ 今回のみ許可"
allowed_session: " ✓ このセッション中は許可"
allowed_always: " ✓ 永続的な許可リストに追加"
denied: " ✗ 拒否しました"
cancelled: " ✗ キャンセルしました"
blocklist_message: "このコマンドは無条件ブロックリストに含まれており、承認できません。"
gateway:
approval_expired: "⚠️ 承認の有効期限が切れました(エージェントはもう待機していません)。エージェントに再試行を依頼してください。"
draining: "⏳ 再起動前に {count} 個のアクティブエージェントの終了を待っています..."
goal_cleared: "✓ 目標をクリアしました。"
no_active_goal: "アクティブな目標はありません。"
config_read_failed: "⚠️ config.yaml を読み込めませんでした: {error}"
config_save_failed: "⚠️ 設定を保存できませんでした: {error}"
+24
View File
@@ -0,0 +1,24 @@
# Hermes 静态消息目录 -- 中文(简体)
# See locales/en.yaml for the source of truth; keep keys in sync.
approval:
dangerous_header: "⚠️ 危险命令: {description}"
choose_long: " [o]仅此一次 | [s]本次会话 | [a]永久允许 | [d]拒绝"
choose_short: " [o]仅此一次 | [s]本次会话 | [d]拒绝"
prompt_long: " 选择 [o/s/a/D]: "
prompt_short: " 选择 [o/s/D]: "
timeout: " ⏱ 超时 — 已拒绝命令"
allowed_once: " ✓ 本次允许"
allowed_session: " ✓ 本次会话内允许"
allowed_always: " ✓ 已加入永久允许列表"
denied: " ✗ 已拒绝"
cancelled: " ✗ 已取消"
blocklist_message: "此命令位于无条件拦截列表中,无法被批准。"
gateway:
approval_expired: "⚠️ 批准已过期(代理不再等待)。请让代理重试。"
draining: "⏳ 正在等待 {count} 个活跃代理结束后重启..."
goal_cleared: "✓ 目标已清除。"
no_active_goal: "当前没有活跃的目标。"
config_read_failed: "⚠️ 无法读取 config.yaml{error}"
config_save_failed: "⚠️ 无法保存配置:{error}"
+38 -3
View File
@@ -511,6 +511,12 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
Handles ``"type": "integer"``, ``"type": "number"``, ``"type": "boolean"``,
and union types (``"type": ["integer", "string"]``).
Also wraps bare scalar values in a single-element list when the schema
declares ``"type": "array"``. Open-weight models (DeepSeek, Qwen, GLM)
sometimes emit ``{"urls": "https://a.com"}`` when the tool expects
``{"urls": ["https://a.com"]}``; wrapping here avoids a confusing tool
failure on what is otherwise a well-formed call.
"""
if not args or not isinstance(args, dict):
return args
@@ -523,13 +529,42 @@ def coerce_tool_args(tool_name: str, args: Dict[str, Any]) -> Dict[str, Any]:
if not properties:
return args
for key, value in args.items():
if not isinstance(value, str):
continue
for key, value in list(args.items()):
prop_schema = properties.get(key)
if not prop_schema:
continue
expected = prop_schema.get("type")
# Wrap bare non-list values when the schema declares ``array``.
# Strings still go through _coerce_value first so JSON-encoded
# arrays (``'["a","b"]'``) get parsed and nullable ``"null"``
# becomes ``None`` rather than ``["null"]``.
# ``None`` itself is preserved — we don't know whether the model
# meant "omit" or "empty list", and tools with sensible defaults
# (e.g. read_file's normalize_read_pagination) already handle it.
if expected == "array" and value is not None and not isinstance(value, (list, tuple)):
if isinstance(value, str):
coerced = _coerce_value(value, expected, schema=prop_schema)
if coerced is not value:
# _coerce_value handled it (JSON-parsed list or
# nullable "null" → None).
args[key] = coerced
continue
args[key] = [value]
logger.info(
"coerce_tool_args: wrapped bare string in list for %s.%s",
tool_name, key,
)
continue
args[key] = [value]
logger.info(
"coerce_tool_args: wrapped bare %s in list for %s.%s",
type(value).__name__, tool_name, key,
)
continue
if not isinstance(value, str):
continue
if not expected and not _schema_allows_null(prop_schema):
continue
coerced = _coerce_value(value, expected, schema=prop_schema)
+31 -24
View File
@@ -163,35 +163,42 @@
for entry in "''${ENTRIES[@]}"; do
IFS=":" read -r ATTR FOLDER NIX_FILE <<< "$entry"
echo "==> .#$ATTR ($FOLDER -> $NIX_FILE)"
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
# Compute the actual hash from the lockfile directly using
# prefetch-npm-deps. This avoids false "ok" from nix build when
# an old derivation is cached in a substituter (cachix/cache.nixos.org).
LOCK_FILE="$FOLDER/package-lock.json"
NEW_HASH=$(${pkgs.lib.getExe pkgs.prefetch-npm-deps} "$LOCK_FILE" 2>/dev/null)
if [ -z "$NEW_HASH" ]; then
echo " prefetch-npm-deps failed, falling back to nix build" >&2
OUTPUT=$(nix build ".#$ATTR.npmDeps" --no-link --print-build-logs 2>&1)
STATUS=$?
if [ "$STATUS" -eq 0 ]; then
echo " ok (via nix build)"
continue
fi
NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
if [ -z "$NEW_HASH" ]; then
if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
echo " skipped (transient cache failure see primary nix build for real status)" >&2
echo "$OUTPUT" | tail -8 >&2
continue
fi
echo " build failed with no hash mismatch:" >&2
echo "$OUTPUT" | tail -40 >&2
exit 1
fi
fi
OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
| sed -E 's/hash = "(.*)"/\1/')
if [ "$NEW_HASH" = "$OLD_HASH" ]; then
echo " ok"
continue
fi
NEW_HASH=$(echo "$OUTPUT" | awk '/got:/ {print $2; exit}')
if [ -z "$NEW_HASH" ]; then
# Magic-Nix-Cache occasionally returns HTTP 418 / cache-throttled
# mid-run; nix then prints "outputs … not valid, so checking is
# not possible" without a `got:` line. That's an infrastructure
# blip, not a stale lockfile — warn + skip rather than failing
# the lint. A real hash mismatch would still surface in the
# primary `.#$ATTR` build, which is a separate CI job.
if echo "$OUTPUT" | grep -qE "throttled|HTTP error 418|substituter .* is disabled|some outputs of .* are not valid"; then
echo " skipped (transient cache failure see primary nix build for real status)" >&2
echo "$OUTPUT" | tail -8 >&2
continue
fi
echo " build failed with no hash mismatch:" >&2
echo "$OUTPUT" | tail -40 >&2
exit 1
fi
HASH_LINE=$(grep -n 'hash = "sha256-' "$NIX_FILE" | head -1 | cut -d: -f1)
OLD_HASH=$(grep -oE 'hash = "sha256-[^"]+"' "$NIX_FILE" | head -1 \
| sed -E 's/hash = "(.*)"/\1/')
LOCK_FILE="$FOLDER/package-lock.json"
echo " stale: $NIX_FILE:$HASH_LINE $OLD_HASH -> $NEW_HASH"
STALE=1
+1 -1
View File
@@ -4,7 +4,7 @@ let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-a/HGI9OgVcTnZrMXA7xFMGnFoVxyHe95fulVz+WNYB0=";
hash = "sha256-MLcLhjTF6dgdvNBtJWzo8Nh19eNh/ZitD2b07nm61Tc=";
};
npm = hermesNpmLib.mkNpmPassthru { folder = "ui-tui"; attr = "tui"; pname = "hermes-tui"; };
@@ -0,0 +1,190 @@
---
name: hyperframes
description: Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.
version: 1.0.0
author: heygen-com
license: Apache-2.0
prerequisites:
commands: [node, ffmpeg, npx]
metadata:
hermes:
tags: [creative, video, animation, html, gsap, motion-graphics]
related_skills: [manim-video, meme-generation]
category: creative
requires_toolsets: [terminal]
---
# HyperFrames
HTML is the source of truth for video. A composition is an HTML file with `data-*` attributes for timing, a GSAP timeline for animation, and CSS for appearance. The HyperFrames engine captures the page frame-by-frame and encodes to MP4/WebM with FFmpeg.
**Complement to `manim-video`:** Use `manim-video` for mathematical/geometric explainers (equations, 3B1B-style). Use `hyperframes` for motion-graphics, talking-head with captions, product tours, social overlays, shader transitions, and anything driven by real video/audio media.
## When to Use
- User asks for a rendered video from text, a script, or a website
- Animated title cards, lower thirds, or typographic intros
- Captioned narration video (TTS + captions synced to waveform)
- Audio-reactive visuals (beat sync, spectrum bars, pulsing glow)
- Scene-to-scene transitions (crossfade, wipe, shader warp, flash-through-white)
- Social overlays (Instagram/TikTok/YouTube style)
- Website-to-video pipeline (capture a URL, produce a promo)
- Any HTML/CSS/JS animation that must render deterministically to a video file
Do **not** use this skill for:
- Pure math/equation animation (→ `manim-video`)
- Image generation or memes (→ `meme-generation`, image models)
- Live video conferencing or streaming
## Quick Reference
```bash
npx hyperframes init my-video # scaffold a project
cd my-video
npx hyperframes lint # validate before preview/render
npx hyperframes preview # live-reload browser preview (port 3002)
npx hyperframes render --output final.mp4 # render to MP4
npx hyperframes doctor # diagnose environment issues
```
Render flags: `--quality draft|standard|high` · `--fps 24|30|60` · `--format mp4|webm` · `--docker` (reproducible) · `--strict`.
Full CLI reference: [references/cli.md](references/cli.md).
## Setup (one-time)
```bash
bash "$(dirname "$(find ~/.hermes/skills -path '*/hyperframes/SKILL.md' 2>/dev/null | head -1)")/scripts/setup.sh"
```
The script:
1. Verifies Node.js >= 22 and FFmpeg are installed (prints fix instructions if not).
2. Installs the `hyperframes` CLI globally (`npm install -g hyperframes@>=0.4.2`).
3. Pre-caches `chrome-headless-shell` via Puppeteer — **required** for best-quality rendering via Chrome's `HeadlessExperimental.beginFrame` capture path.
4. Runs `npx hyperframes doctor` and reports the result.
See [references/troubleshooting.md](references/troubleshooting.md) if setup fails.
## Procedure
### 1. Plan before writing HTML
Before touching code, articulate at a high level:
- **What** — narrative arc, key moments, emotional beats
- **Structure** — compositions, tracks (video/audio/overlays), durations
- **Visual identity** — colors, fonts, motion character (explosive / cinematic / fluid / technical)
- **Hero frame** — for each scene, the moment when the most elements are simultaneously visible. This is the static layout you'll build first.
**Visual Identity Gate (HARD-GATE).** Before writing ANY composition HTML, a visual identity must be defined. Do NOT write compositions with default or generic colors (`#333`, `#3b82f6`, `Roboto` are tells that this step was skipped). Check in order:
1. **`DESIGN.md` at project root?** → Use its exact colors, fonts, motion rules, and "What NOT to Do" constraints.
2. **User named a style** (e.g. "Swiss Pulse", "dark and techy", "luxury brand")? → Generate a minimal `DESIGN.md` with `## Style Prompt`, `## Colors` (3-5 hex with roles), `## Typography` (1-2 families), `## What NOT to Do` (3-5 anti-patterns).
3. **None of the above?** → Ask 3 questions before writing any HTML:
- Mood? (explosive / cinematic / fluid / technical / chaotic / warm)
- Light or dark canvas?
- Any brand colors, fonts, or visual references?
Then generate a `DESIGN.md` from the answers. Every composition must trace its palette and typography back to `DESIGN.md` or explicit user direction.
### 2. Scaffold
```bash
npx hyperframes init my-video --non-interactive
```
Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`. Pass `--example <name>` to pick one, `--video clip.mp4` or `--audio track.mp3` to seed with media.
### 3. Layout before animation
Write the static HTML+CSS for the **hero frame first** — no GSAP yet. The `.scene-content` container must fill the scene (`width:100%; height:100%; padding:Npx`) with `display:flex` + `gap`. Use padding to push content inward — never `position: absolute; top: Npx` on a content container (content overflows when taller than the remaining space).
Only after the hero frame looks right, add `gsap.from()` entrances (animate **to** the CSS position) and `gsap.to()` exits (animate **from** it).
See [references/composition.md](references/composition.md) for the full data-attribute schema and composition rules.
### 4. Animate with GSAP
Every composition must:
- Register its timeline: `window.__timelines["<composition-id>"] = tl`
- Start paused: `gsap.timeline({ paused: true })` — the player controls playback
- Use finite `repeat` values (no `repeat: -1` — breaks the capture engine). Calculate: `repeat: Math.ceil(duration / cycleDuration) - 1`.
- Be deterministic — no `Math.random()`, `Date.now()`, or wall-clock logic. Use a seeded PRNG if you need pseudo-randomness.
- Build synchronously — no `async`/`await`, `setTimeout`, or Promises around timeline construction.
See [references/gsap.md](references/gsap.md) for the core GSAP API (tweens, eases, stagger, timelines).
### 5. Transitions between scenes
Multi-scene compositions require transitions. Rules:
1. **Always use a transition between scenes** — no jump cuts.
2. **Always use entrance animations** on every scene element (`gsap.from(...)`).
3. **Never use exit animations** except on the final scene — the transition IS the exit.
4. The final scene may fade out.
Use `npx hyperframes add <transition-name>` to install shader transitions (`flash-through-white`, `liquid-wipe`, etc.). Full list: `npx hyperframes add --list`.
### 6. Audio, captions, TTS, audio-reactive, highlighting
- **Audio:** always a separate `<audio>` element (video is `muted playsinline`).
- **TTS:** `npx hyperframes tts "Script text" --voice af_nova --output narration.wav`. List voices with `--list`. Voice ID first letter encodes language (`a`/`b`=English, `e`=Spanish, `f`=French, `j`=Japanese, `z`=Mandarin, etc.) — the CLI auto-infers the phonemizer locale; pass `--lang` only to override. Non-English phonemization requires `espeak-ng` installed system-wide.
- **Captions:** `npx hyperframes transcribe narration.wav` → word-level transcript. Pick style from the transcript tone (hype / corporate / tutorial / storytelling / social — see the table in `references/features.md`). **Language rule:** never use `.en` whisper models unless the audio is confirmed English — `.en` translates non-English audio instead of transcribing it. Every caption group MUST have a hard `tl.set(el, { opacity: 0, visibility: "hidden" }, group.end)` kill after its exit tween — otherwise groups leak visible into later ones.
- **Audio-reactive visuals:** pre-extract audio bands (bass / mid / treble) and sample per-frame inside the timeline with a `for` loop of `tl.call(draw, [], f / fps)` — a single long tween does NOT react to audio. Map bass → `scale` (pulse), treble → `textShadow`/`boxShadow` (glow), overall amplitude → `opacity`/`y`/`backgroundColor`. Avoid equalizer-bar clichés — let content guide the visual, audio drive its behavior.
- **Marker-style highlighting:** highlight, circle, burst, scribble, sketchout effects for text emphasis are deterministic CSS+GSAP — see `references/features.md#marker-highlighting`. Fully seekable, no animated SVG filters.
- **Scene transitions:** every multi-scene composition MUST use transitions (no jump cuts). Pick from CSS primitives (push slide, blur crossfade, zoom through, staggered blocks) or shader transitions (`flash-through-white`, `liquid-wipe`, `cross-warp-morph`, `chromatic-split`, etc.) via `npx hyperframes add`. Mood and energy tables live in `references/features.md#transitions`. Do not mix CSS and shader transitions in the same composition.
### 7. Lint, validate, inspect, preview, render
```bash
npx hyperframes lint # catches missing data-composition-id, overlapping tracks, unregistered timelines
npx hyperframes validate # WCAG contrast audit at 5 timestamps
npx hyperframes inspect # visual layout audit — overflow, off-frame elements, occluded text
npx hyperframes preview # live browser preview
npx hyperframes render --quality draft --output draft.mp4 # fast iteration
npx hyperframes render --quality high --output final.mp4 # final delivery
```
`hyperframes validate` samples background pixels behind every text element and warns on contrast ratios below 4.5:1 (or 3:1 for large text). `hyperframes inspect` is the layout-side companion — runs the page at multiple timestamps and flags issues that a static lint can't see (a caption that wraps past the safe area only at 4.5s, a card that overflows when its title is the longest variant, an element that ends up behind a transition shader). Run `inspect` especially on compositions with speech bubbles, cards, captions, or tight typography.
### 8. Website-to-video (if the user gives a URL)
Use the 7-step capture-to-video workflow in [references/website-to-video.md](references/website-to-video.md): capture → DESIGN.md → SCRIPT.md → storyboard → composition → render → deliver.
## Pitfalls
- **`HeadlessExperimental.beginFrame' wasn't found`** — Chromium 147+ removed this protocol. Ensure you're on `hyperframes@>=0.4.2` (auto-detects and falls back to screenshot mode). Escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294) and [references/troubleshooting.md](references/troubleshooting.md).
- **System Chrome (not `chrome-headless-shell`)** — renders hang for 120s then timeout. Run `npx puppeteer browsers install chrome-headless-shell` (setup.sh does this). `hyperframes doctor` reports which binary will be used.
- **`repeat: -1` anywhere** — breaks the capture engine. Always compute a finite repeat count.
- **`gsap.set()` on clip elements that enter later** — the element doesn't exist at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline instead, at or after the clip's `data-start`.
- **`<br>` inside content text** — forced breaks don't know the rendered font width, so natural wrap + `<br>` double-breaks. Use `max-width` to let text wrap. Exception: short display titles where each word is deliberately on its own line.
- **Animating `visibility` or `display`** — GSAP can't tween these. Use `autoAlpha` (handles both visibility and opacity).
- **Calling `video.play()` or `audio.play()`** — the framework owns playback. Never call these yourself.
- **Building timelines async** — the capture engine reads `window.__timelines` synchronously after page load. Never wrap timeline construction in `async`, `setTimeout`, or a Promise.
- **Standalone `index.html` wrapped in `<template>`** — hides all content from the browser. Only **sub-compositions** loaded via `data-composition-src` use `<template>`.
- **Using video for audio** — always muted `<video>` + separate `<audio>`.
## Verification
Before and after rendering:
1. **Lint + validate + inspect pass:** `npx hyperframes lint --strict && npx hyperframes validate && npx hyperframes inspect` (lint catches structural issues, validate catches contrast, inspect catches visual layout / overflow issues — see troubleshooting.md if warnings appear).
2. **Animation choreography** — for new compositions or significant animation changes, run the animation map. `npx hyperframes init` copies the skill scripts into the project, so the path is project-local:
```bash
node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
--out <composition-dir>/.hyperframes/anim-map
```
Outputs a single `animation-map.json` with per-tween summaries, ASCII Gantt timeline, stagger detection, dead zones (>1s with no animation), element lifecycles, and flags (`offscreen`, `collision`, `invisible`, `paced-fast` <0.2s, `paced-slow` >2s). Scan summaries and flags — fix or justify each. Skip on small edits.
3. **File exists + non-zero:** `ls -lh final.mp4`.
4. **Duration matches `data-duration`:** `ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 final.mp4`.
5. **Visual check:** extract a mid-composition frame: `ffmpeg -i final.mp4 -ss 00:00:05 -vframes 1 preview.png`.
6. **Audio present if expected:** `ffprobe -v error -show_streams -select_streams a -of default=nw=1:nk=1 final.mp4 | head -1`.
If `hyperframes render` fails, run `npx hyperframes doctor` and attach its output when reporting.
## References
- [composition.md](references/composition.md) — data attributes, timeline contract, non-negotiable rules, typography/asset rules
- [cli.md](references/cli.md) — every CLI command (init, capture, lint, validate, inspect, preview, render, transcribe, tts, doctor, browser, info, upgrade, benchmark)
- [gsap.md](references/gsap.md) — GSAP core API for HyperFrames (tweens, eases, stagger, timelines, matchMedia)
- [features.md](references/features.md) — captions, TTS, audio-reactive, marker highlighting, transitions (load on demand)
- [website-to-video.md](references/website-to-video.md) — 7-step capture-to-video workflow
- [troubleshooting.md](references/troubleshooting.md) — OpenClaw fix, env vars, common render errors
@@ -0,0 +1,185 @@
# HyperFrames CLI
Everything runs through `npx hyperframes` (or the globally-installed `hyperframes` after `npm install -g hyperframes`). Requires Node.js >= 22 and FFmpeg.
## Workflow
1. **Scaffold**`npx hyperframes init my-video` (or `npx hyperframes capture <url>` if starting from a website)
2. **Write** — author HTML composition (see `composition.md`)
3. **Lint**`npx hyperframes lint`
4. **Validate**`npx hyperframes validate` (WCAG contrast audit)
5. **Inspect**`npx hyperframes inspect` (visual layout audit)
6. **Preview**`npx hyperframes preview`
7. **Render**`npx hyperframes render`
Always lint before preview/render — catches missing `data-composition-id`, overlapping tracks, and unregistered timelines.
## init — Scaffold a Project
```bash
npx hyperframes init my-video # interactive wizard
npx hyperframes init my-video --example warm-grain # pick an example template
npx hyperframes init my-video --video clip.mp4 # seed with a video file
npx hyperframes init my-video --audio track.mp3 # seed with an audio file
npx hyperframes init my-video --non-interactive # skip prompts (CI / agent use)
```
Templates: `blank`, `warm-grain`, `play-mode`, `swiss-grid`, `vignelli`, `decision-tree`, `kinetic-type`, `product-promo`, `nyt-graph`.
`init` creates the correct file structure, copies media, transcribes audio with Whisper, and installs authoring skills. Use it instead of creating files by hand.
## capture — Website → Editable Components
```bash
npx hyperframes capture https://example.com # → captures/example.com/
npx hyperframes capture https://stripe.com -o stripe-video # custom output dir
npx hyperframes capture https://example.com --json # machine-readable output
npx hyperframes capture https://example.com --skip-assets # skip images/SVGs
```
Captures the site into `captures/<hostname>/capture/` by default, producing `capture/screenshots/`, `capture/assets/`, `capture/extracted/` (tokens.json, visible-text.txt, fonts.json), and a self-contained snapshot.
All downstream steps (DESIGN.md, SCRIPT.md, STORYBOARD, composition) read from the `capture/` subfolder — see `website-to-video.md`.
## lint
```bash
npx hyperframes lint # current directory
npx hyperframes lint ./my-project # specific project
npx hyperframes lint --verbose # include info-level findings
npx hyperframes lint --json # machine-readable output
```
Lints `index.html` and all files in `compositions/`. Reports errors (must fix), warnings (should fix), and info (only with `--verbose`).
## validate
```bash
npx hyperframes validate # WCAG contrast audit at 5 timestamps
npx hyperframes validate --no-contrast # skip while iterating
```
Seeks to 5 timestamps, screenshots the page, samples background pixels behind every text element, and warns on contrast ratios below 4.5:1 (normal text) or 3:1 (large text — 24px+, or 19px+ bold). Run before final render.
## inspect
```bash
npx hyperframes inspect # visual layout audit at 5 timestamps
npx hyperframes inspect ./my-project # specific project
npx hyperframes inspect --json # agent-readable findings
npx hyperframes inspect --samples 15 # denser timeline sweep
npx hyperframes inspect --at 1.5,4,7.25 # explicit hero-frame timestamps
```
Use this after `lint` and `validate`, especially for compositions with speech bubbles, cards, captions, or tight typography. Reports overflow, off-frame elements, occluded text, contrast warnings, and per-timestamp layout summaries — catches issues that pure timeline lint can't see (e.g., a caption that wraps past the safe area only at a specific timestamp).
`npx hyperframes layout` is a compatibility alias for the same visual inspection pass.
## preview
```bash
npx hyperframes preview # serve current directory (port 3002)
npx hyperframes preview --port 4567 # custom port
```
Hot-reloads on file changes. Opens the Studio in your browser automatically.
## render
```bash
npx hyperframes render # standard MP4
npx hyperframes render --output final.mp4 # named output
npx hyperframes render --quality draft # fast iteration
npx hyperframes render --fps 60 --quality high # final delivery
npx hyperframes render --format webm # transparent WebM
npx hyperframes render --docker # byte-identical reproducible render
```
| Flag | Options | Default | Notes |
| -------------- | ----------------------- | ------------------------------ | --------------------------- |
| `--output` | path | `renders/<name>_<timestamp>.mp4` | Output path |
| `--fps` | 24, 30, 60 | 30 | 60fps doubles render time |
| `--quality` | `draft`, `standard`, `high` | standard | draft for iterating |
| `--format` | `mp4`, `webm` | mp4 | WebM supports transparency |
| `--workers` | 18 or `auto` | auto | Each spawns Chrome |
| `--docker` | flag | off | Reproducible output |
| `--gpu` | flag | off | GPU-accelerated encoding |
| `--strict` | flag | off | Fail on lint errors |
| `--strict-all` | flag | off | Fail on errors AND warnings |
**Quality guidance:** `draft` while iterating, `standard` for review, `high` for final delivery.
## transcribe
```bash
npx hyperframes transcribe audio.mp3
npx hyperframes transcribe video.mp4 --model medium.en --language en
npx hyperframes transcribe subtitles.srt # import existing
npx hyperframes transcribe subtitles.vtt
npx hyperframes transcribe openai-response.json
```
Produces word-level timings suitable for caption components. First run downloads the Whisper model (cached after).
## tts
```bash
npx hyperframes tts "Text here" --voice af_nova --output narration.wav
npx hyperframes tts script.txt --voice bf_emma
npx hyperframes tts "La reunión empieza a las nueve" --voice ef_dora --output es.wav
npx hyperframes tts "Hello there" --voice af_heart --lang fr-fr --output accented.wav
npx hyperframes tts --list # show all voices
```
Uses Kokoro (local, no API key). Voice ID first letter encodes language: `a` American English, `b` British English, `e` Spanish, `f` French, `h` Hindi, `i` Italian, `j` Japanese, `p` Brazilian Portuguese, `z` Mandarin. The CLI auto-infers the phonemizer locale from that prefix — pass `--lang` only to override (e.g. stylized accents). Valid `--lang` codes: `en-us`, `en-gb`, `es`, `fr-fr`, `hi`, `it`, `pt-br`, `ja`, `zh`. Non-English phonemization requires `espeak-ng` installed system-wide (`apt-get install espeak-ng` / `brew install espeak-ng`).
## doctor
```bash
npx hyperframes doctor
```
Verifies environment:
- Node.js >= 22
- FFmpeg present on PATH
- Available RAM (renders are memory-hungry — 4 GB minimum)
- Chrome binary resolution (`chrome-headless-shell` preferred over system Chrome)
- Current `hyperframes` version
Run this **first** when a render fails. See `troubleshooting.md` for interpreting the output.
## browser
```bash
npx hyperframes browser --install # install the bundled chrome-headless-shell
npx hyperframes browser --path # print the resolved browser binary path
npx hyperframes browser --clean # clear the bundled browser cache
```
## info
```bash
npx hyperframes info
```
Prints version, Node version, FFmpeg version, OS, and resolved browser path — useful in bug reports.
## upgrade
```bash
npx hyperframes upgrade -y
```
Check for and install updates. Run this if you hit `HeadlessExperimental.beginFrame` errors — the auto-detect fix shipped in `hyperframes@0.4.2` (commit 4c72ba4, March 2026).
## Other
```bash
npx hyperframes compositions # list compositions in the project
npx hyperframes docs # open documentation in browser
npx hyperframes benchmark . # benchmark render performance
npx hyperframes add <block> # install a block/component from the catalog
npx hyperframes add --list # browse the catalog
```
Popular catalog blocks: `flash-through-white` (shader transition), `instagram-follow` (social overlay), `data-chart` (animated chart), `lower-third` (talking-head overlay). See [hyperframes.heygen.com/catalog](https://hyperframes.heygen.com/catalog).
@@ -0,0 +1,129 @@
# Composition Authoring
HTML structure, data attributes, timeline contract, and non-negotiable rules.
## Root Structure
Standalone `index.html` — the top-level composition. **Does NOT use `<template>`**. Put the `data-composition-id` div directly in `<body>`.
```html
<!doctype html>
<html>
<body>
<div
id="stage"
data-composition-id="root"
data-start="0"
data-duration="10"
data-width="1920"
data-height="1080"
>
<!-- clips go here -->
<video id="clip-1" data-start="0" data-duration="5" data-track-index="0" src="intro.mp4" muted playsinline></video>
<img id="logo" data-start="2" data-duration="3" data-track-index="1" src="logo.png" />
<audio id="music" data-start="0" data-duration="10" data-track-index="2" data-volume="0.5" src="music.wav"></audio>
</div>
<script src="https://cdn.jsdelivr.net/npm/gsap@3.14.2/dist/gsap.min.js"></script>
<script>
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true });
tl.from("#logo", { opacity: 0, y: 40, duration: 0.6 }, 2);
window.__timelines["root"] = tl;
</script>
</body>
</html>
```
Sub-compositions loaded via `data-composition-src` **DO** use `<template>`:
```html
<template id="my-comp-template">
<div data-composition-id="my-comp" data-width="1920" data-height="1080">
<!-- content + scoped <style> + <script> with window.__timelines["my-comp"] -->
</div>
</template>
```
Load from the root: `<div id="el-1" data-composition-id="my-comp" data-composition-src="compositions/my-comp.html" data-start="0" data-duration="10" data-track-index="1"></div>`
## Data Attributes
### All clips
| Attribute | Required | Values |
| ------------------ | --------------------------------- | ------------------------------------------------------ |
| `id` | Yes | Unique identifier |
| `data-start` | Yes | Seconds, or clip ID reference (`"el-1"`, `"intro + 2"`) |
| `data-duration` | Required for img/div/compositions | Seconds. Video/audio defaults to media duration. |
| `data-track-index` | Yes | Integer. Same-track clips cannot overlap. |
| `data-media-start` | No | Trim offset into source (seconds) |
| `data-volume` | No | 01 (default 1) |
`data-track-index` controls timeline layout only — **not** visual layering. Use CSS `z-index` for layering.
### Composition clips
| Attribute | Required | Values |
| ---------------------------- | -------- | -------------------------------------------- |
| `data-composition-id` | Yes | Unique composition ID |
| `data-start` | Yes | Start time (root composition: `"0"`) |
| `data-duration` | Yes | Takes precedence over GSAP timeline duration |
| `data-width` / `data-height` | Yes | Pixel dimensions (1920x1080 or 1080x1920) |
| `data-composition-src` | No | Path to external HTML file |
## Timeline Contract
- Every timeline starts `{ paused: true }` — the player controls playback.
- Register every timeline: `window.__timelines["<composition-id>"] = tl`.
- Duration comes from `data-duration`, not from the GSAP timeline length.
- Framework auto-nests sub-timelines — do NOT manually add them.
- Never create empty tweens just to set duration.
## Non-Negotiable Rules
1. **Deterministic.** No `Math.random()`, `Date.now()`, or time-based logic. Use a seeded PRNG (e.g. mulberry32) if you need pseudo-randomness.
2. **GSAP only on visual properties.** `opacity`, `x`, `y`, `scale`, `rotation`, `color`, `backgroundColor`, `borderRadius`, transforms. Never animate `visibility`, `display`, or call `video.play()`/`audio.play()`.
3. **No property conflicts across timelines.** Never animate the same property on the same element from multiple timelines simultaneously.
4. **No `repeat: -1`.** Infinite-repeat tweens break the capture engine. Compute `repeat: Math.ceil(duration / cycleDuration) - 1`.
5. **Synchronous timeline construction.** Never build timelines inside `async`/`await`, `setTimeout`, or Promises. The capture engine reads `window.__timelines` synchronously after page load. Fonts are embedded by the compiler — no need to wait for load.
6. **Root composition has no `<template>` wrapper.** Only sub-compositions use `<template>`.
7. **Video is always `muted playsinline`.** Audio is always a separate `<audio>` element — even if it's the same source file.
8. **Content containers use padding, not absolute positioning.** `.scene-content { width: 100%; height: 100%; padding: Npx; display: flex; flex-direction: column; gap: Npx; box-sizing: border-box }`. Absolute-positioned content containers overflow. Reserve `position: absolute` for decoratives only.
## Scene Transitions
Multi-scene compositions MUST follow all of these:
1. **Always use a transition between scenes.** No jump cuts.
2. **Always use entrance animations** on every scene element. Every element animates IN via `gsap.from(...)`. No element may appear fully-formed.
3. **Never use exit animations** (except on the final scene). This means NO `gsap.to()` that animates `opacity` to 0, `y` offscreen, etc. The transition IS the exit. Outgoing scene content must be fully visible at the moment the transition starts.
4. **Final scene only:** may fade elements out. This is the only scene where `gsap.to(..., { opacity: 0 })` is allowed.
## Typography and Assets
- **Fonts:** write the `font-family` you want in CSS — the compiler embeds supported fonts automatically. Unsupported fonts produce a compiler warning.
- Add `crossorigin="anonymous"` to external media.
- For dynamic text sizing, use `window.__hyperframes.fitTextFontSize(text, { maxWidth, fontFamily, fontWeight })`.
- All project files live at the project root alongside `index.html`. Sub-compositions reference assets with `../`.
- For rendered video: 60px+ headlines, 20px+ body, 16px+ data labels. `font-variant-numeric: tabular-nums` on number columns. Avoid full-screen linear gradients on dark backgrounds (H.264 banding — use radial or solid + localized glow).
## Animation Guardrails
- Offset the first animation 0.10.3s (not `t=0`).
- Vary eases across entrance tweens — at least 3 different eases per scene.
- Don't repeat an entrance pattern within a scene.
## Never Do
1. Forget `window.__timelines` registration.
2. Use video for audio — always muted video + separate `<audio>`.
3. Nest video inside a timed div — use a non-timed wrapper.
4. Use `data-layer` (use `data-track-index`) or `data-end` (use `data-duration`).
5. Animate video element dimensions — animate a wrapper div instead.
6. Call `play`/`pause`/`seek` on media — framework owns playback.
7. Create a top-level container without `data-composition-id`.
8. Use `repeat: -1` on any timeline or tween.
9. Build timelines asynchronously.
10. Use `gsap.set()` on elements from later scenes — they don't exist in the DOM at page load. Use `tl.set(selector, vars, timePosition)` inside the timeline at or after the clip's `data-start`.
11. Use `<br>` in content text — causes unwanted extra breaks when the text wraps naturally. Use `max-width` instead. Exception: short display titles (e.g., "THE\nIMMORTAL\nGAME") where each word is deliberately on its own line.
@@ -0,0 +1,289 @@
# HyperFrames Feature Reference
Load this file when a composition needs captions, TTS narration, audio-reactive visuals, marker-style text highlighting, or scene transitions. All patterns here are deterministic (no `Math.random()`, no `Date.now()`, no runtime audio analysis) and live on the same GSAP timeline as the rest of the composition.
## Captions
### Language Rule (Non-Negotiable)
**Never use `.en` whisper models unless the audio is confirmed English.** `.en` models TRANSLATE non-English audio into English instead of transcribing it.
- User says the language → `npx hyperframes transcribe audio.mp3 --model small --language <code>` (no `.en`)
- User confirms English → `--model small.en`
- Language unknown → `--model small` (auto-detects)
### Style Detection
If the user doesn't specify a caption style, detect it from the transcript tone:
| Tone | Font mood | Animation | Color | Size |
| ------------ | ------------------------ | ---------------------------------- | --------------------------- | ------- |
| Hype / launch | Heavy condensed, 800-900 | Scale-pop, `back.out(1.7)`, 0.1-0.2s | Bright on dark | 72-96px |
| Corporate | Clean sans, 600-700 | Fade+slide, `power3.out`, 0.3s | White / neutral + muted accent | 56-72px |
| Tutorial | Mono / clean sans, 500-600 | Typewriter or fade, 0.4-0.5s | High contrast, minimal | 48-64px |
| Storytelling | Serif / elegant, 400-500 | Slow fade, `power2.out`, 0.5-0.6s | Warm muted tones | 44-56px |
| Social | Rounded sans, 700-800 | Bounce, `elastic.out`, word-by-word | Playful, colored pills | 56-80px |
### Word Grouping
- High energy: 2-3 words, quick turnover.
- Conversational: 3-5 words, natural phrases.
- Measured / calm: 4-6 words.
Break on sentence boundaries, 150ms+ pauses, or a max word count.
### Positioning
- Landscape (1920x1080): bottom 80-120px, centered.
- Portrait (1080x1920): ~600-700px from bottom, centered.
- Never cover the subject's face. `position: absolute` (never relative). One caption group visible at a time.
### Text Overflow Prevention
Use the runtime helper so captions never overflow:
```js
const result = window.__hyperframes.fitTextFontSize(group.text.toUpperCase(), {
fontFamily: "Outfit",
fontWeight: 900,
maxWidth: 1600, // 1600 landscape, 900 portrait
});
el.style.fontSize = result.fontSize + "px";
```
When per-word styling uses `scale > 1.0`, compute `maxWidth = safeWidth / maxScale` to leave headroom. Container needs `overflow: visible` (not `hidden` — hidden clips scaled emphasis words and glow).
### Caption Exit Guarantee
Every group MUST have a hard kill after its exit tween — otherwise groups leak into later ones:
```js
tl.to(groupEl, { opacity: 0, scale: 0.95, duration: 0.12, ease: "power2.in" }, group.end - 0.12);
tl.set(groupEl, { opacity: 0, visibility: "hidden" }, group.end); // deterministic kill
```
### Per-Word Styling
Scan the transcript for words that deserve distinct treatment:
- Brand / product names — larger, unique color.
- ALL CAPS — scale boost, flash, accent color.
- Numbers / statistics — bold weight, accent color.
- Emotional keywords — exaggerated animation (overshoot, bounce).
- Call-to-action — highlight, underline, color pop.
## TTS (Kokoro-82M)
Local, no API key. Runs on CPU. Model downloads on first use (~311 MB + ~27 MB voices, cached in `~/.cache/hyperframes/tts/`).
### Voice Selection
| Content type | Voice | Why |
| ------------- | ----------------------- | --------------------------- |
| Product demo | `af_heart` / `af_nova` | Warm, professional |
| Tutorial | `am_adam` / `bf_emma` | Neutral, easy to follow |
| Marketing | `af_sky` / `am_michael` | Energetic or authoritative |
| Documentation | `bf_emma` / `bm_george` | Clear British English |
| Casual | `af_heart` / `af_sky` | Approachable, natural |
Run `npx hyperframes tts --list` for all 54 voices across 8 languages.
### Multilingual Phonemization
Voice ID first letter encodes language: `a`=American English, `b`=British English, `e`=Spanish, `f`=French, `h`=Hindi, `i`=Italian, `j`=Japanese, `p`=Brazilian Portuguese, `z`=Mandarin. The CLI auto-infers the phonemizer locale from that prefix — you don't need `--lang` when voice and text match.
```bash
npx hyperframes tts "La reunión empieza a las nueve" --voice ef_dora --output es.wav
npx hyperframes tts "今日はいい天気ですね" --voice jf_alpha --output ja.wav
```
Pass `--lang` only to override auto-detection (e.g. stylized accents):
```bash
npx hyperframes tts "Hello there" --voice af_heart --lang fr-fr --output accented.wav
```
Valid `--lang` codes: `en-us`, `en-gb`, `es`, `fr-fr`, `hi`, `it`, `pt-br`, `ja`, `zh`. Non-English phonemization requires `espeak-ng` installed system-wide (`apt-get install espeak-ng` / `brew install espeak-ng`).
### Speed
- `0.7-0.8` — tutorial, complex content
- `1.0` — natural (default)
- `1.1-1.2` — intros, upbeat content
- `1.5+` — rarely appropriate
### TTS + Captions Workflow
```bash
npx hyperframes tts script.txt --voice af_heart --output narration.wav
npx hyperframes transcribe narration.wav # → transcript.json (word-level)
```
## Audio-Reactive Visuals
Drive visuals from music, voice, or sound. Any GSAP-tweenable property can respond to pre-extracted audio data.
### Data format
```js
const AUDIO_DATA = {
fps: 30,
totalFrames: 900,
frames: [{ bands: [0.82, 0.45, 0.31, /* ... */] }, /* ... */],
};
```
`frames[i].bands[]` are frequency band amplitudes, 0-1. Index 0 = bass, higher indices = treble. Each band is normalized independently across the full track.
### Mapping audio to visuals
| Audio signal | Visual property | Effect |
| ---------------------- | --------------------------------- | -------------------------- |
| Bass (`bands[0]`) | `scale` | Pulse on beat |
| Treble (`bands[12-14]`)| `textShadow`, `boxShadow` | Glow intensity |
| Overall amplitude | `opacity`, `y`, `backgroundColor` | Breathe, lift, color shift |
| Mid-range (`bands[4-8]`)| `borderRadius`, `width` | Shape morphing |
Any GSAP-tweenable property works — `clipPath`, `filter`, SVG attributes, CSS custom properties. Let content guide the visual and let audio drive its behavior. **Never add** equalizer bars, spectrum analyzers, waveform displays, rainbow cycling, or generic particle systems — they look cheap.
### Sampling pattern (required)
Audio reactivity needs per-frame sampling via a `for` loop of `tl.call()`, NOT a single tween. A single long tween does NOT react to audio:
```js
for (let f = 0; f < AUDIO_DATA.totalFrames; f++) {
tl.call(
((frame) => () => draw(frame))(AUDIO_DATA.frames[f]),
[],
f / AUDIO_DATA.fps,
);
}
```
### Gotchas
- **textShadow on a container** with semi-transparent children (e.g. inactive caption words at `rgba(255,255,255,0.3)`) renders a visible glow rectangle behind every child. Apply the glow to active words individually, not to the container.
- **Subtlety for text** — 3-6% scale variation, soft glow. Heavy pulsing makes text unreadable.
- **Go bigger on non-text** — backgrounds and shapes can handle 10-30% swings.
- **Deterministic only** — pre-extracted audio data, no Web Audio API, no runtime analysis.
## Marker-Style Highlighting
Deterministic CSS + GSAP implementations of the classic "highlight / circle / burst / scribble / sketchout" drawing modes for emphasizing text. Fully seekable — no animated SVG filters, no JS timers.
### Highlight (yellow marker sweep)
```html
<span class="mh-highlight-wrap">
<span class="mh-highlight-bar" id="hl-1"></span>
<span class="mh-highlight-text">highlighted text</span>
</span>
```
```css
.mh-highlight-wrap { position: relative; display: inline; }
.mh-highlight-bar {
position: absolute; inset: 0 -6px;
background: #fdd835; opacity: 0.35;
transform: scaleX(0); transform-origin: left center;
border-radius: 3px; z-index: 0;
}
.mh-highlight-text { position: relative; z-index: 1; }
```
```js
tl.to("#hl-1", { scaleX: 1, duration: 0.5, ease: "power2.out" }, 0.6);
```
Multi-line: apply to `.mh-highlight-bar` with `stagger: 0.3`.
### Circle
Hand-drawn ellipse around a word. Use a positioned `::before` with `border-radius: 50%`, slight rotation, and `clip-path` to avoid covering the letters. Animate `clip-path` or `stroke-dashoffset` on an inline SVG circle.
### Burst
Short radiating lines around a word. Render 6-12 small `<span>` elements positioned in a radial pattern; animate `scaleY` from 0.
### Scribble
A chaotic overlay created by animating `stroke-dashoffset` on an inline SVG `<path>` with a `d` attribute describing a zig-zag. Seed values, never `Math.random()`.
### Sketchout
A rough rectangle outline. Two `<rect>`s with slight `transform` offsets, animated via `stroke-dashoffset`.
All five modes tween CSS transforms or `stroke-dashoffset` only — both tween cleanly, are deterministic, and seek correctly.
## Scene Transitions
Every multi-scene composition MUST use transitions. No jump cuts.
### Energy → primary transition
| Energy | CSS primary | Shader primary | Accent | Duration | Easing |
| ------------------------------------ | ---------------------------- | ------------------------------------ | ------------------------------ | --------- | ------------------------ |
| **Calm** (wellness, brand, luxury) | Blur crossfade, focus pull | Cross-warp morph, thermal distortion | Light leak, circle iris | 0.5-0.8s | `sine.inOut`, `power1` |
| **Medium** (corporate, SaaS) | Push slide, staggered blocks | Whip pan, cinematic zoom | Squeeze, vertical push | 0.3-0.5s | `power2`, `power3` |
| **High** (promos, sports, launch) | Zoom through, overexposure | Ridged burn, glitch, chromatic split | Staggered blocks, gravity drop | 0.15-0.3s | `power4`, `expo` |
Pick ONE primary (60-70% of scene changes) plus 1-2 accents. Never use a different transition for every scene.
### Mood → transition type
| Mood | Transitions |
| ------------------------ | --------------------------------------------------------------------------- |
| Warm / inviting | Light leak, blur crossfade, focus pull, film burn · _Shader:_ thermal distortion, cross-warp morph |
| Cold / clinical | Squeeze, zoom out, blinds, shutter, grid dissolve · _Shader:_ gravitational lens |
| Editorial / magazine | Push slide, vertical push, diagonal split, shutter · _Shader:_ whip pan |
| Tech / futuristic | Grid dissolve, staggered blocks, blinds · _Shader:_ glitch, chromatic split |
| Tense / edgy | Glitch, VHS, chromatic aberration, ripple · _Shader:_ ridged burn, domain warp |
| Playful / fun | Elastic push, 3D flip, circle iris, morph circle · _Shader:_ swirl vortex, ripple waves |
| Dramatic / cinematic | Zoom through, gravity drop, overexposure · _Shader:_ cinematic zoom, gravitational lens |
| Premium / luxury | Focus pull, blur crossfade, color dip to black · _Shader:_ cross-warp morph |
| Retro / analog | Film burn, light leak, VHS, clock wipe · _Shader:_ light leak |
### Presets
| Preset | Duration | Easing |
| ---------- | -------- | ----------------- |
| `snappy` | 0.2s | `power4.inOut` |
| `smooth` | 0.4s | `power2.inOut` |
| `gentle` | 0.6s | `sine.inOut` |
| `dramatic` | 0.5s | `power3.in` → out |
| `instant` | 0.15s | `expo.inOut` |
| `luxe` | 0.7s | `power1.inOut` |
### Install a shader transition
```bash
npx hyperframes add flash-through-white
npx hyperframes add --list
```
### CSS vs shader
- **CSS transitions** animate scene containers with opacity, transforms, `clip-path`, and filters. Simpler to set up.
- **Shader transitions** composite both scene textures per-pixel on a WebGL canvas — can warp, dissolve, and morph in ways CSS cannot. Import from `@hyperframes/shader-transitions` instead of writing raw GLSL.
Don't mix CSS and shader transitions in the same composition — once a composition uses shader transitions, the WebGL canvas replaces DOM-based scene switching for every transition.
### Shader-compatible CSS rules
Shader transitions capture DOM scenes to WebGL textures via html2canvas. The canvas 2D pipeline doesn't match CSS exactly:
1. No `transparent` keyword in gradients — use the target color at zero alpha: `rgba(200,117,51,0)` not `transparent`. (Canvas interpolates `transparent` as `rgba(0,0,0,0)` creating dark fringes.)
2. No gradient backgrounds on elements thinner than 4px. Use solid `background-color` on thin accent lines.
3. No CSS variables (`var()`) on elements visible during capture — html2canvas doesn't reliably resolve custom properties. Use literal color values.
4. Mark uncapturable decoratives with `data-no-capture` — they stay on the live DOM but are absent from the shader texture.
5. No gradient opacity below 0.15 — renders differently in canvas vs CSS.
6. Every `.scene` div must have explicit `background-color`, AND pass the same color as `bgColor` in the `init()` config. Without either, the texture renders as black.
These rules only apply to shader transition compositions. CSS-only compositions have no restrictions.
### Don't
- Mix CSS and shader transitions in one composition.
- Use exit animations on any scene except the final scene — the transition IS the exit.
- Introduce a new transition type every scene — pick one primary + 1-2 accents.
- Use transitions that create visible geometric repetition (grids, hex cells, uniform dots) — they look artificial regardless of the math behind them. Prefer organic noise (FBM, domain warping).
@@ -0,0 +1,136 @@
# GSAP for HyperFrames
GSAP is the animation engine for all HyperFrames compositions. Load from CDN inside the composition:
```html
<script src="https://cdn.jsdelivr.net/npm/gsap@3.14.2/dist/gsap.min.js"></script>
```
## Core Tween Methods
- **`gsap.to(targets, vars)`** — animate from current state to `vars`. Most common.
- **`gsap.from(targets, vars)`** — animate from `vars` to current state (entrances).
- **`gsap.fromTo(targets, fromVars, toVars)`** — explicit start and end.
- **`gsap.set(targets, vars)`** — apply immediately (duration 0). Don't use on clip elements that enter later — use `tl.set(selector, vars, time)` inside the timeline instead.
Always use **camelCase** property names (`backgroundColor`, `rotationX`, not `background-color`).
## Common vars
- **`duration`** — seconds (default 0.5).
- **`delay`** — seconds before start.
- **`ease`** — `"power1.out"` (default), `"power3.inOut"`, `"back.out(1.7)"`, `"elastic.out(1, 0.3)"`, `"none"`, `"expo.out"`, `"circ.inOut"`.
- **`stagger`** — number `0.1` or object: `{ amount: 0.3, from: "center" }`, `{ each: 0.1, from: "random" }`.
- **`overwrite`** — `false` (default), `true`, or `"auto"`.
- **`repeat`** — number (never `-1` in HyperFrames). **`yoyo`** — alternates direction with repeat.
- **`onComplete`**, **`onStart`**, **`onUpdate`** — callbacks.
- **`immediateRender`** — default `true` for `from()`/`fromTo()`. Set `false` on later tweens targeting the same property+element to avoid overwrite surprises.
## Transforms
Prefer GSAP's transform aliases over raw CSS `transform`:
| GSAP property | Equivalent |
| --------------------------- | -------------------------- |
| `x`, `y`, `z` | translateX/Y/Z (px) |
| `xPercent`, `yPercent` | translateX/Y (%) |
| `scale`, `scaleX`, `scaleY` | scale |
| `rotation` | rotate (deg) |
| `rotationX`, `rotationY` | 3D rotate |
| `skewX`, `skewY` | skew |
| `transformOrigin` | transform-origin |
- **`autoAlpha`** — prefer over `opacity`. At 0, also sets `visibility: hidden`.
- **CSS variables**`"--hue": 180`.
- **Directional rotation**`"360_cw"`, `"-170_short"`, `"90_ccw"`.
- **`clearProps`** — `"all"` or comma-separated; removes inline styles on complete.
- **Relative values**`"+=20"`, `"-=10"`, `"*=2"`.
## Function-based Values
```js
gsap.to(".item", {
x: (i, target, targets) => i * 50,
stagger: 0.1,
});
```
## Easing
Built-in eases: `power1` through `power4`, `back`, `bounce`, `circ`, `elastic`, `expo`, `sine`. Each has `.in`, `.out`, `.inOut`.
Rule of thumb:
- Entrances: `power3.out`, `expo.out`, `back.out(1.4)`
- Exits: `power2.in`, `expo.in`
- Scrubbed sections: `none` (linear)
- Vary eases across entrance tweens within a scene — at least 3 different eases.
## Defaults
```js
gsap.defaults({ duration: 0.6, ease: "power2.out" });
```
## Timelines (HyperFrames primary pattern)
```js
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true, defaults: { duration: 0.6, ease: "power2.out" } });
tl.from(".title", { y: 50, opacity: 0 }, 0.3);
tl.from(".subtitle", { y: 30, opacity: 0 }, 0.5);
tl.from(".cta", { scale: 0.8, opacity: 0, ease: "back.out(1.7)" }, 0.8);
window.__timelines["root"] = tl;
```
### Position parameter
Third argument to `.from()` / `.to()` / `.add()`:
- Absolute seconds: `0.5`, `2.1`.
- Relative to end: `">+0.2"` (0.2s after previous), `"<"` (same time as previous), `"<+0.3"` (0.3s after previous's start).
- Named labels: `tl.addLabel("act2", 5); tl.from(".x", { y: 30 }, "act2");`
### Nesting
HyperFrames auto-nests sub-composition timelines. **Do not** manually `tl.add(subTl)` — the framework wires sub-timelines into the parent at the sub-composition's `data-start`.
### Playback
The player controls playback. Don't call `tl.play()`, `tl.pause()`, or `tl.reverse()` at construction time. `{ paused: true }` is required.
## Stagger
```js
// even distribution
tl.from(".card", { opacity: 0, y: 40, stagger: 0.1 });
// control total amount
tl.from(".card", { opacity: 0, stagger: { amount: 0.6, from: "center" } });
// deterministic "random" stagger (HyperFrames compositions must be deterministic)
tl.from(".dot", { opacity: 0, stagger: { each: 0.05, from: "random" } });
```
`stagger.from`: `"start"` | `"end"` | `"center"` | `"edges"` | `"random"` | index | `[x, y]` for grid.
## Performance
- Animate transforms (`x`, `y`, `scale`, `rotation`, `opacity`) — cheap, GPU-accelerated.
- Avoid animating `width`, `height`, `top`, `left`, `margin` — causes layout thrash.
- Avoid box-shadow or filter animations on large elements — expensive.
- `will-change` is rarely needed; GSAP handles promotion.
## gsap.matchMedia (rarely needed in HyperFrames)
Compositions have fixed dimensions (`data-width`/`data-height`), so responsive breakpoints don't apply. You may still use `matchMedia` for `prefers-reduced-motion` when authoring UI previews, but it's not used in rendered video output.
## Don't Do
- `repeat: -1` anywhere — breaks the capture engine.
- `Math.random()`, `Date.now()`, performance.now()` inside tween values — non-deterministic.
- `async` / `setTimeout` / `Promise` around timeline construction — the capture engine reads `window.__timelines` synchronously.
- Animate `visibility` or `display` directly — use `autoAlpha`.
- `gsap.set()` on clip elements that enter later in the timeline — they don't exist in the DOM at page-load. Use `tl.set(sel, vars, time)` inside the timeline.
@@ -0,0 +1,137 @@
# Troubleshooting
## `HeadlessExperimental.beginFrame' wasn't found` (first thing to check)
**Symptom:** `npx hyperframes render` fails with:
```
✗ Render failed
Protocol error (HeadlessExperimental.beginFrame):
'HeadlessExperimental.beginFrame' wasn't found
```
**Cause:** Chromium 147+ removed the `HeadlessExperimental.beginFrame` CDP command. This affected sandbox environments (e.g., OpenClaw, some containerized agent hosts) that ship modern Chromium as the system browser. See [hyperframes#294](https://github.com/heygen-com/hyperframes/issues/294).
**Fix (permanent — preferred):** upgrade.
```bash
npx hyperframes upgrade -y
# or
npm install -g hyperframes@latest
```
`hyperframes >= 0.4.2` auto-detects whether the resolved browser supports `beginFrame` (checks for `chrome-headless-shell` in the binary path) and falls back to screenshot capture mode when it doesn't. Commit [`4c72ba4`](https://github.com/heygen-com/hyperframes/commit/4c72ba4a36ec2bd6733f7b9cb2a9e63f9fb234b9) (March 2026) shipped this auto-detect.
**Fix (escape hatch — if you can't upgrade):**
```bash
export PRODUCER_FORCE_SCREENSHOT=true
npx hyperframes render
```
This forces screenshot mode regardless of the binary. Screenshot mode is slightly slower but visually identical.
**Fix (prevent — recommended):** install `chrome-headless-shell` so the engine can use the fast BeginFrame path:
```bash
npx puppeteer browsers install chrome-headless-shell
# or let the CLI do it
npx hyperframes browser --install
```
`scripts/setup.sh` runs this automatically.
## `npx hyperframes render` hangs for 120s then times out
**Cause:** the resolved browser is system Chrome (e.g., `/usr/bin/google-chrome`) and doesn't support the BeginFrame path, but auto-detect also missed it (older `hyperframes` version).
**Fix:**
1. Check which binary is being used: `npx hyperframes browser --path`
2. If it's system Chrome, either:
- Install `chrome-headless-shell`: `npx hyperframes browser --install`, OR
- Set the escape hatch: `export PRODUCER_FORCE_SCREENSHOT=true`, OR
- Upgrade: `npx hyperframes upgrade -y`
## `ffmpeg: command not found`
Install FFmpeg via your system package manager:
| OS / distro | Command |
| --------------- | ----------------------------------- |
| Ubuntu / Debian | `sudo apt-get install -y ffmpeg` |
| Fedora / RHEL | `sudo dnf install -y ffmpeg` |
| Arch | `sudo pacman -S ffmpeg` |
| macOS | `brew install ffmpeg` |
| Windows | `winget install Gyan.FFmpeg` |
Verify: `ffmpeg -version`.
## `Node version X is not supported`
HyperFrames requires Node.js >= 22. Check with `node --version`.
- **nvm:** `nvm install 22 && nvm use 22`
- **Homebrew (macOS):** `brew install node@22 && brew link --overwrite node@22`
- **apt:** follow [nodesource](https://github.com/nodesource/distributions) for Node 22 LTS.
## `ENOSPC: no space left on device` or OOM kills during render
Renders are memory- and disk-hungry. Minimums:
- **RAM:** 4 GB free (8 GB recommended for 60fps / `--quality high`)
- **Disk:** 2 GB free scratch space — frames are written to `/tmp` during capture
Mitigations:
- Lower quality: `--quality draft`.
- Lower fps: `--fps 24`.
- Lower worker count: `--workers 1`.
- Set `TMPDIR` to a volume with more space: `export TMPDIR=/mnt/scratch`.
## Lint passes but the render is blank / black frames
Check the browser console in `preview` — usually:
- A timeline was registered with the wrong key (`__timelines["typo"]` instead of `__timelines["root"]`).
- The root composition was wrapped in `<template>` (only sub-compositions use `<template>`).
- A script tag failed to load — check Network tab in preview.
Run `npx hyperframes lint --verbose` to see info-level findings.
## Contrast warnings from `hyperframes validate`
```
⚠ WCAG AA contrast warnings (3):
· .subtitle "secondary text" — 2.67:1 (need 4.5:1, t=5.3s)
```
- **Dark backgrounds:** brighten the failing color until it clears 4.5:1 (normal text) or 3:1 (large text — 24px+ or 19px+ bold).
- **Light backgrounds:** darken it.
- Stay within the palette family — don't invent a new color, adjust the existing one.
- Skip the check temporarily with `--no-contrast` if iterating rapidly, but clear it before delivery.
## `Font family 'X' not supported by compiler`
The compiler embeds a curated set of web-safe + open-source fonts. If a font isn't supported, either:
- Swap to a supported alternative from the warning.
- Register a custom font via `@font-face` pointing to a `.woff2` in the project directory (the compiler embeds referenced `@font-face` files).
## Video plays back muted or with no audio
Check:
- The `<video>` element has `muted playsinline` (required — browser autoplay policy).
- Audio is a **separate** `<audio>` element, not the video element.
- Audio `data-volume` is set (defaults to 1).
- The audio file is at the expected path — compositions load relative to their own directory.
## Docker render fails on Linux with rootless Docker
Add `--privileged` or pass `--cap-add=SYS_ADMIN`:
```bash
npx hyperframes render --docker --docker-args "--cap-add=SYS_ADMIN"
```
The headless browser needs namespace permissions for sandboxing.
## Bug reports
Include `npx hyperframes info` output + the full error log. File at [github.com/heygen-com/hyperframes](https://github.com/heygen-com/hyperframes/issues).
@@ -0,0 +1,145 @@
# Website to Video
Capture a website, produce a professional video from it. Use when the user provides a URL and wants a video — social ad, product tour, 30-second promo, etc.
The workflow has 7 steps. Each produces an artifact that gates the next. **Do not skip steps** — each artifact prevents a downstream failure mode.
## Step 1: Capture & Understand
```bash
npx hyperframes capture https://example.com -o example-video
```
Produces `example-video/capture/` with:
- `capture/screenshots/` — above-the-fold + section screenshots (up to `--max-screenshots`)
- `capture/assets/` — logos, hero images, background video (if any)
- `capture/extracted/tokens.json` — colors, fonts, and spacing tokens
- `capture/extracted/visible-text.txt` — extracted headings, paragraphs, CTAs
- `capture/extracted/fonts.json` — font families and stacks detected in computed styles
- `capture/asset-descriptions.md` — auto-generated asset catalog
All subsequent steps read from the `capture/` subfolder — `capture/extracted/tokens.json`, `capture/assets/hero.png`, etc. Never strip the `capture/` prefix when referencing these files.
**Gate:** Print a site summary — name, top 3 colors, primary + display fonts, hero asset path, one-sentence vibe. Keep it in your context — don't re-capture.
## Step 2: Write DESIGN.md
Small brand reference at the project root. 6 sections, ~90 lines. This is the cheat sheet — not the creative plan.
```markdown
# DESIGN
## Brand
- Name: Example Co.
- One-line mission: "…"
## Colors
- Background: #0B0F14
- Primary: #00E0A4 (accent, CTA)
- Secondary: #7A8B9B (body text)
- Text: #FFFFFF
## Typography
- Display: "Inter Tight", 700, tight letter-spacing
- Body: "Inter", 400
## Motion
- Mood: precise, technical, confident
- Eases: `power3.out` for entrances, `expo.in` for exits
## Assets
- Logo: `capture/assets/logo.svg`
- Hero image: `capture/assets/hero.png`
## What NOT to Do
- No purple, no pastels, no serif body
- No playful/bubbly eases (`elastic`, `bounce`)
- No drop shadows on text
```
**Gate:** `DESIGN.md` exists in the project directory.
## Step 3: Write SCRIPT.md
Narration script. Story backbone. **Scene durations come from the narration, not from guessing.**
```markdown
# SCRIPT
## Scene 1 — Hook (0:000:04)
"What if your dashboards wrote themselves?"
## Scene 2 — Problem (0:040:11)
"Teams spend hours stitching together queries, charts, and callouts — every Monday."
## Scene 3 — Solution (0:110:22)
"Example Co. watches your data streams and proposes the dashboard you'd have built — in seconds."
## Scene 4 — CTA (0:220:28)
"Try it free at example.com."
```
Run `npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav` to generate TTS audio. Note the exact duration — that's the video's duration.
**Gate:** `SCRIPT.md` + `narration.wav` exist and durations match the plan (±0.3s).
## Step 4: Storyboard
Text-only scene plan: for each scene, describe the hero frame — what's on screen at the scene's most-visible moment.
```markdown
# STORYBOARD
## Scene 1 (0:000:04) — Hook
Hero frame: giant "WHAT IF YOUR DASHBOARDS WROTE THEMSELVES?" in display font, centered, on near-black. Logo top-left at 40% opacity.
Entrance: each word staggers in, 0.08s apart.
Transition out: flash-through-white into Scene 2.
```
One paragraph per scene. Do NOT skip this step — it's where you catch narrative gaps before writing HTML.
**Gate:** `STORYBOARD.md` exists. Each scene has: hero frame, entrance, transition.
## Step 5: Composition
Write `index.html` scene-by-scene:
- Each scene is a `<div class="scene scene-N">` positioned absolutely, full-bleed.
- Static HTML+CSS for the hero frame first (no GSAP).
- Layer the narration `<audio>` at `data-start="0"` on a high track index.
- Add a transitions component (`flash-through-white`, `liquid-wipe`, etc.) between each scene.
- THEN add GSAP entrances (`gsap.from()`), no exits — transitions own the exit.
- Register `window.__timelines["root"] = tl`.
Install transitions as needed:
```bash
npx hyperframes add flash-through-white
```
## Step 6: Render
```bash
npx hyperframes lint --strict # must pass
npx hyperframes validate # WCAG contrast audit
npx hyperframes render --quality draft --output draft.mp4
```
Watch the draft. Note issues in a `REVIEW.md` bullet list (scene, timestamp, issue). Fix, re-render.
When happy:
```bash
npx hyperframes render --quality high --output final.mp4
```
## Step 7: Deliver
- Report file path + duration + file size to the user.
- If the user wants a vertical cut, re-render with a 9:16 composition (`data-width="1080" data-height="1920"`) — typically requires a separate `index-vertical.html` with tighter typography and re-stacked scene layout.
## Common Failure Modes
- **Skipped DESIGN.md** → colors drift scene-to-scene; output feels like "AI slides."
- **Skipped STORYBOARD.md** → scenes overlap or hero frames collide with transitions.
- **Exit animations** before transitions → empty frames when the transition fires.
- **Narration longer than `data-duration`** → audio clips mid-sentence. Update the composition's `data-duration` to match the TTS output length + 0.5s buffer.
+135
View File
@@ -0,0 +1,135 @@
#!/usr/bin/env bash
# HyperFrames setup for Hermes.
#
# Verifies Node >= 22 and FFmpeg, installs the `hyperframes` CLI globally,
# pre-caches `chrome-headless-shell`, and runs `hyperframes doctor`.
#
# Pins `hyperframes@>=0.4.2` so the OpenClaw/Chromium-147 fix from
# https://github.com/heygen-com/hyperframes/issues/294 (commit 4c72ba4)
# is always present — the engine auto-detects `HeadlessExperimental.beginFrame`
# support and falls back to screenshot capture otherwise.
#
# Idempotent: safe to re-run.
set -euo pipefail
MIN_NODE_MAJOR=22
MIN_HYPERFRAMES_VERSION="0.4.2"
red() { printf '\033[31m%s\033[0m\n' "$*"; }
green() { printf '\033[32m%s\033[0m\n' "$*"; }
yellow() { printf '\033[33m%s\033[0m\n' "$*"; }
bold() { printf '\033[1m%s\033[0m\n' "$*"; }
bold "==> HyperFrames setup"
# --- 1. Node.js --------------------------------------------------------------
if ! command -v node >/dev/null 2>&1; then
red "✗ Node.js is not installed."
echo " Install Node.js >= ${MIN_NODE_MAJOR} (nvm, Homebrew, or your package manager) and re-run."
exit 1
fi
node_version="$(node --version | sed 's/^v//')"
node_major="$(echo "$node_version" | cut -d. -f1)"
if [ "$node_major" -lt "$MIN_NODE_MAJOR" ]; then
red "✗ Node.js ${node_version} is too old. HyperFrames requires Node.js >= ${MIN_NODE_MAJOR}."
echo " Upgrade with 'nvm install ${MIN_NODE_MAJOR} && nvm use ${MIN_NODE_MAJOR}' or your package manager."
exit 1
fi
green "✓ Node.js ${node_version}"
# --- 2. FFmpeg ---------------------------------------------------------------
if ! command -v ffmpeg >/dev/null 2>&1; then
red "✗ FFmpeg is not installed."
case "$(uname -s)" in
Linux*) echo " sudo apt-get install -y ffmpeg # Debian/Ubuntu"
echo " sudo dnf install -y ffmpeg # Fedora/RHEL";;
Darwin*) echo " brew install ffmpeg";;
MINGW*|MSYS*|CYGWIN*) echo " winget install Gyan.FFmpeg";;
*) echo " See https://ffmpeg.org/download.html";;
esac
exit 1
fi
green "✓ FFmpeg $(ffmpeg -version 2>&1 | head -1 | awk '{print $3}')"
# --- 3. npm ------------------------------------------------------------------
if ! command -v npm >/dev/null 2>&1; then
red "✗ npm is not installed (should ship with Node.js)."
exit 1
fi
# --- 4. Install / upgrade hyperframes CLI -----------------------------------
bold "==> Installing hyperframes CLI (>= ${MIN_HYPERFRAMES_VERSION})"
current_hyperframes=""
if command -v hyperframes >/dev/null 2>&1; then
current_hyperframes="$(hyperframes --version 2>/dev/null | tail -1 | sed 's/^v//')"
fi
if [ -n "$current_hyperframes" ]; then
yellow " Found hyperframes ${current_hyperframes}"
fi
# Always install/upgrade to >= MIN version.
# Using 'latest' so we pick up any newer auto-detect/capture fixes.
if ! npm install -g "hyperframes@latest" >/dev/null 2>&1; then
red "✗ npm install -g hyperframes@latest failed."
echo " Try: sudo npm install -g hyperframes@latest"
echo " Or use a user-scoped npm prefix: npm config set prefix ~/.npm-global && export PATH=\"\$HOME/.npm-global/bin:\$PATH\""
exit 1
fi
installed_version="$(hyperframes --version 2>/dev/null | tail -1 | sed 's/^v//')"
green "✓ hyperframes ${installed_version} installed globally"
# Sanity-check minimum version.
version_ge() {
# version_ge A B → true if A >= B
[ "$(printf '%s\n%s\n' "$1" "$2" | sort -V | head -1)" = "$2" ]
}
if ! version_ge "$installed_version" "$MIN_HYPERFRAMES_VERSION"; then
red "✗ hyperframes ${installed_version} is below required minimum ${MIN_HYPERFRAMES_VERSION}."
echo " Try 'npm install -g hyperframes@latest' or 'sudo npm install -g hyperframes@latest'."
exit 1
fi
# --- 5. Pre-cache chrome-headless-shell --------------------------------------
#
# Chromium 147+ removed HeadlessExperimental.beginFrame. System Chrome (e.g.
# /usr/bin/google-chrome) can't render with the fast path, so the engine
# auto-detects and falls back to screenshot mode — but BeginFrame mode is
# faster and produces higher-quality output. Install chrome-headless-shell
# up front so the engine picks it over system Chrome.
bold "==> Pre-caching chrome-headless-shell (for best render quality)"
if ! npx --yes puppeteer browsers install chrome-headless-shell >/dev/null 2>&1; then
yellow "⚠ Could not pre-install chrome-headless-shell."
yellow " Rendering will still work via screenshot-mode fallback (slower)."
yellow " If you hit HeadlessExperimental.beginFrame errors:"
yellow " export PRODUCER_FORCE_SCREENSHOT=true"
yellow " See references/troubleshooting.md."
else
green "✓ chrome-headless-shell installed"
fi
# --- 6. Doctor ---------------------------------------------------------------
bold "==> Running hyperframes doctor"
if hyperframes doctor; then
green "✓ HyperFrames is ready"
echo
echo " Scaffold a project: npx hyperframes init my-video"
echo " Preview: npx hyperframes preview"
echo " Render: npx hyperframes render"
else
yellow "⚠ hyperframes doctor reported issues."
yellow " See references/troubleshooting.md or re-run 'hyperframes doctor'."
exit 1
fi
@@ -345,10 +345,6 @@ Flash Attention uses float16/bfloat16 for speed. Float32 not supported.
**Performance benchmarks**: See [references/benchmarks.md](references/benchmarks.md) for detailed speed and memory comparisons across GPUs and sequence lengths.
**Algorithm details**: See [references/algorithm.md](references/algorithm.md) for tiling strategy, recomputation, and IO complexity analysis.
**Advanced features**: See [references/advanced-features.md](references/advanced-features.md) for rotary embeddings, ALiBi, paged KV cache, and custom attention masks.
## Hardware requirements
- **GPU**: NVIDIA Ampere+ (A100, A10, A30) or AMD MI200+
@@ -6,7 +6,6 @@ This directory contains comprehensive reference materials for SAELens.
- [api.md](api.md) - Complete API reference for SAE, TrainingSAE, and configuration classes
- [tutorials.md](tutorials.md) - Step-by-step tutorials for training and analyzing SAEs
- [papers.md](papers.md) - Key research papers on sparse autoencoders
## Quick Links
+2
View File
@@ -11,6 +11,8 @@ Achievement system for the Hermes Dashboard: collectible, tiered badges generate
The screenshots use temporary demo tier data to show the full visual range. The plugin itself reads real local Hermes session history by default.
> **Update notice (2026-04-29):** If you installed this plugin before today, update to the latest version. The achievements scan path was refactored for much faster warm loads (snapshot cache + incremental checkpoint scan).
>
> **Share cards (2026-05-04, vendored in hermes-agent v0.4.0):** Unlocked achievement cards now have a "Share" button that renders a 1200×630 PNG share card (client-side canvas, no backend, no network) with Download + Copy-to-clipboard actions. Fits X/Twitter, Discord, LinkedIn, Bluesky link-preview dimensions.
## What it does
+303 -2
View File
@@ -66,6 +66,296 @@
});
}
const TIER_HEX = {
"Copper": "#b87333",
"Silver": "#c0c7d2",
"Gold": "#f2c94c",
"Diamond": "#67e8f9",
"Olympian": "#c084fc",
};
function tierHex(tier) {
return TIER_HEX[tier] || "#67e8f9";
}
// Render a LUCIDE icon path fragment into a standalone SVG string with an
// explicit stroke color so it can be rasterized onto a <canvas> via Image.
// The normal render path uses stroke="currentColor" which browsers honor in
// DOM but NOT when the SVG is drawn to a canvas from a data URL.
function iconSvgForCanvas(iconKey, strokeColor) {
const paths = LUCIDE[iconKey] || LUCIDE.secret;
return "<svg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 24 24\" fill=\"none\" " +
"stroke=\"" + strokeColor + "\" stroke-width=\"2\" stroke-linecap=\"round\" stroke-linejoin=\"round\">" +
paths + "</svg>";
}
function loadSvgImage(svgString) {
return new Promise(function (resolve, reject) {
const blob = new Blob([svgString], { type: "image/svg+xml;charset=utf-8" });
const url = URL.createObjectURL(blob);
const img = new Image();
img.onload = function () { URL.revokeObjectURL(url); resolve(img); };
img.onerror = function (e) { URL.revokeObjectURL(url); reject(e); };
img.src = url;
});
}
function wrapText(ctx, text, maxWidth) {
const words = String(text || "").split(/\s+/).filter(Boolean);
const lines = [];
let current = "";
for (let i = 0; i < words.length; i++) {
const candidate = current ? current + " " + words[i] : words[i];
if (ctx.measureText(candidate).width <= maxWidth) {
current = candidate;
} else {
if (current) lines.push(current);
current = words[i];
}
}
if (current) lines.push(current);
return lines;
}
// Build a 1200x630 share card PNG for a single achievement. Returns a Blob.
// Pure client-side render via Canvas2D — no external deps, no network.
async function buildShareImage(achievement) {
const W = 1200;
const H = 630;
const canvas = document.createElement("canvas");
canvas.width = W;
canvas.height = H;
const ctx = canvas.getContext("2d");
const tier = achievement.tier || achievement.next_tier || "Copper";
const color = tierHex(tier);
// Background: dark charcoal with a tier-tinted radial highlight on the
// top-left, echoing the card visual language.
ctx.fillStyle = "#0b0d11";
ctx.fillRect(0, 0, W, H);
const bgGrad = ctx.createRadialGradient(260, 220, 60, 260, 220, 820);
bgGrad.addColorStop(0, color + "33");
bgGrad.addColorStop(0.55, color + "0a");
bgGrad.addColorStop(1, "#0b0d1100");
ctx.fillStyle = bgGrad;
ctx.fillRect(0, 0, W, H);
// Outer border
ctx.strokeStyle = color + "66";
ctx.lineWidth = 2;
ctx.strokeRect(1, 1, W - 2, H - 2);
// Icon block — 380x380 on the left
try {
const svg = iconSvgForCanvas(achievement.icon || "secret", color);
const iconImg = await loadSvgImage(svg);
const ix = 90;
const iy = 125;
const isize = 380;
// Icon glow
ctx.save();
ctx.shadowColor = color;
ctx.shadowBlur = 40;
ctx.drawImage(iconImg, ix, iy, isize, isize);
ctx.restore();
} catch (_) {
// Icon render failure is non-fatal; card still useful without it.
}
// Right column text layout
const rx = 520;
const rMaxWidth = W - rx - 70;
// Category label (kicker)
ctx.fillStyle = "#8b95a8";
ctx.font = "600 22px ui-monospace, 'SF Mono', Menlo, monospace";
ctx.textBaseline = "top";
ctx.fillText((achievement.category || "").toUpperCase(), rx, 112);
// Achievement name — wrap to 2 lines if needed
ctx.fillStyle = "#ffffff";
ctx.font = "780 68px system-ui, -apple-system, 'Segoe UI', sans-serif";
const nameLines = wrapText(ctx, achievement.name || "Achievement", rMaxWidth).slice(0, 2);
let cursorY = 150;
for (let i = 0; i < nameLines.length; i++) {
ctx.fillText(nameLines[i], rx, cursorY);
cursorY += 76;
}
// Tier badge pill
const badgeLabel = tier.toUpperCase() + " TIER";
ctx.font = "700 22px ui-monospace, 'SF Mono', Menlo, monospace";
const badgeWidth = ctx.measureText(badgeLabel).width + 32;
const badgeX = rx;
const badgeY = cursorY + 14;
const badgeH = 40;
ctx.fillStyle = color + "1f";
ctx.strokeStyle = color;
ctx.lineWidth = 1.5;
ctx.beginPath();
ctx.rect(badgeX, badgeY, badgeWidth, badgeH);
ctx.fill();
ctx.stroke();
ctx.fillStyle = color;
ctx.textBaseline = "middle";
ctx.fillText(badgeLabel, badgeX + 16, badgeY + badgeH / 2 + 1);
ctx.textBaseline = "top";
// Description — wrap up to 3 lines
ctx.fillStyle = "#c3cad6";
ctx.font = "400 26px system-ui, -apple-system, 'Segoe UI', sans-serif";
const descLines = wrapText(ctx, achievement.description || "", rMaxWidth).slice(0, 3);
let descY = badgeY + badgeH + 28;
for (let i = 0; i < descLines.length; i++) {
ctx.fillText(descLines[i], rx, descY);
descY += 34;
}
// Progress / stat line (if meaningful)
const progressValue = achievement.progress;
const threshold = achievement.next_threshold;
let statLine = null;
if (progressValue && threshold) {
statLine = progressValue.toLocaleString() + " / " + threshold.toLocaleString();
} else if (progressValue) {
statLine = progressValue.toLocaleString();
}
if (statLine) {
ctx.fillStyle = color;
ctx.font = "700 28px ui-monospace, 'SF Mono', Menlo, monospace";
ctx.fillText(statLine, rx, descY + 14);
}
// Footer watermark
ctx.fillStyle = "#8b95a8";
ctx.font = "600 20px ui-monospace, 'SF Mono', Menlo, monospace";
ctx.textBaseline = "bottom";
ctx.fillText("HERMES AGENT · hermes-agent.nousresearch.com", 70, H - 40);
// "UNLOCKED" stamp upper-right
ctx.textBaseline = "top";
ctx.fillStyle = color;
ctx.font = "800 24px ui-monospace, 'SF Mono', Menlo, monospace";
const stamp = "◆ UNLOCKED";
const stampW = ctx.measureText(stamp).width;
ctx.fillText(stamp, W - 70 - stampW, 70);
return await new Promise(function (resolve, reject) {
canvas.toBlob(function (blob) {
if (blob) resolve(blob); else reject(new Error("canvas.toBlob returned null"));
}, "image/png");
});
}
function ShareDialog({ achievement, onClose }) {
const [status, setStatus] = hooks.useState("rendering"); // rendering | ready | copied | error
const [errorMsg, setErrorMsg] = hooks.useState(null);
const [previewUrl, setPreviewUrl] = hooks.useState(null);
const blobRef = React.useRef(null);
hooks.useEffect(function () {
let cancelled = false;
let createdUrl = null;
buildShareImage(achievement).then(function (blob) {
if (cancelled) return;
blobRef.current = blob;
createdUrl = URL.createObjectURL(blob);
setPreviewUrl(createdUrl);
setStatus("ready");
}).catch(function (err) {
if (cancelled) return;
setErrorMsg(String(err && err.message || err));
setStatus("error");
});
return function () {
cancelled = true;
if (createdUrl) URL.revokeObjectURL(createdUrl);
};
}, [achievement.id]);
function download() {
if (!blobRef.current) return;
const url = URL.createObjectURL(blobRef.current);
const a = document.createElement("a");
a.href = url;
a.download = "hermes-achievement-" + (achievement.id || "badge") + ".png";
document.body.appendChild(a);
a.click();
a.remove();
setTimeout(function () { URL.revokeObjectURL(url); }, 1000);
}
async function copyToClipboard() {
if (!blobRef.current) return;
try {
if (!navigator.clipboard || !window.ClipboardItem) {
throw new Error("Clipboard image copy not supported in this browser — use Download instead.");
}
await navigator.clipboard.write([
new window.ClipboardItem({ "image/png": blobRef.current }),
]);
setStatus("copied");
setTimeout(function () { setStatus("ready"); }, 1800);
} catch (err) {
setErrorMsg(String(err && err.message || err));
setStatus("error");
}
}
// Build the pre-filled tweet text. Keep it short so X doesn't truncate
// when the user hasn't attached the PNG yet — they'll copy-image and
// paste in the same flow.
function tweetText() {
const tierPart = achievement.tier ? (achievement.tier + " tier ") : "";
return "Just unlocked " + tierPart + "\"" + achievement.name + "\" in Hermes Agent ☤\n\n" +
"@NousResearch · https://hermes-agent.nousresearch.com";
}
function shareOnX() {
const url = "https://x.com/intent/post?text=" + encodeURIComponent(tweetText());
window.open(url, "_blank", "noopener,noreferrer");
}
return React.createElement("div", {
className: "ha-share-backdrop",
onClick: function (e) { if (e.target === e.currentTarget) onClose(); },
},
React.createElement("div", { className: "ha-share-dialog", role: "dialog", "aria-label": "Share achievement" },
React.createElement("div", { className: "ha-share-head" },
React.createElement("strong", null, "Share: " + achievement.name),
React.createElement("button", { className: "ha-share-close", onClick: onClose, "aria-label": "Close" }, "×")
),
React.createElement("div", { className: "ha-share-preview" },
status === "rendering" && React.createElement("div", { className: "ha-share-placeholder" }, "Rendering…"),
previewUrl && React.createElement("img", { src: previewUrl, alt: achievement.name + " share card" })
),
status === "error" && React.createElement("div", { className: "ha-share-error" }, errorMsg || "Something went wrong."),
React.createElement("div", { className: "ha-share-actions" },
React.createElement("button", {
className: "ha-share-btn ha-share-btn-primary",
onClick: shareOnX,
title: "Opens X with a pre-filled post",
}, "Share on X"),
React.createElement("button", {
className: "ha-share-btn",
onClick: copyToClipboard,
disabled: status !== "ready" && status !== "copied",
title: "Copy the image to paste into your post",
}, status === "copied" ? "Copied ✓" : "Copy image"),
React.createElement("button", {
className: "ha-share-btn",
onClick: download,
disabled: status !== "ready" && status !== "copied",
}, "Download PNG")
),
React.createElement("p", { className: "ha-share-hint" },
"Share on X opens a pre-filled post in a new tab. Click Copy image first if you want the 1200×630 badge attached — X lets you paste it right into the tweet composer. Download PNG saves the file for use anywhere."
)
)
);
}
function StatCard(props) {
return React.createElement(C.Card, { className: "ha-stat" },
React.createElement(C.CardContent, { className: "ha-stat-content" },
@@ -170,6 +460,7 @@
const targetTier = achievement.next_tier || achievement.tier;
const tierLabel = achievement.tier ? achievement.tier : (targetTier ? "Target " + targetTier : (state === "secret" ? "Hidden" : (unlocked ? "Complete" : "Objective")));
const progressText = state === "secret" ? "hidden" : (progress + (achievement.next_threshold ? " / " + achievement.next_threshold : ""));
const [shareOpen, setShareOpen] = hooks.useState(false);
return React.createElement(C.Card, { className: cn("ha-card", "ha-state-" + state, tierClass(achievement.tier || achievement.next_tier)) },
React.createElement(C.CardContent, { className: "ha-card-content" },
React.createElement("div", { className: "ha-card-head" },
@@ -180,7 +471,13 @@
),
React.createElement("div", { className: "ha-badges" },
React.createElement("span", { className: "ha-state-badge" }, stateLabel),
React.createElement("span", { className: "ha-tier-badge" }, tierLabel)
React.createElement("span", { className: "ha-tier-badge" }, tierLabel),
state === "unlocked" && React.createElement("button", {
className: "ha-share-trigger",
onClick: function () { setShareOpen(true); },
title: "Share this achievement",
"aria-label": "Share " + achievement.name,
}, "Share")
)
),
React.createElement("p", { className: "ha-description" }, achievement.description),
@@ -200,7 +497,11 @@
),
React.createElement("span", { className: "ha-progress-text" }, progressText)
)
)
),
shareOpen && React.createElement(ShareDialog, {
achievement: achievement,
onClose: function () { setShareOpen(false); },
})
);
}
+26
View File
@@ -118,3 +118,29 @@
.ha-scan-banner-text p { margin: .25rem 0 0; font-size: .78rem; line-height: 1.35; color: var(--color-muted-foreground); text-transform: none; letter-spacing: normal; }
.ha-scan-progress-track { height: .4rem; border: 1px solid color-mix(in srgb, #67e8f9 28%, var(--color-border)); background: rgba(0,0,0,.22); overflow: hidden; }
.ha-scan-progress-fill { height: 100%; background: linear-gradient(90deg, #67e8f9, color-mix(in srgb, #67e8f9 48%, white)); transition: width .4s ease-out; }
/* Share achievement trigger button on unlocked cards + modal dialog.
* Added to the vendored bundle (on top of the upstream PCinkusz base).
* Canvas rendering is pure client-side, no backend, no network.
*/
.ha-share-trigger { border: 1px solid color-mix(in srgb, var(--ha-tier) 58%, var(--color-border)); color: var(--ha-tier); background: color-mix(in srgb, var(--ha-tier) 8%, transparent); padding: .18rem .42rem; font-size: .66rem; text-transform: uppercase; letter-spacing: .08em; font-family: var(--font-mono, ui-monospace, monospace); cursor: pointer; margin-top: .05rem; transition: background .12s ease, border-color .12s ease; }
.ha-share-trigger:hover { background: color-mix(in srgb, var(--ha-tier) 20%, transparent); border-color: var(--ha-tier); }
.ha-share-trigger:focus-visible { outline: 2px solid var(--ha-tier); outline-offset: 2px; }
.ha-share-backdrop { position: fixed; inset: 0; z-index: 1000; background: rgba(4,6,10,.72); backdrop-filter: blur(6px); display: flex; align-items: center; justify-content: center; padding: 1.5rem; animation: ha-fade-in .14s ease-out; }
.ha-share-dialog { width: min(760px, 100%); max-height: calc(100vh - 3rem); overflow: auto; border: 1px solid color-mix(in srgb, var(--color-border) 70%, var(--color-ring)); background: color-mix(in srgb, var(--color-card) 94%, #000); box-shadow: 0 24px 60px rgba(0,0,0,.55); display: flex; flex-direction: column; gap: .9rem; padding: 1rem 1.1rem 1.1rem; }
.ha-share-head { display: flex; align-items: center; justify-content: space-between; gap: .75rem; }
.ha-share-head strong { font-size: .82rem; text-transform: uppercase; letter-spacing: .1em; font-family: var(--font-mono, ui-monospace, monospace); color: var(--color-foreground); }
.ha-share-close { width: 1.9rem; height: 1.9rem; display: grid; place-items: center; border: 1px solid var(--color-border); background: transparent; color: var(--color-muted-foreground); font-size: 1.1rem; cursor: pointer; line-height: 1; }
.ha-share-close:hover { color: var(--color-foreground); border-color: var(--color-ring); }
.ha-share-preview { position: relative; border: 1px solid var(--color-border); background: #0b0d11; overflow: hidden; aspect-ratio: 1200 / 630; }
.ha-share-preview img { display: block; width: 100%; height: 100%; object-fit: contain; }
.ha-share-placeholder { position: absolute; inset: 0; display: grid; place-items: center; color: var(--color-muted-foreground); font-family: var(--font-mono, ui-monospace, monospace); font-size: .82rem; text-transform: uppercase; letter-spacing: .1em; animation: ha-pulse 1.4s ease-in-out infinite; border-radius: 0; }
.ha-share-error { border: 1px solid #ef4444; color: #fecaca; background: color-mix(in srgb, #ef4444 10%, transparent); padding: .55rem .7rem; font-size: .78rem; font-family: var(--font-mono, ui-monospace, monospace); }
.ha-share-actions { display: flex; gap: .55rem; flex-wrap: wrap; }
.ha-share-btn { border: 1px solid var(--color-border); background: color-mix(in srgb, var(--color-card) 72%, transparent); color: var(--color-foreground); padding: .5rem .85rem; font-size: .82rem; font-family: var(--font-mono, ui-monospace, monospace); text-transform: uppercase; letter-spacing: .08em; cursor: pointer; transition: border-color .12s ease, background .12s ease; }
.ha-share-btn:hover:not(:disabled) { border-color: var(--color-ring); background: color-mix(in srgb, var(--color-primary) 16%, var(--color-card)); }
.ha-share-btn:disabled { opacity: .5; cursor: not-allowed; }
.ha-share-btn-primary { border-color: #ffffff; color: #ffffff; background: #000000; }
.ha-share-btn-primary:hover:not(:disabled) { background: #1a1a1a; border-color: #67e8f9; color: #67e8f9; }
.ha-share-hint { margin: 0; color: var(--color-muted-foreground); font-size: .76rem; line-height: 1.45; }
@@ -3,7 +3,7 @@
"label": "Achievements",
"description": "Steam-style achievements for vibe coding and agentic Hermes workflows.",
"icon": "Star",
"version": "0.3.1",
"version": "0.4.0",
"tab": { "path": "/achievements", "position": "after:analytics" },
"entry": "dist/index.js",
"css": "dist/style.css",
+4 -3
View File
@@ -203,11 +203,12 @@ class XAIImageGenProvider(ImageGenProvider):
)
response.raise_for_status()
except requests.HTTPError as exc:
status = exc.response.status_code if exc.response else 0
response = exc.response
status = response.status_code if response is not None else 0
try:
err_msg = exc.response.json().get("error", {}).get("message", exc.response.text[:300])
err_msg = response.json().get("error", {}).get("message", response.text[:300])
except Exception:
err_msg = exc.response.text[:300] if exc.response else str(exc)
err_msg = response.text[:300] if response is not None else str(exc)
logger.error("xAI image gen failed (%d): %s", status, err_msg)
return error_response(
error=f"xAI image generation failed ({status}): {err_msg}",
+1010 -52
View File
File diff suppressed because it is too large Load Diff
+526 -14
View File
@@ -268,7 +268,7 @@
}
.hermes-kanban-drawer {
width: min(480px, 92vw);
width: min(var(--hermes-kanban-drawer-width, 640px), 92vw);
height: 100vh;
background: var(--color-card);
border-left: 1px solid var(--color-border);
@@ -334,7 +334,7 @@
.hermes-kanban-meta-row {
display: flex;
gap: 0.5rem;
font-size: 0.72rem;
font-size: 0.8rem;
}
.hermes-kanban-meta-label {
width: 92px;
@@ -351,6 +351,30 @@
gap: 0.3rem;
}
/* ---- Home channel subscription toggles (per-platform, per-task) ----- */
.hermes-kanban-home-subs {
display: flex;
flex-wrap: wrap;
gap: 0.3rem;
}
.hermes-kanban-home-sub {
font-family: var(--font-mono, ui-monospace, monospace);
text-transform: lowercase;
letter-spacing: 0.02em;
}
.hermes-kanban-home-sub--on {
/* Subscribed toggle use a strong ring-colored accent so the on/off
* distinction reads at a glance, not just from the prefix. Border +
* filled background + bolder weight keep the state obvious across
* themes (tested on default teal and NERV orange). */
border-color: var(--color-ring);
background: color-mix(in srgb, var(--color-ring) 32%, transparent);
color: var(--color-foreground);
font-weight: 600;
box-shadow: inset 0 0 0 1px color-mix(in srgb, var(--color-ring) 40%, transparent);
}
.hermes-kanban-section {
display: flex;
flex-direction: column;
@@ -367,14 +391,15 @@
.hermes-kanban-pre {
margin: 0;
padding: 0.45rem 0.55rem;
padding: 0.5rem 0.6rem;
white-space: pre-wrap;
word-break: break-word;
background: color-mix(in srgb, var(--color-foreground) 4%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.72rem;
font-size: 0.8rem;
line-height: 1.5;
color: var(--color-foreground);
}
@@ -605,8 +630,8 @@
/* ---- Markdown rendering -------------------------------------------- */
.hermes-kanban-md {
font-size: 0.8rem;
line-height: 1.55;
font-size: 0.85rem;
line-height: 1.6;
color: var(--color-foreground);
}
.hermes-kanban-md p { margin: 0.25rem 0; }
@@ -632,15 +657,22 @@
}
.hermes-kanban-md code {
font-family: var(--font-mono, ui-monospace, monospace);
font-size: 0.75rem;
font-size: 0.8rem;
padding: 0.05rem 0.3rem;
background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
border-radius: 3px;
color: inherit;
}
/* Fenced code block. Set a visible background even when --color-foreground
* is empty (color-mix falls through to transparent in that case), and force
* color: inherit so the text tracks the drawer foreground rather than the
* UA default on <code> elements otherwise themes that don't set
* --color-foreground leave code text rendering near-black on dark themes
* (see issue #18576). */
.hermes-kanban-md-code {
margin: 0.35rem 0;
padding: 0.5rem 0.6rem;
background: color-mix(in srgb, var(--color-foreground) 5%, transparent);
background: color-mix(in srgb, currentColor 6%, transparent);
border: 1px solid var(--color-border);
border-radius: var(--radius-sm, 0.25rem);
overflow-x: auto;
@@ -648,8 +680,9 @@
.hermes-kanban-md-code code {
background: transparent;
padding: 0;
font-size: 0.75rem;
font-size: 0.8rem;
white-space: pre;
color: inherit;
}
.hermes-kanban-md strong { font-weight: 600; }
@@ -684,11 +717,11 @@
/* ---- Worker log pane ------------------------------------------------ */
.hermes-kanban-log {
max-height: 340px;
max-height: 360px;
overflow: auto;
white-space: pre;
font-size: 0.7rem;
line-height: 1.45;
font-size: 0.78rem;
line-height: 1.5;
}
@@ -739,7 +772,8 @@
color: var(--color-muted-foreground);
}
.hermes-kanban-run-summary {
font-size: 0.75rem;
font-size: 0.82rem;
line-height: 1.5;
padding: 0.2rem 0 0;
color: var(--color-foreground);
}
@@ -751,10 +785,488 @@
}
.hermes-kanban-run-meta {
display: block;
font-size: 0.65rem;
font-size: 0.72rem;
line-height: 1.5;
padding: 0.15rem 0 0;
color: var(--color-muted-foreground);
white-space: pre-wrap;
word-break: break-word;
font-family: var(--font-mono, ui-monospace, monospace);
}
/* -------------------------------------------------------------------------
Multi-project: board switcher + create-board dialog
------------------------------------------------------------------------- */
.hermes-kanban-boardswitcher {
border: 1px solid var(--color-border, rgba(120, 120, 140, 0.25));
border-radius: 0.5rem;
padding: 0.6rem 0.85rem;
background: var(--color-card-subtle, rgba(255, 255, 255, 0.02));
}
.hermes-kanban-boardswitcher-inner {
display: flex;
align-items: flex-end;
gap: 0.75rem;
flex-wrap: wrap;
}
.hermes-kanban-boardswitcher-compact {
display: flex;
justify-content: flex-end;
padding: 0 0.25rem;
}
.hermes-kanban-dialog-backdrop {
position: fixed;
inset: 0;
background: rgba(8, 10, 16, 0.55);
backdrop-filter: blur(2px);
z-index: 60;
display: flex;
align-items: center;
justify-content: center;
}
.hermes-kanban-dialog {
background: var(--color-card, #121421);
color: var(--color-foreground);
border: 1px solid var(--color-border, rgba(120, 120, 140, 0.25));
border-radius: 0.5rem;
padding: 1.1rem 1.2rem 1rem;
width: 28rem;
max-width: calc(100vw - 2rem);
max-height: calc(100vh - 3rem);
overflow: auto;
box-shadow: 0 18px 40px rgba(0, 0, 0, 0.5);
}
.hermes-kanban-dialog-title {
font-size: 1rem;
font-weight: 600;
margin-bottom: 0.25rem;
}
.hermes-kanban-dialog-actions {
display: flex;
justify-content: flex-end;
gap: 0.5rem;
margin-top: 1rem;
}
/* ---------------------------------------------------------------------- */
/* Hallucination warnings: per-card badge, events callout, attention */
/* strip, recovery popover. Orange/red palette but muted so the board */
/* doesn't scream on every render. */
/* ---------------------------------------------------------------------- */
.hermes-kanban-warning-badge {
display: inline-flex;
align-items: center;
justify-content: center;
font-size: 0.75rem;
color: #ff9e3b;
margin-left: 0.25rem;
cursor: help;
}
/* Attention strip — collapsed state is a thin bar. */
.hermes-kanban-attention {
border: 1px solid rgba(255, 158, 59, 0.35);
background: rgba(255, 158, 59, 0.06);
border-radius: 0.5rem;
overflow: hidden;
}
.hermes-kanban-attention-bar {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.4rem 0.75rem;
font-size: 0.8125rem;
}
.hermes-kanban-attention-icon { color: #ff9e3b; font-size: 1rem; }
.hermes-kanban-attention-text { flex: 1; }
.hermes-kanban-attention-toggle,
.hermes-kanban-attention-dismiss,
.hermes-kanban-attention-row-btn {
background: transparent;
border: 1px solid rgba(120, 120, 140, 0.3);
border-radius: 0.3rem;
padding: 0.15rem 0.55rem;
font-size: 0.75rem;
color: inherit;
cursor: pointer;
}
.hermes-kanban-attention-toggle:hover,
.hermes-kanban-attention-dismiss:hover,
.hermes-kanban-attention-row-btn:hover {
background: rgba(255, 158, 59, 0.12);
}
.hermes-kanban-attention-list {
border-top: 1px solid rgba(255, 158, 59, 0.2);
padding: 0.25rem 0;
}
.hermes-kanban-attention-row {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.3rem 0.75rem;
font-size: 0.8125rem;
}
.hermes-kanban-attention-row:hover {
background: rgba(255, 158, 59, 0.08);
}
.hermes-kanban-attention-row-id {
font-family: ui-monospace, SFMono-Regular, monospace;
font-size: 0.75rem;
color: var(--color-muted-foreground, #888);
min-width: 7rem;
}
.hermes-kanban-attention-row-title {
flex: 1;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.hermes-kanban-attention-row-meta {
font-size: 0.75rem;
color: var(--color-muted-foreground, #888);
}
/* Events tab — callout style for hallucination events. */
.hermes-kanban-event--hallucination {
border-left: 3px solid #ff6b6b;
background: rgba(255, 107, 107, 0.08);
padding: 0.5rem 0.65rem;
border-radius: 0.35rem;
margin: 0.25rem 0;
}
.hermes-kanban-event-header,
.hermes-kanban-event-header-plain {
display: flex;
align-items: center;
gap: 0.5rem;
}
.hermes-kanban-event-warning-icon { color: #ff6b6b; font-size: 1rem; }
.hermes-kanban-event-warning-label {
color: #ff6b6b;
font-weight: 600;
font-size: 0.8125rem;
}
.hermes-kanban-event-phantom-row {
display: flex;
align-items: center;
gap: 0.4rem;
flex-wrap: wrap;
margin-top: 0.3rem;
padding-left: 1.35rem;
}
.hermes-kanban-event-phantom-label {
font-size: 0.75rem;
color: var(--color-muted-foreground, #999);
}
.hermes-kanban-event-phantom-chip {
font-family: ui-monospace, SFMono-Regular, monospace;
font-size: 0.75rem;
padding: 0.1rem 0.4rem;
background: rgba(255, 107, 107, 0.15);
border: 1px solid rgba(255, 107, 107, 0.3);
border-radius: 0.3rem;
}
/* Recovery section header — amber accent when the task has warnings. */
.hermes-kanban-section-head-warning { color: #ff9e3b; }
.hermes-kanban-section-head-row {
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.5rem;
}
.hermes-kanban-section-toggle {
background: transparent;
border: 1px solid rgba(120, 120, 140, 0.3);
border-radius: 0.3rem;
padding: 0.15rem 0.55rem;
font-size: 0.75rem;
color: inherit;
cursor: pointer;
}
/* Recovery popover body. */
.hermes-kanban-recovery {
border: 1px solid rgba(120, 120, 140, 0.25);
background: rgba(255, 158, 59, 0.04);
border-radius: 0.5rem;
padding: 0.75rem;
display: flex;
flex-direction: column;
gap: 0.75rem;
}
.hermes-kanban-recovery-title {
font-weight: 600;
font-size: 0.8125rem;
}
.hermes-kanban-recovery-hint {
font-size: 0.75rem;
color: var(--color-muted-foreground, #888);
line-height: 1.35;
}
.hermes-kanban-recovery-section {
display: flex;
flex-direction: column;
gap: 0.35rem;
}
.hermes-kanban-recovery-label {
font-size: 0.75rem;
color: var(--color-muted-foreground, #888);
}
.hermes-kanban-recovery-input,
.hermes-kanban-recovery-select {
padding: 0.25rem 0.4rem;
font-size: 0.8125rem;
background: rgba(0, 0, 0, 0.15);
border: 1px solid rgba(120, 120, 140, 0.3);
border-radius: 0.3rem;
color: inherit;
outline: none;
}
.hermes-kanban-recovery-action-row {
display: flex;
align-items: center;
gap: 0.5rem;
flex-wrap: wrap;
}
.hermes-kanban-recovery-action-label {
font-size: 0.8125rem;
font-weight: 600;
min-width: 8rem;
}
.hermes-kanban-recovery-action-desc {
flex: 1;
font-size: 0.75rem;
color: var(--color-muted-foreground, #888);
}
.hermes-kanban-recovery-btn {
padding: 0.25rem 0.7rem;
font-size: 0.75rem;
background: rgba(255, 158, 59, 0.15);
border: 1px solid rgba(255, 158, 59, 0.4);
border-radius: 0.3rem;
color: inherit;
cursor: pointer;
}
.hermes-kanban-recovery-btn:hover:not(:disabled) {
background: rgba(255, 158, 59, 0.25);
}
.hermes-kanban-recovery-btn:disabled {
opacity: 0.4;
cursor: not-allowed;
}
.hermes-kanban-recovery-reassign-row {
display: flex;
align-items: center;
gap: 0.5rem;
flex-wrap: wrap;
}
.hermes-kanban-recovery-checkbox {
font-size: 0.75rem;
display: inline-flex;
align-items: center;
gap: 0.25rem;
}
.hermes-kanban-recovery-cmd-row {
display: flex;
align-items: center;
gap: 0.5rem;
flex-wrap: wrap;
}
.hermes-kanban-recovery-cmd {
font-family: ui-monospace, SFMono-Regular, monospace;
font-size: 0.75rem;
padding: 0.2rem 0.5rem;
background: rgba(0, 0, 0, 0.2);
border: 1px solid rgba(120, 120, 140, 0.3);
border-radius: 0.3rem;
flex: 1;
min-width: 10rem;
overflow-x: auto;
white-space: nowrap;
}
.hermes-kanban-recovery-msg {
font-size: 0.75rem;
padding: 0.35rem 0.5rem;
border-radius: 0.3rem;
}
.hermes-kanban-recovery-msg--ok {
background: rgba(120, 200, 120, 0.12);
color: #6bc46b;
border: 1px solid rgba(120, 200, 120, 0.3);
}
.hermes-kanban-recovery-msg--err {
background: rgba(255, 107, 107, 0.12);
color: #ff8b8b;
border: 1px solid rgba(255, 107, 107, 0.3);
}
/* ---------------------------------------------------------------------- */
/* Diagnostics — generic, severity-coloured distress signals on tasks. */
/* Three rungs: warning (amber), error (orange), critical (red). */
/* ---------------------------------------------------------------------- */
/* Severity token variables so every diagnostic-coloured surface uses the */
/* same palette. */
.hermes-kanban-diag,
.hermes-kanban-attention,
.hermes-kanban-warning-badge,
.hermes-kanban-attention-row {
--hermes-diag-warning: #ff9e3b;
--hermes-diag-error: #ff6b3d;
--hermes-diag-critical: #ff4d4d;
}
/* Warning-badge severity variants (overrides the base colour). */
.hermes-kanban-warning-badge--warning { color: var(--hermes-diag-warning); }
.hermes-kanban-warning-badge--error { color: var(--hermes-diag-error); font-weight: 700; }
.hermes-kanban-warning-badge--critical { color: var(--hermes-diag-critical); font-weight: 700; }
/* Attention-strip severity variants. */
.hermes-kanban-attention--warning {
border-color: rgba(255, 158, 59, 0.35);
background: rgba(255, 158, 59, 0.06);
}
.hermes-kanban-attention--error {
border-color: rgba(255, 107, 61, 0.45);
background: rgba(255, 107, 61, 0.08);
}
.hermes-kanban-attention--critical {
border-color: rgba(255, 77, 77, 0.55);
background: rgba(255, 77, 77, 0.10);
}
.hermes-kanban-attention--error .hermes-kanban-attention-icon { color: var(--hermes-diag-error); }
.hermes-kanban-attention--critical .hermes-kanban-attention-icon { color: var(--hermes-diag-critical); }
/* Per-row severity marker in the expanded attention list. */
.hermes-kanban-attention-row-sev {
display: inline-block;
min-width: 1.5rem;
font-weight: 600;
}
.hermes-kanban-attention-row--warning .hermes-kanban-attention-row-sev { color: var(--hermes-diag-warning); }
.hermes-kanban-attention-row--error .hermes-kanban-attention-row-sev { color: var(--hermes-diag-error); font-weight: 700; }
.hermes-kanban-attention-row--critical .hermes-kanban-attention-row-sev { color: var(--hermes-diag-critical); font-weight: 700; }
/* Individual diagnostic card inside the drawer's Diagnostics section. */
.hermes-kanban-diag-list {
display: flex;
flex-direction: column;
gap: 0.6rem;
}
.hermes-kanban-diag {
border-left: 3px solid var(--hermes-diag-warning);
background: rgba(255, 158, 59, 0.05);
border-radius: 0.35rem;
padding: 0.6rem 0.75rem;
display: flex;
flex-direction: column;
gap: 0.4rem;
}
.hermes-kanban-diag--error {
border-left-color: var(--hermes-diag-error);
background: rgba(255, 107, 61, 0.06);
}
.hermes-kanban-diag--critical {
border-left-color: var(--hermes-diag-critical);
background: rgba(255, 77, 77, 0.07);
}
.hermes-kanban-diag-header {
display: flex;
align-items: center;
gap: 0.5rem;
}
.hermes-kanban-diag-sev {
font-weight: 700;
min-width: 1.5rem;
}
.hermes-kanban-diag--warning .hermes-kanban-diag-sev { color: var(--hermes-diag-warning); }
.hermes-kanban-diag--error .hermes-kanban-diag-sev { color: var(--hermes-diag-error); }
.hermes-kanban-diag--critical .hermes-kanban-diag-sev { color: var(--hermes-diag-critical); }
.hermes-kanban-diag-title {
font-weight: 600;
font-size: 0.875rem;
}
.hermes-kanban-diag-detail {
font-size: 0.8125rem;
color: var(--color-foreground, #ccc);
line-height: 1.4;
}
.hermes-kanban-diag-data {
display: flex;
flex-direction: column;
gap: 0.2rem;
font-size: 0.75rem;
}
.hermes-kanban-diag-data-row {
display: flex;
align-items: center;
gap: 0.35rem;
flex-wrap: wrap;
}
.hermes-kanban-diag-data-key {
color: var(--color-muted-foreground, #888);
font-weight: 500;
}
.hermes-kanban-diag-data-val {
font-family: ui-monospace, SFMono-Regular, monospace;
}
.hermes-kanban-diag-reassign-row {
display: flex;
align-items: center;
gap: 0.4rem;
font-size: 0.75rem;
}
.hermes-kanban-diag-reassign-label {
color: var(--color-muted-foreground, #888);
}
.hermes-kanban-diag-actions {
display: flex;
flex-wrap: wrap;
gap: 0.4rem;
margin-top: 0.1rem;
}
.hermes-kanban-diag-action-btn {
padding: 0.25rem 0.6rem;
font-size: 0.75rem;
background: rgba(0, 0, 0, 0.2);
border: 1px solid rgba(120, 120, 140, 0.3);
border-radius: 0.3rem;
color: inherit;
cursor: pointer;
text-decoration: none;
}
.hermes-kanban-diag-action-btn:hover:not(:disabled) {
background: rgba(0, 0, 0, 0.3);
}
.hermes-kanban-diag-action-btn:disabled {
opacity: 0.4;
cursor: not-allowed;
}
.hermes-kanban-diag-action-btn--suggested {
background: rgba(255, 158, 59, 0.15);
border-color: rgba(255, 158, 59, 0.4);
font-weight: 600;
}
.hermes-kanban-diag-action-btn--suggested:hover:not(:disabled) {
background: rgba(255, 158, 59, 0.25);
}
.hermes-kanban-diag-action-btn--unknown {
opacity: 0.6;
cursor: default;
}
.hermes-kanban-diag-msg {
font-size: 0.75rem;
padding: 0.35rem 0.5rem;
border-radius: 0.3rem;
}
.hermes-kanban-diag-msg--ok {
background: rgba(120, 200, 120, 0.12);
color: #6bc46b;
border: 1px solid rgba(120, 200, 120, 0.3);
}
.hermes-kanban-diag-msg--err {
background: rgba(255, 107, 61, 0.12);
color: #ff8b6b;
border: 1px solid rgba(255, 107, 61, 0.3);
}
+675 -33
View File
@@ -72,19 +72,45 @@ def _check_ws_token(provided: Optional[str]) -> bool:
return hmac.compare_digest(str(provided), str(expected))
def _conn():
def _resolve_board(board: Optional[str]) -> Optional[str]:
"""Validate and normalise a board slug from a query param.
Raises :class:`HTTPException` 400 on malformed slugs so the browser
sees a clean error instead of a 500. Returns the normalised slug,
or ``None`` when the caller omitted the param (which then falls
through to the active board inside ``kb.connect()``).
"""
if board is None or board == "":
return None
try:
normed = kanban_db._normalize_board_slug(board)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
if normed and normed != kanban_db.DEFAULT_BOARD and not kanban_db.board_exists(normed):
raise HTTPException(
status_code=404,
detail=f"board {normed!r} does not exist",
)
return normed
def _conn(board: Optional[str] = None):
"""Open a kanban_db connection, creating the schema on first use.
Every handler that mutates the DB goes through this so the plugin
self-heals on a fresh install (no user-visible "no such table"
error if somebody hits POST /tasks before GET /board).
``init_db`` is idempotent.
``board`` is the query-param slug (already normalised by
:func:`_resolve_board`). When ``None`` the active board is used
via the resolution chain (env var ``current`` file ``default``).
"""
try:
kanban_db.init_db()
kanban_db.init_db(board=board)
except Exception as exc:
log.warning("kanban init_db failed: %s", exc)
return kanban_db.connect()
return kanban_db.connect(board=board)
# ---------------------------------------------------------------------------
@@ -150,6 +176,120 @@ def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
}
# Hallucination-warning event kinds — see complete_task() in kanban_db.py.
# completion_blocked_hallucination: kernel rejected created_cards with
# phantom ids; task stays in prior state.
# suspected_hallucinated_references: prose scan found t_<hex> in summary
# that doesn't resolve; completion succeeded, advisory only.
_WARNING_EVENT_KINDS = (
"completion_blocked_hallucination",
"suspected_hallucinated_references",
)
def _compute_task_diagnostics(
conn: sqlite3.Connection,
task_ids: Optional[list[str]] = None,
) -> dict[str, list[dict]]:
"""Run the diagnostic rule engine against every task (or a subset)
and return ``{task_id: [diagnostic_dict, ...]}``.
Tasks with no active diagnostics are omitted from the result.
Uses ``hermes_cli.kanban_diagnostics`` see that module for the
rule definitions.
"""
from hermes_cli import kanban_diagnostics as kd
# Build the candidate task list. We need each task's row + its
# events + its runs. Doing N separate queries works but scales
# poorly; do three aggregate queries instead.
if task_ids is not None:
if not task_ids:
return {}
placeholders = ",".join(["?"] * len(task_ids))
rows = conn.execute(
f"SELECT * FROM tasks WHERE id IN ({placeholders})",
tuple(task_ids),
).fetchall()
else:
rows = conn.execute(
"SELECT * FROM tasks WHERE status != 'archived'",
).fetchall()
if not rows:
return {}
# Index events + runs by task id. For very large boards this will
# slurp a lot — acceptable on the dashboard's typical working set
# (hundreds of tasks), but we can add pagination / filtering later
# if profiling shows it's a hotspot.
row_ids = [r["id"] for r in rows]
placeholders = ",".join(["?"] * len(row_ids))
events_by_task: dict[str, list] = {tid: [] for tid in row_ids}
for ev_row in conn.execute(
f"SELECT * FROM task_events WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(row_ids),
).fetchall():
events_by_task.setdefault(ev_row["task_id"], []).append(ev_row)
runs_by_task: dict[str, list] = {tid: [] for tid in row_ids}
for run_row in conn.execute(
f"SELECT * FROM task_runs WHERE task_id IN ({placeholders}) ORDER BY id",
tuple(row_ids),
).fetchall():
runs_by_task.setdefault(run_row["task_id"], []).append(run_row)
out: dict[str, list[dict]] = {}
for r in rows:
tid = r["id"]
diags = kd.compute_task_diagnostics(
r,
events_by_task.get(tid, []),
runs_by_task.get(tid, []),
)
if diags:
out[tid] = [d.to_dict() for d in diags]
return out
def _warnings_summary_from_diagnostics(
diagnostics: list[dict],
) -> Optional[dict]:
"""Compact summary for cards: {count, highest_severity, kinds,
latest_at}. Replaces the old hallucination-only ``warnings`` object
same shape additions plus ``highest_severity`` so the UI can color
badges per diagnostic severity.
Returns None when ``diagnostics`` is empty.
"""
if not diagnostics:
return None
from hermes_cli.kanban_diagnostics import SEVERITY_ORDER
kinds: dict[str, int] = {}
latest = 0
highest_idx = -1
highest_sev: Optional[str] = None
count = 0
for d in diagnostics:
kinds[d["kind"]] = kinds.get(d["kind"], 0) + d.get("count", 1)
count += d.get("count", 1)
la = d.get("last_seen_at") or 0
if la > latest:
latest = la
sev = d.get("severity")
if sev in SEVERITY_ORDER:
idx = SEVERITY_ORDER.index(sev)
if idx > highest_idx:
highest_idx = idx
highest_sev = sev
return {
"count": count,
"kinds": kinds,
"latest_at": latest,
"highest_severity": highest_sev,
}
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
"""Return {'parents': [...], 'children': [...]} for a task."""
parents = [
@@ -177,13 +317,19 @@ def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
def get_board(
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
include_archived: bool = Query(False),
board: Optional[str] = Query(None, description="Kanban board slug (omit for current)"),
):
"""Return the full board grouped by status column.
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
install doesn't surface a "failed to load" error on the plugin tab.
``board`` selects which board to read from. Omitting it falls
through to the active board (``HERMES_KANBAN_BOARD`` env on-disk
``current`` pointer ``default``).
"""
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
tasks = kanban_db.list_tasks(
conn, tenant=tenant, include_archived=include_archived
@@ -221,6 +367,12 @@ def get_board(
if row["cstatus"] == "done":
p["done"] += 1
# Diagnostics rollup for this board — see kanban_diagnostics.
# We get the full structured list per task AND a compact
# summary for the card badge (so cards don't carry the detail
# text; the drawer fetches that via /tasks/:id or /diagnostics).
diagnostics_per_task = _compute_task_diagnostics(conn, task_ids=None)
latest_event_id = conn.execute(
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
).fetchone()["m"]
@@ -234,6 +386,13 @@ def get_board(
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
d["comment_count"] = comment_counts.get(t.id, 0)
d["progress"] = progress.get(t.id) # None when the task has no children
diags = diagnostics_per_task.get(t.id)
if diags:
# Full list goes into the payload so the drawer can render
# without a second round-trip. The board-level badge only
# needs the summary.
d["diagnostics"] = diags
d["warnings"] = _warnings_summary_from_diagnostics(diags)
col = t.status if t.status in columns else "todo"
columns[col].append(d)
@@ -274,14 +433,23 @@ def get_board(
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}")
def get_task(task_id: str):
conn = _conn()
def get_task(task_id: str, board: Optional[str] = Query(None)):
board = _resolve_board(board)
conn = _conn(board=board)
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
task_d = _task_dict(task)
# Attach diagnostics so the drawer's Diagnostics section can
# render recovery actions without a second round-trip.
diags = _compute_task_diagnostics(conn, task_ids=[task_id])
diag_list = diags.get(task_id) or []
if diag_list:
task_d["diagnostics"] = diag_list
task_d["warnings"] = _warnings_summary_from_diagnostics(diag_list)
return {
"task": _task_dict(task),
"task": task_d,
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
"links": _links_for(conn, task_id),
@@ -311,8 +479,9 @@ class CreateTaskBody(BaseModel):
@router.post("/tasks")
def create_task(payload: CreateTaskBody):
conn = _conn()
def create_task(payload: CreateTaskBody, board: Optional[str] = Query(None)):
board = _resolve_board(board)
conn = _conn(board=board)
try:
task_id = kanban_db.create_task(
conn,
@@ -373,8 +542,9 @@ class UpdateTaskBody(BaseModel):
@router.patch("/tasks/{task_id}")
def update_task(task_id: str, payload: UpdateTaskBody):
conn = _conn()
def update_task(task_id: str, payload: UpdateTaskBody, board: Optional[str] = Query(None)):
board = _resolve_board(board)
conn = _conn(board=board)
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
@@ -414,7 +584,12 @@ def update_task(task_id: str, payload: UpdateTaskBody):
ok = _set_status_direct(conn, task_id, "ready")
elif s == "archived":
ok = kanban_db.archive_task(conn, task_id)
elif s in ("todo", "running", "triage"):
elif s == "running":
raise HTTPException(
status_code=400,
detail="Cannot set status to 'running' directly; use the dispatcher/claim path",
)
elif s in ("todo", "triage"):
ok = _set_status_direct(conn, task_id, s)
else:
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
@@ -527,10 +702,11 @@ class CommentBody(BaseModel):
@router.post("/tasks/{task_id}/comments")
def add_comment(task_id: str, payload: CommentBody):
def add_comment(task_id: str, payload: CommentBody, board: Optional[str] = Query(None)):
if not payload.body.strip():
raise HTTPException(status_code=400, detail="body is required")
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
if kanban_db.get_task(conn, task_id) is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
@@ -552,8 +728,9 @@ class LinkBody(BaseModel):
@router.post("/links")
def add_link(payload: LinkBody):
conn = _conn()
def add_link(payload: LinkBody, board: Optional[str] = Query(None)):
board = _resolve_board(board)
conn = _conn(board=board)
try:
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
return {"ok": True}
@@ -564,8 +741,13 @@ def add_link(payload: LinkBody):
@router.delete("/links")
def delete_link(parent_id: str = Query(...), child_id: str = Query(...)):
conn = _conn()
def delete_link(
parent_id: str = Query(...),
child_id: str = Query(...),
board: Optional[str] = Query(None),
):
board = _resolve_board(board)
conn = _conn(board=board)
try:
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
return {"ok": bool(ok)}
@@ -583,10 +765,13 @@ class BulkTaskBody(BaseModel):
assignee: Optional[str] = None # "" or None = unassign
priority: Optional[int] = None
archive: bool = False
result: Optional[str] = None
summary: Optional[str] = None
metadata: Optional[dict] = None
@router.post("/tasks/bulk")
def bulk_update(payload: BulkTaskBody):
def bulk_update(payload: BulkTaskBody, board: Optional[str] = Query(None)):
"""Apply the same patch to every id in ``payload.ids``.
This is an *independent* iteration per-task failures don't abort
@@ -596,7 +781,8 @@ def bulk_update(payload: BulkTaskBody):
if not ids:
raise HTTPException(status_code=400, detail="ids is required")
results: list[dict] = []
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
for tid in ids:
entry: dict[str, Any] = {"id": tid, "ok": True}
@@ -612,7 +798,12 @@ def bulk_update(payload: BulkTaskBody):
if payload.status is not None and not payload.archive:
s = payload.status
if s == "done":
ok = kanban_db.complete_task(conn, tid)
ok = kanban_db.complete_task(
conn, tid,
result=payload.result,
summary=payload.summary,
metadata=payload.metadata,
)
elif s == "blocked":
ok = kanban_db.block_task(conn, tid)
elif s == "ready":
@@ -657,6 +848,168 @@ def bulk_update(payload: BulkTaskBody):
conn.close()
# ---------------------------------------------------------------------------
# Diagnostics — fleet-wide distress signals (hallucinations, crashes,
# spawn failures, stuck-blocked). See hermes_cli.kanban_diagnostics for
# the rule engine.
# ---------------------------------------------------------------------------
@router.get("/diagnostics")
def list_diagnostics(
board: Optional[str] = Query(None, description="Kanban board slug (omit for current)"),
severity: Optional[str] = Query(
None,
description="Filter by severity: warning|error|critical",
),
):
"""Return ``[{task_id, task_title, task_status, task_assignee,
diagnostics: [...]}, ...]`` for every task on the board with at
least one active diagnostic.
Severity-filterable so the UI can render "just the critical ones"
or the CLI can grep. Useful for the board-header attention strip
AND for ``hermes kanban diagnostics`` which shells to this
endpoint when the dashboard's running, or invokes the engine
directly when it isn't.
"""
board = _resolve_board(board)
conn = _conn(board=board)
try:
diags_by_task = _compute_task_diagnostics(conn, task_ids=None)
if not diags_by_task:
return {"diagnostics": [], "count": 0}
# Narrow by severity if asked.
if severity:
filtered: dict[str, list[dict]] = {}
for tid, dl in diags_by_task.items():
keep = [d for d in dl if d.get("severity") == severity]
if keep:
filtered[tid] = keep
diags_by_task = filtered
if not diags_by_task:
return {"diagnostics": [], "count": 0}
# Pull the task rows we need in one query so we can include
# titles/statuses without a per-task lookup.
ids = list(diags_by_task.keys())
placeholders = ",".join(["?"] * len(ids))
rows = {
r["id"]: r
for r in conn.execute(
f"SELECT id, title, status, assignee FROM tasks WHERE id IN ({placeholders})",
tuple(ids),
).fetchall()
}
out = []
for tid, dl in diags_by_task.items():
r = rows.get(tid)
out.append({
"task_id": tid,
"task_title": r["title"] if r else None,
"task_status": r["status"] if r else None,
"task_assignee": r["assignee"] if r else None,
"diagnostics": dl,
})
# Sort: highest severity first, then most recent.
from hermes_cli.kanban_diagnostics import SEVERITY_ORDER
sev_idx = {s: i for i, s in enumerate(SEVERITY_ORDER)}
def _sort_key(row):
top = row["diagnostics"][0]
return (
-sev_idx.get(top.get("severity"), -1),
-(top.get("last_seen_at") or 0),
)
out.sort(key=_sort_key)
return {
"diagnostics": out,
"count": sum(len(d["diagnostics"]) for d in out),
}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Recovery actions — reclaim a running claim, reassign to a new profile
# ---------------------------------------------------------------------------
class ReclaimBody(BaseModel):
reason: Optional[str] = None
@router.post("/tasks/{task_id}/reclaim")
def reclaim_task_endpoint(
task_id: str,
payload: ReclaimBody,
board: Optional[str] = Query(None),
):
"""Release an active worker claim on a running task.
Used by the dashboard recovery popover when an operator wants to
abort a stuck worker (e.g. one that keeps hallucinating card ids)
without waiting for the claim TTL. Maps 1:1 to
``hermes kanban reclaim <task_id> --reason ...``.
"""
board = _resolve_board(board)
conn = _conn(board=board)
try:
ok = kanban_db.reclaim_task(conn, task_id, reason=payload.reason)
if not ok:
raise HTTPException(
status_code=409,
detail=(
f"cannot reclaim {task_id}: not in a claimable state "
"(not running, or unknown id)"
),
)
return {"ok": True, "task_id": task_id}
finally:
conn.close()
class ReassignBody(BaseModel):
profile: Optional[str] = None # "" or None = unassign
reclaim_first: bool = False
reason: Optional[str] = None
@router.post("/tasks/{task_id}/reassign")
def reassign_task_endpoint(
task_id: str,
payload: ReassignBody,
board: Optional[str] = Query(None),
):
"""Reassign a task to a different profile, optionally reclaiming first.
Used by the dashboard recovery popover when an operator wants to
retry a task with a different worker profile (e.g. switch to a
smarter model after the assigned profile keeps hallucinating).
Maps 1:1 to ``hermes kanban reassign <task_id> <profile> [--reclaim]``.
"""
board = _resolve_board(board)
conn = _conn(board=board)
try:
ok = kanban_db.reassign_task(
conn, task_id,
payload.profile or None,
reclaim_first=bool(payload.reclaim_first),
reason=payload.reason,
)
if not ok:
raise HTTPException(
status_code=409,
detail=(
f"cannot reassign {task_id}: unknown id, or still "
"running (pass reclaim_first=true to release the claim first)"
),
)
return {"ok": True, "task_id": task_id, "assignee": payload.profile or None}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
# ---------------------------------------------------------------------------
@@ -685,19 +1038,169 @@ def get_config():
}
# ---------------------------------------------------------------------------
# Home-channel subscriptions (per-task, per-platform toggles)
# ---------------------------------------------------------------------------
#
# Home channels are a first-class gateway concept — each configured platform
# can have exactly one (chat_id, thread_id, name) it considers "home". The
# dashboard surfaces these as per-task toggles so a user can opt a specific
# task into receiving terminal notifications (completed / blocked / gave_up)
# at their telegram/discord/slack home, without touching the CLI.
#
# The wire format mirrors kanban_db.add_notify_sub — (task_id, platform,
# chat_id, thread_id) — so toggle-on creates exactly the same row the
# `/kanban create` slash command would, and the existing gateway notifier
# watcher delivers events without any additional plumbing.
def _configured_home_channels() -> list[dict]:
"""Return every platform that has a home_channel set, fully hydrated.
Reads the live GatewayConfig so env-var overlays (``TELEGRAM_HOME_CHANNEL``
etc.) are honored alongside config.yaml. Returns platforms in a stable
order and drops platforms without a home.
"""
try:
from gateway.config import load_gateway_config
except Exception:
return []
try:
gw_cfg = load_gateway_config()
except Exception:
return []
result: list[dict] = []
for platform, pcfg in gw_cfg.platforms.items():
if not pcfg or not pcfg.home_channel:
continue
hc = pcfg.home_channel
result.append({
"platform": platform.value,
"chat_id": hc.chat_id,
"thread_id": hc.thread_id or "",
"name": hc.name or "Home",
})
# Stable order for deterministic UI — platform name alphabetical.
result.sort(key=lambda r: r["platform"])
return result
def _home_sub_matches(sub: dict, home: dict) -> bool:
"""True if a notify_subs row corresponds to the given home channel."""
return (
sub.get("platform") == home["platform"]
and str(sub.get("chat_id", "")) == str(home["chat_id"])
and str(sub.get("thread_id") or "") == str(home["thread_id"] or "")
)
@router.get("/home-channels")
def get_home_channels(
task_id: Optional[str] = Query(None),
board: Optional[str] = Query(None),
):
"""List every platform with a home channel, plus whether *task_id*
(if given) is currently subscribed to that home.
When ``task_id`` is omitted, every entry's ``subscribed`` is ``false``
useful for the "no task selected" state of the UI.
"""
homes = _configured_home_channels()
subscribed_homes: set[tuple[str, str, str]] = set()
if task_id:
board = _resolve_board(board)
conn = _conn(board=board)
try:
subs = kanban_db.list_notify_subs(conn, task_id)
finally:
conn.close()
for sub in subs:
key = (
str(sub.get("platform") or ""),
str(sub.get("chat_id") or ""),
str(sub.get("thread_id") or ""),
)
subscribed_homes.add(key)
result = []
for home in homes:
key = (home["platform"], home["chat_id"], home["thread_id"])
result.append({**home, "subscribed": key in subscribed_homes})
return {"home_channels": result}
@router.post("/tasks/{task_id}/home-subscribe/{platform}")
def subscribe_home(task_id: str, platform: str, board: Optional[str] = Query(None)):
"""Subscribe *task_id* to notifications routed to *platform*'s home channel.
Idempotent re-subscribing is a no-op at the DB layer. 404 if the
platform has no home channel configured. 404 if the task doesn't exist.
"""
homes = _configured_home_channels()
home = next((h for h in homes if h["platform"] == platform), None)
if not home:
raise HTTPException(
status_code=404,
detail=f"No home channel configured for platform {platform!r}. "
f"Set one from the messenger via /sethome, or configure "
f"gateway.platforms.{platform}.home_channel in config.yaml.",
)
board = _resolve_board(board)
conn = _conn(board=board)
try:
task = kanban_db.get_task(conn, task_id)
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
kanban_db.add_notify_sub(
conn,
task_id=task_id,
platform=platform,
chat_id=home["chat_id"],
thread_id=home["thread_id"] or None,
)
return {"ok": True, "task_id": task_id, "home_channel": home}
finally:
conn.close()
@router.delete("/tasks/{task_id}/home-subscribe/{platform}")
def unsubscribe_home(task_id: str, platform: str, board: Optional[str] = Query(None)):
"""Remove any notify subscription on *task_id* that matches *platform*'s home."""
homes = _configured_home_channels()
home = next((h for h in homes if h["platform"] == platform), None)
if not home:
raise HTTPException(
status_code=404,
detail=f"No home channel configured for platform {platform!r}.",
)
board = _resolve_board(board)
conn = _conn(board=board)
try:
kanban_db.remove_notify_sub(
conn,
task_id=task_id,
platform=platform,
chat_id=home["chat_id"],
thread_id=home["thread_id"] or None,
)
return {"ok": True, "task_id": task_id, "home_channel": home}
finally:
conn.close()
# ---------------------------------------------------------------------------
# Stats (per-profile / per-status counts + oldest-ready age)
# ---------------------------------------------------------------------------
@router.get("/stats")
def get_stats():
def get_stats(board: Optional[str] = Query(None)):
"""Per-status + per-assignee counts + oldest-ready age.
Designed for the dashboard HUD and for router profiles that need to
answer "is this specialist overloaded?" without scanning the whole
board themselves.
"""
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
return kanban_db.board_stats(conn)
finally:
@@ -705,7 +1208,7 @@ def get_stats():
@router.get("/assignees")
def get_assignees():
def get_assignees(board: Optional[str] = Query(None)):
"""Known profiles + per-profile task counts.
Returns the union of ``~/.hermes/profiles/*`` on disk and every
@@ -713,7 +1216,8 @@ def get_assignees():
this to populate its assignee dropdown so a freshly-created profile
appears in the picker before it's been given any task.
"""
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
return {"assignees": kanban_db.known_assignees(conn)}
finally:
@@ -725,7 +1229,11 @@ def get_assignees():
# ---------------------------------------------------------------------------
@router.get("/tasks/{task_id}/log")
def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_000)):
def get_task_log(
task_id: str,
tail: Optional[int] = Query(None, ge=1, le=2_000_000),
board: Optional[str] = Query(None),
):
"""Return the worker's stdout/stderr log.
``tail`` caps the response size (bytes) so the dashboard drawer
@@ -734,15 +1242,16 @@ def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_
``_rotate_worker_log`` a single ``.log.1`` is kept, no further
generations, so disk usage per task is bounded at ~4 MiB.
"""
conn = _conn()
board = _resolve_board(board)
conn = _conn(board=board)
try:
task = kanban_db.get_task(conn, task_id)
finally:
conn.close()
if task is None:
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
content = kanban_db.read_worker_log(task_id, tail_bytes=tail)
log_path = kanban_db.worker_log_path(task_id)
content = kanban_db.read_worker_log(task_id, tail_bytes=tail, board=board)
log_path = kanban_db.worker_log_path(task_id, board=board)
size = log_path.stat().st_size if log_path.exists() else 0
return {
"task_id": task_id,
@@ -760,11 +1269,16 @@ def get_task_log(task_id: str, tail: Optional[int] = Query(None, ge=1, le=2_000_
# ---------------------------------------------------------------------------
@router.post("/dispatch")
def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn = _conn()
def dispatch(
dry_run: bool = Query(False),
max_n: int = Query(8, alias="max"),
board: Optional[str] = Query(None),
):
board = _resolve_board(board)
conn = _conn(board=board)
try:
result = kanban_db.dispatch_once(
conn, dry_run=dry_run, max_spawn=max_n,
conn, dry_run=dry_run, max_spawn=max_n, board=board,
)
# DispatchResult is a dataclass.
try:
@@ -775,6 +1289,124 @@ def dispatch(dry_run: bool = Query(False), max_n: int = Query(8, alias="max")):
conn.close()
# ---------------------------------------------------------------------------
# Boards CRUD (multi-project support)
# ---------------------------------------------------------------------------
class CreateBoardBody(BaseModel):
slug: str
name: Optional[str] = None
description: Optional[str] = None
icon: Optional[str] = None
color: Optional[str] = None
switch: bool = False
class RenameBoardBody(BaseModel):
name: Optional[str] = None
description: Optional[str] = None
icon: Optional[str] = None
color: Optional[str] = None
def _board_counts(slug: str) -> dict[str, int]:
"""Return ``{status: count}`` for a board. Safe on an empty DB."""
try:
path = kanban_db.kanban_db_path(board=slug)
if not path.exists():
return {}
conn = kanban_db.connect(board=slug)
try:
rows = conn.execute(
"SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
).fetchall()
return {r["status"]: int(r["n"]) for r in rows}
finally:
conn.close()
except Exception:
return {}
@router.get("/boards")
def list_boards(include_archived: bool = Query(False)):
"""Return every board on disk with task counts and the active slug."""
boards = kanban_db.list_boards(include_archived=include_archived)
current = kanban_db.get_current_board()
for b in boards:
b["is_current"] = (b["slug"] == current)
b["counts"] = _board_counts(b["slug"])
b["total"] = sum(b["counts"].values())
return {"boards": boards, "current": current}
@router.post("/boards")
def create_board_endpoint(payload: CreateBoardBody):
"""Create a new board. Idempotent — ``slug`` collision returns existing."""
try:
meta = kanban_db.create_board(
payload.slug,
name=payload.name,
description=payload.description,
icon=payload.icon,
color=payload.color,
)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
if payload.switch:
try:
kanban_db.set_current_board(meta["slug"])
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
return {"board": meta, "current": kanban_db.get_current_board()}
@router.patch("/boards/{slug}")
def rename_board(slug: str, payload: RenameBoardBody):
"""Update a board's display metadata (slug is immutable — create a new one to rename the directory)."""
try:
normed = kanban_db._normalize_board_slug(slug)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
if not normed or not kanban_db.board_exists(normed):
raise HTTPException(status_code=404, detail=f"board {slug!r} does not exist")
meta = kanban_db.write_board_metadata(
normed,
name=payload.name,
description=payload.description,
icon=payload.icon,
color=payload.color,
)
return {"board": meta}
@router.delete("/boards/{slug}")
def delete_board(slug: str, delete: bool = Query(False, description="Hard-delete instead of archive")):
"""Archive (default) or hard-delete a board."""
try:
res = kanban_db.remove_board(slug, archive=not delete)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
return {"result": res, "current": kanban_db.get_current_board()}
@router.post("/boards/{slug}/switch")
def switch_board(slug: str):
"""Persist ``slug`` as the active board for subsequent CLI / slash calls.
Dashboard users pick boards via a client-side ``localStorage`` this
endpoint is for ``/kanban boards switch`` parity so gateway slash
commands and the CLI share the same current-board pointer.
"""
try:
normed = kanban_db._normalize_board_slug(slug)
except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))
if not normed or not kanban_db.board_exists(normed):
raise HTTPException(status_code=404, detail=f"board {slug!r} does not exist")
kanban_db.set_current_board(normed)
return {"current": normed}
# ---------------------------------------------------------------------------
# WebSocket: /events?since=<event_id>
# ---------------------------------------------------------------------------
@@ -802,8 +1434,18 @@ async def stream_events(ws: WebSocket):
except ValueError:
cursor = 0
# Board selection — pinned at the WS handshake; re-subscribe to
# switch boards. Changing boards mid-stream would require
# reconciling two cursors, so the UI just opens a new WS on
# board change.
ws_board_raw = ws.query_params.get("board")
try:
ws_board = kanban_db._normalize_board_slug(ws_board_raw) if ws_board_raw else None
except ValueError:
ws_board = None
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
conn = kanban_db.connect()
conn = kanban_db.connect(board=ws_board)
try:
rows = conn.execute(
"SELECT id, task_id, run_id, kind, payload, created_at "
+7 -6
View File
@@ -626,14 +626,15 @@ class HonchoSessionManager:
Pre-fetch user and AI peer context from Honcho.
Fetches peer_representation and peer_card for both peers, plus the
session summary when available. search_query is intentionally omitted
it would only affect additional excerpts that this code does not
consume, and passing the raw message exposes conversation content in
server access logs.
session summary when available. When user_message is provided, it is
passed as search_query to the peer context call so Honcho returns
conclusions relevant to the session topic rather than the full
observation dump.
Args:
session_key: The session key to get context for.
user_message: Unused; kept for call-site compatibility.
user_message: Optional first user message used as search_query for
topic-relevant context retrieval.
Returns:
Dictionary with 'representation', 'card', 'ai_representation',
@@ -659,7 +660,7 @@ class HonchoSessionManager:
logger.debug("Failed to fetch session summary from Honcho: %s", e)
try:
user_ctx = self._fetch_peer_context(session.user_peer_id, target=session.user_peer_id)
user_ctx = self._fetch_peer_context(session.user_peer_id, search_query=user_message or None, target=session.user_peer_id)
result["representation"] = user_ctx["representation"]
result["card"] = "\n".join(user_ctx["card"])
except Exception as e:
+70
View File
@@ -0,0 +1,70 @@
# Model Provider Plugins
Each subdirectory is a self-contained provider profile plugin. The
directory layout mirrors `plugins/platforms/`:
```
plugins/model-providers/
├── openrouter/
│ ├── __init__.py # registers the ProviderProfile
│ └── plugin.yaml # manifest: name, kind, version, description
├── anthropic/
│ ├── __init__.py
│ └── plugin.yaml
└── ...
```
## How discovery works
`providers/__init__.py._discover_providers()` scans this directory (and
`$HERMES_HOME/plugins/model-providers/`) the first time anything calls
`get_provider_profile()` or `list_providers()`. Each `__init__.py` is
imported and expected to call `providers.register_provider(profile)`.
User plugins at `$HERMES_HOME/plugins/model-providers/<name>/` override
bundled plugins of the same name — last-writer-wins in
`register_provider()`. Drop a file there to replace a built-in.
## Adding a new provider
1. Create `plugins/model-providers/<your_provider>/__init__.py`:
```python
from providers import register_provider
from providers.base import ProviderProfile
my_provider = ProviderProfile(
name="your-provider",
aliases=("alias1", "alias2"),
display_name="Your Provider",
description="One-line description shown in the setup picker",
signup_url="https://your-provider.example.com/keys",
env_vars=("YOUR_PROVIDER_API_KEY", "YOUR_PROVIDER_BASE_URL"),
base_url="https://api.your-provider.example.com/v1",
default_aux_model="your-cheap-model",
)
register_provider(my_provider)
```
2. Create `plugins/model-providers/<your_provider>/plugin.yaml`:
```yaml
name: your-provider-profile
kind: model-provider
version: 1.0.0
description: Short sentence about the provider
author: Your Name
```
Nothing else needs to change. `auth.py`, `config.py`, `models.py`,
`doctor.py`, `model_metadata.py`, `runtime_provider.py`, and the
chat_completions transport all auto-wire from the registry.
## Non-trivial profiles
Override the `ProviderProfile` hooks in a subclass for per-provider
quirks — see `plugins/model-providers/openrouter/__init__.py` for
`build_extra_body` and `build_api_kwargs_extras` examples, and
`plugins/model-providers/gemini/__init__.py` for `thinking_config`
translation.
@@ -0,0 +1,43 @@
"""Vercel AI Gateway provider profile.
AI Gateway routes to multiple backends. Hermes sends attribution
headers and full reasoning config passthrough.
"""
from typing import Any
from providers import register_provider
from providers.base import ProviderProfile
class VercelAIGatewayProfile(ProviderProfile):
"""Vercel AI Gateway — attribution headers + reasoning passthrough."""
def build_api_kwargs_extras(
self,
*,
reasoning_config: dict | None = None,
supports_reasoning: bool = True,
**ctx: Any,
) -> tuple[dict[str, Any], dict[str, Any]]:
extra_body: dict[str, Any] = {}
if supports_reasoning and reasoning_config is not None:
extra_body["reasoning"] = dict(reasoning_config)
elif supports_reasoning:
extra_body["reasoning"] = {"enabled": True, "effort": "medium"}
return extra_body, {}
vercel = VercelAIGatewayProfile(
name="ai-gateway",
aliases=("vercel", "vercel-ai-gateway", "ai_gateway", "aigateway"),
env_vars=("AI_GATEWAY_API_KEY",),
base_url="https://ai-gateway.vercel.sh/v1",
default_headers={
"HTTP-Referer": "https://hermes-agent.nousresearch.com",
"X-Title": "Hermes Agent",
},
default_aux_model="google/gemini-3-flash",
)
register_provider(vercel)
@@ -0,0 +1,5 @@
name: ai-gateway-provider
kind: model-provider
version: 1.0.0
description: Vercel AI Gateway
author: Nous Research
@@ -0,0 +1,21 @@
"""Alibaba Cloud Coding Plan provider profile.
Separate from the standard `alibaba` profile because it hits a different
endpoint (coding-intl.dashscope.aliyuncs.com) with a dedicated API key tier.
"""
from providers import register_provider
from providers.base import ProviderProfile
alibaba_coding_plan = ProviderProfile(
name="alibaba-coding-plan",
aliases=("alibaba_coding", "alibaba-coding", "dashscope-coding"),
display_name="Alibaba Cloud (Coding Plan)",
description="Alibaba Cloud Coding Plan — dedicated coding tier",
signup_url="https://help.aliyun.com/zh/model-studio/",
env_vars=("ALIBABA_CODING_PLAN_API_KEY", "DASHSCOPE_API_KEY", "ALIBABA_CODING_PLAN_BASE_URL"),
base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
auth_type="api_key",
)
register_provider(alibaba_coding_plan)

Some files were not shown because too many files have changed in this diff Show More