Compare commits

...

473 Commits

Author SHA1 Message Date
Teknium 4b8272f549 feat(browser): add browser_dialog for native JS dialog handling
Ergonomic wrapper over CDP's Page.handleJavaScriptDialog that accepts
or dismisses alert/confirm/prompt/beforeunload dialogs blocking a page.
Unsticks pages whose JS thread is frozen by an unhandled dialog —
symptom is that browser_snapshot, browser_console, browser_click etc.
start hanging or erroring.

- action='accept'|'dismiss' required; prompt_text optional for prompt()
- target_id auto-resolves when exactly one page tab is open; with
  multiple page tabs, errors with the tab list so the agent picks one
- Shares browser_cdp's check_fn gate — only appears when CDP is
  reachable (/browser connect or browser.cdp_url in config). Hidden
  otherwise so backends that can't use it don't see it.
- Safe as a probe: CDP returns a clean 'No dialog is showing' error
  when nothing's pending, which we pass through verbatim

Dialog detection (knowing a dialog is open without being told) is NOT
included — it requires persistent CDP subscriptions per session, a
larger architectural change. Documented as a follow-up; agents infer
from symptoms and use this tool to recover.

Tests: 11 new unit tests against mock CDP server covering the wrapper
(action validation, auto-resolve with 0/1/multiple page targets,
explicit target_id accept/dismiss flow, prompt_text passthrough, shared
gate with browser_cdp, registry dispatch). E2E probe case against real
headless Chrome passes. Positive-case real-Chrome E2E is blocked by
Chromium's headless auto-dismiss behavior when no persistent listener
is attached — unit tests exercise the exact CDP protocol we send, so
the handling path is protocol-verified; headful real-browser usage
(the actual /browser connect case) keeps dialogs alive via the Chrome
UI.
2026-04-19 05:20:51 -07:00
Teknium 62ce6a38ae fix(gateway): cancel_background_tasks must drain late-arrivals (#12471)
During gateway shutdown, a message arriving while
cancel_background_tasks is mid-await (inside asyncio.gather) spawns
a fresh _process_message_background task via handle_message and adds
it to self._background_tasks.  The original implementation's
_background_tasks.clear() at the end of cancel_background_tasks
dropped the reference; the task ran untracked against a disconnecting
adapter, logged send-failures, and lingered until it completed on
its own.

Fix: wrap the cancel+gather in a bounded loop (MAX_DRAIN_ROUNDS=5).
If new tasks appeared during the gather, cancel them in the next
round.  The .clear() at the end is preserved as a safety net for
any task that appeared after MAX_DRAIN_ROUNDS — but in practice the
drain stabilizes in 1-2 rounds.

Tests: tests/gateway/test_cancel_background_drain.py — 3 cases.
- test_cancel_background_tasks_drains_late_arrivals: spawn M1, start
  cancel, inject M2 during M1's shielded cleanup, verify M2 is
  cancelled.
- test_cancel_background_tasks_handles_no_tasks: no-op path still
  terminates cleanly.
- test_cancel_background_tasks_bounded_rounds: baseline — single
  task cancels in one round, loop terminates.

Regression-guard validated: against the unpatched implementation,
the late-arrival test fails with exactly the expected message
('task leaked').  With the fix it passes.

Blast radius is shutdown-only; the audit classified this as MED.
Shipping because the fix is small and the hygiene is worth it.

While investigating the audit's other MEDs (busy-handler double-ack,
Discord ExecApprovalView double-resolve, UpdatePromptView
double-resolve), I verified all three were false positives — the
check-and-set patterns have no await between them, so they're
atomic on single-threaded asyncio.  No fix needed for those.
2026-04-19 01:48:42 -07:00
konsisumer 1d1e1277e4 fix(gateway): flush undelivered tail before segment reset to preserve streamed text (#8124)
When a streaming edit fails mid-stream (flood control, transport error)
and a tool boundary arrives before the fallback threshold is reached,
the pre-boundary tail in `_accumulated` was silently discarded by
`_reset_segment_state`. The user saw a frozen partial message and
missing words on the other side of the tool call.

Flush the undelivered tail as a continuation message before the reset,
computed relative to the last successfully-delivered prefix so we don't
duplicate content the user already saw.
2026-04-19 01:43:04 -07:00
Teknium e017131403 feat(cron): add wakeAgent gate — scripts can skip the agent entirely
Extends the existing cron script hook with a wake gate ported from
nanoclaw #1232. When a cron job's pre-check Python script (already
sandboxed to HERMES_HOME/scripts/) writes a JSON line like
```json
{"wakeAgent": false}
```
on its last stdout line, `run_job()` returns the SILENT marker and
skips the agent entirely — no LLM call, no delivery, no tokens spent.
Useful for frequent polls (every 1-5 min) that only need to wake the
agent when something has genuinely changed.

Any other script output (non-JSON, missing key, non-dict, `wakeAgent: true`,
truthy/falsy non-False values) behaves as before: stdout is injected
as context and the agent runs normally. Strict `False` is required
to skip — avoids accidental gating from arbitrary JSON.

Refactor:
- New pure helper `_parse_wake_gate(script_output)` in cron/scheduler.py
- `_build_job_prompt` accepts optional `prerun_script` tuple so the
  script runs exactly once per job (run_job runs it for the gate check,
  reuses the output for prompt injection)
- `run_job` short-circuits with SILENT_MARKER when gate fires

Script failures (success=False) still cannot trigger the gate — the
failure is reported as context to the agent as before.

This replaces the approach in closed PR #3837, which inlined bash
scripts via tempfile and lost the path-traversal/scripts-dir sandbox
that main's impl has. The wake-gate idea (the one net-new capability)
is ported on top of the existing sandboxed Python-script model.

Tests:
- 11 pure unit tests for _parse_wake_gate (empty, whitespace, non-JSON,
  non-dict JSON, missing key, truthy/falsy non-False, multi-line,
  trailing blanks, non-last-line JSON)
- 5 integration tests for run_job wake-gate (skip returns SILENT,
  wake-true passes through, script-runs-only-once, script failure
  doesn't gate, no-script regression)
- Full tests/cron/ suite: 194/194 pass
2026-04-19 01:42:35 -07:00
helix4u c94d26c69b fix(cli): sanitize interactive command output 2026-04-19 01:16:34 -07:00
kshitijk4poor 175cf7e6bb fix: tighten quiet-mode salvage follow-ups
Follow-up for the helix4u easy-fix salvage batch:
- route remaining context-engine quiet-mode output through
  _should_emit_quiet_tool_messages() so non-CLI/library callers stay
  silent consistently
- drop the extra senderAliases computation from WhatsApp allowlist-drop
  logging and remove the now-unused import

This keeps the batch scoped to the intended fixes while avoiding
leaked quiet-mode output and unnecessary duplicate work in the bridge.
2026-04-19 00:28:25 -07:00
helix4u cd59af17cc fix(agent): silence quiet_mode in python library use 2026-04-19 00:28:25 -07:00
helix4u 361675018f fix(setup): stop hardcoding max-iterations copy 2026-04-19 00:28:25 -07:00
helix4u 3ade655999 fix(whatsapp): log allowlist drops in bridge 2026-04-19 00:28:25 -07:00
Teknium 7c10761dd2 fix(discord): shield text-batch flush from follow-up cancel (#12444)
When Discord splits a long message at 2000 chars, _enqueue_text_event
buffers each chunk and schedules a _flush_text_batch task with a
short delay.  If another chunk lands while the prior flush task is
already inside handle_message, _enqueue_text_event calls
prior_task.cancel() — and without asyncio.shield, CancelledError
propagates from the flush task into handle_message → the agent's
streaming request, aborting the response the user was waiting on.

Reproducer: user sends a 3000-char prompt (split by Discord into 2
messages).  Chunk 1 lands, flush delay starts, chunk 2 lands during
the brief window when chunk 1's flush has already committed to
handle_message.  Agent's current streaming response is cancelled
with CancelledError, user sees a truncated or missing reply.

Fix (gateway/platforms/discord.py):
- Wrap the handle_message call in asyncio.shield so the inner
  dispatch is protected from the outer task's cancel.
- Add an except asyncio.CancelledError clause so the outer task
  still exits cleanly when cancel lands during the sleep window
  (before the pop) — semantics for that path are unchanged.

The new flush task spawned by the follow-up chunk still handles its
own batch via the normal pending-message / active-session machinery
in base.py, so follow-ups are not lost.

Tests: tests/gateway/test_text_batching.py —
test_shield_protects_handle_message_from_cancel.  Tracks a distinct
first_handle_cancelled event so the assertion fails cleanly when the
shield is missing (verified by stashing the fix and re-running).

Live E2E on the live-loaded DiscordAdapter:
  first_handle_cancelled: False  (shield worked)
  first_handle_completed: True   (handle_message ran to completion)
2026-04-19 00:09:38 -07:00
Teknium dca439fe92 fix(tui): scope session.interrupt pending-prompt release to the calling session (#12441)
session.interrupt on session A was blast-resolving pending
clarify/sudo/secret prompts on ALL sessions sharing the same
tui_gateway process.  Other sessions' agent threads unblocked with
empty-string answers as if the user had cancelled — silent
cross-session corruption.

Root cause: _pending and _answers were globals keyed by random rid
with no record of the owning session.  _clear_pending() iterated
every entry, so the session.interrupt handler had no way to limit
the release to its own sid.

Fix:
- tui_gateway/server.py: _pending now maps rid to (sid, Event)
  tuples.  _clear_pending takes an optional sid argument and filters
  by owner_sid when provided.  session.interrupt passes the calling
  sid so unrelated sessions are untouched.  _clear_pending(None)
  remains the shutdown path for completeness.
- _block and _respond updated to pack/unpack the new tuple format.

Tests (tests/test_tui_gateway_server.py): 4 new cases.
- test_interrupt_only_clears_own_session_pending: two sessions with
  pending prompts, interrupting one must not release the other.
- test_interrupt_clears_multiple_own_pending: same-sid multi-prompt
  release works.
- test_clear_pending_without_sid_clears_all: shutdown path preserved.
- test_respond_unpacks_sid_tuple_correctly: _respond handles the
  tuple format.

Also updated tests/tui_gateway/test_protocol.py to use the new tuple
format for test_block_and_respond and test_clear_pending.

Live E2E against the live Python environment confirmed cross-session
isolation: interrupting sid_a released its own pending prompt without
touching sid_b's.  All 78 related tests pass.
2026-04-19 00:03:58 -07:00
Teknium ce410521b3 feat(browser): add browser_cdp raw DevTools Protocol passthrough (#12369)
Agents can now send arbitrary CDP commands to the browser. The tool is
gated on a reachable CDP endpoint at session start — it only appears in
the toolset when BROWSER_CDP_URL is set (from '/browser connect') or
'browser.cdp_url' is configured in config.yaml. Backends that don't
currently expose CDP to the Python side (Camofox, default local
agent-browser, cloud providers whose per-session cdp_url is not yet
surfaced) do not see the tool at all.

Tool schema description links to the CDP method reference at
https://chromedevtools.github.io/devtools-protocol/ so the agent can
web_extract specific method docs on demand.

Stateless per call. Browser-level methods (Target.*, Browser.*,
Storage.*) omit target_id. Page-level methods attach to the target
with flatten=true and dispatch the method on the returned sessionId.
Clean errors when the endpoint becomes unreachable mid-session or
the URL isn't a WebSocket.

Tests: 19 unit (mock CDP server + gate checks) + E2E against real
headless Chrome (Target.getTargets, Browser.getVersion,
Runtime.evaluate with target_id, Page.navigate + re-eval, bogus
method, bogus target_id, missing endpoint) + E2E of the check_fn
gate (tool hidden without CDP URL, visible with it, hidden again
after unset).
2026-04-19 00:03:10 -07:00
helix4u d66414a844 docs(custom-providers): use key_env in examples 2026-04-18 23:07:59 -07:00
helix4u 7b1a11b971 fix(memory): keep Honcho provider opt-in 2026-04-18 22:50:55 -07:00
kshitijk4poor 0a8d48809f chore: add LeonSGP43 numeric noreply email to AUTHOR_MAP
The cherry-picked commit from #11434 uses the 154585401+ prefixed
noreply format. Add it alongside the existing bare entry so the
contributor audit passes.
2026-04-18 22:50:55 -07:00
Erosika 21d5ef2f17 feat(honcho): wizard cadence default 2, surface reasoning level, backwards-compat fallback
Setup wizard now always writes dialecticCadence=2 on new configs and
surfaces the reasoning level as an explicit step with all five options
(minimal / low / medium / high / max), always writing
dialecticReasoningLevel.

Code keeps a backwards-compat fallback of 1 when dialecticCadence is
unset so existing honcho.json configs that predate the setting keep
firing every turn on upgrade. New setups via the wizard get 2
explicitly; docs show 2 as the default.

Also scrubs editorial lines from code and docs ("max is reserved for
explicit tool-path selection", "Unset → every turn; wizard pre-fills 2",
and similar process-exposing phrasing) and adds an inline link to
app.honcho.dev where the server-side observation sync is mentioned in
honcho.md. Recommended cadence range updated to 1-5 across docs and
wizard copy.
2026-04-18 22:50:55 -07:00
LeonSGP43 5b6792f04d fix(honcho): scope gateway sessions by runtime user id 2026-04-18 22:50:55 -07:00
Erosika ba7da73ca9 test(honcho): drop two first-turn tests subsumed by prewarm + smoke coverage
- TestDialecticDepth::test_first_turn_runs_dialectic_synchronously:
  covered by TestSessionStartDialecticPrewarm::test_turn1_falls_back_to_sync_when_prewarm_missing
  (more realistic — exercises the empty-prewarm → sync-fallback path)
- TestDialecticDepth::test_first_turn_dialectic_does_not_double_fire:
  covered by TestDialecticLifecycleSmoke (turn 1 flow) and
  TestDialecticCadenceAdvancesOnSuccess::test_empty_dialectic_result_does_not_advance_cadence

Both predate the prewarm refactor and test paths that are now
fallback behaviors already covered elsewhere.
2026-04-18 22:50:55 -07:00
Erosika c630dfcdac feat(honcho): dialectic liveness — stale-thread watchdog, stale-result discard, empty-streak backoff
Hardens the dialectic lifecycle against three failure modes that could
leave the prefetch pipeline stuck or injecting stale content:

- Stale-thread watchdog: _thread_is_live() treats any prefetch thread
  older than timeout × 2.0 as dead. A hung Honcho call can no longer
  block subsequent fires indefinitely.

- Stale-result discard: pending _prefetch_result is tagged with its
  fire turn. prefetch() discards the result if more than cadence × 2
  turns passed before a consumer read it (e.g. a run of trivial-prompt
  turns between fire and read).

- Empty-streak backoff: consecutive empty dialectic returns widen the
  effective cadence (dialectic_cadence + streak, capped at cadence × 8).
  A healthy fire resets the streak. Prevents the plugin from hammering
  the backend every turn when the peer graph is cold.

- liveness_snapshot() on the provider exposes current turn, last fire,
  pending fire-at, empty streak, effective cadence, and thread status
  for in-process diagnostics.

- system_prompt_block: nudge the model that honcho_reasoning accepts
  reasoning_level minimal/low/medium/high/max per call.

- hermes honcho status: surface base reasoning level, cap, and heuristic
  toggle so config drift is visible at a glance.

Tests: 550 passed.
- TestDialecticLiveness (8 tests): stale-thread recovery, stale-result
  discard, fresh-result retention, backoff widening, backoff ceiling,
  streak reset on success, streak increment on empty, snapshot shape.
- Existing TestDialecticCadenceAdvancesOnSuccess::test_in_flight_thread_is_not_stacked
  updated to set _prefetch_thread_started_at so it tests the
  fresh-thread-blocks branch (stale path covered separately).
- test_cli TestCmdStatus fake updated with the new config attrs surfaced
  in the status block.
2026-04-18 22:50:55 -07:00
Erosika 098efde848 docs(honcho): wizard cadence default 2, prewarm/depth + observation + multi-peer
- cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1
  so unset → every turn)
- honcho.md: fix stale dialecticCadence default in tables, add
  Session-Start Prewarm subsection (depth runs at init), add
  Query-Adaptive Reasoning Level subsection, expand Observation
  section with directional vs unified semantics and per-peer patterns
- memory-providers.md: fix stale default, rename Multi-agent/Profiles
  to Multi-peer setup, add concrete walkthrough for new profiles and
  sync, document observation toggles + presets, link to honcho.md
- SKILL.md: fix stale defaults, add Depth at session start callout
2026-04-18 22:50:55 -07:00
Erosika 5f9907c116 chore(honcho): drop docs from PR scope, scrub commentary
- Revert website/docs and SKILL.md changes; docs unification handled separately
- Scrub commit/PR refs and process narration from code comments and test
  docstrings (no behavior change)
2026-04-18 22:50:55 -07:00
Erosika 78586ce036 fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:

- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
  3 for cost, but existing installs with no explicit config silently went
  from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
  behavior; 3+ remains available for cost-conscious setups. Docs + wizard
  + status output updated to match.

- Session-start prewarm now consumed. Previously fired a .chat() on init
  whose result landed in HonchoSessionManager._dialectic_cache and was
  never read — pop_dialectic_result had zero call sites. Turn 1 paid for
  a duplicate synchronous dialectic. Prewarm now writes directly to the
  plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
  no extra call.

- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
  weak output on cold peers; the multi-pass audit/reconcile cycle is
  exactly the case dialecticDepth was built for. Prewarm now runs the
  full configured depth in the background.

- Silent dialectic failure no longer burns the cadence window.
  _last_dialectic_turn now advances only when the result is non-empty.
  Empty result → next eligible turn retries immediately instead of
  waiting the full cadence gap.

- Thread pile-up guard. queue_prefetch skips when a prior dialectic
  thread is still in-flight, preventing stacked races on _prefetch_result.

- First-turn sync timeout is recoverable. Previously on timeout the
  background thread's result was stored in a dead local list. Now the
  thread writes into _prefetch_result under lock so the next turn
  picks it up.

- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
  guard let first-turn sync + same-turn queue_prefetch both fire.
  Gate now always applies.

- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
  Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
  +2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
  `reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
  (string, default "high"; previously parsed but never enforced).
  Respects dialecticDepthLevels and proportional lighter-early passes.

- Restored short-prompt skip, dropped in ef7f3156. One-word
  acknowledgements ("ok", "y", "thanks") and slash commands bypass
  both injection and dialectic fire.

- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
  set_dialectic_result, pop_dialectic_result — all unused after prewarm
  refactor.

Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
2026-04-18 22:50:55 -07:00
Teknium bf5d7462ba fix(tui): reject history-mutating commands while session is running (#12416)
Fixes silent data loss in the TUI when /undo, /compress, /retry, or
rollback.restore runs during an in-flight agent turn.  The version-
guard at prompt.submit:1449 would fail the version check and silently
skip writing the agent's result — UI showed the assistant reply but
DB / backend history never received it, causing UI↔backend desync
that persisted across session resume.

Changes (tui_gateway/server.py):
- session.undo, session.compress, /retry, rollback.restore (full-history
  only — file-scoped rollbacks still allowed): reject with 4009 when
  session.running is True.  Users can /interrupt first.
- prompt.submit: on history_version mismatch (defensive backstop),
  attach a 'warning' field to message.complete and log to stderr
  instead of silently dropping the agent's output.  The UI can surface
  the warning to the user; the operator can spot it in logs.

Tests (tests/test_tui_gateway_server.py): 6 new cases.
- test_session_undo_rejects_while_running
- test_session_undo_allowed_when_idle (regression guard)
- test_session_compress_rejects_while_running
- test_rollback_restore_rejects_full_history_while_running
- test_prompt_submit_history_version_mismatch_surfaces_warning
- test_prompt_submit_history_version_match_persists_normally (regression)

Validated: against unpatched server.py the three 'rejects_while_running'
tests fail and the version-mismatch test fails (no 'warning' field).
With the fix, all 6 pass, all 33 tests in the file pass, 74 TUI tests
in total pass.  Live E2E against the live Python environment confirmed
all 5 patches present and guards enforce 4009 exactly as designed.
2026-04-18 22:30:10 -07:00
Teknium 3a6351454b fix(gateway): close pending-drain and late-arrival races in base adapter (#12371)
Two related race conditions in gateway/platforms/base.py that could
produce duplicate agent runs or silently drop messages. Neither is
specific to any one platform — all adapters inherit this logic.

R5 (HIGH) — duplicate agent spawn on turn chain
  In _process_message_background, the pending-drain path deleted
  _active_sessions[session_key] before awaiting typing_task.cancel()
  and then recursively awaiting _process_message_background for the
  queued event. During the typing_task await, a fresh inbound message
  M3 could pass the Level-1 guard (entry now missing), set its own
  Event, and spawn a second _process_message_background for the same
  session_key — two agents running simultaneously, duplicate responses,
  duplicate tool calls.

  Fix: keep the _active_sessions entry populated and only clear() the
  Event. The guard stays live, so any concurrent inbound message takes
  the busy-handler path (queue + interrupt) as intended.

R6 (MED-HIGH) — message dropped during finally cleanup
  The finally block has two await points (typing_task, stop_typing)
  before it deletes _active_sessions. A message arriving in that
  window passes the guard (entry still live), lands in
  _pending_messages via the busy-handler — and then the unconditional
  del removes the guard with that message still queued. Nothing
  drains it; the user never gets a reply.

  Fix: before deleting _active_sessions in finally, pop any late
  pending_messages entry and spawn a drain task for it. Only delete
  _active_sessions when no pending is waiting.

Tests: tests/gateway/test_pending_drain_race.py — three regression
cases. Validated: without the fix, two of the three fail exactly
where the races manifest (duplicate-spawn guard loses identity,
late-arrival 'LATE' message not in processed list).
2026-04-18 19:32:26 -07:00
Teknium 762f7e9796 feat: configurable approval mode for cron jobs (approvals.cron_mode)
Add approvals.cron_mode config option that controls how cron jobs handle
dangerous commands. Previously, cron jobs silently auto-approved all
dangerous commands because there was no user present to approve them.

Now the behavior is configurable:
  - deny (default): block dangerous commands and return a message telling
    the agent to find an alternative approach. The agent loop continues —
    it just can't use that specific command.
  - approve: auto-approve all dangerous commands (previous behavior).

When a command is blocked, the agent receives the same response format as
a user denial in the CLI — exit_code=-1, status=blocked, with a message
explaining why and pointing to the config option. This keeps the agent
loop running and encourages it to adapt.

Implementation:
  - config.py: add approvals.cron_mode to DEFAULT_CONFIG
  - scheduler.py: set HERMES_CRON_SESSION=1 env var before agent runs
  - approval.py: both check_command_approval() and check_all_command_guards()
    now check for cron sessions and apply the configured mode
  - 21 new tests covering config parsing, deny/approve behavior, and
    interaction with other bypass mechanisms (yolo, containers)
2026-04-18 19:24:35 -07:00
Teknium b02833f32d fix(codex): Hermes owns its own Codex auth; stop touching ~/.codex/auth.json (#12360)
Codex OAuth refresh tokens are single-use and rotate on every refresh.
Sharing them with the Codex CLI / VS Code via ~/.codex/auth.json made
concurrent use of both tools a race: whoever refreshed last invalidated
the other side's refresh_token.  On top of that, the silent auto-import
path picked up placeholder / aborted-auth data from ~/.codex/auth.json
(e.g. literal {"access_token":"access-new","refresh_token":"refresh-new"})
and seeded it into the Hermes pool as an entry the selector could
eventually pick.

Hermes now owns its own Codex auth state end-to-end:

Removed
- agent/credential_pool.py: _sync_codex_entry_from_cli() method,
  its pre-refresh + retry + _available_entries call sites, and the
  post-refresh write-back to ~/.codex/auth.json.
- agent/credential_pool.py: auto-import from ~/.codex/auth.json in
  _seed_from_singletons() — users now run `hermes auth openai-codex`
  explicitly.
- hermes_cli/auth.py: silent runtime migration in
  resolve_codex_runtime_credentials() — now surfaces
  `codex_auth_missing` directly (message already points to `hermes auth`).
- hermes_cli/auth.py: post-refresh write-back in
  _refresh_codex_auth_tokens().
- hermes_cli/auth.py: dead helper _write_codex_cli_tokens() and its 4
  tests in test_auth_codex_provider.py.

Kept
- hermes_cli/auth.py: _import_codex_cli_tokens() — still used by the
  interactive `hermes auth openai-codex` setup flow for a user-gated
  one-time import (with "a separate login is recommended" messaging).

User-visible impact
- On existing installs with Hermes auth already present: no change.
- On a fresh install where the user has only logged in via Codex CLI:
  `hermes chat --provider openai-codex` now fails with "No Codex
  credentials stored. Run `hermes auth` to authenticate." The
  interactive setup flow then detects ~/.codex/auth.json and offers a
  one-time import.
- On an install where Codex CLI later refreshes its token: Hermes is
  unaffected (we no longer read from that file at runtime).

Tests
- tests/hermes_cli/test_auth_codex_provider.py: 15/15 pass.
- tests/hermes_cli/test_auth_commands.py: 20/20 pass.
- tests/agent/test_credential_pool.py: 31/31 pass.
- Live E2E on openai-codex/gpt-5.4: 1 API call, 1.7s latency,
  3 log lines, no refresh events, no auth drama.

The related 14:52 refresh-loop bug (hundreds of rotations/minute on a
single entry) is a separate issue — that requires a refresh-attempt
cap on the auth-recovery path in run_agent.py, which remains open.
2026-04-18 19:19:46 -07:00
yeyitech bd01ec7885 fix(cli): strip all reasoning tag variants from /resume recap
HermesCLI._display_resumed_history() calls the module-level _strip_reasoning_tags() to clean assistant content before rendering the recap panel.  The tag list was missing <thought> (Gemma 4) and there was no pass for stray orphan </tag> closes, so those variants leaked internal reasoning into the recap display (#11316).

- Add <thought> to _REASONING_TAGS.
- Add a third regex pass that strips orphan close tags (e.g. 'stuff</think>answer' → 'stuffanswer').
- Apply IGNORECASE to closed-pair and unclosed-pair passes so mixed-case variants (<THINK>, <Thinking>) are handled uniformly — previously both 'THINKING' and 'thinking' had to be listed explicitly as distinct tuple entries, which missed <Thinking>.

7 new regression tests in tests/cli/test_resume_display.py covering: <think>, <thinking>, <reasoning>, <thought>, unclosed <think>, multiple interleaved blocks, and orphan </think> close.

Resolves #11316.

Originally proposed as PR #11366.
2026-04-18 19:19:24 -07:00
Tranquil-Flow ec48ec5530 fix(agent): strip <think> blocks from stored assistant content
Inline reasoning tags in an assistant message's content field leak to every downstream consumer: messaging platforms (#8878, #9568), API replay of prior turns, session transcript, CLI recap, generated session titles, and context compression.  _extract_reasoning() already captures the reasoning text into msg['reasoning'] separately, so the raw tags in content are redundant.

Stripping once at the storage boundary in _build_assistant_message() cleans the content for every downstream path in one place — no per-platform or per-path stripper needed.  Measured impact on a real MiniMax M2.7-highspeed session (per @luoyejiaoe-source, #9306): 55% of assistant messages started with <think> blocks, 51/100 session titles were polluted, 16% content-size reduction.

3 new regression tests in TestBuildAssistantMessage: closed-pair strip with reasoning capture, no-think-tag passthrough, and unterminated-block strip.

Resolves #8878 and #9568.

Originally proposed as PR #9250.
2026-04-18 19:19:24 -07:00
Teknium 9489d1577d fix(agent): strip unterminated <think> blocks from visible content
Providers served via NIM (MiniMax M2.7, some Moonshot/DeepSeek proxies) sometimes drop the closing </think> tag, leaving raw reasoning in the assistant's content field.  _strip_think_blocks()'s closed-pair regex is non-greedy so it only matches complete blocks — any orphan <think>...EOF survived the stripper and leaked to users (#8878, #9568, #10408).

Adds an unterminated-tag pass that fires when an open reasoning tag sits at a block boundary (start of text or after a newline) with no matching close.  Everything from that tag to end of string is stripped.  The block-boundary check mirrors gateway/stream_consumer.py's filter so models that mention <think> in prose are not over-stripped.

Also makes the closed-pair regexes consistently case-insensitive so <THINK>...</THINK> and <Thinking>...</Thinking> are handled uniformly — previously the mixed-case open tag would bypass the closed-pair pass and be caught by the unterminated-tag pass, taking trailing visible content with it.

6 new regression tests in TestStripThinkBlocks covering: unterminated <think>, unterminated <thought>, multi-line unterminated, line-start orphan with preserved prefix, prose-mention non-regression, mixed-case closed pairs.

The implementation is inspired by @luinbytes's PR #10408 report of the NIM/MiniMax symptom.  This commit does not include the 💭/🧠 emoji regexes from that PR — those glyphs are Hermes CLI display decorations, not model content markers.
2026-04-18 19:19:24 -07:00
Teknium 79c5a381c5 feat(uninstall): offer to remove named profiles when uninstalling from default
When `hermes uninstall` runs from the default HERMES_HOME (~/.hermes)
and other named profiles exist under ~/.hermes/profiles/, show them in
the installation overview and prompt:

    Also stop and remove these N profile(s)? [y/N]

If confirmed, for each named profile we:
  1. Shell out to `python -m hermes_cli.main -p <name> gateway stop/uninstall`
     to stop the gateway and remove its systemd unit or launchd plist
     (service names + unit paths are derived from HERMES_HOME, so we
     can't cleanly switch in-process)
  2. Remove the ~/.local/bin/<name> alias wrapper (outside HERMES_HOME)
  3. Wipe the profile's HERMES_HOME dir

Previously `hermes uninstall` was silently profile-scoped, leaving
zombie systemd units at ~/.config/systemd/user/hermes-gateway-<profile>.service
and zombie HERMES_HOMEs under ~/.hermes/profiles/ whenever a user
uninstalled from default with other profiles configured.

Prompt only appears when uninstalling from the default root. Uninstalling
from within a named profile stays profile-scoped as before.
2026-04-18 19:18:13 -07:00
Teknium 3fe0d503b6 fix(uninstall): properly stop and destroy gateway on hermes uninstall
The uninstaller's gateway cleanup was incomplete:
- Linux only (ignored macOS launchd)
- Only checked user systemd scope (missed system services)
- Didn't kill standalone gateway processes (hermes gateway run)
- Missing DBUS env setup for headless servers

Now delegates to gateway.py's existing machinery:
1. Kill any standalone gateway processes (all platforms)
2. Linux: stop + disable + remove both user AND system systemd services
3. macOS: unload + remove launchd plist
4. Warns (instead of silently failing) when system service needs sudo
2026-04-18 19:18:13 -07:00
Teknium 1e5f0439d9 docs: update Anthropic console URLs to platform.claude.com
Anthropic migrated their developer console from console.anthropic.com
to platform.claude.com. Two user-facing display URLs were still pointing
to the old domain:

- hermes_cli/main.py — API key prompt in the Anthropic model flow
- run_agent.py — 401 troubleshooting output

The OAuth token refresh endpoint was already migrated in PR #3246
(with fallback).

Spotted by @LucidPaths in PR #3237.

(Salvage of #3758 — dropped the setup.py hunk since that section was
refactored away and no longer contains the stale URL.)
2026-04-18 18:55:58 -07:00
Teknium 2a2e5c0fed fix: force relogin on 401/403 Codex token refresh failures
When the OAuth token endpoint returns 401/403 but the JSON body
doesn't contain a known error code (invalid_grant, etc.),
relogin_required stayed False. Users saw a bare error message
without guidance to re-authenticate.

Now any 401/403 from the token endpoint forces relogin_required=True,
since these status codes always indicate invalid credentials on a
refresh endpoint. 500+ errors remain as transient (no relogin).
2026-04-18 18:54:34 -07:00
Teknium beabbd87ef fix(gateway): close adapter resources when connect() fails or raises (#12339)
Gateway startup leaks aiohttp.ClientSession (and other partial-init
resources) when an adapter's connect() returns False or raises. The
adapter is never added to self.adapters, so the shutdown path at
gateway/run.py:2426 never calls disconnect() on it — Python GC later
logs 'Unclosed client session' at process exit.

Seen on 2026-04-18 18:08:16 during a double --replace takeover cycle:
one of the partial-init sessions survived past shutdown and emitted
the warning right before status=75/TEMPFAIL.

Fix:
- New GatewayRunner._safe_adapter_disconnect() helper — calls
  adapter.disconnect() and swallows any exception. Used on error paths.
- Connect loop calls it in both failure branches: success=False and
  except Exception.
- Adapter disconnect() implementations are already expected to be
  idempotent and tolerate partial-init state (they all guard on
  self._http_session / self._bridge_process before touching them).

Tests: tests/gateway/test_safe_adapter_disconnect.py — 3 cases verify
the helper forwards to disconnect, swallows exceptions, and tolerates
platform=None.
2026-04-18 18:53:31 -07:00
Teknium 632a807a3e fix(gateway): slash commands never interrupt a running agent (#12334)
Any recognized slash command now bypasses the Level-1 active-session
guard instead of queueing + interrupting. A mid-run /model (or
/reasoning, /voice, /insights, /title, /resume, /retry, /undo,
/compress, /usage, /provider, /reload-mcp, /sethome, /reset) used to
interrupt the agent AND get silently discarded by the slash-command
safety net — zero-char response, dropped tool calls.

Root cause:
- Discord registers 41 native slash commands via tree.command().
- Only 14 were in ACTIVE_SESSION_BYPASS_COMMANDS.
- The other ~15 user-facing ones fell through base.py:handle_message
  to the busy-session handler, which calls running_agent.interrupt()
  AND queues the text.
- After the aborted run, gateway/run.py:9912 correctly identifies the
  queued text as a slash command and discards it — but the damage
  (interrupt + zero-char response) already happened.

Fix:
- should_bypass_active_session() now returns True for any resolvable
  slash command. ACTIVE_SESSION_BYPASS_COMMANDS stays as the subset
  with dedicated Level-2 handlers (documentation + tests).
- gateway/run.py adds a catch-all after the dedicated handlers that
  returns a user-visible "agent busy — wait or /stop first" response
  for any other resolvable command.
- Unknown text / file-path-like messages are unchanged — they still
  queue.

Also:
- gateway/platforms/discord.py logs the invoker identity on every
  slash command (user id + name + channel + guild) so future
  ghost-command reports can be triaged without guessing.

Tests:
- 15 new parametrized cases in test_command_bypass_active_session.py
  cover every previously-broken Discord slash command.
- Existing tests for /stop, /new, /approve, /deny, /help, /status,
  /agents, /background, /steer, /update, /queue still pass.
- test_steer.py's ACTIVE_SESSION_BYPASS_COMMANDS check still passes.

Fixes #5057. Related: #6252, #10370, #4665.
2026-04-18 18:53:22 -07:00
Teknium 41560192c4 chore(attribution): add AUTHOR_MAP entry for nish3451
Adds the nish3451 noreply email to the AUTHOR_MAP so CI attribution checks
pass for the #6100 Telegram DM fallback fix merged in 1a9a2d7f.
2026-04-18 18:52:41 -07:00
Teknium aa5f89d3ea test: add coverage for from_user=None DM fallback
Tests the three cases:
- DM with from_user=None: user_id falls back to chat.id
- Group with from_user=None: user_id stays None (safe default)
- DM with from_user present: user_id uses from_user.id (no regression)
2026-04-18 18:18:01 -07:00
Nish 1a9a2d7fe8 fix(gateway/telegram): fall back to chat.id when from_user is None in DMs
When `message.from_user` is None — which can happen for forwarded messages,
anonymous admin mode in groups, or certain Telegram client edge cases —
`_build_message_event` set `source.user_id` to None. This caused:

1. `_is_user_authorized()` to early-return False (`if not user_id: return False`)
2. The access check never compared against `TELEGRAM_ALLOWED_USERS` even when
   the user actually was in the allowlist
3. The pairing flow fired and generated a code for `user_id=None`
4. The pairing approval saved an entry under the literal string key "null"
5. The user was effectively locked out because their real user_id never
   matched the "null" key on subsequent messages

For DMs (`chat_type == "dm"`), Telegram guarantees `chat.id == user.id` —
they are the same numeric ID for private chats. Falling back to `chat.id`
when `from_user` is None for DMs restores the expected access-control
behavior without weakening it (group/channel chats correctly stay None).

Also adds a parallel `user_name` fallback to `chat.full_name` so the
display name still works in the same edge case.
2026-04-18 18:18:01 -07:00
Teknium 139a6da67c fix(skills): touchdesigner-mcp setup.sh — correct pgrep match + suppress stray yaml output
Discovered while dogfooding the skill end-to-end:

- pgrep -if "TouchDesigner" matched any shell whose command line
  contained the substring (including the setup script's own invocation
  under certain wrappers), falsely reporting TD running on machines
  where it isn't. Switch to pgrep -x (exact process name match,
  supported on both macOS and Linux) and also check TouchDesignerFTE
  (the non-commercial variant).
- The embedded python3 yaml-writer printed 'added' / 'exists' to
  stdout as status, which leaked a stray word into the setup output
  right before the ✔ line. Drop the print()s — the bash-level ✔/✘ is
  the status indicator.
2026-04-18 17:43:42 -07:00
Teknium 6b31e20894 chore(skills): touchdesigner-mcp follow-ups
- Remove orphan skills/creative/touchdesigner/references/pitfalls.md
  left over from the rename commit (git add-then-edit instead of git mv
  meant the old file never got deleted).
- Honour $HERMES_HOME in setup.sh and SKILL.md setup invocation so
  profile-aware installs work correctly.
- Fix troubleshooting.md config path to use $HERMES_HOME instead of
  hardcoding ~/.hermes/.
- Add touchdesigner-mcp entries to skills-catalog.md and
  optional-skills-catalog.md for parity with blender-mcp/meme-generation.
2026-04-18 17:43:42 -07:00
Teknium 11ee87e605 chore(attribution): add AUTHOR_MAP entry for kshitijk4poor@gmail.com
Covers the non-noreply email used on commit dd3e6424 (rename of the
TouchDesigner skill to touchdesigner-mcp).
2026-04-18 17:43:42 -07:00
kshitijk4poor 6d2fe1d624 feat: rename touchdesigner -> touchdesigner-mcp, move to optional-skills/
- Rename skill to touchdesigner-mcp (matches blender-mcp convention)
- Move from skills/creative/ to optional-skills/creative/
- Fix duplicate pitfall numbering (#3 appeared twice)
- Update SKILL.md cross-references for renumbered pitfalls
- Update setup.sh path for new directory location
2026-04-18 17:43:42 -07:00
kshitijk4poor 6f27390fae feat: rewrite TouchDesigner skill for twozero MCP (v2.0.0)
Major rewrite of the TouchDesigner skill:
- Replace custom API handler with twozero MCP (36 native tools)
- Add audio-reactive GLSL proven recipe (spectrum chain, pitfalls)
- Add recording checklist (FPS>0, non-black, audio cueing)
- Expand pitfalls: 38 entries from real sessions (was 20)
- Update network-patterns with MCP-native build scripts
- Rewrite mcp-tools reference for twozero v2.774+
- Update troubleshooting for MCP-based workflow
- Remove obsolete custom_api_handler.py
- Generalize Environment section for all users
- Remove session-specific Paired Skills section
- Bump version to 2.0.0
2026-04-18 17:43:42 -07:00
kshitijk4poor 7a5371b20d feat: add TouchDesigner integration skill
New skill: creative/touchdesigner — control a running TouchDesigner
instance via REST API. Build real-time visual networks programmatically.

Architecture:
  Hermes Agent -> HTTP REST (curl) -> TD WebServer DAT -> TD Python env

Key features:
- Custom API handler (scripts/custom_api_handler.py) that creates a
  self-contained WebServer DAT + callback in TD. More reliable than the
  official mcp_webserver_base.tox which frequently fails module imports.
- Discovery-first workflow: never hardcode TD parameter names. Always
  probe the running instance first since names change across versions.
- Persistent setup: save the TD project once with the API handler baked
  in. TD auto-opens the last project on launch, so port 9981 is live
  with zero manual steps after first-time setup.
- Works via curl in execute_code (no MCP dependency required).
- Optional MCP server config for touchdesigner-mcp-server npm package.

Skill structure (2823 lines total):
  SKILL.md (209 lines) — setup, workflow, key rules, operator reference
  references/pitfalls.md (276 lines) — 24 hard-won lessons
  references/operators.md (239 lines) — all 6 operator families
  references/network-patterns.md (589 lines) — audio-reactive, generative,
    video processing, GLSL, instancing, live performance recipes
  references/mcp-tools.md (501 lines) — 13 MCP tool schemas
  references/python-api.md (443 lines) — TD Python scripting patterns
  references/troubleshooting.md (274 lines) — connection diagnostics
  scripts/custom_api_handler.py (140 lines) — REST API handler for TD
  scripts/setup.sh (152 lines) — prerequisite checker

Tested on TouchDesigner 099 Non-Commercial (macOS/darwin).
2026-04-18 17:43:42 -07:00
Teknium c49a58a6d0 fix(gateway): mark only still-running sessions resume_pending on drain timeout (#12332)
Follow-up to #12301.

The drain-timeout branch of _stop_impl() was iterating the drain-start
snapshot (active_agents) when marking sessions resume_pending. That
snapshot can include sessions that finished gracefully during the drain
window — marking them would give their next turn a stray
'your previous turn was interrupted by a gateway restart' system note
even though the prior turn actually completed cleanly.

Iterate self._running_agents at timeout time instead, mirroring
_interrupt_running_agents() exactly:
- only sessions still blocking the shutdown get marked
- pending sentinels (AIAgent construction not yet complete) are skipped

Changes:
- gateway/run.py: swap active_agents.keys() for filtered
  self._running_agents.items() iteration in the drain-timeout mark loop.
- tests/gateway/test_restart_resume_pending.py: two regression tests —
  finisher-during-drain not marked, pending sentinel not marked.
2026-04-18 17:40:34 -07:00
Teknium cb4addacab fix(gateway): auto-resume sessions after drain-timeout restart (#11852) (#12301)
The shutdown banner promised "send any message after restart to resume
where you left off" but the code did the opposite: a drain-timeout
restart skipped the .clean_shutdown marker, which made the next startup
call suspend_recently_active(), which marked the session suspended,
which made get_or_create_session() spawn a fresh session_id with a
'Session automatically reset. Use /resume...' notice — contradicting
the banner.

Introduce a resume_pending state on SessionEntry that is distinct from
suspended. Drain-timeout shutdown flags active sessions resume_pending
instead of letting startup-wide suspension destroy them. The next
message on the same session_key preserves the session_id, reloads the
transcript, and the agent receives a reason-aware restart-resume
system note that subsumes the existing tool-tail auto-continue note
(PR #9934).

Terminal escalation still flows through the existing
.restart_failure_counts stuck-loop counter (PR #7536, threshold 3) —
no parallel counter on SessionEntry. suspended still wins over
resume_pending in get_or_create_session() so genuinely stuck sessions
converge to a clean slate.

Spec: PR #11852 (BrennerSpear). Implementation follows the spec with
the approved correction (reuse .restart_failure_counts rather than
adding a resume_attempts field).

Changes:
- gateway/session.py: SessionEntry.resume_pending/resume_reason/
  last_resume_marked_at + to_dict/from_dict; SessionStore
  .mark_resume_pending()/clear_resume_pending(); get_or_create_session()
  returns existing entry when resume_pending (suspended still wins);
  suspend_recently_active() skips resume_pending entries.
- gateway/run.py: _stop_impl() drain-timeout branch marks active
  sessions resume_pending before _interrupt_running_agents();
  _run_agent() injects reason-aware restart-resume system note that
  subsumes the tool-tail case; successful-turn cleanup also clears
  resume_pending next to _clear_restart_failure_count();
  _notify_active_sessions_of_shutdown() softens the restart banner to
  'I'll try to resume where you left off' (honest about stuck-loop
  escalation).
- tests/gateway/test_restart_resume_pending.py: 29 new tests covering
  SessionEntry roundtrip, mark/clear helpers, get_or_create_session
  precedence (suspended > resume_pending), suspend_recently_active
  skip, drain-timeout mark reason (restart vs shutdown), system-note
  injection decision tree (including tool-tail subsumption), banner
  wording, and stuck-loop escalation override.
2026-04-18 17:32:17 -07:00
brooklyn! ad99e32371 Merge pull request #12312 from NousResearch/bb/tui-ux-pack
feat(tui): UX pack — stable picker keys, /clear confirm, light-theme preset
2026-04-18 18:13:06 -05:00
Brooklyn Nicholson df5ca5065f feat(tui): replace /clear double-press gate with a proper confirm overlay
The time-window gate felt wrong — users would hit /clear, read the
prompt, retype, and consistently blow past the window. Swapping to a
real yes/no overlay that blocks input like the existing Approval and
Clarify prompts.

- add ConfirmReq type + OverlayState.confirm + $isBlocked coverage
- ConfirmPrompt component (prompts.tsx): cancel row on top as the
  default, danger-coloured confirm row on the bottom, Y/N hotkeys,
  Enter on default = cancel, Esc/Ctrl+C cancel
- wire into PromptZone (appOverlays.tsx)
- /clear + /new now push onto the overlay instead of arming a timer
- HERMES_TUI_NO_CONFIRM=1 still skips the prompt for scripting
- drop the destructiveGate + createSlashHandler reset wiring
  (destructive.ts and its tests removed)

Refs #4069.
2026-04-18 18:04:08 -05:00
Brooklyn Nicholson 75377feb07 fix(tui): make /clear confirm window humane (3s → 30s, reset on other slash)
The 3s gate was too tight — users reading the prompt and retyping
consistently blow past it and get stuck in a loop ("press /clear
again within 3s" forever). Fixes:

- bump CONFIRM_WINDOW_MS 3_000 → 30_000
- drop the time number from the confirmation message to remove the
  pressure vibe: "press /clear again to confirm — starts a new session"
- reset the gate from createSlashHandler whenever any non-destructive
  slash command runs, so stale arming from 20s ago can't silently
  turn the next /clear into an unintended confirm
- export the gate + isDestructiveCommand helper for that wiring
- add armed() introspection method

Follow-up to #4069 / 3366714b.
2026-04-18 17:55:53 -05:00
Brooklyn Nicholson 20eab355e7 feat(tui): add LIGHT_THEME preset for white/light terminal backgrounds
Splits the existing palette into DARK_THEME (current yellow-heavy
default) and LIGHT_THEME (darker browns + proper contrast on white).
DEFAULT_THEME aliases DARK_THEME, and flips to LIGHT_THEME when
HERMES_TUI_LIGHT=1 is set at launch.

Skin system (fromSkin) still layers on top of whichever preset is
active, so users can keep customizing on top of either palette.

Refs #11300.
2026-04-18 17:49:40 -05:00
Brooklyn Nicholson 3366714ba4 feat(tui): double-press confirm on /clear and /new
Prevents accidental session loss: the first press prints
"press /clear again within 3s to confirm"; a second press inside
the window actually starts a new session. Outside the window the
gate re-arms.

Opt out with HERMES_TUI_NO_CONFIRM=1 for scripted / muscle-memory
workflows.

Refs #4069.
2026-04-18 17:48:34 -05:00
Brooklyn Nicholson 52124384de fix(tui): stable React keys in /model picker rows
Use provider.slug (and a composite key for model rows) instead of the
rendered string, so dupes in the backend response can't collapse two
rows into one or trigger key-collision warnings.
2026-04-18 17:47:26 -05:00
brooklyn! db59c190c1 Merge pull request #12305 from NousResearch/bb/tui-status-git-branch
feat(tui): append git branch to cwd label in status bar
2026-04-18 17:27:40 -05:00
brooklyn! c0edcf2d53 Merge pull request #12306 from NousResearch/bb/tui-model-picker-dedupe-names
fix(tui): disambiguate /model picker rows when provider display names collide
2026-04-18 17:27:31 -05:00
Brooklyn Nicholson 4aa52590d8 fix(tui): disambiguate /model picker rows when provider display names collide
If the gateway returns two providers that resolve to the same display name
(e.g. `kimi-coding` and `kimi-coding-cn` both → "Kimi For Coding"), the
picker now appends the slug so users can tell them apart, in both the
provider list and the selected-provider header. No-op when names are
already unique.

Refs #10526 — the Python backend dedupe from #10599 skips one alias, but
user-defined providers, canonical overlays, and future regressions can
still surface as indistinguishable rows in the picker. This is a
client-side safety net on top of that.
2026-04-18 17:22:23 -05:00
Brooklyn Nicholson ff2aa7ccd7 feat(tui): append git branch to cwd label in status bar
Adds useGitBranch hook (async, cached, 15s TTL) and fmtCwdBranch
helper so the footer shows `~/repo (main)` instead of just `~/repo`.
Degrades silently when git is unavailable or cwd is outside a repo.

Partial fix for #12267 (TUI portion; #12277 covers the Python side).
2026-04-18 17:17:05 -05:00
Teknium 0175ff7516 feat(skills): replace xitter with xurl — the official X API CLI (#12303)
Swap the social-media/xitter skill (third-party wrapper around
Infatoshi/x-cli) for a new social-media/xurl skill wrapping
xdevplatform/xurl — the official X API CLI from the X developer
platform team.

Why:
- xurl is officially maintained by the X dev platform team
- OAuth 2.0 PKCE with auto-refresh + multi-app / multi-user support
  (vs. xitter's 5-env-var OAuth 1.0a + single account)
- Credentials stored in ~/.xurl managed by xurl itself — no manual
  env var juggling for users
- Substantially larger API surface: DMs, follows, blocks, mutes,
  media upload, streaming, and raw v2 endpoint access
- Ships stronger agent-safety guardrails (forbidden-flag list,
  no --verbose in agent mode, never-read-~/.xurl rule)

Adaptation:
- Ported the openclaw SKILL.md (which the xdevplatform team seeded)
  to Hermes frontmatter conventions (prerequisites.commands, platforms,
  metadata.hermes.tags/homepage) — dropped openclaw-specific metadata
- Added a Hermes-oriented one-time user setup section so the agent
  knows to direct the user to run auth commands themselves, never
  execute them with inline secrets
- Preserved the mandatory secret-safety rules verbatim
- Attribution block credits xdevplatform, openclaw, and the Hermes
  port

Docs: updated website/docs/reference/skills-catalog.md to replace
the xitter row with xurl.
2026-04-18 15:11:32 -07:00
Teknium 6a3a6a0fb6 Merge pull request #12263 from NousResearch/bb/tui-audit-followup
fix(tui): TUI v2 audit follow-up — registry, overlays, paste, reasoning, hyperlinks
2026-04-18 14:40:16 -07:00
helix4u 4e8f60fd11 fix(cli): use display width for wrapped spinner height 2026-04-18 14:34:05 -07:00
Brooklyn Nicholson fb06bc67de fix(tui): Ctrl+C with input selection actually preserves input (lift handler to app level)
Previous fix in 9dbf1ec6 handled Ctrl+C inside textInput but the APP-level
useInputHandlers fires the same keypress in a separate React hook and ran
clearIn() regardless. Net effect: the OSC 52 copy succeeded but the input
wiped right after, so Brooklyn only noticed the wipe.

Lift the selection-aware Ctrl+C to a single place by threading input
selection state through a new nanostore (src/app/inputSelectionStore.ts).
textInput syncs its derived `selected` range + a clear() callback to the
store on every selection change, and the app-level Ctrl+C handler reads
the store before its clear/interrupt/die chain:

  - terminal-level selection (scrollback) → copy, existing behavior
  - in-input selection present → copy + clear selection, preserve input
  - input has text, no selection → clearIn(), existing behavior
  - empty + busy → interrupt turn
  - empty + idle → die

textInput no longer has its own Ctrl+C block; keypress falls through to
app-level like it did before 9dbf1ec6.
2026-04-18 16:28:51 -05:00
Brooklyn Nicholson bfac5d039d Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/tui-audit-followup 2026-04-18 15:27:40 -05:00
Brooklyn Nicholson 17e95a26b7 fix(tui): render /skills browse as a formatted Panel instead of raw JSON
Previous handler dumped the raw skills.manage response into a pager, which
was unreadable and hid the pagination metadata. Also silently accepted
non-numeric page args.

Now:
- validates page arg (rejects NaN / <1 with a usage message)
- shows "fetching community skills (scans 6 sources, may take ~15s)…" up
  front so the 10-30s hub fetch isn't a silent hang
- renders items as {name · trust, description (truncated 160 chars)} rows
  in the existing Panel component
- footer shows "page X of Y · N skills total · /skills browse N+1 for more"
  when the server returned pagination metadata

Skills hub's remote fetch latency is a separate upstream issue
(browse_skills hits 6 sources sequentially) — client-side we just stop
misrepresenting it.
2026-04-18 15:22:43 -05:00
Brooklyn Nicholson 7e9a098574 chore: uptick 2026-04-18 15:17:42 -05:00
Brooklyn Nicholson 450ded98db chore(tui): prettier whitespace on files touched in this branch 2026-04-18 15:13:31 -05:00
Brooklyn Nicholson 93b4080b78 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/tui-audit-followup
# Conflicts:
#	ui-tui/src/components/markdown.tsx
#	ui-tui/src/types/hermes-ink.d.ts
2026-04-18 14:52:54 -05:00
helix4u ca32a2a60b fix(gemini): restore bearer auth on openai route 2026-04-18 12:52:01 -07:00
helix4u a7dd6a3449 fix(gemini): hide stale and low-TPM Google models 2026-04-18 12:52:01 -07:00
helix4u 2eab7ee15f fix(gemini): hide low-TPM Gemma models from exposed lists 2026-04-18 12:52:01 -07:00
LVT382009 f7af90e2da fix: wire _ephemeral_max_output_tokens into chat_completions and add NVIDIA NIM default
Based on #12152 by @LVT382009.

Two fixes to run_agent.py:

1. _ephemeral_max_output_tokens consumption in chat_completions path:
   The error-recovery ephemeral override was only consumed in the
   anthropic_messages branch of _build_api_kwargs.  All chat_completions
   providers (OpenRouter, NVIDIA NIM, Qwen, Alibaba, custom, etc.)
   silently ignored it.  Now consumed at highest priority, matching the
   anthropic pattern.

2. NVIDIA NIM max_tokens default (16384):
   NVIDIA NIM falls back to a very low internal default when max_tokens
   is omitted, causing models like GLM-4.7 to truncate immediately
   (thinking tokens exhaust the budget before the response starts).

3. Progressive length-continuation boost:
   When finish_reason='length' triggers a continuation retry, the output
   budget now grows progressively (2x base on retry 1, 3x on retry 2,
   capped at 32768) via _ephemeral_max_output_tokens.  Previously the
   retry loop just re-sent the same token limit on all 3 attempts.
2026-04-18 12:51:30 -07:00
jarvischer 0f778f7768 fix: prevent tool name duplication in streaming accumulator (MiniMax/NVIDIA NIM)
Based on #11984 by @maxchernin.  Fixes #8259.

Some providers (MiniMax M2.7 via NVIDIA NIM) resend the full function
name in every streaming chunk instead of only the first.  The old
accumulator used += which concatenated them into 'read_fileread_file'.

Changed to simple assignment (=), matching the OpenAI Node SDK, LiteLLM,
and Vercel AI SDK patterns.  Function names are atomic identifiers
delivered complete — no provider splits them across chunks, so
concatenation was never correct semantics.
2026-04-18 12:50:32 -07:00
Brooklyn Nicholson 4caf6c23dd fix(tui): strip <think>…</think> tags from assistant content and route to reasoning panel
Models that emit reasoning inline as <think>/<reasoning>/<thinking>/<thought>/
<REASONING_SCRATCHPAD> tags in the content field (rather than a separate API
reasoning channel) had the raw tags + inner content shown twice: once as body
text with literal <think> markers, and again in the thinking panel when the
reasoning field was populated.

Port v1's tag set to lib/reasoning.ts with a splitReasoning(text) helper that
returns { reasoning, text }. Applied in three spots:

  - scheduleStreaming: strips tags from the live streaming view so the user
    never sees <think> mid-turn.
  - flushStreamingSegment: when a tool interrupts assistant output mid-turn,
    the saved segment is the stripped text; extracted reasoning promotes to
    reasoningText if the API channel hasn't already populated it.
  - recordMessageComplete: final message text is split, extracted reasoning
    merges with any existing reasoning (API channel wins on conflicts so we
    don't double-count when both are present).
2026-04-18 14:46:38 -05:00
Brooklyn Nicholson 37cba82bfc fix(tui): Ctrl+C on in-input selection copies to clipboard instead of clearing
Before: textInput explicitly ignored Ctrl+C so the app-level handler took
over — with no knowledge of the TextInput's own selection — and fell through
to clearIn() whenever input had text. Selecting part of the composer and
pressing Ctrl+C silently nuked everything you typed.

Now: Ctrl+C with an active in-input selection writes the selected substring
to the clipboard via OSC 52 and clears the selection. The original semantics
(Ctrl+C with no selection → app-level interrupt/clear/die chain) are
preserved by still returning early in that case.
2026-04-18 14:42:03 -05:00
Teknium 0bebf5b948 chore(attribution): add AUTHOR_MAP entry for Honghua Yang (honghua) 2026-04-18 12:40:56 -07:00
Honghua Yang 3128d9fcd2 fix(context_compressor): keep tool-call arguments JSON valid when shrinking
Pass 3 of `_prune_old_tool_results` previously shrunk long `function.arguments`
blobs by slicing the raw JSON string at byte 200 and appending the literal
text `...[truncated]`. That routinely produced payloads like::

    {"path": "/foo.md", "content": "# Long markdown
    ...[truncated]

— an unterminated string with no closing brace. Strict providers (observed
on MiniMax) reject this as `invalid function arguments json string` with a
non-retryable 400. Because the broken call survives in the session history,
every subsequent turn re-sends the same malformed payload and gets the same
400, locking the session into a re-send loop until the call falls out of
the window.

Fix: parse the arguments first, shrink long string leaves inside the parsed
structure, and re-serialise. Non-string values (paths, ints, booleans, lists)
pass through intact. Arguments that are not valid JSON to begin with (rare,
some backends use non-JSON tool args) are returned unchanged rather than
replaced with something neither we nor the provider can parse.

Observed in the wild: a `write_file` with ~800 chars of markdown `content`
triggered this on a real session against MiniMax-M2.7; every turn after
compression got rejected until the session was manually reset.

Tests:
- 7 direct tests of `_truncate_tool_call_args_json` covering valid-JSON
  output, non-JSON pass-through, nested structures, non-string leaves,
  scalar JSON, and Unicode preservation
- 1 end-to-end test through `_prune_old_tool_results` Pass 3 that
  reproduces the exact failure payload shape from the incident

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:40:56 -07:00
Brooklyn Nicholson 5c8b291607 fix(tui): wrap markdown links in Link so Ghostty/iTerm/kitty get real OSC 8 hyperlinks
renderLink was discarding the URL entirely — it rendered the label as amber
underlined text and dropped the href. Result: Cmd+Click / Ctrl+Click did
nothing in any terminal, including Ghostty.

Now both markdown links `[label](url)` and bare `https://…` URLs are wrapped
in @hermes/ink's Link component, which emits OSC 8 (\\x1b]8;;url\\x07label\\x1b]8;;\\x07)
when supportsHyperlinks() returns true. ADDITIONAL_HYPERLINK_TERMINALS already
includes ghostty, iTerm2, kitty, alacritty, Hyper.

Autolinks that look like bare emails (foo@bar.com) now prepend mailto: in the
href so they open the mail client correctly.

Also adds a typed declaration for Link in hermes-ink.d.ts.
2026-04-18 14:39:24 -05:00
Brooklyn Nicholson a7f4d756b7 fix(tui): cap approval prompt command preview at 10 lines
Large inline scripts (e.g. Python code_execution bodies) rendered as a single
unbounded <Text> block, pushing the Allow/Deny options below the visible
viewport. Users had to scroll the terminal to vote.

Preview now shows the first 10 lines with truncate-end wrap per line and a
dim "… +N more lines" indicator. Full text remains in the transcript above.
2026-04-18 14:36:34 -05:00
Teknium b73ebfee30 chore(attribution): add AUTHOR_MAP entry for Jim Liu (JimLiu)
Maps junminliu@gmail.com → JimLiu for the baoyu-infographic skill port
co-author attribution.
2026-04-18 12:32:16 -07:00
Teknium ade7958f1f docs: add PORT_NOTES.md for baoyu-infographic
Documents what changed from upstream and how to sync future updates.
2026-04-18 12:32:16 -07:00
Teknium 65c0a30a77 feat(skills): add baoyu-infographic skill — 21 layouts × 21 styles
Port of baoyu-infographic from JimLiu/baoyu-skills (v1.56.1) adapted
for Hermes Agent's tool ecosystem.

Adaptations from upstream:
- Frontmatter: openclaw metadata → hermes metadata
- Usage: slash command syntax → natural language triggers
- Removed EXTEND.md config system (not part of Hermes infrastructure)
- AskUserQuestion → clarify tool (one question at a time)
- Image generation → image_generate tool
- Removed Windows-specific paths
- Simplified file operations to use Hermes file tools
- All 45 reference files (layouts, styles, templates) preserved intact

Attribution preserved per agreement with 宝玉 (Jim Liu):
- author, version, GitHub homepage URL in frontmatter

Co-authored-by: Jim Liu 宝玉 <junminliu@gmail.com>
2026-04-18 12:32:16 -07:00
Siddharth Balyan a828daa7f8 perf(docker): layer-cache npm/Playwright and skip redundant web rebuild (#12225)
* perf(docker): layer-cache npm/Playwright and skip redundant web rebuild

Copy package manifests before source so npm install + Playwright only
re-run when lockfiles change. Use COPY --chown instead of chown -R,
set HERMES_WEB_DIST to skip runtime web rebuild, and drop the
USER root / chmod dance since entrypoint.sh is already executable in git.

* Update Dockerfile
2026-04-18 22:44:31 +05:30
bluefishs b0bde98b0f fix(docker): build web/ dashboard assets in image (#12180)
The Dockerfile installs root-level npm dependencies (for Playwright) and the
whatsapp-bridge bundle, but never builds the web/ Vite project. As a result,
'hermes dashboard' starts FastAPI on :9119 but serves a broken SPA because
hermes_cli/web_dist/ is empty and requests to /assets/index-<hash>.js 404.

Add a build step inside web/ so the Vite output is baked into the image.

Reproduce (before):
  docker build -t hermes-repro -f Dockerfile .
  docker run --rm -p 9119:9119 hermes-repro hermes dashboard
  curl -sI http://localhost:9119/assets/ | head -1   # -> 404

After: /assets/ returns the built asset path.
2026-04-18 22:20:24 +05:30
kshitij c14b3b5880 fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) (#12144)
* fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo)

The prior override only matched the literal model name "kimi-for-coding",
but Moonshot's coding endpoint is hit with real model IDs such as
`kimi-k2.5`, `kimi-k2-turbo-preview`, `kimi-k2-thinking`, etc.  Those
requests bypassed the override and kept the caller's temperature, so
Moonshot returns HTTP 400 "invalid temperature: only 0.6 is allowed for
this model" (or 1.0 for thinking variants).

Match the whole kimi-k2.* family:
  * kimi-k2-thinking / kimi-k2-thinking-turbo -> 1.0 (thinking mode)
  * all other kimi-k2.* -> 0.6 (non-thinking / instant mode)

Also accept an optional vendor prefix (e.g. `moonshotai/kimi-k2.5`) so
aggregator routings are covered.

* refactor(kimi): whitelist-match kimi coding models instead of prefix

Addresses review feedback on PR #12144.

- Replace `startswith("kimi-k2")` with explicit frozensets sourced from
  Moonshot's kimi-for-coding model list.  The prefix match would have also
  clamped `kimi-k2-instruct` / `kimi-k2-instruct-0905`, which are the
  separate non-coding K2 family with variable temperature (recommended 0.6
  but not enforced — see huggingface.co/moonshotai/Kimi-K2-Instruct).
- Confirmed via platform.kimi.ai docs that all five coding models
  (k2.5, k2-turbo-preview, k2-0905-preview, k2-thinking, k2-thinking-turbo)
  share the fixed-temperature lock, so the preview-model mapping is no
  longer an assumption.
- Drop the fragile `"thinking" in bare` substring test for a set lookup.
- Log a debug line on each override so operators can see when Hermes
  silently rewrites temperature.
- Update class docstring.  Extend the negative test to parametrize over
  kimi-k2-instruct, Kimi-K2-Instruct-0905, and a hypothetical future
  kimi-k2-experimental name — all must keep the caller's temperature.
2026-04-18 09:35:51 -07:00
kshitijk4poor 656c375855 fix(tui): review follow-up — /retry, /plan, ANSI truncation, caching
- /retry: use session['history'] instead of non-existent
  agent.conversation_history; truncate history at last user message
  to match CLI retry_last() behavior; add history_lock safety
- /plan: pass user instruction (arg) to build_plan_path instead of
  session_key; add runtime_note so agent knows where to save the plan
- ANSI tool results: render full text via <Ansi wrap=truncate-end>
  instead of slicing raw ANSI through compactPreview (which cuts
  mid-escape-sequence producing garbled output)
- Move _PENDING_INPUT_COMMANDS frozenset to module level
- Use get_skill_commands() (cached) instead of scan_skill_commands()
  (rescans disk) in slash.exec skill interception
- Add 3 retry tests: happy path with history truncation verification,
  empty history error, multipart content extraction
- Update test mock target from scan_skill_commands to get_skill_commands
2026-04-18 09:30:48 -07:00
kshitijk4poor abc95338c2 fix(tui): slash.exec _pending_input commands, tool ANSI, terminal title
Additional TUI fixes discovered in the same audit:

1. /plan slash command was silently lost — process_command() queues the
   plan skill invocation onto _pending_input which nobody reads in the
   slash worker subprocess.  Now intercepted in slash.exec and routed
   through command.dispatch with a new 'send' dispatch type.

   Same interception added for /retry, /queue, /steer as safety nets
   (these already have correct TUI-local handlers in core.ts, but the
   server-side guard prevents regressions if the local handler is
   bypassed).

2. Tool results were stripping ANSI escape codes — the messageLine
   component used stripAnsi() + plain <Text> for tool role messages,
   losing all color/styling from terminal, search_files, etc.  Now
   uses <Ansi> component (already imported) when ANSI is detected.

3. Terminal tab title now shows model + busy status via useTerminalTitle
   hook from @hermes/ink (was never used).  Users can identify Hermes
   tabs and see at a glance whether the agent is busy or ready.

4. Added 'send' variant to CommandDispatchResponse type + asCommandDispatch
   parser + createSlashHandler handler for commands that need to inject
   a message into the conversation (plan, queue fallback, steer fallback).
2026-04-18 09:30:48 -07:00
kshitijk4poor 2da558ec36 fix(tui): clickable hyperlinks and skill slash command dispatch
Two TUI fixes:

1. Hyperlinks are now clickable (Cmd+Click / Ctrl+Click) in terminals
   that support OSC 8.  The markdown renderer was rendering links as
   plain colored text — now wraps them in the existing <Link> component
   from @hermes/ink which emits OSC 8 escape sequences.

2. Skill slash commands (e.g. /hermes-agent-dev) now work in the TUI.
   The slash.exec handler was delegating to the _SlashWorker subprocess
   which calls cli.process_command().  For skills, process_command()
   queues the invocation message onto _pending_input — a Queue that
   nobody reads in the worker subprocess.  The skill message was lost.
   Now slash.exec detects skill commands early and rejects them so
   the TUI falls through to command.dispatch, which correctly builds
   and returns the skill payload for the client to send().
2026-04-18 09:30:48 -07:00
Siddharth Balyan b0efdf37d7 fix(nix): upgrade Python 3.11 → 3.12, add cross-platform eval check (#12208) 2026-04-18 21:51:03 +05:30
Siddharth Balyan 8a0c774e9e Add web dashboard build to Nix flake (#12194)
The web dashboard (Vite/React frontend) is now built as a separate Nix
derivation and baked into the Hermes package. The build output is
installed to a standard location and exposed via the `HERMES_WEB_DIST`
environment variable, allowing the dashboard command to use pre-built
assets when available (e.g., in packaged releases) instead of rebuilding
on every invocation.
2026-04-18 20:55:39 +05:30
Brooklyn Nicholson f8becbfbea feat(tui): per-language syntax highlighting in markdown code fences
Adds a minimal hand-rolled highlighter for ts/js/jsx/tsx, py, sh/bash, go, rust,
json, yaml, sql. Recognizes whole-line comments, single/double/backtick strings,
numbers, and per-language keyword sets. Unknown langs fall through to the current
plain rendering; the existing diff-specific colorization is preserved.

Closes the §8 "Markdown syntax highlighting is missing (only diff gets colored)"
finding from the TUI v2 audit without pulling in a highlighter library.
2026-04-18 09:48:38 -05:00
Brooklyn Nicholson 5e148ca3d0 fix(tui): route /skills subcommands through skills.manage instead of curses slash.exec
/skills install, inspect, search, browse, list now call the typed skills.manage RPC
and render results via panel/page. Previously they fell through to slash.exec which
invokes v1's curses code path — that hangs or crashes inside the Ink worker per the
§2 parity-audit finding.

Also drop Enter-as-install from the Skills Hub action stage since the Hub lists
locally installed skills; primary action is inspect-and-close. x still triggers a
manual reinstall for power users.
2026-04-18 09:46:36 -05:00
Brooklyn Nicholson 949b8f5521 feat(tui): register /skills slash command to open Skills Hub
Intercept bare /skills locally and flip overlay.skillsHub, so the
overlay opens instantly without waiting on slash.exec. /skills <args>
still forwards to slash.exec and paginates any output. Tests cover
both branches.
2026-04-18 09:42:57 -05:00
Brooklyn Nicholson ef284e021a feat(tui): add two-step SkillsHub overlay component
New SkillsHub mirrors ModelPicker's category → item → actions flow with
paginated 12-line lists, 1-9/0 quick-pick, Esc-back navigation, and
lazy skills.manage inspect/install calls. Mount it from appOverlays
when overlay.skillsHub is true.
2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 6fbfae8f42 feat(tui): add skillsHub overlay state wiring
Extend OverlayState with a skillsHub flag, fold it into $isBlocked, and
teach Ctrl+C to close the overlay so later PRs can render the component
behind this slot.
2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 3821323029 feat(tui): render per-MCP-server status block in SessionPanel 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson b82ec6419d test(tui-gateway): cover mcp_servers field in _session_info output 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 202b78ec68 feat(tui-gateway): include per-MCP-server status in session.info payload 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson fd6ffc777f feat(tui): honor display.* flags in turn renderer, status bar, and event handler
- turnController gates scheduleStreaming / reasoning recorders on
  streaming + showReasoning so disabling them keeps the buffer silent
  until message.complete flushes
- createGatewayEventHandler only surfaces inline_diff previews when
  inlineDiffs is on
- StatusRule takes a showCost prop and renders `· $X.XXXX` with the
  same toFixed(4) formatting as /usage when usage.cost_usd is present
- Usage grows cost_usd?: number to match the gateway payload
- Existing handler tests flip showReasoning on in beforeEach so
  reasoning-flow assertions keep their meaning
2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 200c17433c feat(tui): read display.streaming / show_reasoning / show_cost / inline_diffs from config
Extends ConfigDisplayConfig and UiState so the four new display flags
flow from `config.get {key:"full"}` into the nanostore. applyDisplay is
exported to keep the fan-out testable without an Ink harness.

Defaults mirror v1 parity: streaming + inline_diffs default true
(opt-out via `=== false`), show_cost + show_reasoning default false
(opt-in via plain truthy check).
2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 586b2f2089 feat(tui): persist large pastes to ~/.hermes/pastes/ via paste.collapse 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson a397b0fd4d test(tui-gateway): assert quick_commands appear in commands.catalog output 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 5152e1ad86 feat(tui-gateway): surface config.quick_commands in commands.catalog 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson 4e1ea79edc feat(tui): accept raw Ctrl+V as clipboard image paste fallback 2026-04-18 09:42:57 -05:00
Brooklyn Nicholson f0638f3596 fix(tui): split /model picker from /provider wizard to resolve registry collision 2026-04-18 09:42:57 -05:00
Siddharth Balyan 6fb69229ca fix(nix): fix build failures, TUI Node.js crash, and upgrade container to Node 22 (#12159)
* Add setuptools build dep for legacy alibabacloud packages and updated
stale npm-deps hash

* Add HERMES_NODE env var to pin Node.js version

The TUI requires Node.js 20+ for regex `/v` flag support (used by
string-width). Instead of relying on PATH lookup, explicitly set
HERMES_NODE to the bundled Node 22 in the Nix wrapper, and add a
fallback check in the Python code to use HERMES_NODE if available.

Also upgrade container provisioning to Node 22 via NodeSource (Ubuntu
24.04 ships Node 18 which is EOL) and add a Nix check to verify the
wrapper and Node version at build time.
2026-04-18 19:21:28 +05:30
Teknium 2edebedc9e feat(steer): /steer <prompt> injects a mid-run note after the next tool call (#12116)
* feat(steer): /steer <prompt> injects a mid-run note after the next tool call

Adds a new slash command that sits between /queue (turn boundary) and
interrupt. /steer <text> stashes the message on the running agent and
the agent loop appends it to the LAST tool result's content once the
current tool batch finishes. The model sees it as part of the tool
output on its next iteration.

No interrupt is fired, no new user turn is inserted, and no prompt
cache invalidation happens beyond the normal per-turn tool-result
churn. Message-role alternation is preserved — we only modify an
existing role:"tool" message's content.

Wiring
------
- hermes_cli/commands.py: register /steer + add to ACTIVE_SESSION_BYPASS_COMMANDS.
- run_agent.py: add _pending_steer state, AIAgent.steer(), _drain_pending_steer(),
  _apply_pending_steer_to_tool_results(); drain at end of both parallel and
  sequential tool executors; clear on interrupt; return leftover as
  result['pending_steer'] if the agent exits before another tool batch.
- cli.py: /steer handler — route to agent.steer() when running, fall back to
  the regular queue otherwise; deliver result['pending_steer'] as next turn.
- gateway/run.py: running-agent intercept calls running_agent.steer(); idle-agent
  path strips the prefix and forwards as a regular user message.
- tui_gateway/server.py: new session.steer JSON-RPC method.
- ui-tui: SessionSteerResponse type + local /steer slash command that calls
  session.steer when ui.busy, otherwise enqueues for the next turn.

Fallbacks
---------
- Agent exits mid-steer → surfaces in run_conversation result as pending_steer
  so CLI/gateway deliver it as the next user turn instead of silently dropping it.
- All tools skipped after interrupt → re-stashes pending_steer for the caller.
- No active agent → /steer reduces to sending the text as a normal message.

Tests
-----
- tests/run_agent/test_steer.py — accept/reject, concatenation, drain,
  last-tool-result injection, multimodal list content, thread safety,
  cleared-on-interrupt, registry membership, bypass-set membership.
- tests/gateway/test_steer_command.py — running agent, pending sentinel,
  missing steer() method, rejected payload, empty payload.
- tests/gateway/test_command_bypass_active_session.py — /steer bypasses
  the Level-1 base adapter guard.
- tests/test_tui_gateway_server.py — session.steer RPC paths.

72/72 targeted tests pass under scripts/run_tests.sh.

* feat(steer): register /steer in Discord's native slash tree

Discord's app_commands tree is a curated subset of slash commands (not
derived from COMMAND_REGISTRY like Telegram/Slack). /steer already
works there as plain text (routes through handle_message → base
adapter bypass → runner), but registering it here adds Discord's
native autocomplete + argument hint UI so users can discover and
type it like any other first-class command.
2026-04-18 04:17:18 -07:00
Teknium f9667331e5 docs(browser): improve /browser connect setup guidance (#12123)
- Note that /browser connect is CLI-only and won't work in gateways (WebUI, Telegram, Discord).
- Update the Chrome launch command to use a dedicated --user-data-dir, so port 9222 actually comes up even when Chrome is already running with the user's regular profile.
- Add --no-first-run --no-default-browser-check to skip the fresh-profile wizard.
- Explain why the dedicated user-data-dir matters.

Community tip via Karamjit Singh.

Co-authored-by: teknium1 <teknium@noreply.github.com>
2026-04-18 04:14:05 -07:00
Teknium 9527707f80 fix(signal): back off sendTyping spam for unreachable recipients (#12118)
base.py's _keep_typing refresh loop calls send_typing every ~2s while
the agent is processing. If signal-cli returns NETWORK_FAILURE for the
recipient (offline, unroutable, group membership lost), the unmitigated
path was a WARNING log every 2 seconds for as long as the agent stayed
busy — a user report showed 1048 warnings in 41 minutes for one
offline contact, plus the matching volume of pointless RPC traffic to
signal-cli.

- _rpc() accepts log_failures=False so callers can route repeated
  expected failures (typing) to DEBUG while keeping send/receive at
  WARNING.
- send_typing() tracks consecutive failures per chat. First failure
  still logs WARNING so transport issues remain visible; subsequent
  failures log at DEBUG. After three consecutive failures we skip the
  RPC during an exponential cooldown (16s, 32s, 60s cap) so we stop
  hammering signal-cli for a recipient it can't deliver to. A
  successful sendTyping resets the counters.
- _stop_typing_indicator() clears the backoff state so the next agent
  turn starts fresh.

E2E simulation against the reported 41-minute window: RPCs drop from
1230 to 45 (-96%), log lines from 1048 WARNINGs to 1 WARNING + 44
DEBUGs.

Credits kshitijk4poor (#12056) for the _rpc log_failures kwarg idea;
the broader restructure in that PR (nested per-chat loop inside
send_typing) is avoided here in favour of stateful backoff that
preserves base.py's existing _keep_typing architecture.
2026-04-18 04:13:32 -07:00
Teknium cf012a05d8 docs(terminal): warn against stacking watch_patterns + notify_on_complete on end-of-run markers (#12113)
Stacking both features on the same event produces duplicate, delayed
notifications — delivery is async and continues firing after the process
exits, so matches on end-of-run markers (SUMMARY, DONE, PASS) arrive
after the agent has already polled/waited and moved on.

Updates both the terminal tool JSON schema description and the
terminal_tool() function docstring to make the split explicit:

- watch_patterns: mid-process signals only (errors, readiness markers,
  intermediate steps you want to react to before the process exits)
- notify_on_complete: end-of-run completion signal

No behavioural change.
2026-04-18 03:53:21 -07:00
teknium1 3b69b2fd61 test(session-search): regression coverage for CJK LIKE fallback
Twelve tests under TestCJKSearchFallback guarding:
 - CJK detection across Chinese/Japanese/Korean/Hiragana/Katakana ranges
   (including the full Hangul syllables block \uac00-\ud7af, to catch
   the shorter-range typo from one of the duplicate PRs)
 - Substring match for multi-char Chinese, Japanese, Korean queries
 - Filter preservation (source_filter, exclude_sources, role_filter)
   in the LIKE path — guards against the SQL-builder bug from another
   duplicate PR where filter clauses landed after LIMIT/OFFSET
 - Snippet centered on the matched term (instr-based substr window),
   not the leading 200 chars of content
 - English fast-path untouched
 - Empty/no-match cases
 - Mixed CJK+English queries

Also:
 - hermes_state.py: LIKE-fallback snippet is now
   `substr(content, max(1, instr(content, ?) - 40), 120)`, centered on
   the match instead of the whole-content default. Credit goes to
   @iamagenius00 for the snippet idea in PR #11517.
 - scripts/release.py: add @iamagenius00 to AUTHOR_MAP so future
   release attribution resolves cleanly.

Refs #11511, #11516, #11517, #11541.

Co-authored-by: iamagenius00 <iamagenius00@users.noreply.github.com>
2026-04-18 01:57:57 -07:00
vominh1919 8826d9c197 fix: FTS5 LIKE fallback for CJK (Chinese/Japanese/Korean) queries
FTS5 default tokenizer splits CJK text character-by-character, causing
multi-character queries like '记忆断裂' to return 0 results.

This fix adds a LIKE fallback: when FTS5 returns no results and the
query contains CJK characters, retry with WHERE content LIKE '%query%'.
Preserves FTS5 performance for English queries.

Fixes #11511
2026-04-18 01:57:57 -07:00
Teknium a2c9f5d0a7 docs(execute_code): document project/strict execution modes (#12073)
Follow-up to PR #11971. Documents the new code_execution.mode config
key and what each mode actually does.

- user-guide/configuration.md: add mode: project to the yaml example,
  explain project vs strict and call out that security invariants are
  identical across modes.
- user-guide/features/code-execution.md: new 'Execution Mode' section
  with a comparison table and usage guidance; update the 'temporary
  directory' note so it reflects that script.py runs in the session
  CWD in project mode (staging dir stays on PYTHONPATH for imports);
  drop stale 'sandboxed' framing from the intro and skill-passthrough
  paragraph.
- getting-started/learning-path.md: update the one-line Code Execution
  summary to match (no longer 'sandboxed environments' — the default
  runs in the session's real working directory).

No code changes.
2026-04-18 01:53:09 -07:00
Teknium 8322b42c6c fix(streaming): surface dropped tool-call on mid-stream stall (#12072)
When streaming died after text was already delivered to the user but
before a tool-call's arguments finished streaming, the partial-stream
stub at the end of _interruptible_streaming_api_call silently set
`tool_calls=None` on the returned message and kept `finish_reason=stop`.
The agent treated the turn as complete, the session exited cleanly with
code 0, and the attempted action was lost with zero user-facing signal.

Live-observed Apr 2026 with MiniMax M2.7 on a ~6-minute audit task:
agent streamed 'Let me write the audit:', started emitting a write_file
tool call, MiniMax stalled for 240s mid-arguments, the stale-stream
detector killed the connection, the stub fired, session ended, no file
written, no error shown.

Fix: the streaming accumulator now records each tool-call's name into
`result['partial_tool_names']` as soon as the name is known. When the
stub builder fires after a partial delivery and finds any recorded tool
names, it appends a human-visible warning to the stub's content — and
also fires it as a live stream delta so the user sees it immediately,
not only in the persisted transcript. The next turn's model also sees
the warning in conversation history and can retry on its own. Text-only
partial streams keep the original bare-recovery behaviour (no warning).

Validation:
| Scenario                                    | Before                    | After                                       |
|---------------------------------------------|---------------------------|---------------------------------------------|
| Stream dies mid tool-call, text already sent | Silent exit, no indication | User sees ⚠ warning naming the dropped tool |
| Text-only partial stream                     | Bare recovered text       | Unchanged                                   |
| tests/run_agent/test_streaming.py            | 24 passed                 | 26 passed (2 new)                           |
2026-04-18 01:52:06 -07:00
Teknium 285bb2b915 feat(execute_code): add project/strict execution modes, default to project (#11971)
Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.

Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:

  project (new default):
    - cwd       = session's TERMINAL_CWD (falls back to os.getcwd())
    - python    = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
                  with a Python 3.8+ version check; falls back cleanly to
                  sys.executable if no venv or the candidate fails
    - result    : 'import pandas' works, '.env' resolves, matches terminal()

  strict (opt-in):
    - cwd       = staging tmpdir (today's behavior)
    - python    = sys.executable (today's behavior)
    - result    : maximum reproducibility and isolation; project deps
                  won't resolve

Security-critical invariants are identical across both modes and covered by
explicit regression tests:

  - env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
    *_CREDENTIAL, *_PASSWD, *_AUTH substrings)
  - SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
    delegate_task, no MCP from inside scripts)
  - resource caps (5-min timeout, 50KB stdout, 50 tool calls)

Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).

Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
2026-04-18 01:46:25 -07:00
Teknium 54e0eb24c0 docs: correctness audit — fix wrong values, add missing coverage (#11972)
Comprehensive audit of every reference/messaging/feature doc page against the
live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY,
TOOLSETS, tool registry, on-disk skills). Every fix was verified against code
before writing.

### Wrong values fixed (users would paste-and-fail)

- reference/environment-variables.md:
  - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192
    actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`.
  - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual
    `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint).
- reference/toolsets-reference.md MCP example used the non-existent nested
  `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`.
- reference/skills-catalog.md listed ~20 bundled skills that no longer exist
  on disk (all moved to `optional-skills/`). Regenerated the whole bundled
  section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names.
- messaging/slack.md ":::info" callout claimed Slack has no
  `free_response_channels` equivalent; both the env var and the yaml key are
  in fact read.
- messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the
  adapter only reads `extra.markdown_support` from config.yaml. Removed the
  env var row and noted config-only nature.
- messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`.

### Missing coverage added

- Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in
  PROVIDER_REGISTRY but undocumented everywhere. Added sections to
  integrations/providers.md, rows to quickstart.md and fallback-providers.md.
- integrations/providers.md "Fallback Model" provider list now includes
  gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock.
- reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER
  enum in env-vars now include the same set.
- reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`.
  Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`.
- reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added
  `feishu_doc` and `feishu_drive` toolset sections.
- reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core
  rows + all missing `hermes-<platform>` toolsets in the platform table
  (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin,
  homeassistant, webhook, gateway). Fixed the `debugging` composite to
  describe the actual `includes=[...]` mechanism.
- reference/optional-skills-catalog.md: added `fitness-nutrition`.
- reference/environment-variables.md: added NOUS_BASE_URL,
  NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL,
  XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE,
  BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS,
  DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS,
  QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX.
- messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY,
  HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT
  _DELAY_SECONDS (all actively read by the adapter).
- messaging/matrix.md: documented MATRIX_REACTIONS (default true).
- messaging/telegram.md: removed the redundant second Webhook Mode section
  that invented a `telegram.webhook_mode: true` yaml key the adapter does
  not read.
- user-guide/features/hooks.md: added `on_session_finalize` and
  `on_session_reset` (both emitted via invoke_hook but undocumented).
- user-guide/features/api-server.md: documented GET /health/detailed, the
  `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events
  (10 routes that were live but undocumented).
- user-guide/features/fallback-providers.md: added `approval` and
  `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth
  to the supported-providers table.
- user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add
  oversight in #11942).
- user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`;
  yaml example block gains `mistral:`, `gemini:`, `xai:` subsections.
  Auxiliary-provider enum now enumerates all real registry entries.
- reference/faq.md: stale AIAgent/config examples bumped from
  `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to
  `claude-opus-4.7`.

### Docs-site integrity

- guides/build-a-hermes-plugin.md referenced two nonexistent hooks
  (`pre_api_request`, `post_api_request`). Replaced with the real
  `on_session_finalize` / `on_session_reset` entries.
- messaging/open-webui.md and features/api-server.md had pre-existing
  broken links to `/docs/user-guide/features/profiles` (actual path is
  `/docs/user-guide/profiles`). Fixed.
- reference/skills-catalog.md had one `<1%` literal that MDX parsed as a
  JSX tag. Escaped to `&lt;1%`.

### False positives filtered out (not changed, verified correct)

- `/set-home` is a registered alias of `/sethome` \u2014 docs were fine.
- `hermes setup gateway` is valid syntax (`hermes setup \<section\>`);
  changed in qqbot.md for cross-doc consistency, not as a bug fix.
- Telegram reactions "disabled by default" matches code (default `"false"`).
- Matrix encryption "opt-in" matches code (empty env default \u2192 disabled).
- `pre_api_request` / `post_api_request` hooks do NOT exist in current code;
  documented instead the real `on_session_finalize` / `on_session_reset`.
- SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it).

Validation:
- `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning).
- `ascii-guard lint docs` \u2014 124 files, 0 errors.
- 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
Teknium 73bccc94c7 skills: consolidate mlops redundancies (gguf+llama-cpp, grpo+trl, guidance→optional) (#11965)
Three tightly-scoped built-in skill consolidations to reduce redundancy in
the available_skills listing injected into every system prompt:

1. gguf-quantization → llama-cpp (merged)
   GGUF is llama.cpp's format; two skills covered the same toolchain. The
   merged llama-cpp skill keeps the full K-quant table + imatrix workflow
   from gguf and the ROCm/benchmarks/supported-models sections from the
   original llama-cpp. All 5 reference files preserved.

2. grpo-rl-training → fine-tuning-with-trl (folded in)
   GRPO isn't a framework, it's a trainer inside TRL. Moved the 17KB
   deep-dive SKILL.md to references/grpo-training.md and the working
   template to templates/basic_grpo_training.py. TRL's GRPO workflow
   section now points to both. Atropos skill's related_skills updated.

3. guidance → optional-skills/mlops/
   Dropped from built-in. Outlines (still built-in) covers the same
   structured-generation ground with wider adoption. Listed in the
   optional catalog for users who specifically want Guidance.

Net: 3 fewer built-in skill lines in every system prompt, zero content
loss. Contributor authorship preserved via git rename detection.
2026-04-17 21:36:40 -07:00
Teknium 598cba62ad test: update stale tests to match current code (#11963)
Seven test files were asserting against older function signatures and
behaviors. CI has been red on main because of accumulated test debt
from other PRs; this catches the tests up.

- tests/agent/test_subagent_progress.py: _build_child_progress_callback
  now takes (task_index, goal, parent_agent, task_count=1); update all
  call sites and rewrite tests that assumed the old 'batch-only' relay
  semantics (now relays per-tool AND flushes a summary at BATCH_SIZE).
  Renamed test_thinking_not_relayed_to_gateway → test_thinking_relayed_to_gateway
  since thinking IS now relayed as subagent.thinking.
- tests/tools/test_delegate.py: _build_child_agent now requires
  task_count; add task_count=1 to all 8 call sites.
- tests/cli/test_reasoning_command.py: AIAgent gained _stream_callback;
  stub it on the two test agent helpers that use spec=AIAgent / __new__.
- tests/hermes_cli/test_cmd_update.py: cmd_update now runs npm install
  in repo root + ui-tui/ + web/ and 'npm run build' in web/; assert
  all four subprocess calls in the expected order.
- tests/hermes_cli/test_model_validation.py: dissimilar unknown models
  now return accepted=False (previously True with warning); update
  both affected tests.
- tests/tools/test_registry.py: include feishu_doc_tool and
  feishu_drive_tool in the expected builtin tool set.
- tests/gateway/test_voice_command.py: missing-voice-deps message now
  suggests 'pip install PyNaCl' not 'hermes-agent[messaging]'.

411/411 pass locally across these 7 files.
2026-04-17 21:35:30 -07:00
Teknium 5ff65dbf68 docs(execute_code): clarify that scripts run in their own temp dir, not session CWD (#11956)
Weaker models (Gemma-class) repeatedly rediscover and forget that execute_code's
working directory differs from terminal()/read_file()'s, leading to
os.path.exists('.env') returning False even though the file exists in the
session's CWD. They then bounce between 'the file exists' and 'the file is
missing' across tool calls.

Adds a 'Working directory' note to the execute_code schema description
pointing agents at absolute paths (os.path.expanduser) or terminal()/read_file()
for inspecting user files.

Carefully avoids the 'sandbox'/'isolated'/'cloud' language that commit
39b83f34 removed (it caused agents on local backends to refuse networking
tasks and save false sandbox beliefs to persistent memory). Purely factual
CWD guidance — no restriction implications.
2026-04-17 21:30:34 -07:00
Teknium c20e236b71 chore: map AviArora02-commits author email in release AUTHOR_MAP 2026-04-17 21:30:17 -07:00
AviArora02-commits 994faacce8 fix: suppress Authorization: Bearer for Gemini provider to prevent HTTP 400 (#7893) 2026-04-17 21:30:17 -07:00
Teknium 8a59f8a9ed fix(update): survive mid-update terminal disconnect (#11960)
hermes update no longer dies when the controlling terminal closes
(SSH drop, shell close) during pip install.  SIGHUP is set to SIG_IGN
for the duration of the update, and stdout/stderr are wrapped so writes
to a closed pipe are absorbed instead of cascading into process exit.
All update output is mirrored to ~/.hermes/logs/update.log so users can
see what happened after reconnecting.

SIGINT (Ctrl-C) and SIGTERM (systemd) are intentionally still honored —
those are deliberate cancellations, not accidents.  In gateway mode the
helper is a no-op since the update is already detached.

POSIX preserves SIG_IGN across exec(), so pip and git subprocesses
inherit hangup protection automatically — no changes to subprocess
spawning needed.
2026-04-17 21:29:24 -07:00
Teknium 1c352f6b1d docs(browser): expand Camofox persistence guide with troubleshooting (#11957)
The existing 'Persistent browser sessions' section had the correct config
snippet but users still hit the flag at the wrong config path, assumed
Hermes could force persistence when the server was ephemeral, and had no
way to verify the flag was actually taking effect.

Adds to that section:
- Warning admonition calling out the nested path vs top-level mistake.
- Explicit 'What Hermes does / does not do' split so users understand
  Hermes can only send a stable userId; the Camofox server must map it
  to a persistent profile.
- 5-step verification flow for confirming persistence works end-to-end.
- Reminder to restart Hermes after editing config.yaml.
- Where Hermes derives the stable userId (~/.hermes/browser_auth/camofox/)
  so users can reset or back up state.

Docs-only change.
2026-04-17 21:23:31 -07:00
Teknium 11a89cc032 docs: backfill coverage for recently-merged features (#11942)
Fills documentation gaps that accumulated as features merged ahead of their
docs updates. All additions are verified against code and the originating PRs.

Providers:
- Ollama Cloud (#10782) — new provider section, env vars, quickstart/fallback rows
- xAI Grok Responses API + TTS (#10783) — provider note, TTS table + config
- Google Gemini CLI OAuth (#11270) — quickstart/fallback/cli-commands entries
- NVIDIA NIM (#11774) — NVIDIA_API_KEY / NVIDIA_BASE_URL in env-vars reference
- HERMES_INFERENCE_PROVIDER enum updated

Messaging:
- DISCORD_ALLOWED_ROLES (#11608) — env-vars, discord.md access control section
- DingTalk QR device-flow (#11574) — wizard path in Option A + openClaw disclosure
- Feishu document comment intelligent reply (#11898) — full section + 3-tier access control + CLI

Skills / commands:
- concept-diagrams skill (#11363) — optional-skills-catalog entry
- /gquota (#11270) — slash-commands reference

Build: docusaurus build passes, ascii-guard lint 0 errors.
2026-04-17 21:22:11 -07:00
Teknium 45acd9beb5 fix(gateway): ignore redelivered /restart after PTB offset ACK fails (#11940)
When a Telegram /restart fires and PTB's graceful-shutdown `get_updates`
ACK call times out ("When polling for updates is restarted, updates may
be received twice" in gateway.log), the new gateway receives the same
/restart again and restarts a second time — a self-perpetuating loop.

Record the triggering update_id in `.restart_last_processed.json` when
handling /restart.  On the next process, reject a /restart whose
update_id <= the recorded one as a stale redelivery.  5-minute staleness
guard so an orphaned marker can't block a legitimately new /restart.

- gateway/platforms/base.py: add `platform_update_id` to MessageEvent
- gateway/platforms/telegram.py: propagate `update.update_id` through
  _build_message_event for text/command/location/media handlers
- gateway/run.py: write dedup marker in _handle_restart_command;
  _is_stale_restart_redelivery checks it before processing /restart
- tests/gateway/test_restart_redelivery_dedup.py: 9 new tests covering
  fresh restart, redelivery, staleness window, cross-platform,
  malformed-marker resilience, and no-update_id (CLI) bypass

Only active for Telegram today (the one platform with monotonic
cross-session update ordering); other platforms return False from
_is_stale_restart_redelivery and proceed normally.
2026-04-17 21:17:33 -07:00
Teknium c5c0bb9a73 fix: point optional-dep install hints at the venv's python (#11938)
Error messages that tell users to install optional extras now use
{sys.executable} -m pip install ... instead of a bare 'pip install
hermes-agent[extra]' string.  Under the curl installer, bare 'pip'
resolves to system pip, which either fails with PEP 668
externally-managed-environment or installs into the wrong Python.

Affects: hermes dashboard, hermes web server startup, mcp_serve,
hermes doctor Bedrock check, CLI voice mode, voice_mode tool runtime
error, Discord voice-channel join failure message.
2026-04-17 21:16:33 -07:00
Teknium 20f2258f34 fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907)
* fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace

interrupt() previously only flagged the agent's _execution_thread_id.
Tools running inside _execute_tool_calls_concurrent execute on
ThreadPoolExecutor worker threads whose tids are distinct from the
agent's, so is_interrupted() inside those tools returned False no matter
how many times the gateway called .interrupt() — hung ssh / curl / long
make-builds ran to their own timeout.

Changes:
- run_agent.py: track concurrent-tool worker tids in a per-agent set,
  fan interrupt()/clear_interrupt() out to them, and handle the
  register-after-interrupt race at _run_tool entry.  getattr fallback
  for the tracker so test stubs built via object.__new__ keep working.
- tools/environments/base.py: opt-in _wait_for_process trace (ENTER,
  per-30s HEARTBEAT with interrupt+activity-cb state, INTERRUPT
  DETECTED, TIMEOUT, EXIT) behind HERMES_DEBUG_INTERRUPT=1.
- tools/interrupt.py: opt-in set_interrupt() trace (caller tid, target
  tid, set snapshot) behind the same env flag.
- tests: new regression test runs a polling tool on a concurrent worker
  and asserts is_interrupted() flips to True within ~1s of interrupt().
  Second new test guards clear_interrupt() clearing tracked worker bits.

Validation: tests/run_agent/ all 762 pass; tests/tools/ interrupt+env
subset 216 pass.

* fix(interrupt-debug): bypass quiet_mode logger filter so trace reaches agent.log

AIAgent.__init__ sets logging.getLogger('tools').setLevel(ERROR) when
quiet_mode=True (the CLI default). This would silently swallow every
INFO-level trace line from the HERMES_DEBUG_INTERRUPT=1 instrumentation
added in the parent commit — confirmed by running hermes chat -q with
the flag and finding zero trace lines in agent.log even though
_wait_for_process was clearly executing (subprocess pid existed).

Fix: when HERMES_DEBUG_INTERRUPT=1, each traced module explicitly sets
its own logger level to INFO at import time, overriding the 'tools'
parent-level filter. Scoped to the opt-in case only, so production
(quiet_mode default) logs stay quiet as designed.

Validation: hermes chat -q with HERMES_DEBUG_INTERRUPT=1 now writes
'_wait_for_process ENTER/EXIT' lines to agent.log as expected.

* fix(cli): SIGTERM/SIGHUP no longer orphans tool subprocesses

Tool subprocesses spawned by the local environment backend use
os.setsid so they run in their own process group. Before this fix,
SIGTERM/SIGHUP to the hermes CLI killed the main thread via
KeyboardInterrupt but the worker thread running _wait_for_process
never got a chance to call _kill_process — Python exited, the child
was reparented to init (PPID=1), and the subprocess ran to its
natural end (confirmed live: sleep 300 survived 4+ min after SIGTERM
to the agent until manual cleanup).

Changes:
- cli.py _signal_handler (interactive) + _signal_handler_q (-q mode):
  route SIGTERM/SIGHUP through agent.interrupt() so the worker's poll
  loop sees the per-thread interrupt flag and calls _kill_process
  (os.killpg) on the subprocess group. HERMES_SIGTERM_GRACE (default
  1.5s) gives the worker time to complete its SIGTERM+SIGKILL
  escalation before KeyboardInterrupt unwinds main.
- tools/environments/base.py _wait_for_process: wrap the poll loop in
  try/except (KeyboardInterrupt, SystemExit) so the cleanup fires
  even on paths the signal handlers don't cover (direct sys.exit,
  unhandled KI from nested code, etc.). Emits EXCEPTION_EXIT trace
  line when HERMES_DEBUG_INTERRUPT=1.
- New regression test: injects KeyboardInterrupt into a running
  _wait_for_process via PyThreadState_SetAsyncExc, verifies the
  subprocess process group is dead within 3s of the exception and
  that KeyboardInterrupt re-raises cleanly afterward.

Validation:
| Before                                                  | After              |
|---------------------------------------------------------|--------------------|
| sleep 300 survives 4+ min as PPID=1 orphan after SIGTERM | dies within 2 s   |
| No INTERRUPT DETECTED in trace                          | INTERRUPT DETECTED fires + killing process group |
| tests/tools/test_local_interrupt_cleanup                | 1/1 pass          |
| tests/run_agent/test_concurrent_interrupt               | 4/4 pass          |
2026-04-17 20:39:25 -07:00
Teknium 607be54a24 fix(discord): forum channel media + polish
Extend forum support from PR #10145:

- REST path (_send_discord): forum thread creation now uploads media
  files as multipart attachments on the starter message in a single
  call. Previously media files were silently dropped on the forum
  path.
- Websocket media paths (_send_file_attachment, send_voice, send_image,
  send_animation — covers send_image_file, send_video, send_document
  transitively): forum channels now go through a new _forum_post_file
  helper that creates a thread with the file as starter content,
  instead of failing via channel.send(file=...) which forums reject.
- _send_to_forum chunk follow-up failures are collected into
  raw_response['warnings'] so partial-send outcomes surface.
- Process-local probe cache (_DISCORD_CHANNEL_TYPE_PROBE_CACHE) avoids
  GET /channels/{id} on every uncached send after the first.
- Dedup of TestSendDiscordMedia that the PR merge-resolution left
  behind.
- Docs: Forum Channels section under website/docs/user-guide/messaging/discord.md.

Tests: 117 passed (22 new for forum+media, probe cache, warnings).
2026-04-17 20:25:48 -07:00
ChimingLiu e5333e793c feat(discord): support forum channels 2026-04-17 20:25:48 -07:00
helix4u 148459716c fix(kimi): cover remaining fixed-temperature bypasses 2026-04-17 20:25:42 -07:00
Teknium 53e4a2f2c6 feat(update): warn about legacy hermes.service units during hermes update (#11918)
Follow-up to #11909: surface the legacy-unit warning where users are most
likely to see it. After a 'hermes update', if a pre-rename hermes.service
is still installed alongside the current hermes-gateway.service, print
the list of legacy units + the 'hermes gateway migrate-legacy' command.

Profile-safe: reuses _find_legacy_hermes_units() which is an explicit
allowlist of hermes.service only — profile units never match.
Platform-gated: only prints on systemd hosts (the rename is Linux-only).
Non-blocking: just prints, never prompts, so gateway-spawned
hermes update --gateway runs aren't affected.
2026-04-17 19:35:12 -07:00
Teknium 07db20c72d fix(gateway): detect legacy hermes.service + mark --replace SIGTERM as planned (#11909)
* fix(gateway): detect legacy hermes.service units from pre-rename installs

Older Hermes installs used a different service name (hermes.service) before
the rename to hermes-gateway.service. When both units remain installed, they
fight over the same bot token — after PR #5646's signal-recovery change,
this manifests as a 30-second SIGTERM flap loop between the two services.

Detection is an explicit allowlist (no globbing) plus an ExecStart content
check, so profile units (hermes-gateway-<profile>.service) and unrelated
third-party services named 'hermes' are never matched.

Wired into systemd_install, systemd_status, gateway_setup wizard, and the
main hermes setup flow — anywhere we already warn about scope conflicts now
also warns about legacy units.

* feat(gateway): add migrate-legacy command + install-time removal prompt

- New hermes_cli.gateway.remove_legacy_hermes_units() removes legacy
  unit files with stop → disable → unlink → daemon-reload. Handles user
  and system scopes separately; system scope returns path list when not
  running as root so the caller can tell the user to re-run with sudo.
- New 'hermes gateway migrate-legacy' subcommand (with --dry-run and -y)
  routes to remove_legacy_hermes_units via gateway_command dispatch.
- systemd_install now offers to remove legacy units BEFORE installing
  the new hermes-gateway.service, preventing the SIGTERM flap loop that
  hits users who still have pre-rename hermes.service around.

Profile units (hermes-gateway-<profile>.service) remain untouched in
all paths — the legacy allowlist is explicit (_LEGACY_SERVICE_NAMES)
and the ExecStart content check further narrows matches.

* fix(gateway): mark --replace SIGTERM as planned so target exits 0

PR #5646 made SIGTERM exit the gateway with code 1 so systemd's
Restart=on-failure revives it after unexpected kills. But when a user has
two gateway units fighting for the same bot token (e.g. legacy
hermes.service + hermes-gateway.service from a pre-rename install), the
--replace takeover itself becomes the 'unexpected' SIGTERM — the loser
exits 1, systemd revives it 30s later, and the cycle flaps indefinitely.

Before calling terminate_pid(), --replace now writes a short-lived marker
file naming the target PID + start_time. The target's shutdown_signal_handler
consumes the marker and, when it names this process, leaves
_signal_initiated_shutdown=False so the final exit code stays 0.

Staleness defences:
- PID + start_time combo prevents PID reuse matching an old marker
- Marker older than 60s is treated as stale and discarded
- Marker is unlinked on first read even if it doesn't match this process
- Replacer clears the marker post-loop + on permission-denied give-up
2026-04-17 19:27:58 -07:00
Teknium 38436eb4e3 chore(release): add pedh to AUTHOR_MAP 2026-04-17 19:26:53 -07:00
pedh 86fd0f846d docs(dingtalk): document AI Cards, emoji reactions, and display settings
- AI Cards: how to configure ``card_template_id`` for streaming rich replies
- Emoji reactions: 🤔Thinking → 🥳Done lifecycle
- Per-platform display settings (streaming, tool_progress, reasoning, etc.)
- Installation: switch to the ``hermes-agent[dingtalk]`` extra (adds
  alibabacloud-dingtalk alongside dingtalk-stream)
- Messaging capability matrix updated to reflect images, audio, video,
  and threading support
2026-04-17 19:26:53 -07:00
pedh 4459913f40 feat(dingtalk): AI Cards streaming, emoji reactions, and media handling
Cherry-picked from #10985 by pedh, adapted to current main:

* Keeps main's full group-chat gating (require_mention + allowed_users +
  free_response_chats + mention_patterns) — PR's simpler subset dropped.
* Keeps main's fire-and-forget process() dispatch + session_webhook
  fallback for SDK >= 0.24.
* Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on
  BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through
  stream_consumer.  Default False so Telegram/Slack/Discord/Matrix stay
  on the zero-overhead fast path.
* DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow
  (tool-progress + final response) with sibling auto-close driven by
  reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$
  for media URL resolution (replaces raw HTTP that was 403-ing).
* pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 +
  alibabacloud-dingtalk>=2.0.0 + qrcode.

Closes #10991

Co-authored-by: pedh
2026-04-17 19:26:53 -07:00
Teknium d7ef562a05 fix(file-ops): follow terminal env's live cwd in _exec instead of init-time cached cwd (#11912)
ShellFileOperations captured the terminal env's cwd at __init__ time and
used that stale value for every subsequent _exec() call.  When the user
ran `cd` via the terminal tool, `env.cwd` updated but `ops.cwd` did not.
Relative paths passed to patch_replace / read_file / write_file / search
then targeted the ORIGINAL directory instead of the current one.

Observed symptom in agent sessions:

  terminal: cd .worktrees/my-branch
  patch hermes_cli/main.py <old> <new>
    → returns {"success": true} with a plausible unified diff
    → but `git diff` in the worktree shows nothing
    → the patch landed in the main repo's checkout of main.py instead

The diff looked legitimate because patch_replace computes it from the
IN-MEMORY content vs new_content, not by re-reading the file.  The
write itself DID succeed — it just wrote to the wrong directory's copy
of the same-named file.

Fix: _exec() now resolves cwd from live sources in this order:

  1. Explicit `cwd` arg (if provided by the caller)
  2. Live `self.env.cwd` (tracks `cd` commands run via terminal)
  3. Init-time `self.cwd` (fallback when env has no cwd attribute)

Includes a 5-test regression suite covering:
  - cd followed by relative read follows live cwd
  - the exact reported bug: patch_replace with relative path after cd
  - explicit cwd= arg still wins over env.cwd
  - env without cwd attribute falls back to init-time cwd
  - patch_replace success reflects real file state (safety rail)

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:26:40 -07:00
helix4u 47010e0757 fix(gateway): allow systemd-backed distrobox services 2026-04-17 19:24:30 -07:00
Teknium 213e39463b chore(release): add akhater to AUTHOR_MAP
Contributor of PR #11858 (nous OAuth providers mirror fix).  CI
blocks releases on unmapped author emails.
2026-04-17 19:13:40 -07:00
Teknium 2297c5f5ce fix(auth): restore --label for hermes auth add nous --type oauth
persist_nous_credentials() now accepts an optional label kwarg which
gets embedded in providers.nous under the 'label' key.
_seed_from_singletons() prefers the embedded label over the
auto-derived label_from_token() fingerprint when materialising the
pool entry, so re-seeding on every load_pool('nous') preserves the
user's chosen label.

auth_commands.py threads --label through to the helper, restoring
parity with how other OAuth providers (anthropic, codex, google,
qwen) honor the flag.

Tests: 4 new (embed, reseed-survives, no-label fallback, end-to-end
through auth_add_command). All 390 nous/auth/credential_pool tests
pass.
2026-04-17 19:13:40 -07:00
Antoine Khater c7fece1f9d fix: normalise Nous device-code pool source to avoid duplicates
Review feedback on the original commit: the helper wrote a pool entry
with source `manual:device_code` while `_seed_from_singletons()` upserts
with `device_code` (no `manual:` prefix), so the pool grew a duplicate
row on every `load_pool()` after login.

Normalise: the helper now writes `providers.nous` and delegates the pool
write entirely to `_seed_from_singletons()` via a follow-up
`load_pool()` call. The canonical source is `device_code`; the helper
never materialises a parallel `manual:device_code` entry.

- `persist_nous_credentials()` loses its `label` and `source` kwargs —
  both are now derived by the seed path from the singleton state.
- CLI and web dashboard call sites simplified accordingly.
- New test `test_persist_nous_credentials_idempotent_no_duplicate_pool_entries`
  asserts that two consecutive persists leave exactly one pool row and
  no stray `manual:` entries.
- Existing `test_auth_add_nous_oauth_persists_pool_entry` updated to
  assert the canonical source and single-entry invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Antoine Khater c096a6935f fix(auth): mirror Nous OAuth credentials to providers.nous on CLI login
`hermes auth add nous --type oauth` only wrote credential_pool.nous,
leaving providers.nous empty. When the Nous agent_key's 24h TTL expired,
run_agent.py's 401-recovery path called resolve_nous_runtime_credentials
(which reads providers.nous), got AuthError "Hermes is not logged into
Nous Portal", caught it as logger.debug (suppressed at INFO level), and
the agent died with "Non-retryable client error" — no signal to the
user that recovery even tried.

Introduce persist_nous_credentials() as the single source of truth for
Nous device-code login persistence. Both auth_commands (CLI) and
web_server (dashboard) now route through it, so pool and providers
stay in sync at write time.

Why: CLI-provisioned profiles couldn't recover from agent_key expiry,
producing silent daily outages 24h after first login. PR #6856/#6869
addressed adjacent issues but assumed providers.nous was populated;
this one wasn't being written.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Teknium a155b4a159 feat(auxiliary): default 'auto' routing to main model for all users (#11900)
Before: aggregator users (OpenRouter / Nous Portal) running 'auto'
routing for auxiliary tasks — compression, vision, web extraction,
session search, etc. — got routed to a cheap provider-side default
model (Gemini Flash).  Non-aggregator users already got their main
model.  Behavior was inconsistent and surprising — users picked
Claude / GPT / their preferred model, but side tasks ran on
Gemini Flash.

After: 'auto' means "use my main chat model" for every user,
regardless of provider type.  Only when the main provider has no
working client does the fallback chain run (OpenRouter → Nous →
custom → Codex → API-key providers).  Explicit per-task overrides
in config.yaml (auxiliary.<task>.provider / .model) still win —
they are a hard constraint, not subject to the auto policy.

Vision auto-detection follows the same policy: try main provider +
main model first (with _PROVIDER_VISION_MODELS overrides preserved
for providers like xiaomi and zai that ship a dedicated multimodal
model distinct from their chat model).  Aggregator strict vision
backends are fallbacks, not the primary path.

Changes:
  - agent/auxiliary_client.py: _resolve_auto() drops the
    `_AGGREGATOR_PROVIDERS` guard.  resolve_vision_provider_client()
    auto branch unifies aggregator and exotic-provider paths —
    everyone goes through resolve_provider_client() with main_model.
    Dead _AGGREGATOR_PROVIDERS constant removed (was only used by
    the guard we just removed).
  - hermes_cli/main.py: aux config menu copy updated to reflect
    the new semantics ("'auto' means 'use my main model'").
  - tests/agent/test_auxiliary_main_first.py: 12 regression tests
    covering OpenRouter/Nous/DeepSeek main paths, runtime-override
    wins, explicit-config wins, vision override preservation for
    exotic providers, and fallback-chain activation when the main
    provider has no working client.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:13:23 -07:00
Teknium b449a0e049 fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP
Follow-up polish on top of the cherry-picked #11023 commit.

- feishu_comment_rules.py: replace import-time "~/.hermes" expanduser fallback
  with get_hermes_home() from hermes_constants (canonical, profile-safe).
- tools/feishu_doc_tool.py, tools/feishu_drive_tool.py: drop the
  asyncio.get_event_loop().run_until_complete(asyncio.to_thread(...)) dance.
  Tool handlers run synchronously in a worker thread with no running loop, so
  the RuntimeError branch was always the one that executed. Calls client.request
  directly now. Unused asyncio import removed.
- tests/gateway/test_feishu.py: add register_p2_customized_event to the mock
  EventDispatcher builder so the existing adapter test matches the new handler
  registration for drive.notice.comment_add_v1.
- scripts/release.py: map liujinkun@bytedance.com -> liujinkun2025 for
  contributor attribution on release notes.
2026-04-17 19:04:11 -07:00
liujinkun 85cdb04bd4 feat: add Feishu document comment intelligent reply with 3-tier access control
- Full comment handler: parse drive.notice.comment_add_v1 events, build
  timeline, run agent, deliver reply with chunking support.
- 5 tools: feishu_doc_read, feishu_drive_list_comments,
  feishu_drive_list_comment_replies, feishu_drive_reply_comment,
  feishu_drive_add_comment.
- 3-tier access control rules (exact doc > wildcard "*" > top-level >
  defaults) with per-field fallback. Config via
  ~/.hermes/feishu_comment_rules.json, mtime-cached hot-reload.
- Self-reply filter using generalized self_open_id (supports future
  user-identity subscriptions). Receiver check: only process events
  where the bot is the @mentioned target.
- Smart timeline selection, long text chunking, semantic text extraction,
  session sharing per document, wiki link resolution.

Change-Id: I31e82fd6355173dbcc400b8934b6d9799e3137b9
2026-04-17 19:04:11 -07:00
Teknium 9b14b76eb3 fix(wecom): bound req_id cache, revert undocumented is_group change, add tests
Follow-up to the cherry-picked contributor fix:

- Extract `_remember_chat_req_id()` and bound it at DEDUP_MAX_SIZE like
  `_reply_req_ids` — the unbounded dict would grow forever on a long-
  running gateway with many chats.
- Move the cache write to AFTER the group/DM policy check so we don't
  cache req_ids from blocked senders.
- Revert the undocumented `is_group` change: the contributor flipped
  `chattype == 'group'` to `bool(chatid)`, which wasn't mentioned in
  the PR description and weakens the signal (chattype is the explicit
  hint; relying on chatid presence assumes DMs never carry it). Keep
  the original check.
- Drop the defensive `getattr(self, '_last_chat_req_ids', {})` reads
  at both send sites — the attribute is initialized in __init__.
- Update `test_send_uses_passive_reply_stream_...` → `_markdown_...`
  to match the new msgtype, and add a new TestWeComZombieSessionFix
  class covering device_id presence in subscribe, per-chat req_id
  caching + bounding, blocked-sender cache exclusion, and the group
  APP_CMD_RESPONSE fallback path.
2026-04-17 19:03:29 -07:00
Devorun 2992802b35 fix(wecom): resolve WebSocket zombie sessions and group chat 600039 errors #11554 2026-04-17 19:03:29 -07:00
Teknium 04a0c3cb95 fix(config): preserve env refs when save_config rewrites config (#11892)
Co-authored-by: binhnt92 <84617813+binhnt92@users.noreply.github.com>
2026-04-17 19:03:26 -07:00
Teknium 8444f66890 feat(hermes model): add Configure auxiliary models UI to hermes model (#11891)
Previously users had to hand-edit config.yaml to route individual auxiliary
tasks (vision, compression, web_extract, etc.) to a specific provider+model.
Add a first-class picker reachable from the bottom of the existing `hermes
model` provider list.

Flow:
  hermes model
    → Configure auxiliary models...
      → <task picker: 9 tasks, shows current setting inline>
        → <provider picker: authenticated providers + auto + custom>
          → <model picker: curated list + live pricing>

The aux picker does NOT re-run credential/OAuth setup; users authenticate
providers through the normal `hermes model` flow, then route aux tasks to
them here.  `list_authenticated_providers()` gates the list to providers
the user has configured.

Also:
  - 'Cancel' entry relabeled 'Leave unchanged' (sentinel still 'cancel'
    internally, so dispatch logic is unchanged)
  - 'Reset all to auto' entry to bulk-clear aux overrides; preserves
    user-tuned timeout / download_timeout values
  - Adds `title_generation` task to DEFAULT_CONFIG.auxiliary — the task
    was called from agent/title_generator.py but was missing from defaults,
    so config-backed timeout overrides never worked for it

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:02:06 -07:00
Teknium bb85404b16 chore: add Sara Reynolds to AUTHOR_MAP 2026-04-17 18:58:29 -07:00
Sara Reynolds 8ab1aa2efc fix(gateway): fix discrepancies in gateway status 2026-04-17 18:58:29 -07:00
Xowiek 511ed4dacc fix(gateway): bypass active-session guard for gateway-handled slash commands 2026-04-17 18:58:03 -07:00
Michel Belleau d465fc5869 fix(skills): use frontmatter name in skills index instead of directory name
build_skills_system_prompt() was using the skill directory name (skill_name)
when appending to skills_by_category in all three code paths (snapshot cache,
cold filesystem scan, external dirs). This meant any skill whose directory name
differed from its frontmatter `name` field would appear under the wrong name in
the system prompt, causing LLM routing failures.

The snapshot entry already stores both skill_name (dir) and frontmatter_name
(declared); switch the three tuple appends to use frontmatter_name. Also fix
the external-dir dedup set (seen_skill_names) to track frontmatter names for
consistency with the local-skill tuples now stored under frontmatter_name.

Fixes #11777

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:56:37 -07:00
helix4u 016ae5c334 fix(kimi): force 0.6 on main chat path 2026-04-17 18:47:01 -07:00
Teknium 304fb921bf fix: two process leaks (agent-browser daemons, paste.rs sleepers) (#11843)
Both fixes close process leaks observed in production (18+ orphaned
agent-browser node daemons, 15+ orphaned paste.rs sleep interpreters
accumulated over ~3 days, ~2.7 GB RSS).

## agent-browser daemon leak

Previously the orphan reaper (_reap_orphaned_browser_sessions) only ran
from _start_browser_cleanup_thread, which is only invoked on the first
browser tool call in a process. Hermes sessions that never used the
browser never swept orphans, and the cross-process orphan detection
relied on in-process _active_sessions, which doesn't see other hermes
PIDs' sessions (race risk).

- Write <session>.owner_pid alongside the socket dir recording the
  hermes PID that owns the daemon (extracted into _write_owner_pid for
  direct testability).
- Reaper prefers owner_pid liveness over in-process _active_sessions.
  Cross-process safe: concurrent hermes instances won't reap each
  other's daemons. Legacy tracked_names fallback kept for daemons
  that predate owner_pid.
- atexit handler (_emergency_cleanup_all_sessions) now always runs
  the reaper, not just when this process had active sessions —
  every clean hermes exit sweeps accumulated orphans.

## paste.rs auto-delete leak

_schedule_auto_delete spawned a detached Python subprocess per call
that slept 6 hours then issued DELETE requests. No dedup, no tracking —
every 'hermes debug share' invocation added ~20 MB of resident Python
interpreters that stuck around until the sleep finished.

- Replaced the spawn with ~/.hermes/pastes/pending.json: records
  {url, expire_at} entries.
- _sweep_expired_pastes() synchronously DELETEs past-due entries on
  every 'hermes debug' invocation (run_debug() dispatcher).
- Network failures stay in pending.json for up to 24h, then give up
  (paste.rs's own retention handles the 'user never runs hermes again'
  edge case).
- Zero subprocesses; regression test asserts subprocess/Popen/time.sleep
  never appear in the function source (skipping docstrings via AST).

## Validation

|                              | Before        | After        |
|------------------------------|---------------|--------------|
| Orphan agent-browser daemons | 18 accumulated| 2 (live)     |
| paste.rs sleep interpreters  | 15 accumulated| 0            |
| RSS reclaimed                | -             | ~2.7 GB      |
| Targeted tests               | -             | 2253 pass    |

E2E verified: alive-owner daemons NOT reaped; dead-owner daemons
SIGTERM'd and socket dirs cleaned; pending.json sweep deletes expired
entries without spawning subprocesses.
2026-04-17 18:46:30 -07:00
helix4u 64b354719f Support browser CDP URL from config 2026-04-17 16:05:04 -07:00
brooklyn! e9b8ece103 Merge pull request #4692 from NousResearch/feat/ink-refactor
Feat/ink refactor
2026-04-17 18:02:37 -05:00
Teknium 3f43aec15d fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839)
Two accretion-over-time leaks that compound over long CLI / gateway
lifetimes.  Both were flagged in the memory-leak audit.

## file_tools._read_tracker

_read_tracker[task_id] holds three sub-containers that grew unbounded:

  read_history     set of (path, offset, limit) tuples — 1 per unique read
  dedup            dict of (path, offset, limit) → mtime — same growth pattern
  read_timestamps  dict of resolved_path → mtime — 1 per unique path

A CLI session uses one stable task_id for its lifetime, so these were
uncapped.  A 10k-read session accumulated ~1.5MB of tracker state that
the tool no longer needed (only the most recent reads are relevant for
dedup, consecutive-loop detection, and write/patch external-edit
warnings).

Fix: _cap_read_tracker_data() enforces hard caps on each container
after every add.  Defaults: read_history=500, dedup=1000,
read_timestamps=1000.  Eviction is insertion-order (Python 3.7+ dict
guarantee) for the dicts; arbitrary for the set (which only feeds
diagnostic summaries).

## process_registry._completion_consumed

Module-level set that recorded every session_id ever polled / waited /
logged.  No pruning.  Each entry is ~20 bytes, so the absolute leak is
small, but on a gateway processing thousands of background commands
per day the set grows until process exit.

Fix: _prune_if_needed() now discards _completion_consumed entries
alongside the session dict evictions it already performs (both the
TTL-based prune and the LRU-over-cap prune).  Adds a final
belt-and-suspenders pass that drops any dangling entries whose
session_id no longer appears in _running or _finished.

Tests: tests/tools/test_accretion_caps.py — 9 cases
  * Each container bound respected, oldest evicted
  * No-op when under cap (no unnecessary work)
  * Handles missing sub-containers without crashing
  * Live read_file_tool path enforces caps end-to-end
  * _completion_consumed pruned on TTL expiry
  * _completion_consumed pruned on LRU eviction
  * Dangling entries (no backing session) cleared

Broader suite: 3486 tests/tools + tests/cli pass.  The single flake
(test_alias_command_passes_args) reproduces on unchanged main — known
cross-test pollution under suite-order load.
2026-04-17 15:53:57 -07:00
Brooklyn Nicholson aa583cb14e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 17:51:40 -05:00
Teknium 0a83187801 refactor(kimi): use _fixed_temperature_for_model helper in flush_memories
Replace the hardcoded 'kimi-for-coding' string check with the helper
from auxiliary_client so there is one source of truth for the list of
models with fixed-temperature contracts. Adding a new entry to
_FIXED_TEMPERATURE_MODELS now automatically covers flush_memories too.
2026-04-17 15:49:14 -07:00
helix4u 2b60478fc2 fix(kimi): force kimi-for-coding temperature to 0.6 2026-04-17 15:49:14 -07:00
Teknium c6fd2619f7 fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833)
Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit
path (status_code on the exception, Retry-After preserved via error.response)
instead of being opaque RuntimeErrors. User sees a one-line capacity message
instead of a 500-char JSON dump.

Changes
- CodeAssistError grows status_code / response / retry_after / details attrs.
  _extract_status_code in error_classifier picks up status_code and classifies
  429 as FailoverReason.rate_limit, so fallback_providers triggers the same
  way it does for SDK errors. run_agent.py line ~10428 already walks
  error.response.headers for Retry-After — preserving the response means that
  path just works.
- _gemini_http_error parses the Google error envelope (error.status +
  error.details[].reason from google.rpc.ErrorInfo, retryDelay from
  google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404
  model-not-found each produce a human-readable message; unknown shapes fall
  back to the previous raw-body format.
- Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and
  agent/model_metadata.py — Google returned 404 for it today in local repro.
  Kept gemma-4-31b-it (capacity-constrained but not retired).

Validation
|                           | Before                         | After                                     |
|---------------------------|--------------------------------|-------------------------------------------|
| Error message             | 'Code Assist returned HTTP 429: {500 chars JSON}' | 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' |
| status_code on error      | None (opaque RuntimeError)     | 429                                       |
| Classifier reason         | unknown (string-match fallback) | FailoverReason.rate_limit                |
| Retry-After honored       | ignored                        | extracted from RetryInfo or header        |
| gemma-4-26b-it picker     | advertised (404s on Google)    | removed                                   |

Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found,
Retry-After header fallback, malformed body, and classifier integration.
Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full
tests/hermes_cli (2203 tests) green.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 15:34:12 -07:00
Teknium d2206c69cc fix(qqbot): add back-compat for env var rename; drop qrcode core dep
Follow-up to WideLee's salvaged PR #11582.

Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename:
  - gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL
    with a one-shot deprecation warning so users on the old name aren't
    silently broken.
  - cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new
    name; _get_home_target_chat_id falls back to the legacy name via a
    _LEGACY_HOME_TARGET_ENV_VARS table.
  - hermes_cli/status.py + hermes_cli/setup.py: honor both names when
    displaying or checking for missing home channels.
  - hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in
    _EXTRA_ENV_KEYS so .env sanitization still recognizes them.

Scope cleanup:
  - Drop qrcode from core dependencies and requirements.txt (remains in
    messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades
    gracefully when qrcode is missing, printing a 'pip install qrcode' tip
    and falling back to URL-only display.
  - Restore @staticmethod on QQAdapter._detect_message_type (it doesn't
    use self). Revert the test change that was only needed when it was
    converted to an instance method.
  - Reset uv.lock to origin/main; the PR's stale lock also included
    unrelated changes (atroposlib source URL, hermes-agent version bump,
    fastapi additions) that don't belong.

Verified E2E:
  - Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the
    legacy name; deprecation warning logs once.
  - Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name,
    no warning.
  - Both set: new name wins on both surfaces.

Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).
2026-04-17 15:31:14 -07:00
WideLee 103beea7a6 fix(qqbot): fix test failures after package refactor
- Re-export _ssrf_redirect_guard from __init__.py
- Fix _parse_json @staticmethod using self._log_tag
- Update test_detect_message_type to call as instance method
- Fix mock.patch path for httpx.AsyncClient in adapter submodule
2026-04-17 15:31:14 -07:00
WideLee 287d3e12c7 chore: add author map 2026-04-17 15:31:14 -07:00
WideLee 6fd58e1e4a refactor(qqbot): replace log tags with self._log_tag 2026-04-17 15:31:14 -07:00
WideLee 235e6ecc0e refactor(qqbot): replace hardcoded log tags with self._log_tag and adjust STT log levels
- Remove @staticmethod from _detect_message_type, _convert_silk_to_wav,
  _convert_raw_to_wav, _convert_ffmpeg_to_wav so they can use self._log_tag
- Replace all remaining hardcoded "QQBot" log args with self._log_tag
- Downgrade STT routine flow logs (download, convert, success) from info to debug
- Keep warning level for actual failures (STT failed, ffmpeg error, empty transcript)
2026-04-17 15:31:14 -07:00
WideLee 1648e41c17 refactor(qqbot): change qrcode style 2026-04-17 15:31:14 -07:00
WideLee c4cdf3b861 refactor(qqbot): change setup method selection prompt_choice style 2026-04-17 15:31:14 -07:00
WideLee 02f5e3dc27 refactor(qqbot): use _log_tag with app_id in all logger calls for multi-instance disambiguation 2026-04-17 15:31:14 -07:00
WideLee b7d330211a fix(qqbot): simplify home channel prompt wording 2026-04-17 15:31:14 -07:00
WideLee a5f4d652d3 feat(qqbot): prompt to add scanned user to allow list and home channel during setup 2026-04-17 15:31:14 -07:00
WideLee 6358501915 refactor(qqbot): split qqbot.py into package & add QR scan-to-configure onboard flow
- Refactor gateway/platforms/qqbot.py into gateway/platforms/qqbot/ package:
  - adapter.py: core QQAdapter (unchanged logic, constants from shared module)
  - constants.py: shared constants (API URLs, timeouts, message types)
  - crypto.py: AES-256-GCM key generation and secret decryption
  - onboard.py: QR-code scan-to-configure API (create_bind_task, poll_bind_result)
  - utils.py: User-Agent builder, HTTP headers, config helpers
  - __init__.py: re-exports all public symbols for backward compatibility

- Add interactive QR-code setup flow in hermes_cli/gateway.py:
  - Terminal QR rendering via qrcode package (graceful fallback to URL)
  - Auto-refresh on QR expiry (up to 3 times)
  - AES-256-GCM encrypted credential exchange
  - DM security policy selection (pairing/allowlist/open)

- Update hermes_cli/setup.py to delegate to gateway's _setup_qqbot()
- Add qrcode>=7.4 dependency to pyproject.toml and requirements.txt
2026-04-17 15:31:14 -07:00
Teknium 31e7276474 fix(gateway): consolidate per-session cleanup; close SessionDB on shutdown (#11800)
Three closely-related fixes for shutdown / lifecycle hygiene.

1. _release_running_agent_state(session_key) helper
   ----------------------------------------------------
   Per-running-agent state lived in three dicts that drifted out of sync
   across cleanup sites:
     self._running_agents       — AIAgent per session_key
     self._running_agents_ts    — start timestamp per session_key
     self._busy_ack_ts          — last busy-ack timestamp per session_key

   Inventory before this PR:
     8 sites: del self._running_agents[key]
       — only 1 (stale-eviction) cleaned all three
       — 1 cleaned _running_agents + _running_agents_ts only
       — 6 cleaned _running_agents only

   Each missed entry was a (str, float) tuple per session per gateway
   lifetime — small, persistent, accumulates across thousands of
   sessions over months.  Per-platform leaks compounded.

   This change adds a single helper that pops all three dicts in
   lockstep, and replaces every bare 'del self._running_agents[key]'
   site with it.  Per-session state that PERSISTS across turns
   (_session_model_overrides, _voice_mode, _pending_approvals,
   _update_prompt_pending) is intentionally NOT touched here — those
   have their own lifecycles tied to user actions, not turn boundaries.

2. _running_agents_ts cleared in _stop_impl
   ----------------------------------------
   Was being missed alongside _running_agents.clear(); now included.

3. SessionDB close() in _stop_impl
   ---------------------------------
   The SQLite WAL write lock stayed held by the old gateway connection
   until Python actually exited — causing 'database is locked' errors
   when --replace launched a new gateway against the same file.  We
   now explicitly close both self._db and self.session_store._db
   inside _stop_impl, with try/except so a flaky close on one doesn't
   block the other.

Tests
-----
tests/gateway/test_session_state_cleanup.py — 10 cases covering:
  * helper pops all three dicts atomically
  * idempotent on missing/empty keys
  * preserves other sessions
  * tolerates older runners without _busy_ack_ts attribute
  * thread-safe under concurrent release
  * regression guard: scans gateway/run.py and fails if a future
    contributor reintroduces 'del self._running_agents[...]'
    outside docstrings
  * SessionDB close called on both holders during shutdown
  * shutdown tolerates missing session_store
  * shutdown tolerates close() raising on one db (other still closes)

Broader gateway suite: 3108 passed (vs 3100 on baseline) — failure
delta is +8 net passes; the 10 remaining failures are pre-existing
cross-test pollution / missing optional deps (matrix needs olm,
signal/telegram approval flake, dingtalk Mock wiring), all reproduce
on stashed baseline.
2026-04-17 15:18:23 -07:00
Teknium 036dacf659 feat(telegram): auto-wrap markdown tables in code blocks (#11794)
Telegram's MarkdownV2 has no table syntax — pipes get backslash-escaped
and tables render as noisy unaligned text.  format_message now detects
GFM-style pipe tables (header row + delimiter row + optional body) and
wraps them in ``` fences before the existing MarkdownV2 conversion runs.
Telegram renders fenced code blocks as monospace preformatted text with
columns intact.

Tables already inside an existing code block are left alone.  Plain
prose with pipes, lone '---' horizontal rules, and non-table content
are unaffected.

Closes the recurring community request to stop having to ask the agent
to re-render tables as code blocks manually.
2026-04-17 14:27:26 -07:00
Teknium 3207b9bda0 test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)
Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:

## 1. Retry backoff mocks

- tests/run_agent/conftest.py (NEW): autouse fixture mocks
  jittered_backoff to 0.0 so the `while time.time() < sleep_end`
  busy-loop exits immediately. No global time.sleep mock (would
  break threading tests).
- test_anthropic_error_handling, test_413_compression,
  test_run_agent_codex_responses, test_fallback_model: per-file
  fixtures mock time.sleep / asyncio.sleep for retry / compression
  paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
  to 0.05s via a per-test shim (background writer-thread retries
  sleep 2s after errors; tests don't care about exact duration).
  Plus replace arbitrary time.sleep(N) waits with short polling
  loops bounded by deadline.

## 2. Subprocess sleeps in production code

- test_update_gateway_restart: mock time.sleep. Production code
  does time.sleep(3) after `systemctl restart` to verify the
  service survived. Tests mock subprocess.run \u2014 nothing actually
  restarts \u2014 so the wait is dead time.

## 3. Network / IMDS timeouts (biggest single win)

- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
  AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
  to IMDS (169.254.169.254) when no AWS creds are set. Any test
  hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
  test_status, test_setup_copilot_acp, anything that touches
  provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
  resolve_runtime_provider which was doing real network auto-detect
  (~4s). Tests don't care about provider resolution \u2014 the agent
  is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
  into 2 tests by checking both injection AND no-leak in the same
  subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).

## Validation

| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |

No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.

Supersedes PR #11779 (those changes are included here).
2026-04-17 14:21:22 -07:00
Teknium eb07c05646 fix(gateway): prune stale SessionStore entries to bound memory + disk (#11789)
SessionStore._entries grew unbounded.  Every unique
(platform, chat_id, thread_id, user_id) tuple ever seen was kept in
RAM and rewritten to sessions.json on every message.  A Discord bot
in 100 servers x 100 channels x ~100 rotating users accumulates on
the order of 10^5 entries after a few months; each sessions.json
write becomes an O(n) fsync.  Nothing trimmed this — there was no
TTL, no cap, no eviction path.

Changes
-------
* SessionStore.prune_old_entries(max_age_days) — drops entries whose
  updated_at is older than the cutoff.  Preserves:
    - suspended entries (user paused them via /stop for later resume)
    - entries with an active background process attached
  Pruning is functionally identical to a natural reset-policy expiry:
  SQLite transcript stays, session_key -> session_id mapping dropped,
  returning user gets a fresh session.

* GatewayConfig.session_store_max_age_days (default 90; 0 disables).
  Serialized in to_dict/from_dict, coerced from bad types / negatives
  to safe defaults.  No migration needed — missing field -> 90 days.

* _session_expiry_watcher calls prune_old_entries once per hour
  (first tick is immediate).  Uses the existing watcher loop so no
  new background task is created.

Why not more aggressive
-----------------------
90 days is long enough that legitimate long-idle users (seasonal,
vacation, etc.) aren't surprised — pruning just means they get a
fresh session on return, same outcome they'd get from any other
reset-policy trigger.  Admins can lower it via config; 0 disables.

Tests
-----
tests/gateway/test_session_store_prune.py — 17 cases covering:
  * entry age based on updated_at, not created_at
  * max_age_days=0 disables; negative coerces to 0
  * suspended + active-process entries are skipped
  * _save fires iff something was removed
  * disk JSON reflects post-prune state
  * thread safety against concurrent readers
  * config field roundtrips + graceful fallback on bad values
  * watcher gate logic (first tick prunes, subsequent within 1h don't)

119 broader session/gateway tests remain green.
2026-04-17 13:48:49 -07:00
Teknium f362083c64 fix(providers): complete NVIDIA NIM parity with other providers
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).

Gaps closed:

- hermes_cli/main.py:
  - Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
    selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
    provider flow (previously fell through silently).
  - Add 'nvidia' to `hermes chat --provider` argparse choices so the
    documented test command (`hermes chat --provider nvidia --model ...`)
    parses successfully.

- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
  OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
  auto-added to the subprocess env blocklist.

- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
  `hermes doctor` probes https://integrate.api.nvidia.com/v1/models.

- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
  `hermes dump` credential masking.

- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
  with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.

- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
  so all Nemotron variants get 128K context via substring match (rather
  than falling back to MINIMUM_CONTEXT_LENGTH).

- hermes_cli/models.py: Fix hallucinated model ID
  'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
  (verified against live integrate.api.nvidia.com/v1/models catalog).
  Expand curated list from 5 to 9 agentic models mapping to OpenRouter
  defaults per provider-guide convention: add qwen3.5-397b-a17b,
  deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.

- cli-config.yaml.example: Document 'nvidia' provider option.

- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
  for CI attribution.

E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
2026-04-17 13:47:46 -07:00
asurla 3b569ff576 feat(providers): add native NVIDIA NIM provider
Adds NVIDIA NIM as a first-class provider: ProviderConfig in
auth.py, HermesOverlay in providers.py, curated models
(Nemotron plus other open source models hosted on
build.nvidia.com), URL mapping in model_metadata.py, aliases
(nim, nvidia-nim, build-nvidia, nemotron), and env var tests.

Docs updated: providers page, quickstart table, fallback
providers table, and README provider list.
2026-04-17 13:47:46 -07:00
Brooklyn Nicholson bd09e42eac Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:44:57 -05:00
Teknium cc3aa76675 build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627)
#4b1567f4 (anthhub) added qrcode to the messaging extra for Weixin's
QR login. The same package is needed by:

  * hermes_cli/dingtalk_auth.py — QR device-flow auth shipped in #11574
  * gateway/platforms/feishu.py:3962 — Feishu QR login

These extras are independent of [messaging] (users can install
hermes-agent[dingtalk] or hermes-agent[feishu] without [messaging]),
so the dep needs to be declared on each.

Pin matches anthhub's choice (>=7.0,<8) for consistency. The all
extra inherits from all three, so it picks up qrcode transitively.

Adds parallel tests to tests/test_project_metadata.py — same shape
as test_messaging_extra_includes_qrcode_for_weixin_setup.

Refs #9431.
2026-04-17 13:31:53 -07:00
Teknium 2ff1ef6ae6 fix(surrogates): sanitize reasoning/reasoning_content/reasoning_details fields (#11628)
Byte-level reasoning models (xiaomi/mimo-v2-pro, kimi, glm) can emit lone
surrogates in reasoning output. The proactive sanitizer walked content/
name/tool_calls but not extra fields like reasoning or the nested
reasoning_details array. Surrogates in those fields survived the
proactive pass, crashed json.dumps() in the OpenAI SDK, and the recovery
block's _sanitize_messages_surrogates(messages) call also didn't check
those fields — so 'found' was False, no retry happened, and after 3
attempts the user saw:

  API call failed after 3 retries. 'utf-8' codec can't encode characters
  in position N-M: surrogates not allowed

Changes:
- _sanitize_messages_surrogates: walk any extra string fields (reasoning,
  reasoning_content, etc.) and recurse into nested dict/list values
  (reasoning_details). Mirrors _sanitize_messages_non_ascii coverage
  added in PR #10537.
- _sanitize_structure_surrogates: new recursive walker, mirror of
  _sanitize_structure_non_ascii but for surrogate recovery.
- UnicodeEncodeError recovery block: also sanitize api_messages,
  api_kwargs, and prefill_messages (not just the canonical messages
  list — the API-copy carries reasoning_content transformed from
  reasoning and that's what the SDK actually serializes). Always
  retry on detected surrogate errors, not only when we found
  something to strip — gate on error type per PR #10537's pattern.

Tests: extended tests/cli/test_surrogate_sanitization.py with
coverage for reasoning, reasoning_content, reasoning_details (flat
and deeply nested), structure walker, and an integration case that
reproduces the exact api_messages shape that was crashing.
2026-04-17 13:30:47 -07:00
Teknium 1229d8855c fix: remove misleading model.max_tokens suggestion from thinking-exhausted error (#11626)
The 'Thinking Budget Exhausted' user-facing error message advised users to
'set model.max_tokens in config.yaml'. That config key is documented but
intentionally not wired through to the API call in CLI/gateway paths — we
omit max_tokens by default so the inference server uses its full output
budget (llama-server -1=infinity, vLLM max_model_len-prompt_len, etc.).

Users followed the suggestion, saw no change, and kept filing bugs (see
closed #4404, #10917, #6955 and PRs #5001/#6080/#6446/#6707/#7075/#8804/
#10924/#11173/#11268 — all reporting the same misdirection).

Replace the misleading suggestion with an actionable one: switch models
via /model. Lowering reasoning effort remains the primary remediation.
2026-04-17 13:29:54 -07:00
Henkey d49126b987 fix(release): map HenkDz contributor email 2026-04-17 13:29:26 -07:00
Henkey cb883f9e97 fix(acp): improve zed integration 2026-04-17 13:29:26 -07:00
Brooklyn Nicholson d5b9db8b4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:13:36 -05:00
Brooklyn Nicholson 6a37802476 chore: uptick 2026-04-17 15:13:33 -05:00
Teknium d0e1388ca9 fix(tests): make AIAgent constructor calls self-contained (#11755)
* fix(tests): make AIAgent constructor calls self-contained (no env leakage)

Tests in tests/run_agent/ were constructing AIAgent() without passing
both api_key and base_url, then relying on leaked state from other
tests in the same xdist worker (or process-level env vars) to keep
provider resolution happy. Under hermetic conftest + pytest-split,
that state is gone and the tests fail with 'No LLM provider configured'.

Fix: pass both api_key and base_url explicitly on 47 AIAgent()
construction sites across 13 files. AIAgent.__init__ with both set
takes the direct-construction path (line 960 in run_agent.py) and
skips the resolver entirely.

One call site (test_none_base_url_passed_as_none) left alone — that
test asserts behavior for base_url=None specifically.

This is a prerequisite for any future matrix-split or stricter
isolation work, and lands cleanly on its own.

Validation:
- tests/run_agent/ full: 760 passed, 0 failed (local)
- Previously relied on cross-test pollution; now self-contained

* fix(tests): update opencode-go model order assertion to match kimi-k2.5-first

commit 78a74bb promoted kimi-k2.5 to first position in model suggestion
lists but didn't update this test, which has been failing on main since.
Reorder expected list to match the new canonical order.
2026-04-17 12:32:03 -07:00
kshitij 78a74bb097 feat: promote kimi-k2.5 to first position in all model suggestion lists (#11745)
Move moonshotai/kimi-k2.5 to position #1 in every model picker list:
- OPENROUTER_MODELS (with 'recommended' tag)
- _PROVIDER_MODELS: nous, kimi-coding, opencode-zen, opencode-go, alibaba, huggingface
- _model_flow_kimi() Coding Plan model list in main.py

kimi-coding-cn and moonshot lists already had kimi-k2.5 first.
2026-04-17 12:05:22 -07:00
Brooklyn Nicholson bedbeebbc8 feat(tui): interleave tool rows into live assistant turns
Live turn rendering used to show the streaming assistant text as one
blob with tool calls pooled in a separate section below, so the live
view drifted from the reload view (which threads tool rows inline via
toTranscriptMessages). Model now mirrors reload:

- turnStore gains streamSegments (completed assistant chunks, each
  with any tool rows that landed between its predecessor and itself)
  and streamPendingTools (tool rows waiting for the next chunk)
- turnController.flushStreamingSegment() seals the current bufRef into
  a segment when a new tool.start fires; pending tools get attached to
  that next chunk so order matches reload hydration
- recordMessageComplete returns finalMessages instead of one payload,
  so appendMessage gets the same shape for live-ending turns as for
  reloaded ones
- appLayout renders segments before the progress/streaming area, and
  the streaming message + pending-tools fallback carry whatever tools
  arrived after the last assistant chunk
2026-04-17 11:33:29 -05:00
Brooklyn Nicholson f53250b5e1 fix(tui): tighten /resume render, follow-up to 42721dbe
- useVirtualHistory: track last-seen ScrollBox metrics in a ref inside
  the post-layout effect and bump ver when sticky/top/vp change — the
  subscribe-based rearm was sufficient for fresh clicks but not for the
  "hydrated mid-commit, measured empty, then metrics settle" path where
  nothing re-triggered the hook until the next unrelated keystroke
- useSessionLifecycle: resume scrollToBottom from queueMicrotask to
  setTimeout(..., 0) so the fresh transcript has a full task turn to
  commit + measure before we try to land at the newest content
2026-04-17 11:33:14 -05:00
Brooklyn Nicholson 00591e3801 chore: fmt 2026-04-17 11:06:25 -05:00
Brooklyn Nicholson be768db627 fix: long history session thingy 2026-04-17 11:05:23 -05:00
Brooklyn Nicholson 42721dbe1c fix(tui): big-session /resume now renders without first keystroke
useVirtualHistory set up its useSyncExternalStore subscription during
the first render, when scrollRef.current was still null (the ScrollBox
ref attaches during commit, after render). Its useCallback for
subscribe had a stable scrollRef identity as its only dep, so it never
re-subscribed once the ref actually attached — the hook stayed stuck
with vp=0, top=0, no scroll subscription. Small sessions fit entirely
in cold-start so you didn't notice; big /resume sessions got sliced to
the last 40 items with a huge topSpacer and the viewport sat on empty
space until some unrelated state change (e.g. a keystroke) re-rendered
and finally read a real vp.

- flip a hasScrollRef flag in useLayoutEffect once the ref attaches and
  add it to the subscribe useCallback deps so useSyncExternalStore
  rearms with a real subscription
- on resume, scrollToBottom() after history hydrates so the ScrollBox
  lands at the newest messages instead of scrollTop=0 (stickyScroll
  doesn't auto-engage on the initial empty→full dump)
2026-04-17 11:04:29 -05:00
Brooklyn Nicholson 8f553a55b2 chore(tui): fix eslint/prettier nits from npm run fix
- drop inline `import()` type annotation in useSessionLifecycle (import
  `PanelSection` at the top like everything else)
- include `panel` and `session.resumeById` in the useMainApp useMemo
  deps now that the event handler depends on them
- wrap the derived `selected` range in a useMemo so it has stable
  identity and stops invalidating the TextInput `rendered` memo every
  render
- prettier re-sorting of a couple of export/import lines
2026-04-17 11:00:15 -05:00
Brooklyn Nicholson a82097e7a2 feat(tui): /model and /setup slash commands with in-place CLI handoff
- hermes-ink: export `withInkSuspended()` + `useExternalProcess()` that
  pause/resume Ink around an arbitrary external process (built on the
  existing enterAlternateScreen/exitAlternateScreen plumbing)
- tui: `launchHermesCommand(args)` spawns the `hermes` binary with
  inherited stdio, with `HERMES_BIN` override for non-standard launches
- tui: `/model` and `/setup` slash commands invoke the CLI wizards
  in-place, then re-preflight `setup.status` and auto-start a session on
  success — no more exit-and-relaunch to finish first-run setup
- setup panel now advertises those slashes instead of only pointing
  users back at the shell
2026-04-17 10:58:18 -05:00
Brooklyn Nicholson 0dd5055d59 fix(tui): first-run setup preflight + actionable no-provider panel
- tui_gateway: new `setup.status` RPC that reuses CLI's
  `_has_any_provider_configured()`, so the TUI can ask the same question
  the CLI bootstrap asks before launching a session
- useSessionLifecycle: preflight `setup.status` before both `newSession`
  and `resumeById`, and render a clear "Setup Required" panel when no
  provider is configured instead of booting a session that immediately
  fails with `agent init failed`
- createGatewayEventHandler: drop duplicate startup resume logic in
  favor of the preflighted `resumeById`, and special-case the
  no-provider agent-init error as a last-mile fallback to the same
  setup panel
- add regression tests for both paths
2026-04-17 10:58:01 -05:00
Brooklyn Nicholson 5b386ced71 fix(tui): approval flow + input ergonomics + selection perf
- tui_gateway: route approvals through gateway callback (HERMES_GATEWAY_SESSION/
  HERMES_EXEC_ASK) so dangerous commands emit approval.request instead of
  silently falling through the CLI input() path and auto-denying
- approval UX: dedicated PromptZone between transcript and composer, safer
  defaults (sel=0, numeric quick-picks, no Esc=deny), activity trail line,
  outcome footer under the cost row
- text input: Ctrl+A select-all, real forward Delete, Ctrl+W always consumed
  (fixes Ctrl+Backspace at cursor 0 inserting literal w)
- hermes-ink selection: swap synchronous onRender() for throttled
  scheduleRender() on drag, and only notify React subscribers on presence
  change — no more per-cell paint/subscribe spam
- useConfigSync: silence config.get polling failures instead of surfacing
  'error: timeout: config.get' in the transcript
2026-04-17 10:37:48 -05:00
Brooklyn Nicholson 0219da9626 chore: uptick 2026-04-17 09:47:19 -05:00
Brooklyn Nicholson 1f37ef2fd1 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
Teknium 6ea7386a6f chore: map memosr, anthhub, shenuu, xiayh0107 emails to AUTHOR_MAP 2026-04-17 06:50:36 -07:00
Young Sherlock 8dcd08d8bb Fix Weixin media uploads and refresh lockfile 2026-04-17 06:50:36 -07:00
shenuu 3a0ec1d935 fix(weixin): macOS SSL cert, QR data, and refresh rendering
- Use certifi CA bundle for aiohttp SSL in qr_login(), start(), and
  send_weixin_direct() to fix SSL verification failures against
  Tencent's iLink server on macOS (Homebrew OpenSSL lacks system certs)
- Fix QR code data: encode qrcode_img_content (full liteapp URL) instead
  of raw hex token — WeChat needs the full URL to resolve the scan
- Render ASCII QR on refresh so the user can re-scan without restarting
- Improve error message on QR render failure to show the actual exception

Tested on macOS (Apple Silicon, Homebrew Python 3.13)
2026-04-17 06:50:36 -07:00
jinzheng8115 e105b7ac93 fix(weixin): retry send without context_token on iLink session expiry
iLink context_token has a limited TTL. When no user message has arrived
for an extended period (e.g. overnight), cron-initiated pushes fail with
errcode -14 (session timeout).

Tested that iLink accepts sends without context_token as a degraded
fallback, so we now automatically strip the expired token and retry
once. This keeps scheduled push messages (weather, digests, etc.)
working reliably without requiring a user message to refresh the
session first.

Changes:
- _send_text_chunk() catches iLinkDeliveryError with session-expired
  errcode (-14) and retries without context_token
- Stale tokens are cleared from ContextTokenStore on session expiry
- All 34 existing weixin tests pass
2026-04-17 06:50:36 -07:00
anthhub 4b1567f425 fix(packaging): include qrcode in messaging extra 2026-04-17 06:50:36 -07:00
memosr cedc95c100 fix(security): validate WeChat media URLs against CDN allowlist to prevent SSRF 2026-04-17 06:50:36 -07:00
Teknium c7334b4a50 chore(release): map @Hypn0sis and @OwenYWT to AUTHOR_MAP 2026-04-17 06:46:52 -07:00
Teknium 3f3d8a7b24 fix(discord): strip mention syntax from auto-thread names
Previously a message like `<@&1490963422786093149> help` would spawn a
thread literally named `<@&1490963422786093149> help`, exposing raw
Discord mention markers in the thread list. Only user mentions
(`<@id>`) were being stripped upstream — role mentions (`<@&id>`) and
channel mentions (`<#id>`) leaked through.

Fix: strip all three mention patterns in `_auto_create_thread` before
building the thread name. Collapse runs of whitespace left by the
removal. If the entire content was mention-only, fall back to 'Hermes'
instead of an empty title.

Fixes #6336.

Tests: two new regression guards in test_discord_slash_commands.py
covering mixed-mention content and mention-only content.
2026-04-17 06:46:52 -07:00
sgaofen 32a694ad5f fix(discord): fall back when auto-thread creation fails 2026-04-17 06:46:52 -07:00
OwenYWT f5dc4e905d fix(discord): skip auto-threading reply messages 2026-04-17 06:46:52 -07:00
Matteo De Agazio 93fe4b357d fix(discord): free-response channels skip auto-threading
Free-response channels already bypassed the @mention gate so users could
chat inline with the bot, but auto-threading still fired on every
message — spinning off a thread per message and defeating the
lightweight-chat purpose.

Fix: fold `is_free_channel` into `skip_thread` so threading is skipped
whenever the channel is in DISCORD_FREE_RESPONSE_CHANNELS (via env or
discord.free_response_channels in config.yaml).

Net change: one line in _handle_message + one regression test.

Partially addresses #9399. Authored by @Hypn0sis (salvaged from PR #9650;
the bundled 'smart' auto-thread mode from that PR was dropped in favor
of deterministic true/false semantics).
2026-04-17 06:46:52 -07:00
Teknium 8d7b7feb0d fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction (#11565)
* fix(gateway): bound _agent_cache with LRU cap + idle TTL eviction

The per-session AIAgent cache was unbounded. Each cached AIAgent holds
LLM clients, tool schemas, memory providers, and a conversation buffer.
In a long-lived gateway serving many chats/threads, cached agents
accumulated indefinitely — entries were only evicted on /new, /model,
or session reset.

Changes:
- Cache is now an OrderedDict so we can pop least-recently-used entries.
- _enforce_agent_cache_cap() pops entries beyond _AGENT_CACHE_MAX_SIZE=64
  when a new agent is inserted. LRU order is refreshed via move_to_end()
  on cache hits.
- _sweep_idle_cached_agents() evicts entries whose AIAgent has been idle
  longer than _AGENT_CACHE_IDLE_TTL_SECS=3600s. Runs from the existing
  _session_expiry_watcher so no new background task is created.
- The expiry watcher now also pops the cache entry after calling
  _cleanup_agent_resources on a flushed session — previously the agent
  was shut down but its reference stayed in the cache dict.
- Evicted agents have _cleanup_agent_resources() called on a daemon
  thread so the cache lock isn't held during slow teardown.

Both tuning constants live at module scope so tests can monkeypatch
them without touching class state.

Tests: 7 new cases in test_agent_cache.py covering LRU eviction,
move_to_end refresh, cleanup thread dispatch, idle TTL sweep,
defensive handling of agents without _last_activity_ts, and plain-dict
test fixture tolerance.

* tweak: bump _AGENT_CACHE_MAX_SIZE 64 -> 128

* fix(gateway): never evict mid-turn agents; live spillover tests

The prior commit could tear down an active agent if its session_key
happened to be LRU when the cap was exceeded.  AIAgent.close() kills
process_registry entries for the task, tears down the terminal
sandbox, closes the OpenAI client (sets self.client = None), and
cascades .close() into any active child subagents — all fatal if
the agent is still processing a turn.

Changes:
- _enforce_agent_cache_cap and _sweep_idle_cached_agents now look at
  GatewayRunner._running_agents and skip any entry whose AIAgent
  instance is present (identity via id(), so MagicMock doesn't
  confuse lookup in tests).  _AGENT_PENDING_SENTINEL is treated
  as 'not active' since no real agent exists yet.
- Eviction only considers the LRU-excess window (first size-cap
  entries).  If an excess slot is held by a mid-turn agent, we skip
  it WITHOUT compensating by evicting a newer entry.  A freshly
  inserted session (zero cache history) shouldn't be punished to
  protect a long-lived one that happens to be busy.
- Cache may therefore stay transiently over cap when load spikes;
  a WARNING is logged so operators can see it, and the next insert
  re-runs the check after some turns have finished.

New tests (TestAgentCacheActiveSafety + TestAgentCacheSpilloverLive):
- Active LRU entry is skipped; no newer entry compensated
- Mixed active/idle excess window: only idle slots go
- All-active cache: no eviction, WARNING logged, all clients intact
- _AGENT_PENDING_SENTINEL doesn't block other evictions
- Idle-TTL sweep skips active agents
- End-to-end: active agent's .client survives eviction attempt
- Live fill-to-cap with real AIAgents, then spillover
- Live: CAP=4 all active + 1 newcomer — cache grows to 5, no teardown
- Live: 8 threads racing 160 inserts into CAP=16 — settles at 16
- Live: evicted session's next turn gets a fresh agent that works

30 tests pass (13 pre-existing + 17 new).  Related gateway suites
(model switch, session reset, proxy, etc.) all green.

* fix(gateway): cache eviction preserves per-task state for session resume

The prior commits called AIAgent.close() on cache-evicted agents, which
tears down process_registry entries, terminal sandbox, and browser
daemon for that task_id — permanently. Fine for session-expiry (session
ended), wrong for cache eviction (session may resume).

Real-world scenario: a user leaves a Telegram session open for 2+ hours,
idle TTL evicts the cached AIAgent, user returns and sends a message.
Conversation history is preserved via SessionStore, but their terminal
sandbox (cwd, env vars, bg shells) and browser state were destroyed.

Fix: split the two cleanup modes.

  close()               Full teardown — session ended. Kills bg procs,
                        tears down terminal sandbox + browser daemon,
                        closes LLM client. Used by session-expiry,
                        /new, /reset (unchanged).

  release_clients()     Soft cleanup — session may resume. Closes
                        LLM client only. Leaves process_registry,
                        terminal sandbox, browser daemon intact
                        for the resuming agent to inherit via
                        shared task_id.

Gateway cache eviction (_enforce_agent_cache_cap, _sweep_idle_cached_agents)
now dispatches _release_evicted_agent_soft on the daemon thread instead
of _cleanup_agent_resources. All session-expiry call sites of
_cleanup_agent_resources are unchanged.

Tests (TestAgentCacheIdleResume, 5 new cases):
- release_clients does NOT call process_registry.kill_all
- release_clients does NOT call cleanup_vm / cleanup_browser
- release_clients DOES close the LLM client (agent.client is None after)
- close() vs release_clients() — semantic contract pinned
- Idle-evicted session's rebuild with same session_id gets same task_id

Updated test_cap_triggers_cleanup_thread to assert the soft path fires
and the hard path does NOT.

35 tests pass in test_agent_cache.py; 67 related tests green.
2026-04-17 06:36:34 -07:00
Teknium fc04f83062 chore(release): map jvcl author email for release notes 2026-04-17 06:33:21 -07:00
Jorge fe0e7edd27 fix(cli): clear input buffer after /model picker selection
The Enter handler that confirms a selection in the /model picker closed
the picker but never reset event.app.current_buffer, leaving the user's
original "/model" command lingering in the prompt. Match the ESC and
Ctrl+C handlers (which already reset the buffer) so the prompt is empty
after a successful switch.
2026-04-17 06:33:21 -07:00
Jorge 86f02d8d71 refactor(cli): align model picker viewport with PR #11260 vocabulary
Match the row-budget naming introduced in PR #11260 for the approval and
clarify panels: rename chrome_reserve=14 into reserved_below=6 (input
chrome below the panel) + panel_chrome=6 (this panel's borders, blanks,
and hint row) + min_visible=3 (floor on visible items). Same arithmetic
as before, but a reviewer reading both files now sees the same handle.

Compact-chrome mode is intentionally not adopted — that pattern fits the
"fixed mandatory content might overflow" shape of approval/clarify
(solved by truncating with a marker), whereas the picker's overflow is
already handled by the scrolling viewport.
2026-04-17 06:33:21 -07:00
Jorge 5fbe16635b fix(cli): scroll the /model picker viewport so long catalogs aren't clipped
The /model picker rendered every choice into a prompt_toolkit Window
with no max height. Providers with many models (e.g. Ollama Cloud's 36+)
overflowed the terminal, clipping the bottom border and the last items.

- Add HermesCLI._compute_model_picker_viewport() to slide a scroll
  offset that keeps the cursor on screen, sized from the live terminal
  rows minus chrome reserved for input/status/border.
- Render only the visible slice in _get_model_picker_display() and
  persist the offset on _model_picker_state across redraws.
- Bind ESC (eager) to close the picker, matching the Cancel button.
- Cover the viewport math with 8 unit tests in
  tests/hermes_cli/test_model_picker_viewport.py.
2026-04-17 06:33:21 -07:00
Teknium fdf42d62a0 chore: map briandevans and LLQWQ emails to AUTHOR_MAP 2026-04-17 06:26:43 -07:00
Teknium f64241ed90 feat(cron+tests): extend origin fallback to email/dingtalk/qqbot + fix Weixin test mocks
Cron origin fallback extension (builds on #9193's _HOME_TARGET_ENV_VARS):
adds the three remaining origin-fallback-eligible platforms that have
home channel env vars configured in gateway/config.py but use non-generic
env var names:

- email    → EMAIL_HOME_ADDRESS   (non-standard suffix)
- dingtalk → DINGTALK_HOME_CHANNEL
- qqbot    → QQ_HOME_CHANNEL      (non-standard prefix: QQ_ not QQBOT_)

Picks up the completeness intent of @Xowiek's PR #11317 using the
architecturally-correct dict-based lookup from #9193, so platforms with
non-standard env var names actually resolve instead of silently missing.
Extended the parametrized regression test to cover the new three.

Weixin test mock alignment (builds on #10091's _send_session split):
Three test sites added in Batch 1 (TestWeixinSendImageFileParameterName)
and Batch 3 (TestWeixinVoiceSending) mocked only adapter._session, but
#10091 switched the send paths to check self._send_session. Added the
companion setter so the tests stay green with the session split in place.
2026-04-17 06:26:43 -07:00
bde3249023 b46db048c3 fix(cron): align home target env lookup 2026-04-17 06:26:43 -07:00
bde3249023 f696b4745a fix(cron): restore origin fallback for feishu home channels 2026-04-17 06:26:43 -07:00
Ubuntu 5ca52bae5b fix(gateway/weixin): split poll/send sessions, reuse live adapter for cron & send_message
- gateway/platforms/weixin.py:
  - Split aiohttp.ClientSession into _poll_session and _send_session
  - Add _LIVE_ADAPTERS registry so send_weixin_direct() reuses the connected gateway adapter instead of creating a competing session
  - Fixes silent message loss when gateway is running (iLink token contention)

- cron/scheduler.py:
  - Support comma-separated deliver values (e.g. 'feishu,weixin') for multi-target delivery
  - Delay pconfig/enabled check until standalone fallback so live adapters work even when platform is not in gateway config

- tools/send_message_tool.py:
  - Synthesize PlatformConfig from WEIXIN_* env vars when gateway config lacks a weixin entry
  - Fall back to WEIXIN_HOME_CHANNEL env var for home channel resolution

- tests/gateway/test_weixin.py:
  - Update mocks to include _send_session
2026-04-17 06:26:43 -07:00
Teknium c60b6dc317 test(dingtalk): cover get_connected_platforms + null platform_toolsets
Follow-ups to the salvaged commits in this PR:

* gateway/config.py — strip trailing whitespace from youngDoo's diff
  (line 315 had ~140 trailing spaces).

* hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})`
  with `config.get("platform_toolsets") or {}`. Handles the case where the
  YAML key is present but explicitly null (parses as None, previously
  crashed with AttributeError on the next line's .get(platform)).
  Cherry-picked from yyq4193's #9003 with attribution.

* tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms
  covering DingTalk via extras, via env vars, disabled, and missing creds.

* tests/hermes_cli/test_tools_config.py — regression test for the null
  platform_toolsets edge case.

* scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP.

Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>
2026-04-17 06:26:18 -07:00
kagura-agent 47a0dd1024 fix(dingtalk): fire-and-forget message processing & session_webhook fallback
Fixes #11463: DingTalk channel receives messages but fails to reply
with 'No session_webhook available'.

Two changes:

1. **Fire-and-forget message processing**: process() now dispatches
   _on_message as a background task via asyncio.create_task instead of
   awaiting it. This ensures the SDK ACK is returned immediately,
   preventing heartbeat timeouts and disconnections when message
   processing takes longer than the SDK's ACK deadline.

2. **session_webhook extraction fallback**: If ChatbotMessage.from_dict()
   fails to map the sessionWebhook field (possible across SDK versions),
   the handler now falls back to extracting it directly from the raw
   callback data dict using both 'sessionWebhook' and 'session_webhook'
   key variants.

Added 3 tests covering webhook extraction, fallback behavior, and
fire-and-forget ACK timing.
2026-04-17 06:26:18 -07:00
youngDoo 91e7aff219 gateway cant add DingTalk platform
gateway cant add DingTalk platform without key and secret
2026-04-17 06:26:18 -07:00
Teknium d404849351 test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577)
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh

Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.

## Layer 1 — hermetic conftest.py (tests/conftest.py)

Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.

Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.

Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.

Also:
  - Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
    code reading ~/.hermes/* directly can't touch the real dir.
  - Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
    match CI's deterministic runtime.

The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.

## Layer 2 — scripts/run_tests.sh canonical runner

'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
  - Unsets all credential env vars (belt-and-suspenders for callers
    who bypass conftest — e.g. IDE integrations)
  - Pins TZ/LANG/PYTHONHASHSEED
  - Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
    a 20-core workstation surfaces test-ordering flakes CI will never
    see, causing the infamous 'passes in CI, fails locally' drift)
  - Finds the venv in .venv, venv, or main checkout's venv
  - Passes through arbitrary pytest args

Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.

## Remove 3 module-level dotenv stubs that broke test isolation

tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:

    if 'dotenv' not in sys.modules:
        fake_dotenv = types.ModuleType('dotenv')
        fake_dotenv.load_dotenv = lambda *a, **kw: None
        sys.modules['dotenv'] = fake_dotenv

This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.

dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.

## Validation

- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
  failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
  tests/test_hermes_logging.py combined: 123 passed (the caplog
  regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
  previously present before the conftest hardening

## Background

See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.

* fix(conftest): don't redirect HOME — it broke CI subprocesses

PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.

Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.

Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
2026-04-17 06:09:09 -07:00
Teknium ee95822e07 chore(release): map jz.pentest@gmail.com to @0xyg3n 2026-04-17 05:48:26 -07:00
Teknium e5b880264b fix(discord): harden DISCORD_ALLOWED_ROLES and cover gateway layer
Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`):

1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())`
   so test fixtures that build the adapter via `object.__new__`
   (skipping __init__) don't crash with AttributeError.
   See AGENTS.md pitfall #17 — same pattern as gateway.run.

2. New 3-case regression coverage in test_discord_bot_auth_bypass.py:
   - role-only config bypasses the gateway 'no allowlists' branch
   - roles + users combined still authorizes user-allowlist matches
   - the role bypass does NOT leak to other platforms (Telegram, etc.)

3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord
   auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from
   a previous test in the session can't flip later 'should-reject' tests
   into false-pass.

Required because the bare cherry-pick of #9873 only added the adapter-
level role check — it didn't cover the gateway-level _is_user_authorized,
which still rejected role-only setups via the 'no allowlists configured'
branch.
2026-04-17 05:48:26 -07:00
0xyg3n 541a3e27d7 feat(discord): add DISCORD_ALLOWED_ROLES env var for role-based access control
Adds a new DISCORD_ALLOWED_ROLES environment variable that allows filtering
bot interactions by Discord role ID. Uses OR semantics with the existing
DISCORD_ALLOWED_USERS - if a user matches either allowlist, they're permitted.

Changes:
- Parse DISCORD_ALLOWED_ROLES comma-separated role IDs on connect
- Enable members intent when roles are configured (needed for role lookup)
- Update _is_allowed_user() to accept optional author param for direct role check
- Fallback to scanning mutual guilds when author object lacks roles (DMs, voice)
- Fully backwards compatible: no behavior change when env var is unset
2026-04-17 05:48:26 -07:00
Teknium 0741f22463 chore(release): map gnanasekaran.sekareee@gmail.com to @gnanam1990 2026-04-17 05:42:04 -07:00
Teknium 7d888ab49c test(discord): regression guard for DISCORD_ALLOW_BOTS auth bypass
Six test cases covering:
- DISCORD_ALLOW_BOTS=mentions + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=all + bot not in DISCORD_ALLOWED_USERS → authorized
- DISCORD_ALLOW_BOTS=none → bots still rejected (preserves security)
- DISCORD_ALLOW_BOTS unset → same as 'none'
- Humans still checked against allowlist even with allow_bots=all
- Bot bypass is Discord-specific — doesn't leak to other platforms

Guards against a regression where the is_bot bypass in _is_user_authorized
gets moved, removed, or accidentally extended to other platforms.
2026-04-17 05:42:04 -07:00
gnanam1990 0f4403346d fix(discord): DISCORD_ALLOW_BOTS=mentions/all now works without DISCORD_ALLOWED_USERS
Fixes #4466.

Root cause: two sequential authorization gates both independently rejected
bot messages, making DISCORD_ALLOW_BOTS completely ineffective.

Gate 1 — `discord.py` `on_message`:
    _is_allowed_user ran BEFORE the bot filter, so bot senders were dropped
    before the DISCORD_ALLOW_BOTS policy was ever evaluated.

Gate 2 — `gateway/run.py` _is_user_authorized:
    The gateway-level allowlist check rejected bot IDs with 'Unauthorized
    user: <bot_id>' even if they passed Gate 1.

Fix:

  gateway/platforms/discord.py — reorder on_message so DISCORD_ALLOW_BOTS
  runs BEFORE _is_allowed_user. Bots permitted by the filter skip the
  user allowlist; non-bots are still checked.

  gateway/session.py — add is_bot: bool = False to SessionSource so the
  gateway layer can distinguish bot senders.

  gateway/platforms/base.py — expose is_bot parameter in build_source.

  gateway/platforms/discord.py _handle_message — set is_bot=True when
  building the SessionSource for bot authors.

  gateway/run.py _is_user_authorized — when source.is_bot is True AND
  DISCORD_ALLOW_BOTS is 'mentions' or 'all', return True early. Platform
  filter already validated the message at on_message; don't re-reject.

Behavior matrix:

  | Config                                     | Before  | After   |
  | DISCORD_ALLOW_BOTS=none (default)          | Blocked | Blocked |
  | DISCORD_ALLOW_BOTS=all                     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions + @mention     | Blocked | Allowed |
  | DISCORD_ALLOW_BOTS=mentions, no mention    | Blocked | Blocked |
  | Human in DISCORD_ALLOWED_USERS             | Allowed | Allowed |
  | Human NOT in DISCORD_ALLOWED_USERS         | Blocked | Blocked |

Co-authored-by: Hermes Maintainer <hermes@nousresearch.com>
2026-04-17 05:42:04 -07:00
Teknium d7fb435e0e fix(discord): flat /skill command with autocomplete — fits 8KB limit trivially (#11580)
Closes #11321, closes #10259.

## Problem

The nested /skill command group (category subcommand groups + skill
subcommands) serialized to ~14KB with the default 75-skill catalog,
exceeding Discord's ~8000-byte per-command registration payload. The
entire tree.sync() rejected with error 50035 — ALL slash commands
including the 27 base commands failed to register.

## Fix

Replace the nested Group layout with a single flat Command:

    /skill name:<autocomplete> args:<optional string>

Autocomplete options are fetched dynamically by Discord when the user
types — they do NOT count against the per-command registration budget.
So this single command registers at ~200 bytes regardless of how many
skills exist. Scales to thousands of skills with no size calculations,
no splitting, no hidden skills.

UX improvements:
- Discord live-filters by user's typed prefix against BOTH name and
  description, so '/skill pdf' finds 'ocr-and-documents' via its
  description. More discoverable than clicking through category menus.
- Unknown skill name → ephemeral error pointing user at autocomplete.
- Stable alphabetical ordering across restarts.

## Why not the other proposed approaches

Three prior PRs tried to fit within the 8KB limit by modifying the
nested layout:

- #10214 (njiangk): truncated all descriptions to 'Run <name>' and
  category descriptions to 'Skills'. Works but destroys slash picker UX.
- #11385 (LeonSGP43): 40-char description clamp + iterative
  trim-largest-category fallback. Works but HIDES skills the user can
  no longer invoke via slash — functional regression.
- #10261 (zeapsu): adaptive split into /skill-<cat> top-level groups.
  Preserves all skills but pollutes the slash namespace with 20
  top-level commands.

All three work around the symptom. The flat autocomplete design
dissolves the problem — there is no payload-size pressure to manage.

## Tests

tests/gateway/test_discord_slash_commands.py — 5 new test cases replace
the 3 old nested-structure tests:

- flat-not-nested structure assertion
- empty skills → no command registered
- callback dispatches the right cmd_key by name
- unknown name → ephemeral error, no dispatch
- large-catalog regression guard (500 skills) — command payload stays
  under 500 bytes regardless

E2E validated against real discord.py 2.7.1:
- Command registers as discord.app_commands.Command (not Group).
- Autocomplete filters by name AND description (verified across several
  queries including description-only matches like 'pdf' → OCR skill).
- 500-skill catalog returns max 25 results per autocomplete query
  (Discord's hard cap), filtered correctly.
- Choice labels formatted as 'name — description' clamped to 100 chars.
2026-04-17 05:19:14 -07:00
Teknium 13f2d997b0 test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
  * _api_post — network error mapping, errcode-nonzero mapping, success path
  * begin_registration — 2-step chain, missing-nonce/device_code/uri
    error cases
  * wait_for_registration_success — success path, missing-creds guard,
    on_waiting callback invocation
  * render_qr_to_terminal — returns False when qrcode missing, prints
    when available
  * Configuration — BASE_URL default + override, SOURCE default

Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.

Also adds meng93 to scripts/release.py AUTHOR_MAP.
2026-04-17 05:08:07 -07:00
meng93 9deeee7bb7 feat(dingtalk): add QR code auth support and fix 3 critical bugs
- feat: support one-click QR scan to create DingTalk bot and establish connection
- fix(gateway): wrap blocking DingTalkStreamClient.start() with asyncio.to_thread()
- fix(gateway): extract message fields from CallbackMessage payload instead of ChatbotMessage
- fix(gateway): add oapi.dingtalk.com to allowed webhook URL domains
2026-04-17 05:08:07 -07:00
Teknium 08930a65ea chore: map Patrick Wang, Hedgeho9, Berny Linville emails to AUTHOR_MAP 2026-04-17 05:01:29 -07:00
Berny Linville 6ee65b4d61 fix(weixin): preserve native markdown rendering
- stop rewriting markdown tables, headings, and links before delivery
- keep markdown table blocks and headings together during chunking
- update Weixin tests and docs for native markdown rendering

Closes #10308
2026-04-17 05:01:29 -07:00
Hedgeho9 498fc6780e fix(weixin): extract and deliver MEDIA: attachments in normal send() path
The Weixin adapter's send() method previously split and delivered the
raw response text without first extracting MEDIA: tags or bare local
file paths. This meant images, documents, and voice files referenced
by the agent were silently dropped in normal (non-streaming,
non-background) conversations.

Changes:
- In WeixinAdapter.send(), call extract_media() and
  extract_local_files() before formatting/splitting text.
- Deliver extracted files via send_image_file(), send_document(),
  send_voice(), or send_video() prior to sending text chunks.
- Also fix two minor typing issues in gateway/run.py where
  extract_media() tuples were not unpacked correctly in background
  and /btw task handlers.

Fixes missing media delivery on Weixin personal accounts.
2026-04-17 05:01:29 -07:00
Patrick Wang 4ed6e4c1a5 refactor(weixin): drop pilk dependency from voice fallback 2026-04-17 05:01:29 -07:00
Patrick Wang 649f38390c fix: force Weixin voice fallback to file attachments 2026-04-17 05:01:29 -07:00
Patrick Wang 678b69ec1b fix(weixin): use Tencent SILK encoding for voice replies 2026-04-17 05:01:29 -07:00
Teknium 53da34a4fc fix(discord): route attachment downloads through authenticated bot session (#11568)
Three open issues — #8242, #6587, #11345 — all trace to the same root
cause: the image / audio / document download paths in
`DiscordAdapter._handle_message` used plain, unauthenticated HTTP to
fetch `att.url`. That broke in three independent ways:

  #8242  cdn.discordapp.com attachment URLs increasingly require the
         bot session to download; unauthenticated httpx sees 403
         Forbidden, image/voice analysis fail silently.

  #6587  Some user environments (VPNs, corporate DNS, tunnels) resolve
         cdn.discordapp.com to private-looking IPs. Our is_safe_url()
         guard correctly blocks them as SSRF risks, but the user
         environment is legitimate — image analysis and voice STT die.

  #11345 The document download path skipped is_safe_url() entirely —
         raw aiohttp.ClientSession.get(att.url) with no SSRF check,
         inconsistent with the image/audio branches.

Unified fix: use `discord.Attachment.read()` as the primary download
path on all three branches. `att.read()` routes through discord.py's
own authenticated HTTPClient, so:

  - Discord CDN auth is handled (#8242 resolved).
  - Our is_safe_url() gate isn't consulted for the attachment path at
    all — the bot session handles networking internally (#6587 resolved).
  - All three branches now share the same code path, eliminating the
    document-path SSRF gap (#11345 resolved).

Falls back to the existing cache_*_from_url helpers (image/audio) or an
SSRF-gated aiohttp fetch (documents) when `att.read()` is unavailable
or fails — preserves defense-in-depth for any future payload-schema
drift that could slip a non-CDN URL into att.url.

New helpers on DiscordAdapter:
  - _read_attachment_bytes(att)  — safe att.read() wrapper
  - _cache_discord_image(att, ext)     — primary + URL fallback
  - _cache_discord_audio(att, ext)     — primary + URL fallback
  - _cache_discord_document(att, ext)  — primary + SSRF-gated aiohttp fallback

Tests:
  - tests/gateway/test_discord_attachment_download.py — 12 new cases
    covering all three helpers: primary path, fallback on missing
    .read(), fallback on validator rejection, SSRF guard on document
    fallback, aiohttp fallback happy-path, and an E2E case via
    _handle_message confirming cache_image_from_url is never invoked
    when att.read() succeeds.
  - All 11 existing document-handling tests continue to pass via the
    aiohttp fallback path (their SimpleNamespace attachments have no
    .read(), which triggers the fallback — now SSRF-gated).

Closes #8242, closes #6587, closes #11345.
2026-04-17 04:59:03 -07:00
Teknium 24342813fe fix(qqbot): correct Authorization header format in send_message REST path (#11569)
The send_message tool's direct-REST QQBot path used "QQBotAccessToken {token}"
which QQ's API rejects with 401. The correct format is "QQBot {token}" — the
gateway adapter at gateway/platforms/qqbot.py uses this format in all 5 header
sites (lines 341, 551, 579, 1068, 1467); this was the one outlier.

Credit to @Quon for surfacing this in #10257 (that PR had unrelated issues in
its media-upload logic and was closed; this salvages the genuine 1-line fix).
2026-04-17 04:25:47 -07:00
Teknium ca03e80348 chore: map LehaoLin email to AUTHOR_MAP for release script 2026-04-17 04:22:40 -07:00
LehaoLin 504e7eb9e5 fix(gateway): wait for reconnection before dropping WebSocket sends
When a WebSocket-based platform adapter (e.g. QQ Bot) temporarily
loses its connection, send() now polls is_connected for up to 15s
instead of immediately returning a non-retryable failure. If the
auto-reconnect completes within the window, the message is delivered
normally. On timeout, the SendResult is marked retryable=True so the
base class retry mechanism can attempt re-delivery.

Same treatment applied to _send_media().

Adds 4 async tests covering:
- Successful send after simulated reconnection
- Retryable failure on timeout
- Immediate success when already connected
- _send_media reconnection wait

Fixes #11163
2026-04-17 04:22:40 -07:00
dieutx b594b30de4 fix(release): map dieutx email in author map 2026-04-17 04:22:40 -07:00
dieutx 995177d542 fix(gateway): honor QQ_GROUP_ALLOWED_USERS in runner auth 2026-04-17 04:22:40 -07:00
Pedro Gonzalez 590c9964e1 Fix QQ voice attachment SSRF validation 2026-04-17 04:22:40 -07:00
yeyitech a97b08e30c fix: allow trusted QQ CDN benchmark IP resolution 2026-04-17 04:22:40 -07:00
Teknium aca81ac7bb test(dingtalk): cover require_mention + allowed_users gating
Adds 16 regression tests for the gating logic introduced in the
salvaged commit:

  * TestAllowedUsersGate — empty/wildcard/case-insensitive matching,
    staff_id vs sender_id, env var CSV population
  * TestMentionPatterns — compilation, case-insensitivity, invalid
    regex is skipped-not-raised, JSON env var, newline fallback
  * TestShouldProcessMessage — DM always accepted, group gating via
    require_mention / is_in_at_list / wake-word pattern / free_response_chats

Also adds yule975 to scripts/release.py AUTHOR_MAP (release CI blocks
unmapped emails).
2026-04-17 04:21:49 -07:00
yule975 9039273ff0 feat(platforms): add require_mention + allowed_users gating to DingTalk
DingTalk was the only messaging platform without group-mention gating or a
per-user allowlist. Slack, Telegram, Discord, WhatsApp, Matrix, and Mattermost
all support these via config.yaml + matching env vars; this change closes the
gap for DingTalk using the same surface:

Config:
  platforms.dingtalk.require_mention: bool   (env: DINGTALK_REQUIRE_MENTION)
  platforms.dingtalk.mention_patterns: list  (env: DINGTALK_MENTION_PATTERNS)
  platforms.dingtalk.free_response_chats: list  (env: DINGTALK_FREE_RESPONSE_CHATS)
  platforms.dingtalk.allowed_users: list     (env: DINGTALK_ALLOWED_USERS)

Semantics mirror Telegram's implementation:
- DMs are always accepted (subject to allowed_users).
- Group messages are accepted only when the chat is allowlisted, mention is
  not required, the bot was @mentioned (dingtalk_stream sets is_in_at_list),
  or the text matches a configured regex wake-word.
- allowed_users matches sender_id / sender_staff_id case-insensitively;
  a single "*" disables the check.

Rationale: without this, any DingTalk user in a group chat can trigger the
bot, which makes DingTalk less safe to deploy than the other platforms. A
user's config.yaml already accepts require_mention for dingtalk but the value
was silently ignored.
2026-04-17 04:21:49 -07:00
Teknium 29d5d36b14 fix(copilot): normalize vendor-prefixed and dash-notation model IDs (#6879) (#11561)
The Copilot API returns HTTP 400 "model_not_supported" when it receives a
model ID it doesn't recognize (vendor-prefixed like
`anthropic/claude-sonnet-4.6` or dash-notation like `claude-sonnet-4-6`).
Two bugs combined to leave both formats unhandled:

1. `_COPILOT_MODEL_ALIASES` in hermes_cli/models.py only covered bare
   dot-notation and vendor-prefixed dot-notation.  Hermes' default Claude
   IDs elsewhere use hyphens (anthropic native format), and users with an
   aggregator-style config who switch `model.provider` to `copilot`
   inherit `anthropic/claude-X-4.6` — neither case was in the table.

2. The Copilot branch of `normalize_model_for_provider()` only stripped
   the vendor prefix when it matched the target provider (`copilot/`) or
   was the special-cased `openai/` for openai-codex.  Every other vendor
   prefix survived to the Copilot request unchanged.

Fix:

- Add dash-notation aliases (`claude-{opus,sonnet,haiku}-4-{5,6}` and the
  `anthropic/`-prefixed variants) to the alias table.
- Rewire the Copilot / Copilot-ACP branch of
  `normalize_model_for_provider()` to delegate to the existing
  `normalize_copilot_model_id()`.  That function already does alias
  lookups, catalog-aware resolution, and vendor-prefix fallback — it was
  being bypassed for the generic normalisation entry point.

Because `switch_model()` already calls `normalize_model_for_provider()`
for every `/model` switch (line 685 in model_switch.py), this single fix
covers the CLI startup path (cli.py), the `/model` slash command path,
and the gateway load-from-config path.

Closes #6879

Credits dsr-restyn (#6743) who independently diagnosed the dash-notation
case; their aliases are folded into this consolidated fix alongside the
vendor-prefix stripping repair.
2026-04-17 04:19:36 -07:00
Teknium eabe14af1c test(discord): update reply_mode fixture for new to_reference() wrapping
Follow-up to the reply-reference fix: `_make_discord_adapter` used to return
the raw fetched `Message` as the expected reference, but the adapter now
wraps it via `ref_msg.to_reference(fail_if_not_exists=False)` so Discord
treats a deleted target as 'send without reply chip'. Update the fixture
to return the MessageReference sentinel so the 4 chunk-reference-identity
tests assert against the right object.

No production behavior change; only aligns the stale test fixture.
2026-04-17 04:17:56 -07:00
Teknium ef37aa7cce test(discord): add regression guard for non-reference send errors
Follow-up to the reply-reference fix: ensure errors unrelated to the reply
reference (e.g. 50013 Missing Permissions) do NOT trigger the no-reference
retry path and still surface as a failed SendResult. Keeps the wider retry
condition from silently swallowing unrelated API errors.

Proposed in the original issue writeup (#11342) as test case
`test_non_reference_errors_still_propagate`.
2026-04-17 04:17:56 -07:00
LeonSGP43 a448e7a04d fix(discord): drop invalid reply references 2026-04-17 04:17:56 -07:00
Teknium 0231f8882b chore(release): add Asunfly to AUTHOR_MAP for #10070 salvage 2026-04-17 04:11:30 -07:00
Asunfly 7c932c5aa4 fix(dingtalk): close websocket on disconnect 2026-04-17 04:11:30 -07:00
Teknium f268215019 fix(auth): codex auth remove no longer silently undone by auto-import (#11485)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(auth): codex auth remove no longer silently undone by auto-import

'hermes auth remove openai-codex' appeared to succeed but the credential
reappeared on the next command.  Two compounding bugs:

1. _seed_from_singletons() for openai-codex unconditionally re-imports
   tokens from ~/.codex/auth.json whenever the Hermes auth store is
   empty (by design — the Codex CLI and Hermes share that file).  There
   was no suppression check, unlike the claude_code seed path.

2. auth_remove_command's cleanup branch only matched
   removed.source == 'device_code' exactly.  Entries added via
   'hermes auth add openai-codex' have source 'manual:device_code', so
   for those the Hermes auth store's providers['openai-codex'] state was
   never cleared on remove — the next load_pool() re-seeded straight
   from there.

Net effect: there was no way to make a codex removal stick short of
manually editing both ~/.hermes/auth.json and ~/.codex/auth.json before
opening Hermes again.

Fix:

- Add unsuppress_credential_source() helper (mirrors
  suppress_credential_source()).
- Gate the openai-codex branch in _seed_from_singletons() with
  is_source_suppressed(), matching the claude_code pattern.
- Broaden auth_remove_command's codex match to handle both
  'device_code' and 'manual:device_code' (via endswith check), always
  call suppress_credential_source(), and print guidance about the
  unchanged ~/.codex/auth.json file.
- Clear the suppression marker in auth_add_command's openai-codex
  branch so re-linking via 'hermes auth add openai-codex' works.

~/.codex/auth.json is left untouched — that's the Codex CLI's own
credential store, not ours to delete.

Tests cover: unsuppress helper behavior, remove of both source
variants, add clears suppression, seed respects suppression.  E2E
verified: remove → load → add → load flow now behaves correctly.
2026-04-17 04:10:17 -07:00
Teknium 8b312248dc chore: map RucchiZ email to AUTHOR_MAP for release script 2026-04-17 04:09:21 -07:00
赵晨飞 82969615bb test(weixin): add regression test for send_image_file parameter name
Add TestWeixinSendImageFileParameterName test class with two tests:
- test_send_image_file_uses_image_path_parameter: verifies the correct
  parameter name (image_path) is used when gateway calls send_image_file
- test_send_image_file_works_without_optional_params: ensures minimal
  params work correctly

This prevents the interface from drifting again as noted by Copilot.
2026-04-17 04:09:21 -07:00
赵晨飞 902d6b97d6 fix(weixin): correct send_image_file parameter name to match base class
The send_image_file method in WeixinAdapter used 'path' as parameter
name, but BasePlatformAdapter and gateway callers use 'image_path'.
This mismatch caused image sending to fail when called through the
gateway's extract_media path.

Changed parameter name from 'path' to 'image_path' to match the
interface defined in base.py and the calls in gateway/run.py.
2026-04-17 04:09:21 -07:00
Teknium 5d929caa59 chore(release): map michel.belleau@malaiwah.com to @malaiwah 2026-04-17 04:08:42 -07:00
Michel Belleau efa6c9f715 fix(discord): default allowed_mentions to block @everyone and role pings
discord.py does not apply a default AllowedMentions to the client, so any
reply whose content contains @everyone/@here or a role mention would ping
the whole server — including verbatim echoes of user input or LLM output
that happens to contain those tokens.

Set a safe default on commands.Bot: everyone=False, roles=False,
users=True, replied_user=True. Operators can opt back in via four
DISCORD_ALLOW_MENTION_* env vars or discord.allow_mentions.* in
config.yaml. No behavior change for normal user/reply pings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 04:08:42 -07:00
Teknium 2367c6ffd5 test: remove 169 change-detector tests across 21 files (#11472)
First pass of test-suite reduction to address flaky CI and bloat.

Removed tests that fall into these change-detector patterns:

1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
   that call inspect.getsource() on production modules and grep for string
   literals. Break on any refactor/rename even when behavior is correct.

2. Platform enum tautologies (every gateway/test_X.py): assertions like
   `Platform.X.value == 'x'` duplicated across ~9 adapter test files.

3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
   only verify a key exists in a dict. Data-layout tests, not behavior.

4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
   _fallback): tests that do parser.parse_args([...]) then assert args.field.
   Tests Python's argparse, not our code.

5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
   cmd_X, call plugins_command with matching action, assert mock called.
   Tests the if/elif chain, not behavior.

6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
   test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
   that mock the external API client, call our function, and assert exact
   kwargs. Break on refactor even when behavior is preserved.

7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
   tests): tests that patch own helper method, then assert it was called.

Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.

Net reduction: 169 tests removed. 38 empty classes cleaned up.

Collected before: 12,522 tests
Collected after:  12,353 tests
2026-04-17 01:05:09 -07:00
Teknium e33cb65a98 fix(insights): hide cache read/write and cost metrics from display (#11477)
The cache-read, cache-write, and total estimated-cost values shown in
/insights (and the per-model Cost column) were unreliable. Hide them from
both terminal and gateway renderings.

The underlying data pipeline is untouched — sessions still store
cache_read_tokens, cache_write_tokens, and estimated_cost_usd; the web
server, /usage command, and status bar are unaffected. Only the
InsightsEngine display layer is trimmed.

Changes:
- format_terminal: drop 'Cache read / Cache write' line, drop 'Est. cost'
  from the Total tokens row, drop per-model 'Cost' column, drop the
  '* Cost N/A for custom/self-hosted' footnote.
- format_gateway: drop cache breakdown from Tokens line, drop 'Est. cost'
  line, drop per-model cost suffix.
- Tests updated to assert these strings are now absent.
2026-04-17 01:02:06 -07:00
Teknium 3f74dafaee fix(nous): respect 'Skip (keep current)' after OAuth login (#11476)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(nous): respect 'Skip (keep current)' after OAuth login

When a user already set up on another provider (e.g. OpenRouter) runs
`hermes model` and picks Nous Portal, OAuth succeeds and then a model
picker is shown.  If the user picks 'Skip (keep current)', the previous
provider + model should be preserved.

Previously, \_update_config_for_provider was called unconditionally after
login, which flipped config.yaml model.provider to 'nous' while keeping
the old model.default (e.g. anthropic/claude-opus-4.6 from OpenRouter),
leaving the user with a mismatched provider/model pair on the next
request.

Fix: snapshot the prior active_provider before login, and if no model is
selected (Skip, or no models available, or fetch failure), restore the
prior active_provider and leave config.yaml untouched.  The Nous OAuth
tokens stay saved so future `hermes model` -> Nous works without
re-authenticating.

Test plan:
- New tests cover Skip path (preserves provider+model, saves creds),
  pick-a-model path (switches to nous), and fresh-install Skip path
  (active_provider cleared, not stuck as 'nous').
2026-04-17 00:52:42 -07:00
Teknium 3438d274f6 fix(dingtalk): repair _extract_text for dingtalk-stream >= 0.20 SDK shape
The cherry-picked SDK compat fix (previous commit) wired process() to
parse CallbackMessage.data into a ChatbotMessage, but _extract_text()
was still written against the pre-0.20 payload shape:

  * message.text changed from dict {content: ...} → TextContent object.
    The old code's str(text) fallback produced 'TextContent(content=...)'
    as the agent's input, so every received message came in mangled.
  * rich_text moved from message.rich_text (list) to
    message.rich_text_content.rich_text_list.

This preserves legacy fallbacks (dict-shaped text, bare rich_text list)
while handling the current SDK layout via hasattr(text, 'content').

Adds regression tests covering:
  * webhook domain allowlist (api.*, oapi.*, and hostile lookalikes)
  * _IncomingHandler.process is a coroutine function
  * _extract_text against TextContent object, dict, rich_text_content,
    legacy rich_text, and empty-message cases

Also adds kevinskysunny to scripts/release.py AUTHOR_MAP (release CI
blocks unmapped emails).
2026-04-17 00:52:35 -07:00
Kevin S. Sunny c3d2895b18 fix(dingtalk): support dingtalk-stream 0.24+ and oapi webhooks 2026-04-17 00:52:35 -07:00
Teknium e5cde568b7 feat(skills): add 'hermes skills reset' to un-stick bundled skills (#11468)
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage
2026-04-17 00:41:31 -07:00
Teknium a55a133387 fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453)
Three tests in tests/test_plugin_skills.py and tests/hermes_cli/test_plugins.py
used caplog.at_level(logging.WARNING) without specifying a logger. When another
test earlier in the same xdist worker touched propagation on tools.skills_tool
or hermes_cli.plugins, caplog would miss the warning and the assertion would
fail intermittently in CI.

These three tests accounted for 15 of the last ~30 Tests workflow failures
(5 each), including the recent main failure on commit 436a7359 (PR #11398).

Fix: pass logger="tools.skills_tool" / logger="hermes_cli.plugins" to
caplog.at_level() so the handler attaches directly to the logger under test
and capture is independent of global propagation state.

Affected tests:
- tests/test_plugin_skills.py::TestSkillViewPluginGuards::test_injection_logged_but_served
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_empty_name_rejected
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_builtin_conflict_rejected

No production code change. Verified passing under xdist (-n 4) alongside
test_hermes_logging.py (the test most likely to poison the logger state).
2026-04-17 00:20:40 -07:00
Teknium 816e3e3774 test(feishu): cover new SDK event handler registrations
Extends test_build_event_handler_registers_reaction_and_card_processors
to assert that register_p2_im_chat_access_event_bot_p2p_chat_entered_v1
and register_p2_im_message_recalled_v1 are called when building the
event handler, matching the production registrations.

Also adds Fatty911 to scripts/release.py AUTHOR_MAP for credit on the
salvaged event-handler fix.
2026-04-16 22:08:11 -07:00
Fatty911 94168b7f60 fix: register missing Feishu event handlers for P2P chat entered and message recalled 2026-04-16 22:08:11 -07:00
Teknium 220fa7db90 feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406)
* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
Teknium 70768665a4 fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383)
* feat(mcp-oauth): scaffold MCPOAuthManager

Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).

No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).

* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch

Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.

Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.

* refactor(mcp-oauth): extract build_oauth_auth helpers

Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.

Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.

* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly

_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.

build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:

    MCPOAuthManager.get_or_build_provider  ->
      _build_provider  ->
        _configure_callback_port
        _build_client_metadata
        _maybe_preregister_client
        _parse_base_url
        HermesMCPOAuthProvider(...)

Task 4 of 8.

* feat(mcp): wire OAuth manager + add _reconnect_event

MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).

_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.

shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.

Task 5 of 8.

* feat(mcp): detect auth failures in tool handlers, trigger reconnect

All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:

  1. If the manager says recovery is viable (disk has fresh tokens,
     or SDK can refresh in-place), signal MCPServerTask._reconnect_event
     to tear down and rebuild the MCP session with fresh credentials,
     then retry the tool call once.

  2. If no recovery path exists, return a structured needs_reauth
     JSON error so the model stops hallucinating manual refresh
     attempts (the 'let me curl the token endpoint' loop Cthulhu
     pasted from Discord).

_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.

Task 6 of 8.

* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'

cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.

New 'hermes mcp login <name>' command:
  - Wipes both the on-disk tokens file and the in-memory
    MCPOAuthManager cache
  - Triggers a fresh OAuth browser flow via the existing probe
    path
  - Intended target for the needs_reauth error Task 6 returns
    to the model

Task 7 of 8.

* test(mcp-oauth): end-to-end integration tests

Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):

  1. external_refresh_picked_up_without_restart — Cthulhu's cron
     workflow. External process writes fresh tokens to disk;
     on the next auth flow the manager's mtime-watch flips
     _initialized and the SDK re-reads from storage.

  2. handle_401_deduplicates_concurrent_callers — 10 concurrent
     handlers for the same failed token fire exactly ONE recovery
     attempt (thundering-herd protection).

  3. handle_401_returns_false_when_no_provider — defensive path
     for unknown servers.

  4. invalidate_if_disk_changed_handles_missing_file — pre-auth
     state returns False cleanly.

  5. provider_is_reused_across_reconnects — cache stickiness so
     reconnects preserve the disk-watch baseline mtime.

Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
Teknium 436a7359cd feat: add claude-opus-4.7 to Nous Portal curated model list (#11398)
Mirrors OpenRouter which already lists anthropic/claude-opus-4.7 as
recommended. Surfaces the model in the `hermes model` picker and the
gateway /model flow for Nous Portal users.

Context length (1M) is already covered by the existing claude-opus-4.7
entry in agent/model_metadata.py DEFAULT_CONTEXT_LENGTHS.
2026-04-16 21:37:06 -07:00
Teknium 24fa055763 fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373)
* docs: fix ascii-guard border alignment errors

Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:

- architecture.md: outer box is 71 cols but inner-box content lines
  and border corners were offset by 1 col, making content-line right
  border at col 70/72 while top/bottom border was at col 71. Inner
  boxes also had border corners at cols 19/36/53 but content pipes
  at cols 20/37/54. Rewrote the diagram with consistent 71-col width
  throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
  2-space gaps and 15-space trailing padding.

- gateway-internals.md: same class of issue — outer box at 51 cols,
  inner content lines varied 52-54 cols. Rewrote with consistent
  51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
  restructured the bottom-half message flow so it's bare text
  (not half-open box cells) matching the intent of the original.

- agent-loop.md line 112-114: box 2 (API thread) content lines had
  one extra space pushing the right border to col 46 while the top
  and bottom borders of that box sat at col 45. Trimmed one trailing
  space from each of the three content lines.

All 123 docs files now pass `npm run lint:diagrams`:
  ✓ Errors: 0  (warnings: 6, non-fatal)

Pre-existing failures on main — unrelated to any open PR.

* test(setup): accept description kwarg in prompt_choice mock lambdas

setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.

Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.

* test(matrix): update stale audio-caching assertion

test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.

Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.

* test(413): allow multi-pass preflight compression

run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).

test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.

The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.

* docs: drop '...' from gateway diagram, merge side-by-side boxes

ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:

1. gateway-internals.md L33: the '...' suffix after inner box 3's
   right border got parsed as 'extra characters after inner-box right
   border'. Dropped the '...' — the surrounding prose already conveys
   'and more platforms' without needing the visual hint.

2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
   boxes of different heights (main thread 7 rows, API thread 5 rows).
   Even equalizing heights didn't help — the linter treats the left
   box's right border as the end of the diagram. Merged into a single
   54-char-wide outer box with both threads labeled as regions inside,
   keeping the ▶ arrow to preserve the main→API flow direction.
2026-04-16 20:43:41 -07:00
Teknium fdefd98aa3 docs(skills): make descriptions self-contained, not cross-dependent
Previous pass assumed both skills would always be loaded together, so
each description pointed at the other ('use concept-diagrams instead').
That breaks when only one skill is active — the agent reads 'use the
other skill' and there is no other skill.

Now each skill's description and scope section is fully self-contained:

- States what it's best suited for
- Lists subjects where a more specialized skill (if available) would be
  a better fit, naming them only as 'consider X if available'
- Explicitly offers itself as a general SVG diagram fallback when no
  more specialized skill exists

An agent loading either skill alone gets unambiguous guidance; an
agent with both loaded still gets useful routing via the 'consider X
if available' hints and the related_skills metadata.
2026-04-16 20:39:55 -07:00
Teknium 7d535969ff docs(skills): make architecture-diagram vs concept-diagrams routing explicit
Both skills generate SVG system diagrams, but for very different subjects
and aesthetics. The old descriptions didn't make the split clear, so an
agent loading either one couldn't confidently pick.

Changes:

- Rewrote both frontmatter descriptions to state the scope up front plus
  an explicit 'for X, use the other skill instead' pointer.
- Added a symmetric 'When to use this skill vs <other>' decision table
  to the top of each SKILL.md body, so the guidance is visible whether
  the agent is reading frontmatter or full content.
- Added architecture-diagram <-> concept-diagrams to each other's
  related_skills metadata.

Rule of thumb baked into both skills:
  software/cloud infra -> architecture-diagram
  physical / scientific / educational -> concept-diagrams
2026-04-16 20:39:55 -07:00
Teknium 19c589a20b refactor(concept-diagrams): rename + tighten v1k22's skill for merge
Salvage of PR #11045 (original by v1k22). Changes on top of the
original commit:

- Rename 'architecture-visualization-svg-diagrams' -> 'concept-diagrams'
  to differentiate from the existing architecture-diagram skill.
  architecture-diagram stays as the dark-themed Cocoon-style option for
  software/infra; concept-diagrams covers physics, chemistry, math,
  engineering, physical objects, and educational visuals.
- Trigger description scoped to actual use cases; removed the 'always
  use this skill' language and long phrase-capture list to stop
  colliding with architecture-diagram, excalidraw, generative-widgets,
  manim-video.
- Default output is now a standalone self-contained HTML file (works
  offline, no server). The preview server is opt-in and no longer part
  of the default workflow.
- When the server IS used: bind to 127.0.0.1 instead of 0.0.0.0 (was a
  LAN exposure hazard on shared networks) and let the OS pick a free
  ephemeral port instead of hard-coding 22223 (collision prone).
- Shrink SKILL.md from 1540 to 353 lines by extracting reusable
  material into linked files:
    - templates/template.html (host page with full CSS design system)
    - references/physical-shape-cookbook.md
    - references/infrastructure-patterns.md
    - references/dashboard-patterns.md
  All 15 examples kept intact.
- Add dhandhalyabhavik@gmail.com -> v1k22 to AUTHOR_MAP.

Preserves v1k22's authorship on the underlying commit.
2026-04-16 20:39:55 -07:00
v1k22 9a4766fc18 feat: add architecture-visualization-svg-diagrams skill to creative category
- SKILL.md with full SVG design system (color palette, typography, spacing, dark mode)
- 15 example diagrams covering flowcharts, physical structures, chemistry, charts, floor plans, and more
- Supports 8 diagram types: flowchart, structural, API map, microservice, data flow, physical, infrastructure, UI mockups
- Auto-hosts diagrams on 0.0.0.0:22223 as interactive web pages
2026-04-16 20:39:55 -07:00
Teknium 7af9bf3a54 fix(feishu): queue inbound events when adapter loop not ready (#5499) (#11372)
Inbound Feishu messages arriving during brief windows when the adapter
loop is unavailable (startup/restart transitions, network-flap reconnect)
were silently dropped with a WARNING log. This matches the symptom in
issue #5499 — and users have reported seeing only a subset of their
messages reach the agent.

Fix: queue pending events in a thread-safe list and spawn a single
drainer thread that replays them once the loop becomes ready. Covers
these scenarios:

  * Queue events instead of dropping when loop is None/closed
  * Single drainer handles the full queue (not thread-per-event)
  * Thread-safe with threading.Lock on the queue and schedule flag
  * Handles mid-drain bursts (new events arrive while drainer is working)
  * Handles RuntimeError if loop closes between check and submit
  * Depth cap (1000) prevents unbounded growth during extended outages
  * Drops queue cleanly on disconnect rather than holding forever
  * Safety timeout (120s) prevents infinite retention on broken adapters

Based on the approach proposed in #4789 by milkoor, rewritten for
thread-safety and correctness.

Test plan:
  * 5 new unit tests (TestPendingInboundQueue) — all passing
  * E2E test with real asyncio loop + fake WS thread: 10-event burst
    before loop ready → all 10 delivered in order
  * E2E concurrent burst test: 20 events queued, 20 more arrive during
    drainer dispatch → all 40 delivered, no loss, no duplicates
  * All 111 existing feishu tests pass

Related: #5499, #4789

Co-authored-by: milkoor <milkoor@users.noreply.github.com>
2026-04-16 20:36:59 -07:00
Brooklyn Nicholson 5435287dec chore: uptick 2026-04-16 22:35:45 -05:00
Brooklyn Nicholson 41d3d7afb7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 22:35:27 -05:00
Brooklyn Nicholson 39231f29c6 refactor(tui): /clean pass across ui-tui — 49 files, −217 LOC
Full codebase pass using the /clean doctrine (KISS/DRY, no one-off
helpers, no variables-used-once, pure functional where natural,
inlined obvious one-liners, killed dead exports, narrowed types,
spaced JSX). All contracts preserved — no RPC method, event name,
or exported type shape changed.

app/ — 15 files, -134 LOC
- inlined 4 one-off helpers (titleCase, isLong, statusToneFrom,
  focusOutside predicate)
- stores to arrow-const style (buildUiState, buildTurnState,
  buildOverlayState plus get/patch/reset triplets)
- functional slash/registry byName map (flatMap over for-loops)
- dropped dead param `live` in cancelOverlayFromCtrlC
- DRY'd duplicate shift() call in scrollWithSelection
- consolidated sections.push calls in /help

components/ — 12 files, -40 LOC
- extracted inline prop types to interfaces at file bottom (13×)
- inlined 6 one-off vars (pctLabel, logoW, heroW, cwd, title, hint)
- promoted HEART_COLORS + OPTS/LABELS to module scope
- JSX sibling spacing across 9 files
- un-shadowed `raw` in textInput
- components/thinking.tsx + components/markdown.tsx untouched
  (structurally load-bearing / edge-case-heavy)

config content domain protocol/ — 8 files, -77 LOC
- tightened 3 regexes (MOUSE_TRACKING, looksLikeSlashCommand,
  hasInterpolation — dropped stateful lastIndex dance)
- dead export ParsedSlashCommand removed
- MODES narrowed to `as const`, `.find(m => m === s)` replaces
  `.includes() ? (as cast) : null`
- fortunes.ts hash via reduce
- fmtDuration ternary chain
- inlined aboveViewport predicate in viewport.ts

hooks/ + lib/ — 9 files, -38 LOC
- ANSI_RE via String.fromCharCode(27) + WS_RE lifted to module
  scope (no more eslint-disable no-control-regex)
- compactPreview/edgePreview/thinkingPreview → ternary arrows
- useCompletion: hoisted pathReplace, moved stale-ref guard earlier
- useInputHistory: dropped useCallback wrapper (append is stable)
- useVirtualHistory: replaced 4× any with unknown + narrow
  MeasuredNode interface + one cast site

root TS — 3 files, -63 LOC
- banner.ts: parseRichMarkup via matchAll instead of exec/lastIndex,
  artWidth via reduce
- gatewayClient.ts: resolvePython candidate list collapse, inlined
  one-branch guards in dispatch/pushLog/drain/request
- types.ts: alpha-sorted ActiveTool / Msg / SudoReq / SecretReq
  members

eslint config
- disabled react-hooks/exhaustive-deps on packages/hermes-ink/**
  (compiled by react/compiler, deps live in $[N] memo arrays that
  eslint can't introspect) and removed the now-orphan in-file
  disable directive in ScrollBox.tsx

fixes (not from the cleaner pass)
- useComposerState: unlinkSync(file) + try/catch → rmSync(file,
  { force: true }) — kills the no-empty lint error and is more
  idiomatic
- useConfigSync: added setBellOnComplete + setVoiceEnabled to the
  two useEffect dep arrays (they're stable React setState setters;
  adding is safe and silences exhaustive-deps)

verification
- npx eslint src/ packages/ → 0 errors, 0 warnings
- npm run type-check → clean
- npm test → 50/50
- npm run build → 394.8kb ink-bundle.js, 11ms esbuild
- pytest tests/tui_gateway/ tests/test_tui_gateway_server.py
  tests/hermes_cli/test_tui_resume_flow.py
  tests/hermes_cli/test_tui_npm_install.py → 57/57
2026-04-16 22:32:53 -05:00
Teknium 01906e99dd feat(image_gen): multi-model FAL support with picker in hermes tools (#11265)
* feat(image_gen): multi-model FAL support with picker in hermes tools

Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.

Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)

Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
  whitelist, and upscale flag. Three size families handled uniformly:
  image_size_preset (flux family), aspect_ratio (nano-banana), and
  gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
  into model-specific payloads, merges defaults, applies caller overrides,
  wires GPT quality_setting, then filters to the supports whitelist — so
  models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
  providers (Replicate, Stability, etc.); each provider entry tags itself
  with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
  prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.

Config:
  image_gen.model           = fal-ai/flux-2/klein/9b  (new)
  image_gen.quality_setting = medium                  (new, GPT only)
  image_gen.use_gateway     = bool                    (existing)

Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.

Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.

Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.

Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).

* feat(image_gen): translate managed-gateway 4xx to actionable error

When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
  1. The rejected model ID + HTTP status
  2. Two remediation paths: set FAL_KEY for direct access, or
     pick a different model via `hermes tools`

5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).

Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.

Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.

Docs: brief note in image-generation.md for Nous subscribers.

Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.

* feat(image_gen): pin GPT-Image quality to medium (no user choice)

Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:

1. Nous Portal billing complexity — the 22x cost spread between tiers
   ($0.009 low / $0.20 high) forces the gateway to meter per-tier per
   user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
   credit ~6x faster than `medium`.

This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:

- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
  config.yaml, no code path reads it — always sends medium.

Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
  (5 tests) — proves medium is baked in, config is ignored, flag is
  removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
  test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
  picker call fires when gpt-image is selected (no quality follow-up).

Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.

* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section

Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
2026-04-16 20:19:53 -07:00
Teknium 0061dca950 fix(installer): make prompt_yes_no bash 3.2 compatible
The helper used ${var,,} (bash 4+ lowercase parameter expansion) and
[[ =~ ]], which fail on macOS default /bin/bash (3.2.57) with:

    bash: ${default,,}: bad substitution

With 'set -e' at the top of the script, that aborts the whole
installer for macOS users who don't have a newer bash on PATH.

Replace the lowercase expansions with POSIX-style case patterns
(`[yY]|[yY][eE][sS]|...`) that behave identically and parse cleanly
on bash 3.2. Verified with a 15-case behavior test on both bash 3.2
and bash 5.2 — all pass.
2026-04-16 20:14:02 -07:00
helix4u 5be8e95604 fix(installer): use line-based tty confirmation prompts 2026-04-16 20:14:02 -07:00
Teknium 8c478983ed fix: enable TCP keepalives to detect dead provider connections (#10324) (#11277)
Re-land of #10933, now guarded by the tests in #11266.

When a provider drops a TCP connection mid-stream, the socket can enter
CLOSE-WAIT and ''epoll_wait'' may never fire — no data or error signal
arrives, so the httpx read timeout never triggers and the agent hangs
indefinitely. The other defenses (''_force_close_tcp_sockets'', stale
stream detector) all ride on the socket layer reporting the dead
connection, which it never does without probes.

Inject ''SO_KEEPALIVE'' + ''TCP_KEEPIDLE''/''KEEPINTVL''/''KEEPCNT''
into the httpx transport. Kernel probes after 30s idle, retries every
10s, gives up after 3 → dead peer detected within ~60s instead of
hanging forever. Platform-aware: ''TCP_KEEPIDLE'' on Linux,
''TCP_KEEPALIVE'' on macOS. Silent no-op on Windows or anywhere
the socket options aren't available.

The original land (#10933) mutated ''client_kwargs'' in place when it
injected the ''httpx.Client''. Since callers pass ''self._client_kwargs''
by reference, the injected client leaked into the instance state. After
the first request, the OpenAI SDK closed its ''http_client'' — including
the injected one. The next ''_create_openai_client'' call re-read the
now-closed ''httpx.Client'' from ''self._client_kwargs'' and every
subsequent chat raised ''APIConnectionError'' with cause ''RuntimeError:
Cannot send a request, as the client has been closed'' (AlexKucera's
Discord report, 2026-04-16).

The defensive ''client_kwargs = dict(client_kwargs)'' copy already on
main (taeuk178's #10978) means this injection only lands in the
per-call local copy. Each ''_create_openai_client'' invocation gets
its OWN fresh ''httpx.Client'' whose lifetime is tied to the paired
''OpenAI'' client. When that ''OpenAI'' client is closed (rebuild,
teardown, credential rotation), its ''httpx.Client'' closes with it
and the next call constructs a fresh one — no stale closed transport
can be reused.

Full 4-test matrix all green (unit + live with real OpenRouter round
trips, HERMES_LIVE_TESTS=1):

    tests/run_agent/test_create_openai_client_kwargs_isolation.py      PASS
    tests/run_agent/test_create_openai_client_reuse.py                 PASS (2)
    tests/run_agent/test_sequential_chats_live.py                      PASS

Socket options verified on the live httpx transport:

    _socket_options: [(1, 9, 1), (6, 4, 30), (6, 5, 10), (6, 6, 3)]
    = (SO_KEEPALIVE=1, TCP_KEEPIDLE=30s, TCP_KEEPINTVL=10s, TCP_KEEPCNT=3)

Sequential-chat reproduction of the #10933 failure was explicitly
run against this patch — the defensive copy on main prevents the
closed transport from leaking back into ''self._client_kwargs'', so
every rebuild constructs a fresh transport.

Closes #10324
2026-04-16 20:04:54 -07:00
Teknium ab33ce1c86 fix(opencode): strip /v1 from base_url on mid-session /model switch to Anthropic-routed models (#11286)
PR #4918 fixed the double-/v1 bug at fresh agent init by stripping the
trailing /v1 from OpenCode base URLs when api_mode is anthropic_messages
(so the Anthropic SDK's own /v1/messages doesn't land on /v1/v1/messages).
The same logic was missing from the /model mid-session switch path.

Repro: start a session on opencode-go with GLM-5 (or any chat_completions
model), then `/model minimax-m2.7`. switch_model() correctly sets
api_mode=anthropic_messages via opencode_model_api_mode(), but base_url
passes through as https://opencode.ai/zen/go/v1. The Anthropic SDK then
POSTs to https://opencode.ai/zen/go/v1/v1/messages, which returns the
OpenCode website 404 HTML page (title 'Not Found | opencode').

Same bug affects `/model claude-sonnet-4-6` on opencode-zen.

Verified upstream: POST /v1/messages returns clean JSON 401 with x-api-key
auth (route works), while POST /v1/v1/messages returns the exact HTML 404
users reported.

Fix mirrors runtime_provider.resolve_runtime_provider:
- hermes_cli/model_switch.py::switch_model() strips /v1 after the OpenCode
  api_mode override when the resolved mode is anthropic_messages.
- run_agent.py::AIAgent.switch_model() applies the same strip as
  defense-in-depth, so any direct caller can't reintroduce the double-/v1.

Tests: 9 new regression tests in tests/hermes_cli/test_model_switch_opencode_anthropic.py
covering minimax on opencode-go, claude on opencode-zen, chat_completions
(GLM/Kimi/Gemini) keeping /v1 intact, codex_responses (GPT) keeping /v1
intact, trailing-slash handling, and the agent-level defense-in-depth.
2026-04-16 19:41:41 -07:00
Teknium 7fd508979e fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR #8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
2026-04-16 19:39:21 -07:00
kshitijk4poor d64446e315 feat(file-sync): sync remote changes back to host on teardown
Salvage of PR #8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR #8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on #8018 by @alt-glitch.
2026-04-16 19:39:21 -07:00
Brooklyn Nicholson c730ab8ad7 chore: fmt 2026-04-16 21:09:50 -05:00
Brooklyn Nicholson c74017f405 fix(tui): sticky prompt correctness + scrollbar re-render thrash
Sticky prompt:
The loop was skipping `first` (the first row in the viewport) when
looking for a user message scrolled above the top edge. If `first`
itself was a user row that had just ticked above the viewport, we'd
fall through the early-return guard (`role === 'user' && !above`),
then walk from `first - 1` backward — never rechecking `first`, never
finding anything, returning '' and leaving the sticky empty. This is
why it felt "stuck" at the start: one-turn sessions with the user row
exactly at/near the top never surfaced the breadcrumb.

Collapsed the two branches into one loop starting at `first`: nearest
user wins — still-on-screen → empty (redundant to echo), already
above → text. Same semantics, covers the gap.

Scrollbar:
`useSyncExternalStore` snapshot was `scrollTop:vp:scrollHeight` —
scrollHeight ticks up by ~1 row on every streamed chunk, forcing a
re-render per chunk. Quantized snapshot to the displayed values
(`thumbTop:thumbSize:vp`) so we only re-render when the bar actually
changes. Drops render count per turn by ~100x during streaming and
stops the "constantly resizes" flicker.
2026-04-16 21:07:19 -05:00
Brooklyn Nicholson 40f2368875 fix(tui): ungate reasoning events so the Thinking panel shows live tokens
The gateway was gating `reasoning.delta` and `reasoning.available`
behind `_reasoning_visible(sid)` (true iff `display.show_reasoning:
true` or `tool_progress_mode: verbose`). With the default config,
neither was true — so reasoning events never reached the TUI,
`turn.reasoning` stayed empty, `reasoningTokens` stayed 0, and the
Thinking expander showed no token label for the whole turn. Tools
still reported tokens because `tool.start` had no such gate.

Then `message.complete` fired with `payload.reasoning` populated, the
TUI saved it into `msg.thinking`, and the finalized row's expander
sprouted "~36 tokens" post-hoc. That's the "tokens appear after the
turn" jank.

Remove the gate on emission. The TUI is responsible for whether to
display reasoning content (detailsMode + collapsed expander already
handle that). Token counting becomes continuous throughout the turn,
matching how tools work.

Also dropped the now-unused `_reasoning_visible` and
`_session_show_reasoning` helpers. `show_reasoning` config key stays
in place — it's still toggled via `/reasoning show|hide` and read
elsewhere for potential future TUI-side gating.
2026-04-16 20:56:47 -05:00
Brooklyn Nicholson 319aabbb80 refactor(tui): wrap progress panel + streaming body in StreamingAssistant
Two improvements:

1. The progress ToolTrail and the streaming MessageLine were two
   sibling JSX blocks in appLayout with hand-rolled margin glue
   between them. Extracted into `<StreamingAssistant>`, a single
   component that owns both the trail and the streaming body plus
   the 1-row gap between them. appLayout just hands it `progress`
   and theme; the layout logic lives in one place, matching the
   mental model that these two pieces are one live assistant turn.

2. Thinking token label was hidden when `reasoningTokens === 0` even
   if the live reasoning text was already populated (the
   scheduleReasoning timer hadn't ticked, or the model sent no
   reasoning but the text was coming in via reasoning.delta).
   Changed the tokenCount fallback from `reasoningTokens !==
   undefined ? reasoningTokens : estimate` to `reasoningTokens > 0 ?
   ... : estimate` so the label appears the moment text exists.
2026-04-16 20:49:41 -05:00
Brooklyn Nicholson 26f3a05c9c fix(tui): don't clobber busy on the progress panel during streaming
`appLayout` was passing `busy={ui.busy && !progress.streaming}` into
ToolTrail, so the moment `message.delta` fired and streaming began,
the panel internally saw `busy=false`. With the prior fix in place
(hasThinking = !!cot || reasoningActive || busy), that flipped
hasThinking to false and the Thinking expander vanished mid-turn —
reappearing only after message.complete when the finalized row
rendered with its own internal expander.

The `!progress.streaming` override was a defensive guard against the
panel implying "still thinking" once the response text was streaming.
But that's already handled inside ToolTrail — `streaming` prop on the
Thinking component uses `busy && reasoningStreaming`, and
reasoningStreaming is already falsey once recordMessageDelta calls
endReasoningPhase.

Pass plain `busy={ui.busy}`. Panel stays up start-to-finish; handoff
to the finalized-message row is continuous.
2026-04-16 20:39:02 -05:00
Brooklyn Nicholson 15096903c7 fix(tui): keep the newline above the streaming assistant text
Finalized assistant messages rendered the thinking/tools trail inside
MessageLine with marginBottom=1 before the response body — giving a
clean blank line above the text. The streaming path rendered the
progress ToolTrail and the streaming MessageLine as two separate
siblings with no margin between, so the in-progress response butted
right up against the thinking panel. That's the "newline appears
after it's done" jank.

Wrap the streaming MessageLine in a Box with marginTop=1 whenever the
progress area is visible above it. Same spacing as the finalized
version, continuous through the handoff.
2026-04-16 20:35:46 -05:00
Brooklyn Nicholson 26859e3fcb fix(tui): keep the Thinking expander visible for the whole turn
Previously `hasThinking = !!cot || reasoningActive || (busy && !hasTools)`
so the moment a tool started streaming (`hasTools` → true) the expander
vanished mid-turn. If the model also produced no `reasoning.delta`
events (reasoning-less models, or reasoning arriving after tools), the
whole turn ran with no Thinking row — then `message.complete`
populated `msg.thinking` from the payload's post-hoc reasoning trace
and the expander suddenly appeared in the transcript AFTER the turn.

Drop the `!hasTools` restriction. The Thinking row now anchors for the
entire `busy` window; tools and thinking coexist as sibling sections
(they already did — the exclusion was a UX mistake). Reasoning-less
models show a dim empty header; streaming models show live content;
tool-interleaved turns keep the anchor visible throughout.
2026-04-16 20:27:06 -05:00
Brooklyn Nicholson aedc767c66 feat(tui): put the kawaii face+verb ticker in the status bar, not the thinking panel
The status bar was showing stale lifecycle text ("running…") while the
face+verb stream flickered through the thinking panel as Python pushed
thinking.delta events. That's backwards — the face ticker is the
primary "I'm alive" signal, it belongs in the status bar; the thinking
panel is for substantive reasoning and tool activity.

Status bar now reads `ui.busy`: when true, renders a local `<FaceTicker>`
cycling FACES × VERBS on a 2.5s interval, unaffected by server events.
When false, the bar shows the actual status string (ready, starting
agent…, interrupted, etc.).

Side effect: `scheduleThinkingStatus` still patches `ui.status` with
Python's face text, but while busy the bar ignores that string and uses
the ticker instead. No server-side changes needed — Python keeps
emitting thinking.delta as a liveness heartbeat, the TUI just doesn't
let it fight the status bar.
2026-04-16 20:14:25 -05:00
Brooklyn Nicholson 23212d6b40 docs: kill "PT" shorthand — say "classic (prompt_toolkit) CLI"
"PT" was internal shorthand for prompt_toolkit that leaked into
AGENTS.md and the TUI post-mortem. Spell it out.

- AGENTS.md: "PT CLI" → "classic (prompt_toolkit) CLI"
- docs/plans/2026-04-01-ink-gateway-tui-migration-plan.md: both hits
2026-04-16 19:39:09 -05:00
Brooklyn Nicholson 7ffefc2d6c docs(tui): rename "Ink TUI" to just "TUI" throughout user-facing surfaces
"Ink" is the React reconciler — implementation detail, not branding.
Consistent naming: the classic CLI is the CLI, the new one is the TUI.

Updated docs: user-guide/tui.md, user-guide/cli.md cross-link, quickstart,
cli-commands reference, environment-variables reference.

Updated code: main.py --tui help text, server.py user-visible setup
error, AGENTS.md "TUI Architecture" section.

Kept "Ink" only where it is literally the library (hermes-ink internal
source comments, AGENTS.md tree note flagging ui-tui/ as a React/Ink dir).
2026-04-16 19:38:21 -05:00
Brooklyn Nicholson 2812bfe5b9 docs(tui): add Ink TUI user guide + cross-link from CLI docs
New primary guide at `user-guide/tui.md` covering launch, requirements,
keybindings, slash commands, status line, configuration, sessions, and
the revert path. Matches the voice of `user-guide/cli.md`.

Cross-links:
- `user-guide/cli.md`: tip callout pointing readers at the Ink TUI
- `getting-started/quickstart.md`: shows both `hermes` and `hermes --tui`
  under "Start Chatting" so first-run users know they have the choice
- `reference/environment-variables.md`: new "Interface" section with
  `HERMES_TUI` and `HERMES_TUI_DIR`
- `reference/cli-commands.md`: `--tui` and `--dev` added to global options

Sidebar: `user-guide/tui` slotted right after `user-guide/cli`.
2026-04-16 19:29:18 -05:00
Brooklyn Nicholson ca30803d89 chore(tui): strip noise comments 2026-04-16 19:14:05 -05:00
Brooklyn Nicholson 7f1204840d test(tui): fix stale mocks + xdist flakes in TUI test suite
All 61 TUI-related tests green across 3 consecutive xdist runs.

tests/tui_gateway/test_protocol.py:
- rename `get_messages` → `get_messages_as_conversation` on mock DB (method
  was renamed in the real backend, test was still stubbing the old name)
- update tool-message shape expectation: `{role, name, context}` matches
  current `_history_to_messages` output, not the legacy `{role, text}`

tests/hermes_cli/test_tui_resume_flow.py:
- `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes
  setup" before `_launch_tui` was ever reached; 3 tests stubbed
  `_resolve_last_session` + `_launch_tui` but not the gate
- factored a `main_mod` fixture that stubs `_has_any_provider_configured`,
  reused by all three tests

tests/test_tui_gateway_server.py:
- `test_config_set_personality_resets_history_and_returns_info` was flaky
  under xdist because the real `_write_config_key` touches
  `~/.hermes/config.yaml`, racing with any other worker that writes
  config. Stub it in the test.
2026-04-16 19:07:49 -05:00
Brooklyn Nicholson dd2ec6bfa0 chore: uptick 2026-04-16 18:57:56 -05:00
Brooklyn Nicholson 3746c60439 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 18:25:49 -05:00
Brooklyn Nicholson 727f0eaf74 refactor(tui): clean up touched files — DRY, KISS, functional
Python (tui_gateway/server.py):
- hoist `_wait_agent` next to `_sess` so `_sess` no longer forward-refs
- simplify `_wait_agent`: `ready.wait()` already returns True when set,
  no separate `.is_set()` check, collapse two returns into one expr
- factor `_sess_nowait` for handlers that don't need the agent (currently
  `terminal.resize` + `input.detect_drop`) — DRY up the duplicated
  `_sessions.get` + "session not found" dance
- inline `session = _sessions[sid]` in the session.create build thread so
  agent/worker writes don't re-look-up the dict each time
- rename inline `ready_event` → `ready` (it's never ambiguous)

TS:
- `useSessionLifecycle.newSession`: hoist `r.info ?? null` into `info`
  so it's one lookup, drop ceremonial `{ … }` blocks around single-line
  bodies
- `createGatewayEventHandler.session.info`: wrap the case in a block,
  hoist `ev.payload` into `info`, tighten comments
- `useMainApp` flush effect: collapse two guard returns into one
- `bootBanner.ts`: lift `TAGLINE` + `FALLBACK` to module constants, make
  `GRADIENT` readonly, one-liner return via template literal
- `theme.ts`: group `selectionBg` inside the status* block (it's a UI
  surface bg, same family), trim the comment
2026-04-16 18:07:23 -05:00
Brooklyn Nicholson 275256cdb4 feat(tui): uniform selection background instead of SGR inverse
Selection was falling back to SGR-7 inverse (fg ↔ bg per cell), which
fragments over syntax-highlighted content — each amber/gold/dim/cornsilk
fg turned into a different bg stripe, producing the staircase look.

Now `useMainApp` calls `selection.setSelectionBgColor()` with a muted
navy (`#3a3a55`) on theme change. `setSelectionBg` in screen.ts replaces
just the bg cell-by-cell while preserving fg/bold/dim/italic, so the
highlight is one solid color across the whole drag range and the text
stays readable in its original color.

Skins can override via `selection_bg` in their color map.
2026-04-16 15:50:28 -05:00
Brooklyn Nicholson 9503896aa2 perf(tui): paint banner to stdout in ~2ms, before Ink loads
Dynamic-importing @hermes/ink + App costs ~170ms on cold start — during
that window the terminal was blank. Now `entry.tsx` writes a raw-ANSI
banner to stdout immediately after the TTY check, using hardcoded
DEFAULT_THEME colors. Ink's `<AlternateScreen>` wipes the normal-screen
buffer when it mounts, so the boot banner is replaced seamlessly by the
real React render a moment later — no double-banner, no flash.

  T=2ms    banner visible (vs. ~170ms before)
  T=~170ms React + Ink mounts
  T=~200ms alt screen takes over, Banner component repaints

Palette drift between `bootBanner.ts` and the live theme is harmless —
the live render overrides after ~200ms. Narrow terminals (cols < 98)
fall back to the one-line "⚕ NOUS HERMES" marker.
2026-04-16 15:48:41 -05:00
Brooklyn Nicholson 04e36851b7 feat(tui): honest status 'starting agent…' until session.info arrives
Post-async-session.create, `session.create` returns in ~1ms with partial
info and the real agent fires `session.info` ~1s later. Previously the
status bar went straight to 'ready' right after the instant RPC return,
which was misleading — `prompt.submit` would block server-side waiting
for the agent to finish building.

Now:
- `newSession`: status = 'starting agent…' when info has no `version`,
  else 'ready' (covers the fast resume path too)
- `session.info` event: flips status to 'ready' only if it was
  'starting agent…', preserving running/interrupted/error states
2026-04-16 15:41:44 -05:00
Brooklyn Nicholson a8e0a1148f perf(tui): async session.create — sid live in ~250ms instead of ~1350ms
Previously `session.create` blocked for ~1.2s on `_make_agent` (mostly
`run_agent` transitive imports + AIAgent constructor). The UI waited
through that whole window before sid became known and the banner/panel
could render.

Now `session.create` returns immediately with `{session_id, info:
{model, cwd, tools:{}, skills:{}}}` and spawns a background thread that
does the real `_make_agent` + `_init_session`. When the agent is live,
the thread emits `session.info` with the full payload.

Python side:
- `_sessions[sid]` gets a placeholder dict with `agent=None` and a
  `threading.Event()` named `agent_ready`
- `_wait_agent(session, rid, timeout=30)` blocks until the event is set
  (no-op when already set or absent, e.g. for `session.resume`)
- `_sess()` now calls `_wait_agent` — so every handler routed through it
  (prompt.submit, session.usage, session.compress, session.branch,
  rollback.*, tools.configure, etc.) automatically holds until the agent
  is live, but only during the ~1s startup window
- `terminal.resize` and `input.detect_drop` bypass the wait via direct
  dict lookup — they don't touch the agent and would otherwise block
  the first post-startup RPCs unnecessarily

TS side:
- `session.info` event handler now patches the intro message's `info`
  in-place so the seeded banner upgrades to the full session panel when
  the agent finishes initializing
- `appLayout` gates `SessionPanel` on `info.version` being present
  (only set by `_session_info(agent)`, not by the partial payload from
  `session.create`) — so the panel only appears when real data arrives

Net effect on cold start:
  T=~400ms  banner paints (seeded intro)
  T=~245ms  ui.sid set (session.create responds in ~1ms after ready)
  T=~1400ms session panel fills in (real session.info event)

Pre-session keystrokes queue as before (already handled by the flush
effect); `prompt.submit` will wait on `agent_ready` on the Python side
when the flush tries to send before the agent is live.
2026-04-16 15:39:19 -05:00
Brooklyn Nicholson 842a122964 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 15:37:28 -05:00
Brooklyn Nicholson 2d693c865c perf(tui): spawn python gateway before loading @hermes/ink
Before: entry.tsx imports @hermes/ink (394KB bundle) + App + GatewayClient
in declaration order, then calls `gw.start()` at ~T=220ms. Python fork +
server.py import starts then.

After: only `GatewayClient` is statically imported (5ms, node builtins
only). `gw.start()` fires at ~T=5ms. @hermes/ink + App load in parallel
via `Promise.all(import(...))`. Python gets ~215ms of free runway to do
its own module import before node even finishes loading.

Net: session.info arrives ~150ms earlier in cold start. First React frame
timing is unchanged (still ~240ms — still gated by ink+app imports).

Removed a previously-tried warm-thread in server.py that pre-imported
`run_agent` in the background. Measured variance showed occasional
5-10s outliers (GIL thrashing); median gain was <100ms. Not worth the
non-determinism.
2026-04-16 15:21:49 -05:00
Brooklyn Nicholson f3920fec0b feat(tui): queue pre-session input, auto-flush when session lands
The TUI is fully interactive from the first frame but `session.create`
(agent + tools + MCP) takes ~2s. Plain-text messages typed before the
session is live used to fail with "session not ready yet"; slash and
shell commands worked but agent prompts were dropped.

Now:
- `dispatchSubmission` enqueues plain text when `sid` is null (slash/shell
  still short-circuit first)
- `useMainApp` tracks sid transitions and kicks off one `sendQueued()`
  when the session first becomes ready; subsequent queued messages drain
  on `message.complete` as before
- Fixed pre-existing double-Enter bug that dequeued without sid check

User flow: type `hello` → shows in `queuedDisplay` preview → 2s later
agent wakes → message auto-sends → reply streams. Zero wasted input.
2026-04-16 15:04:18 -05:00
Brooklyn Nicholson c6ed61430a perf(tui): paint banner on first frame, don't wait on session.create
Previously `historyItems` was seeded empty and the intro (with Banner +
SessionPanel) was only pushed after Python's `session.create` returned —
~1.8s of agent + tools + MCP init with nothing on screen. Base CLI feels
instant because it prints the banner as its first action.

Seed `historyItems` with an info-less intro on mount. `appLayout` now
renders the Banner unconditionally for `kind === 'intro'` and gates only
the SessionPanel on `info` being present. Gateway.ready swaps the skin
(~200ms) and session.info fills in the panel when the agent is ready.

Net: first usable frame drops from ~2s to ~300ms (node + module graph +
React mount). No behavior change — intro message is replaced in place
by `introMsg(info)` when `newSession()` / `resumeById()` resolve.
2026-04-16 14:58:12 -05:00
Brooklyn Nicholson cb2a737bc8 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 14:48:33 -05:00
Brooklyn Nicholson 18840bcff8 chore: uptick 2026-04-16 14:48:29 -05:00
Brooklyn Nicholson 0478266831 refactor(tui): stop shadowing python — slash fallback inherits worker output
Python's slash worker already prints every echo/panel command through Rich.
TS was reformatting the same data client-side for 23 commands. Delete those
shadows; let the `slash.exec` fallback in `createSlashHandler` route the
worker's text (via `<Ansi>`) and page-wrap long output.

TS registry now contains 23 commands (down from 45) — only those that:
  - mutate React-local state (composer, transcript, overlays, uiStore)
  - touch the terminal (OSC52 copy, `$EDITOR`, clipboard)
  - open pickers (`/model`, `/resume`)
  - trigger history surgery (`/undo`, `/retry`, `/compress`, `/personality`)
  - need TS-only composition (`/help` merges HOTKEYS + catalog)

Deleted shadows:
  session: yolo, skin, verbose, reasoning, provider, stop, reload-mcp,
           save, title, insights, debug, fast, platforms, snapshot,
           usage, history, profile
  ops:     plugins, rollback, agents, tasks, cron, config, toolsets,
           browser, skills (list/browse only; `/tools configure` kept
           for its history-reset side effect)

Side effects:
- Drops `slash/shared.ts` + `SlashShared` + `shared`/`SLASH_OUTPUT_PAGE` —
  generic slash.exec fallback handles titled paging via `createSlashHandler`.
- Prunes 17 now-unreferenced `*Response` interfaces from gatewayTypes.ts.
- `createSlashHandler` fallback now pages long output (len>180 || lines>2)
  and uses the command name as title.

session.ts: 670 -> 199  (-70%)
ops.ts:     460 ->  52  (-88%)
gatewayTypes.ts: 450 -> 302  (-33%)
2026-04-16 14:26:15 -05:00
Brooklyn Nicholson beccd1bc04 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 12:42:44 -05:00
Brooklyn Nicholson 68ecdb6e26 refactor(tui): store-driven turn state + slash registry + module split
Hoist turn state from a 286-line hook into $turnState atom + turnController
singleton. createGatewayEventHandler becomes a typed dispatch over the
controller; its ctx shrinks from 30 fields to 5. Event-handler refs and 16
threaded actions are gone.

Fold three createSlash*Handler factories into a data-driven SlashCommand[]
registry under slash/commands/{core,session,ops}.ts. Aliases are data;
findSlashCommand does name+alias lookup. Shared guarded/guardedErr combinator
in slash/guarded.ts.

Split constants.ts + app/helpers.ts into config/ (timing/limits/env),
content/ (faces/placeholders/hotkeys/verbs/charms/fortunes), domain/ (roles/
details/messages/paths/slash/viewport/usage), protocol/ (interpolation/paste).

Type every RPC response in gatewayTypes.ts (26 new interfaces); drop all
`(r: any)` across slash + main app.

Shrink useMainApp from 1216 -> 646 lines by extracting useSessionLifecycle,
useSubmission, useConfigSync. Add <Fg> themed primitive and strip ~50
`as any` color casts.

Tests: 50 passing. Build + type-check clean.
2026-04-16 12:34:45 -05:00
Ari Lotter fc0623f0af update nix 2026-04-16 11:50:35 -04:00
Brooklyn Nicholson 9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
Brooklyn Nicholson c4b9750bc1 feat: lazy bootstrap node 2026-04-16 10:47:37 -05:00
Brooklyn Nicholson 39b1336d1f fix: ctx usage display 2026-04-16 08:27:41 -05:00
Brooklyn Nicholson f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Brooklyn Nicholson 8e06db56fd chore: uptick 2026-04-16 01:04:35 -05:00
Brooklyn Nicholson cb31732c4f chore: uptick 2026-04-15 23:29:00 -05:00
Brooklyn Nicholson 097702c8a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 19:11:07 -05:00
Brooklyn Nicholson 72aebfbb24 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 17:43:41 -05:00
Brooklyn Nicholson c9f78d110a feat: good vibes indi 2026-04-15 17:43:38 -05:00
Brooklyn Nicholson baa0de7649 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 16:35:01 -05:00
Brooklyn Nicholson 57e4b61155 feat: change to $ when in ! mode 2026-04-15 16:34:58 -05:00
Brooklyn Nicholson 53a024a941 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 14:37:54 -05:00
Brooklyn Nicholson cb7b740e32 feat: add subagent details 2026-04-15 14:35:42 -05:00
Brooklyn Nicholson 4b4b4d47bc feat: just more cleaning 2026-04-15 14:14:01 -05:00
Brooklyn Nicholson 46cef4b7fa Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 12:48:17 -05:00
Brooklyn Nicholson 9931d1d814 chore: cleanup 2026-04-15 10:35:08 -05:00
Brooklyn Nicholson cc15b55bb9 chore: uptick 2026-04-15 10:23:15 -05:00
Brooklyn Nicholson 371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
Brooklyn Nicholson 33c615504d feat: add inline token count etc and fix venv 2026-04-15 10:20:56 -05:00
Brooklyn Nicholson 561cea0d4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 00:02:31 -05:00
Brooklyn Nicholson 496bfb3c59 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 22:30:22 -05:00
Brooklyn Nicholson 99d859ce4a feat: refactor by splitting up app and doing proper state 2026-04-14 22:30:18 -05:00
Brooklyn Nicholson 4cbf54fb33 chore: uptick 2026-04-14 19:38:04 -05:00
Brooklyn Nicholson 77cd5bf565 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 19:33:03 -05:00
Brooklyn Nicholson bf54f1fb2f Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 18:26:05 -05:00
Brooklyn Nicholson 3bc661ea29 fix: model et al selection on enter 2026-04-14 18:26:00 -05:00
Brooklyn Nicholson 52c11d172a feat: add scrollbar and fix selection on scroll 2026-04-14 14:34:33 -05:00
Brooklyn Nicholson 9804aa7443 fix: scrolling while selecting 2026-04-14 12:50:22 -05:00
Brooklyn Nicholson 7aed09e1ba fix: ctrlc 2026-04-14 12:07:29 -05:00
Brooklyn Nicholson dd2b0b4775 chore: uptick 2026-04-14 11:53:55 -05:00
Brooklyn Nicholson ea2d5754ab Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 11:49:40 -05:00
Brooklyn Nicholson 9a3a2925ed feat: scroll aware sticky prompt 2026-04-14 11:49:32 -05:00
Brooklyn Nicholson c189d5e98b fix: pasting 2026-04-13 22:39:03 -05:00
Brooklyn Nicholson 6bbac046a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:46:11 -05:00
Brooklyn Nicholson bbc7316007 feat: add cur cwd 2026-04-13 21:46:08 -05:00
Brooklyn Nicholson 35dbb1da3f chore: uptick 2026-04-13 21:22:44 -05:00
Brooklyn Nicholson 6d6b3b03ac feat: add clicky handles 2026-04-13 21:20:55 -05:00
Brooklyn Nicholson 1b573b7b21 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:17:41 -05:00
Brooklyn Nicholson 7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Brooklyn Nicholson aeb53131f3 fix(ui-tui): harden TUI error handling, model validation, command UX parity, and gateway lifecycle 2026-04-13 18:29:24 -05:00
Brooklyn Nicholson 783c6b6ed6 chore: uptick 2026-04-13 15:08:06 -05:00
Brooklyn Nicholson 4a260b51fe fix: deep markdown parsing 2026-04-13 15:01:15 -05:00
Brooklyn Nicholson ebe3270430 fix: fake models 2026-04-13 14:57:42 -05:00
Brooklyn Nicholson 77b97b810a chore: update how txt pasting ux feels 2026-04-13 14:49:10 -05:00
Brooklyn Nicholson 9db94e8521 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 14:17:55 -05:00
Brooklyn Nicholson cac1b1b724 fix(ui-tui): surface RPC errors and guard invalid gateway responses 2026-04-13 14:17:52 -05:00
Ari Lotter 56524bb1d9 fix: nix local dev with tui 2026-04-13 15:09:31 -04:00
Brooklyn Nicholson 0642b6cc53 fix: clean newline paste thingy 2026-04-13 12:54:48 -05:00
Brooklyn Nicholson eec1db36f7 chore: preserve commands 2026-04-13 10:43:42 -05:00
Brooklyn Nicholson 713a614ea8 chore: uptick 2026-04-13 10:22:44 -05:00
Brooklyn Nicholson a27167fb30 chore: fmt 2026-04-13 10:14:05 -05:00
Brooklyn Nicholson a2c0597ae4 feat: show thinking indicator while inferencing 2026-04-13 10:11:18 -05:00
Brooklyn Nicholson 0fd33a98cd feat: ctrl t for diff thinking rendering types 2026-04-12 20:08:12 -05:00
Brooklyn Nicholson ddb0871769 feat(tui): hierarchical tool progress with grouped parent/child rows and transient line pruning 2026-04-12 17:39:17 -05:00
Brooklyn Nicholson e03bef684e chore: fmt 2026-04-12 16:33:25 -05:00
Brooklyn Nicholson 4b026d6761 fix: little box typey thing 2026-04-12 16:31:30 -05:00
Brooklyn Nicholson 8efd3db1b4 fix: force builds 2026-04-12 16:08:03 -05:00
Brooklyn Nicholson ef51bb0091 fix: tool drafting stuff 2026-04-12 16:06:39 -05:00
Ari Lotter 3bf0f39337 wrap preformatted ansi in <Ansi> component 2026-04-12 16:53:53 -04:00
Brooklyn Nicholson 690d62a6d1 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:19:07 -05:00
Brooklyn Nicholson 2aea75e91e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:18:55 -05:00
Austin Pickett 5552e1ffe1 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 22:10:11 -04:00
Austin Pickett 90890f8f04 feat: personality selector 2026-04-11 22:10:02 -04:00
Ari Lotter 8e0df1d532 launch tui later to allow setup et al 2026-04-11 20:23:30 -04:00
Ari Lotter 29721fcc58 nix fixes 2026-04-11 19:35:00 -04:00
Brooklyn Nicholson a1d2a0c0fd feat: self update npm deps on hermes update 2026-04-11 18:29:18 -05:00
Brooklyn Nicholson ec553fdb49 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 17:15:41 -05:00
Brooklyn Nicholson 24a498eb90 feat: better markdown 2026-04-11 17:15:36 -05:00
Brooklyn Nicholson 9ccb490cf3 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:30:23 -05:00
Brooklyn Nicholson 32302c37dd feat: fix types and add type checking plus lazybundle on launch andddd dev flag 2026-04-11 14:42:28 -05:00
Ari Lotter 5e5e65f6d5 fix nix build 2026-04-11 15:30:37 -04:00
Brooklyn Nicholson acbf1794f2 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 14:05:17 -05:00
Brooklyn Nicholson e2ea8934d4 feat: ensure feature parity once again 2026-04-11 14:02:36 -05:00
Austin Pickett 7e7f78f86c Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:00:28 -04:00
Austin Pickett 5fb6a4418b feat: panels 2026-04-11 14:29:24 -04:00
Brooklyn Nicholson bf6af95ff5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 13:14:36 -05:00
Brooklyn Nicholson 3fd5cf6e3c feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
Brooklyn Nicholson b04248f4d5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor
# Conflicts:
#	gateway/platforms/base.py
#	gateway/run.py
#	tests/gateway/test_command_bypass_active_session.py
2026-04-11 11:39:47 -05:00
Brooklyn Nicholson 7803d21bcc Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 11:39:19 -05:00
Brooklyn Nicholson 8760faf991 feat: fork ink and make it work nicely 2026-04-11 11:29:08 -05:00
jonny cab6447d58 fix(tui): render tool trail consistently between live and resume
Resumed sessions showed raw JSON tool output in content boxes instead
of the compact trail lines seen during live use. The root cause was
two separate rendering paths with no shared code.

Extract buildToolTrailLine() into lib/text.ts as the single source
of truth for formatting tool trail lines. Both the live tool.complete
handler and toTranscriptMessages now call it.

Server-side, reconstruct tool name and args from the assistant
message's tool_calls field (tool_name column is unpopulated) and
pass them through _tool_ctx/build_tool_preview — the same path
the live tool.start callback uses.
2026-04-11 06:35:00 +00:00
jonny 57e8d44af8 fix(tui): preserve tool metadata in resumed session history
session.resume was building conversation history with only role and
content, stripping tool_call_id, tool_calls, and tool_name. The API
requires tool messages to reference their parent tool_call, so resumed
sessions with tool history would fail with HTTP 500.

Use get_messages_as_conversation() which already preserves the full
message structure including tool metadata and reasoning fields.
2026-04-11 05:23:44 +00:00
jonny cb79018977 fix(tui): improve session picker readability
- Show full session ID in a fixed-width column for easy scanning
- Pad row numbers to 2 digits to keep alignment past 9 entries
- Always show session source (tui/cli) instead of conditionally hiding it
- Use Box-based column layout so ID, metadata, and title don't run together
2026-04-10 11:16:41 +00:00
jonny 90f0aa174d fix(tui): support /resume <id> to bypass session picker
- Extract resumeById callback from inline onSelect handler
- /resume with no arg opens picker (unchanged behavior)
- /resume <id> resumes directly, skipping the picker
2026-04-10 11:00:08 +00:00
jonny 304f1463a9 fix(tui): show CLI sessions in resume picker
- session.list RPC now queries both tui and cli sources, merged by recency
- Session picker shows source label for non-tui sessions (e.g. ", cli")
- Added source field to SessionItem interface
2026-04-10 09:34:01 +00:00
jonny 294c377c0c fix(tui): use PROJECT_ROOT instead of cwd for HERMES_ROOT fallback
When HERMES_ROOT was added for Nix-bundled TUI support, the fallback
was set to os.getcwd(). This overrode the TUI's own import.meta.dirname
resolution, so launching `hermes --tui` from outside the repo caused
the gateway client to look for venv/bin/python relative to the user's
working directory instead of the repo root.

Use PROJECT_ROOT (resolved from the source file location) as the
fallback, which is stable regardless of where the command is invoked.
2026-04-10 09:18:06 +00:00
Ari Lotter 660379637a one more nix fix 2026-04-10 01:41:29 -04:00
Ari Lotter bc80848e49 update lockfile 2026-04-10 00:50:39 -04:00
Ari Lotter 658cd2dd4c nix: add tui lockfile update script 2026-04-10 00:46:37 -04:00
Brooklyn Nicholson 8c1ba639c6 Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 23:35:29 -05:00
Brooklyn Nicholson 17a9c47178 feat: support shift enter for ghostty etc 2026-04-09 23:35:25 -05:00
Austin Pickett e1df13cf20 fix: menus 2026-04-10 00:01:37 -04:00
Brooklyn Nicholson 4fe78d5b88 chore: fix bad merge apparently? 2026-04-09 19:17:06 -05:00
Brooklyn Nicholson aa5b697a9d Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 19:12:31 -05:00
Brooklyn Nicholson aca479c1ae Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 19:08:52 -05:00
Brooklyn Nicholson b85ff282bc feat(ui-tui): slash command history/display, CoT fade, live skin switch, fix double reasoning 2026-04-09 19:08:47 -05:00
Austin Pickett f805323517 chore: merge main 2026-04-09 20:00:34 -04:00
Austin Pickett 4406b4b100 fix: add delete support 2026-04-09 19:53:55 -04:00
Brooklyn Nicholson 17ecdce936 feat: add slash commands to the history so it doesnt get lost 2026-04-09 18:51:17 -05:00
Brooklyn Nicholson 7e813a30e0 fix: sexier cots 2026-04-09 18:33:25 -05:00
Brooklyn Nicholson 6e24b9947e feat(ui-tui): render tool calls inline in message flow instead of activity lane 2026-04-09 17:40:30 -05:00
Brooklyn Nicholson 99fd3b518d feat: add /copy and /agents 2026-04-09 17:19:36 -05:00
Brooklyn Nicholson c5511bbc5a fix: leading ./ thingy 2026-04-09 16:27:06 -05:00
Brooklyn Nicholson b7d4ea1550 feat: better hyperlink formatting 2026-04-09 15:13:43 -05:00
Ari Lotter 74241328f0 direnv: watch lockfiles/nix files; gitignore .nix-stamps 2026-04-09 15:50:24 -04:00
Ari Lotter df5874c119 nix: add bundled TUI build-time verification check 2026-04-09 15:50:24 -04:00
Ari Lotter 21afb3fa3c nix: delegate devShell setup to package passthru hooks
- use inputsFrom to inherit build inputs from packages
- concat passthru.devShellHook from each package
2026-04-09 15:50:24 -04:00
Ari Lotter 31b2c12f0f nix: bundle TUI in main package with passthru hooks
- build tui.nix, copy to $out/ui-tui/ (same layout as dev)
- set HERMES_TUI_DIR, HERMES_PYTHON in wrapper
- add passthru.devShellHook with stamp-checked venv setup
- expose tui as separate package output
2026-04-09 15:50:24 -04:00
Ari Lotter 405c1b4e84 nix: add TUI derivation with buildNpmPackage
- fetchNpmDeps for reproducibilty
- compile ts to js
- passthru.devShellHook for dev shell stamp-checked auto dep install
2026-04-09 15:50:24 -04:00
Ari Lotter 5ff96551d5 cli: support bundled TUI at HERMES_TUI_DIR (for nix)
- Fix cwd to use bundled TUI dir, not PROJECT_ROOT
- Set HERMES_ROOT from env with cwd fallback
2026-04-09 15:50:24 -04:00
Ari Lotter 2b4272ef5b ui-tui: update package-lock.json 2026-04-09 15:35:54 -04:00
Ari Lotter 670dcea8f4 ui-tui: add tsc build pipeline
- Switch tsconfig to nodenext module resolution for Node 22 (used by
installer script)
- Add shebang to entry.tsx, preserved into index.js
- Add HERMES_ROOT env var fallback for repo root resolution
2026-04-09 15:35:29 -04:00
Brooklyn Nicholson 17f13013eb chore: fmt 2026-04-09 14:17:45 -05:00
Brooklyn Nicholson 00e1d42b9e feat: image pasting 2026-04-09 13:45:23 -05:00
Brooklyn Nicholson b2ea9b4176 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-09 12:31:20 -05:00
Brooklyn Nicholson 0d7c19a42f fix(ui-tui): ref-based input buffer, gateway listener stability, usage display, and 6 correctness bugs 2026-04-09 12:21:24 -05:00
Brooklyn Nicholson 8755b9dfc0 fix: resizing etc 2026-04-09 00:46:35 -05:00
Brooklyn Nicholson 54bd25ff4a fix(tui): -c resume, ctrl z, pasting updates, exit summary, session fix 2026-04-09 00:36:53 -05:00
Brooklyn Nicholson b66550ed08 fix(tui): stabilize multiline input, persist tool traces, and port CLI-style context status bar 2026-04-08 23:59:56 -05:00
Brooklyn Nicholson c49bbbe8c2 chore: fmt 2026-04-08 22:02:38 -05:00
Brooklyn Nicholson 9d8f9765c1 feat: add tests and update mds 2026-04-08 19:31:25 -05:00
Brooklyn Nicholson f226e6be10 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-08 19:11:44 -05:00
Brooklyn Nicholson a435c7274a chore: uptick 2026-04-08 14:22:36 -05:00
Brooklyn Nicholson b597123489 feat: better bg tasks 2026-04-08 14:18:37 -05:00
Brooklyn Nicholson af0f4a52fe feat: cute spinners 2026-04-08 13:45:34 -05:00
Brooklyn Nicholson b50d81f212 fix: diff colours 2026-04-08 12:11:55 -05:00
Brooklyn Nicholson a9fa054df9 chore: uptick 2026-04-08 10:35:07 -05:00
Brooklyn Nicholson 31cb23890a Merge branch 'feat/ink-refactor' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-08 09:46:46 -05:00
Brooklyn Nicholson a3cfb1de86 feat: auto install tui deps 2026-04-08 09:46:40 -05:00
Austin Pickett 371efafc46 feat: personality 2026-04-08 00:15:15 -04:00
Austin Pickett ebd2d83ef2 feat: add skin logo support 2026-04-07 23:59:11 -04:00
Brooklyn Nicholson af077b2c0d fix: history up arrow 2026-04-07 20:47:59 -05:00
Brooklyn Nicholson 2d884ff12d chore: uptick 2026-04-07 20:46:59 -05:00
Brooklyn Nicholson b397c91d4a chore: uptick 2026-04-07 20:44:18 -05:00
Brooklyn Nicholson 9c2c9e3a3e chore: fmt 2026-04-07 20:30:22 -05:00
Brooklyn Nicholson c3eeb03e26 chore: clean exit 2026-04-07 20:29:31 -05:00
Brooklyn Nicholson d9d0ac06b9 chore: readme update 2026-04-07 20:24:46 -05:00
Brooklyn Nicholson 29f2610e4b tui updates for rendering pipeline 2026-04-07 20:11:05 -05:00
Brooklyn Nicholson dcb97f7465 chore: readme 2026-04-06 18:52:45 -05:00
Brooklyn Nicholson 86308b6de4 chore: better command support 2026-04-06 18:49:40 -05:00
Brooklyn Nicholson 2d349bbf7a chore: fmt 2026-04-06 18:43:00 -05:00
Brooklyn Nicholson 39878aff00 chore: uptick 2026-04-06 18:40:21 -05:00
Brooklyn Nicholson afd670a36f feat: small refactors 2026-04-06 18:38:13 -05:00
Brooklyn Nicholson e2b3b1c5e4 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-06 17:56:45 -05:00
Brooklyn Nicholson 4c7d5ec778 tui: add tui arg 2026-04-05 18:55:59 -05:00
Brooklyn Nicholson f116c59071 tui: inherit Python-side rendering via gateway bridge 2026-04-05 18:50:41 -05:00
Brooklyn Nicholson 0f556a17f5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-05 18:24:10 -05:00
Brooklyn Nicholson ee92460763 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-04 16:35:13 -05:00
Brooklyn Nicholson 2893e9df71 feat: add image pasting capability 2026-04-04 13:00:55 -05:00
Brooklyn Nicholson 5a5d90c85a chore: formatting etc 2026-04-03 20:14:57 -05:00
Brooklyn Nicholson 56a69e519b chore: uptick 2026-04-03 19:55:15 -05:00
Brooklyn Nicholson fab4d8d470 chore: uptick 2026-04-03 19:52:50 -05:00
Brooklyn Nicholson 1218994992 chore: uptick 2026-04-03 14:44:50 -05:00
Brooklyn Nicholson f4bf57ff7a chore: uptick 2026-04-02 23:00:38 -05:00
Brooklyn Nicholson bbba9ed4f2 feat: split apart main.tsx 2026-04-02 20:39:52 -05:00
Brooklyn Nicholson 2818dd8611 feat: add prettier etc for ui-tui 2026-04-02 19:34:30 -05:00
Brooklyn Nicholson 2ea5345a7b feat: new tui based on ink 2026-04-02 19:07:53 -05:00
655 changed files with 101208 additions and 7135 deletions
+4
View File
@@ -1 +1,5 @@
watch_file pyproject.toml uv.lock
watch_file ui-tui/package-lock.json ui-tui/package.json
watch_file flake.nix flake.lock nix/devShell.nix nix/tui.nix nix/package.nix nix/python.nix
use flake
+1
View File
@@ -60,5 +60,6 @@ mini-swe-agent/
# Nix
.direnv/
.nix-stamps/
result
website/static/api/skills-index.json
+104 -6
View File
@@ -56,6 +56,19 @@ hermes-agent/
│ ├── run.py # Main loop, slash commands, message dispatch
│ ├── session.py # SessionStore — conversation persistence
│ └── platforms/ # Adapters: telegram, discord, slack, whatsapp, homeassistant, signal, qqbot
├── ui-tui/ # Ink (React) terminal UI — `hermes --tui`
│ ├── src/entry.tsx # TTY gate + render()
│ ├── src/app.tsx # Main state machine and UI
│ ├── src/gatewayClient.ts # Child process + JSON-RPC bridge
│ ├── src/app/ # Decomposed app logic (event handler, slash handler, stores, hooks)
│ ├── src/components/ # Ink components (branding, markdown, prompts, pickers, etc.)
│ ├── src/hooks/ # useCompletion, useInputHistory, useQueue, useVirtualHistory
│ └── src/lib/ # Pure helpers (history, osc52, text, rpc, messages)
├── tui_gateway/ # Python JSON-RPC backend for the TUI
│ ├── entry.py # stdio entrypoint
│ ├── server.py # RPC handlers and session logic
│ ├── render.py # Optional rich/ANSI bridge
│ └── slash_worker.py # Persistent HermesCLI subprocess for slash commands
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains integration)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── environments/ # RL training environments (Atropos)
@@ -179,6 +192,59 @@ if canonical == "mycommand":
---
## TUI Architecture (ui-tui + tui_gateway)
The TUI is a full replacement for the classic (prompt_toolkit) CLI, activated via `hermes --tui` or `HERMES_TUI=1`.
### Process Model
```
hermes --tui
└─ Node (Ink) ──stdio JSON-RPC── Python (tui_gateway)
│ └─ AIAgent + tools + sessions
└─ renders transcript, composer, prompts, activity
```
TypeScript owns the screen. Python owns sessions, tools, model calls, and slash command logic.
### Transport
Newline-delimited JSON-RPC over stdio. Requests from Ink, events from Python. See `tui_gateway/server.py` for the full method/event catalog.
### Key Surfaces
| Surface | Ink component | Gateway method |
|---------|---------------|----------------|
| Chat streaming | `app.tsx` + `messageLine.tsx` | `prompt.submit``message.delta/complete` |
| Tool activity | `thinking.tsx` | `tool.start/progress/complete` |
| Approvals | `prompts.tsx` | `approval.respond``approval.request` |
| Clarify/sudo/secret | `prompts.tsx`, `maskedPrompt.tsx` | `clarify/sudo/secret.respond` |
| Session picker | `sessionPicker.tsx` | `session.list/resume` |
| Slash commands | Local handler + fallthrough | `slash.exec``_SlashWorker`, `command.dispatch` |
| Completions | `useCompletion` hook | `complete.slash`, `complete.path` |
| Theming | `theme.ts` + `branding.tsx` | `gateway.ready` with skin data |
### Slash Command Flow
1. Built-in client commands (`/help`, `/quit`, `/clear`, `/resume`, `/copy`, `/paste`, etc.) handled locally in `app.tsx`
2. Everything else → `slash.exec` (runs in persistent `_SlashWorker` subprocess) → `command.dispatch` fallback
### Dev Commands
```bash
cd ui-tui
npm install # first time
npm run dev # watch mode (rebuilds hermes-ink + tsx --watch)
npm start # production
npm run build # full build (hermes-ink + tsc)
npm run type-check # typecheck only (tsc --noEmit)
npm run lint # eslint
npm run fmt # prettier
npm test # vitest
```
---
## Adding New Tools
Requires changes in **2 files**:
@@ -458,13 +524,45 @@ def profile_env(tmp_path, monkeypatch):
## Testing
**ALWAYS use `scripts/run_tests.sh`** — do not call `pytest` directly. The script enforces
hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
4 xdist workers matching GHA ubuntu-latest). Direct `pytest` on a 16+ core
developer machine with API keys set diverges from CI in ways that have caused
multiple "works locally, fails in CI" incidents (and the reverse).
```bash
source venv/bin/activate
python -m pytest tests/ -q # Full suite (~3000 tests, ~3 min)
python -m pytest tests/test_model_tools.py -q # Toolset resolution
python -m pytest tests/test_cli_init.py -q # CLI config loading
python -m pytest tests/gateway/ -q # Gateway tests
python -m pytest tests/tools/ -q # Tool-level tests
scripts/run_tests.sh # full suite, CI-parity
scripts/run_tests.sh tests/gateway/ # one directory
scripts/run_tests.sh tests/agent/test_foo.py::test_x # one test
scripts/run_tests.sh -v --tb=long # pass-through pytest flags
```
### Why the wrapper (and why the old "just call pytest" doesn't work)
Five real sources of local-vs-CI drift the script closes:
| | Without wrapper | With wrapper |
|---|---|---|
| Provider API keys | Whatever is in your env (auto-detects pool) | All `*_API_KEY`/`*_TOKEN`/etc. unset |
| HOME / `~/.hermes/` | Your real config+auth.json | Temp dir per test |
| Timezone | Local TZ (PDT etc.) | UTC |
| Locale | Whatever is set | C.UTF-8 |
| xdist workers | `-n auto` = all cores (20+ on a workstation) | `-n 4` matching CI |
`tests/conftest.py` also enforces points 1-4 as an autouse fixture so ANY pytest
invocation (including IDE integrations) gets hermetic behavior — but the wrapper
is belt-and-suspenders.
### Running without the wrapper (only if you must)
If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
pytest directly), at minimum activate the venv and pass `-n 4`:
```bash
source venv/bin/activate
python -m pytest tests/ -q -n 4
```
Worker count above 4 will surface test-ordering flakes that CI never sees.
Always run the full suite before pushing changes.
+20 -10
View File
@@ -21,26 +21,36 @@ RUN useradd -u 10000 -m -d /opt/data hermes
COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
COPY . /opt/hermes
WORKDIR /opt/hermes
# Install Node dependencies and Playwright as root (--with-deps needs apt)
# ---------- Layer-cached dependency install ----------
# Copy only package manifests first so npm install + Playwright are cached
# unless the lockfiles themselves change.
COPY package.json package-lock.json ./
COPY scripts/whatsapp-bridge/package.json scripts/whatsapp-bridge/package-lock.json scripts/whatsapp-bridge/
COPY web/package.json web/package-lock.json web/
RUN npm install --prefer-offline --no-audit && \
npx playwright install --with-deps chromium --only-shell && \
cd /opt/hermes/scripts/whatsapp-bridge && \
npm install --prefer-offline --no-audit && \
(cd scripts/whatsapp-bridge && npm install --prefer-offline --no-audit) && \
(cd web && npm install --prefer-offline --no-audit) && \
npm cache clean --force
# Hand ownership to hermes user, then install Python deps in a virtualenv
RUN chown -R hermes:hermes /opt/hermes
USER hermes
# ---------- Source code ----------
# .dockerignore excludes node_modules, so the installs above survive.
COPY --chown=hermes:hermes . .
# Build web dashboard (Vite outputs to hermes_cli/web_dist/)
RUN cd web && npm run build
# ---------- Python virtualenv ----------
RUN chown hermes:hermes /opt/hermes
USER hermes
RUN uv venv && \
uv pip install --no-cache-dir -e ".[all]"
USER root
RUN chmod +x /opt/hermes/docker/entrypoint.sh
# ---------- Runtime ----------
ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
ENV HERMES_HOME=/opt/data
VOLUME [ "/opt/data" ]
ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
+9 -2
View File
@@ -13,7 +13,7 @@
**The self-improving AI agent built by [Nous Research](https://nousresearch.com).** It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
Use any model you want — [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai) (200+ models), [NVIDIA NIM](https://build.nvidia.com) (Nemotron), [Xiaomi MiMo](https://platform.xiaomimimo.com), [z.ai/GLM](https://z.ai), [Kimi/Moonshot](https://platform.moonshot.ai), [MiniMax](https://www.minimax.io), [Hugging Face](https://huggingface.co), OpenAI, or your own endpoint. Switch with `hermes model` — no code changes, no lock-in.
<table>
<tr><td><b>A real terminal interface</b></td><td>Full TUI with multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output.</td></tr>
@@ -141,11 +141,18 @@ See `hermes claw migrate --help` for all options, or use the `openclaw-migration
We welcome contributions! See the [Contributing Guide](https://hermes-agent.nousresearch.com/docs/developer-guide/contributing) for development setup, code style, and PR process.
Quick start for contributors:
Quick start for contributors — clone and go with `setup-hermes.sh`:
```bash
git clone https://github.com/NousResearch/hermes-agent.git
cd hermes-agent
./setup-hermes.sh # installs uv, creates venv, installs .[all], symlinks ~/.local/bin/hermes
./hermes # auto-detects the venv, no need to `source` first
```
Manual path (equivalent to the above):
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
source venv/bin/activate
+20 -1
View File
@@ -49,6 +49,7 @@ def make_tool_progress_cb(
session_id: str,
loop: asyncio.AbstractEventLoop,
tool_call_ids: Dict[str, Deque[str]],
tool_call_meta: Dict[str, Dict[str, Any]],
) -> Callable:
"""Create a ``tool_progress_callback`` for AIAgent.
@@ -84,6 +85,16 @@ def make_tool_progress_cb(
tool_call_ids[name] = queue
queue.append(tc_id)
snapshot = None
if name in {"write_file", "patch", "skill_manage"}:
try:
from agent.display import capture_local_edit_snapshot
snapshot = capture_local_edit_snapshot(name, args)
except Exception:
logger.debug("Failed to capture ACP edit snapshot for %s", name, exc_info=True)
tool_call_meta[tc_id] = {"args": args, "snapshot": snapshot}
update = build_tool_start(tc_id, name, args)
_send_update(conn, session_id, loop, update)
@@ -119,6 +130,7 @@ def make_step_cb(
session_id: str,
loop: asyncio.AbstractEventLoop,
tool_call_ids: Dict[str, Deque[str]],
tool_call_meta: Dict[str, Dict[str, Any]],
) -> Callable:
"""Create a ``step_callback`` for AIAgent.
@@ -132,10 +144,12 @@ def make_step_cb(
for tool_info in prev_tools:
tool_name = None
result = None
function_args = None
if isinstance(tool_info, dict):
tool_name = tool_info.get("name") or tool_info.get("function_name")
result = tool_info.get("result") or tool_info.get("output")
function_args = tool_info.get("arguments") or tool_info.get("args")
elif isinstance(tool_info, str):
tool_name = tool_info
@@ -145,8 +159,13 @@ def make_step_cb(
tool_call_ids[tool_name] = queue
if tool_name and queue:
tc_id = queue.popleft()
meta = tool_call_meta.pop(tc_id, {})
update = build_tool_complete(
tc_id, tool_name, result=str(result) if result is not None else None
tc_id,
tool_name,
result=str(result) if result is not None else None,
function_args=function_args or meta.get("args"),
snapshot=meta.get("snapshot"),
)
_send_update(conn, session_id, loop, update)
if not queue:
+148 -30
View File
@@ -26,6 +26,7 @@ from acp.schema import (
McpServerHttp,
McpServerSse,
McpServerStdio,
ModelInfo,
NewSessionResponse,
PromptResponse,
ResumeSessionResponse,
@@ -36,6 +37,7 @@ from acp.schema import (
SessionCapabilities,
SessionForkCapabilities,
SessionListCapabilities,
SessionModelState,
SessionResumeCapabilities,
SessionInfo,
TextContentBlock,
@@ -147,6 +149,98 @@ class HermesACPAgent(acp.Agent):
self._conn = conn
logger.info("ACP client connected")
@staticmethod
def _encode_model_choice(provider: str | None, model: str | None) -> str:
"""Encode a model selection so ACP clients can keep provider context."""
raw_model = str(model or "").strip()
if not raw_model:
return ""
raw_provider = str(provider or "").strip().lower()
if not raw_provider:
return raw_model
return f"{raw_provider}:{raw_model}"
def _build_model_state(self, state: SessionState) -> SessionModelState | None:
"""Return the ACP model selector payload for editors like Zed."""
model = str(state.model or getattr(state.agent, "model", "") or "").strip()
provider = getattr(state.agent, "provider", None) or detect_provider() or "openrouter"
try:
from hermes_cli.models import curated_models_for_provider, normalize_provider, provider_label
normalized_provider = normalize_provider(provider)
provider_name = provider_label(normalized_provider)
available_models: list[ModelInfo] = []
seen_ids: set[str] = set()
for model_id, description in curated_models_for_provider(normalized_provider):
rendered_model = str(model_id or "").strip()
if not rendered_model:
continue
choice_id = self._encode_model_choice(normalized_provider, rendered_model)
if choice_id in seen_ids:
continue
desc_parts = [f"Provider: {provider_name}"]
if description:
desc_parts.append(str(description).strip())
if rendered_model == model:
desc_parts.append("current")
available_models.append(
ModelInfo(
model_id=choice_id,
name=rendered_model,
description="".join(part for part in desc_parts if part),
)
)
seen_ids.add(choice_id)
current_model_id = self._encode_model_choice(normalized_provider, model)
if current_model_id and current_model_id not in seen_ids:
available_models.insert(
0,
ModelInfo(
model_id=current_model_id,
name=model,
description=f"Provider: {provider_name} • current",
),
)
if available_models:
return SessionModelState(
available_models=available_models,
current_model_id=current_model_id or available_models[0].model_id,
)
except Exception:
logger.debug("Could not build ACP model state", exc_info=True)
if not model:
return None
fallback_choice = self._encode_model_choice(provider, model)
return SessionModelState(
available_models=[ModelInfo(model_id=fallback_choice, name=model)],
current_model_id=fallback_choice,
)
@staticmethod
def _resolve_model_selection(raw_model: str, current_provider: str) -> tuple[str, str]:
"""Resolve ``provider:model`` input into the provider and normalized model id."""
target_provider = current_provider
new_model = raw_model.strip()
try:
from hermes_cli.models import detect_provider_for_model, parse_model_input
target_provider, new_model = parse_model_input(new_model, current_provider)
if target_provider == current_provider:
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
except Exception:
logger.debug("Provider detection failed, using model as-is", exc_info=True)
return target_provider, new_model
async def _register_session_mcp_servers(
self,
state: SessionState,
@@ -273,7 +367,10 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("New session %s (cwd=%s)", state.session_id, cwd)
self._schedule_available_commands_update(state.session_id)
return NewSessionResponse(session_id=state.session_id)
return NewSessionResponse(
session_id=state.session_id,
models=self._build_model_state(state),
)
async def load_session(
self,
@@ -289,7 +386,7 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
self._schedule_available_commands_update(session_id)
return LoadSessionResponse()
return LoadSessionResponse(models=self._build_model_state(state))
async def resume_session(
self,
@@ -305,7 +402,7 @@ class HermesACPAgent(acp.Agent):
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
self._schedule_available_commands_update(state.session_id)
return ResumeSessionResponse()
return ResumeSessionResponse(models=self._build_model_state(state))
async def cancel(self, session_id: str, **kwargs: Any) -> None:
state = self.session_manager.get_session(session_id)
@@ -340,11 +437,20 @@ class HermesACPAgent(acp.Agent):
cwd: str | None = None,
**kwargs: Any,
) -> ListSessionsResponse:
infos = self.session_manager.list_sessions()
sessions = [
SessionInfo(session_id=s["session_id"], cwd=s["cwd"])
for s in infos
]
infos = self.session_manager.list_sessions(cwd=cwd)
sessions = []
for s in infos:
updated_at = s.get("updated_at")
if updated_at is not None and not isinstance(updated_at, str):
updated_at = str(updated_at)
sessions.append(
SessionInfo(
session_id=s["session_id"],
cwd=s["cwd"],
title=s.get("title"),
updated_at=updated_at,
)
)
return ListSessionsResponse(sessions=sessions)
# ---- Prompt (core) ------------------------------------------------------
@@ -389,12 +495,13 @@ class HermesACPAgent(acp.Agent):
state.cancel_event.clear()
tool_call_ids: dict[str, Deque[str]] = defaultdict(deque)
tool_call_meta: dict[str, dict[str, Any]] = {}
previous_approval_cb = None
if conn:
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids)
tool_progress_cb = make_tool_progress_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
thinking_cb = make_thinking_cb(conn, session_id, loop)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids)
step_cb = make_step_cb(conn, session_id, loop, tool_call_ids, tool_call_meta)
message_cb = make_message_cb(conn, session_id, loop)
approval_cb = make_approval_callback(conn.request_permission, loop, session_id)
else:
@@ -449,6 +556,19 @@ class HermesACPAgent(acp.Agent):
self.session_manager.save_session(session_id)
final_response = result.get("final_response", "")
if final_response:
try:
from agent.title_generator import maybe_auto_title
maybe_auto_title(
self.session_manager._get_db(),
session_id,
user_text,
final_response,
state.history,
)
except Exception:
logger.debug("Failed to auto-title ACP session %s", session_id, exc_info=True)
if final_response and conn:
update = acp.update_agent_message_text(final_response)
await conn.session_update(session_id, update)
@@ -556,27 +676,15 @@ class HermesACPAgent(acp.Agent):
provider = getattr(state.agent, "provider", None) or "auto"
return f"Current model: {model}\nProvider: {provider}"
new_model = args.strip()
target_provider = None
current_provider = getattr(state.agent, "provider", None) or "openrouter"
# Auto-detect provider for the requested model
try:
from hermes_cli.models import parse_model_input, detect_provider_for_model
target_provider, new_model = parse_model_input(new_model, current_provider)
if target_provider == current_provider:
detected = detect_provider_for_model(new_model, current_provider)
if detected:
target_provider, new_model = detected
except Exception:
logger.debug("Provider detection failed, using model as-is", exc_info=True)
target_provider, new_model = self._resolve_model_selection(args, current_provider)
state.model = new_model
state.agent = self.session_manager._make_agent(
session_id=state.session_id,
cwd=state.cwd,
model=new_model,
requested_provider=target_provider or current_provider,
requested_provider=target_provider,
)
self.session_manager.save_session(state.session_id)
provider_label = getattr(state.agent, "provider", None) or target_provider or current_provider
@@ -678,20 +786,30 @@ class HermesACPAgent(acp.Agent):
"""Switch the model for a session (called by ACP protocol)."""
state = self.session_manager.get_session(session_id)
if state:
state.model = model_id
current_provider = getattr(state.agent, "provider", None)
current_base_url = getattr(state.agent, "base_url", None)
current_api_mode = getattr(state.agent, "api_mode", None)
requested_provider, resolved_model = self._resolve_model_selection(
model_id,
current_provider or "openrouter",
)
state.model = resolved_model
provider_changed = bool(current_provider and requested_provider != current_provider)
current_base_url = None if provider_changed else getattr(state.agent, "base_url", None)
current_api_mode = None if provider_changed else getattr(state.agent, "api_mode", None)
state.agent = self.session_manager._make_agent(
session_id=session_id,
cwd=state.cwd,
model=model_id,
requested_provider=current_provider,
model=resolved_model,
requested_provider=requested_provider,
base_url=current_base_url,
api_mode=current_api_mode,
)
self.session_manager.save_session(session_id)
logger.info("Session %s: model switched to %s", session_id, model_id)
logger.info(
"Session %s: model switched to %s via provider %s",
session_id,
resolved_model,
requested_provider,
)
return SetSessionModelResponse()
logger.warning("Session %s: model switch requested for missing session", session_id)
return None
+127 -34
View File
@@ -13,8 +13,12 @@ from hermes_constants import get_hermes_home
import copy
import json
import logging
import os
import re
import sys
import time
import uuid
from datetime import datetime, timezone
from dataclasses import dataclass, field
from threading import Lock
from typing import Any, Dict, List, Optional
@@ -22,6 +26,64 @@ from typing import Any, Dict, List, Optional
logger = logging.getLogger(__name__)
def _normalize_cwd_for_compare(cwd: str | None) -> str:
raw = str(cwd or ".").strip()
if not raw:
raw = "."
expanded = os.path.expanduser(raw)
# Normalize Windows drive paths into the equivalent WSL mount form so
# ACP history filters match the same workspace across Windows and WSL.
match = re.match(r"^([A-Za-z]):[\\/](.*)$", expanded)
if match:
drive = match.group(1).lower()
tail = match.group(2).replace("\\", "/")
expanded = f"/mnt/{drive}/{tail}"
elif re.match(r"^/mnt/[A-Za-z]/", expanded):
expanded = f"/mnt/{expanded[5].lower()}/{expanded[7:]}"
return os.path.normpath(expanded)
def _build_session_title(title: Any, preview: Any, cwd: str | None) -> str:
explicit = str(title or "").strip()
if explicit:
return explicit
preview_text = str(preview or "").strip()
if preview_text:
return preview_text
leaf = os.path.basename(str(cwd or "").rstrip("/\\"))
return leaf or "New thread"
def _format_updated_at(value: Any) -> str | None:
if value is None:
return None
if isinstance(value, str) and value.strip():
return value
try:
return datetime.fromtimestamp(float(value), tz=timezone.utc).isoformat()
except Exception:
return None
def _updated_at_sort_key(value: Any) -> float:
if value is None:
return float("-inf")
if isinstance(value, (int, float)):
return float(value)
raw = str(value).strip()
if not raw:
return float("-inf")
try:
return datetime.fromisoformat(raw.replace("Z", "+00:00")).timestamp()
except Exception:
try:
return float(raw)
except Exception:
return float("-inf")
def _acp_stderr_print(*args, **kwargs) -> None:
"""Best-effort human-readable output sink for ACP stdio sessions.
@@ -162,47 +224,78 @@ class SessionManager:
logger.info("Forked ACP session %s -> %s", session_id, new_id)
return state
def list_sessions(self) -> List[Dict[str, Any]]:
def list_sessions(self, cwd: str | None = None) -> List[Dict[str, Any]]:
"""Return lightweight info dicts for all sessions (memory + database)."""
normalized_cwd = _normalize_cwd_for_compare(cwd) if cwd else None
db = self._get_db()
persisted_rows: dict[str, dict[str, Any]] = {}
if db is not None:
try:
for row in db.list_sessions_rich(source="acp", limit=1000):
persisted_rows[str(row["id"])] = dict(row)
except Exception:
logger.debug("Failed to load ACP sessions from DB", exc_info=True)
# Collect in-memory sessions first.
with self._lock:
seen_ids = set(self._sessions.keys())
results = [
{
"session_id": s.session_id,
"cwd": s.cwd,
"model": s.model,
"history_len": len(s.history),
}
for s in self._sessions.values()
]
results = []
for s in self._sessions.values():
history_len = len(s.history)
if history_len <= 0:
continue
if normalized_cwd and _normalize_cwd_for_compare(s.cwd) != normalized_cwd:
continue
persisted = persisted_rows.get(s.session_id, {})
preview = next(
(
str(msg.get("content") or "").strip()
for msg in s.history
if msg.get("role") == "user" and str(msg.get("content") or "").strip()
),
persisted.get("preview") or "",
)
results.append(
{
"session_id": s.session_id,
"cwd": s.cwd,
"model": s.model,
"history_len": history_len,
"title": _build_session_title(persisted.get("title"), preview, s.cwd),
"updated_at": _format_updated_at(
persisted.get("last_active") or persisted.get("started_at") or time.time()
),
}
)
# Merge any persisted sessions not currently in memory.
db = self._get_db()
if db is not None:
try:
rows = db.search_sessions(source="acp", limit=1000)
for row in rows:
sid = row["id"]
if sid in seen_ids:
continue
# Extract cwd from model_config JSON.
cwd = "."
mc = row.get("model_config")
if mc:
try:
cwd = json.loads(mc).get("cwd", ".")
except (json.JSONDecodeError, TypeError):
pass
results.append({
"session_id": sid,
"cwd": cwd,
"model": row.get("model") or "",
"history_len": row.get("message_count") or 0,
})
except Exception:
logger.debug("Failed to list ACP sessions from DB", exc_info=True)
for sid, row in persisted_rows.items():
if sid in seen_ids:
continue
message_count = int(row.get("message_count") or 0)
if message_count <= 0:
continue
# Extract cwd from model_config JSON.
session_cwd = "."
mc = row.get("model_config")
if mc:
try:
session_cwd = json.loads(mc).get("cwd", ".")
except (json.JSONDecodeError, TypeError):
pass
if normalized_cwd and _normalize_cwd_for_compare(session_cwd) != normalized_cwd:
continue
results.append({
"session_id": sid,
"cwd": session_cwd,
"model": row.get("model") or "",
"history_len": message_count,
"title": _build_session_title(row.get("title"), row.get("preview"), session_cwd),
"updated_at": _format_updated_at(row.get("last_active") or row.get("started_at")),
})
results.sort(key=lambda item: _updated_at_sort_key(item.get("updated_at")), reverse=True)
return results
def update_cwd(self, session_id: str, cwd: str) -> Optional[SessionState]:
+174 -9
View File
@@ -2,6 +2,7 @@
from __future__ import annotations
import json
import uuid
from typing import Any, Dict, List, Optional
@@ -96,6 +97,170 @@ def build_tool_title(tool_name: str, args: Dict[str, Any]) -> str:
return tool_name
def _build_patch_mode_content(patch_text: str) -> List[Any]:
"""Parse V4A patch mode input into ACP diff blocks when possible."""
if not patch_text:
return [acp.tool_content(acp.text_block(""))]
try:
from tools.patch_parser import OperationType, parse_v4a_patch
operations, error = parse_v4a_patch(patch_text)
if error or not operations:
return [acp.tool_content(acp.text_block(patch_text))]
content: List[Any] = []
for op in operations:
if op.operation == OperationType.UPDATE:
old_chunks: list[str] = []
new_chunks: list[str] = []
for hunk in op.hunks:
old_lines = [line.content for line in hunk.lines if line.prefix in (" ", "-")]
new_lines = [line.content for line in hunk.lines if line.prefix in (" ", "+")]
if old_lines or new_lines:
old_chunks.append("\n".join(old_lines))
new_chunks.append("\n".join(new_lines))
old_text = "\n...\n".join(chunk for chunk in old_chunks if chunk)
new_text = "\n...\n".join(chunk for chunk in new_chunks if chunk)
if old_text or new_text:
content.append(
acp.tool_diff_content(
path=op.file_path,
old_text=old_text or None,
new_text=new_text or "",
)
)
continue
if op.operation == OperationType.ADD:
added_lines = [line.content for hunk in op.hunks for line in hunk.lines if line.prefix == "+"]
content.append(
acp.tool_diff_content(
path=op.file_path,
new_text="\n".join(added_lines),
)
)
continue
if op.operation == OperationType.DELETE:
content.append(
acp.tool_diff_content(
path=op.file_path,
old_text=f"Delete file: {op.file_path}",
new_text="",
)
)
continue
if op.operation == OperationType.MOVE:
content.append(
acp.tool_content(acp.text_block(f"Move file: {op.file_path} -> {op.new_path}"))
)
return content or [acp.tool_content(acp.text_block(patch_text))]
except Exception:
return [acp.tool_content(acp.text_block(patch_text))]
def _strip_diff_prefix(path: str) -> str:
raw = str(path or "").strip()
if raw.startswith(("a/", "b/")):
return raw[2:]
return raw
def _parse_unified_diff_content(diff_text: str) -> List[Any]:
"""Convert unified diff text into ACP diff content blocks."""
if not diff_text:
return []
content: List[Any] = []
current_old_path: Optional[str] = None
current_new_path: Optional[str] = None
old_lines: list[str] = []
new_lines: list[str] = []
def _flush() -> None:
nonlocal current_old_path, current_new_path, old_lines, new_lines
if current_old_path is None and current_new_path is None:
return
path = current_new_path if current_new_path and current_new_path != "/dev/null" else current_old_path
if not path or path == "/dev/null":
current_old_path = None
current_new_path = None
old_lines = []
new_lines = []
return
content.append(
acp.tool_diff_content(
path=_strip_diff_prefix(path),
old_text="\n".join(old_lines) if old_lines else None,
new_text="\n".join(new_lines),
)
)
current_old_path = None
current_new_path = None
old_lines = []
new_lines = []
for line in diff_text.splitlines():
if line.startswith("--- "):
_flush()
current_old_path = line[4:].strip()
continue
if line.startswith("+++ "):
current_new_path = line[4:].strip()
continue
if line.startswith("@@"):
continue
if current_old_path is None and current_new_path is None:
continue
if line.startswith("+"):
new_lines.append(line[1:])
elif line.startswith("-"):
old_lines.append(line[1:])
elif line.startswith(" "):
shared = line[1:]
old_lines.append(shared)
new_lines.append(shared)
_flush()
return content
def _build_tool_complete_content(
tool_name: str,
result: Optional[str],
*,
function_args: Optional[Dict[str, Any]] = None,
snapshot: Any = None,
) -> List[Any]:
"""Build structured ACP completion content, falling back to plain text."""
display_result = result or ""
if len(display_result) > 5000:
display_result = display_result[:4900] + f"\n... ({len(result)} chars total, truncated)"
if tool_name in {"write_file", "patch", "skill_manage"}:
try:
from agent.display import extract_edit_diff
diff_text = extract_edit_diff(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
if isinstance(diff_text, str) and diff_text.strip():
diff_content = _parse_unified_diff_content(diff_text)
if diff_content:
return diff_content
except Exception:
pass
return [acp.tool_content(acp.text_block(display_result))]
# ---------------------------------------------------------------------------
# Build ACP content objects for tool-call events
# ---------------------------------------------------------------------------
@@ -119,9 +284,8 @@ def build_tool_start(
new = arguments.get("new_string", "")
content = [acp.tool_diff_content(path=path, new_text=new, old_text=old)]
else:
# Patch mode — show the patch content as text
patch_text = arguments.get("patch", "")
content = [acp.tool_content(acp.text_block(patch_text))]
content = _build_patch_mode_content(patch_text)
return acp.start_tool_call(
tool_call_id, title, kind=kind, content=content, locations=locations,
raw_input=arguments,
@@ -178,16 +342,17 @@ def build_tool_complete(
tool_call_id: str,
tool_name: str,
result: Optional[str] = None,
function_args: Optional[Dict[str, Any]] = None,
snapshot: Any = None,
) -> ToolCallProgress:
"""Create a ToolCallUpdate (progress) event for a completed tool call."""
kind = get_tool_kind(tool_name)
# Truncate very large results for the UI
display_result = result or ""
if len(display_result) > 5000:
display_result = display_result[:4900] + f"\n... ({len(result)} chars total, truncated)"
content = [acp.tool_content(acp.text_block(display_result))]
content = _build_tool_complete_content(
tool_name,
result,
function_args=function_args,
snapshot=snapshot,
)
return acp.update_tool_call(
tool_call_id,
kind=kind,
+88 -33
View File
@@ -94,6 +94,54 @@ def _normalize_aux_provider(provider: Optional[str]) -> str:
return "custom"
return _PROVIDER_ALIASES.get(normalized, normalized)
_FIXED_TEMPERATURE_MODELS: Dict[str, float] = {
"kimi-for-coding": 0.6,
}
# Moonshot's kimi-for-coding endpoint (api.kimi.com/coding) documents:
# "k2.5 model will use a fixed value 1.0, non-thinking mode will use a fixed
# value 0.6. Any other value will result in an error." The same lock applies
# to the other k2.* models served on that endpoint. Enumerated explicitly so
# non-coding siblings like `kimi-k2-instruct` (variable temperature, served on
# the standard chat API and third parties) are NOT clamped.
# Source: https://platform.kimi.ai/docs/guide/kimi-k2-5-quickstart
_KIMI_INSTANT_MODELS: frozenset = frozenset({
"kimi-k2.5",
"kimi-k2-turbo-preview",
"kimi-k2-0905-preview",
})
_KIMI_THINKING_MODELS: frozenset = frozenset({
"kimi-k2-thinking",
"kimi-k2-thinking-turbo",
})
def _fixed_temperature_for_model(model: Optional[str]) -> Optional[float]:
"""Return a required temperature override for models with strict contracts.
Moonshot's kimi-for-coding endpoint rejects any non-approved temperature on
the k2.5 family. Non-thinking variants require exactly 0.6; thinking
variants require 1.0. An optional ``vendor/`` prefix (e.g.
``moonshotai/kimi-k2.5``) is tolerated for aggregator routings.
Returns ``None`` for every other model, including ``kimi-k2-instruct*``
which is the separate non-coding K2 family with variable temperature.
"""
normalized = (model or "").strip().lower()
fixed = _FIXED_TEMPERATURE_MODELS.get(normalized)
if fixed is not None:
logger.debug("Forcing temperature=%s for model %r (fixed map)", fixed, model)
return fixed
bare = normalized.rsplit("/", 1)[-1]
if bare in _KIMI_THINKING_MODELS:
logger.debug("Forcing temperature=1.0 for kimi thinking model %r", model)
return 1.0
if bare in _KIMI_INSTANT_MODELS:
logger.debug("Forcing temperature=0.6 for kimi instant model %r", model)
return 0.6
return None
# Default auxiliary models for direct API-key providers (cheap/fast for side tasks)
_API_KEY_PROVIDER_AUX_MODELS: Dict[str, str] = {
"gemini": "gemini-3-flash-preview",
@@ -1064,8 +1112,6 @@ _AUTO_PROVIDER_LABELS = {
"_resolve_api_key_provider": "api-key",
}
_AGGREGATOR_PROVIDERS = frozenset({"openrouter", "nous"})
_MAIN_RUNTIME_FIELDS = ("provider", "model", "base_url", "api_key", "api_mode")
@@ -1196,11 +1242,15 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
"""Full auto-detection chain.
Priority:
1. If the user's main provider is NOT an aggregator (OpenRouter / Nous),
use their main provider + main model directly. This ensures users on
Alibaba, DeepSeek, ZAI, etc. get auxiliary tasks handled by the same
provider they already have credentials for — no OpenRouter key needed.
2. OpenRouter → Nous → custom → Codex → API-key providers (original chain).
1. User's main provider + main model, regardless of provider type.
This means auxiliary tasks (compression, vision, web extraction,
session search, etc.) use the same model the user configured for
chat. Users on OpenRouter/Nous get their chosen chat model; users
on DeepSeek/ZAI/Alibaba get theirs; etc. Running aux tasks on the
user's picked model keeps behavior predictable — no surprise
switches to a cheap fallback model for side tasks.
2. OpenRouter → Nous → custom → Codex → API-key providers (fallback
chain, only used when the main provider has no working client).
"""
global auxiliary_is_nous, _stale_base_url_warned
auxiliary_is_nous = False # Reset — _try_nous() will set True if it wins
@@ -1230,11 +1280,16 @@ def _resolve_auto(main_runtime: Optional[Dict[str, Any]] = None) -> Tuple[Option
)
_stale_base_url_warned = True
# ── Step 1: non-aggregator main provider → use main model directly ──
# ── Step 1: main provider + main model → use them directly ──
#
# This is the primary aux backend for every user. "auto" means
# "use my main chat model for side tasks as well" — including users
# on aggregators (OpenRouter, Nous) who previously got routed to a
# cheap provider-side default. Explicit per-task overrides set via
# config.yaml (auxiliary.<task>.provider) still win over this.
main_provider = runtime_provider or _read_main_provider()
main_model = runtime_model or _read_main_model()
if (main_provider and main_model
and main_provider not in _AGGREGATOR_PROVIDERS
and main_provider not in ("auto", "")):
resolved_provider = main_provider
explicit_base_url = None
@@ -1593,7 +1648,6 @@ def resolve_provider_client(
from hermes_cli.models import copilot_default_headers
headers.update(copilot_default_headers())
client = OpenAI(api_key=api_key, base_url=base_url,
**({"default_headers": headers} if headers else {}))
@@ -1817,34 +1871,31 @@ def resolve_vision_provider_client(
if requested == "auto":
# Vision auto-detection order:
# 1. Active provider + model (user's main chat config)
# 2. OpenRouter (known vision-capable default model)
# 3. Nous Portal (known vision-capable default model)
# 1. User's main provider + main model (including aggregators).
# _PROVIDER_VISION_MODELS provides per-provider vision model
# overrides when the provider has a dedicated multimodal model
# that differs from the chat model (e.g. xiaomi → mimo-v2-omni,
# zai → glm-5v-turbo).
# 2. OpenRouter (vision-capable aggregator fallback)
# 3. Nous Portal (vision-capable aggregator fallback)
# 4. Stop
main_provider = _read_main_provider()
main_model = _read_main_model()
if main_provider and main_provider not in ("auto", ""):
if main_provider in _VISION_AUTO_PROVIDER_ORDER:
# Known strict backend — use its defaults.
sync_client, default_model = _resolve_strict_vision_backend(main_provider)
if sync_client is not None:
return _finalize(main_provider, sync_client, default_model)
else:
# Exotic provider (DeepSeek, Alibaba, Xiaomi, named custom, etc.)
# Use provider-specific vision model if available, otherwise main model.
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
rpc_client, rpc_model = resolve_provider_client(
main_provider, vision_model,
api_mode=resolved_api_mode)
if rpc_client is not None:
logger.info(
"Vision auto-detect: using active provider %s (%s)",
main_provider, rpc_model or vision_model,
)
return _finalize(
main_provider, rpc_client, rpc_model or vision_model)
vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
rpc_client, rpc_model = resolve_provider_client(
main_provider, vision_model,
api_mode=resolved_api_mode)
if rpc_client is not None:
logger.info(
"Vision auto-detect: using main provider %s (%s)",
main_provider, rpc_model or vision_model,
)
return _finalize(
main_provider, rpc_client, rpc_model or vision_model)
# Fall back through aggregators.
# Fall back through aggregators (uses their dedicated vision model,
# not the user's main model) when main provider has no client.
for candidate in _VISION_AUTO_PROVIDER_ORDER:
if candidate == main_provider:
continue # already tried above
@@ -2293,6 +2344,10 @@ def _build_call_kwargs(
"timeout": timeout,
}
fixed_temperature = _fixed_temperature_for_model(model)
if fixed_temperature is not None:
temperature = fixed_temperature
# Opus 4.7+ rejects any non-default temperature/top_p/top_k — silently
# drop here so auxiliary callers that hardcode temperature (e.g. 0.3 on
# flush_memories, 0 on structured-JSON extraction) don't 400 the moment
+55 -2
View File
@@ -63,6 +63,52 @@ _CHARS_PER_TOKEN = 4
_SUMMARY_FAILURE_COOLDOWN_SECONDS = 600
def _truncate_tool_call_args_json(args: str, head_chars: int = 200) -> str:
"""Shrink long string values inside a tool-call arguments JSON blob while
preserving JSON validity.
The ``function.arguments`` field on a tool call is a JSON-encoded string
passed through to the LLM provider; downstream providers strictly
validate it and return a non-retryable 400 when it is not well-formed.
An earlier implementation sliced the raw JSON at a fixed byte offset and
appended ``...[truncated]`` — which routinely produced strings like::
{"path": "/foo/bar", "content": "# long markdown
...[truncated]
i.e. an unterminated string and a missing closing brace. MiniMax, for
example, rejects this with ``invalid function arguments json string``
and the session gets stuck re-sending the same broken history on every
turn. See issue #11762 for the observed loop.
This helper parses the arguments, shrinks long string leaves inside the
parsed structure, and re-serialises. Non-string values (paths, ints,
booleans) are preserved intact. If the arguments are not valid JSON
to begin with — some model backends use non-JSON tool arguments — the
original string is returned unchanged rather than replaced with
something neither we nor the backend can parse.
"""
try:
parsed = json.loads(args)
except (ValueError, TypeError):
return args
def _shrink(obj: Any) -> Any:
if isinstance(obj, str):
if len(obj) > head_chars:
return obj[:head_chars] + "...[truncated]"
return obj
if isinstance(obj, dict):
return {k: _shrink(v) for k, v in obj.items()}
if isinstance(obj, list):
return [_shrink(v) for v in obj]
return obj
shrunken = _shrink(parsed)
# ensure_ascii=False preserves CJK/emoji instead of bloating with \uXXXX
return json.dumps(shrunken, ensure_ascii=False)
def _summarize_tool_result(tool_name: str, tool_args: str, tool_content: str) -> str:
"""Create an informative 1-line summary of a tool call + result.
@@ -449,6 +495,11 @@ class ContextCompressor(ContextEngine):
# Pass 3: Truncate large tool_call arguments in assistant messages
# outside the protected tail. write_file with 50KB content, for
# example, survives pruning entirely without this.
#
# The shrinking is done inside the parsed JSON structure so the
# result remains valid JSON — otherwise downstream providers 400
# on every subsequent turn until the broken call falls out of
# the window. See ``_truncate_tool_call_args_json`` docstring.
for i in range(prune_boundary):
msg = result[i]
if msg.get("role") != "assistant" or not msg.get("tool_calls"):
@@ -459,8 +510,10 @@ class ContextCompressor(ContextEngine):
if isinstance(tc, dict):
args = tc.get("function", {}).get("arguments", "")
if len(args) > 500:
tc = {**tc, "function": {**tc["function"], "arguments": args[:200] + "...[truncated]"}}
modified = True
new_args = _truncate_tool_call_args_json(args)
if new_args != args:
tc = {**tc, "function": {**tc["function"], "arguments": new_args}}
modified = True
new_tcs.append(tc)
if modified:
result[i] = {**msg, "tool_calls": new_tcs}
+28 -120
View File
@@ -22,8 +22,6 @@ from hermes_cli.auth import (
_auth_store_lock,
_codex_access_token_is_expiring,
_decode_jwt_claims,
_import_codex_cli_tokens,
_write_codex_cli_tokens,
_load_auth_store,
_load_provider_state,
_resolve_kimi_base_url,
@@ -457,39 +455,6 @@ class CredentialPool:
logger.debug("Failed to sync from credentials file: %s", exc)
return entry
def _sync_codex_entry_from_cli(self, entry: PooledCredential) -> PooledCredential:
"""Sync an openai-codex pool entry from ~/.codex/auth.json if tokens differ.
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When the Codex CLI (or another Hermes profile) refreshes its token,
the pool entry's refresh_token becomes stale. This method detects that
by comparing against ~/.codex/auth.json and syncing the fresh pair.
"""
if self.provider != "openai-codex":
return entry
try:
cli_tokens = _import_codex_cli_tokens()
if not cli_tokens:
return entry
cli_refresh = cli_tokens.get("refresh_token", "")
cli_access = cli_tokens.get("access_token", "")
if cli_refresh and cli_refresh != entry.refresh_token:
logger.debug("Pool entry %s: syncing tokens from ~/.codex/auth.json (refresh token changed)", entry.id)
updated = replace(
entry,
access_token=cli_access,
refresh_token=cli_refresh,
last_status=None,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(entry, updated)
self._persist()
return updated
except Exception as exc:
logger.debug("Failed to sync from ~/.codex/auth.json: %s", exc)
return entry
def _sync_device_code_entry_to_auth_store(self, entry: PooledCredential) -> None:
"""Write refreshed pool entry tokens back to auth.json providers.
@@ -585,13 +550,6 @@ class CredentialPool:
except Exception as wexc:
logger.debug("Failed to write refreshed token to credentials file: %s", wexc)
elif self.provider == "openai-codex":
# Proactively sync from ~/.codex/auth.json before refresh.
# The Codex CLI (or another Hermes profile) may have already
# consumed our refresh_token. Syncing first avoids a
# "refresh_token_reused" error when the CLI has a newer pair.
synced = self._sync_codex_entry_from_cli(entry)
if synced is not entry:
entry = synced
refreshed = auth_mod.refresh_codex_oauth_pure(
entry.access_token,
entry.refresh_token,
@@ -677,45 +635,6 @@ class CredentialPool:
# Credentials file had a valid (non-expired) token — use it directly
logger.debug("Credentials file has valid token, using without refresh")
return synced
# For openai-codex: the refresh_token may have been consumed by
# the Codex CLI between our proactive sync and the refresh call.
# Re-sync and retry once.
if self.provider == "openai-codex":
synced = self._sync_codex_entry_from_cli(entry)
if synced.refresh_token != entry.refresh_token:
logger.debug("Retrying Codex refresh with synced token from ~/.codex/auth.json")
try:
refreshed = auth_mod.refresh_codex_oauth_pure(
synced.access_token,
synced.refresh_token,
)
updated = replace(
synced,
access_token=refreshed["access_token"],
refresh_token=refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
last_status=STATUS_OK,
last_status_at=None,
last_error_code=None,
)
self._replace_entry(synced, updated)
self._persist()
self._sync_device_code_entry_to_auth_store(updated)
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file (retry): %s", wexc)
return updated
except Exception as retry_exc:
logger.debug("Codex retry refresh also failed: %s", retry_exc)
elif not self._entry_needs_refresh(synced):
logger.debug("Codex CLI has valid token, using without refresh")
self._sync_device_code_entry_to_auth_store(synced)
return synced
self._mark_exhausted(entry, None)
return None
@@ -734,17 +653,6 @@ class CredentialPool:
# _seed_from_singletons() on the next load_pool() sees fresh state
# instead of re-seeding stale/consumed tokens.
self._sync_device_code_entry_to_auth_store(updated)
# Write refreshed tokens back to ~/.codex/auth.json so Codex CLI
# and VS Code don't hit "refresh_token_reused" on their next refresh.
if self.provider == "openai-codex":
try:
_write_codex_cli_tokens(
updated.access_token,
updated.refresh_token,
last_refresh=updated.last_refresh,
)
except Exception as wexc:
logger.debug("Failed to write refreshed Codex tokens to CLI file: %s", wexc)
return updated
def _entry_needs_refresh(self, entry: PooledCredential) -> bool:
@@ -790,16 +698,6 @@ class CredentialPool:
if synced is not entry:
entry = synced
cleared_any = True
# For openai-codex entries, sync from ~/.codex/auth.json before
# any status/refresh checks. This picks up tokens refreshed by
# the Codex CLI or another Hermes profile.
if (self.provider == "openai-codex"
and entry.last_status == STATUS_EXHAUSTED
and entry.refresh_token):
synced = self._sync_codex_entry_from_cli(entry)
if synced is not entry:
entry = synced
cleared_any = True
if entry.last_status == STATUS_EXHAUSTED:
exhausted_until = _exhausted_until(entry)
if exhausted_until is not None and now < exhausted_until:
@@ -1130,6 +1028,14 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
state = _load_provider_state(auth_store, "nous")
if state:
active_sources.add("device_code")
# Prefer a user-supplied label embedded in the singleton state
# (set by persist_nous_credentials(label=...) when the user ran
# `hermes auth add nous --label <name>`). Fall back to the
# auto-derived token fingerprint for logins that didn't supply one.
custom_label = str(state.get("label") or "").strip()
seeded_label = custom_label or label_from_token(
state.get("access_token", ""), "device_code"
)
changed |= _upsert_entry(
entries,
provider,
@@ -1148,7 +1054,7 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
"agent_key": state.get("agent_key"),
"agent_key_expires_at": state.get("agent_key_expires_at"),
"tls": state.get("tls") if isinstance(state.get("tls"), dict) else None,
"label": label_from_token(state.get("access_token", ""), "device_code"),
"label": seeded_label,
},
)
@@ -1208,25 +1114,27 @@ def _seed_from_singletons(provider: str, entries: List[PooledCredential]) -> Tup
logger.debug("Qwen OAuth token seed failed: %s", exc)
elif provider == "openai-codex":
# Respect user suppression — `hermes auth remove openai-codex` marks
# the device_code source as suppressed so it won't be re-seeded from
# the Hermes auth store. Without this gate the removal is instantly
# undone on the next load_pool() call.
codex_suppressed = False
try:
from hermes_cli.auth import is_source_suppressed
codex_suppressed = is_source_suppressed(provider, "device_code")
except ImportError:
pass
if codex_suppressed:
return changed, active_sources
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
# Fallback: import from Codex CLI (~/.codex/auth.json) if Hermes auth
# store has no tokens. This mirrors resolve_codex_runtime_credentials()
# so that load_pool() and list_authenticated_providers() detect tokens
# that only exist in the Codex CLI shared file.
if not (isinstance(tokens, dict) and tokens.get("access_token")):
try:
from hermes_cli.auth import _import_codex_cli_tokens, _save_codex_tokens
cli_tokens = _import_codex_cli_tokens()
if cli_tokens:
logger.info("Importing Codex CLI tokens into Hermes auth store.")
_save_codex_tokens(cli_tokens)
# Re-read state after import
auth_store = _load_auth_store()
state = _load_provider_state(auth_store, "openai-codex")
tokens = state.get("tokens") if isinstance(state, dict) else None
except Exception as exc:
logger.debug("Codex CLI token import failed: %s", exc)
# Hermes owns its own Codex auth state — we do NOT auto-import from
# ~/.codex/auth.json at pool-load time. OAuth refresh tokens are
# single-use, so sharing them with Codex CLI / VS Code causes
# refresh_token_reused race failures. Users who want to adopt
# existing Codex CLI credentials get a one-time, explicit prompt
# via `hermes auth openai-codex`.
if isinstance(tokens, dict) and tokens.get("access_token"):
active_sources.add("device_code")
changed |= _upsert_entry(
+135 -4
View File
@@ -747,18 +747,149 @@ class GeminiCloudCodeClient:
def _gemini_http_error(response: httpx.Response) -> CodeAssistError:
"""Translate an httpx response into a CodeAssistError with rich metadata.
Parses Google's error envelope (``{"error": {"code", "message", "status",
"details": [...]}}``) so the agent's error classifier can reason about
the failure — ``status_code`` enables the rate_limit / auth classification
paths, and ``response`` lets the main loop honor ``Retry-After`` just
like it does for OpenAI SDK exceptions.
Also lifts a few recognizable Google conditions into human-readable
messages so the user sees something better than a 500-char JSON dump:
MODEL_CAPACITY_EXHAUSTED → "Gemini model capacity exhausted for
<model>. This is a Google-side throttle..."
RESOURCE_EXHAUSTED w/o reason → quota-style message
404 → "Model <name> not found at cloudcode-pa..."
"""
status = response.status_code
# Parse the body once, surviving any weird encodings.
body_text = ""
body_json: Dict[str, Any] = {}
try:
body = response.text[:500]
body_text = response.text
except Exception:
body = ""
# Let run_agent's retry logic see auth errors as rotatable via `api_key`
body_text = ""
if body_text:
try:
parsed = json.loads(body_text)
if isinstance(parsed, dict):
body_json = parsed
except (ValueError, TypeError):
body_json = {}
# Dig into Google's error envelope. Shape is:
# {"error": {"code": 429, "message": "...", "status": "RESOURCE_EXHAUSTED",
# "details": [{"@type": ".../ErrorInfo", "reason": "MODEL_CAPACITY_EXHAUSTED",
# "metadata": {...}},
# {"@type": ".../RetryInfo", "retryDelay": "30s"}]}}
err_obj = body_json.get("error") if isinstance(body_json, dict) else None
if not isinstance(err_obj, dict):
err_obj = {}
err_status = str(err_obj.get("status") or "").strip()
err_message = str(err_obj.get("message") or "").strip()
err_details_list = err_obj.get("details") if isinstance(err_obj.get("details"), list) else []
# Extract google.rpc.ErrorInfo reason + metadata. There may be more
# than one ErrorInfo (rare), so we pick the first one with a reason.
error_reason = ""
error_metadata: Dict[str, Any] = {}
retry_delay_seconds: Optional[float] = None
for detail in err_details_list:
if not isinstance(detail, dict):
continue
type_url = str(detail.get("@type") or "")
if not error_reason and type_url.endswith("/google.rpc.ErrorInfo"):
reason = detail.get("reason")
if isinstance(reason, str) and reason:
error_reason = reason
md = detail.get("metadata")
if isinstance(md, dict):
error_metadata = md
elif retry_delay_seconds is None and type_url.endswith("/google.rpc.RetryInfo"):
# retryDelay is a google.protobuf.Duration string like "30s" or "1.5s".
delay_raw = detail.get("retryDelay")
if isinstance(delay_raw, str) and delay_raw.endswith("s"):
try:
retry_delay_seconds = float(delay_raw[:-1])
except ValueError:
pass
elif isinstance(delay_raw, (int, float)):
retry_delay_seconds = float(delay_raw)
# Fall back to the Retry-After header if the body didn't include RetryInfo.
if retry_delay_seconds is None:
try:
header_val = response.headers.get("Retry-After") or response.headers.get("retry-after")
except Exception:
header_val = None
if header_val:
try:
retry_delay_seconds = float(header_val)
except (TypeError, ValueError):
retry_delay_seconds = None
# Classify the error code. ``code_assist_rate_limited`` stays the default
# for 429s; a more specific reason tag helps downstream callers (e.g. tests,
# logs) without changing the rate_limit classification path.
code = f"code_assist_http_{status}"
if status == 401:
code = "code_assist_unauthorized"
elif status == 429:
code = "code_assist_rate_limited"
if error_reason == "MODEL_CAPACITY_EXHAUSTED":
code = "code_assist_capacity_exhausted"
# Build a human-readable message. Keep the status + a raw-body tail for
# debugging, but lead with a friendlier summary when we recognize the
# Google signal.
model_hint = ""
if isinstance(error_metadata, dict):
model_hint = str(error_metadata.get("model") or error_metadata.get("modelId") or "").strip()
if status == 429 and error_reason == "MODEL_CAPACITY_EXHAUSTED":
target = model_hint or "this Gemini model"
message = (
f"Gemini capacity exhausted for {target} (Google-side throttle, "
f"not a Hermes issue). Try a different Gemini model or set a "
f"fallback_providers entry to a non-Gemini provider."
)
if retry_delay_seconds is not None:
message += f" Google suggests retrying in {retry_delay_seconds:g}s."
elif status == 429 and err_status == "RESOURCE_EXHAUSTED":
message = (
f"Gemini quota exhausted ({err_message or 'RESOURCE_EXHAUSTED'}). "
f"Check /gquota for remaining daily requests."
)
if retry_delay_seconds is not None:
message += f" Retry suggested in {retry_delay_seconds:g}s."
elif status == 404:
# Google returns 404 when a model has been retired or renamed.
target = model_hint or (err_message or "model")
message = (
f"Code Assist 404: {target} is not available at "
f"cloudcode-pa.googleapis.com. It may have been renamed or "
f"retired. Check hermes_cli/models.py for the current list."
)
elif err_message:
# Generic fallback with the parsed message.
message = f"Code Assist HTTP {status} ({err_status or 'error'}): {err_message}"
else:
# Last-ditch fallback — raw body snippet.
message = f"Code Assist returned HTTP {status}: {body_text[:500]}"
return CodeAssistError(
f"Code Assist returned HTTP {status}: {body}",
message,
code=code,
status_code=status,
response=response,
retry_after=retry_delay_seconds,
details={
"status": err_status,
"reason": error_reason,
"metadata": error_metadata,
"message": err_message,
},
)
+37 -1
View File
@@ -68,9 +68,45 @@ _ONBOARDING_POLL_INTERVAL_SECONDS = 5.0
class CodeAssistError(RuntimeError):
def __init__(self, message: str, *, code: str = "code_assist_error") -> None:
"""Exception raised by the Code Assist (``cloudcode-pa``) integration.
Carries HTTP status / response / retry-after metadata so the agent's
``error_classifier._extract_status_code`` and the main loop's Retry-After
handling (which walks ``error.response.headers``) pick up the right
signals. Without these, 429s from the OAuth path look like opaque
``RuntimeError`` and skip the rate-limit path.
"""
def __init__(
self,
message: str,
*,
code: str = "code_assist_error",
status_code: Optional[int] = None,
response: Any = None,
retry_after: Optional[float] = None,
details: Optional[Dict[str, Any]] = None,
) -> None:
super().__init__(message)
self.code = code
# ``status_code`` is picked up by ``agent.error_classifier._extract_status_code``
# so a 429 from Code Assist classifies as FailoverReason.rate_limit and
# triggers the main loop's fallback_providers chain the same way SDK
# errors do.
self.status_code = status_code
# ``response`` is the underlying ``httpx.Response`` (or a shim with a
# ``.headers`` mapping and ``.json()`` method). The main loop reads
# ``error.response.headers["Retry-After"]`` to honor Google's retry
# hints when the backend throttles us.
self.response = response
# Parsed ``Retry-After`` seconds (kept separately for convenience —
# Google returns retry hints in both the header and the error body's
# ``google.rpc.RetryInfo`` details, and we pick whichever we found).
self.retry_after = retry_after
# Parsed structured error details from the Google error envelope
# (e.g. ``{"reason": "MODEL_CAPACITY_EXHAUSTED", "status": "RESOURCE_EXHAUSTED"}``).
# Useful for logging and for tests that want to assert on specifics.
self.details = details or {}
class ProjectIdRequiredError(CodeAssistError):
+5 -26
View File
@@ -634,13 +634,7 @@ class InsightsEngine:
lines.append(f" Sessions: {o['total_sessions']:<12} Messages: {o['total_messages']:,}")
lines.append(f" Tool calls: {o['total_tool_calls']:<12,} User messages: {o['user_messages']:,}")
lines.append(f" Input tokens: {o['total_input_tokens']:<12,} Output tokens: {o['total_output_tokens']:,}")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f" Cache read: {o['total_cache_read_tokens']:<12,} Cache write: {o['total_cache_write_tokens']:,}")
cost_str = f"${o['estimated_cost']:.2f}"
if o.get("models_without_pricing"):
cost_str += " *"
lines.append(f" Total tokens: {o['total_tokens']:<12,} Est. cost: {cost_str}")
lines.append(f" Total tokens: {o['total_tokens']:,}")
if o["total_hours"] > 0:
lines.append(f" Active time: ~{_format_duration(o['total_hours'] * 3600):<11} Avg session: ~{_format_duration(o['avg_session_duration'])}")
lines.append(f" Avg msgs/session: {o['avg_messages_per_session']:.1f}")
@@ -650,16 +644,10 @@ class InsightsEngine:
if report["models"]:
lines.append(" 🤖 Models Used")
lines.append(" " + "" * 56)
lines.append(f" {'Model':<30} {'Sessions':>8} {'Tokens':>12} {'Cost':>8}")
lines.append(f" {'Model':<30} {'Sessions':>8} {'Tokens':>12}")
for m in report["models"]:
model_name = m["model"][:28]
if m.get("has_pricing"):
cost_cell = f"${m['cost']:>6.2f}"
else:
cost_cell = " N/A"
lines.append(f" {model_name:<30} {m['sessions']:>8} {m['total_tokens']:>12,} {cost_cell}")
if o.get("models_without_pricing"):
lines.append(" * Cost N/A for custom/self-hosted models")
lines.append(f" {model_name:<30} {m['sessions']:>8} {m['total_tokens']:>12,}")
lines.append("")
# Platform breakdown
@@ -739,15 +727,7 @@ class InsightsEngine:
# Overview
lines.append(f"**Sessions:** {o['total_sessions']} | **Messages:** {o['total_messages']:,} | **Tool calls:** {o['total_tool_calls']:,}")
cache_total = o.get("total_cache_read_tokens", 0) + o.get("total_cache_write_tokens", 0)
if cache_total > 0:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,} / cache: {cache_total:,})")
else:
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
cost_note = ""
if o.get("models_without_pricing"):
cost_note = " _(excludes custom/self-hosted models)_"
lines.append(f"**Est. cost:** ${o['estimated_cost']:.2f}{cost_note}")
lines.append(f"**Tokens:** {o['total_tokens']:,} (in: {o['total_input_tokens']:,} / out: {o['total_output_tokens']:,})")
if o["total_hours"] > 0:
lines.append(f"**Active time:** ~{_format_duration(o['total_hours'] * 3600)} | **Avg session:** ~{_format_duration(o['avg_session_duration'])}")
lines.append("")
@@ -756,8 +736,7 @@ class InsightsEngine:
if report["models"]:
lines.append("**🤖 Models:**")
for m in report["models"][:5]:
cost_str = f"${m['cost']:.2f}" if m.get("has_pricing") else "N/A"
lines.append(f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens, {cost_str}")
lines.append(f" {m['model'][:25]}{m['sessions']} sessions, {m['total_tokens']:,} tokens")
lines.append("")
# Platforms (if multi-platform)
+4 -1
View File
@@ -38,6 +38,7 @@ _PROVIDER_PREFIXES: frozenset[str] = frozenset({
"mimo", "xiaomi-mimo",
"arcee-ai", "arceeai",
"xai", "x-ai", "x.ai", "grok",
"nvidia", "nim", "nvidia-nim", "nemotron",
"qwen-portal",
})
@@ -124,7 +125,6 @@ DEFAULT_CONTEXT_LENGTHS = {
"gemini": 1048576,
# Gemma (open models served via AI Studio)
"gemma-4-31b": 256000,
"gemma-4-26b": 256000,
"gemma-3": 131072,
"gemma": 8192, # fallback for older gemma models
# DeepSeek
@@ -158,6 +158,8 @@ DEFAULT_CONTEXT_LENGTHS = {
"grok": 131072, # catch-all (grok-beta, unknown grok-*)
# Kimi
"kimi": 262144,
# Nemotron — NVIDIA's open-weights series (128K context across all sizes)
"nemotron": 131072,
# Arcee
"trinity": 262144,
# OpenRouter
@@ -240,6 +242,7 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"api.fireworks.ai": "fireworks",
"opencode.ai": "opencode-go",
"api.x.ai": "xai",
"integrate.api.nvidia.com": "nvidia",
"api.xiaomimimo.com": "xiaomi",
"xiaomimimo.com": "xiaomi",
"ollama.com": "ollama-cloud",
+43 -3
View File
@@ -420,7 +420,10 @@ def list_provider_models(provider: str) -> List[str]:
models = _get_provider_models(provider)
if models is None:
return []
return list(models.keys())
return [
mid for mid in models.keys()
if not _should_hide_from_provider_catalog(provider, mid)
]
# Patterns that indicate non-agentic or noise models (TTS, embedding,
@@ -432,6 +435,43 @@ _NOISE_PATTERNS: re.Pattern = re.compile(
re.IGNORECASE,
)
# Google's live Gemini catalogs currently include a mix of stale slugs and
# Gemma models whose TPM quotas are too small for normal Hermes agent traffic.
# Keep capability metadata available for direct/manual use, but hide these from
# the Gemini model catalogs we surface in setup and model selection.
_GOOGLE_HIDDEN_MODELS = frozenset({
# Low-TPM Gemma models that trip Google input-token quota walls under
# agent-style traffic despite advertising large context windows.
"gemma-4-31b-it",
"gemma-4-26b-it",
"gemma-4-26b-a4b-it",
"gemma-3-1b",
"gemma-3-1b-it",
"gemma-3-2b",
"gemma-3-2b-it",
"gemma-3-4b",
"gemma-3-4b-it",
"gemma-3-12b",
"gemma-3-12b-it",
"gemma-3-27b",
"gemma-3-27b-it",
# Stale/retired Google slugs that still surface through models.dev-backed
# Gemini selection but 404 on the current Google endpoints.
"gemini-1.5-flash",
"gemini-1.5-pro",
"gemini-1.5-flash-8b",
"gemini-2.0-flash",
"gemini-2.0-flash-lite",
})
def _should_hide_from_provider_catalog(provider: str, model_id: str) -> bool:
provider_lower = (provider or "").strip().lower()
model_lower = (model_id or "").strip().lower()
if provider_lower in {"gemini", "google"} and model_lower in _GOOGLE_HIDDEN_MODELS:
return True
return False
def list_agentic_models(provider: str) -> List[str]:
"""Return model IDs suitable for agentic use from models.dev.
@@ -448,6 +488,8 @@ def list_agentic_models(provider: str) -> List[str]:
for mid, entry in models.items():
if not isinstance(entry, dict):
continue
if _should_hide_from_provider_catalog(provider, mid):
continue
if not entry.get("tool_call", False):
continue
if _NOISE_PATTERNS.search(mid):
@@ -582,5 +624,3 @@ def get_model_info(
return _parse_model_info(mid, mdata, mdev_id)
return None
+7 -6
View File
@@ -654,7 +654,7 @@ def build_skills_system_prompt(
):
continue
skills_by_category.setdefault(category, []).append(
(skill_name, entry.get("description", ""))
(frontmatter_name, entry.get("description", ""))
)
category_descriptions = {
str(k): str(v)
@@ -679,7 +679,7 @@ def build_skills_system_prompt(
):
continue
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
(entry["frontmatter_name"], entry["description"])
)
# Read category-level DESCRIPTION.md files
@@ -722,9 +722,10 @@ def build_skills_system_prompt(
continue
entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
skill_name = entry["skill_name"]
if skill_name in seen_skill_names:
frontmatter_name = entry["frontmatter_name"]
if frontmatter_name in seen_skill_names:
continue
if entry["frontmatter_name"] in disabled or skill_name in disabled:
if frontmatter_name in disabled or skill_name in disabled:
continue
if not _skill_should_show(
extract_skill_conditions(frontmatter),
@@ -732,9 +733,9 @@ def build_skills_system_prompt(
available_toolsets,
):
continue
seen_skill_names.add(skill_name)
seen_skill_names.add(frontmatter_name)
skills_by_category.setdefault(entry["category"], []).append(
(skill_name, entry["description"])
(frontmatter_name, entry["description"])
)
except Exception as e:
logger.debug("Error reading external skill %s: %s", skill_file, e)
+1
View File
@@ -24,6 +24,7 @@ model:
# "minimax" - MiniMax global (requires: MINIMAX_API_KEY)
# "minimax-cn" - MiniMax China (requires: MINIMAX_CN_API_KEY)
# "huggingface" - Hugging Face Inference (requires: HF_TOKEN)
# "nvidia" - NVIDIA NIM / build.nvidia.com (requires: NVIDIA_API_KEY)
# "xiaomi" - Xiaomi MiMo (requires: XIAOMI_API_KEY)
# "arcee" - Arcee AI Trinity models (requires: ARCEEAI_API_KEY)
# "ollama-cloud" - Ollama Cloud (requires: OLLAMA_API_KEY — https://ollama.com/settings)
+399 -97
View File
@@ -18,6 +18,8 @@ import os
import shutil
import sys
import json
import re
import base64
import atexit
import tempfile
import time
@@ -78,6 +80,76 @@ _project_env = Path(__file__).parent / '.env'
load_hermes_dotenv(hermes_home=_hermes_home, project_env=_project_env)
_REASONING_TAGS = (
"REASONING_SCRATCHPAD",
"think",
"thinking",
"reasoning",
"thought",
)
def _strip_reasoning_tags(text: str) -> str:
"""Remove reasoning/thinking blocks from displayed text.
Handles every case:
* Closed pairs ``<tag></tag>`` (case-insensitive, multi-line).
* Unterminated open tags that run to end-of-text (e.g. truncated
generations on NIM/MiniMax where the close tag is dropped).
* Stray orphan close tags (``stuff</think>answer``) left behind by
partial-content dumps.
Covers the variants emitted by reasoning models today: ``<think>``,
``<thinking>``, ``<reasoning>``, ``<REASONING_SCRATCHPAD>``, and
``<thought>`` (Gemma 4). Must stay in sync with
``run_agent.py::_strip_think_blocks`` and the stream consumer's
``_OPEN_THINK_TAGS`` / ``_CLOSE_THINK_TAGS`` tuples.
"""
cleaned = text
for tag in _REASONING_TAGS:
# Closed pair — case-insensitive so <THINK>…</THINK> is handled too.
cleaned = re.sub(
rf"<{tag}>.*?</{tag}>\s*",
"",
cleaned,
flags=re.DOTALL | re.IGNORECASE,
)
# Unterminated open tag — strip from the tag to end of text.
cleaned = re.sub(
rf"<{tag}>.*$",
"",
cleaned,
flags=re.DOTALL | re.IGNORECASE,
)
# Stray orphan close tag left behind by partial dumps.
cleaned = re.sub(
rf"</{tag}>\s*",
"",
cleaned,
flags=re.IGNORECASE,
)
return cleaned.strip()
def _assistant_content_as_text(content: Any) -> str:
if content is None:
return ""
if isinstance(content, str):
return content
if isinstance(content, list):
parts = [
str(part.get("text", ""))
for part in content
if isinstance(part, dict) and part.get("type") == "text"
]
return "\n".join(p for p in parts if p)
return str(content)
def _assistant_copy_text(content: Any) -> str:
return _strip_reasoning_tags(_assistant_content_as_text(content))
# =============================================================================
# Configuration Loading
# =============================================================================
@@ -1172,6 +1244,10 @@ def _resolve_attachment_path(raw_path: str) -> Path | None:
return None
expanded = os.path.expandvars(os.path.expanduser(token))
if os.name != "nt":
normalized = expanded.replace("\\", "/")
if len(normalized) >= 3 and normalized[1] == ":" and normalized[2] == "/" and normalized[0].isalpha():
expanded = f"/mnt/{normalized[0].lower()}/{normalized[3:]}"
path = Path(expanded)
if not path.is_absolute():
base_dir = Path(os.getenv("TERMINAL_CWD", os.getcwd()))
@@ -1254,10 +1330,12 @@ def _detect_file_drop(user_input: str) -> "dict | None":
or stripped.startswith("~")
or stripped.startswith("./")
or stripped.startswith("../")
or (len(stripped) >= 3 and stripped[1] == ":" and stripped[2] in ("\\", "/") and stripped[0].isalpha())
or stripped.startswith('"/')
or stripped.startswith('"~')
or stripped.startswith("'/")
or stripped.startswith("'~")
or (len(stripped) >= 4 and stripped[0] in ("'", '"') and stripped[2] == ":" and stripped[3] in ("\\", "/") and stripped[1].isalpha())
)
if not starts_like_path:
return None
@@ -1732,7 +1810,7 @@ class HermesCLI:
mcp_names = set((CLI_CONFIG.get("mcp_servers") or {}).keys())
invalid = [t for t in toolsets if not validate_toolset(t) and t not in mcp_names]
if invalid:
self.console.print(f"[bold red]Warning: Unknown toolsets: {', '.join(invalid)}[/]")
self._console_print(f"[bold red]Warning: Unknown toolsets: {', '.join(invalid)}[/]")
# Filesystem checkpoints: CLI flag > config
cp_cfg = CLI_CONFIG.get("checkpoints", {})
@@ -2024,20 +2102,35 @@ class HermesCLI:
def _spinner_widget_height(self, width: Optional[int] = None) -> int:
"""Return the visible height for the spinner/status text line above the status bar."""
if not getattr(self, "_spinner_text", ""):
spinner_line = self._render_spinner_text()
if not spinner_line:
return 0
if self._use_minimal_tui_chrome(width=width):
return 0
# Compute how many lines the spinner text needs when wrapped.
# The rendered text is " {emoji} {label} ({elapsed})" — about
# len(_spinner_text) + 16 chars for indent + timer suffix.
width = width or self._get_tui_terminal_width()
if width and width > 10:
import math
text_len = len(self._spinner_text) + 16 # indent + timer
return max(1, math.ceil(text_len / width))
text_width = self._status_bar_display_width(spinner_line)
return max(1, math.ceil(text_width / width))
return 1
def _render_spinner_text(self) -> str:
"""Return the live spinner/status text exactly as rendered in the TUI."""
txt = getattr(self, "_spinner_text", "")
if not txt:
return ""
t0 = getattr(self, "_tool_start_time", 0) or 0
if t0 > 0:
import time as _time
elapsed = _time.monotonic() - t0
if elapsed >= 60:
_m, _s = int(elapsed // 60), int(elapsed % 60)
elapsed_str = f"{_m}m {_s}s"
else:
elapsed_str = f"{elapsed:.1f}s"
return f" {txt} ({elapsed_str})"
return f" {txt}"
def _get_voice_status_fragments(self, width: Optional[int] = None):
"""Return the voice status bar fragments for the interactive TUI."""
width = width or self._get_tui_terminal_width()
@@ -2168,7 +2261,7 @@ class HermesCLI:
normalized_model = normalize_model_for_provider(current_model, resolved_provider)
if normalized_model and normalized_model != current_model:
if not self._model_is_default:
self.console.print(
self._console_print(
f"[yellow]⚠️ Normalized model '{current_model}' to '{normalized_model}' for {resolved_provider}.[/]"
)
self.model = normalized_model
@@ -2184,7 +2277,7 @@ class HermesCLI:
canonical = normalize_copilot_model_id(current_model, api_key=self.api_key)
if canonical and canonical != current_model:
if not self._model_is_default:
self.console.print(
self._console_print(
f"[yellow]⚠️ Normalized Copilot model '{current_model}' to '{canonical}'.[/]"
)
self.model = canonical
@@ -2206,7 +2299,7 @@ class HermesCLI:
canonical = normalize_opencode_model_id(resolved_provider, current_model)
if canonical and canonical != current_model:
if not self._model_is_default:
self.console.print(
self._console_print(
f"[yellow]⚠️ Stripped provider prefix from '{current_model}'; using '{canonical}' for {resolved_provider}.[/]"
)
self.model = canonical
@@ -2228,7 +2321,7 @@ class HermesCLI:
if "/" in current_model:
slug = current_model.split("/", 1)[1]
if not self._model_is_default:
self.console.print(
self._console_print(
f"[yellow]⚠️ Stripped provider prefix from '{current_model}'; "
f"using '{slug}' for OpenAI Codex.[/]"
)
@@ -2977,7 +3070,7 @@ class HermesCLI:
use_compact = self.compact or term_width < 80
if use_compact:
self.console.print(_build_compact_banner())
self._console_print(_build_compact_banner())
self._show_status()
else:
# Get tools for display
@@ -3002,25 +3095,25 @@ class HermesCLI:
# Warn about very low context lengths (common with local servers)
if ctx_len and ctx_len <= 8192:
self.console.print()
self.console.print(
self._console_print()
self._console_print(
f"[yellow]⚠️ Context length is only {ctx_len:,} tokens — "
f"this is likely too low for agent use with tools.[/]"
)
self.console.print(
self._console_print(
"[dim] Hermes needs 16k32k minimum. Tool schemas + system prompt alone use ~4k8k.[/]"
)
base_url = getattr(self, "base_url", "") or ""
if "11434" in base_url or "ollama" in base_url.lower():
self.console.print(
self._console_print(
"[dim] Ollama fix: OLLAMA_CONTEXT_LENGTH=32768 ollama serve[/]"
)
elif "1234" in base_url:
self.console.print(
self._console_print(
"[dim] LM Studio fix: Set context length in model settings → reload model[/]"
)
else:
self.console.print(
self._console_print(
"[dim] Fix: Set model.context_length in config.yaml, or increase your server's context setting[/]"
)
@@ -3029,20 +3122,20 @@ class HermesCLI:
model_name = getattr(self, "model", "") or ""
if is_nous_hermes_non_agentic(model_name):
self.console.print()
self.console.print(
self._console_print()
self._console_print(
"[bold yellow]⚠ Nous Research Hermes 3 & 4 models are NOT agentic and are not "
"designed for use with Hermes Agent.[/]"
)
self.console.print(
self._console_print(
"[dim] They lack tool-calling capabilities required for agent workflows. "
"Consider using an agentic model (Claude, GPT, Gemini, DeepSeek, etc.).[/]"
)
self.console.print(
self._console_print(
"[dim] Switch with: /model sonnet or /model gpt5[/]"
)
self.console.print()
self._console_print()
def _preload_resumed_session(self) -> bool:
"""Load a resumed session's history from the DB early (before first chat).
@@ -3060,10 +3153,10 @@ class HermesCLI:
session_meta = self._session_db.get_session(self.session_id)
if not session_meta:
self.console.print(
self._console_print(
f"[bold red]Session not found: {self.session_id}[/]"
)
self.console.print(
self._console_print(
"[dim]Use a session ID from a previous CLI run "
"(hermes sessions list).[/]"
)
@@ -3078,7 +3171,7 @@ class HermesCLI:
if session_meta.get("title"):
title_part = f' "{session_meta["title"]}"'
accent_color = _accent_hex()
self.console.print(
self._console_print(
f"[{accent_color}]↻ Resumed session [bold]{self.session_id}[/bold]"
f"{title_part} "
f"({msg_count} user message{'s' if msg_count != 1 else ''}, "
@@ -3086,7 +3179,7 @@ class HermesCLI:
)
else:
accent_color = _accent_hex()
self.console.print(
self._console_print(
f"[{accent_color}]Session {self.session_id} found but has no "
f"messages. Starting fresh.[/]"
)
@@ -3125,21 +3218,6 @@ class HermesCLI:
MAX_ASST_LEN = 200 # truncate assistant text
MAX_ASST_LINES = 3 # max lines of assistant text
def _strip_reasoning(text: str) -> str:
"""Remove <REASONING_SCRATCHPAD>...</REASONING_SCRATCHPAD> blocks
from displayed text (reasoning model internal thoughts)."""
import re
cleaned = re.sub(
r"<REASONING_SCRATCHPAD>.*?</REASONING_SCRATCHPAD>\s*",
"", text, flags=re.DOTALL,
)
# Also strip unclosed reasoning tags at the end
cleaned = re.sub(
r"<REASONING_SCRATCHPAD>.*$",
"", cleaned, flags=re.DOTALL,
)
return cleaned.strip()
# Collect displayable entries (skip system, tool-result messages)
entries = [] # list of (role, display_text)
_last_asst_idx = None # index of last assistant entry
@@ -3171,7 +3249,7 @@ class HermesCLI:
elif role == "assistant":
text = "" if content is None else str(content)
text = _strip_reasoning(text)
text = _strip_reasoning_tags(text)
parts = []
full_parts = [] # un-truncated version
if text:
@@ -3276,7 +3354,7 @@ class HermesCLI:
padding=(0, 1),
style=_history_text_c,
)
self.console.print(panel)
self._console_print(panel)
def _try_attach_clipboard_image(self) -> bool:
"""Check clipboard for an image and attach it if found.
@@ -3510,6 +3588,26 @@ class HermesCLI:
killed = process_registry.kill_all()
print(f" ✅ Stopped {killed} process(es).")
def _handle_agents_command(self):
"""Handle /agents — show background processes and agent status."""
from tools.process_registry import format_uptime_short, process_registry
processes = process_registry.list_sessions()
running = [p for p in processes if p.get("status") == "running"]
finished = [p for p in processes if p.get("status") != "running"]
_cprint(f" Running processes: {len(running)}")
for p in running:
cmd = p.get("command", "")[:80]
up = format_uptime_short(p.get("uptime_seconds", 0))
_cprint(f" {p.get('session_id', '?')} · {up} · {cmd}")
if finished:
_cprint(f" Recently finished: {len(finished)}")
agent_running = getattr(self, "_agent_running", False)
_cprint(f" Agent: {'running' if agent_running else 'idle'}")
def _handle_paste_command(self):
"""Handle /paste — explicitly check clipboard for an image.
@@ -3535,6 +3633,61 @@ class HermesCLI:
else:
_cprint(f" {_DIM}(._.) No image found in clipboard{_RST}")
def _write_osc52_clipboard(self, text: str) -> None:
"""Copy *text* to terminal clipboard via OSC 52."""
payload = base64.b64encode(text.encode("utf-8")).decode("ascii")
seq = f"\x1b]52;c;{payload}\x07"
out = getattr(self, "_app", None)
output = getattr(out, "output", None) if out else None
if output and hasattr(output, "write_raw"):
output.write_raw(seq)
output.flush()
return
if output and hasattr(output, "write"):
output.write(seq)
output.flush()
return
sys.stdout.write(seq)
sys.stdout.flush()
def _handle_copy_command(self, cmd_original: str) -> None:
"""Handle /copy [number] — copy assistant output to clipboard."""
parts = cmd_original.split(maxsplit=1)
arg = parts[1].strip() if len(parts) > 1 else ""
assistant = [m for m in self.conversation_history if m.get("role") == "assistant"]
if not assistant:
_cprint(" Nothing to copy yet.")
return
if arg:
try:
idx = int(arg) - 1
except ValueError:
_cprint(" Usage: /copy [number]")
return
if idx < 0 or idx >= len(assistant):
_cprint(f" Invalid response number. Use 1-{len(assistant)}.")
return
else:
idx = len(assistant) - 1
while idx >= 0 and not _assistant_copy_text(assistant[idx].get("content")):
idx -= 1
if idx < 0:
_cprint(" Nothing to copy in assistant responses yet.")
return
text = _assistant_copy_text(assistant[idx].get("content"))
if not text:
_cprint(" Nothing to copy in that assistant response.")
return
try:
self._write_osc52_clipboard(text)
_cprint(f" Copied assistant response #{idx + 1} to clipboard")
except Exception as e:
_cprint(f" Clipboard copy failed: {e}")
def _handle_image_command(self, cmd_original: str):
"""Handle /image <path> — attach a local image file for the next prompt."""
raw_args = (cmd_original.split(None, 1)[1].strip() if " " in cmd_original else "")
@@ -3637,14 +3790,14 @@ class HermesCLI:
api_key_missing = [u for u in unavailable if u["missing_vars"]]
if api_key_missing:
self.console.print()
self.console.print("[yellow]⚠️ Some tools disabled (missing API keys):[/]")
self._console_print()
self._console_print("[yellow]⚠️ Some tools disabled (missing API keys):[/]")
for item in api_key_missing:
tools_str = ", ".join(item["tools"][:2]) # Show first 2 tools
if len(item["tools"]) > 2:
tools_str += f", +{len(item['tools'])-2} more"
self.console.print(f" [dim]• {item['name']}[/] [dim italic]({', '.join(item['missing_vars'])})[/]")
self.console.print("[dim] Run 'hermes setup' to configure[/]")
self._console_print(f" [dim]• {item['name']}[/] [dim italic]({', '.join(item['missing_vars'])})[/]")
self._console_print("[dim] Run 'hermes setup' to configure[/]")
except Exception:
pass # Don't crash on import errors
@@ -3671,7 +3824,7 @@ class HermesCLI:
skin = get_active_skin()
separator_color = skin.get_color("banner_dim", "#B8860B")
accent_color = skin.get_color("ui_accent", "#FFBF00")
label_color = skin.get_color("ui_label", "#4dd0e1")
label_color = skin.get_color("ui_label", "#DAA520")
except Exception:
separator_color, accent_color, label_color = "#B8860B", "#FFBF00", "cyan"
toolsets_info = ""
@@ -3682,7 +3835,7 @@ class HermesCLI:
if self._provider_source:
provider_info += f" [dim {separator_color}]·[/] [dim]auth: {self._provider_source}[/]"
self.console.print(
self._console_print(
f" {api_indicator} [{accent_color}]{model_short}[/] "
f"[dim {separator_color}]·[/] [bold {label_color}]{tool_count} tools[/]"
f"{toolsets_info}{provider_info}"
@@ -3739,7 +3892,7 @@ class HermesCLI:
f"Tokens: {total_tokens:,}",
f"Agent Running: {'Yes' if is_running else 'No'}",
])
self.console.print("\n".join(lines), highlight=False, markup=False)
self._console_print("\n".join(lines), highlight=False, markup=False)
def _fast_command_available(self) -> bool:
try:
@@ -4514,6 +4667,34 @@ class HermesCLI:
self._restore_modal_input_snapshot()
self._invalidate(min_interval=0.0)
@staticmethod
def _compute_model_picker_viewport(
selected: int,
scroll_offset: int,
n: int,
term_rows: int,
reserved_below: int = 6,
panel_chrome: int = 6,
min_visible: int = 3,
) -> tuple[int, int]:
"""Resolve (scroll_offset, visible) for the /model picker viewport.
``reserved_below`` matches the approval / clarify panels input area,
status bar, and separators below the panel. ``panel_chrome`` covers
this panel's own borders + blanks + hint row. The remaining rows hold
the scrollable list, with the offset slid to keep ``selected`` on screen.
"""
max_visible = max(min_visible, term_rows - reserved_below - panel_chrome)
if n <= max_visible:
return 0, n
visible = max_visible
if selected < scroll_offset:
scroll_offset = selected
elif selected >= scroll_offset + visible:
scroll_offset = selected - visible + 1
scroll_offset = max(0, min(scroll_offset, n - visible))
return scroll_offset, visible
def _apply_model_switch_result(self, result, persist_global: bool) -> None:
if not result.success:
_cprint(f"{result.error_message}")
@@ -4909,8 +5090,15 @@ class HermesCLI:
print(" To change model or provider, use: hermes model")
def _output_console(self):
"""Use prompt_toolkit-safe Rich rendering once the TUI is live."""
if getattr(self, "_app", None):
return ChatConsole()
return self.console
def _console_print(self, *args, **kwargs):
"""Print through the active command-safe console."""
self._output_console().print(*args, **kwargs)
@staticmethod
def _resolve_personality_prompt(value) -> str:
@@ -4930,14 +5118,14 @@ class HermesCLI:
from agent.google_oauth import get_valid_access_token, GoogleOAuthError, load_credentials
from agent.google_code_assist import retrieve_user_quota, CodeAssistError
except ImportError as exc:
self.console.print(f" [red]Gemini modules unavailable: {exc}[/]")
self._console_print(f" [red]Gemini modules unavailable: {exc}[/]")
return
try:
access_token = get_valid_access_token()
except GoogleOAuthError as exc:
self.console.print(f" [yellow]{exc}[/]")
self.console.print(" Run [bold]/model[/] and pick 'Google Gemini (OAuth)' to sign in.")
self._console_print(f" [yellow]{exc}[/]")
self._console_print(" Run [bold]/model[/] and pick 'Google Gemini (OAuth)' to sign in.")
return
creds = load_credentials()
@@ -4946,18 +5134,18 @@ class HermesCLI:
try:
buckets = retrieve_user_quota(access_token, project_id=project_id)
except CodeAssistError as exc:
self.console.print(f" [red]Quota lookup failed:[/] {exc}")
self._console_print(f" [red]Quota lookup failed:[/] {exc}")
return
if not buckets:
self.console.print(" [dim]No quota buckets reported (account may be on legacy/unmetered tier).[/]")
self._console_print(" [dim]No quota buckets reported (account may be on legacy/unmetered tier).[/]")
return
# Sort for stable display, group by model
buckets.sort(key=lambda b: (b.model_id, b.token_type))
self.console.print()
self.console.print(f" [bold]Gemini Code Assist quota[/] (project: {project_id or '(auto / free-tier)'})")
self.console.print()
self._console_print()
self._console_print(f" [bold]Gemini Code Assist quota[/] (project: {project_id or '(auto / free-tier)'})")
self._console_print()
for b in buckets:
pct = max(0.0, min(1.0, b.remaining_fraction))
width = 20
@@ -4967,8 +5155,8 @@ class HermesCLI:
header = b.model_id
if b.token_type:
header += f" [{b.token_type}]"
self.console.print(f" {header:40s} {bar} {pct_str}")
self.console.print()
self._console_print(f" {header:40s} {bar} {pct_str}")
self._console_print()
def _handle_personality_command(self, cmd: str):
"""Handle the /personality command to set predefined personalities."""
@@ -5416,7 +5604,7 @@ class HermesCLI:
_tip_color = get_active_skin().get_color("banner_dim", "#B8860B")
except Exception:
_tip_color = "#B8860B"
self.console.print(f"[dim {_tip_color}]✦ Tip: {_tip}[/]")
self._console_print(f"[dim {_tip_color}]✦ Tip: {_tip}[/]")
except Exception:
pass
elif canonical == "history":
@@ -5510,7 +5698,7 @@ class HermesCLI:
elif canonical == "statusbar":
self._status_bar_visible = not self._status_bar_visible
state = "visible" if self._status_bar_visible else "hidden"
self.console.print(f" Status bar {state}")
self._console_print(f" Status bar {state}")
elif canonical == "verbose":
self._toggle_verbose()
elif canonical == "yolo":
@@ -5525,6 +5713,8 @@ class HermesCLI:
self._show_usage()
elif canonical == "insights":
self._show_insights(cmd_original)
elif canonical == "copy":
self._handle_copy_command(cmd_original)
elif canonical == "debug":
self._handle_debug_command()
elif canonical == "paste":
@@ -5568,6 +5758,8 @@ class HermesCLI:
self._handle_snapshot_command(cmd_original)
elif canonical == "stop":
self._handle_stop_command()
elif canonical == "agents":
self._handle_agents_command()
elif canonical == "background":
self._handle_background_command(cmd_original)
elif canonical == "btw":
@@ -5584,6 +5776,30 @@ class HermesCLI:
_cprint(f" Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
else:
_cprint(f" Queued: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "steer":
# Inject a message after the next tool call without interrupting.
# If the agent is actively running, push the text into the agent's
# pending_steer slot — the drain hook in _execute_tool_calls_*
# will append it to the next tool result's content. If no agent
# is running, fall back to queue semantics (same as /queue).
parts = cmd_original.split(None, 1)
payload = parts[1].strip() if len(parts) > 1 else ""
if not payload:
_cprint(" Usage: /steer <prompt>")
elif self._agent_running and self.agent is not None and hasattr(self.agent, "steer"):
try:
accepted = self.agent.steer(payload)
except Exception as exc:
_cprint(f" Steer failed: {exc}")
else:
if accepted:
_cprint(f" ⏩ Steer queued — arrives after the next tool call: {payload[:80]}{'...' if len(payload) > 80 else ''}")
else:
_cprint(" Steer rejected (empty payload).")
else:
# No active run — treat as a normal next-turn message.
self._pending_input.put(payload)
_cprint(f" No agent running; queued as next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
elif canonical == "skin":
self._handle_skin_command(cmd_original)
elif canonical == "voice":
@@ -5605,15 +5821,15 @@ class HermesCLI:
)
output = result.stdout.strip() or result.stderr.strip()
if output:
self.console.print(_rich_text_from_ansi(output))
self._console_print(_rich_text_from_ansi(output))
else:
self.console.print("[dim]Command returned no output[/]")
self._console_print("[dim]Command returned no output[/]")
except subprocess.TimeoutExpired:
self.console.print("[bold red]Quick command timed out (30s)[/]")
self._console_print("[bold red]Quick command timed out (30s)[/]")
except Exception as e:
self.console.print(f"[bold red]Quick command error: {e}[/]")
self._console_print(f"[bold red]Quick command error: {e}[/]")
else:
self.console.print(f"[bold red]Quick command '{base_cmd}' has no command defined[/]")
self._console_print(f"[bold red]Quick command '{base_cmd}' has no command defined[/]")
elif qcmd.get("type") == "alias":
target = qcmd.get("target", "").strip()
if target:
@@ -5622,9 +5838,9 @@ class HermesCLI:
aliased_command = f"{target} {user_args}".strip()
return self.process_command(aliased_command)
else:
self.console.print(f"[bold red]Quick command '{base_cmd}' has no target defined[/]")
self._console_print(f"[bold red]Quick command '{base_cmd}' has no target defined[/]")
else:
self.console.print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
self._console_print(f"[bold red]Quick command '{base_cmd}' has unsupported type (supported: 'exec', 'alias')[/]")
# Check for plugin-registered slash commands
elif base_cmd.lstrip("/") in _get_plugin_cmd_handler_names():
from hermes_cli.plugins import get_plugin_command_handler
@@ -6881,8 +7097,7 @@ class HermesCLI:
)
raise RuntimeError(
"Voice mode requires sounddevice and numpy.\n"
"Install with: pip install sounddevice numpy\n"
"Or: pip install hermes-agent[voice]"
f"Install with: {sys.executable} -m pip install sounddevice numpy"
)
if not reqs.get("stt_available", reqs.get("stt_key_set")):
raise RuntimeError(
@@ -7158,8 +7373,7 @@ class HermesCLI:
_cprint(f" {_DIM}Then install/update the Termux:API Android app for microphone capture{_RST}")
_cprint(f" {_BOLD}Option 2: pkg install python-numpy portaudio && python -m pip install sounddevice{_RST}")
else:
_cprint(f"\n {_BOLD}Install: pip install {' '.join(reqs['missing_packages'])}{_RST}")
_cprint(f" {_DIM}Or: pip install hermes-agent[voice]{_RST}")
_cprint(f"\n {_BOLD}Install: {sys.executable} -m pip install {' '.join(reqs['missing_packages'])}{_RST}")
return
with self._voice_lock:
@@ -8110,7 +8324,15 @@ class HermesCLI:
else:
print(f"\n⚡ Sending after interrupt: '{preview}'")
self._pending_input.put(combined)
# If a /steer was left over (agent finished before another tool
# batch could absorb it), deliver it as the next user turn.
_leftover_steer = result.get("pending_steer") if result else None
if _leftover_steer and hasattr(self, '_pending_input'):
preview = _leftover_steer[:60] + ("..." if len(_leftover_steer) > 60 else "")
print(f"\n⏩ Delivering leftover /steer as next turn: '{preview}'")
self._pending_input.put(_leftover_steer)
return response
except Exception as e:
@@ -8388,7 +8610,7 @@ class HermesCLI:
except Exception:
_welcome_text = "Welcome to Hermes Agent! Type your message or /help for commands."
_welcome_color = "#FFF8DC"
self.console.print(f"[{_welcome_color}]{_welcome_text}[/]")
self._console_print(f"[{_welcome_color}]{_welcome_text}[/]")
# Show a random tip to help users discover features
try:
from hermes_cli.tips import get_random_tip
@@ -8397,16 +8619,16 @@ class HermesCLI:
_tip_color = _welcome_skin.get_color("banner_dim", "#B8860B")
except Exception:
_tip_color = "#B8860B"
self.console.print(f"[dim {_tip_color}]✦ Tip: {_tip}[/]")
self._console_print(f"[dim {_tip_color}]✦ Tip: {_tip}[/]")
except Exception:
pass # Tips are non-critical — never break startup
if self.preloaded_skills and not self._startup_skills_line_shown:
skills_label = ", ".join(self.preloaded_skills)
self.console.print(
self._console_print(
f"[bold {_accent_hex()}]Activated skills:[/] {skills_label}"
)
self._startup_skills_line_shown = True
self.console.print()
self._console_print()
# State for async operation
self._agent_running = False
@@ -8528,6 +8750,7 @@ class HermesCLI:
# --- /model picker modal ---
if self._model_picker_state:
self._handle_model_picker_selection()
event.app.current_buffer.reset()
event.app.invalidate()
return
@@ -8693,6 +8916,13 @@ class HermesCLI:
state["selected"] = min(max_idx, state.get("selected", 0) + 1)
event.app.invalidate()
@kb.add('escape', filter=Condition(lambda: bool(self._model_picker_state)), eager=True)
def model_picker_escape(event):
"""ESC closes the /model picker."""
self._close_model_picker()
event.app.current_buffer.reset()
event.app.invalidate()
# --- History navigation: up/down browse history in normal input mode ---
# The TextArea is multiline, so by default up/down only move the cursor.
# Buffer.auto_up/auto_down handle both: cursor movement when multi-line,
@@ -9201,21 +9431,10 @@ class HermesCLI:
return cli_ref._agent_spacer_height()
def get_spinner_text():
txt = cli_ref._spinner_text
if not txt:
spinner_line = cli_ref._render_spinner_text()
if not spinner_line:
return []
# Append live elapsed timer when a tool is running
t0 = cli_ref._tool_start_time
if t0 > 0:
import time as _time
elapsed = _time.monotonic() - t0
if elapsed >= 60:
_m, _s = int(elapsed // 60), int(elapsed % 60)
elapsed_str = f"{_m}m {_s}s"
else:
elapsed_str = f"{elapsed:.1f}s"
return [('class:hint', f' {txt} ({elapsed_str})')]
return [('class:hint', f' {txt}')]
return [('class:hint', spinner_line)]
def get_spinner_height():
return cli_ref._spinner_widget_height()
@@ -9494,6 +9713,22 @@ class HermesCLI:
box_width = _panel_box_width(title, [hint] + choices, min_width=46, max_width=84)
inner_text_width = max(8, box_width - 6)
selected = state.get("selected", 0)
# Scrolling viewport: the panel renders into a Window with no max
# height, so without limiting visible items the bottom border and
# any items past the available terminal rows get clipped on long
# provider catalogs (e.g. Ollama Cloud's 36+ models).
try:
from prompt_toolkit.application import get_app
term_rows = get_app().output.get_size().rows
except Exception:
term_rows = shutil.get_terminal_size((100, 24)).lines
scroll_offset, visible = HermesCLI._compute_model_picker_viewport(
selected, state.get("_scroll_offset", 0), len(choices), term_rows,
)
state["_scroll_offset"] = scroll_offset
lines = []
lines.append(('class:clarify-border', '╭─ '))
lines.append(('class:clarify-title', title))
@@ -9501,8 +9736,8 @@ class HermesCLI:
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
_append_panel_line(lines, 'class:clarify-border', 'class:clarify-hint', hint, box_width)
_append_blank_panel_line(lines, 'class:clarify-border', box_width)
selected = state.get("selected", 0)
for idx, choice in enumerate(choices):
for idx in range(scroll_offset, scroll_offset + visible):
choice = choices[idx]
style = 'class:clarify-selected' if idx == selected else 'class:clarify-choice'
prefix = ' ' if idx == selected else ' '
for wrapped in _wrap_panel_text(prefix + choice, inner_text_width, subsequent_indent=' '):
@@ -9907,8 +10142,36 @@ class HermesCLI:
# Register signal handlers for graceful shutdown on SSH disconnect / SIGTERM
def _signal_handler(signum, frame):
"""Handle SIGHUP/SIGTERM by triggering graceful cleanup."""
"""Handle SIGHUP/SIGTERM by triggering graceful cleanup.
Calls ``self.agent.interrupt()`` first so the agent daemon
thread's poll loop sees the per-thread interrupt and kills the
tool's subprocess group via ``_kill_process`` (os.killpg).
Without this, the main thread dies from KeyboardInterrupt and
the daemon thread is killed with it before it can run one
more poll iteration to clean up the subprocess, which was
spawned with ``os.setsid`` and therefore survives as an orphan
with PPID=1.
Grace window (``HERMES_SIGTERM_GRACE``, default 1.5 s) gives
the daemon time to: detect the interrupt (next 200 ms poll)
call _kill_process (SIGTERM + 1 s wait + SIGKILL if needed)
return from _wait_for_process. ``time.sleep`` releases the
GIL so the daemon actually runs during the window.
"""
logger.debug("Received signal %s, triggering graceful shutdown", signum)
try:
if getattr(self, "agent", None) and getattr(self, "_agent_running", False):
self.agent.interrupt(f"received signal {signum}")
import time as _t
try:
_grace = float(os.getenv("HERMES_SIGTERM_GRACE", "1.5"))
except (TypeError, ValueError):
_grace = 1.5
if _grace > 0:
_t.sleep(_grace)
except Exception:
pass # never block signal handling
raise KeyboardInterrupt()
try:
@@ -10211,6 +10474,45 @@ def main(
# Register cleanup for single-query mode (interactive mode registers in run())
atexit.register(_run_cleanup)
# Also install signal handlers in single-query / `-q` mode. Interactive
# mode registers its own inside HermesCLI.run(), but `-q` runs
# cli.agent.run_conversation() below and AIAgent spawns worker threads
# for tools — so when SIGTERM arrives on the main thread, raising
# KeyboardInterrupt only unwinds the main thread, not the worker
# running _wait_for_process. Python then exits, the child subprocess
# (spawned with os.setsid, its own process group) is reparented to
# init and keeps running as an orphan.
#
# Fix: route SIGTERM/SIGHUP through agent.interrupt() which sets the
# per-thread interrupt flag the worker's poll loop checks every 200 ms.
# Give the worker a grace window to call _kill_process (SIGTERM to the
# process group, then SIGKILL after 1 s), then raise KeyboardInterrupt
# so main unwinds normally. HERMES_SIGTERM_GRACE overrides the 1.5 s
# default for debugging.
def _signal_handler_q(signum, frame):
logger.debug("Received signal %s in single-query mode", signum)
try:
_agent = getattr(cli, "agent", None)
if _agent is not None:
_agent.interrupt(f"received signal {signum}")
import time as _t
try:
_grace = float(os.getenv("HERMES_SIGTERM_GRACE", "1.5"))
except (TypeError, ValueError):
_grace = 1.5
if _grace > 0:
_t.sleep(_grace)
except Exception:
pass # never block signal handling
raise KeyboardInterrupt()
try:
import signal as _signal
_signal.signal(_signal.SIGTERM, _signal_handler_q)
if hasattr(_signal, "SIGHUP"):
_signal.signal(_signal.SIGHUP, _signal_handler_q)
except Exception:
pass # signal handler may fail in restricted environments
# Handle single query mode
if query or image:
+253 -110
View File
@@ -27,7 +27,7 @@ except ImportError:
except ImportError:
msvcrt = None
from pathlib import Path
from typing import Optional
from typing import List, Optional
# Add parent directory to path for imports BEFORE repo-level imports.
# Without this, standalone invocations (e.g. after `hermes update` reloads
@@ -49,6 +49,33 @@ _KNOWN_DELIVERY_PLATFORMS = frozenset({
"qqbot",
})
# Platforms that support a configured cron/notification home target, mapped to
# the environment variable used by gateway setup/runtime config.
_HOME_TARGET_ENV_VARS = {
"matrix": "MATRIX_HOME_ROOM",
"telegram": "TELEGRAM_HOME_CHANNEL",
"discord": "DISCORD_HOME_CHANNEL",
"slack": "SLACK_HOME_CHANNEL",
"signal": "SIGNAL_HOME_CHANNEL",
"mattermost": "MATTERMOST_HOME_CHANNEL",
"sms": "SMS_HOME_CHANNEL",
"email": "EMAIL_HOME_ADDRESS",
"dingtalk": "DINGTALK_HOME_CHANNEL",
"feishu": "FEISHU_HOME_CHANNEL",
"wecom": "WECOM_HOME_CHANNEL",
"weixin": "WEIXIN_HOME_CHANNEL",
"bluebubbles": "BLUEBUBBLES_HOME_CHANNEL",
"qqbot": "QQBOT_HOME_CHANNEL",
}
# Legacy env var names kept for back-compat. Each entry is the current
# primary env var → the previous name. _get_home_target_chat_id falls
# back to the legacy name if the primary is unset, so users who set the
# old name before the rename keep working until they migrate.
_LEGACY_HOME_TARGET_ENV_VARS = {
"QQBOT_HOME_CHANNEL": "QQ_HOME_CHANNEL",
}
from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run
# Sentinel: when a cron agent has nothing new to report, it can start its
@@ -76,15 +103,28 @@ def _resolve_origin(job: dict) -> Optional[dict]:
return None
def _resolve_delivery_target(job: dict) -> Optional[dict]:
"""Resolve the concrete auto-delivery target for a cron job, if any."""
deliver = job.get("deliver", "local")
def _get_home_target_chat_id(platform_name: str) -> str:
"""Return the configured home target chat/room ID for a delivery platform."""
env_var = _HOME_TARGET_ENV_VARS.get(platform_name.lower())
if not env_var:
return ""
value = os.getenv(env_var, "")
if not value:
legacy = _LEGACY_HOME_TARGET_ENV_VARS.get(env_var)
if legacy:
value = os.getenv(legacy, "")
return value
def _resolve_single_delivery_target(job: dict, deliver_value: str) -> Optional[dict]:
"""Resolve one concrete auto-delivery target for a cron job."""
origin = _resolve_origin(job)
if deliver == "local":
if deliver_value == "local":
return None
if deliver == "origin":
if deliver_value == "origin":
if origin:
return {
"platform": origin["platform"],
@@ -93,8 +133,8 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
# Origin missing (e.g. job created via API/script) — try each
# platform's home channel as a fallback instead of silently dropping.
for platform_name in ("matrix", "telegram", "discord", "slack", "bluebubbles"):
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
for platform_name in _HOME_TARGET_ENV_VARS:
chat_id = _get_home_target_chat_id(platform_name)
if chat_id:
logger.info(
"Job '%s' has deliver=origin but no origin; falling back to %s home channel",
@@ -108,8 +148,8 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
return None
if ":" in deliver:
platform_name, rest = deliver.split(":", 1)
if ":" in deliver_value:
platform_name, rest = deliver_value.split(":", 1)
platform_key = platform_name.lower()
from tools.send_message_tool import _parse_target_ref
@@ -139,7 +179,7 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
"thread_id": thread_id,
}
platform_name = deliver
platform_name = deliver_value
if origin and origin.get("platform") == platform_name:
return {
"platform": platform_name,
@@ -149,7 +189,7 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
if platform_name.lower() not in _KNOWN_DELIVERY_PLATFORMS:
return None
chat_id = os.getenv(f"{platform_name.upper()}_HOME_CHANNEL", "")
chat_id = _get_home_target_chat_id(platform_name)
if not chat_id:
return None
@@ -160,6 +200,30 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
}
def _resolve_delivery_targets(job: dict) -> List[dict]:
"""Resolve all concrete auto-delivery targets for a cron job (supports comma-separated deliver)."""
deliver = job.get("deliver", "local")
if deliver == "local":
return []
parts = [p.strip() for p in str(deliver).split(",") if p.strip()]
seen = set()
targets = []
for part in parts:
target = _resolve_single_delivery_target(job, part)
if target:
key = (target["platform"].lower(), str(target["chat_id"]), target.get("thread_id"))
if key not in seen:
seen.add(key)
targets.append(target)
return targets
def _resolve_delivery_target(job: dict) -> Optional[dict]:
"""Resolve the concrete auto-delivery target for a cron job, if any."""
targets = _resolve_delivery_targets(job)
return targets[0] if targets else None
# Media extension sets — keep in sync with gateway/platforms/base.py:_process_message_background
_AUDIO_EXTS = frozenset({'.ogg', '.opus', '.mp3', '.wav', '.m4a'})
_VIDEO_EXTS = frozenset({'.mp4', '.mov', '.avi', '.mkv', '.webm', '.3gp'})
@@ -200,7 +264,7 @@ def _send_media_via_adapter(adapter, chat_id: str, media_files: list, metadata:
def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Optional[str]:
"""
Deliver job output to the configured target (origin chat, specific platform, etc.).
Deliver job output to the configured target(s) (origin chat, specific platform, etc.).
When ``adapters`` and ``loop`` are provided (gateway is running), tries to
use the live adapter first this supports E2EE rooms (e.g. Matrix) where
@@ -209,33 +273,14 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
Returns None on success, or an error string on failure.
"""
target = _resolve_delivery_target(job)
if not target:
targets = _resolve_delivery_targets(job)
if not targets:
if job.get("deliver", "local") != "local":
msg = f"no delivery target resolved for deliver={job.get('deliver', 'local')}"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
return None # local-only jobs don't deliver — not a failure
platform_name = target["platform"]
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
from tools.send_message_tool import _send_to_platform
from gateway.config import load_gateway_config, Platform
@@ -258,24 +303,6 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
"bluebubbles": Platform.BLUEBUBBLES,
"qqbot": Platform.QQBOT,
}
platform = platform_map.get(platform_name.lower())
if not platform:
msg = f"unknown platform '{platform_name}'"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
try:
config = load_gateway_config()
except Exception as e:
msg = f"failed to load gateway config: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
return msg
# Optionally wrap the content with a header/footer so the user knows this
# is a cron delivery. Wrapping is on by default; set cron.wrap_response: false
@@ -304,67 +331,117 @@ def _deliver_result(job: dict, content: str, adapters=None, loop=None) -> Option
from gateway.platforms.base import BasePlatformAdapter
media_files, cleaned_delivery_content = BasePlatformAdapter.extract_media(delivery_content)
# Prefer the live adapter when the gateway is running — this supports E2EE
# rooms (e.g. Matrix) where the standalone HTTP path cannot encrypt.
runtime_adapter = (adapters or {}).get(platform)
if runtime_adapter is not None and loop is not None and getattr(loop, "is_running", lambda: False)():
send_metadata = {"thread_id": thread_id} if thread_id else None
try:
# Send cleaned text (MEDIA tags stripped) — not the raw content
text_to_send = cleaned_delivery_content.strip()
adapter_ok = True
if text_to_send:
future = asyncio.run_coroutine_threadsafe(
runtime_adapter.send(chat_id, text_to_send, metadata=send_metadata),
loop,
)
send_result = future.result(timeout=60)
if send_result and not getattr(send_result, "success", True):
err = getattr(send_result, "error", "unknown")
logger.warning(
"Job '%s': live adapter send to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, err,
)
adapter_ok = False # fall through to standalone path
try:
config = load_gateway_config()
except Exception as e:
msg = f"failed to load gateway config: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
# Send extracted media files as native attachments via the live adapter
if adapter_ok and media_files:
_send_media_via_adapter(runtime_adapter, chat_id, media_files, send_metadata, loop, job)
delivery_errors = []
if adapter_ok:
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
return None
except Exception as e:
for target in targets:
platform_name = target["platform"]
chat_id = target["chat_id"]
thread_id = target.get("thread_id")
# Diagnostic: log thread_id for topic-aware delivery debugging
origin = job.get("origin") or {}
origin_thread = origin.get("thread_id")
if origin_thread and not thread_id:
logger.warning(
"Job '%s': live adapter delivery to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, e,
"Job '%s': origin has thread_id=%s but delivery target lost it "
"(deliver=%s, target=%s)",
job["id"], origin_thread, job.get("deliver", "local"), target,
)
elif thread_id:
logger.debug(
"Job '%s': delivering to %s:%s thread_id=%s",
job["id"], platform_name, chat_id, thread_id,
)
# Standalone path: run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files)
try:
result = asyncio.run(coro)
except RuntimeError:
# asyncio.run() checks for a running loop before awaiting the coroutine;
# when it raises, the original coro was never started — close it to
# prevent "coroutine was never awaited" RuntimeWarning, then retry in a
# fresh thread that has no running loop.
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files))
result = future.result(timeout=30)
except Exception as e:
msg = f"delivery to {platform_name}:{chat_id} failed: {e}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
platform = platform_map.get(platform_name.lower())
if not platform:
msg = f"unknown platform '{platform_name}'"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
if result and result.get("error"):
msg = f"delivery error: {result['error']}"
logger.error("Job '%s': %s", job["id"], msg)
return msg
# Prefer the live adapter when the gateway is running — this supports E2EE
# rooms (e.g. Matrix) where the standalone HTTP path cannot encrypt.
runtime_adapter = (adapters or {}).get(platform)
delivered = False
if runtime_adapter is not None and loop is not None and getattr(loop, "is_running", lambda: False)():
send_metadata = {"thread_id": thread_id} if thread_id else None
try:
# Send cleaned text (MEDIA tags stripped) — not the raw content
text_to_send = cleaned_delivery_content.strip()
adapter_ok = True
if text_to_send:
future = asyncio.run_coroutine_threadsafe(
runtime_adapter.send(chat_id, text_to_send, metadata=send_metadata),
loop,
)
send_result = future.result(timeout=60)
if send_result and not getattr(send_result, "success", True):
err = getattr(send_result, "error", "unknown")
logger.warning(
"Job '%s': live adapter send to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, err,
)
adapter_ok = False # fall through to standalone path
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
# Send extracted media files as native attachments via the live adapter
if adapter_ok and media_files:
_send_media_via_adapter(runtime_adapter, chat_id, media_files, send_metadata, loop, job)
if adapter_ok:
logger.info("Job '%s': delivered to %s:%s via live adapter", job["id"], platform_name, chat_id)
delivered = True
except Exception as e:
logger.warning(
"Job '%s': live adapter delivery to %s:%s failed (%s), falling back to standalone",
job["id"], platform_name, chat_id, e,
)
if not delivered:
pconfig = config.platforms.get(platform)
if not pconfig or not pconfig.enabled:
msg = f"platform '{platform_name}' not configured/enabled"
logger.warning("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
# Standalone path: run the async send in a fresh event loop (safe from any thread)
coro = _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files)
try:
result = asyncio.run(coro)
except RuntimeError:
# asyncio.run() checks for a running loop before awaiting the coroutine;
# when it raises, the original coro was never started — close it to
# prevent "coroutine was never awaited" RuntimeWarning, then retry in a
# fresh thread that has no running loop.
coro.close()
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, cleaned_delivery_content, thread_id=thread_id, media_files=media_files))
result = future.result(timeout=30)
except Exception as e:
msg = f"delivery to {platform_name}:{chat_id} failed: {e}"
logger.error("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
if result and result.get("error"):
msg = f"delivery error: {result['error']}"
logger.error("Job '%s': %s", job["id"], msg)
delivery_errors.append(msg)
continue
logger.info("Job '%s': delivered to %s:%s", job["id"], platform_name, chat_id)
if delivery_errors:
return "; ".join(delivery_errors)
return None
@@ -487,15 +564,53 @@ def _run_job_script(script_path: str) -> tuple[bool, str]:
return False, f"Script execution failed: {exc}"
def _build_job_prompt(job: dict) -> str:
"""Build the effective prompt for a cron job, optionally loading one or more skills first."""
def _parse_wake_gate(script_output: str) -> bool:
"""Parse the last non-empty stdout line of a cron job's pre-check script
as a wake gate.
The convention (ported from nanoclaw #1232): if the last stdout line is
JSON like ``{"wakeAgent": false}``, the agent is skipped entirely no
LLM run, no delivery. Any other output (non-JSON, missing flag, gate
absent, or ``wakeAgent: true``) means wake the agent normally.
Returns True if the agent should wake, False to skip.
"""
if not script_output:
return True
stripped_lines = [line for line in script_output.splitlines() if line.strip()]
if not stripped_lines:
return True
last_line = stripped_lines[-1].strip()
try:
gate = json.loads(last_line)
except (json.JSONDecodeError, ValueError):
return True
if not isinstance(gate, dict):
return True
return gate.get("wakeAgent", True) is not False
def _build_job_prompt(job: dict, prerun_script: Optional[tuple] = None) -> str:
"""Build the effective prompt for a cron job, optionally loading one or more skills first.
Args:
job: The cron job dict.
prerun_script: Optional ``(success, stdout)`` from a script that has
already been executed by the caller (e.g. for a wake-gate check).
When provided, the script is not re-executed and the cached
result is used for prompt injection. When omitted, the script
(if any) runs inline as before.
"""
prompt = job.get("prompt", "")
skills = job.get("skills")
# Run data-collection script if configured, inject output as context.
script_path = job.get("script")
if script_path:
success, script_output = _run_job_script(script_path)
if prerun_script is not None:
success, script_output = prerun_script
else:
success, script_output = _run_job_script(script_path)
if success:
if script_output:
prompt = (
@@ -597,13 +712,41 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
job_id = job["id"]
job_name = job["name"]
prompt = _build_job_prompt(job)
# Wake-gate: if this job has a pre-check script, run it BEFORE building
# the prompt so a ``{"wakeAgent": false}`` response can short-circuit
# the whole agent run. We pass the result into _build_job_prompt so
# the script is only executed once.
prerun_script = None
script_path = job.get("script")
if script_path:
prerun_script = _run_job_script(script_path)
_ran_ok, _script_output = prerun_script
if _ran_ok and not _parse_wake_gate(_script_output):
logger.info(
"Job '%s' (ID: %s): wakeAgent=false, skipping agent run",
job_name, job_id,
)
silent_doc = (
f"# Cron Job: {job_name}\n\n"
f"**Job ID:** {job_id}\n"
f"**Run Time:** {_hermes_now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
"Script gate returned `wakeAgent=false` — agent skipped.\n"
)
return True, silent_doc, SILENT_MARKER, None
prompt = _build_job_prompt(job, prerun_script=prerun_script)
origin = _resolve_origin(job)
_cron_session_id = f"cron_{job_id}_{_hermes_now().strftime('%Y%m%d_%H%M%S')}"
logger.info("Running job '%s' (ID: %s)", job_name, job_id)
logger.info("Prompt: %s", prompt[:100])
# Mark this as a cron session so the approval system can apply cron_mode.
# This env var is process-wide and persists for the lifetime of the
# scheduler process — every job this process runs is a cron job.
os.environ["HERMES_CRON_SESSION"] = "1"
try:
# Inject origin context so the agent's send_message tool knows the chat.
# Must be INSIDE the try block so the finally cleanup always runs.
@@ -0,0 +1,108 @@
# Ink Gateway TUI Migration — Post-mortem
Planned: 2026-04-01 · Delivered: 2026-04 · Status: shipped, classic (prompt_toolkit) CLI still present
## What Shipped
Three layers, same repo, Python runtime unchanged.
```
ui-tui (Node/TS) ──stdio JSON-RPC──▶ tui_gateway (Py) ──▶ AIAgent (run_agent.py)
```
### Backend — `tui_gateway/`
```
tui_gateway/
├── entry.py # subprocess entrypoint, stdio read/write loop
├── server.py # everything: sessions dict, @method handlers, _emit
├── render.py # stream renderer, diff rendering, message rendering
├── slash_worker.py # subprocess that runs hermes_cli slash commands
└── __init__.py
```
`server.py` owns the full runtime-control surface: session store (`_sessions: dict[str, dict]`), method registry (`@method("…")` decorator), event emitter (`_emit`), agent lifecycle (`_make_agent`, `_init_session`, `_wire_callbacks`), approval/sudo/clarify round-trips, and JSON-RPC dispatch.
Protocol methods (`@method(...)` in `server.py`):
- session: `session.{create, resume, list, close, interrupt, usage, history, compress, branch, title, save, undo}`
- prompt: `prompt.{submit, background, btw}`
- tools: `tools.{list, show, configure}`
- slash: `slash.exec`, `command.{dispatch, resolve}`, `commands.catalog`, `complete.{path, slash}`
- approvals: `approval.respond`, `sudo.respond`, `clarify.respond`, `secret.respond`
- config/state: `config.{get, set, show}`, `model.options`, `reload.mcp`
- ops: `shell.exec`, `cli.exec`, `terminal.resize`, `input.detect_drop`, `clipboard.paste`, `paste.collapse`, `image.attach`, `process.stop`
- misc: `agents.list`, `skills.manage`, `plugins.list`, `cron.manage`, `insights.get`, `rollback.{list, diff, restore}`, `browser.manage`
Protocol events (`_emit(…)` → handled in `ui-tui/src/app/createGatewayEventHandler.ts`):
- lifecycle: `gateway.{ready, stderr}`, `session.info`, `skin.changed`
- stream: `message.{start, delta, complete}`, `thinking.delta`, `reasoning.{delta, available}`, `status.update`
- tools: `tool.{start, progress, complete, generating}`, `subagent.{start, thinking, tool, progress, complete}`
- interactive: `approval.request`, `sudo.request`, `clarify.request`, `secret.request`
- async: `background.complete`, `btw.complete`, `error`
### Frontend — `ui-tui/src/`
```
src/
├── entry.tsx # node bootstrap: bootBanner → spawn python → dynamic-import Ink → render(<App/>)
├── app.tsx # <GatewayProvider> wraps <AppLayout>
├── bootBanner.ts # raw-ANSI banner to stdout in ~2ms, pre-React
├── gatewayClient.ts # JSON-RPC client over child_process stdio
├── gatewayTypes.ts # typed RPC responses + GatewayEvent union
├── theme.ts # DEFAULT_THEME + fromSkin
├── app/ # hooks + stores — the orchestration layer
│ ├── uiStore.ts # nanostore: sid, info, busy, usage, theme, status…
│ ├── turnStore.ts # nanostore: per-turn activity / reasoning / tools
│ ├── turnController.ts # imperative singleton for stream-time operations
│ ├── overlayStore.ts # nanostore: modal/overlay state
│ ├── useMainApp.ts # top-level composition hook
│ ├── useSessionLifecycle.ts # session.create/resume/close/reset
│ ├── useSubmission.ts # shell/slash/prompt dispatch + interpolation
│ ├── useConfigSync.ts # config.get + mtime poll
│ ├── useComposerState.ts # input buffer, paste snippets, editor mode
│ ├── useInputHandlers.ts # key bindings
│ ├── createGatewayEventHandler.ts # event-stream dispatcher
│ ├── createSlashHandler.ts # slash command router (registry + python fallback)
│ └── slash/commands/ # core.ts, ops.ts, session.ts — TS-owned slash commands
├── components/ # AppLayout, AppChrome, AppOverlays, MessageLine, Thinking, Markdown, pickers, prompts, Banner, SessionPanel
├── config/ # env, limits, timing constants
├── content/ # charms, faces, fortunes, hotkeys, placeholders, verbs
├── domain/ # details, messages, paths, roles, slash, usage, viewport
├── protocol/ # interpolation, paste regex
├── hooks/ # useCompletion, useInputHistory, useQueue, useVirtualHistory
└── lib/ # history, messages, osc52, rpc, text
```
### CLI entry points — `hermes_cli/main.py`
- `hermes --tui``node dist/entry.js` (auto-builds when `.ts`/`.tsx` newer than `dist/entry.js`)
- `hermes --tui --dev``tsx src/entry.tsx` (skip build)
- `HERMES_TUI_DIR=…` → external prebuilt dist (nix, distro packaging)
## Diverged From Original Plan
| Plan | Reality | Why |
|---|---|---|
| `tui_gateway/{controller,session_state,events,protocol}.py` | all collapsed into `server.py` | no second consumer ever emerged, keeping one file cheaper than four |
| `ui-tui/src/main.tsx` | split into `entry.tsx` (bootstrap) + `app.tsx` (shell) | boot banner + early python spawn wanted a pre-React moment |
| `ui-tui/src/state/store.ts` | three nanostores (`uiStore`, `turnStore`, `overlayStore`) | separate lifetimes: ui persists, turn resets per reply, overlay is modal |
| `approval.requested` / `sudo.requested` / `clarify.requested` | `*.request` (no `-ed`) | cosmetic |
| `session.cancel` | dropped | `session.interrupt` covers it |
| `HERMES_EXPERIMENTAL_TUI=1`, `display.experimental_tui: true`, `/tui on/off/status` | none shipped | `--tui` went from opt-in to first-class without an experimental phase |
## Post-migration Additions (not in original plan)
- **Async `session.create`** — returns sid in ~1ms, agent builds on a background thread, `session.info` broadcasts when ready; `_wait_agent()` gates every agent-touching handler via `_sess`
- **`bootBanner`** — raw-ANSI logo painted to stdout at T≈2ms, before Ink loads; `<AlternateScreen>` wipes it seamlessly when React mounts
- **Selection uniform bg**`theme.color.selectionBg` wired via `useSelection().setSelectionBgColor`; replaces SGR-inverse per-cell swap that fragmented over amber/gold fg
- **Slash command registry** — TS-owned commands in `app/slash/commands/{core,ops,session}.ts`, everything else falls through to `slash.exec` (python worker)
- **Turn store + controller split** — imperative singleton (`turnController`) holds refs/timers, nanostore (`turnStore`) holds render-visible state
## What's Still Open
- **Classic CLI not deleted.** `cli.py` still has ~80 `prompt_toolkit` references; classic REPL is still the default when `--tui` is absent. The original plan's "Cut 4 · prompt_toolkit removal later" hasn't happened.
- **No config-file opt-in.** `HERMES_EXPERIMENTAL_TUI` and `display.experimental_tui` were never built; only the CLI flag exists. Fine for now — if we want "default to TUI", a single line in `main.py` flips it.
+36 -27
View File
@@ -6,6 +6,11 @@
# All fields are optional — missing values inherit from the default skin.
# Activate with: /skin <name> or display.skin: <name> in config.yaml
#
# Keys are marked:
# (both) — applies to both the classic CLI and the TUI
# (classic) — classic CLI only (see hermes --tui in user-guide/tui.md)
# (tui) — TUI only
#
# See hermes_cli/skin_engine.py for the full schema reference.
# ============================================================================
@@ -14,43 +19,47 @@ name: example
description: An example custom skin — copy and modify this template
# ── Colors ──────────────────────────────────────────────────────────────────
# Hex color values for Rich markup. These control the CLI's visual palette.
# Hex color values. These control the visual palette.
colors:
# Banner panel (the startup welcome box)
# Banner panel (the startup welcome box) — (both)
banner_border: "#CD7F32" # Panel border
banner_title: "#FFD700" # Panel title text
banner_accent: "#FFBF00" # Section headers (Available Tools, Skills, etc.)
banner_dim: "#B8860B" # Dim/muted text (separators, model info)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
# UI elements
ui_accent: "#FFBF00" # General accent color
# UI elements — (both)
ui_accent: "#FFBF00" # General accent (falls back to banner_accent)
ui_label: "#4dd0e1" # Labels
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
# Input area
prompt: "#FFF8DC" # Prompt text color
input_rule: "#CD7F32" # Horizontal rule around input
prompt: "#FFF8DC" # Prompt text / `` glyph color (both)
input_rule: "#CD7F32" # Horizontal rule above input (classic)
# Response box
response_border: "#FFD700" # Response box border (ANSI color)
# Response box — (classic)
response_border: "#FFD700" # Response box border
# Session display
session_label: "#DAA520" # Session label
session_border: "#8B8682" # Session ID dim color
# Session display — (both)
session_label: "#DAA520" # "Session: " label
session_border: "#8B8682" # Session ID text
# TUI surfaces
status_bar_bg: "#1a1a2e" # Status / usage bar background
voice_status_bg: "#1a1a2e" # Voice-mode badge background
completion_menu_bg: "#1a1a2e" # Completion list background
completion_menu_current_bg: "#333355" # Active completion row background
completion_menu_meta_bg: "#1a1a2e" # Completion meta column background
completion_menu_meta_current_bg: "#333355" # Active completion meta background
# TUI / CLI surfaces — (classic: status bar, voice badge, completion meta)
status_bar_bg: "#1a1a2e" # Status / usage bar background (classic)
voice_status_bg: "#1a1a2e" # Voice-mode badge background (classic)
completion_menu_bg: "#1a1a2e" # Completion list background (both)
completion_menu_current_bg: "#333355" # Active completion row background (both)
completion_menu_meta_bg: "#1a1a2e" # Completion meta column bg (classic)
completion_menu_meta_current_bg: "#333355" # Active meta bg (classic)
# Drag-to-select background — (tui)
selection_bg: "#3a3a55" # Uniform selection highlight in the TUI
# ── Spinner ─────────────────────────────────────────────────────────────────
# Customize the animated spinner shown during API calls and tool execution.
# (classic) — the TUI uses its own animated indicators; spinner config here
# is only read by the classic prompt_toolkit CLI.
spinner:
# Faces shown while waiting for the API response
waiting_faces:
@@ -78,17 +87,17 @@ spinner:
# - ["⟪▲", "▲⟫"]
# ── Branding ────────────────────────────────────────────────────────────────
# Text strings used throughout the CLI interface.
# Text strings used throughout the interface.
branding:
agent_name: "Hermes Agent" # Banner title, about display
welcome: "Welcome! Type your message or /help for commands."
goodbye: "Goodbye! ⚕" # Exit message
response_label: " ⚕ Hermes " # Response box header label
prompt_symbol: " " # Input prompt symbol
help_header: "(^_^)? Available Commands" # /help header text
agent_name: "Hermes Agent" # (both) Banner title, about display
welcome: "Welcome! Type your message or /help for commands." # (both)
goodbye: "Goodbye! ⚕" # (both) Exit message
response_label: " ⚕ Hermes " # (classic) Response box header label
prompt_symbol: " " # (both) Input prompt glyph
help_header: "(^_^)? Available Commands" # (both) /help overlay title
# ── Tool Output ─────────────────────────────────────────────────────────────
# Character used as the prefix for tool output lines.
# Character used as the prefix for tool output lines. (both)
# Default is "┊" (thin dotted vertical line). Some alternatives:
# "╎" (light triple dash vertical)
# "▏" (left one-eighth block)
Generated
+21
View File
@@ -36,6 +36,26 @@
"type": "github"
}
},
"npm-lockfile-fix": {
"inputs": {
"nixpkgs": [
"nixpkgs"
]
},
"locked": {
"lastModified": 1775903712,
"narHash": "sha256-2GV79U6iVH4gKAPWYrxUReB0S41ty/Y3dBLquU8AlaA=",
"owner": "jeslie0",
"repo": "npm-lockfile-fix",
"rev": "c6093acb0c0548e0f9b8b3d82918823721930fe8",
"type": "github"
},
"original": {
"owner": "jeslie0",
"repo": "npm-lockfile-fix",
"type": "github"
}
},
"pyproject-build-systems": {
"inputs": {
"nixpkgs": [
@@ -124,6 +144,7 @@
"inputs": {
"flake-parts": "flake-parts",
"nixpkgs": "nixpkgs",
"npm-lockfile-fix": "npm-lockfile-fix",
"pyproject-build-systems": "pyproject-build-systems",
"pyproject-nix": "pyproject-nix_2",
"uv2nix": "uv2nix_2"
+11 -2
View File
@@ -19,11 +19,20 @@
url = "github:pyproject-nix/build-system-pkgs";
inputs.nixpkgs.follows = "nixpkgs";
};
npm-lockfile-fix = {
url = "github:jeslie0/npm-lockfile-fix";
inputs.nixpkgs.follows = "nixpkgs";
};
};
outputs = inputs:
outputs =
inputs:
inputs.flake-parts.lib.mkFlake { inherit inputs; } {
systems = [ "x86_64-linux" "aarch64-linux" "aarch64-darwin" ];
systems = [
"x86_64-linux"
"aarch64-linux"
"aarch64-darwin"
];
imports = [
./nix/packages.nix
+19 -1
View File
@@ -100,7 +100,7 @@ def build_channel_directory(adapters: Dict[Any, Any]) -> Dict[str, Any]:
def _build_discord(adapter) -> List[Dict[str, str]]:
"""Enumerate all text channels the Discord bot can see."""
"""Enumerate all text channels and forum channels the Discord bot can see."""
channels = []
client = getattr(adapter, "_client", None)
if not client:
@@ -119,6 +119,15 @@ def _build_discord(adapter) -> List[Dict[str, str]]:
"guild": guild.name,
"type": "channel",
})
# Forum channels (type 15) — creating a message auto-spawns a thread post.
forums = getattr(guild, "forum_channels", None) or []
for ch in forums:
channels.append({
"id": str(ch.id),
"name": ch.name,
"guild": guild.name,
"type": "forum",
})
# Also include DM-capable users we've interacted with is not
# feasible via guild enumeration; those come from sessions.
@@ -191,6 +200,15 @@ def load_directory() -> Dict[str, Any]:
return {"updated_at": None, "platforms": {}}
def lookup_channel_type(platform_name: str, chat_id: str) -> Optional[str]:
"""Return the channel ``type`` string (e.g. ``"channel"``, ``"forum"``) for *chat_id*, or *None* if unknown."""
directory = load_directory()
for ch in directory.get("platforms", {}).get(platform_name, []):
if ch.get("id") == chat_id:
return ch.get("type")
return None
def resolve_channel_name(platform_name: str, name: str) -> Optional[str]:
"""
Resolve a human-friendly channel name to a numeric ID.
+89 -2
View File
@@ -258,6 +258,13 @@ class GatewayConfig:
# Streaming configuration
streaming: StreamingConfig = field(default_factory=StreamingConfig)
# Session store pruning: drop SessionEntry records older than this many
# days from the in-memory dict and sessions.json. Keeps the store from
# growing unbounded in gateways serving many chats/threads/users over
# months. Pruning is invisible to users — if they resume, they get a
# fresh session exactly as if the reset policy had fired. 0 = disabled.
session_store_max_age_days: int = 90
def get_connected_platforms(self) -> List[Platform]:
"""Return list of platforms that are enabled and configured."""
connected = []
@@ -307,6 +314,14 @@ class GatewayConfig:
# QQBot uses extra dict for app credentials
elif platform == Platform.QQBOT and config.extra.get("app_id") and config.extra.get("client_secret"):
connected.append(platform)
# DingTalk uses client_id/client_secret from config.extra or env vars
elif platform == Platform.DINGTALK and (
config.extra.get("client_id") or os.getenv("DINGTALK_CLIENT_ID")
) and (
config.extra.get("client_secret") or os.getenv("DINGTALK_CLIENT_SECRET")
):
connected.append(platform)
return connected
def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@@ -357,6 +372,7 @@ class GatewayConfig:
"thread_sessions_per_user": self.thread_sessions_per_user,
"unauthorized_dm_behavior": self.unauthorized_dm_behavior,
"streaming": self.streaming.to_dict(),
"session_store_max_age_days": self.session_store_max_age_days,
}
@classmethod
@@ -404,6 +420,13 @@ class GatewayConfig:
"pair",
)
try:
session_store_max_age_days = int(data.get("session_store_max_age_days", 90))
if session_store_max_age_days < 0:
session_store_max_age_days = 0
except (TypeError, ValueError):
session_store_max_age_days = 90
return cls(
platforms=platforms,
default_reset_policy=default_policy,
@@ -418,6 +441,7 @@ class GatewayConfig:
thread_sessions_per_user=_coerce_bool(thread_sessions_per_user, False),
unauthorized_dm_behavior=unauthorized_dm_behavior,
streaming=StreamingConfig.from_dict(data.get("streaming", {})),
session_store_max_age_days=session_store_max_age_days,
)
def get_unauthorized_dm_behavior(self, platform: Optional[Platform] = None) -> str:
@@ -617,6 +641,20 @@ def load_gateway_config() -> GatewayConfig:
if isinstance(ntc, list):
ntc = ",".join(str(v) for v in ntc)
os.environ["DISCORD_NO_THREAD_CHANNELS"] = str(ntc)
# allow_mentions: granular control over what the bot can ping.
# Safe defaults (no @everyone/roles) are applied in the adapter;
# these YAML keys only override when set and let users opt back
# into unsafe modes (e.g. roles=true) if they actually want it.
allow_mentions_cfg = discord_cfg.get("allow_mentions")
if isinstance(allow_mentions_cfg, dict):
for yaml_key, env_key in (
("everyone", "DISCORD_ALLOW_MENTION_EVERYONE"),
("roles", "DISCORD_ALLOW_MENTION_ROLES"),
("users", "DISCORD_ALLOW_MENTION_USERS"),
("replied_user", "DISCORD_ALLOW_MENTION_REPLIED_USER"),
):
if yaml_key in allow_mentions_cfg and not os.getenv(env_key):
os.environ[env_key] = str(allow_mentions_cfg[yaml_key]).lower()
# Telegram settings → env vars (env vars take precedence)
telegram_cfg = yaml_cfg.get("telegram", {})
@@ -663,6 +701,24 @@ def load_gateway_config() -> GatewayConfig:
frc = ",".join(str(v) for v in frc)
os.environ["WHATSAPP_FREE_RESPONSE_CHATS"] = str(frc)
# DingTalk settings → env vars (env vars take precedence)
dingtalk_cfg = yaml_cfg.get("dingtalk", {})
if isinstance(dingtalk_cfg, dict):
if "require_mention" in dingtalk_cfg and not os.getenv("DINGTALK_REQUIRE_MENTION"):
os.environ["DINGTALK_REQUIRE_MENTION"] = str(dingtalk_cfg["require_mention"]).lower()
if "mention_patterns" in dingtalk_cfg and not os.getenv("DINGTALK_MENTION_PATTERNS"):
os.environ["DINGTALK_MENTION_PATTERNS"] = json.dumps(dingtalk_cfg["mention_patterns"])
frc = dingtalk_cfg.get("free_response_chats")
if frc is not None and not os.getenv("DINGTALK_FREE_RESPONSE_CHATS"):
if isinstance(frc, list):
frc = ",".join(str(v) for v in frc)
os.environ["DINGTALK_FREE_RESPONSE_CHATS"] = str(frc)
allowed = dingtalk_cfg.get("allowed_users")
if allowed is not None and not os.getenv("DINGTALK_ALLOWED_USERS"):
if isinstance(allowed, list):
allowed = ",".join(str(v) for v in allowed)
os.environ["DINGTALK_ALLOWED_USERS"] = str(allowed)
# Matrix settings → env vars (env vars take precedence)
matrix_cfg = yaml_cfg.get("matrix", {})
if isinstance(matrix_cfg, dict):
@@ -1006,6 +1062,25 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
if webhook_secret:
config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret
# DingTalk
dingtalk_client_id = os.getenv("DINGTALK_CLIENT_ID")
dingtalk_client_secret = os.getenv("DINGTALK_CLIENT_SECRET")
if dingtalk_client_id and dingtalk_client_secret:
if Platform.DINGTALK not in config.platforms:
config.platforms[Platform.DINGTALK] = PlatformConfig()
config.platforms[Platform.DINGTALK].enabled = True
config.platforms[Platform.DINGTALK].extra.update({
"client_id": dingtalk_client_id,
"client_secret": dingtalk_client_secret,
})
dingtalk_home = os.getenv("DINGTALK_HOME_CHANNEL")
if dingtalk_home:
config.platforms[Platform.DINGTALK].home_channel = HomeChannel(
platform=Platform.DINGTALK,
chat_id=dingtalk_home,
name=os.getenv("DINGTALK_HOME_CHANNEL_NAME", "Home"),
)
# Feishu / Lark
feishu_app_id = os.getenv("FEISHU_APP_ID")
feishu_app_secret = os.getenv("FEISHU_APP_SECRET")
@@ -1154,12 +1229,24 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
qq_group_allowed = os.getenv("QQ_GROUP_ALLOWED_USERS", "").strip()
if qq_group_allowed:
extra["group_allow_from"] = qq_group_allowed
qq_home = os.getenv("QQ_HOME_CHANNEL", "").strip()
qq_home = os.getenv("QQBOT_HOME_CHANNEL", "").strip()
qq_home_name_env = "QQBOT_HOME_CHANNEL_NAME"
if not qq_home:
# Back-compat: accept the pre-rename name and log a one-time warning.
legacy_home = os.getenv("QQ_HOME_CHANNEL", "").strip()
if legacy_home:
qq_home = legacy_home
qq_home_name_env = "QQ_HOME_CHANNEL_NAME"
import logging
logging.getLogger(__name__).warning(
"QQ_HOME_CHANNEL is deprecated; rename to QQBOT_HOME_CHANNEL "
"in your .env for consistency with the platform key."
)
if qq_home:
config.platforms[Platform.QQBOT].home_channel = HomeChannel(
platform=Platform.QQBOT,
chat_id=qq_home,
name=os.getenv("QQ_HOME_CHANNEL_NAME", "Home"),
name=os.getenv("QQBOT_HOME_CHANNEL_NAME") or os.getenv(qq_home_name_env, "Home"),
)
# Session settings
+97 -9
View File
@@ -669,6 +669,15 @@ class MessageEvent:
# Original platform data
raw_message: Any = None
message_id: Optional[str] = None
# Platform-specific update identifier. For Telegram this is the
# ``update_id`` from the PTB Update wrapper; other platforms currently
# ignore it. Used by ``/restart`` to record the triggering update so the
# new gateway can advance the Telegram offset past it and avoid processing
# the same ``/restart`` twice if PTB's graceful-shutdown ACK times out
# ("Error while calling `get_updates` one more time to mark all fetched
# updates" in gateway.log).
platform_update_id: Optional[int] = None
# Media attachments
# media_urls: local file paths (for vision tool access)
@@ -1045,16 +1054,40 @@ class BasePlatformAdapter(ABC):
"""
pass
# Default: the adapter treats ``finalize=True`` on edit_message as a
# no-op and is happy to have the stream consumer skip redundant final
# edits. Subclasses that *require* an explicit finalize call to close
# out the message lifecycle (e.g. rich card / AI assistant surfaces
# such as DingTalk AI Cards) override this to True (class attribute or
# property) so the stream consumer knows not to short-circuit.
REQUIRES_EDIT_FINALIZE: bool = False
async def edit_message(
self,
chat_id: str,
message_id: str,
content: str,
*,
finalize: bool = False,
) -> SendResult:
"""
Edit a previously sent message. Optional platforms that don't
support editing return success=False and callers fall back to
sending a new message.
``finalize`` signals that this is the last edit in a streaming
sequence. Most platforms (Telegram, Slack, Discord, Matrix,
etc.) treat it as a no-op because their edit APIs have no notion
of message lifecycle state an edit is an edit. Platforms that
render streaming updates with a distinct "in progress" state and
require explicit closure (e.g. rich card / AI assistant surfaces
such as DingTalk AI Cards) use it to finalize the message and
transition the UI out of the streaming indicator those should
also set ``REQUIRES_EDIT_FINALIZE = True`` so callers route a
final edit through even when content is unchanged. Callers
should set ``finalize=True`` on the final edit of a streamed
response (typically when ``got_done`` fires in the stream
consumer) and leave it ``False`` on intermediate edits.
"""
return SendResult(success=False, error="Not supported")
@@ -1579,7 +1612,9 @@ class BasePlatformAdapter(ABC):
# session lifecycle and its cleanup races with the running task
# (see PR #4926).
cmd = event.get_command()
if cmd in ("approve", "deny", "status", "stop", "new", "reset", "background", "restart", "queue", "q"):
from hermes_cli.commands import should_bypass_active_session
if should_bypass_active_session(cmd):
logger.debug(
"[%s] Command '/%s' bypassing active-session guard for %s",
self.name, cmd, session_key,
@@ -1891,9 +1926,18 @@ class BasePlatformAdapter(ABC):
if session_key in self._pending_messages:
pending_event = self._pending_messages.pop(session_key)
logger.debug("[%s] Processing queued message from interrupt", self.name)
# Clean up current session before processing pending
if session_key in self._active_sessions:
del self._active_sessions[session_key]
# Keep the _active_sessions entry live across the turn chain
# and only CLEAR the interrupt Event — do NOT delete the entry.
# If we deleted here, a concurrent inbound message arriving
# during the awaits below would pass the Level-1 guard, spawn
# its own _process_message_background, and run simultaneously
# with the recursive drain below. Two agents on one
# session_key = duplicate responses, duplicate tool calls.
# Clearing the Event keeps the guard live so follow-ups take
# the busy-handler path (queue + interrupt) as intended.
_active = self._active_sessions.get(session_key)
if _active is not None:
_active.clear()
typing_task.cancel()
try:
await typing_task
@@ -1951,6 +1995,34 @@ class BasePlatformAdapter(ABC):
await self.stop_typing(event.source.chat_id)
except Exception:
pass
# Late-arrival drain: a message may have arrived during the
# cleanup awaits above (typing_task cancel, stop_typing). Such
# messages passed the Level-1 guard (entry still live, Event
# possibly set) and landed in _pending_messages via the
# busy-handler path. Without this block, we would delete the
# active-session entry and the queued message would be silently
# dropped (user never gets a reply).
late_pending = self._pending_messages.pop(session_key, None)
if late_pending is not None:
logger.debug(
"[%s] Late-arrival pending message during cleanup — spawning drain task",
self.name,
)
_active = self._active_sessions.get(session_key)
if _active is not None:
_active.clear()
drain_task = asyncio.create_task(
self._process_message_background(late_pending, session_key)
)
try:
self._background_tasks.add(drain_task)
drain_task.add_done_callback(self._background_tasks.discard)
except TypeError:
# Tests stub create_task() with non-hashable sentinels; tolerate.
pass
# Leave _active_sessions[session_key] populated — the drain
# task's own lifecycle will clean it up.
return
# Clean up session tracking
if session_key in self._active_sessions:
del self._active_sessions[session_key]
@@ -1961,12 +2033,26 @@ class BasePlatformAdapter(ABC):
Used during gateway shutdown/replacement so active sessions from the old
process do not keep running after adapters are being torn down.
"""
tasks = [task for task in self._background_tasks if not task.done()]
for task in tasks:
self._expected_cancelled_tasks.add(task)
task.cancel()
if tasks:
# Loop until no new tasks appear. Without this, a message
# arriving during the `await asyncio.gather` below would spawn
# a fresh _process_message_background task (added to
# self._background_tasks at line ~1668 via handle_message),
# and the _background_tasks.clear() at the end of this method
# would drop the reference — the task runs untracked against a
# disconnecting adapter, logs send-failures, and may linger
# until it completes on its own. Retrying the drain until the
# task set stabilizes closes the window.
MAX_DRAIN_ROUNDS = 5
for _ in range(MAX_DRAIN_ROUNDS):
tasks = [task for task in self._background_tasks if not task.done()]
if not tasks:
break
for task in tasks:
self._expected_cancelled_tasks.add(task)
task.cancel()
await asyncio.gather(*tasks, return_exceptions=True)
# Loop: late-arrival tasks spawned during the gather above
# will be in self._background_tasks now. Re-check.
self._background_tasks.clear()
self._expected_cancelled_tasks.clear()
self._pending_messages.clear()
@@ -1991,6 +2077,7 @@ class BasePlatformAdapter(ABC):
chat_topic: Optional[str] = None,
user_id_alt: Optional[str] = None,
chat_id_alt: Optional[str] = None,
is_bot: bool = False,
) -> SessionSource:
"""Helper to build a SessionSource for this platform."""
# Normalize empty topic to None
@@ -2007,6 +2094,7 @@ class BasePlatformAdapter(ABC):
chat_topic=chat_topic.strip() if chat_topic else None,
user_id_alt=user_id_alt,
chat_id_alt=chat_id_alt,
is_bot=is_bot,
)
@abstractmethod
File diff suppressed because it is too large Load Diff
+580 -92
View File
@@ -51,7 +51,9 @@ from gateway.platforms.base import (
ProcessingOutcome,
SendResult,
cache_image_from_url,
cache_image_from_bytes,
cache_audio_from_url,
cache_audio_from_bytes,
cache_document_from_bytes,
SUPPORTED_DOCUMENT_TYPES,
)
@@ -80,6 +82,41 @@ def check_discord_requirements() -> bool:
return DISCORD_AVAILABLE
def _build_allowed_mentions():
"""Build Discord ``AllowedMentions`` with safe defaults, overridable via env.
Discord bots default to parsing ``@everyone``, ``@here``, role pings, and
user pings when ``allowed_mentions`` is unset on the client any LLM
output or echoed user content that contains ``@everyone`` would therefore
ping the whole server. We explicitly deny ``@everyone`` and role pings
by default and keep user / replied-user pings enabled so normal
conversation still works.
Override via environment variables (or ``discord.allow_mentions.*`` in
config.yaml):
DISCORD_ALLOW_MENTION_EVERYONE default false @everyone + @here
DISCORD_ALLOW_MENTION_ROLES default false @role pings
DISCORD_ALLOW_MENTION_USERS default true @user pings
DISCORD_ALLOW_MENTION_REPLIED_USER default true reply-ping author
"""
if not DISCORD_AVAILABLE:
return None
def _b(name: str, default: bool) -> bool:
raw = os.getenv(name, "").strip().lower()
if not raw:
return default
return raw in ("true", "1", "yes", "on")
return discord.AllowedMentions(
everyone=_b("DISCORD_ALLOW_MENTION_EVERYONE", False),
roles=_b("DISCORD_ALLOW_MENTION_ROLES", False),
users=_b("DISCORD_ALLOW_MENTION_USERS", True),
replied_user=_b("DISCORD_ALLOW_MENTION_REPLIED_USER", True),
)
class VoiceReceiver:
"""Captures and decodes voice audio from a Discord voice channel.
@@ -458,6 +495,7 @@ class DiscordAdapter(BasePlatformAdapter):
self._client: Optional[commands.Bot] = None
self._ready_event = asyncio.Event()
self._allowed_user_ids: set = set() # For button approval authorization
self._allowed_role_ids: set = set() # For DISCORD_ALLOWED_ROLES filtering
# Voice channel state (per-guild)
self._voice_clients: Dict[int, Any] = {} # guild_id -> VoiceClient
# Text batching: merge rapid successive messages (Telegram-style)
@@ -536,6 +574,15 @@ class DiscordAdapter(BasePlatformAdapter):
if uid.strip()
}
# Parse DISCORD_ALLOWED_ROLES — comma-separated role IDs.
# Users with ANY of these roles can interact with the bot.
roles_env = os.getenv("DISCORD_ALLOWED_ROLES", "")
if roles_env:
self._allowed_role_ids = {
int(rid.strip()) for rid in roles_env.split(",")
if rid.strip().isdigit()
}
# Set up intents.
# Message Content is required for normal text replies.
# Server Members is only needed when the allowlist contains usernames
@@ -547,7 +594,10 @@ class DiscordAdapter(BasePlatformAdapter):
intents.message_content = True
intents.dm_messages = True
intents.guild_messages = True
intents.members = any(not entry.isdigit() for entry in self._allowed_user_ids)
intents.members = (
any(not entry.isdigit() for entry in self._allowed_user_ids)
or bool(self._allowed_role_ids) # Need members intent for role lookup
)
intents.voice_states = True
# Resolve proxy (DISCORD_PROXY > generic env vars > macOS system proxy)
@@ -556,10 +606,15 @@ class DiscordAdapter(BasePlatformAdapter):
if proxy_url:
logger.info("[%s] Using proxy for Discord: %s", self.name, proxy_url)
# Create bot — proxy= for HTTP, connector= for SOCKS
# Create bot — proxy= for HTTP, connector= for SOCKS.
# allowed_mentions is set with safe defaults (no @everyone/roles)
# so LLM output or echoed user content can't ping the whole
# server; override per DISCORD_ALLOW_MENTION_* env vars or the
# discord.allow_mentions.* block in config.yaml.
self._client = commands.Bot(
command_prefix="!", # Not really used, we handle raw messages
intents=intents,
allowed_mentions=_build_allowed_mentions(),
**proxy_kwargs_for_bot(proxy_url),
)
adapter_self = self # capture for closure
@@ -594,14 +649,13 @@ class DiscordAdapter(BasePlatformAdapter):
if message.type not in (discord.MessageType.default, discord.MessageType.reply):
return
# Check if the message author is in the allowed user list
if not self._is_allowed_user(str(message.author.id)):
return
# Bot message filtering (DISCORD_ALLOW_BOTS):
# "none" — ignore all other bots (default)
# "mentions" — accept bot messages only when they @mention us
# "all" — accept all bot messages
# Must run BEFORE the user allowlist check so that bots
# permitted by DISCORD_ALLOW_BOTS are not rejected for
# not being in DISCORD_ALLOWED_USERS (fixes #4466).
if getattr(message.author, "bot", False):
allow_bots = os.getenv("DISCORD_ALLOW_BOTS", "none").lower().strip()
if allow_bots == "none":
@@ -609,7 +663,12 @@ class DiscordAdapter(BasePlatformAdapter):
elif allow_bots == "mentions":
if not self._client.user or self._client.user not in message.mentions:
return
# "all" falls through to handle_message
# "all" falls through; bot is permitted — skip the
# human-user allowlist below (bots aren't in it).
else:
# Non-bot: enforce the configured user/role allowlists.
if not self._is_allowed_user(str(message.author.id), message.author):
return
# Multi-agent filtering: if the message mentions specific bots
# but NOT this bot, the sender is talking to another agent —
@@ -798,6 +857,9 @@ class DiscordAdapter(BasePlatformAdapter):
When metadata contains a thread_id, the message is sent to that
thread instead of the parent channel identified by chat_id.
Forum channels (type 15) reject direct messages a thread post is
created automatically.
"""
if not self._client:
return SendResult(success=False, error="Not connected")
@@ -823,6 +885,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not channel:
return SendResult(success=False, error=f"Channel {chat_id} not found")
# Forum channels reject channel.send() — create a thread post instead.
if self._is_forum_parent(channel):
return await self._send_to_forum(channel, content)
# Format and split message if needed
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
@@ -833,7 +899,10 @@ class DiscordAdapter(BasePlatformAdapter):
if reply_to and self._reply_to_mode != "off":
try:
ref_msg = await channel.fetch_message(int(reply_to))
reference = ref_msg
if hasattr(ref_msg, "to_reference"):
reference = ref_msg.to_reference(fail_if_not_exists=False)
else:
reference = ref_msg
except Exception as e:
logger.debug("Could not fetch reply-to message: %s", e)
@@ -851,14 +920,20 @@ class DiscordAdapter(BasePlatformAdapter):
err_text = str(e)
if (
chunk_reference is not None
and "error code: 50035" in err_text
and "Cannot reply to a system message" in err_text
and (
(
"error code: 50035" in err_text
and "Cannot reply to a system message" in err_text
)
or "error code: 10008" in err_text
)
):
logger.warning(
"[%s] Reply target %s is a Discord system message; retrying send without reply reference",
"[%s] Reply target %s rejected the reply reference; retrying send without reply reference",
self.name,
reply_to,
)
reference = None
msg = await channel.send(
content=chunk,
reference=None,
@@ -877,6 +952,120 @@ class DiscordAdapter(BasePlatformAdapter):
logger.error("[%s] Failed to send Discord message: %s", self.name, e, exc_info=True)
return SendResult(success=False, error=str(e))
async def _send_to_forum(self, forum_channel: Any, content: str) -> SendResult:
"""Create a thread post in a forum channel with the message as starter content.
Forum channels (type 15) don't support direct messages. Instead we
POST to /channels/{forum_id}/threads with a thread name derived from
the first line of the message. Any follow-up chunk failures are
reported in ``raw_response['warnings']`` so the caller can surface
partial-send issues.
"""
from tools.send_message_tool import _derive_forum_thread_name
formatted = self.format_message(content)
chunks = self.truncate_message(formatted, self.MAX_MESSAGE_LENGTH)
thread_name = _derive_forum_thread_name(content)
starter_content = chunks[0] if chunks else thread_name
try:
thread = await forum_channel.create_thread(
name=thread_name,
content=starter_content,
)
except Exception as e:
logger.error("[%s] Failed to create forum thread in %s: %s", self.name, forum_channel.id, e)
return SendResult(success=False, error=f"Forum thread creation failed: {e}")
thread_channel = thread if hasattr(thread, "send") else getattr(thread, "thread", None)
thread_id = str(getattr(thread_channel, "id", getattr(thread, "id", "")))
starter_msg = getattr(thread, "message", None)
message_id = str(getattr(starter_msg, "id", thread_id)) if starter_msg else thread_id
# Send remaining chunks into the newly created thread. Track any
# per-chunk failures so the caller sees partial-send outcomes.
message_ids = [message_id]
warnings: list[str] = []
for chunk in chunks[1:]:
try:
msg = await thread_channel.send(content=chunk)
message_ids.append(str(msg.id))
except Exception as e:
warning = f"Failed to send follow-up chunk to forum thread {thread_id}: {e}"
logger.warning("[%s] %s", self.name, warning)
warnings.append(warning)
raw_response: Dict[str, Any] = {"message_ids": message_ids, "thread_id": thread_id}
if warnings:
raw_response["warnings"] = warnings
return SendResult(
success=True,
message_id=message_ids[0],
raw_response=raw_response,
)
async def _forum_post_file(
self,
forum_channel: Any,
*,
thread_name: Optional[str] = None,
content: str = "",
file: Any = None,
files: Optional[list] = None,
) -> SendResult:
"""Create a forum thread whose starter message carries file attachments.
Used by the send_voice / send_image_file / send_document paths when
the target channel is a forum (type 15). ``create_thread`` on a
ForumChannel accepts the same file/files/content kwargs as
``channel.send``, creating the thread and starter message atomically.
"""
from tools.send_message_tool import _derive_forum_thread_name
if not thread_name:
# Prefer the text content, fall back to the first attached
# filename, fall back to the generic default.
hint = content or ""
if not hint.strip():
if file is not None:
hint = getattr(file, "filename", "") or ""
elif files:
hint = getattr(files[0], "filename", "") or ""
thread_name = _derive_forum_thread_name(hint) if hint.strip() else "New Post"
kwargs: Dict[str, Any] = {"name": thread_name}
if content:
kwargs["content"] = content
if file is not None:
kwargs["file"] = file
if files:
kwargs["files"] = files
try:
thread = await forum_channel.create_thread(**kwargs)
except Exception as e:
logger.error(
"[%s] Failed to create forum thread with file in %s: %s",
self.name,
getattr(forum_channel, "id", "?"),
e,
)
return SendResult(success=False, error=f"Forum thread creation failed: {e}")
thread_channel = thread if hasattr(thread, "send") else getattr(thread, "thread", None)
thread_id = str(getattr(thread_channel, "id", getattr(thread, "id", "")))
starter_msg = getattr(thread, "message", None)
message_id = str(getattr(starter_msg, "id", thread_id)) if starter_msg else thread_id
return SendResult(
success=True,
message_id=message_id,
raw_response={"thread_id": thread_id},
)
async def edit_message(
self,
chat_id: str,
@@ -907,7 +1096,11 @@ class DiscordAdapter(BasePlatformAdapter):
caption: Optional[str] = None,
file_name: Optional[str] = None,
) -> SendResult:
"""Send a local file as a Discord attachment."""
"""Send a local file as a Discord attachment.
Forum channels (type 15) get a new thread whose starter message
carries the file they reject direct POST /messages.
"""
if not self._client:
return SendResult(success=False, error="Not connected")
@@ -920,6 +1113,12 @@ class DiscordAdapter(BasePlatformAdapter):
filename = file_name or os.path.basename(file_path)
with open(file_path, "rb") as fh:
file = discord.File(fh, filename=filename)
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(content=caption if caption else None, file=file)
return SendResult(success=True, message_id=str(msg.id))
@@ -968,6 +1167,18 @@ class DiscordAdapter(BasePlatformAdapter):
with open(audio_path, "rb") as f:
file_data = f.read()
# Forum channels (type 15) reject direct POST /messages — the
# native voice flag path also targets /messages so it would fail
# too. Create a thread post with the audio as the starter
# attachment instead.
if self._is_forum_parent(channel):
forum_file = discord.File(io.BytesIO(file_data), filename=filename)
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=forum_file,
)
# Try sending as a native voice message via raw API (flags=8192).
try:
import base64
@@ -1310,11 +1521,48 @@ class DiscordAdapter(BasePlatformAdapter):
except OSError:
pass
def _is_allowed_user(self, user_id: str) -> bool:
"""Check if user is in DISCORD_ALLOWED_USERS."""
if not self._allowed_user_ids:
def _is_allowed_user(self, user_id: str, author=None) -> bool:
"""Check if user is allowed via DISCORD_ALLOWED_USERS or DISCORD_ALLOWED_ROLES.
Uses OR semantics: if the user matches EITHER allowlist, they're allowed.
If both allowlists are empty, everyone is allowed (backwards compatible).
When author is a Member, checks .roles directly; otherwise falls back
to scanning the bot's mutual guilds for a Member record.
"""
# ``getattr`` fallbacks here guard against test fixtures that build
# an adapter via ``object.__new__(DiscordAdapter)`` and skip __init__
# (see AGENTS.md pitfall #17 — same pattern as gateway.run).
allowed_users = getattr(self, "_allowed_user_ids", set())
allowed_roles = getattr(self, "_allowed_role_ids", set())
has_users = bool(allowed_users)
has_roles = bool(allowed_roles)
if not has_users and not has_roles:
return True
return user_id in self._allowed_user_ids
# Check user ID allowlist
if has_users and user_id in allowed_users:
return True
# Check role allowlist
if has_roles:
# Try direct role check from Member object
direct_roles = getattr(author, "roles", None) if author is not None else None
if direct_roles:
if any(getattr(r, "id", None) in allowed_roles for r in direct_roles):
return True
# Fallback: scan mutual guilds for member's roles
if self._client is not None:
try:
uid_int = int(user_id)
except (TypeError, ValueError):
uid_int = None
if uid_int is not None:
for guild in self._client.guilds:
m = guild.get_member(uid_int)
if m is None:
continue
m_roles = getattr(m, "roles", None) or []
if any(getattr(r, "id", None) in allowed_roles for r in m_roles):
return True
return False
async def send_image_file(
self,
@@ -1383,6 +1631,13 @@ class DiscordAdapter(BasePlatformAdapter):
import io
file = discord.File(io.BytesIO(image_data), filename=f"image.{ext}")
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(
content=caption if caption else None,
file=file,
@@ -1445,6 +1700,13 @@ class DiscordAdapter(BasePlatformAdapter):
import io
file = discord.File(io.BytesIO(animation_data), filename="animation.gif")
if self._is_forum_parent(channel):
return await self._forum_post_file(
channel,
content=(caption or "").strip(),
file=file,
)
msg = await channel.send(
content=caption if caption else None,
file=file,
@@ -1671,6 +1933,24 @@ class DiscordAdapter(BasePlatformAdapter):
the "thinking..." indicator is replaced with that text; otherwise it
is deleted so the channel isn't cluttered.
"""
# Log the invoker so ghost-command reports can be triaged. Discord
# native slash invocations are always user-initiated (no bot can fire
# them), but mobile autocomplete / keyboard shortcuts / other users
# in the same channel are easy to miss in post-mortems.
try:
_user = interaction.user
_chan_id = getattr(interaction.channel, "id", None) or getattr(interaction, "channel_id", None)
logger.info(
"[Discord] slash '%s' invoked by user=%s id=%s channel=%s guild=%s",
command_text,
getattr(_user, "name", "?"),
getattr(_user, "id", "?"),
_chan_id,
getattr(interaction, "guild_id", None),
)
except Exception:
pass # logging must never block command dispatch
await interaction.response.defer(ephemeral=True)
event = self._build_slash_event(interaction, command_text)
await self.handle_message(event)
@@ -1732,6 +2012,11 @@ class DiscordAdapter(BasePlatformAdapter):
async def slash_stop(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/stop", "Stop requested~")
@tree.command(name="steer", description="Inject a message after the next tool call (no interrupt)")
@discord.app_commands.describe(prompt="Text to inject into the agent's next tool result")
async def slash_steer(interaction: discord.Interaction, prompt: str):
await self._run_simple_slash(interaction, f"/steer {prompt}".strip())
@tree.command(name="compress", description="Compress conversation context")
async def slash_compress(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/compress")
@@ -1904,12 +2189,23 @@ class DiscordAdapter(BasePlatformAdapter):
self._register_skill_group(tree)
def _register_skill_group(self, tree) -> None:
"""Register a ``/skill`` command group with category subcommand groups.
"""Register a single ``/skill`` command with autocomplete on the name.
Skills are organized by their directory category under ``SKILLS_DIR``.
Each category becomes a subcommand group; root-level skills become
direct subcommands. Discord supports 25 subcommand groups × 25
subcommands each = 625 skills well beyond the old 100-command cap.
Discord enforces an ~8000-byte per-command payload limit. The older
nested layout (``/skill <category> <name>``) registered one giant
command whose serialized payload grew linearly with the skill
catalog with the default ~75 skills the payload was ~14 KB and
``tree.sync()`` rejected the entire slash-command batch (issues
#11321, #10259, #11385, #10261, #10214).
Autocomplete options are fetched dynamically by Discord when the
user types they do NOT count against the per-command registration
budget. So we register ONE flat ``/skill`` command with
``name: str`` (autocompleted) and ``args: str = ""``. This scales
to thousands of skills with no size math, no splitting, and no
hidden skills. The slash picker also becomes more discoverable
Discord live-filters by the user's typed prefix against both the
skill name and its description.
"""
try:
from hermes_cli.commands import discord_skill_commands_by_category
@@ -1920,68 +2216,97 @@ class DiscordAdapter(BasePlatformAdapter):
except Exception:
pass
# Reuse the existing collector for consistent filtering
# (per-platform disabled, hub-excluded, name clamping), then
# flatten — the category grouping was only useful for the
# nested layout.
categories, uncategorized, hidden = discord_skill_commands_by_category(
reserved_names=existing_names,
)
entries: list[tuple[str, str, str]] = list(uncategorized)
for cat_skills in categories.values():
entries.extend(cat_skills)
if not categories and not uncategorized:
if not entries:
return
skill_group = discord.app_commands.Group(
# Stable alphabetical order so the autocomplete suggestion
# list is predictable across restarts.
entries.sort(key=lambda t: t[0])
# name -> (description, cmd_key) — used by both the autocomplete
# callback and the handler for O(1) dispatch.
skill_lookup: dict[str, tuple[str, str]] = {
n: (d, k) for n, d, k in entries
}
async def _autocomplete_name(
interaction: "discord.Interaction", current: str,
) -> list:
"""Filter skills by the user's typed prefix.
Matches both the skill name and its description so
"/skill pdf" surfaces skills whose description mentions
PDFs even if the name doesn't. Discord caps this list at
25 entries per query.
"""
q = (current or "").strip().lower()
choices: list = []
for name, desc, _key in entries:
if not q or q in name.lower() or (desc and q in desc.lower()):
if desc:
label = f"{name}{desc}"
else:
label = name
# Discord's Choice.name is capped at 100 chars.
if len(label) > 100:
label = label[:97] + "..."
choices.append(
discord.app_commands.Choice(name=label, value=name)
)
if len(choices) >= 25:
break
return choices
@discord.app_commands.describe(
name="Which skill to run",
args="Optional arguments for the skill",
)
@discord.app_commands.autocomplete(name=_autocomplete_name)
async def _skill_handler(
interaction: "discord.Interaction", name: str, args: str = "",
):
entry = skill_lookup.get(name)
if not entry:
await interaction.response.send_message(
f"Unknown skill: `{name}`. Start typing for "
f"autocomplete suggestions.",
ephemeral=True,
)
return
_desc, cmd_key = entry
await self._run_simple_slash(
interaction, f"{cmd_key} {args}".strip()
)
cmd = discord.app_commands.Command(
name="skill",
description="Run a Hermes skill",
callback=_skill_handler,
)
tree.add_command(cmd)
# ── Helper: build a callback for a skill command key ──
def _make_handler(_key: str):
@discord.app_commands.describe(args="Optional arguments for the skill")
async def _handler(interaction: discord.Interaction, args: str = ""):
await self._run_simple_slash(interaction, f"{_key} {args}".strip())
_handler.__name__ = f"skill_{_key.lstrip('/').replace('-', '_')}"
return _handler
# ── Uncategorized (root-level) skills → direct subcommands ──
for discord_name, description, cmd_key in uncategorized:
cmd = discord.app_commands.Command(
name=discord_name,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
skill_group.add_command(cmd)
# ── Category subcommand groups ──
for cat_name in sorted(categories):
cat_desc = f"{cat_name.replace('-', ' ').title()} skills"
if len(cat_desc) > 100:
cat_desc = cat_desc[:97] + "..."
cat_group = discord.app_commands.Group(
name=cat_name,
description=cat_desc,
parent=skill_group,
)
for discord_name, description, cmd_key in categories[cat_name]:
cmd = discord.app_commands.Command(
name=discord_name,
description=description or f"Run the {discord_name} skill",
callback=_make_handler(cmd_key),
)
cat_group.add_command(cmd)
tree.add_command(skill_group)
total = sum(len(v) for v in categories.values()) + len(uncategorized)
logger.info(
"[%s] Registered /skill group: %d skill(s) across %d categories"
" + %d uncategorized",
self.name, total, len(categories), len(uncategorized),
"[%s] Registered /skill command with %d skill(s) via autocomplete",
self.name, len(entries),
)
if hidden:
logger.warning(
"[%s] %d skill(s) not registered (Discord subcommand limits)",
logger.info(
"[%s] %d skill(s) filtered out of /skill (name clamp / reserved)",
self.name, hidden,
)
except Exception as exc:
logger.warning("[%s] Failed to register /skill group: %s", self.name, exc)
logger.warning("[%s] Failed to register /skill command: %s", self.name, exc)
def _build_slash_event(self, interaction: discord.Interaction, text: str) -> MessageEvent:
"""Build a MessageEvent from a Discord slash command interaction."""
@@ -2140,6 +2465,26 @@ class DiscordAdapter(BasePlatformAdapter):
from gateway.platforms.base import resolve_channel_prompt
return resolve_channel_prompt(self.config.extra, channel_id, parent_id)
def _discord_require_mention(self) -> bool:
"""Return whether Discord channel messages require a bot mention."""
configured = self.config.extra.get("require_mention")
if configured is not None:
if isinstance(configured, str):
return configured.lower() not in ("false", "0", "no", "off")
return bool(configured)
return os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no", "off")
def _discord_free_response_channels(self) -> set:
"""Return Discord channel IDs where no bot mention is required."""
raw = self.config.extra.get("free_response_channels")
if raw is None:
raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
if isinstance(raw, list):
return {str(part).strip() for part in raw if str(part).strip()}
if isinstance(raw, str) and raw.strip():
return {part.strip() for part in raw.split(",") if part.strip()}
return set()
def _thread_parent_channel(self, channel: Any) -> Any:
"""Return the parent text channel when invoked from a thread."""
return getattr(channel, "parent", None) or channel
@@ -2242,8 +2587,15 @@ class DiscordAdapter(BasePlatformAdapter):
Returns the created thread object, or ``None`` on failure.
"""
# Build a short thread name from the message
# Build a short thread name from the message. Strip Discord mention
# syntax (users / roles / channels) so thread titles don't end up
# showing raw <@id>, <@&id>, or <#id> markers — the ID isn't
# meaningful to humans glancing at the thread list (#6336).
content = (message.content or "").strip()
# <@123>, <@!123>, <@&123>, <#123> — collapse to empty; normalize spaces.
content = re.sub(r"<@[!&]?\d+>", "", content)
content = re.sub(r"<#\d+>", "", content)
content = re.sub(r"\s+", " ", content).strip()
thread_name = content[:80] if content else "Hermes"
if len(content) > 80:
thread_name = thread_name[:77] + "..."
@@ -2251,9 +2603,25 @@ class DiscordAdapter(BasePlatformAdapter):
try:
thread = await message.create_thread(name=thread_name, auto_archive_duration=1440)
return thread
except Exception as e:
logger.warning("[%s] Auto-thread creation failed: %s", self.name, e)
return None
except Exception as direct_error:
display_name = getattr(getattr(message, "author", None), "display_name", None) or "unknown user"
reason = f"Auto-threaded from mention by {display_name}"
try:
seed_msg = await message.channel.send(f"\U0001f9f5 Thread created by Hermes: **{thread_name}**")
thread = await seed_msg.create_thread(
name=thread_name,
auto_archive_duration=1440,
reason=reason,
)
return thread
except Exception as fallback_error:
logger.warning(
"[%s] Auto-thread creation failed. Direct error: %s. Fallback error: %s",
self.name,
direct_error,
fallback_error,
)
return None
async def send_exec_approval(
self, chat_id: str, command: str, session_key: str,
@@ -2440,6 +2808,124 @@ class DiscordAdapter(BasePlatformAdapter):
return f"{parent_name} / {thread_name}"
return thread_name
# ------------------------------------------------------------------
# Attachment download helpers
#
# Discord attachments (images / audio / documents) are fetched via the
# authenticated bot session whenever the Attachment object exposes
# ``read()``. That sidesteps two classes of bug that hit the older
# plain-HTTP path:
#
# 1. ``cdn.discordapp.com`` URLs increasingly require bot auth on
# download — unauthenticated httpx sees 403 Forbidden.
# (issue #8242)
# 2. Some user environments (VPNs, corporate DNS, tunnels) resolve
# ``cdn.discordapp.com`` to private-looking IPs that our
# ``is_safe_url`` guard classifies as SSRF risks. Routing the
# fetch through discord.py's own HTTP client handles DNS
# internally so our guard isn't consulted for the attachment
# path. (issue #6587)
#
# If ``att.read()`` is unavailable (unexpected object shape / test
# stub) or the bot session fetch fails, we fall back to the existing
# SSRF-gated URL downloaders. The fallback keeps defense-in-depth
# against any future Discord payload-schema drift that could slip a
# non-CDN URL into the ``att.url`` field. (issue #11345)
# ------------------------------------------------------------------
async def _read_attachment_bytes(self, att) -> Optional[bytes]:
"""Read an attachment via discord.py's authenticated bot session.
Returns the raw bytes on success, or ``None`` if ``att`` doesn't
expose a callable ``read()`` or the read itself fails. Callers
should treat ``None`` as a signal to fall back to the URL-based
downloaders.
"""
reader = getattr(att, "read", None)
if reader is None or not callable(reader):
return None
try:
return await reader()
except Exception as e:
logger.warning(
"[Discord] Authenticated attachment read failed for %s: %s",
getattr(att, "filename", None) or getattr(att, "url", "<unknown>"),
e,
)
return None
async def _cache_discord_image(self, att, ext: str) -> str:
"""Cache a Discord image attachment to local disk.
Primary path: ``att.read()`` + ``cache_image_from_bytes``
(authenticated, no SSRF gate).
Fallback: ``cache_image_from_url`` (plain httpx, SSRF-gated).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
try:
return cache_image_from_bytes(raw_bytes, ext=ext)
except Exception as e:
logger.debug(
"[Discord] cache_image_from_bytes rejected att.read() data; falling back to URL: %s",
e,
)
return await cache_image_from_url(att.url, ext=ext)
async def _cache_discord_audio(self, att, ext: str) -> str:
"""Cache a Discord audio attachment to local disk.
Primary path: ``att.read()`` + ``cache_audio_from_bytes``
(authenticated, no SSRF gate).
Fallback: ``cache_audio_from_url`` (plain httpx, SSRF-gated).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
try:
return cache_audio_from_bytes(raw_bytes, ext=ext)
except Exception as e:
logger.debug(
"[Discord] cache_audio_from_bytes failed; falling back to URL: %s",
e,
)
return await cache_audio_from_url(att.url, ext=ext)
async def _cache_discord_document(self, att, ext: str) -> bytes:
"""Download a Discord document attachment and return the raw bytes.
Primary path: ``att.read()`` (authenticated, no SSRF gate).
Fallback: SSRF-gated ``aiohttp`` download. This closes the gap
where the old document path made raw ``aiohttp.ClientSession``
requests with no safety check (#11345). The caller is responsible
for passing the returned bytes to ``cache_document_from_bytes``
(and, where applicable, for injecting text content).
"""
raw_bytes = await self._read_attachment_bytes(att)
if raw_bytes is not None:
return raw_bytes
# Fallback: SSRF-gated URL download.
if not is_safe_url(att.url):
raise ValueError(
f"Blocked unsafe attachment URL (SSRF protection): {att.url}"
)
import aiohttp
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="DISCORD_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(**_sess_kw) as session:
async with session.get(
att.url,
timeout=aiohttp.ClientTimeout(total=30),
**_req_kw,
) as resp:
if resp.status != 200:
raise Exception(f"HTTP {resp.status}")
return await resp.read()
async def _handle_message(self, message: DiscordMessage) -> None:
"""Handle incoming Discord messages."""
# In server channels (not DMs), require the bot to be @mentioned
@@ -2482,12 +2968,11 @@ class DiscordAdapter(BasePlatformAdapter):
logger.debug("[%s] Ignoring message in ignored channel: %s", self.name, channel_ids)
return
free_channels_raw = os.getenv("DISCORD_FREE_RESPONSE_CHANNELS", "")
free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
free_channels = self._discord_free_response_channels()
if parent_channel_id:
channel_ids.add(parent_channel_id)
require_mention = os.getenv("DISCORD_REQUIRE_MENTION", "true").lower() not in ("false", "0", "no")
require_mention = self._discord_require_mention()
# Voice-linked text channels act as free-response while voice is active.
# Only the exact bound channel gets the exemption, not sibling threads.
voice_linked_ids = {str(ch_id) for ch_id in self._voice_text_channels.values()}
@@ -2515,9 +3000,10 @@ class DiscordAdapter(BasePlatformAdapter):
if not is_thread and not isinstance(message.channel, discord.DMChannel):
no_thread_channels_raw = os.getenv("DISCORD_NO_THREAD_CHANNELS", "")
no_thread_channels = {ch.strip() for ch in no_thread_channels_raw.split(",") if ch.strip()}
skip_thread = bool(channel_ids & no_thread_channels)
skip_thread = bool(channel_ids & no_thread_channels) or is_free_channel
auto_thread = os.getenv("DISCORD_AUTO_THREAD", "true").lower() in ("true", "1", "yes")
if auto_thread and not skip_thread and not is_voice_linked_channel:
is_reply_message = getattr(message, "type", None) == discord.MessageType.reply
if auto_thread and not skip_thread and not is_voice_linked_channel and not is_reply_message:
thread = await self._auto_create_thread(message)
if thread:
is_thread = True
@@ -2578,6 +3064,7 @@ class DiscordAdapter(BasePlatformAdapter):
user_name=message.author.display_name,
thread_id=thread_id,
chat_topic=chat_topic,
is_bot=getattr(message.author, "bot", False),
)
# Build media URLs -- download image attachments to local cache so the
@@ -2593,7 +3080,7 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"):
ext = ".jpg"
cached_path = await cache_image_from_url(att.url, ext=ext)
cached_path = await self._cache_discord_image(att, ext)
media_urls.append(cached_path)
media_types.append(content_type)
print(f"[Discord] Cached user image: {cached_path}", flush=True)
@@ -2607,7 +3094,7 @@ class DiscordAdapter(BasePlatformAdapter):
ext = "." + content_type.split("/")[-1].split(";")[0]
if ext not in (".ogg", ".mp3", ".wav", ".webm", ".m4a"):
ext = ".ogg"
cached_path = await cache_audio_from_url(att.url, ext=ext)
cached_path = await self._cache_discord_audio(att, ext)
media_urls.append(cached_path)
media_types.append(content_type)
print(f"[Discord] Cached user audio: {cached_path}", flush=True)
@@ -2638,19 +3125,7 @@ class DiscordAdapter(BasePlatformAdapter):
)
else:
try:
import aiohttp
from gateway.platforms.base import resolve_proxy_url, proxy_kwargs_for_aiohttp
_proxy = resolve_proxy_url(platform_env_var="DISCORD_PROXY")
_sess_kw, _req_kw = proxy_kwargs_for_aiohttp(_proxy)
async with aiohttp.ClientSession(**_sess_kw) as session:
async with session.get(
att.url,
timeout=aiohttp.ClientTimeout(total=30),
**_req_kw,
) as resp:
if resp.status != 200:
raise Exception(f"HTTP {resp.status}")
raw_bytes = await resp.read()
raw_bytes = await self._cache_discord_document(att, ext)
cached_path = cache_document_from_bytes(
raw_bytes, att.filename or f"document{ext}"
)
@@ -2790,7 +3265,20 @@ class DiscordAdapter(BasePlatformAdapter):
"[Discord] Flushing text batch %s (%d chars)",
key, len(event.text or ""),
)
await self.handle_message(event)
# Shield the downstream dispatch so that a subsequent chunk
# arriving while handle_message is mid-flight cannot cancel
# the running agent turn. _enqueue_text_event always cancels
# the prior flush task when a new chunk lands; without this
# shield, CancelledError would propagate from our task down
# into handle_message → the agent's streaming request,
# aborting the response the user was waiting on. The new
# chunk is handled by the fresh flush task regardless.
await asyncio.shield(self.handle_message(event))
except asyncio.CancelledError:
# Only reached if cancel landed before the pop — the shielded
# handle_message is unaffected either way. Let the task exit
# cleanly so the finally block cleans up.
pass
finally:
if self._pending_text_batch_tasks.get(key) is current_task:
self._pending_text_batch_tasks.pop(key, None)
+173 -3
View File
@@ -1073,6 +1073,13 @@ class FeishuAdapter(BasePlatformAdapter):
self._webhook_rate_counts: Dict[str, tuple[int, float]] = {} # rate_key → (count, window_start)
self._webhook_anomaly_counts: Dict[str, tuple[int, str, float]] = {} # ip → (count, last_status, first_seen)
self._card_action_tokens: Dict[str, float] = {} # token → first_seen_time
# Inbound events that arrived before the adapter loop was ready
# (e.g. during startup/restart or network-flap reconnect). A single
# drainer thread replays them as soon as the loop becomes available.
self._pending_inbound_events: List[Any] = []
self._pending_inbound_lock = threading.Lock()
self._pending_drain_scheduled = False
self._pending_inbound_max_depth = 1000 # cap queue; drop oldest beyond
self._chat_locks: Dict[str, asyncio.Lock] = {} # chat_id → lock (per-chat serial processing)
self._sent_message_ids_to_chat: Dict[str, str] = {} # message_id → chat_id (for reaction routing)
self._sent_message_id_order: List[str] = [] # LRU order for _sent_message_ids_to_chat
@@ -1219,6 +1226,12 @@ class FeishuAdapter(BasePlatformAdapter):
.register_p2_card_action_trigger(self._on_card_action_trigger)
.register_p2_im_chat_member_bot_added_v1(self._on_bot_added_to_chat)
.register_p2_im_chat_member_bot_deleted_v1(self._on_bot_removed_from_chat)
.register_p2_im_chat_access_event_bot_p2p_chat_entered_v1(self._on_p2p_chat_entered)
.register_p2_im_message_recalled_v1(self._on_message_recalled)
.register_p2_customized_event(
"drive.notice.comment_add_v1",
self._on_drive_comment_event,
)
.build()
)
@@ -1757,10 +1770,22 @@ class FeishuAdapter(BasePlatformAdapter):
# =========================================================================
def _on_message_event(self, data: Any) -> None:
"""Normalize Feishu inbound events into MessageEvent."""
"""Normalize Feishu inbound events into MessageEvent.
Called by the lark_oapi SDK's event dispatcher on a background thread.
If the adapter loop is not currently accepting callbacks (brief window
during startup/restart or network-flap reconnect), the event is queued
for replay instead of dropped.
"""
loop = self._loop
if loop is None or bool(getattr(loop, "is_closed", lambda: False)()):
logger.warning("[Feishu] Dropping inbound message before adapter loop is ready")
if not self._loop_accepts_callbacks(loop):
start_drainer = self._enqueue_pending_inbound_event(data)
if start_drainer:
threading.Thread(
target=self._drain_pending_inbound_events,
name="feishu-pending-inbound-drainer",
daemon=True,
).start()
return
future = asyncio.run_coroutine_threadsafe(
self._handle_message_event_data(data),
@@ -1768,6 +1793,124 @@ class FeishuAdapter(BasePlatformAdapter):
)
future.add_done_callback(self._log_background_failure)
def _enqueue_pending_inbound_event(self, data: Any) -> bool:
"""Append an event to the pending-inbound queue.
Returns True if the caller should spawn a drainer thread (no drainer
currently scheduled), False if a drainer is already running and will
pick up the new event on its next pass.
"""
with self._pending_inbound_lock:
if len(self._pending_inbound_events) >= self._pending_inbound_max_depth:
# Queue full — drop the oldest to make room. This happens only
# if the loop stays unavailable for an extended period AND the
# WS keeps firing callbacks. Still better than silent drops.
dropped = self._pending_inbound_events.pop(0)
try:
event = getattr(dropped, "event", None)
message = getattr(event, "message", None)
message_id = str(getattr(message, "message_id", "") or "unknown")
except Exception:
message_id = "unknown"
logger.error(
"[Feishu] Pending-inbound queue full (%d); dropped oldest event %s",
self._pending_inbound_max_depth,
message_id,
)
self._pending_inbound_events.append(data)
depth = len(self._pending_inbound_events)
should_start = not self._pending_drain_scheduled
if should_start:
self._pending_drain_scheduled = True
logger.warning(
"[Feishu] Queued inbound event for replay (loop not ready, queue depth=%d)",
depth,
)
return should_start
def _drain_pending_inbound_events(self) -> None:
"""Replay queued inbound events once the adapter loop is ready.
Runs in a dedicated daemon thread. Polls ``_running`` and
``_loop_accepts_callbacks`` until events can be dispatched or the
adapter shuts down. A single drainer handles the entire queue;
concurrent ``_on_message_event`` calls just append.
"""
poll_interval = 0.25
max_wait_seconds = 120.0 # safety cap: drop queue after 2 minutes
waited = 0.0
try:
while True:
if not getattr(self, "_running", True):
# Adapter shutting down — drop queued events rather than
# holding them against a closed loop.
with self._pending_inbound_lock:
dropped = len(self._pending_inbound_events)
self._pending_inbound_events.clear()
if dropped:
logger.warning(
"[Feishu] Dropped %d queued inbound event(s) during shutdown",
dropped,
)
return
loop = self._loop
if self._loop_accepts_callbacks(loop):
with self._pending_inbound_lock:
batch = self._pending_inbound_events[:]
self._pending_inbound_events.clear()
if not batch:
# Queue emptied between check and grab; done.
with self._pending_inbound_lock:
if not self._pending_inbound_events:
return
continue
dispatched = 0
requeue: List[Any] = []
for event in batch:
try:
fut = asyncio.run_coroutine_threadsafe(
self._handle_message_event_data(event),
loop,
)
fut.add_done_callback(self._log_background_failure)
dispatched += 1
except RuntimeError:
# Loop closed between check and submit — requeue
# and poll again.
requeue.append(event)
if requeue:
with self._pending_inbound_lock:
self._pending_inbound_events[:0] = requeue
if dispatched:
logger.info(
"[Feishu] Replayed %d queued inbound event(s)",
dispatched,
)
if not requeue:
# Successfully drained; check if more arrived while
# we were dispatching and exit if not.
with self._pending_inbound_lock:
if not self._pending_inbound_events:
return
# More events queued or requeue pending — loop again.
continue
if waited >= max_wait_seconds:
with self._pending_inbound_lock:
dropped = len(self._pending_inbound_events)
self._pending_inbound_events.clear()
logger.error(
"[Feishu] Adapter loop unavailable for %.0fs; "
"dropped %d queued inbound event(s)",
max_wait_seconds,
dropped,
)
return
time.sleep(poll_interval)
waited += poll_interval
finally:
with self._pending_inbound_lock:
self._pending_drain_scheduled = False
async def _handle_message_event_data(self, data: Any) -> None:
"""Shared inbound message handling for websocket and webhook transports."""
event = getattr(data, "event", None)
@@ -1820,6 +1963,31 @@ class FeishuAdapter(BasePlatformAdapter):
logger.info("[Feishu] Bot removed from chat: %s", chat_id)
self._chat_info_cache.pop(chat_id, None)
def _on_p2p_chat_entered(self, data: Any) -> None:
logger.debug("[Feishu] User entered P2P chat with bot")
def _on_message_recalled(self, data: Any) -> None:
logger.debug("[Feishu] Message recalled by user")
def _on_drive_comment_event(self, data: Any) -> None:
"""Handle drive document comment notification (drive.notice.comment_add_v1).
Delegates to :mod:`gateway.platforms.feishu_comment` for parsing,
logging, and reaction. Scheduling follows the same
``run_coroutine_threadsafe`` pattern used by ``_on_message_event``.
"""
from gateway.platforms.feishu_comment import handle_drive_comment_event
loop = self._loop
if not self._loop_accepts_callbacks(loop):
logger.warning("[Feishu] Dropping drive comment event before adapter loop is ready")
return
future = asyncio.run_coroutine_threadsafe(
handle_drive_comment_event(self._client, data, self_open_id=self._bot_open_id),
loop,
)
future.add_done_callback(self._log_background_failure)
def _on_reaction_event(self, event_type: str, data: Any) -> None:
"""Route user reactions on bot messages as synthetic text events."""
event = getattr(data, "event", None)
@@ -2445,6 +2613,8 @@ class FeishuAdapter(BasePlatformAdapter):
self._on_reaction_event(event_type, data)
elif event_type == "card.action.trigger":
self._on_card_action_trigger(data)
elif event_type == "drive.notice.comment_add_v1":
self._on_drive_comment_event(data)
else:
logger.debug("[Feishu] Ignoring webhook event type: %s", event_type or "unknown")
return web.json_response({"code": 0, "msg": "ok"})
File diff suppressed because it is too large Load Diff
+429
View File
@@ -0,0 +1,429 @@
"""
Feishu document comment access-control rules.
3-tier rule resolution: exact doc > wildcard "*" > top-level > code defaults.
Each field (enabled/policy/allow_from) falls back independently.
Config: ~/.hermes/feishu_comment_rules.json (mtime-cached, hot-reload).
Pairing store: ~/.hermes/feishu_comment_pairing.json.
"""
from __future__ import annotations
import json
import logging
import time
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, Optional
from hermes_constants import get_hermes_home
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Paths
# ---------------------------------------------------------------------------
#
# Uses the canonical ``get_hermes_home()`` helper (HERMES_HOME-aware and
# profile-safe). Resolved at import time; this module is lazy-imported by
# the Feishu comment event handler, which runs long after profile overrides
# have been applied, so freezing paths here is safe.
RULES_FILE = get_hermes_home() / "feishu_comment_rules.json"
PAIRING_FILE = get_hermes_home() / "feishu_comment_pairing.json"
# ---------------------------------------------------------------------------
# Data models
# ---------------------------------------------------------------------------
_VALID_POLICIES = ("allowlist", "pairing")
@dataclass(frozen=True)
class CommentDocumentRule:
"""Per-document rule. ``None`` means 'inherit from lower tier'."""
enabled: Optional[bool] = None
policy: Optional[str] = None
allow_from: Optional[frozenset] = None
@dataclass(frozen=True)
class CommentsConfig:
"""Top-level comment access config."""
enabled: bool = True
policy: str = "pairing"
allow_from: frozenset = field(default_factory=frozenset)
documents: Dict[str, CommentDocumentRule] = field(default_factory=dict)
@dataclass(frozen=True)
class ResolvedCommentRule:
"""Fully resolved rule after field-by-field fallback."""
enabled: bool
policy: str
allow_from: frozenset
match_source: str # e.g. "exact:docx:xxx" | "wildcard" | "top" | "default"
# ---------------------------------------------------------------------------
# Mtime-cached file loading
# ---------------------------------------------------------------------------
class _MtimeCache:
"""Generic mtime-based file cache. ``stat()`` per access, re-read only on change."""
def __init__(self, path: Path):
self._path = path
self._mtime: float = 0.0
self._data: Optional[dict] = None
def load(self) -> dict:
try:
st = self._path.stat()
mtime = st.st_mtime
except FileNotFoundError:
self._mtime = 0.0
self._data = {}
return {}
if mtime == self._mtime and self._data is not None:
return self._data
try:
with open(self._path, "r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
data = {}
except (json.JSONDecodeError, OSError):
logger.warning("[Feishu-Rules] Failed to read %s, using empty config", self._path)
data = {}
self._mtime = mtime
self._data = data
return data
_rules_cache = _MtimeCache(RULES_FILE)
_pairing_cache = _MtimeCache(PAIRING_FILE)
# ---------------------------------------------------------------------------
# Config parsing
# ---------------------------------------------------------------------------
def _parse_frozenset(raw: Any) -> Optional[frozenset]:
"""Parse a list of strings into a frozenset; return None if key absent."""
if raw is None:
return None
if isinstance(raw, (list, tuple)):
return frozenset(str(u).strip() for u in raw if str(u).strip())
return None
def _parse_document_rule(raw: dict) -> CommentDocumentRule:
enabled = raw.get("enabled")
if enabled is not None:
enabled = bool(enabled)
policy = raw.get("policy")
if policy is not None:
policy = str(policy).strip().lower()
if policy not in _VALID_POLICIES:
policy = None
allow_from = _parse_frozenset(raw.get("allow_from"))
return CommentDocumentRule(enabled=enabled, policy=policy, allow_from=allow_from)
def load_config() -> CommentsConfig:
"""Load comment rules from disk (mtime-cached)."""
raw = _rules_cache.load()
if not raw:
return CommentsConfig()
documents: Dict[str, CommentDocumentRule] = {}
raw_docs = raw.get("documents", {})
if isinstance(raw_docs, dict):
for key, rule_raw in raw_docs.items():
if isinstance(rule_raw, dict):
documents[str(key)] = _parse_document_rule(rule_raw)
policy = str(raw.get("policy", "pairing")).strip().lower()
if policy not in _VALID_POLICIES:
policy = "pairing"
return CommentsConfig(
enabled=raw.get("enabled", True),
policy=policy,
allow_from=_parse_frozenset(raw.get("allow_from")) or frozenset(),
documents=documents,
)
# ---------------------------------------------------------------------------
# Rule resolution (§8.4 field-by-field fallback)
# ---------------------------------------------------------------------------
def has_wiki_keys(cfg: CommentsConfig) -> bool:
"""Check if any document rule key starts with 'wiki:'."""
return any(k.startswith("wiki:") for k in cfg.documents)
def resolve_rule(
cfg: CommentsConfig,
file_type: str,
file_token: str,
wiki_token: str = "",
) -> ResolvedCommentRule:
"""Resolve effective rule: exact doc → wiki key → wildcard → top-level → defaults."""
exact_key = f"{file_type}:{file_token}"
exact = cfg.documents.get(exact_key)
exact_src = f"exact:{exact_key}"
if exact is None and wiki_token:
wiki_key = f"wiki:{wiki_token}"
exact = cfg.documents.get(wiki_key)
exact_src = f"exact:{wiki_key}"
wildcard = cfg.documents.get("*")
layers = []
if exact is not None:
layers.append((exact, exact_src))
if wildcard is not None:
layers.append((wildcard, "wildcard"))
def _pick(field_name: str):
for layer, source in layers:
val = getattr(layer, field_name)
if val is not None:
return val, source
return getattr(cfg, field_name), "top"
enabled, en_src = _pick("enabled")
policy, pol_src = _pick("policy")
allow_from, _ = _pick("allow_from")
# match_source = highest-priority tier that contributed any field
priority_order = {"exact": 0, "wildcard": 1, "top": 2}
best_src = min(
[en_src, pol_src],
key=lambda s: priority_order.get(s.split(":")[0], 3),
)
return ResolvedCommentRule(
enabled=enabled,
policy=policy,
allow_from=allow_from,
match_source=best_src,
)
# ---------------------------------------------------------------------------
# Pairing store
# ---------------------------------------------------------------------------
def _load_pairing_approved() -> set:
"""Return set of approved user open_ids (mtime-cached)."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if isinstance(approved, dict):
return set(approved.keys())
if isinstance(approved, list):
return set(str(u) for u in approved if u)
return set()
def _save_pairing(data: dict) -> None:
PAIRING_FILE.parent.mkdir(parents=True, exist_ok=True)
tmp = PAIRING_FILE.with_suffix(".tmp")
with open(tmp, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2, ensure_ascii=False)
tmp.replace(PAIRING_FILE)
# Invalidate cache so next load picks up change
_pairing_cache._mtime = 0.0
_pairing_cache._data = None
def pairing_add(user_open_id: str) -> bool:
"""Add a user to the pairing-approved list. Returns True if newly added."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if not isinstance(approved, dict):
approved = {}
if user_open_id in approved:
return False
approved[user_open_id] = {"approved_at": time.time()}
data["approved"] = approved
_save_pairing(data)
return True
def pairing_remove(user_open_id: str) -> bool:
"""Remove a user from the pairing-approved list. Returns True if removed."""
data = _pairing_cache.load()
approved = data.get("approved", {})
if not isinstance(approved, dict):
return False
if user_open_id not in approved:
return False
del approved[user_open_id]
data["approved"] = approved
_save_pairing(data)
return True
def pairing_list() -> Dict[str, Any]:
"""Return the approved dict {user_open_id: {approved_at: ...}}."""
data = _pairing_cache.load()
approved = data.get("approved", {})
return dict(approved) if isinstance(approved, dict) else {}
# ---------------------------------------------------------------------------
# Access check (public API for feishu_comment.py)
# ---------------------------------------------------------------------------
def is_user_allowed(rule: ResolvedCommentRule, user_open_id: str) -> bool:
"""Check if user passes the resolved rule's policy gate."""
if user_open_id in rule.allow_from:
return True
if rule.policy == "pairing":
return user_open_id in _load_pairing_approved()
return False
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def _print_status() -> None:
cfg = load_config()
print(f"Rules file: {RULES_FILE}")
print(f" exists: {RULES_FILE.exists()}")
print(f"Pairing file: {PAIRING_FILE}")
print(f" exists: {PAIRING_FILE.exists()}")
print()
print(f"Top-level:")
print(f" enabled: {cfg.enabled}")
print(f" policy: {cfg.policy}")
print(f" allow_from: {sorted(cfg.allow_from) if cfg.allow_from else '[]'}")
print()
if cfg.documents:
print(f"Document rules ({len(cfg.documents)}):")
for key, rule in sorted(cfg.documents.items()):
parts = []
if rule.enabled is not None:
parts.append(f"enabled={rule.enabled}")
if rule.policy is not None:
parts.append(f"policy={rule.policy}")
if rule.allow_from is not None:
parts.append(f"allow_from={sorted(rule.allow_from)}")
print(f" [{key}] {', '.join(parts) if parts else '(empty — inherits all)'}")
else:
print("Document rules: (none)")
print()
approved = pairing_list()
print(f"Pairing approved ({len(approved)}):")
for uid, meta in sorted(approved.items()):
ts = meta.get("approved_at", 0)
print(f" {uid} (approved_at={ts})")
def _do_check(doc_key: str, user_open_id: str) -> None:
cfg = load_config()
parts = doc_key.split(":", 1)
if len(parts) != 2:
print(f"Error: doc_key must be 'fileType:fileToken', got '{doc_key}'")
return
file_type, file_token = parts
rule = resolve_rule(cfg, file_type, file_token)
allowed = is_user_allowed(rule, user_open_id)
print(f"Document: {doc_key}")
print(f"User: {user_open_id}")
print(f"Resolved rule:")
print(f" enabled: {rule.enabled}")
print(f" policy: {rule.policy}")
print(f" allow_from: {sorted(rule.allow_from) if rule.allow_from else '[]'}")
print(f" match_source: {rule.match_source}")
print(f"Result: {'ALLOWED' if allowed else 'DENIED'}")
def _main() -> int:
import sys
try:
from hermes_cli.env_loader import load_hermes_dotenv
load_hermes_dotenv()
except Exception:
pass
usage = (
"Usage: python -m gateway.platforms.feishu_comment_rules <command> [args]\n"
"\n"
"Commands:\n"
" status Show rules config and pairing state\n"
" check <fileType:token> <user> Simulate access check\n"
" pairing add <user_open_id> Add user to pairing-approved list\n"
" pairing remove <user_open_id> Remove user from pairing-approved list\n"
" pairing list List pairing-approved users\n"
"\n"
f"Rules config file: {RULES_FILE}\n"
" Edit this JSON file directly to configure policies and document rules.\n"
" Changes take effect on the next comment event (no restart needed).\n"
)
args = sys.argv[1:]
if not args:
print(usage)
return 1
cmd = args[0]
if cmd == "status":
_print_status()
elif cmd == "check":
if len(args) < 3:
print("Usage: check <fileType:fileToken> <user_open_id>")
return 1
_do_check(args[1], args[2])
elif cmd == "pairing":
if len(args) < 2:
print("Usage: pairing <add|remove|list> [args]")
return 1
sub = args[1]
if sub == "add":
if len(args) < 3:
print("Usage: pairing add <user_open_id>")
return 1
if pairing_add(args[2]):
print(f"Added: {args[2]}")
else:
print(f"Already approved: {args[2]}")
elif sub == "remove":
if len(args) < 3:
print("Usage: pairing remove <user_open_id>")
return 1
if pairing_remove(args[2]):
print(f"Removed: {args[2]}")
else:
print(f"Not in approved list: {args[2]}")
elif sub == "list":
approved = pairing_list()
if not approved:
print("(no approved users)")
for uid, meta in sorted(approved.items()):
print(f" {uid} approved_at={meta.get('approved_at', '?')}")
else:
print(f"Unknown pairing subcommand: {sub}")
return 1
else:
print(f"Unknown command: {cmd}\n")
print(usage)
return 1
return 0
if __name__ == "__main__":
import sys
sys.exit(_main())
+57
View File
@@ -0,0 +1,57 @@
"""
QQBot platform package.
Re-exports the main adapter symbols from ``adapter.py`` (the original
``qqbot.py``) so that **all existing import paths remain unchanged**::
from gateway.platforms.qqbot import QQAdapter # works
from gateway.platforms.qqbot import check_qq_requirements # works
New modules:
- ``constants`` shared constants (API URLs, timeouts, message types)
- ``utils`` User-Agent builder, config helpers
- ``crypto`` AES-256-GCM key generation and decryption
- ``onboard`` QR-code scan-to-configure flow
"""
# -- Adapter (original qqbot.py) ------------------------------------------
from .adapter import ( # noqa: F401
QQAdapter,
QQCloseError,
check_qq_requirements,
_coerce_list,
_ssrf_redirect_guard,
)
# -- Onboard (QR-code scan-to-configure) -----------------------------------
from .onboard import ( # noqa: F401
BindStatus,
create_bind_task,
poll_bind_result,
build_connect_url,
)
from .crypto import decrypt_secret, generate_bind_key # noqa: F401
# -- Utils -----------------------------------------------------------------
from .utils import build_user_agent, get_api_headers, coerce_list # noqa: F401
__all__ = [
# adapter
"QQAdapter",
"QQCloseError",
"check_qq_requirements",
"_coerce_list",
"_ssrf_redirect_guard",
# onboard
"BindStatus",
"create_bind_task",
"poll_bind_result",
"build_connect_url",
# crypto
"decrypt_secret",
"generate_bind_key",
# utils
"build_user_agent",
"get_api_headers",
"coerce_list",
]
File diff suppressed because it is too large Load Diff
+74
View File
@@ -0,0 +1,74 @@
"""QQBot package-level constants shared across adapter, onboard, and other modules."""
from __future__ import annotations
import os
# ---------------------------------------------------------------------------
# QQBot adapter version — bump on functional changes to the adapter package.
# ---------------------------------------------------------------------------
QQBOT_VERSION = "1.1.0"
# ---------------------------------------------------------------------------
# API endpoints
# ---------------------------------------------------------------------------
# The portal domain is configurable via QQ_API_HOST for corporate proxies
# or test environments. Default: q.qq.com (production).
PORTAL_HOST = os.getenv("QQ_PORTAL_HOST", "q.qq.com")
API_BASE = "https://api.sgroup.qq.com"
TOKEN_URL = "https://bots.qq.com/app/getAppAccessToken"
GATEWAY_URL_PATH = "/gateway"
# QR-code onboard endpoints (on the portal host)
ONBOARD_CREATE_PATH = "/lite/create_bind_task"
ONBOARD_POLL_PATH = "/lite/poll_bind_result"
QR_URL_TEMPLATE = (
"https://q.qq.com/qqbot/openclaw/connect.html"
"?task_id={task_id}&_wv=2&source=hermes"
)
# ---------------------------------------------------------------------------
# Timeouts & retry
# ---------------------------------------------------------------------------
DEFAULT_API_TIMEOUT = 30.0
FILE_UPLOAD_TIMEOUT = 120.0
CONNECT_TIMEOUT_SECONDS = 20.0
RECONNECT_BACKOFF = [2, 5, 10, 30, 60]
MAX_RECONNECT_ATTEMPTS = 100
RATE_LIMIT_DELAY = 60 # seconds
QUICK_DISCONNECT_THRESHOLD = 5.0 # seconds
MAX_QUICK_DISCONNECT_COUNT = 3
ONBOARD_POLL_INTERVAL = 2.0 # seconds between poll_bind_result calls
ONBOARD_API_TIMEOUT = 10.0
# ---------------------------------------------------------------------------
# Message limits
# ---------------------------------------------------------------------------
MAX_MESSAGE_LENGTH = 4000
DEDUP_WINDOW_SECONDS = 300
DEDUP_MAX_SIZE = 1000
# ---------------------------------------------------------------------------
# QQ Bot message types
# ---------------------------------------------------------------------------
MSG_TYPE_TEXT = 0
MSG_TYPE_MARKDOWN = 2
MSG_TYPE_MEDIA = 7
MSG_TYPE_INPUT_NOTIFY = 6
# ---------------------------------------------------------------------------
# QQ Bot file media types
# ---------------------------------------------------------------------------
MEDIA_TYPE_IMAGE = 1
MEDIA_TYPE_VIDEO = 2
MEDIA_TYPE_VOICE = 3
MEDIA_TYPE_FILE = 4
+45
View File
@@ -0,0 +1,45 @@
"""AES-256-GCM utilities for QQBot scan-to-configure credential decryption."""
from __future__ import annotations
import base64
import os
def generate_bind_key() -> str:
"""Generate a 256-bit random AES key and return it as base64.
The key is passed to ``create_bind_task`` so the server can encrypt
the bot's *client_secret* before returning it. Only this CLI holds
the key, ensuring the secret never travels in plaintext.
"""
return base64.b64encode(os.urandom(32)).decode()
def decrypt_secret(encrypted_base64: str, key_base64: str) -> str:
"""Decrypt a base64-encoded AES-256-GCM ciphertext.
Ciphertext layout (after base64-decoding)::
IV (12 bytes) ciphertext (N bytes) AuthTag (16 bytes)
Args:
encrypted_base64: The ``bot_encrypt_secret`` value from
``poll_bind_result``.
key_base64: The base64 AES key generated by
:func:`generate_bind_key`.
Returns:
The decrypted *client_secret* as a UTF-8 string.
"""
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
key = base64.b64decode(key_base64)
raw = base64.b64decode(encrypted_base64)
iv = raw[:12]
ciphertext_with_tag = raw[12:] # AESGCM expects ciphertext + tag concatenated
aesgcm = AESGCM(key)
plaintext = aesgcm.decrypt(iv, ciphertext_with_tag, None)
return plaintext.decode("utf-8")
+124
View File
@@ -0,0 +1,124 @@
"""
QQBot scan-to-configure (QR code onboard) module.
Calls the ``q.qq.com`` ``create_bind_task`` / ``poll_bind_result`` APIs to
generate a QR-code URL and poll for scan completion. On success the caller
receives the bot's *app_id*, *client_secret* (decrypted locally), and the
scanner's *user_openid* — enough to fully configure the QQBot gateway.
Reference: https://bot.q.qq.com/wiki/develop/api-v2/
"""
from __future__ import annotations
import logging
from enum import IntEnum
from typing import Tuple
from urllib.parse import quote
from .constants import (
ONBOARD_API_TIMEOUT,
ONBOARD_CREATE_PATH,
ONBOARD_POLL_PATH,
PORTAL_HOST,
QR_URL_TEMPLATE,
)
from .crypto import generate_bind_key
from .utils import get_api_headers
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Bind status
# ---------------------------------------------------------------------------
class BindStatus(IntEnum):
"""Status codes returned by ``poll_bind_result``."""
NONE = 0
PENDING = 1
COMPLETED = 2
EXPIRED = 3
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
async def create_bind_task(
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[str, str]:
"""Create a bind task and return *(task_id, aes_key_base64)*.
The AES key is generated locally and sent to the server so it can
encrypt the bot credentials before returning them.
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
import httpx
url = f"https://{PORTAL_HOST}{ONBOARD_CREATE_PATH}"
key = generate_bind_key()
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"key": key}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
if data.get("retcode") != 0:
raise RuntimeError(data.get("msg", "create_bind_task failed"))
task_id = data.get("data", {}).get("task_id")
if not task_id:
raise RuntimeError("create_bind_task: missing task_id in response")
logger.debug("create_bind_task ok: task_id=%s", task_id)
return task_id, key
async def poll_bind_result(
task_id: str,
timeout: float = ONBOARD_API_TIMEOUT,
) -> Tuple[BindStatus, str, str, str]:
"""Poll the bind result for *task_id*.
Returns:
A 4-tuple of ``(status, bot_appid, bot_encrypt_secret, user_openid)``.
* ``bot_encrypt_secret`` is AES-256-GCM encrypted decrypt it with
:func:`~gateway.platforms.qqbot.crypto.decrypt_secret` using the
key from :func:`create_bind_task`.
* ``user_openid`` is the OpenID of the person who scanned the code
(available when ``status == COMPLETED``).
Raises:
RuntimeError: If the API returns a non-zero ``retcode``.
"""
import httpx
url = f"https://{PORTAL_HOST}{ONBOARD_POLL_PATH}"
async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
resp = await client.post(url, json={"task_id": task_id}, headers=get_api_headers())
resp.raise_for_status()
data = resp.json()
if data.get("retcode") != 0:
raise RuntimeError(data.get("msg", "poll_bind_result failed"))
d = data.get("data", {})
return (
BindStatus(d.get("status", 0)),
str(d.get("bot_appid", "")),
d.get("bot_encrypt_secret", ""),
d.get("user_openid", ""),
)
def build_connect_url(task_id: str) -> str:
"""Build the QR-code target URL for a given *task_id*."""
return QR_URL_TEMPLATE.format(task_id=quote(task_id))
+71
View File
@@ -0,0 +1,71 @@
"""QQBot shared utilities — User-Agent, HTTP helpers, config coercion."""
from __future__ import annotations
import platform
import sys
from typing import Any, Dict, List
from .constants import QQBOT_VERSION
# ---------------------------------------------------------------------------
# User-Agent
# ---------------------------------------------------------------------------
def _get_hermes_version() -> str:
"""Return the hermes-agent package version, or 'dev' if unavailable."""
try:
from importlib.metadata import version
return version("hermes-agent")
except Exception:
return "dev"
def build_user_agent() -> str:
"""Build a descriptive User-Agent string.
Format::
QQBotAdapter/<qqbot_version> (Python/<py_version>; <os>; Hermes/<hermes_version>)
Example::
QQBotAdapter/1.0.0 (Python/3.11.15; darwin; Hermes/0.9.0)
"""
py_version = f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}"
os_name = platform.system().lower()
hermes_version = _get_hermes_version()
return f"QQBotAdapter/{QQBOT_VERSION} (Python/{py_version}; {os_name}; Hermes/{hermes_version})"
def get_api_headers() -> Dict[str, str]:
"""Return standard HTTP headers for QQBot API requests.
Includes ``Content-Type``, ``Accept``, and a dynamic ``User-Agent``.
``q.qq.com`` requires ``Accept: application/json`` without it,
the server returns a JavaScript anti-bot challenge page.
"""
return {
"Content-Type": "application/json",
"Accept": "application/json",
"User-Agent": build_user_agent(),
}
# ---------------------------------------------------------------------------
# Config helpers
# ---------------------------------------------------------------------------
def coerce_list(value: Any) -> List[str]:
"""Coerce config values into a trimmed string list.
Accepts comma-separated strings, lists, tuples, sets, or single values.
"""
if value is None:
return []
if isinstance(value, str):
return [item.strip() for item in value.split(",") if item.strip()]
if isinstance(value, (list, tuple, set)):
return [str(item).strip() for item in value if str(item).strip()]
return [str(value).strip()] if str(value).strip() else []
+78 -6
View File
@@ -160,6 +160,14 @@ class SignalAdapter(BasePlatformAdapter):
self._sse_task: Optional[asyncio.Task] = None
self._health_monitor_task: Optional[asyncio.Task] = None
self._typing_tasks: Dict[str, asyncio.Task] = {}
# Per-chat typing-indicator backoff. When signal-cli reports
# NETWORK_FAILURE (recipient offline / unroutable), base.py's
# _keep_typing refresh loop would otherwise hammer sendTyping every
# ~2s indefinitely, producing WARNING-level log spam and pointless
# RPC traffic. We track consecutive failures per chat and skip the
# RPC during a cooldown window instead.
self._typing_failures: Dict[str, int] = {}
self._typing_skip_until: Dict[str, float] = {}
self._running = False
self._last_sse_activity = 0.0
self._sse_response: Optional[httpx.Response] = None
@@ -548,8 +556,22 @@ class SignalAdapter(BasePlatformAdapter):
# JSON-RPC Communication
# ------------------------------------------------------------------
async def _rpc(self, method: str, params: dict, rpc_id: str = None) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon."""
async def _rpc(
self,
method: str,
params: dict,
rpc_id: str = None,
*,
log_failures: bool = True,
) -> Any:
"""Send a JSON-RPC 2.0 request to signal-cli daemon.
When ``log_failures=False``, error and exception paths log at DEBUG
instead of WARNING used by the typing-indicator path to silence
repeated NETWORK_FAILURE spam for unreachable recipients while
still preserving visibility for the first occurrence and for
unrelated RPCs.
"""
if not self.client:
logger.warning("Signal: RPC called but client not connected")
return None
@@ -574,13 +596,19 @@ class SignalAdapter(BasePlatformAdapter):
data = resp.json()
if "error" in data:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
if log_failures:
logger.warning("Signal RPC error (%s): %s", method, data["error"])
else:
logger.debug("Signal RPC error (%s): %s", method, data["error"])
return None
return data.get("result")
except Exception as e:
logger.warning("Signal RPC %s failed: %s", method, e)
if log_failures:
logger.warning("Signal RPC %s failed: %s", method, e)
else:
logger.debug("Signal RPC %s failed: %s", method, e)
return None
# ------------------------------------------------------------------
@@ -627,7 +655,28 @@ class SignalAdapter(BasePlatformAdapter):
self._recent_sent_timestamps.pop()
async def send_typing(self, chat_id: str, metadata=None) -> None:
"""Send a typing indicator."""
"""Send a typing indicator.
base.py's ``_keep_typing`` refresh loop calls this every ~2s while
the agent is processing. If signal-cli returns NETWORK_FAILURE for
this recipient (offline, unroutable, group membership lost, etc.)
the unmitigated behaviour is: a WARNING log every 2 seconds for as
long as the agent keeps running. Instead we:
- silence the WARNING after the first consecutive failure (subsequent
attempts log at DEBUG) so transport issues are still visible once
but don't flood the log,
- skip the RPC entirely during an exponential cooldown window once
three consecutive failures have happened, so we stop hammering
signal-cli with requests it can't deliver.
A successful sendTyping clears the counters.
"""
now = time.monotonic()
skip_until = self._typing_skip_until.get(chat_id, 0.0)
if now < skip_until:
return
params: Dict[str, Any] = {
"account": self.account,
}
@@ -637,7 +686,26 @@ class SignalAdapter(BasePlatformAdapter):
else:
params["recipient"] = [chat_id]
await self._rpc("sendTyping", params, rpc_id="typing")
fails = self._typing_failures.get(chat_id, 0)
result = await self._rpc(
"sendTyping",
params,
rpc_id="typing",
log_failures=(fails == 0),
)
if result is None:
fails += 1
self._typing_failures[chat_id] = fails
# After 3 consecutive failures, back off exponentially (16s,
# 32s, 60s cap) to stop spamming signal-cli for a recipient
# that clearly isn't reachable right now.
if fails >= 3:
backoff = min(60.0, 16.0 * (2 ** (fails - 3)))
self._typing_skip_until[chat_id] = now + backoff
else:
self._typing_failures.pop(chat_id, None)
self._typing_skip_until.pop(chat_id, None)
async def send_image(
self,
@@ -789,6 +857,10 @@ class SignalAdapter(BasePlatformAdapter):
await task
except asyncio.CancelledError:
pass
# Reset per-chat typing backoff state so the next agent turn starts
# fresh rather than inheriting a cooldown from a prior conversation.
self._typing_failures.pop(chat_id, None)
self._typing_skip_until.pop(chat_id, None)
async def stop_typing(self, chat_id: str) -> None:
"""Public interface for stopping typing — called by base adapter's
+104 -8
View File
@@ -118,6 +118,84 @@ def _strip_mdv2(text: str) -> str:
return cleaned
# ---------------------------------------------------------------------------
# Markdown table → code block conversion
# ---------------------------------------------------------------------------
# Telegram's MarkdownV2 has no table syntax — '|' is just an escaped literal,
# so pipe tables render as noisy backslash-pipe text with no alignment.
# Wrapping the table in a fenced code block makes Telegram render it as
# monospace preformatted text with columns intact.
# Matches a GFM table delimiter row: optional outer pipes, cells containing
# only dashes (with optional leading/trailing colons for alignment) separated
# by '|'. Requires at least one internal '|' so lone '---' horizontal rules
# are NOT matched.
_TABLE_SEPARATOR_RE = re.compile(
r'^\s*\|?\s*:?-+:?\s*(?:\|\s*:?-+:?\s*){1,}\|?\s*$'
)
def _is_table_row(line: str) -> bool:
"""Return True if *line* could plausibly be a table data row."""
stripped = line.strip()
return bool(stripped) and '|' in stripped
def _wrap_markdown_tables(text: str) -> str:
"""Wrap GFM-style pipe tables in ``` fences so Telegram renders them.
Detected by a row containing '|' immediately followed by a delimiter
row matching :data:`_TABLE_SEPARATOR_RE`. Subsequent pipe-containing
non-blank lines are consumed as the table body and included in the
wrapped block. Tables inside existing fenced code blocks are left
alone.
"""
if '|' not in text or '-' not in text:
return text
lines = text.split('\n')
out: list[str] = []
in_fence = False
i = 0
while i < len(lines):
line = lines[i]
stripped = line.lstrip()
# Track existing fenced code blocks — never touch content inside.
if stripped.startswith('```'):
in_fence = not in_fence
out.append(line)
i += 1
continue
if in_fence:
out.append(line)
i += 1
continue
# Look for a header row (contains '|') immediately followed by a
# delimiter row.
if (
'|' in line
and i + 1 < len(lines)
and _TABLE_SEPARATOR_RE.match(lines[i + 1])
):
table_block = [line, lines[i + 1]]
j = i + 2
while j < len(lines) and _is_table_row(lines[j]):
table_block.append(lines[j])
j += 1
out.append('```')
out.extend(table_block)
out.append('```')
i = j
continue
out.append(line)
i += 1
return '\n'.join(out)
class TelegramAdapter(BasePlatformAdapter):
"""
Telegram bot adapter.
@@ -1916,6 +1994,12 @@ class TelegramAdapter(BasePlatformAdapter):
text = content
# 0) Pre-wrap GFM-style pipe tables in ``` fences. Telegram can't
# render tables natively, but fenced code blocks render as
# monospace preformatted text with columns intact. The wrapped
# tables then flow through step (1) below as protected regions.
text = _wrap_markdown_tables(text)
# 1) Protect fenced code blocks (``` ... ```)
# Per MarkdownV2 spec, \ and ` inside pre/code must be escaped.
def _protect_fenced(m):
@@ -2242,7 +2326,7 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._should_process_message(update.message):
return
event = self._build_message_event(update.message, MessageType.TEXT)
event = self._build_message_event(update.message, MessageType.TEXT, update_id=update.update_id)
event.text = self._clean_bot_trigger_text(event.text)
self._enqueue_text_event(event)
@@ -2253,7 +2337,7 @@ class TelegramAdapter(BasePlatformAdapter):
if not self._should_process_message(update.message, is_command=True):
return
event = self._build_message_event(update.message, MessageType.COMMAND)
event = self._build_message_event(update.message, MessageType.COMMAND, update_id=update.update_id)
await self.handle_message(event)
async def _handle_location_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
@@ -2289,7 +2373,7 @@ class TelegramAdapter(BasePlatformAdapter):
parts.append(f"Map: https://www.google.com/maps/search/?api=1&query={lat},{lon}")
parts.append("Ask what they'd like to find nearby (restaurants, cafes, etc.) and any preferences.")
event = self._build_message_event(msg, MessageType.LOCATION)
event = self._build_message_event(msg, MessageType.LOCATION, update_id=update.update_id)
event.text = "\n".join(parts)
await self.handle_message(event)
@@ -2440,7 +2524,7 @@ class TelegramAdapter(BasePlatformAdapter):
else:
msg_type = MessageType.DOCUMENT
event = self._build_message_event(msg, msg_type)
event = self._build_message_event(msg, msg_type, update_id=update.update_id)
# Add caption as text
if msg.caption:
@@ -2779,8 +2863,19 @@ class TelegramAdapter(BasePlatformAdapter):
self.name, cache_key, thread_id,
)
def _build_message_event(self, message: Message, msg_type: MessageType) -> MessageEvent:
"""Build a MessageEvent from a Telegram message."""
def _build_message_event(
self,
message: Message,
msg_type: MessageType,
update_id: Optional[int] = None,
) -> MessageEvent:
"""Build a MessageEvent from a Telegram message.
``update_id`` is the ``Update.update_id`` from PTB; passing it through
lets ``/restart`` record the triggering offset so the new gateway
process can advance past it (prevents ``/restart`` being re-delivered
when PTB's graceful-shutdown ACK fails).
"""
chat = message.chat
user = message.from_user
@@ -2831,8 +2926,8 @@ class TelegramAdapter(BasePlatformAdapter):
chat_id=str(chat.id),
chat_name=chat.title or (chat.full_name if hasattr(chat, "full_name") else None),
chat_type=chat_type,
user_id=str(user.id) if user else None,
user_name=user.full_name if user else None,
user_id=str(user.id) if user else (str(chat.id) if chat_type == "dm" else None),
user_name=user.full_name if user else (chat.full_name if hasattr(chat, "full_name") and chat_type == "dm" else None),
thread_id=thread_id_str,
chat_topic=chat_topic,
)
@@ -2859,6 +2954,7 @@ class TelegramAdapter(BasePlatformAdapter):
source=source,
raw_message=message,
message_id=str(message.message_id),
platform_update_id=update_id,
reply_to_message_id=reply_to_id,
reply_to_text=reply_to_text,
auto_skill=topic_skill,
+41 -10
View File
@@ -180,6 +180,8 @@ class WeComAdapter(BasePlatformAdapter):
self._text_batch_split_delay_seconds = float(os.getenv("HERMES_WECOM_TEXT_BATCH_SPLIT_DELAY_SECONDS", "2.0"))
self._pending_text_batches: Dict[str, MessageEvent] = {}
self._pending_text_batch_tasks: Dict[str, asyncio.Task] = {}
self._device_id = uuid.uuid4().hex
self._last_chat_req_ids: Dict[str, str] = {}
# ------------------------------------------------------------------
# Connection lifecycle
@@ -277,7 +279,11 @@ class WeComAdapter(BasePlatformAdapter):
{
"cmd": APP_CMD_SUBSCRIBE,
"headers": {"req_id": req_id},
"body": {"bot_id": self._bot_id, "secret": self._secret},
"body": {
"bot_id": self._bot_id,
"secret": self._secret,
"device_id": self._device_id,
},
}
)
@@ -496,6 +502,11 @@ class WeComAdapter(BasePlatformAdapter):
logger.debug("[%s] DM sender %s blocked by policy", self.name, sender_id)
return
# Cache the inbound req_id after policy checks so proactive sends to
# this chat can fall back to APP_CMD_RESPONSE (required for groups —
# WeCom AI Bots cannot initiate APP_CMD_SEND in group chats).
self._remember_chat_req_id(chat_id, self._payload_req_id(payload))
text, reply_text = self._extract_text(body)
media_urls, media_types = await self._extract_media(body)
message_type = self._derive_message_type(body, text, media_types)
@@ -847,6 +858,23 @@ class WeComAdapter(BasePlatformAdapter):
while len(self._reply_req_ids) > DEDUP_MAX_SIZE:
self._reply_req_ids.pop(next(iter(self._reply_req_ids)))
def _remember_chat_req_id(self, chat_id: str, req_id: str) -> None:
"""Cache the most recent inbound req_id per chat.
Used as a fallback reply target when we need to send into a group
without an explicit ``reply_to`` WeCom AI Bots are blocked from
APP_CMD_SEND in groups and must use APP_CMD_RESPONSE bound to some
prior req_id. Bounded like _reply_req_ids so long-running gateways
don't leak memory across many chats.
"""
normalized_chat_id = str(chat_id or "").strip()
normalized_req_id = str(req_id or "").strip()
if not normalized_chat_id or not normalized_req_id:
return
self._last_chat_req_ids[normalized_chat_id] = normalized_req_id
while len(self._last_chat_req_ids) > DEDUP_MAX_SIZE:
self._last_chat_req_ids.pop(next(iter(self._last_chat_req_ids)))
def _reply_req_id_for_message(self, reply_to: Optional[str]) -> Optional[str]:
normalized = str(reply_to or "").strip()
if not normalized or normalized.startswith("quote:"):
@@ -1163,19 +1191,15 @@ class WeComAdapter(BasePlatformAdapter):
self._raise_for_wecom_error(response, "send media message")
return response
async def _send_reply_stream(self, reply_req_id: str, content: str) -> Dict[str, Any]:
async def _send_reply_markdown(self, reply_req_id: str, content: str) -> Dict[str, Any]:
response = await self._send_reply_request(
reply_req_id,
{
"msgtype": "stream",
"stream": {
"id": self._new_req_id("stream"),
"finish": True,
"content": content[:self.MAX_MESSAGE_LENGTH],
},
"msgtype": "markdown",
"markdown": {"content": content[:self.MAX_MESSAGE_LENGTH]},
},
)
self._raise_for_wecom_error(response, "send reply stream")
self._raise_for_wecom_error(response, "send reply markdown")
return response
async def _send_reply_media_message(
@@ -1235,6 +1259,9 @@ class WeComAdapter(BasePlatformAdapter):
return SendResult(success=False, error=prepared["reject_reason"])
reply_req_id = self._reply_req_id_for_message(reply_to)
if not reply_req_id and chat_id in self._last_chat_req_ids:
reply_req_id = self._last_chat_req_ids[chat_id]
try:
upload_result = await self._upload_media_bytes(
prepared["data"],
@@ -1302,8 +1329,12 @@ class WeComAdapter(BasePlatformAdapter):
try:
reply_req_id = self._reply_req_id_for_message(reply_to)
if not reply_req_id and chat_id in self._last_chat_req_ids:
reply_req_id = self._last_chat_req_ids[chat_id]
if reply_req_id:
response = await self._send_reply_stream(reply_req_id, content)
response = await self._send_reply_markdown(reply_req_id, content)
else:
response = await self._send_request(
APP_CMD_SEND,
+310 -86
View File
@@ -28,7 +28,7 @@ import uuid
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from urllib.parse import quote
from urllib.parse import quote, urlparse
logger = logging.getLogger(__name__)
@@ -96,6 +96,28 @@ MEDIA_VIDEO = 2
MEDIA_FILE = 3
MEDIA_VOICE = 4
_LIVE_ADAPTERS: Dict[str, Any] = {}
def _make_ssl_connector() -> Optional["aiohttp.TCPConnector"]:
"""Return a TCPConnector with a certifi CA bundle, or None if certifi is unavailable.
Tencent's iLink server (``ilinkai.weixin.qq.com``) is not verifiable against
some system CA stores (notably Homebrew's OpenSSL on macOS Apple Silicon).
When ``certifi`` is installed, use its Mozilla CA bundle to guarantee
verification. Otherwise fall back to aiohttp's default (which honors
``SSL_CERT_FILE`` env var via ``trust_env=True``).
"""
try:
import ssl
import certifi
except ImportError:
return None
if not AIOHTTP_AVAILABLE:
return None
ssl_ctx = ssl.create_default_context(cafile=certifi.where())
return aiohttp.TCPConnector(ssl=ssl_ctx)
ITEM_TEXT = 1
ITEM_IMAGE = 2
ITEM_VOICE = 3
@@ -398,7 +420,12 @@ async def _send_message(
text: str,
context_token: Optional[str],
client_id: str,
) -> None:
) -> Dict[str, Any]:
"""Send a text message via iLink sendmessage API.
Returns the raw API response dict (may contain error codes like
``errcode: -14`` for session expiry that the caller can inspect).
"""
if not text or not text.strip():
raise ValueError("_send_message: text must not be empty")
message: Dict[str, Any] = {
@@ -411,7 +438,7 @@ async def _send_message(
}
if context_token:
message["context_token"] = context_token
await _api_post(
return await _api_post(
session,
base_url=base_url,
endpoint=EP_SEND_MESSAGE,
@@ -533,6 +560,39 @@ async def _download_bytes(
return await response.read()
_WEIXIN_CDN_ALLOWLIST: frozenset[str] = frozenset(
{
"novac2c.cdn.weixin.qq.com",
"ilinkai.weixin.qq.com",
"wx.qlogo.cn",
"thirdwx.qlogo.cn",
"res.wx.qq.com",
"mmbiz.qpic.cn",
"mmbiz.qlogo.cn",
}
)
def _assert_weixin_cdn_url(url: str) -> None:
"""Raise ValueError if *url* does not point at a known WeChat CDN host."""
try:
parsed = urlparse(url)
scheme = parsed.scheme.lower()
host = parsed.hostname or ""
except Exception as exc: # noqa: BLE001
raise ValueError(f"Unparseable media URL: {url!r}") from exc
if scheme not in ("http", "https"):
raise ValueError(
f"Media URL has disallowed scheme {scheme!r}; only http/https are permitted."
)
if host not in _WEIXIN_CDN_ALLOWLIST:
raise ValueError(
f"Media URL host {host!r} is not in the WeChat CDN allowlist. "
"Refusing to fetch to prevent SSRF."
)
def _media_reference(item: Dict[str, Any], key: str) -> Dict[str, Any]:
return (item.get(key) or {}).get("media") or {}
@@ -553,6 +613,7 @@ async def _download_and_decrypt_media(
timeout_seconds=timeout_seconds,
)
elif full_url:
_assert_weixin_cdn_url(full_url)
raw = await _download_bytes(session, url=full_url, timeout_seconds=timeout_seconds)
else:
raise RuntimeError("media item had neither encrypt_query_param nor full_url")
@@ -623,42 +684,31 @@ def _rewrite_table_block_for_weixin(lines: List[str]) -> str:
def _normalize_markdown_blocks(content: str) -> str:
lines = content.splitlines()
result: List[str] = []
i = 0
in_code_block = False
blank_run = 0
while i < len(lines):
line = lines[i].rstrip()
fence_match = _FENCE_RE.match(line.strip())
if fence_match:
for raw_line in lines:
line = raw_line.rstrip()
if _FENCE_RE.match(line.strip()):
in_code_block = not in_code_block
result.append(line)
i += 1
blank_run = 0
continue
if in_code_block:
result.append(line)
i += 1
continue
if (
i + 1 < len(lines)
and "|" in lines[i]
and _TABLE_RULE_RE.match(lines[i + 1].rstrip())
):
table_lines = [lines[i].rstrip(), lines[i + 1].rstrip()]
i += 2
while i < len(lines) and "|" in lines[i]:
table_lines.append(lines[i].rstrip())
i += 1
result.append(_rewrite_table_block_for_weixin(table_lines))
if not line.strip():
blank_run += 1
if blank_run <= 1:
result.append("")
continue
result.append(_MARKDOWN_LINK_RE.sub(r"\1 (\2)", _rewrite_headers_for_weixin(line)))
i += 1
blank_run = 0
result.append(line)
normalized = "\n".join(item.rstrip() for item in result)
normalized = re.sub(r"\n{3,}", "\n\n", normalized)
return normalized.strip()
return "\n".join(result).strip()
def _split_markdown_blocks(content: str) -> List[str]:
@@ -704,8 +754,8 @@ def _split_delivery_units_for_weixin(content: str) -> List[str]:
Weixin can render Markdown, but chat readability is better when top-level
line breaks become separate messages. Keep fenced code blocks intact and
attach indented continuation lines to the previous top-level line so
transformed tables/lists do not get torn apart.
attach indented continuation lines to the previous top-level line so nested
list items do not get torn apart.
"""
units: List[str] = []
@@ -747,7 +797,9 @@ def _looks_like_chatty_line_for_weixin(line: str) -> bool:
return False
if line.startswith((" ", "\t")):
return False
if stripped.startswith((">", "-", "*", "")):
if stripped.startswith((">", "-", "*", "", "#", "|")):
return False
if _TABLE_RULE_RE.match(stripped):
return False
if re.match(r"^\*\*[^*]+\*\*$", stripped):
return False
@@ -757,10 +809,12 @@ def _looks_like_chatty_line_for_weixin(line: str) -> bool:
def _looks_like_heading_line_for_weixin(line: str) -> bool:
"""Return True when a short line behaves like a plain-text heading."""
"""Return True when a short line behaves like a heading."""
stripped = line.strip()
if not stripped:
return False
if _HEADER_RE.match(stripped):
return True
return len(stripped) <= 24 and stripped.endswith((":", ""))
@@ -935,7 +989,7 @@ async def qr_login(
if not AIOHTTP_AVAILABLE:
raise RuntimeError("aiohttp is required for Weixin QR login")
async with aiohttp.ClientSession(trust_env=True) as session:
async with aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector()) as session:
try:
qr_resp = await _api_get(
session,
@@ -953,6 +1007,10 @@ async def qr_login(
logger.error("weixin: QR response missing qrcode")
return None
# qrcode_url is the full scannable liteapp URL; qrcode_value is just the hex token
# WeChat needs to scan the full URL, not the raw hex string
qr_scan_data = qrcode_url if qrcode_url else qrcode_value
print("\n请使用微信扫描以下二维码:")
if qrcode_url:
print(qrcode_url)
@@ -960,11 +1018,11 @@ async def qr_login(
import qrcode
qr = qrcode.QRCode()
qr.add_data(qrcode_url or qrcode_value)
qr.add_data(qr_scan_data)
qr.make(fit=True)
qr.print_ascii(invert=True)
except Exception:
print("(终端二维码渲染失败,请直接打开上面的二维码链接)")
except Exception as _qr_exc:
print(f"(终端二维码渲染失败: {_qr_exc},请直接打开上面的二维码链接)")
deadline = time.time() + timeout_seconds
current_base_url = ILINK_BASE_URL
@@ -1010,8 +1068,17 @@ async def qr_login(
)
qrcode_value = str(qr_resp.get("qrcode") or "")
qrcode_url = str(qr_resp.get("qrcode_img_content") or "")
qr_scan_data = qrcode_url if qrcode_url else qrcode_value
if qrcode_url:
print(qrcode_url)
try:
import qrcode as _qrcode
qr = _qrcode.QRCode()
qr.add_data(qr_scan_data)
qr.make(fit=True)
qr.print_ascii(invert=True)
except Exception:
pass
except Exception as exc:
logger.error("weixin: QR refresh failed: %s", exc)
return None
@@ -1059,7 +1126,8 @@ class WeixinAdapter(BasePlatformAdapter):
self._hermes_home = hermes_home
self._token_store = ContextTokenStore(hermes_home)
self._typing_cache = TypingTicketCache()
self._session: Optional[aiohttp.ClientSession] = None
self._poll_session: Optional[aiohttp.ClientSession] = None
self._send_session: Optional[aiohttp.ClientSession] = None
self._poll_task: Optional[asyncio.Task] = None
self._dedup = MessageDeduplicator(ttl_seconds=MESSAGE_DEDUP_TTL_SECONDS)
@@ -1134,14 +1202,17 @@ class WeixinAdapter(BasePlatformAdapter):
except Exception as exc:
logger.debug("[%s] Token lock unavailable (non-fatal): %s", self.name, exc)
self._session = aiohttp.ClientSession(trust_env=True)
self._poll_session = aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector())
self._send_session = aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector())
self._token_store.restore(self._account_id)
self._poll_task = asyncio.create_task(self._poll_loop(), name="weixin-poll")
self._mark_connected()
_LIVE_ADAPTERS[self._token] = self
logger.info("[%s] Connected account=%s base=%s", self.name, _safe_id(self._account_id), self._base_url)
return True
async def disconnect(self) -> None:
_LIVE_ADAPTERS.pop(self._token, None)
self._running = False
if self._poll_task and not self._poll_task.done():
self._poll_task.cancel()
@@ -1150,15 +1221,18 @@ class WeixinAdapter(BasePlatformAdapter):
except asyncio.CancelledError:
pass
self._poll_task = None
if self._session and not self._session.closed:
await self._session.close()
self._session = None
if self._poll_session and not self._poll_session.closed:
await self._poll_session.close()
self._poll_session = None
if self._send_session and not self._send_session.closed:
await self._send_session.close()
self._send_session = None
self._release_platform_lock()
self._mark_disconnected()
logger.info("[%s] Disconnected", self.name)
async def _poll_loop(self) -> None:
assert self._session is not None
assert self._poll_session is not None
sync_buf = _load_sync_buf(self._hermes_home, self._account_id)
timeout_ms = LONG_POLL_TIMEOUT_MS
consecutive_failures = 0
@@ -1166,7 +1240,7 @@ class WeixinAdapter(BasePlatformAdapter):
while self._running:
try:
response = await _get_updates(
self._session,
self._poll_session,
base_url=self._base_url,
token=self._token,
sync_buf=sync_buf,
@@ -1223,7 +1297,7 @@ class WeixinAdapter(BasePlatformAdapter):
logger.error("[%s] unhandled inbound error from=%s: %s", self.name, _safe_id(message.get("from_user_id")), exc, exc_info=True)
async def _process_message(self, message: Dict[str, Any]) -> None:
assert self._session is not None
assert self._poll_session is not None
sender_id = str(message.get("from_user_id") or "").strip()
if not sender_id:
return
@@ -1316,7 +1390,7 @@ class WeixinAdapter(BasePlatformAdapter):
media = _media_reference(item, "image_item")
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=(item.get("image_item") or {}).get("aeskey")
@@ -1334,7 +1408,7 @@ class WeixinAdapter(BasePlatformAdapter):
media = _media_reference(item, "video_item")
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1353,7 +1427,7 @@ class WeixinAdapter(BasePlatformAdapter):
mime = _mime_from_filename(filename)
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1372,7 +1446,7 @@ class WeixinAdapter(BasePlatformAdapter):
return None
try:
data = await _download_and_decrypt_media(
self._session,
self._poll_session,
cdn_base_url=self._cdn_base_url,
encrypted_query_param=media.get("encrypt_query_param"),
aes_key_b64=media.get("aes_key"),
@@ -1385,13 +1459,13 @@ class WeixinAdapter(BasePlatformAdapter):
return None
async def _maybe_fetch_typing_ticket(self, user_id: str, context_token: Optional[str]) -> None:
if not self._session or not self._token:
if not self._poll_session or not self._token:
return
if self._typing_cache.get(user_id):
return
try:
response = await _get_config(
self._session,
self._poll_session,
base_url=self._base_url,
token=self._token,
user_id=user_id,
@@ -1416,12 +1490,19 @@ class WeixinAdapter(BasePlatformAdapter):
context_token: Optional[str],
client_id: str,
) -> None:
"""Send a single text chunk with per-chunk retry and backoff."""
"""Send a single text chunk with per-chunk retry and backoff.
On session-expired errors (errcode -14), automatically retries
*without* ``context_token`` iLink accepts tokenless sends as a
degraded fallback, which keeps cron-initiated push messages working
even when no user message has refreshed the session recently.
"""
last_error: Optional[Exception] = None
retried_without_token = False
for attempt in range(self._send_chunk_retries + 1):
try:
await _send_message(
self._session,
resp = await _send_message(
self._send_session,
base_url=self._base_url,
token=self._token,
to=chat_id,
@@ -1429,6 +1510,31 @@ class WeixinAdapter(BasePlatformAdapter):
context_token=context_token,
client_id=client_id,
)
# Check iLink response for session-expired error
if resp and isinstance(resp, dict):
ret = resp.get("ret")
errcode = resp.get("errcode")
if (ret is not None and ret not in (0,)) or (errcode is not None and errcode not in (0,)):
is_session_expired = (
ret == SESSION_EXPIRED_ERRCODE
or errcode == SESSION_EXPIRED_ERRCODE
)
# Session expired — strip token and retry once
if is_session_expired and not retried_without_token and context_token:
retried_without_token = True
context_token = None
self._token_store._cache.pop(
self._token_store._key(self._account_id, chat_id), None
)
logger.warning(
"[%s] session expired for %s; retrying without context_token",
self.name, _safe_id(chat_id),
)
continue
errmsg = resp.get("errmsg") or resp.get("msg") or "unknown error"
raise RuntimeError(
f"iLink sendmessage error: ret={ret} errcode={errcode} errmsg={errmsg}"
)
return
except Exception as exc:
last_error = exc
@@ -1456,12 +1562,48 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
context_token = self._token_store.get(self._account_id, chat_id)
last_message_id: Optional[str] = None
# Extract MEDIA: tags and bare local file paths before text delivery.
media_files, cleaned_content = self.extract_media(content)
_, image_cleaned = self.extract_images(cleaned_content)
local_files, final_content = self.extract_local_files(image_cleaned)
_AUDIO_EXTS = {".ogg", ".opus", ".mp3", ".wav", ".m4a"}
_VIDEO_EXTS = {".mp4", ".mov", ".avi", ".mkv", ".webm", ".3gp"}
_IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".webp", ".gif"}
async def _deliver_media(path: str, is_voice: bool = False) -> None:
ext = Path(path).suffix.lower()
if is_voice or ext in _AUDIO_EXTS:
await self.send_voice(chat_id=chat_id, audio_path=path, metadata=metadata)
elif ext in _VIDEO_EXTS:
await self.send_video(chat_id=chat_id, video_path=path, metadata=metadata)
elif ext in _IMAGE_EXTS:
await self.send_image_file(chat_id=chat_id, image_path=path, metadata=metadata)
else:
await self.send_document(chat_id=chat_id, file_path=path, metadata=metadata)
try:
chunks = [c for c in self._split_text(self.format_message(content)) if c and c.strip()]
# Deliver extracted MEDIA: attachments first.
for media_path, is_voice in media_files:
try:
await _deliver_media(media_path, is_voice)
except Exception as exc:
logger.warning("[%s] media delivery failed for %s: %s", self.name, media_path, exc)
# Deliver bare local file paths.
for file_path in local_files:
try:
await _deliver_media(file_path, is_voice=False)
except Exception as exc:
logger.warning("[%s] local file delivery failed for %s: %s", self.name, file_path, exc)
# Deliver text content.
chunks = [c for c in self._split_text(self.format_message(final_content)) if c and c.strip()]
for idx, chunk in enumerate(chunks):
client_id = f"hermes-weixin-{uuid.uuid4().hex}"
await self._send_text_chunk(
@@ -1479,14 +1621,14 @@ class WeixinAdapter(BasePlatformAdapter):
return SendResult(success=False, error=str(exc))
async def send_typing(self, chat_id: str, metadata: Optional[Dict[str, Any]] = None) -> None:
if not self._session or not self._token:
if not self._send_session or not self._token:
return
typing_ticket = self._typing_cache.get(chat_id)
if not typing_ticket:
return
try:
await _send_typing(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1497,14 +1639,14 @@ class WeixinAdapter(BasePlatformAdapter):
logger.debug("[%s] typing start failed for %s: %s", self.name, _safe_id(chat_id), exc)
async def stop_typing(self, chat_id: str) -> None:
if not self._session or not self._token:
if not self._send_session or not self._token:
return
typing_ticket = self._typing_cache.get(chat_id)
if not typing_ticket:
return
try:
await _send_typing(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1542,24 +1684,35 @@ class WeixinAdapter(BasePlatformAdapter):
async def send_image_file(
self,
chat_id: str,
path: str,
caption: str = "",
image_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
return await self.send_document(chat_id, file_path=path, caption=caption, metadata=metadata)
del reply_to, kwargs
return await self.send_document(
chat_id=chat_id,
file_path=image_path,
caption=caption,
metadata=metadata,
)
async def send_document(
self,
chat_id: str,
file_path: str,
caption: str = "",
caption: Optional[str] = None,
file_name: Optional[str] = None,
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
**kwargs,
) -> SendResult:
if not self._session or not self._token:
del file_name, reply_to, metadata, kwargs
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, file_path, caption)
message_id = await self._send_file(chat_id, file_path, caption or "")
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_document failed to=%s: %s", self.name, _safe_id(chat_id), exc)
@@ -1573,7 +1726,7 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
if not self._session or not self._token:
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
try:
message_id = await self._send_file(chat_id, video_path, caption or "")
@@ -1590,7 +1743,24 @@ class WeixinAdapter(BasePlatformAdapter):
reply_to: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None,
) -> SendResult:
return await self.send_document(chat_id, audio_path, caption=caption or "", metadata=metadata)
if not self._send_session or not self._token:
return SendResult(success=False, error="Not connected")
# Native outbound Weixin voice bubbles are not proven-working in the
# upstream reference implementation. Prefer a reliable file attachment
# fallback so users at least receive playable audio, even for .silk.
fallback_caption = caption or "[voice message as attachment]"
try:
message_id = await self._send_file(
chat_id,
audio_path,
fallback_caption,
force_file_attachment=True,
)
return SendResult(success=True, message_id=message_id)
except Exception as exc:
logger.error("[%s] send_voice failed to=%s: %s", self.name, _safe_id(chat_id), exc)
return SendResult(success=False, error=str(exc))
async def _download_remote_media(self, url: str) -> str:
from tools.url_safety import is_safe_url
@@ -1598,8 +1768,8 @@ class WeixinAdapter(BasePlatformAdapter):
if not is_safe_url(url):
raise ValueError(f"Blocked unsafe URL (SSRF protection): {url}")
assert self._session is not None
async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
assert self._send_session is not None
async with self._send_session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
response.raise_for_status()
data = await response.read()
suffix = Path(url.split("?", 1)[0]).suffix or ".bin"
@@ -1607,16 +1777,22 @@ class WeixinAdapter(BasePlatformAdapter):
handle.write(data)
return handle.name
async def _send_file(self, chat_id: str, path: str, caption: str) -> str:
assert self._session is not None and self._token is not None
async def _send_file(
self,
chat_id: str,
path: str,
caption: str,
force_file_attachment: bool = False,
) -> str:
assert self._send_session is not None and self._token is not None
plaintext = Path(path).read_bytes()
media_type, item_builder = self._outbound_media_builder(path)
media_type, item_builder = self._outbound_media_builder(path, force_file_attachment=force_file_attachment)
filekey = secrets.token_hex(16)
aes_key = secrets.token_bytes(16)
rawsize = len(plaintext)
rawfilemd5 = hashlib.md5(plaintext).hexdigest()
upload_response = await _get_upload_url(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to_user_id=chat_id,
@@ -1642,30 +1818,34 @@ class WeixinAdapter(BasePlatformAdapter):
raise RuntimeError(f"getUploadUrl returned neither upload_param nor upload_full_url: {upload_response}")
encrypted_query_param = await _upload_ciphertext(
self._session,
self._send_session,
ciphertext=ciphertext,
upload_url=upload_url,
)
context_token = self._token_store.get(self._account_id, chat_id)
# The iLink API expects aes_key as base64(hex_string), not base64(raw_bytes).
# Sending base64(raw_bytes) causes images to show as grey boxes on the
# receiver side because the decryption key doesn't match.
aes_key_for_api = base64.b64encode(aes_key.hex().encode("ascii")).decode("ascii")
media_item = item_builder(
encrypt_query_param=encrypted_query_param,
aes_key_for_api=aes_key_for_api,
ciphertext_size=len(ciphertext),
plaintext_size=rawsize,
filename=Path(path).name,
rawfilemd5=rawfilemd5,
)
item_kwargs = {
"encrypt_query_param": encrypted_query_param,
"aes_key_for_api": aes_key_for_api,
"ciphertext_size": len(ciphertext),
"plaintext_size": rawsize,
"filename": Path(path).name,
"rawfilemd5": rawfilemd5,
}
if media_type == MEDIA_VOICE and path.endswith(".silk"):
item_kwargs["encode_type"] = 6
item_kwargs["sample_rate"] = 24000
item_kwargs["bits_per_sample"] = 16
media_item = item_builder(**item_kwargs)
last_message_id = None
if caption:
last_message_id = f"hermes-weixin-{uuid.uuid4().hex}"
await _send_message(
self._session,
self._send_session,
base_url=self._base_url,
token=self._token,
to=chat_id,
@@ -1676,7 +1856,7 @@ class WeixinAdapter(BasePlatformAdapter):
last_message_id = f"hermes-weixin-{uuid.uuid4().hex}"
await _api_post(
self._session,
self._send_session,
base_url=self._base_url,
endpoint=EP_SEND_MESSAGE,
payload={
@@ -1695,7 +1875,7 @@ class WeixinAdapter(BasePlatformAdapter):
)
return last_message_id
def _outbound_media_builder(self, path: str):
def _outbound_media_builder(self, path: str, force_file_attachment: bool = False):
mime = mimetypes.guess_type(path)[0] or "application/octet-stream"
if mime.startswith("image/"):
return MEDIA_IMAGE, lambda **kw: {
@@ -1723,7 +1903,7 @@ class WeixinAdapter(BasePlatformAdapter):
"video_md5": kw.get("rawfilemd5", ""),
},
}
if mime.startswith("audio/") or path.endswith(".silk"):
if path.endswith(".silk") and not force_file_attachment:
return MEDIA_VOICE, lambda **kw: {
"type": ITEM_VOICE,
"voice_item": {
@@ -1732,9 +1912,25 @@ class WeixinAdapter(BasePlatformAdapter):
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"encode_type": kw.get("encode_type"),
"bits_per_sample": kw.get("bits_per_sample"),
"sample_rate": kw.get("sample_rate"),
"playtime": kw.get("playtime", 0),
},
}
if mime.startswith("audio/"):
return MEDIA_FILE, lambda **kw: {
"type": ITEM_FILE,
"file_item": {
"media": {
"encrypt_query_param": kw["encrypt_query_param"],
"aes_key": kw["aes_key_for_api"],
"encrypt_type": 1,
},
"file_name": kw["filename"],
"len": str(kw["plaintext_size"]),
},
}
return MEDIA_FILE, lambda **kw: {
"type": ITEM_FILE,
"file_item": {
@@ -1784,7 +1980,34 @@ async def send_weixin_direct(
token_store.restore(account_id)
context_token = token_store.get(account_id, chat_id)
async with aiohttp.ClientSession(trust_env=True) as session:
live_adapter = _LIVE_ADAPTERS.get(resolved_token)
send_session = getattr(live_adapter, '_send_session', None)
if live_adapter is not None and send_session is not None and not send_session.closed:
last_result: Optional[SendResult] = None
cleaned = live_adapter.format_message(message)
if cleaned:
last_result = await live_adapter.send(chat_id, cleaned)
if not last_result.success:
return {"error": f"Weixin send failed: {last_result.error}"}
for media_path, _is_voice in media_files or []:
ext = Path(media_path).suffix.lower()
if ext in {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp"}:
last_result = await live_adapter.send_image_file(chat_id, media_path)
else:
last_result = await live_adapter.send_document(chat_id, media_path)
if not last_result.success:
return {"error": f"Weixin media send failed: {last_result.error}"}
return {
"success": True,
"platform": "weixin",
"chat_id": chat_id,
"message_id": last_result.message_id if last_result else None,
"context_token_used": bool(context_token),
}
async with aiohttp.ClientSession(trust_env=True, connector=_make_ssl_connector()) as session:
adapter = WeixinAdapter(
PlatformConfig(
enabled=True,
@@ -1797,6 +2020,7 @@ async def send_weixin_direct(
},
)
)
adapter._send_session = session
adapter._session = session
adapter._token = resolved_token
adapter._account_id = account_id
+749 -31
View File
File diff suppressed because it is too large Load Diff
+156 -3
View File
@@ -82,6 +82,7 @@ class SessionSource:
chat_topic: Optional[str] = None # Channel topic/description (Discord, Slack)
user_id_alt: Optional[str] = None # Signal UUID (alternative to phone number)
chat_id_alt: Optional[str] = None # Signal group internal ID
is_bot: bool = False # True when the message author is a bot/webhook (Discord)
@property
def description(self) -> str:
@@ -376,7 +377,19 @@ class SessionEntry:
# this session (create a new session_id) so the user starts fresh.
# Set by /stop to break stuck-resume loops (#7536).
suspended: bool = False
# When True the session was interrupted by a gateway restart/shutdown
# drain timeout, but recovery is still expected. Unlike ``suspended``,
# ``resume_pending`` preserves the existing session_id on next access —
# the user stays on the same transcript and the agent auto-continues
# from where it left off. Cleared after the next successful turn.
# Escalation to ``suspended`` is handled by the existing
# ``.restart_failure_counts`` stuck-loop counter (#7536), not by a
# parallel counter on this entry.
resume_pending: bool = False
resume_reason: Optional[str] = None # e.g. "restart_timeout"
last_resume_marked_at: Optional[datetime] = None
def to_dict(self) -> Dict[str, Any]:
result = {
"session_key": self.session_key,
@@ -396,6 +409,13 @@ class SessionEntry:
"cost_status": self.cost_status,
"memory_flushed": self.memory_flushed,
"suspended": self.suspended,
"resume_pending": self.resume_pending,
"resume_reason": self.resume_reason,
"last_resume_marked_at": (
self.last_resume_marked_at.isoformat()
if self.last_resume_marked_at
else None
),
}
if self.origin:
result["origin"] = self.origin.to_dict()
@@ -413,7 +433,15 @@ class SessionEntry:
platform = Platform(data["platform"])
except ValueError as e:
logger.debug("Unknown platform value %r: %s", data["platform"], e)
last_resume_marked_at = None
_lrma = data.get("last_resume_marked_at")
if _lrma:
try:
last_resume_marked_at = datetime.fromisoformat(_lrma)
except (TypeError, ValueError):
last_resume_marked_at = None
return cls(
session_key=data["session_key"],
session_id=data["session_id"],
@@ -433,6 +461,9 @@ class SessionEntry:
cost_status=data.get("cost_status", "unknown"),
memory_flushed=data.get("memory_flushed", False),
suspended=data.get("suspended", False),
resume_pending=data.get("resume_pending", False),
resume_reason=data.get("resume_reason"),
last_resume_marked_at=last_resume_marked_at,
)
@@ -709,9 +740,23 @@ class SessionStore:
entry = self._entries[session_key]
# Auto-reset sessions marked as suspended (e.g. after /stop
# broke a stuck loop — #7536).
# broke a stuck loop — #7536). ``suspended`` is the hard
# forced-wipe signal and always wins over ``resume_pending``,
# so repeated interrupted restarts that escalate via the
# existing ``.restart_failure_counts`` stuck-loop counter
# still converge to a clean slate.
if entry.suspended:
reset_reason = "suspended"
elif entry.resume_pending:
# Restart-interrupted session: preserve the session_id
# and return the existing entry so the transcript
# reloads intact. ``resume_pending`` is cleared after
# the NEXT successful turn completes (not here), which
# means a re-interrupted retry keeps trying — the
# stuck-loop counter handles terminal escalation.
entry.updated_at = now
self._save()
return entry
else:
reset_reason = self._should_reset(entry, source)
if not reset_reason:
@@ -801,6 +846,106 @@ class SessionStore:
return True
return False
def mark_resume_pending(
self,
session_key: str,
reason: str = "restart_timeout",
) -> bool:
"""Mark a session as resumable after a restart interruption.
Unlike ``suspend_session()``, this preserves the existing
``session_id`` and the transcript. The next call to
``get_or_create_session()`` for this key returns the same entry
so the user auto-resumes on the same conversation lane.
Returns True if the session existed and was marked.
"""
with self._lock:
self._ensure_loaded_locked()
if session_key in self._entries:
entry = self._entries[session_key]
# Never override an explicit ``suspended`` — that is a hard
# forced-wipe signal (from /stop or stuck-loop escalation).
if entry.suspended:
return False
entry.resume_pending = True
entry.resume_reason = reason
entry.last_resume_marked_at = _now()
self._save()
return True
return False
def clear_resume_pending(self, session_key: str) -> bool:
"""Clear the resume-pending flag after a successful resumed turn.
Called from the gateway after ``run_conversation()`` returns a
final response for a session that had ``resume_pending=True``,
signalling that recovery succeeded.
Returns True if a flag was cleared.
"""
with self._lock:
self._ensure_loaded_locked()
entry = self._entries.get(session_key)
if entry is None or not entry.resume_pending:
return False
entry.resume_pending = False
entry.resume_reason = None
entry.last_resume_marked_at = None
self._save()
return True
def prune_old_entries(self, max_age_days: int) -> int:
"""Drop SessionEntry records older than max_age_days.
Pruning is based on ``updated_at`` (last activity), not ``created_at``.
A session that's been active within the window is kept regardless of
how old it is. Entries marked ``suspended`` are kept the user
explicitly paused them for later resume. Entries held by an active
process (via has_active_processes_fn) are also kept so long-running
background work isn't orphaned.
Pruning is functionally identical to a natural reset-policy expiry:
the transcript in SQLite stays, but the session_key session_id
mapping is dropped and the user starts a fresh session on return.
``max_age_days <= 0`` disables pruning; returns 0 immediately.
Returns the number of entries removed.
"""
if max_age_days is None or max_age_days <= 0:
return 0
from datetime import timedelta
cutoff = _now() - timedelta(days=max_age_days)
removed_keys: list[str] = []
with self._lock:
self._ensure_loaded_locked()
for key, entry in list(self._entries.items()):
if entry.suspended:
continue
# Never prune sessions with an active background process
# attached — the user may still be waiting on output.
if self._has_active_processes_fn is not None:
try:
if self._has_active_processes_fn(entry.session_id):
continue
except Exception:
pass
if entry.updated_at < cutoff:
removed_keys.append(key)
for key in removed_keys:
self._entries.pop(key, None)
if removed_keys:
self._save()
if removed_keys:
logger.info(
"SessionStore pruned %d entries older than %d days",
len(removed_keys), max_age_days,
)
return len(removed_keys)
def suspend_recently_active(self, max_age_seconds: int = 120) -> int:
"""Mark recently-active sessions as suspended.
@@ -809,6 +954,12 @@ class SessionStore:
(#7536). Only suspends sessions updated within *max_age_seconds*
to avoid resetting long-idle sessions that are harmless to resume.
Returns the number of sessions that were suspended.
Entries flagged ``resume_pending=True`` are skipped those were
marked intentionally by the drain-timeout path as recoverable.
Terminal escalation for genuinely stuck ``resume_pending`` sessions
is handled by the existing ``.restart_failure_counts`` stuck-loop
counter, which runs after this method on startup.
"""
from datetime import timedelta
@@ -817,6 +968,8 @@ class SessionStore:
with self._lock:
self._ensure_loaded_locked()
for entry in self._entries.values():
if entry.resume_pending:
continue
if not entry.suspended and entry.updated_at >= cutoff:
entry.suspended = True
count += 1
+159 -11
View File
@@ -188,8 +188,8 @@ def _write_json_file(path: Path, payload: dict[str, Any]) -> None:
path.write_text(json.dumps(payload))
def _read_pid_record() -> Optional[dict]:
pid_path = _get_pid_path()
def _read_pid_record(pid_path: Optional[Path] = None) -> Optional[dict]:
pid_path = pid_path or _get_pid_path()
if not pid_path.exists():
return None
@@ -212,6 +212,18 @@ def _read_pid_record() -> Optional[dict]:
return None
def _cleanup_invalid_pid_path(pid_path: Path, *, cleanup_stale: bool) -> None:
if not cleanup_stale:
return
try:
if pid_path == _get_pid_path():
remove_pid_file()
else:
pid_path.unlink(missing_ok=True)
except Exception:
pass
def write_pid_file() -> None:
"""Write the current process PID and metadata to the gateway PID file."""
_write_json_file(_get_pid_path(), _build_pid_record())
@@ -413,43 +425,179 @@ def release_all_scoped_locks() -> int:
return removed
def get_running_pid() -> Optional[int]:
# ── --replace takeover marker ─────────────────────────────────────────
#
# When a new gateway starts with ``--replace``, it SIGTERMs the existing
# gateway so it can take over the bot token. PR #5646 made SIGTERM exit
# the gateway with code 1 so ``Restart=on-failure`` can revive it after
# unexpected kills — but that also means a --replace takeover target
# exits 1, which tricks systemd into reviving it 30 seconds later,
# starting a flap loop against the replacer when both services are
# enabled in the user's systemd (e.g. ``hermes.service`` + ``hermes-
# gateway.service``).
#
# The takeover marker breaks the loop: the replacer writes a short-lived
# file naming the target PID + start_time BEFORE sending SIGTERM.
# The target's shutdown handler reads the marker and, if it names
# this process, treats the SIGTERM as a planned takeover and exits 0.
# The marker is unlinked after the target has consumed it, so a stale
# marker left by a crashed replacer can grief at most one future
# shutdown on the same PID — and only within _TAKEOVER_MARKER_TTL_S.
_TAKEOVER_MARKER_FILENAME = ".gateway-takeover.json"
_TAKEOVER_MARKER_TTL_S = 60 # Marker older than this is treated as stale
def _get_takeover_marker_path() -> Path:
"""Return the path to the --replace takeover marker file."""
home = get_hermes_home()
return home / _TAKEOVER_MARKER_FILENAME
def write_takeover_marker(target_pid: int) -> bool:
"""Record that ``target_pid`` is being replaced by the current process.
Captures the target's ``start_time`` so that PID reuse after the
target exits cannot later match the marker. Also records the
replacer's PID and a UTC timestamp for TTL-based staleness checks.
Returns True on successful write, False on any failure. The caller
should proceed with the SIGTERM even if the write fails (the marker
is a best-effort signal, not a correctness requirement).
"""
try:
target_start_time = _get_process_start_time(target_pid)
record = {
"target_pid": target_pid,
"target_start_time": target_start_time,
"replacer_pid": os.getpid(),
"written_at": _utc_now_iso(),
}
_write_json_file(_get_takeover_marker_path(), record)
return True
except (OSError, PermissionError):
return False
def consume_takeover_marker_for_self() -> bool:
"""Check & unlink the takeover marker if it names the current process.
Returns True only when a valid (non-stale) marker names this PID +
start_time. A returning True indicates the current SIGTERM is a
planned --replace takeover; the caller should exit 0 instead of
signalling ``_signal_initiated_shutdown``.
Always unlinks the marker on match (and on detected staleness) so
subsequent unrelated signals don't re-trigger.
"""
path = _get_takeover_marker_path()
record = _read_json_file(path)
if not record:
return False
# Any malformed or stale marker → drop it and return False
try:
target_pid = int(record["target_pid"])
target_start_time = record.get("target_start_time")
written_at = record.get("written_at") or ""
except (KeyError, TypeError, ValueError):
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# TTL guard: a stale marker older than _TAKEOVER_MARKER_TTL_S is ignored.
stale = False
try:
written_dt = datetime.fromisoformat(written_at)
age = (datetime.now(timezone.utc) - written_dt).total_seconds()
if age > _TAKEOVER_MARKER_TTL_S:
stale = True
except (TypeError, ValueError):
stale = True # Unparseable timestamp — treat as stale
if stale:
try:
path.unlink(missing_ok=True)
except OSError:
pass
return False
# Does the marker name THIS process?
our_pid = os.getpid()
our_start_time = _get_process_start_time(our_pid)
matches = (
target_pid == our_pid
and target_start_time is not None
and our_start_time is not None
and target_start_time == our_start_time
)
# Consume the marker whether it matched or not — a marker that doesn't
# match our identity is stale-for-us anyway.
try:
path.unlink(missing_ok=True)
except OSError:
pass
return matches
def clear_takeover_marker() -> None:
"""Remove the takeover marker unconditionally. Safe to call repeatedly."""
try:
_get_takeover_marker_path().unlink(missing_ok=True)
except OSError:
pass
def get_running_pid(
pid_path: Optional[Path] = None,
*,
cleanup_stale: bool = True,
) -> Optional[int]:
"""Return the PID of a running gateway instance, or ``None``.
Checks the PID file and verifies the process is actually alive.
Cleans up stale PID files automatically.
"""
record = _read_pid_record()
resolved_pid_path = pid_path or _get_pid_path()
record = _read_pid_record(resolved_pid_path)
if not record:
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
pid = int(record["pid"])
except (KeyError, TypeError, ValueError):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
try:
os.kill(pid, 0) # signal 0 = existence check, no actual signal sent
except (ProcessLookupError, PermissionError):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
recorded_start = record.get("start_time")
current_start = _get_process_start_time(pid)
if recorded_start is not None and current_start is not None and current_start != recorded_start:
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
if not _looks_like_gateway_process(pid):
if not _record_looks_like_gateway(record):
remove_pid_file()
_cleanup_invalid_pid_path(resolved_pid_path, cleanup_stale=cleanup_stale)
return None
return pid
def is_gateway_running() -> bool:
def is_gateway_running(
pid_path: Optional[Path] = None,
*,
cleanup_stale: bool = True,
) -> bool:
"""Check if the gateway daemon is currently running."""
return get_running_pid() is not None
return get_running_pid(pid_path, cleanup_stale=cleanup_stale) is not None
+94 -6
View File
@@ -100,6 +100,14 @@ class GatewayStreamConsumer:
self._flood_strikes = 0 # Consecutive flood-control edit failures
self._current_edit_interval = self.cfg.edit_interval # Adaptive backoff
self._final_response_sent = False
# Cache adapter lifecycle capability: only platforms that need an
# explicit finalize call (e.g. DingTalk AI Cards) force us to make
# a redundant final edit. Everyone else keeps the fast path.
# Use ``is True`` (not ``bool(...)``) so MagicMock attribute access
# in tests doesn't incorrectly enable this path.
self._adapter_requires_finalize: bool = (
getattr(adapter, "REQUIRES_EDIT_FINALIZE", False) is True
)
# Think-block filter state (mirrors CLI's _stream_delta tag suppression)
self._in_think_block = False
@@ -361,7 +369,16 @@ class GatewayStreamConsumer:
if not got_done and not got_segment_break and commentary_text is None:
display_text += self.cfg.cursor
current_update_visible = await self._send_or_edit(display_text)
# Segment break: finalize the current message so platforms
# that need explicit closure (e.g. DingTalk AI Cards) don't
# leave the previous segment stuck in a loading state when
# the next segment (tool progress, next chunk) creates a
# new message below it. got_done has its own finalize
# path below so we don't finalize here for it.
current_update_visible = await self._send_or_edit(
display_text,
finalize=got_segment_break,
)
self._last_edit_time = time.monotonic()
if got_done:
@@ -372,10 +389,22 @@ class GatewayStreamConsumer:
if self._accumulated:
if self._fallback_final_send:
await self._send_fallback_final(self._accumulated)
elif current_update_visible:
elif (
current_update_visible
and not self._adapter_requires_finalize
):
# Mid-stream edit above already delivered the
# final accumulated content. Skip the redundant
# final edit — but only for adapters that don't
# need an explicit finalize signal.
self._final_response_sent = True
elif self._message_id:
self._final_response_sent = await self._send_or_edit(self._accumulated)
# Either the mid-stream edit didn't run (no
# visible update this tick) OR the adapter needs
# explicit finalize=True to close the stream.
self._final_response_sent = await self._send_or_edit(
self._accumulated, finalize=True,
)
elif not self._already_sent:
self._final_response_sent = await self._send_or_edit(self._accumulated)
return
@@ -401,6 +430,21 @@ class GatewayStreamConsumer:
# a real string like "msg_1", not "__no_edit__", so that case
# still resets and creates a fresh segment as intended.)
if got_segment_break:
# If the segment-break edit failed to deliver the
# accumulated content (flood control that has not yet
# promoted to fallback mode, or fallback mode itself),
# _accumulated still holds pre-boundary text the user
# never saw. Flush that tail as a continuation message
# before the reset below wipes _accumulated — otherwise
# text generated before the tool boundary is silently
# dropped (issue #8124).
if (
self._accumulated
and not current_update_visible
and self._message_id
and self._message_id != "__no_edit__"
):
await self._flush_segment_tail_on_edit_failure()
self._reset_segment_state(preserve_no_edit=True)
await asyncio.sleep(0.05) # Small yield to not busy-loop
@@ -591,6 +635,39 @@ class GatewayStreamConsumer:
err_lower = err.lower()
return "flood" in err_lower or "retry after" in err_lower or "rate" in err_lower
async def _flush_segment_tail_on_edit_failure(self) -> None:
"""Deliver un-sent tail content before a segment-break reset.
When an edit fails (flood control, transport error) and a tool
boundary arrives before the next retry, ``_accumulated`` holds text
that was generated but never shown to the user. Without this flush,
the segment reset would discard that tail and leave a frozen cursor
in the partial message.
Sends the tail that sits after the last successfully-delivered
prefix as a new message, and best-effort strips the stuck cursor
from the previous partial message.
"""
if not self._fallback_final_send:
await self._try_strip_cursor()
visible = self._fallback_prefix or self._visible_prefix()
tail = self._accumulated
if visible and tail.startswith(visible):
tail = tail[len(visible):].lstrip()
tail = self._clean_for_display(tail)
if not tail.strip():
return
try:
result = await self.adapter.send(
chat_id=self.chat_id,
content=tail,
metadata=self.metadata,
)
if result.success:
self._already_sent = True
except Exception as e:
logger.error("Segment-break tail flush error: %s", e)
async def _try_strip_cursor(self) -> None:
"""Best-effort edit to remove the cursor from the last visible message.
@@ -633,12 +710,15 @@ class GatewayStreamConsumer:
logger.error("Commentary send error: %s", e)
return False
async def _send_or_edit(self, text: str) -> bool:
async def _send_or_edit(self, text: str, *, finalize: bool = False) -> bool:
"""Send or edit the streaming message.
Returns True if the text was successfully delivered (sent or edited),
False otherwise. Callers like the overflow split loop use this to
decide whether to advance past the delivered chunk.
``finalize`` is True when this is the last edit in a streaming
sequence.
"""
# Strip MEDIA: directives so they don't appear as visible text.
# Media files are delivered as native attachments after the stream
@@ -672,14 +752,22 @@ class GatewayStreamConsumer:
try:
if self._message_id is not None:
if self._edit_supported:
# Skip if text is identical to what we last sent
if text == self._last_sent_text:
# Skip if text is identical to what we last sent.
# Exception: adapters that require an explicit finalize
# call (REQUIRES_EDIT_FINALIZE) must still receive the
# finalize=True edit even when content is unchanged, so
# their streaming UI can transition out of the in-
# progress state. Everyone else short-circuits.
if text == self._last_sent_text and not (
finalize and self._adapter_requires_finalize
):
return True
# Edit existing message
result = await self.adapter.edit_message(
chat_id=self.chat_id,
message_id=self._message_id,
content=text,
finalize=finalize,
)
if result.success:
self._already_sent = True
+121 -68
View File
@@ -233,6 +233,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
api_key_env_vars=("XAI_API_KEY",),
base_url_env_var="XAI_BASE_URL",
),
"nvidia": ProviderConfig(
id="nvidia",
name="NVIDIA NIM",
auth_type="api_key",
inference_base_url="https://integrate.api.nvidia.com/v1",
api_key_env_vars=("NVIDIA_API_KEY",),
base_url_env_var="NVIDIA_BASE_URL",
),
"ai-gateway": ProviderConfig(
id="ai-gateway",
name="Vercel AI Gateway",
@@ -773,6 +781,28 @@ def is_source_suppressed(provider_id: str, source: str) -> bool:
return False
def unsuppress_credential_source(provider_id: str, source: str) -> bool:
"""Clear a suppression marker so the source will be re-seeded on the next load.
Returns True if a marker was cleared, False if no marker existed.
"""
with _auth_store_lock():
auth_store = _load_auth_store()
suppressed = auth_store.get("suppressed_sources")
if not isinstance(suppressed, dict):
return False
provider_list = suppressed.get(provider_id)
if not isinstance(provider_list, list) or source not in provider_list:
return False
provider_list.remove(source)
if not provider_list:
suppressed.pop(provider_id, None)
if not suppressed:
auth_store.pop("suppressed_sources", None)
_save_auth_store(auth_store)
return True
def get_provider_auth_state(provider_id: str) -> Optional[Dict[str, Any]]:
"""Return persisted auth state for a provider, or None."""
auth_store = _load_auth_store()
@@ -1404,49 +1434,6 @@ def _read_codex_tokens(*, _lock: bool = True) -> Dict[str, Any]:
}
def _write_codex_cli_tokens(
access_token: str,
refresh_token: str,
*,
last_refresh: Optional[str] = None,
) -> None:
"""Write refreshed tokens back to ~/.codex/auth.json.
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When Hermes refreshes a token it consumes the old refresh_token; if we
don't write the new pair back, the Codex CLI (or VS Code extension) will
fail with ``refresh_token_reused`` on its next refresh attempt.
This mirrors the Anthropic write-back to ~/.claude/.credentials.json
via ``_write_claude_code_credentials()``.
"""
codex_home = os.getenv("CODEX_HOME", "").strip()
if not codex_home:
codex_home = str(Path.home() / ".codex")
auth_path = Path(codex_home).expanduser() / "auth.json"
try:
existing: Dict[str, Any] = {}
if auth_path.is_file():
existing = json.loads(auth_path.read_text(encoding="utf-8"))
if not isinstance(existing, dict):
existing = {}
tokens_dict = existing.get("tokens")
if not isinstance(tokens_dict, dict):
tokens_dict = {}
tokens_dict["access_token"] = access_token
tokens_dict["refresh_token"] = refresh_token
existing["tokens"] = tokens_dict
if last_refresh is not None:
existing["last_refresh"] = last_refresh
auth_path.parent.mkdir(parents=True, exist_ok=True)
auth_path.write_text(json.dumps(existing, indent=2), encoding="utf-8")
auth_path.chmod(0o600)
except (OSError, IOError) as exc:
logger.debug("Failed to write refreshed tokens to %s: %s", auth_path, exc)
def _save_codex_tokens(tokens: Dict[str, str], last_refresh: str = None) -> None:
"""Save Codex OAuth tokens to Hermes auth store (~/.hermes/auth.json)."""
if last_refresh is None:
@@ -1514,6 +1501,11 @@ def refresh_codex_oauth_pure(
"then run `hermes auth` to re-authenticate."
)
relogin_required = True
# A 401/403 from the token endpoint always means the refresh token
# is invalid/expired — force relogin even if the body error code
# wasn't one of the known strings above.
if response.status_code in (401, 403) and not relogin_required:
relogin_required = True
raise AuthError(
message,
provider="openai-codex",
@@ -1569,12 +1561,6 @@ def _refresh_codex_auth_tokens(
updated_tokens["refresh_token"] = refreshed["refresh_token"]
_save_codex_tokens(updated_tokens)
# Write back to ~/.codex/auth.json so Codex CLI / VS Code stay in sync.
_write_codex_cli_tokens(
refreshed["access_token"],
refreshed["refresh_token"],
last_refresh=refreshed.get("last_refresh"),
)
return updated_tokens
@@ -1619,25 +1605,7 @@ def resolve_codex_runtime_credentials(
refresh_skew_seconds: int = CODEX_ACCESS_TOKEN_REFRESH_SKEW_SECONDS,
) -> Dict[str, Any]:
"""Resolve runtime credentials from Hermes's own Codex token store."""
try:
data = _read_codex_tokens()
except AuthError as orig_err:
# Only attempt migration when there are NO tokens stored at all
# (code == "codex_auth_missing"), not when tokens exist but are invalid.
if orig_err.code != "codex_auth_missing":
raise
# Migration: user had Codex as active provider with old storage (~/.codex/).
cli_tokens = _import_codex_cli_tokens()
if cli_tokens:
logger.info("Migrating Codex credentials from ~/.codex/ to Hermes auth store")
print("⚠️ Migrating Codex credentials to Hermes's own auth store.")
print(" This avoids conflicts with Codex CLI and VS Code.")
print(" Run `hermes auth` to create a fully independent session.\n")
_save_codex_tokens(cli_tokens)
data = _read_codex_tokens()
else:
raise
data = _read_codex_tokens()
tokens = dict(data["tokens"])
access_token = str(tokens.get("access_token", "") or "").strip()
refresh_timeout_seconds = float(os.getenv("HERMES_CODEX_REFRESH_TIMEOUT_SECONDS", "20"))
@@ -2129,6 +2097,62 @@ def refresh_nous_oauth_from_state(
)
NOUS_DEVICE_CODE_SOURCE = "device_code"
def persist_nous_credentials(
creds: Dict[str, Any],
*,
label: Optional[str] = None,
):
"""Persist minted Nous OAuth credentials as the singleton provider state
and ensure the credential pool is in sync.
Nous credentials are read at runtime from two independent locations:
- ``providers.nous``: singleton state read by
``resolve_nous_runtime_credentials()`` during 401 recovery and by
``_seed_from_singletons()`` during pool load.
- ``credential_pool.nous``: used by the runtime ``pool.select()`` path.
Historically ``hermes auth add nous`` wrote a ``manual:device_code`` pool
entry only, skipping ``providers.nous``. When the 24h agent_key TTL
expired, the recovery path read the empty singleton state and raised
``AuthError`` silently (``logger.debug`` at INFO level).
This helper writes ``providers.nous`` then calls ``load_pool("nous")`` so
``_seed_from_singletons`` materialises the canonical ``device_code`` pool
entry from the singleton. Re-running login upserts the same entry in
place; the pool never accumulates duplicate device_code rows.
``label`` is an optional user-chosen display name (from
``hermes auth add nous --label <name>``). It gets embedded in the
singleton state so that ``_seed_from_singletons`` uses it as the pool
entry's label on every subsequent ``load_pool("nous")`` instead of the
auto-derived token fingerprint. When ``None``, the auto-derived label
via ``label_from_token`` is used (unchanged default behaviour).
Returns the upserted :class:`PooledCredential` entry (or ``None`` if
seeding somehow produced no match shouldn't happen).
"""
from agent.credential_pool import load_pool
state = dict(creds)
if label and str(label).strip():
state["label"] = str(label).strip()
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", state)
_save_auth_store(auth_store)
pool = load_pool("nous")
return next(
(e for e in pool.entries() if e.source == NOUS_DEVICE_CODE_SOURCE),
None,
)
def resolve_nous_runtime_credentials(
*,
min_key_ttl_seconds: int = DEFAULT_AGENT_KEY_MIN_TTL_SECONDS,
@@ -3297,6 +3321,14 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
inference_base_url = auth_state["inference_base_url"]
# Snapshot the prior active_provider BEFORE _save_provider_state
# overwrites it to "nous". If the user picks "Skip (keep current)"
# during model selection below, we restore this so the user's previous
# provider (e.g. openrouter) is preserved.
with _auth_store_lock():
_prior_store = _load_auth_store()
prior_active_provider = _prior_store.get("active_provider")
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", auth_state)
@@ -3356,6 +3388,27 @@ def _login_nous(args, pconfig: ProviderConfig) -> None:
print(f"Login succeeded, but could not fetch available models. Reason: {message}")
# Write provider + model atomically so config is never mismatched.
# If no model was selected (user picked "Skip (keep current)",
# model list fetch failed, or no curated models were available),
# preserve the user's previous provider — don't silently switch
# them to Nous with a mismatched model. The Nous OAuth tokens
# stay saved for future use.
if not selected_model:
# Restore the prior active_provider that _save_provider_state
# overwrote to "nous". config.yaml model.provider is left
# untouched, so the user's previous provider is fully preserved.
with _auth_store_lock():
auth_store = _load_auth_store()
if prior_active_provider:
auth_store["active_provider"] = prior_active_provider
else:
auth_store.pop("active_provider", None)
_save_auth_store(auth_store)
print()
print("No provider change. Nous credentials saved for future use.")
print(" Run `hermes model` again to switch to Nous Portal.")
return
config_path = _update_config_for_provider(
"nous", inference_base_url, default_model=selected_model,
)
+39 -13
View File
@@ -217,22 +217,21 @@ def auth_add_command(args) -> None:
ca_bundle=getattr(args, "ca_bundle", None),
min_key_ttl_seconds=max(60, int(getattr(args, "min_key_ttl_seconds", 5 * 60))),
)
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds.get("access_token", ""),
_oauth_default_label(provider, len(pool.entries()) + 1),
# Honor `--label <name>` so nous matches other providers' UX. The
# helper embeds this into providers.nous so that label_from_token
# doesn't overwrite it on every subsequent load_pool("nous").
custom_label = (getattr(args, "label", None) or "").strip() or None
entry = auth_mod.persist_nous_credentials(creds, label=custom_label)
shown_label = entry.label if entry is not None else label_from_token(
creds.get("access_token", ""), _oauth_default_label(provider, 1),
)
entry = PooledCredential.from_dict(provider, {
**creds,
"label": label,
"auth_type": AUTH_TYPE_OAUTH,
"source": f"{SOURCE_MANUAL}:device_code",
"base_url": creds.get("inference_base_url"),
})
pool.add_entry(entry)
print(f'Added {provider} OAuth credential #{len(pool.entries())}: "{entry.label}"')
print(f'Saved {provider} OAuth device-code credentials: "{shown_label}"')
return
if provider == "openai-codex":
# Clear any existing suppression marker so a re-link after `hermes auth
# remove openai-codex` works without the new tokens being skipped.
auth_mod.unsuppress_credential_source(provider, "device_code")
creds = auth_mod._codex_device_code_login()
label = (getattr(args, "label", None) or "").strip() or label_from_token(
creds["tokens"]["access_token"],
@@ -352,7 +351,34 @@ def auth_remove_command(args) -> None:
# If this was a singleton-seeded credential (OAuth device_code, hermes_pkce),
# clear the underlying auth store / credential file so it doesn't get
# re-seeded on the next load_pool() call.
elif removed.source == "device_code" and provider in ("openai-codex", "nous"):
elif provider == "openai-codex" and (
removed.source == "device_code" or removed.source.endswith(":device_code")
):
# Codex tokens live in TWO places: the Hermes auth store and
# ~/.codex/auth.json (the Codex CLI shared file). On every refresh,
# refresh_codex_oauth_pure() writes to both. So clearing only the
# Hermes auth store is not enough — _seed_from_singletons() will
# auto-import from ~/.codex/auth.json on the next load_pool() and
# the removal is instantly undone. Mark the source as suppressed
# so auto-import is skipped; leave ~/.codex/auth.json untouched so
# the Codex CLI itself keeps working.
from hermes_cli.auth import (
_load_auth_store, _save_auth_store, _auth_store_lock,
suppress_credential_source,
)
with _auth_store_lock():
auth_store = _load_auth_store()
providers_dict = auth_store.get("providers")
if isinstance(providers_dict, dict) and provider in providers_dict:
del providers_dict[provider]
_save_auth_store(auth_store)
print(f"Cleared {provider} OAuth tokens from auth store")
suppress_credential_source(provider, "device_code")
print("Suppressed openai-codex device_code source — it will not be re-seeded.")
print("Note: Codex CLI credentials still live in ~/.codex/auth.json")
print("Run `hermes auth add openai-codex` to re-enable if needed.")
elif removed.source == "device_code" and provider == "nous":
from hermes_cli.auth import (
_load_auth_store, _save_auth_store, _auth_store_lock,
)
+119 -70
View File
@@ -7,8 +7,8 @@ CLI tools that ship with the platform (or are commonly installed).
Platform support:
macOS osascript (always available), pngpaste (if installed)
Windows PowerShell via .NET System.Windows.Forms.Clipboard
WSL2 powershell.exe via .NET System.Windows.Forms.Clipboard
Windows PowerShell via WinForms, Get-Clipboard, file-drop fallback
WSL2 powershell.exe via WinForms, Get-Clipboard, file-drop fallback
Linux wl-paste (Wayland), xclip (X11)
"""
@@ -46,10 +46,11 @@ def has_clipboard_image() -> bool:
return _macos_has_image()
if sys.platform == "win32":
return _windows_has_image()
if _is_wsl():
return _wsl_has_image()
if os.environ.get("WAYLAND_DISPLAY"):
return _wayland_has_image()
# Match _linux_save fallthrough order: WSL → Wayland → X11
if _is_wsl() and _wsl_has_image():
return True
if os.environ.get("WAYLAND_DISPLAY") and _wayland_has_image():
return True
return _xclip_has_image()
@@ -135,6 +136,114 @@ _PS_EXTRACT_IMAGE = (
"[System.Convert]::ToBase64String($ms.ToArray())"
)
_PS_CHECK_IMAGE_GET_CLIPBOARD = (
"try { "
"$img = Get-Clipboard -Format Image -ErrorAction Stop;"
"if ($null -ne $img) { 'True' } else { 'False' }"
"} catch { 'False' }"
)
_PS_EXTRACT_IMAGE_GET_CLIPBOARD = (
"try { "
"Add-Type -AssemblyName System.Drawing;"
"Add-Type -AssemblyName PresentationCore;"
"Add-Type -AssemblyName WindowsBase;"
"$img = Get-Clipboard -Format Image -ErrorAction Stop;"
"if ($null -eq $img) { exit 1 }"
"$ms = New-Object System.IO.MemoryStream;"
"if ($img -is [System.Drawing.Image]) {"
"$img.Save($ms, [System.Drawing.Imaging.ImageFormat]::Png)"
"} elseif ($img -is [System.Windows.Media.Imaging.BitmapSource]) {"
"$enc = New-Object System.Windows.Media.Imaging.PngBitmapEncoder;"
"$enc.Frames.Add([System.Windows.Media.Imaging.BitmapFrame]::Create($img));"
"$enc.Save($ms)"
"} else { exit 2 }"
"[System.Convert]::ToBase64String($ms.ToArray())"
"} catch { exit 1 }"
)
_FILEDROP_IMAGE_EXTS = "'.png','.jpg','.jpeg','.gif','.webp','.bmp','.tiff','.tif'"
_PS_CHECK_FILEDROP_IMAGE = (
"try { "
"$files = Get-Clipboard -Format FileDropList -ErrorAction Stop;"
f"$exts = @({_FILEDROP_IMAGE_EXTS});"
"$hit = $files | Where-Object { $exts -contains ([System.IO.Path]::GetExtension($_).ToLowerInvariant()) } | Select-Object -First 1;"
"if ($null -ne $hit) { 'True' } else { 'False' }"
"} catch { 'False' }"
)
_PS_EXTRACT_FILEDROP_IMAGE = (
"try { "
"$files = Get-Clipboard -Format FileDropList -ErrorAction Stop;"
f"$exts = @({_FILEDROP_IMAGE_EXTS});"
"$hit = $files | Where-Object { $exts -contains ([System.IO.Path]::GetExtension($_).ToLowerInvariant()) } | Select-Object -First 1;"
"if ($null -eq $hit) { exit 1 }"
"[System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes($hit))"
"} catch { exit 1 }"
)
_POWERSHELL_HAS_IMAGE_SCRIPTS = (
_PS_CHECK_IMAGE,
_PS_CHECK_IMAGE_GET_CLIPBOARD,
_PS_CHECK_FILEDROP_IMAGE,
)
_POWERSHELL_EXTRACT_IMAGE_SCRIPTS = (
_PS_EXTRACT_IMAGE,
_PS_EXTRACT_IMAGE_GET_CLIPBOARD,
_PS_EXTRACT_FILEDROP_IMAGE,
)
def _run_powershell(exe: str, script: str, timeout: int) -> subprocess.CompletedProcess:
return subprocess.run(
[exe, "-NoProfile", "-NonInteractive", "-Command", script],
capture_output=True, text=True, timeout=timeout,
)
def _write_base64_image(dest: Path, b64_data: str) -> bool:
image_bytes = base64.b64decode(b64_data, validate=True)
dest.write_bytes(image_bytes)
return dest.exists() and dest.stat().st_size > 0
def _powershell_has_image(exe: str, *, timeout: int, label: str) -> bool:
for script in _POWERSHELL_HAS_IMAGE_SCRIPTS:
try:
r = _run_powershell(exe, script, timeout=timeout)
if r.returncode == 0 and "True" in r.stdout:
return True
except FileNotFoundError:
logger.debug("%s not found — clipboard unavailable", exe)
return False
except Exception as e:
logger.debug("%s clipboard image check failed: %s", label, e)
return False
def _powershell_save_image(exe: str, dest: Path, *, timeout: int, label: str) -> bool:
for script in _POWERSHELL_EXTRACT_IMAGE_SCRIPTS:
try:
r = _run_powershell(exe, script, timeout=timeout)
if r.returncode != 0:
continue
b64_data = r.stdout.strip()
if not b64_data:
continue
if _write_base64_image(dest, b64_data):
return True
except FileNotFoundError:
logger.debug("%s not found — clipboard unavailable", exe)
return False
except Exception as e:
logger.debug("%s clipboard image extraction failed: %s", label, e)
dest.unlink(missing_ok=True)
return False
# ── Native Windows ────────────────────────────────────────────────────────
@@ -175,15 +284,7 @@ def _windows_has_image() -> bool:
ps = _get_ps_exe()
if ps is None:
return False
try:
r = subprocess.run(
[ps, "-NoProfile", "-NonInteractive", "-Command", _PS_CHECK_IMAGE],
capture_output=True, text=True, timeout=5,
)
return r.returncode == 0 and "True" in r.stdout
except Exception as e:
logger.debug("Windows clipboard image check failed: %s", e)
return False
return _powershell_has_image(ps, timeout=5, label="Windows")
def _windows_save(dest: Path) -> bool:
@@ -192,26 +293,7 @@ def _windows_save(dest: Path) -> bool:
if ps is None:
logger.debug("No PowerShell found — Windows clipboard image paste unavailable")
return False
try:
r = subprocess.run(
[ps, "-NoProfile", "-NonInteractive", "-Command", _PS_EXTRACT_IMAGE],
capture_output=True, text=True, timeout=15,
)
if r.returncode != 0:
return False
b64_data = r.stdout.strip()
if not b64_data:
return False
png_bytes = base64.b64decode(b64_data)
dest.write_bytes(png_bytes)
return dest.exists() and dest.stat().st_size > 0
except Exception as e:
logger.debug("Windows clipboard image extraction failed: %s", e)
dest.unlink(missing_ok=True)
return False
return _powershell_save_image(ps, dest, timeout=15, label="Windows")
# ── Linux ────────────────────────────────────────────────────────────────
@@ -235,45 +317,12 @@ def _linux_save(dest: Path) -> bool:
def _wsl_has_image() -> bool:
"""Check if Windows clipboard has an image (via powershell.exe)."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_CHECK_IMAGE],
capture_output=True, text=True, timeout=8,
)
return r.returncode == 0 and "True" in r.stdout
except FileNotFoundError:
logger.debug("powershell.exe not found — WSL clipboard unavailable")
except Exception as e:
logger.debug("WSL clipboard check failed: %s", e)
return False
return _powershell_has_image("powershell.exe", timeout=8, label="WSL")
def _wsl_save(dest: Path) -> bool:
"""Extract clipboard image via powershell.exe → base64 → decode to PNG."""
try:
r = subprocess.run(
["powershell.exe", "-NoProfile", "-NonInteractive", "-Command",
_PS_EXTRACT_IMAGE],
capture_output=True, text=True, timeout=15,
)
if r.returncode != 0:
return False
b64_data = r.stdout.strip()
if not b64_data:
return False
png_bytes = base64.b64decode(b64_data)
dest.write_bytes(png_bytes)
return dest.exists() and dest.stat().st_size > 0
except FileNotFoundError:
logger.debug("powershell.exe not found — WSL clipboard unavailable")
except Exception as e:
logger.debug("WSL clipboard extraction failed: %s", e)
dest.unlink(missing_ok=True)
return False
return _powershell_save_image("powershell.exe", dest, timeout=15, label="WSL")
# ── Wayland (wl-paste) ──────────────────────────────────────────────────
+112 -7
View File
@@ -87,8 +87,12 @@ COMMAND_REGISTRY: list[CommandDef] = [
aliases=("bg",), args_hint="<prompt>"),
CommandDef("btw", "Ephemeral side question using session context (no tools, not persisted)", "Session",
args_hint="<question>"),
CommandDef("agents", "Show active agents and running tasks", "Session",
aliases=("tasks",)),
CommandDef("queue", "Queue a prompt for the next turn (doesn't interrupt)", "Session",
aliases=("q",), args_hint="<prompt>"),
CommandDef("steer", "Inject a message after the next tool call without interrupting", "Session",
args_hint="<prompt>"),
CommandDef("status", "Show session info", "Session"),
CommandDef("profile", "Show active profile name and home directory", "Info"),
CommandDef("sethome", "Set this chat as the home channel", "Session",
@@ -99,7 +103,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
# Configuration
CommandDef("config", "Show current configuration", "Configuration",
cli_only=True),
CommandDef("model", "Switch model for this session", "Configuration", args_hint="[model] [--global]"),
CommandDef("model", "Switch model for this session", "Configuration", args_hint="[model] [--provider name] [--global]"),
CommandDef("provider", "Show available providers and current provider",
"Configuration"),
CommandDef("gquota", "Show Google Gemini Code Assist quota usage", "Info"),
@@ -120,7 +124,7 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[normal|fast|status]",
subcommands=("normal", "fast", "status", "on", "off")),
CommandDef("skin", "Show or change the display skin/theme", "Configuration",
cli_only=True, args_hint="[name]"),
args_hint="[name]"),
CommandDef("voice", "Toggle voice mode", "Configuration",
args_hint="[on|off|tts|status]", subcommands=("on", "off", "tts", "status")),
@@ -155,7 +159,9 @@ COMMAND_REGISTRY: list[CommandDef] = [
args_hint="[days]"),
CommandDef("platforms", "Show gateway/messaging platform status", "Info",
cli_only=True, aliases=("gateway",)),
CommandDef("paste", "Check clipboard for an image and attach it", "Info",
CommandDef("copy", "Copy the last assistant response to clipboard", "Info",
cli_only=True, args_hint="[number]"),
CommandDef("paste", "Attach clipboard image from your clipboard", "Info",
cli_only=True),
CommandDef("image", "Attach a local image file for your next prompt", "Info",
cli_only=True, args_hint="<path>"),
@@ -254,6 +260,53 @@ GATEWAY_KNOWN_COMMANDS: frozenset[str] = frozenset(
)
# Commands with explicit Level-2 running-agent handlers in gateway/run.py.
# Listed here for introspection / tests; semantically a subset of
# "all resolvable commands" — which is the real bypass set (see
# should_bypass_active_session below).
ACTIVE_SESSION_BYPASS_COMMANDS: frozenset[str] = frozenset(
{
"agents",
"approve",
"background",
"commands",
"deny",
"help",
"new",
"profile",
"queue",
"restart",
"status",
"steer",
"stop",
"update",
}
)
def should_bypass_active_session(command_name: str | None) -> bool:
"""Return True for any resolvable slash command.
Rationale: every gateway-registered slash command either has a
specific Level-2 handler in gateway/run.py (/stop, /new, /model,
/approve, etc.) or reaches the running-agent catch-all that returns
a "busy — wait or /stop first" response. In both paths the command
is dispatched, not queued.
Queueing is always wrong for a recognized slash command because the
safety net in gateway.run discards any command text that reaches
the pending queue which meant a mid-run /model (or /reasoning,
/voice, /insights, /title, /resume, /retry, /undo, /compress,
/usage, /provider, /reload-mcp, /sethome, /reset) would silently
interrupt the agent AND get discarded, producing a zero-char
response. See issue #5057 / PRs #6252, #10370, #4665.
ACTIVE_SESSION_BYPASS_COMMANDS remains the subset of commands with
explicit Level-2 handlers; the rest fall through to the catch-all.
"""
return resolve_command(command_name) is not None if command_name else False
def _resolve_config_gates() -> set[str]:
"""Return canonical names of commands whose ``gateway_config_gate`` is truthy.
@@ -1044,6 +1097,51 @@ class SlashCommandCompleter(Completer):
display_meta=f"{fp} {meta}" if meta else fp,
)
@staticmethod
def _skin_completions(sub_text: str, sub_lower: str):
"""Yield completions for /skin from available skins."""
try:
from hermes_cli.skin_engine import list_skins
for s in list_skins():
name = s["name"]
if name.startswith(sub_lower) and name != sub_lower:
yield Completion(
name,
start_position=-len(sub_text),
display=name,
display_meta=s.get("description", "") or s.get("source", ""),
)
except Exception:
pass
@staticmethod
def _personality_completions(sub_text: str, sub_lower: str):
"""Yield completions for /personality from configured personalities."""
try:
from hermes_cli.config import load_config
personalities = load_config().get("agent", {}).get("personalities", {})
if "none".startswith(sub_lower) and "none" != sub_lower:
yield Completion(
"none",
start_position=-len(sub_text),
display="none",
display_meta="clear personality overlay",
)
for name, prompt in personalities.items():
if name.startswith(sub_lower) and name != sub_lower:
if isinstance(prompt, dict):
meta = prompt.get("description") or prompt.get("system_prompt", "")[:50]
else:
meta = str(prompt)[:50]
yield Completion(
name,
start_position=-len(sub_text),
display=name,
display_meta=meta,
)
except Exception:
pass
def _model_completions(self, sub_text: str, sub_lower: str):
"""Yield completions for /model from config aliases + built-in aliases."""
seen = set()
@@ -1098,10 +1196,17 @@ class SlashCommandCompleter(Completer):
sub_text = parts[1] if len(parts) > 1 else ""
sub_lower = sub_text.lower()
# Dynamic model alias completions for /model
if " " not in sub_text and base_cmd == "/model":
yield from self._model_completions(sub_text, sub_lower)
return
# Dynamic completions for commands with runtime lists
if " " not in sub_text:
if base_cmd == "/model":
yield from self._model_completions(sub_text, sub_lower)
return
if base_cmd == "/skin":
yield from self._skin_completions(sub_text, sub_lower)
return
if base_cmd == "/personality":
yield from self._personality_completions(sub_text, sub_lower)
return
# Static subcommand completions
if " " not in sub_text and base_cmd in SUBCOMMANDS and self._command_allowed(base_cmd):
+146 -10
View File
@@ -12,6 +12,7 @@ This module provides:
- hermes config wizard - Re-run setup wizard
"""
import copy
import os
import platform
import re
@@ -26,6 +27,7 @@ from typing import Dict, Any, Optional, List, Tuple
_IS_WINDOWS = platform.system() == "Windows"
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
_LAST_EXPANDED_CONFIG_BY_PATH: Dict[str, Any] = {}
# Env var names written to .env that aren't in OPTIONAL_ENV_VARS
# (managed by setup/provider flows directly).
_EXTRA_ENV_KEYS = frozenset({
@@ -44,7 +46,8 @@ _EXTRA_ENV_KEYS = frozenset({
"WEIXIN_HOME_CHANNEL", "WEIXIN_HOME_CHANNEL_NAME", "WEIXIN_DM_POLICY", "WEIXIN_GROUP_POLICY",
"WEIXIN_ALLOWED_USERS", "WEIXIN_GROUP_ALLOWED_USERS", "WEIXIN_ALLOW_ALL_USERS",
"BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_PASSWORD",
"QQ_APP_ID", "QQ_CLIENT_SECRET", "QQ_HOME_CHANNEL", "QQ_HOME_CHANNEL_NAME",
"QQ_APP_ID", "QQ_CLIENT_SECRET", "QQBOT_HOME_CHANNEL", "QQBOT_HOME_CHANNEL_NAME",
"QQ_HOME_CHANNEL", "QQ_HOME_CHANNEL_NAME", # legacy aliases (pre-rename, still read for back-compat)
"QQ_ALLOWED_USERS", "QQ_GROUP_ALLOWED_USERS", "QQ_ALLOW_ALL_USERS", "QQ_MARKDOWN_SUPPORT",
"QQ_STT_API_KEY", "QQ_STT_BASE_URL", "QQ_STT_MODEL",
"TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
@@ -417,6 +420,7 @@ DEFAULT_CONFIG = {
"command_timeout": 30, # Timeout for browser commands in seconds (screenshot, navigate, etc.)
"record_sessions": False, # Auto-record browser sessions as WebM videos
"allow_private_urls": False, # Allow navigating to private/internal IPs (localhost, 192.168.x.x, etc.)
"cdp_url": "", # Optional persistent CDP endpoint for attaching to an existing Chromium/Chrome
"camofox": {
# When true, Hermes sends a stable profile-scoped userId to Camofox
# so the server maps it to a persistent Firefox profile automatically.
@@ -537,6 +541,13 @@ DEFAULT_CONFIG = {
"api_key": "",
"timeout": 30,
},
"title_generation": {
"provider": "auto",
"model": "",
"base_url": "",
"api_key": "",
"timeout": 30,
},
},
"display": {
@@ -726,9 +737,14 @@ DEFAULT_CONFIG = {
# manual — always prompt the user (default)
# smart — use auxiliary LLM to auto-approve low-risk commands, prompt for high-risk
# off — skip all approval prompts (equivalent to --yolo)
#
# cron_mode — what to do when a cron job hits a dangerous command:
# deny — block the command and let the agent find another way (default, safe)
# approve — auto-approve all dangerous commands in cron jobs
"approvals": {
"mode": "manual",
"timeout": 60,
"cron_mode": "deny",
},
# Permanently allowed dangerous command patterns (added via "always" approval)
@@ -760,6 +776,20 @@ DEFAULT_CONFIG = {
"wrap_response": True,
},
# execute_code settings — controls the tool used for programmatic tool calls.
"code_execution": {
# Execution mode:
# project (default) — scripts run in the session's working directory
# with the active virtualenv/conda env's python, so project deps
# (pandas, torch, project packages) and relative paths resolve.
# strict — scripts run in an isolated temp directory with
# hermes-agent's own python (sys.executable). Maximum isolation
# and reproducibility; project deps and relative paths won't work.
# Env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, ...) and the
# tool whitelist apply identically in both modes.
"mode": "project",
},
# Logging — controls file logging to ~/.hermes/logs/.
# agent.log captures INFO+ (all agent activity); errors.log captures WARNING+.
"logging": {
@@ -777,7 +807,7 @@ DEFAULT_CONFIG = {
},
# Config schema version - bump this when adding new required fields
"_config_version": 18,
"_config_version": 19,
}
# =============================================================================
@@ -861,6 +891,22 @@ OPTIONAL_ENV_VARS = {
"category": "provider",
"advanced": True,
},
"NVIDIA_API_KEY": {
"description": "NVIDIA NIM API key (build.nvidia.com or local NIM endpoint)",
"prompt": "NVIDIA NIM API key",
"url": "https://build.nvidia.com/",
"password": True,
"category": "provider",
"advanced": True,
},
"NVIDIA_BASE_URL": {
"description": "NVIDIA NIM base URL override (e.g. http://localhost:8000/v1 for local NIM)",
"prompt": "NVIDIA NIM base URL (leave empty for default)",
"url": None,
"password": False,
"category": "provider",
"advanced": True,
},
"GLM_API_KEY": {
"description": "Z.AI / GLM API key (also recognized as ZAI_API_KEY / Z_AI_API_KEY)",
"prompt": "Z.AI / GLM API key",
@@ -1518,12 +1564,12 @@ OPTIONAL_ENV_VARS = {
"prompt": "Allow All QQ Users",
"category": "messaging",
},
"QQ_HOME_CHANNEL": {
"QQBOT_HOME_CHANNEL": {
"description": "Default QQ channel/group for cron delivery and notifications",
"prompt": "QQ Home Channel",
"category": "messaging",
},
"QQ_HOME_CHANNEL_NAME": {
"QQBOT_HOME_CHANNEL_NAME": {
"description": "Display name for the QQ home channel",
"prompt": "QQ Home Channel Name",
"category": "messaging",
@@ -2610,6 +2656,85 @@ def _expand_env_vars(obj):
return obj
def _items_by_unique_name(items):
"""Return a name-indexed dict only when all items have unique string names."""
if not isinstance(items, list):
return None
indexed = {}
for item in items:
if not isinstance(item, dict) or not isinstance(item.get("name"), str):
return None
name = item["name"]
if name in indexed:
return None
indexed[name] = item
return indexed
def _preserve_env_ref_templates(current, raw, loaded_expanded=None):
"""Restore raw ``${VAR}`` templates when a value is otherwise unchanged.
``load_config()`` expands env refs for runtime use. When a caller later
persists that config after modifying some unrelated setting, keep the
original on-disk template instead of writing the expanded plaintext
secret back to ``config.yaml``.
Prefer preserving the raw template when ``current`` still matches either
the value previously returned by ``load_config()`` for this config path or
the current environment expansion of ``raw``. This handles env-var
rotation between load and save while still treating mixed literal/template
string edits as caller-owned once their rendered value diverges.
"""
if isinstance(current, str) and isinstance(raw, str) and re.search(r"\${[^}]+}", raw):
if current == raw:
return raw
if isinstance(loaded_expanded, str) and current == loaded_expanded:
return raw
if _expand_env_vars(raw) == current:
return raw
return current
if isinstance(current, dict) and isinstance(raw, dict):
return {
key: _preserve_env_ref_templates(
value,
raw.get(key),
loaded_expanded.get(key) if isinstance(loaded_expanded, dict) else None,
)
for key, value in current.items()
}
if isinstance(current, list) and isinstance(raw, list):
# Prefer matching named config objects (e.g. custom_providers) by name
# so harmless reordering doesn't drop the original template. If names
# are duplicated, fall back to positional matching instead of silently
# shadowing one entry.
current_by_name = _items_by_unique_name(current)
raw_by_name = _items_by_unique_name(raw)
loaded_by_name = _items_by_unique_name(loaded_expanded)
if current_by_name is not None and raw_by_name is not None:
return [
_preserve_env_ref_templates(
item,
raw_by_name.get(item.get("name")),
loaded_by_name.get(item.get("name")) if loaded_by_name is not None else None,
)
for item in current
]
return [
_preserve_env_ref_templates(
item,
raw[index] if index < len(raw) else None,
loaded_expanded[index]
if isinstance(loaded_expanded, list) and index < len(loaded_expanded)
else None,
)
for index, item in enumerate(current)
]
return current
def _normalize_root_model_keys(config: Dict[str, Any]) -> Dict[str, Any]:
"""Move stale root-level provider/base_url into model section.
@@ -2677,7 +2802,6 @@ def read_raw_config() -> Dict[str, Any]:
def load_config() -> Dict[str, Any]:
"""Load configuration from ~/.hermes/config.yaml."""
import copy
ensure_hermes_home()
config_path = get_config_path()
@@ -2698,8 +2822,11 @@ def load_config() -> Dict[str, Any]:
config = _deep_merge(config, user_config)
except Exception as e:
print(f"Warning: Failed to load config: {e}")
return _expand_env_vars(_normalize_root_model_keys(_normalize_max_turns_config(config)))
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
expanded = _expand_env_vars(normalized)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(expanded)
return expanded
_SECURITY_COMMENT = """
@@ -2734,7 +2861,7 @@ _FALLBACK_COMMENT = """
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
#
# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
# fallback_model:
# provider: openrouter
@@ -2778,7 +2905,7 @@ _COMMENTED_SECTIONS = """
# minimax (MINIMAX_API_KEY) — MiniMax
# minimax-cn (MINIMAX_CN_API_KEY) — MiniMax (China)
#
# For custom OpenAI-compatible endpoints, add base_url and api_key_env.
# For custom OpenAI-compatible endpoints, add base_url and key_env.
#
# fallback_model:
# provider: openrouter
@@ -2808,7 +2935,15 @@ def save_config(config: Dict[str, Any]):
ensure_hermes_home()
config_path = get_config_path()
normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
current_normalized = _normalize_root_model_keys(_normalize_max_turns_config(config))
normalized = current_normalized
raw_existing = _normalize_root_model_keys(_normalize_max_turns_config(read_raw_config()))
if raw_existing:
normalized = _preserve_env_ref_templates(
normalized,
raw_existing,
_LAST_EXPANDED_CONFIG_BY_PATH.get(str(config_path)),
)
# Build optional commented-out sections for features that are off by
# default or only relevant when explicitly configured.
@@ -2826,6 +2961,7 @@ def save_config(config: Dict[str, Any]):
extra_content="".join(parts) if parts else None,
)
_secure_file(config_path)
_LAST_EXPANDED_CONFIG_BY_PATH[str(config_path)] = copy.deepcopy(current_normalized)
def load_env() -> Dict[str, str]:
+137 -29
View File
@@ -6,7 +6,10 @@ Currently supports:
"""
import io
import json
import os
import sys
import time
import urllib.error
import urllib.parse
import urllib.request
@@ -31,6 +34,119 @@ _MAX_LOG_BYTES = 512_000
_AUTO_DELETE_SECONDS = 21600
# ---------------------------------------------------------------------------
# Pending-deletion tracking (replaces the old fork-and-sleep subprocess).
# ---------------------------------------------------------------------------
def _pending_file() -> Path:
"""Path to ``~/.hermes/pastes/pending.json``.
Each entry: ``{"url": "...", "expire_at": <unix_ts>}``. Scheduled
DELETEs used to be handled by spawning a detached Python process per
paste that slept for 6 hours; those accumulated forever if the user
ran ``hermes debug share`` repeatedly. We now persist the schedule
to disk and sweep expired entries on the next debug invocation.
"""
return get_hermes_home() / "pastes" / "pending.json"
def _load_pending() -> list[dict]:
path = _pending_file()
if not path.exists():
return []
try:
data = json.loads(path.read_text(encoding="utf-8"))
if isinstance(data, list):
# Filter to well-formed entries only
return [
e for e in data
if isinstance(e, dict) and "url" in e and "expire_at" in e
]
except (OSError, ValueError, json.JSONDecodeError):
pass
return []
def _save_pending(entries: list[dict]) -> None:
path = _pending_file()
try:
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(".json.tmp")
tmp.write_text(json.dumps(entries, indent=2), encoding="utf-8")
os.replace(tmp, path)
except OSError:
# Non-fatal — worst case the user has to run ``hermes debug delete``
# manually.
pass
def _record_pending(urls: list[str], delay_seconds: int = _AUTO_DELETE_SECONDS) -> None:
"""Record *urls* for deletion at ``now + delay_seconds``.
Only paste.rs URLs are recorded (dpaste.com auto-expires). Entries
are merged into any existing pending.json.
"""
paste_rs_urls = [u for u in urls if _extract_paste_id(u)]
if not paste_rs_urls:
return
entries = _load_pending()
# Dedupe by URL: keep the later expire_at if same URL appears twice
by_url: dict[str, float] = {e["url"]: float(e["expire_at"]) for e in entries}
expire_at = time.time() + delay_seconds
for u in paste_rs_urls:
by_url[u] = max(expire_at, by_url.get(u, 0.0))
merged = [{"url": u, "expire_at": ts} for u, ts in by_url.items()]
_save_pending(merged)
def _sweep_expired_pastes(now: Optional[float] = None) -> tuple[int, int]:
"""Synchronously DELETE any pending pastes whose ``expire_at`` has passed.
Returns ``(deleted, remaining)``. Best-effort: failed deletes stay in
the pending file and will be retried on the next sweep. Silent
intended to be called from every ``hermes debug`` invocation with
minimal noise.
"""
entries = _load_pending()
if not entries:
return (0, 0)
current = time.time() if now is None else now
deleted = 0
remaining: list[dict] = []
for entry in entries:
try:
expire_at = float(entry.get("expire_at", 0))
except (TypeError, ValueError):
continue # drop malformed entries
if expire_at > current:
remaining.append(entry)
continue
url = entry.get("url", "")
try:
if delete_paste(url):
deleted += 1
continue
except Exception:
# Network hiccup, 404 (already gone), etc. — drop the entry
# after a grace period; don't retry forever.
pass
# Retain failed deletes for up to 24h past expiration, then give up.
if expire_at + 86400 > current:
remaining.append(entry)
else:
deleted += 1 # count as reaped (paste.rs will GC eventually)
if deleted:
_save_pending(remaining)
return (deleted, len(remaining))
# ---------------------------------------------------------------------------
# Privacy / delete helpers
# ---------------------------------------------------------------------------
@@ -90,37 +206,19 @@ def delete_paste(url: str) -> bool:
def _schedule_auto_delete(urls: list[str], delay_seconds: int = _AUTO_DELETE_SECONDS):
"""Spawn a detached process to delete paste.rs pastes after *delay_seconds*.
"""Record *urls* for deletion ``delay_seconds`` from now.
The child process is fully detached (``start_new_session=True``) so it
survives the parent exiting (important for CLI mode). Only paste.rs
URLs are attempted dpaste.com pastes auto-expire on their own.
Previously this spawned a detached Python subprocess per call that slept
for 6 hours and then issued DELETE requests. Those subprocesses leaked
every ``hermes debug share`` invocation added ~20 MB of resident Python
interpreters that never exited until the sleep completed.
The replacement is stateless: we append to ``~/.hermes/pastes/pending.json``
and rely on opportunistic sweeps (``_sweep_expired_pastes``) called from
every ``hermes debug`` invocation. If the user never runs ``hermes debug``
again, paste.rs's own retention policy handles cleanup.
"""
import subprocess
paste_rs_urls = [u for u in urls if _extract_paste_id(u)]
if not paste_rs_urls:
return
# Build a tiny inline Python script. No imports beyond stdlib.
url_list = ", ".join(f'"{u}"' for u in paste_rs_urls)
script = (
"import time, urllib.request; "
f"time.sleep({delay_seconds}); "
f"[urllib.request.urlopen(urllib.request.Request(u, method='DELETE', "
f"headers={{'User-Agent': 'hermes-agent/auto-delete'}}), timeout=15) "
f"for u in [{url_list}]]"
)
try:
subprocess.Popen(
[sys.executable, "-c", script],
start_new_session=True,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
except Exception:
pass # Best-effort; manual delete still available.
_record_pending(urls, delay_seconds=delay_seconds)
def _delete_hint(url: str) -> str:
@@ -455,6 +553,16 @@ def run_debug_delete(args):
def run_debug(args):
"""Route debug subcommands."""
# Opportunistic sweep of expired pastes on every ``hermes debug`` call.
# Replaces the old per-paste sleeping subprocess that used to leak as
# one orphaned Python interpreter per scheduled deletion. Silent and
# best-effort — any failure is swallowed so ``hermes debug`` stays
# reliable even when offline.
try:
_sweep_expired_pastes()
except Exception:
pass
subcmd = getattr(args, "debug_command", None)
if subcmd == "share":
run_debug_share(args)
+294
View File
@@ -0,0 +1,294 @@
"""
DingTalk Device Flow authorization.
Implements the same 3-step registration flow as dingtalk-openclaw-connector:
1. POST /app/registration/init get nonce
2. POST /app/registration/begin get device_code + verification_uri_complete
3. POST /app/registration/poll poll until SUCCESS get client_id + client_secret
The verification_uri_complete is rendered as a QR code in the terminal so the
user can scan it with DingTalk to authorize, yielding AppKey + AppSecret
automatically.
"""
from __future__ import annotations
import io
import os
import sys
import time
import logging
from typing import Optional, Tuple
import requests
logger = logging.getLogger(__name__)
# ── Configuration ──────────────────────────────────────────────────────────
REGISTRATION_BASE_URL = os.environ.get(
"DINGTALK_REGISTRATION_BASE_URL", "https://oapi.dingtalk.com"
).rstrip("/")
REGISTRATION_SOURCE = os.environ.get("DINGTALK_REGISTRATION_SOURCE", "openClaw")
# ── API helpers ────────────────────────────────────────────────────────────
class RegistrationError(Exception):
"""Raised when a DingTalk registration API call fails."""
def _api_post(path: str, payload: dict) -> dict:
"""POST to the registration API and return the parsed JSON body."""
url = f"{REGISTRATION_BASE_URL}{path}"
try:
resp = requests.post(url, json=payload, timeout=15)
resp.raise_for_status()
data = resp.json()
except requests.RequestException as exc:
raise RegistrationError(f"Network error calling {url}: {exc}") from exc
errcode = data.get("errcode", -1)
if errcode != 0:
errmsg = data.get("errmsg", "unknown error")
raise RegistrationError(f"API error [{path}]: {errmsg} (errcode={errcode})")
return data
# ── Core flow ──────────────────────────────────────────────────────────────
def begin_registration() -> dict:
"""Start a device-flow registration.
Returns a dict with keys:
device_code, verification_uri_complete, expires_in, interval
"""
# Step 1: init → nonce
init_data = _api_post("/app/registration/init", {"source": REGISTRATION_SOURCE})
nonce = str(init_data.get("nonce", "")).strip()
if not nonce:
raise RegistrationError("init response missing nonce")
# Step 2: begin → device_code, verification_uri_complete
begin_data = _api_post("/app/registration/begin", {"nonce": nonce})
device_code = str(begin_data.get("device_code", "")).strip()
verification_uri_complete = str(begin_data.get("verification_uri_complete", "")).strip()
if not device_code:
raise RegistrationError("begin response missing device_code")
if not verification_uri_complete:
raise RegistrationError("begin response missing verification_uri_complete")
return {
"device_code": device_code,
"verification_uri_complete": verification_uri_complete,
"expires_in": int(begin_data.get("expires_in", 7200)),
"interval": max(int(begin_data.get("interval", 3)), 2),
}
def poll_registration(device_code: str) -> dict:
"""Poll the registration status once.
Returns a dict with keys: status, client_id?, client_secret?, fail_reason?
"""
data = _api_post("/app/registration/poll", {"device_code": device_code})
status_raw = str(data.get("status", "")).strip().upper()
if status_raw not in ("WAITING", "SUCCESS", "FAIL", "EXPIRED"):
status_raw = "UNKNOWN"
return {
"status": status_raw,
"client_id": str(data.get("client_id", "")).strip() or None,
"client_secret": str(data.get("client_secret", "")).strip() or None,
"fail_reason": str(data.get("fail_reason", "")).strip() or None,
}
def wait_for_registration_success(
device_code: str,
interval: int = 3,
expires_in: int = 7200,
on_waiting: Optional[callable] = None,
) -> Tuple[str, str]:
"""Block until the registration succeeds or times out.
Returns (client_id, client_secret).
"""
deadline = time.monotonic() + expires_in
retry_window = 120 # 2 minutes for transient errors
retry_start = 0.0
while time.monotonic() < deadline:
time.sleep(interval)
try:
result = poll_registration(device_code)
except RegistrationError:
if retry_start == 0:
retry_start = time.monotonic()
if time.monotonic() - retry_start < retry_window:
continue
raise
status = result["status"]
if status == "WAITING":
retry_start = 0
if on_waiting:
on_waiting()
continue
if status == "SUCCESS":
cid = result["client_id"]
csecret = result["client_secret"]
if not cid or not csecret:
raise RegistrationError("authorization succeeded but credentials are missing")
return cid, csecret
# FAIL / EXPIRED / UNKNOWN
if retry_start == 0:
retry_start = time.monotonic()
if time.monotonic() - retry_start < retry_window:
continue
reason = result.get("fail_reason") or status
raise RegistrationError(f"authorization failed: {reason}")
raise RegistrationError("authorization timed out, please retry")
# ── QR code rendering ─────────────────────────────────────────────────────
def _ensure_qrcode_installed() -> bool:
"""Try to import qrcode; if missing, auto-install it via pip/uv."""
try:
import qrcode # noqa: F401
return True
except ImportError:
pass
import subprocess
# Try uv first (Hermes convention), then pip
for cmd in (
[sys.executable, "-m", "uv", "pip", "install", "qrcode"],
[sys.executable, "-m", "pip", "install", "-q", "qrcode"],
):
try:
subprocess.check_call(cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
import qrcode # noqa: F401,F811
return True
except (subprocess.CalledProcessError, ImportError, FileNotFoundError):
continue
return False
def render_qr_to_terminal(url: str) -> bool:
"""Render *url* as a compact QR code in the terminal.
Returns True if the QR code was printed, False if the library is missing.
"""
try:
import qrcode
except ImportError:
return False
qr = qrcode.QRCode(
version=1,
error_correction=qrcode.constants.ERROR_CORRECT_L,
box_size=1,
border=1,
)
qr.add_data(url)
qr.make(fit=True)
# Use half-block characters for compact rendering (2 rows per character)
matrix = qr.get_matrix()
rows = len(matrix)
lines: list[str] = []
TOP_HALF = "\u2580" # ▀
BOTTOM_HALF = "\u2584" # ▄
FULL_BLOCK = "\u2588" # █
EMPTY = " "
for r in range(0, rows, 2):
line_chars: list[str] = []
for c in range(len(matrix[r])):
top = matrix[r][c]
bottom = matrix[r + 1][c] if r + 1 < rows else False
if top and bottom:
line_chars.append(FULL_BLOCK)
elif top:
line_chars.append(TOP_HALF)
elif bottom:
line_chars.append(BOTTOM_HALF)
else:
line_chars.append(EMPTY)
lines.append(" " + "".join(line_chars))
print("\n".join(lines))
return True
# ── High-level entry point for the setup wizard ───────────────────────────
def dingtalk_qr_auth() -> Optional[Tuple[str, str]]:
"""Run the interactive QR-code device-flow authorization.
Returns (client_id, client_secret) on success, or None if the user
cancelled or the flow failed.
"""
from hermes_cli.setup import print_info, print_success, print_warning, print_error
print()
print_info(" Initializing DingTalk device authorization...")
print_info(" Note: the scan page is branded 'OpenClaw' — DingTalk's")
print_info(" ecosystem onboarding bridge. Safe to use.")
try:
reg = begin_registration()
except RegistrationError as exc:
print_error(f" Authorization init failed: {exc}")
return None
url = reg["verification_uri_complete"]
# Ensure qrcode library is available (auto-install if missing)
if not _ensure_qrcode_installed():
print_warning(" qrcode library install failed, will show link only.")
print()
print_info(" Please scan the QR code below with DingTalk to authorize:")
print()
if not render_qr_to_terminal(url):
print_warning(f" QR code render failed, please open the link below to authorize:")
print()
print_info(f" Or open this link manually: {url}")
print()
print_info(" Waiting for QR scan authorization... (timeout: 2 hours)")
dot_count = 0
def _on_waiting():
nonlocal dot_count
dot_count += 1
if dot_count % 10 == 0:
sys.stdout.write(".")
sys.stdout.flush()
try:
client_id, client_secret = wait_for_registration_success(
device_code=reg["device_code"],
interval=reg["interval"],
expires_in=reg["expires_in"],
on_waiting=_on_waiting,
)
except RegistrationError as exc:
print()
print_error(f" Authorization failed: {exc}")
return None
print()
print_success(" QR scan authorization successful!")
print_success(f" Client ID: {client_id}")
print_success(f" Client Secret: {client_secret[:8]}{'*' * (len(client_secret) - 8)}")
return client_id, client_secret
+3 -2
View File
@@ -825,6 +825,7 @@ def run_doctor(args):
("Arcee AI", ("ARCEEAI_API_KEY",), "https://api.arcee.ai/api/v1/models", "ARCEE_BASE_URL", True),
("DeepSeek", ("DEEPSEEK_API_KEY",), "https://api.deepseek.com/v1/models", "DEEPSEEK_BASE_URL", True),
("Hugging Face", ("HF_TOKEN",), "https://router.huggingface.co/v1/models", "HF_BASE_URL", True),
("NVIDIA NIM", ("NVIDIA_API_KEY",), "https://integrate.api.nvidia.com/v1/models", "NVIDIA_BASE_URL", True),
("Alibaba/DashScope", ("DASHSCOPE_API_KEY",), "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models", "DASHSCOPE_BASE_URL", True),
# MiniMax: the /anthropic endpoint doesn't support /models, but the /v1 endpoint does.
("MiniMax", ("MINIMAX_API_KEY",), "https://api.minimax.io/v1/models", "MINIMAX_BASE_URL", True),
@@ -894,8 +895,8 @@ def run_doctor(args):
_model_count = len(_br_resp.get("modelSummaries", []))
print(f"\r {color('', Colors.GREEN)} {_label} {color(f'({_auth_var}, {_region}, {_model_count} models)', Colors.DIM)} ")
except ImportError:
print(f"\r {color('', Colors.YELLOW)} {_label} {color('(boto3 not installed — pip install hermes-agent[bedrock])', Colors.DIM)} ")
issues.append("Install boto3 for Bedrock: pip install hermes-agent[bedrock]")
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'(boto3 not installed — {sys.executable} -m pip install boto3)', Colors.DIM)} ")
issues.append(f"Install boto3 for Bedrock: {sys.executable} -m pip install boto3")
except Exception as _e:
_err_name = type(_e).__name__
print(f"\r {color('', Colors.YELLOW)} {_label} {color(f'({_err_name}: {_e})', Colors.DIM)} ")
+15 -35
View File
@@ -43,41 +43,20 @@ def _redact(value: str) -> str:
def _gateway_status() -> str:
"""Return a short gateway status string."""
if sys.platform.startswith("linux"):
from hermes_constants import is_container
if is_container():
try:
from hermes_cli.gateway import find_gateway_pids
pids = find_gateway_pids()
if pids:
return f"running (docker, pid {pids[0]})"
return "stopped (docker)"
except Exception:
return "stopped (docker)"
try:
from hermes_cli.gateway import get_service_name
svc = get_service_name()
except Exception:
svc = "hermes-gateway"
try:
r = subprocess.run(
["systemctl", "--user", "is-active", svc],
capture_output=True, text=True, timeout=5,
)
return "running (systemd)" if r.stdout.strip() == "active" else "stopped"
except Exception:
return "unknown"
elif sys.platform == "darwin":
try:
from hermes_cli.gateway import get_launchd_label
r = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True, text=True, timeout=5,
)
return "loaded (launchd)" if r.returncode == 0 else "not loaded"
except Exception:
return "unknown"
return "N/A"
try:
from hermes_cli.gateway import get_gateway_runtime_snapshot
snapshot = get_gateway_runtime_snapshot()
if snapshot.running:
mode = snapshot.manager
if snapshot.has_process_service_mismatch:
mode = "manual"
return f"running ({mode}, pid {snapshot.gateway_pids[0]})"
if snapshot.service_installed and not snapshot.service_running:
return f"stopped ({snapshot.manager})"
return f"stopped ({snapshot.manager})"
except Exception:
return "unknown" if sys.platform.startswith(("linux", "darwin")) else "N/A"
def _count_skills(hermes_home: Path) -> int:
@@ -296,6 +275,7 @@ def run_dump(args):
("DEEPSEEK_API_KEY", "deepseek"),
("DASHSCOPE_API_KEY", "dashscope"),
("HF_TOKEN", "huggingface"),
("NVIDIA_API_KEY", "nvidia"),
("AI_GATEWAY_API_KEY", "ai_gateway"),
("OPENCODE_ZEN_API_KEY", "opencode_zen"),
("OPENCODE_GO_API_KEY", "opencode_go"),
+691 -34
View File
@@ -10,6 +10,7 @@ import shutil
import signal
import subprocess
import sys
from dataclasses import dataclass
from pathlib import Path
PROJECT_ROOT = Path(__file__).parent.parent.resolve()
@@ -41,6 +42,23 @@ from hermes_cli.colors import Colors, color
# Process Management (for manual gateway runs)
# =============================================================================
@dataclass(frozen=True)
class GatewayRuntimeSnapshot:
manager: str
service_installed: bool = False
service_running: bool = False
gateway_pids: tuple[int, ...] = ()
service_scope: str | None = None
@property
def running(self) -> bool:
return self.service_running or bool(self.gateway_pids)
@property
def has_process_service_mismatch(self) -> bool:
return self.service_installed and self.running and not self.service_running
def _get_service_pids() -> set:
"""Return PIDs currently managed by systemd or launchd gateway services.
@@ -157,20 +175,22 @@ def _request_gateway_self_restart(pid: int) -> bool:
return True
def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = False) -> list:
"""Find PIDs of running gateway processes.
def _append_unique_pid(pids: list[int], pid: int | None, exclude_pids: set[int]) -> None:
if pid is None or pid <= 0:
return
if pid == os.getpid() or pid in exclude_pids or pid in pids:
return
pids.append(pid)
Args:
exclude_pids: PIDs to exclude from the result (e.g. service-managed
PIDs that should not be killed during a stale-process sweep).
all_profiles: When ``True``, return gateway PIDs across **all**
profiles (the pre-7923 global behaviour). ``hermes update``
needs this because a code update affects every profile.
When ``False`` (default), only PIDs belonging to the current
Hermes profile are returned.
def _scan_gateway_pids(exclude_pids: set[int], all_profiles: bool = False) -> list[int]:
"""Best-effort process-table scan for gateway PIDs.
This supplements the profile-scoped PID file so status views can still spot
a live gateway when the PID file is stale/missing, and ``--all`` sweeps can
discover gateways outside the current profile.
"""
_exclude = exclude_pids or set()
pids = [pid for pid in _get_service_pids() if pid not in _exclude]
pids: list[int] = []
patterns = [
"hermes_cli.main gateway",
"hermes_cli.main --profile",
@@ -203,20 +223,24 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
if is_windows():
result = subprocess.run(
["wmic", "process", "get", "ProcessId,CommandLine", "/FORMAT:LIST"],
capture_output=True, text=True, timeout=10
capture_output=True,
text=True,
timeout=10,
)
if result.returncode != 0:
return []
current_cmd = ""
for line in result.stdout.split('\n'):
for line in result.stdout.split("\n"):
line = line.strip()
if line.startswith("CommandLine="):
current_cmd = line[len("CommandLine="):]
elif line.startswith("ProcessId="):
pid_str = line[len("ProcessId="):]
if any(p in current_cmd for p in patterns) and (all_profiles or _matches_current_profile(current_cmd)):
if any(p in current_cmd for p in patterns) and (
all_profiles or _matches_current_profile(current_cmd)
):
try:
pid = int(pid_str)
if pid != os.getpid() and pid not in pids and pid not in _exclude:
pids.append(pid)
_append_unique_pid(pids, int(pid_str), exclude_pids)
except ValueError:
pass
current_cmd = ""
@@ -227,9 +251,11 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
text=True,
timeout=10,
)
for line in result.stdout.split('\n'):
if result.returncode != 0:
return []
for line in result.stdout.split("\n"):
stripped = line.strip()
if not stripped or 'grep' in stripped:
if not stripped or "grep" in stripped:
continue
pid = None
@@ -251,16 +277,137 @@ def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = Fals
if pid is None:
continue
if pid == os.getpid() or pid in pids or pid in _exclude:
continue
if any(pattern in command for pattern in patterns) and (all_profiles or _matches_current_profile(command)):
pids.append(pid)
if any(pattern in command for pattern in patterns) and (
all_profiles or _matches_current_profile(command)
):
_append_unique_pid(pids, pid, exclude_pids)
except (OSError, subprocess.TimeoutExpired):
pass
return []
return pids
def find_gateway_pids(exclude_pids: set | None = None, all_profiles: bool = False) -> list:
"""Find PIDs of running gateway processes.
Args:
exclude_pids: PIDs to exclude from the result (e.g. service-managed
PIDs that should not be killed during a stale-process sweep).
all_profiles: When ``True``, return gateway PIDs across **all**
profiles (the pre-7923 global behaviour). ``hermes update``
needs this because a code update affects every profile.
When ``False`` (default), only PIDs belonging to the current
Hermes profile are returned.
"""
_exclude = set(exclude_pids or set())
pids: list[int] = []
if not all_profiles:
try:
from gateway.status import get_running_pid
_append_unique_pid(pids, get_running_pid(), _exclude)
except Exception:
pass
for pid in _get_service_pids():
_append_unique_pid(pids, pid, _exclude)
for pid in _scan_gateway_pids(_exclude, all_profiles=all_profiles):
_append_unique_pid(pids, pid, _exclude)
return pids
def _probe_systemd_service_running(system: bool = False) -> tuple[bool, bool]:
selected_system = _select_systemd_scope(system)
unit_exists = get_systemd_unit_path(system=selected_system).exists()
if not unit_exists:
return selected_system, False
try:
result = _run_systemctl(
["is-active", get_service_name()],
system=selected_system,
capture_output=True,
text=True,
timeout=10,
)
except (RuntimeError, subprocess.TimeoutExpired):
return selected_system, False
return selected_system, result.stdout.strip() == "active"
def _probe_launchd_service_running() -> bool:
if not get_launchd_plist_path().exists():
return False
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=10,
)
except subprocess.TimeoutExpired:
return False
return result.returncode == 0
def get_gateway_runtime_snapshot(system: bool = False) -> GatewayRuntimeSnapshot:
"""Return a unified view of gateway liveness for the current profile."""
gateway_pids = tuple(find_gateway_pids())
if is_termux():
return GatewayRuntimeSnapshot(
manager="Termux / manual process",
gateway_pids=gateway_pids,
)
from hermes_constants import is_container
if is_linux() and is_container():
return GatewayRuntimeSnapshot(
manager="docker (foreground)",
gateway_pids=gateway_pids,
)
if supports_systemd_services():
selected_system, service_running = _probe_systemd_service_running(system=system)
scope_label = _service_scope_label(selected_system)
return GatewayRuntimeSnapshot(
manager=f"systemd ({scope_label})",
service_installed=get_systemd_unit_path(system=selected_system).exists(),
service_running=service_running,
gateway_pids=gateway_pids,
service_scope=scope_label,
)
if is_macos():
return GatewayRuntimeSnapshot(
manager="launchd",
service_installed=get_launchd_plist_path().exists(),
service_running=_probe_launchd_service_running(),
gateway_pids=gateway_pids,
service_scope="launchd",
)
return GatewayRuntimeSnapshot(
manager="manual process",
gateway_pids=gateway_pids,
)
def _format_gateway_pids(pids: tuple[int, ...] | list[int], *, limit: int | None = 3) -> str:
rendered = [str(pid) for pid in pids[:limit] if pid > 0] if limit is not None else [str(pid) for pid in pids if pid > 0]
if limit is not None and len(pids) > limit:
rendered.append("...")
return ", ".join(rendered)
def _print_gateway_process_mismatch(snapshot: GatewayRuntimeSnapshot) -> None:
if not snapshot.has_process_service_mismatch:
return
print()
print("⚠ Gateway process is running for this profile, but the service is not active")
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids, limit=None)}")
print(" This is usually a manual foreground/tmux/nohup run, so `hermes gateway`")
print(" can refuse to start another copy until this process stops.")
def kill_gateway_processes(force: bool = False, exclude_pids: set | None = None,
all_profiles: bool = False) -> int:
"""Kill any running gateway processes. Returns count killed.
@@ -340,25 +487,44 @@ def _wsl_systemd_operational() -> bool:
WSL2 with ``systemd=true`` in wsl.conf has working systemd.
WSL2 without it (or WSL1) does not systemctl commands fail.
"""
return _systemd_operational(system=True)
def _systemd_operational(system: bool = False) -> bool:
"""Return True when the requested systemd scope is usable."""
try:
result = subprocess.run(
["systemctl", "is-system-running"],
capture_output=True, text=True, timeout=5,
result = _run_systemctl(
["is-system-running"],
system=system,
capture_output=True,
text=True,
timeout=5,
)
# "running", "degraded", "starting" all mean systemd is PID 1
status = result.stdout.strip().lower()
return status in ("running", "degraded", "starting", "initializing")
except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
except (RuntimeError, subprocess.TimeoutExpired, OSError):
return False
def _container_systemd_operational() -> bool:
"""Return True when a container exposes working user or system systemd."""
if _systemd_operational(system=False):
return True
if _systemd_operational(system=True):
return True
return False
def supports_systemd_services() -> bool:
if not is_linux() or is_termux() or is_container():
if not is_linux() or is_termux():
return False
if shutil.which("systemctl") is None:
return False
if is_wsl():
return _wsl_systemd_operational()
if is_container():
return _container_systemd_operational()
return True
@@ -521,6 +687,195 @@ def has_conflicting_systemd_units() -> bool:
return len(get_installed_systemd_scopes()) > 1
# Legacy service names from older Hermes installs that predate the
# hermes-gateway rename. Kept as an explicit allowlist (NOT a glob) so
# profile units (hermes-gateway-*.service) and unrelated third-party
# "hermes" units are never matched.
_LEGACY_SERVICE_NAMES: tuple[str, ...] = ("hermes.service",)
# ExecStart content markers that identify a unit as running our gateway.
# A legacy unit is only flagged when its file contains one of these.
_LEGACY_UNIT_EXECSTART_MARKERS: tuple[str, ...] = (
"hermes_cli.main gateway",
"hermes_cli/main.py gateway",
"gateway/run.py",
" hermes gateway ",
"/hermes gateway ",
)
def _legacy_unit_search_paths() -> list[tuple[bool, Path]]:
"""Return ``[(is_system, base_dir), ...]`` — directories to scan for legacy units.
Factored out so tests can monkeypatch the search roots without touching
real filesystem paths.
"""
return [
(False, Path.home() / ".config" / "systemd" / "user"),
(True, Path("/etc/systemd/system")),
]
def _find_legacy_hermes_units() -> list[tuple[str, Path, bool]]:
"""Return ``[(unit_name, unit_path, is_system)]`` for legacy Hermes gateway units.
Detects unit files installed by older Hermes versions that used a
different service name (e.g. ``hermes.service`` before the rename to
``hermes-gateway.service``). When both a legacy unit and the current
``hermes-gateway.service`` are active, they fight over the same bot
token the PR #5646 signal-recovery change turns this into a 30-second
SIGTERM flap loop.
Safety guards:
* Explicit allowlist of legacy names (no globbing). Profile units such
as ``hermes-gateway-coder.service`` and unrelated third-party
``hermes-*`` services are never matched.
* ExecStart content check only flag units that invoke our gateway
entrypoint. A user-created ``hermes.service`` running an unrelated
binary is left untouched.
* Results are returned purely for caller inspection; this function
never mutates or removes anything.
"""
results: list[tuple[str, Path, bool]] = []
for is_system, base in _legacy_unit_search_paths():
for name in _LEGACY_SERVICE_NAMES:
unit_path = base / name
try:
if not unit_path.exists():
continue
text = unit_path.read_text(encoding="utf-8", errors="ignore")
except (OSError, PermissionError):
continue
if not any(marker in text for marker in _LEGACY_UNIT_EXECSTART_MARKERS):
# Not our gateway — leave alone
continue
results.append((name, unit_path, is_system))
return results
def has_legacy_hermes_units() -> bool:
"""Return True when any legacy Hermes gateway unit files exist."""
return bool(_find_legacy_hermes_units())
def print_legacy_unit_warning() -> None:
"""Warn about legacy Hermes gateway unit files if any are installed.
Idempotent: prints nothing when no legacy units are detected. Safe to
call from any status/install/setup path.
"""
legacy = _find_legacy_hermes_units()
if not legacy:
return
print_warning("Legacy Hermes gateway unit(s) detected from an older install:")
for name, path, is_system in legacy:
scope = "system" if is_system else "user"
print_info(f" {path} ({scope} scope)")
print_info(" These run alongside the current hermes-gateway service and")
print_info(" cause SIGTERM flap loops — both try to use the same bot token.")
print_info(" Remove them with:")
print_info(" hermes gateway migrate-legacy")
def remove_legacy_hermes_units(
interactive: bool = True,
dry_run: bool = False,
) -> tuple[int, list[Path]]:
"""Stop, disable, and remove legacy Hermes gateway unit files.
Iterates over whatever ``_find_legacy_hermes_units()`` returns which is
an explicit allowlist of legacy names (not a glob). Profile units and
unrelated third-party services are never touched.
Args:
interactive: When True, prompt before removing. When False, remove
without asking (used when another prompt has already confirmed,
e.g. from the install flow).
dry_run: When True, list what would be removed and return.
Returns:
``(removed_count, remaining_paths)`` remaining includes units we
couldn't remove (typically system-scope when not running as root).
"""
legacy = _find_legacy_hermes_units()
if not legacy:
print("No legacy Hermes gateway units found.")
return 0, []
user_units = [(n, p) for n, p, is_sys in legacy if not is_sys]
system_units = [(n, p) for n, p, is_sys in legacy if is_sys]
print()
print("Legacy Hermes gateway unit(s) found:")
for name, path, is_system in legacy:
scope = "system" if is_system else "user"
print(f" {path} ({scope} scope)")
print()
if dry_run:
print("(dry-run — nothing removed)")
return 0, [p for _, p, _ in legacy]
if interactive and not prompt_yes_no("Remove these legacy units?", True):
print("Skipped. Run again with: hermes gateway migrate-legacy")
return 0, [p for _, p, _ in legacy]
removed = 0
remaining: list[Path] = []
# User-scope removal
for name, path in user_units:
try:
_run_systemctl(["stop", name], system=False, check=False, timeout=90)
_run_systemctl(["disable", name], system=False, check=False, timeout=30)
path.unlink(missing_ok=True)
print(f" ✓ Removed {path}")
removed += 1
except (OSError, RuntimeError) as e:
print(f" ⚠ Could not remove {path}: {e}")
remaining.append(path)
if user_units:
try:
_run_systemctl(["daemon-reload"], system=False, check=False, timeout=30)
except RuntimeError:
pass
# System-scope removal (needs root)
if system_units:
if os.geteuid() != 0:
print()
print_warning("System-scope legacy units require root to remove.")
print_info(" Re-run with: sudo hermes gateway migrate-legacy")
for _, path in system_units:
remaining.append(path)
else:
for name, path in system_units:
try:
_run_systemctl(["stop", name], system=True, check=False, timeout=90)
_run_systemctl(["disable", name], system=True, check=False, timeout=30)
path.unlink(missing_ok=True)
print(f" ✓ Removed {path}")
removed += 1
except (OSError, RuntimeError) as e:
print(f" ⚠ Could not remove {path}: {e}")
remaining.append(path)
try:
_run_systemctl(["daemon-reload"], system=True, check=False, timeout=30)
except RuntimeError:
pass
print()
if remaining:
print_warning(f"{len(remaining)} legacy unit(s) still present — see messages above.")
else:
print_success(f"Removed {removed} legacy unit(s).")
return removed, remaining
def print_systemd_scope_conflict_warning() -> None:
scopes = get_installed_systemd_scopes()
if len(scopes) < 2:
@@ -1054,6 +1409,19 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
if system:
_require_root_for_system_service("install")
# Offer to remove legacy units (hermes.service from pre-rename installs)
# before installing the new hermes-gateway.service. If both remain, they
# flap-fight for the Telegram bot token on every gateway startup.
# Only removes units matching _LEGACY_SERVICE_NAMES + our ExecStart
# signature — profile units are never touched.
if has_legacy_hermes_units():
print()
print_legacy_unit_warning()
print()
if prompt_yes_no("Remove the legacy unit(s) before installing?", True):
remove_legacy_hermes_units(interactive=False)
print()
unit_path = get_systemd_unit_path(system=system)
scope_flag = " --system" if system else ""
@@ -1092,6 +1460,7 @@ def systemd_install(force: bool = False, system: bool = False, run_as_user: str
_ensure_linger_enabled()
print_systemd_scope_conflict_warning()
print_legacy_unit_warning()
def systemd_uninstall(system: bool = False):
@@ -1215,6 +1584,10 @@ def systemd_status(deep: bool = False, system: bool = False):
print_systemd_scope_conflict_warning()
print()
if has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if not systemd_unit_is_current(system=system):
print("⚠ Installed gateway service definition is outdated")
print(f" Run: {'sudo ' if system else ''}hermes gateway restart{scope_flag} # auto-refreshes the unit")
@@ -1998,7 +2371,7 @@ _PLATFORMS = [
{"name": "QQ_ALLOWED_USERS", "prompt": "Allowed user OpenIDs (comma-separated, leave empty for open access)", "password": False,
"is_allowlist": True,
"help": "Optional — restrict DM access to specific user OpenIDs."},
{"name": "QQ_HOME_CHANNEL", "prompt": "Home channel (user/group OpenID for cron delivery, or empty)", "password": False,
{"name": "QQBOT_HOME_CHANNEL", "prompt": "Home channel (user/group OpenID for cron delivery, or empty)", "password": False,
"help": "OpenID to deliver cron results and notifications to."},
],
},
@@ -2211,9 +2584,62 @@ def _setup_sms():
def _setup_dingtalk():
"""Configure DingTalk via the standard platform setup."""
"""Configure DingTalk — QR scan (recommended) or manual credential entry."""
from hermes_cli.setup import (
prompt_choice, prompt_yes_no, print_info, print_success, print_warning,
)
dingtalk_platform = next(p for p in _PLATFORMS if p["key"] == "dingtalk")
_setup_standard_platform(dingtalk_platform)
emoji = dingtalk_platform["emoji"]
label = dingtalk_platform["label"]
print()
print(color(f" ─── {emoji} {label} Setup ───", Colors.CYAN))
existing = get_env_value("DINGTALK_CLIENT_ID")
if existing:
print()
print_success(f"{label} is already configured (Client ID: {existing}).")
if not prompt_yes_no(f" Reconfigure {label}?", False):
return
print()
method = prompt_choice(
" Choose setup method",
[
"QR Code Scan (Recommended, auto-obtain Client ID and Client Secret)",
"Manual Input (Client ID and Client Secret)",
],
default=0,
)
if method == 0:
# ── QR-code device-flow authorization ──
try:
from hermes_cli.dingtalk_auth import dingtalk_qr_auth
except ImportError as exc:
print_warning(f" QR auth module failed to load ({exc}), falling back to manual input.")
_setup_standard_platform(dingtalk_platform)
return
result = dingtalk_qr_auth()
if result is None:
print_warning(" QR auth incomplete, falling back to manual input.")
_setup_standard_platform(dingtalk_platform)
return
client_id, client_secret = result
save_env_value("DINGTALK_CLIENT_ID", client_id)
save_env_value("DINGTALK_CLIENT_SECRET", client_secret)
save_env_value("DINGTALK_ALLOW_ALL_USERS", "true")
print()
print_success(f"{emoji} {label} configured via QR scan!")
else:
# ── Manual entry ──
_setup_standard_platform(dingtalk_platform)
# Also enable allow-all by default for convenience
if get_env_value("DINGTALK_CLIENT_ID"):
save_env_value("DINGTALK_ALLOW_ALL_USERS", "true")
def _setup_wecom():
@@ -2572,6 +2998,215 @@ def _setup_feishu():
print_info(f" Bot: {bot_name}")
def _setup_qqbot():
"""Interactive setup for QQ Bot — scan-to-configure or manual credentials."""
print()
print(color(" ─── 🐧 QQ Bot Setup ───", Colors.CYAN))
existing_app_id = get_env_value("QQ_APP_ID")
existing_secret = get_env_value("QQ_CLIENT_SECRET")
if existing_app_id and existing_secret:
print()
print_success("QQ Bot is already configured.")
if not prompt_yes_no(" Reconfigure QQ Bot?", False):
return
# ── Choose setup method ──
print()
method_choices = [
"Scan QR code to add bot automatically (recommended)",
"Enter existing App ID and App Secret manually",
]
method_idx = prompt_choice(" How would you like to set up QQ Bot?", method_choices, 0)
credentials = None
used_qr = False
if method_idx == 0:
# ── QR scan-to-configure ──
try:
credentials = _qqbot_qr_flow()
except KeyboardInterrupt:
print()
print_warning(" QQ Bot setup cancelled.")
return
if credentials:
used_qr = True
if not credentials:
print_info(" QR setup did not complete. Continuing with manual input.")
# ── Manual credential input ──
if not credentials:
print()
print_info(" Go to https://q.qq.com to register a QQ Bot application.")
print_info(" Note your App ID and App Secret from the application page.")
print()
app_id = prompt(" App ID", password=False)
if not app_id:
print_warning(" Skipped — QQ Bot won't work without an App ID.")
return
app_secret = prompt(" App Secret", password=True)
if not app_secret:
print_warning(" Skipped — QQ Bot won't work without an App Secret.")
return
credentials = {"app_id": app_id.strip(), "client_secret": app_secret.strip(), "user_openid": ""}
# ── Save core credentials ──
save_env_value("QQ_APP_ID", credentials["app_id"])
save_env_value("QQ_CLIENT_SECRET", credentials["client_secret"])
user_openid = credentials.get("user_openid", "")
# ── DM security policy ──
print()
access_choices = [
"Use DM pairing approval (recommended)",
"Allow all direct messages",
"Only allow listed user OpenIDs",
]
access_idx = prompt_choice(" How should direct messages be authorized?", access_choices, 0)
if access_idx == 0:
save_env_value("QQ_ALLOW_ALL_USERS", "false")
if user_openid:
print()
if prompt_yes_no(f" Add yourself ({user_openid}) to the allow list?", True):
save_env_value("QQ_ALLOWED_USERS", user_openid)
print_success(f" Allow list set to {user_openid}")
else:
save_env_value("QQ_ALLOWED_USERS", "")
else:
save_env_value("QQ_ALLOWED_USERS", "")
print_success(" DM pairing enabled.")
print_info(" Unknown users can request access; approve with `hermes pairing approve`.")
elif access_idx == 1:
save_env_value("QQ_ALLOW_ALL_USERS", "true")
save_env_value("QQ_ALLOWED_USERS", "")
print_warning(" Open DM access enabled for QQ Bot.")
else:
default_allow = user_openid or ""
allowlist = prompt(" Allowed user OpenIDs (comma-separated)", default_allow, password=False).replace(" ", "")
save_env_value("QQ_ALLOW_ALL_USERS", "false")
save_env_value("QQ_ALLOWED_USERS", allowlist)
print_success(" Allowlist saved.")
# ── Home channel ──
if user_openid:
print()
if prompt_yes_no(f" Use your QQ user ID ({user_openid}) as the home channel?", True):
save_env_value("QQBOT_HOME_CHANNEL", user_openid)
print_success(f" Home channel set to {user_openid}")
else:
print()
home_channel = prompt(" Home channel OpenID (for cron/notifications, or empty)", password=False)
if home_channel:
save_env_value("QQBOT_HOME_CHANNEL", home_channel.strip())
print_success(f" Home channel set to {home_channel.strip()}")
print()
print_success("🐧 QQ Bot configured!")
print_info(f" App ID: {credentials['app_id']}")
def _qqbot_render_qr(url: str) -> bool:
"""Try to render a QR code in the terminal. Returns True if successful."""
try:
import qrcode as _qr
qr = _qr.QRCode(border=1,error_correction=_qr.constants.ERROR_CORRECT_L)
qr.add_data(url)
qr.make(fit=True)
qr.print_ascii(invert=True)
return True
except Exception:
return False
def _qqbot_qr_flow():
"""Run the QR-code scan-to-configure flow.
Returns a dict with app_id, client_secret, user_openid on success,
or None on failure/cancel.
"""
try:
from gateway.platforms.qqbot import (
create_bind_task, poll_bind_result, build_connect_url,
decrypt_secret, BindStatus,
)
from gateway.platforms.qqbot.constants import ONBOARD_POLL_INTERVAL
except Exception as exc:
print_error(f" QQBot onboard import failed: {exc}")
return None
import asyncio
import time
MAX_REFRESHES = 3
refresh_count = 0
while refresh_count <= MAX_REFRESHES:
loop = asyncio.new_event_loop()
# ── Create bind task ──
try:
task_id, aes_key = loop.run_until_complete(create_bind_task())
except Exception as e:
print_warning(f" Failed to create bind task: {e}")
loop.close()
return None
url = build_connect_url(task_id)
# ── Display QR code + URL ──
print()
if _qqbot_render_qr(url):
print(f" Scan the QR code above, or open this URL directly:\n {url}")
else:
print(f" Open this URL in QQ on your phone:\n {url}")
print_info(" Tip: pip install qrcode to show a scannable QR code here")
# ── Poll loop (silent — keep QR visible at bottom) ──
try:
while True:
try:
status, app_id, encrypted_secret, user_openid = loop.run_until_complete(
poll_bind_result(task_id)
)
except Exception:
time.sleep(ONBOARD_POLL_INTERVAL)
continue
if status == BindStatus.COMPLETED:
client_secret = decrypt_secret(encrypted_secret, aes_key)
print()
print_success(f" QR scan complete! (App ID: {app_id})")
if user_openid:
print_info(f" Scanner's OpenID: {user_openid}")
return {
"app_id": app_id,
"client_secret": client_secret,
"user_openid": user_openid,
}
if status == BindStatus.EXPIRED:
refresh_count += 1
if refresh_count > MAX_REFRESHES:
print()
print_warning(f" QR code expired {MAX_REFRESHES} times — giving up.")
return None
print()
print_warning(f" QR code expired, refreshing... ({refresh_count}/{MAX_REFRESHES})")
loop.close()
break # outer while creates a new task
time.sleep(ONBOARD_POLL_INTERVAL)
except KeyboardInterrupt:
loop.close()
raise
finally:
loop.close()
return None
def _setup_signal():
"""Interactive setup for Signal messenger."""
import shutil
@@ -2709,6 +3344,10 @@ def gateway_setup():
print_systemd_scope_conflict_warning()
print()
if supports_systemd_services() and has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if service_installed and service_running:
print_success("Gateway service is installed and running.")
elif service_installed:
@@ -2749,8 +3388,12 @@ def gateway_setup():
_setup_signal()
elif platform["key"] == "weixin":
_setup_weixin()
elif platform["key"] == "dingtalk":
_setup_dingtalk()
elif platform["key"] == "feishu":
_setup_feishu()
elif platform["key"] == "qqbot":
_setup_qqbot()
else:
_setup_standard_platform(platform)
@@ -3110,15 +3753,18 @@ def gateway_command(args):
elif subcmd == "status":
deep = getattr(args, 'deep', False)
system = getattr(args, 'system', False)
snapshot = get_gateway_runtime_snapshot(system=system)
# Check for service first
if supports_systemd_services() and (get_systemd_unit_path(system=False).exists() or get_systemd_unit_path(system=True).exists()):
systemd_status(deep, system=system)
_print_gateway_process_mismatch(snapshot)
elif is_macos() and get_launchd_plist_path().exists():
launchd_status(deep)
_print_gateway_process_mismatch(snapshot)
else:
# Check for manually running processes
pids = find_gateway_pids()
pids = list(snapshot.gateway_pids)
if pids:
print(f"✓ Gateway is running (PID: {', '.join(map(str, pids))})")
print(" (Running manually, not as a system service)")
@@ -3159,3 +3805,14 @@ def gateway_command(args):
else:
print(" hermes gateway install # Install as user service")
print(" sudo hermes gateway install --system # Install as boot-time system service")
elif subcmd == "migrate-legacy":
# Stop, disable, and remove legacy Hermes gateway unit files from
# pre-rename installs (e.g. hermes.service). Profile units and
# unrelated third-party services are never touched.
dry_run = getattr(args, 'dry_run', False)
yes = getattr(args, 'yes', False)
if not supports_systemd_services() and not is_macos():
print("Legacy unit migration only applies to systemd-based Linux hosts.")
return
remove_legacy_hermes_units(interactive=not yes, dry_run=dry_run)
+2609 -686
View File
File diff suppressed because it is too large Load Diff
+66 -5
View File
@@ -279,8 +279,8 @@ def cmd_mcp_add(args):
_info(f"Starting OAuth flow for '{name}'...")
oauth_ok = False
try:
from tools.mcp_oauth import build_oauth_auth
oauth_auth = build_oauth_auth(name, url)
from tools.mcp_oauth_manager import get_manager
oauth_auth = get_manager().get_or_build_provider(name, url, None)
if oauth_auth:
server_config["auth"] = "oauth"
_success("OAuth configured (tokens will be acquired on first connection)")
@@ -428,10 +428,12 @@ def cmd_mcp_remove(args):
_remove_mcp_server(name)
_success(f"Removed '{name}' from config")
# Clean up OAuth tokens if they exist
# Clean up OAuth tokens if they exist — route through MCPOAuthManager so
# any provider instance cached in the current process (e.g. from an
# earlier `hermes mcp test` in the same session) is evicted too.
try:
from tools.mcp_oauth import remove_oauth_tokens
remove_oauth_tokens(name)
from tools.mcp_oauth_manager import get_manager
get_manager().remove(name)
_success("Cleaned up OAuth tokens")
except Exception:
pass
@@ -577,6 +579,63 @@ def _interpolate_value(value: str) -> str:
return re.sub(r"\$\{(\w+)\}", _replace, value)
# ─── hermes mcp login ────────────────────────────────────────────────────────
def cmd_mcp_login(args):
"""Force re-authentication for an OAuth-based MCP server.
Deletes cached tokens (both on disk and in the running process's
MCPOAuthManager cache) and triggers a fresh OAuth flow via the
existing probe path.
Use this when:
- Tokens are stuck in a bad state (server revoked, refresh token
consumed by an external process, etc.)
- You want to re-authenticate to change scopes or account
- A tool call returned ``needs_reauth: true``
"""
name = args.name
servers = _get_mcp_servers()
if name not in servers:
_error(f"Server '{name}' not found in config.")
if servers:
_info(f"Available servers: {', '.join(servers)}")
return
server_config = servers[name]
url = server_config.get("url")
if not url:
_error(f"Server '{name}' has no URL — not an OAuth-capable server")
return
if server_config.get("auth") != "oauth":
_error(f"Server '{name}' is not configured for OAuth (auth={server_config.get('auth')})")
_info("Use `hermes mcp remove` + `hermes mcp add` to reconfigure auth.")
return
# Wipe both disk and in-memory cache so the next probe forces a fresh
# OAuth flow.
try:
from tools.mcp_oauth_manager import get_manager
mgr = get_manager()
mgr.remove(name)
except Exception as exc:
_warning(f"Could not clear existing OAuth state: {exc}")
print()
_info(f"Starting OAuth flow for '{name}'...")
# Probe triggers the OAuth flow (browser redirect + callback capture).
try:
tools = _probe_single_server(name, server_config)
if tools:
_success(f"Authenticated — {len(tools)} tool(s) available")
else:
_success("Authenticated (server reported no tools)")
except Exception as exc:
_error(f"Authentication failed: {exc}")
# ─── hermes mcp configure ────────────────────────────────────────────────────
def cmd_mcp_configure(args):
@@ -696,6 +755,7 @@ def mcp_command(args):
"test": cmd_mcp_test,
"configure": cmd_mcp_configure,
"config": cmd_mcp_configure,
"login": cmd_mcp_login,
}
handler = handlers.get(action)
@@ -713,4 +773,5 @@ def mcp_command(args):
_info("hermes mcp list List servers")
_info("hermes mcp test <name> Test connection")
_info("hermes mcp configure <name> Toggle tools")
_info("hermes mcp login <name> Re-authenticate OAuth")
print()
+20 -1
View File
@@ -374,7 +374,26 @@ def normalize_model_for_provider(model_input: str, target_provider: str) -> str:
return bare
return _dots_to_hyphens(bare)
# --- Copilot: strip matching provider prefix, keep dots ---
# --- Copilot / Copilot ACP: delegate to the Copilot-specific
# normalizer. It knows about the alias table (vendor-prefix
# stripping for Anthropic/OpenAI, dash-to-dot repair for Claude)
# and live-catalog lookups. Without this, vendor-prefixed or
# dash-notation Claude IDs survive to the Copilot API and hit
# HTTP 400 "model_not_supported". See issue #6879.
if provider in {"copilot", "copilot-acp"}:
try:
from hermes_cli.models import normalize_copilot_model_id
normalized = normalize_copilot_model_id(name)
if normalized:
return normalized
except Exception:
# Fall through to the generic strip-vendor behaviour below
# if the Copilot-specific path is unavailable for any reason.
pass
# --- Copilot / Copilot ACP / openai-codex fallback:
# strip matching provider prefix, keep dots ---
if provider in _STRIP_VENDOR_ONLY_PROVIDERS:
stripped = _strip_matching_provider_prefix(name, provider)
if stripped == name and name.startswith("openai/"):
+20 -4
View File
@@ -692,12 +692,12 @@ def switch_model(
api_key=api_key,
base_url=base_url,
)
except Exception:
except Exception as e:
validation = {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": None,
"message": f"Could not validate `{new_model}`: {e}",
}
if not validation.get("accepted"):
@@ -727,6 +727,22 @@ def switch_model(
if not api_mode:
api_mode = determine_api_mode(target_provider, base_url)
# OpenCode base URLs end with /v1 for OpenAI-compatible models, but the
# Anthropic SDK prepends its own /v1/messages to the base_url. Strip the
# trailing /v1 so the SDK constructs the correct path (e.g.
# https://opencode.ai/zen/go/v1/messages instead of .../v1/v1/messages).
# Mirrors the same logic in hermes_cli.runtime_provider.resolve_runtime_provider;
# without it, /model switches into an anthropic_messages-routed OpenCode
# model (e.g. `/model minimax-m2.7` on opencode-go, `/model claude-sonnet-4-6`
# on opencode-zen) hit a double /v1 and returned OpenCode's website 404 page.
if (
api_mode == "anthropic_messages"
and target_provider in {"opencode-zen", "opencode-go"}
and isinstance(base_url, str)
and base_url
):
base_url = re.sub(r"/v1/?$", "", base_url)
# --- Get capabilities (legacy) ---
capabilities = get_model_capabilities(target_provider, new_model)
+59 -31
View File
@@ -26,7 +26,8 @@ COPILOT_REASONING_EFFORTS_O_SERIES = ["low", "medium", "high"]
# Fallback OpenRouter snapshot used when the live catalog is unavailable.
# (model_id, display description shown in menus)
OPENROUTER_MODELS: list[tuple[str, str]] = [
("anthropic/claude-opus-4.7", "recommended"),
("moonshotai/kimi-k2.5", "recommended"),
("anthropic/claude-opus-4.7", ""),
("anthropic/claude-opus-4.6", ""),
("anthropic/claude-sonnet-4.6", ""),
("qwen/qwen3.6-plus", ""),
@@ -49,7 +50,6 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
("z-ai/glm-5.1", ""),
("z-ai/glm-5v-turbo", ""),
("z-ai/glm-5-turbo", ""),
("moonshotai/kimi-k2.5", ""),
("x-ai/grok-4.20", ""),
("nvidia/nemotron-3-super-120b-a12b", ""),
("nvidia/nemotron-3-super-120b-a12b:free", "free"),
@@ -75,7 +75,9 @@ def _codex_curated_models() -> list[str]:
_PROVIDER_MODELS: dict[str, list[str]] = {
"nous": [
"moonshotai/kimi-k2.5",
"xiaomi/mimo-v2-pro",
"anthropic/claude-opus-4.7",
"anthropic/claude-opus-4.6",
"anthropic/claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5",
@@ -95,7 +97,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"z-ai/glm-5.1",
"z-ai/glm-5v-turbo",
"z-ai/glm-5-turbo",
"moonshotai/kimi-k2.5",
"x-ai/grok-4.20-beta",
"nvidia/nemotron-3-super-120b-a12b",
"nvidia/nemotron-3-super-120b-a12b:free",
@@ -132,9 +133,6 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"gemini-2.5-pro",
"gemini-2.5-flash",
"gemini-2.5-flash-lite",
# Gemma open models (also served via AI Studio)
"gemma-4-31b-it",
"gemma-4-26b-it",
],
"google-gemini-cli": [
"gemini-2.5-pro",
@@ -154,9 +152,23 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"grok-4.20-reasoning",
"grok-4-1-fast-reasoning",
],
"nvidia": [
# NVIDIA flagship reasoning models
"nvidia/nemotron-3-super-120b-a12b",
"nvidia/nemotron-3-nano-30b-a3b",
"nvidia/llama-3.3-nemotron-super-49b-v1.5",
# Third-party agentic models hosted on build.nvidia.com
# (map to OpenRouter defaults — users get familiar picks on NIM)
"qwen/qwen3.5-397b-a17b",
"deepseek-ai/deepseek-v3.2",
"moonshotai/kimi-k2.5",
"minimaxai/minimax-m2.5",
"z-ai/glm5",
"openai/gpt-oss-120b",
],
"kimi-coding": [
"kimi-for-coding",
"kimi-k2.5",
"kimi-for-coding",
"kimi-k2-thinking",
"kimi-k2-thinking-turbo",
"kimi-k2-turbo-preview",
@@ -211,6 +223,7 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"trinity-mini",
],
"opencode-zen": [
"kimi-k2.5",
"gpt-5.4-pro",
"gpt-5.4",
"gpt-5.3-codex",
@@ -242,16 +255,15 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
"glm-5",
"glm-4.7",
"glm-4.6",
"kimi-k2.5",
"kimi-k2-thinking",
"kimi-k2",
"qwen3-coder",
"big-pickle",
],
"opencode-go": [
"kimi-k2.5",
"glm-5.1",
"glm-5",
"kimi-k2.5",
"mimo-v2-pro",
"mimo-v2-omni",
"minimax-m2.7",
@@ -284,21 +296,21 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
# to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
# or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
"alibaba": [
"kimi-k2.5",
"qwen3.5-plus",
"qwen3-coder-plus",
"qwen3-coder-next",
# Third-party models available on coding-intl
"glm-5",
"glm-4.7",
"kimi-k2.5",
"MiniMax-M2.5",
],
# Curated HF model list — only agentic models that map to OpenRouter defaults.
"huggingface": [
"moonshotai/Kimi-K2.5",
"Qwen/Qwen3.5-397B-A17B",
"Qwen/Qwen3.5-35B-A3B",
"deepseek-ai/DeepSeek-V3.2",
"moonshotai/Kimi-K2.5",
"MiniMaxAI/MiniMax-M2.5",
"zai-org/GLM-5",
"XiaomiMiMo/MiMo-V2-Flash",
@@ -535,6 +547,7 @@ CANONICAL_PROVIDERS: list[ProviderEntry] = [
ProviderEntry("anthropic", "Anthropic", "Anthropic (Claude models — API key or Claude Code)"),
ProviderEntry("openai-codex", "OpenAI Codex", "OpenAI Codex"),
ProviderEntry("xiaomi", "Xiaomi MiMo", "Xiaomi MiMo (MiMo-V2 models — pro, omni, flash)"),
ProviderEntry("nvidia", "NVIDIA NIM", "NVIDIA NIM (Nemotron models — build.nvidia.com or local NIM)"),
ProviderEntry("qwen-oauth", "Qwen OAuth (Portal)", "Qwen OAuth (reuses local Qwen CLI login)"),
ProviderEntry("copilot", "GitHub Copilot", "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)"),
ProviderEntry("copilot-acp", "GitHub Copilot ACP", "GitHub Copilot ACP (spawns `copilot --acp --stdio`)"),
@@ -617,6 +630,10 @@ _PROVIDER_ALIASES = {
"grok": "xai",
"x-ai": "xai",
"x.ai": "xai",
"nim": "nvidia",
"nvidia-nim": "nvidia",
"build-nvidia": "nvidia",
"nemotron": "nvidia",
"ollama": "custom", # bare "ollama" = local; use "ollama-cloud" for cloud
"ollama_cloud": "ollama-cloud",
}
@@ -1487,6 +1504,19 @@ _COPILOT_MODEL_ALIASES = {
"anthropic/claude-sonnet-4.6": "claude-sonnet-4.6",
"anthropic/claude-sonnet-4.5": "claude-sonnet-4.5",
"anthropic/claude-haiku-4.5": "claude-haiku-4.5",
# Dash-notation fallbacks: Hermes' default Claude IDs elsewhere use
# hyphens (anthropic native format), but Copilot's API only accepts
# dot-notation. Accept both so users who configure copilot + a
# default hyphenated Claude model don't hit HTTP 400
# "model_not_supported". See issue #6879.
"claude-opus-4-6": "claude-opus-4.6",
"claude-sonnet-4-6": "claude-sonnet-4.6",
"claude-sonnet-4-5": "claude-sonnet-4.5",
"claude-haiku-4-5": "claude-haiku-4.5",
"anthropic/claude-opus-4-6": "claude-opus-4.6",
"anthropic/claude-sonnet-4-6": "claude-sonnet-4.6",
"anthropic/claude-sonnet-4-5": "claude-sonnet-4.5",
"anthropic/claude-haiku-4-5": "claude-haiku-4.5",
}
@@ -2018,8 +2048,8 @@ def validate_requested_model(
)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": message,
}
@@ -2032,8 +2062,8 @@ def validate_requested_model(
message += f"\n If this server expects `/v1`, try base URL: `{probe.get('suggested_base_url')}`"
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": message,
}
@@ -2067,12 +2097,11 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in the OpenAI Codex model listing. "
f"It may still work if your account has access to it."
f"Model `{requested}` was not found in the OpenAI Codex model listing."
f"{suggestion_text}"
),
}
@@ -2111,16 +2140,15 @@ def validate_requested_model(
if suggestions:
suggestion_text = "\n Similar models: " + ", ".join(f"`{s}`" for s in suggestions)
return {
"accepted": True,
"persist": True,
"recognized": False,
"message": (
f"Note: `{requested}` was not found in this provider's model listing. "
f"It may still work if your plan supports it."
f"{suggestion_text}"
),
}
return {
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Model `{requested}` was not found in this provider's model listing."
f"{suggestion_text}"
),
}
# api_models is None — couldn't reach API. Accept and persist,
# but warn so typos don't silently break things.
@@ -2162,8 +2190,8 @@ def validate_requested_model(
provider_label = _PROVIDER_LABELS.get(normalized, normalized)
return {
"accepted": True,
"persist": True,
"accepted": False,
"persist": False,
"recognized": False,
"message": (
f"Could not reach the {provider_label} API to validate `{requested}`. "
+3 -12
View File
@@ -300,19 +300,10 @@ def _read_config_model(profile_dir: Path) -> tuple:
def _check_gateway_running(profile_dir: Path) -> bool:
"""Check if a gateway is running for a given profile directory."""
pid_file = profile_dir / "gateway.pid"
if not pid_file.exists():
return False
try:
raw = pid_file.read_text().strip()
if not raw:
return False
data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
pid = int(data["pid"])
os.kill(pid, 0) # existence check
return True
except (json.JSONDecodeError, KeyError, ValueError, TypeError,
ProcessLookupError, PermissionError, OSError):
from gateway.status import get_running_pid
return get_running_pid(profile_dir / "gateway.pid", cleanup_stale=False) is not None
except Exception:
return False
+11
View File
@@ -137,6 +137,11 @@ HERMES_OVERLAYS: Dict[str, HermesOverlay] = {
base_url_override="https://api.x.ai/v1",
base_url_env_var="XAI_BASE_URL",
),
"nvidia": HermesOverlay(
transport="openai_chat",
base_url_override="https://integrate.api.nvidia.com/v1",
base_url_env_var="NVIDIA_BASE_URL",
),
"xiaomi": HermesOverlay(
transport="openai_chat",
base_url_env_var="XIAOMI_BASE_URL",
@@ -191,6 +196,12 @@ ALIASES: Dict[str, str] = {
"x.ai": "xai",
"grok": "xai",
# nvidia
"nim": "nvidia",
"nvidia-nim": "nvidia",
"build-nvidia": "nvidia",
"nemotron": "nvidia",
# kimi-for-coding (models.dev ID)
"kimi": "kimi-for-coding",
"kimi-coding": "kimi-for-coding",
+15 -55
View File
@@ -91,7 +91,6 @@ _DEFAULT_PROVIDER_MODELS = {
"gemini": [
"gemini-3.1-pro-preview", "gemini-3-flash-preview", "gemini-3.1-flash-lite-preview",
"gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite",
"gemma-4-31b-it", "gemma-4-26b-it",
],
"zai": ["glm-5.1", "glm-5", "glm-4.7", "glm-4.5", "glm-4.5-flash"],
"kimi-coding": ["kimi-k2.5", "kimi-k2-thinking", "kimi-k2-turbo-preview"],
@@ -1461,7 +1460,9 @@ def setup_agent_settings(config: dict):
)
print_info("Maximum tool-calling iterations per conversation.")
print_info("Higher = more complex tasks, but costs more tokens.")
print_info("Default is 90, which works for most tasks. Use 150+ for open exploration.")
print_info(
f"Press Enter to keep {current_max}. Use 90 for most tasks or 150+ for open exploration."
)
max_iter_str = prompt("Max iterations", current_max)
try:
@@ -2005,52 +2006,6 @@ def _setup_wecom_callback():
_gw_setup()
def _setup_qqbot():
"""Configure QQ Bot gateway."""
print_header("QQ Bot")
existing = get_env_value("QQ_APP_ID")
if existing:
print_info("QQ Bot: already configured")
if not prompt_yes_no("Reconfigure QQ Bot?", False):
return
print_info("Connects Hermes to QQ via the Official QQ Bot API (v2).")
print_info(" Requires a QQ Bot application at q.qq.com")
print_info(" Reference: https://bot.q.qq.com/wiki/develop/api-v2/")
print()
app_id = prompt("QQ Bot App ID")
if not app_id:
print_warning("App ID is required — skipping QQ Bot setup")
return
save_env_value("QQ_APP_ID", app_id.strip())
client_secret = prompt("QQ Bot App Secret", password=True)
if not client_secret:
print_warning("App Secret is required — skipping QQ Bot setup")
return
save_env_value("QQ_CLIENT_SECRET", client_secret)
print_success("QQ Bot credentials saved")
print()
print_info("🔒 Security: Restrict who can DM your bot")
print_info(" Use QQ user OpenIDs (found in event payloads)")
print()
allowed_users = prompt("Allowed user OpenIDs (comma-separated, leave empty for open access)")
if allowed_users:
save_env_value("QQ_ALLOWED_USERS", allowed_users.replace(" ", ""))
print_success("QQ Bot allowlist configured")
else:
print_info("⚠️ No allowlist set — anyone can DM the bot!")
print()
print_info("📬 Home Channel: OpenID for cron job delivery and notifications.")
home_channel = prompt("Home channel OpenID (leave empty to set later)")
if home_channel:
save_env_value("QQ_HOME_CHANNEL", home_channel)
print()
print_success("QQ Bot configured!")
def _setup_bluebubbles():
@@ -2119,12 +2074,9 @@ def _setup_bluebubbles():
def _setup_qqbot():
"""Configure QQ Bot (Official API v2) via standard platform setup."""
from hermes_cli.gateway import _PLATFORMS
qq_platform = next((p for p in _PLATFORMS if p["key"] == "qqbot"), None)
if qq_platform:
from hermes_cli.gateway import _setup_standard_platform
_setup_standard_platform(qq_platform)
"""Configure QQ Bot (Official API v2) via gateway setup."""
from hermes_cli.gateway import _setup_qqbot as _gateway_setup_qqbot
_gateway_setup_qqbot()
def _setup_webhooks():
@@ -2264,7 +2216,9 @@ def setup_gateway(config: dict):
missing_home.append("Slack")
if get_env_value("BLUEBUBBLES_SERVER_URL") and not get_env_value("BLUEBUBBLES_HOME_CHANNEL"):
missing_home.append("BlueBubbles")
if get_env_value("QQ_APP_ID") and not get_env_value("QQ_HOME_CHANNEL"):
if get_env_value("QQ_APP_ID") and not (
get_env_value("QQBOT_HOME_CHANNEL") or get_env_value("QQ_HOME_CHANNEL")
):
missing_home.append("QQBot")
if missing_home:
@@ -2289,8 +2243,10 @@ def setup_gateway(config: dict):
_is_service_running,
supports_systemd_services,
has_conflicting_systemd_units,
has_legacy_hermes_units,
install_linux_gateway_from_setup,
print_systemd_scope_conflict_warning,
print_legacy_unit_warning,
systemd_start,
systemd_restart,
launchd_install,
@@ -2308,6 +2264,10 @@ def setup_gateway(config: dict):
print_systemd_scope_conflict_warning()
print()
if supports_systemd and has_legacy_hermes_units():
print_legacy_unit_warning()
print()
if service_running:
if prompt_yes_no(" Restart the gateway to pick up changes?", True):
try:
+147 -1
View File
@@ -515,6 +515,90 @@ def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
c.print()
def browse_skills(page: int = 1, page_size: int = 20, source: str = "all") -> dict:
"""Paginated hub browse for programmatic callers (e.g. TUI gateway).
Returns ``{"items": [...], "page": int, "total_pages": int, "total": int}``.
"""
from tools.skills_hub import GitHubAuth, create_source_router
page_size = max(1, min(page_size, 100))
_TRUST_RANK = {"builtin": 3, "trusted": 2, "community": 1}
_PER_SOURCE_LIMIT = {"official": 100, "skills-sh": 100, "well-known": 25, "github": 100, "clawhub": 50,
"claude-marketplace": 50, "lobehub": 50}
auth = GitHubAuth()
sources = create_source_router(auth)
all_results: list = []
for src in sources:
sid = src.source_id()
if source != "all" and sid != source and sid != "official":
continue
try:
limit = _PER_SOURCE_LIMIT.get(sid, 50)
all_results.extend(src.search("", limit=limit))
except Exception:
continue
if not all_results:
return {"items": [], "page": 1, "total_pages": 1, "total": 0}
seen: dict = {}
for r in all_results:
rank = _TRUST_RANK.get(r.trust_level, 0)
if r.name not in seen or rank > _TRUST_RANK.get(seen[r.name].trust_level, 0):
seen[r.name] = r
deduped = list(seen.values())
deduped.sort(key=lambda r: (-_TRUST_RANK.get(r.trust_level, 0), r.source != "official", r.name.lower()))
total = len(deduped)
total_pages = max(1, (total + page_size - 1) // page_size)
page = max(1, min(page, total_pages))
start = (page - 1) * page_size
page_items = deduped[start : min(start + page_size, total)]
return {
"items": [{"name": r.name, "description": r.description, "source": r.source,
"trust": r.trust_level} for r in page_items],
"page": page,
"total_pages": total_pages,
"total": total,
}
def inspect_skill(identifier: str) -> Optional[dict]:
"""Skill metadata (+ SKILL.md preview) for programmatic callers."""
from tools.skills_hub import GitHubAuth, create_source_router
class _Q:
def print(self, *a, **k):
pass
c = _Q()
auth = GitHubAuth()
sources = create_source_router(auth)
ident = identifier
if "/" not in ident:
ident = _resolve_short_name(ident, sources, c)
if not ident:
return None
meta, bundle, _ = _resolve_source_meta_and_bundle(ident, sources)
if not meta:
return None
out: dict = {
"name": meta.name,
"description": meta.description,
"source": meta.source,
"identifier": meta.identifier,
"tags": list(meta.tags) if meta.tags else [],
}
if bundle and "SKILL.md" in bundle.files:
content = bundle.files["SKILL.md"]
if isinstance(content, bytes):
content = content.decode("utf-8", errors="replace")
lines = content.split("\n")
preview = "\n".join(lines[:50])
if len(lines) > 50:
preview += f"\n\n... ({len(lines) - 50} more lines)"
out["skill_md_preview"] = preview
return out
def do_list(source_filter: str = "all", console: Optional[Console] = None) -> None:
"""List installed skills, distinguishing hub, builtin, and local skills."""
from tools.skills_hub import HubLockFile, ensure_hub_dirs
@@ -684,6 +768,51 @@ def do_uninstall(name: str, console: Optional[Console] = None,
c.print(f"[bold red]Error:[/] {msg}\n")
def do_reset(name: str, restore: bool = False,
console: Optional[Console] = None,
skip_confirm: bool = False,
invalidate_cache: bool = True) -> None:
"""Reset a bundled skill's manifest tracking (+ optionally restore from bundled)."""
from tools.skills_sync import reset_bundled_skill
c = console or _console
if not skip_confirm and restore:
c.print(f"\n[bold]Restore '{name}' from bundled source?[/]")
c.print("[dim]This will DELETE your current copy and re-copy the bundled version.[/]")
try:
answer = input("Confirm [y/N]: ").strip().lower()
except (EOFError, KeyboardInterrupt):
answer = "n"
if answer not in ("y", "yes"):
c.print("[dim]Cancelled.[/]\n")
return
result = reset_bundled_skill(name, restore=restore)
if not result["ok"]:
c.print(f"[bold red]Error:[/] {result['message']}\n")
return
c.print(f"[bold green]{result['message']}[/]")
synced = result.get("synced") or {}
if synced.get("copied"):
c.print(f"[dim]Copied: {', '.join(synced['copied'])}[/]")
if synced.get("updated"):
c.print(f"[dim]Updated: {', '.join(synced['updated'])}[/]")
c.print()
if invalidate_cache:
try:
from agent.prompt_builder import clear_skills_system_prompt_cache
clear_skills_system_prompt_cache(clear_snapshot=True)
except Exception:
pass
else:
c.print("[dim]Change will take effect in your next session.[/]")
c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
def do_tap(action: str, repo: str = "", console: Optional[Console] = None) -> None:
"""Manage taps (custom GitHub repo sources)."""
from tools.skills_hub import TapsManager
@@ -1007,6 +1136,9 @@ def skills_command(args) -> None:
do_audit(name=getattr(args, "name", None))
elif action == "uninstall":
do_uninstall(args.name)
elif action == "reset":
do_reset(args.name, restore=getattr(args, "restore", False),
skip_confirm=getattr(args, "yes", False))
elif action == "publish":
do_publish(
args.skill_path,
@@ -1029,7 +1161,7 @@ def skills_command(args) -> None:
return
do_tap(tap_action, repo=repo)
else:
_console.print("Usage: hermes skills [browse|search|install|inspect|list|check|update|audit|uninstall|publish|snapshot|tap]\n")
_console.print("Usage: hermes skills [browse|search|install|inspect|list|check|update|audit|uninstall|reset|publish|snapshot|tap]\n")
_console.print("Run 'hermes skills <command> --help' for details.\n")
@@ -1175,6 +1307,19 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:
do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
invalidate_cache=invalidate_cache)
elif action == "reset":
if not args:
c.print("[bold red]Usage:[/] /skills reset <name> [--restore] [--now]\n")
c.print("[dim]Clears the bundled-skills manifest entry so future updates stop marking it as user-modified.[/]")
c.print("[dim]Pass --restore to also replace the current copy with the bundled version.[/]\n")
return
name = args[0]
restore = "--restore" in args
invalidate_cache = "--now" in args
# Slash commands can't prompt — --restore in slash mode is implicit consent.
do_reset(name, restore=restore, console=c, skip_confirm=True,
invalidate_cache=invalidate_cache)
elif action == "publish":
if not args:
c.print("[bold red]Usage:[/] /skills publish <skill-path> [--to github] [--repo owner/repo]\n")
@@ -1231,6 +1376,7 @@ def _print_skills_help(console: Console) -> None:
" [cyan]update[/] [name] Update hub skills with upstream changes\n"
" [cyan]audit[/] [name] Re-scan hub skills for security\n"
" [cyan]uninstall[/] <name> Remove a hub-installed skill\n"
" [cyan]reset[/] <name> [--restore] Reset bundled-skill tracking (fix 'user-modified' flag)\n"
" [cyan]publish[/] <path> --repo <r> Publish a skill to GitHub via PR\n"
" [cyan]snapshot[/] export|import Export/import skill configurations\n"
" [cyan]tap[/] list|add|remove Manage skill sources\n",
+2 -2
View File
@@ -23,7 +23,7 @@ All fields are optional. Missing values inherit from the ``default`` skin.
banner_dim: "#B8860B" # Dim/muted text (separators, labels)
banner_text: "#FFF8DC" # Body text (tool names, skill names)
ui_accent: "#FFBF00" # General UI accent
ui_label: "#4dd0e1" # UI labels
ui_label: "#DAA520" # UI labels (warm gold; teal clashed w/ default banner gold)
ui_ok: "#4caf50" # Success indicators
ui_error: "#ef5350" # Error indicators
ui_warn: "#ffa726" # Warning indicators
@@ -163,7 +163,7 @@ _BUILTIN_SKINS: Dict[str, Dict[str, Any]] = {
"banner_dim": "#B8860B",
"banner_text": "#FFF8DC",
"ui_accent": "#FFBF00",
"ui_label": "#4dd0e1",
"ui_label": "#DAA520",
"ui_ok": "#4caf50",
"ui_error": "#ef5350",
"ui_warn": "#ffa726",
+30 -64
View File
@@ -317,7 +317,7 @@ def show_status(args):
"WeCom Callback": ("WECOM_CALLBACK_CORP_ID", None),
"Weixin": ("WEIXIN_ACCOUNT_ID", "WEIXIN_HOME_CHANNEL"),
"BlueBubbles": ("BLUEBUBBLES_SERVER_URL", "BLUEBUBBLES_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQ_HOME_CHANNEL"),
"QQBot": ("QQ_APP_ID", "QQBOT_HOME_CHANNEL"),
}
for name, (token_var, home_var) in platforms.items():
@@ -327,6 +327,9 @@ def show_status(args):
home_channel = ""
if home_var:
home_channel = os.getenv(home_var, "")
# Back-compat: QQBot home channel was renamed from QQ_HOME_CHANNEL to QQBOT_HOME_CHANNEL
if not home_channel and home_var == "QQBOT_HOME_CHANNEL":
home_channel = os.getenv("QQ_HOME_CHANNEL", "")
status = "configured" if has_token else "not configured"
if home_channel:
@@ -339,73 +342,36 @@ def show_status(args):
# =========================================================================
print()
print(color("◆ Gateway Service", Colors.CYAN, Colors.BOLD))
if _is_termux():
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
except Exception:
gateway_pids = []
is_running = bool(gateway_pids)
try:
from hermes_cli.gateway import get_gateway_runtime_snapshot, _format_gateway_pids
snapshot = get_gateway_runtime_snapshot()
is_running = snapshot.running
print(f" Status: {check_mark(is_running)} {'running' if is_running else 'stopped'}")
print(" Manager: Termux / manual process")
if gateway_pids:
rendered = ", ".join(str(pid) for pid in gateway_pids[:3])
if len(gateway_pids) > 3:
rendered += ", ..."
print(f" PID(s): {rendered}")
else:
print(f" Manager: {snapshot.manager}")
if snapshot.gateway_pids:
print(f" PID(s): {_format_gateway_pids(snapshot.gateway_pids)}")
if snapshot.has_process_service_mismatch:
print(" Service: installed but not managing the current running gateway")
elif _is_termux() and not snapshot.gateway_pids:
print(" Start with: hermes gateway")
print(" Note: Android may stop background jobs when Termux is suspended")
elif sys.platform.startswith('linux'):
from hermes_constants import is_container
if is_container():
# Docker/Podman: no systemd — check for running gateway processes
try:
from hermes_cli.gateway import find_gateway_pids
gateway_pids = find_gateway_pids()
is_active = len(gateway_pids) > 0
except Exception:
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: docker (foreground)")
elif snapshot.service_installed and not snapshot.service_running:
print(" Service: installed but stopped")
except Exception:
if _is_termux():
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: Termux / manual process")
elif sys.platform.startswith('linux'):
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: systemd/manual")
elif sys.platform == 'darwin':
print(f" Status: {color('unknown', Colors.DIM)}")
print(" Manager: launchd")
else:
try:
from hermes_cli.gateway import get_service_name
_gw_svc = get_service_name()
except Exception:
_gw_svc = "hermes-gateway"
try:
result = subprocess.run(
["systemctl", "--user", "is-active", _gw_svc],
capture_output=True,
text=True,
timeout=5
)
is_active = result.stdout.strip() == "active"
except (FileNotFoundError, subprocess.TimeoutExpired):
is_active = False
print(f" Status: {check_mark(is_active)} {'running' if is_active else 'stopped'}")
print(" Manager: systemd (user)")
elif sys.platform == 'darwin':
from hermes_cli.gateway import get_launchd_label
try:
result = subprocess.run(
["launchctl", "list", get_launchd_label()],
capture_output=True,
text=True,
timeout=5
)
is_loaded = result.returncode == 0
except subprocess.TimeoutExpired:
is_loaded = False
print(f" Status: {check_mark(is_loaded)} {'loaded' if is_loaded else 'not loaded'}")
print(" Manager: launchd")
else:
print(f" Status: {color('N/A', Colors.DIM)}")
print(" Manager: (not supported on this platform)")
print(f" Status: {color('N/A', Colors.DIM)}")
print(" Manager: (not supported on this platform)")
# =========================================================================
# Cron Jobs
+121 -2
View File
@@ -258,14 +258,16 @@ TOOL_CATEGORIES = {
"requires_nous_auth": True,
"managed_nous_feature": "image_gen",
"override_env_vars": ["FAL_KEY"],
"imagegen_backend": "fal",
},
{
"name": "FAL.ai",
"badge": "paid",
"tag": "FLUX 2 Pro with auto-upscaling",
"tag": "Pick from flux-2-klein, flux-2-pro, gpt-image, nano-banana, etc.",
"env_vars": [
{"key": "FAL_KEY", "prompt": "FAL API key", "url": "https://fal.ai/dashboard/keys"},
],
"imagegen_backend": "fal",
},
],
},
@@ -510,7 +512,7 @@ def _get_platform_tools(
"""Resolve which individual toolset names are enabled for a platform."""
from toolsets import resolve_toolset
platform_toolsets = config.get("platform_toolsets", {})
platform_toolsets = config.get("platform_toolsets") or {}
toolset_names = platform_toolsets.get(platform)
if toolset_names is None or not isinstance(toolset_names, list):
@@ -950,6 +952,106 @@ def _detect_active_provider_index(providers: list, config: dict) -> int:
return 0
# ─── Image Generation Model Pickers ───────────────────────────────────────────
#
# IMAGEGEN_BACKENDS is a per-backend catalog. Each entry exposes:
# - config_key: top-level config.yaml key for this backend's settings
# - model_catalog_fn: returns an OrderedDict-like {model_id: metadata}
# - default_model: fallback when nothing is configured
#
# This prepares for future imagegen backends (Replicate, Stability, etc.):
# each new backend registers its own entry; the FAL provider entry in
# TOOL_CATEGORIES tags itself with `imagegen_backend: "fal"` to select the
# right catalog at picker time.
def _fal_model_catalog():
"""Lazy-load the FAL model catalog from the tool module."""
from tools.image_generation_tool import FAL_MODELS, DEFAULT_MODEL
return FAL_MODELS, DEFAULT_MODEL
IMAGEGEN_BACKENDS = {
"fal": {
"display": "FAL.ai",
"config_key": "image_gen",
"catalog_fn": _fal_model_catalog,
},
}
def _format_imagegen_model_row(model_id: str, meta: dict, widths: dict) -> str:
"""Format a single picker row with column-aligned speed / strengths / price."""
return (
f"{model_id:<{widths['model']}} "
f"{meta.get('speed', ''):<{widths['speed']}} "
f"{meta.get('strengths', ''):<{widths['strengths']}} "
f"{meta.get('price', '')}"
)
def _configure_imagegen_model(backend_name: str, config: dict) -> None:
"""Prompt the user to pick a model for the given imagegen backend.
Writes selection to ``config[backend_config_key]["model"]``. Safe to
call even when stdin is not a TTY curses_radiolist falls back to
keeping the current selection.
"""
backend = IMAGEGEN_BACKENDS.get(backend_name)
if not backend:
return
catalog, default_model = backend["catalog_fn"]()
if not catalog:
return
cfg_key = backend["config_key"]
cur_cfg = config.setdefault(cfg_key, {})
if not isinstance(cur_cfg, dict):
cur_cfg = {}
config[cfg_key] = cur_cfg
current_model = cur_cfg.get("model") or default_model
if current_model not in catalog:
current_model = default_model
model_ids = list(catalog.keys())
# Put current model at the top so the cursor lands on it by default.
ordered = [current_model] + [m for m in model_ids if m != current_model]
# Column widths
widths = {
"model": max(len(m) for m in model_ids),
"speed": max((len(catalog[m].get("speed", "")) for m in model_ids), default=6),
"strengths": max((len(catalog[m].get("strengths", "")) for m in model_ids), default=0),
}
print()
header = (
f" {'Model':<{widths['model']}} "
f"{'Speed':<{widths['speed']}} "
f"{'Strengths':<{widths['strengths']}} "
f"Price"
)
print(color(header, Colors.CYAN))
rows = []
for mid in ordered:
row = _format_imagegen_model_row(mid, catalog[mid], widths)
if mid == current_model:
row += " ← currently in use"
rows.append(row)
idx = _prompt_choice(
f" Choose {backend['display']} model:",
rows,
default=0,
)
chosen = ordered[idx]
cur_cfg["model"] = chosen
_print_success(f" Model set to: {chosen}")
def _configure_provider(provider: dict, config: dict):
"""Configure a single provider - prompt for API keys and set config."""
env_vars = provider.get("env_vars", [])
@@ -1006,6 +1108,10 @@ def _configure_provider(provider: dict, config: dict):
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
# Imagegen backends prompt for model selection after backend pick.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
return
# Prompt for each required env var
@@ -1040,6 +1146,10 @@ def _configure_provider(provider: dict, config: dict):
if all_configured:
_print_success(f" {provider['name']} configured!")
# Imagegen backends prompt for model selection after env vars are in.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
def _configure_simple_requirements(ts_key: str):
@@ -1211,6 +1321,10 @@ def _reconfigure_provider(provider: dict, config: dict):
_print_success(f" {provider['name']} - no configuration needed!")
if managed_feature:
_print_info(" Requests for this tool will be billed to your Nous subscription.")
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
return
for var in env_vars:
@@ -1228,6 +1342,11 @@ def _reconfigure_provider(provider: dict, config: dict):
else:
_print_info(" Kept current")
# Imagegen backends prompt for model selection on reconfig too.
backend = provider.get("imagegen_backend")
if backend:
_configure_imagegen_model(backend, config)
def _reconfigure_simple_requirements(ts_key: str):
"""Reconfigure simple env var requirements."""
+210 -55
View File
@@ -118,59 +118,166 @@ def remove_wrapper_script():
def uninstall_gateway_service():
"""Stop and uninstall the gateway service if running."""
"""Stop and uninstall the gateway service (systemd, launchd) and kill any
standalone gateway processes.
Delegates to the gateway module which handles:
- Linux: user + system systemd services (with proper DBUS env setup)
- macOS: launchd plists
- All platforms: standalone ``hermes gateway run`` processes
- Termux/Android: skips systemd (no systemd on Android), still kills standalone processes
"""
import platform
if platform.system() != "Linux":
return False
stopped_something = False
prefix = os.getenv("PREFIX", "")
if os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix:
return False
# 1. Kill any standalone gateway processes (all platforms, including Termux)
try:
from hermes_cli.gateway import get_service_name
svc_name = get_service_name()
except Exception:
svc_name = "hermes-gateway"
service_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
if not service_file.exists():
return False
try:
# Stop the service
subprocess.run(
["systemctl", "--user", "stop", svc_name],
capture_output=True,
check=False
)
# Disable the service
subprocess.run(
["systemctl", "--user", "disable", svc_name],
capture_output=True,
check=False
)
# Remove service file
service_file.unlink()
# Reload systemd
subprocess.run(
["systemctl", "--user", "daemon-reload"],
capture_output=True,
check=False
)
return True
from hermes_cli.gateway import kill_gateway_processes, find_gateway_pids
pids = find_gateway_pids()
if pids:
killed = kill_gateway_processes()
if killed:
log_success(f"Killed {killed} running gateway process(es)")
stopped_something = True
except Exception as e:
log_warn(f"Could not fully remove gateway service: {e}")
log_warn(f"Could not check for gateway processes: {e}")
system = platform.system()
# Termux/Android has no systemd and no launchd — nothing left to do.
prefix = os.getenv("PREFIX", "")
is_termux = bool(os.getenv("TERMUX_VERSION") or "com.termux/files/usr" in prefix)
if is_termux:
return stopped_something
# 2. Linux: uninstall systemd services (both user and system scopes)
if system == "Linux":
try:
from hermes_cli.gateway import (
get_systemd_unit_path,
get_service_name,
_systemctl_cmd,
)
svc_name = get_service_name()
for is_system in (False, True):
unit_path = get_systemd_unit_path(system=is_system)
if not unit_path.exists():
continue
scope = "system" if is_system else "user"
try:
if is_system and os.geteuid() != 0:
log_warn(f"System gateway service exists at {unit_path} "
f"but needs sudo to remove")
continue
cmd = _systemctl_cmd(is_system)
subprocess.run(cmd + ["stop", svc_name],
capture_output=True, check=False)
subprocess.run(cmd + ["disable", svc_name],
capture_output=True, check=False)
unit_path.unlink()
subprocess.run(cmd + ["daemon-reload"],
capture_output=True, check=False)
log_success(f"Removed {scope} gateway service ({unit_path})")
stopped_something = True
except Exception as e:
log_warn(f"Could not remove {scope} gateway service: {e}")
except Exception as e:
log_warn(f"Could not check systemd gateway services: {e}")
# 3. macOS: uninstall launchd plist
elif system == "Darwin":
try:
from hermes_cli.gateway import get_launchd_plist_path
plist_path = get_launchd_plist_path()
if plist_path.exists():
subprocess.run(["launchctl", "unload", str(plist_path)],
capture_output=True, check=False)
plist_path.unlink()
log_success(f"Removed macOS gateway service ({plist_path})")
stopped_something = True
except Exception as e:
log_warn(f"Could not remove launchd gateway service: {e}")
return stopped_something
def _is_default_hermes_home(hermes_home: Path) -> bool:
"""Return True when ``hermes_home`` points at the default (non-profile) root."""
try:
from hermes_constants import get_default_hermes_root
return hermes_home.resolve() == get_default_hermes_root().resolve()
except Exception:
return False
def _discover_named_profiles():
"""Return a list of ``ProfileInfo`` for every non-default profile, or ``[]``
if profile support is unavailable or nothing is installed beyond the
default root."""
try:
from hermes_cli.profiles import list_profiles
except Exception:
return []
try:
return [p for p in list_profiles() if not getattr(p, "is_default", False)]
except Exception as e:
log_warn(f"Could not enumerate profiles: {e}")
return []
def _uninstall_profile(profile) -> None:
"""Fully uninstall a single named profile: stop its gateway service,
remove its alias wrapper, and wipe its HERMES_HOME directory.
We shell out to ``hermes -p <name> gateway stop|uninstall`` because
service names, unit paths, and plist paths are all derived from the
current HERMES_HOME and can't be easily switched in-process.
"""
import sys as _sys
name = profile.name
profile_home = profile.path
log_info(f"Uninstalling profile '{name}'...")
# 1. Stop and remove this profile's gateway service.
# Use `python -m hermes_cli.main` so we don't depend on a `hermes`
# wrapper that may be half-removed mid-uninstall.
hermes_invocation = [_sys.executable, "-m", "hermes_cli.main", "--profile", name]
for subcmd in ("stop", "uninstall"):
try:
subprocess.run(
hermes_invocation + ["gateway", subcmd],
capture_output=True,
text=True,
timeout=60,
check=False,
)
except subprocess.TimeoutExpired:
log_warn(f" Gateway {subcmd} timed out for '{name}'")
except Exception as e:
log_warn(f" Could not run gateway {subcmd} for '{name}': {e}")
# 2. Remove the wrapper alias script at ~/.local/bin/<name> (if any).
alias_path = getattr(profile, "alias_path", None)
if alias_path and alias_path.exists():
try:
alias_path.unlink()
log_success(f" Removed alias {alias_path}")
except Exception as e:
log_warn(f" Could not remove alias {alias_path}: {e}")
# 3. Wipe the profile's HERMES_HOME directory.
try:
if profile_home.exists():
shutil.rmtree(profile_home)
log_success(f" Removed {profile_home}")
except Exception as e:
log_warn(f" Could not remove {profile_home}: {e}")
def run_uninstall(args):
"""
Run the uninstall process.
@@ -181,7 +288,13 @@ def run_uninstall(args):
"""
project_root = get_project_root()
hermes_home = get_hermes_home()
# Detect named profiles when uninstalling from the default root —
# offer to clean them up too instead of leaving zombie HERMES_HOMEs
# and systemd units behind.
is_default_profile = _is_default_hermes_home(hermes_home)
named_profiles = _discover_named_profiles() if is_default_profile else []
print()
print(color("┌─────────────────────────────────────────────────────────┐", Colors.MAGENTA, Colors.BOLD))
print(color("│ ⚕ Hermes Agent Uninstaller │", Colors.MAGENTA, Colors.BOLD))
@@ -195,6 +308,13 @@ def run_uninstall(args):
print(f" Secrets: {hermes_home / '.env'}")
print(f" Data: {hermes_home / 'cron/'}, {hermes_home / 'sessions/'}, {hermes_home / 'logs/'}")
print()
if named_profiles:
print(color("Other profiles detected:", Colors.CYAN, Colors.BOLD))
for p in named_profiles:
running = " (gateway running)" if getattr(p, "gateway_running", False) else ""
print(f"{p.name}{running}: {p.path}")
print()
# Ask for confirmation
print(color("Uninstall Options:", Colors.YELLOW, Colors.BOLD))
@@ -221,12 +341,40 @@ def run_uninstall(args):
return
full_uninstall = (choice == "2")
# When doing a full uninstall from the default profile, also offer to
# remove any named profiles — stopping their gateway services, unlinking
# their alias wrappers, and wiping their HERMES_HOME dirs. Otherwise
# those leave zombie services and data behind.
remove_profiles = False
if full_uninstall and named_profiles:
print()
print(color("Other profiles will NOT be removed by default.", Colors.YELLOW))
print(f"Found {len(named_profiles)} named profile(s): " +
", ".join(p.name for p in named_profiles))
print()
try:
resp = input(color(
f"Also stop and remove these {len(named_profiles)} profile(s)? [y/N]: ",
Colors.BOLD
)).strip().lower()
except (KeyboardInterrupt, EOFError):
print()
print("Cancelled.")
return
remove_profiles = resp in ("y", "yes")
# Final confirmation
print()
if full_uninstall:
print(color("⚠️ WARNING: This will permanently delete ALL Hermes data!", Colors.RED, Colors.BOLD))
print(color(" Including: configs, API keys, sessions, scheduled jobs, logs", Colors.RED))
if remove_profiles:
print(color(
f" Plus {len(named_profiles)} profile(s): " +
", ".join(p.name for p in named_profiles),
Colors.RED
))
else:
print("This will remove the Hermes code but keep your configuration and data.")
@@ -247,12 +395,10 @@ def run_uninstall(args):
print(color("Uninstalling...", Colors.CYAN, Colors.BOLD))
print()
# 1. Stop and uninstall gateway service
log_info("Checking for gateway service...")
if uninstall_gateway_service():
log_success("Gateway service stopped and removed")
else:
log_info("No gateway service found")
# 1. Stop and uninstall gateway service + kill standalone processes
log_info("Checking for running gateway...")
if not uninstall_gateway_service():
log_info("No gateway service or processes found")
# 2. Remove PATH entries from shell configs
log_info("Removing PATH entries from shell configs...")
@@ -291,8 +437,17 @@ def run_uninstall(args):
log_warn(f"Could not fully remove {project_root}: {e}")
log_info("You may need to manually remove it")
# 5. Optionally remove ~/.hermes/ data directory
# 5. Optionally remove ~/.hermes/ data directory (and named profiles)
if full_uninstall:
# 5a. Stop and remove each named profile's gateway service and
# alias wrapper. The profile HERMES_HOME dirs live under
# ``<default>/profiles/<name>/`` and will be swept away by the
# rmtree below, but services + alias scripts live OUTSIDE the
# default root and have to be cleaned up explicitly.
if remove_profiles and named_profiles:
for prof in named_profiles:
_uninstall_profile(prof)
log_info("Removing configuration and data...")
try:
if hermes_home.exists():
+4 -34
View File
@@ -56,10 +56,10 @@ try:
except ImportError:
raise SystemExit(
"Web UI requires fastapi and uvicorn.\n"
"Run 'hermes web' to auto-install, or: pip install hermes-agent[web]"
f"Install with: {sys.executable} -m pip install 'fastapi' 'uvicorn[standard]'"
)
WEB_DIST = Path(__file__).parent / "web_dist"
WEB_DIST = Path(os.environ["HERMES_WEB_DIST"]) if "HERMES_WEB_DIST" in os.environ else Path(__file__).parent / "web_dist"
_log = logging.getLogger(__name__)
app = FastAPI(title="Hermes Agent", version=__version__)
@@ -1444,38 +1444,8 @@ def _nous_poller(session_id: str) -> None:
auth_state, min_key_ttl_seconds=300, timeout_seconds=15.0,
force_refresh=False, force_mint=True,
)
# Save into credential pool same as auth_commands.py does
from agent.credential_pool import (
PooledCredential,
load_pool,
AUTH_TYPE_OAUTH,
SOURCE_MANUAL,
)
pool = load_pool("nous")
entry = PooledCredential.from_dict("nous", {
**full_state,
"label": "dashboard device_code",
"auth_type": AUTH_TYPE_OAUTH,
"source": f"{SOURCE_MANUAL}:dashboard_device_code",
"base_url": full_state.get("inference_base_url"),
})
pool.add_entry(entry)
# Also persist to auth store so get_nous_auth_status() sees it
# (matches what _login_nous in auth.py does for the CLI flow).
try:
from hermes_cli.auth import (
_load_auth_store, _save_provider_state, _save_auth_store,
_auth_store_lock,
)
with _auth_store_lock():
auth_store = _load_auth_store()
_save_provider_state(auth_store, "nous", full_state)
_save_auth_store(auth_store)
except Exception as store_exc:
_log.warning(
"oauth/device: credential pool saved but auth store write failed "
"(session=%s): %s", session_id, store_exc,
)
from hermes_cli.auth import persist_nous_credentials
persist_nous_credentials(full_state)
with _oauth_sessions_lock:
sess["status"] = "approved"
_log.info("oauth/device: nous login completed (session=%s)", session_id)
+2 -1
View File
@@ -14,7 +14,8 @@ def get_hermes_home() -> Path:
Reads HERMES_HOME env var, falls back to ~/.hermes.
This is the single source of truth all other copies should import this.
"""
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
val = os.environ.get("HERMES_HOME", "").strip()
return Path(val) if val else Path.home() / ".hermes"
def get_default_hermes_root() -> Path:
+57 -2
View File
@@ -987,6 +987,22 @@ class SessionDB:
return sanitized.strip()
@staticmethod
def _contains_cjk(text: str) -> bool:
"""Check if text contains CJK (Chinese, Japanese, Korean) characters."""
for ch in text:
cp = ord(ch)
if (0x4E00 <= cp <= 0x9FFF or # CJK Unified Ideographs
0x3400 <= cp <= 0x4DBF or # CJK Extension A
0x20000 <= cp <= 0x2A6DF or # CJK Extension B
0x3000 <= cp <= 0x303F or # CJK Symbols
0x3040 <= cp <= 0x309F or # Hiragana
0x30A0 <= cp <= 0x30FF or # Katakana
0xAC00 <= cp <= 0xD7AF): # Hangul Syllables
return True
return False
def search_messages(
self,
query: str,
@@ -1062,8 +1078,47 @@ class SessionDB:
cursor = self._conn.execute(sql, params)
except sqlite3.OperationalError:
# FTS5 query syntax error despite sanitization — return empty
return []
matches = [dict(row) for row in cursor.fetchall()]
# unless query contains CJK (fall back to LIKE below)
if not self._contains_cjk(query):
return []
matches = []
else:
matches = [dict(row) for row in cursor.fetchall()]
# LIKE fallback for CJK queries: FTS5 default tokenizer splits CJK
# characters individually, causing multi-character queries to fail.
if not matches and self._contains_cjk(query):
raw_query = query.strip('"').strip()
like_where = ["m.content LIKE ?"]
like_params: list = [f"%{raw_query}%"]
if source_filter is not None:
like_where.append(f"s.source IN ({','.join('?' for _ in source_filter)})")
like_params.extend(source_filter)
if exclude_sources is not None:
like_where.append(f"s.source NOT IN ({','.join('?' for _ in exclude_sources)})")
like_params.extend(exclude_sources)
if role_filter:
like_where.append(f"m.role IN ({','.join('?' for _ in role_filter)})")
like_params.extend(role_filter)
like_sql = f"""
SELECT m.id, m.session_id, m.role,
substr(m.content,
max(1, instr(m.content, ?) - 40),
120) AS snippet,
m.content, m.timestamp, m.tool_name,
s.source, s.model, s.started_at AS session_started
FROM messages m
JOIN sessions s ON s.id = m.session_id
WHERE {' AND '.join(like_where)}
ORDER BY m.timestamp DESC
LIMIT ? OFFSET ?
"""
like_params.extend([limit, offset])
# instr() parameter goes first in the bound list
like_params = [raw_query] + like_params
with self._lock:
like_cursor = self._conn.execute(like_sql, like_params)
matches = [dict(row) for row in like_cursor.fetchall()]
# Add surrounding context (1 message before + after each match).
# Done outside the lock so we don't hold it across N sequential queries.
+2 -2
View File
@@ -433,7 +433,7 @@ def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
if not _MCP_SERVER_AVAILABLE:
raise ImportError(
"MCP server requires the 'mcp' package. "
"Install with: pip install 'hermes-agent[mcp]'"
f"Install with: {sys.executable} -m pip install 'mcp'"
)
mcp = FastMCP(
@@ -838,7 +838,7 @@ def run_mcp_server(verbose: bool = False) -> None:
if not _MCP_SERVER_AVAILABLE:
print(
"Error: MCP server requires the 'mcp' package.\n"
"Install with: pip install 'hermes-agent[mcp]'",
f"Install with: {sys.executable} -m pip install 'mcp'",
file=sys.stderr,
)
sys.exit(1)
+20 -6
View File
@@ -43,6 +43,15 @@ from dotenv import load_dotenv
load_dotenv()
def _effective_temperature_for_model(model: str) -> Optional[float]:
"""Return a fixed temperature for models with strict sampling contracts."""
try:
from agent.auxiliary_client import _fixed_temperature_for_model
except Exception:
return None
return _fixed_temperature_for_model(model)
# ============================================================================
@@ -442,12 +451,17 @@ Complete the user's task step by step."""
# Make API call
try:
response = self.client.chat.completions.create(
model=self.model,
messages=api_messages,
tools=self.tools,
timeout=300.0
)
api_kwargs = {
"model": self.model,
"messages": api_messages,
"tools": self.tools,
"timeout": 300.0,
}
fixed_temperature = _effective_temperature_for_model(self.model)
if fixed_temperature is not None:
api_kwargs["temperature"] = fixed_temperature
response = self.client.chat.completions.create(**api_kwargs)
except Exception as e:
self.logger.error(f"API call failed: {e}")
break
+2 -2
View File
@@ -274,9 +274,9 @@ def get_tool_definitions(
# execute_code" even when the API key isn't configured or the toolset is
# disabled (#560-discord).
if "execute_code" in available_tool_names:
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema, _get_execution_mode
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & available_tool_names
dynamic_schema = build_execute_code_schema(sandbox_enabled)
dynamic_schema = build_execute_code_schema(sandbox_enabled, mode=_get_execution_mode())
for i, td in enumerate(filtered_tools):
if td.get("function", {}).get("name") == "execute_code":
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
+69 -1
View File
@@ -37,7 +37,30 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
in {
packages.configKeys = configKeys;
checks = lib.optionalAttrs pkgs.stdenv.hostPlatform.isLinux {
checks = {
# Cross-platform evaluation — catches "not supported for interpreter"
# errors (e.g. sphinx dropping python311) without needing a darwin builder.
# Evaluation is pure and instant; it doesn't build anything.
cross-eval = let
targetSystems = builtins.filter
(s: inputs.self.packages ? ${s})
[ "x86_64-linux" "aarch64-linux" "aarch64-darwin" "x86_64-darwin" ];
tryEvalPkg = sys:
let pkg = inputs.self.packages.${sys}.default;
in builtins.tryEval (builtins.seq pkg.drvPath true);
results = map (sys: { inherit sys; result = tryEvalPkg sys; }) targetSystems;
failures = builtins.filter (r: !r.result.success) results;
failMsg = lib.concatMapStringsSep "\n" (r: " - ${r.sys}") failures;
in pkgs.runCommand "hermes-cross-eval" { } (
if failures != [] then
builtins.throw "Package fails to evaluate on:\n${failMsg}"
else ''
echo "PASS: package evaluates on all ${toString (builtins.length targetSystems)} platforms"
mkdir -p $out
echo "ok" > $out/result
''
);
} // lib.optionalAttrs pkgs.stdenv.hostPlatform.isLinux {
# Verify binaries exist and are executable
package-contents = pkgs.runCommand "hermes-package-contents" { } ''
set -e
@@ -103,6 +126,51 @@ json.dump(sorted(leaf_paths(DEFAULT_CONFIG)), sys.stdout, indent=2)
echo "ok" > $out/result
'';
# Verify bundled TUI is present and compiled
bundled-tui = pkgs.runCommand "hermes-bundled-tui" { } ''
set -e
echo "=== Checking bundled TUI ==="
test -d ${hermes-agent}/ui-tui || (echo "FAIL: ui-tui directory missing"; exit 1)
echo "PASS: ui-tui directory exists"
test -f ${hermes-agent}/ui-tui/dist/entry.js || (echo "FAIL: compiled entry.js missing"; exit 1)
echo "PASS: compiled entry.js present"
test -d ${hermes-agent}/ui-tui/node_modules || (echo "FAIL: node_modules missing"; exit 1)
echo "PASS: node_modules present"
grep -q "HERMES_TUI_DIR" ${hermes-agent}/bin/hermes || \
(echo "FAIL: HERMES_TUI_DIR not in wrapper"; exit 1)
echo "PASS: HERMES_TUI_DIR set in wrapper"
echo "=== All bundled TUI checks passed ==="
mkdir -p $out
echo "ok" > $out/result
'';
# Verify HERMES_NODE is set in wrapper and points to Node 20+
# (string-width uses the /v regex flag which requires Node 20+)
hermes-node = pkgs.runCommand "hermes-node-version" { } ''
set -e
echo "=== Checking HERMES_NODE in wrapper ==="
grep -q "HERMES_NODE" ${hermes-agent}/bin/hermes || \
(echo "FAIL: HERMES_NODE not set in wrapper"; exit 1)
echo "PASS: HERMES_NODE present in wrapper"
HERMES_NODE=$(sed -n "s/^export HERMES_NODE='\(.*\)'/\1/p" ${hermes-agent}/bin/hermes)
test -x "$HERMES_NODE" || (echo "FAIL: HERMES_NODE=$HERMES_NODE not executable"; exit 1)
echo "PASS: HERMES_NODE executable at $HERMES_NODE"
NODE_MAJOR=$("$HERMES_NODE" --version | sed 's/^v//' | cut -d. -f1)
test "$NODE_MAJOR" -ge 20 || \
(echo "FAIL: Node v$NODE_MAJOR < 20, TUI needs /v regex flag support"; exit 1)
echo "PASS: Node v$NODE_MAJOR >= 20"
echo "=== All HERMES_NODE checks passed ==="
mkdir -p $out
echo "ok" > $out/result
'';
# Verify HERMES_MANAGED guard works on all mutation commands
managed-guard = pkgs.runCommand "hermes-managed-guard" { } ''
set -e
+15 -38
View File
@@ -1,49 +1,26 @@
# nix/devShell.nix — Fast dev shell with stamp-file optimization
# nix/devShell.nix — Dev shell that delegates setup to each package
#
# Each package in inputsFrom exposes passthru.devShellHook — a bash snippet
# with stamp-checked setup logic. This file collects and runs them all.
{ inputs, ... }: {
perSystem = { pkgs, ... }:
perSystem = { pkgs, system, ... }:
let
python = pkgs.python311;
hermes-agent = inputs.self.packages.${system}.default;
hermes-tui = inputs.self.packages.${system}.tui;
packages = [ hermes-agent hermes-tui ];
in {
devShells.default = pkgs.mkShell {
inputsFrom = packages;
packages = with pkgs; [
python uv nodejs_20 ripgrep git openssh ffmpeg
python312 uv nodejs_22 ripgrep git openssh ffmpeg
];
shellHook = ''
shellHook = let
hooks = map (p: p.passthru.devShellHook or "") packages;
combined = pkgs.lib.concatStringsSep "\n" (builtins.filter (h: h != "") hooks);
in ''
echo "Hermes Agent dev shell"
# Composite stamp: changes when nix python or uv change
STAMP_VALUE="${python}:${pkgs.uv}"
STAMP_FILE=".venv/.nix-stamp"
# Create venv if missing
if [ ! -d .venv ]; then
echo "Creating Python 3.11 venv..."
uv venv .venv --python ${python}/bin/python3
fi
source .venv/bin/activate
# Only install if stamp is stale or missing
if [ ! -f "$STAMP_FILE" ] || [ "$(cat "$STAMP_FILE")" != "$STAMP_VALUE" ]; then
echo "Installing Python dependencies..."
uv pip install -e ".[all]"
if [ -d mini-swe-agent ]; then
uv pip install -e ./mini-swe-agent 2>/dev/null || true
fi
if [ -d tinker-atropos ]; then
uv pip install -e ./tinker-atropos 2>/dev/null || true
fi
# Install npm deps
if [ -f package.json ] && [ ! -d node_modules ]; then
echo "Installing npm dependencies..."
npm install
fi
echo "$STAMP_VALUE" > "$STAMP_FILE"
fi
${combined}
echo "Ready. Run 'hermes' to start."
'';
};
+14 -7
View File
@@ -121,11 +121,19 @@
# ── Provision apt packages (first boot only, cached in writable layer) ──
# sudo: agent self-modification
# nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
# curl: needed for uv installer
# Node 22 via NodeSource — Ubuntu 24.04 ships Node 18 which is EOL.
# curl: needed for uv installer + NodeSource setup
if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
echo "First boot: provisioning agent tools..."
apt-get update -qq
apt-get install -y -qq sudo nodejs npm curl
apt-get install -y -qq sudo curl ca-certificates gnupg
mkdir -p /etc/apt/keyrings
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key \
| gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_22.x nodistro main" \
> /etc/apt/sources.list.d/nodesource.list
apt-get update -qq
apt-get install -y -qq nodejs
touch /var/lib/hermes-tools-provisioned
fi
@@ -140,15 +148,14 @@
su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
fi
# Python 3.11 venv — gives the agent a writable Python with pip.
# Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
# Python 3.12 venv — gives the agent a writable Python with pip.
# --seed includes pip/setuptools so bare `pip install` works.
_UV_BIN="$TARGET_HOME/.local/bin/uv"
if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
su -s /bin/sh "$TARGET_USER" -c "
export PATH=\"\$HOME/.local/bin:\$PATH\"
uv python install 3.11
uv venv --python 3.11 --seed \"\$HOME/.venv\"
uv python install 3.12
uv venv --python 3.12 --seed \"\$HOME/.venv\"
" || true
fi
@@ -171,7 +178,7 @@
# Package and entrypoint use stable symlinks (current-package, current-entrypoint)
# so they can update without recreation. Env vars go through $HERMES_HOME/.env.
containerIdentity = builtins.hashString "sha256" (builtins.toJSON {
schema = 3; # bump when identity inputs change
schema = 4; # bump when identity inputs change (4: Node 18→22 via NodeSource)
image = cfg.container.image;
extraVolumes = cfg.container.extraVolumes;
extraOptions = cfg.container.extraOptions;
+91 -29
View File
@@ -1,54 +1,116 @@
# nix/packages.nix — Hermes Agent package built with uv2nix
{ inputs, ... }: {
perSystem = { pkgs, system, ... }:
{ inputs, ... }:
{
perSystem =
{ pkgs, inputs', ... }:
let
hermesVenv = pkgs.callPackage ./python.nix {
inherit (inputs) uv2nix pyproject-nix pyproject-build-systems;
};
hermesTui = pkgs.callPackage ./tui.nix {
npm-lockfile-fix = inputs'.npm-lockfile-fix.packages.default;
};
# Import bundled skills, excluding runtime caches
bundledSkills = pkgs.lib.cleanSourceWith {
src = ../skills;
filter = path: _type:
!(pkgs.lib.hasInfix "/index-cache/" path);
filter = path: _type: !(pkgs.lib.hasInfix "/index-cache/" path);
};
hermesWeb = pkgs.callPackage ./web.nix {
npm-lockfile-fix = inputs'.npm-lockfile-fix.packages.default;
};
runtimeDeps = with pkgs; [
nodejs_20 ripgrep git openssh ffmpeg tirith
nodejs_22
ripgrep
git
openssh
ffmpeg
tirith
];
runtimePath = pkgs.lib.makeBinPath runtimeDeps;
in {
packages.default = pkgs.stdenv.mkDerivation {
pname = "hermes-agent";
version = (builtins.fromTOML (builtins.readFile ../pyproject.toml)).project.version;
dontUnpack = true;
dontBuild = true;
nativeBuildInputs = [ pkgs.makeWrapper ];
# Lockfile hashes for dev shell stamps
pyprojectHash = builtins.hashString "sha256" (builtins.readFile ../pyproject.toml);
uvLockHash =
if builtins.pathExists ../uv.lock then
builtins.hashString "sha256" (builtins.readFile ../uv.lock)
else
"none";
in
{
packages = {
default = pkgs.stdenv.mkDerivation {
pname = "hermes-agent";
version = (fromTOML (builtins.readFile ../pyproject.toml)).project.version;
installPhase = ''
runHook preInstall
dontUnpack = true;
dontBuild = true;
nativeBuildInputs = [ pkgs.makeWrapper ];
mkdir -p $out/share/hermes-agent $out/bin
cp -r ${bundledSkills} $out/share/hermes-agent/skills
installPhase = ''
runHook preInstall
${pkgs.lib.concatMapStringsSep "\n" (name: ''
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
--suffix PATH : "${runtimePath}" \
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
'') [ "hermes" "hermes-agent" "hermes-acp" ]}
mkdir -p $out/share/hermes-agent $out/bin
cp -r ${bundledSkills} $out/share/hermes-agent/skills
cp -r ${hermesWeb} $out/share/hermes-agent/web_dist
runHook postInstall
'';
# copy pre-built TUI (same layout as dev: ui-tui/dist/ + node_modules/)
mkdir -p $out/ui-tui
cp -r ${hermesTui}/lib/hermes-tui/* $out/ui-tui/
meta = with pkgs.lib; {
description = "AI agent with advanced tool-calling capabilities";
homepage = "https://github.com/NousResearch/hermes-agent";
mainProgram = "hermes";
license = licenses.mit;
platforms = platforms.unix;
${pkgs.lib.concatMapStringsSep "\n"
(name: ''
makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
--suffix PATH : "${runtimePath}" \
--set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills \
--set HERMES_WEB_DIST $out/share/hermes-agent/web_dist \
--set HERMES_TUI_DIR $out/ui-tui \
--set HERMES_PYTHON ${hermesVenv}/bin/python3 \
--set HERMES_NODE ${pkgs.nodejs_22}/bin/node
'')
[
"hermes"
"hermes-agent"
"hermes-acp"
]
}
runHook postInstall
'';
passthru.devShellHook = ''
STAMP=".nix-stamps/hermes-agent"
STAMP_VALUE="${pyprojectHash}:${uvLockHash}"
if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
echo "hermes-agent: installing Python dependencies..."
uv venv .venv --python ${pkgs.python312}/bin/python3 2>/dev/null || true
source .venv/bin/activate
uv pip install -e ".[all]"
[ -d mini-swe-agent ] && uv pip install -e ./mini-swe-agent 2>/dev/null || true
[ -d tinker-atropos ] && uv pip install -e ./tinker-atropos 2>/dev/null || true
mkdir -p .nix-stamps
echo "$STAMP_VALUE" > "$STAMP"
else
source .venv/bin/activate
export HERMES_PYTHON=${hermesVenv}/bin/python3
fi
'';
meta = with pkgs.lib; {
description = "AI agent with advanced tool-calling capabilities";
homepage = "https://github.com/NousResearch/hermes-agent";
mainProgram = "hermes";
license = licenses.mit;
platforms = platforms.unix;
};
};
tui = hermesTui;
web = hermesWeb;
};
};
}
+26 -9
View File
@@ -1,6 +1,6 @@
# nix/python.nix — uv2nix virtual environment builder
{
python311,
python312,
lib,
callPackage,
uv2nix,
@@ -35,30 +35,46 @@ let
};
};
# Legacy alibabacloud packages ship only sdists with setup.py/setup.cfg
# and no pyproject.toml, so setuptools isn't declared as a build dep.
buildSystemOverrides = final: prev: builtins.mapAttrs
(name: _: prev.${name}.overrideAttrs (old: {
nativeBuildInputs = (old.nativeBuildInputs or [ ]) ++ [ final.setuptools ];
}))
(lib.genAttrs [
"alibabacloud-credentials-api"
"alibabacloud-endpoint-util"
"alibabacloud-gateway-dingtalk"
"alibabacloud-gateway-spi"
"alibabacloud-tea"
] (_: null));
pythonPackageOverrides = final: _prev:
if isAarch64Darwin then {
numpy = mkPrebuiltOverride final python311.pkgs.numpy { };
numpy = mkPrebuiltOverride final python312.pkgs.numpy { };
av = mkPrebuiltOverride final python311.pkgs.av { };
pyarrow = mkPrebuiltOverride final python312.pkgs.pyarrow { };
humanfriendly = mkPrebuiltOverride final python311.pkgs.humanfriendly { };
av = mkPrebuiltOverride final python312.pkgs.av { };
coloredlogs = mkPrebuiltOverride final python311.pkgs.coloredlogs {
humanfriendly = mkPrebuiltOverride final python312.pkgs.humanfriendly { };
coloredlogs = mkPrebuiltOverride final python312.pkgs.coloredlogs {
humanfriendly = [ ];
};
onnxruntime = mkPrebuiltOverride final python311.pkgs.onnxruntime {
onnxruntime = mkPrebuiltOverride final python312.pkgs.onnxruntime {
coloredlogs = [ ];
numpy = [ ];
packaging = [ ];
};
ctranslate2 = mkPrebuiltOverride final python311.pkgs.ctranslate2 {
ctranslate2 = mkPrebuiltOverride final python312.pkgs.ctranslate2 {
numpy = [ ];
pyyaml = [ ];
};
faster-whisper = mkPrebuiltOverride final python311.pkgs.faster-whisper {
faster-whisper = mkPrebuiltOverride final python312.pkgs.faster-whisper {
av = [ ];
ctranslate2 = [ ];
huggingface-hub = [ ];
@@ -70,11 +86,12 @@ let
pythonSet =
(callPackage pyproject-nix.build.packages {
python = python311;
python = python312;
}).overrideScope
(lib.composeManyExtensions [
pyproject-build-systems.overlays.default
overlay
buildSystemOverrides
pythonPackageOverrides
]);
in
+77
View File
@@ -0,0 +1,77 @@
# nix/tui.nix — Hermes TUI (Ink/React) compiled with tsc and bundled
{ pkgs, npm-lockfile-fix, ... }:
let
src = ../ui-tui;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-mG3vpgGi4ljt4X3XIf3I/5mIcm+rVTUAmx2DQ6YVA90=";
};
packageJson = builtins.fromJSON (builtins.readFile (src + "/package.json"));
version = packageJson.version;
npmLockHash = builtins.hashString "sha256" (builtins.readFile ../ui-tui/package-lock.json);
in
pkgs.buildNpmPackage {
pname = "hermes-tui";
inherit src npmDeps version;
doCheck = false;
installPhase = ''
runHook preInstall
mkdir -p $out/lib/hermes-tui
cp -r dist $out/lib/hermes-tui/dist
# runtime node_modules
cp -r node_modules $out/lib/hermes-tui/node_modules
# @hermes/ink is a file: dependency, we need to copy it in fr
rm -f $out/lib/hermes-tui/node_modules/@hermes/ink
cp -r packages/hermes-ink $out/lib/hermes-tui/node_modules/@hermes/ink
# package.json needed for "type": "module" resolution
cp package.json $out/lib/hermes-tui/
runHook postInstall
'';
nativeBuildInputs = [
(pkgs.writeShellScriptBin "update_tui_lockfile" ''
set -euox pipefail
# get root of repo
REPO_ROOT=$(git rev-parse --show-toplevel)
# cd into ui-tui and reinstall
cd "$REPO_ROOT/ui-tui"
rm -rf node_modules/
npm cache clean --force
CI=true npm install # ci env var to suppress annoying unicode install banner lag
${pkgs.lib.getExe npm-lockfile-fix} ./package-lock.json
NIX_FILE="$REPO_ROOT/nix/tui.nix"
# compute the new hash
sed -i "s/hash = \"[^\"]*\";/hash = \"\";/" $NIX_FILE
NIX_OUTPUT=$(nix build .#tui 2>&1 || true)
NEW_HASH=$(echo "$NIX_OUTPUT" | grep 'got:' | awk '{print $2}')
echo got new hash $NEW_HASH
sed -i "s|hash = \"[^\"]*\";|hash = \"$NEW_HASH\";|" $NIX_FILE
nix build .#tui
echo "Updated npm hash in $NIX_FILE to $NEW_HASH"
'')
];
passthru.devShellHook = ''
STAMP=".nix-stamps/hermes-tui"
STAMP_VALUE="${npmLockHash}"
if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
echo "hermes-tui: installing npm dependencies..."
cd ui-tui && CI=true npm install --silent --no-fund --no-audit 2>/dev/null && cd ..
mkdir -p .nix-stamps
echo "$STAMP_VALUE" > "$STAMP"
fi
'';
}
+63
View File
@@ -0,0 +1,63 @@
# nix/web.nix — Hermes Web Dashboard (Vite/React) frontend build
{ pkgs, npm-lockfile-fix, ... }:
let
src = ../web;
npmDeps = pkgs.fetchNpmDeps {
inherit src;
hash = "sha256-Y0pOzdFG8BLjfvCLmsvqYpjxFjAQabXp1i7X9W/cCU4=";
};
npmLockHash = builtins.hashString "sha256" (builtins.readFile ../web/package-lock.json);
in
pkgs.buildNpmPackage {
pname = "hermes-web";
version = "0.0.0";
inherit src npmDeps;
doCheck = false;
buildPhase = ''
npx tsc -b
npx vite build --outDir dist
'';
installPhase = ''
runHook preInstall
cp -r dist $out
runHook postInstall
'';
nativeBuildInputs = [
(pkgs.writeShellScriptBin "update_web_lockfile" ''
set -euox pipefail
REPO_ROOT=$(git rev-parse --show-toplevel)
cd "$REPO_ROOT/web"
rm -rf node_modules/
npm cache clean --force
CI=true npm install
${pkgs.lib.getExe npm-lockfile-fix} ./package-lock.json
NIX_FILE="$REPO_ROOT/nix/web.nix"
sed -i "s/hash = \"[^\"]*\";/hash = \"\";/" $NIX_FILE
NIX_OUTPUT=$(nix build .#web 2>&1 || true)
NEW_HASH=$(echo "$NIX_OUTPUT" | grep 'got:' | awk '{print $2}')
echo got new hash $NEW_HASH
sed -i "s|hash = \"[^\"]*\";|hash = \"$NEW_HASH\";|" $NIX_FILE
nix build .#web
echo "Updated npm hash in $NIX_FILE to $NEW_HASH"
'')
];
passthru.devShellHook = ''
STAMP=".nix-stamps/hermes-web"
STAMP_VALUE="${npmLockHash}"
if [ ! -f "$STAMP" ] || [ "$(cat "$STAMP")" != "$STAMP_VALUE" ]; then
echo "hermes-web: installing npm dependencies..."
cd web && CI=true npm install --silent --no-fund --no-audit 2>/dev/null && cd ..
mkdir -p .nix-stamps
echo "$STAMP_VALUE" > "$STAMP"
fi
'';
}
@@ -145,10 +145,10 @@ Controls **how often** dialectic and context calls happen.
| Key | Default | Description |
|-----|---------|-------------|
| `contextCadence` | `1` | Min turns between context API calls |
| `dialecticCadence` | `3` | Min turns between dialectic API calls |
| `dialecticCadence` | `2` | Min turns between dialectic API calls. Recommended 15 |
| `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` for base context injection |
Higher cadence values reduce API calls and cost. `dialecticCadence: 3` (default) means the dialectic engine fires at most every 3rd turn.
Higher cadence values fire the dialectic LLM less often. `dialecticCadence: 2` means the engine fires every other turn. Setting it to `1` fires every turn.
### Depth (how many)
@@ -180,6 +180,8 @@ If `dialecticDepthLevels` is omitted, rounds use **proportional levels** derived
This keeps earlier passes cheap while using full depth on the final synthesis.
**Depth at session start.** The session-start prewarm runs the full configured `dialecticDepth` in the background before turn 1. A single-pass prewarm on a cold peer often returns thin output — multi-pass depth runs the audit/reconcile cycle before the user ever speaks. Turn 1 consumes the prewarm result directly; if prewarm hasn't landed in time, turn 1 falls back to a synchronous call with a bounded timeout.
### Level (how hard)
Controls the **intensity** of each dialectic reasoning round.
@@ -368,7 +370,7 @@ Config file: `$HERMES_HOME/honcho.json` (profile-local) or `~/.honcho/config.jso
| `contextTokens` | uncapped | Max tokens for the combined base context injection (summary + representation + card). Opt-in cap — omit to leave uncapped, set to an integer to bound injection size. |
| `injectionFrequency` | `every-turn` | `every-turn` or `first-turn` |
| `contextCadence` | `1` | Min turns between context API calls |
| `dialecticCadence` | `3` | Min turns between dialectic LLM calls |
| `dialecticCadence` | `2` | Min turns between dialectic LLM calls (recommended 15) |
The `contextTokens` budget is enforced at injection time. If the session summary + representation + card exceed the budget, Honcho trims the summary first, then the representation, preserving the card. This prevents context blowup in long sessions.
@@ -0,0 +1,361 @@
---
name: concept-diagrams
description: Generate flat, minimal light/dark-aware SVG diagrams as standalone HTML files, using a unified educational visual language with 9 semantic color ramps, sentence-case typography, and automatic dark mode. Best suited for educational and non-software visuals — physics setups, chemistry mechanisms, math curves, physical objects (aircraft, turbines, smartphones, mechanical watches), anatomy, floor plans, cross-sections, narrative journeys (lifecycle of X, process of Y), hub-spoke system integrations (smart city, IoT), and exploded layer views. If a more specialized skill exists for the subject (dedicated software/cloud architecture, hand-drawn sketches, animated explainers, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback with a clean educational look. Ships with 15 example diagrams.
version: 0.1.0
author: v1k22 (original PR), ported into hermes-agent
license: MIT
dependencies: []
metadata:
hermes:
tags: [diagrams, svg, visualization, education, physics, chemistry, engineering]
related_skills: [architecture-diagram, excalidraw, generative-widgets]
---
# Concept Diagrams
Generate production-quality SVG diagrams with a unified flat, minimal design system. Output is a single self-contained HTML file that renders identically in any modern browser, with automatic light/dark mode.
## Scope
**Best suited for:**
- Physics setups, chemistry mechanisms, math curves, biology
- Physical objects (aircraft, turbines, smartphones, mechanical watches, cells)
- Anatomy, cross-sections, exploded layer views
- Floor plans, architectural conversions
- Narrative journeys (lifecycle of X, process of Y)
- Hub-spoke system integrations (smart city, IoT networks, electricity grids)
- Educational / textbook-style visuals in any domain
- Quantitative charts (grouped bars, energy profiles)
**Look elsewhere first for:**
- Dedicated software / cloud infrastructure architecture with a dark tech aesthetic (consider `architecture-diagram` if available)
- Hand-drawn whiteboard sketches (consider `excalidraw` if available)
- Animated explainers or video output (consider an animation skill)
If a more specialized skill is available for the subject, prefer that. If none fits, this skill can serve as a general-purpose SVG diagram fallback — the output will carry the clean educational aesthetic described below, which is a reasonable default for almost any subject.
## Workflow
1. Decide on the diagram type (see Diagram Types below).
2. Lay out components using the Design System rules.
3. Write the full HTML page using `templates/template.html` as the wrapper — paste your SVG where the template says `<!-- PASTE SVG HERE -->`.
4. Save as a standalone `.html` file (for example `~/my-diagram.html` or `./my-diagram.html`).
5. User opens it directly in a browser — no server, no dependencies.
Optional: if the user wants a browsable gallery of multiple diagrams, see "Local Preview Server" at the bottom.
Load the HTML template:
```
skill_view(name="concept-diagrams", file_path="templates/template.html")
```
The template embeds the full CSS design system (`c-*` color classes, text classes, light/dark variables, arrow marker styles). The SVG you generate relies on these classes being present on the hosting page.
---
## Design System
### Philosophy
- **Flat**: no gradients, drop shadows, blur, glow, or neon effects.
- **Minimal**: show the essential. No decorative icons inside boxes.
- **Consistent**: same colors, spacing, typography, and stroke widths across every diagram.
- **Dark-mode ready**: all colors auto-adapt via CSS classes — no per-mode SVG.
### Color Palette
9 color ramps, each with 7 stops. Put the class name on a `<g>` or shape element; the template CSS handles both modes.
| Class | 50 (lightest) | 100 | 200 | 400 | 600 | 800 | 900 (darkest) |
|------------|---------------|---------|---------|---------|---------|---------|---------------|
| `c-purple` | #EEEDFE | #CECBF6 | #AFA9EC | #7F77DD | #534AB7 | #3C3489 | #26215C |
| `c-teal` | #E1F5EE | #9FE1CB | #5DCAA5 | #1D9E75 | #0F6E56 | #085041 | #04342C |
| `c-coral` | #FAECE7 | #F5C4B3 | #F0997B | #D85A30 | #993C1D | #712B13 | #4A1B0C |
| `c-pink` | #FBEAF0 | #F4C0D1 | #ED93B1 | #D4537E | #993556 | #72243E | #4B1528 |
| `c-gray` | #F1EFE8 | #D3D1C7 | #B4B2A9 | #888780 | #5F5E5A | #444441 | #2C2C2A |
| `c-blue` | #E6F1FB | #B5D4F4 | #85B7EB | #378ADD | #185FA5 | #0C447C | #042C53 |
| `c-green` | #EAF3DE | #C0DD97 | #97C459 | #639922 | #3B6D11 | #27500A | #173404 |
| `c-amber` | #FAEEDA | #FAC775 | #EF9F27 | #BA7517 | #854F0B | #633806 | #412402 |
| `c-red` | #FCEBEB | #F7C1C1 | #F09595 | #E24B4A | #A32D2D | #791F1F | #501313 |
#### Color Assignment Rules
Color encodes **meaning**, not sequence. Never cycle through colors like a rainbow.
- Group nodes by **category** — all nodes of the same type share one color.
- Use `c-gray` for neutral/structural nodes (start, end, generic steps, users).
- Use **2-3 colors per diagram**, not 6+.
- Prefer `c-purple`, `c-teal`, `c-coral`, `c-pink` for general categories.
- Reserve `c-blue`, `c-green`, `c-amber`, `c-red` for semantic meaning (info, success, warning, error).
Light/dark stop mapping (handled by the template CSS — just use the class):
- Light mode: 50 fill + 600 stroke + 800 title / 600 subtitle
- Dark mode: 800 fill + 200 stroke + 100 title / 200 subtitle
### Typography
Only two font sizes. No exceptions.
| Class | Size | Weight | Use |
|-------|------|--------|-----|
| `th` | 14px | 500 | Node titles, region labels |
| `ts` | 12px | 400 | Subtitles, descriptions, arrow labels |
| `t` | 14px | 400 | General text |
- **Sentence case always.** Never Title Case, never ALL CAPS.
- Every `<text>` MUST carry a class (`t`, `ts`, or `th`). No unclassed text.
- `dominant-baseline="central"` on all text inside boxes.
- `text-anchor="middle"` for centered text in boxes.
**Width estimation (approx):**
- 14px weight 500: ~8px per character
- 12px weight 400: ~6.5px per character
- Always verify: `box_width >= (char_count × px_per_char) + 48` (24px padding each side)
### Spacing & Layout
- **ViewBox**: `viewBox="0 0 680 H"` where H = content height + 40px buffer.
- **Safe area**: x=40 to x=640, y=40 to y=(H-40).
- **Between boxes**: 60px minimum gap.
- **Inside boxes**: 24px horizontal padding, 12px vertical padding.
- **Arrowhead gap**: 10px between arrowhead and box edge.
- **Single-line box**: 44px height.
- **Two-line box**: 56px height, 18px between title and subtitle baselines.
- **Container padding**: 20px minimum inside every container.
- **Max nesting**: 2-3 levels deep. Deeper gets unreadable at 680px width.
### Stroke & Shape
- **Stroke width**: 0.5px on all node borders. Not 1px, not 2px.
- **Rect rounding**: `rx="8"` for nodes, `rx="12"` for inner containers, `rx="16"` to `rx="20"` for outer containers.
- **Connector paths**: MUST have `fill="none"`. SVG defaults to `fill: black` otherwise.
### Arrow Marker
Include this `<defs>` block at the start of **every** SVG:
```xml
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
```
Use `marker-end="url(#arrow)"` on lines. The arrowhead inherits the line color via `context-stroke`.
### CSS Classes (Provided by the Template)
The template page provides:
- Text: `.t`, `.ts`, `.th`
- Neutral: `.box`, `.arr`, `.leader`, `.node`
- Color ramps: `.c-purple`, `.c-teal`, `.c-coral`, `.c-pink`, `.c-gray`, `.c-blue`, `.c-green`, `.c-amber`, `.c-red` (all with automatic light/dark mode)
You do **not** need to redefine these — just apply them in your SVG. The template file contains the full CSS definitions.
---
## SVG Boilerplate
Every SVG inside the template page starts with this exact structure:
```xml
<svg width="100%" viewBox="0 0 680 {HEIGHT}" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Diagram content here -->
</svg>
```
Replace `{HEIGHT}` with the actual computed height (last element bottom + 40px).
### Node Patterns
**Single-line node (44px):**
```xml
<g class="node c-blue">
<rect x="100" y="20" width="180" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="190" y="42" text-anchor="middle" dominant-baseline="central">Service name</text>
</g>
```
**Two-line node (56px):**
```xml
<g class="node c-teal">
<rect x="100" y="20" width="200" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="200" y="38" text-anchor="middle" dominant-baseline="central">Service name</text>
<text class="ts" x="200" y="56" text-anchor="middle" dominant-baseline="central">Short description</text>
</g>
```
**Connector (no label):**
```xml
<line x1="200" y1="76" x2="200" y2="120" class="arr" marker-end="url(#arrow)"/>
```
**Container (dashed or solid):**
```xml
<g class="c-purple">
<rect x="40" y="92" width="600" height="300" rx="16" stroke-width="0.5"/>
<text class="th" x="66" y="116">Container label</text>
<text class="ts" x="66" y="134">Subtitle info</text>
</g>
```
---
## Diagram Types
Choose the layout that fits the subject:
1. **Flowchart** — CI/CD pipelines, request lifecycles, approval workflows, data processing. Single-direction flow (top-down or left-right). Max 4-5 nodes per row.
2. **Structural / Containment** — Cloud infrastructure nesting, system architecture with layers. Large outer containers with inner regions. Dashed rects for logical groupings.
3. **API / Endpoint Map** — REST routes, GraphQL schemas. Tree from root, branching to resource groups, each containing endpoint nodes.
4. **Microservice Topology** — Service mesh, event-driven systems. Services as nodes, arrows for communication patterns, message queues between.
5. **Data Flow** — ETL pipelines, streaming architectures. Left-to-right flow from sources through processing to sinks.
6. **Physical / Structural** — Vehicles, buildings, hardware, anatomy. Use shapes that match the physical form — `<path>` for curved bodies, `<polygon>` for tapered shapes, `<ellipse>`/`<circle>` for cylindrical parts, nested `<rect>` for compartments. See `references/physical-shape-cookbook.md`.
7. **Infrastructure / Systems Integration** — Smart cities, IoT networks, multi-domain systems. Hub-spoke layout with central platform connecting subsystems. Semantic line styles (`.data-line`, `.power-line`, `.water-pipe`, `.road`). See `references/infrastructure-patterns.md`.
8. **UI / Dashboard Mockups** — Admin panels, monitoring dashboards. Screen frame with nested chart/gauge/indicator elements. See `references/dashboard-patterns.md`.
For physical, infrastructure, and dashboard diagrams, load the matching reference file before generating — each one provides ready-made CSS classes and shape primitives.
---
## Validation Checklist
Before finalizing any SVG, verify ALL of the following:
1. Every `<text>` has class `t`, `ts`, or `th`.
2. Every `<text>` inside a box has `dominant-baseline="central"`.
3. Every connector `<path>` or `<line>` used as arrow has `fill="none"`.
4. No arrow line crosses through an unrelated box.
5. `box_width >= (longest_label_chars × 8) + 48` for 14px text.
6. `box_width >= (longest_label_chars × 6.5) + 48` for 12px text.
7. ViewBox height = bottom-most element + 40px.
8. All content stays within x=40 to x=640.
9. Color classes (`c-*`) are on `<g>` or shape elements, never on `<path>` connectors.
10. Arrow `<defs>` block is present.
11. No gradients, shadows, blur, or glow effects.
12. Stroke width is 0.5px on all node borders.
---
## Output & Preview
### Default: standalone HTML file
Write a single `.html` file the user can open directly. No server, no dependencies, works offline. Pattern:
```python
# 1. Load the template
template = skill_view("concept-diagrams", "templates/template.html")
# 2. Fill in title, subtitle, and paste your SVG
html = template.replace(
"<!-- DIAGRAM TITLE HERE -->", "SN2 reaction mechanism"
).replace(
"<!-- OPTIONAL SUBTITLE HERE -->", "Bimolecular nucleophilic substitution"
).replace(
"<!-- PASTE SVG HERE -->", svg_content
)
# 3. Write to a user-chosen path (or ./ by default)
write_file("./sn2-mechanism.html", html)
```
Tell the user how to open it:
```
# macOS
open ./sn2-mechanism.html
# Linux
xdg-open ./sn2-mechanism.html
```
### Optional: local preview server (multi-diagram gallery)
Only use this when the user explicitly wants a browsable gallery of multiple diagrams.
**Rules:**
- Bind to `127.0.0.1` only. Never `0.0.0.0`. Exposing diagrams on all network interfaces is a security hazard on shared networks.
- Pick a free port (do NOT hard-code one) and tell the user the chosen URL.
- The server is optional and opt-in — prefer the standalone HTML file first.
Recommended pattern (lets the OS pick a free ephemeral port):
```bash
# Put each diagram in its own folder under .diagrams/
mkdir -p .diagrams/sn2-mechanism
# ...write .diagrams/sn2-mechanism/index.html...
# Serve on loopback only, free port
cd .diagrams && python3 -c "
import http.server, socketserver
with socketserver.TCPServer(('127.0.0.1', 0), http.server.SimpleHTTPRequestHandler) as s:
print(f'Serving at http://127.0.0.1:{s.server_address[1]}/')
s.serve_forever()
" &
```
If the user insists on a fixed port, use `127.0.0.1:<port>` — still never `0.0.0.0`. Document how to stop the server (`kill %1` or `pkill -f "http.server"`).
---
## Examples Reference
The `examples/` directory ships 15 complete, tested diagrams. Browse them for working patterns before writing a new diagram of a similar type:
| File | Type | Demonstrates |
|------|------|--------------|
| `hospital-emergency-department-flow.md` | Flowchart | Priority routing with semantic colors |
| `feature-film-production-pipeline.md` | Flowchart | Phased workflow, horizontal sub-flows |
| `automated-password-reset-flow.md` | Flowchart | Auth flow with error branches |
| `autonomous-llm-research-agent-flow.md` | Flowchart | Loop-back arrows, decision branches |
| `place-order-uml-sequence.md` | Sequence | UML sequence diagram style |
| `commercial-aircraft-structure.md` | Physical | Paths, polygons, ellipses for realistic shapes |
| `wind-turbine-structure.md` | Physical cross-section | Underground/above-ground separation, color coding |
| `smartphone-layer-anatomy.md` | Exploded view | Alternating left/right labels, layered components |
| `apartment-floor-plan-conversion.md` | Floor plan | Walls, doors, proposed changes in dotted red |
| `banana-journey-tree-to-smoothie.md` | Narrative journey | Winding path, progressive state changes |
| `cpu-ooo-microarchitecture.md` | Hardware pipeline | Fan-out, memory hierarchy sidebar |
| `sn2-reaction-mechanism.md` | Chemistry | Molecules, curved arrows, energy profile |
| `smart-city-infrastructure.md` | Hub-spoke | Semantic line styles per system |
| `electricity-grid-flow.md` | Multi-stage flow | Voltage hierarchy, flow markers |
| `ml-benchmark-grouped-bar-chart.md` | Chart | Grouped bars, dual axis |
Load any example with:
```
skill_view(name="concept-diagrams", file_path="examples/<filename>")
```
---
## Quick Reference: What to Use When
| User says | Diagram type | Suggested colors |
|-----------|--------------|------------------|
| "show the pipeline" | Flowchart | gray start/end, purple steps, red errors, teal deploy |
| "draw the data flow" | Data pipeline (left-right) | gray sources, purple processing, teal sinks |
| "visualize the system" | Structural (containment) | purple container, teal services, coral data |
| "map the endpoints" | API tree | purple root, one ramp per resource group |
| "show the services" | Microservice topology | gray ingress, teal services, purple bus, coral workers |
| "draw the aircraft/vehicle" | Physical | paths, polygons, ellipses for realistic shapes |
| "smart city / IoT" | Hub-spoke integration | semantic line styles per subsystem |
| "show the dashboard" | UI mockup | dark screen, chart colors: teal, purple, coral for alerts |
| "power grid / electricity" | Multi-stage flow | voltage hierarchy (HV/MV/LV line weights) |
| "wind turbine / turbine" | Physical cross-section | foundation + tower cutaway + nacelle color-coded |
| "journey of X / lifecycle" | Narrative journey | winding path, progressive state changes |
| "layers of X / exploded" | Exploded layer view | vertical stack, alternating labels |
| "CPU / pipeline" | Hardware pipeline | vertical stages, fan-out to execution ports |
| "floor plan / apartment" | Floor plan | walls, doors, proposed changes in dotted red |
| "reaction mechanism" | Chemistry | atoms, bonds, curved arrows, transition state, energy profile |
@@ -0,0 +1,244 @@
# Apartment Floor Plan: 3 BHK to 4 BHK Conversion
An architectural floor plan showing a 1,500 sq ft apartment with proposed modifications to convert from 3 BHK to 4 BHK. Demonstrates architectural drawing conventions, room layouts, proposed changes with dotted lines, and area comparison tables.
## Key Patterns Used
- **Architectural floor plan**: Top-down view with walls, doors, windows
- **Proposed modifications**: Dotted red lines for new walls
- **Room color coding**: Light fills to distinguish room types
- **Circulation paths**: Arrows showing new access routes
- **Data table**: Before/after area comparison with highlighting
- **Architectural symbols**: North arrow, scale bar, door swings
## Diagram Type
This is an **architectural floor plan** with:
- **Plan view**: Top-down orthographic projection
- **Overlay technique**: Existing structure + proposed changes
- **Quantitative data**: Area measurements and comparison table
## Architectural Drawing Elements
### Wall Styles
```xml
<!-- Outer walls (thick) -->
<line class="wall" x1="0" y1="0" x2="560" y2="0"/>
<!-- Internal walls (thinner) -->
<line class="wall-thin" x1="180" y1="0" x2="180" y2="140"/>
<!-- Proposed new walls (dotted red) -->
<line class="proposed-wall" x1="125" y1="170" x2="125" y2="330"/>
```
```css
.wall { stroke: var(--text-primary); stroke-width: 6; fill: none; stroke-linecap: square; }
.wall-thin { stroke: var(--text-primary); stroke-width: 3; fill: none; }
.proposed-wall { stroke: #A32D2D; stroke-width: 4; fill: none; stroke-dasharray: 8 4; }
```
### Door Symbols
```xml
<!-- Door opening with swing arc -->
<rect x="150" y="137" width="25" height="6" fill="var(--bg-primary)"/>
<path class="door" d="M150,140 L150,165"/>
<path class="door-swing" d="M150,140 A25,25 0 0,0 175,140"/>
<!-- Sliding door (balcony) -->
<rect x="60" y="327" width="60" height="6" fill="var(--bg-primary)" stroke="var(--text-secondary)" stroke-width="1"/>
<line x1="60" y1="330" x2="90" y2="330" stroke="var(--text-secondary)" stroke-width="2"/>
<line x1="90" y1="330" x2="120" y2="330" stroke="var(--text-secondary)" stroke-width="2" stroke-dasharray="3 3"/>
<!-- Proposed door (dotted) -->
<rect x="143" y="292" width="22" height="6" fill="var(--bg-primary)" stroke="#A32D2D" stroke-width="1" stroke-dasharray="3 2"/>
<path d="M165,295 A22,22 0 0,0 165,273" stroke="#A32D2D" stroke-width="1" stroke-dasharray="3 2" fill="none"/>
```
```css
.door { stroke: var(--text-secondary); stroke-width: 1.5; fill: none; }
.door-swing { stroke: var(--text-tertiary); stroke-width: 1; fill: none; stroke-dasharray: 3 2; }
```
### Window Symbols
```xml
<!-- Window with glass indication -->
<rect class="window" x="-3" y="30" width="6" height="50"/>
<line class="window-glass" x1="0" y1="35" x2="0" y2="75"/>
<!-- Horizontal window (top wall) -->
<rect class="window" x="220" y="-3" width="60" height="6"/>
<line class="window-glass" x1="225" y1="0" x2="275" y2="0"/>
```
```css
.window { stroke: var(--text-primary); stroke-width: 1; fill: var(--bg-primary); }
.window-glass { stroke: #378ADD; stroke-width: 2; fill: none; }
```
### Room Fills
```xml
<!-- Different colors for room types -->
<rect class="room-master" x="3" y="3" width="174" height="134" rx="2"/>
<rect class="room-bed2" x="183" y="3" width="134" height="104" rx="2"/>
<rect class="room-living" x="3" y="173" width="554" height="154" rx="2"/>
<rect class="room-kitchen" x="443" y="3" width="114" height="104" rx="2"/>
<rect class="room-bath" x="183" y="113" width="54" height="54" rx="2"/>
<!-- Proposed new room (highlighted) -->
<rect class="room-new" x="3" y="223" width="120" height="104"/>
```
```css
.room-master { fill: rgba(206, 203, 246, 0.3); } /* purple tint */
.room-bed2 { fill: rgba(159, 225, 203, 0.3); } /* teal tint */
.room-bed3 { fill: rgba(250, 199, 117, 0.3); } /* amber tint */
.room-living { fill: rgba(245, 196, 179, 0.3); } /* coral tint */
.room-kitchen { fill: rgba(237, 147, 177, 0.3); } /* pink tint */
.room-bath { fill: rgba(133, 183, 235, 0.3); } /* blue tint */
.room-new { fill: rgba(163, 45, 45, 0.15); } /* red tint for proposed */
```
### Support Fixtures
```xml
<!-- Kitchen counter hint -->
<rect x="450" y="15" width="50" height="25" fill="none" stroke="var(--text-tertiary)" stroke-width="0.5" rx="2"/>
<text class="tx" x="475" y="30" text-anchor="middle">Counter</text>
<!-- Balcony (dashed outline) -->
<rect class="balcony-fill" x="3" y="333" width="200" height="50"/>
```
```css
.balcony { fill: none; stroke: var(--text-secondary); stroke-width: 2; stroke-dasharray: 6 3; }
.balcony-fill { fill: rgba(93, 202, 165, 0.1); }
```
### Room Labels
```xml
<!-- Room name and area -->
<text class="room-label" x="90" y="65" text-anchor="middle">MASTER</text>
<text class="room-label" x="90" y="78" text-anchor="middle">BEDROOM</text>
<text class="area-label" x="90" y="95" text-anchor="middle">195 sq ft</text>
<!-- Proposed room (in red) -->
<text class="room-label" x="63" y="268" text-anchor="middle" fill="#A32D2D">BEDROOM 4</text>
<text class="tx" x="63" y="282" text-anchor="middle" fill="#A32D2D">(NEW)</text>
```
```css
.room-label { font-family: system-ui; font-size: 11px; fill: var(--text-primary); font-weight: 500; }
.area-label { font-family: system-ui; font-size: 9px; fill: var(--text-tertiary); }
```
### Circulation Arrow
```xml
<defs>
<marker id="circ-arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 Z" class="circulation-fill"/>
</marker>
</defs>
<path class="circulation" d="M300,250 L200,250 L145,250 L145,280" marker-end="url(#circ-arrow)"/>
<text class="tx" x="250" y="242" fill="#3B6D11" font-weight="500">New corridor access</text>
```
```css
.circulation { stroke: #3B6D11; stroke-width: 2; fill: none; }
.circulation-fill { fill: #3B6D11; }
```
### North Arrow and Scale Bar
```xml
<!-- North arrow -->
<g transform="translate(520, 260)">
<circle cx="0" cy="0" r="20" fill="none" stroke="var(--text-tertiary)" stroke-width="0.5"/>
<polygon points="0,-18 -5,5 0,0 5,5" fill="var(--text-primary)"/>
<text class="tx" x="0" y="-22" text-anchor="middle">N</text>
</g>
<!-- Scale bar -->
<g transform="translate(420, 300)">
<line x1="0" y1="0" x2="100" y2="0" stroke="var(--text-primary)" stroke-width="2"/>
<line x1="0" y1="-5" x2="0" y2="5" stroke="var(--text-primary)" stroke-width="1"/>
<line x1="50" y1="-3" x2="50" y2="3" stroke="var(--text-primary)" stroke-width="1"/>
<line x1="100" y1="-5" x2="100" y2="5" stroke="var(--text-primary)" stroke-width="1"/>
<text class="tx" x="0" y="15" text-anchor="middle">0</text>
<text class="tx" x="50" y="15" text-anchor="middle">5'</text>
<text class="tx" x="100" y="15" text-anchor="middle">10'</text>
</g>
```
## Area Comparison Table
### Table Structure
```xml
<!-- Header row -->
<rect class="table-header" x="0" y="0" width="180" height="28" rx="4 4 0 0"/>
<text class="ts" x="90" y="18" text-anchor="middle" font-weight="500">Room</text>
<!-- Normal row -->
<rect class="table-row" x="0" y="28" width="180" height="24"/>
<text class="tx" x="10" y="44">Master Bedroom</text>
<text class="tx" x="230" y="44" text-anchor="middle">195</text>
<!-- Alternating row -->
<rect class="table-row-alt" x="0" y="52" width="180" height="24"/>
<!-- Highlighted row (for changes) -->
<rect class="table-highlight" x="0" y="100" width="180" height="24"/>
<text class="tx" x="10" y="116" fill="#A32D2D" font-weight="500">Bedroom 4 (NEW)</text>
<text class="tx" x="430" y="116" text-anchor="middle" fill="#3B6D11">+100</text>
<!-- Total row -->
<rect x="0" y="268" width="180" height="28" fill="var(--bg-secondary)" stroke="var(--border)" stroke-width="1"/>
<text class="ts" x="10" y="286" font-weight="500">TOTAL CARPET AREA</text>
```
```css
.table-header { fill: var(--bg-secondary); }
.table-row { fill: var(--bg-primary); stroke: var(--border); stroke-width: 0.5; }
.table-row-alt { fill: var(--bg-tertiary); stroke: var(--border); stroke-width: 0.5; }
.table-highlight { fill: rgba(163, 45, 45, 0.1); stroke: #A32D2D; stroke-width: 0.5; }
```
## Layout Notes
- **ViewBox**: 800×780 (portrait for floor plan + table)
- **Scale**: 10px = 1 foot (apartment ~50ft × 33ft)
- **Floor plan origin**: Offset at (50, 60) for margins
- **Wall thickness**: 6px outer, 3px inner (represents ~6" walls)
- **Room labels**: Centered in each room with area below
- **Table placement**: Below floor plan with full width
## Color Coding
| Element | Color | Usage |
|---------|-------|-------|
| Proposed walls | Red (#A32D2D) dotted | New construction |
| New room fill | Red 15% opacity | Bedroom 4 area |
| Circulation | Green (#3B6D11) | New access path |
| Window glass | Blue (#378ADD) | Glass indication |
| Bedrooms | Purple/Teal/Amber tints | Room differentiation |
| Wet areas | Blue tint | Bathrooms |
| Living | Coral tint | Common areas |
## When to Use This Pattern
Use this diagram style for:
- Apartment/house floor plans
- Office layout planning
- Renovation proposals showing before/after
- Space planning with area calculations
- Real estate marketing materials
- Interior design presentations
- Building permit documentation
@@ -0,0 +1,276 @@
# Automated Password Reset Flow
A two-section flowchart tracing the full user journey for a web application password reset: the initial request phase (forgot password → email check → token generation) and the reset-form phase (link click → new password entry → token/password validation). Demonstrates multi-exit decision diamonds, a three-column branching layout, a loop-back path, and a cross-section separator arrow.
## Key Patterns Used
- **Three-column layout**: Left column (error/terminal branches at cx=115), center column (main happy path at cx=340), right column (expired-token branch at cx=552) — allows side branches to live at the same y-level as center nodes without overlap
- **Decision diamonds with `<polygon>`**: Each decision uses a `<g class="decision">` wrapper containing a `<polygon>` and centered `<text>`; the diamond points are computed as `cx±hw, cy±hh` (hw=100, hh=28)
- **Pill-shaped terminals**: Start and end nodes use `rx=22` on their `<rect>` to signal entry/exit points; all mid-flow process nodes use `rx=8`
- **Three-branch decision paths**: Each diamond has a "Yes" branch (down, short `<line>`) and a "No" branch (`<path>` going horizontal then vertical to a side column)
- **Loop-back path**: Mismatch error node loops back to the password-entry node via a routing corridor at x=215 — a 5-px gap between the left column (right edge x=210) and center column (left edge x=220); the path exits the bottom of the error node, drops below it, travels right to x=215, then goes up to the target node's center y, then right 5 px into the node's left edge
- **Section separator**: A dashed horizontal `<line>` at y=452 splits the two phases; the connecting arrow crosses it with a faded label ("user receives email") to preserve flow continuity
- **Italic annotation**: The exact UX copy for the generic message ("If that email exists…") is shown as a faded italic `ts` text block below the left-branch terminal node
- **Legend row**: Five inline swatches (gray, purple, teal, red, amber diamond) at the bottom explain the color-to-role mapping
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 960" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!--
Column layout (680px viewBox, safe area x=40640):
Left col : x=20, w=190, cx=115 (error / terminal branches)
Center col: x=220, w=240, cx=340 (main happy path)
Right col: x=465, w=175, cx=552 (expired-token branch)
Loop corridor at x=215 (5-px gap between left and center cols)
-->
<!-- ═══ SECTION 1 — Forgot password request ═══ -->
<text class="ts" x="40" y="38" opacity=".45">Section 1 — Forgot password request</text>
<!-- START terminal (pill rx=22 signals start/end) -->
<g class="c-gray">
<rect x="220" y="46" width="240" height="44" rx="22"/>
<text class="th" x="340" y="68" text-anchor="middle" dominant-baseline="central">User: &quot;Forgot password&quot;</text>
</g>
<line x1="340" y1="90" x2="340" y2="108" class="arr" marker-end="url(#arrow)"/>
<!-- N2 · Enter email -->
<g class="c-gray">
<rect x="220" y="108" width="240" height="44" rx="8"/>
<text class="th" x="340" y="130" text-anchor="middle" dominant-baseline="central">Enter email address</text>
</g>
<line x1="340" y1="152" x2="340" y2="172" class="arr" marker-end="url(#arrow)"/>
<!-- D1 · Email in system? diamond: center=(340,200) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,172 440,200 340,228 240,200"/>
<text class="th" x="340" y="200" text-anchor="middle" dominant-baseline="central">Email in system?</text>
</g>
<!-- D1 "No" → left column -->
<path d="M 240,200 L 115,200 L 115,248" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="178" y="193" text-anchor="middle" opacity=".75">No</text>
<!-- D1 "Yes" → continue down -->
<line x1="340" y1="228" x2="340" y2="248" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="242" text-anchor="start" opacity=".75">Yes</text>
<!-- ── Left branch (D1 = No): generic security message → end ── -->
<!-- L1 · Generic message (security: never confirm email existence) -->
<g class="c-gray">
<rect x="20" y="248" width="190" height="56" rx="8"/>
<text class="th" x="115" y="269" text-anchor="middle" dominant-baseline="central">Generic message shown</text>
<text class="ts" x="115" y="287" text-anchor="middle" dominant-baseline="central">Email sent if found</text>
</g>
<line x1="115" y1="304" x2="115" y2="324" class="arr" marker-end="url(#arrow)"/>
<!-- L2 · End terminal (left) -->
<g class="c-gray">
<rect x="20" y="324" width="190" height="44" rx="22"/>
<text class="th" x="115" y="346" text-anchor="middle" dominant-baseline="central">Request handled</text>
</g>
<!-- Italic annotation: actual UX copy shown below the end node -->
<text class="ts" x="20" y="384" opacity=".45" font-style="italic">&quot;If that email exists, a reset</text>
<text class="ts" x="20" y="398" opacity=".45" font-style="italic">link has been sent.&quot;</text>
<!-- ── Center Yes branch: system generates & sends token ── -->
<!-- N3 · Generate unique token -->
<g class="c-purple">
<rect x="220" y="248" width="240" height="56" rx="8"/>
<text class="th" x="340" y="269" text-anchor="middle" dominant-baseline="central">Generate unique token</text>
<text class="ts" x="340" y="287" text-anchor="middle" dominant-baseline="central">Time-limited, cryptographic</text>
</g>
<line x1="340" y1="304" x2="340" y2="324" class="arr" marker-end="url(#arrow)"/>
<!-- N4 · Store token + user ID -->
<g class="c-purple">
<rect x="220" y="324" width="240" height="44" rx="8"/>
<text class="th" x="340" y="346" text-anchor="middle" dominant-baseline="central">Store token + user ID</text>
</g>
<line x1="340" y1="368" x2="340" y2="388" class="arr" marker-end="url(#arrow)"/>
<!-- N5 · Send reset email -->
<g class="c-teal">
<rect x="220" y="388" width="240" height="44" rx="8"/>
<text class="th" x="340" y="410" text-anchor="middle" dominant-baseline="central">Send reset link via email</text>
</g>
<!-- ═══ Section separator ═══ -->
<line x1="40" y1="452" x2="640" y2="452"
stroke="var(--border)" stroke-width="1" stroke-dasharray="8 5"/>
<!-- Arrow crossing separator (with inline label) -->
<line x1="340" y1="432" x2="340" y2="472" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="448" text-anchor="start" opacity=".55">user receives email</text>
<text class="ts" x="40" y="464" opacity=".45">Section 2 — Password reset form</text>
<!-- ═══ SECTION 2 — Password reset form ═══ -->
<!-- N6 · User clicks reset link -->
<g class="c-gray">
<rect x="220" y="480" width="240" height="44" rx="8"/>
<text class="th" x="340" y="502" text-anchor="middle" dominant-baseline="central">User clicks reset link</text>
</g>
<line x1="340" y1="524" x2="340" y2="544" class="arr" marker-end="url(#arrow)"/>
<!-- N7 · Enter new password ×2 -->
<g class="c-gray">
<rect x="220" y="544" width="240" height="56" rx="8"/>
<text class="th" x="340" y="565" text-anchor="middle" dominant-baseline="central">Enter new password ×2</text>
<text class="ts" x="340" y="583" text-anchor="middle" dominant-baseline="central">Confirm both passwords match</text>
</g>
<line x1="340" y1="600" x2="340" y2="620" class="arr" marker-end="url(#arrow)"/>
<!-- D2 · Token expired? diamond: center=(340,648) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,620 440,648 340,676 240,648"/>
<text class="th" x="340" y="648" text-anchor="middle" dominant-baseline="central">Token expired?</text>
</g>
<!-- D2 "Yes" → right column (expired-token branch) -->
<path d="M 440,648 L 552,648 L 552,692" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="496" y="641" text-anchor="middle" opacity=".75">Yes</text>
<!-- D2 "No" → down to password-match check -->
<line x1="340" y1="676" x2="340" y2="714" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="698" text-anchor="start" opacity=".75">No</text>
<!-- ── Right branch (D2 = Yes): token expired → dead end ── -->
<!-- R1 · Token expired error -->
<g class="c-red">
<rect x="465" y="692" width="175" height="56" rx="8"/>
<text class="th" x="552" y="713" text-anchor="middle" dominant-baseline="central">Token expired</text>
<text class="ts" x="552" y="731" text-anchor="middle" dominant-baseline="central">Show expiry error</text>
</g>
<line x1="552" y1="748" x2="552" y2="768" class="arr" marker-end="url(#arrow)"/>
<!-- R2 · End terminal (right) -->
<g class="c-gray">
<rect x="465" y="768" width="175" height="44" rx="22"/>
<text class="th" x="552" y="790" text-anchor="middle" dominant-baseline="central">End — request again</text>
</g>
<!-- D3 · Passwords match? diamond: center=(340,742) hw=100 hh=28 -->
<g class="decision">
<polygon points="340,714 440,742 340,770 240,742"/>
<text class="th" x="340" y="742" text-anchor="middle" dominant-baseline="central">Passwords match?</text>
</g>
<!-- D3 "No" → left column (mismatch branch) -->
<path d="M 240,742 L 115,742 L 115,786" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="178" y="735" text-anchor="middle" opacity=".75">No</text>
<!-- D3 "Yes" → down to reset -->
<line x1="340" y1="770" x2="340" y2="790" class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="348" y="783" text-anchor="start" opacity=".75">Yes</text>
<!-- ── Left branch (D3 = No): passwords don't match → loop back ── -->
<!-- L3 · Password mismatch error -->
<g class="c-red">
<rect x="20" y="786" width="190" height="56" rx="8"/>
<text class="th" x="115" y="807" text-anchor="middle" dominant-baseline="central">Password mismatch</text>
<text class="ts" x="115" y="825" text-anchor="middle" dominant-baseline="central">Passwords do not match</text>
</g>
<!-- Loop-back arrow: exits L3 bottom → drops to y=862 →
travels right to corridor x=215 → climbs to N7 center y=572 →
enters N7 left edge at (220, 572) pointing right -->
<path d="M 115,842 L 115,862 L 215,862 L 215,572 L 220,572"
class="arr" marker-end="url(#arrow)"/>
<text class="ts" x="224" y="538" text-anchor="start" opacity=".6">retry</text>
<!-- ── Center Yes branch (D3 = Yes): reset password & invalidate token ── -->
<!-- N8 · Reset password -->
<g class="c-teal">
<rect x="220" y="790" width="240" height="56" rx="8"/>
<text class="th" x="340" y="811" text-anchor="middle" dominant-baseline="central">Reset password</text>
<text class="ts" x="340" y="829" text-anchor="middle" dominant-baseline="central">Invalidate used token</text>
</g>
<line x1="340" y1="846" x2="340" y2="866" class="arr" marker-end="url(#arrow)"/>
<!-- N9 · Success terminal -->
<g class="c-green">
<rect x="220" y="866" width="240" height="44" rx="22"/>
<text class="th" x="340" y="888" text-anchor="middle" dominant-baseline="central">Password reset complete</text>
</g>
<!-- ═══ Legend ═══ -->
<text class="ts" x="40" y="930" opacity=".4">Legend —</text>
<rect x="108" y="920" width="13" height="13" rx="2" fill="#F1EFE8" stroke="#5F5E5A" stroke-width="0.5"/>
<text class="ts" x="126" y="930" opacity=".7">User action</text>
<rect x="210" y="920" width="13" height="13" rx="2" fill="#EEEDFE" stroke="#534AB7" stroke-width="0.5"/>
<text class="ts" x="228" y="930" opacity=".7">System process</text>
<rect x="334" y="920" width="13" height="13" rx="2" fill="#E1F5EE" stroke="#0F6E56" stroke-width="0.5"/>
<text class="ts" x="352" y="930" opacity=".7">Email / success</text>
<rect x="455" y="920" width="13" height="13" rx="2" fill="#FCEBEB" stroke="#A32D2D" stroke-width="0.5"/>
<text class="ts" x="473" y="930" opacity=".7">Error state</text>
<polygon points="556,926 566,932 556,938 546,932" fill="#FAEEDA" stroke="#854F0B" stroke-width="0.5"/>
<text class="ts" x="572" y="932" opacity=".7">Decision</text>
</svg>
```
## Custom CSS
Add these classes to the hosting page `<style>` block (in addition to the standard skill CSS):
```css
/* Decision diamond — amber fill, same palette as c-amber */
.decision > polygon { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; }
.decision > .th { fill: #633806; }
@media (prefers-color-scheme: dark) {
.decision > polygon { fill: #633806; stroke: #EF9F27; }
.decision > .th { fill: #FAC775; }
}
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Start / end terminals | `c-gray` | Neutral entry and exit points |
| User actions (enter email, click link, enter password) | `c-gray` | User-facing steps with no system processing |
| Generic message + request-handled terminal | `c-gray` | Intentionally neutral — the security message must not reveal data |
| Generate & store token | `c-purple` | Backend system operations |
| Send reset email | `c-teal` | Positive external action (outbound communication) |
| Token expired error | `c-red` | Failure / blocking error state |
| Password mismatch error | `c-red` | Validation failure |
| Reset password + success | `c-teal` / `c-green` | Positive outcome: teal for the action, green pill for the terminal |
| Decision diamonds | `c-amber` (custom `.decision`) | Warning / branch point — matches amber semantic meaning |
## Layout Notes
- **ViewBox**: 680×960 — tall flowchart with two phases
- **Three-column structure**: Left (cx=115), center (cx=340), right (cx=552) — each branch stays within its column; only `<path>` arrows cross column boundaries
- **Diamond formula**: `<polygon points="cx,cy-hh cx+hw,cy cx,cy+hh cx-hw,cy"/>` with hw=100, hh=28 gives a 200×56px diamond that sits flush with the center column (x=220460)
- **Branch routing pattern**: "No" paths use `<path d="M left_point,cy L side_cx,cy L side_cx,node_top">` — one horizontal segment + one vertical segment, no curves needed
- **Loop corridor**: The 5-px gap at x=210220 between left and center columns provides a clean vertical channel for the loop-back path without any node overlap; the path exits node bottom, drops 20px, goes right to x=215, climbs to target y, enters from left
- **Section separator**: A dashed `<line>` at y=452 with `stroke-dasharray="8 5"` provides a visual phase break; the single connecting arrow crosses it at center, with a faded label on the arrow
- **Pill terminals**: `rx=22` (half the 44px node height) produces a perfect capsule/pill shape — use this consistently for all start/end terminals
- **Error annotation**: The exact UX copy is rendered as faded (`opacity=".45"`) italic `ts` text below the relevant node, keeping it informative without cluttering the flow
@@ -0,0 +1,240 @@
# Autonomous LLM Research Agent Flow
A multi-section flowchart showing Karpathy's autoresearch framework: human-agent handoff, the autonomous experiment loop with keep/discard decision branching, and the modifiable training pipeline. Demonstrates loop-back arrows, convergent decision paths, and semantic color coding for outcomes.
## Key Patterns Used
- **Three-section layout**: Setup row, main loop container, and detail container — each visually distinct
- **Neutral dashed containers**: Loop and training pipeline use `var(--bg-secondary)` fill with dashed borders to recede behind colored content nodes
- **Decision branching with convergence**: "val_bpb improved?" splits into Keep (green) and Discard (red), then both converge back to "Log to results.tsv"
- **Loop-back arrow**: Dashed path with rounded corners on the right side of the container showing infinite repetition
- **Semantic color for outcomes**: Green = improvement (keep), Red = no improvement (discard) — not arbitrary decoration
- **Highlighted key step**: "Run training" uses `c-coral` to visually distinguish the most important step from other `c-teal` actions
- **Horizontal pipeline flow**: Training details section uses left-to-right arrow-connected nodes (GPT → MuonAdamW → Evaluation)
- **Footer metadata**: Fixed constraints shown as subtle centered text below the pipeline nodes
- **Legend row**: Color key at the bottom explaining what each color means
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 920" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- ========================================== -->
<!-- SECTION 1: SETUP (Human → program.md → AI) -->
<!-- ========================================== -->
<text class="ts" x="40" y="30" text-anchor="start" opacity=".5">One-time setup</text>
<!-- Human -->
<g class="node c-gray">
<rect x="60" y="42" width="140" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="130" y="62" text-anchor="middle" dominant-baseline="central">Human</text>
<text class="ts" x="130" y="82" text-anchor="middle" dominant-baseline="central">Researcher</text>
</g>
<!-- Arrow: Human → program.md -->
<line x1="200" y1="70" x2="250" y2="70" class="arr" marker-end="url(#arrow)"/>
<!-- program.md -->
<g class="node c-gray">
<rect x="250" y="42" width="180" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="62" text-anchor="middle" dominant-baseline="central">program.md</text>
<text class="ts" x="340" y="82" text-anchor="middle" dominant-baseline="central">Agent instructions</text>
</g>
<!-- Arrow: program.md → AI Agent -->
<line x1="430" y1="70" x2="470" y2="70" class="arr" marker-end="url(#arrow)"/>
<!-- AI Agent -->
<g class="node c-purple">
<rect x="470" y="42" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="550" y="62" text-anchor="middle" dominant-baseline="central">AI agent</text>
<text class="ts" x="550" y="82" text-anchor="middle" dominant-baseline="central">Claude / Codex</text>
</g>
<!-- Arrow: Setup row → Loop (from program.md center down) -->
<line x1="340" y1="98" x2="340" y2="142" class="arr" marker-end="url(#arrow)"/>
<!-- ========================================== -->
<!-- SECTION 2: AUTONOMOUS EXPERIMENT LOOP -->
<!-- ========================================== -->
<!-- Loop container (neutral dashed) -->
<g>
<rect x="40" y="142" width="600" height="528" rx="16"
stroke-width="1" stroke-dasharray="6 4"
fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="170">Autonomous experiment loop</text>
<text class="ts" x="66" y="188">~12 experiments/hour — runs until manually stopped</text>
</g>
<!-- Step 1: Read code + past results -->
<g class="node c-teal">
<rect x="170" y="208" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="230" text-anchor="middle" dominant-baseline="central">Read code + past results</text>
</g>
<!-- Arrow: S1 → S2 -->
<line x1="310" y1="252" x2="310" y2="274" class="arr" marker-end="url(#arrow)"/>
<!-- Step 2: Propose + edit train.py -->
<g class="node c-teal">
<rect x="170" y="274" width="280" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="294" text-anchor="middle" dominant-baseline="central">Propose + edit train.py</text>
<text class="ts" x="310" y="314" text-anchor="middle" dominant-baseline="central">Arch, optimizer, hyperparameters</text>
</g>
<!-- Arrow: S2 → S3 -->
<line x1="310" y1="330" x2="310" y2="352" class="arr" marker-end="url(#arrow)"/>
<!-- Step 3: Run training (highlighted — key step) -->
<g class="node c-coral">
<rect x="170" y="352" width="280" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="372" text-anchor="middle" dominant-baseline="central">Run training</text>
<text class="ts" x="310" y="392" text-anchor="middle" dominant-baseline="central">uv run train.py (5 min budget)</text>
</g>
<!-- Arrow: S3 → S4 -->
<line x1="310" y1="408" x2="310" y2="430" class="arr" marker-end="url(#arrow)"/>
<!-- Step 4: Decision — val_bpb improved? -->
<g class="node c-gray">
<rect x="170" y="430" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="452" text-anchor="middle" dominant-baseline="central">val_bpb improved?</text>
</g>
<!-- Decision arrows to Keep / Discard -->
<line x1="240" y1="474" x2="175" y2="508" class="arr" marker-end="url(#arrow)"/>
<line x1="380" y1="474" x2="445" y2="508" class="arr" marker-end="url(#arrow)"/>
<!-- Decision labels -->
<text class="ts" x="195" y="496" opacity=".6">yes</text>
<text class="ts" x="416" y="496" opacity=".6">no</text>
<!-- Keep — advance branch -->
<g class="node c-green">
<rect x="70" y="508" width="210" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="175" y="528" text-anchor="middle" dominant-baseline="central">Keep</text>
<text class="ts" x="175" y="548" text-anchor="middle" dominant-baseline="central">Advance git branch</text>
</g>
<!-- Discard — git reset -->
<g class="node c-red">
<rect x="340" y="508" width="210" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="445" y="528" text-anchor="middle" dominant-baseline="central">Discard</text>
<text class="ts" x="445" y="548" text-anchor="middle" dominant-baseline="central">Git reset to previous</text>
</g>
<!-- Converge arrows: Keep → Log, Discard → Log -->
<line x1="175" y1="564" x2="250" y2="590" class="arr" marker-end="url(#arrow)"/>
<line x1="445" y1="564" x2="370" y2="590" class="arr" marker-end="url(#arrow)"/>
<!-- Step 6: Log to results.tsv -->
<g class="node c-teal">
<rect x="170" y="590" width="280" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="612" text-anchor="middle" dominant-baseline="central">Log to results.tsv</text>
</g>
<!-- Loop-back arrow (dashed, right side) -->
<path d="M 450 612 L 564 612 Q 576 612 576 600 L 576 242 Q 576 230 564 230 L 450 230"
fill="none" class="arr" stroke-dasharray="4 3" marker-end="url(#arrow)"/>
<!-- ========================================== -->
<!-- SECTION 3: TRAINING PIPELINE DETAILS -->
<!-- ========================================== -->
<!-- Connection arrow: Loop → Training details -->
<line x1="310" y1="670" x2="310" y2="710" class="arr" marker-end="url(#arrow)"/>
<!-- Training container (neutral dashed) -->
<g>
<rect x="40" y="710" width="600" height="170" rx="16"
stroke-width="1" stroke-dasharray="6 4"
fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="738">train.py — modifiable training pipeline</text>
<text class="ts" x="66" y="756">Runs during each training step — single GPU, single file</text>
</g>
<!-- GPT model -->
<g class="node c-coral">
<rect x="70" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="147" y="794" text-anchor="middle" dominant-baseline="central">GPT model</text>
<text class="ts" x="147" y="814" text-anchor="middle" dominant-baseline="central">RoPE, FlashAttn3</text>
</g>
<!-- Arrow: GPT → MuonAdamW -->
<line x1="225" y1="802" x2="260" y2="802" class="arr" marker-end="url(#arrow)"/>
<!-- MuonAdamW optimizer -->
<g class="node c-coral">
<rect x="260" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="337" y="794" text-anchor="middle" dominant-baseline="central">MuonAdamW</text>
<text class="ts" x="337" y="814" text-anchor="middle" dominant-baseline="central">Hybrid optimizer</text>
</g>
<!-- Arrow: MuonAdamW → Evaluation -->
<line x1="415" y1="802" x2="450" y2="802" class="arr" marker-end="url(#arrow)"/>
<!-- Evaluation -->
<g class="node c-amber">
<rect x="450" y="774" width="155" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="527" y="794" text-anchor="middle" dominant-baseline="central">Evaluation</text>
<text class="ts" x="527" y="814" text-anchor="middle" dominant-baseline="central">val_bpb metric</text>
</g>
<!-- Footer: fixed constraints -->
<text class="ts" x="340" y="856" text-anchor="middle" opacity=".5">climbmix-400b data · 8K BPE vocab · 300s budget · 2048 context</text>
<!-- ========================================== -->
<!-- LEGEND -->
<!-- ========================================== -->
<g class="c-teal"><rect x="40" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="62" y="902">Agent actions</text>
<g class="c-coral"><rect x="170" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="192" y="902">Training run</text>
<g class="c-green"><rect x="300" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="322" y="902">Improvement</text>
<g class="c-red"><rect x="430" y="890" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="452" y="902">No improvement</text>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Human, program.md | `c-gray` | Neutral setup / input nodes |
| AI agent | `c-purple` | The active intelligent actor |
| Loop action steps | `c-teal` | Agent's analytical/editing actions |
| Run training | `c-coral` | Highlighted key step — the 5-min training run |
| Decision check | `c-gray` | Neutral evaluation checkpoint |
| Keep (improved) | `c-green` | Semantic success — val_bpb decreased |
| Discard (not improved) | `c-red` | Semantic failure — no improvement |
| Training pipeline nodes | `c-coral` | Training infrastructure components |
| Evaluation node | `c-amber` | Distinct from training — measurement/metric role |
| Containers | Neutral (dashed) | Subtle grouping that recedes behind content |
## Layout Notes
- **ViewBox**: 680×920 (standard width, tall for 3 sections)
- **Three sections**: Setup row (y=3098), loop container (y=142670), training details (y=710880)
- **Container style**: Dashed border (`stroke-dasharray="6 4"`), neutral fill (`var(--bg-secondary)`), `stroke-width="1"` — not colored, so inner nodes pop
- **Loop-back arrow**: Dashed `<path>` with quadratic curves (`Q`) at corners for smooth rounded turns, running up the right side of the loop container from "Log" back to "Read code"
- **Decision pattern**: Single question node ("val_bpb improved?") with diagonal arrows to Keep/Discard, then convergent diagonal arrows back to "Log to results.tsv"
- **Decision labels**: "yes"/"no" labels placed along the diagonal arrows with `opacity=".6"` to stay subtle
- **Key step highlight**: "Run training" uses `c-coral` while surrounding steps use `c-teal`, drawing the eye to the most important step
- **Horizontal sub-flow**: Training pipeline uses left-to-right arrow-connected nodes (GPT model → MuonAdamW → Evaluation)
- **Footer metadata**: Fixed constraints (data, vocab, budget, context) shown as a single centered `ts` text line with `opacity=".5"`
- **Legend**: Four color swatches at the bottom explaining the semantic meaning of each color used
@@ -0,0 +1,161 @@
# Journey of a Banana: From Tree to Smoothie
A narrative journey diagram following a single banana across 3,000 miles and 3 weeks, from harvest in Costa Rica to a smoothie in the consumer's kitchen. Demonstrates storytelling through visualization, winding path layout, and progressive state changes.
## Key Patterns Used
- **Winding journey path**: S-curve connecting all stages visually
- **Location markers**: Country flags and place names for geographic context
- **Progressive state changes**: Banana color changes (green → yellow → brown → frozen → smoothie)
- **Narrative details**: Fun elements like spider check, stickers, price tags
- **Timeline**: Bottom timeline showing duration of journey
- **Environmental context**: Ocean waves, gas clouds, store awning
## New Shape Techniques
### Banana (curved fruit shape)
```xml
<!-- Green banana -->
<path class="banana-green" d="M 5 0 Q 0 10 3 20 Q 6 25 10 20 Q 13 10 8 0 Z"/>
<!-- Yellow banana -->
<path class="banana-yellow" d="M 0 5 Q -6 18 0 32 Q 7 40 15 30 Q 20 15 12 5 Z"/>
<!-- Brown overripe banana with spots -->
<path class="banana-brown" d="M 0 5 Q -5 15 0 28 Q 6 35 14 26 Q 18 14 12 5 Z"/>
<circle class="banana-spots" cx="5" cy="15" r="1.5"/>
<circle class="banana-spots" cx="9" cy="20" r="1"/>
```
### Banana Tree
```xml
<!-- Trunk -->
<rect class="tree-trunk" x="55" y="50" width="15" height="60" rx="3"/>
<!-- Leaves (rotated ellipses) -->
<ellipse class="tree-leaf" cx="62" cy="45" rx="40" ry="15" transform="rotate(-20, 62, 45)"/>
<ellipse class="tree-leaf" cx="62" cy="50" rx="35" ry="12" transform="rotate(25, 62, 50)"/>
<!-- Banana bunch hanging -->
<g transform="translate(40, 55)">
<path class="banana-green" d="M 5 0 Q 0 10 3 20 Q 6 25 10 20 Q 13 10 8 0 Z"/>
<path class="banana-green" d="M 12 2 Q 8 12 11 22 Q 14 27 18 22 Q 21 12 16 2 Z"/>
<rect class="stem" x="8" y="-5" width="12" height="8" rx="2"/>
</g>
```
### Cargo Ship
```xml
<!-- Ocean waves -->
<path class="ocean" d="M 0 90 Q 30 85 60 90 Q 90 95 120 90 Q 150 85 180 90 L 180 110 L 0 110 Z" opacity="0.5"/>
<!-- Hull -->
<path class="ship-hull" d="M 20 90 L 30 60 L 160 60 L 170 90 Q 150 95 95 95 Q 40 95 20 90 Z"/>
<!-- Deck -->
<rect class="ship-deck" x="40" y="45" width="110" height="18" rx="2"/>
<!-- Reefer containers -->
<rect class="container" x="45" y="25" width="30" height="22" rx="2"/>
<!-- Refrigeration symbol -->
<text x="60" y="40" text-anchor="middle" fill="#185FA5" style="font-size:10px">❄</text>
<!-- Smoke stack -->
<rect x="145" y="35" width="8" height="15" fill="#444441"/>
```
### Inspector Figure
```xml
<!-- Body -->
<rect class="inspector" x="10" y="20" width="25" height="35" rx="3"/>
<!-- Head -->
<circle class="inspector" cx="22" cy="12" r="10"/>
<!-- Hat -->
<rect x="12" y="2" width="20" height="6" rx="2" fill="#534AB7"/>
<!-- Clipboard -->
<rect class="clipboard" x="38" y="28" width="15" height="20" rx="2"/>
<line x1="42" y1="34" x2="50" y2="34" stroke="#888780" stroke-width="1"/>
```
### Spider with "No" Symbol
```xml
<circle cx="15" cy="15" r="18" fill="none" stroke="#A32D2D" stroke-width="2"/>
<line x1="3" y1="3" x2="27" y2="27" stroke="#A32D2D" stroke-width="2"/>
<!-- Spider body -->
<ellipse class="spider" cx="15" cy="15" rx="4" ry="5"/>
<ellipse class="spider" cx="15" cy="10" rx="3" ry="3"/>
<!-- Legs -->
<line x1="12" y1="14" x2="5" y2="10" stroke="#2C2C2A" stroke-width="1"/>
<line x1="18" y1="14" x2="25" y2="10" stroke="#2C2C2A" stroke-width="1"/>
```
### Blender with Smoothie
```xml
<!-- Blender jar -->
<path class="blender" d="M 5 5 L 0 45 L 35 45 L 30 5 Z"/>
<!-- Smoothie inside (wavy top) -->
<path class="smoothie" d="M 3 20 L 0 45 L 35 45 L 32 20 Q 25 18 17 22 Q 10 18 3 20 Z"/>
<!-- Blender base -->
<rect class="blender" x="-2" y="45" width="40" height="12" rx="3"/>
<!-- Lid -->
<rect x="8" y="0" width="20" height="8" rx="2" fill="#AFA9EC" stroke="#534AB7"/>
<!-- Banana chunks floating -->
<ellipse cx="12" cy="32" rx="4" ry="2" fill="#FAC775"/>
```
### Winding Journey Path
```xml
<path class="journey-path" d="
M 80 100
L 200 100
Q 280 100 280 150
L 280 180
Q 280 220 320 220
L 520 220
Q 560 220 560 260
L 560 320
Q 560 360 520 360
L 280 360
...
"/>
```
## CSS Classes
```css
/* Journey */
.journey-path { stroke: #D3D1C7; stroke-width: 3; fill: none; stroke-linecap: round; }
/* Banana ripeness stages */
.banana-green { fill: #97C459; stroke: #3B6D11; stroke-width: 0.5; }
.banana-yellow { fill: #FAC775; stroke: #BA7517; stroke-width: 0.5; }
.banana-brown { fill: #854F0B; stroke: #633806; stroke-width: 0.5; }
.banana-spots { fill: #633806; }
/* Environment elements */
.tree-trunk { fill: #854F0B; stroke: #633806; stroke-width: 1; }
.tree-leaf { fill: #97C459; stroke: #3B6D11; stroke-width: 0.5; }
.ocean { fill: #85B7EB; }
.ship-hull { fill: #5F5E5A; stroke: #444441; stroke-width: 1; }
.container { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.gas-cloud { fill: #C0DD97; stroke: #97C459; stroke-width: 0.5; opacity: 0.6; }
/* Buildings */
.packhouse { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.warehouse { fill: #FAEEDA; stroke: #854F0B; stroke-width: 1; }
.store { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
/* Kitchen */
.counter { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.blender { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.smoothie { fill: #FAC775; }
.freezer { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
/* Details */
.sticker { fill: #378ADD; stroke: #185FA5; stroke-width: 0.3; }
.spider { fill: #2C2C2A; stroke: #1a1a18; stroke-width: 0.3; }
```
## Layout Notes
- **ViewBox**: 850×680 (tall for winding path)
- **Path style**: S-curve winding path connects all 7 stages
- **Location labels**: Country flags + place names anchor geographic context
- **State progression**: Same object (banana) shown in different states throughout
- **Timeline**: Horizontal timeline at bottom shows journey duration
- **Narrative elements**: Fun details (spider, stickers, price tags) add storytelling value
- **Environmental context**: Ocean waves, gas clouds, awnings create sense of place
@@ -0,0 +1,209 @@
# Commercial Aircraft Structure
A physical/structural diagram showing an aircraft side profile using appropriate SVG shapes beyond rectangles - paths, polygons, ellipses for realistic representation.
## Key Patterns Used
- **Path elements**: Curved fuselage body with nose cone using quadratic bezier curves
- **Polygon elements**: Tapered wing shape, triangular stabilizers, control surfaces
- **Ellipse elements**: Engines (cylinders), wheels (circles)
- **Line elements**: Landing gear struts, leader lines for labels
- **Dashed strokes**: Interior sections (fuel tank), movable control surfaces (rudder, elevator)
- **Layered composition**: Cabin sections drawn inside the fuselage shape
- **Leader lines with labels**: Connect labels to components they describe
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 400" xmlns="http://www.w3.org/2000/svg">
<!-- FUSELAGE - main body cylinder with nose cone -->
<path class="fuselage" d="
M 80 180
Q 40 180 40 200
Q 40 220 80 220
L 560 220
Q 580 220 580 200
Q 580 180 560 180
Z
"/>
<!-- Nose cone -->
<path class="fuselage" d="
M 80 180
Q 50 180 35 200
Q 50 220 80 220
" fill="none" stroke-width="1"/>
<!-- COCKPIT windows -->
<path class="cockpit" d="
M 45 190
L 75 185
L 75 200
L 50 200
Z
"/>
<line x1="55" y1="188" x2="55" y2="200" stroke="#534AB7" stroke-width="0.5"/>
<line x1="65" y1="186" x2="65" y2="200" stroke="#534AB7" stroke-width="0.5"/>
<!-- CABIN SECTIONS (inside fuselage) -->
<!-- First class -->
<rect class="first-class" x="85" y="183" width="50" height="34" rx="2"/>
<text class="tl" x="110" y="203" text-anchor="middle">First</text>
<!-- Business class -->
<rect class="business-class" x="140" y="183" width="80" height="34" rx="2"/>
<text class="tl" x="180" y="203" text-anchor="middle">Business</text>
<!-- Economy class -->
<rect class="economy-class" x="225" y="183" width="200" height="34" rx="2"/>
<text class="tl" x="325" y="203" text-anchor="middle">Economy</text>
<!-- CARGO HOLD (lower section indication) -->
<line x1="85" y1="217" x2="520" y2="217" class="leader"/>
<text class="tl" x="300" y="228" text-anchor="middle" opacity=".6">Cargo hold below deck</text>
<!-- WING - main wing shape -->
<polygon class="wing" points="
200,220
120,300
130,305
160,305
340,235
340,220
"/>
<!-- Wing fuel tank (dashed interior) -->
<polygon class="fuel-tank" points="
210,225
150,280
160,283
180,283
310,232
310,225
"/>
<text class="tl" x="220" y="260" opacity=".7">Fuel</text>
<!-- Flaps (trailing edge) -->
<polygon class="flap" points="
130,300
120,305
160,310
165,305
"/>
<text class="tl" x="143" y="320">Flaps</text>
<!-- ENGINE under wing -->
<ellipse class="engine" cx="175" cy="285" rx="25" ry="12"/>
<ellipse cx="155" cy="285" rx="8" ry="10" fill="none" stroke="#993C1D" stroke-width="0.5"/>
<!-- Engine pylon -->
<line x1="175" y1="273" x2="190" y2="245" stroke="#5F5E5A" stroke-width="2"/>
<text class="tl" x="175" y="308" text-anchor="middle">Engine</text>
<!-- TAIL SECTION -->
<!-- Vertical stabilizer -->
<polygon class="tail-v" points="
520,180
560,100
580,100
580,180
"/>
<text class="tl" x="565" y="150" text-anchor="middle">Vertical</text>
<text class="tl" x="565" y="162" text-anchor="middle">stabilizer</text>
<!-- Rudder -->
<polygon points="575,105 590,105 590,178 580,178" fill="none" stroke="#185FA5" stroke-width="0.5" stroke-dasharray="3 2"/>
<text class="tl" x="595" y="145" opacity=".6">Rudder</text>
<!-- Horizontal stabilizer -->
<polygon class="tail-h" points="
500,195
460,175
465,170
580,170
580,180
520,195
"/>
<text class="tl" x="510" y="166">Horizontal stabilizer</text>
<!-- Elevator -->
<polygon points="462,174 450,168 455,163 467,169" fill="none" stroke="#185FA5" stroke-width="0.5" stroke-dasharray="3 2"/>
<text class="tl" x="440" y="158" opacity=".6">Elevator</text>
<!-- LANDING GEAR -->
<!-- Nose gear -->
<line class="gear" x1="100" y1="220" x2="100" y2="260" stroke-width="3"/>
<ellipse class="wheel" cx="100" cy="268" rx="8" ry="10"/>
<text class="tl" x="100" y="290" text-anchor="middle">Nose gear</text>
<!-- Main gear (under wing/fuselage junction) -->
<line class="gear" x1="280" y1="220" x2="280" y2="270" stroke-width="4"/>
<line class="gear" x1="268" y1="265" x2="292" y2="265" stroke-width="3"/>
<ellipse class="wheel" cx="268" cy="278" rx="10" ry="12"/>
<ellipse class="wheel" cx="292" cy="278" rx="10" ry="12"/>
<text class="tl" x="280" y="302" text-anchor="middle">Main gear</text>
<!-- LABELS with leader lines -->
<!-- Cockpit label -->
<line class="leader" x1="60" y1="175" x2="60" y2="140"/>
<text class="ts" x="60" y="132" text-anchor="middle">Cockpit</text>
<!-- Wing label -->
<line class="leader" x1="250" y1="250" x2="290" y2="330"/>
<text class="ts" x="290" y="345" text-anchor="middle">Wing structure</text>
<text class="tl" x="290" y="358" text-anchor="middle">Spars, ribs, skin</text>
<!-- Fuselage label -->
<line class="leader" x1="400" y1="180" x2="400" y2="140"/>
<text class="ts" x="400" y="132" text-anchor="middle">Fuselage</text>
<text class="tl" x="400" y="145" text-anchor="middle">Pressure vessel</text>
</svg>
```
## CSS Classes for Physical Diagrams
When creating physical/structural diagrams, define semantic classes for each component type:
```css
/* Structure shapes */
.fuselage { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.wing { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.tail-v { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.tail-h { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
/* Interior sections */
.cockpit { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.first-class { fill: #FBEAF0; stroke: #993556; stroke-width: 0.5; }
.business-class { fill: #FAECE7; stroke: #993C1D; stroke-width: 0.5; }
.economy-class { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.5; }
.cargo { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 0.5; }
/* Systems */
.engine { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.fuel-tank { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; stroke-dasharray: 3 2; }
.flap { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.5; }
/* Mechanical */
.gear { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
.wheel { fill: #2C2C2A; stroke: #1a1a18; stroke-width: 0.5; }
```
## Shape Selection Guide
| Physical form | SVG element | Example |
|---------------|-------------|---------|
| Curved body | `<path>` with Q (quadratic) or C (cubic) curves | Fuselage, nose cone |
| Tapered/angular | `<polygon>` | Wings, stabilizers |
| Cylindrical | `<ellipse>` | Engines, wheels, tanks |
| Linear structure | `<line>` | Struts, pylons, gear legs |
| Internal sections | `<rect>` inside parent shape | Cabin classes |
| Dashed boundaries | `stroke-dasharray` on any shape | Fuel tanks, control surfaces |
## Layout Notes
- **ViewBox**: 680×400 (wider aspect ratio suits side profile)
- **Layering**: Draw outer structures first, then interior details on top
- **Leader lines**: Use `.leader` class (dashed) to connect labels to components
- **Text sizes**: Use `.tl` (10px) for component labels, `.ts` (12px) for section labels
- **Semantic colors**: Group by system (structure=blue, propulsion=coral, fuel=amber, etc.)
@@ -0,0 +1,236 @@
# Out-of-Order CPU Core Microarchitecture
A structural diagram showing the internal pipeline stages of a modern superscalar out-of-order CPU core. Demonstrates multi-stage vertical flow with parallel paths, fan-out patterns for execution ports, and a separate memory hierarchy sidebar.
## Key Patterns Used
- **Multi-stage vertical flow**: Six pipeline stages (Front End → Rename → Schedule → Execute → Retire)
- **Parallel decode paths**: Main decode and µop cache bypass (dashed line for cache hit)
- **Container grouping**: Logical stages grouped in colored containers
- **Fan-out pattern**: Single scheduler dispatching to 6 execution ports
- **Sidebar layout**: Memory hierarchy placed in separate column on right
- **Stage labels**: Left-aligned labels indicating pipeline phase
- **Color-coded semantics**: Different colors for each functional unit category
## Diagram Type
This is a **hybrid structural/flow** diagram:
- **Flow aspect**: Instructions move top-to-bottom through pipeline stages
- **Structural aspect**: Components are grouped by function (rename unit, execution cluster)
- **Sidebar**: Memory hierarchy is architecturally separate but connected via data paths
## Pipeline Stage Breakdown
### Front End (Purple)
```xml
<!-- Fetch Unit -->
<g class="node c-purple">
<rect x="40" y="70" width="140" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="110" y="90" text-anchor="middle" dominant-baseline="central">Fetch unit</text>
<text class="ts" x="110" y="110" text-anchor="middle" dominant-baseline="central">6-wide, 32B/cycle</text>
</g>
<!-- Branch Predictor (subordinate) -->
<g class="node c-purple">
<rect x="40" y="140" width="140" height="44" rx="8" stroke-width="0.5"/>
<text class="th" x="110" y="162" text-anchor="middle" dominant-baseline="central">Branch predictor</text>
</g>
<!-- Decode -->
<g class="node c-purple">
<rect x="230" y="70" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="90" text-anchor="middle" dominant-baseline="central">Decode</text>
<text class="ts" x="310" y="110" text-anchor="middle" dominant-baseline="central">x86 → µops, 6-wide</text>
</g>
```
### µop Cache Bypass Path (Teal)
The µop cache (Decoded Stream Buffer) provides an alternate path that bypasses the complex decoder:
```xml
<!-- µop Cache parallel to decode -->
<g class="node c-teal">
<rect x="230" y="150" width="160" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="310" y="168" text-anchor="middle" dominant-baseline="central">µop cache (DSB)</text>
<text class="ts" x="310" y="186" text-anchor="middle" dominant-baseline="central">4K entries, 8-wide</text>
</g>
<!-- Dashed bypass path indicating cache hit -->
<path d="M180 110 L205 110 L205 175 L230 175" fill="none" class="arr"
stroke-dasharray="4 3" marker-end="url(#arrow)"/>
<text class="tx" x="164" y="148" opacity=".6">hit</text>
```
### Rename/Allocate Container (Coral)
Groups related rename components in a container:
```xml
<!-- Outer container -->
<g class="c-coral">
<rect x="40" y="250" width="530" height="130" rx="12" stroke-width="0.5"/>
<text class="th" x="60" y="274">Rename / allocate</text>
<text class="ts" x="60" y="292">Map architectural → physical registers</text>
</g>
<!-- Inner components -->
<g class="node c-coral">
<rect x="60" y="310" width="180" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="150" y="330" text-anchor="middle" dominant-baseline="central">Register alias table</text>
<text class="ts" x="150" y="350" text-anchor="middle" dominant-baseline="central">180 physical regs</text>
</g>
```
### Scheduler Fan-Out Pattern (Amber → Teal)
Single unified scheduler dispatching to multiple execution ports:
```xml
<!-- Unified Scheduler -->
<g class="node c-amber">
<rect x="140" y="420" width="330" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="305" y="438" text-anchor="middle" dominant-baseline="central">Unified scheduler</text>
<text class="ts" x="305" y="456" text-anchor="middle" dominant-baseline="central">97 entries, out-of-order dispatch</text>
</g>
<!-- Fan-out arrows to 6 ports -->
<line x1="170" y1="470" x2="90" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="215" y1="470" x2="170" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="265" y1="470" x2="250" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="305" y1="470" x2="330" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="355" y1="470" x2="410" y2="540" class="arr" marker-end="url(#arrow)"/>
<line x1="420" y1="470" x2="490" y2="540" class="arr" marker-end="url(#arrow)"/>
```
### Execution Port Box Pattern
Compact boxes showing port number and capabilities:
```xml
<!-- Execution port with multi-line capability -->
<g class="node c-teal">
<rect x="55" y="540" width="70" height="64" rx="6" stroke-width="0.5"/>
<text class="th" x="90" y="560" text-anchor="middle" dominant-baseline="central">Port 0</text>
<text class="tx" x="90" y="576" text-anchor="middle" dominant-baseline="central">ALU</text>
<text class="tx" x="90" y="590" text-anchor="middle" dominant-baseline="central">DIV</text>
</g>
```
### Reorder Buffer (Pink)
Wide horizontal bar at bottom showing retirement:
```xml
<g class="c-pink">
<rect x="40" y="670" width="530" height="40" rx="10" stroke-width="0.5"/>
<text class="th" x="305" y="694" text-anchor="middle" dominant-baseline="central">Reorder buffer (ROB) — 512 entries, 8-wide retire</text>
</g>
```
### Memory Hierarchy Sidebar (Blue)
Separate column showing cache levels:
```xml
<!-- Container -->
<g class="c-blue">
<rect x="600" y="30" width="190" height="360" rx="16" stroke-width="0.5"/>
<text class="th" x="695" y="54" text-anchor="middle">Memory hierarchy</text>
</g>
<!-- Cache levels stacked vertically -->
<g class="node c-blue">
<rect x="620" y="70" width="150" height="50" rx="8" stroke-width="0.5"/>
<text class="th" x="695" y="88" text-anchor="middle" dominant-baseline="central">L1-I cache</text>
<text class="ts" x="695" y="106" text-anchor="middle" dominant-baseline="central">32 KB, 8-way</text>
</g>
<!-- Additional levels follow same pattern -->
```
## Connection Patterns
### Instruction Fetch Path
Horizontal arrow from L1-I cache to fetch unit:
```xml
<path d="M620 95 L200 95" fill="none" class="arr" marker-end="url(#arrow)"/>
<text class="tx" x="410" y="88" text-anchor="middle" opacity=".6">instruction fetch</text>
```
### Load/Store Path
Complex path from execution ports to L1-D cache:
```xml
<path d="M250 604 L250 640 L580 640 L580 160 L620 160" fill="none" class="arr" marker-end="url(#arrow)"/>
<text class="tx" x="415" y="652" text-anchor="middle" opacity=".6">load / store</text>
```
### Commit Path (dashed)
Dashed line showing write-back from ROB to register file:
```xml
<path d="M550 690 L580 690 L580 445 L595 445" fill="none" class="arr" stroke-dasharray="4 3"/>
<text class="tx" x="590" y="578" opacity=".6" transform="rotate(-90 590 578)">commit</text>
```
### Path Merge (Decode + µop Cache)
Two paths converging before rename:
```xml
<line x1="390" y1="98" x2="430" y2="98" class="arr"/>
<line x1="390" y1="175" x2="430" y2="175" class="arr"/>
<path d="M430 98 L430 175" fill="none" stroke="var(--text-secondary)" stroke-width="1.5"/>
<line x1="430" y1="136" x2="470" y2="136" class="arr" marker-end="url(#arrow)"/>
```
## Text Classes
This diagram uses an additional text class for very small labels:
```css
.tx { font-family: system-ui, -apple-system, sans-serif; font-size: 10px; fill: var(--text-secondary); }
```
Used for:
- Execution port capability labels (ALU, Branch, Load, etc.)
- Connection labels (instruction fetch, load/store, commit)
- DRAM latency annotation
## Color Semantic Mapping
| Color | Stage | Components |
|-------|-------|------------|
| `c-purple` | Front end | Fetch, Branch predictor, Decode |
| `c-teal` | Execution | µop cache, Execution ports |
| `c-coral` | Rename | RAT, Physical RF, Free list |
| `c-amber` | Schedule | Unified scheduler |
| `c-pink` | Retire | Reorder buffer |
| `c-blue` | Memory | L1-I, L1-D, L2, DRAM |
| `c-gray` | External | Off-chip DRAM |
## Layout Notes
- **ViewBox**: 820×720 (taller than wide for vertical pipeline flow)
- **Main pipeline**: x=40 to x=570 (530px width)
- **Memory sidebar**: x=600 to x=790 (190px width)
- **Stage labels**: x=30, left-aligned, 50% opacity
- **Vertical spacing**: ~80-100px between major stages
- **Container padding**: 20px inside containers
- **Port spacing**: 80px between execution port centers
- **Legend**: Bottom-right of memory sidebar, explains color coding
## Architectural Details Shown
| Component | Specification | Notes |
|-----------|---------------|-------|
| Fetch | 6-wide, 32B/cycle | Typical modern Intel/AMD |
| Decode | 6-wide, x86→µops | Complex decoder |
| µop Cache | 4K entries, 8-wide | Bypass for hot code |
| RAT | 180 physical regs | Supports deep OoO |
| Scheduler | 97 entries | Unified RS |
| Execution | 6 ports | ALU×2, Load, Store×2, Vector |
| ROB | 512 entries, 8-wide | In-order retirement |
| L1-I | 32 KB, 8-way | Instruction cache |
| L1-D | 48 KB, 12-way | Data cache |
| L2 | 1.25 MB, 20-way | Unified |
| DRAM | DDR5-6400, ~80ns | Off-chip |
## When to Use This Pattern
Use this diagram style for:
- CPU/GPU microarchitecture visualization
- Compiler pipeline stages
- Network packet processing pipelines
- Any system with parallel execution units fed by a scheduler
- Hardware designs with multiple functional units
@@ -0,0 +1,182 @@
# Electricity Grid: Generation to Consumption
A left-to-right flow diagram showing electricity from multiple generation sources through transmission and distribution networks to end consumers. Demonstrates multi-stage flow layout, voltage level visual hierarchy, and smart grid data overlay.
## Key Patterns Used
- **Multi-stage horizontal flow**: Four distinct columns (Generation → Transmission → Distribution → Consumption)
- **Stage dividers**: Vertical dashed lines separating each phase
- **Voltage level hierarchy**: Different line weights/colors for HV, MV, LV
- **Smart grid data overlay**: Dashed data flow lines from control center
- **Capacity labels**: Power ratings on generation sources
- **Multiple source convergence**: Four generators feeding into single transmission grid
## New Shape Techniques
### Nuclear Plant (cooling tower + reactor)
```xml
<!-- Cooling tower (hyperbolic curve) -->
<path class="nuclear-tower" d="M 25 80 Q 15 60 20 40 Q 25 20 40 15 Q 55 20 60 40 Q 65 60 55 80 Z"/>
<!-- Steam clouds -->
<ellipse class="nuclear-steam" cx="40" cy="8" rx="12" ry="6"/>
<!-- Reactor dome -->
<rect class="nuclear-building" x="65" y="45" width="40" height="35" rx="3"/>
<ellipse class="nuclear-building" cx="85" cy="45" rx="20" ry="8"/>
```
### Gas Peaker Plant (with flames)
```xml
<rect class="gas-plant" x="0" y="25" width="70" height="40" rx="3"/>
<!-- Smokestacks -->
<rect class="gas-stack" x="15" y="5" width="8" height="25" rx="1"/>
<!-- Flame -->
<path class="gas-flame" d="M 19 5 Q 17 0 19 -3 Q 21 0 19 5"/>
<!-- Turbine housing -->
<ellipse class="gas-plant" cx="55" cy="45" rx="12" ry="8"/>
```
### Transmission Pylon with Insulators
```xml
<!-- Tapered tower -->
<polygon class="pylon" points="20,0 25,0 30,80 15,80"/>
<!-- Cross arms -->
<line class="pylon-arm" x1="5" y1="10" x2="40" y2="10"/>
<line class="pylon-arm" x1="8" y1="25" x2="37" y2="25"/>
<!-- Insulators (where lines attach) -->
<circle class="insulator" cx="8" cy="10" r="3"/>
<circle class="insulator" cx="37" cy="10" r="3"/>
```
### Transformer Symbol
```xml
<!-- Two coils with core -->
<circle class="transformer-coil" cx="25" cy="25" r="12"/>
<circle class="transformer-coil" cx="55" cy="25" r="12"/>
<rect class="transformer-core" x="35" y="15" width="10" height="20" rx="2"/>
<!-- Busbars -->
<line x1="0" y1="15" x2="-10" y2="15" stroke="#EF9F27" stroke-width="3"/>
```
### Pole-mounted Transformer
```xml
<rect class="pole" x="18" y="0" width="4" height="60"/>
<line x1="10" y1="8" x2="30" y2="8" stroke="#854F0B" stroke-width="2"/>
<rect class="dist-transformer" x="8" y="15" width="24" height="18" rx="2"/>
<line class="lv-line" x1="20" y1="33" x2="20" y2="60"/>
```
### House with Roof
```xml
<rect class="home" x="0" y="25" width="35" height="30" rx="2"/>
<polygon class="home-roof" points="0,25 17,8 35,25"/>
<!-- Door -->
<rect x="8" y="35" width="8" height="15" fill="#085041"/>
<!-- Window -->
<rect x="22" y="32" width="8" height="8" fill="#9FE1CB"/>
```
### Factory Building
```xml
<rect class="factory" x="0" y="15" width="90" height="50" rx="3"/>
<!-- Smokestacks -->
<rect class="factory-stack" x="15" y="0" width="10" height="20"/>
<!-- Windows row -->
<rect x="10" y="30" width="15" height="12" fill="#F5C4B3"/>
<rect x="30" y="30" width="15" height="12" fill="#F5C4B3"/>
<!-- Loading dock -->
<rect x="55" y="50" width="30" height="15" fill="#993C1D"/>
```
### EV Charger with Car
```xml
<!-- Charging station -->
<rect class="ev-charger" x="20" y="0" width="25" height="45" rx="3"/>
<rect x="24" y="5" width="17" height="12" rx="1" fill="#3C3489"/>
<!-- Cable -->
<path d="M 32 20 Q 32 35 45 40" stroke="#534AB7" stroke-width="2" fill="none"/>
<circle cx="45" cy="40" r="4" fill="#534AB7"/>
<!-- Status light -->
<circle cx="32" cy="38" r="3" fill="#97C459"/>
<!-- EV Car -->
<path class="ev-car" d="M 5 20 L 5 12 Q 5 5 15 5 L 45 5 Q 55 5 55 12 L 55 20 Z"/>
<!-- Windows -->
<rect x="10" y="8" width="15" height="8" rx="2" fill="#534AB7"/>
<!-- Wheels -->
<circle cx="15" cy="22" r="5" fill="#2C2C2A"/>
<!-- Charging bolt icon -->
<path d="M 28 12 L 32 8 L 30 11 L 34 11 L 30 16 L 32 13 Z" fill="#97C459"/>
```
## Voltage Level Line Styles
```css
/* High voltage (transmission) - thick, bright */
.hv-line { stroke: #EF9F27; stroke-width: 2.5; fill: none; }
/* Medium voltage (distribution) - medium */
.mv-line { stroke: #BA7517; stroke-width: 2; fill: none; }
/* Low voltage (consumer) - thin, darker */
.lv-line { stroke: #854F0B; stroke-width: 1.5; fill: none; }
/* Smart grid data - dashed purple */
.data-flow { stroke: #7F77DD; stroke-width: 1; fill: none; stroke-dasharray: 3 2; opacity: 0.7; }
```
## Flow Arrow Marker
```xml
<defs>
<marker id="flow-arrow" viewBox="0 0 10 10" refX="9" refY="5"
markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 Z" fill="#EF9F27"/>
</marker>
</defs>
<!-- Usage -->
<line x1="140" y1="105" x2="210" y2="105" class="hv-line" marker-end="url(#flow-arrow)"/>
```
## CSS Classes
```css
/* Generation */
.nuclear-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.nuclear-building { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.solar-panel { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.wind-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.wind-blade { fill: #F1EFE8; stroke: #888780; stroke-width: 0.5; }
.gas-plant { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.gas-flame { fill: #EF9F27; }
/* Transmission */
.pylon { fill: #5F5E5A; stroke: #444441; stroke-width: 0.5; }
.insulator { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.5; }
.substation { fill: #E6F1FB; stroke: #185FA5; stroke-width: 1; }
.transformer-coil { fill: none; stroke: #185FA5; stroke-width: 1.5; }
/* Distribution */
.pole { fill: #854F0B; stroke: #633806; stroke-width: 0.5; }
.dist-transformer { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
/* Consumption */
.home { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1; }
.home-roof { fill: #0F6E56; stroke: #085041; stroke-width: 0.5; }
.factory { fill: #FAECE7; stroke: #993C1D; stroke-width: 1; }
.ev-charger { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.ev-car { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
/* Smart grid */
.smart-grid { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1.5; }
```
## Layout Notes
- **ViewBox**: 820×520 (wide for 4-column layout)
- **Column widths**: ~200px per stage
- **Stage dividers**: Vertical dashed lines at x=200, 420, 620
- **Stage labels**: Top of diagram, uppercase for emphasis
- **Flow direction**: Left-to-right with arrows showing power flow
- **Data overlay**: Smart grid data lines use different style (dashed purple) to distinguish from power lines
- **Capacity labels**: Show MW ratings on generators for context
- **Voltage labels**: Show transformation ratios at substations
@@ -0,0 +1,172 @@
# Feature Film Production Pipeline
A phased workflow showing the five stages of filmmaking, using containers with inner nodes and horizontal sub-flows within a phase.
## Key Patterns Used
- **Phase containers**: Large rounded rectangles with neutral background and dashed borders
- **Inner task nodes**: Smaller colored nodes inside containers for sub-tasks
- **Horizontal flow within container**: Post-production shows sequential pipeline with arrows (Editing → Color → VFX → Sound → Score)
- **Consistent phase spacing**: ~30px gap between phase containers
- **Phase labels with subtitles**: Each container has title + description
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 780" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Phase 1: Development -->
<g>
<rect x="40" y="30" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="56">Development</text>
<text class="ts" x="66" y="74">Concept to greenlight</text>
</g>
<g class="node c-purple">
<rect x="70" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="108" text-anchor="middle" dominant-baseline="central">Script / screenplay</text>
</g>
<g class="node c-purple">
<rect x="260" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="108" text-anchor="middle" dominant-baseline="central">Financing / budget</text>
</g>
<g class="node c-purple">
<rect x="450" y="90" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="108" text-anchor="middle" dominant-baseline="central">Casting leads</text>
</g>
<!-- Arrow to Phase 2 -->
<line x1="340" y1="140" x2="340" y2="170" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 2: Pre-production -->
<g>
<rect x="40" y="170" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="196">Pre-production</text>
<text class="ts" x="66" y="214">Planning and preparation</text>
</g>
<g class="node c-teal">
<rect x="70" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="248" text-anchor="middle" dominant-baseline="central">Storyboards</text>
</g>
<g class="node c-teal">
<rect x="260" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="248" text-anchor="middle" dominant-baseline="central">Location scouting</text>
</g>
<g class="node c-teal">
<rect x="450" y="230" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="248" text-anchor="middle" dominant-baseline="central">Crew hiring</text>
</g>
<!-- Arrow to Phase 3 -->
<line x1="340" y1="280" x2="340" y2="310" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 3: Production -->
<g>
<rect x="40" y="310" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="336">Production</text>
<text class="ts" x="66" y="354">Principal photography</text>
</g>
<g class="node c-coral">
<rect x="70" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="388" text-anchor="middle" dominant-baseline="central">Filming / shooting</text>
</g>
<g class="node c-coral">
<rect x="260" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="388" text-anchor="middle" dominant-baseline="central">Production sound</text>
</g>
<g class="node c-coral">
<rect x="450" y="370" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="388" text-anchor="middle" dominant-baseline="central">VFX plates</text>
</g>
<!-- Arrow to Phase 4 -->
<line x1="340" y1="420" x2="340" y2="450" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 4: Post-production -->
<g>
<rect x="40" y="450" width="600" height="150" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="476">Post-production</text>
<text class="ts" x="66" y="494">Assembly and finishing</text>
</g>
<g class="node c-amber">
<rect x="70" y="510" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="125" y="528" text-anchor="middle" dominant-baseline="central">Editing</text>
</g>
<g class="node c-amber">
<rect x="195" y="510" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="250" y="528" text-anchor="middle" dominant-baseline="central">Color grade</text>
</g>
<g class="node c-amber">
<rect x="320" y="510" width="90" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="365" y="528" text-anchor="middle" dominant-baseline="central">VFX</text>
</g>
<g class="node c-amber">
<rect x="425" y="510" width="100" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="475" y="528" text-anchor="middle" dominant-baseline="central">Sound mix</text>
</g>
<g class="node c-amber">
<rect x="540" y="510" width="80" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="580" y="528" text-anchor="middle" dominant-baseline="central">Score</text>
</g>
<!-- Flow arrows within post -->
<line x1="180" y1="528" x2="195" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="305" y1="528" x2="320" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="410" y1="528" x2="425" y2="528" class="arr" marker-end="url(#arrow)"/>
<line x1="525" y1="528" x2="540" y2="528" class="arr" marker-end="url(#arrow)"/>
<!-- Final delivery label -->
<g class="node c-amber">
<rect x="240" y="556" width="200" height="32" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="572" text-anchor="middle" dominant-baseline="central">Final master / DCP</text>
</g>
<line x1="340" y1="546" x2="340" y2="556" class="arr" marker-end="url(#arrow)"/>
<!-- Arrow to Phase 5 -->
<line x1="340" y1="600" x2="340" y2="630" class="arr" marker-end="url(#arrow)"/>
<!-- Phase 5: Distribution -->
<g>
<rect x="40" y="630" width="600" height="110" rx="16" stroke-width="1" stroke-dasharray="6 4" fill="var(--bg-secondary)" stroke="var(--border)"/>
<text class="th" x="66" y="656">Distribution</text>
<text class="ts" x="66" y="674">Release and exhibition</text>
</g>
<g class="node c-blue">
<rect x="70" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="150" y="708" text-anchor="middle" dominant-baseline="central">Film festivals</text>
</g>
<g class="node c-blue">
<rect x="260" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="708" text-anchor="middle" dominant-baseline="central">Theatrical release</text>
</g>
<g class="node c-blue">
<rect x="450" y="690" width="160" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="530" y="708" text-anchor="middle" dominant-baseline="central">Streaming / VOD</text>
</g>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Phase containers | Neutral (dashed) | Subtle grouping, doesn't compete with content |
| Development tasks | `c-purple` | Creative/concept work |
| Pre-production tasks | `c-teal` | Planning and preparation |
| Production tasks | `c-coral` | Active filming (main event) |
| Post-production tasks | `c-amber` | Processing/refinement |
| Distribution tasks | `c-blue` | Outward delivery/release |
## Layout Notes
- **ViewBox**: 680×780 (standard width, tall for 5 phases)
- **Container style**: Dashed border (`stroke-dasharray="6 4"`), neutral fill (`var(--bg-secondary)`), `stroke-width="1"`
- **Container height**: 110px for 3-node phases, 150px for post-production (more complex)
- **Inner node dimensions**: 160×36px for standard tasks, variable width for post-production sequential flow
- **Phase gap**: 30px between containers
- **Horizontal sub-flow**: Post-production uses tightly packed nodes with arrows between them to show sequence
- **Convergence node**: "Final master / DCP" sits below the horizontal flow, collecting all post outputs
@@ -0,0 +1,165 @@
# Hospital Emergency Department Flow
A multi-path flowchart showing patient journey through an emergency department with priority-based routing using semantic colors (red=critical, amber=urgent, green=stable).
## Key Patterns Used
- **Semantic color coding**: Red/amber/green for priority levels (not arbitrary decoration)
- **Stage labels**: Left-aligned faded labels marking workflow phases
- **Convergent paths**: Multiple entry points merging, then branching, then converging again
- **Nested containers**: Diagnostics grouped in a container with inner nodes
- **Legend**: Color key at bottom explaining priority levels
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 620" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
</defs>
<!-- Stage labels -->
<text class="ts" x="40" y="68" text-anchor="start" opacity=".5">Arrival</text>
<text class="ts" x="40" y="168" text-anchor="start" opacity=".5">Assessment</text>
<text class="ts" x="40" y="288" text-anchor="start" opacity=".5">Priority routing</text>
<text class="ts" x="40" y="418" text-anchor="start" opacity=".5">Diagnostics</text>
<text class="ts" x="40" y="518" text-anchor="start" opacity=".5">Outcome</text>
<!-- Arrival: Ambulance -->
<g class="node c-gray">
<rect x="140" y="40" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="220" y="60" text-anchor="middle" dominant-baseline="central">Ambulance</text>
<text class="ts" x="220" y="80" text-anchor="middle" dominant-baseline="central">Emergency transport</text>
</g>
<!-- Arrival: Walk-in -->
<g class="node c-gray">
<rect x="380" y="40" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="460" y="60" text-anchor="middle" dominant-baseline="central">Walk-in</text>
<text class="ts" x="460" y="80" text-anchor="middle" dominant-baseline="central">Self-arrival</text>
</g>
<!-- Arrows to Triage -->
<line x1="220" y1="96" x2="300" y2="140" class="arr" marker-end="url(#arrow)"/>
<line x1="460" y1="96" x2="380" y2="140" class="arr" marker-end="url(#arrow)"/>
<!-- Triage -->
<g class="node c-purple">
<rect x="240" y="140" width="200" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="160" text-anchor="middle" dominant-baseline="central">Triage</text>
<text class="ts" x="340" y="180" text-anchor="middle" dominant-baseline="central">Nurse assessment, vitals</text>
</g>
<!-- Arrows from Triage to Priority -->
<line x1="280" y1="196" x2="140" y2="260" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="196" x2="340" y2="260" class="arr" marker-end="url(#arrow)"/>
<line x1="400" y1="196" x2="540" y2="260" class="arr" marker-end="url(#arrow)"/>
<!-- Priority: Red - Trauma -->
<g class="node c-red">
<rect x="60" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="140" y="280" text-anchor="middle" dominant-baseline="central">Trauma bay</text>
<text class="ts" x="140" y="300" text-anchor="middle" dominant-baseline="central">Priority: critical</text>
</g>
<!-- Priority: Yellow - Exam rooms -->
<g class="node c-amber">
<rect x="260" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="280" text-anchor="middle" dominant-baseline="central">Exam rooms</text>
<text class="ts" x="340" y="300" text-anchor="middle" dominant-baseline="central">Priority: urgent</text>
</g>
<!-- Priority: Green - Waiting -->
<g class="node c-green">
<rect x="460" y="260" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="540" y="280" text-anchor="middle" dominant-baseline="central">Waiting area</text>
<text class="ts" x="540" y="300" text-anchor="middle" dominant-baseline="central">Priority: stable</text>
</g>
<!-- Arrows to Diagnostics -->
<line x1="140" y1="316" x2="220" y2="390" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="316" x2="340" y2="390" class="arr" marker-end="url(#arrow)"/>
<line x1="540" y1="316" x2="460" y2="390" class="arr" marker-end="url(#arrow)"/>
<!-- Diagnostics container -->
<g class="c-teal">
<rect x="140" y="390" width="400" height="56" rx="12" stroke-width="0.5"/>
</g>
<!-- Labs -->
<g class="node c-teal">
<rect x="160" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="215" y="418" text-anchor="middle" dominant-baseline="central">Labs</text>
</g>
<!-- Imaging -->
<g class="node c-teal">
<rect x="285" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="340" y="418" text-anchor="middle" dominant-baseline="central">Imaging</text>
</g>
<!-- Diagnosis -->
<g class="node c-teal">
<rect x="410" y="400" width="110" height="36" rx="6" stroke-width="0.5"/>
<text class="ts" x="465" y="418" text-anchor="middle" dominant-baseline="central">Diagnosis</text>
</g>
<!-- Arrows to Outcomes -->
<line x1="215" y1="446" x2="160" y2="490" class="arr" marker-end="url(#arrow)"/>
<line x1="340" y1="446" x2="340" y2="490" class="arr" marker-end="url(#arrow)"/>
<line x1="465" y1="446" x2="520" y2="490" class="arr" marker-end="url(#arrow)"/>
<!-- Outcome: Admission -->
<g class="node c-coral">
<rect x="80" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="160" y="510" text-anchor="middle" dominant-baseline="central">Admission</text>
<text class="ts" x="160" y="530" text-anchor="middle" dominant-baseline="central">Inpatient ward</text>
</g>
<!-- Outcome: Surgery -->
<g class="node c-coral">
<rect x="260" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="340" y="510" text-anchor="middle" dominant-baseline="central">Surgery</text>
<text class="ts" x="340" y="530" text-anchor="middle" dominant-baseline="central">Operating room</text>
</g>
<!-- Outcome: Discharge -->
<g class="node c-coral">
<rect x="440" y="490" width="160" height="56" rx="8" stroke-width="0.5"/>
<text class="th" x="520" y="510" text-anchor="middle" dominant-baseline="central">Discharge</text>
<text class="ts" x="520" y="530" text-anchor="middle" dominant-baseline="central">Home with instructions</text>
</g>
<!-- Legend -->
<text class="ts" x="140" y="580" opacity=".5">Priority levels</text>
<g class="c-red"><rect x="140" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="162" y="604">Critical</text>
<g class="c-amber"><rect x="240" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="262" y="604">Urgent</text>
<g class="c-green"><rect x="340" y="592" width="14" height="14" rx="3" stroke-width="0.5"/></g>
<text class="ts" x="362" y="604">Stable</text>
</svg>
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Entry points (Ambulance, Walk-in) | `c-gray` | Neutral starting points |
| Triage | `c-purple` | Processing/assessment step |
| Trauma bay | `c-red` | Critical priority (semantic) |
| Exam rooms | `c-amber` | Urgent priority (semantic) |
| Waiting area | `c-green` | Stable priority (semantic) |
| Diagnostics | `c-teal` | Clinical services category |
| Outcomes | `c-coral` | Final disposition category |
## Layout Notes
- **ViewBox**: 680×620 (standard width, extended height for 5 stages)
- **Stage spacing**: ~110-130px between stage rows
- **Diagonal arrows**: Connect nodes across columns naturally
- **Container with inner nodes**: Diagnostics uses outer `c-teal` rect with inner node rects
@@ -0,0 +1,114 @@
# ML Benchmark Grouped Bar Chart with Dual Axis
A quantitative data visualization comparing LLM inference speed across quantization levels with dual Y-axes, threshold markers, and an inset accuracy table.
## Key Patterns Used
- **Grouped bars**: Min/max range pairs per category using semantic color pairs (lighter=min, darker=max)
- **Dual Y-axis**: Left axis for primary metric (tok/s), right axis for secondary metric (VRAM GB)
- **Overlay line graph**: `<polyline>` with labeled dots showing VRAM usage across categories
- **Threshold marker**: Dashed red horizontal line indicating hardware limit (24 GB GPU)
- **Zone annotations**: Subtle text labels above/below threshold for context
- **Inset data table**: Alternating row fills below chart with quantitative accuracy data
- **Semantic color coding**: Each quantization level gets its own color from the skill palette (red=OOM, amber=slow, teal=sweet spot, blue=fast)
## Diagram Type
This is a **quantitative data chart** with:
- **Grouped vertical bars**: Range bars showing minmax performance per category
- **Secondary axis line**: VRAM usage overlaid as a connected scatter plot
- **Threshold annotation**: Hardware constraint line
- **Inset table**: Supporting accuracy metrics
## Chart Layout Formula
```
Chart area: x=90590, y=70410 (500px wide, 340px tall)
Left Y-axis: Primary metric (tok/s)
y = 410 (val / max_val) × 340
Right Y-axis: Secondary metric (VRAM GB)
Same formula, different scale labels
Groups: Divide width by number of categories
Bars: Each group → min bar (34px) + 8px gap + max bar (34px)
Line overlay: <polyline> connecting data points across group centers
Threshold: Horizontal dashed line at critical value
Table: Below chart, alternating row fills
```
## Data Mapped
| Quantization | Model Size | Speed (tok/s) | VRAM (GB) | MMLU Pro | Status |
|-------------|-----------|---------------|-----------|----------|--------|
| FP16 | 62 GB | 0.52 | 62 | 75.2 | OOM / unusable |
| Q8_0 | 32 GB | 35 | 32 | 75.0 | Partial offload |
| Q4_K_M | 16.8 GB | 812 | 16.8 | 73.1 | Fits in VRAM ✓ |
| IQ3_M | 12 GB | 1215 | 12 | 70.5 | Full GPU speed |
## Bar CSS Classes
```css
/* Light mode */
.bar-fp16-min { fill: #FCEBEB; stroke: #A32D2D; stroke-width: 0.75; }
.bar-fp16-max { fill: #F7C1C1; stroke: #A32D2D; stroke-width: 0.75; }
.bar-q8-min { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.75; }
.bar-q8-max { fill: #FAC775; stroke: #854F0B; stroke-width: 0.75; }
.bar-q4-min { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.75; }
.bar-q4-max { fill: #9FE1CB; stroke: #0F6E56; stroke-width: 0.75; }
.bar-iq3-min { fill: #E6F1FB; stroke: #185FA5; stroke-width: 0.75; }
.bar-iq3-max { fill: #B5D4F4; stroke: #185FA5; stroke-width: 0.75; }
/* Dark mode */
@media (prefers-color-scheme: dark) {
.bar-fp16-min { fill: #501313; stroke: #F09595; }
.bar-fp16-max { fill: #791F1F; stroke: #F09595; }
.bar-q8-min { fill: #412402; stroke: #EF9F27; }
.bar-q8-max { fill: #633806; stroke: #EF9F27; }
.bar-q4-min { fill: #04342C; stroke: #5DCAA5; }
.bar-q4-max { fill: #085041; stroke: #5DCAA5; }
.bar-iq3-min { fill: #042C53; stroke: #85B7EB; }
.bar-iq3-max { fill: #0C447C; stroke: #85B7EB; }
}
```
## Overlay Line CSS
```css
.vram-line { stroke: #534AB7; stroke-width: 2.5; fill: none; }
.vram-dot { fill: #534AB7; stroke: var(--bg-primary); stroke-width: 2; }
.vram-label { font-family: system-ui, sans-serif; font-size: 10px; fill: #534AB7; font-weight: 500; }
```
## Threshold CSS
```css
.threshold { stroke: #A32D2D; stroke-width: 1; stroke-dasharray: 6 3; fill: none; }
.threshold-label { font-family: system-ui, sans-serif; font-size: 10px; fill: #A32D2D; font-weight: 500; }
```
## Table CSS
```css
.tbl-header { fill: var(--bg-secondary); stroke: var(--border); stroke-width: 0.5; }
.tbl-row { fill: transparent; stroke: var(--border); stroke-width: 0.25; }
.tbl-alt { fill: var(--bg-secondary); stroke: var(--border); stroke-width: 0.25; }
```
## Layout Notes
- **ViewBox**: 680×660 (portrait, chart + legend + table)
- **Chart area**: y=70410, x=90590
- **Legend row**: y=458470
- **Inset table**: y=490620
- **Bar width**: 34px each, 8px gap between min/max pair
- **Group spacing**: 125px center-to-center
- **Dot halo**: White circle (r=6) behind colored dot (r=5) for legibility over bars/grid
## When to Use This Pattern
Use this diagram style for:
- Model benchmark comparisons across quantization levels
- Performance vs. resource usage tradeoff analysis
- Any multi-metric comparison with a hardware/software constraint
- GPU/TPU/accelerator benchmarking dashboards
- Accuracy vs. speed Pareto frontiers
- Hardware requirement sizing charts
@@ -0,0 +1,325 @@
# Place Order — UML Sequence Diagram
A UML sequence diagram for the 'Place Order' use case in an e-commerce system. Six lifelines (:Customer, :ShoppingCart, :OrderController, :PaymentGateway, :InventorySystem, :EmailService) interact across 14 numbered messages. An **alt** combined fragment (amber) covers the three conditional outcomes — payment authorized, payment failed, and item unavailable. A **par** combined fragment (teal) nested inside the success branch shows concurrent email confirmation and stock-level update. Demonstrates activation bars, two distinct arrowhead types, UML pentagon fragment tags, and guard conditions.
## Key Patterns Used
- **6 lifelines at equal spacing**: Lifeline centers placed at x=90, 190, 290, 390, 490, 590 (100px apart) so the first box left-edge lands at x=40 and the last right-edge lands at x=640 — exactly filling the safe area
- **Two-row actor headers**: Each lifeline box shows `":"` (small, tertiary color) on one line and the class name (slightly larger, bold) on a second line, matching the UML anonymous-instance notation `:ClassName`
- **Two separate arrowhead markers**: `#arr-call` is a filled triangle (`<polygon>`) for synchronous calls; `#arr-ret` is an open chevron (`fill="none"`) for dashed return messages — both use `context-stroke` to inherit line color
- **Activation bars**: Narrow 8px-wide rectangles (`class="activation"`) layered on top of lifeline stems to show object execution periods; OrderController's bar spans the entire interaction; shorter bars mark PaymentGateway, InventorySystem, and EmailService during their active windows
- **Combined fragment pentagon tag**: Each `alt` / `par` frame uses a `<polygon>` dog-eared label shape in the top-left corner — points follow the pattern `(x,y) (x+w,y) (x+w+6,y+6) (x+w+6,y+18) (x,y+18)` creating the characteristic UML notch
- **Nested par inside alt**: The `par` rect (teal) sits inside branch 1 of the `alt` rect (amber); inner rect uses inset x/y (+15/+2) so both borders remain visible and distinguishable
- **Guard conditions**: Italic text in `[square brackets]` placed immediately after each alt frame divider line, or just inside the top frame for branch 1 — rendered with a dedicated `guard-lbl` class (italic, amber color)
- **Alt branch dividers**: Solid horizontal lines (`.frag-alt-div`) span the full alt rect width to separate the three branches; par branch separator uses a dashed line (`.frag-par-div`) per UML spec
- **Lifeline end caps**: Short 14px horizontal tick marks at y=590 (bottom of all lifeline stems) to formally terminate each lifeline
- **Message sequence annotation**: A faint counter row below the legend (①–③ / ④–⑩ / ⑪–⑫ / ⑬–⑭) explains the four message groups without adding noise to the diagram body
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 648" xmlns="http://www.w3.org/2000/svg">
<defs>
<!-- Open chevron arrowhead — return messages -->
<marker id="arr-ret" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
<!-- Filled triangle arrowhead — synchronous calls -->
<marker id="arr-call" viewBox="0 0 10 10" refX="9" refY="5"
markerWidth="7" markerHeight="7" orient="auto">
<polygon points="0,1 10,5 0,9" fill="context-stroke"/>
</marker>
</defs>
<!--
Lifeline centres (x):
L1 :Customer → 90
L2 :ShoppingCart → 190
L3 :OrderController → 290
L4 :PaymentGateway → 390
L5 :InventorySystem → 490
L6 :EmailService → 590
Actor boxes: x = cx50, y=20, w=100, h=56, rx=6
Lifelines: x = cx, y1=76, y2=590
-->
<!-- ── 1. LIFELINE DASHED STEMS (drawn first, behind everything) ── -->
<line x1="90" y1="76" x2="90" y2="590" class="lifeline"/>
<line x1="190" y1="76" x2="190" y2="590" class="lifeline"/>
<line x1="290" y1="76" x2="290" y2="590" class="lifeline"/>
<line x1="390" y1="76" x2="390" y2="590" class="lifeline"/>
<line x1="490" y1="76" x2="490" y2="590" class="lifeline"/>
<line x1="590" y1="76" x2="590" y2="590" class="lifeline"/>
<!-- ── 2. ACTOR HEADER BOXES ── -->
<!-- :Customer -->
<rect x="40" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="90" y="40" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="90" y="58" text-anchor="middle" dominant-baseline="central">Customer</text>
<!-- :ShoppingCart -->
<rect x="140" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="190" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="190" y="55" text-anchor="middle" dominant-baseline="central">ShoppingCart</text>
<!-- :OrderController -->
<rect x="240" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="290" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="290" y="55" text-anchor="middle" dominant-baseline="central">OrderController</text>
<!-- :PaymentGateway -->
<rect x="340" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="390" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="390" y="55" text-anchor="middle" dominant-baseline="central">PaymentGateway</text>
<!-- :InventorySystem -->
<rect x="440" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="490" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="490" y="55" text-anchor="middle" dominant-baseline="central">InventorySystem</text>
<!-- :EmailService -->
<rect x="540" y="20" width="100" height="56" rx="6" class="actor"/>
<text class="actor-colon" x="590" y="37" text-anchor="middle" dominant-baseline="central">:</text>
<text class="actor-name" x="590" y="55" text-anchor="middle" dominant-baseline="central">EmailService</text>
<!-- ── 3. ACTIVATION BARS ── -->
<!-- ShoppingCart: active while forwarding checkout → placeOrder -->
<rect x="186" y="102" width="8" height="26" rx="1" class="activation"/>
<!-- OrderController: active throughout full sequence -->
<rect x="286" y="128" width="8" height="415" rx="1" class="activation"/>
<!-- PaymentGateway: active during auth check (happy-path branch only) -->
<rect x="386" y="154" width="8" height="46" rx="1" class="activation"/>
<!-- InventorySystem: active from reserveItems → updateStockLevels end -->
<rect x="486" y="225" width="8" height="128" rx="1" class="activation"/>
<!-- EmailService: active during confirmation send -->
<rect x="586" y="290" width="8" height="25" rx="1" class="activation"/>
<!-- ── 4. PRE-ALT MESSAGES ── -->
<!-- ① checkout() :Customer → :ShoppingCart -->
<line x1="90" y1="102" x2="186" y2="102" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="140" y="97" text-anchor="middle">checkout()</text>
<!-- ② placeOrder(cartItems) :ShoppingCart → :OrderController -->
<line x1="194" y1="128" x2="286" y2="128" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="242" y="123" text-anchor="middle">placeOrder(cartItems)</text>
<!-- ③ authorizePayment(amount) :OrderController → :PaymentGateway -->
<line x1="294" y1="154" x2="386" y2="154" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="342" y="149" text-anchor="middle">authorizePayment(amount)</text>
<!-- ── 5. ALT COMBINED FRAGMENT y=166 → y=563 ── -->
<!-- Outer alt rectangle -->
<rect x="45" y="166" width="590" height="397" rx="3" class="frag-alt-bg"/>
<!-- Pentagon "alt" tag: TL corner notch shape -->
<polygon points="45,166 84,166 90,173 90,185 45,185" class="frag-alt-tag"/>
<text class="frag-alt-kw" x="67" y="178" text-anchor="middle" dominant-baseline="central">alt</text>
<!-- Guard: branch 1 -->
<text class="guard-lbl" x="96" y="179" dominant-baseline="central">[payment authorized]</text>
<!-- ─── Branch 1: payment authorized ─── -->
<!-- ④ « authorized » :PaymentGateway → :OrderController (dashed return) -->
<line x1="386" y1="200" x2="294" y2="200" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="342" y="195" text-anchor="middle">« authorized »</text>
<!-- ⑤ reserveItems(cartItems) :OrderController → :InventorySystem -->
<line x1="294" y1="225" x2="486" y2="225" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="392" y="220" text-anchor="middle">reserveItems(cartItems)</text>
<!-- ⑥ « itemsReserved » :InventorySystem → :OrderController (dashed return) -->
<line x1="486" y1="250" x2="294" y2="250" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="392" y="245" text-anchor="middle">« itemsReserved »</text>
<!-- ── 6. PAR COMBINED FRAGMENT (nested inside alt branch 1) y=266 → y=373 ── -->
<!-- Inner par rectangle -->
<rect x="60" y="266" width="560" height="107" rx="3" class="frag-par-bg"/>
<!-- Pentagon "par" tag -->
<polygon points="60,266 97,266 102,272 102,284 60,284" class="frag-par-tag"/>
<text class="frag-par-kw" x="81" y="275" text-anchor="middle" dominant-baseline="central">par</text>
<!-- Par branch 1: email confirmation -->
<!-- ⑦ sendConfirmationEmail() :OrderController → :EmailService -->
<line x1="294" y1="295" x2="586" y2="295" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="442" y="290" text-anchor="middle">sendConfirmationEmail()</text>
<!-- ⑧ « emailQueued » :EmailService → :OrderController (dashed return) -->
<line x1="586" y1="318" x2="294" y2="318" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="442" y="313" text-anchor="middle">« emailQueued »</text>
<!-- Par branch divider (dashed, per UML spec) -->
<line x1="60" y1="336" x2="620" y2="336" class="frag-par-div"/>
<!-- Par branch 2: stock level update -->
<!-- ⑨ updateStockLevels() :OrderController → :InventorySystem -->
<line x1="294" y1="355" x2="486" y2="355" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="392" y="350" text-anchor="middle">updateStockLevels()</text>
<!-- PAR fragment ends at y=373 -->
<!-- ⑩ « orderPlaced » :OrderController → :Customer (dashed return, after par) -->
<line x1="286" y1="395" x2="90" y2="395" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="190" y="390" text-anchor="middle">« orderPlaced »</text>
<!-- ─── Alt else: [payment failed] ─── -->
<!-- Alt branch divider 1 (solid line) -->
<line x1="45" y1="415" x2="635" y2="415" class="frag-alt-div"/>
<text class="guard-lbl" x="50" y="429" dominant-baseline="central">[payment failed]</text>
<!-- ⑪ « authFailed » :PaymentGateway → :OrderController (dashed return) -->
<line x1="390" y1="448" x2="294" y2="448" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="344" y="443" text-anchor="middle">« authFailed »</text>
<!-- ⑫ error(PAYMENT_FAILED) :OrderController → :Customer -->
<line x1="286" y1="470" x2="90" y2="470" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="190" y="465" text-anchor="middle">error(PAYMENT_FAILED)</text>
<!-- ─── Alt else: [item unavailable] ─── -->
<!-- Alt branch divider 2 (solid line) -->
<line x1="45" y1="490" x2="635" y2="490" class="frag-alt-div"/>
<text class="guard-lbl" x="50" y="504" dominant-baseline="central">[item unavailable]</text>
<!-- ⑬ « unavailable » :InventorySystem → :OrderController (dashed return) -->
<line x1="486" y1="523" x2="294" y2="523" class="msg-ret" marker-end="url(#arr-ret)"/>
<text class="rlbl" x="392" y="518" text-anchor="middle">« unavailable »</text>
<!-- ⑭ error(ITEM_UNAVAILABLE) :OrderController → :Customer -->
<line x1="286" y1="545" x2="90" y2="545" class="msg-call" marker-end="url(#arr-call)"/>
<text class="mlbl" x="190" y="540" text-anchor="middle">error(ITEM_UNAVAILABLE)</text>
<!-- ALT fragment ends at y=563 -->
<!-- ── 7. LIFELINE END CAPS (short horizontal tick at y=590) ── -->
<line x1="83" y1="590" x2="97" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="183" y1="590" x2="197" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="283" y1="590" x2="297" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="383" y1="590" x2="397" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="483" y1="590" x2="497" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<line x1="583" y1="590" x2="597" y2="590" stroke="var(--text-tertiary)" stroke-width="1.5"/>
<!-- ── 8. LEGEND ── -->
<text class="ts" x="45" y="612" opacity=".45">Legend —</text>
<line x1="110" y1="609" x2="148" y2="609"
stroke="var(--text-primary)" stroke-width="1.5" marker-end="url(#arr-call)"/>
<text class="ts" x="154" y="613" opacity=".75">Synchronous call</text>
<line x1="288" y1="609" x2="326" y2="609"
stroke="var(--text-secondary)" stroke-width="1.5"
stroke-dasharray="5 3" marker-end="url(#arr-ret)"/>
<text class="ts" x="332" y="613" opacity=".75">Return message</text>
<rect x="458" y="603" width="22" height="13" rx="2"
fill="#FAEEDA" fill-opacity="0.5" stroke="#854F0B" stroke-width="0.75"/>
<text class="ts" x="484" y="613" opacity=".75">alt fragment</text>
<rect x="558" y="603" width="22" height="13" rx="2"
fill="#E1F5EE" fill-opacity="0.6" stroke="#0F6E56" stroke-width="0.75"/>
<text class="ts" x="584" y="613" opacity=".75">par fragment</text>
<!-- Message group annotation -->
<text class="ts" x="45" y="632" opacity=".35">
①–③ pre-condition · ④–⑩ happy path · ⑪–⑫ payment failure · ⑬–⑭ item unavailable
</text>
</svg>
```
## Custom CSS
Add these classes to the hosting page `<style>` block (in addition to the standard skill CSS):
```css
/* ── Actor lifeline header boxes ── */
.actor { fill: var(--bg-secondary); stroke: var(--text-secondary); stroke-width: 0.5; }
.actor-name { font-family: system-ui, sans-serif; font-size: 11.5px; font-weight: 600;
fill: var(--text-primary); }
.actor-colon { font-family: system-ui, sans-serif; font-size: 10px; fill: var(--text-tertiary); }
/* ── Lifeline dashed stems ── */
.lifeline { stroke: var(--text-tertiary); stroke-width: 1; stroke-dasharray: 6 4; fill: none; }
/* ── Activation bars ── */
.activation { fill: var(--bg-secondary); stroke: var(--text-secondary); stroke-width: 0.75; }
/* ── Message arrows ── */
.msg-call { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
.msg-ret { stroke: var(--text-secondary); stroke-width: 1.5; fill: none; stroke-dasharray: 6 3; }
/* ── Message labels ── */
.mlbl { font-family: system-ui, sans-serif; font-size: 11px; fill: var(--text-primary); }
.rlbl { font-family: system-ui, sans-serif; font-size: 11px; fill: var(--text-secondary);
font-style: italic; }
/* ── Combined fragment: alt (amber) ── */
.frag-alt-bg { fill: #FAEEDA; fill-opacity: 0.18; stroke: #854F0B; stroke-width: 1; }
.frag-alt-tag { fill: #FAEEDA; stroke: #854F0B; stroke-width: 0.75; }
.frag-alt-kw { font-family: system-ui, sans-serif; font-size: 11px; font-weight: 700;
fill: #633806; }
.frag-alt-div { stroke: #854F0B; stroke-width: 0.75; fill: none; }
.guard-lbl { font-family: system-ui, sans-serif; font-size: 10.5px; font-style: italic;
fill: #854F0B; }
/* ── Combined fragment: par (teal) ── */
.frag-par-bg { fill: #E1F5EE; fill-opacity: 0.35; stroke: #0F6E56; stroke-width: 1; }
.frag-par-tag { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 0.75; }
.frag-par-kw { font-family: system-ui, sans-serif; font-size: 11px; font-weight: 700;
fill: #085041; }
.frag-par-div { stroke: #0F6E56; stroke-width: 0.75; stroke-dasharray: 5 3; fill: none; }
/* ── Dark mode overrides ── */
@media (prefers-color-scheme: dark) {
.actor { fill: #2c2c2a; stroke: #b4b2a9; }
.actor-name { fill: #e8e6de; }
.actor-colon { fill: #888780; }
.frag-alt-bg { fill: #633806; fill-opacity: 0.25; stroke: #EF9F27; }
.frag-alt-tag { fill: #633806; stroke: #EF9F27; }
.frag-alt-kw { fill: #FAC775; }
.frag-alt-div { stroke: #EF9F27; }
.guard-lbl { fill: #EF9F27; }
.frag-par-bg { fill: #085041; fill-opacity: 0.35; stroke: #5DCAA5; }
.frag-par-tag { fill: #085041; stroke: #5DCAA5; }
.frag-par-kw { fill: #9FE1CB; }
.frag-par-div { stroke: #5DCAA5; }
}
```
## Color Assignments
| Element | Color | Reason |
|---------|-------|--------|
| Actor header boxes | Neutral (`var(--bg-secondary)`) | Structural / non-semantic — all lifelines share one style |
| Activation bars | Neutral (`var(--bg-secondary)`) | Show execution periods without adding semantic color |
| Synchronous call arrows | `var(--text-primary)` + filled triangle | High contrast for calls — the primary interaction direction |
| Return / dashed arrows | `var(--text-secondary)` + open chevron | Lower contrast for returns — secondary flow direction |
| `alt` fragment | Amber (`#FAEEDA` / `#854F0B`) | Warning / conditional — matches `c-amber` semantic meaning |
| Guard condition text | Amber italic | Belongs visually to the alt fragment |
| `par` fragment | Teal (`#E1F5EE` / `#0F6E56`) | Concurrent success path — matches `c-teal` semantic meaning |
| Alt branch dividers | Amber solid line | Continuity with the alt frame color |
| Par branch divider | Teal dashed line | UML spec: par branches separated by dashed lines |
## Layout Notes
- **ViewBox**: 680×648 (standard width; height = lifeline bottom y=590 + legend + annotation + 16px buffer)
- **Lifeline spacing formula**: `(safe_area_width) / (n_lifelines 1) = 600 / 5 = 120px` — but use `spacing = 100px` starting at `x=90` so that first box left = 40 and last box right = 640 exactly
- **Actor box split-label trick**: Two separate `<text>` elements per box — one for `":"` (10px, tertiary color) and one for the class name (11.5px bold, primary color) — avoids the 14px font needing ~150px+ per box for long names like "OrderController"
- **Pentagon tag formula**: For a fragment starting at `(fx, fy)`, the tag polygon points are `(fx,fy) (fx+w,fy) (fx+w+6,fy+6) (fx+w+6,fy+18) (fx,fy+18)` where `w` = approximate text width of the keyword + 8px padding each side
- **Nested fragment inset**: The `par` rect uses `x = alt_x + 15` and `y = alt_y_current + 2` so both borders remain simultaneously visible — inset enough to separate visually, not so much that it wastes vertical space
- **Activation bar placement**: `x = lifeline_cx 4`, `width = 8` — centered on the lifeline and narrow enough not to obscure the dashed stem behind it
- **Message label y-offset**: All labels are placed at `y = arrow_y 5` to sit just above the arrow line; this applies to both left-going and right-going arrows since `text-anchor="middle"` handles horizontal centering automatically
- **Return arrows entering activation bars**: End `x1/x2` at lifeline center (e.g. x=294 for OrderController) rather than the bar edge (x=286) — the small overlap is intentional and clarifies the target object
- **Alt guard label placement**: Branch 1 guard goes at `y = frame_top + 13` to the right of the pentagon tag; subsequent branch guards go at `divider_y + 14` so they sit just inside the new branch
- **Lifeline end cap pattern**: `<line x1="cx7" y1="590" x2="cx+7" y2="590" stroke-width="1.5"/>` — a simple symmetric tick, no special marker needed
@@ -0,0 +1,173 @@
# Smart City Infrastructure
A multi-system integration diagram showing interconnected city infrastructure (power, water, transport) connected through a central IoT platform with a citizen dashboard on top. Demonstrates hub-spoke layout, diverse physical shapes, and UI mockups.
## Key Patterns Used
- **Hub-spoke layout**: Central IoT platform with radiating data connections to subsystems
- **Connection dots**: Visual indicators where data lines attach to the central hub
- **Dashboard/UI mockup**: Screen with mini-charts, gauges, and status indicators
- **Multi-system integration**: Three independent systems unified by central platform
- **Semantic line styles**: Different stroke styles for data (dashed), power, water, roads
- **Physical infrastructure shapes**: Solar panels, wind turbines, dams, pipes, roads, vehicles
## New Shape Techniques
### Solar Panels (angled polygons with grid lines)
```xml
<polygon class="solar-panel" points="0,25 35,8 38,12 3,29"/>
<line class="solar-frame" x1="12" y1="22" x2="24" y2="13"/>
<line x1="19" y1="29" x2="19" y2="40" stroke="#5F5E5A" stroke-width="2"/>
```
### Wind Turbine (tower + nacelle + blades)
```xml
<!-- Tapered tower -->
<polygon class="wind-tower" points="20,70 30,70 28,25 22,25"/>
<!-- Nacelle -->
<rect class="wind-hub" x="18" y="20" width="14" height="8" rx="2"/>
<!-- Hub -->
<circle class="wind-hub" cx="25" cy="18" r="5"/>
<!-- Blades (rotated ellipses) -->
<ellipse class="wind-blade" cx="25" cy="5" rx="3" ry="13"/>
<ellipse class="wind-blade" cx="14" cy="26" rx="3" ry="13" transform="rotate(-120, 25, 18)"/>
<ellipse class="wind-blade" cx="36" cy="26" rx="3" ry="13" transform="rotate(120, 25, 18)"/>
```
### Battery with Charge Level
```xml
<rect class="battery" x="0" y="0" width="45" height="65" rx="5"/>
<!-- Terminals -->
<rect x="10" y="-6" width="10" height="8" rx="2" fill="#27500A"/>
<rect x="25" y="-6" width="10" height="8" rx="2" fill="#27500A"/>
<!-- Charge level fill -->
<rect class="battery-level" x="5" y="12" width="35" height="48" rx="3"/>
<text x="22" y="42" text-anchor="middle" fill="#173404" style="font-size:10px">85%</text>
```
### Dam/Reservoir with Water Waves
```xml
<!-- Dam wall -->
<polygon class="reservoir-wall" points="0,60 10,0 70,0 80,60"/>
<!-- Water behind dam -->
<polygon class="water" points="12,10 68,10 68,55 75,55 75,58 5,58 5,55 12,55"/>
<!-- Wave effect -->
<path d="M 15 25 Q 25 22 35 25 Q 45 28 55 25" fill="none" stroke="#378ADD" stroke-width="1" opacity="0.5"/>
```
### Pipe Network with Joints and Valves
```xml
<path class="pipe" d="M 80 85 L 110 85"/>
<circle class="pipe-joint" cx="10" cy="30" r="8"/>
<circle class="valve" cx="190" cy="85" r="6"/>
<!-- Distribution branches -->
<path class="pipe-thin" d="M 18 30 L 50 30"/>
<path class="pipe-thin" d="M 10 22 L 10 5 L 50 5"/>
```
### Road Intersection with Lane Markings
```xml
<!-- Road surface -->
<line class="road" x1="0" y1="50" x2="170" y2="50"/>
<line class="road-mark" x1="10" y1="50" x2="160" y2="50"/>
<!-- Cross road -->
<line class="road" x1="85" y1="0" x2="85" y2="100"/>
<line class="road-mark" x1="85" y1="10" x2="85" y2="90"/>
<!-- Embedded sensors -->
<circle class="sensor" cx="40" cy="50" r="5"/>
```
### Traffic Light with Signal States
```xml
<rect class="traffic-light" x="0" y="0" width="14" height="32" rx="3"/>
<circle class="light-red" cx="7" cy="8" r="4"/>
<circle class="light-off" cx="7" cy="16" r="4"/>
<circle class="light-off" cx="7" cy="24" r="4"/>
```
### Bus with Windows and Wheels
```xml
<rect class="bus" x="0" y="0" width="55" height="28" rx="6"/>
<!-- Windows -->
<rect class="bus-window" x="5" y="5" width="12" height="12" rx="2"/>
<rect class="bus-window" x="20" y="5" width="12" height="12" rx="2"/>
<!-- Wheels with hubcaps -->
<circle cx="14" cy="30" r="6" fill="#2C2C2A"/>
<circle cx="14" cy="30" r="3" fill="#5F5E5A"/>
```
### Dashboard UI Mockup
```xml
<!-- Monitor frame -->
<rect class="dashboard" x="0" y="0" width="200" height="120" rx="8"/>
<!-- Screen -->
<rect class="screen" x="10" y="10" width="180" height="85" rx="4"/>
<!-- Mini bar chart -->
<rect class="screen-content" x="18" y="18" width="50" height="35" rx="2"/>
<rect class="screen-chart" x="22" y="38" width="8" height="12"/>
<rect class="screen-chart" x="33" y="32" width="8" height="18"/>
<!-- Gauge -->
<circle class="screen-bar" cx="100" cy="35" r="12"/>
<text x="100" y="39" text-anchor="middle" fill="#E8E6DE" style="font-size:8px">78%</text>
<!-- Status indicators -->
<circle cx="35" cy="74" r="6" fill="#97C459"/>
<circle cx="75" cy="74" r="6" fill="#97C459"/>
<circle cx="115" cy="74" r="6" fill="#EF9F27"/>
```
### Hexagonal IoT Hub with Connection Points
```xml
<!-- Outer hexagon -->
<polygon class="iot-hex" points="0,-45 39,-22 39,22 0,45 -39,22 -39,-22"/>
<!-- Inner hexagon -->
<polygon class="iot-inner" points="0,-20 17,-10 17,10 0,20 -17,10 -17,-10"/>
<!-- Connection dots on data lines -->
<circle cx="321" cy="248" r="4" fill="#7F77DD"/>
```
## CSS Classes for Infrastructure
```css
/* Power system */
.solar-panel { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.solar-frame { fill: none; stroke: #EEEDFE; stroke-width: 0.5; }
.wind-tower { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.wind-blade { fill: #F1EFE8; stroke: #888780; stroke-width: 0.5; }
.battery { fill: #27500A; stroke: #3B6D11; stroke-width: 1.5; }
.battery-level { fill: #97C459; }
.power-line { stroke: #EF9F27; stroke-width: 2; fill: none; }
/* Water system */
.reservoir-wall { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.water { fill: #85B7EB; stroke: #378ADD; stroke-width: 0.5; }
.pipe { fill: none; stroke: #378ADD; stroke-width: 4; stroke-linecap: round; }
.pipe-joint { fill: #185FA5; stroke: #0C447C; stroke-width: 1; }
.valve { fill: #0C447C; stroke: #185FA5; stroke-width: 1; }
/* Transport */
.road { stroke: #888780; stroke-width: 8; fill: none; stroke-linecap: round; }
.road-mark { stroke: #F1EFE8; stroke-width: 1; fill: none; stroke-dasharray: 6 4; }
.traffic-light { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
.light-red { fill: #E24B4A; }
.light-green { fill: #97C459; }
.light-off { fill: #2C2C2A; }
.bus { fill: #E1F5EE; stroke: #0F6E56; stroke-width: 1.5; }
/* Data/IoT */
.data-line { stroke: #7F77DD; stroke-width: 2; fill: none; stroke-dasharray: 4 3; }
.iot-hex { fill: #EEEDFE; stroke: #534AB7; stroke-width: 2; }
/* Dashboard */
.dashboard { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1.5; }
.screen { fill: #1a1a18; }
.screen-chart { fill: #5DCAA5; }
```
## Layout Notes
- **ViewBox**: 720×620 (wider for three-column system layout)
- **Hub position**: Central IoT at (360, 270) - geometric center
- **Data lines**: Use quadratic curves or L-shaped paths, add connection dots at hub attachment points
- **System spacing**: ~200px width per system section
- **Vertical layers**: Dashboard (top) → IoT Hub (middle) → Systems (bottom)
- **Component grouping**: Use `<g transform="translate(x,y)">` for each major component for easy positioning
@@ -0,0 +1,154 @@
# Smartphone Layer Anatomy
An exploded view diagram showing all internal layers of a smartphone from front glass to back, with alternating left/right labels to avoid overlap. Demonstrates layered product teardown visualization and component detail.
## Key Patterns Used
- **Exploded vertical stack**: Layers separated vertically to show internal structure
- **Alternating labels**: Left/right label placement prevents text overlap
- **Component detail**: Chips, coils, lenses rendered with realistic shapes
- **Thickness scale**: Measurement indicator on the side
- **Progressive depth**: Each layer slightly offset to create 3D stack effect
## New Shape Techniques
### Capacitive Touch Grid
```xml
<rect class="digitizer" x="0" y="0" width="140" height="90" rx="14"/>
<g transform="translate(8, 8)">
<!-- Horizontal lines -->
<line class="digitizer-grid" x1="0" y1="15" x2="124" y2="15"/>
<line class="digitizer-grid" x1="0" y1="37" x2="124" y2="37"/>
<!-- Vertical lines -->
<line class="digitizer-grid" x1="20" y1="0" x2="20" y2="74"/>
<line class="digitizer-grid" x1="50" y1="0" x2="50" y2="74"/>
</g>
<!-- Touch point indicator -->
<circle cx="70" cy="45" r="12" fill="none" stroke="#7F77DD" stroke-width="2" opacity="0.6"/>
<circle cx="70" cy="45" r="5" fill="#7F77DD" opacity="0.4"/>
```
### OLED RGB Subpixels
```xml
<rect class="oled-panel" x="0" y="0" width="140" height="90" rx="12"/>
<g transform="translate(10, 10)">
<!-- RGB pixel group -->
<rect class="oled-subpixel-r" x="0" y="0" width="2" height="6"/>
<rect class="oled-subpixel-g" x="3" y="0" width="2" height="6"/>
<rect class="oled-subpixel-b" x="6" y="0" width="2" height="6"/>
<!-- Repeat pattern -->
<rect class="oled-subpixel-r" x="11" y="0" width="2" height="6"/>
<rect class="oled-subpixel-g" x="14" y="0" width="2" height="6"/>
<rect class="oled-subpixel-b" x="17" y="0" width="2" height="6"/>
</g>
```
### Logic Board with Chips
```xml
<rect class="pcb" x="0" y="0" width="116" height="106" rx="3"/>
<!-- PCB traces -->
<path class="pcb-trace" d="M 8 50 L 30 50 L 30 35"/>
<!-- CPU chip -->
<rect class="chip-cpu" x="30" y="20" width="55" height="35" rx="3"/>
<text class="chip-label" x="57" y="35" text-anchor="middle">A17 Pro</text>
<!-- RAM chip -->
<rect class="chip-ram" x="30" y="62" width="35" height="18" rx="2"/>
<text class="chip-label" x="47" y="74" text-anchor="middle">8GB RAM</text>
<!-- Storage chip -->
<rect class="chip-storage" x="30" y="85" width="55" height="16" rx="2"/>
<text class="chip-label" x="57" y="96" text-anchor="middle">256GB NAND</text>
```
### Camera Lens Array
```xml
<!-- Main camera -->
<circle class="camera-lens" cx="20" cy="20" r="18"/>
<circle class="camera-lens-inner" cx="20" cy="20" r="13"/>
<circle class="camera-sensor" cx="20" cy="20" r="8"/>
<circle cx="20" cy="20" r="3" fill="#1a1a18"/>
<!-- Secondary camera (smaller) -->
<circle class="camera-lens" cx="15" cy="15" r="13"/>
<circle class="camera-lens-inner" cx="15" cy="15" r="9"/>
<circle class="camera-sensor" cx="15" cy="15" r="5"/>
```
### Wireless Charging Coil with Magnets
```xml
<!-- Concentric coil rings -->
<circle class="charging-coil-outer" cx="0" cy="0" r="30"/>
<circle class="charging-coil" cx="0" cy="0" r="23"/>
<circle class="charging-coil" cx="0" cy="0" r="16"/>
<circle class="charging-coil" cx="0" cy="0" r="9"/>
<!-- MagSafe magnet ring -->
<circle class="magnet" cx="0" cy="-35" r="3"/>
<circle class="magnet" cx="25" cy="-25" r="3"/>
<circle class="magnet" cx="35" cy="0" r="3"/>
<circle class="magnet" cx="25" cy="25" r="3"/>
<!-- ... continue around circle -->
```
### Battery Cell
```xml
<rect class="battery" x="0" y="0" width="140" height="90" rx="10"/>
<rect class="battery-cell" x="10" y="12" width="120" height="60" rx="6"/>
<text x="70" y="38" text-anchor="middle" fill="#27500A" style="font-size:9px">Li-Ion Polymer</text>
<text x="70" y="52" text-anchor="middle" fill="#27500A" style="font-size:12px; font-weight:bold">4422 mAh</text>
<rect class="battery-connector" x="55" y="75" width="30" height="10" rx="2"/>
```
## CSS Classes
```css
/* Glass */
.front-glass { fill: #E8E6DE; stroke: #888780; stroke-width: 1; opacity: 0.9; }
.back-glass { fill: #2C2C2A; stroke: #444441; stroke-width: 1; }
/* Touch digitizer */
.digitizer { fill: #EEEDFE; stroke: #534AB7; stroke-width: 1; }
.digitizer-grid { stroke: #AFA9EC; stroke-width: 0.3; fill: none; }
/* OLED */
.oled-panel { fill: #1a1a18; stroke: #444441; stroke-width: 1; }
.oled-subpixel-r { fill: #E24B4A; }
.oled-subpixel-g { fill: #97C459; }
.oled-subpixel-b { fill: #378ADD; }
/* Midframe */
.midframe { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1.5; }
/* Logic board */
.pcb { fill: #0F6E56; stroke: #085041; stroke-width: 1; }
.pcb-trace { stroke: #5DCAA5; stroke-width: 0.3; fill: none; }
.chip-cpu { fill: #3C3489; stroke: #534AB7; stroke-width: 0.5; }
.chip-ram { fill: #185FA5; stroke: #378ADD; stroke-width: 0.5; }
.chip-storage { fill: #27500A; stroke: #3B6D11; stroke-width: 0.5; }
/* Battery */
.battery { fill: #EAF3DE; stroke: #3B6D11; stroke-width: 1.5; }
.battery-cell { fill: #97C459; stroke: #639922; stroke-width: 0.5; }
/* Camera */
.camera-lens { fill: #0C447C; stroke: #185FA5; stroke-width: 0.5; }
.camera-lens-inner { fill: #1a1a18; stroke: #378ADD; stroke-width: 0.3; }
.camera-sensor { fill: #3C3489; stroke: #534AB7; stroke-width: 0.3; }
/* Wireless charging */
.charging-coil { fill: none; stroke: #EF9F27; stroke-width: 1.5; }
.magnet { fill: #5F5E5A; stroke: #444441; stroke-width: 0.5; }
```
## Layout Notes
- **ViewBox**: 900×780 (tall for vertical stack)
- **Layer offset**: Each layer offset 10px right and down for depth effect
- **Label alternation**: Odd layers → RIGHT labels, Even layers → LEFT labels
- **Thickness scale**: Vertical measurement bar on left side
- **Front/Back markers**: Text labels at top and bottom
- **Chip labels**: Use small white text (6px) directly on chip shapes
@@ -0,0 +1,247 @@
# SN2 Reaction Mechanism
A chemistry diagram showing the bimolecular nucleophilic substitution (SN2) mechanism between hydroxide ion and methyl bromide. Demonstrates molecular structure rendering, electron movement arrows, transition state notation, and reaction energy profiles.
## Key Patterns Used
- **Molecular structures**: Ball-and-stick style atoms with bonds
- **Electron movement**: Curved arrows showing nucleophilic attack
- **Transition state**: Bracketed pentacoordinate intermediate with partial charges
- **Stereochemistry**: Wedge/dash bonds showing 3D configuration
- **Energy profile**: Potential energy vs reaction coordinate plot
- **Annotation boxes**: Key features and mechanistic notes
## Diagram Type
This is a **chemistry mechanism diagram** with:
- **Molecular rendering**: Atoms as colored circles with element symbols
- **Bond notation**: Solid, wedge, dash, and partial (dashed) bonds
- **Reaction arrows**: Curved for electron movement, straight for reaction progress
- **Energy landscape**: Quantitative energy profile below mechanism
## Molecular Structure Elements
### Atom Rendering
```xml
<!-- Carbon atom (dark) -->
<circle cx="0" cy="0" r="14" class="carbon"/>
<text class="chem" x="0" y="5" text-anchor="middle" fill="white" font-weight="500">C</text>
<!-- Oxygen atom (red) -->
<circle cx="0" cy="0" r="14" class="oxygen"/>
<text class="chem" x="0" y="5" text-anchor="middle" fill="white" font-weight="500">O</text>
<!-- Hydrogen atom (light with border) -->
<circle cx="38" cy="0" r="8" class="hydrogen"/>
<text class="chem-sm" x="38" y="4" text-anchor="middle">H</text>
<!-- Bromine atom (brown) -->
<circle cx="52" cy="0" r="16" class="bromine"/>
<text class="chem" x="52" y="5" text-anchor="middle" fill="white" font-weight="500">Br</text>
```
```css
.carbon { fill: #2C2C2A; }
.hydrogen { fill: #F1EFE8; stroke: #888780; stroke-width: 1; }
.oxygen { fill: #E24B4A; }
.bromine { fill: #993C1D; }
.nitrogen { fill: #378ADD; } /* for other reactions */
```
### Bond Types
```xml
<!-- Single bond (solid) -->
<line x1="14" y1="0" x2="38" y2="0" class="bond"/>
<!-- Wedge bond (coming toward viewer) -->
<polygon class="bond-wedge" points="0,-14 -6,-35 6,-35"/>
<!-- Dash bond (going away from viewer) -->
<line x1="-10" y1="10" x2="-28" y2="28" class="bond-dash"/>
<!-- Partial bond (forming/breaking) -->
<line x1="-40" y1="0" x2="-14" y2="0" class="bond-partial"/>
```
```css
.bond { stroke: var(--text-primary); stroke-width: 2.5; fill: none; stroke-linecap: round; }
.bond-thin { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
.bond-partial { stroke: var(--text-primary); stroke-width: 2; fill: none; stroke-dasharray: 4 3; }
.bond-wedge { fill: var(--text-primary); stroke: none; }
.bond-dash { stroke: var(--text-primary); stroke-width: 2; fill: none; stroke-dasharray: 2 2; }
```
### Lone Pairs and Charges
```xml
<!-- Lone pair electrons (dots) -->
<circle cx="-8" cy="-18" r="2" fill="var(--text-primary)"/>
<circle cx="0" cy="-18" r="2" fill="var(--text-primary)"/>
<!-- Formal negative charge -->
<text class="charge" x="12" y="-12" fill="#A32D2D" font-weight="bold">⊖</text>
<!-- Partial charges (delta notation) -->
<text class="partial" x="0" y="-18" text-anchor="middle" fill="#A32D2D">δ⁻</text>
<text class="partial" x="0" y="-22" text-anchor="middle" fill="#3B6D11">δ⁺</text>
```
```css
.charge { font-family: "Times New Roman", Georgia, serif; font-size: 12px; }
.partial { font-family: "Times New Roman", Georgia, serif; font-size: 11px; font-style: italic; }
```
### Curved Arrow (Electron Movement)
```xml
<defs>
<marker id="curved-arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="6" markerHeight="6" orient="auto">
<path d="M0,0 L10,5 L0,10 L3,5 Z" class="arrow-fill"/>
</marker>
</defs>
<!-- Nucleophilic attack arrow -->
<path d="M -5,15 Q 30,60 70,25" class="arrow-curved" marker-end="url(#curved-arrow)"/>
```
```css
.arrow-curved { stroke: #534AB7; stroke-width: 2; fill: none; }
.arrow-fill { fill: #534AB7; }
```
### Transition State Brackets
```xml
<!-- Left bracket -->
<path d="M -75,-70 L -85,-70 L -85,75 L -75,75" class="ts-bracket"/>
<!-- Right bracket -->
<path d="M 95,-70 L 105,-70 L 105,75 L 95,75" class="ts-bracket"/>
<!-- Double dagger symbol -->
<text class="chem" x="115" y="-60" fill="var(--text-primary)">‡</text>
```
```css
.ts-bracket { stroke: var(--text-primary); stroke-width: 1.5; fill: none; }
```
## Energy Profile Diagram
### Axes
```xml
<!-- Y-axis (Energy) -->
<line x1="0" y1="280" x2="0" y2="0" class="axis" marker-end="url(#straight-arrow)"/>
<text class="t" x="-15" y="-10" text-anchor="middle" transform="rotate(-90 -15 140)">Potential Energy</text>
<!-- X-axis (Reaction Coordinate) -->
<line x1="0" y1="280" x2="600" y2="280" class="axis" marker-end="url(#straight-arrow)"/>
<text class="t" x="580" y="305" text-anchor="middle">Reaction Coordinate</text>
```
### Energy Curve
```xml
<!-- Filled area under curve -->
<path class="energy-fill" d="
M 40,200
Q 150,200 250,50
Q 350,200 500,220
L 500,280 L 40,280 Z
"/>
<!-- Curve line -->
<path class="energy-curve" d="
M 40,200
Q 100,200 150,150
Q 200,80 250,50
Q 300,80 350,150
Q 400,210 500,220
"/>
```
```css
.energy-curve { stroke: #534AB7; stroke-width: 2.5; fill: none; }
.energy-fill { fill: rgba(83, 74, 183, 0.1); }
```
### Energy Levels and Annotations
```xml
<!-- Reactants level -->
<line x1="20" y1="200" x2="80" y2="200" stroke="#3B6D11" stroke-width="2"/>
<text class="ts" x="50" y="218" text-anchor="middle">Reactants</text>
<!-- Transition state peak -->
<circle cx="250" cy="50" r="5" fill="#534AB7"/>
<line x1="250" y1="50" x2="250" y2="280" class="energy-level"/>
<text class="ts" x="250" y="30" text-anchor="middle" fill="#534AB7" font-weight="500">Transition State [‡]</text>
<!-- Products level (lower = exergonic) -->
<line x1="470" y1="220" x2="530" y2="220" stroke="#3B6D11" stroke-width="2"/>
<!-- Activation energy arrow -->
<line x1="100" y1="200" x2="100" y2="55" class="delta-arrow" marker-end="url(#delta-arrow)"/>
<text class="ts" x="85" y="125" text-anchor="end" fill="#3B6D11">E<tspan baseline-shift="sub" font-size="8">a</tspan></text>
```
```css
.energy-level { stroke: var(--text-secondary); stroke-width: 1; stroke-dasharray: 4 2; fill: none; }
.delta-arrow { stroke: #3B6D11; stroke-width: 1.5; fill: none; }
.delta-fill { fill: #3B6D11; }
```
## Chemistry Text Styles
```css
/* Chemistry notation (serif font for formulas) */
.chem { font-family: "Times New Roman", Georgia, serif; font-size: 16px; fill: var(--text-primary); }
.chem-sm { font-family: "Times New Roman", Georgia, serif; font-size: 12px; fill: var(--text-primary); }
.chem-lg { font-family: "Times New Roman", Georgia, serif; font-size: 18px; fill: var(--text-primary); }
```
## Subscript/Superscript in SVG
```xml
<!-- Subscript using tspan -->
<text class="ts">E<tspan baseline-shift="sub" font-size="8">a</tspan></text>
<!-- Superscript for charges -->
<text class="chem-sm">OH⁻</text> <!-- Using Unicode superscript minus -->
<text class="chem-sm">CH₃Br</text> <!-- Using Unicode subscript 3 -->
```
## Color Coding
| Element | Color | Hex |
|---------|-------|-----|
| Carbon | Dark gray | #2C2C2A |
| Hydrogen | Light cream | #F1EFE8 |
| Oxygen | Red | #E24B4A |
| Bromine | Brown | #993C1D |
| Nitrogen | Blue | #378ADD |
| Electron arrows | Purple | #534AB7 |
| Positive charge | Green | #3B6D11 |
| Negative charge | Red | #A32D2D |
## Layout Notes
- **ViewBox**: 800×680 (landscape for mechanism + energy profile)
- **Mechanism section**: y=60-300, showing reactants → TS → products
- **Energy profile**: y=320-630, with axes and curve
- **Atom sizes**: C/O/Br ~12-16px radius, H ~7-8px radius
- **Bond lengths**: ~25-40px between atom centers
- **Spacing**: ~140px between mechanism stages
## When to Use This Pattern
Use this diagram style for:
- Organic reaction mechanisms (SN1, SN2, E1, E2, additions, eliminations)
- Reaction energy profiles and kinetics
- Stereochemistry illustrations
- Enzyme mechanism diagrams
- Transition state theory visualization
- Any chemistry concept requiring molecular structures
@@ -0,0 +1,338 @@
# Modern Onshore Wind Turbine Structure
A physical/structural cross-section diagram showing all major components of a modern wind turbine from underground foundation to blade tips.
## Key Patterns Used
- **Underground section**: Soil layers, deep concrete foundation with rebar reinforcement grid, spread footing
- **Cross-section view**: Tower wall thickness shown, internal components visible
- **Tapered tower**: Path elements creating realistic tower silhouette that narrows toward top
- **Internal access**: Ladder with rungs, elevator shaft inside tower
- **Cable routing**: Power cables running from nacelle down through tower to transformer
- **Nacelle cutaway**: Gearbox, generator, brake, yaw system all visible inside housing
- **Rotor assembly**: Hub with pitch motors at blade roots, three composite blades with gradient fill
- **Ground level marker**: Clear separation between above/below ground
- **Component color coding**: Each system type has distinct color (blue=generator, gold=gearbox, red=brake, green=yaw, purple=pitch)
- **Legend bar**: Quick reference for color meanings
## Diagram
```xml
<svg width="100%" viewBox="0 0 680 920" xmlns="http://www.w3.org/2000/svg">
<defs>
<marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5"
markerWidth="6" markerHeight="6" orient="auto-start-reverse">
<path d="M2 1L8 5L2 9" fill="none" stroke="context-stroke"
stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
</marker>
<!-- Blade gradient for 3D effect -->
<linearGradient id="bladeGrad" x1="0%" y1="0%" x2="100%" y2="0%">
<stop offset="0%" style="stop-color:#D3D1C7"/>
<stop offset="50%" style="stop-color:#F1EFE8"/>
<stop offset="100%" style="stop-color:#B4B2A9"/>
</linearGradient>
</defs>
<!-- ===== GROUND LEVEL LINE ===== -->
<line x1="40" y1="680" x2="640" y2="680" stroke="#3B6D11" stroke-width="2"/>
<text class="tl" x="45" y="675">Ground level</text>
<!-- ===== UNDERGROUND: FOUNDATION ===== -->
<!-- Soil layers -->
<rect x="120" y="680" width="300" height="180" class="soil"/>
<rect x="120" y="780" width="300" height="80" class="soil-dark"/>
<!-- Deep concrete foundation -->
<path d="M170 680 L170 820 L200 850 L340 850 L370 820 L370 680 Z" class="concrete"/>
<!-- Foundation base spread -->
<path d="M140 820 L170 820 L200 850 L340 850 L370 820 L400 820 L400 860 L140 860 Z" class="concrete-dark"/>
<!-- Rebar reinforcement -->
<g class="rebar">
<line x1="185" y1="700" x2="185" y2="840"/>
<line x1="210" y1="700" x2="210" y2="845"/>
<line x1="235" y1="700" x2="235" y2="848"/>
<line x1="260" y1="700" x2="260" y2="848"/>
<line x1="285" y1="700" x2="285" y2="848"/>
<line x1="310" y1="700" x2="310" y2="845"/>
<line x1="335" y1="700" x2="335" y2="840"/>
<!-- Horizontal rebar -->
<line x1="175" y1="720" x2="365" y2="720"/>
<line x1="175" y1="760" x2="365" y2="760"/>
<line x1="175" y1="800" x2="365" y2="800"/>
<line x1="155" y1="835" x2="385" y2="835"/>
</g>
<!-- Foundation labels -->
<line x1="410" y1="770" x2="480" y2="770" class="leader"/>
<text class="ts" x="485" y="766">Deep concrete foundation</text>
<text class="tl" x="485" y="778">Reinforced with steel rebar</text>
<text class="tl" x="485" y="790">15-25m deep typical</text>
<line x1="400" y1="850" x2="480" y2="870" class="leader"/>
<text class="ts" x="485" y="866">Foundation spread footing</text>
<text class="tl" x="485" y="878">Distributes load to soil</text>
<!-- ===== TOWER BASE ===== -->
<!-- Tower base flange -->
<ellipse cx="270" cy="680" rx="70" ry="12" class="concrete-dark"/>
<rect x="200" y="668" width="140" height="12" class="tower"/>
<!-- Transformer at base -->
<g transform="translate(470, 640)">
<rect x="0" y="0" width="50" height="40" rx="3" class="transformer"/>
<!-- Cooling fins -->
<rect x="52" y="5" width="4" height="30" class="transformer-fin"/>
<rect x="58" y="5" width="4" height="30" class="transformer-fin"/>
<rect x="64" y="5" width="4" height="30" class="transformer-fin"/>
<!-- Connection box -->
<rect x="10" y="-8" width="30" height="10" rx="2" class="transformer-fin"/>
</g>
<line x1="470" y1="660" x2="430" y2="640" class="leader"/>
<text class="ts" x="385" y="636" text-anchor="end">Transformer</text>
<text class="tl" x="385" y="648" text-anchor="end">Steps up voltage for grid</text>
<!-- ===== TUBULAR STEEL TOWER ===== -->
<!-- Tower outer shell (tapered) -->
<path d="M200 680 L220 200 L320 200 L340 680 Z" class="tower"/>
<!-- Tower inner surface (cutaway) -->
<path d="M215 680 L232 210 L308 210 L325 680 Z" class="tower-inner"/>
<!-- Tower section joints -->
<line x1="205" y1="550" x2="335" y2="550" class="tower-section"/>
<line x1="210" y1="420" x2="330" y2="420" class="tower-section"/>
<line x1="215" y1="300" x2="325" y2="300" class="tower-section"/>
<!-- Internal ladder (left side) -->
<g transform="translate(225, 220)">
<!-- Ladder rails -->
<line x1="0" y1="0" x2="8" y2="450" class="ladder"/>
<line x1="15" y1="0" x2="23" y2="450" class="ladder"/>
<!-- Rungs -->
<g class="ladder-rung">
<line x1="1" y1="20" x2="22" y2="21"/>
<line x1="1" y1="50" x2="22" y2="52"/>
<line x1="2" y1="80" x2="22" y2="83"/>
<line x1="2" y1="110" x2="23" y2="114"/>
<line x1="2" y1="140" x2="23" y2="145"/>
<line x1="3" y1="170" x2="23" y2="176"/>
<line x1="3" y1="200" x2="24" y2="207"/>
<line x1="3" y1="230" x2="24" y2="238"/>
<line x1="4" y1="260" x2="24" y2="269"/>
<line x1="4" y1="290" x2="25" y2="300"/>
<line x1="4" y1="320" x2="25" y2="331"/>
<line x1="5" y1="350" x2="25" y2="362"/>
<line x1="5" y1="380" x2="26" y2="393"/>
<line x1="6" y1="410" x2="26" y2="424"/>
<line x1="6" y1="440" x2="27" y2="455"/>
</g>
</g>
<!-- Elevator shaft (right side) -->
<rect x="280" y="230" width="25" height="430" rx="2" class="elevator"/>
<text class="tl" x="292" y="450" text-anchor="middle" transform="rotate(-90, 292, 450)" fill="#185FA5">ELEVATOR</text>
<!-- Electrical cables running down -->
<path d="M270 220 C270 300 268 400 268 500 C268 600 268 650 310 665 L470 665" class="cable"/>
<path d="M260 225 C258 350 256 500 256 600 C256 650 256 670 256 680" class="cable-thin"/>
<!-- Tower labels -->
<line x1="340" y1="350" x2="400" y2="320" class="leader"/>
<text class="ts" x="405" y="316">Tubular steel tower</text>
<text class="tl" x="405" y="328">80-120m height typical</text>
<text class="tl" x="405" y="340">Tapered for strength</text>
<line x1="248" y1="400" x2="130" y2="380" class="leader"/>
<text class="ts" x="125" y="376" text-anchor="end">Internal ladder</text>
<text class="tl" x="125" y="388" text-anchor="end">Service access</text>
<line x1="305" y1="500" x2="400" y2="520" class="leader"/>
<text class="ts" x="405" y="516">Service elevator</text>
<line x1="268" y1="580" x2="130" y2="600" class="leader"/>
<text class="ts" x="125" y="596" text-anchor="end">Power cables</text>
<text class="tl" x="125" y="608" text-anchor="end">To transformer</text>
<!-- ===== NACELLE ===== -->
<g transform="translate(270, 160)">
<!-- Nacelle base/bedplate -->
<rect x="-60" y="30" width="120" height="15" class="nacelle"/>
<!-- Yaw bearing -->
<ellipse cx="0" cy="42" rx="35" ry="6" class="bearing"/>
<!-- Yaw motors -->
<rect x="-55" y="32" width="12" height="18" rx="2" class="yaw"/>
<rect x="43" y="32" width="12" height="18" rx="2" class="yaw"/>
<!-- Nacelle housing -->
<path d="M-65 30 L-70 -10 L-65 -35 L70 -35 L85 -10 L85 30 Z" class="nacelle-cover"/>
<!-- Main shaft -->
<rect x="-90" y="-8" width="35" height="16" rx="2" fill="#888780" stroke="#5F5E5A" stroke-width="0.5"/>
<!-- Gearbox -->
<rect x="-55" y="-25" width="40" height="45" rx="3" class="gearbox"/>
<text class="tl" x="-35" y="5" text-anchor="middle" fill="#633806">GEAR</text>
<!-- Generator -->
<rect x="-10" y="-20" width="50" height="38" rx="4" class="generator"/>
<ellipse cx="15" cy="0" rx="15" ry="15" fill="none" stroke="#0C447C" stroke-width="1"/>
<text class="tl" x="15" y="4" text-anchor="middle" fill="#E6F1FB">GEN</text>
<!-- Brake disc -->
<rect x="45" y="-12" width="8" height="24" rx="1" class="brake"/>
<!-- Electrical cabinet -->
<rect x="58" y="-25" width="20" height="35" rx="2" fill="#5F5E5A" stroke="#444441" stroke-width="0.5"/>
<!-- Anemometer on top -->
<line x1="60" y1="-35" x2="60" y2="-50" stroke="#5F5E5A" stroke-width="1"/>
<ellipse cx="60" cy="-52" rx="8" ry="3" fill="#D3D1C7" stroke="#888780" stroke-width="0.5"/>
</g>
<!-- Nacelle labels -->
<line x1="215" y1="135" x2="130" y2="115" class="leader"/>
<text class="ts" x="125" y="111" text-anchor="end">Gearbox</text>
<text class="tl" x="125" y="123" text-anchor="end">Speed multiplier</text>
<line x1="285" y1="145" x2="400" y2="125" class="leader"/>
<text class="ts" x="405" y="121">Generator</text>
<text class="tl" x="405" y="133">Converts rotation to electricity</text>
<line x1="315" y1="155" x2="400" y2="165" class="leader"/>
<text class="ts" x="405" y="161">Brake system</text>
<line x1="215" y1="200" x2="130" y2="220" class="leader"/>
<text class="ts" x="125" y="216" text-anchor="end">Yaw motors</text>
<text class="tl" x="125" y="228" text-anchor="end">Rotate nacelle to face wind</text>
<line x1="330" y1="108" x2="400" y2="90" class="leader"/>
<text class="ts" x="405" y="86">Anemometer</text>
<text class="tl" x="405" y="98">Wind speed sensor</text>
<!-- ===== ROTOR HUB & BLADES ===== -->
<!-- Hub -->
<g transform="translate(180, 152)">
<!-- Hub body -->
<ellipse cx="0" cy="0" rx="25" ry="30" class="hub"/>
<!-- Hub nose cone -->
<path d="M-25 -20 Q-50 0 -25 20 Q-30 0 -25 -20" class="hub-cap"/>
<!-- Blade roots with pitch motors -->
<!-- Blade 1 (up) -->
<g transform="translate(-10, -25) rotate(-80)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
<!-- Blade 2 (lower left) -->
<g transform="translate(-18, 18) rotate(40)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
<!-- Blade 3 (lower right) -->
<g transform="translate(5, 22) rotate(160)">
<ellipse cx="0" cy="0" rx="12" ry="8" class="blade-root"/>
<rect x="-8" y="-5" width="10" height="10" rx="2" class="pitch-motor"/>
</g>
</g>
<!-- Blade 1 (pointing up-left) -->
<path d="M165 125 Q140 80 130 40 Q125 20 115 15 Q110 18 112 25 Q115 50 125 90 Q140 120 158 128 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade 2 (pointing down-left) -->
<path d="M158 175 Q120 200 80 230 Q60 245 55 255 Q60 258 68 252 Q95 235 130 210 Q155 190 163 178 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade 3 (pointing down-right, partially visible) -->
<path d="M188 175 Q195 200 205 230 Q210 250 215 255 Q220 252 218 245 Q212 220 202 195 Q192 175 186 172 Z" class="blade" fill="url(#bladeGrad)"/>
<!-- Blade labels -->
<line x1="115" y1="35" x2="60" y2="35" class="leader"/>
<text class="ts" x="55" y="31" text-anchor="end">Composite blade</text>
<text class="tl" x="55" y="43" text-anchor="end">Fiberglass/carbon fiber</text>
<text class="tl" x="55" y="55" text-anchor="end">40-80m length each</text>
<line x1="170" y1="130" x2="130" y2="155" class="leader"/>
<text class="ts" x="85" y="151" text-anchor="end">Pitch motor</text>
<text class="tl" x="85" y="163" text-anchor="end">Adjusts blade angle</text>
<line x1="180" y1="152" x2="130" y2="180" class="leader"/>
<text class="ts" x="85" y="183" text-anchor="end">Rotor hub</text>
<!-- ===== LEGEND ===== -->
<g transform="translate(40, 895)">
<rect x="0" y="-15" width="600" height="30" rx="4" fill="none" stroke="#D3D1C7" stroke-width="0.5"/>
<rect x="15" y="-5" width="12" height="12" rx="2" class="generator"/>
<text class="tl" x="32" y="5">Generator</text>
<rect x="95" y="-5" width="12" height="12" rx="2" class="gearbox"/>
<text class="tl" x="112" y="5">Gearbox</text>
<rect x="170" y="-5" width="12" height="12" rx="2" class="brake"/>
<text class="tl" x="187" y="5">Brake</text>
<rect x="230" y="-5" width="12" height="12" rx="2" class="yaw"/>
<text class="tl" x="247" y="5">Yaw system</text>
<rect x="320" y="-5" width="12" height="12" rx="2" class="pitch-motor"/>
<text class="tl" x="337" y="5">Pitch motor</text>
<line x1="415" y1="1" x2="435" y2="1" class="cable" style="stroke-width:2"/>
<text class="tl" x="440" y="5">Power cable</text>
<rect x="515" y="-5" width="12" height="12" rx="2" class="transformer"/>
<text class="tl" x="532" y="5">Transformer</text>
</g>
</svg>
```
## CSS Classes
```css
/* Foundation */
.concrete { fill: #B4B2A9; stroke: #5F5E5A; stroke-width: 1; }
.concrete-dark { fill: #888780; stroke: #5F5E5A; stroke-width: 1; }
.rebar { stroke: #854F0B; stroke-width: 1.5; fill: none; }
.soil { fill: #8B7355; stroke: #5F5E5A; stroke-width: 0.5; }
.soil-dark { fill: #6B5344; }
/* Tower */
.tower { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.tower-inner { fill: #D3D1C7; stroke: #888780; stroke-width: 0.5; }
.tower-section { stroke: #888780; stroke-width: 0.5; stroke-dasharray: 2 4; }
.ladder { stroke: #5F5E5A; stroke-width: 1; fill: none; }
.ladder-rung { stroke: #888780; stroke-width: 0.8; }
.elevator { fill: #E6F1FB; stroke: #185FA5; stroke-width: 0.5; }
.cable { stroke: #E24B4A; stroke-width: 2; fill: none; }
.cable-thin { stroke: #E24B4A; stroke-width: 1.5; fill: none; }
/* Nacelle */
.nacelle { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.nacelle-cover { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 1; }
.gearbox { fill: #BA7517; stroke: #633806; stroke-width: 0.5; }
.generator { fill: #378ADD; stroke: #0C447C; stroke-width: 0.5; }
.brake { fill: #E24B4A; stroke: #791F1F; stroke-width: 0.5; }
.yaw { fill: #5DCAA5; stroke: #085041; stroke-width: 0.5; }
.bearing { fill: #444441; stroke: #2C2C2A; stroke-width: 0.5; }
/* Rotor */
.hub { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 1; }
.hub-cap { fill: #F1EFE8; stroke: #5F5E5A; stroke-width: 1; }
.blade { fill: #F1EFE8; stroke: #888780; stroke-width: 1; }
.blade-root { fill: #D3D1C7; stroke: #5F5E5A; stroke-width: 0.5; }
.pitch-motor { fill: #7F77DD; stroke: #3C3489; stroke-width: 0.5; }
/* Transformer */
.transformer { fill: #27500A; stroke: #173404; stroke-width: 1; }
.transformer-fin { fill: #3B6D11; stroke: #27500A; stroke-width: 0.5; }
```

Some files were not shown because too many files have changed in this diff Show More